Extract HTML Structure From Page With Simple Script
Monday, October 5th, 2009 by Jesper Rønn-JensenI often want to extract parts of HTML page. Often I find myself parsing a doc with Nokogiri and extracting it with a CSS selector. It’s pretty straight-forward from scratch.
To follow the Don’t Repeat Yourself principle, I created a script to make it a one-liner. Just run
ruby parsepage.rb [url] [css_selector]
Feel free to use and [...]