Sun May 01 2011 23:33 Overview: Extracting article text from HTML documents | My tech blog.:
Mon Mar 28 2011 01:36 boilerpipe - Boilerplate Removal and Fulltext Extraction from HTML pages - Google Project Hosting :
Whereas this is work-related
Sat Mar 26 2011 16:17 http://packages.python.org/pyquery/:
A good DSL
© 2000-2013 Leonard Richardson.