Little useful frameworks - JSoup
In several cases, I needed to parse html pages and extract data from specific tags.
For instance, I had to build a wiki migration, or to transform and import massively pages to a CMS.
JSoup, a Java framework, makes easier these operations.
Based on html5 elements, JSoup parses an Url, a String or a file with CSS selectors, or DOM transversal and gives facilities to manipulate the result found: you can easily replace some content, wrap with HTML tags.