- HTML and XML Manipulation Utilities - HTML-XML-utils

HTML-XML-utils consists of a set of small C programs (filters) that read HTML and XML files and can add a table of contents, an alphabetical index, a bibliography, cross-references, numbered headings, remove elements, count elements, pretty-print them, etc. When it reads HTML, it assumes the code is correct HTML 4.0 or close to it.

Below are the sets of utilities included:
 asc2xml      -  convert from UTF-8 to &#nnn; entities
 xml2asc      -  convert from &#nnn; entities to UTF-8
 hxaddid      -  add IDs to selected elements
 hxcite       -  replace bibliographic references by hyperlinks
 hxcite-mkbib -  expand references and create bibliography
 hxclean      -  apply heuristics to correct an HTML file
Continue Reading...


source:http://linuxpoison.blogspot.com/2012/01/135781677518703.html