Skip to content

Category Archives: Main

On-topic posts

Broken HTML parsing in Java

16-Aug-07

Given: HTML code, non-valid and non-well-formed. Make it a well-formed XHTML, in Java. We considered JTidy, but it’s source looked too messy and hacky. I looked for a parser that could correct XML – not in sense of schema, but correct any mistakes, and pass out a well-formed XML and error list. I was able […]