Skip to content

Author Archives:

Broken HTML parsing in Java

16-Aug-07

Given: HTML code, non-valid and non-well-formed. Make it a well-formed XHTML, in Java. We considered JTidy, but it’s source looked too messy and hacky. I looked for a parser that could correct XML – not in sense of schema, but correct any mistakes, and pass out a well-formed XML and error list. I was able […]

About this blog

16-Aug-07

I’m going to start with reposting my older articles written in Russian. To be honest: It’s a self-promotion blog which has the chance to be commercial. OTOH, this also means the articles will be carefully reviewed, filtered and refined. I promise that I’m intending to maybe do it 🙂