Harman Patil (Editor)

Comparison of HTML parsers

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

HTML parsers are software for automated Hypertext Markup Language (HTML) parsing. They have two main purposes:

  • HTML traversal: offer an interface for programmers to easily access and modify of the "HTML string code". Canonical example: DOM parsers.
  • HTML clean: to fix invalid HTML and to improve the layout and indent style of the resulting markup. Canonical example: HTML Tidy.
  • * Latest release (of significant changes) date.** sanitize (generating standard-compatible web-page, reduce spam, etc.) and clean (strip out surplus presentational tags, remove XSS code, etc.) HTML code.*** Updates HTML4.X to XHTML or to HTML5, converting deprecated tags (ex. CENTER) to valid ones (ex. DIV with style="text-align:center;").

    References

    Comparison of HTML parsers Wikipedia


    Similar Topics