Difference between revisions of "Tagsoup"

Latest revision as of 23:41, 23 October 2009

TagSoup is a haskell library for extracting information out of unstructured HTML code, sometimes known as tag-soup. The HTML does not have to be well formed, or render properly within any particular framework. This library is for situations where the author of the HTML is not cooperating with the person trying to extract the information, but is also not trying to hide the information.

The library provides a basic data type for a list of unstructured tags, a parser to convert HTML into this tag type, and useful functions and combinators for finding and extracting information.

For more information, see the tagsoup website.

Related Projects

Tagsoup for Java - an independently written malformed HTML parser for Java. Including links to other HTML parsers.

@@ Line 3: / Line 3: @@
 The library provides a basic data type for a list of unstructured tags, a parser to convert HTML into this tag type, and useful functions and combinators for finding and extracting information.
-For more information, see the [http://community.haskell.org/~ndm/tagsoup tagsoup website]
+For more information, see the [http://community.haskell.org/~ndm/tagsoup tagsoup website].
 ==Related Projects==

Difference between revisions of "Tagsoup"

Latest revision as of 23:41, 23 October 2009

Related Projects

Navigation menu

Search