diff options
author | John MacFarlane <jgm@berkeley.edu> | 2010-12-22 20:25:15 -0800 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2010-12-30 13:55:40 -0800 |
commit | 904050fa36715e18522d80432a2666fcbaacd105 (patch) | |
tree | 4745876e797d400539dd80309d31c330a013e969 /Benchmark.hs | |
parent | 220fe5fab89ce84fcb98f0430c4126281ca8362d (diff) |
New HTML reader using tagsoup as a lexer.
* The new reader is faster and more accurate.
* API changes for Text.Pandoc.Readers.HTML:
- removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
htmlBlockElement, htmlComment
- added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag
* tagsoup is a new dependency.
* Text.Pandoc.Parsing: Generalized type on readWith.
* Benchmark.hs: Added length calculation to force full evaluation.
* Updated HTML reader tests.
* Updated markdown and textile readers to use the functions from
the HTML reader.
* Note: The markdown reader now correctly handles some cases it did not
before. For example:
<hr/>
is reproduced without adding a space.
<script>
a = '<b>';
</script>
is parsed correctly.
Diffstat (limited to 'Benchmark.hs')
-rw-r--r-- | Benchmark.hs | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/Benchmark.hs b/Benchmark.hs index 67c790526..0f3520fde 100644 --- a/Benchmark.hs +++ b/Benchmark.hs @@ -13,8 +13,11 @@ readerBench doc (name, reader) = inp = writer defaultWriterOptions{ writerWrapText = True , writerLiterateHaskell = "+lhs" `isSuffixOf` name } doc - in bench (name ++ " reader") $ whnf - (reader defaultParserState{stateSmart = True + -- we compute the length to force full evaluation + getLength (Pandoc (Meta a b c) d) = + length a + length b + length c + length d + in bench (name ++ " reader") $ whnf (getLength . + reader defaultParserState{ stateSmart = True , stateStandalone = True , stateLiterateHaskell = "+lhs" `isSuffixOf` name }) inp |