summaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Parsing.hs
Commit message (Collapse)AuthorAge
* Tighten up parsing of raw email addresses.John MacFarlane2016-10-23
| | | | | | | | | | Technically `**@user` is a valid email address, but if we allow things like this, we get bad results in markdown flavors that autolink raw email addresses. (See #2940.) So we exclude a few valid email addresses in order to avoid these more common bad cases. Closes #2940.
* Allow empty lines when parsing line blocksAlbert Krewinkel2016-10-13
| | | | | | Line blocks are allowed to contain empty lines and should be parsed as a single block in that case. Previously an empty (line block) line would have terminated parsing of the line block element.
* Remove TagSoup compatJesse Rosenthal2016-09-02
| | | | | | | We already lower-bound tagsoup at 0.13.7, which means we were always running the compatibility layer (it was conditional on min value 0.13). Better to just use `lookupEntity` from the library directly, and convert a string to a char if need be.
* Remove Compat.MonoidJesse Rosenthal2016-09-02
| | | | | This was only necessary for GHC versions with base below 4.5 (i.e., ghc < 7.4).
* Use liftM since otherwise Functor type constraint needen in ghc 7.8.John MacFarlane2016-07-15
|
* Fixed compiler warnings.John MacFarlane2016-07-14
|
* Updated copyright dates to include 2016.John MacFarlane2016-03-22
|
* Changed type of Shared.uniqueIdent argument from [String] to Set String.John MacFarlane2016-01-22
| | | | | | | This avoids performance problems in documents with many identically named headers. Closes #2671.
* Work around tagsoup bug - not allowing uppercase x in hex entities.John MacFarlane2016-01-08
| | | | Issue submitted at tagsoup.
* Entity handling fixes:John MacFarlane2016-01-08
| | | | | | | | | | | | | - Text.Pandoc.XML.fromEntities: handle entities without a semicolon. Always lookup character references with the trailing ';', even if it wasn't present. And never add it when looking up numerical entities. (This is what tagsoup seems to require.) - Text.Pandoc.Parsing.characterReference: Always lookup character references with the trailing ';', and leave off the ';' when looking up numerical entities. This fixes a regression for e.g. `&lang;`.
* Fixed cite key parsing regression.John MacFarlane2015-12-12
| | | | | | | We were capturing final colons as in [@foo: bar]; the citation id was being parsed as "@foo:". Closes jgm/pandoc-citeproc#201.
* Merge branch 'new-image-attributes' of https://github.com/mb21/pandoc into ↵John MacFarlane2015-11-19
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mb21-new-image-attributes * Bumped version to 1.16. * Added Attr field to Link and Image. * Added `common_link_attributes` extension. * Updated readers for link attributes. * Updated writers for link attributes. * Updated tests * Updated stack.yaml to build against unreleased versions of pandoc-types and texmath. * Fixed various compiler warnings. Closes #261. TODO: * Relative (percentage) image widths in docx writer. * ODT/OpenDocument writer (untested, same issue about percentage widths). * Update pandoc-citeproc.
| * Parsing: Add `extractIdClass`, modified type of `KeyTable`.John MacFarlane2015-08-05
| | | | | | | | (mb21)
* | Allow `://` in citation keys.John MacFarlane2015-11-13
| | | | | | | | Closes jgm/pandoc-citeproc#166.
* | Restored Text.Pandoc.Compat.Monoid.John MacFarlane2015-11-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't use custom prelude for latest ghc. This is a better approach to making 'stack ghci' and 'cabal repl' work. Instead of using NoImplicitPrelude, we only use the custom prelude for older ghc versions. The custom prelude presents a uniform API that matches the current base version's prelude. So, when developing (presumably with latest ghc), we don't use a custom prelude at all and hence have no trouble with ghci. The custom prelude no longer exports (<>): we now want to match the base 4.8 prelude behavior.
* | Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."John MacFarlane2015-11-09
| | | | | | | | This reverts commit c423dbb5a34c2d1195020e0f0ca3aae883d0749b.
* | Use -XNoImplicitPrelude and 'import Prelude' explicitly.John MacFarlane2015-11-08
| | | | | | | | | | | | | | This is needed for ghci to work with pandoc, given that we now use a custom prelude. Closes #2503.
* | Use custom Prelude to avoid compiler warnings.John MacFarlane2015-10-14
|/ | | | | | | | | | | | | - The (non-exported) prelude is in prelude/Prelude.hs. - It exports Monoid and Applicative, like base 4.8 prelude, but works with older base versions. - It exports (<>) for mappend. - It hides 'catch' on older base versions. This allows us to remove many imports of Data.Monoid and Control.Applicative, and remove Text.Pandoc.Compat.Monoid. It should allow us to use -Wall again for ghc 7.10.
* Parsing: toKey: strip off outer brackets.John MacFarlane2015-07-23
| | | | | | | | | | | This makes keys with extra space at the beginning and end work: e.g. [foo]: bar [ foo ] will now be a link to bar (it wasn't before).
* Improved bare autolink detection.John MacFarlane2015-07-14
| | | | | | | | | | | | | | | | | | | Previously we disallowed `-` at the end of an autolink, and disallowed the combination `=-`. This commit liberalizes the rules for allowing punctuation in a bare URI. Added test cases. One potential drawback is that you can no longer put a bare URI in em dashes like this this uri---http://example.com---is an example. But in this respect we now match github's treatment of bare URIs. Closes #2299.
* Markdown reader: Made implicit header references case-insensitive.John MacFarlane2015-05-13
| | | | | | | | | Added `stateHeaderKeys` to `ParserState`; this is a `KeyTable` like `stateKeys`, but it only gets consulted if we don't find a match in `stateKeys`, and if `Ext_implicit_header_references` is enabled. Closes #1606.
* HTML reader: Fixed detection of self-closing tags.John MacFarlane2015-05-11
| | | | | | | | Earlier versions had a bug and would wrongly think opening tags containing attributes with slashes in them were self-closing. Closes #2146.
* Updated copyright notices to -2015. Closes #2111.John MacFarlane2015-04-26
|
* Revert "Merge pull request #1947 from mpickering/Fmonad"John MacFarlane2015-04-18
| | | | | | | | | | | | | Closes #2062. This reverts commit c302bdcdbe97b38721015fe82403b2a8f488a702, reversing changes made to b983adf0d0cbc98d2da1e2751f46ae1f93352be6. Conflicts: src/Text/Pandoc/Parsing.hs src/Text/Pandoc/Readers/Markdown.hs src/Text/Pandoc/Readers/Org.hs src/Text/Pandoc/Readers/RST.hs
* Merge pull request #1954 from mcmtroffaes/feature/citekey-firstchar-alphanumJohn MacFarlane2015-04-17
|\ | | | | Allow digit as first character of a citation key.
| * Allow digit as first character of a citation key.Matthias C. M. Troffaes2015-02-18
| | | | | | | | | | | | | | | | * Update parser to recognize citation keys starting with a digit. * Update documentation accordingly. * Test case added. See https://github.com/jgm/pandoc-citeproc/issues/97
* | MD Reader: Smart `'` after inline mathNikolay Yakimov2015-04-18
| | | | | | | | | | | | | | | | | | | | | | | | Closes #1909. Adds new parser combinator to Parsing.hs `a <+?> b` : if a succeeds, applies b and mappends output (if any) to result of a. If b fails, it's just a, if a fails, whole expression fails.
* | Add Text.Pandoc.Error module with PandocError typeMatthew Pickering2015-02-18
| |
* | Factor out "returnState" into Parsing moduleMatthew Pickering2015-02-18
| |
* | Generalise signature of addWarningMatthew Pickering2015-02-18
| |
* | Add check to see whether in a footnote to ParserState (to avoid circular ↵Matthew Pickering2015-02-18
| | | | | | | | footnotes)
* | Remove F monad from ParsingMatthew Pickering2015-02-18
| |
* | Changed parseWithWarnings to the more general returnWarnings parser transformerMatthew Pickering2015-02-18
| |
* | Added generalize function which can be used to lift specialised parsers.Matthew Pickering2015-02-18
|/ | | | Monad m => Parsec s st a -> Parsec T s st m a
* Text.Pandoc.Parsing: Change parseFromString to fail if not all input isMatthew Pickering2014-12-15
| | | | consumed.
* Merge pull request #1805 from bergey/rstJohn MacFarlane2014-12-15
|\ | | | | RST Reader - Improved Role Support
| * RST Reader: compute Attrs when role is definedDaniel Bergey2014-12-12
| | | | | | | | | | | | | | | | | | | | Move recursive role lookup from renderRole to addNewRole. The Attr value will be the same for every occurance of this role, so there's no reason to compute it every time. This allows simplifying the stateRstCustomRoles map considerably. We could go even further, and remove the fmt and attr arguments to renderRole, which are null except for custom roles.
| * expose warnings from RST reader; refactorDaniel Bergey2014-12-12
| | | | | | | | | | | | This commit moves some code which was only used for the Markdown Reader into a generic form which can be used for any Reader. Otherwise, it takes naming and interface cues from the preexisting Markdown code.
| * RST Reader: Warn about skipped directivesDaniel Bergey2014-12-08
| | | | | | | | move `addWarning` to Parsing.hs, so it can be used by Markdown & RST readers.
* | Fixe autolinks with following punctuation.John MacFarlane2014-12-14
|/ | | | | | Closes #1811. The price of this is that autolinked bare URIs can no longer contain `>` characters, but this is not a big issue.
* Parsing: fixed `inlineMath` so it handles `\text{..}` containing `$`.John MacFarlane2014-10-19
| | | | For example: `$x = \text{the $n$th root of $y$}`. Closes #1677.
* Use texmath 0.7 interface.John MacFarlane2014-08-04
|
* Parsing: Added isbn and pmid schemesMatthew Pickering2014-07-27
|
* Generalised more in Parsing.hs to enable the use of custom stateMatthew Pickering2014-07-26
|
* Exported runParserT and StreamMatthew Pickering2014-07-22
|
* Generalised readWith to readWithMMatthew Pickering2014-07-22
|
* Fix behavior of `markdown_attribute` extension.John MacFarlane2014-07-20
| | | | | | | | It now works as in PHP markdown extra. Setting `markdown="1"` on an outer tag affects all contained tags until it is reversed with `markdown="0"`. Closes #1378. Added `stateMarkdownAttribute` to `ParserState`.
* readWith: reverted generalization from f201bdcb.John MacFarlane2014-07-20
| | | | | We need input to be a string so we can print the offending line on an error.
* Parsing: Simplified dash and ellipsis.John MacFarlane2014-07-12
| | | | | | | | | | | | | | | | | | This originated with @dubiousjim's observation in #1419 that there was a typo in the definition of enDash. It returned an em dash character instead of an en dash. I thought about why this had not been noticed before, and realized that en dashes were just being parsed as regular symbols. That made me realize that, now that we no longer have dedicate EnDash, EmDash, and Ellipses inline elements, as we used to in pandoc, we no longer need to parse the unicode characters specially. This allowed a considerable simplification of the code. Partially resolves #1419.
* Removed space at ends of lines in source.John MacFarlane2014-07-12
|