summaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
Commit message (Collapse)AuthorAge
* Docx reader: Import traverse for ghc 7.8Jesse Rosenthal2016-08-29
| | | | The GHC 7.8 build was erroring without it.
* Docx reader: clean up function with `traverse`Jesse Rosenthal2016-08-29
|
* Merge branch 'org-meta-handling'Albert Krewinkel2016-08-29
|\
| * Org reader: respect `creator` export optionAlbert Krewinkel2016-08-29
| | | | | | | | | | | | | | | | | | | | | | The `creator` option controls whether the creator meta-field should be included in the final markup. Setting `#+OPTIONS: creator:nil` will drop the creator field from the final meta-data output. Org-mode recognizes the special value `comment` for this field, causing the creator to be included in a comment. This is difficult to translate to Pandoc internals and is hence interpreted the same as other truish values (i.e. the meta field is kept if it's present).
| * Org reader: respect `email` export optionAlbert Krewinkel2016-08-29
| | | | | | | | | | | | The `email` option controls whether the email meta-field should be included in the final markup. Setting `#+OPTIONS: email:nil` will drop the email field from the final meta-data output.
| * Org reader: respect `author` export optionAlbert Krewinkel2016-08-29
| | | | | | | | | | | | The `author` option controls whether the author should be included in the final markup. Setting `#+OPTIONS: author:nil` will drop the author from the final meta-data output.
| * Org reader: read HTML_head as header-includesAlbert Krewinkel2016-08-29
| | | | | | | | | | | | HTML-specific head content can be defined in `#+HTML_head` lines. They are parsed as format-specific inlines to ensure that they will only show up in HTML output.
| * Org reader: set classoption meta from LaTeX_class_optionsAlbert Krewinkel2016-08-29
| |
| * Org reader: set documentclass meta from LaTeX_classAlbert Krewinkel2016-08-29
| |
| * Org reader: read LaTeX_header as header-includesAlbert Krewinkel2016-08-29
| | | | | | | | | | | | LaTeX-specific header commands can be defined in `#+LaTeX_header` lines. They are parsed as format-specific inlines to ensure that they will only show up in LaTeX output.
| * Org reader: give precedence to later meta linesAlbert Krewinkel2016-08-29
| | | | | | | | | | | | The last meta-line of any given type is the significant line. Previously the value of the first line was kept, even if more lines of the same type were encounterd.
| * Org reader: allow multiple, comma-separated authorsAlbert Krewinkel2016-08-29
| | | | | | | | | | Multiple authors can be specified in the `#+AUTHOR` meta line if they are given as a comma-separated list.
| * Org reader: read markup only for special meta keysAlbert Krewinkel2016-08-29
| | | | | | | | | | Most meta-keys should be read as normal string values, only a few are interpreted as marked-up text.
| * Org reader: extract meta parsing code to moduleAlbert Krewinkel2016-08-29
| | | | | | | | | | Parsing of meta-data is well separable from other block parsing tasks. Moving into new module to get small files and clearly arranged code.
* | Docx reader: update copyright.Jesse Rosenthal2016-08-28
| |
* | Docx reader: use all anchor spans for header ids.Jesse Rosenthal2016-08-28
| | | | | | | | | | | | | | | | Previously we only used the first anchor span to affect header ids. This allows us to use all the anchor spans in a header, whether they're nested or not. Along with 62882f97, this closes #3088.
* | Docx reader: Let headers use exisiting id.Jesse Rosenthal2016-08-28
| | | | | | | | | | | | Previously we always generated an id for headers (since they wouldn't bring one from Docx). Now we let it use an existing one if possible. This should allow us to recurs through anchor spans.
* | Docx reader: Handle anchor spans with content in headers.Jesse Rosenthal2016-08-28
|/ | | | | | | | | Previously, we would only be able to figure out internal links to a header in a docx if the anchor span was empty. We change that to read the inlines out of the first anchor span in a header. This still leaves another problem: what to do if there are multiple anchor spans in a header. That will be addressed in a future commit.
* StyleMap: export functions on StyleMap instancesJesse Rosenthal2016-08-15
| | | | We're going to want `getMap` in the Docx Writer.
* Docx parser: Use xml convenience functionsJesse Rosenthal2016-08-13
| | | | | | The functions `isElem` and `elemName` (defined in Docx/Util.hs) make the code a lot cleaner than the original XML.Light functions, but they had been used inconsistently. This puts them in wherever applicable.
* Merge pull request #3048 from tarleb/latex-mini-fixJohn MacFarlane2016-08-11
|\ | | | | LaTeX reader: drop duplicate `*` in bibtexKeyChars
| * LaTeX reader: drop duplicate `*` in bibtexKeyCharsAlbert Krewinkel2016-07-29
| |
* | Merge pull request #3065 from tarleb/org-verse-indentJohn MacFarlane2016-08-09
|\ \ | | | | | | Org reader: preserve indentation of verse lines
| * | Org reader: preserve indentation of verse linesAlbert Krewinkel2016-08-08
| | | | | | | | | | | | | | | | | | | | | Leading spaces in verse lines are converted to non-breaking spaces, so indentation is preserved. This fixes #3064.
* | | Org reader: ensure image sources are proper linksAlbert Krewinkel2016-08-09
|/ / | | | | | | | | | | | | | | | | | | | | | | | | Image sources as those in plain images, image links, or figures, must be proper URIs or relative file paths to be recognized as images. This restriction is now enforced for all image sources. This also fixes the reader's usage of uncleaned image sources, leading to `file:` prefixes not being deleted from figure images (e.g. `[[file:image.jpg]]` leading to a broken image `<img src="file:image.jpg"/>) Thanks to @bsag for noticing this bug.
* | MediaWiki reader: properly interpret XML tags in pre environments.John MacFarlane2016-08-06
| | | | | | | | | | They are meant to be interpreted as literal text in textile. Closes #3042.
* | Improved mediawiki reader's treatment of verbatim constructions.John MacFarlane2016-08-06
| | | | | | | | | | | | | | | | Previously these yielded strings of alternating Code and Space elements; we now incorporate the spaces into the Code. Emphasis etc. is still possible inside these. Closes #3055.
* | Fix for unquoted attribute values in mediawiki tables.John MacFarlane2016-08-06
|/ | | | | | | | | Previously an unquoted attribute value in a table row could cause parsing problems. Fixes #3053 (well, proper rowspans and colspans aren't created, but that's a bigger limitation with the current Pandoc document model for tables).
* Textile reader: disallow empty URL in explicit link.John MacFarlane2016-07-22
| | | | Closes #3036.
* Textile reader: support `bc..` extended code blocks.John MacFarlane2016-07-22
| | | | | Also, remove trailing newline in code blocks (consistently with Markdown reader).
* LaTeX reader: be more forgiving of non-standard characters.John MacFarlane2016-07-20
| | | | | | E.g. `^` outside of math. Some custom environments give these a meaning, so we should try not to fall over when we encounter them.
* LaTeX reader: more robust parsing of unknown environments.John MacFarlane2016-07-20
| | | | | We no longer fail on things like `^` inside options for tikz. Closes #3026.
* RST reader: use Div for admonitions.John MacFarlane2016-07-20
| | | | | | | | | | | | | | Previously blockquotes were used. Now a Div is used with class `admonition` and (if relevant) one of the following: `attention`, `caution`, `danger`, `error`, `hint`, `important`, `note`, `tip`, `warning`. `sidebar` is also put into a Div. Note: This will change rendering of RST documents! It should provide much more flexibility. Closes #3031.
* Textile reader: improve definition list parsing.John MacFarlane2016-07-19
| | | | | - Allow multiple terms (which we concatenate with linebreaks). - Fix exponential parsing bug (closes #3020 for real this time).
* Textile reader: improved table parsing.John MacFarlane2016-07-18
| | | | | | | | | | | We now handle cell and row attributes, mostly by skipping them. However, alignments are now handled properly. Since in pandoc alignment is per-column, not per-cell, we try to devine column alignments from cell alignments. Table captions are also now parsed, and textile indicators for thead and tfoot no longer cause parse failure. (However, a row designated as tfoot will just be a regular row in pandoc.)
* Don't require haddock-library 1.4.John MacFarlane2016-07-15
| | | | Instead use CPP to work around version differences.
* Fixed compiler warnings.John MacFarlane2016-07-14
|
* Haddock reader - support math.John MacFarlane2016-07-14
| | | | | The Haddock document model added elements for math in 1.4.
* Removed some redundant class constraints.John MacFarlane2016-07-14
|
* Merge pull request #3019 from tarleb/org-verbatim-fixJohn MacFarlane2016-07-14
|\ | | | | Org reader: fix parsing of verbatim inlines
| * Org reader: fix parsing of verbatim inlinesAlbert Krewinkel2016-07-14
| | | | | | | | | | | | | | | | | | | | Org rules for allowed characters before or after markup chars were not checked for verbatim text. This resultet in wrong parsing outcomes of if the verbatim text contained e.g. space enclosed markup characters as part of the text (`=is_substr = True=`). Forcing the parser to update the positions of allowed/forbidden markup border characters fixes this. This fixes #3016.
* | Fixed exponential parsing bug in textile reader.John MacFarlane2016-07-14
|/ | | | Closes #3020.
* Org reader: replace ugly code with view patternAlbert Krewinkel2016-07-04
| | | | | | | Some less-than-smart code required a pragma switching of overlapping pattern warnings in order to compile seamlessly. Using view patterns makes the code easier to read and also doesn't require overlapping pattern checks to be disabled.
* Merge pull request #3010 from tarleb/org-header-treeJohn MacFarlane2016-07-03
|\ | | | | Org reader: support archived trees, headline levels export setting
| * Org reader: support headline levels export settingAlbert Krewinkel2016-07-03
| | | | | | | | | | The depths of headlines can be modified using the `H` option. Deeper headlines will be converted to lists.
| * Org reader: put export setting parser into moduleAlbert Krewinkel2016-07-02
| | | | | | | | | | Export option parsing is distinct enough from general block parsing to justify putting it into a separate module.
| * Org reader: support archived trees export optionsAlbert Krewinkel2016-07-01
| | | | | | | | | | | | | | Handling of archived trees can be modified using the `arch` option. Archived trees are either dropped, exported completely, or collapsed to include just the header when the `arch` option is nil, non-nil, or `headline`, respectively.
| * Org reader: refactor comment tree handlingAlbert Krewinkel2016-07-01
| | | | | | | | | | | | Comment trees were handled after parsing, as pattern matching on lists is easier than matching on sequences. The new method of reading documents as trees allows for more elegant subtree removal.
| * Org reader: parse as headlines, convert to blocksAlbert Krewinkel2016-07-01
| | | | | | | | | | | | | | Emacs org-mode is based on outline-mode, which treats documents as trees with headlines are nodes. The reader is refactored to parse into a similar tree structure. This simplifies transformations acting on document (sub-)trees.
| * Org reader: improve tag and properties type safetyAlbert Krewinkel2016-07-01
| | | | | | | | | | Specific newtype definitions are used to replace stringly typing of tags and properties. Type safety is increased while readability is improved.