summaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/Org.hs
Commit message (Collapse)AuthorAge
* Update copyright notices to include 2018Albert Krewinkel2018-01-05
|
* Move CR filtering from tabFilter to the readers.John MacFarlane2017-06-20
| | | | | | | | | | The readers previously assumed that CRs had been filtered from the input. Now we strip the CRs in the readers themselves, before parsing. (The point of this is just to simplify the parsers.) Shared now exports a new function `crFilter`. [API change] And `tabFilter` no longer filters CRs.
* Changed all readers to take Text instead of String.John MacFarlane2017-06-10
| | | | | | | | Readers: Renamed StringReader -> TextReader. Updated tests. API change.
* Update dates in copyright noticesAlbert Krewinkel2017-05-13
| | | | | This follows the suggestions given by the FSF for GPL licensed software. <https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
* Issue warning for duplicate header identifiers.John MacFarlane2017-03-12
| | | | | | | | | | | | | | | As noted in the previous commit, an autogenerated identifier may still coincide with an explicit identifier that is given for a header later in the document, or with an identifier on a div, span, link, or image. This commit adds a warning in this case, so users can supply an explicit identifier. * Added `DuplicateIdentifier` to LogMessage. * Modified HTML, Org, MediaWiki readers so their custom state type is an instance of HasLogMessages. This is necessary for `registerHeader` to issue warnings. See #1745.
* Stylish-haskell automatic formatting changes.John MacFarlane2017-03-04
|
* Unify Errors.Jesse Rosenthal2017-01-25
|
* Working on readers.Jesse Rosenthal2017-01-25
|
* Org reader: refactor comment tree handlingAlbert Krewinkel2016-07-01
| | | | | | Comment trees were handled after parsing, as pattern matching on lists is easier than matching on sequences. The new method of reading documents as trees allows for more elegant subtree removal.
* Org reader: support smart quotes export optionAlbert Krewinkel2016-06-03
| | | | Reading of smart quotes can be toggled using the `'` option.
* Org reader: extract blocks parser to moduleAlbert Krewinkel2016-05-25
| | | | | | Block parsing code is moved to a separate module. This is part of the Org-mode reader cleanup effort.
* Org reader: extract inline parser to moduleAlbert Krewinkel2016-05-25
| | | | | | | Inline parsing code is moved to a separate module. Parsers for block starts are extracted as well, as those are used in the `endline` parser. This is part of the Org-mode reader cleanup effort.
* Org reader: extract parsing function to moduleAlbert Krewinkel2016-05-25
| | | | | | | | | | | | The Org-mode reader uses many functions defined in the `Text.Pandoc.Parsing` utility module. Some of the functions are overwritten with versions adapted to Org-mode idiosyncrasies. These special functions, as well as the normal Pandoc versions, are combined in a single module to increase the ease of use. This leads to decoupling of Org-mode and Pandoc and hence to slightly cleaner code. The downside is code-bloat due to repeated import/export statements.
* Org reader: respect drawer export settingAlbert Krewinkel2016-05-23
| | | | | The `d` export option can be used to control which drawers are exported and which are discarded. Basic support for this option is added here.
* Org reader/writer: use CUSTOM_ID in propertiesAlbert Krewinkel2016-05-22
| | | | | | | | | The `ID` property is reserved for internal use by Org-mode and should not be used. The `CUSTOM_ID` property is to be used instead, it is converted to the `ID` property for certain export format. The reader and writer erroneously used `ID`. This is corrected by using `CUSTOM_ID` where appropriate.
* Org reader: add :PROPERTIES: drawer supportAlbert Krewinkel2016-05-20
| | | | | | | | | | | | | | Headers can have optional `:PROPERTIES:` drawers associated with them. These drawers contain key/value pairs like the header's `id`. The reader adds all listed pairs to the header's attributes; `id` and `class` attributes are handled specially to match the way `Attr` are defined. This also changes behavior of how drawers of unknown type are handled. Instead of including all unknown drawers, those are not read/exported, thereby matching current Emacs behavior. This closes #1877.
* Org reader: add support for ATTR_HTML attributesAlbert Krewinkel2016-05-19
| | | | | | | | | | | Arbitrary key-value pairs can be added to some block types using a `#+ATTR_HTML` line before the block. Emacs Org-mode only includes these when exporting to HTML, but since we cannot make this distinction here, the attributes are always added. The functionality is now supported for figures. This closes #1906.
* Org reader: use custom `anyLine`Albert Krewinkel2016-05-19
| | | | | | | | | | | | | Additional state changes need to be made after a newline is parsed, otherwise markup may not be recognized correctly. This fixes a bug where markup after certain block-types would not be recognized. E.g. `/emph/` in the following snippet was not parsed as emphasized. foo # comment /emph/
* Org reader: refactor block attribute handlingAlbert Krewinkel2016-05-19
| | | | | | | A parser state attribute was used to keep track of block attributes defined in meta-lines. Global state is undesirable, so block attributes are no longer saved as part of the parser state. Old functions and the respective part of the parser state are removed.
* Org reader: parse but ignore export optionsAlbert Krewinkel2016-05-11
| | | | All known export options are parsed but ignored.
* Org reader: add support for sub/superscript export optionsAlbert Krewinkel2016-05-11
| | | | | | Org-mode allows to specify export settings via `#+OPTIONS` lines. Disabling simple sub- and superscripts is one of these export options, this options is now supported.
* Org reader: move parser state into separate moduleAlbert Krewinkel2016-05-11
| | | | | The org reader code has become large and confusing. Extracting smaller parts into submodules should help to clean things up.
* Org reader: fix inline-LaTeX regressionAlbert Krewinkel2016-05-09
| | | | | | | The last fix for whitespace handling of inline LaTeX commands was incorrect, preventing correct recognition of inline LaTeX commands which contain spaces. This fix ensures that only trailing whitespace is cut off.
* Merge pull request #2898 from tarleb/org-table-refactoringJohn MacFarlane2016-05-05
|\ | | | | Org reader: table parsing code refactoring and fixes
| * Org reader: fix handling of empty table cells, rowsAlbert Krewinkel2016-05-04
| | | | | | | | | | | | | | | | | | | | This fixes Org mode parsing of some corner cases regarding empty cells and rows. Empty cells weren't parsed correctly, e.g. `|||` should be two empty cells, but would be parsed as a single cell containing a pipe character. Empty rows where parsed as alignment rows and dropped from the output. This fixes #2616.
| * Org reader: refactor rows-to-table conversionAlbert Krewinkel2016-05-04
| | | | | | | | | | This refactores the codes conversing a list table lines to an org table ADT. The old code was simplified and is now slightly less ugly.
| * Org reader: stop padding short table rowsAlbert Krewinkel2016-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | Emacs Org-mode doesn't add any padding to table rows. The first row (header or first body row) is used to determine the column count, no other magic is performed. The org reader was padding rows to the length of the longest table row. This was done due to a misunderstanding of how Org handles tables. This feature reflected how Org-mode handles tables when pressing <TAB>. The Org exporter however, which is what the reader should implement, doesn't do any of this. So this was a mis-feature that made the reader more complex and reduced comparability. It was hence removed.
* | Org reader: fix spacing after LaTeX-style symbolsAlbert Krewinkel2016-05-04
|/ | | | | | | | The org-reader was droping space after unescaped LaTeX-style symbol commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä` instead. This seems to be because the LaTeX-reader treats the command-terminating space as part of the command. Dropping the trailing space from the symbol-command fixes this issue.
* Ignore leading space in org code blocksEmanuel Evans2016-04-26
| | | | | | Fixes #2862 Also fix up tab handling for leading whitespace in code blocks.
* Merge pull request #2646 from tarleb/org-figure-with-no-nameJohn MacFarlane2016-02-20
|\ | | | | Prefix even empty figure names with "fig:"
| * Prefix even empty figure names with "fig:"Albert Krewinkel2016-01-11
| | | | | | | | | | | | | | | | The convention used by pandoc for figures is to mark them by prefixing the name with "fig:". The org reader failed to do this if a figure had no name. The test for this was broken as well. This fixes #2643.
* | Org reader: Refactor link-target processingAlbert Krewinkel2016-01-31
| | | | | | | | | | | | | | Cleanup of the code for link target handling. Most notably, the canonicalization of a link is handled by a separate function. This fixes #2684.
* | Changed type of Shared.uniqueIdent argument from [String] to Set String.John MacFarlane2016-01-22
|/ | | | | | | This avoids performance problems in documents with many identically named headers. Closes #2671.
* Fix function dropping subtrees tagged :noexport:Albert Krewinkel2016-01-07
| | | | | | | | | | | Continue scanning for comment subtrees beyond only the first block. Note to self: when writing an recursive function, don't forget to, you know, actually recurse. Shout to @mrvdb for noticing this. This fixes #2628.
* Modified readers to emit SoftBreak when appropriate.John MacFarlane2015-12-12
|
* Merge pull request #2526 from tarleb/org-definition-lists-fixJohn MacFarlane2015-11-13
|\ | | | | Org reader: Require whitespace around def list markers
| * Org reader: Require whitespace around def list markersAlbert Krewinkel2015-11-13
| | | | | | | | | | | | | | | | | | | | Definition list markers (i.e. double colons `::`) must be surrounded by whitespace to start a definition item. This rule was not checked before, resulting in bugs with footnotes and some link types. Thanks to @conklech for noticing and reporting this issue. This fixes #2518.
* | Org reader: Fix emphasis rules for smart parsingAlbert Krewinkel2015-11-13
|/ | | | | | | | | | Smart quotes, ellipses, and dashes should behave like normal quotes, single dashes, and dots with respect to text markup parsing. The parser state was not updated properly in all cases, which has been fixed. Thanks to @conklech for reporting this issue. This fixes #2513.
* Restored Text.Pandoc.Compat.Monoid.John MacFarlane2015-11-09
| | | | | | | | | | | | | | Don't use custom prelude for latest ghc. This is a better approach to making 'stack ghci' and 'cabal repl' work. Instead of using NoImplicitPrelude, we only use the custom prelude for older ghc versions. The custom prelude presents a uniform API that matches the current base version's prelude. So, when developing (presumably with latest ghc), we don't use a custom prelude at all and hence have no trouble with ghci. The custom prelude no longer exports (<>): we now want to match the base 4.8 prelude behavior.
* Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."John MacFarlane2015-11-09
| | | | This reverts commit c423dbb5a34c2d1195020e0f0ca3aae883d0749b.
* Merge pull request #2505 from tarleb/org-header-markup-fixJohn MacFarlane2015-11-08
|\ | | | | Org reader: fix markup parsing in headers
| * Org reader: fix markup parsing in headersAlbert Krewinkel2015-11-08
| | | | | | | | | | | | | | | | | | | | Markup as the very first item in a header wasn't recognized. This was caused by an incorrect parser state: positions at which inline markup can start need to be marked explicitly by changing the parser state. This wasn't done for headers. The proper function to update the state is now called at the beginning of the header parser, fixing this issue. This fixes #2504.
* | Use -XNoImplicitPrelude and 'import Prelude' explicitly.John MacFarlane2015-11-08
|/ | | | | | | This is needed for ghci to work with pandoc, given that we now use a custom prelude. Closes #2503.
* Merge pull request #2477 from tarleb/org-toggling-header-argsJohn MacFarlane2015-10-25
|\ | | | | Org reader: allow toggling header args
| * Org reader: allow toggling header argsAlbert Krewinkel2015-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Org-mode allows to skip the argument of a code block header argument if it's toggling a value. Argument-less headers are now recognized, avoiding weird parsing errors. The fixes are not exactly pretty, but neither is the code that was fixed. So I guess it's about par for the course. However, a rewrite of the header parsing code wouldn't hurt in the long run. Thanks to @jo-tham for filing the bug report. This fixes #2269.
* | Org reader: fix paragraph/list interactionAlbert Krewinkel2015-10-24
|/ | | | | | | | | | | Paragraphs can be followed by lists, even if there is no blank line between the two blocks. However, this should only be true if the paragraph is not within a list, were the preceding block should be parsed as a plain instead of paragraph (to allow for compact lists). Thanks to @rgaiacs for bringing this up. This fixes #2464.
* Use custom Prelude to avoid compiler warnings.John MacFarlane2015-10-14
| | | | | | | | | | | | | - The (non-exported) prelude is in prelude/Prelude.hs. - It exports Monoid and Applicative, like base 4.8 prelude, but works with older base versions. - It exports (<>) for mappend. - It hides 'catch' on older base versions. This allows us to remove many imports of Data.Monoid and Control.Applicative, and remove Text.Pandoc.Compat.Monoid. It should allow us to use -Wall again for ghc 7.10.
* Make sure verse blocks can contain empty linesAlbert Krewinkel2015-09-19
| | | | | | | | | | | | The previous verse parsing code made the faulty assumption that empty strings are valid (and empty) inlines. This isn't the case, so lines are changed to contain at least a newline. It would generally be nicer and faster to keep the newlines while splitting the string. However, this would require more code, which seems unjustified for a simple (and fairly rare) block as *verse*. This fixes #2402.
* Org reader: add auto identifiers if not present on headersJuliusz Gonera2015-08-15
| | | | | | | Refs #2354 This should also fix the table of contents (--toc) when generating a html file from org input
* Merge pull request #2170 from tarleb/org-generalize-result-blockJohn MacFarlane2015-05-26
|\ | | | | Org generalize result block