summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJonas Smedegaard <dr@jones.dk>2016-06-06 22:32:09 +0200
committerJonas Smedegaard <dr@jones.dk>2016-06-06 22:32:09 +0200
commitf8389d48da9921672da3896b85f7ed444ede714d (patch)
treed01ffee40dfdfa03f9b6a18980e5beb0c5dc262e
parent29583a109043c7b2128b37f13135a326add16996 (diff)
Imported Upstream version 1.17.1~dfsg
-rw-r--r--README98
-rw-r--r--changelog224
-rw-r--r--data/templates/default.docbook530
-rw-r--r--data/templates/default.latex3
-rw-r--r--man/pandoc.1113
-rw-r--r--pandoc.cabal31
-rw-r--r--pandoc.hs4
-rw-r--r--src/Text/Pandoc.hs2
-rw-r--r--src/Text/Pandoc/Options.hs2
-rw-r--r--src/Text/Pandoc/Readers/Docx.hs9
-rw-r--r--src/Text/Pandoc/Readers/Docx/Parse.hs4
-rw-r--r--src/Text/Pandoc/Readers/EPUB.hs18
-rw-r--r--src/Text/Pandoc/Readers/HTML.hs27
-rw-r--r--src/Text/Pandoc/Readers/Markdown.hs42
-rw-r--r--src/Text/Pandoc/Readers/MediaWiki.hs2
-rw-r--r--src/Text/Pandoc/Readers/Odt.hs4
-rw-r--r--src/Text/Pandoc/Readers/Org.hs1563
-rw-r--r--src/Text/Pandoc/Readers/Org/BlockStarts.hs112
-rw-r--r--src/Text/Pandoc/Readers/Org/Blocks.hs901
-rw-r--r--src/Text/Pandoc/Readers/Org/Inlines.hs762
-rw-r--r--src/Text/Pandoc/Readers/Org/ParserState.hs237
-rw-r--r--src/Text/Pandoc/Readers/Org/Parsing.hs214
-rw-r--r--src/Text/Pandoc/Readers/Org/Shared.hs76
-rw-r--r--src/Text/Pandoc/Readers/RST.hs14
-rw-r--r--src/Text/Pandoc/Writers/Docbook.hs21
-rw-r--r--src/Text/Pandoc/Writers/EPUB.hs3
-rw-r--r--src/Text/Pandoc/Writers/HTML.hs5
-rw-r--r--src/Text/Pandoc/Writers/LaTeX.hs60
-rw-r--r--src/Text/Pandoc/Writers/Org.hs36
-rw-r--r--stack.yaml10
-rw-r--r--tests/Tests/Old.hs5
-rw-r--r--tests/Tests/Readers/Docx.hs12
-rw-r--r--tests/Tests/Readers/Org.hs167
-rw-r--r--tests/Tests/Readers/RST.hs29
-rw-r--r--tests/docx/track_changes_move.docxbin0 -> 26151 bytes
-rw-r--r--tests/docx/track_changes_move_accept.native3
-rw-r--r--tests/docx/track_changes_move_all.native4
-rw-r--r--tests/docx/track_changes_move_reject.native3
-rw-r--r--tests/mallard-reader.native3
-rw-r--r--tests/markdown-reader-more.native4
-rw-r--r--tests/mediawiki-reader.native5
-rw-r--r--tests/mediawiki-reader.wiki8
-rw-r--r--tests/tables.docbook5432
-rw-r--r--tests/tables.latex144
-rw-r--r--tests/writer.docbook51395
-rw-r--r--tests/writer.org93
-rw-r--r--tests/writers-lang-and-dir.latex8
47 files changed, 5125 insertions, 1817 deletions
diff --git a/README b/README
index 9aaa561bb..c87a8070a 100644
--- a/README
+++ b/README
@@ -1,6 +1,6 @@
% Pandoc User's Guide
% John MacFarlane
-% January 12, 2016
+% June 4, 2016
Synopsis
========
@@ -273,26 +273,26 @@ General options
(LaTeX), `beamer` (LaTeX beamer slide show), `context` (ConTeXt),
`man` (groff man), `mediawiki` (MediaWiki markup), `dokuwiki`
(DokuWiki markup), `textile` (Textile), `org` (Emacs Org mode),
- `texinfo` (GNU Texinfo), `opml` (OPML), `docbook` (DocBook),
- `opendocument` (OpenDocument), `odt` (OpenOffice text document),
- `docx` (Word docx), `haddock` (Haddock markup), `rtf` (rich text
- format), `epub` (EPUB v2 book), `epub3` (EPUB v3), `fb2`
- (FictionBook2 e-book), `asciidoc` (AsciiDoc), `icml` (InDesign
- ICML), `tei` (TEI Simple), `slidy` (Slidy HTML and javascript slide
- show), `slideous` (Slideous HTML and javascript slide show),
- `dzslides` (DZSlides HTML5 + javascript slide show), `revealjs`
- (reveal.js HTML5 + javascript slide show), `s5` (S5 HTML and javascript
- slide show), or the path of a custom lua writer (see [Custom
- writers], below). Note that `odt`, `epub`, and
- `epub3` output will not be directed to *stdout*; an output
- filename must be specified using the `-o/--output` option. If
- `+lhs` is appended to `markdown`, `rst`, `latex`, `beamer`,
- `html`, or `html5`, the output will be rendered as literate
- Haskell source: see [Literate Haskell
- support], below. Markdown syntax
- extensions can be individually enabled or disabled by appending
- `+EXTENSION` or `-EXTENSION` to the format name, as described
- above under `-f`.
+ `texinfo` (GNU Texinfo), `opml` (OPML), `docbook` (DocBook 4),
+ `docbook5` (DocBook 5), `opendocument` (OpenDocument), `odt`
+ (OpenOffice text document), `docx` (Word docx), `haddock`
+ (Haddock markup), `rtf` (rich text format), `epub` (EPUB v2
+ book), `epub3` (EPUB v3), `fb2` (FictionBook2 e-book),
+ `asciidoc` (AsciiDoc), `icml` (InDesign ICML), `tei` (TEI
+ Simple), `slidy` (Slidy HTML and javascript slide show),
+ `slideous` (Slideous HTML and javascript slide show),
+ `dzslides` (DZSlides HTML5 + javascript slide show),
+ `revealjs` (reveal.js HTML5 + javascript slide show), `s5`
+ (S5 HTML and javascript slide show), or the path of a custom
+ lua writer (see [Custom writers], below). Note that `odt`,
+ `epub`, and `epub3` output will not be directed to *stdout*;
+ an output filename must be specified using the `-o/--output`
+ option. If `+lhs` is appended to `markdown`, `rst`, `latex`,
+ `beamer`, `html`, or `html5`, the output will be rendered as
+ literate Haskell source: see [Literate Haskell support],
+ below. Markdown syntax extensions can be individually
+ enabled or disabled by appending `+EXTENSION` or
+ `-EXTENSION` to the format name, as described above under `-f`.
`-o` *FILE*, `--output=`*FILE*
@@ -538,16 +538,16 @@ General writer options
`--columns=`*NUMBER*
-: Specify length of lines in characters (for text wrapping).
- This affects only the generated source code, not the layout on
- the rendered page.
+: Specify length of lines in characters. This affects text wrapping
+ in the generated source code (see `--wrap`). It also affects
+ calculation of column widths for plain text tables (see [Tables] below).
`--toc`, `--table-of-contents`
: Include an automatically generated table of contents (or, in
the case of `latex`, `context`, `docx`, and `rst`, an instruction to create
one) in the output document. This option has no effect on `man`,
- `docbook`, `slidy`, `slideous`, `s5`, or `odt` output.
+ `docbook`, `docbook5`, `slidy`, `slideous`, `s5`, or `odt` output.
`--toc-depth=`*NUMBER*
@@ -909,7 +909,7 @@ Math rendering in HTML
`--mathml`[`=`*URL*]
-: Convert TeX math to [MathML] (in `docbook` as well as `html` and `html5`).
+: Convert TeX math to [MathML] (in `docbook`, `docbook5`, `html` and `html5`).
In standalone `html` output, a small javascript (or a link to such a
script if a *URL* is supplied) will be inserted that allows the MathML to
be viewed on some browsers.
@@ -1591,21 +1591,26 @@ CSS.
#### Extension: `implicit_header_references` ####
Pandoc behaves as if reference links have been defined for each header.
-So, instead of
+So, to link to a header
- [header identifiers](#header-identifiers-in-html)
+ # Header identifiers in HTML
you can simply write
- [header identifiers]
+ [Header identifiers in HTML]
or
- [header identifiers][]
+ [Header identifiers in HTML][]
or
- [the section on header identifiers][header identifiers]
+ [the section on header identifiers][header identifiers in
+ HTML]
+
+instead of giving the identifier explicitly:
+
+ [Header identifiers in HTML](#header-identifiers-in-html)
If there are multiple headers with identical text, the corresponding
reference will link to the first one only, and you will need to use explicit
@@ -3835,6 +3840,7 @@ any kind. (See COPYRIGHT for full copyright and warranty notices.)
Contributors include
Aaron Wolen,
Albert Krewinkel,
+Alex Vong,
Alexander Kondratskiy,
Alexander Sulfrian,
Alexander V Vershilov,
@@ -3847,6 +3853,7 @@ Arlo O'Keeffe,
Artyom Kazak,
Ben Gamari,
Beni Cherniavsky-Paskin,
+Benoit Schweblin,
Bjorn Buckwalter,
Bradley Kuhn,
Brent Yorgey,
@@ -3854,12 +3861,16 @@ Bryan O'Sullivan,
B. Scott Michel,
Caleb McDaniel,
Calvin Beck,
+Carlos Sosa,
+Chris Black,
+Christian Conkle,
Christoffer Ackelman,
Christoffer Sawicki,
Clare Macrae,
Clint Adams,
Conal Elliott,
Craig S. Bosma,
+csforste,
Daniel Bergey,
Daniel T. Staal,
David Lazar,
@@ -3867,27 +3878,35 @@ David Röthlisberger,
Denis Laxalde,
Douglas Calvert,
Douglas F. Calvert,
+Emanuel Evans,
+Emily Eisenberg,
Eric Kow,
Eric Seidel,
Florian Eitel,
François Gannaz,
Freiric Barral,
+Freirich Raabe,
Fyodor Sheremetyev,
Gabor Pali,
Gavin Beatty,
+Gottfried Haider,
Greg Maslov,
Grégory Bataille,
Greg Rundlett,
gwern,
Gwern Branwen,
Hans-Peter Deifel,
+Henrik Tramberend,
Henry de Valence,
+ickc,
Ilya V. Portnov,
infinity0x,
+Ivo Clarysse,
Jaime Marquínez Ferrándiz,
James Aspnes,
Jamie F. Olson,
Jan Larres,
+Jan Schulz,
Jason Ronallo,
Jeff Arnold,
Jeff Runningen,
@@ -3902,22 +3921,30 @@ Jonathan Daugherty,
Josef Svenningsson,
Jose Luis Duran,
Julien Cretel,
+Juliusz Gonera,
Justin Bogner,
Kelsey Hightower,
+Kolen Cheung,
Konstantin Zudov,
+Kristof Bastiaensen,
Lars-Dominik Braun,
Luke Plant,
Mark Szepieniec,
Mark Wright,
+Martin Linn,
Masayoshi Takahashi,
Matej Kollar,
Mathias Schenner,
+Mathieu Duponchelle,
+Matthew Eddey,
Matthew Pickering,
Matthias C. M. Troffaes,
Mauro Bieg,
Max Bolingbroke,
Max Rydahl Andersen,
Merijn Verstraaten,
+Michael Beaumont,
+Michael Chladek,
Michael Snoyman,
Michael Thompson,
MinRK,
@@ -3927,22 +3954,29 @@ Nick Bart,
Nicolas Kaiser,
Nikolay Yakimov,
nkalvi,
+Ophir Lifshitz,
+Pablo Rodríguez,
Paulo Tanimoto,
Paul Rivier,
Peter Wang,
Philippe Ombredanne,
Phillip Alday,
+Prayag Verma,
Puneeth Chaganti,
qerub,
Ralf Stephan,
+Raniere Silva,
Recai Oktaş,
+robabla,
rodja.trappe,
+rski,
RyanGlScott,
Scott Morrison,
Sergei Trofimovich,
Sergey Astanin,
Shahbaz Youssefi,
Shaun Attfield,
+Sidarth Kapur,
shreevatsa.public,
Simon Hengel,
Sumit Sahrawat,
@@ -3950,6 +3984,8 @@ takahashim,
thsutton,
Tim Lin,
Timothy Humphries,
+Tiziano Müller,
+Thomas Hodgson,
Todd Sifleet,
Tom Leese,
Uli Köhler,
diff --git a/changelog b/changelog
index e93d85b18..c9250e3e9 100644
--- a/changelog
+++ b/changelog
@@ -1,3 +1,227 @@
+pandoc (1.17.1)
+
+ * New output format: `docbook5` (Ivo Clarysse).
+
+ * `Text.Pandoc.Options`: Add `writerDocBook5` to `WriterOptions`
+ (API change).
+
+ * Org writer:
+
+ + Add :PROPERTIES: drawer support (Albert Krewinkel, #1962).
+ This allows header attributes to be added to org documents in the form
+ of `:PROPERTIES:` drawers. All available attributes are stored as
+ key/value pairs. This reflects the way the org reader handles
+ `:PROPERTIES:` blocks.
+ + Add drawer capability (Carlos Sosa). For the implementation of the
+ Drawer element in the Org Writer, we make use of a generic Block
+ container with attributes. The presence of a `drawer` class defines
+ that the `Div` constructor is a drawer. The first class defines the
+ drawer name to use. The key-value list in the attributes defines
+ the keys to add inside the Drawer. Lastly, the list of Block elements
+ contains miscellaneous blocks elements to add inside of the Drawer.
+ + Use `CUSTOM_ID` in properties (Albert Krewinkel). The `ID` property is
+ reserved for internal use by Org-mode and should not be used.
+ The `CUSTOM_ID` property is to be used instead, it is converted to the
+ `ID` property for certain export format.
+
+ * LaTeX writer:
+
+ + Ignore `--incremental` unless output format is beamer (#2843).
+ + Fix polyglossia to babel env mapping (Mauro Bieg, #2728).
+ Allow for optional argument in square brackets.
+ + Recognize `la-x-classic` as Classical Latin (Andrew Dunning).
+ This allows one to access the hyphenation patterns in CTAN's
+ hyph-utf8.
+ + Add missing languages from hyph-utf8 (Andrew Dunning).
+ + Improve use of `\strut` with `\minipage` inside tables
+ (Jose Luis Duran). This improves spacing in multiline
+ tables.
+ + Use `{}` around options containing special chars (#2892).
+ + Avoid lazy `foldl`.
+ + Don't escape underscore in labels (#2921). Previously they were
+ escaped as `ux5f`.
+ + brazilian -> brazil for polyglossia (#2953).
+
+ * HTML writer: Ensure mathjax link is added when math appears in footnote
+ (#2881). Previously if a document only had math in a footnote, the
+ MathJax link would not be added.
+
+ * EPUB writer: set `navpage` variable on nav page.
+ This allows templates to treat it differently.
+
+ * DocBook writer:
+
+ + Use docbook5 if `writerDocbook5` is set (Ivo Clarysse).
+ + Properly handle `ulink`/`link` (Ivo Clarysse).
+
+ * EPUB reader:
+
+ + Unescape URIs in spine (#2924).
+ + EPUB reader: normalise link id (Mauro Bieg).
+
+ * Docx Reader:
+
+ + Parse `moveTo` and `moveFrom` (Jesse Rosenthal).
+ `moveTo` and `moveFrom` are track-changes tags that are used when a
+ block of text is moved in the document. We now recognize these tags and
+ treat them the same as `insert` and `delete`, respectively. So,
+ `--track-changes=accept` will show the moved version, while
+ `--track-changes=reject` will show the original version.
+ + Tests for track-changes moving (Jesse Rosenthal).
+
+ * ODT, EPUB, Docx readers: throw `PandocError` on unzip failure
+ (Jesse Rosenthal) Previously, `readDocx`, `readEPUB`, and `readOdt`
+ would error out if zip-archive failed. We change the archive extraction
+ step from `toArchive` to `toArchiveOrFail`, which returns an Either value.
+
+ * Markdown, HTML readers: be more forgiving about unescaped `&` in
+ HTML (#2410). We are now more forgiving about parsing invalid HTML with
+ unescaped `&` as raw HTML. (Previously any unescaped `&`
+ would cause pandoc not to recognize the string as raw HTML.)
+
+ * Markdown reader:
+
+ + Fix pandoc title blocks with lines ending in 2 spaces (#2799).
+ + Added `-s` to markdown-reader-more test.
+
+ * HTML reader: fixed bug in `pClose`. This caused exponential parsing
+ behavior in documnets with unclosed tags in `dl`, `dd`, `dt`.
+
+ * MediaWiki reader: Allow spaces before `!` in MediaWiki table header
+ (roblabla).
+
+ * RST reader: Support `:class:` option for code block in RST reader
+ (Sidharth Kapur).
+
+ * Org reader (all Albert Krewinkel, except where noted otherwise):
+
+ + Stop padding short table rows.
+ Emacs Org-mode doesn't add any padding to table rows. The first
+ row (header or first body row) is used to determine the column count,
+ no other magic is performed.
+ + Refactor rows-to-table conversion. This refactors
+ the codes conversing a list table lines to an org table ADT.
+ The old code was simplified and is now slightly less ugly.
+ + Fix handling of empty table cells, rows (Albert Krewinkel, #2616).
+ This fixes Org mode parsing of some corner cases regarding empty cells
+ and rows. Empty cells weren't parsed correctly, e.g. `|||` should be
+ two empty cells, but would be parsed as a single cell containing a pipe
+ character. Empty rows where parsed as alignment rows and dropped from
+ the output.
+ + Fix spacing after LaTeX-style symbols.
+ The org-reader was droping space after unescaped LaTeX-style symbol
+ commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä`
+ instead. This seems to be because the LaTeX-reader treats the
+ command-terminating space as part of the command. Dropping the trailing
+ space from the symbol-command fixes this issue.
+ + Print empty table rows. Empty table rows should not
+ be dropped from the output, so row-height is always set to be at least 1.
+ + Move parser state into separate module.
+ The org reader code has become large and confusing. Extracting smaller
+ parts into submodules should help to clean things up.
+ + Add support for sub/superscript export options.
+ Org-mode allows to specify export settings via `#+OPTIONS` lines.
+ Disabling simple sub- and superscripts is one of these export options,
+ this options is now supported.
+ + Support special strings export option Parsing of special strings
+ (like `...` as ellipsis or `--` as en dash) can be toggled using the `-`
+ option.
+ + Support emphasized text export option. Parsing of emphasized text can
+ be toggled using the `*` option. This influences parsing of text marked
+ as emphasized, strong, strikeout, and underline. Parsing of inline math,
+ code, and verbatim text is not affected by this option.
+ + Support smart quotes export option. Reading of smart quotes can be
+ toggled using the `'` option.
+ + Parse but ignore export options. All known export options are parsed
+ but ignored.
+ + Refactor block attribute handling. A parser state attribute was used
+ to keep track of block attributes defined in meta-lines. Global state
+ is undesirable, so block attributes are no longer saved as part of the
+ parser state. Old functions and the respective part of the parser state
+ are removed.
+ + Use custom `anyLine`. Additional state changes need to be made after
+ a newline is parsed, otherwise markup may not be recognized correctly.
+ This fixes a bug where markup after certain block-types would not be
+ recognized.
+ + Add support for `ATTR_HTML` attributes (#1906).
+ Arbitrary key-value pairs can be added to some block types using a
+ `#+ATTR_HTML` line before the block. Emacs Org-mode only includes these
+ when exporting to HTML, but since we cannot make this distinction here,
+ the attributes are always added. The functionality is now supported
+ for figures.
+ + Add `:PROPERTIES:` drawer support (#1877).
+ Headers can have optional `:PROPERTIES:` drawers associated with them.
+ These drawers contain key/value pairs like the header's `id`. The
+ reader adds all listed pairs to the header's attributes; `id` and
+ `class` attributes are handled specially to match the way `Attr` are
+ defined. This also changes behavior of how drawers of unknown type
+ are handled. Instead of including all unknown drawers, those are not
+ read/exported, thereby matching current Emacs behavior.
+ + Use `CUSTOM_ID` in properties. See above on Org writer changes.
+ + Respect drawer export setting. The `d` export option can be used
+ to control which drawers are exported and which are discarded.
+ Basic support for this option is added here.
+ + Ignore leading space in org code blocks (Emanuel Evans, #2862).
+ Also fix up tab handling for leading whitespace in code blocks.
+ + Support new syntax for export blocks. Org-mode version 9
+ uses a new syntax for export blocks. Instead of `#+BEGIN_<FORMAT>`,
+ where `<FORMAT>` is the format of the block's content, the new
+ format uses `#+BEGIN_export <FORMAT>` instead. Both types are
+ supported.
+ + Refactor `BEGIN...END` block parsing.
+ + Fix handling of whitespace in blocks, allowing content to be indented
+ less then the block header.
+ + Support org-ref style citations. The *org-ref* package is an
+ org-mode extension commonly used to manage citations in org
+ documents. Basic support for the `cite:citeKey` and
+ `[[cite:citeKey][prefix text::suffix text]]` syntax is added.
+ + Split code into separate modules, making for cleaner code and
+ better decoupling.
+
+ * Added `docbook5` template.
+
+ * `--mathjax` improvements:
+
+ + Use new CommonHTML output for MathJax (updated default MathJax URL,
+ #2858).
+ + Change default mathjax setup to use `TeX-AMS_CHTML` configuration.
+ This is designed for cases where the input is always TeX and maximal
+ conformity with TeX is desired. It seems to be smaller and load faster
+ than what we used before. See #2858.
+ + Load the full MathJax config to maximize loading speed (KolenCheung).
+
+ * Bumped upper version bounds to allow use of latest packages
+ and compilation with ghc 8.
+
+ * Require texmath 0.8.6.2. Closes several texmath-related bugs (#2775,
+ #2310, #2310, #2824). This fixes behavior of roots, e.g.
+ `\sqrt[3]{x}`, and issues with sub/superscript positioning
+ and matrix column alignment in docx.
+
+ * README:
+
+ + Clarified documentation of `implicit_header_references` (#2904).
+ + Improved documentation of `--columns` option.
+
+ * Added appveyor setup, with artefacts (Jan Schulz).
+
+ * stack.yaml versions: Use proper flags used for texmath, pandoc-citeproc.
+
+ * LaTeX template: support for custom font families (vladipus).
+ Needed for correct polyglossia operation with Cyrillic fonts and perhaps
+ can find some other usages. Example usage in YAML metadata:
+
+ fontfamilies:
+ - name: \cyrillicfont
+ font: Liberation Serif
+ - name: \cyrillicfonttt
+ options: Scale=MatchLowercase
+ font: Liberation
+
+ * Create unsigned msi as build artifact in appveyor build.
+
+ * On travis, test with ghc 8.0.1; drop testing for ghc 7.4.1.
+
pandoc (1.17.0.3)
* LaTeX writer: Fixed position of label in figures (#2813).
diff --git a/data/templates/default.docbook5 b/data/templates/default.docbook5
new file mode 100644
index 000000000..b3a0b6def
--- /dev/null
+++ b/data/templates/default.docbook5
@@ -0,0 +1,30 @@
+<?xml version="1.0" encoding="utf-8" ?>
+<!DOCTYPE article>
+$if(mathml)$
+<article xmlns="http://docbook.org/ns/docbook" version="5.0">
+$else$
+<article xmlns="http://docbook.org/ns/docbook" version="5.0">
+$endif$
+ <info>
+ <title>$title$</title>
+$if(author)$
+ <authorgroup>
+$for(author)$
+ <author>
+ $author$
+ </author>
+$endfor$
+ </authorgroup>
+$endif$
+$if(date)$
+ <date>$date$</date>
+$endif$
+ </info>
+$for(include-before)$
+$include-before$
+$endfor$
+$body$
+$for(include-after)$
+$include-after$
+$endfor$
+</article>
diff --git a/data/templates/default.latex b/data/templates/default.latex
index 0a1c47391..bc84520a3 100644
--- a/data/templates/default.latex
+++ b/data/templates/default.latex
@@ -24,6 +24,9 @@ $endif$
\usepackage{fontspec}
\fi
\defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
+$for(fontfamilies)$
+ \newfontfamily{$fontfamilies.name$}[$fontfamilies.options$]{$fontfamilies.font$}
+$endfor$
$if(euro)$
\newcommand{\euro}{€}
$endif$
diff --git a/man/pandoc.1 b/man/pandoc.1
index 372673d84..db4a7793c 100644
--- a/man/pandoc.1
+++ b/man/pandoc.1
@@ -1,5 +1,5 @@
.\"t
-.TH PANDOC 1 "January 12, 2016" "pandoc 1.17.0.3"
+.TH PANDOC 1 "June 4, 2016" "pandoc 1.17.1"
.SH NAME
pandoc - general markup converter
.SH SYNOPSIS
@@ -248,17 +248,17 @@ Specify output format.
(MediaWiki markup), \f[C]dokuwiki\f[] (DokuWiki markup),
\f[C]textile\f[] (Textile), \f[C]org\f[] (Emacs Org mode),
\f[C]texinfo\f[] (GNU Texinfo), \f[C]opml\f[] (OPML), \f[C]docbook\f[]
-(DocBook), \f[C]opendocument\f[] (OpenDocument), \f[C]odt\f[]
-(OpenOffice text document), \f[C]docx\f[] (Word docx), \f[C]haddock\f[]
-(Haddock markup), \f[C]rtf\f[] (rich text format), \f[C]epub\f[] (EPUB
-v2 book), \f[C]epub3\f[] (EPUB v3), \f[C]fb2\f[] (FictionBook2 e\-book),
-\f[C]asciidoc\f[] (AsciiDoc), \f[C]icml\f[] (InDesign ICML),
-\f[C]tei\f[] (TEI Simple), \f[C]slidy\f[] (Slidy HTML and javascript
-slide show), \f[C]slideous\f[] (Slideous HTML and javascript slide
-show), \f[C]dzslides\f[] (DZSlides HTML5 + javascript slide show),
-\f[C]revealjs\f[] (reveal.js HTML5 + javascript slide show), \f[C]s5\f[]
-(S5 HTML and javascript slide show), or the path of a custom lua writer
-(see Custom writers, below).
+(DocBook 4), \f[C]docbook5\f[] (DocBook 5), \f[C]opendocument\f[]
+(OpenDocument), \f[C]odt\f[] (OpenOffice text document), \f[C]docx\f[]
+(Word docx), \f[C]haddock\f[] (Haddock markup), \f[C]rtf\f[] (rich text
+format), \f[C]epub\f[] (EPUB v2 book), \f[C]epub3\f[] (EPUB v3),
+\f[C]fb2\f[] (FictionBook2 e\-book), \f[C]asciidoc\f[] (AsciiDoc),
+\f[C]icml\f[] (InDesign ICML), \f[C]tei\f[] (TEI Simple), \f[C]slidy\f[]
+(Slidy HTML and javascript slide show), \f[C]slideous\f[] (Slideous HTML
+and javascript slide show), \f[C]dzslides\f[] (DZSlides HTML5 +
+javascript slide show), \f[C]revealjs\f[] (reveal.js HTML5 + javascript
+slide show), \f[C]s5\f[] (S5 HTML and javascript slide show), or the
+path of a custom lua writer (see Custom writers, below).
Note that \f[C]odt\f[], \f[C]epub\f[], and \f[C]epub3\f[] output will
not be directed to \f[I]stdout\f[]; an output filename must be specified
using the \f[C]\-o/\-\-output\f[] option.
@@ -584,9 +584,11 @@ Deprecated synonym for \f[C]\-\-wrap=none\f[].
.RE
.TP
.B \f[C]\-\-columns=\f[]\f[I]NUMBER\f[]
-Specify length of lines in characters (for text wrapping).
-This affects only the generated source code, not the layout on the
-rendered page.
+Specify length of lines in characters.
+This affects text wrapping in the generated source code (see
+\f[C]\-\-wrap\f[]).
+It also affects calculation of column widths for plain text tables (see
+Tables below).
.RS
.RE
.TP
@@ -595,7 +597,8 @@ Include an automatically generated table of contents (or, in the case of
\f[C]latex\f[], \f[C]context\f[], \f[C]docx\f[], and \f[C]rst\f[], an
instruction to create one) in the output document.
This option has no effect on \f[C]man\f[], \f[C]docbook\f[],
-\f[C]slidy\f[], \f[C]slideous\f[], \f[C]s5\f[], or \f[C]odt\f[] output.
+\f[C]docbook5\f[], \f[C]slidy\f[], \f[C]slideous\f[], \f[C]s5\f[], or
+\f[C]odt\f[] output.
.RS
.RE
.TP
@@ -1038,8 +1041,8 @@ copy of the script, so it can be cached.
.RE
.TP
.B \f[C]\-\-mathml\f[][\f[C]=\f[]\f[I]URL\f[]]
-Convert TeX math to MathML (in \f[C]docbook\f[] as well as \f[C]html\f[]
-and \f[C]html5\f[]).
+Convert TeX math to MathML (in \f[C]docbook\f[], \f[C]docbook5\f[],
+\f[C]html\f[] and \f[C]html5\f[]).
In standalone \f[C]html\f[] output, a small javascript (or a link to
such a script if a \f[I]URL\f[] is supplied) will be inserted that
allows the MathML to be viewed on some browsers.
@@ -1950,11 +1953,11 @@ treated differently in CSS.
.SS Extension: \f[C]implicit_header_references\f[]
.PP
Pandoc behaves as if reference links have been defined for each header.
-So, instead of
+So, to link to a header
.IP
.nf
\f[C]
-[header\ identifiers](#header\-identifiers\-in\-html)
+#\ Header\ identifiers\ in\ HTML
\f[]
.fi
.PP
@@ -1962,7 +1965,7 @@ you can simply write
.IP
.nf
\f[C]
-[header\ identifiers]
+[Header\ identifiers\ in\ HTML]
\f[]
.fi
.PP
@@ -1970,7 +1973,7 @@ or
.IP
.nf
\f[C]
-[header\ identifiers][]
+[Header\ identifiers\ in\ HTML][]
\f[]
.fi
.PP
@@ -1978,7 +1981,16 @@ or
.IP
.nf
\f[C]
-[the\ section\ on\ header\ identifiers][header\ identifiers]
+[the\ section\ on\ header\ identifiers][header\ identifiers\ in
+HTML]
+\f[]
+.fi
+.PP
+instead of giving the identifier explicitly:
+.IP
+.nf
+\f[C]
+[Header\ identifiers\ in\ HTML](#header\-identifiers\-in\-html)
\f[]
.fi
.PP
@@ -4780,40 +4792,47 @@ Released under the GPL, version 2 or greater.
This software carries no warranty of any kind.
(See COPYRIGHT for full copyright and warranty notices.)
.PP
-Contributors include Aaron Wolen, Albert Krewinkel, Alexander
+Contributors include Aaron Wolen, Albert Krewinkel, Alex Vong, Alexander
Kondratskiy, Alexander Sulfrian, Alexander V Vershilov, Alfred
Wechselberger, Andreas Lööw, Andrew Dunning, Antoine Latter, Arata
Mizuki, Arlo O\[aq]Keeffe, Artyom Kazak, Ben Gamari, Beni
-Cherniavsky\-Paskin, Bjorn Buckwalter, Bradley Kuhn, Brent Yorgey, Bryan
-O\[aq]Sullivan, B.
-Scott Michel, Caleb McDaniel, Calvin Beck, Christoffer Ackelman,
-Christoffer Sawicki, Clare Macrae, Clint Adams, Conal Elliott, Craig S.
-Bosma, Daniel Bergey, Daniel T.
+Cherniavsky\-Paskin, Benoit Schweblin, Bjorn Buckwalter, Bradley Kuhn,
+Brent Yorgey, Bryan O\[aq]Sullivan, B.
+Scott Michel, Caleb McDaniel, Calvin Beck, Carlos Sosa, Chris Black,
+Christian Conkle, Christoffer Ackelman, Christoffer Sawicki, Clare
+Macrae, Clint Adams, Conal Elliott, Craig S.
+Bosma, csforste, Daniel Bergey, Daniel T.
Staal, David Lazar, David Röthlisberger, Denis Laxalde, Douglas Calvert,
Douglas F.
-Calvert, Eric Kow, Eric Seidel, Florian Eitel, François Gannaz, Freiric
-Barral, Fyodor Sheremetyev, Gabor Pali, Gavin Beatty, Greg Maslov,
+Calvert, Emanuel Evans, Emily Eisenberg, Eric Kow, Eric Seidel, Florian
+Eitel, François Gannaz, Freiric Barral, Freirich Raabe, Fyodor
+Sheremetyev, Gabor Pali, Gavin Beatty, Gottfried Haider, Greg Maslov,
Grégory Bataille, Greg Rundlett, gwern, Gwern Branwen, Hans\-Peter
-Deifel, Henry de Valence, Ilya V.
-Portnov, infinity0x, Jaime Marquínez Ferrándiz, James Aspnes, Jamie F.
-Olson, Jan Larres, Jason Ronallo, Jeff Arnold, Jeff Runningen, Jens
-Petersen, Jérémy Bobbio, Jesse Rosenthal, J.
+Deifel, Henrik Tramberend, Henry de Valence, ickc, Ilya V.
+Portnov, infinity0x, Ivo Clarysse, Jaime Marquínez Ferrándiz, James
+Aspnes, Jamie F.
+Olson, Jan Larres, Jan Schulz, Jason Ronallo, Jeff Arnold, Jeff
+Runningen, Jens Petersen, Jérémy Bobbio, Jesse Rosenthal, J.
Lewis Muir, Joe Hillenbrand, John MacFarlane, Jonas Smedegaard, Jonathan
-Daugherty, Josef Svenningsson, Jose Luis Duran, Julien Cretel, Justin
-Bogner, Kelsey Hightower, Konstantin Zudov, Lars\-Dominik Braun, Luke
-Plant, Mark Szepieniec, Mark Wright, Masayoshi Takahashi, Matej Kollar,
-Mathias Schenner, Matthew Pickering, Matthias C.
+Daugherty, Josef Svenningsson, Jose Luis Duran, Julien Cretel, Juliusz
+Gonera, Justin Bogner, Kelsey Hightower, Kolen Cheung, Konstantin Zudov,
+Kristof Bastiaensen, Lars\-Dominik Braun, Luke Plant, Mark Szepieniec,
+Mark Wright, Martin Linn, Masayoshi Takahashi, Matej Kollar, Mathias
+Schenner, Mathieu Duponchelle, Matthew Eddey, Matthew Pickering,
+Matthias C.
M.
Troffaes, Mauro Bieg, Max Bolingbroke, Max Rydahl Andersen, Merijn
-Verstraaten, Michael Snoyman, Michael Thompson, MinRK, Nathan Gass, Neil
-Mayhew, Nick Bart, Nicolas Kaiser, Nikolay Yakimov, nkalvi, Paulo
+Verstraaten, Michael Beaumont, Michael Chladek, Michael Snoyman, Michael
+Thompson, MinRK, Nathan Gass, Neil Mayhew, Nick Bart, Nicolas Kaiser,
+Nikolay Yakimov, nkalvi, Ophir Lifshitz, Pablo Rodríguez, Paulo
Tanimoto, Paul Rivier, Peter Wang, Philippe Ombredanne, Phillip Alday,
-Puneeth Chaganti, qerub, Ralf Stephan, Recai Oktaş, rodja.trappe,
-RyanGlScott, Scott Morrison, Sergei Trofimovich, Sergey Astanin, Shahbaz
-Youssefi, Shaun Attfield, shreevatsa.public, Simon Hengel, Sumit
-Sahrawat, takahashim, thsutton, Tim Lin, Timothy Humphries, Todd
-Sifleet, Tom Leese, Uli Köhler, Václav Zeman, Viktor Kronvall, Vincent,
-Wikiwide, and Xavier Olive.
+Prayag Verma, Puneeth Chaganti, qerub, Ralf Stephan, Raniere Silva,
+Recai Oktaş, robabla, rodja.trappe, rski, RyanGlScott, Scott Morrison,
+Sergei Trofimovich, Sergey Astanin, Shahbaz Youssefi, Shaun Attfield,
+Sidarth Kapur, shreevatsa.public, Simon Hengel, Sumit Sahrawat,
+takahashim, thsutton, Tim Lin, Timothy Humphries, Tiziano Müller, Thomas
+Hodgson, Todd Sifleet, Tom Leese, Uli Köhler, Václav Zeman, Viktor
+Kronvall, Vincent, Wikiwide, and Xavier Olive.
.PP
The Pandoc source code and all documentation may be downloaded
from <http://pandoc.org>.
diff --git a/pandoc.cabal b/pandoc.cabal
index 578cdf2ee..820e417a5 100644
--- a/pandoc.cabal
+++ b/pandoc.cabal
@@ -1,5 +1,5 @@
Name: pandoc
-Version: 1.17.0.3
+Version: 1.17.1
Cabal-Version: >= 1.10
Build-Type: Custom
License: GPL
@@ -11,7 +11,7 @@ Bug-Reports: https://github.com/jgm/pandoc/issues
Stability: alpha
Homepage: http://pandoc.org
Category: Text
-Tested-With: GHC == 7.4.2, GHC == 7.6.3, GHC == 7.8.4, GHC == 7.10.2
+Tested-With: GHC == 7.6.3, GHC == 7.8.4, GHC == 7.10.2, GHC == 8.0.1
Synopsis: Conversion between markup formats
Description: Pandoc is a Haskell library for converting from one markup
format to another, and a command-line tool that uses
@@ -39,6 +39,7 @@ Data-Files:
data/templates/default.html
data/templates/default.html5
data/templates/default.docbook
+ data/templates/default.docbook5
data/templates/default.tei
data/templates/default.beamer
data/templates/default.opendocument
@@ -145,6 +146,7 @@ Extra-Source-Files:
tests/s5-inserts.html
tests/tables.context
tests/tables.docbook
+ tests/tables.docbook5
tests/tables.dokuwiki
tests/tables.icml
tests/tables.html
@@ -168,6 +170,7 @@ Extra-Source-Files:
tests/writer.latex
tests/writer.context
tests/writer.docbook
+ tests/writer.docbook5
tests/writer.html
tests/writer.man
tests/writer.markdown
@@ -257,7 +260,7 @@ Library
text >= 0.11 && < 1.3,
zip-archive >= 0.2.3.4 && < 0.4,
HTTP >= 4000.0.5 && < 4000.4,
- texmath >= 0.8.4.1 && < 0.9,
+ texmath >= 0.8.6.2 && < 0.9,
xml >= 1.3.12 && < 1.4,
random >= 1 && < 1.2,
extensible-exceptions >= 0.1 && < 0.2,
@@ -266,8 +269,8 @@ Library
tagsoup >= 0.13.7 && < 0.14,
base64-bytestring >= 0.1 && < 1.1,
zlib >= 0.5 && < 0.7,
- highlighting-kate >= 0.6.1 && < 0.7,
- data-default >= 0.4 && < 0.6,
+ highlighting-kate >= 0.6.2 && < 0.7,
+ data-default >= 0.4 && < 0.8,
temporary >= 1.1 && < 1.3,
blaze-html >= 0.5 && < 0.9,
blaze-markup >= 0.5.1 && < 0.8,
@@ -277,7 +280,7 @@ Library
hslua >= 0.3 && < 0.5,
binary >= 0.5 && < 0.9,
SHA >= 1.6 && < 1.7,
- haddock-library >= 1.1 && < 1.3,
+ haddock-library >= 1.1 && < 1.5,
old-time,
deepseq >= 1.3 && < 1.5,
JuicyPixels >= 3.1.6.1 && < 3.3,
@@ -288,7 +291,7 @@ Library
Build-Depends: old-locale >= 1 && < 1.1,
time >= 1.2 && < 1.5
else
- Build-Depends: time >= 1.5 && < 1.6
+ Build-Depends: time >= 1.5 && < 1.7
if flag(network-uri)
Build-Depends: network-uri >= 2.6 && < 2.7, network >= 2.6
else
@@ -304,8 +307,8 @@ Library
other-modules: Text.Pandoc.Data
if os(windows)
Cpp-options: -D_WINDOWS
- Ghc-Options: -rtsopts -Wall -fno-warn-unused-do-bind
- Ghc-Prof-Options: -fprof-auto-exported -rtsopts
+ Ghc-Options: -Wall -fno-warn-unused-do-bind
+ Ghc-Prof-Options: -fprof-auto-exported
Default-Language: Haskell98
Other-Extensions: PatternGuards, OverloadedStrings,
ScopedTypeVariables, GeneralizedNewtypeDeriving,
@@ -390,6 +393,12 @@ Library
Text.Pandoc.Readers.Odt.Generic.XMLConverter,
Text.Pandoc.Readers.Odt.Arrows.State,
Text.Pandoc.Readers.Odt.Arrows.Utils,
+ Text.Pandoc.Readers.Org.BlockStarts,
+ Text.Pandoc.Readers.Org.Blocks,
+ Text.Pandoc.Readers.Org.Inlines,
+ Text.Pandoc.Readers.Org.ParserState,
+ Text.Pandoc.Readers.Org.Parsing,
+ Text.Pandoc.Readers.Org.Shared,
Text.Pandoc.Writers.Shared,
Text.Pandoc.Asciify,
Text.Pandoc.MIME,
@@ -417,7 +426,7 @@ Executable pandoc
text >= 0.11 && < 1.3,
bytestring >= 0.9 && < 0.11,
extensible-exceptions >= 0.1 && < 0.2,
- highlighting-kate >= 0.6.1 && < 0.7,
+ highlighting-kate >= 0.6.2 && < 0.7,
aeson >= 0.7.0.5 && < 0.12,
yaml >= 0.8.8.2 && < 0.9,
containers >= 0.1 && < 0.6,
@@ -473,7 +482,7 @@ Test-Suite test-pandoc
directory >= 1 && < 1.3,
filepath >= 1.1 && < 1.5,
process >= 1 && < 1.5,
- highlighting-kate >= 0.6.1 && < 0.7,
+ highlighting-kate >= 0.6.2 && < 0.7,
Diff >= 0.2 && < 0.4,
test-framework >= 0.3 && < 0.9,
test-framework-hunit >= 0.2 && < 0.4,
diff --git a/pandoc.hs b/pandoc.hs
index e8a971de7..cb3d1e04a 100644
--- a/pandoc.hs
+++ b/pandoc.hs
@@ -1,4 +1,4 @@
-{-# LANGUAGE CPP, TupleSections #-}
+{-# LANGUAGE CPP, TupleSections, ScopedTypeVariables #-}
{-
Copyright (C) 2006-2016 John MacFarlane <jgm@berkeley.edu>
@@ -836,7 +836,7 @@ options =
, Option "" ["mathjax"]
(OptArg
(\arg opt -> do
- let url' = fromMaybe "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" arg
+ let url' = fromMaybe "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_CHTML-full" arg
return opt { optHTMLMathMethod = MathJax url'})
"URL")
"" -- "Use MathJax for HTML math"
diff --git a/src/Text/Pandoc.hs b/src/Text/Pandoc.hs
index b67a53f5b..0330c46e2 100644
--- a/src/Text/Pandoc.hs
+++ b/src/Text/Pandoc.hs
@@ -291,6 +291,8 @@ writers = [
writeHtmlString o{ writerSlideVariant = RevealJsSlides
, writerHtml5 = True })
,("docbook" , PureStringWriter writeDocbook)
+ ,("docbook5" , PureStringWriter $ \o ->
+ writeDocbook o{ writerDocbook5 = True })
,("opml" , PureStringWriter writeOPML)
,("opendocument" , PureStringWriter writeOpenDocument)
,("latex" , PureStringWriter writeLaTeX)
diff --git a/src/Text/Pandoc/Options.hs b/src/Text/Pandoc/Options.hs
index 171210962..701cd8bd1 100644
--- a/src/Text/Pandoc/Options.hs
+++ b/src/Text/Pandoc/Options.hs
@@ -357,6 +357,7 @@ data WriterOptions = WriterOptions
, writerSourceURL :: Maybe String -- ^ Absolute URL + directory of 1st source file
, writerUserDataDir :: Maybe FilePath -- ^ Path of user data directory
, writerCiteMethod :: CiteMethod -- ^ How to print cites
+ , writerDocbook5 :: Bool -- ^ Produce DocBook5
, writerHtml5 :: Bool -- ^ Produce HTML5
, writerHtmlQTags :: Bool -- ^ Use @<q>@ tags for quotes in HTML
, writerBeamer :: Bool -- ^ Produce beamer LaTeX slide show
@@ -403,6 +404,7 @@ instance Default WriterOptions where
, writerSourceURL = Nothing
, writerUserDataDir = Nothing
, writerCiteMethod = Citeproc
+ , writerDocbook5 = False
, writerHtml5 = False
, writerHtmlQTags = False
, writerBeamer = False
diff --git a/src/Text/Pandoc/Readers/Docx.hs b/src/Text/Pandoc/Readers/Docx.hs
index 604bc20de..9c7c3b264 100644
--- a/src/Text/Pandoc/Readers/Docx.hs
+++ b/src/Text/Pandoc/Readers/Docx.hs
@@ -100,12 +100,13 @@ import Text.Pandoc.Compat.Except
readDocxWithWarnings :: ReaderOptions
-> B.ByteString
-> Either PandocError (Pandoc, MediaBag, [String])
-readDocxWithWarnings opts bytes =
- case archiveToDocxWithWarnings (toArchive bytes) of
- Right (docx, warnings) -> do
+readDocxWithWarnings opts bytes
+ | Right archive <- toArchiveOrFail bytes
+ , Right (docx, warnings) <- archiveToDocxWithWarnings archive = do
(meta, blks, mediaBag) <- docxToOutput opts docx
return (Pandoc meta blks, mediaBag, warnings)
- Left _ -> Left (ParseFailure "couldn't parse docx file")
+readDocxWithWarnings _ _ =
+ Left (ParseFailure "couldn't parse docx file")
readDocx :: ReaderOptions
-> B.ByteString
diff --git a/src/Text/Pandoc/Readers/Docx/Parse.hs b/src/Text/Pandoc/Readers/Docx/Parse.hs
index 364483929..7265ef8dd 100644
--- a/src/Text/Pandoc/Readers/Docx/Parse.hs
+++ b/src/Text/Pandoc/Readers/Docx/Parse.hs
@@ -661,14 +661,14 @@ elemToParPart ns element
| isElem ns "w" "r" element =
elemToRun ns element >>= (\r -> return $ PlainRun r)
elemToParPart ns element
- | isElem ns "w" "ins" element
+ | isElem ns "w" "ins" element || isElem ns "w" "moveTo" element
, Just cId <- findAttr (elemName ns "w" "id") element
, Just cAuthor <- findAttr (elemName ns "w" "author") element
, Just cDate <- findAttr (elemName ns "w" "date") element = do
runs <- mapD (elemToRun ns) (elChildren element)
return $ Insertion cId cAuthor cDate runs
elemToParPart ns element
- | isElem ns "w" "del" element
+ | isElem ns "w" "del" element || isElem ns "w" "moveFrom" element
, Just cId <- findAttr (elemName ns "w" "id") element
, Just cAuthor <- findAttr (elemName ns "w" "author") element
, Just cDate <- findAttr (elemName ns "w" "date") element = do
diff --git a/src/Text/Pandoc/Readers/EPUB.hs b/src/Text/Pandoc/Readers/EPUB.hs
index 07d282708..b8a0b47e7 100644
--- a/src/Text/Pandoc/Readers/EPUB.hs
+++ b/src/Text/Pandoc/Readers/EPUB.hs
@@ -14,12 +14,13 @@ import Text.Pandoc.Walk (walk, query)
import Text.Pandoc.Readers.HTML (readHtml)
import Text.Pandoc.Options ( ReaderOptions(..), readerTrace)
import Text.Pandoc.Shared (escapeURI, collapseFilePath, addMetaField)
+import Network.URI (unEscapeString)
import Text.Pandoc.MediaBag (MediaBag, insertMedia)
import Text.Pandoc.Compat.Except (MonadError, throwError, runExcept, Except)
import Text.Pandoc.Compat.Monoid ((<>))
import Text.Pandoc.MIME (MimeType)
import qualified Text.Pandoc.Builder as B
-import Codec.Archive.Zip ( Archive (..), toArchive, fromEntry
+import Codec.Archive.Zip ( Archive (..), toArchiveOrFail, fromEntry
, findEntryByPath, Entry)
import qualified Data.ByteString.Lazy as BL (ByteString)
import System.FilePath ( takeFileName, (</>), dropFileName, normalise
@@ -39,7 +40,9 @@ import Text.Pandoc.Error
type Items = M.Map String (FilePath, MimeType)
readEPUB :: ReaderOptions -> BL.ByteString -> Either PandocError (Pandoc, MediaBag)
-readEPUB opts bytes = runEPUB (archiveToEPUB opts $ toArchive bytes)
+readEPUB opts bytes = case toArchiveOrFail bytes of
+ Right archive -> runEPUB $ archiveToEPUB opts $ archive
+ Left _ -> Left $ ParseFailure "Couldn't extract ePub file"
runEPUB :: Except PandocError a -> Either PandocError a
runEPUB = runExcept
@@ -72,14 +75,15 @@ archiveToEPUB os archive = do
let docSpan = B.doc $ B.para $ B.spanWith (takeFileName path, [], []) mempty
return $ docSpan <> doc
mimeToReader :: MonadError PandocError m => MimeType -> FilePath -> FilePath -> m Pandoc
- mimeToReader "application/xhtml+xml" (normalise -> root) (normalise -> path) = do
+ mimeToReader "application/xhtml+xml" (unEscapeString -> root)
+ (unEscapeString -> path) = do
fname <- findEntryByPathE (root </> path) archive
html <- either throwError return .
readHtml os' .
UTF8.toStringLazy $
fromEntry fname
return $ fixInternalReferences path html
- mimeToReader s _ path
+ mimeToReader s _ (unEscapeString -> path)
| s `elem` imageMimes = return $ imageToPandoc path
| otherwise = return $ mempty
@@ -190,8 +194,10 @@ fixInlineIRs s (Span as v) =
Span (fixAttrs s as) v
fixInlineIRs s (Code as code) =
Code (fixAttrs s as) code
-fixInlineIRs s (Link attr t ('#':url, tit)) =
- Link attr t (addHash s url, tit)
+fixInlineIRs s (Link as is ('#':url, tit)) =
+ Link (fixAttrs s as) is (addHash s url, tit)
+fixInlineIRs s (Link as is t) =
+ Link (fixAttrs s as) is t
fixInlineIRs _ v = v
prependHash :: [String] -> Inline -> Inline
diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs
index fb936cff7..164e3a98f 100644
--- a/src/Text/Pandoc/Readers/HTML.hs
+++ b/src/Text/Pandoc/Readers/HTML.hs
@@ -707,7 +707,7 @@ pCloses tagtype = try $ do
(TagOpen t' _) | t' `closes` tagtype -> return ()
(TagClose "ul") | tagtype == "li" -> return ()
(TagClose "ol") | tagtype == "li" -> return ()
- (TagClose "dl") | tagtype == "li" -> return ()
+ (TagClose "dl") | tagtype == "dd" -> return ()
(TagClose "table") | tagtype == "td" -> return ()
(TagClose "table") | tagtype == "tr" -> return ()
_ -> mzero
@@ -971,11 +971,20 @@ htmlTag :: Monad m
htmlTag f = try $ do
lookAhead (char '<')
inp <- getInput
- let (next : rest) = canonicalizeTags $ parseTagsOptions
- parseOptions{ optTagWarning = True } inp
+ let (next : _) = canonicalizeTags $ parseTagsOptions
+ parseOptions{ optTagWarning = False } inp
guard $ f next
+ let handleTag tagname = do
+ -- <www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
+ -- should NOT be parsed as an HTML tag, see #2277
+ guard $ not ('.' `elem` tagname)
+ -- <https://example.org> should NOT be a tag either.
+ -- tagsoup will parse it as TagOpen "https:" [("example.org","")]
+ guard $ not (null tagname)
+ guard $ last tagname /= ':'
+ rendered <- manyTill anyChar (char '>')
+ return (next, rendered ++ ">")
case next of
- TagWarning _ -> fail "encountered TagWarning"
TagComment s
| "<!--" `isPrefixOf` inp -> do
count (length s + 4) anyChar
@@ -983,13 +992,9 @@ htmlTag f = try $ do
char '>'
return (next, "<!--" ++ s ++ "-->")
| otherwise -> fail "bogus comment mode, HTML5 parse error"
- _ -> do
- -- we get a TagWarning on things like
- -- <www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
- -- which should NOT be parsed as an HTML tag, see #2277
- guard $ not $ hasTagWarning rest
- rendered <- manyTill anyChar (char '>')
- return (next, rendered ++ ">")
+ TagOpen tagname _attr -> handleTag tagname
+ TagClose tagname -> handleTag tagname
+ _ -> mzero
mkAttr :: [(String, String)] -> Attr
mkAttr attr = (attribsId, attribsClasses, attribsKV)
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs
index b5d175453..e43714526 100644
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@@ -122,9 +122,6 @@ inList = do
ctx <- stateParserContext <$> getState
guard (ctx == ListItemState)
-isNull :: F Inlines -> Bool
-isNull ils = B.isNull $ runF ils def
-
spnl :: Parser [Char] st ()
spnl = try $ do
skipSpaces
@@ -188,31 +185,38 @@ charsInBalancedBrackets openBrackets =
-- document structure
--
-titleLine :: MarkdownParser (F Inlines)
-titleLine = try $ do
+rawTitleBlockLine :: MarkdownParser String
+rawTitleBlockLine = do
char '%'
skipSpaces
- res <- many $ (notFollowedBy newline >> inline)
- <|> try (endline >> whitespace)
- newline
+ first <- anyLine
+ rest <- many $ try $ do spaceChar
+ notFollowedBy blankline
+ skipSpaces
+ anyLine
+ return $ trim $ unlines (first:rest)
+
+titleLine :: MarkdownParser (F Inlines)
+titleLine = try $ do
+ raw <- rawTitleBlockLine
+ res <- parseFromString (many inline) raw
return $ trimInlinesF $ mconcat res
authorsLine :: MarkdownParser (F [Inlines])
authorsLine = try $ do
- char '%'
- skipSpaces
- authors <- sepEndBy (many (notFollowedBy (satisfy $ \c ->
- c == ';' || c == '\n') >> inline))
- (char ';' <|>
- try (newline >> notFollowedBy blankline >> spaceChar))
- newline
- return $ sequence $ filter (not . isNull) $ map (trimInlinesF . mconcat) authors
+ raw <- rawTitleBlockLine
+ let sep = (char ';' <* spaces) <|> newline
+ let pAuthors = sepEndBy
+ (trimInlinesF . mconcat <$> many
+ (try $ notFollowedBy sep >> inline))
+ sep
+ sequence <$> parseFromString pAuthors raw
dateLine :: MarkdownParser (F Inlines)
dateLine = try $ do
- char '%'
- skipSpaces
- trimInlinesF . mconcat <$> manyTill inline newline
+ raw <- rawTitleBlockLine
+ res <- parseFromString (many inline) raw
+ return $ trimInlinesF $ mconcat res
titleBlock :: MarkdownParser ()
titleBlock = pandocTitleBlock <|> mmdTitleBlock
diff --git a/src/Text/Pandoc/Readers/MediaWiki.hs b/src/Text/Pandoc/Readers/MediaWiki.hs
index 950497992..d3cee08e2 100644
--- a/src/Text/Pandoc/Readers/MediaWiki.hs
+++ b/src/Text/Pandoc/Readers/MediaWiki.hs
@@ -225,7 +225,7 @@ table = do
Nothing -> 1.0
caption <- option mempty tableCaption
optional rowsep
- hasheader <- option False $ True <$ (lookAhead (char '!'))
+ hasheader <- option False $ True <$ (lookAhead (skipSpaces *> char '!'))
(cellspecs',hdr) <- unzip <$> tableRow
let widths = map ((tableWidth *) . snd) cellspecs'
let restwidth = tableWidth - sum widths
diff --git a/src/Text/Pandoc/Readers/Odt.hs b/src/Text/Pandoc/Readers/Odt.hs
index a925c1d84..68e89263c 100644
--- a/src/Text/Pandoc/Readers/Odt.hs
+++ b/src/Text/Pandoc/Readers/Odt.hs
@@ -59,7 +59,9 @@ readOdt _ bytes = case bytesToOdt bytes of
--
bytesToOdt :: B.ByteString -> Either PandocError Pandoc
-bytesToOdt bytes = archiveToOdt $ toArchive bytes
+bytesToOdt bytes = case toArchiveOrFail bytes of
+ Right archive -> archiveToOdt archive
+ Left _ -> Left $ ParseFailure "Couldn't parse odt file."
--
archiveToOdt :: Archive -> Either PandocError Pandoc
diff --git a/src/Text/Pandoc/Readers/Org.hs b/src/Text/Pandoc/Readers/Org.hs
index 7dd611be3..d593f856d 100644
--- a/src/Text/Pandoc/Readers/Org.hs
+++ b/src/Text/Pandoc/Readers/Org.hs
@@ -1,6 +1,3 @@
-{-# LANGUAGE OverloadedStrings #-}
-{-# LANGUAGE GeneralizedNewtypeDeriving #-}
-{-# LANGUAGE MultiParamTypeClasses, FlexibleContexts, FlexibleInstances #-}
{-
Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
@@ -30,60 +27,35 @@ Conversion of org-mode formatted plain text to 'Pandoc' document.
-}
module Text.Pandoc.Readers.Org ( readOrg ) where
-import qualified Text.Pandoc.Builder as B
-import Text.Pandoc.Builder ( Inlines, Blocks, HasMeta(..),
- trimInlines )
+import Text.Pandoc.Readers.Org.Blocks ( blockList, meta )
+import Text.Pandoc.Readers.Org.Parsing ( OrgParser, readWithM )
+import Text.Pandoc.Readers.Org.ParserState ( optionsToParserState )
+
import Text.Pandoc.Definition
-import Text.Pandoc.Compat.Monoid ((<>))
+import Text.Pandoc.Error
import Text.Pandoc.Options
-import qualified Text.Pandoc.Parsing as P
-import Text.Pandoc.Parsing hiding ( F, unF, askF, asksF, runF
- , newline, orderedListMarker
- , parseFromString, blanklines
- )
-import Text.Pandoc.Readers.LaTeX (inlineCommand, rawLaTeXInline)
-import Text.Pandoc.Shared (compactify', compactify'DL)
-import Text.TeXMath (readTeX, writePandoc, DisplayType(..))
-import qualified Text.TeXMath.Readers.MathML.EntityMap as MathMLEntityMap
-import Control.Arrow (first)
-import Control.Monad (foldM, guard, liftM, liftM2, mplus, mzero, when)
-import Control.Monad.Reader (Reader, runReader, ask, asks, local)
-import Data.Char (isAlphaNum, toLower)
-import Data.Default
-import Data.List (intersperse, isPrefixOf, isSuffixOf)
-import qualified Data.Map as M
-import qualified Data.Set as Set
-import Data.Maybe (fromMaybe, isJust)
-import Network.HTTP (urlEncode)
+import Control.Monad.Reader ( runReader )
-import Text.Pandoc.Error
-- | Parse org-mode string and return a Pandoc document.
readOrg :: ReaderOptions -- ^ Reader options
-> String -- ^ String to parse (assuming @'\n'@ line endings)
-> Either PandocError Pandoc
-readOrg opts s = flip runReader def $ readWithM parseOrg def{ orgStateOptions = opts } (s ++ "\n\n")
-
-data OrgParserLocal = OrgParserLocal { orgLocalQuoteContext :: QuoteContext }
-
-type OrgParser = ParserT [Char] OrgParserState (Reader OrgParserLocal)
-
-instance HasIdentifierList OrgParserState where
- extractIdentifierList = orgStateIdentifiers
- updateIdentifierList f s = s{ orgStateIdentifiers = f (orgStateIdentifiers s) }
-
-instance HasHeaderMap OrgParserState where
- extractHeaderMap = orgStateHeaderMap
- updateHeaderMap f s = s{ orgStateHeaderMap = f (orgStateHeaderMap s) }
+readOrg opts s = flip runReader def $
+ readWithM parseOrg (optionsToParserState opts) (s ++ "\n\n")
+--
+-- Parser
+--
parseOrg :: OrgParser Pandoc
parseOrg = do
- blocks' <- parseBlocks
- st <- getState
- let meta = runF (orgStateMeta' st) st
- let removeUnwantedBlocks = dropCommentTrees . filter (/= Null)
- return $ Pandoc meta $ removeUnwantedBlocks (B.toList $ runF blocks' st)
+ blocks' <- blockList
+ meta' <- meta
+ return . Pandoc meta' $ removeUnwantedBlocks blocks'
+ where
+ removeUnwantedBlocks :: [Block] -> [Block]
+ removeUnwantedBlocks = dropCommentTrees . filter (/= Null)
-- | Drop COMMENT headers and the document tree below those headers.
dropCommentTrees :: [Block] -> [Block]
@@ -118,1504 +90,3 @@ isHeaderLevelLowerEq n blk =
case blk of
(Header level _ _) -> n >= level
_ -> False
-
---
--- Parser State for Org
---
-
-type OrgNoteRecord = (String, F Blocks)
-type OrgNoteTable = [OrgNoteRecord]
-
-type OrgBlockAttributes = M.Map String String
-
-type OrgLinkFormatters = M.Map String (String -> String)
-
--- | Org-mode parser state
-data OrgParserState = OrgParserState
- { orgStateOptions :: ReaderOptions
- , orgStateAnchorIds :: [String]
- , orgStateBlockAttributes :: OrgBlockAttributes
- , orgStateEmphasisCharStack :: [Char]
- , orgStateEmphasisNewlines :: Maybe Int
- , orgStateLastForbiddenCharPos :: Maybe SourcePos
- , orgStateLastPreCharPos :: Maybe SourcePos
- , orgStateLastStrPos :: Maybe SourcePos
- , orgStateLinkFormatters :: OrgLinkFormatters
- , orgStateMeta :: Meta
- , orgStateMeta' :: F Meta
- , orgStateNotes' :: OrgNoteTable
- , orgStateParserContext :: ParserContext
- , orgStateIdentifiers :: Set.Set String
- , orgStateHeaderMap :: M.Map Inlines String
- }
-
-instance Default OrgParserLocal where
- def = OrgParserLocal NoQuote
-
-instance HasReaderOptions OrgParserState where
- extractReaderOptions = orgStateOptions
-
-instance HasMeta OrgParserState where
- setMeta field val st =
- st{ orgStateMeta = setMeta field val $ orgStateMeta st }
- deleteMeta field st =
- st{ orgStateMeta = deleteMeta field $ orgStateMeta st }
-
-instance HasLastStrPosition OrgParserState where
- getLastStrPos = orgStateLastStrPos
- setLastStrPos pos st = st{ orgStateLastStrPos = Just pos }
-
-instance HasQuoteContext st (Reader OrgParserLocal) where
- getQuoteContext = asks orgLocalQuoteContext
- withQuoteContext q = local (\s -> s{orgLocalQuoteContext = q})
-
-instance Default OrgParserState where
- def = defaultOrgParserState
-
-defaultOrgParserState :: OrgParserState
-defaultOrgParserState = OrgParserState
- { orgStateOptions = def
- , orgStateAnchorIds = []
- , orgStateBlockAttributes = M.empty
- , orgStateEmphasisCharStack = []
- , orgStateEmphasisNewlines = Nothing
- , orgStateLastForbiddenCharPos = Nothing
- , orgStateLastPreCharPos = Nothing
- , orgStateLastStrPos = Nothing
- , orgStateLinkFormatters = M.empty
- , orgStateMeta = nullMeta
- , orgStateMeta' = return nullMeta
- , orgStateNotes' = []
- , orgStateParserContext = NullState
- , orgStateIdentifiers = Set.empty
- , orgStateHeaderMap = M.empty
- }
-
-recordAnchorId :: String -> OrgParser ()
-recordAnchorId i = updateState $ \s ->
- s{ orgStateAnchorIds = i : (orgStateAnchorIds s) }
-
-updateLastForbiddenCharPos :: OrgParser ()
-updateLastForbiddenCharPos = getPosition >>= \p ->
- updateState $ \s -> s{ orgStateLastForbiddenCharPos = Just p}
-
-updateLastPreCharPos :: OrgParser ()
-updateLastPreCharPos = getPosition >>= \p ->
- updateState $ \s -> s{ orgStateLastPreCharPos = Just p}
-
-pushToInlineCharStack :: Char -> OrgParser ()
-pushToInlineCharStack c = updateState $ \s ->
- s{ orgStateEmphasisCharStack = c:orgStateEmphasisCharStack s }
-
-popInlineCharStack :: OrgParser ()
-popInlineCharStack = updateState $ \s ->
- s{ orgStateEmphasisCharStack = drop 1 . orgStateEmphasisCharStack $ s }
-
-surroundingEmphasisChar :: OrgParser [Char]
-surroundingEmphasisChar =
- take 1 . drop 1 . orgStateEmphasisCharStack <$> getState
-
-startEmphasisNewlinesCounting :: Int -> OrgParser ()
-startEmphasisNewlinesCounting maxNewlines = updateState $ \s ->
- s{ orgStateEmphasisNewlines = Just maxNewlines }
-
-decEmphasisNewlinesCount :: OrgParser ()
-decEmphasisNewlinesCount = updateState $ \s ->
- s{ orgStateEmphasisNewlines = (\n -> n - 1) <$> orgStateEmphasisNewlines s }
-
-newlinesCountWithinLimits :: OrgParser Bool
-newlinesCountWithinLimits = do
- st <- getState
- return $ ((< 0) <$> orgStateEmphasisNewlines st) /= Just True
-
-resetEmphasisNewlines :: OrgParser ()
-resetEmphasisNewlines = updateState $ \s ->
- s{ orgStateEmphasisNewlines = Nothing }
-
-addLinkFormat :: String
- -> (String -> String)
- -> OrgParser ()
-addLinkFormat key formatter = updateState $ \s ->
- let fs = orgStateLinkFormatters s
- in s{ orgStateLinkFormatters = M.insert key formatter fs }
-
-addToNotesTable :: OrgNoteRecord -> OrgParser ()
-addToNotesTable note = do
- oldnotes <- orgStateNotes' <$> getState
- updateState $ \s -> s{ orgStateNotes' = note:oldnotes }
-
--- The version Text.Pandoc.Parsing cannot be used, as we need additional parts
--- of the state saved and restored.
-parseFromString :: OrgParser a -> String -> OrgParser a
-parseFromString parser str' = do
- oldLastPreCharPos <- orgStateLastPreCharPos <$> getState
- updateState $ \s -> s{ orgStateLastPreCharPos = Nothing }
- result <- P.parseFromString parser str'
- updateState $ \s -> s{ orgStateLastPreCharPos = oldLastPreCharPos }
- return result
-
-
---
--- Adaptions and specializations of parsing utilities
---
-
-newtype F a = F { unF :: Reader OrgParserState a
- } deriving (Monad, Applicative, Functor)
-
-runF :: F a -> OrgParserState -> a
-runF = runReader . unF
-
-askF :: F OrgParserState
-askF = F ask
-
-asksF :: (OrgParserState -> a) -> F a
-asksF f = F $ asks f
-
-instance Monoid a => Monoid (F a) where
- mempty = return mempty
- mappend = liftM2 mappend
- mconcat = fmap mconcat . sequence
-
-trimInlinesF :: F Inlines -> F Inlines
-trimInlinesF = liftM trimInlines
-
-returnF :: a -> OrgParser (F a)
-returnF = return . return
-
-
--- | Like @Text.Parsec.Char.newline@, but causes additional state changes.
-newline :: OrgParser Char
-newline =
- P.newline
- <* updateLastPreCharPos
- <* updateLastForbiddenCharPos
-
--- | Like @Text.Parsec.Char.blanklines@, but causes additional state changes.
-blanklines :: OrgParser [Char]
-blanklines =
- P.blanklines
- <* updateLastPreCharPos
- <* updateLastForbiddenCharPos
-
--- | Succeeds when we're in list context.
-inList :: OrgParser ()
-inList = do
- ctx <- orgStateParserContext <$> getState
- guard (ctx == ListItemState)
-
--- | Parse in different context
-withContext :: ParserContext -- ^ New parser context
- -> OrgParser a -- ^ Parser to run in that context
- -> OrgParser a
-withContext context parser = do
- oldContext <- orgStateParserContext <$> getState
- updateState $ \s -> s{ orgStateParserContext = context }
- result <- parser
- updateState $ \s -> s{ orgStateParserContext = oldContext }
- return result
-
---
--- parsing blocks
---
-
-parseBlocks :: OrgParser (F Blocks)
-parseBlocks = mconcat <$> manyTill block eof
-
-block :: OrgParser (F Blocks)
-block = choice [ mempty <$ blanklines
- , optionalAttributes $ choice
- [ orgBlock
- , figure
- , table
- ]
- , example
- , drawer
- , specialLine
- , header
- , return <$> hline
- , list
- , latexFragment
- , noteBlock
- , paraOrPlain
- ] <?> "block"
-
---
--- Block Attributes
---
-
--- | Parse optional block attributes (like #+TITLE or #+NAME)
-optionalAttributes :: OrgParser (F Blocks) -> OrgParser (F Blocks)
-optionalAttributes parser = try $
- resetBlockAttributes *> parseBlockAttributes *> parser
- where
- resetBlockAttributes :: OrgParser ()
- resetBlockAttributes = updateState $ \s ->
- s{ orgStateBlockAttributes = orgStateBlockAttributes def }
-
-parseBlockAttributes :: OrgParser ()
-parseBlockAttributes = do
- attrs <- many attribute
- mapM_ (uncurry parseAndAddAttribute) attrs
- where
- attribute :: OrgParser (String, String)
- attribute = try $ do
- key <- metaLineStart *> many1Till nonspaceChar (char ':')
- val <- skipSpaces *> anyLine
- return (map toLower key, val)
-
-parseAndAddAttribute :: String -> String -> OrgParser ()
-parseAndAddAttribute key value = do
- let key' = map toLower key
- () <$ addBlockAttribute key' value
-
-lookupInlinesAttr :: String -> OrgParser (Maybe (F Inlines))
-lookupInlinesAttr attr = try $ do
- val <- lookupBlockAttribute attr
- maybe (return Nothing)
- (fmap Just . parseFromString parseInlines)
- val
-
-addBlockAttribute :: String -> String -> OrgParser ()
-addBlockAttribute key val = updateState $ \s ->
- let attrs = orgStateBlockAttributes s
- in s{ orgStateBlockAttributes = M.insert key val attrs }
-
-lookupBlockAttribute :: String -> OrgParser (Maybe String)
-lookupBlockAttribute key =
- M.lookup key . orgStateBlockAttributes <$> getState
-
-
---
--- Org Blocks (#+BEGIN_... / #+END_...)
---
-
-type BlockProperties = (Int, String) -- (Indentation, Block-Type)
-
-orgBlock :: OrgParser (F Blocks)
-orgBlock = try $ do
- blockProp@(_, blkType) <- blockHeaderStart
- ($ blockProp) $
- case blkType of
- "comment" -> withRaw' (const mempty)
- "html" -> withRaw' (return . (B.rawBlock blkType))
- "latex" -> withRaw' (return . (B.rawBlock blkType))
- "ascii" -> withRaw' (return . (B.rawBlock blkType))
- "example" -> withRaw' (return . exampleCode)
- "quote" -> withParsed (fmap B.blockQuote)
- "verse" -> verseBlock
- "src" -> codeBlock
- _ -> withParsed (fmap $ divWithClass blkType)
-
-blockHeaderStart :: OrgParser (Int, String)
-blockHeaderStart = try $ (,) <$> indent <*> blockType
- where
- indent = length <$> many spaceChar
- blockType = map toLower <$> (stringAnyCase "#+begin_" *> orgArgWord)
-
-withRaw' :: (String -> F Blocks) -> BlockProperties -> OrgParser (F Blocks)
-withRaw' f blockProp = (ignHeaders *> (f <$> rawBlockContent blockProp))
-
-withParsed :: (F Blocks -> F Blocks) -> BlockProperties -> OrgParser (F Blocks)
-withParsed f blockProp = (ignHeaders *> (f <$> parsedBlockContent blockProp))
-
-ignHeaders :: OrgParser ()
-ignHeaders = (() <$ newline) <|> (() <$ anyLine)
-
-divWithClass :: String -> Blocks -> Blocks
-divWithClass cls = B.divWith ("", [cls], [])
-
-verseBlock :: BlockProperties -> OrgParser (F Blocks)
-verseBlock blkProp = try $ do
- ignHeaders
- content <- rawBlockContent blkProp
- fmap B.para . mconcat . intersperse (pure B.linebreak)
- <$> mapM (parseFromString parseInlines) (map (++ "\n") . lines $ content)
-
-exportsCode :: [(String, String)] -> Bool
-exportsCode attrs = not (("rundoc-exports", "none") `elem` attrs
- || ("rundoc-exports", "results") `elem` attrs)
-
-exportsResults :: [(String, String)] -> Bool
-exportsResults attrs = ("rundoc-exports", "results") `elem` attrs
- || ("rundoc-exports", "both") `elem` attrs
-
-followingResultsBlock :: OrgParser (Maybe (F Blocks))
-followingResultsBlock =
- optionMaybe (try $ blanklines *> stringAnyCase "#+RESULTS:"
- *> blankline
- *> block)
-
-codeBlock :: BlockProperties -> OrgParser (F Blocks)
-codeBlock blkProp = do
- skipSpaces
- (classes, kv) <- codeHeaderArgs <|> (mempty <$ ignHeaders)
- id' <- fromMaybe "" <$> lookupBlockAttribute "name"
- content <- rawBlockContent blkProp
- resultsContent <- followingResultsBlock
- let includeCode = exportsCode kv
- let includeResults = exportsResults kv
- let codeBlck = B.codeBlockWith ( id', classes, kv ) content
- labelledBlck <- maybe (pure codeBlck)
- (labelDiv codeBlck)
- <$> lookupInlinesAttr "caption"
- let resultBlck = fromMaybe mempty resultsContent
- return $ (if includeCode then labelledBlck else mempty)
- <> (if includeResults then resultBlck else mempty)
- where
- labelDiv blk value =
- B.divWith nullAttr <$> (mappend <$> labelledBlock value
- <*> pure blk)
- labelledBlock = fmap (B.plain . B.spanWith ("", ["label"], []))
-
-rawBlockContent :: BlockProperties -> OrgParser String
-rawBlockContent (indent, blockType) = try $
- unlines . map commaEscaped <$> manyTill indentedLine blockEnder
- where
- indentedLine = try $ ("" <$ blankline) <|> (indentWith indent *> anyLine)
- blockEnder = try $ indentWith indent *> stringAnyCase ("#+end_" <> blockType)
-
-parsedBlockContent :: BlockProperties -> OrgParser (F Blocks)
-parsedBlockContent blkProps = try $ do
- raw <- rawBlockContent blkProps
- parseFromString parseBlocks (raw ++ "\n")
-
--- indent by specified number of spaces (or equiv. tabs)
-indentWith :: Int -> OrgParser String
-indentWith num = do
- tabStop <- getOption readerTabStop
- if num < tabStop
- then count num (char ' ')
- else choice [ try (count num (char ' '))
- , try (char '\t' >> count (num - tabStop) (char ' ')) ]
-
-type SwitchOption = (Char, Maybe String)
-
-orgArgWord :: OrgParser String
-orgArgWord = many1 orgArgWordChar
-
--- | Parse code block arguments
--- TODO: We currently don't handle switches.
-codeHeaderArgs :: OrgParser ([String], [(String, String)])
-codeHeaderArgs = try $ do
- language <- skipSpaces *> orgArgWord
- _ <- skipSpaces *> (try $ switch `sepBy` (many1 spaceChar))
- parameters <- manyTill blockOption newline
- let pandocLang = translateLang language
- return $
- if hasRundocParameters parameters
- then ( [ pandocLang, rundocBlockClass ]
- , map toRundocAttrib (("language", language) : parameters)
- )
- else ([ pandocLang ], parameters)
- where hasRundocParameters = not . null
-
-switch :: OrgParser SwitchOption
-switch = try $ simpleSwitch <|> lineNumbersSwitch
- where
- simpleSwitch = (\c -> (c, Nothing)) <$> (oneOf "-+" *> letter)
- lineNumbersSwitch = (\ls -> ('l', Just ls)) <$>
- (string "-l \"" *> many1Till nonspaceChar (char '"'))
-
-translateLang :: String -> String
-translateLang "C" = "c"
-translateLang "C++" = "cpp"
-translateLang "emacs-lisp" = "commonlisp" -- emacs lisp is not supported
-translateLang "js" = "javascript"
-translateLang "lisp" = "commonlisp"
-translateLang "R" = "r"
-translateLang "sh" = "bash"
-translateLang "sqlite" = "sql"
-translateLang cs = cs
-
--- | Prefix used for Rundoc classes and arguments.
-rundocPrefix :: String
-rundocPrefix = "rundoc-"
-
--- | The class-name used to mark rundoc blocks.
-rundocBlockClass :: String
-rundocBlockClass = rundocPrefix ++ "block"
-
-blockOption :: OrgParser (String, String)
-blockOption = try $ do
- argKey <- orgArgKey
- paramValue <- option "yes" orgParamValue
- return (argKey, paramValue)
-
-inlineBlockOption :: OrgParser (String, String)
-inlineBlockOption = try $ do
- argKey <- orgArgKey
- paramValue <- option "yes" orgInlineParamValue
- return (argKey, paramValue)
-
-orgArgKey :: OrgParser String
-orgArgKey = try $
- skipSpaces *> char ':'
- *> many1 orgArgWordChar
-
-orgParamValue :: OrgParser String
-orgParamValue = try $
- skipSpaces
- *> notFollowedBy (char ':' )
- *> many1 (noneOf "\t\n\r ")
- <* skipSpaces
-
-orgInlineParamValue :: OrgParser String
-orgInlineParamValue = try $
- skipSpaces
- *> notFollowedBy (char ':')
- *> many1 (noneOf "\t\n\r ]")
- <* skipSpaces
-
-orgArgWordChar :: OrgParser Char
-orgArgWordChar = alphaNum <|> oneOf "-_"
-
-toRundocAttrib :: (String, String) -> (String, String)
-toRundocAttrib = first ("rundoc-" ++)
-
-commaEscaped :: String -> String
-commaEscaped (',':cs@('*':_)) = cs
-commaEscaped (',':cs@('#':'+':_)) = cs
-commaEscaped cs = cs
-
-example :: OrgParser (F Blocks)
-example = try $ do
- return . return . exampleCode =<< unlines <$> many1 exampleLine
-
-exampleCode :: String -> Blocks
-exampleCode = B.codeBlockWith ("", ["example"], [])
-
-exampleLine :: OrgParser String
-exampleLine = try $ skipSpaces *> string ": " *> anyLine
-
--- Drawers for properties or a logbook
-drawer :: OrgParser (F Blocks)
-drawer = try $ do
- drawerStart
- manyTill drawerLine (try drawerEnd)
- return mempty
-
-drawerStart :: OrgParser String
-drawerStart = try $
- skipSpaces *> drawerName <* skipSpaces <* P.newline
- where drawerName = try $ char ':' *> validDrawerName <* char ':'
- validDrawerName = stringAnyCase "PROPERTIES"
- <|> stringAnyCase "LOGBOOK"
-
-drawerLine :: OrgParser String
-drawerLine = try anyLine
-
-drawerEnd :: OrgParser String
-drawerEnd = try $
- skipSpaces *> stringAnyCase ":END:" <* skipSpaces <* P.newline
-
-
---
--- Figures
---
-
--- Figures (Image on a line by itself, preceded by name and/or caption)
-figure :: OrgParser (F Blocks)
-figure = try $ do
- (cap, nam) <- nameAndCaption
- src <- skipSpaces *> selfTarget <* skipSpaces <* P.newline
- guard (isImageFilename src)
- return $ do
- cap' <- cap
- return $ B.para $ B.image src nam cap'
- where
- nameAndCaption =
- do
- maybeCap <- lookupInlinesAttr "caption"
- maybeNam <- lookupBlockAttribute "name"
- guard $ isJust maybeCap || isJust maybeNam
- return ( fromMaybe mempty maybeCap
- , withFigPrefix $ fromMaybe mempty maybeNam )
- withFigPrefix cs =
- if "fig:" `isPrefixOf` cs
- then cs
- else "fig:" ++ cs
-
---
--- Comments, Options and Metadata
-specialLine :: OrgParser (F Blocks)
-specialLine = fmap return . try $ metaLine <|> commentLine
-
-metaLine :: OrgParser Blocks
-metaLine = try $ mempty
- <$ (metaLineStart *> (optionLine <|> declarationLine))
-
-commentLine :: OrgParser Blocks
-commentLine = try $ commentLineStart *> anyLine *> pure mempty
-
--- The order, in which blocks are tried, makes sure that we're not looking at
--- the beginning of a block, so we don't need to check for it
-metaLineStart :: OrgParser String
-metaLineStart = try $ mappend <$> many spaceChar <*> string "#+"
-
-commentLineStart :: OrgParser String
-commentLineStart = try $ mappend <$> many spaceChar <*> string "# "
-
-declarationLine :: OrgParser ()
-declarationLine = try $ do
- key <- metaKey
- inlinesF <- metaInlines
- updateState $ \st ->
- let meta' = B.setMeta <$> pure key <*> inlinesF <*> pure nullMeta
- in st { orgStateMeta' = orgStateMeta' st <> meta' }
- return ()
-
-metaInlines :: OrgParser (F MetaValue)
-metaInlines = fmap (MetaInlines . B.toList) <$> inlinesTillNewline
-
-metaKey :: OrgParser String
-metaKey = map toLower <$> many1 (noneOf ": \n\r")
- <* char ':'
- <* skipSpaces
-
-optionLine :: OrgParser ()
-optionLine = try $ do
- key <- metaKey
- case key of
- "link" -> parseLinkFormat >>= uncurry addLinkFormat
- _ -> mzero
-
-parseLinkFormat :: OrgParser ((String, String -> String))
-parseLinkFormat = try $ do
- linkType <- (:) <$> letter <*> many (alphaNum <|> oneOf "-_") <* skipSpaces
- linkSubst <- parseFormat
- return (linkType, linkSubst)
-
--- | An ad-hoc, single-argument-only implementation of a printf-style format
--- parser.
-parseFormat :: OrgParser (String -> String)
-parseFormat = try $ do
- replacePlain <|> replaceUrl <|> justAppend
- where
- -- inefficient, but who cares
- replacePlain = try $ (\x -> concat . flip intersperse x)
- <$> sequence [tillSpecifier 's', rest]
- replaceUrl = try $ (\x -> concat . flip intersperse x . urlEncode)
- <$> sequence [tillSpecifier 'h', rest]
- justAppend = try $ (++) <$> rest
-
- rest = manyTill anyChar (eof <|> () <$ oneOf "\n\r")
- tillSpecifier c = manyTill (noneOf "\n\r") (try $ string ('%':c:""))
-
---
--- Headers
---
-
--- | Headers
-header :: OrgParser (F Blocks)
-header = try $ do
- level <- headerStart
- title <- manyTill inline (lookAhead headerEnd)
- tags <- headerEnd
- let inlns = trimInlinesF . mconcat $ title <> map tagToInlineF tags
- st <- getState
- let inlines = runF inlns st
- attr <- registerHeader nullAttr inlines
- return $ pure (B.headerWith attr level inlines)
- where
- tagToInlineF :: String -> F Inlines
- tagToInlineF t = return $ B.spanWith ("", ["tag"], [("data-tag-name", t)]) mempty
-
-headerEnd :: OrgParser [String]
-headerEnd = option [] headerTags <* newline
-
-headerTags :: OrgParser [String]
-headerTags = try $
- skipSpaces
- *> char ':'
- *> many1 tag
- <* skipSpaces
- where tag = many1 (alphaNum <|> oneOf "@%#_")
- <* char ':'
-
-headerStart :: OrgParser Int
-headerStart = try $
- (length <$> many1 (char '*')) <* many1 (char ' ') <* updateLastPreCharPos
-
-
--- Don't use (or need) the reader wrapper here, we want hline to be
--- @show@able. Otherwise we can't use it with @notFollowedBy'@.
-
--- | Horizontal Line (five -- dashes or more)
-hline :: OrgParser Blocks
-hline = try $ do
- skipSpaces
- string "-----"
- many (char '-')
- skipSpaces
- newline
- return B.horizontalRule
-
---
--- Tables
---
-
-data OrgTableRow = OrgContentRow (F [Blocks])
- | OrgAlignRow [Alignment]
- | OrgHlineRow
-
-data OrgTable = OrgTable
- { orgTableColumns :: Int
- , orgTableAlignments :: [Alignment]
- , orgTableHeader :: [Blocks]
- , orgTableRows :: [[Blocks]]
- }
-
-table :: OrgParser (F Blocks)
-table = try $ do
- lookAhead tableStart
- do
- rows <- tableRows
- cptn <- fromMaybe (pure "") <$> lookupInlinesAttr "caption"
- return $ (<$> cptn) . orgToPandocTable . normalizeTable =<< rowsToTable rows
-
-orgToPandocTable :: OrgTable
- -> Inlines
- -> Blocks
-orgToPandocTable (OrgTable _ aligns heads lns) caption =
- B.table caption (zip aligns $ repeat 0) heads lns
-
-tableStart :: OrgParser Char
-tableStart = try $ skipSpaces *> char '|'
-
-tableRows :: OrgParser [OrgTableRow]
-tableRows = try $ many (tableAlignRow <|> tableHline <|> tableContentRow)
-
-tableContentRow :: OrgParser OrgTableRow
-tableContentRow = try $
- OrgContentRow . sequence <$> (tableStart *> manyTill tableContentCell newline)
-
-tableContentCell :: OrgParser (F Blocks)
-tableContentCell = try $
- fmap B.plain . trimInlinesF . mconcat <$> many1Till inline endOfCell
-
-endOfCell :: OrgParser Char
-endOfCell = try $ char '|' <|> lookAhead newline
-
-tableAlignRow :: OrgParser OrgTableRow
-tableAlignRow = try $
- OrgAlignRow <$> (tableStart *> manyTill tableAlignCell newline)
-
-tableAlignCell :: OrgParser Alignment
-tableAlignCell =
- choice [ try $ emptyCell *> return AlignDefault
- , try $ skipSpaces
- *> char '<'
- *> tableAlignFromChar
- <* many digit
- <* char '>'
- <* emptyCell
- ] <?> "alignment info"
- where emptyCell = try $ skipSpaces *> endOfCell
-
-tableAlignFromChar :: OrgParser Alignment
-tableAlignFromChar = try $ choice [ char 'l' *> return AlignLeft
- , char 'c' *> return AlignCenter
- , char 'r' *> return AlignRight
- ]
-
-tableHline :: OrgParser OrgTableRow
-tableHline = try $
- OrgHlineRow <$ (tableStart *> char '-' *> anyLine)
-
-rowsToTable :: [OrgTableRow]
- -> F OrgTable
-rowsToTable = foldM (flip rowToContent) zeroTable
- where zeroTable = OrgTable 0 mempty mempty mempty
-
-normalizeTable :: OrgTable
- -> OrgTable
-normalizeTable (OrgTable cols aligns heads lns) =
- let aligns' = fillColumns aligns AlignDefault
- heads' = if heads == mempty
- then mempty
- else fillColumns heads (B.plain mempty)
- lns' = map (`fillColumns` B.plain mempty) lns
- fillColumns base padding = take cols $ base ++ repeat padding
- in OrgTable cols aligns' heads' lns'
-
-
--- One or more horizontal rules after the first content line mark the previous
--- line as a header. All other horizontal lines are discarded.
-rowToContent :: OrgTableRow
- -> OrgTable
- -> F OrgTable
-rowToContent OrgHlineRow t = maybeBodyToHeader t
-rowToContent (OrgAlignRow as) t = setLongestRow as =<< setAligns as t
-rowToContent (OrgContentRow rf) t = do
- rs <- rf
- setLongestRow rs =<< appendToBody rs t
-
-setLongestRow :: [a]
- -> OrgTable
- -> F OrgTable
-setLongestRow rs t =
- return t{ orgTableColumns = max (length rs) (orgTableColumns t) }
-
-maybeBodyToHeader :: OrgTable
- -> F OrgTable
-maybeBodyToHeader t = case t of
- OrgTable{ orgTableHeader = [], orgTableRows = b:[] } ->
- return t{ orgTableHeader = b , orgTableRows = [] }
- _ -> return t
-
-appendToBody :: [Blocks]
- -> OrgTable
- -> F OrgTable
-appendToBody r t = return t{ orgTableRows = orgTableRows t ++ [r] }
-
-setAligns :: [Alignment]
- -> OrgTable
- -> F OrgTable
-setAligns aligns t = return $ t{ orgTableAlignments = aligns }
-
-
---
--- LaTeX fragments
---
-latexFragment :: OrgParser (F Blocks)
-latexFragment = try $ do
- envName <- latexEnvStart
- content <- mconcat <$> manyTill anyLineNewline (latexEnd envName)
- return . return $ B.rawBlock "latex" (content `inLatexEnv` envName)
- where
- c `inLatexEnv` e = mconcat [ "\\begin{", e, "}\n"
- , c
- , "\\end{", e, "}\n"
- ]
-
-latexEnvStart :: OrgParser String
-latexEnvStart = try $ do
- skipSpaces *> string "\\begin{"
- *> latexEnvName
- <* string "}"
- <* blankline
-
-latexEnd :: String -> OrgParser ()
-latexEnd envName = try $
- () <$ skipSpaces
- <* string ("\\end{" ++ envName ++ "}")
- <* blankline
-
--- | Parses a LaTeX environment name.
-latexEnvName :: OrgParser String
-latexEnvName = try $ do
- mappend <$> many1 alphaNum
- <*> option "" (string "*")
-
-
---
--- Footnote defintions
---
-noteBlock :: OrgParser (F Blocks)
-noteBlock = try $ do
- ref <- noteMarker <* skipSpaces
- content <- mconcat <$> blocksTillHeaderOrNote
- addToNotesTable (ref, content)
- return mempty
- where
- blocksTillHeaderOrNote =
- many1Till block (eof <|> () <$ lookAhead noteMarker
- <|> () <$ lookAhead headerStart)
-
--- Paragraphs or Plain text
-paraOrPlain :: OrgParser (F Blocks)
-paraOrPlain = try $ do
- ils <- parseInlines
- nl <- option False (newline *> return True)
- -- Read block as paragraph, except if we are in a list context and the block
- -- is directly followed by a list item, in which case the block is read as
- -- plain text.
- try (guard nl
- *> notFollowedBy (inList *> (orderedListStart <|> bulletListStart))
- *> return (B.para <$> ils))
- <|> (return (B.plain <$> ils))
-
-inlinesTillNewline :: OrgParser (F Inlines)
-inlinesTillNewline = trimInlinesF . mconcat <$> manyTill inline newline
-
-
---
--- list blocks
---
-
-list :: OrgParser (F Blocks)
-list = choice [ definitionList, bulletList, orderedList ] <?> "list"
-
-definitionList :: OrgParser (F Blocks)
-definitionList = try $ do n <- lookAhead (bulletListStart' Nothing)
- fmap B.definitionList . fmap compactify'DL . sequence
- <$> many1 (definitionListItem $ bulletListStart' (Just n))
-
-bulletList :: OrgParser (F Blocks)
-bulletList = try $ do n <- lookAhead (bulletListStart' Nothing)
- fmap B.bulletList . fmap compactify' . sequence
- <$> many1 (listItem (bulletListStart' $ Just n))
-
-orderedList :: OrgParser (F Blocks)
-orderedList = fmap B.orderedList . fmap compactify' . sequence
- <$> many1 (listItem orderedListStart)
-
-genericListStart :: OrgParser String
- -> OrgParser Int
-genericListStart listMarker = try $
- (+) <$> (length <$> many spaceChar)
- <*> (length <$> listMarker <* many1 spaceChar)
-
--- parses bullet list marker. maybe we know the indent level
-bulletListStart :: OrgParser Int
-bulletListStart = bulletListStart' Nothing
-
-bulletListStart' :: Maybe Int -> OrgParser Int
--- returns length of bulletList prefix, inclusive of marker
-bulletListStart' Nothing = do ind <- length <$> many spaceChar
- when (ind == 0) $ notFollowedBy (char '*')
- oneOf bullets
- many1 spaceChar
- return (ind + 1)
- -- Unindented lists are legal, but they can't use '*' bullets
- -- We return n to maintain compatibility with the generic listItem
-bulletListStart' (Just n) = do count (n-1) spaceChar
- when (n == 1) $ notFollowedBy (char '*')
- oneOf bullets
- many1 spaceChar
- return n
-
-bullets :: String
-bullets = "*+-"
-
-orderedListStart :: OrgParser Int
-orderedListStart = genericListStart orderedListMarker
- -- Ordered list markers allowed in org-mode
- where orderedListMarker = mappend <$> many1 digit <*> (pure <$> oneOf ".)")
-
-definitionListItem :: OrgParser Int
- -> OrgParser (F (Inlines, [Blocks]))
-definitionListItem parseMarkerGetLength = try $ do
- markerLength <- parseMarkerGetLength
- term <- manyTill (noneOf "\n\r") (try definitionMarker)
- line1 <- anyLineNewline
- blank <- option "" ("\n" <$ blankline)
- cont <- concat <$> many (listContinuation markerLength)
- term' <- parseFromString parseInlines term
- contents' <- parseFromString parseBlocks $ line1 ++ blank ++ cont
- return $ (,) <$> term' <*> fmap (:[]) contents'
- where
- definitionMarker =
- spaceChar *> string "::" <* (spaceChar <|> lookAhead P.newline)
-
-
--- parse raw text for one list item, excluding start marker and continuations
-listItem :: OrgParser Int
- -> OrgParser (F Blocks)
-listItem start = try . withContext ListItemState $ do
- markerLength <- try start
- firstLine <- anyLineNewline
- blank <- option "" ("\n" <$ blankline)
- rest <- concat <$> many (listContinuation markerLength)
- parseFromString parseBlocks $ firstLine ++ blank ++ rest
-
--- continuation of a list item - indented and separated by blankline or endline.
--- Note: nested lists are parsed as continuations.
-listContinuation :: Int
- -> OrgParser String
-listContinuation markerLength = try $
- notFollowedBy' blankline
- *> (mappend <$> (concat <$> many1 listLine)
- <*> many blankline)
- where listLine = try $ indentWith markerLength *> anyLineNewline
-
-anyLineNewline :: OrgParser String
-anyLineNewline = (++ "\n") <$> anyLine
-
-
---
--- inline
---
-
-inline :: OrgParser (F Inlines)
-inline =
- choice [ whitespace
- , linebreak
- , cite
- , footnote
- , linkOrImage
- , anchor
- , inlineCodeBlock
- , str
- , endline
- , emph
- , strong
- , strikeout
- , underline
- , code
- , math
- , displayMath
- , verbatim
- , subscript
- , superscript
- , inlineLaTeX
- , smart
- , symbol
- ] <* (guard =<< newlinesCountWithinLimits)
- <?> "inline"
-
-parseInlines :: OrgParser (F Inlines)
-parseInlines = trimInlinesF . mconcat <$> many1 inline
-
--- treat these as potentially non-text when parsing inline:
-specialChars :: [Char]
-specialChars = "\"$'()*+-,./:<=>[\\]^_{|}~"
-
-
-whitespace :: OrgParser (F Inlines)
-whitespace = pure B.space <$ skipMany1 spaceChar
- <* updateLastPreCharPos
- <* updateLastForbiddenCharPos
- <?> "whitespace"
-
-linebreak :: OrgParser (F Inlines)
-linebreak = try $ pure B.linebreak <$ string "\\\\" <* skipSpaces <* newline
-
-str :: OrgParser (F Inlines)
-str = return . B.str <$> many1 (noneOf $ specialChars ++ "\n\r ")
- <* updateLastStrPos
-
--- | An endline character that can be treated as a space, not a structural
--- break. This should reflect the values of the Emacs variable
--- @org-element-pagaraph-separate@.
-endline :: OrgParser (F Inlines)
-endline = try $ do
- newline
- notFollowedBy blankline
- notFollowedBy' exampleLine
- notFollowedBy' hline
- notFollowedBy' noteMarker
- notFollowedBy' tableStart
- notFollowedBy' drawerStart
- notFollowedBy' headerStart
- notFollowedBy' metaLineStart
- notFollowedBy' latexEnvStart
- notFollowedBy' commentLineStart
- notFollowedBy' bulletListStart
- notFollowedBy' orderedListStart
- decEmphasisNewlinesCount
- guard =<< newlinesCountWithinLimits
- updateLastPreCharPos
- return . return $ B.softbreak
-
-cite :: OrgParser (F Inlines)
-cite = try $ do
- guardEnabled Ext_citations
- (cs, raw) <- withRaw normalCite
- return $ (flip B.cite (B.text raw)) <$> cs
-
-normalCite :: OrgParser (F [Citation])
-normalCite = try $ char '['
- *> skipSpaces
- *> citeList
- <* skipSpaces
- <* char ']'
-
-citeList :: OrgParser (F [Citation])
-citeList = sequence <$> sepBy1 citation (try $ char ';' *> skipSpaces)
-
-citation :: OrgParser (F Citation)
-citation = try $ do
- pref <- prefix
- (suppress_author, key) <- citeKey
- suff <- suffix
- return $ do
- x <- pref
- y <- suff
- return $ Citation{ citationId = key
- , citationPrefix = B.toList x
- , citationSuffix = B.toList y
- , citationMode = if suppress_author
- then SuppressAuthor
- else NormalCitation
- , citationNoteNum = 0
- , citationHash = 0
- }
- where
- prefix = trimInlinesF . mconcat <$>
- manyTill inline (char ']' <|> (']' <$ lookAhead citeKey))
- suffix = try $ do
- hasSpace <- option False (notFollowedBy nonspaceChar >> return True)
- skipSpaces
- rest <- trimInlinesF . mconcat <$>
- many (notFollowedBy (oneOf ";]") *> inline)
- return $ if hasSpace
- then (B.space <>) <$> rest
- else rest
-
-footnote :: OrgParser (F Inlines)
-footnote = try $ inlineNote <|> referencedNote
-
-inlineNote :: OrgParser (F Inlines)
-inlineNote = try $ do
- string "[fn:"
- ref <- many alphaNum
- char ':'
- note <- fmap B.para . trimInlinesF . mconcat <$> many1Till inline (char ']')
- when (not $ null ref) $
- addToNotesTable ("fn:" ++ ref, note)
- return $ B.note <$> note
-
-referencedNote :: OrgParser (F Inlines)
-referencedNote = try $ do
- ref <- noteMarker
- return $ do
- notes <- asksF orgStateNotes'
- case lookup ref notes of
- Nothing -> return $ B.str $ "[" ++ ref ++ "]"
- Just contents -> do
- st <- askF
- let contents' = runF contents st{ orgStateNotes' = [] }
- return $ B.note contents'
-
-noteMarker :: OrgParser String
-noteMarker = try $ do
- char '['
- choice [ many1Till digit (char ']')
- , (++) <$> string "fn:"
- <*> many1Till (noneOf "\n\r\t ") (char ']')
- ]
-
-linkOrImage :: OrgParser (F Inlines)
-linkOrImage = explicitOrImageLink
- <|> selflinkOrImage
- <|> angleLink
- <|> plainLink
- <?> "link or image"
-
-explicitOrImageLink :: OrgParser (F Inlines)
-explicitOrImageLink = try $ do
- char '['
- srcF <- applyCustomLinkFormat =<< possiblyEmptyLinkTarget
- title <- enclosedRaw (char '[') (char ']')
- title' <- parseFromString (mconcat <$> many inline) title
- char ']'
- return $ do
- src <- srcF
- if isImageFilename title
- then pure $ B.link src "" $ B.image title mempty mempty
- else linkToInlinesF src =<< title'
-
-selflinkOrImage :: OrgParser (F Inlines)
-selflinkOrImage = try $ do
- src <- char '[' *> linkTarget <* char ']'
- return $ linkToInlinesF src (B.str src)
-
-plainLink :: OrgParser (F Inlines)
-plainLink = try $ do
- (orig, src) <- uri
- returnF $ B.link src "" (B.str orig)
-
-angleLink :: OrgParser (F Inlines)
-angleLink = try $ do
- char '<'
- link <- plainLink
- char '>'
- return link
-
-selfTarget :: OrgParser String
-selfTarget = try $ char '[' *> linkTarget <* char ']'
-
-linkTarget :: OrgParser String
-linkTarget = enclosedByPair '[' ']' (noneOf "\n\r[]")
-
-possiblyEmptyLinkTarget :: OrgParser String
-possiblyEmptyLinkTarget = try linkTarget <|> ("" <$ string "[]")
-
-applyCustomLinkFormat :: String -> OrgParser (F String)
-applyCustomLinkFormat link = do
- let (linkType, rest) = break (== ':') link
- return $ do
- formatter <- M.lookup linkType <$> asksF orgStateLinkFormatters
- return $ maybe link ($ drop 1 rest) formatter
-
--- | Take a link and return a function which produces new inlines when given
--- description inlines.
-linkToInlinesF :: String -> Inlines -> F Inlines
-linkToInlinesF linkStr =
- case linkStr of
- "" -> pure . B.link mempty "" -- wiki link (empty by convention)
- ('#':_) -> pure . B.link linkStr "" -- document-local fraction
- _ -> case cleanLinkString linkStr of
- (Just cleanedLink) -> if isImageFilename cleanedLink
- then const . pure $ B.image cleanedLink "" ""
- else pure . B.link cleanedLink ""
- Nothing -> internalLink linkStr -- other internal link
-
--- | Cleanup and canonicalize a string describing a link. Return @Nothing@ if
--- the string does not appear to be a link.
-cleanLinkString :: String -> Maybe String
-cleanLinkString s =
- case s of
- '/':_ -> Just $ "file://" ++ s -- absolute path
- '.':'/':_ -> Just s -- relative path
- '.':'.':'/':_ -> Just s -- relative path
- -- Relative path or URL (file schema)
- 'f':'i':'l':'e':':':s' -> Just $ if ("//" `isPrefixOf` s') then s else s'
- _ | isUrl s -> Just s -- URL
- _ -> Nothing
- where
- isUrl :: String -> Bool
- isUrl cs =
- let (scheme, path) = break (== ':') cs
- in all (\c -> isAlphaNum c || c `elem` (".-"::String)) scheme
- && not (null path)
-
-isImageFilename :: String -> Bool
-isImageFilename filename =
- any (\x -> ('.':x) `isSuffixOf` filename) imageExtensions &&
- (any (\x -> (x++":") `isPrefixOf` filename) protocols ||
- ':' `notElem` filename)
- where
- imageExtensions = [ "jpeg" , "jpg" , "png" , "gif" , "svg" ]
- protocols = [ "file", "http", "https" ]
-
-internalLink :: String -> Inlines -> F Inlines
-internalLink link title = do
- anchorB <- (link `elem`) <$> asksF orgStateAnchorIds
- if anchorB
- then return $ B.link ('#':link) "" title
- else return $ B.emph title
-
--- | Parse an anchor like @<<anchor-id>>@ and return an empty span with
--- @anchor-id@ set as id. Legal anchors in org-mode are defined through
--- @org-target-regexp@, which is fairly liberal. Since no link is created if
--- @anchor-id@ contains spaces, we are more restrictive in what is accepted as
--- an anchor.
-
-anchor :: OrgParser (F Inlines)
-anchor = try $ do
- anchorId <- parseAnchor
- recordAnchorId anchorId
- returnF $ B.spanWith (solidify anchorId, [], []) mempty
- where
- parseAnchor = string "<<"
- *> many1 (noneOf "\t\n\r<>\"' ")
- <* string ">>"
- <* skipSpaces
-
--- | Replace every char but [a-zA-Z0-9_.-:] with a hypen '-'. This mirrors
--- the org function @org-export-solidify-link-text@.
-
-solidify :: String -> String
-solidify = map replaceSpecialChar
- where replaceSpecialChar c
- | isAlphaNum c = c
- | c `elem` ("_.-:" :: String) = c
- | otherwise = '-'
-
--- | Parses an inline code block and marks it as an babel block.
-inlineCodeBlock :: OrgParser (F Inlines)
-inlineCodeBlock = try $ do
- string "src_"
- lang <- many1 orgArgWordChar
- opts <- option [] $ enclosedByPair '[' ']' inlineBlockOption
- inlineCode <- enclosedByPair '{' '}' (noneOf "\n\r")
- let attrClasses = [translateLang lang, rundocBlockClass]
- let attrKeyVal = map toRundocAttrib (("language", lang) : opts)
- returnF $ B.codeWith ("", attrClasses, attrKeyVal) inlineCode
-
-enclosedByPair :: Char -- ^ opening char
- -> Char -- ^ closing char
- -> OrgParser a -- ^ parser
- -> OrgParser [a]
-enclosedByPair s e p = char s *> many1Till p (char e)
-
-emph :: OrgParser (F Inlines)
-emph = fmap B.emph <$> emphasisBetween '/'
-
-strong :: OrgParser (F Inlines)
-strong = fmap B.strong <$> emphasisBetween '*'
-
-strikeout :: OrgParser (F Inlines)
-strikeout = fmap B.strikeout <$> emphasisBetween '+'
-
--- There is no underline, so we use strong instead.
-underline :: OrgParser (F Inlines)
-underline = fmap B.strong <$> emphasisBetween '_'
-
-verbatim :: OrgParser (F Inlines)
-verbatim = return . B.code <$> verbatimBetween '='
-
-code :: OrgParser (F Inlines)
-code = return . B.code <$> verbatimBetween '~'
-
-subscript :: OrgParser (F Inlines)
-subscript = fmap B.subscript <$> try (char '_' *> subOrSuperExpr)
-
-superscript :: OrgParser (F Inlines)
-superscript = fmap B.superscript <$> try (char '^' *> subOrSuperExpr)
-
-math :: OrgParser (F Inlines)
-math = return . B.math <$> choice [ math1CharBetween '$'
- , mathStringBetween '$'
- , rawMathBetween "\\(" "\\)"
- ]
-
-displayMath :: OrgParser (F Inlines)
-displayMath = return . B.displayMath <$> choice [ rawMathBetween "\\[" "\\]"
- , rawMathBetween "$$" "$$"
- ]
-
-updatePositions :: Char
- -> OrgParser (Char)
-updatePositions c = do
- when (c `elem` emphasisPreChars) updateLastPreCharPos
- when (c `elem` emphasisForbiddenBorderChars) updateLastForbiddenCharPos
- return c
-
-symbol :: OrgParser (F Inlines)
-symbol = return . B.str . (: "") <$> (oneOf specialChars >>= updatePositions)
-
-emphasisBetween :: Char
- -> OrgParser (F Inlines)
-emphasisBetween c = try $ do
- startEmphasisNewlinesCounting emphasisAllowedNewlines
- res <- enclosedInlines (emphasisStart c) (emphasisEnd c)
- isTopLevelEmphasis <- null . orgStateEmphasisCharStack <$> getState
- when isTopLevelEmphasis
- resetEmphasisNewlines
- return res
-
-verbatimBetween :: Char
- -> OrgParser String
-verbatimBetween c = try $
- emphasisStart c *>
- many1TillNOrLessNewlines 1 (noneOf "\n\r") (emphasisEnd c)
-
--- | Parses a raw string delimited by @c@ using Org's math rules
-mathStringBetween :: Char
- -> OrgParser String
-mathStringBetween c = try $ do
- mathStart c
- body <- many1TillNOrLessNewlines mathAllowedNewlines
- (noneOf (c:"\n\r"))
- (lookAhead $ mathEnd c)
- final <- mathEnd c
- return $ body ++ [final]
-
--- | Parse a single character between @c@ using math rules
-math1CharBetween :: Char
- -> OrgParser String
-math1CharBetween c = try $ do
- char c
- res <- noneOf $ c:mathForbiddenBorderChars
- char c
- eof <|> () <$ lookAhead (oneOf mathPostChars)
- return [res]
-
-rawMathBetween :: String
- -> String
- -> OrgParser String
-rawMathBetween s e = try $ string s *> manyTill anyChar (try $ string e)
-
--- | Parses the start (opening character) of emphasis
-emphasisStart :: Char -> OrgParser Char
-emphasisStart c = try $ do
- guard =<< afterEmphasisPreChar
- guard =<< notAfterString
- char c
- lookAhead (noneOf emphasisForbiddenBorderChars)
- pushToInlineCharStack c
- return c
-
--- | Parses the closing character of emphasis
-emphasisEnd :: Char -> OrgParser Char
-emphasisEnd c = try $ do
- guard =<< notAfterForbiddenBorderChar
- char c
- eof <|> () <$ lookAhead acceptablePostChars
- updateLastStrPos
- popInlineCharStack
- return c
- where acceptablePostChars =
- surroundingEmphasisChar >>= \x -> oneOf (x ++ emphasisPostChars)
-
-mathStart :: Char -> OrgParser Char
-mathStart c = try $
- char c <* notFollowedBy' (oneOf (c:mathForbiddenBorderChars))
-
-mathEnd :: Char -> OrgParser Char
-mathEnd c = try $ do
- res <- noneOf (c:mathForbiddenBorderChars)
- char c
- eof <|> () <$ lookAhead (oneOf mathPostChars)
- return res
-
-
-enclosedInlines :: OrgParser a
- -> OrgParser b
- -> OrgParser (F Inlines)
-enclosedInlines start end = try $
- trimInlinesF . mconcat <$> enclosed start end inline
-
-enclosedRaw :: OrgParser a
- -> OrgParser b
- -> OrgParser String
-enclosedRaw start end = try $
- start *> (onSingleLine <|> spanningTwoLines)
- where onSingleLine = try $ many1Till (noneOf "\n\r") end
- spanningTwoLines = try $
- anyLine >>= \f -> mappend (f <> " ") <$> onSingleLine
-
--- | Like many1Till, but parses at most @n+1@ lines. @p@ must not consume
--- newlines.
-many1TillNOrLessNewlines :: Int
- -> OrgParser Char
- -> OrgParser a
- -> OrgParser String
-many1TillNOrLessNewlines n p end = try $
- nMoreLines (Just n) mempty >>= oneOrMore
- where
- nMoreLines Nothing cs = return cs
- nMoreLines (Just 0) cs = try $ (cs ++) <$> finalLine
- nMoreLines k cs = try $ (final k cs <|> rest k cs)
- >>= uncurry nMoreLines
- final _ cs = (\x -> (Nothing, cs ++ x)) <$> try finalLine
- rest m cs = (\x -> (minus1 <$> m, cs ++ x ++ "\n")) <$> try (manyTill p P.newline)
- finalLine = try $ manyTill p end
- minus1 k = k - 1
- oneOrMore cs = guard (not $ null cs) *> return cs
-
--- Org allows customization of the way it reads emphasis. We use the defaults
--- here (see, e.g., the Emacs Lisp variable `org-emphasis-regexp-components`
--- for details).
-
--- | Chars allowed to occur before emphasis (spaces and newlines are ok, too)
-emphasisPreChars :: [Char]
-emphasisPreChars = "\t \"'({"
-
--- | Chars allowed at after emphasis
-emphasisPostChars :: [Char]
-emphasisPostChars = "\t\n !\"'),-.:;?\\}"
-
--- | Chars not allowed at the (inner) border of emphasis
-emphasisForbiddenBorderChars :: [Char]
-emphasisForbiddenBorderChars = "\t\n\r \"',"
-
--- | The maximum number of newlines within
-emphasisAllowedNewlines :: Int
-emphasisAllowedNewlines = 1
-
--- LaTeX-style math: see `org-latex-regexps` for details
-
--- | Chars allowed after an inline ($...$) math statement
-mathPostChars :: [Char]
-mathPostChars = "\t\n \"'),-.:;?"
-
--- | Chars not allowed at the (inner) border of math
-mathForbiddenBorderChars :: [Char]
-mathForbiddenBorderChars = "\t\n\r ,;.$"
-
--- | Maximum number of newlines in an inline math statement
-mathAllowedNewlines :: Int
-mathAllowedNewlines = 2
-
--- | Whether we are right behind a char allowed before emphasis
-afterEmphasisPreChar :: OrgParser Bool
-afterEmphasisPreChar = do
- pos <- getPosition
- lastPrePos <- orgStateLastPreCharPos <$> getState
- return . fromMaybe True $ (== pos) <$> lastPrePos
-
--- | Whether the parser is right after a forbidden border char
-notAfterForbiddenBorderChar :: OrgParser Bool
-notAfterForbiddenBorderChar = do
- pos <- getPosition
- lastFBCPos <- orgStateLastForbiddenCharPos <$> getState
- return $ lastFBCPos /= Just pos
-
--- | Read a sub- or superscript expression
-subOrSuperExpr :: OrgParser (F Inlines)
-subOrSuperExpr = try $
- choice [ id <$> charsInBalanced '{' '}' (noneOf "\n\r")
- , enclosing ('(', ')') <$> charsInBalanced '(' ')' (noneOf "\n\r")
- , simpleSubOrSuperString
- ] >>= parseFromString (mconcat <$> many inline)
- where enclosing (left, right) s = left : s ++ [right]
-
-simpleSubOrSuperString :: OrgParser String
-simpleSubOrSuperString = try $
- choice [ string "*"
- , mappend <$> option [] ((:[]) <$> oneOf "+-")
- <*> many1 alphaNum
- ]
-
-inlineLaTeX :: OrgParser (F Inlines)
-inlineLaTeX = try $ do
- cmd <- inlineLaTeXCommand
- maybe mzero returnF $
- parseAsMath cmd `mplus` parseAsMathMLSym cmd `mplus` parseAsInlineLaTeX cmd
- where
- parseAsMath :: String -> Maybe Inlines
- parseAsMath cs = B.fromList <$> texMathToPandoc cs
-
- parseAsInlineLaTeX :: String -> Maybe Inlines
- parseAsInlineLaTeX cs = maybeRight $ runParser inlineCommand state "" cs
-
- parseAsMathMLSym :: String -> Maybe Inlines
- parseAsMathMLSym cs = B.str <$> MathMLEntityMap.getUnicode (clean cs)
- -- dropWhileEnd would be nice here, but it's not available before base 4.5
- where clean = reverse . dropWhile (`elem` ("{}" :: String)) . reverse . drop 1
-
- state :: ParserState
- state = def{ stateOptions = def{ readerParseRaw = True }}
-
- texMathToPandoc inp = (maybeRight $ readTeX inp) >>=
- writePandoc DisplayInline
-
-maybeRight :: Either a b -> Maybe b
-maybeRight = either (const Nothing) Just
-
-inlineLaTeXCommand :: OrgParser String
-inlineLaTeXCommand = try $ do
- rest <- getInput
- case runParser rawLaTeXInline def "source" rest of
- Right (RawInline _ cs) -> do
- let len = length cs
- count len anyChar
- return cs
- _ -> mzero
-
-smart :: OrgParser (F Inlines)
-smart = do
- getOption readerSmart >>= guard
- doubleQuoted <|> singleQuoted <|>
- choice (map (return <$>) [orgApostrophe, orgDash, orgEllipses])
- where
- orgDash = dash <* updatePositions '-'
- orgEllipses = ellipses <* updatePositions '.'
- orgApostrophe =
- (char '\'' <|> char '\8217') <* updateLastPreCharPos
- <* updateLastForbiddenCharPos
- *> return (B.str "\x2019")
-
-singleQuoted :: OrgParser (F Inlines)
-singleQuoted = try $ do
- singleQuoteStart
- updatePositions '\''
- withQuoteContext InSingleQuote $
- fmap B.singleQuoted . trimInlinesF . mconcat <$>
- many1Till inline (singleQuoteEnd <* updatePositions '\'')
-
--- doubleQuoted will handle regular double-quoted sections, as well
--- as dialogues with an open double-quote without a close double-quote
--- in the same paragraph.
-doubleQuoted :: OrgParser (F Inlines)
-doubleQuoted = try $ do
- doubleQuoteStart
- updatePositions '"'
- contents <- mconcat <$> many (try $ notFollowedBy doubleQuoteEnd >> inline)
- (withQuoteContext InDoubleQuote $ (doubleQuoteEnd <* updateLastForbiddenCharPos) >> return
- (fmap B.doubleQuoted . trimInlinesF $ contents))
- <|> (return $ return (B.str "\8220") <> contents)
diff --git a/src/Text/Pandoc/Readers/Org/BlockStarts.hs b/src/Text/Pandoc/Readers/Org/BlockStarts.hs
new file mode 100644
index 000000000..e4dc31342
--- /dev/null
+++ b/src/Text/Pandoc/Readers/Org/BlockStarts.hs
@@ -0,0 +1,112 @@
+{-
+Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.Org.Options
+ Copyright : Copyright (C) 2014-2016 Albert Krewinkel
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+Parsers for Org-mode inline elements.
+-}
+module Text.Pandoc.Readers.Org.BlockStarts
+ ( exampleLineStart
+ , hline
+ , noteMarker
+ , tableStart
+ , drawerStart
+ , headerStart
+ , metaLineStart
+ , latexEnvStart
+ , commentLineStart
+ , bulletListStart
+ , orderedListStart
+ ) where
+
+import Text.Pandoc.Readers.Org.Parsing
+
+-- | Horizontal Line (five -- dashes or more)
+hline :: OrgParser ()
+hline = try $ do
+ skipSpaces
+ string "-----"
+ many (char '-')
+ skipSpaces
+ newline
+ return ()
+
+-- | Read the start of a header line, return the header level
+headerStart :: OrgParser Int
+headerStart = try $
+ (length <$> many1 (char '*')) <* many1 (char ' ') <* updateLastPreCharPos
+
+tableStart :: OrgParser Char
+tableStart = try $ skipSpaces *> char '|'
+
+latexEnvStart :: OrgParser String
+latexEnvStart = try $ do
+ skipSpaces *> string "\\begin{"
+ *> latexEnvName
+ <* string "}"
+ <* blankline
+ where
+ latexEnvName :: OrgParser String
+ latexEnvName = try $ mappend <$> many1 alphaNum <*> option "" (string "*")
+
+
+-- | Parses bullet list marker.
+bulletListStart :: OrgParser ()
+bulletListStart = try $
+ choice
+ [ () <$ skipSpaces <* oneOf "+-" <* skipSpaces1
+ , () <$ skipSpaces1 <* char '*' <* skipSpaces1
+ ]
+
+genericListStart :: OrgParser String
+ -> OrgParser Int
+genericListStart listMarker = try $
+ (+) <$> (length <$> many spaceChar)
+ <*> (length <$> listMarker <* many1 spaceChar)
+
+orderedListStart :: OrgParser Int
+orderedListStart = genericListStart orderedListMarker
+ -- Ordered list markers allowed in org-mode
+ where orderedListMarker = mappend <$> many1 digit <*> (pure <$> oneOf ".)")
+
+drawerStart :: OrgParser String
+drawerStart = try $
+ skipSpaces *> drawerName <* skipSpaces <* newline
+ where drawerName = char ':' *> manyTill nonspaceChar (char ':')
+
+metaLineStart :: OrgParser ()
+metaLineStart = try $ skipSpaces <* string "#+"
+
+commentLineStart :: OrgParser ()
+commentLineStart = try $ skipSpaces <* string "# "
+
+exampleLineStart :: OrgParser ()
+exampleLineStart = () <$ try (skipSpaces *> string ": ")
+
+noteMarker :: OrgParser String
+noteMarker = try $ do
+ char '['
+ choice [ many1Till digit (char ']')
+ , (++) <$> string "fn:"
+ <*> many1Till (noneOf "\n\r\t ") (char ']')
+ ]
diff --git a/src/Text/Pandoc/Readers/Org/Blocks.hs b/src/Text/Pandoc/Readers/Org/Blocks.hs
new file mode 100644
index 000000000..75e564f2f
--- /dev/null
+++ b/src/Text/Pandoc/Readers/Org/Blocks.hs
@@ -0,0 +1,901 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-
+Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.Org.Options
+ Copyright : Copyright (C) 2014-2016 Albert Krewinkel
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+Parsers for Org-mode block elements.
+-}
+module Text.Pandoc.Readers.Org.Blocks
+ ( blockList
+ , meta
+ ) where
+
+import Text.Pandoc.Readers.Org.BlockStarts
+import Text.Pandoc.Readers.Org.Inlines
+import Text.Pandoc.Readers.Org.ParserState
+import Text.Pandoc.Readers.Org.Parsing
+import Text.Pandoc.Readers.Org.Shared
+ ( isImageFilename, rundocBlockClass, toRundocAttrib
+ , translateLang )
+
+import qualified Text.Pandoc.Builder as B
+import Text.Pandoc.Builder ( Inlines, Blocks )
+import Text.Pandoc.Definition
+import Text.Pandoc.Compat.Monoid ((<>))
+import Text.Pandoc.Options
+import Text.Pandoc.Shared ( compactify', compactify'DL )
+
+import Control.Monad ( foldM, guard, mzero )
+import Data.Char ( isSpace, toLower, toUpper)
+import Data.List ( foldl', intersperse, isPrefixOf )
+import qualified Data.Map as M
+import Data.Maybe ( fromMaybe, isNothing )
+import Network.HTTP ( urlEncode )
+
+
+--
+-- parsing blocks
+--
+
+-- | Get a list of blocks.
+blockList :: OrgParser [Block]
+blockList = do
+ blocks' <- blocks
+ st <- getState
+ return . B.toList $ runF blocks' st
+
+-- | Get the meta information safed in the state.
+meta :: OrgParser Meta
+meta = do
+ st <- getState
+ return $ runF (orgStateMeta st) st
+
+blocks :: OrgParser (F Blocks)
+blocks = mconcat <$> manyTill block eof
+
+block :: OrgParser (F Blocks)
+block = choice [ mempty <$ blanklines
+ , table
+ , orgBlock
+ , figure
+ , example
+ , genericDrawer
+ , specialLine
+ , header
+ , horizontalRule
+ , list
+ , latexFragment
+ , noteBlock
+ , paraOrPlain
+ ] <?> "block"
+
+
+--
+-- Block Attributes
+--
+
+-- | Attributes that may be added to figures (like a name or caption).
+data BlockAttributes = BlockAttributes
+ { blockAttrName :: Maybe String
+ , blockAttrCaption :: Maybe (F Inlines)
+ , blockAttrKeyValues :: [(String, String)]
+ }
+
+stringyMetaAttribute :: (String -> Bool) -> OrgParser (String, String)
+stringyMetaAttribute attrCheck = try $ do
+ metaLineStart
+ attrName <- map toUpper <$> many1Till nonspaceChar (char ':')
+ guard $ attrCheck attrName
+ skipSpaces
+ attrValue <- anyLine
+ return (attrName, attrValue)
+
+blockAttributes :: OrgParser BlockAttributes
+blockAttributes = try $ do
+ kv <- many (stringyMetaAttribute attrCheck)
+ let caption = foldl' (appendValues "CAPTION") Nothing kv
+ let kvAttrs = foldl' (appendValues "ATTR_HTML") Nothing kv
+ let name = lookup "NAME" kv
+ caption' <- maybe (return Nothing)
+ (fmap Just . parseFromString inlines)
+ caption
+ kvAttrs' <- parseFromString keyValues . (++ "\n") $ fromMaybe mempty kvAttrs
+ return $ BlockAttributes
+ { blockAttrName = name
+ , blockAttrCaption = caption'
+ , blockAttrKeyValues = kvAttrs'
+ }
+ where
+ attrCheck :: String -> Bool
+ attrCheck attr =
+ case attr of
+ "NAME" -> True
+ "CAPTION" -> True
+ "ATTR_HTML" -> True
+ _ -> False
+
+ appendValues :: String -> Maybe String -> (String, String) -> Maybe String
+ appendValues attrName accValue (key, value) =
+ if key /= attrName
+ then accValue
+ else case accValue of
+ Just acc -> Just $ acc ++ ' ':value
+ Nothing -> Just value
+
+keyValues :: OrgParser [(String, String)]
+keyValues = try $
+ manyTill ((,) <$> key <*> value) newline
+ where
+ key :: OrgParser String
+ key = try $ skipSpaces *> char ':' *> many1 nonspaceChar
+
+ value :: OrgParser String
+ value = skipSpaces *> manyTill anyChar endOfValue
+
+ endOfValue :: OrgParser ()
+ endOfValue =
+ lookAhead $ (() <$ try (many1 spaceChar <* key))
+ <|> () <$ newline
+
+
+--
+-- Org Blocks (#+BEGIN_... / #+END_...)
+--
+
+-- | Read an org-mode block delimited by #+BEGIN_TYPE and #+END_TYPE.
+orgBlock :: OrgParser (F Blocks)
+orgBlock = try $ do
+ blockAttrs <- blockAttributes
+ blkType <- blockHeaderStart
+ ($ blkType) $
+ case blkType of
+ "export" -> exportBlock
+ "comment" -> rawBlockLines (const mempty)
+ "html" -> rawBlockLines (return . (B.rawBlock blkType))
+ "latex" -> rawBlockLines (return . (B.rawBlock blkType))
+ "ascii" -> rawBlockLines (return . (B.rawBlock blkType))
+ "example" -> rawBlockLines (return . exampleCode)
+ "quote" -> parseBlockLines (fmap B.blockQuote)
+ "verse" -> verseBlock
+ "src" -> codeBlock blockAttrs
+ _ -> parseBlockLines (fmap $ B.divWith (mempty, [blkType], mempty))
+ where
+ blockHeaderStart :: OrgParser String
+ blockHeaderStart = try $ do
+ skipSpaces
+ blockType <- stringAnyCase "#+begin_" *> orgArgWord
+ return (map toLower blockType)
+
+rawBlockLines :: (String -> F Blocks) -> String -> OrgParser (F Blocks)
+rawBlockLines f blockType = (ignHeaders *> (f <$> rawBlockContent blockType))
+
+parseBlockLines :: (F Blocks -> F Blocks) -> String -> OrgParser (F Blocks)
+parseBlockLines f blockType = (ignHeaders *> (f <$> parsedBlockContent))
+ where
+ parsedBlockContent :: OrgParser (F Blocks)
+ parsedBlockContent = try $ do
+ raw <- rawBlockContent blockType
+ parseFromString blocks (raw ++ "\n")
+
+-- | Read the raw string content of a block
+rawBlockContent :: String -> OrgParser String
+rawBlockContent blockType = try $ do
+ blkLines <- manyTill rawLine blockEnder
+ tabLen <- getOption readerTabStop
+ return
+ . unlines
+ . stripIndent
+ . map (tabsToSpaces tabLen . commaEscaped)
+ $ blkLines
+ where
+ rawLine :: OrgParser String
+ rawLine = try $ ("" <$ blankline) <|> anyLine
+
+ blockEnder :: OrgParser ()
+ blockEnder = try $ skipSpaces <* stringAnyCase ("#+end_" <> blockType)
+
+ stripIndent :: [String] -> [String]
+ stripIndent strs = map (drop (shortestIndent strs)) strs
+
+ shortestIndent :: [String] -> Int
+ shortestIndent = minimum
+ . map (length . takeWhile isSpace)
+ . filter (not . null)
+
+ tabsToSpaces :: Int -> String -> String
+ tabsToSpaces _ [] = []
+ tabsToSpaces tabLen cs'@(c:cs) =
+ case c of
+ ' ' -> ' ':tabsToSpaces tabLen cs
+ '\t' -> (take tabLen $ repeat ' ') ++ tabsToSpaces tabLen cs
+ _ -> cs'
+
+ commaEscaped :: String -> String
+ commaEscaped (',':cs@('*':_)) = cs
+ commaEscaped (',':cs@('#':'+':_)) = cs
+ commaEscaped (' ':cs) = ' ':commaEscaped cs
+ commaEscaped ('\t':cs) = '\t':commaEscaped cs
+ commaEscaped cs = cs
+
+-- | Read but ignore all remaining block headers.
+ignHeaders :: OrgParser ()
+ignHeaders = (() <$ newline) <|> (() <$ anyLine)
+
+-- | Read a block containing code intended for export in specific backends
+-- only.
+exportBlock :: String -> OrgParser (F Blocks)
+exportBlock blockType = try $ do
+ exportType <- skipSpaces *> orgArgWord <* ignHeaders
+ contents <- rawBlockContent blockType
+ returnF (B.rawBlock (map toLower exportType) contents)
+
+verseBlock :: String -> OrgParser (F Blocks)
+verseBlock blockType = try $ do
+ ignHeaders
+ content <- rawBlockContent blockType
+ fmap B.para . mconcat . intersperse (pure B.linebreak)
+ <$> mapM (parseFromString inlines) (map (++ "\n") . lines $ content)
+
+-- | Read a code block and the associated results block if present. Which of
+-- boths blocks is included in the output is determined using the "exports"
+-- argument in the block header.
+codeBlock :: BlockAttributes -> String -> OrgParser (F Blocks)
+codeBlock blockAttrs blockType = do
+ skipSpaces
+ (classes, kv) <- codeHeaderArgs <|> (mempty <$ ignHeaders)
+ content <- rawBlockContent blockType
+ resultsContent <- trailingResultsBlock
+ let id' = fromMaybe mempty $ blockAttrName blockAttrs
+ let includeCode = exportsCode kv
+ let includeResults = exportsResults kv
+ let codeBlck = B.codeBlockWith ( id', classes, kv ) content
+ let labelledBlck = maybe (pure codeBlck)
+ (labelDiv codeBlck)
+ (blockAttrCaption blockAttrs)
+ let resultBlck = fromMaybe mempty resultsContent
+ return $
+ (if includeCode then labelledBlck else mempty) <>
+ (if includeResults then resultBlck else mempty)
+ where
+ labelDiv :: Blocks -> F Inlines -> F Blocks
+ labelDiv blk value =
+ B.divWith nullAttr <$> (mappend <$> labelledBlock value <*> pure blk)
+
+ labelledBlock :: F Inlines -> F Blocks
+ labelledBlock = fmap (B.plain . B.spanWith ("", ["label"], []))
+
+exportsCode :: [(String, String)] -> Bool
+exportsCode attrs = not (("rundoc-exports", "none") `elem` attrs
+ || ("rundoc-exports", "results") `elem` attrs)
+
+exportsResults :: [(String, String)] -> Bool
+exportsResults attrs = ("rundoc-exports", "results") `elem` attrs
+ || ("rundoc-exports", "both") `elem` attrs
+
+trailingResultsBlock :: OrgParser (Maybe (F Blocks))
+trailingResultsBlock = optionMaybe . try $ do
+ blanklines
+ stringAnyCase "#+RESULTS:"
+ blankline
+ block
+
+-- | Parse code block arguments
+-- TODO: We currently don't handle switches.
+codeHeaderArgs :: OrgParser ([String], [(String, String)])
+codeHeaderArgs = try $ do
+ language <- skipSpaces *> orgArgWord
+ _ <- skipSpaces *> (try $ switch `sepBy` (many1 spaceChar))
+ parameters <- manyTill blockOption newline
+ let pandocLang = translateLang language
+ return $
+ if hasRundocParameters parameters
+ then ( [ pandocLang, rundocBlockClass ]
+ , map toRundocAttrib (("language", language) : parameters)
+ )
+ else ([ pandocLang ], parameters)
+ where
+ hasRundocParameters = not . null
+
+switch :: OrgParser (Char, Maybe String)
+switch = try $ simpleSwitch <|> lineNumbersSwitch
+ where
+ simpleSwitch = (\c -> (c, Nothing)) <$> (oneOf "-+" *> letter)
+ lineNumbersSwitch = (\ls -> ('l', Just ls)) <$>
+ (string "-l \"" *> many1Till nonspaceChar (char '"'))
+
+blockOption :: OrgParser (String, String)
+blockOption = try $ do
+ argKey <- orgArgKey
+ paramValue <- option "yes" orgParamValue
+ return (argKey, paramValue)
+
+orgParamValue :: OrgParser String
+orgParamValue = try $
+ skipSpaces
+ *> notFollowedBy (char ':' )
+ *> many1 nonspaceChar
+ <* skipSpaces
+
+horizontalRule :: OrgParser (F Blocks)
+horizontalRule = return B.horizontalRule <$ try hline
+
+
+--
+-- Drawers
+--
+
+-- | A generic drawer which has no special meaning for org-mode.
+-- Whether or not this drawer is included in the output depends on the drawers
+-- export setting.
+genericDrawer :: OrgParser (F Blocks)
+genericDrawer = try $ do
+ name <- map toUpper <$> drawerStart
+ content <- manyTill drawerLine (try drawerEnd)
+ state <- getState
+ -- Include drawer if it is explicitly included in or not explicitly excluded
+ -- from the list of drawers that should be exported. PROPERTIES drawers are
+ -- never exported.
+ case (exportDrawers . orgStateExportSettings $ state) of
+ _ | name == "PROPERTIES" -> return mempty
+ Left names | name `elem` names -> return mempty
+ Right names | name `notElem` names -> return mempty
+ _ -> drawerDiv name <$> parseLines content
+ where
+ parseLines :: [String] -> OrgParser (F Blocks)
+ parseLines = parseFromString blocks . (++ "\n") . unlines
+
+ drawerDiv :: String -> F Blocks -> F Blocks
+ drawerDiv drawerName = fmap $ B.divWith (mempty, [drawerName, "drawer"], mempty)
+
+drawerLine :: OrgParser String
+drawerLine = anyLine
+
+drawerEnd :: OrgParser String
+drawerEnd = try $
+ skipSpaces *> stringAnyCase ":END:" <* skipSpaces <* newline
+
+-- | Read a :PROPERTIES: drawer and return the key/value pairs contained
+-- within.
+propertiesDrawer :: OrgParser [(String, String)]
+propertiesDrawer = try $ do
+ drawerType <- drawerStart
+ guard $ map toUpper drawerType == "PROPERTIES"
+ manyTill property (try drawerEnd)
+ where
+ property :: OrgParser (String, String)
+ property = try $ (,) <$> key <*> value
+
+ key :: OrgParser String
+ key = try $ skipSpaces *> char ':' *> many1Till nonspaceChar (char ':')
+
+ value :: OrgParser String
+ value = try $ skipSpaces *> manyTill anyChar (try $ skipSpaces *> newline)
+
+keyValuesToAttr :: [(String, String)] -> Attr
+keyValuesToAttr kvs =
+ let
+ lowerKvs = map (\(k, v) -> (map toLower k, v)) kvs
+ id' = fromMaybe mempty . lookup "custom_id" $ lowerKvs
+ cls = fromMaybe mempty . lookup "class" $ lowerKvs
+ kvs' = filter (flip notElem ["custom_id", "class"] . fst) lowerKvs
+ in
+ (id', words cls, kvs')
+
+
+--
+-- Figures
+--
+
+-- | Figures (Image on a line by itself, preceded by name and/or caption)
+figure :: OrgParser (F Blocks)
+figure = try $ do
+ figAttrs <- blockAttributes
+ src <- skipSpaces *> selfTarget <* skipSpaces <* newline
+ guard . not . isNothing . blockAttrCaption $ figAttrs
+ guard (isImageFilename src)
+ let figName = fromMaybe mempty $ blockAttrName figAttrs
+ let figCaption = fromMaybe mempty $ blockAttrCaption figAttrs
+ let figKeyVals = blockAttrKeyValues figAttrs
+ let attr = (mempty, mempty, figKeyVals)
+ return $ (B.para . B.imageWith attr src (withFigPrefix figName) <$> figCaption)
+ where
+ withFigPrefix :: String -> String
+ withFigPrefix cs =
+ if "fig:" `isPrefixOf` cs
+ then cs
+ else "fig:" ++ cs
+
+ selfTarget :: OrgParser String
+ selfTarget = try $ char '[' *> linkTarget <* char ']'
+
+--
+-- Examples
+--
+
+-- | Example code marked up by a leading colon.
+example :: OrgParser (F Blocks)
+example = try $ do
+ return . return . exampleCode =<< unlines <$> many1 exampleLine
+ where
+ exampleLine :: OrgParser String
+ exampleLine = try $ exampleLineStart *> anyLine
+
+exampleCode :: String -> Blocks
+exampleCode = B.codeBlockWith ("", ["example"], [])
+
+
+--
+-- Comments, Options and Metadata
+--
+
+specialLine :: OrgParser (F Blocks)
+specialLine = fmap return . try $ metaLine <|> commentLine
+
+-- The order, in which blocks are tried, makes sure that we're not looking at
+-- the beginning of a block, so we don't need to check for it
+metaLine :: OrgParser Blocks
+metaLine = mempty <$ metaLineStart <* (optionLine <|> declarationLine)
+
+commentLine :: OrgParser Blocks
+commentLine = commentLineStart *> anyLine *> pure mempty
+
+declarationLine :: OrgParser ()
+declarationLine = try $ do
+ key <- metaKey
+ value <- metaInlines
+ updateState $ \st ->
+ let meta' = B.setMeta key <$> value <*> pure nullMeta
+ in st { orgStateMeta = orgStateMeta st <> meta' }
+
+metaInlines :: OrgParser (F MetaValue)
+metaInlines = fmap (MetaInlines . B.toList) <$> inlinesTillNewline
+
+metaKey :: OrgParser String
+metaKey = map toLower <$> many1 (noneOf ": \n\r")
+ <* char ':'
+ <* skipSpaces
+
+optionLine :: OrgParser ()
+optionLine = try $ do
+ key <- metaKey
+ case key of
+ "link" -> parseLinkFormat >>= uncurry addLinkFormat
+ "options" -> () <$ sepBy spaces exportSetting
+ _ -> mzero
+
+addLinkFormat :: String
+ -> (String -> String)
+ -> OrgParser ()
+addLinkFormat key formatter = updateState $ \s ->
+ let fs = orgStateLinkFormatters s
+ in s{ orgStateLinkFormatters = M.insert key formatter fs }
+
+
+--
+-- Export Settings
+--
+
+-- | Read and process org-mode specific export options.
+exportSetting :: OrgParser ()
+exportSetting = choice
+ [ booleanSetting "^" setExportSubSuperscripts
+ , booleanSetting "'" setExportSmartQuotes
+ , booleanSetting "*" setExportEmphasizedText
+ , booleanSetting "-" setExportSpecialStrings
+ , ignoredSetting ":"
+ , ignoredSetting "<"
+ , ignoredSetting "\\n"
+ , ignoredSetting "arch"
+ , ignoredSetting "author"
+ , ignoredSetting "c"
+ , ignoredSetting "creator"
+ , complementableListSetting "d" setExportDrawers
+ , ignoredSetting "date"
+ , ignoredSetting "e"
+ , ignoredSetting "email"
+ , ignoredSetting "f"
+ , ignoredSetting "H"
+ , ignoredSetting "inline"
+ , ignoredSetting "num"
+ , ignoredSetting "p"
+ , ignoredSetting "pri"
+ , ignoredSetting "prop"
+ , ignoredSetting "stat"
+ , ignoredSetting "tags"
+ , ignoredSetting "tasks"
+ , ignoredSetting "tex"
+ , ignoredSetting "timestamp"
+ , ignoredSetting "title"
+ , ignoredSetting "toc"
+ , ignoredSetting "todo"
+ , ignoredSetting "|"
+ ] <?> "export setting"
+
+booleanSetting :: String -> ExportSettingSetter Bool -> OrgParser ()
+booleanSetting settingIdentifier setter = try $ do
+ string settingIdentifier
+ char ':'
+ value <- elispBoolean
+ updateState $ modifyExportSettings setter value
+
+-- | Read an elisp boolean. Only NIL is treated as false, non-NIL values are
+-- interpreted as true.
+elispBoolean :: OrgParser Bool
+elispBoolean = try $ do
+ value <- many1 nonspaceChar
+ return $ case map toLower value of
+ "nil" -> False
+ "{}" -> False
+ "()" -> False
+ _ -> True
+
+-- | A list or a complement list (i.e. a list starting with `not`).
+complementableListSetting :: String
+ -> ExportSettingSetter (Either [String] [String])
+ -> OrgParser ()
+complementableListSetting settingIdentifier setter = try $ do
+ _ <- string settingIdentifier <* char ':'
+ value <- choice [ Left <$> complementStringList
+ , Right <$> stringList
+ , (\b -> if b then Left [] else Right []) <$> elispBoolean
+ ]
+ updateState $ modifyExportSettings setter value
+ where
+ -- Read a plain list of strings.
+ stringList :: OrgParser [String]
+ stringList = try $
+ char '('
+ *> sepBy elispString spaces
+ <* char ')'
+
+ -- Read an emacs lisp list specifying a complement set.
+ complementStringList :: OrgParser [String]
+ complementStringList = try $
+ string "(not "
+ *> sepBy elispString spaces
+ <* char ')'
+
+ elispString :: OrgParser String
+ elispString = try $
+ char '"'
+ *> manyTill alphaNum (char '"')
+
+ignoredSetting :: String -> OrgParser ()
+ignoredSetting s = try (() <$ string s <* char ':' <* many1 nonspaceChar)
+
+
+parseLinkFormat :: OrgParser ((String, String -> String))
+parseLinkFormat = try $ do
+ linkType <- (:) <$> letter <*> many (alphaNum <|> oneOf "-_") <* skipSpaces
+ linkSubst <- parseFormat
+ return (linkType, linkSubst)
+
+-- | An ad-hoc, single-argument-only implementation of a printf-style format
+-- parser.
+parseFormat :: OrgParser (String -> String)
+parseFormat = try $ do
+ replacePlain <|> replaceUrl <|> justAppend
+ where
+ -- inefficient, but who cares
+ replacePlain = try $ (\x -> concat . flip intersperse x)
+ <$> sequence [tillSpecifier 's', rest]
+ replaceUrl = try $ (\x -> concat . flip intersperse x . urlEncode)
+ <$> sequence [tillSpecifier 'h', rest]
+ justAppend = try $ (++) <$> rest
+
+ rest = manyTill anyChar (eof <|> () <$ oneOf "\n\r")
+ tillSpecifier c = manyTill (noneOf "\n\r") (try $ string ('%':c:""))
+
+--
+-- Headers
+--
+
+-- | Headers
+header :: OrgParser (F Blocks)
+header = try $ do
+ level <- headerStart
+ title <- manyTill inline (lookAhead $ optional headerTags <* newline)
+ tags <- option [] headerTags
+ newline
+ let text = tagTitle title tags
+ propAttr <- option nullAttr (keyValuesToAttr <$> propertiesDrawer)
+ attr <- registerHeader propAttr (runF text def)
+ return (B.headerWith attr level <$> text)
+ where
+ tagTitle :: [F Inlines] -> [String] -> F Inlines
+ tagTitle title tags = trimInlinesF . mconcat $ title <> map tagToInlineF tags
+
+ tagToInlineF :: String -> F Inlines
+ tagToInlineF t = return $ B.spanWith ("", ["tag"], [("data-tag-name", t)]) mempty
+
+ headerTags :: OrgParser [String]
+ headerTags = try $
+ let tag = many1 (alphaNum <|> oneOf "@%#_") <* char ':'
+ in skipSpaces
+ *> char ':'
+ *> many1 tag
+ <* skipSpaces
+
+
+--
+-- Tables
+--
+
+data OrgTableRow = OrgContentRow (F [Blocks])
+ | OrgAlignRow [Alignment]
+ | OrgHlineRow
+
+-- OrgTable is strongly related to the pandoc table ADT. Using the same
+-- (i.e. pandoc-global) ADT would mean that the reader would break if the
+-- global structure was to be changed, which would be bad. The final table
+-- should be generated using a builder function. Column widths aren't
+-- implemented yet, so they are not tracked here.
+data OrgTable = OrgTable
+ { orgTableAlignments :: [Alignment]
+ , orgTableHeader :: [Blocks]
+ , orgTableRows :: [[Blocks]]
+ }
+
+table :: OrgParser (F Blocks)
+table = try $ do
+ blockAttrs <- blockAttributes
+ lookAhead tableStart
+ do
+ rows <- tableRows
+ let caption = fromMaybe (return mempty) $ blockAttrCaption blockAttrs
+ return $ (<$> caption) . orgToPandocTable . normalizeTable =<< rowsToTable rows
+
+orgToPandocTable :: OrgTable
+ -> Inlines
+ -> Blocks
+orgToPandocTable (OrgTable aligns heads lns) caption =
+ B.table caption (zip aligns $ repeat 0) heads lns
+
+tableRows :: OrgParser [OrgTableRow]
+tableRows = try $ many (tableAlignRow <|> tableHline <|> tableContentRow)
+
+tableContentRow :: OrgParser OrgTableRow
+tableContentRow = try $
+ OrgContentRow . sequence <$> (tableStart *> many1Till tableContentCell newline)
+
+tableContentCell :: OrgParser (F Blocks)
+tableContentCell = try $
+ fmap B.plain . trimInlinesF . mconcat <$> manyTill inline endOfCell
+
+tableAlignRow :: OrgParser OrgTableRow
+tableAlignRow = try $ do
+ tableStart
+ cells <- many1Till tableAlignCell newline
+ -- Empty rows are regular (i.e. content) rows, not alignment rows.
+ guard $ any (/= AlignDefault) cells
+ return $ OrgAlignRow cells
+
+tableAlignCell :: OrgParser Alignment
+tableAlignCell =
+ choice [ try $ emptyCell *> return AlignDefault
+ , try $ skipSpaces
+ *> char '<'
+ *> tableAlignFromChar
+ <* many digit
+ <* char '>'
+ <* emptyCell
+ ] <?> "alignment info"
+ where emptyCell = try $ skipSpaces *> endOfCell
+
+tableAlignFromChar :: OrgParser Alignment
+tableAlignFromChar = try $
+ choice [ char 'l' *> return AlignLeft
+ , char 'c' *> return AlignCenter
+ , char 'r' *> return AlignRight
+ ]
+
+tableHline :: OrgParser OrgTableRow
+tableHline = try $
+ OrgHlineRow <$ (tableStart *> char '-' *> anyLine)
+
+endOfCell :: OrgParser Char
+endOfCell = try $ char '|' <|> lookAhead newline
+
+rowsToTable :: [OrgTableRow]
+ -> F OrgTable
+rowsToTable = foldM rowToContent emptyTable
+ where emptyTable = OrgTable mempty mempty mempty
+
+normalizeTable :: OrgTable -> OrgTable
+normalizeTable (OrgTable aligns heads rows) = OrgTable aligns' heads rows
+ where
+ refRow = if heads /= mempty
+ then heads
+ else if rows == mempty then mempty else head rows
+ cols = length refRow
+ fillColumns base padding = take cols $ base ++ repeat padding
+ aligns' = fillColumns aligns AlignDefault
+
+-- One or more horizontal rules after the first content line mark the previous
+-- line as a header. All other horizontal lines are discarded.
+rowToContent :: OrgTable
+ -> OrgTableRow
+ -> F OrgTable
+rowToContent orgTable row =
+ case row of
+ OrgHlineRow -> return singleRowPromotedToHeader
+ OrgAlignRow as -> return . setAligns $ as
+ OrgContentRow cs -> appendToBody cs
+ where
+ singleRowPromotedToHeader :: OrgTable
+ singleRowPromotedToHeader = case orgTable of
+ OrgTable{ orgTableHeader = [], orgTableRows = b:[] } ->
+ orgTable{ orgTableHeader = b , orgTableRows = [] }
+ _ -> orgTable
+
+ setAligns :: [Alignment] -> OrgTable
+ setAligns aligns = orgTable{ orgTableAlignments = aligns }
+
+ appendToBody :: F [Blocks] -> F OrgTable
+ appendToBody frow = do
+ newRow <- frow
+ let oldRows = orgTableRows orgTable
+ -- NOTE: This is an inefficient O(n) operation. This should be changed
+ -- if performance ever becomes a problem.
+ return orgTable{ orgTableRows = oldRows ++ [newRow] }
+
+
+--
+-- LaTeX fragments
+--
+latexFragment :: OrgParser (F Blocks)
+latexFragment = try $ do
+ envName <- latexEnvStart
+ content <- mconcat <$> manyTill anyLineNewline (latexEnd envName)
+ return . return $ B.rawBlock "latex" (content `inLatexEnv` envName)
+ where
+ c `inLatexEnv` e = mconcat [ "\\begin{", e, "}\n"
+ , c
+ , "\\end{", e, "}\n"
+ ]
+
+latexEnd :: String -> OrgParser ()
+latexEnd envName = try $
+ () <$ skipSpaces
+ <* string ("\\end{" ++ envName ++ "}")
+ <* blankline
+
+
+--
+-- Footnote defintions
+--
+noteBlock :: OrgParser (F Blocks)
+noteBlock = try $ do
+ ref <- noteMarker <* skipSpaces
+ content <- mconcat <$> blocksTillHeaderOrNote
+ addToNotesTable (ref, content)
+ return mempty
+ where
+ blocksTillHeaderOrNote =
+ many1Till block (eof <|> () <$ lookAhead noteMarker
+ <|> () <$ lookAhead headerStart)
+
+-- Paragraphs or Plain text
+paraOrPlain :: OrgParser (F Blocks)
+paraOrPlain = try $ do
+ ils <- inlines
+ nl <- option False (newline *> return True)
+ -- Read block as paragraph, except if we are in a list context and the block
+ -- is directly followed by a list item, in which case the block is read as
+ -- plain text.
+ try (guard nl
+ *> notFollowedBy (inList *> (() <$ orderedListStart <|> bulletListStart))
+ *> return (B.para <$> ils))
+ <|> (return (B.plain <$> ils))
+
+inlinesTillNewline :: OrgParser (F Inlines)
+inlinesTillNewline = trimInlinesF . mconcat <$> manyTill inline newline
+
+
+--
+-- list blocks
+--
+
+list :: OrgParser (F Blocks)
+list = choice [ definitionList, bulletList, orderedList ] <?> "list"
+
+definitionList :: OrgParser (F Blocks)
+definitionList = try $ do n <- lookAhead (bulletListStart' Nothing)
+ fmap B.definitionList . fmap compactify'DL . sequence
+ <$> many1 (definitionListItem $ bulletListStart' (Just n))
+
+bulletList :: OrgParser (F Blocks)
+bulletList = try $ do n <- lookAhead (bulletListStart' Nothing)
+ fmap B.bulletList . fmap compactify' . sequence
+ <$> many1 (listItem (bulletListStart' $ Just n))
+
+orderedList :: OrgParser (F Blocks)
+orderedList = fmap B.orderedList . fmap compactify' . sequence
+ <$> many1 (listItem orderedListStart)
+
+bulletListStart' :: Maybe Int -> OrgParser Int
+-- returns length of bulletList prefix, inclusive of marker
+bulletListStart' Nothing = do ind <- length <$> many spaceChar
+ oneOf (bullets $ ind == 0)
+ skipSpaces1
+ return (ind + 1)
+bulletListStart' (Just n) = do count (n-1) spaceChar
+ oneOf (bullets $ n == 1)
+ many1 spaceChar
+ return n
+
+-- Unindented lists are legal, but they can't use '*' bullets.
+-- We return n to maintain compatibility with the generic listItem.
+bullets :: Bool -> String
+bullets unindented = if unindented then "+-" else "*+-"
+
+definitionListItem :: OrgParser Int
+ -> OrgParser (F (Inlines, [Blocks]))
+definitionListItem parseMarkerGetLength = try $ do
+ markerLength <- parseMarkerGetLength
+ term <- manyTill (noneOf "\n\r") (try definitionMarker)
+ line1 <- anyLineNewline
+ blank <- option "" ("\n" <$ blankline)
+ cont <- concat <$> many (listContinuation markerLength)
+ term' <- parseFromString inlines term
+ contents' <- parseFromString blocks $ line1 ++ blank ++ cont
+ return $ (,) <$> term' <*> fmap (:[]) contents'
+ where
+ definitionMarker =
+ spaceChar *> string "::" <* (spaceChar <|> lookAhead newline)
+
+
+-- parse raw text for one list item, excluding start marker and continuations
+listItem :: OrgParser Int
+ -> OrgParser (F Blocks)
+listItem start = try . withContext ListItemState $ do
+ markerLength <- try start
+ firstLine <- anyLineNewline
+ blank <- option "" ("\n" <$ blankline)
+ rest <- concat <$> many (listContinuation markerLength)
+ parseFromString blocks $ firstLine ++ blank ++ rest
+
+-- continuation of a list item - indented and separated by blankline or endline.
+-- Note: nested lists are parsed as continuations.
+listContinuation :: Int
+ -> OrgParser String
+listContinuation markerLength = try $
+ notFollowedBy' blankline
+ *> (mappend <$> (concat <$> many1 listLine)
+ <*> many blankline)
+ where
+ listLine = try $ indentWith markerLength *> anyLineNewline
+
+ -- indent by specified number of spaces (or equiv. tabs)
+ indentWith :: Int -> OrgParser String
+ indentWith num = do
+ tabStop <- getOption readerTabStop
+ if num < tabStop
+ then count num (char ' ')
+ else choice [ try (count num (char ' '))
+ , try (char '\t' >> count (num - tabStop) (char ' ')) ]
+
+-- | Parse any line, include the final newline in the output.
+anyLineNewline :: OrgParser String
+anyLineNewline = (++ "\n") <$> anyLine
diff --git a/src/Text/Pandoc/Readers/Org/Inlines.hs b/src/Text/Pandoc/Readers/Org/Inlines.hs
new file mode 100644
index 000000000..001aeb569
--- /dev/null
+++ b/src/Text/Pandoc/Readers/Org/Inlines.hs
@@ -0,0 +1,762 @@
+{-# LANGUAGE OverloadedStrings #-}
+{-
+Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.Org.Options
+ Copyright : Copyright (C) 2014-2016 Albert Krewinkel
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+Parsers for Org-mode inline elements.
+-}
+module Text.Pandoc.Readers.Org.Inlines
+ ( inline
+ , inlines
+ , addToNotesTable
+ , linkTarget
+ ) where
+
+import Text.Pandoc.Readers.Org.BlockStarts
+import Text.Pandoc.Readers.Org.ParserState
+import Text.Pandoc.Readers.Org.Parsing
+import Text.Pandoc.Readers.Org.Shared
+ ( isImageFilename, rundocBlockClass, toRundocAttrib
+ , translateLang )
+
+import qualified Text.Pandoc.Builder as B
+import Text.Pandoc.Builder ( Inlines )
+import Text.Pandoc.Definition
+import Text.Pandoc.Compat.Monoid ( (<>) )
+import Text.Pandoc.Options
+import Text.Pandoc.Readers.LaTeX ( inlineCommand, rawLaTeXInline )
+import Text.TeXMath ( readTeX, writePandoc, DisplayType(..) )
+import qualified Text.TeXMath.Readers.MathML.EntityMap as MathMLEntityMap
+
+import Control.Monad ( guard, mplus, mzero, when )
+import Data.Char ( isAlphaNum, isSpace )
+import Data.List ( isPrefixOf )
+import Data.Maybe ( fromMaybe )
+import qualified Data.Map as M
+
+--
+-- Functions acting on the parser state
+--
+recordAnchorId :: String -> OrgParser ()
+recordAnchorId i = updateState $ \s ->
+ s{ orgStateAnchorIds = i : (orgStateAnchorIds s) }
+
+pushToInlineCharStack :: Char -> OrgParser ()
+pushToInlineCharStack c = updateState $ \s ->
+ s{ orgStateEmphasisCharStack = c:orgStateEmphasisCharStack s }
+
+popInlineCharStack :: OrgParser ()
+popInlineCharStack = updateState $ \s ->
+ s{ orgStateEmphasisCharStack = drop 1 . orgStateEmphasisCharStack $ s }
+
+surroundingEmphasisChar :: OrgParser [Char]
+surroundingEmphasisChar =
+ take 1 . drop 1 . orgStateEmphasisCharStack <$> getState
+
+startEmphasisNewlinesCounting :: Int -> OrgParser ()
+startEmphasisNewlinesCounting maxNewlines = updateState $ \s ->
+ s{ orgStateEmphasisNewlines = Just maxNewlines }
+
+decEmphasisNewlinesCount :: OrgParser ()
+decEmphasisNewlinesCount = updateState $ \s ->
+ s{ orgStateEmphasisNewlines = (\n -> n - 1) <$> orgStateEmphasisNewlines s }
+
+newlinesCountWithinLimits :: OrgParser Bool
+newlinesCountWithinLimits = do
+ st <- getState
+ return $ ((< 0) <$> orgStateEmphasisNewlines st) /= Just True
+
+resetEmphasisNewlines :: OrgParser ()
+resetEmphasisNewlines = updateState $ \s ->
+ s{ orgStateEmphasisNewlines = Nothing }
+
+addToNotesTable :: OrgNoteRecord -> OrgParser ()
+addToNotesTable note = do
+ oldnotes <- orgStateNotes' <$> getState
+ updateState $ \s -> s{ orgStateNotes' = note:oldnotes }
+
+-- | Parse a single Org-mode inline element
+inline :: OrgParser (F Inlines)
+inline =
+ choice [ whitespace
+ , linebreak
+ , cite
+ , footnote
+ , linkOrImage
+ , anchor
+ , inlineCodeBlock
+ , str
+ , endline
+ , emphasizedText
+ , code
+ , math
+ , displayMath
+ , verbatim
+ , subscript
+ , superscript
+ , inlineLaTeX
+ , smart
+ , symbol
+ ] <* (guard =<< newlinesCountWithinLimits)
+ <?> "inline"
+
+-- | Read the rest of the input as inlines.
+inlines :: OrgParser (F Inlines)
+inlines = trimInlinesF . mconcat <$> many1 inline
+
+-- treat these as potentially non-text when parsing inline:
+specialChars :: [Char]
+specialChars = "\"$'()*+-,./:<=>[\\]^_{|}~"
+
+
+whitespace :: OrgParser (F Inlines)
+whitespace = pure B.space <$ skipMany1 spaceChar
+ <* updateLastPreCharPos
+ <* updateLastForbiddenCharPos
+ <?> "whitespace"
+
+linebreak :: OrgParser (F Inlines)
+linebreak = try $ pure B.linebreak <$ string "\\\\" <* skipSpaces <* newline
+
+str :: OrgParser (F Inlines)
+str = return . B.str <$> many1 (noneOf $ specialChars ++ "\n\r ")
+ <* updateLastStrPos
+
+-- | An endline character that can be treated as a space, not a structural
+-- break. This should reflect the values of the Emacs variable
+-- @org-element-pagaraph-separate@.
+endline :: OrgParser (F Inlines)
+endline = try $ do
+ newline
+ notFollowedBy blankline
+ notFollowedBy' exampleLineStart
+ notFollowedBy' hline
+ notFollowedBy' noteMarker
+ notFollowedBy' tableStart
+ notFollowedBy' drawerStart
+ notFollowedBy' headerStart
+ notFollowedBy' metaLineStart
+ notFollowedBy' latexEnvStart
+ notFollowedBy' commentLineStart
+ notFollowedBy' bulletListStart
+ notFollowedBy' orderedListStart
+ decEmphasisNewlinesCount
+ guard =<< newlinesCountWithinLimits
+ updateLastPreCharPos
+ return . return $ B.softbreak
+
+cite :: OrgParser (F Inlines)
+cite = try $ do
+ guardEnabled Ext_citations
+ (cs, raw) <- withRaw (pandocOrgCite <|> orgRefCite)
+ return $ (flip B.cite (B.text raw)) <$> cs
+
+-- | A citation in Pandoc Org-mode style (@[\@citekey]@).
+pandocOrgCite :: OrgParser (F [Citation])
+pandocOrgCite = try $
+ char '[' *> skipSpaces *> citeList <* skipSpaces <* char ']'
+
+orgRefCite :: OrgParser (F [Citation])
+orgRefCite = try $ normalOrgRefCite <|> (fmap (:[]) <$> linkLikeOrgRefCite)
+
+normalOrgRefCite :: OrgParser (F [Citation])
+normalOrgRefCite = try $ do
+ mode <- orgRefCiteMode
+ sequence <$> sepBy1 (orgRefCiteList mode) (char ',')
+ where
+ -- | A list of org-ref style citation keys, parsed as citation of the given
+ -- citation mode.
+ orgRefCiteList :: CitationMode -> OrgParser (F Citation)
+ orgRefCiteList citeMode = try $ do
+ key <- orgRefCiteKey
+ returnF $ Citation
+ { citationId = key
+ , citationPrefix = mempty
+ , citationSuffix = mempty
+ , citationMode = citeMode
+ , citationNoteNum = 0
+ , citationHash = 0
+ }
+
+-- | Read a link-like org-ref style citation. The citation includes pre and
+-- post text. However, multiple citations are not possible due to limitations
+-- in the syntax.
+linkLikeOrgRefCite :: OrgParser (F Citation)
+linkLikeOrgRefCite = try $ do
+ _ <- string "[["
+ mode <- orgRefCiteMode
+ key <- orgRefCiteKey
+ _ <- string "]["
+ pre <- trimInlinesF . mconcat <$> manyTill inline (try $ string "::")
+ spc <- option False (True <$ spaceChar)
+ suf <- trimInlinesF . mconcat <$> manyTill inline (try $ string "]]")
+ return $ do
+ pre' <- pre
+ suf' <- suf
+ return Citation
+ { citationId = key
+ , citationPrefix = B.toList pre'
+ , citationSuffix = B.toList (if spc then B.space <> suf' else suf')
+ , citationMode = mode
+ , citationNoteNum = 0
+ , citationHash = 0
+ }
+
+-- | Read a citation key. The characters allowed in citation keys are taken
+-- from the `org-ref-cite-re` variable in `org-ref.el`.
+orgRefCiteKey :: OrgParser String
+orgRefCiteKey = try . many1 . satisfy $ \c ->
+ isAlphaNum c || c `elem` ("-_:\\./"::String)
+
+-- | Supported citation types. Only a small subset of org-ref types is
+-- supported for now. TODO: rewrite this, use LaTeX reader as template.
+orgRefCiteMode :: OrgParser CitationMode
+orgRefCiteMode =
+ choice $ map (\(s, mode) -> mode <$ try (string s <* char ':'))
+ [ ("cite", AuthorInText)
+ , ("citep", NormalCitation)
+ , ("citep*", NormalCitation)
+ , ("citet", AuthorInText)
+ , ("citet*", AuthorInText)
+ , ("citeyear", SuppressAuthor)
+ ]
+
+citeList :: OrgParser (F [Citation])
+citeList = sequence <$> sepBy1 citation (try $ char ';' *> skipSpaces)
+
+citation :: OrgParser (F Citation)
+citation = try $ do
+ pref <- prefix
+ (suppress_author, key) <- citeKey
+ suff <- suffix
+ return $ do
+ x <- pref
+ y <- suff
+ return $ Citation{ citationId = key
+ , citationPrefix = B.toList x
+ , citationSuffix = B.toList y
+ , citationMode = if suppress_author
+ then SuppressAuthor
+ else NormalCitation
+ , citationNoteNum = 0
+ , citationHash = 0
+ }
+ where
+ prefix = trimInlinesF . mconcat <$>
+ manyTill inline (char ']' <|> (']' <$ lookAhead citeKey))
+ suffix = try $ do
+ hasSpace <- option False (notFollowedBy nonspaceChar >> return True)
+ skipSpaces
+ rest <- trimInlinesF . mconcat <$>
+ many (notFollowedBy (oneOf ";]") *> inline)
+ return $ if hasSpace
+ then (B.space <>) <$> rest
+ else rest
+
+footnote :: OrgParser (F Inlines)
+footnote = try $ inlineNote <|> referencedNote
+
+inlineNote :: OrgParser (F Inlines)
+inlineNote = try $ do
+ string "[fn:"
+ ref <- many alphaNum
+ char ':'
+ note <- fmap B.para . trimInlinesF . mconcat <$> many1Till inline (char ']')
+ when (not $ null ref) $
+ addToNotesTable ("fn:" ++ ref, note)
+ return $ B.note <$> note
+
+referencedNote :: OrgParser (F Inlines)
+referencedNote = try $ do
+ ref <- noteMarker
+ return $ do
+ notes <- asksF orgStateNotes'
+ case lookup ref notes of
+ Nothing -> return $ B.str $ "[" ++ ref ++ "]"
+ Just contents -> do
+ st <- askF
+ let contents' = runF contents st{ orgStateNotes' = [] }
+ return $ B.note contents'
+
+linkOrImage :: OrgParser (F Inlines)
+linkOrImage = explicitOrImageLink
+ <|> selflinkOrImage
+ <|> angleLink
+ <|> plainLink
+ <?> "link or image"
+
+explicitOrImageLink :: OrgParser (F Inlines)
+explicitOrImageLink = try $ do
+ char '['
+ srcF <- applyCustomLinkFormat =<< possiblyEmptyLinkTarget
+ title <- enclosedRaw (char '[') (char ']')
+ title' <- parseFromString (mconcat <$> many inline) title
+ char ']'
+ return $ do
+ src <- srcF
+ if isImageFilename title
+ then pure $ B.link src "" $ B.image title mempty mempty
+ else linkToInlinesF src =<< title'
+
+selflinkOrImage :: OrgParser (F Inlines)
+selflinkOrImage = try $ do
+ src <- char '[' *> linkTarget <* char ']'
+ return $ linkToInlinesF src (B.str src)
+
+plainLink :: OrgParser (F Inlines)
+plainLink = try $ do
+ (orig, src) <- uri
+ returnF $ B.link src "" (B.str orig)
+
+angleLink :: OrgParser (F Inlines)
+angleLink = try $ do
+ char '<'
+ link <- plainLink
+ char '>'
+ return link
+
+linkTarget :: OrgParser String
+linkTarget = enclosedByPair '[' ']' (noneOf "\n\r[]")
+
+possiblyEmptyLinkTarget :: OrgParser String
+possiblyEmptyLinkTarget = try linkTarget <|> ("" <$ string "[]")
+
+applyCustomLinkFormat :: String -> OrgParser (F String)
+applyCustomLinkFormat link = do
+ let (linkType, rest) = break (== ':') link
+ return $ do
+ formatter <- M.lookup linkType <$> asksF orgStateLinkFormatters
+ return $ maybe link ($ drop 1 rest) formatter
+
+-- | Take a link and return a function which produces new inlines when given
+-- description inlines.
+linkToInlinesF :: String -> Inlines -> F Inlines
+linkToInlinesF linkStr =
+ case linkStr of
+ "" -> pure . B.link mempty "" -- wiki link (empty by convention)
+ ('#':_) -> pure . B.link linkStr "" -- document-local fraction
+ _ -> case cleanLinkString linkStr of
+ (Just cleanedLink) -> if isImageFilename cleanedLink
+ then const . pure $ B.image cleanedLink "" ""
+ else pure . B.link cleanedLink ""
+ Nothing -> internalLink linkStr -- other internal link
+
+-- | Cleanup and canonicalize a string describing a link. Return @Nothing@ if
+-- the string does not appear to be a link.
+cleanLinkString :: String -> Maybe String
+cleanLinkString s =
+ case s of
+ '/':_ -> Just $ "file://" ++ s -- absolute path
+ '.':'/':_ -> Just s -- relative path
+ '.':'.':'/':_ -> Just s -- relative path
+ -- Relative path or URL (file schema)
+ 'f':'i':'l':'e':':':s' -> Just $ if ("//" `isPrefixOf` s') then s else s'
+ _ | isUrl s -> Just s -- URL
+ _ -> Nothing
+ where
+ isUrl :: String -> Bool
+ isUrl cs =
+ let (scheme, path) = break (== ':') cs
+ in all (\c -> isAlphaNum c || c `elem` (".-"::String)) scheme
+ && not (null path)
+
+internalLink :: String -> Inlines -> F Inlines
+internalLink link title = do
+ anchorB <- (link `elem`) <$> asksF orgStateAnchorIds
+ if anchorB
+ then return $ B.link ('#':link) "" title
+ else return $ B.emph title
+
+-- | Parse an anchor like @<<anchor-id>>@ and return an empty span with
+-- @anchor-id@ set as id. Legal anchors in org-mode are defined through
+-- @org-target-regexp@, which is fairly liberal. Since no link is created if
+-- @anchor-id@ contains spaces, we are more restrictive in what is accepted as
+-- an anchor.
+
+anchor :: OrgParser (F Inlines)
+anchor = try $ do
+ anchorId <- parseAnchor
+ recordAnchorId anchorId
+ returnF $ B.spanWith (solidify anchorId, [], []) mempty
+ where
+ parseAnchor = string "<<"
+ *> many1 (noneOf "\t\n\r<>\"' ")
+ <* string ">>"
+ <* skipSpaces
+
+-- | Replace every char but [a-zA-Z0-9_.-:] with a hypen '-'. This mirrors
+-- the org function @org-export-solidify-link-text@.
+
+solidify :: String -> String
+solidify = map replaceSpecialChar
+ where replaceSpecialChar c
+ | isAlphaNum c = c
+ | c `elem` ("_.-:" :: String) = c
+ | otherwise = '-'
+
+-- | Parses an inline code block and marks it as an babel block.
+inlineCodeBlock :: OrgParser (F Inlines)
+inlineCodeBlock = try $ do
+ string "src_"
+ lang <- many1 orgArgWordChar
+ opts <- option [] $ enclosedByPair '[' ']' inlineBlockOption
+ inlineCode <- enclosedByPair '{' '}' (noneOf "\n\r")
+ let attrClasses = [translateLang lang, rundocBlockClass]
+ let attrKeyVal = map toRundocAttrib (("language", lang) : opts)
+ returnF $ B.codeWith ("", attrClasses, attrKeyVal) inlineCode
+ where
+ inlineBlockOption :: OrgParser (String, String)
+ inlineBlockOption = try $ do
+ argKey <- orgArgKey
+ paramValue <- option "yes" orgInlineParamValue
+ return (argKey, paramValue)
+
+ orgInlineParamValue :: OrgParser String
+ orgInlineParamValue = try $
+ skipSpaces
+ *> notFollowedBy (char ':')
+ *> many1 (noneOf "\t\n\r ]")
+ <* skipSpaces
+
+
+emphasizedText :: OrgParser (F Inlines)
+emphasizedText = do
+ state <- getState
+ guard . exportEmphasizedText . orgStateExportSettings $ state
+ try $ choice
+ [ emph
+ , strong
+ , strikeout
+ , underline
+ ]
+
+enclosedByPair :: Char -- ^ opening char
+ -> Char -- ^ closing char
+ -> OrgParser a -- ^ parser
+ -> OrgParser [a]
+enclosedByPair s e p = char s *> many1Till p (char e)
+
+emph :: OrgParser (F Inlines)
+emph = fmap B.emph <$> emphasisBetween '/'
+
+strong :: OrgParser (F Inlines)
+strong = fmap B.strong <$> emphasisBetween '*'
+
+strikeout :: OrgParser (F Inlines)
+strikeout = fmap B.strikeout <$> emphasisBetween '+'
+
+-- There is no underline, so we use strong instead.
+underline :: OrgParser (F Inlines)
+underline = fmap B.strong <$> emphasisBetween '_'
+
+verbatim :: OrgParser (F Inlines)
+verbatim = return . B.code <$> verbatimBetween '='
+
+code :: OrgParser (F Inlines)
+code = return . B.code <$> verbatimBetween '~'
+
+subscript :: OrgParser (F Inlines)
+subscript = fmap B.subscript <$> try (char '_' *> subOrSuperExpr)
+
+superscript :: OrgParser (F Inlines)
+superscript = fmap B.superscript <$> try (char '^' *> subOrSuperExpr)
+
+math :: OrgParser (F Inlines)
+math = return . B.math <$> choice [ math1CharBetween '$'
+ , mathStringBetween '$'
+ , rawMathBetween "\\(" "\\)"
+ ]
+
+displayMath :: OrgParser (F Inlines)
+displayMath = return . B.displayMath <$> choice [ rawMathBetween "\\[" "\\]"
+ , rawMathBetween "$$" "$$"
+ ]
+
+updatePositions :: Char
+ -> OrgParser (Char)
+updatePositions c = do
+ when (c `elem` emphasisPreChars) updateLastPreCharPos
+ when (c `elem` emphasisForbiddenBorderChars) updateLastForbiddenCharPos
+ return c
+
+symbol :: OrgParser (F Inlines)
+symbol = return . B.str . (: "") <$> (oneOf specialChars >>= updatePositions)
+
+emphasisBetween :: Char
+ -> OrgParser (F Inlines)
+emphasisBetween c = try $ do
+ startEmphasisNewlinesCounting emphasisAllowedNewlines
+ res <- enclosedInlines (emphasisStart c) (emphasisEnd c)
+ isTopLevelEmphasis <- null . orgStateEmphasisCharStack <$> getState
+ when isTopLevelEmphasis
+ resetEmphasisNewlines
+ return res
+
+verbatimBetween :: Char
+ -> OrgParser String
+verbatimBetween c = try $
+ emphasisStart c *>
+ many1TillNOrLessNewlines 1 (noneOf "\n\r") (emphasisEnd c)
+
+-- | Parses a raw string delimited by @c@ using Org's math rules
+mathStringBetween :: Char
+ -> OrgParser String
+mathStringBetween c = try $ do
+ mathStart c
+ body <- many1TillNOrLessNewlines mathAllowedNewlines
+ (noneOf (c:"\n\r"))
+ (lookAhead $ mathEnd c)
+ final <- mathEnd c
+ return $ body ++ [final]
+
+-- | Parse a single character between @c@ using math rules
+math1CharBetween :: Char
+ -> OrgParser String
+math1CharBetween c = try $ do
+ char c
+ res <- noneOf $ c:mathForbiddenBorderChars
+ char c
+ eof <|> () <$ lookAhead (oneOf mathPostChars)
+ return [res]
+
+rawMathBetween :: String
+ -> String
+ -> OrgParser String
+rawMathBetween s e = try $ string s *> manyTill anyChar (try $ string e)
+
+-- | Parses the start (opening character) of emphasis
+emphasisStart :: Char -> OrgParser Char
+emphasisStart c = try $ do
+ guard =<< afterEmphasisPreChar
+ guard =<< notAfterString
+ char c
+ lookAhead (noneOf emphasisForbiddenBorderChars)
+ pushToInlineCharStack c
+ return c
+
+-- | Parses the closing character of emphasis
+emphasisEnd :: Char -> OrgParser Char
+emphasisEnd c = try $ do
+ guard =<< notAfterForbiddenBorderChar
+ char c
+ eof <|> () <$ lookAhead acceptablePostChars
+ updateLastStrPos
+ popInlineCharStack
+ return c
+ where acceptablePostChars =
+ surroundingEmphasisChar >>= \x -> oneOf (x ++ emphasisPostChars)
+
+mathStart :: Char -> OrgParser Char
+mathStart c = try $
+ char c <* notFollowedBy' (oneOf (c:mathForbiddenBorderChars))
+
+mathEnd :: Char -> OrgParser Char
+mathEnd c = try $ do
+ res <- noneOf (c:mathForbiddenBorderChars)
+ char c
+ eof <|> () <$ lookAhead (oneOf mathPostChars)
+ return res
+
+
+enclosedInlines :: OrgParser a
+ -> OrgParser b
+ -> OrgParser (F Inlines)
+enclosedInlines start end = try $
+ trimInlinesF . mconcat <$> enclosed start end inline
+
+enclosedRaw :: OrgParser a
+ -> OrgParser b
+ -> OrgParser String
+enclosedRaw start end = try $
+ start *> (onSingleLine <|> spanningTwoLines)
+ where onSingleLine = try $ many1Till (noneOf "\n\r") end
+ spanningTwoLines = try $
+ anyLine >>= \f -> mappend (f <> " ") <$> onSingleLine
+
+-- | Like many1Till, but parses at most @n+1@ lines. @p@ must not consume
+-- newlines.
+many1TillNOrLessNewlines :: Int
+ -> OrgParser Char
+ -> OrgParser a
+ -> OrgParser String
+many1TillNOrLessNewlines n p end = try $
+ nMoreLines (Just n) mempty >>= oneOrMore
+ where
+ nMoreLines Nothing cs = return cs
+ nMoreLines (Just 0) cs = try $ (cs ++) <$> finalLine
+ nMoreLines k cs = try $ (final k cs <|> rest k cs)
+ >>= uncurry nMoreLines
+ final _ cs = (\x -> (Nothing, cs ++ x)) <$> try finalLine
+ rest m cs = (\x -> (minus1 <$> m, cs ++ x ++ "\n")) <$> try (manyTill p newline)
+ finalLine = try $ manyTill p end
+ minus1 k = k - 1
+ oneOrMore cs = guard (not $ null cs) *> return cs
+
+-- Org allows customization of the way it reads emphasis. We use the defaults
+-- here (see, e.g., the Emacs Lisp variable `org-emphasis-regexp-components`
+-- for details).
+
+-- | Chars allowed to occur before emphasis (spaces and newlines are ok, too)
+emphasisPreChars :: [Char]
+emphasisPreChars = "\t \"'({"
+
+-- | Chars allowed at after emphasis
+emphasisPostChars :: [Char]
+emphasisPostChars = "\t\n !\"'),-.:;?\\}"
+
+-- | Chars not allowed at the (inner) border of emphasis
+emphasisForbiddenBorderChars :: [Char]
+emphasisForbiddenBorderChars = "\t\n\r \"',"
+
+-- | The maximum number of newlines within
+emphasisAllowedNewlines :: Int
+emphasisAllowedNewlines = 1
+
+-- LaTeX-style math: see `org-latex-regexps` for details
+
+-- | Chars allowed after an inline ($...$) math statement
+mathPostChars :: [Char]
+mathPostChars = "\t\n \"'),-.:;?"
+
+-- | Chars not allowed at the (inner) border of math
+mathForbiddenBorderChars :: [Char]
+mathForbiddenBorderChars = "\t\n\r ,;.$"
+
+-- | Maximum number of newlines in an inline math statement
+mathAllowedNewlines :: Int
+mathAllowedNewlines = 2
+
+-- | Whether we are right behind a char allowed before emphasis
+afterEmphasisPreChar :: OrgParser Bool
+afterEmphasisPreChar = do
+ pos <- getPosition
+ lastPrePos <- orgStateLastPreCharPos <$> getState
+ return . fromMaybe True $ (== pos) <$> lastPrePos
+
+-- | Whether the parser is right after a forbidden border char
+notAfterForbiddenBorderChar :: OrgParser Bool
+notAfterForbiddenBorderChar = do
+ pos <- getPosition
+ lastFBCPos <- orgStateLastForbiddenCharPos <$> getState
+ return $ lastFBCPos /= Just pos
+
+-- | Read a sub- or superscript expression
+subOrSuperExpr :: OrgParser (F Inlines)
+subOrSuperExpr = try $
+ choice [ id <$> charsInBalanced '{' '}' (noneOf "\n\r")
+ , enclosing ('(', ')') <$> charsInBalanced '(' ')' (noneOf "\n\r")
+ , simpleSubOrSuperString
+ ] >>= parseFromString (mconcat <$> many inline)
+ where enclosing (left, right) s = left : s ++ [right]
+
+simpleSubOrSuperString :: OrgParser String
+simpleSubOrSuperString = try $ do
+ state <- getState
+ guard . exportSubSuperscripts . orgStateExportSettings $ state
+ choice [ string "*"
+ , mappend <$> option [] ((:[]) <$> oneOf "+-")
+ <*> many1 alphaNum
+ ]
+
+inlineLaTeX :: OrgParser (F Inlines)
+inlineLaTeX = try $ do
+ cmd <- inlineLaTeXCommand
+ maybe mzero returnF $
+ parseAsMath cmd `mplus` parseAsMathMLSym cmd `mplus` parseAsInlineLaTeX cmd
+ where
+ parseAsMath :: String -> Maybe Inlines
+ parseAsMath cs = B.fromList <$> texMathToPandoc cs
+
+ parseAsInlineLaTeX :: String -> Maybe Inlines
+ parseAsInlineLaTeX cs = maybeRight $ runParser inlineCommand state "" cs
+
+ parseAsMathMLSym :: String -> Maybe Inlines
+ parseAsMathMLSym cs = B.str <$> MathMLEntityMap.getUnicode (clean cs)
+ -- drop initial backslash and any trailing "{}"
+ where clean = dropWhileEnd (`elem` ("{}" :: String)) . drop 1
+
+ state :: ParserState
+ state = def{ stateOptions = def{ readerParseRaw = True }}
+
+ texMathToPandoc :: String -> Maybe [Inline]
+ texMathToPandoc cs = (maybeRight $ readTeX cs) >>= writePandoc DisplayInline
+
+maybeRight :: Either a b -> Maybe b
+maybeRight = either (const Nothing) Just
+
+inlineLaTeXCommand :: OrgParser String
+inlineLaTeXCommand = try $ do
+ rest <- getInput
+ case runParser rawLaTeXInline def "source" rest of
+ Right (RawInline _ cs) -> do
+ -- drop any trailing whitespace, those are not be part of the command as
+ -- far as org mode is concerned.
+ let cmdNoSpc = dropWhileEnd isSpace cs
+ let len = length cmdNoSpc
+ count len anyChar
+ return cmdNoSpc
+ _ -> mzero
+
+-- Taken from Data.OldList.
+dropWhileEnd :: (a -> Bool) -> [a] -> [a]
+dropWhileEnd p = foldr (\x xs -> if p x && null xs then [] else x : xs) []
+
+smart :: OrgParser (F Inlines)
+smart = do
+ getOption readerSmart >>= guard
+ doubleQuoted <|> singleQuoted <|>
+ choice (map (return <$>) [orgApostrophe, orgDash, orgEllipses])
+ where
+ orgDash = do
+ guard =<< getExportSetting exportSpecialStrings
+ dash <* updatePositions '-'
+ orgEllipses = do
+ guard =<< getExportSetting exportSpecialStrings
+ ellipses <* updatePositions '.'
+ orgApostrophe =
+ (char '\'' <|> char '\8217') <* updateLastPreCharPos
+ <* updateLastForbiddenCharPos
+ *> return (B.str "\x2019")
+
+singleQuoted :: OrgParser (F Inlines)
+singleQuoted = try $ do
+ guard =<< getExportSetting exportSmartQuotes
+ singleQuoteStart
+ updatePositions '\''
+ withQuoteContext InSingleQuote $
+ fmap B.singleQuoted . trimInlinesF . mconcat <$>
+ many1Till inline (singleQuoteEnd <* updatePositions '\'')
+
+-- doubleQuoted will handle regular double-quoted sections, as well
+-- as dialogues with an open double-quote without a close double-quote
+-- in the same paragraph.
+doubleQuoted :: OrgParser (F Inlines)
+doubleQuoted = try $ do
+ guard =<< getExportSetting exportSmartQuotes
+ doubleQuoteStart
+ updatePositions '"'
+ contents <- mconcat <$> many (try $ notFollowedBy doubleQuoteEnd >> inline)
+ (withQuoteContext InDoubleQuote $ (doubleQuoteEnd <* updateLastForbiddenCharPos) >> return
+ (fmap B.doubleQuoted . trimInlinesF $ contents))
+ <|> (return $ return (B.str "\8220") <> contents)
diff --git a/src/Text/Pandoc/Readers/Org/ParserState.hs b/src/Text/Pandoc/Readers/Org/ParserState.hs
new file mode 100644
index 000000000..0c58183f9
--- /dev/null
+++ b/src/Text/Pandoc/Readers/Org/ParserState.hs
@@ -0,0 +1,237 @@
+{-# LANGUAGE FlexibleInstances #-}
+{-# LANGUAGE GeneralizedNewtypeDeriving #-}
+{-# LANGUAGE MultiParamTypeClasses #-}
+{-
+Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.Org.Options
+ Copyright : Copyright (C) 2014-2016 Albert Krewinkel
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+Define the Org-mode parser state.
+-}
+module Text.Pandoc.Readers.Org.ParserState
+ ( OrgParserState (..)
+ , OrgParserLocal (..)
+ , OrgNoteRecord
+ , HasReaderOptions (..)
+ , HasQuoteContext (..)
+ , F(..)
+ , askF
+ , asksF
+ , trimInlinesF
+ , runF
+ , returnF
+ , ExportSettingSetter
+ , ExportSettings (..)
+ , setExportDrawers
+ , setExportEmphasizedText
+ , setExportSmartQuotes
+ , setExportSpecialStrings
+ , setExportSubSuperscripts
+ , modifyExportSettings
+ , optionsToParserState
+ ) where
+
+import Control.Monad (liftM, liftM2)
+import Control.Monad.Reader (Reader, runReader, ask, asks, local)
+
+import Data.Default (Default(..))
+import qualified Data.Map as M
+import qualified Data.Set as Set
+
+import Text.Pandoc.Builder ( Inlines, Blocks, trimInlines )
+import Text.Pandoc.Definition ( Meta(..), nullMeta )
+import Text.Pandoc.Options ( ReaderOptions(..) )
+import Text.Pandoc.Parsing ( HasHeaderMap(..)
+ , HasIdentifierList(..)
+ , HasLastStrPosition(..)
+ , HasQuoteContext(..)
+ , HasReaderOptions(..)
+ , ParserContext(..)
+ , QuoteContext(..)
+ , SourcePos )
+
+-- | An inline note / footnote containing the note key and its (inline) value.
+type OrgNoteRecord = (String, F Blocks)
+-- | Table of footnotes
+type OrgNoteTable = [OrgNoteRecord]
+-- | Map of functions for link transformations. The map key is refers to the
+-- link-type, the corresponding function transforms the given link string.
+type OrgLinkFormatters = M.Map String (String -> String)
+
+-- | Export settings <http://orgmode.org/manual/Export-settings.html>
+-- These settings can be changed via OPTIONS statements.
+data ExportSettings = ExportSettings
+ { exportDrawers :: Either [String] [String]
+ -- ^ Specify drawer names which should be exported. @Left@ names are
+ -- explicitly excluded from the resulting output while @Right@ means that
+ -- only the listed drawer names should be included.
+ , exportEmphasizedText :: Bool -- ^ Parse emphasized text
+ , exportSmartQuotes :: Bool -- ^ Parse quotes smartly
+ , exportSpecialStrings :: Bool -- ^ Parse ellipses and dashes smartly
+ , exportSubSuperscripts :: Bool -- ^ TeX-like syntax for sub- and superscripts
+ }
+
+-- | Org-mode parser state
+data OrgParserState = OrgParserState
+ { orgStateAnchorIds :: [String]
+ , orgStateEmphasisCharStack :: [Char]
+ , orgStateEmphasisNewlines :: Maybe Int
+ , orgStateExportSettings :: ExportSettings
+ , orgStateHeaderMap :: M.Map Inlines String
+ , orgStateIdentifiers :: Set.Set String
+ , orgStateLastForbiddenCharPos :: Maybe SourcePos
+ , orgStateLastPreCharPos :: Maybe SourcePos
+ , orgStateLastStrPos :: Maybe SourcePos
+ , orgStateLinkFormatters :: OrgLinkFormatters
+ , orgStateMeta :: F Meta
+ , orgStateNotes' :: OrgNoteTable
+ , orgStateOptions :: ReaderOptions
+ , orgStateParserContext :: ParserContext
+ }
+
+data OrgParserLocal = OrgParserLocal { orgLocalQuoteContext :: QuoteContext }
+
+instance Default OrgParserLocal where
+ def = OrgParserLocal NoQuote
+
+instance HasReaderOptions OrgParserState where
+ extractReaderOptions = orgStateOptions
+
+instance HasLastStrPosition OrgParserState where
+ getLastStrPos = orgStateLastStrPos
+ setLastStrPos pos st = st{ orgStateLastStrPos = Just pos }
+
+instance HasQuoteContext st (Reader OrgParserLocal) where
+ getQuoteContext = asks orgLocalQuoteContext
+ withQuoteContext q = local (\s -> s{orgLocalQuoteContext = q})
+
+instance HasIdentifierList OrgParserState where
+ extractIdentifierList = orgStateIdentifiers
+ updateIdentifierList f s = s{ orgStateIdentifiers = f (orgStateIdentifiers s) }
+
+instance HasHeaderMap OrgParserState where
+ extractHeaderMap = orgStateHeaderMap
+ updateHeaderMap f s = s{ orgStateHeaderMap = f (orgStateHeaderMap s) }
+
+instance Default ExportSettings where
+ def = defaultExportSettings
+
+instance Default OrgParserState where
+ def = defaultOrgParserState
+
+defaultOrgParserState :: OrgParserState
+defaultOrgParserState = OrgParserState
+ { orgStateAnchorIds = []
+ , orgStateEmphasisCharStack = []
+ , orgStateEmphasisNewlines = Nothing
+ , orgStateExportSettings = def
+ , orgStateHeaderMap = M.empty
+ , orgStateIdentifiers = Set.empty
+ , orgStateLastForbiddenCharPos = Nothing
+ , orgStateLastPreCharPos = Nothing
+ , orgStateLastStrPos = Nothing
+ , orgStateLinkFormatters = M.empty
+ , orgStateMeta = return nullMeta
+ , orgStateNotes' = []
+ , orgStateOptions = def
+ , orgStateParserContext = NullState
+ }
+
+defaultExportSettings :: ExportSettings
+defaultExportSettings = ExportSettings
+ { exportDrawers = Left ["LOGBOOK"]
+ , exportEmphasizedText = True
+ , exportSmartQuotes = True
+ , exportSpecialStrings = True
+ , exportSubSuperscripts = True
+ }
+
+optionsToParserState :: ReaderOptions -> OrgParserState
+optionsToParserState opts =
+ def { orgStateOptions = opts }
+
+
+--
+-- Setter for exporting options
+--
+type ExportSettingSetter a = a -> ExportSettings -> ExportSettings
+
+-- | Set export options for drawers. See the @exportDrawers@ in ADT
+-- @ExportSettings@ for details.
+setExportDrawers :: ExportSettingSetter (Either [String] [String])
+setExportDrawers val es = es { exportDrawers = val }
+
+-- | Set export options for emphasis parsing.
+setExportEmphasizedText :: ExportSettingSetter Bool
+setExportEmphasizedText val es = es { exportEmphasizedText = val }
+
+-- | Set export options for parsing of smart quotes.
+setExportSmartQuotes :: ExportSettingSetter Bool
+setExportSmartQuotes val es = es { exportSmartQuotes = val }
+
+-- | Set export options for parsing of special strings (like em/en dashes or
+-- ellipses).
+setExportSpecialStrings :: ExportSettingSetter Bool
+setExportSpecialStrings val es = es { exportSpecialStrings = val }
+
+-- | Set export options for sub/superscript parsing. The short syntax will
+-- not be parsed if this is set set to @False@.
+setExportSubSuperscripts :: ExportSettingSetter Bool
+setExportSubSuperscripts val es = es { exportSubSuperscripts = val }
+
+-- | Modify a parser state
+modifyExportSettings :: ExportSettingSetter a -> a -> OrgParserState -> OrgParserState
+modifyExportSettings setter val state =
+ state { orgStateExportSettings = setter val . orgStateExportSettings $ state }
+
+
+--
+-- Parser state reader
+--
+
+-- | Reader monad wrapping the parser state. This is used to delay evaluation
+-- until all relevant information has been parsed and made available in the
+-- parser state. See also the newtype of the same name in
+-- Text.Pandoc.Parsing.
+newtype F a = F { unF :: Reader OrgParserState a
+ } deriving (Functor, Applicative, Monad)
+
+instance Monoid a => Monoid (F a) where
+ mempty = return mempty
+ mappend = liftM2 mappend
+ mconcat = fmap mconcat . sequence
+
+runF :: F a -> OrgParserState -> a
+runF = runReader . unF
+
+askF :: F OrgParserState
+askF = F ask
+
+asksF :: (OrgParserState -> a) -> F a
+asksF f = F $ asks f
+
+trimInlinesF :: F Inlines -> F Inlines
+trimInlinesF = liftM trimInlines
+
+returnF :: Monad m => a -> m (F a)
+returnF = return . return
diff --git a/src/Text/Pandoc/Readers/Org/Parsing.hs b/src/Text/Pandoc/Readers/Org/Parsing.hs
new file mode 100644
index 000000000..8cf0c696c
--- /dev/null
+++ b/src/Text/Pandoc/Readers/Org/Parsing.hs
@@ -0,0 +1,214 @@
+{-
+Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.Org.Options
+ Copyright : Copyright (C) 2014-2016 Albert Krewinkel
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+Org-mode parsing utilities.
+
+Most functions are simply re-exports from @Text.Pandoc.Parsing@, some
+functions are adapted to Org-mode specific functionality.
+-}
+module Text.Pandoc.Readers.Org.Parsing
+ ( OrgParser
+ , anyLine
+ , blanklines
+ , newline
+ , parseFromString
+ , skipSpaces1
+ , inList
+ , withContext
+ , getExportSetting
+ , updateLastForbiddenCharPos
+ , updateLastPreCharPos
+ , orgArgKey
+ , orgArgWord
+ , orgArgWordChar
+ -- * Re-exports from Text.Pandoc.Parser
+ , ParserContext (..)
+ , many1Till
+ , notFollowedBy'
+ , spaceChar
+ , nonspaceChar
+ , skipSpaces
+ , blankline
+ , enclosed
+ , stringAnyCase
+ , charsInBalanced
+ , uri
+ , withRaw
+ , readWithM
+ , guardEnabled
+ , updateLastStrPos
+ , notAfterString
+ , ParserState (..)
+ , registerHeader
+ , QuoteContext (..)
+ , singleQuoteStart
+ , singleQuoteEnd
+ , doubleQuoteStart
+ , doubleQuoteEnd
+ , dash
+ , ellipses
+ , citeKey
+ -- * Re-exports from Text.Pandoc.Parsec
+ , runParser
+ , getInput
+ , char
+ , letter
+ , digit
+ , alphaNum
+ , skipMany1
+ , spaces
+ , anyChar
+ , satisfy
+ , string
+ , count
+ , eof
+ , noneOf
+ , oneOf
+ , lookAhead
+ , notFollowedBy
+ , many
+ , many1
+ , manyTill
+ , (<|>)
+ , (<?>)
+ , choice
+ , try
+ , sepBy
+ , sepBy1
+ , option
+ , optional
+ , optionMaybe
+ , getState
+ , updateState
+ , SourcePos
+ , getPosition
+ ) where
+
+import Text.Pandoc.Readers.Org.ParserState
+
+import qualified Text.Pandoc.Parsing as P
+import Text.Pandoc.Parsing hiding ( anyLine, blanklines, newline
+ , parseFromString )
+
+import Control.Monad ( guard )
+import Control.Monad.Reader ( Reader )
+
+-- | The parser used to read org files.
+type OrgParser = ParserT [Char] OrgParserState (Reader OrgParserLocal)
+
+--
+-- Adaptions and specializations of parsing utilities
+--
+
+-- | Parse any line of text
+anyLine :: OrgParser String
+anyLine =
+ P.anyLine
+ <* updateLastPreCharPos
+ <* updateLastForbiddenCharPos
+
+-- The version Text.Pandoc.Parsing cannot be used, as we need additional parts
+-- of the state saved and restored.
+parseFromString :: OrgParser a -> String -> OrgParser a
+parseFromString parser str' = do
+ oldLastPreCharPos <- orgStateLastPreCharPos <$> getState
+ updateState $ \s -> s{ orgStateLastPreCharPos = Nothing }
+ result <- P.parseFromString parser str'
+ updateState $ \s -> s{ orgStateLastPreCharPos = oldLastPreCharPos }
+ return result
+
+-- | Skip one or more tab or space characters.
+skipSpaces1 :: OrgParser ()
+skipSpaces1 = skipMany1 spaceChar
+
+-- | Like @Text.Parsec.Char.newline@, but causes additional state changes.
+newline :: OrgParser Char
+newline =
+ P.newline
+ <* updateLastPreCharPos
+ <* updateLastForbiddenCharPos
+
+-- | Like @Text.Parsec.Char.blanklines@, but causes additional state changes.
+blanklines :: OrgParser [Char]
+blanklines =
+ P.blanklines
+ <* updateLastPreCharPos
+ <* updateLastForbiddenCharPos
+
+-- | Succeeds when we're in list context.
+inList :: OrgParser ()
+inList = do
+ ctx <- orgStateParserContext <$> getState
+ guard (ctx == ListItemState)
+
+-- | Parse in different context
+withContext :: ParserContext -- ^ New parser context
+ -> OrgParser a -- ^ Parser to run in that context
+ -> OrgParser a
+withContext context parser = do
+ oldContext <- orgStateParserContext <$> getState
+ updateState $ \s -> s{ orgStateParserContext = context }
+ result <- parser
+ updateState $ \s -> s{ orgStateParserContext = oldContext }
+ return result
+
+--
+-- Parser state functions
+--
+
+-- | Get an export setting.
+getExportSetting :: (ExportSettings -> a) -> OrgParser a
+getExportSetting s = s . orgStateExportSettings <$> getState
+
+-- | Set the current position as the last position at which a forbidden char
+-- was found (i.e. a character which is not allowed at the inner border of
+-- markup).
+updateLastForbiddenCharPos :: OrgParser ()
+updateLastForbiddenCharPos = getPosition >>= \p ->
+ updateState $ \s -> s{ orgStateLastForbiddenCharPos = Just p}
+
+-- | Set the current parser position as the position at which a character was
+-- seen which allows inline markup to follow.
+updateLastPreCharPos :: OrgParser ()
+updateLastPreCharPos = getPosition >>= \p ->
+ updateState $ \s -> s{ orgStateLastPreCharPos = Just p}
+
+--
+-- Org key-value parsing
+--
+
+-- | Read the key of a plist style key-value list.
+orgArgKey :: OrgParser String
+orgArgKey = try $
+ skipSpaces *> char ':'
+ *> many1 orgArgWordChar
+
+-- | Read the value of a plist style key-value list.
+orgArgWord :: OrgParser String
+orgArgWord = many1 orgArgWordChar
+
+-- | Chars treated as part of a word in plists.
+orgArgWordChar :: OrgParser Char
+orgArgWordChar = alphaNum <|> oneOf "-_"
diff --git a/src/Text/Pandoc/Readers/Org/Shared.hs b/src/Text/Pandoc/Readers/Org/Shared.hs
new file mode 100644
index 000000000..3ba46b9e4
--- /dev/null
+++ b/src/Text/Pandoc/Readers/Org/Shared.hs
@@ -0,0 +1,76 @@
+{-# LANGUAGE OverloadedStrings #-}
+{-
+Copyright (C) 2014-2016 Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.Org.Options
+ Copyright : Copyright (C) 2014-2016 Albert Krewinkel
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Albert Krewinkel <tarleb+pandoc@moltkeplatz.de>
+
+Utility functions used in other Pandoc Org modules.
+-}
+module Text.Pandoc.Readers.Org.Shared
+ ( isImageFilename
+ , rundocBlockClass
+ , toRundocAttrib
+ , translateLang
+ ) where
+
+import Control.Arrow ( first )
+import Data.List ( isPrefixOf, isSuffixOf )
+
+
+-- | Check whether the given string looks like the path to of URL of an image.
+isImageFilename :: String -> Bool
+isImageFilename filename =
+ any (\x -> ('.':x) `isSuffixOf` filename) imageExtensions &&
+ (any (\x -> (x++":") `isPrefixOf` filename) protocols ||
+ ':' `notElem` filename)
+ where
+ imageExtensions = [ "jpeg" , "jpg" , "png" , "gif" , "svg" ]
+ protocols = [ "file", "http", "https" ]
+
+-- | Prefix used for Rundoc classes and arguments.
+rundocPrefix :: String
+rundocPrefix = "rundoc-"
+
+-- | The class-name used to mark rundoc blocks.
+rundocBlockClass :: String
+rundocBlockClass = rundocPrefix ++ "block"
+
+-- | Prefix the name of a attribute, marking it as a code execution parameter.
+toRundocAttrib :: (String, String) -> (String, String)
+toRundocAttrib = first (rundocPrefix ++)
+
+-- | Translate from Org-mode's programming language identifiers to those used
+-- by Pandoc. This is useful to allow for proper syntax highlighting in
+-- Pandoc output.
+translateLang :: String -> String
+translateLang cs =
+ case cs of
+ "C" -> "c"
+ "C++" -> "cpp"
+ "emacs-lisp" -> "commonlisp" -- emacs lisp is not supported
+ "js" -> "javascript"
+ "lisp" -> "commonlisp"
+ "R" -> "r"
+ "sh" -> "bash"
+ "sqlite" -> "sql"
+ _ -> cs
diff --git a/src/Text/Pandoc/Readers/RST.hs b/src/Text/Pandoc/Readers/RST.hs
index 7be0cd392..296c55f32 100644
--- a/src/Text/Pandoc/Readers/RST.hs
+++ b/src/Text/Pandoc/Readers/RST.hs
@@ -586,8 +586,9 @@ directive' = do
case trim top of
"" -> stateRstDefaultRole def
role -> role })
- "code" -> codeblock (lookup "number-lines" fields) (trim top) body
- "code-block" -> codeblock (lookup "number-lines" fields) (trim top) body
+ x | x == "code" || x == "code-block" ->
+ codeblock (words $ fromMaybe [] $ lookup "class" fields)
+ (lookup "number-lines" fields) (trim top) body
"aafig" -> do
let attribs = ("", ["aafig"], map (\(k,v) -> (k, trimr v)) fields)
return $ B.codeBlockWith attribs $ stripTrailingNewlines body
@@ -713,12 +714,13 @@ toChunks = dropWhile null
. map (trim . unlines)
. splitBy (all (`elem` (" \t" :: String))) . lines
-codeblock :: Maybe String -> String -> String -> RSTParser Blocks
-codeblock numberLines lang body =
+codeblock :: [String] -> Maybe String -> String -> String -> RSTParser Blocks
+codeblock classes numberLines lang body =
return $ B.codeBlockWith attribs $ stripTrailingNewlines body
- where attribs = ("", classes, kvs)
- classes = "sourceCode" : lang
+ where attribs = ("", classes', kvs)
+ classes' = "sourceCode" : lang
: maybe [] (\_ -> ["numberLines"]) numberLines
+ ++ classes
kvs = case numberLines of
Just "" -> []
Nothing -> []
diff --git a/src/Text/Pandoc/Writers/Docbook.hs b/src/Text/Pandoc/Writers/Docbook.hs
index 2aaebf99f..9acfe289a 100644
--- a/src/Text/Pandoc/Writers/Docbook.hs
+++ b/src/Text/Pandoc/Writers/Docbook.hs
@@ -112,10 +112,15 @@ elementToDocbook opts lvl (Sec _ _num (id',_,_) title elements) =
else elements
tag = case lvl of
n | n == 0 -> "chapter"
- | n >= 1 && n <= 5 -> "sect" ++ show n
+ | n >= 1 && n <= 5 -> if writerDocbook5 opts
+ then "section"
+ else "sect" ++ show n
| otherwise -> "simplesect"
- in inTags True tag [("id", writerIdentifierPrefix opts ++ id') |
- not (null id')] $
+ idAttr = [("id", writerIdentifierPrefix opts ++ id') | not (null id')]
+ nsAttr = if writerDocbook5 opts && lvl == 0 then [("xmlns", "http://docbook.org/ns/docbook")]
+ else []
+ attribs = nsAttr ++ idAttr
+ in inTags True tag attribs $
inTagsSimple "title" (inlinesToDocbook opts title) $$
vcat (map (elementToDocbook opts (lvl + 1)) elements')
@@ -227,9 +232,11 @@ blockToDocbook opts (OrderedList (start, numstyle, _) (first:rest)) =
blockToDocbook opts (DefinitionList lst) =
let attribs = [("spacing", "compact") | isTightList $ concatMap snd lst]
in inTags True "variablelist" attribs $ deflistItemsToDocbook opts lst
-blockToDocbook _ (RawBlock f str)
+blockToDocbook opts (RawBlock f str)
| f == "docbook" = text str -- raw XML block
- | f == "html" = text str -- allow html for backwards compatibility
+ | f == "html" = if writerDocbook5 opts
+ then empty -- No html in Docbook5
+ else text str -- allow html for backwards compatibility
| otherwise = empty
blockToDocbook _ HorizontalRule = empty -- not semantic
blockToDocbook opts (Table caption aligns widths headers rows) =
@@ -344,7 +351,9 @@ inlineToDocbook opts (Link attr txt (src, _))
| otherwise =
(if isPrefixOf "#" src
then inTags False "link" $ ("linkend", drop 1 src) : idAndRole attr
- else inTags False "ulink" $ ("url", src) : idAndRole attr ) $
+ else if writerDocbook5 opts
+ then inTags False "link" $ ("xlink:href", src) : idAndRole attr
+ else inTags False "ulink" $ ("url", src) : idAndRole attr ) $
inlinesToDocbook opts txt
inlineToDocbook opts (Image attr _ (src, tit)) =
let titleDoc = if null tit
diff --git a/src/Text/Pandoc/Writers/EPUB.hs b/src/Text/Pandoc/Writers/EPUB.hs
index 804dbb926..90f502f6f 100644
--- a/src/Text/Pandoc/Writers/EPUB.hs
+++ b/src/Text/Pandoc/Writers/EPUB.hs
@@ -667,7 +667,8 @@ writeEPUB opts doc@(Pandoc meta _) = do
]
]
else []
- let navData = renderHtml $ writeHtml opts'
+ let navData = renderHtml $ writeHtml
+ opts'{ writerVariables = ("navpage","true"):vars }
(Pandoc (setMeta "title"
(walk removeNote $ fromList $ docTitle' meta) nullMeta)
(navBlocks ++ landmarks))
diff --git a/src/Text/Pandoc/Writers/HTML.hs b/src/Text/Pandoc/Writers/HTML.hs
index c5b6a6db2..d8b8384e7 100644
--- a/src/Text/Pandoc/Writers/HTML.hs
+++ b/src/Text/Pandoc/Writers/HTML.hs
@@ -855,13 +855,12 @@ inlineToHtml opts inline =
(Note contents)
| writerIgnoreNotes opts -> return mempty
| otherwise -> do
- st <- get
- let notes = stNotes st
+ notes <- gets stNotes
let number = (length notes) + 1
let ref = show number
htmlContents <- blockListToNote opts ref contents
-- push contents onto front of notes
- put $ st {stNotes = (htmlContents:notes)}
+ modify $ \st -> st {stNotes = (htmlContents:notes)}
let revealSlash = ['/' | writerSlideVariant opts
== RevealJsSlides]
let link = H.a ! A.href (toValue $ "#" ++
diff --git a/src/Text/Pandoc/Writers/LaTeX.hs b/src/Text/Pandoc/Writers/LaTeX.hs
index dd5b14424..888c866a6 100644
--- a/src/Text/Pandoc/Writers/LaTeX.hs
+++ b/src/Text/Pandoc/Writers/LaTeX.hs
@@ -39,8 +39,10 @@ import Text.Pandoc.Templates
import Text.Printf ( printf )
import Network.URI ( isURI, unEscapeString )
import Data.Aeson (object, (.=), FromJSON)
-import Data.List ( (\\), isInfixOf, stripPrefix, intercalate, intersperse, nub, nubBy )
-import Data.Char ( toLower, isPunctuation, isAscii, isLetter, isDigit, ord )
+import Data.List ( (\\), isInfixOf, stripPrefix, intercalate, intersperse,
+ nub, nubBy, foldl' )
+import Data.Char ( toLower, isPunctuation, isAscii, isLetter, isDigit,
+ ord, isAlphaNum )
import Data.Maybe ( fromMaybe, isJust, catMaybes )
import qualified Data.Text as T
import Control.Applicative ((<|>))
@@ -223,7 +225,7 @@ pandocToLaTeX options (Pandoc meta blocks) = do
++ poly ++ "}{##2}}}\n"
else "\\newcommand{\\text" ++ poly ++ "}[2][]{\\foreignlanguage{"
++ babel ++ "}{#2}}\n" ++
- "\\newenvironment{" ++ poly ++ "}[1]{\\begin{otherlanguage}{"
+ "\\newenvironment{" ++ poly ++ "}[2][]{\\begin{otherlanguage}{"
++ babel ++ "}}{\\end{otherlanguage}}\n"
)
-- eliminate duplicates that have same polyglossia name
@@ -308,7 +310,7 @@ toLabel z = go `fmap` stringToLaTeX URLString z
where go [] = ""
go (x:xs)
| (isLetter x || isDigit x) && isAscii x = x:go xs
- | elem x ("-+=:;." :: String) = x:go xs
+ | elem x ("_-+=:;." :: String) = x:go xs
| otherwise = "ux" ++ printf "%x" (ord x) ++ go xs
-- | Puts contents into LaTeX command.
@@ -409,8 +411,6 @@ blockToLaTeX (Para [Image attr@(ident, _, _) txt (src,'f':'i':'g':':':tit)]) = d
capt <- inlineListToLaTeX txt
notes <- gets stNotes
modify $ \st -> st{ stInMinipage = False, stNotes = [] }
- ref <- text `fmap` toLabel ident
- internalLinks <- gets stInternalLinks
-- We can't have footnotes in the list of figures, so remove them:
captForLof <- if null notes
@@ -473,23 +473,27 @@ blockToLaTeX (CodeBlock (identifier,classes,keyvalAttr) str) = do
st <- get
let params = if writerListings (stOptions st)
then (case getListingsLanguage classes of
- Just l -> [ "language=" ++ l ]
+ Just l -> [ "language=" ++ mbBraced l ]
Nothing -> []) ++
[ "numbers=left" | "numberLines" `elem` classes
|| "number" `elem` classes
|| "number-lines" `elem` classes ] ++
[ (if key == "startFrom"
then "firstnumber"
- else key) ++ "=" ++ attr |
+ else key) ++ "=" ++ mbBraced attr |
(key,attr) <- keyvalAttr ] ++
(if identifier == ""
then []
else [ "label=" ++ ref ])
else []
+ mbBraced x = if not (all isAlphaNum x)
+ then "{" <> x <> "}"
+ else x
printParams
| null params = empty
- | otherwise = brackets $ hcat (intersperse ", " (map text params))
+ | otherwise = brackets $ hcat (intersperse ", "
+ (map text params))
return $ flush ("\\begin{lstlisting}" <> printParams $$ text str $$
"\\end{lstlisting}") $$ cr
let highlightedCodeBlock =
@@ -510,7 +514,8 @@ blockToLaTeX (RawBlock f x)
blockToLaTeX (BulletList []) = return empty -- otherwise latex error
blockToLaTeX (BulletList lst) = do
incremental <- gets stIncremental
- let inc = if incremental then "[<+->]" else ""
+ beamer <- writerBeamer `fmap` gets stOptions
+ let inc = if beamer && incremental then "[<+->]" else ""
items <- mapM listItemToLaTeX lst
let spacing = if isTightList lst
then text "\\tightlist"
@@ -670,8 +675,8 @@ tableCellToLaTeX header (width, align, blocks) = do
AlignDefault -> "\\raggedright"
return $ ("\\begin{minipage}" <> valign <>
braces (text (printf "%.2f\\columnwidth" width)) <>
- (halign <> "\\strut" <> cr <> cellContents <> cr) <>
- "\\strut\\end{minipage}") $$
+ (halign <> "\\strut" <> cr <> cellContents <> "\\strut" <> cr) <>
+ "\\end{minipage}") $$
notesToLaTeX notes
notesToLaTeX :: [Doc] -> Doc
@@ -722,7 +727,7 @@ sectionHeader :: Bool -- True for unnumbered
-> State WriterState Doc
sectionHeader unnumbered ident level lst = do
txt <- inlineListToLaTeX lst
- plain <- stringToLaTeX TextString $ foldl (++) "" $ map stringify lst
+ plain <- stringToLaTeX TextString $ concatMap stringify lst
let noNote (Note _) = Str ""
noNote x = x
let lstNoNotes = walk noNote lst
@@ -1034,7 +1039,7 @@ citationsToNatbib (c:cs) | citationMode c == AuthorInText = do
citationsToNatbib cits = do
cits' <- mapM convertOne cits
- return $ text "\\citetext{" <> foldl combineTwo empty cits' <> text "}"
+ return $ text "\\citetext{" <> foldl' combineTwo empty cits' <> text "}"
where
combineTwo a b | isEmpty a = b
| otherwise = a <> text "; " <> b
@@ -1083,7 +1088,7 @@ citationsToBiblatex (one:[])
citationsToBiblatex (c:cs) = do
args <- mapM convertOne (c:cs)
- return $ text cmd <> foldl (<>) empty args
+ return $ text cmd <> foldl' (<>) empty args
where
cmd = case citationMode c of
AuthorInText -> "\\textcites"
@@ -1127,7 +1132,7 @@ toPolyglossiaEnv l =
-- Takes a list of the constituents of a BCP 47 language code and
-- converts it to a Polyglossia (language, options) tuple
--- http://mirrors.concertpass.com/tex-archive/macros/latex/contrib/polyglossia/polyglossia.pdf
+-- http://mirrors.ctan.org/macros/latex/contrib/polyglossia/polyglossia.pdf
toPolyglossia :: [String] -> (String, String)
toPolyglossia ("ar":"DZ":_) = ("arabic", "locale=algeria")
toPolyglossia ("ar":"IQ":_) = ("arabic", "locale=mashriq")
@@ -1155,17 +1160,21 @@ toPolyglossia ("en":"UK":_) = ("english", "variant=british")
toPolyglossia ("en":"US":_) = ("english", "variant=american")
toPolyglossia ("grc":_) = ("greek", "variant=ancient")
toPolyglossia ("hsb":_) = ("usorbian", "")
+toPolyglossia ("la":"x":"classic":_) = ("latin", "variant=classic")
toPolyglossia ("sl":_) = ("slovenian", "")
toPolyglossia x = (commonFromBcp47 x, "")
-- Takes a list of the constituents of a BCP 47 language code and
-- converts it to a Babel language string.
--- http://mirrors.concertpass.com/tex-archive/macros/latex/required/babel/base/babel.pdf
--- Note that the PDF unfortunately does not contain a complete list of supported languages.
+-- http://mirrors.ctan.org/macros/latex/required/babel/base/babel.pdf
+-- List of supported languages (slightly outdated):
+-- http://tug.ctan.org/language/hyph-utf8/doc/generic/hyph-utf8/hyphenation.pdf
toBabel :: [String] -> String
toBabel ("de":"1901":_) = "german"
toBabel ("de":"AT":"1901":_) = "austrian"
toBabel ("de":"AT":_) = "naustrian"
+toBabel ("de":"CH":"1901":_) = "swissgerman"
+toBabel ("de":"CH":_) = "nswissgerman"
toBabel ("de":_) = "ngerman"
toBabel ("dsb":_) = "lowersorbian"
toBabel ("el":"polyton":_) = "polutonikogreek"
@@ -1179,6 +1188,7 @@ toBabel ("fr":"CA":_) = "canadien"
toBabel ("fra":"aca":_) = "acadian"
toBabel ("grc":_) = "polutonikogreek"
toBabel ("hsb":_) = "uppersorbian"
+toBabel ("la":"x":"classic":_) = "classiclatin"
toBabel ("sl":_) = "slovene"
toBabel x = commonFromBcp47 x
@@ -1187,12 +1197,17 @@ toBabel x = commonFromBcp47 x
-- https://tools.ietf.org/html/bcp47#section-2.1
commonFromBcp47 :: [String] -> String
commonFromBcp47 [] = ""
-commonFromBcp47 ("pt":"BR":_) = "brazilian"
+commonFromBcp47 ("pt":"BR":_) = "brazil"
+-- Note: documentation says "brazilian" works too, but it doesn't seem to work
+-- on some systems. See #2953.
+commonFromBcp47 ("sr":"Cyrl":_) = "serbianc"
+commonFromBcp47 ("zh":"Latn":"pinyin":_) = "pinyin"
commonFromBcp47 x = fromIso $ head x
where
fromIso "af" = "afrikaans"
fromIso "am" = "amharic"
fromIso "ar" = "arabic"
+ fromIso "as" = "assamese"
fromIso "ast" = "asturian"
fromIso "bg" = "bulgarian"
fromIso "bn" = "bengali"
@@ -1216,12 +1231,13 @@ commonFromBcp47 x = fromIso $ head x
fromIso "fur" = "friulan"
fromIso "ga" = "irish"
fromIso "gd" = "scottish"
+ fromIso "gez" = "ethiopic"
fromIso "gl" = "galician"
fromIso "he" = "hebrew"
fromIso "hi" = "hindi"
fromIso "hr" = "croatian"
- fromIso "hy" = "armenian"
fromIso "hu" = "magyar"
+ fromIso "hy" = "armenian"
fromIso "ia" = "interlingua"
fromIso "id" = "indonesian"
fromIso "ie" = "interlingua"
@@ -1229,6 +1245,7 @@ commonFromBcp47 x = fromIso $ head x
fromIso "it" = "italian"
fromIso "jp" = "japanese"
fromIso "km" = "khmer"
+ fromIso "kmr" = "kurmanji"
fromIso "kn" = "kannada"
fromIso "ko" = "korean"
fromIso "la" = "latin"
@@ -1244,6 +1261,7 @@ commonFromBcp47 x = fromIso $ head x
fromIso "no" = "norsk"
fromIso "nqo" = "nko"
fromIso "oc" = "occitan"
+ fromIso "pa" = "panjabi"
fromIso "pl" = "polish"
fromIso "pms" = "piedmontese"
fromIso "pt" = "portuguese"
@@ -1260,6 +1278,7 @@ commonFromBcp47 x = fromIso $ head x
fromIso "ta" = "tamil"
fromIso "te" = "telugu"
fromIso "th" = "thai"
+ fromIso "ti" = "ethiopic"
fromIso "tk" = "turkmen"
fromIso "tr" = "turkish"
fromIso "uk" = "ukrainian"
@@ -1290,4 +1309,3 @@ pDocumentClass =
else do P.skipMany (P.satisfy (/='{'))
P.char '{'
P.manyTill P.letter (P.char '}')
-
diff --git a/src/Text/Pandoc/Writers/Org.hs b/src/Text/Pandoc/Writers/Org.hs
index 20086ed19..f87aeca81 100644
--- a/src/Text/Pandoc/Writers/Org.hs
+++ b/src/Text/Pandoc/Writers/Org.hs
@@ -110,6 +110,17 @@ isRawFormat f =
blockToOrg :: Block -- ^ Block element
-> State WriterState Doc
blockToOrg Null = return empty
+blockToOrg (Div (_,classes@(cls:_),kvs) bs) | "drawer" `elem` classes = do
+ contents <- blockListToOrg bs
+ let drawerNameTag = ":" <> text cls <> ":"
+ let keys = vcat $ map (\(k,v) ->
+ ":" <> text k <> ":"
+ <> space <> text v) kvs
+ let drawerEndTag = text ":END:"
+ return $ drawerNameTag $$ cr $$ keys $$
+ blankline $$ contents $$
+ blankline $$ drawerEndTag $$
+ blankline
blockToOrg (Div attrs bs) = do
contents <- blockListToOrg bs
let startTag = tagWithAttrs "div" attrs
@@ -137,10 +148,13 @@ blockToOrg (RawBlock f str) | isRawFormat f =
return $ text str
blockToOrg (RawBlock _ _) = return empty
blockToOrg HorizontalRule = return $ blankline $$ "--------------" $$ blankline
-blockToOrg (Header level _ inlines) = do
+blockToOrg (Header level attr inlines) = do
contents <- inlineListToOrg inlines
let headerStr = text $ if level > 999 then " " else replicate level '*'
- return $ headerStr <> " " <> contents <> blankline
+ let drawerStr = if attr == nullAttr
+ then empty
+ else cr <> nest (level + 1) (propertiesDrawer attr)
+ return $ headerStr <> " " <> contents <> drawerStr <> blankline
blockToOrg (CodeBlock (_,classes,_) str) = do
opts <- stOptions <$> get
let tabstop = writerTabStop opts
@@ -170,7 +184,7 @@ blockToOrg (Table caption' _ _ headers rows) = do
map ((+2) . numChars) $ transpose (headers' : rawRows)
-- FIXME: Org doesn't allow blocks with height more than 1.
let hpipeBlocks blocks = hcat [beg, middle, end]
- where h = maximum (map height blocks)
+ where h = maximum (1 : map height blocks)
sep' = lblock 3 $ vcat (map text $ replicate h " | ")
beg = lblock 2 $ vcat (map text $ replicate h "| ")
end = lblock 2 $ vcat (map text $ replicate h " |")
@@ -230,6 +244,22 @@ definitionListItemToOrg (label, defs) = do
contents <- liftM vcat $ mapM blockListToOrg defs
return $ hang 3 "- " $ label' <> " :: " <> (contents <> cr)
+-- | Convert list of key/value pairs to Org :PROPERTIES: drawer.
+propertiesDrawer :: Attr -> Doc
+propertiesDrawer (ident, classes, kv) =
+ let
+ drawerStart = text ":PROPERTIES:"
+ drawerEnd = text ":END:"
+ kv' = if (classes == mempty) then kv else ("CLASS", unwords classes):kv
+ kv'' = if (ident == mempty) then kv' else ("CUSTOM_ID", ident):kv'
+ properties = vcat $ map kvToOrgProperty kv''
+ in
+ drawerStart <> cr <> properties <> cr <> drawerEnd
+ where
+ kvToOrgProperty :: (String, String) -> Doc
+ kvToOrgProperty (key, value) =
+ text ":" <> text key <> text ": " <> text value <> cr
+
-- | Convert list of Pandoc block elements to Org.
blockListToOrg :: [Block] -- ^ List of block elements
-> State WriterState Doc
diff --git a/stack.yaml b/stack.yaml
index e25ad9b07..a8a71f47e 100644
--- a/stack.yaml
+++ b/stack.yaml
@@ -7,9 +7,7 @@ flags:
network-uri: true
packages:
- '.'
-extra-deps: []
-# to compile against aeson 0.11.0.0:
-# - 'aeson-0.11.0.0'
-# - 'fail-4.9.0.0'
-# - 'pandoc-types-1.16.1'
-resolver: lts-5.8
+extra-deps:
+- data-default-0.6.0
+- data-default-instances-base-0.1.0
+resolver: lts-6.1
diff --git a/tests/Tests/Old.hs b/tests/Tests/Old.hs
index 36bb3398e..4e0eb46a4 100644
--- a/tests/Tests/Old.hs
+++ b/tests/Tests/Old.hs
@@ -57,7 +57,7 @@ tests = [ testGroup "markdown"
"tables.txt" "tables.native"
, test "pipe tables" ["-r", "markdown", "-w", "native", "--columns=80"]
"pipe-tables.txt" "pipe-tables.native"
- , test "more" ["-r", "markdown", "-w", "native", "-S"]
+ , test "more" ["-r", "markdown", "-w", "native", "-s", "-S"]
"markdown-reader-more.txt" "markdown-reader-more.native"
, lhsReaderTest "markdown+lhs"
]
@@ -108,6 +108,9 @@ tests = [ testGroup "markdown"
, test "reader" ["-r", "docbook", "-w", "native", "-s"]
"docbook-xref.docbook" "docbook-xref.native"
]
+ , testGroup "docbook5"
+ [ testGroup "writer" $ writerTests "docbook5"
+ ]
, testGroup "native"
[ testGroup "writer" $ writerTests "native"
, test "reader" ["-r", "native", "-w", "native", "-s"]
diff --git a/tests/Tests/Readers/Docx.hs b/tests/Tests/Readers/Docx.hs
index e09d56529..aeb6bf939 100644
--- a/tests/Tests/Readers/Docx.hs
+++ b/tests/Tests/Readers/Docx.hs
@@ -266,6 +266,18 @@ tests = [ testGroup "inlines"
"keep deletion (all)"
"docx/track_changes_deletion.docx"
"docx/track_changes_deletion_all.native"
+ , testCompareWithOpts def{readerTrackChanges=AcceptChanges}
+ "move text (accept)"
+ "docx/track_changes_move.docx"
+ "docx/track_changes_move_accept.native"
+ , testCompareWithOpts def{readerTrackChanges=RejectChanges}
+ "move text (reject)"
+ "docx/track_changes_move.docx"
+ "docx/track_changes_move_reject.native"
+ , testCompareWithOpts def{readerTrackChanges=AllChanges}
+ "move text (all)"
+ "docx/track_changes_move.docx"
+ "docx/track_changes_move_all.native"
]
, testGroup "media"
[ testMediaBag
diff --git a/tests/Tests/Readers/Org.hs b/tests/Tests/Readers/Org.hs
index b095ac60a..9bd999b01 100644
--- a/tests/Tests/Readers/Org.hs
+++ b/tests/Tests/Readers/Org.hs
@@ -300,6 +300,42 @@ tests =
, citationHash = 0}
in (para $ cite [citation] "[see @item1 p. 34-35]")
+ , "Org-ref simple citation" =:
+ "cite:pandoc" =?>
+ let citation = Citation
+ { citationId = "pandoc"
+ , citationPrefix = mempty
+ , citationSuffix = mempty
+ , citationMode = AuthorInText
+ , citationNoteNum = 0
+ , citationHash = 0
+ }
+ in (para $ cite [citation] "cite:pandoc")
+
+ , "Org-ref simple citep citation" =:
+ "citep:pandoc" =?>
+ let citation = Citation
+ { citationId = "pandoc"
+ , citationPrefix = mempty
+ , citationSuffix = mempty
+ , citationMode = NormalCitation
+ , citationNoteNum = 0
+ , citationHash = 0
+ }
+ in (para $ cite [citation] "citep:pandoc")
+
+ , "Org-ref extended citation" =:
+ "[[citep:Dominik201408][See page 20::, for example]]" =?>
+ let citation = Citation
+ { citationId = "Dominik201408"
+ , citationPrefix = toList "See page 20"
+ , citationSuffix = toList ", for example"
+ , citationMode = NormalCitation
+ , citationNoteNum = 0
+ , citationHash = 0
+ }
+ in (para $ cite [citation] "[[citep:Dominik201408][See page 20::, for example]]")
+
, "Inline LaTeX symbol" =:
"\\dots" =?>
para "…"
@@ -308,6 +344,10 @@ tests =
"\\textit{Emphasised}" =?>
para (emph "Emphasised")
+ , "Inline LaTeX command with spaces" =:
+ "\\emph{Emphasis mine}" =?>
+ para (emph "Emphasis mine")
+
, "Inline LaTeX math symbol" =:
"\\tau" =?>
para (emph "τ")
@@ -328,6 +368,10 @@ tests =
"\\copy" =?>
para "©"
+ , "MathML symbols, space separated" =:
+ "\\ForAll \\Auml" =?>
+ para "∀ Ä"
+
, "LaTeX citation" =:
"\\cite{Coffee}" =?>
let citation = Citation
@@ -404,17 +448,18 @@ tests =
] =?>
para "Before" <> para "After"
- , "Drawer start is the only text in first line of a drawer" =:
+ , "Drawer markers must be the only text in the line" =:
unlines [ " :LOGBOOK: foo"
- , " :END:"
+ , " :END: bar"
] =?>
- para (":LOGBOOK:" <> space <> "foo" <> softbreak <> ":END:")
+ para (":LOGBOOK: foo" <> softbreak <> ":END: bar")
- , "Drawers with unknown names are just text" =:
+ , "Drawers can be arbitrary" =:
unlines [ ":FOO:"
+ , "/bar/"
, ":END:"
] =?>
- para (":FOO:" <> softbreak <> ":END:")
+ divWith (mempty, ["FOO", "drawer"], mempty) (para $ emph "bar")
, "Anchor reference" =:
unlines [ "<<link-here>> Target."
@@ -461,6 +506,34 @@ tests =
, "[[expl:foo][bar]]"
] =?>
(para (link "http://example.com/foo" "" "bar"))
+
+ , "Export option: Disable simple sub/superscript syntax" =:
+ unlines [ "#+OPTIONS: ^:nil"
+ , "a^b"
+ ] =?>
+ para "a^b"
+
+ , "Export option: directly select drawers to be exported" =:
+ unlines [ "#+OPTIONS: d:(\"IMPORTANT\")"
+ , ":IMPORTANT:"
+ , "23"
+ , ":END:"
+ , ":BORING:"
+ , "very boring"
+ , ":END:"
+ ] =?>
+ divWith (mempty, ["IMPORTANT", "drawer"], mempty) (para "23")
+
+ , "Export option: exclude drawers from being exported" =:
+ unlines [ "#+OPTIONS: d:(not \"BORING\")"
+ , ":IMPORTANT:"
+ , "5"
+ , ":END:"
+ , ":BORING:"
+ , "very boring"
+ , ":END:"
+ ] =?>
+ divWith (mempty, ["IMPORTANT", "drawer"], mempty) (para "5")
]
, testGroup "Basic Blocks" $
@@ -583,6 +656,15 @@ tests =
, headerWith ("but-this-is", [], []) 2 "But this is"
]
+ , "Preferences are treated as header attributes" =:
+ unlines [ "* foo"
+ , " :PROPERTIES:"
+ , " :custom_id: fubar"
+ , " :bar: baz"
+ , " :END:"
+ ] =?>
+ headerWith ("fubar", [], [("bar", "baz")]) 1 "foo"
+
, "Paragraph starting with an asterisk" =:
"*five" =?>
para "*five"
@@ -653,6 +735,17 @@ tests =
para (image "the-red-queen.jpg" "fig:redqueen"
"Used as a metapher in evolutionary biology.")
+ , "Figure with HTML attributes" =:
+ unlines [ "#+CAPTION: mah brain just explodid"
+ , "#+NAME: lambdacat"
+ , "#+ATTR_HTML: :style color: blue :role button"
+ , "[[lambdacat.jpg]]"
+ ] =?>
+ let kv = [("style", "color: blue"), ("role", "button")]
+ name = "fig:lambdacat"
+ caption = "mah brain just explodid"
+ in para (imageWith (mempty, mempty, kv) "lambdacat.jpg" name caption)
+
, "Footnote" =:
unlines [ "A footnote[1]"
, ""
@@ -941,7 +1034,7 @@ tests =
, "Empty table" =:
"||" =?>
- simpleTable' 1 mempty mempty
+ simpleTable' 1 mempty [[mempty]]
, "Glider Table" =:
unlines [ "| 1 | 0 | 0 |"
@@ -996,6 +1089,17 @@ tests =
, [ plain "dynamic", plain "Lisp" ]
]
+ , "Table with empty cells" =:
+ "|||c|" =?>
+ simpleTable' 3 mempty [[mempty, mempty, plain "c"]]
+
+ , "Table with empty rows" =:
+ unlines [ "| first |"
+ , "| |"
+ , "| third |"
+ ] =?>
+ simpleTable' 1 mempty [[plain "first"], [mempty], [plain "third"]]
+
, "Table with alignment row" =:
unlines [ "| Numbers | Text | More |"
, "| <c> | <r> | |"
@@ -1024,10 +1128,10 @@ tests =
, "| 1 | One | foo |"
, "| 2"
] =?>
- table "" (zip [AlignCenter, AlignRight, AlignDefault] [0, 0, 0])
- [ plain "Numbers", plain "Text" , plain mempty ]
- [ [ plain "1" , plain "One" , plain "foo" ]
- , [ plain "2" , plain mempty , plain mempty ]
+ table "" (zip [AlignCenter, AlignRight] [0, 0])
+ [ plain "Numbers", plain "Text" ]
+ [ [ plain "1" , plain "One" , plain "foo" ]
+ , [ plain "2" ]
]
, "Table with caption" =:
@@ -1054,6 +1158,33 @@ tests =
" where greeting = \"moin\"\n"
in codeBlockWith attr' code'
+ , "Source block with indented code" =:
+ unlines [ " #+BEGIN_SRC haskell"
+ , " main = putStrLn greeting"
+ , " where greeting = \"moin\""
+ , " #+END_SRC" ] =?>
+ let attr' = ("", ["haskell"], [])
+ code' = "main = putStrLn greeting\n" ++
+ " where greeting = \"moin\"\n"
+ in codeBlockWith attr' code'
+
+ , "Source block with tab-indented code" =:
+ unlines [ "\t#+BEGIN_SRC haskell"
+ , "\tmain = putStrLn greeting"
+ , "\t where greeting = \"moin\""
+ , "\t#+END_SRC" ] =?>
+ let attr' = ("", ["haskell"], [])
+ code' = "main = putStrLn greeting\n" ++
+ " where greeting = \"moin\"\n"
+ in codeBlockWith attr' code'
+
+ , "Empty source block" =:
+ unlines [ " #+BEGIN_SRC haskell"
+ , " #+END_SRC" ] =?>
+ let attr' = ("", ["haskell"], [])
+ code' = ""
+ in codeBlockWith attr' code'
+
, "Source block between paragraphs" =:
unlines [ "Low German greeting"
, " #+BEGIN_SRC haskell"
@@ -1198,7 +1329,7 @@ tests =
]
]
- , "Verse block with newlines" =:
+ , "Verse block with blank lines" =:
unlines [ "#+BEGIN_VERSE"
, "foo"
, ""
@@ -1207,6 +1338,20 @@ tests =
] =?>
para ("foo" <> linebreak <> linebreak <> "bar")
+ , "Raw block LaTeX" =:
+ unlines [ "#+BEGIN_LaTeX"
+ , "The category $\\cat{Set}$ is adhesive."
+ , "#+END_LaTeX"
+ ] =?>
+ rawBlock "latex" "The category $\\cat{Set}$ is adhesive.\n"
+
+ , "Export block HTML" =:
+ unlines [ "#+BEGIN_export html"
+ , "<samp>Hello, World!</samp>"
+ , "#+END_export"
+ ] =?>
+ rawBlock "html" "<samp>Hello, World!</samp>\n"
+
, "LaTeX fragment" =:
unlines [ "\\begin{equation}"
, "X_i = \\begin{cases}"
diff --git a/tests/Tests/Readers/RST.hs b/tests/Tests/Readers/RST.hs
index ea85a5929..622f5e48b 100644
--- a/tests/Tests/Readers/RST.hs
+++ b/tests/Tests/Readers/RST.hs
@@ -94,6 +94,35 @@ tests = [ "line block with blank line" =:
("A-1-B_2_C:3:D+4+E.5.F_\n\n" ++
".. _A-1-B_2_C:3:D+4+E.5.F: https://example.com\n") =?>
para (link "https://example.com" "" "A-1-B_2_C:3:D+4+E.5.F")
+ , "Code directive with class and number-lines" =: unlines
+ [ ".. code::python"
+ , " :number-lines: 34"
+ , " :class: class1 class2 class3"
+ , ""
+ , " def func(x):"
+ , " return y"
+ ] =?>
+ ( doc $ codeBlockWith
+ ( ""
+ , ["sourceCode", "python", "numberLines", "class1", "class2", "class3"]
+ , [ ("startFrom", "34") ]
+ )
+ "def func(x):\n return y"
+ )
+ , "Code directive with number-lines, no line specified" =: unlines
+ [ ".. code::python"
+ , " :number-lines: "
+ , ""
+ , " def func(x):"
+ , " return y"
+ ] =?>
+ ( doc $ codeBlockWith
+ ( ""
+ , ["sourceCode", "python", "numberLines"]
+ , [ ("startFrom", "") ]
+ )
+ "def func(x):\n return y"
+ )
, testGroup "literal / line / code blocks"
[ "indented literal block" =: unlines
[ "::"
diff --git a/tests/docx/track_changes_move.docx b/tests/docx/track_changes_move.docx
new file mode 100644
index 000000000..b70779fd4
--- /dev/null
+++ b/tests/docx/track_changes_move.docx
Binary files differ
diff --git a/tests/docx/track_changes_move_accept.native b/tests/docx/track_changes_move_accept.native
new file mode 100644
index 000000000..0cf276768
--- /dev/null
+++ b/tests/docx/track_changes_move_accept.native
@@ -0,0 +1,3 @@
+[Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "text."]
+,Para [Str "Here",Space,Str "is",Space,Str "the",Space,Str "text",Space,Str "to",Space,Str "be",Space,Str "moved."]
+,Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "more",Space,Str "text."]]
diff --git a/tests/docx/track_changes_move_all.native b/tests/docx/track_changes_move_all.native
new file mode 100644
index 000000000..3afae83a5
--- /dev/null
+++ b/tests/docx/track_changes_move_all.native
@@ -0,0 +1,4 @@
+[Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "text."]
+,Para [Span ("",["insertion"],[("author","Jesse Rosenthal"),("date","2016-04-16T08:20:00Z")]) [Str "Here",Space,Str "is",Space,Str "the",Space,Str "text",Space,Str "to",Space,Str "be",Space,Str "moved."]]
+,Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "more",Space,Str "text."]
+,Para [Span ("",["deletion"],[("author","Jesse Rosenthal"),("date","2016-04-16T08:20:00Z")]) [Str "Here",Space,Str "is",Space,Str "the",Space,Str "text",Space,Str "to",Space,Str "be",Space,Str "moved."]]]
diff --git a/tests/docx/track_changes_move_reject.native b/tests/docx/track_changes_move_reject.native
new file mode 100644
index 000000000..9c57871b6
--- /dev/null
+++ b/tests/docx/track_changes_move_reject.native
@@ -0,0 +1,3 @@
+[Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "text."]
+,Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "more",Space,Str "text."]
+,Para [Str "Here",Space,Str "is",Space,Str "the",Space,Str "text",Space,Str "to",Space,Str "be",Space,Str "moved."]]
diff --git a/tests/mallard-reader.native b/tests/mallard-reader.native
new file mode 100644
index 000000000..16274f00a
--- /dev/null
+++ b/tests/mallard-reader.native
@@ -0,0 +1,3 @@
+Pandoc (Meta {unMeta = fromList [("guide-group",MetaInlines [Str ""]),("guide-xref",MetaInlines [Str "index#intro"]),("title",MetaInlines [Str "Title"])]})
+[Header 1 ("introduction",[],[]) [Str "Title"]
+,Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "test."]]
diff --git a/tests/markdown-reader-more.native b/tests/markdown-reader-more.native
index 0148e9394..c38ffe038 100644
--- a/tests/markdown-reader-more.native
+++ b/tests/markdown-reader-more.native
@@ -1,5 +1,5 @@
-[Para [Str "spanning",Space,Str "multiple",Space,Str "lines",SoftBreak,Str "%",Space,Str "Author",Space,Str "One",SoftBreak,Str "Author",Space,Str "Two;",Space,Str "Author",Space,Str "Three;",SoftBreak,Str "Author",Space,Str "Four"]
-,Header 1 ("additional-markdown-reader-tests",[],[]) [Str "Additional",Space,Str "markdown",Space,Str "reader",Space,Str "tests"]
+Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "Author",Space,Str "One"],MetaInlines [Str "Author",Space,Str "Two"],MetaInlines [Str "Author",Space,Str "Three"],MetaInlines [Str "Author",Space,Str "Four"]]),("title",MetaInlines [Str "Title",SoftBreak,Str "spanning",Space,Str "multiple",Space,Str "lines"])]})
+[Header 1 ("additional-markdown-reader-tests",[],[]) [Str "Additional",Space,Str "markdown",Space,Str "reader",Space,Str "tests"]
,Header 2 ("blank-line-before-url-in-link-reference",[],[]) [Str "Blank",Space,Str "line",Space,Str "before",Space,Str "URL",Space,Str "in",Space,Str "link",Space,Str "reference"]
,Para [Link ("",[],[]) [Str "foo"] ("/url",""),Space,Str "and",Space,Link ("",[],[]) [Str "bar"] ("/url","title")]
,Header 2 ("raw-context-environments",[],[]) [Str "Raw",Space,Str "ConTeXt",Space,Str "environments"]
diff --git a/tests/mediawiki-reader.native b/tests/mediawiki-reader.native
index cf80d0664..6afeb602c 100644
--- a/tests/mediawiki-reader.native
+++ b/tests/mediawiki-reader.native
@@ -252,6 +252,11 @@ Pandoc (Meta {unMeta = fromList []})
[[]]
[[[Para [Str "Orange"]]]]
,Para [Str "Paragraph",Space,Str "after",Space,Str "the",Space,Str "table."]
+,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[Para [Str "fruit"]]
+ ,[Para [Str "topping"]]]
+ [[[Para [Str "apple"]]
+ ,[Para [Str "ice",Space,Str "cream"]]]]
,Header 2 ("notes",[],[]) [Str "notes"]
,Para [Str "My",Space,Str "note!",Note [Plain [Str "This."]]]
,Para [Str "URL",Space,Str "note.",Note [Plain [Link ("",[],[]) [Str "http://docs.python.org/library/functions.html#range"] ("http://docs.python.org/library/functions.html#range","")]]]]
diff --git a/tests/mediawiki-reader.wiki b/tests/mediawiki-reader.wiki
index 862bb3b48..11cd52d9c 100644
--- a/tests/mediawiki-reader.wiki
+++ b/tests/mediawiki-reader.wiki
@@ -381,6 +381,14 @@ and cheese
|Orange
|}Paragraph after the table.
+{|
+ !fruit
+ !topping
+ |-
+ |apple
+ |ice cream
+ |}
+
== notes ==
My note!<ref>This.</ref>
diff --git a/tests/tables.docbook5 b/tests/tables.docbook5
new file mode 100644
index 000000000..6224cf222
--- /dev/null
+++ b/tests/tables.docbook5
@@ -0,0 +1,432 @@
+<para>
+ Simple table with caption:
+</para>
+<table>
+ <title>
+ Demonstration of simple table syntax.
+ </title>
+ <tgroup cols="4">
+ <colspec align="right" />
+ <colspec align="left" />
+ <colspec align="center" />
+ <colspec align="left" />
+ <thead>
+ <row>
+ <entry>
+ Right
+ </entry>
+ <entry>
+ Left
+ </entry>
+ <entry>
+ Center
+ </entry>
+ <entry>
+ Default
+ </entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</table>
+<para>
+ Simple table without caption:
+</para>
+<informaltable>
+ <tgroup cols="4">
+ <colspec align="right" />
+ <colspec align="left" />
+ <colspec align="center" />
+ <colspec align="left" />
+ <thead>
+ <row>
+ <entry>
+ Right
+ </entry>
+ <entry>
+ Left
+ </entry>
+ <entry>
+ Center
+ </entry>
+ <entry>
+ Default
+ </entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</informaltable>
+<para>
+ Simple table indented two spaces:
+</para>
+<table>
+ <title>
+ Demonstration of simple table syntax.
+ </title>
+ <tgroup cols="4">
+ <colspec align="right" />
+ <colspec align="left" />
+ <colspec align="center" />
+ <colspec align="left" />
+ <thead>
+ <row>
+ <entry>
+ Right
+ </entry>
+ <entry>
+ Left
+ </entry>
+ <entry>
+ Center
+ </entry>
+ <entry>
+ Default
+ </entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</table>
+<para>
+ Multiline table with caption:
+</para>
+<table>
+ <title>
+ Here's the caption. It may span multiple lines.
+ </title>
+ <tgroup cols="4">
+ <colspec colwidth="15*" align="center" />
+ <colspec colwidth="13*" align="left" />
+ <colspec colwidth="16*" align="right" />
+ <colspec colwidth="33*" align="left" />
+ <thead>
+ <row>
+ <entry>
+ Centered Header
+ </entry>
+ <entry>
+ Left Aligned
+ </entry>
+ <entry>
+ Right Aligned
+ </entry>
+ <entry>
+ Default aligned
+ </entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ First
+ </entry>
+ <entry>
+ row
+ </entry>
+ <entry>
+ 12.0
+ </entry>
+ <entry>
+ Example of a row that spans multiple lines.
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Second
+ </entry>
+ <entry>
+ row
+ </entry>
+ <entry>
+ 5.0
+ </entry>
+ <entry>
+ Here's another one. Note the blank line between rows.
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</table>
+<para>
+ Multiline table without caption:
+</para>
+<informaltable>
+ <tgroup cols="4">
+ <colspec colwidth="15*" align="center" />
+ <colspec colwidth="13*" align="left" />
+ <colspec colwidth="16*" align="right" />
+ <colspec colwidth="33*" align="left" />
+ <thead>
+ <row>
+ <entry>
+ Centered Header
+ </entry>
+ <entry>
+ Left Aligned
+ </entry>
+ <entry>
+ Right Aligned
+ </entry>
+ <entry>
+ Default aligned
+ </entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ First
+ </entry>
+ <entry>
+ row
+ </entry>
+ <entry>
+ 12.0
+ </entry>
+ <entry>
+ Example of a row that spans multiple lines.
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Second
+ </entry>
+ <entry>
+ row
+ </entry>
+ <entry>
+ 5.0
+ </entry>
+ <entry>
+ Here's another one. Note the blank line between rows.
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</informaltable>
+<para>
+ Table without column headers:
+</para>
+<informaltable>
+ <tgroup cols="4">
+ <colspec align="right" />
+ <colspec align="left" />
+ <colspec align="center" />
+ <colspec align="right" />
+ <tbody>
+ <row>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ <entry>
+ 12
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ <entry>
+ 123
+ </entry>
+ </row>
+ <row>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ <entry>
+ 1
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</informaltable>
+<para>
+ Multiline table without column headers:
+</para>
+<informaltable>
+ <tgroup cols="4">
+ <colspec colwidth="15*" align="center" />
+ <colspec colwidth="13*" align="left" />
+ <colspec colwidth="16*" align="right" />
+ <colspec colwidth="33*" align="left" />
+ <tbody>
+ <row>
+ <entry>
+ First
+ </entry>
+ <entry>
+ row
+ </entry>
+ <entry>
+ 12.0
+ </entry>
+ <entry>
+ Example of a row that spans multiple lines.
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Second
+ </entry>
+ <entry>
+ row
+ </entry>
+ <entry>
+ 5.0
+ </entry>
+ <entry>
+ Here's another one. Note the blank line between rows.
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+</informaltable>
diff --git a/tests/tables.latex b/tests/tables.latex
index 96cbc9579..38d4d089e 100644
--- a/tests/tables.latex
+++ b/tests/tables.latex
@@ -53,46 +53,46 @@ Multiline table with caption:
\caption{Here's the caption. It may span multiple lines.}\tabularnewline
\toprule
\begin{minipage}[b]{0.13\columnwidth}\centering\strut
-Centered Header
-\strut\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright\strut
-Left Aligned
-\strut\end{minipage} & \begin{minipage}[b]{0.14\columnwidth}\raggedleft\strut
-Right Aligned
-\strut\end{minipage} & \begin{minipage}[b]{0.30\columnwidth}\raggedright\strut
-Default aligned
-\strut\end{minipage}\tabularnewline
+Centered Header\strut
+\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright\strut
+Left Aligned\strut
+\end{minipage} & \begin{minipage}[b]{0.14\columnwidth}\raggedleft\strut
+Right Aligned\strut
+\end{minipage} & \begin{minipage}[b]{0.30\columnwidth}\raggedright\strut
+Default aligned\strut
+\end{minipage}\tabularnewline
\midrule
\endfirsthead
\toprule
\begin{minipage}[b]{0.13\columnwidth}\centering\strut
-Centered Header
-\strut\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright\strut
-Left Aligned
-\strut\end{minipage} & \begin{minipage}[b]{0.14\columnwidth}\raggedleft\strut
-Right Aligned
-\strut\end{minipage} & \begin{minipage}[b]{0.30\columnwidth}\raggedright\strut
-Default aligned
-\strut\end{minipage}\tabularnewline
+Centered Header\strut
+\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright\strut
+Left Aligned\strut
+\end{minipage} & \begin{minipage}[b]{0.14\columnwidth}\raggedleft\strut
+Right Aligned\strut
+\end{minipage} & \begin{minipage}[b]{0.30\columnwidth}\raggedright\strut
+Default aligned\strut
+\end{minipage}\tabularnewline
\midrule
\endhead
\begin{minipage}[t]{0.13\columnwidth}\centering\strut
-First
-\strut\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
-row
-\strut\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
-12.0
-\strut\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
-Example of a row that spans multiple lines.
-\strut\end{minipage}\tabularnewline
+First\strut
+\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
+row\strut
+\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
+12.0\strut
+\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
+Example of a row that spans multiple lines.\strut
+\end{minipage}\tabularnewline
\begin{minipage}[t]{0.13\columnwidth}\centering\strut
-Second
-\strut\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
-row
-\strut\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
-5.0
-\strut\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
-Here's another one. Note the blank line between rows.
-\strut\end{minipage}\tabularnewline
+Second\strut
+\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
+row\strut
+\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
+5.0\strut
+\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
+Here's another one. Note the blank line between rows.\strut
+\end{minipage}\tabularnewline
\bottomrule
\end{longtable}
@@ -101,34 +101,34 @@ Multiline table without caption:
\begin{longtable}[]{@{}clrl@{}}
\toprule
\begin{minipage}[b]{0.13\columnwidth}\centering\strut
-Centered Header
-\strut\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright\strut
-Left Aligned
-\strut\end{minipage} & \begin{minipage}[b]{0.14\columnwidth}\raggedleft\strut
-Right Aligned
-\strut\end{minipage} & \begin{minipage}[b]{0.30\columnwidth}\raggedright\strut
-Default aligned
-\strut\end{minipage}\tabularnewline
+Centered Header\strut
+\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright\strut
+Left Aligned\strut
+\end{minipage} & \begin{minipage}[b]{0.14\columnwidth}\raggedleft\strut
+Right Aligned\strut
+\end{minipage} & \begin{minipage}[b]{0.30\columnwidth}\raggedright\strut
+Default aligned\strut
+\end{minipage}\tabularnewline
\midrule
\endhead
\begin{minipage}[t]{0.13\columnwidth}\centering\strut
-First
-\strut\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
-row
-\strut\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
-12.0
-\strut\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
-Example of a row that spans multiple lines.
-\strut\end{minipage}\tabularnewline
+First\strut
+\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
+row\strut
+\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
+12.0\strut
+\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
+Example of a row that spans multiple lines.\strut
+\end{minipage}\tabularnewline
\begin{minipage}[t]{0.13\columnwidth}\centering\strut
-Second
-\strut\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
-row
-\strut\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
-5.0
-\strut\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
-Here's another one. Note the blank line between rows.
-\strut\end{minipage}\tabularnewline
+Second\strut
+\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
+row\strut
+\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
+5.0\strut
+\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
+Here's another one. Note the blank line between rows.\strut
+\end{minipage}\tabularnewline
\bottomrule
\end{longtable}
@@ -147,22 +147,22 @@ Multiline table without column headers:
\begin{longtable}[]{@{}clrl@{}}
\toprule
\begin{minipage}[t]{0.13\columnwidth}\centering\strut
-First
-\strut\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
-row
-\strut\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
-12.0
-\strut\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
-Example of a row that spans multiple lines.
-\strut\end{minipage}\tabularnewline
+First\strut
+\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
+row\strut
+\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
+12.0\strut
+\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
+Example of a row that spans multiple lines.\strut
+\end{minipage}\tabularnewline
\begin{minipage}[t]{0.13\columnwidth}\centering\strut
-Second
-\strut\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
-row
-\strut\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
-5.0
-\strut\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
-Here's another one. Note the blank line between rows.
-\strut\end{minipage}\tabularnewline
+Second\strut
+\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright\strut
+row\strut
+\end{minipage} & \begin{minipage}[t]{0.14\columnwidth}\raggedleft\strut
+5.0\strut
+\end{minipage} & \begin{minipage}[t]{0.30\columnwidth}\raggedright\strut
+Here's another one. Note the blank line between rows.\strut
+\end{minipage}\tabularnewline
\bottomrule
\end{longtable}
diff --git a/tests/writer.docbook5 b/tests/writer.docbook5
new file mode 100644
index 000000000..5261a35be
--- /dev/null
+++ b/tests/writer.docbook5
@@ -0,0 +1,1395 @@
+<?xml version="1.0" encoding="utf-8" ?>
+<!DOCTYPE article>
+<article xmlns="http://docbook.org/ns/docbook" version="5.0">
+ <info>
+ <title>Pandoc Test Suite</title>
+ <authorgroup>
+ <author>
+ <firstname>John</firstname>
+ <surname>MacFarlane</surname>
+ </author>
+ <author>
+ <firstname></firstname>
+ <surname>Anonymous</surname>
+ </author>
+ </authorgroup>
+ <date>July 17, 2006</date>
+ </info>
+<para>
+ This is a set of tests for pandoc. Most of them are adapted from John
+ Gruber’s markdown test suite.
+</para>
+<section id="headers">
+ <title>Headers</title>
+ <section id="level-2-with-an-embedded-link">
+ <title>Level 2 with an <link xlink:href="/url">embedded
+ link</link></title>
+ <section id="level-3-with-emphasis">
+ <title>Level 3 with <emphasis>emphasis</emphasis></title>
+ <section id="level-4">
+ <title>Level 4</title>
+ <section id="level-5">
+ <title>Level 5</title>
+ <para>
+ </para>
+ </section>
+ </section>
+ </section>
+ </section>
+</section>
+<section id="level-1">
+ <title>Level 1</title>
+ <section id="level-2-with-emphasis">
+ <title>Level 2 with <emphasis>emphasis</emphasis></title>
+ <section id="level-3">
+ <title>Level 3</title>
+ <para>
+ with no blank line
+ </para>
+ </section>
+ </section>
+ <section id="level-2">
+ <title>Level 2</title>
+ <para>
+ with no blank line
+ </para>
+ </section>
+</section>
+<section id="paragraphs">
+ <title>Paragraphs</title>
+ <para>
+ Here’s a regular paragraph.
+ </para>
+ <para>
+ In Markdown 1.0.0 and earlier. Version 8. This line turns into a list
+ item. Because a hard-wrapped line in the middle of a paragraph looked like
+ a list item.
+ </para>
+ <para>
+ Here’s one with a bullet. * criminey.
+ </para>
+<literallayout>There should be a hard line break
+here.</literallayout>
+</section>
+<section id="block-quotes">
+ <title>Block Quotes</title>
+ <para>
+ E-mail style:
+ </para>
+ <blockquote>
+ <para>
+ This is a block quote. It is pretty short.
+ </para>
+ </blockquote>
+ <blockquote>
+ <para>
+ Code in a block quote:
+ </para>
+ <programlisting>
+sub status {
+ print &quot;working&quot;;
+}
+</programlisting>
+ <para>
+ A list:
+ </para>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem>
+ <para>
+ item one
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ item two
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ Nested block quotes:
+ </para>
+ <blockquote>
+ <para>
+ nested
+ </para>
+ </blockquote>
+ <blockquote>
+ <para>
+ nested
+ </para>
+ </blockquote>
+ </blockquote>
+ <para>
+ This should not be a block quote: 2 &gt; 1.
+ </para>
+ <para>
+ And a following paragraph.
+ </para>
+</section>
+<section id="code-blocks">
+ <title>Code Blocks</title>
+ <para>
+ Code:
+ </para>
+ <programlisting>
+---- (should be four hyphens)
+
+sub status {
+ print &quot;working&quot;;
+}
+
+this code block is indented by one tab
+</programlisting>
+ <para>
+ And:
+ </para>
+ <programlisting>
+ this code block is indented by two tabs
+
+These should not be escaped: \$ \\ \&gt; \[ \{
+</programlisting>
+</section>
+<section id="lists">
+ <title>Lists</title>
+ <section id="unordered">
+ <title>Unordered</title>
+ <para>
+ Asterisks tight:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ asterisk 1
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ asterisk 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ asterisk 3
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Asterisks loose:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ asterisk 1
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ asterisk 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ asterisk 3
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Pluses tight:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Plus 1
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Plus 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Plus 3
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Pluses loose:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Plus 1
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Plus 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Plus 3
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Minuses tight:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Minus 1
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Minus 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Minus 3
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Minuses loose:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Minus 1
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Minus 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Minus 3
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ <section id="ordered">
+ <title>Ordered</title>
+ <para>
+ Tight:
+ </para>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem>
+ <para>
+ First
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Second
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Third
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ and:
+ </para>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem>
+ <para>
+ One
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Two
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Three
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ Loose using tabs:
+ </para>
+ <orderedlist numeration="arabic">
+ <listitem>
+ <para>
+ First
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Second
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Third
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ and using spaces:
+ </para>
+ <orderedlist numeration="arabic">
+ <listitem>
+ <para>
+ One
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Two
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Three
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ Multiple paragraphs:
+ </para>
+ <orderedlist numeration="arabic">
+ <listitem>
+ <para>
+ Item 1, graf one.
+ </para>
+ <para>
+ Item 1. graf two. The quick brown fox jumped over the lazy dog’s
+ back.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Item 2.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Item 3.
+ </para>
+ </listitem>
+ </orderedlist>
+ </section>
+ <section id="nested">
+ <title>Nested</title>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Tab
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Tab
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Tab
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Here’s another:
+ </para>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem>
+ <para>
+ First
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Second:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Fee
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Fie
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Foe
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Third
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ Same thing but with paragraphs:
+ </para>
+ <orderedlist numeration="arabic">
+ <listitem>
+ <para>
+ First
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Second:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ Fee
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Fie
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Foe
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Third
+ </para>
+ </listitem>
+ </orderedlist>
+ </section>
+ <section id="tabs-and-spaces">
+ <title>Tabs and spaces</title>
+ <itemizedlist>
+ <listitem>
+ <para>
+ this is a list item indented with tabs
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ this is a list item indented with spaces
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ this is an example list item indented with tabs
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ this is an example list item indented with spaces
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </section>
+ <section id="fancy-list-markers">
+ <title>Fancy list markers</title>
+ <orderedlist numeration="arabic">
+ <listitem override="2">
+ <para>
+ begins with 2
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ and now 3
+ </para>
+ <para>
+ with a continuation
+ </para>
+ <orderedlist numeration="lowerroman" spacing="compact">
+ <listitem override="4">
+ <para>
+ sublist with roman numerals, starting with 4
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ more items
+ </para>
+ <orderedlist numeration="upperalpha" spacing="compact">
+ <listitem>
+ <para>
+ a subsublist
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ a subsublist
+ </para>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </orderedlist>
+ <para>
+ Nesting:
+ </para>
+ <orderedlist numeration="upperalpha" spacing="compact">
+ <listitem>
+ <para>
+ Upper Alpha
+ </para>
+ <orderedlist numeration="upperroman" spacing="compact">
+ <listitem>
+ <para>
+ Upper Roman.
+ </para>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem override="6">
+ <para>
+ Decimal start with 6
+ </para>
+ <orderedlist numeration="loweralpha" spacing="compact">
+ <listitem override="3">
+ <para>
+ Lower alpha with paren
+ </para>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </orderedlist>
+ <para>
+ Autonumbering:
+ </para>
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Autonumber.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ More.
+ </para>
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Nested.
+ </para>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </orderedlist>
+ <para>
+ Should not be a list item:
+ </para>
+ <para>
+ M.A. 2007
+ </para>
+ <para>
+ B. Williams
+ </para>
+ </section>
+</section>
+<section id="definition-lists">
+ <title>Definition Lists</title>
+ <para>
+ Tight using spaces:
+ </para>
+ <variablelist spacing="compact">
+ <varlistentry>
+ <term>
+ apple
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ orange
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ banana
+ </term>
+ <listitem>
+ <para>
+ yellow fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Tight using tabs:
+ </para>
+ <variablelist spacing="compact">
+ <varlistentry>
+ <term>
+ apple
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ orange
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ banana
+ </term>
+ <listitem>
+ <para>
+ yellow fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Loose:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>
+ apple
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ orange
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ banana
+ </term>
+ <listitem>
+ <para>
+ yellow fruit
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Multiple blocks with italics:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>
+ <emphasis>apple</emphasis>
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ <para>
+ contains seeds, crisp, pleasant to taste
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ <emphasis>orange</emphasis>
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ <programlisting>
+{ orange code block }
+</programlisting>
+ <blockquote>
+ <para>
+ orange block quote
+ </para>
+ </blockquote>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Multiple definitions, tight:
+ </para>
+ <variablelist spacing="compact">
+ <varlistentry>
+ <term>
+ apple
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ <para>
+ computer
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ orange
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ <para>
+ bank
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Multiple definitions, loose:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>
+ apple
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ <para>
+ computer
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ orange
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ <para>
+ bank
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Blank line after term, indented marker, alternate markers:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>
+ apple
+ </term>
+ <listitem>
+ <para>
+ red fruit
+ </para>
+ <para>
+ computer
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ orange
+ </term>
+ <listitem>
+ <para>
+ orange fruit
+ </para>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem>
+ <para>
+ sublist
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ sublist
+ </para>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+</section>
+<section id="html-blocks">
+ <title>HTML Blocks</title>
+ <para>
+ Simple block on one line:
+ </para>
+ <para>
+ foo
+ </para>
+ <para>
+ And nested without indentation:
+ </para>
+ <para>
+ foo
+ </para>
+ <para>
+ bar
+ </para>
+ <para>
+ Interpreted markdown in a table:
+ </para>
+ This is <emphasis>emphasized</emphasis>
+ And this is <emphasis role="strong">strong</emphasis>
+ <para>
+ Here’s a simple block:
+ </para>
+ <para>
+ foo
+ </para>
+ <para>
+ This should be a code block, though:
+ </para>
+ <programlisting>
+&lt;div&gt;
+ foo
+&lt;/div&gt;
+</programlisting>
+ <para>
+ As should this:
+ </para>
+ <programlisting>
+&lt;div&gt;foo&lt;/div&gt;
+</programlisting>
+ <para>
+ Now, nested:
+ </para>
+ <para>
+ foo
+ </para>
+ <para>
+ This should just be an HTML comment:
+ </para>
+ <para>
+ Multiline:
+ </para>
+ <para>
+ Code block:
+ </para>
+ <programlisting>
+&lt;!-- Comment --&gt;
+</programlisting>
+ <para>
+ Just plain comment, with trailing spaces on the line:
+ </para>
+ <para>
+ Code:
+ </para>
+ <programlisting>
+&lt;hr /&gt;
+</programlisting>
+ <para>
+ Hr’s:
+ </para>
+</section>
+<section id="inline-markup">
+ <title>Inline Markup</title>
+ <para>
+ This is <emphasis>emphasized</emphasis>, and so <emphasis>is
+ this</emphasis>.
+ </para>
+ <para>
+ This is <emphasis role="strong">strong</emphasis>, and so
+ <emphasis role="strong">is this</emphasis>.
+ </para>
+ <para>
+ An <emphasis><link xlink:href="/url">emphasized link</link></emphasis>.
+ </para>
+ <para>
+ <emphasis role="strong"><emphasis>This is strong and
+ em.</emphasis></emphasis>
+ </para>
+ <para>
+ So is <emphasis role="strong"><emphasis>this</emphasis></emphasis> word.
+ </para>
+ <para>
+ <emphasis role="strong"><emphasis>This is strong and
+ em.</emphasis></emphasis>
+ </para>
+ <para>
+ So is <emphasis role="strong"><emphasis>this</emphasis></emphasis> word.
+ </para>
+ <para>
+ This is code: <literal>&gt;</literal>, <literal>$</literal>,
+ <literal>\</literal>, <literal>\$</literal>,
+ <literal>&lt;html&gt;</literal>.
+ </para>
+ <para>
+ <emphasis role="strikethrough">This is
+ <emphasis>strikeout</emphasis>.</emphasis>
+ </para>
+ <para>
+ Superscripts: a<superscript>bc</superscript>d
+ a<superscript><emphasis>hello</emphasis></superscript>
+ a<superscript>hello there</superscript>.
+ </para>
+ <para>
+ Subscripts: H<subscript>2</subscript>O, H<subscript>23</subscript>O,
+ H<subscript>many of them</subscript>O.
+ </para>
+ <para>
+ These should not be superscripts or subscripts, because of the unescaped
+ spaces: a^b c^d, a~b c~d.
+ </para>
+</section>
+<section id="smart-quotes-ellipses-dashes">
+ <title>Smart quotes, ellipses, dashes</title>
+ <para>
+ <quote>Hello,</quote> said the spider. <quote><quote>Shelob</quote> is my
+ name.</quote>
+ </para>
+ <para>
+ <quote>A</quote>, <quote>B</quote>, and <quote>C</quote> are letters.
+ </para>
+ <para>
+ <quote>Oak,</quote> <quote>elm,</quote> and <quote>beech</quote> are names
+ of trees. So is <quote>pine.</quote>
+ </para>
+ <para>
+ <quote>He said, <quote>I want to go.</quote></quote> Were you alive in the
+ 70’s?
+ </para>
+ <para>
+ Here is some quoted <quote><literal>code</literal></quote> and a
+ <quote><link xlink:href="http://example.com/?foo=1&amp;bar=2">quoted
+ link</link></quote>.
+ </para>
+ <para>
+ Some dashes: one—two — three—four — five.
+ </para>
+ <para>
+ Dashes between numbers: 5–7, 255–66, 1987–1999.
+ </para>
+ <para>
+ Ellipses…and…and….
+ </para>
+</section>
+<section id="latex">
+ <title>LaTeX</title>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ 2 + 2 = 4
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <emphasis>x</emphasis> ∈ <emphasis>y</emphasis>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <emphasis>α</emphasis> ∧ <emphasis>ω</emphasis>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ 223
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <emphasis>p</emphasis>-Tree
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Here’s some display math:
+ $$\frac{d}{dx}f(x)=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}$$
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Here’s one that has a line break in it:
+ <emphasis>α</emphasis> + <emphasis>ω</emphasis> × <emphasis>x</emphasis><superscript>2</superscript>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ These shouldn’t be math:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ To get the famous equation, write <literal>$e = mc^2$</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ $22,000 is a <emphasis>lot</emphasis> of money. So is $34,000. (It
+ worked if <quote>lot</quote> is emphasized.)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Shoes ($20) and socks ($5).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Escaped <literal>$</literal>: $73 <emphasis>this should be
+ emphasized</emphasis> 23$.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Here’s a LaTeX table:
+ </para>
+</section>
+<section id="special-characters">
+ <title>Special Characters</title>
+ <para>
+ Here is some unicode:
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ I hat: Î
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ o umlaut: ö
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ section: §
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ set membership: ∈
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ copyright: ©
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ AT&amp;T has an ampersand in their name.
+ </para>
+ <para>
+ AT&amp;T is another way to write it.
+ </para>
+ <para>
+ This &amp; that.
+ </para>
+ <para>
+ 4 &lt; 5.
+ </para>
+ <para>
+ 6 &gt; 5.
+ </para>
+ <para>
+ Backslash: \
+ </para>
+ <para>
+ Backtick: `
+ </para>
+ <para>
+ Asterisk: *
+ </para>
+ <para>
+ Underscore: _
+ </para>
+ <para>
+ Left brace: {
+ </para>
+ <para>
+ Right brace: }
+ </para>
+ <para>
+ Left bracket: [
+ </para>
+ <para>
+ Right bracket: ]
+ </para>
+ <para>
+ Left paren: (
+ </para>
+ <para>
+ Right paren: )
+ </para>
+ <para>
+ Greater-than: &gt;
+ </para>
+ <para>
+ Hash: #
+ </para>
+ <para>
+ Period: .
+ </para>
+ <para>
+ Bang: !
+ </para>
+ <para>
+ Plus: +
+ </para>
+ <para>
+ Minus: -
+ </para>
+</section>
+<section id="links">
+ <title>Links</title>
+ <section id="explicit">
+ <title>Explicit</title>
+ <para>
+ Just a <link xlink:href="/url/">URL</link>.
+ </para>
+ <para>
+ <link xlink:href="/url/">URL and title</link>.
+ </para>
+ <para>
+ <link xlink:href="/url/">URL and title</link>.
+ </para>
+ <para>
+ <link xlink:href="/url/">URL and title</link>.
+ </para>
+ <para>
+ <link xlink:href="/url/">URL and title</link>
+ </para>
+ <para>
+ <link xlink:href="/url/">URL and title</link>
+ </para>
+ <para>
+ <link xlink:href="/url/with_underscore">with_underscore</link>
+ </para>
+ <para>
+ Email link (<email>nobody@nowhere.net</email>)
+ </para>
+ <para>
+ <link xlink:href="">Empty</link>.
+ </para>
+ </section>
+ <section id="reference">
+ <title>Reference</title>
+ <para>
+ Foo <link xlink:href="/url/">bar</link>.
+ </para>
+ <para>
+ Foo <link xlink:href="/url/">bar</link>.
+ </para>
+ <para>
+ Foo <link xlink:href="/url/">bar</link>.
+ </para>
+ <para>
+ With <link xlink:href="/url/">embedded [brackets]</link>.
+ </para>
+ <para>
+ <link xlink:href="/url/">b</link> by itself should be a link.
+ </para>
+ <para>
+ Indented <link xlink:href="/url">once</link>.
+ </para>
+ <para>
+ Indented <link xlink:href="/url">twice</link>.
+ </para>
+ <para>
+ Indented <link xlink:href="/url">thrice</link>.
+ </para>
+ <para>
+ This should [not][] be a link.
+ </para>
+ <programlisting>
+[not]: /url
+</programlisting>
+ <para>
+ Foo <link xlink:href="/url/">bar</link>.
+ </para>
+ <para>
+ Foo <link xlink:href="/url/">biz</link>.
+ </para>
+ </section>
+ <section id="with-ampersands">
+ <title>With ampersands</title>
+ <para>
+ Here’s a <link xlink:href="http://example.com/?foo=1&amp;bar=2">link
+ with an ampersand in the URL</link>.
+ </para>
+ <para>
+ Here’s a link with an amersand in the link text:
+ <link xlink:href="http://att.com/">AT&amp;T</link>.
+ </para>
+ <para>
+ Here’s an <link xlink:href="/script?foo=1&amp;bar=2">inline link</link>.
+ </para>
+ <para>
+ Here’s an <link xlink:href="/script?foo=1&amp;bar=2">inline link in
+ pointy braces</link>.
+ </para>
+ </section>
+ <section id="autolinks">
+ <title>Autolinks</title>
+ <para>
+ With an ampersand:
+ <link xlink:href="http://example.com/?foo=1&amp;bar=2">http://example.com/?foo=1&amp;bar=2</link>
+ </para>
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ In a list?
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <link xlink:href="http://example.com/">http://example.com/</link>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ It should.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ An e-mail address: <email>nobody@nowhere.net</email>
+ </para>
+ <blockquote>
+ <para>
+ Blockquoted:
+ <link xlink:href="http://example.com/">http://example.com/</link>
+ </para>
+ </blockquote>
+ <para>
+ Auto-links should not occur here:
+ <literal>&lt;http://example.com/&gt;</literal>
+ </para>
+ <programlisting>
+or here: &lt;http://example.com/&gt;
+</programlisting>
+ </section>
+</section>
+<section id="images">
+ <title>Images</title>
+ <para>
+ From <quote>Voyage dans la Lune</quote> by Georges Melies (1902):
+ </para>
+ <figure>
+ <title>lalune</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="lalune.jpg" />
+ </imageobject>
+ <textobject><phrase>lalune</phrase></textobject>
+ </mediaobject>
+ </figure>
+ <para>
+ Here is a movie <inlinemediaobject>
+ <imageobject>
+ <imagedata fileref="movie.jpg" />
+ </imageobject>
+ </inlinemediaobject> icon.
+ </para>
+</section>
+<section id="footnotes">
+ <title>Footnotes</title>
+ <para>
+ Here is a footnote reference,<footnote>
+ <para>
+ Here is the footnote. It can go anywhere after the footnote reference.
+ It need not be placed at the end of the document.
+ </para>
+ </footnote> and another.<footnote>
+ <para>
+ Here’s the long note. This one contains multiple blocks.
+ </para>
+ <para>
+ Subsequent blocks are indented to show that they belong to the
+ footnote (as with list items).
+ </para>
+ <programlisting>
+ { &lt;code&gt; }
+</programlisting>
+ <para>
+ If you want, you can indent every line, but you can also be lazy and
+ just indent the first line of each block.
+ </para>
+ </footnote> This should <emphasis>not</emphasis> be a footnote reference,
+ because it contains a space.[^my note] Here is an inline note.<footnote>
+ <para>
+ This is <emphasis>easier</emphasis> to type. Inline notes may contain
+ <link xlink:href="http://google.com">links</link> and
+ <literal>]</literal> verbatim characters, as well as [bracketed text].
+ </para>
+ </footnote>
+ </para>
+ <blockquote>
+ <para>
+ Notes can go in quotes.<footnote>
+ <para>
+ In quote.
+ </para>
+ </footnote>
+ </para>
+ </blockquote>
+ <orderedlist numeration="arabic" spacing="compact">
+ <listitem>
+ <para>
+ And in list items.<footnote>
+ <para>
+ In list.
+ </para>
+ </footnote>
+ </para>
+ </listitem>
+ </orderedlist>
+ <para>
+ This paragraph should not be part of the note, as it is not indented.
+ </para>
+</section>
+</article>
diff --git a/tests/writer.org b/tests/writer.org
index 13bacdfa6..4c7f363a6 100644
--- a/tests/writer.org
+++ b/tests/writer.org
@@ -9,30 +9,60 @@ markdown test suite.
--------------
* Headers
+ :PROPERTIES:
+ :CUSTOM_ID: headers
+ :END:
** Level 2 with an [[/url][embedded link]]
+ :PROPERTIES:
+ :CUSTOM_ID: level-2-with-an-embedded-link
+ :END:
*** Level 3 with /emphasis/
+ :PROPERTIES:
+ :CUSTOM_ID: level-3-with-emphasis
+ :END:
**** Level 4
+ :PROPERTIES:
+ :CUSTOM_ID: level-4
+ :END:
***** Level 5
+ :PROPERTIES:
+ :CUSTOM_ID: level-5
+ :END:
* Level 1
+ :PROPERTIES:
+ :CUSTOM_ID: level-1
+ :END:
** Level 2 with /emphasis/
+ :PROPERTIES:
+ :CUSTOM_ID: level-2-with-emphasis
+ :END:
*** Level 3
+ :PROPERTIES:
+ :CUSTOM_ID: level-3
+ :END:
with no blank line
** Level 2
+ :PROPERTIES:
+ :CUSTOM_ID: level-2
+ :END:
with no blank line
--------------
* Paragraphs
+ :PROPERTIES:
+ :CUSTOM_ID: paragraphs
+ :END:
Here's a regular paragraph.
@@ -48,6 +78,9 @@ here.
--------------
* Block Quotes
+ :PROPERTIES:
+ :CUSTOM_ID: block-quotes
+ :END:
E-mail style:
@@ -87,6 +120,9 @@ And a following paragraph.
--------------
* Code Blocks
+ :PROPERTIES:
+ :CUSTOM_ID: code-blocks
+ :END:
Code:
@@ -111,8 +147,14 @@ And:
--------------
* Lists
+ :PROPERTIES:
+ :CUSTOM_ID: lists
+ :END:
** Unordered
+ :PROPERTIES:
+ :CUSTOM_ID: unordered
+ :END:
Asterisks tight:
@@ -157,6 +199,9 @@ Minuses loose:
- Minus 3
** Ordered
+ :PROPERTIES:
+ :CUSTOM_ID: ordered
+ :END:
Tight:
@@ -197,6 +242,9 @@ Multiple paragraphs:
3. Item 3.
** Nested
+ :PROPERTIES:
+ :CUSTOM_ID: nested
+ :END:
- Tab
@@ -228,6 +276,9 @@ Same thing but with paragraphs:
3. Third
** Tabs and spaces
+ :PROPERTIES:
+ :CUSTOM_ID: tabs-and-spaces
+ :END:
- this is a list item indented with tabs
@@ -238,6 +289,9 @@ Same thing but with paragraphs:
- this is an example list item indented with spaces
** Fancy list markers
+ :PROPERTIES:
+ :CUSTOM_ID: fancy-list-markers
+ :END:
2) begins with 2
3) and now 3
@@ -276,6 +330,9 @@ B. Williams
--------------
* Definition Lists
+ :PROPERTIES:
+ :CUSTOM_ID: definition-lists
+ :END:
Tight using spaces:
@@ -342,6 +399,9 @@ Blank line after term, indented marker, alternate markers:
2. sublist
* HTML Blocks
+ :PROPERTIES:
+ :CUSTOM_ID: html-blocks
+ :END:
Simple block on one line:
@@ -569,6 +629,9 @@ Hr's:
--------------
* Inline Markup
+ :PROPERTIES:
+ :CUSTOM_ID: inline-markup
+ :END:
This is /emphasized/, and so /is this/.
@@ -598,6 +661,9 @@ spaces: a\^b c\^d, a~b c~d.
--------------
* Smart quotes, ellipses, dashes
+ :PROPERTIES:
+ :CUSTOM_ID: smart-quotes-ellipses-dashes
+ :END:
"Hello," said the spider. "'Shelob' is my name."
@@ -619,6 +685,9 @@ Ellipses...and...and....
--------------
* LaTeX
+ :PROPERTIES:
+ :CUSTOM_ID: latex
+ :END:
- \cite[22-23]{smith.1899}
- $2+2=4$
@@ -649,6 +718,9 @@ Cat & 1 \\ \hline
--------------
* Special Characters
+ :PROPERTIES:
+ :CUSTOM_ID: special-characters
+ :END:
Here is some unicode:
@@ -703,8 +775,14 @@ Minus: -
--------------
* Links
+ :PROPERTIES:
+ :CUSTOM_ID: links
+ :END:
** Explicit
+ :PROPERTIES:
+ :CUSTOM_ID: explicit
+ :END:
Just a [[/url/][URL]].
@@ -725,6 +803,9 @@ Just a [[/url/][URL]].
[[][Empty]].
** Reference
+ :PROPERTIES:
+ :CUSTOM_ID: reference
+ :END:
Foo [[/url/][bar]].
@@ -753,6 +834,9 @@ Foo [[/url/][bar]].
Foo [[/url/][biz]].
** With ampersands
+ :PROPERTIES:
+ :CUSTOM_ID: with-ampersands
+ :END:
Here's a [[http://example.com/?foo=1&bar=2][link with an ampersand in the
URL]].
@@ -764,6 +848,9 @@ Here's an [[/script?foo=1&bar=2][inline link]].
Here's an [[/script?foo=1&bar=2][inline link in pointy braces]].
** Autolinks
+ :PROPERTIES:
+ :CUSTOM_ID: autolinks
+ :END:
With an ampersand: [[http://example.com/?foo=1&bar=2]]
@@ -786,6 +873,9 @@ Auto-links should not occur here: =<http://example.com/>=
--------------
* Images
+ :PROPERTIES:
+ :CUSTOM_ID: images
+ :END:
From "Voyage dans la Lune" by Georges Melies (1902):
@@ -797,6 +887,9 @@ Here is a movie [[movie.jpg]] icon.
--------------
* Footnotes
+ :PROPERTIES:
+ :CUSTOM_ID: footnotes
+ :END:
Here is a footnote reference, [1] and another. [2] This should /not/ be a
footnote reference, because it contains a space.[\^my note] Here is an inline
diff --git a/tests/writers-lang-and-dir.latex b/tests/writers-lang-and-dir.latex
index 056809a5e..346675353 100644
--- a/tests/writers-lang-and-dir.latex
+++ b/tests/writers-lang-and-dir.latex
@@ -27,16 +27,16 @@
breaklinks=true}
\urlstyle{same} % don't use monospace font for urls
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
- \usepackage[shorthands=off,ngerman,british,ngerman,spanish,french,main=english]{babel}
+ \usepackage[shorthands=off,ngerman,british,nswissgerman,spanish,french,main=english]{babel}
\newcommand{\textgerman}[2][]{\foreignlanguage{ngerman}{#2}}
- \newenvironment{german}[1]{\begin{otherlanguage}{ngerman}}{\end{otherlanguage}}
+ \newenvironment{german}[2][]{\begin{otherlanguage}{ngerman}}{\end{otherlanguage}}
\newcommand{\textenglish}[2][]{\foreignlanguage{british}{#2}}
- \newenvironment{english}[1]{\begin{otherlanguage}{british}}{\end{otherlanguage}}
+ \newenvironment{english}[2][]{\begin{otherlanguage}{british}}{\end{otherlanguage}}
\let\oritextspanish\textspanish
\AddBabelHook{spanish}{beforeextras}{\renewcommand{\textspanish}{\oritextspanish}}
\AddBabelHook{spanish}{afterextras}{\renewcommand{\textspanish}[2][]{\foreignlanguage{spanish}{##2}}}
\newcommand{\textfrench}[2][]{\foreignlanguage{french}{#2}}
- \newenvironment{french}[1]{\begin{otherlanguage}{french}}{\end{otherlanguage}}
+ \newenvironment{french}[2][]{\begin{otherlanguage}{french}}{\end{otherlanguage}}
\else
\usepackage{polyglossia}
\setmainlanguage[]{english}