Skip to content

Commit

Permalink
Add break-on.plaintext-command option
Browse files Browse the repository at this point in the history
The new option enables the interpretation of paragraph text as LaTeX
commands, thus allowing to use `\newpage` and the like without requiring
the `raw_tex` extension.

Closes: #6
  • Loading branch information
tarleb committed May 7, 2024
1 parent da0ef99 commit bdf52a0
Show file tree
Hide file tree
Showing 9 changed files with 77 additions and 11 deletions.
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ reference document (Configuration).
In all other formats, the page break is represented using the
form feed character.

Input support is enabled (by default) with: _markdown_ and _opml_ formats.
If you want to enable another input format, go to [raw_tex](https://pandoc.org/MANUAL.html#extension-raw_tex) and
hover over `±` to check which formats support the extension,
because this filter is based on that extension.
Note that not all input formats support the `raw_tex` format
extension, which is required to use the filter in the default
configuration. Enable the `break-on.plaintext-command` option to
use this filter with if `raw_tex` is unavailable.

Usage
-----
Expand Down Expand Up @@ -74,6 +74,9 @@ pagebreak:
# Treat paragraphs that contain just a form feed
# character as pagebreak markers.
form-feed: true
# Allow plaintext commands, i.e., respect LaTeX newpage
# commands even if they are not in a raw TeX block.
plaintext-command: true

# Use a div with this class instead of hard-coded CSS
html-class: 'page-break'
Expand All @@ -91,6 +94,11 @@ Currently supported options:
a significant performance impact for large documents and is
therefore *disabled by default*.

- `break-on.plaintext-command`: boolean value that controls
whether paragraphs with LaTeX commands should be interpreted as
pagebreak markers. Enabling this option may impact performance,
so it is *disabled* by default.

- `html-class`: If you want to use an HTML class rather than an
inline style set the value of the metadata key `html-class` or
the environment variable `PANDOC_PAGEBREAK_HTML_CLASS` (the
Expand Down
33 changes: 26 additions & 7 deletions pagebreak.lua
Original file line number Diff line number Diff line change
Expand Up @@ -103,11 +103,23 @@ local function latex_pagebreak (pagebreak)
end
end

-- Turning paragraphs which contain nothing but a form feed
-- characters into line breaks.
local function ascii_pagebreak (raw_pagebreak)
return function (el)
if #el.content == 1 and el.content[1].text == '\f' then
--- Checks if a paragraph contains nothing but a form feed character.
local formfeed_check = function (para)
return #para.content == 1 and para.content[1].text == '\f'
end

--- Checks if a paragraph looks like a LaTeX newpage command.
local function plaintext_check (para)
return #para.content == 1 and is_newpage_command(para.content[1].text)
end

--- Replaces a paragraph with a pagebreak if on of the `checks` returns true.
local function para_pagebreak(raw_pagebreak, checks)
local is_pb = function (para)
return checks:find_if(function (pred) return pred(para) end)
end
return function (para)
if is_pb(para) then
return raw_pagebreak
end
end
Expand All @@ -118,11 +130,18 @@ function Pandoc (doc)
local config = doc.meta.pagebreak or {}
local break_on = config['break-on'] or {}
local raw_pagebreak = newpage(FORMAT, pagebreak_from_config(doc.meta))
local paragraph_checks = pandoc.List{}
if break_on['form-feed'] then
paragraph_checks:insert(formfeed_check)
end
if break_on['plaintext-command'] then
paragraph_checks:insert(plaintext_check)
end
return doc:walk {
RawBlock = latex_pagebreak(raw_pagebreak),
-- Replace paragraphs that contain just a form feed char.
Para = break_on['form-feed']
and ascii_pagebreak(raw_pagebreak)
Para = #paragraph_checks > 0
and para_pagebreak(raw_pagebreak, paragraph_checks)
or nil
}
end
8 changes: 8 additions & 0 deletions test/expected.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,12 @@ ridiculus mus. Nulla posuere. Donec vitae dolor.
Pellentesque dapibus suscipit ligula. Donec posuere augue in quam.
Suspendisse potenti.

The following does not mark a pagebreak unless the interpretation of
LaTeX commands in plain paragraphs is enabled.

<<<

Cum sociis natoque penatibus et magnis dis parturient montes, nascetur
ridiculus mus.

Final paragraph without a preceding pagebreak.
3 changes: 3 additions & 0 deletions test/expected.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,7 @@
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Nulla posuere. Donec vitae dolor.</p>
<div style="page-break-after: always;"></div>
<p>Pellentesque dapibus suscipit ligula. Donec posuere augue in quam. Suspendisse potenti.</p>
<p>The following does not mark a pagebreak unless the interpretation of LaTeX commands in plain paragraphs is enabled.</p>
<p>\pagebreak</p>
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p>
<p>Final paragraph without a preceding pagebreak.</p>
8 changes: 8 additions & 0 deletions test/expected.ms
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,12 @@ Pellentesque dapibus suscipit ligula.
Donec posuere augue in quam.
Suspendisse potenti.
.PP
The following does not mark a pagebreak unless the interpretation of
LaTeX commands in plain paragraphs is enabled.
.PP
\[rs]pagebreak
.PP
Cum sociis natoque penatibus et magnis dis parturient montes, nascetur
ridiculus mus.
.PP
Final paragraph without a preceding pagebreak.
3 changes: 3 additions & 0 deletions test/expected.no-form-feed.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,7 @@
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Nulla posuere. Donec vitae dolor.</p>
<p> </p>
<p>Pellentesque dapibus suscipit ligula. Donec posuere augue in quam. Suspendisse potenti.</p>
<p>The following does not mark a pagebreak unless the interpretation of LaTeX commands in plain paragraphs is enabled.</p>
<p>\pagebreak</p>
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p>
<p>Final paragraph without a preceding pagebreak.</p>
8 changes: 8 additions & 0 deletions test/expected.typst
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,12 @@ ridiculus mus. Nulla posuere. Donec vitae dolor.
Pellentesque dapibus suscipit ligula. Donec posuere augue in quam.
Suspendisse potenti.

The following does not mark a pagebreak unless the interpretation of
LaTeX commands in plain paragraphs is enabled.

\\pagebreak

Cum sociis natoque penatibus et magnis dis parturient montes, nascetur
ridiculus mus.

Final paragraph without a preceding pagebreak.
8 changes: 8 additions & 0 deletions test/input.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,12 @@ nascetur ridiculus mus. Nulla posuere. Donec vitae dolor.
Pellentesque dapibus suscipit ligula. Donec posuere augue in
quam. Suspendisse potenti.

The following does not mark a pagebreak unless the interpretation
of LaTeX commands in plain paragraphs is enabled.

\\pagebreak

Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus.

Final paragraph without a preceding pagebreak.
1 change: 1 addition & 0 deletions test/test-adoc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ metadata:
pagebreak:
break-on:
form-feed: true
plaintext-command: true

0 comments on commit bdf52a0

Please sign in to comment.