Replies: 2 comments 3 replies
-
Here is the first draft of the Lua interface of user-defined syntax extensions as it is implemented in #182: |
Beta Was this translation helpful? Give feedback.
1 reply
-
And we are live! 😉 http://witiko.github.io/markdown/#option-extensions @writersglen, this should be interesting to you. Since version 2.17.0, the Markdown package supports user-defined syntax extensions, see the above link for an example. This means that you can extend Markdown without going to the trouble of forking it. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In #138, we added a mechanism for creating separate syntax extension objects in order to cut down on the code complexity of the markdown reader and to reduce the time cost of developing new syntax extensions for Markdown. This change has been a net benefit, reducing the development time of a syntax extensions roughly by half, and enabling the rapid development of several long-time-overdue syntax extensions in #149 (to be released in Markdown 2.16.0).
The resounding success of this new mechanism raises the natural question of whether it should be exposed to TeX users, so that package creators can dynamically add new syntax extensions at runtime. In this discussion thread, I will describe a possible implementation and semantics of user extensions, and the technical challenges of the proposed implementation.
Possible Implementation
I will use the strikethrough syntax extension added in #160 as an example throughout, because it is simple and sufficient to illustrate the main points. Adding the strikethrough syntax extension to Markdown amounted to adding the following code: e1c6157. As you can see, most changes are confined to creating an independent extension object, which defines how strike-throughs are written in markdown. #160 futher adds the
strikeThrough
Lua option, defines renderers for the newstrikeThrough
element, updates example documents, and adds unit tests. However, most of these changes are inconsequential for user extensions, which would be applied at runtime.We can add a
\markdownSetupExtension
TeX command, which would allow us to define our strike-through user extension at runtime. Whereas current syntax extensions have separateextend_reader
andextend_writer
parts, which modify the markdown reader and the TeX writer, respectively, it seems sufficient to expose just theextend_reader
part to user extensions, because adding named functions to the TeX writer just invites needless naming collisions between user extensions, which we can easily avoid by using local functions likewrite_strike_through
below:The only remaining point of contention here is
syntax.StrikeThrough
, which can introduce naming collisions between user extensions.Unlike the
\markdownSetupSnippet
command, which is LaTeX-specific, the\markdownSetupExtension
command would be available in plain TeX as well, but should honor the namespacing imposed by Markdown LaTeX themes when used with LaTeX similarly to how setup snippets are namespaced. For example, if we defined thestrikeThrough
user extension in thewitiko/extensions
theme, we would enable the syntax extension as follows:The new syntax for importing setup snippets proposed in #107 does not need to be extended to support user extensions, because snippets can subsume user extensions and it's simpler to have just snippets as the only named thing that can be passed around:
\markdownSetupSnippet{strikeThrough}{ extension=witiko/extensions/strikeThrough }
Challenges
There are multiple challenges to implementing user extensions according to the above proposal that may not be obvious at first glance. In this section, I go through the challenges and suggest how we may tackle them.
Extending the Markdown grammar without cheating
Not all changes in e1c6157 are related to creating an extension object. We create the
parsers.doubletildes
global variable, which is no big issue and can be created as a local variable inside the user extension. However, we also cheat by defining thesyntax.StrikeThrough
grammar rule in the markdown reader with the default value ofparsers.fail
, which essentialy tells the reader to ignore the rule until the syntax extension has been enabled. Futhermore, we add theV("StrikeThrough")
pattern to the middle of thesyntax.Inline
grammar rule, so that the markdown reader knows where in a document strike-throughs may appear:If we want to add user extensions at runtime, we cannot cheat. Without the knowledge of where strike-throughs may appear baked into the markdown reader, we would need to either atomize the grammar rules, so that we only need to prepend or append to them, or we would need to add a method (e.g.
insert_pattern()
) that could walk the grammar rules and insert new patterns at arbitrary places. In either case, the current grammar of Markdown would effectively be frozen after this and any refactoring of the Markdown grammar would be a breaking change.In our user extension, we would replace
syntax.StrikeThrough = read_strike_through
withself.insert_pattern('StrikeThrough', 'Inline after Emph', read_strike_through)
. Notice that this imposes partial order in which user extensions must be loaded: If two different user extensions insert their patterns afterEmph
, loading the user extensions in a different order may produce different results (in PEG grammars, the choice operator is not commutative). If a user extension inserts its pattern afterStrikeThrough
, then it must be loaded after our user extension; however, this is perhaps less of an issue, because one can imagine that theinsert_pattern()
method would complain loudly instead of silently doing the wrong thing. Regardless, in the absence of a dependency solver, the responsibility of loading extensions in correct order would fall onto the user. Let's see how different implementations of markdown handle this:Side note: An alternative to
self.insert_pattern('StrikeThrough', 'Inline after Emph', read_strike_through)
would be to make the inserted patterns anonymous, e.g.self.insert_pattern('Inline after Emph', read_strike_through)
. This would remove any possible naming collisions between user extensions, but it would also make it impossible for user extensions to extend other user extensions. It is unclear whether this is a good trade-off to make, although I would lean into a reluctant yes: User extensions may useInline before InlineNote
instead ofInline after StrikeThrough
to control the placement of a pattern. Furthermore, any user extensions that make a sufficient mess of things to force other extensions to work around them would perhaps best be canonicalized and incorporated into the Markdown package. Compelling counterexamples are welcome.Recovering base reader and writer after extension
At the moment, the markdown reader and the TeX writer are instantiated first and then syntax extensions are applied to them before finalizing the Markdown grammar. This is simple to implement but hampers flexibility: syntax extensions directly modify the underlying reader and writer objects and cannot be unapplied. Admittedly, we don't currently need to unapply syntax extensions, because we instantiate a new reader and writer every time we typeset a piece of markdown text. However, having the ability to recover the base reader and writer without reinstantiating them may be necessary if we need to improve performance in the future.
Ideally, syntax extensions would be decorator classes that would wrap around either the base markdown reader (which includes a reference to the TeX writer) or around previously applied syntax extensions, much like a tasty onion:
The extensions would redefine public methods such as
finalize_grammar()
, where they would insert their own patterns into the Markdown grammar. Consequently, user extensions could be easily unapplied by unwrapping the nested objects until only the base markdown reader would remain.Passing user extensions from TeX to Lua
In the proposed implementation, user extensions live as token lists in TeX and would need to be passed to Lua whenever we typeset a piece of markdown text. This is not necessarily difficult, but it begs the question whether writing Lua code in TeX isn't more trouble than it is worth. Consider the following:
%
initiate a comment inside the code of a user extension or not?)A better way might be to force user extensions to live in separate
markdownextension_*.lua
files with similar naming conventions to Markdown LaTeX themes. We can then dispense with the\markdownSetupExtension
command: To load thewitiko/extensions/strikeThrough
extension, the user would directly say\markdownSetup{extension=witiko/extensions/strikeThrough}
and the filemarkdownextension_witiko_extensions_strikeThrough.lua
would be loaded by the markdown reader without any passing of Lua code between TeX and Lua. This also dispenses with any interaction between Markdown LaTeX themes and user extensions in terms of namespacing, leading to simpler implementation and clearer reasoning about what a piece of Markdown code does.Plain TeX and LaTeX interfaces
Suppose the user wants to load multiple user extensions such as
witiko/extensions/strikeThrough
andwitiko/extensions/superscript
. One way to do this in the LaTeX interface might be to specify theextension=...
option several times:This matches the behaviour of several other LaTeX-specific options, such as
theme
,snippet
, andcode
, which all correspond to imperative commands rather than variable declarations.In the plain TeX interface, the corresponding
\markdownOptionExtension
option would be a comma list, which would constitute a new option type (clist
) in addition to the currentboolean
,counter
,number
,path
, andslice
:A LaTeX user who wants to remove items from
\markdownOptionExtension
, perhaps to remove default extensions set for the document, would need to use the plain TeX interface, which is less than idea. Furthermore, the name\markdownOptionExtension
seems to imply that the option specifies a single user extension rather than a list, which seems confusing. To fix these issues, the plain TeX option could be named\markdownOptionExtensions
(plural) and LaTeX users would be given two corresponding options:extension
, which would add a user extension to the end of the current list of extensionsextensions
, which would explicitly set the list of user extensions:Beta Was this translation helpful? Give feedback.
All reactions