Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roles (Inline Directives) #398

Open
mentalisttraceur opened this issue Nov 4, 2024 · 1 comment
Open

Roles (Inline Directives) #398

mentalisttraceur opened this issue Nov 4, 2024 · 1 comment

Comments

@mentalisttraceur
Copy link
Contributor

mentalisttraceur commented Nov 4, 2024

I wrote a Mistune plug-in to add support for MyST Markdown's "role" syntax:

[click to expand roles.py]
import re


_TYPE_PATTERN = r'\{(?P<type>[a-zA-Z0-9_-]+)\}'


class Role:
    def __init__(self, plugins, max_nested_level: int = 6):
        marker_pattern = '(?P<marker>`{{1,{:d}}})'.format(max_nested_level)
        self._start_pattern = _TYPE_PATTERN + marker_pattern
        self._methods = {}
        self._plugins = plugins

    def register(self, name, fn):
        self._methods[name] = fn

    def parse_method(self, inline, m, state):
        _type = m.group('type')
        method = self._methods.get(_type)
        if method:
            try:
                token = method(inline, m, state)
            except ValueError as e:
                token = {'type': 'role_error', 'raw': str(e)}
        else:
            text = m.group(0)
            token = {'type': 'role_error', 'raw': text}

        if isinstance(token, list):
            for tok in token:
                state.append_token(tok)
        else:
            state.append_token(token)
        return token

    def parse_role(self, inline, m, state):
        marker = m.group('marker')
        full_pattern = self._start_pattern + '(?P<text>.*?[^`])' + marker + '(?!`)'
        full_re = re.compile(full_pattern, re.DOTALL)
        full_m = full_re.match(state.src, m.start())
        if not full_m:
            state.append_token({'type': 'text', 'raw': m.group(0)})
            return m.end()
        self.parse_method(inline, full_m, state)
        return full_m.end()

    def __call__(self, md):
        for plugin in self._plugins:
            plugin(self, md)
        md.inline.register('role', self._start_pattern, self.parse_role, before='codespan')
        if md.renderer and md.renderer.NAME == 'html':
            md.renderer.register('role_error', render_role_error)


def render_role_error(self, text: str) -> str:
    return '<span class="error"><pre>' + text + '</pre></span>'


class RolePlugin:
    @staticmethod
    def parse_type(m):
        return m.group('type')

    @staticmethod
    def parse_content(m):
        text = m.group('text')
        if len(text.strip()) and text[0] in ' \n' and text[-1] in ' \n':
            text = text[1:-1]
        return text

    @staticmethod
    def parse_tokens(inline, text, state):
        new_state = state.copy()
        new_state.src = text
        return inline.render(new_state)

    def parse(self, inline, m, state):
        raise NotImplementedError()

    def __call__(self, md):
        raise NotImplementedError()

Just as directives make it easy to add custom blocks without needing distinct syntax for each block type, roles make it easy to add custom inline spans without needing distinct syntax for each. (Similarly, for a Mistune plugin writer, role plugins are like directive plugins - easier to write than writing a standalone plugin from scratch.)

The combination of directives and roles brings Markdown's expressive power on par with reStructuredText.

@lepture if you're interested in merging this, I can flesh it out into a full PR (docstrings, tests). In the meantime, if anyone wants to use it, you can have it under the same license as Mistune. (If there's no response for a while or lepture doesn't want to include this in-package, I'm open to publishing this separately on PyPI.)

Note

Many parts of roles.py are very similar to code in Mistune. We could get good code reuse if this was merged, and it would help keep the behavior consistent. For examples: Role.parse_method is just a couple name replacements away from BaseDirective.parse_method, Role.parse_tokens is the same as the recursive inline parsing done by built-in plugins like strikethrough, and the unique parts of InlineParser.parse_codespan are almost identical to Role.parse_content.

Example 1: Substitution/Templating

This lets you inject variables into your Markdown, like a template, and optionally you can have the injected value be parsed as Markdown.

[click to expand `Substitute` role implementation]
class Substitute(RolePlugin):
    NAME = 'substitute'
    
    def __init__(self, substitutions):
        self._substitutions = substitutions

    def parse(self, inline, m, state):
        key = self.parse_content(m)

        if key.startswith('!'):
            name = key[1:]
        else:
            name = key

        value = self._substitutions.get(name, None)
        if value is None:
            raise ValueError('no substitution available for {!r}'.format(name))

        if key.startswith('!'):
            return self.parse_tokens(inline, value, state)
        return {'type': 'text', 'raw': value}

    def __call__(self, role, md):
        role.register('substitute', self.parse)

You'd enable this similarly to a Directive plugin:

substitutions = {
    'email': '[email protected]',
    'foo': '**qux**',
}

md = mistune.Markdown(
    mistune.HTMLRenderer(),
    plugins=[
        Role([
            Substitute(substitutions),
        ]),
    ],
)

And then the Markdown looks like this:

You can reach me at {substitute}`email`.

You can do either

* {substitute}`foo`, or
* {substitute}`!foo`.

Which is as-if you wrote:

You can reach me at [email protected].

You can do either

* \*\*qux\*\*, or
* **qux**.

Example 2: Nicer Link Shorthands

I have some roles which expand to links nicely. So that

[click to expand `Phone` role implementation]
class Phone(RolePlugin):
    NAME = 'phone'

    def parse(self, inline, m, state):
        content = self.parse_content(m)
        children = self.parse_tokens(inline, content, state)
        number = _extract_parsed_text(children)
        return {
            'type': 'link',
            'children': [
                {
                    'type': 'text',
                    'raw': number.replace('-', ' '),
                }
            ],
            'attrs': {
                'url': 'tel:' + number,
            },
        }

    def __call__(self, role, md):
        role.register(self.NAME, self.parse)


def _extract_parsed_text(tokens):
    text = ''
    for token in tokens:
        if 'raw' in token:
            text += token['raw']
        elif 'children' in token:
            text += _extract_parsed_text(token['children'])
    return text
@lepture
Copy link
Owner

lepture commented Nov 7, 2024

Great. I'd like to accept this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants