-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply the standard rule mechanism also to text nodes #304
Conversation
…plify node processing.
706dc1d
to
6f3306e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @martincizek!
Thanks for all your work and your detailed pull requests 💫
I really like how this cleans up the process
function. However I'm not sure it's the right solution yet. As you've figured out, escaping in Turndown is simple, but naive. It works for most cases, however it's often customised. I'd like a offer an improved system for customising escaping, perhaps along the lines of addEscape
, removeEscape
(custom escaping also discussed in #242 (comment)). Is this something you'd be interested in helping out on?
var replacement = '' | ||
if (node.nodeType === 3) { | ||
replacement = node.isCode ? node.nodeValue : self.escape(node.nodeValue) | ||
} else if (node.nodeType === 1) { | ||
replacement = replacementForNode.call(self, node) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍
replacement: function (content, node, options) { | ||
if (node.isCode) return node.nodeValue | ||
return options.escapes.reduce(function (accumulator, escape) { | ||
return accumulator.replace(escape[0], escape[1]) | ||
}, node.nodeValue).trim() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight concern here is that it requires a fair bit of knowledge to add to the behaviour i.e. a developer has to remember to:
- pass through unchanged code content
- iterate over the existing escapes and performs the replacements
- trim the value
Hey @domchristie, thank you for your reply! As I went through all the Turndown's code over time (again, nice job!), I admit another approach would be better - introducing We have done it, please check this commit, we just somehow forgot to make a new PR and deprecate this PR. :) Is this the way to go? Regarding relation to escaping:
|
Regarding help with the escaping subsystem:
So I believe that Turndown would need something like The mentioned project GfmEscape is actually very thoroughly configured But
Is integrating If it were done this way, the comprehensive |
Closing this PR in favour of #339, which provides related hooks in a way more consistent with the rest of Turndown. |
There might be strong reasons for creating rules also for text nodes. This simple patch makes it possible. We believe it even makes the current code more consistent, as we can rely on the node name "
#text
".Backround:
Our use case is related to text escaping. As the docs admit, it is quite simplistic and adds unnecessary backslashes. Unfortunately any simple GFM escaping also necessarily corrupts data. We aim at nearly-lossless conversions, so we have developed GfmEscape to address context-dependent escaping.
I admit that full escaping complexity might be overkill for Turndown.
But Turndown is still a fantastic framework for user-developed conversions, so let's just provide the desired hook. :)
Thank you!