Plain-text TOC: Can the Table of Contents be made to ignore HTML tags within headings? #9596
-
Hello, I tend to individually format headings, e.g. with line breaks and big and small tags, so I can control the number of words per line in long titles; and sometimes I render some words in a different font (small caps, swash italics), as you might see in a magazine for example. (Of course I specify a default style with CSS, but these tweaks are by nature discretionary.) All of this works just fine; however, it’s duplicated in the automatically generated Table of Contents, where it’s unwanted. Is there a way to have the TOC simply ignore inline formatting, and just generate the list of section titles in plain text? Currently the only workarounds I’ve found are (1) manually editing documents post conversion to cleanup the TOC, or (2) hiding actual header elements via CSS, then adding redundant text in DIVs styled the same way right after each one. Obviously, neither approach is optimal, so I just thought I’d ask here if there’s a more direct fix. The automatic generation of a hyperlinked TOC is a swell feature, and a real time saver; however, there’s no need for it to reproduce any formatting that happens to exist within headers. I think that’s taking things a step too far; my only complaint, then, is that this feature works too well! 😅 Cheers |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 5 replies
-
Well in general, we certainly wouldn't want to remove all formatting. For example, if a section heading is `The plot of the Iliad', we'd like 'Iliad' to be italicized in the TOC too. I think that currently we do remove links and footnotes. |
Beta Was this translation helpful? Give feedback.
-
@artistofmind If you’re working from Markdown, you might want to play around with my little gh-toc Markdown ToC generation tool, which removes HTML from headings (but leaves content). Just because I also needed that feature… I typically use it before Pandoc conversion, since it was originally meant to make ToCs for GitHub. In general I agree: Would be nice to have something like "strip some" (as we have) and "strip all". But of course the next question would be: Remove contents between HTML tags or not? |
Beta Was this translation helpful? Give feedback.
-
Here's an approach that depends on implementation details, so there's no guarantee that it's going to keep working in the future. But with most templates, setting the function Pandoc (doc)
doc.meta['table-of-contents'] = pandoc.structure.table_of_contents(doc):walk{
-- Unlink and remote notes
Link = function (link) return link.content end,
Note = function (note) return pandoc.utils.blocks_to_inlines(note.content) end,
}
return doc
end Use the above filter by saving it to a file, and then passing it to pandoc via the
|
Beta Was this translation helpful? Give feedback.
-
Interesting. I've done that too, but I didn't realize that it relied on an implementation detail. Could the |
Beta Was this translation helpful? Give feedback.
-
Makes sense (thanks), but all the same might it not be worth mentioning it in the manual? I'd be happy to create an issue/PR. |
Beta Was this translation helpful? Give feedback.
-
Thanks, everyone, for sharing all these different ideas! It’s heartening to know I’m not the only user who’s found the current functionality less than ideally suited to my needs, despite its working so well. 😊 Do I understand correctly that most of these approaches involve disabling the TOC in Pandoc, and generating one with another extension, instead? . . . That’s something I hadn’t considered doing, and I confess I’m a bit overwhelmed at the prospect. But I’ll give it due consideration. Has anyone tried more than one of the above methods? Which one is the easiest to implement, gives you the most control, is the most compatible, or for some other reason should be tried first? I guess I’m looking for the “least radical” solution. Finally, is it at all likely that this functionality will eventually be added to Pandoc? Should I just create an issue for it, instead of asking questions here? |
Beta Was this translation helpful? Give feedback.
-
@artistofmind, I can offer this lua function (see below for usage examples):
How I use it (the sidebar and navigation ToCs are used by a custom template):
FYI this is the
Please let me know if I forgot anything or if you have any questions. |
Beta Was this translation helpful? Give feedback.
-
I figure I should share with you all the simple solution I came up with, or rather stumbled upon. I noticed HTML headers don’t generate an entry in the TOC, only Markdown ones do. (So Pandoc adds ###, but ignores <h3>.) I now use both, and place the Markdown one inside a div with “display: hidden.” I then have the ability to style each version of the title independently. 🙂 So, using an example from earlier in this thread, I can still italicize Iliad in the TOC if I wish, and just avoid any line breaks there. Meanwhile, I can stylize the in-text chapter title to my heart’s content, without affecting the TOC in any way. Perhaps even add a subtitle there, while displaying only the main title in the TOC. 🤷 As an added benefit, VS Code’s internally generated index also sees the Markdown and ignores the HTML. Since the former is generally simpler, this makes it easier to navigate in the editor. 😉 Finally, since it’s linking to hidden text, the anchor point can be independently placed. So I can put a chapter number in between the two versions of the title, and the TOC link effectively takes you to that number, displaying the title under it. (Before, it would go right to the title, and you’d have to scroll up to see the number.) Just a little bit more work, for a great deal more control. 👌 |
Beta Was this translation helpful? Give feedback.
I figure I should share with you all the simple solution I came up with, or rather stumbled upon.
I noticed HTML headers don’t generate an entry in the TOC, only Markdown ones do. (So Pandoc adds ###, but ignores <h3>.) I now use both, and place the Markdown one inside a div with “display: hidden.” I then have the ability to style each version of the title independently. 🙂
So, using an example from earlier in this thread, I can still italicize Iliad in the TOC if I wish, and just avoid any line breaks there. Meanwhile, I can stylize the in-text chapter title to my heart’s content, without affecting the TOC in any way. Perhaps even add a subtitle there, while displaying only the main title…