Skip to content

mattiase/regexp-disasm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

regexp-disasm – Emacs regexp bytecode disassembler

regexp-disasm disassembles compiled Emacs regexp bytecode to something that can (just) be understood by a human. It is mainly intended for those debugging the Emacs regexp engine, but anyone curious about how it works and how regexps are interpreted may find it interesting.

This is a specialised package: it requires some knowledge of the regexp internals to be useful. See src/regex-emacs.c in the Emacs source tree, and re_opcode_t in particular.

Installation

This package requires either Emacs 30, or a patched Emacs 28-29. For Emacs 29 and older, apply regexp-bytecode.patch to your Emacs source tree and rebuild.

Emacs 30 does not need patching but since it has (at the time of writing) not yet been released, you may need to build a snapshot of a development version from source.

Start Emacs and load regexp-disasm.el.

Usage

  • (regexp-disassemble REGEXP &optional CASE-TABLE)

    Compile REGEXP with the optional CASE-TABLE and display a pretty-printed disassembly in a separate buffer. This function can be used as an interactive command; note that the regexp is then input as a Lisp expression, not a string.

  • (regexp-disasm REGEXP &optional CASE-TABLE)

    Compile REGEXP with the optional CASE-TABLE and return the disassembly as a list of instructions. Each instruction takes the form (ADDRESS . INSTR) where ADDRESS is the byte offset and INSTR is the instruction in a symbolic form.

For both functions, previously compiled (cached) regexp bytecode may be used. It should be functionally equivalent but may not be identical, since Emacs regexps sometimes use self-modifying code.

About

Emacs regexp bytecode disassembler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published