mishegos

A differential fuzzer for x86 decoders.

Usage

Start with a clone, including submodules:

git clone --recurse-submodules https://github.com/trailofbits/mishegos

Building

mishegos is most easily built within Docker:

docker build -t mishegos .

Alternatively, you can try building it directly.

Make sure you have binutils-dev (or however your system provides libopcodes) installed:

make
# or
make debug

Running

Run the fuzzer for a bit:

./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

mishegos checks for three environment variables:

V=1 enables verbose output on stderr
D=1 enables the "dummy" mutation mode for debugging purposes
M=1 enables the "manual" mutation mode (i.e., read from stdin)
MODE=mode can be used to configure the mutation mode in the absence of D and M
- Valid mutation modes are sliding (default), havoc, and structured

Convert mishegos's raw output into JSONL suitable for analysis:

./src/mish2jsonl/mish2jsonl /tmp/mishegos > /tmp/mishegos.jsonl

mish2jsonl checks for V=1 to enable verbose output on stderr.

Run an analysis/filter pass group on the results:

./src/analysis/analysis -p same-size-different-decodings < /tmp/mishegos.jsonl > /tmp/mishegos.interesting

Generate an ~~ugly~~ pretty visualization of the filtered results:

./src/mishmat/mishmat < /tmp/mishegos.interesting > /tmp/mishegos.html
open /tmp/mishegos.html

Contributing

We welcome contributors to mishegos!

A guide for adding new disassembler workers can be found here.

Performance notes

All numbers below correspond to the following run:

V=1 timeout 60s ./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

Outside Docker:

On a Linux desktop (Ubuntu 20.04, Ryzen 5 3600, 32GB DDR4):
- Commit d80063a
- 8 workers (no udis86) + 1 mishegos fuzzer process
- 8.7M outputs/minute
- 9 cores pinned

TODO

Performance improvements
- Break cohort collection out into a separate process (requires re-addition of semaphores)
- Maybe use a better data structure for input/output/cohort slots
Add a scaling factor for workers, e.g. spawn N of each worker
Pre-analysis normalization (whitespace, immediate representation, prefixes)
Analysis strategies:
- Filter by length, decode status discrepancies
- Easy: lexical comparison
- Easy: reassembly + effects modeling (maybe with microx?)
Scoring ideas:
- Low value: Flag/prefix discrepancies
- Medium value: Decode success/failure/crash discrepancies
- High value: Decode discrepancies with differing control flow, operands, maybe some immediates
Visualization ideas:
- Basic but not really basic: some kind of mouse-over differential visualization

Name		Name	Last commit message	Last commit date
Latest commit History 374 Commits
.github/workflows		.github/workflows
docs		docs
src		src
.clang-format		.clang-format
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
workers.spec		workers.spec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mishegos

Usage

Building

Running

Contributing

Performance notes

TODO

About

Releases

Packages

Languages

License

inventednight/mishegos

Folders and files

Latest commit

History

Repository files navigation

mishegos

Usage

Building

Running

Contributing

Performance notes

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages