Proof of concept implementation of code coverage guided fuzzing for scrypto blueprints #1780

bbarwikowski-hacken · 2024-04-21T16:10:56Z

This is a proof of concept implementation of honggfuzz fuzzer with data coverage information for scrypto blueprints and WASM projects in general.

The create scrypto-wasm-fuzzer is building scrypto blueprint fuzz_blueprint with sanitizer-coverage-inline-8bit-counters flag used to track program coverage. After the execution, the coverage is delivered to honggfuzz fuzzer to prepare new test cases.

FuzzBlueprint contains a bug which would be very hard to find just by guessing the inputs to it. With the coverage data the fuzzer is able to find this bug within few minutes.

To run the fuzzer on Debian on Ubuntu you need to have rust installed and do the following steps:

sudo apt-get update && sudo apt-get install -y cmake make llvm clang binutils-dev libunwind-dev libblocksruntime-dev git
rustup default nightly && rustup target add wasm32-unknown-unknown && rustup component add rust-src
git clone --branch honggfuzz-wasm-fuzzer-proof-of-concept --depth 1 --recurse-submodules https://github.com/hknio/radixdlt-scrypto
cd radixdlt-scrypto/scrypto-wasm-fuzzer
./run.sh

This is only a proof of concept that fuzzing WASM smart contracts is possible. To make this feature useful the following steps should be done:

Current execution speed is slow, just 1000 executions per seconds for very simple program. To speed up execution, radix-engine should be using wasmer instead of wasmi with ahead of time compilation, preferably using LLVM backend. Everything related to execution of program, like loading and compiling the WASM code should be done just once and then cached. I believe that 10x speedup would be achievable in this case.
More data should be tracked by coverage, especially data from trace-cmp flag.
The interface for fuzzing should be better implemented, function should have some #[fuzz] macro which would do all the things when it comes to parameters and returned data.
Data coverage should be also gathered from radix-engine, especially when native components are called. Fortunately it is possible to have such behavior in honggfuzz and collect coverage data from multiple sources
There should be custom mutator or dictionary implemented to work correctly when using native components, because right now guessing 29 bytes NodeId is impossible. I would even consider changing NodeId to 2 or 4 bytes for fuzzing and adding incremental NodeId generation (I already created POC of that in different commit). Then with trace-cmp or tracking loads/store fuzzer could even find them automatically.
It should be possible to convert test cases to fuzz test cases.
More anomalies than just panics should be detected.
More than just single transaction/operation should be supported. I would somehow modify preview to support more than 1 transaction at once

There's a lot to be done, but in short summary - it would be possible to create equivalent of Echidna or something even better for scrypto blueprints in reasonable time period, maybe just 2-3 months. At some point of scrypto and Radix development, it will be important thing to have such a feature.

proof of concept implementation of honggfuzz fuzzer for WASM

b534a1b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proof of concept implementation of code coverage guided fuzzing for scrypto blueprints #1780

Proof of concept implementation of code coverage guided fuzzing for scrypto blueprints #1780

bbarwikowski-hacken commented Apr 21, 2024 •

edited

Loading

Proof of concept implementation of code coverage guided fuzzing for scrypto blueprints #1780

Are you sure you want to change the base?

Proof of concept implementation of code coverage guided fuzzing for scrypto blueprints #1780

Conversation

bbarwikowski-hacken commented Apr 21, 2024 • edited Loading

bbarwikowski-hacken commented Apr 21, 2024 •

edited

Loading