-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question About Initial Input Seeds for Fuzzing Libraries in OSS-Fuzz #12422
Comments
The initial seeds are usually added in the oss-fuzz/projects/llvm/build.sh Lines 187 to 189 in 802a321
If you want to go back in time, you'll have to follow the git history of the |
Could you clarify what you mean by "created by humans"? I think for each OSS-Fuzz project there has been a human involved in setting up the seeds, however, there is perhaps a spectrum of involvement, e.g. whether the seed files were pre-existing and just copied out to the harness corpus folder, whether there were some involvement e.g. finding relevant pre-existing images that can be used as seeds, whether a human actively assembled the seeds in a programmatic manner like structured generation or whether the human assembled a given seed file byte-by-byte manually. I don't think there are any cases of the latter, but there are many different variations of the three former -- do they constitute "created by humans" though? There are no initial input seeds used by OSS-Fuzz that are "generated through fuzzing campaigns", at least not OSS-Fuzz running it -- it may be that a developer has run things locally and uploaded it, and that's not something OSS-Fuzz maintainers would be keeping track of. |
By "created by humans," I mean that there was human involvement in creating the seeds, as opposed to the seeds simply resulting from a series of fuzzing campaigns where the "best" seeds from one campaign are taken as input for the next. I think the last part of your message has clarified my doubts — OSS-Fuzz uses seeds uploaded by developers rather than seeds generated through automatically run fuzzing campaigns. |
OSS-Fuzz naturally saves the corpus generated and carries it forward in iterations, which as far as I can tell is what you describe here. OSS-Fuzz also does corpus minimization to "narrow down the corpus to a set of optimal inputs" -- but that is all done by https://github.com/google/clusterfuzz |
I have some questions regarding the initial input seeds used for fuzzing these libraries.
The libraries in question are binutils, cairo, libzip, llvm, mupdf, and sqlite3.
I would like to know:
Are the initial input seeds used by OSS-Fuzz manually created by humans, or are they generated through fuzzing campaigns?
If there are human-made initial input seeds, can the most recent versions be accessed?
Alternatively, if only seeds from fuzzing campaigns are available, could I obtain the oldest initial input seeds that have undergone the fewest fuzzing iterations?
The text was updated successfully, but these errors were encountered: