Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import null pointer information from PDG into static analysis #1086

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

ahomescu
Copy link
Contributor

No description provided.

@ahomescu
Copy link
Contributor Author

I still need to test this on some actual code (maybe lighttpd?)

@kkysen kkysen changed the title Import null pointer information from PDF into static analysis Import null pointer information from PDG into static analysis Apr 25, 2024
c2rust-analyze/src/analyze.rs Outdated Show resolved Hide resolved
c2rust-analyze/src/analyze.rs Outdated Show resolved Hide resolved
pdg/src/builder.rs Outdated Show resolved Hide resolved
@spernsteiner
Copy link
Collaborator

Is this waiting on anything, other than needing a cargo fmt to fix the CI?

Copy link
Contributor

@fw-immunant fw-immunant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. The brokenness in visit_place is surprising but I agree that the current behavior is wrong as described; see inline comment.

@ahomescu ahomescu force-pushed the ahomescu/non_null branch 9 times, most recently from f0cadd4 to de844f5 Compare June 13, 2024 07:21
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 4 times, most recently from 3a5ab91 to 710b626 Compare June 15, 2024 05:54
@spernsteiner
Copy link
Collaborator

@ahomescu Don't forget to fix the CI. Currently it's failing on the cargo fmt style check.

Copy link
Collaborator

@spernsteiner spernsteiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments. I haven't looked at the PDG code yet.

analysis/runtime/src/events.rs Outdated Show resolved Hide resolved
analysis/runtime/src/events.rs Outdated Show resolved Hide resolved
analysis/runtime/src/events.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@spernsteiner spernsteiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems generally reasonable. This approach of collapsing nested projections down to a single byte offset will limit our ability to extract borrowchecker aliasing information from the PDG, if we end up wanting to do that in the future.

Are there any automated tests related to instrumentation or the PDG? I thought we had one or two, but I could be wrong.

pdg/src/builder.rs Outdated Show resolved Hide resolved
CopyRef => unimplemented!(),
AddrOfLocal(ptr, _, size) => {
// TODO: is this a local from another function?
mapping.size = size.try_into().ok();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this conversion ever fail? If not, it seems like you could make the field size: usize instead of Option<usize> and avoid the None case above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field would still need to be size: Option<usize> because we have other cases where we get a pointer without a known size. That comes from the None on line 268.

Some of this is happening because we get pointers from native libraries (e.g. glibc) which we have no control over. I think the best we can do right now is say "this is a pointer we know nothing about", and maybe add some clever heuristics later.

Re the conversion: I could do Some(size.try_into().unwrap()), that might be better for debugging?

pdg/src/graph.rs Outdated Show resolved Hide resolved
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 2 times, most recently from bcfaceb to 2a94d8f Compare August 5, 2024 23:38
Record field and index projections for pointers
by recording a Project event with a base and target
pointer pair.
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 2 times, most recently from db8c192 to e9b6204 Compare August 14, 2024 06:27
Copy link
Collaborator

@spernsteiner spernsteiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable overall. Left a few comments.

analysis/runtime/src/metadata.rs Outdated Show resolved Hide resolved
c2rust-analyze/src/analyze.rs Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
pdg/src/builder.rs Outdated Show resolved Hide resolved
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 5 times, most recently from 34ddf68 to b0351bf Compare August 21, 2024 22:51
Add an index value to Project events to differentiate between
different projections with the same pointer offset, e.g.,
(*p).x and (*p).x.a if a is the first field of the structure.

This is implemented as a HashMap<u64, Vec<usize>> where each element
is a unique combination of field projections. The Project key
points to an element in this map. The keys are randomly-generated
64-bit keys with very low probability of collision.
The event assignment-based PDG
construction steps were giving incorrect results
for indirect accesses, so remove them completely.
Track the provenance of pointers with finer granularity
by storing the size of every allocation, local, and constant
inside the corresponding events. With this information, we can
keep track of the boundaries of every object and track whether
a projected pointer falls inside the original allocation.
Keep track of the size of every local for the
new provenance algorithm.
Add a new arg_ptr() method that allows passing
a pointer directly to an event handler.
The handler needs to be generic on the pointee type:
fn foo_handler<T: ?Sized>(ptr: *const T) { ... }

This change allows us to get more information about
the pointees from the event handler body, e.g.,
call std::mem::size_of_val().
Add a new AddrOfSized event and use it to keep
track of the sizes of all constant operands for
the new provenance algorithm.
Handle Offset nodes with a base pointer of 0
where the offset is non-zero, potentially resulting
in a brand new pointer.

One such case occurs in mod_cgi from lighttpd:
    const uintptr_t baseptr = (uintptr_t)env->b->ptr;
    for (i = 0; i < env->oused; ++i)
            envp[i] += baseptr;
Mark each PDG graph with a boolean flag that
represents whether that graph corresponds to the
null pointer or not. The PDG construction algorithm
seems to build one unique graph for all null pointers
in the entire program.
Add one test where a function argument can be either
null or non-null in the recur() function of the
analysis/tests/misc example code.
The dynamic instrumentation code inserts additional
locals for its own instrumentation code.
Remove the NON_NULL permission from all nodes in
the null graph from the PDG.
If the dynamic analysis tells us a pointer is never NULL
(i.e. it never shows up in an is_null graph), then
force the NON_NULL permission on it using updates_forbidden.
This is potentially an unsound transformation because the
pointer may actually be NULL on other inputs not covered
by the sample ones.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants