Skip to content

Commit

Permalink
[analyzer] Never create Regions wrapping reference TypedValueRegions …
Browse files Browse the repository at this point in the history
…(NFCI) (#118096)

Like in the test case:
```c++
struct String {
  String(const String &) {}
};

struct MatchComponent {
  unsigned numbers[2];
  String prerelease;
  MatchComponent(MatchComponent const &) = default;
};

MatchComponent get();
void consume(MatchComponent const &);

MatchComponent parseMatchComponent() {
  MatchComponent component = get();
  component.numbers[0] = 10;
  component.numbers[1] = 20;
  return component; // We should have no stack addr escape warning here.
}

void top() {
  consume(parseMatchComponent());
}
```

When calling `consume(parseMatchComponent())` the
`parseMatchComponent()` would return a copy of a temporary of
`component`. That copy would invoke the
`MatchComponent::MatchComponent(const MatchComponent &)` ctor.

That ctor would have a (reference typed) ParamVarRegion, holding the
location (lvalue) of the object we are about to copy (&component). So
far so good, but just before evaluating the binding operation for
initializing the `numbers` field of the temporary, we evaluate the
ArrayInitLoopExpr representing the by-value elementwise copy of the
array `component.numbers`. This is represented by a LazyCompoundVal,
because we (usually) don't just copy large arrays and bind many
individual direct bindings. Rather, we take a snapshot by using a LCV.

However, notice that the LCV representing this copy would look like
this:
  lazyCompoundVal{ParamVarRegion{"reference param of cctor"}.numbers}

Notice that it refers to the numbers field of a reference. It would be
much better to desugar the reference to the actual object, thus it
should be: `lazyCompoundVal{component.numbers}`

Actually, when binding the result of the ArrayInitLoopExpr to the
`temp_object.numbers` in the compiler-generated member initializer of
the cctor, we should have two individual direct bindings because this is
a "small array":
```
  binding &Element{temp_object.numbers, 0} <- loadFrom(&Element{component.numbers, 0})
  binding &Element{temp_object.numbers, 1} <- loadFrom(&Element{component.numbers, 1})
```
Where `loadFrom(...)` would be:
```
  loadFrom(&Element{component.numbers, 0}): 10 U32b
  loadFrom(&Element{component.numbers, 1}): 20 U32b
```
So the store should look like this, after PostInitializer of
`temp_object.numbers`:
```
  temp_object at offset  0: 10 U32b
  temp_object at offset 32: 20 U32b
```

The lesson is that it's okay to have TypedValueRegions of references as
long as we don't form subregions of those. If we ever want to refer to a
subregion of a "reference" we actually meant to "desugar" the reference
and slice a subregion of the pointee of the reference instead.

Once this canonicalization is in place, we can also drop the special
handling of references in `ProcessInitializer`, because now reference
TypedValueRegions are eagerly desugared into their referee region when
forming a subregion of it.

There should be no practical differences, but there are of course bugs
that this patch may surface.
  • Loading branch information
steakhal authored Nov 29, 2024
1 parent 9ebab70 commit 820403c
Show file tree
Hide file tree
Showing 5 changed files with 48 additions and 9 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -487,6 +487,7 @@ class ProgramState : public llvm::FoldingSetNode {
friend void ProgramStateRetain(const ProgramState *state);
friend void ProgramStateRelease(const ProgramState *state);

SVal desugarReference(SVal Val) const;
SVal wrapSymbolicRegion(SVal Base) const;
};

Expand Down
10 changes: 1 addition & 9 deletions clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1206,15 +1206,7 @@ void ExprEngine::ProcessInitializer(const CFGInitializer CFGInit,
while ((ASE = dyn_cast<ArraySubscriptExpr>(Init)))
Init = ASE->getBase()->IgnoreImplicit();

SVal LValue = State->getSVal(Init, stackFrame);
if (!Field->getType()->isReferenceType()) {
if (std::optional<Loc> LValueLoc = LValue.getAs<Loc>()) {
InitVal = State->getSVal(*LValueLoc);
} else if (auto CV = LValue.getAs<nonloc::CompoundVal>()) {
// Initializer list for an array.
InitVal = *CV;
}
}
InitVal = State->getSVal(Init, stackFrame);

// If we fail to get the value for some reason, use a symbolic value.
if (InitVal.isUnknownOrUndef()) {
Expand Down
8 changes: 8 additions & 0 deletions clang/lib/StaticAnalyzer/Core/MemRegion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,11 @@ using namespace ento;
// MemRegion Construction.
//===----------------------------------------------------------------------===//

[[maybe_unused]] static bool isAReferenceTypedValueRegion(const MemRegion *R) {
const auto *TyReg = llvm::dyn_cast<TypedValueRegion>(R);
return TyReg && TyReg->getValueType()->isReferenceType();
}

template <typename RegionTy, typename SuperTy, typename Arg1Ty>
RegionTy* MemRegionManager::getSubRegion(const Arg1Ty arg1,
const SuperTy *superRegion) {
Expand All @@ -76,6 +81,7 @@ RegionTy* MemRegionManager::getSubRegion(const Arg1Ty arg1,
if (!R) {
R = new (A) RegionTy(arg1, superRegion);
Regions.InsertNode(R, InsertPos);
assert(!isAReferenceTypedValueRegion(superRegion));
}

return R;
Expand All @@ -92,6 +98,7 @@ RegionTy* MemRegionManager::getSubRegion(const Arg1Ty arg1, const Arg2Ty arg2,
if (!R) {
R = new (A) RegionTy(arg1, arg2, superRegion);
Regions.InsertNode(R, InsertPos);
assert(!isAReferenceTypedValueRegion(superRegion));
}

return R;
Expand All @@ -110,6 +117,7 @@ RegionTy* MemRegionManager::getSubRegion(const Arg1Ty arg1, const Arg2Ty arg2,
if (!R) {
R = new (A) RegionTy(arg1, arg2, arg3, superRegion);
Regions.InsertNode(R, InsertPos);
assert(!isAReferenceTypedValueRegion(superRegion));
}

return R;
Expand Down
12 changes: 12 additions & 0 deletions clang/lib/StaticAnalyzer/Core/ProgramState.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,16 @@ ProgramStateRef ProgramState::killBinding(Loc LV) const {
return makeWithStore(newStore);
}

/// We should never form a MemRegion that would wrap a TypedValueRegion of a
/// reference type. What we actually wanted was to create a MemRegion refering
/// to the pointee of that reference.
SVal ProgramState::desugarReference(SVal Val) const {
const auto *TyReg = dyn_cast_or_null<TypedValueRegion>(Val.getAsRegion());
if (!TyReg || !TyReg->getValueType()->isReferenceType())
return Val;
return getSVal(TyReg);
}

/// SymbolicRegions are expected to be wrapped by an ElementRegion as a
/// canonical representation. As a canonical representation, SymbolicRegions
/// should be wrapped by ElementRegions before getting a FieldRegion.
Expand Down Expand Up @@ -445,12 +455,14 @@ void ProgramState::setStore(const StoreRef &newStore) {
}

SVal ProgramState::getLValue(const FieldDecl *D, SVal Base) const {
Base = desugarReference(Base);
Base = wrapSymbolicRegion(Base);
return getStateManager().StoreMgr->getLValueField(D, Base);
}

SVal ProgramState::getLValue(const IndirectFieldDecl *D, SVal Base) const {
StoreManager &SM = *getStateManager().StoreMgr;
Base = desugarReference(Base);
Base = wrapSymbolicRegion(Base);

// FIXME: This should work with `SM.getLValueField(D->getAnonField(), Base)`,
Expand Down
26 changes: 26 additions & 0 deletions clang/test/Analysis/initializer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -366,3 +366,29 @@ void testI() {
clang_analyzer_eval(B::b == 2); // expected-warning{{TRUE}}
}
} // namespace dont_skip_vbase_initializers_in_most_derived_class

namespace elementwise_copy_small_array_from_post_initializer_of_cctor {
struct String {
String(const String &) {}
};

struct MatchComponent {
unsigned numbers[2];
String prerelease;
MatchComponent(MatchComponent const &) = default;
};

MatchComponent get();
void consume(MatchComponent const &);

MatchComponent parseMatchComponent() {
MatchComponent component = get();
component.numbers[0] = 10;
component.numbers[1] = 20;
return component; // We should have no stack addr escape warning here.
}

void top() {
consume(parseMatchComponent());
}
} // namespace elementwise_copy_small_array_from_post_initializer_of_cctor

0 comments on commit 820403c

Please sign in to comment.