Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafe prebuilt boundary in MediaTek backend #6074

Open
dbort opened this issue Oct 9, 2024 · 1 comment
Open

Unsafe prebuilt boundary in MediaTek backend #6074

dbort opened this issue Oct 9, 2024 · 1 comment
Assignees
Labels
bug Something isn't working module: mediatek Issues related to the Mediatek delegate triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@dbort
Copy link
Contributor

dbort commented Oct 9, 2024

🐛 Describe the bug

While making some changes to backends/mediatek, I noticed:

ET does not make any ABI compatibility promises. If the MT libraries really do contain a prebuilt version of a subclass of an ET type, ET could legally make changes to MemoryAllocator that would break that prebuilt code, possibly in subtle or dangerous ways.

For example, if we made a change that reordered the vtable, a call like this could end up calling a different method, which wouldn't expect the same parameters or register values:

mediatek::BufferAllocator ba;  // points to a vtable with an old layout
executorch::runtime::MemoryAllocator *ma = &ba;
ma->allocate(...);  // expects a new vtable layout

In the best case this would immediately crash, but in the worst case it would silently corrupt memory or execute random code.

Things would also go bad if we added a new field to the base class, because it could alias to an offset used by a subclass field.

At a higher level, it is unsafe to have prebuilt code that refers to any ExecuTorch type, unless it's guaranteed that the prebuilt code will link against a very specific (i.e., a single git hash) version of the ET runtime.

Solution

The prebuilt code should not refer to any ExecuTorch types.

Mitigation/detection

There's no way to reliably check these sorts of things; it's just plain unsafe.

But as a first-order check, the backend could compare sizeof(MemoryAllocator) across the prebuilt and non-prebuilt boundaries. E.g., it could have a function like size_t get_prebuilt_memory_allocator_size() whose implementation is inside the prebuilt, and returns sizeof(MemoryAllocator) (which will be determined at prebuild time). Then some code in a non-prebuilt .cpp file could assert sizeof(MemoryAllocator) == get_prebuilt_memory_allocator_size() before constructing a BufferAllocator, and return an error if the sizes don't match.

But this check could be defeated if the base class's fields were reordered in a way that did not affect the size of the type. It also wouldn't catch the re-ordering, addition, or removal of a virtual method, since the vtable isn't part of the size of the type.

Versions

Issue is present in main as of at least afb2664 on 2024-10-09

@dbort dbort added bug Something isn't working triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mediatek Issues related to the Mediatek delegate labels Oct 9, 2024
@dbort
Copy link
Contributor Author

dbort commented Oct 9, 2024

cc: @neuropilot-captain (I can't seem to assign this issue to you)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module: mediatek Issues related to the Mediatek delegate triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants