Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to change malloc memory alignment, 64-bits system must align 16-bytes, 32-bits system must align 8-bytes. #16

Open
jianjunjiang opened this issue Nov 24, 2020 · 7 comments

Comments

@jianjunjiang
Copy link

jianjunjiang commented Nov 24, 2020

The memory address allocated by the malloc function must be aligned twice void *, which means that for 64-bit system, 16-byte alignment is required, and for 32-bit system requires 8-byte alignment.

If you simply modify the macro definition, TLSF will crash directly. How to do this modification?

diff --git a/tlsf.c b/tlsf.c
index af57573..6e87539 100644
--- a/tlsf.c
+++ b/tlsf.c
@@ -211,15 +211,15 @@ enum tlsf_public
        SL_INDEX_COUNT_LOG2 = 5,
 };

-/* Private constants: do not modify. */
+/* Private constants: do not modify. why ??? */
 enum tlsf_private
 {
 #if defined (TLSF_64BIT)

-       /* All allocation sizes and addresses are aligned to 8 bytes. */
-       ALIGN_SIZE_LOG2 = 3,

+       /* All allocation sizes and addresses are aligned to 8(->16) bytes. */
+       ALIGN_SIZE_LOG2 = 4,
         #else

-       /* All allocation sizes and addresses are aligned to 4 bytes. */
-       ALIGN_SIZE_LOG2 = 2,

+       /* All allocation sizes and addresses are aligned to 4(->8) bytes. */
+       ALIGN_SIZE_LOG2 = 3,
         #endif
        ALIGN_SIZE = (1 << ALIGN_SIZE_LOG2),
        
@pavel-kirienko
Copy link

Consider o1heap perhaps. It is also a constant-complexity, well-characterized allocator, but it doesn't make nontrivial assumptions about the target platform (such as limiting the memory alignment to the pointer size). I am obviously biased though because I am the author of that library, so beware.

@gulmezmerve
Copy link

@pavel-kirienko I am wondering if 01heap can work with 16-byte alignment on 64 bits targets?

@pavel-kirienko
Copy link

@gulmezmerve Yes.

@lfnoise
Copy link

lfnoise commented Aug 15, 2024 via email

@pavel-kirienko
Copy link

pavel-kirienko commented Aug 15, 2024

@lfnoise The worst case occurs at the power of two minus the overhead size plus one; on a 32-bit platform that would be the power of two minus 15 bytes. The statement about the consumed size is correct though.

The point of o1heap is that its worst-case memory consumption in the case of the worst-case heap fragmentation is lower than that of TLSF due to its more favorable placement of used blocks. There is an overview of this matter in the README; the most relevant part is quoted below:

image

@lfnoise
Copy link

lfnoise commented Aug 15, 2024 via email

@pavel-kirienko
Copy link

OK, is it true that if I am allocating 1024 bytes, for example, that a
fragment of size 2048 bytes will be consumed because of this line?:

Yes. Power-of-two allocation is close to the worst case of the inner fragmentation indeed. The inner fragmentation would be zero if you were allocating, say, 1008 bytes, just to illustrate the point.

And for any other power of two size, it also will consume twice the
requested amount?

This is accurate.

So, if my use case is that I will almost always be allocating exact powers
of two, that this allocator will consume [a lot of memory]?

Yes (for the edited quote). In general, if you care about the average-case performance rather than worst-case performance, then o1heap is unlikely to suit your application. It is intended for mission-critical real-time systems (like embedded systems, incl. safety-critical ones), where the worst case is of the primary concern while the average-case performance is often irrelevant.

Keep in mind the difference between the inner fragmentation and the outer fragmentation. Half-fit allocators (which o1heap is one example of) do have a much higher inner fragmentation due to the power-of-2 rounding, but the upshot is that the worst-case outer fragmentation has a much tighter bound compared to best-fit allocators like TLSF. The result is that the worst-case total memory consumption is lower for half-fit allocators than best-fit ones. If you're curious about the background, there are excellent articles linked from the README I mentioned earlier.

It is not correct nor useful to gauge the worst-case memory consumption of an allocator without considering the outer fragmentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants