-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mmc: sdhci: fix max req size based on spec #5571
Conversation
For almost 2 decades, the max allowed requests were limited to 512KB because of SDMA's max 512KiB boundary limit. ADMA2 and ADMA3 do not have such limits and were effectively made so any kind of block count would not impose interrupt and managing stress to the host. By limiting that to 512KiB, it effectively downgrades these DMA modes to SDMA. Fix that by actually following the spec: When ADMA is selected tuning mode is advised. On lesser modes 4MiB transfers is selected as max, so re-tuning if timer trigger or if requested by host interrupt, can be done in time. Otherwise, the only limit is the variable size of types used. In this implementation, 16MiB is used as maximum since tests showed that after that point, there are diminishing returns. Also 16MiB in worst case scenarios, when card is eMMC and its max speed is a generous 350MiB/s, will generate interrupts every 45ms on huge data transfers. A new `adma_get_req_limit` sdhci host function was also introduced, to let vendors override imposed limits by the generic implementation if needed. For example, on local tests with rigorous CPU/GPU burn-in tests and abrupt cut-offs to generate huge temperature changes (upwards/downwards) to the card, tested host was fine up to 128MB/s transfers on slow cards that used SDR104 bus timing without re-tuning. In that case the 4MiB limit was overridden with a more than safe 8MiB value. In all testing cases and boards, that change brought the following: Depending on bus timing and eMMC/SD specs: * Max Read throughput increased by 2-20% * Max Write throughput increased by 50-200% Depending on CPU frequency and transfer sizes: * Reduced mmcqd cpu core usage by 4-50% Signed-off-by: CTCaer <[email protected]>
@CTCaer this is your patch from CTCaer/switch-l4t-kernel-4.9@fa86ebb but with the whitespace changes and custom vendor override hook removed (if required then they should be additional independent patches). We've been asked to test and look at upstreaming it, but as per https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin that requires a genuine name alongside the email address. Is there any chance you could provide such? Replying via this PR is generally considered sufficient as it is a public record, and I can then rewrite the commit and re-push. Thanks. |
Looks interesting. Please drop the following part from the commit message:
|
if (host->tuning_mode != SDHCI_TUNING_MODE_3) | ||
mmc->max_req_size = 4194304; | ||
else | ||
mmc->max_req_size = 16777216; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are defines which are easier to read: SZ_4M and SZ_16M
Sorry, the referenced user isn't interested in merging this way due to the public nature of the PR. I wrote in the original forum thread how it should be done but that was not followed. |
It's an open source project, so how can it not be public? |
@6by9 no you are missing the point and context. the commit is written with a signed-off-by tag already under the users pseudonym. The user requested that they be contacted via that email address directly (how it is done in mainline to resolve such issues) so that they could either A: change the commit and place it under a real name without reference to the original commit (as the author, they can do that) or B: give authorship writes to someone else entirely. A/B keep the user pseudonym CTCaer's anonymity. Doing the change here as part of a PR that is easily searchable does not. |
THIS NEEDS A CORRECT SoB BEFORE BEING CONSIDERED FOR MERGE.
https://forums.raspberrypi.com/viewtopic.php?t=354518 for testing.
For almost 2 decades, the max allowed requests were limited to 512KB because of SDMA's max 512KiB boundary limit.
ADMA2 and ADMA3 do not have such limits and were effectively made so any kind of block count would not impose interrupt and managing stress to the host.
By limiting that to 512KiB, it effectively downgrades these DMA modes to SDMA.
Fix that by actually following the spec:
When ADMA is selected tuning mode is advised.
On lesser modes 4MiB transfers is selected as max, so re-tuning if timer trigger or if requested by host interrupt, can be done in time. Otherwise, the only limit is the variable size of types used. In this implementation, 16MiB is used as maximum since tests showed that after that point, there are diminishing returns.
Also 16MiB in worst case scenarios, when card is eMMC and its max speed is a generous 350MiB/s, will generate interrupts every 45ms on huge data transfers.
A new
adma_get_req_limit
sdhci host function was also introduced, to let vendors override imposed limits by the generic implementation if needed.For example, on local tests with rigorous CPU/GPU burn-in tests and abrupt cut-offs to generate huge temperature changes (upwards/downwards) to the card, tested host was fine up to 128MB/s transfers on slow cards that used SDR104 bus timing without re-tuning.
In that case the 4MiB limit was overridden with a more than safe 8MiB value.
In all testing cases and boards, that change brought the following:
Depending on bus timing and eMMC/SD specs: