Refactor LPSPI with word packing, futures #157

mciantyre · 2024-04-30T17:57:13Z

Instead of using LPSPI continuous transactions, we pack the user's data
into u32 words. The primitives for translating the user's data to / from
the data FIFOs should help us as we consider async LPSPI drivers. And,
in order to implement embedded-hal 1.0 traits, we'll need something like
the dummy transmit / receive helpers, since we need to handle differing
transmit / receive buffer sizes. The included unit tests don't trigger
an error in Miri, and they try to simulate how we'd use the primitives
in firmware. (This is an "it's not obviously wrong" test, not an "it's
correct" test; help me review here.)

The commit introduces spinning futures into the LPSPI driver. By
combining and spinning on these futures, we can realize in-place
transfers, read-only transactions, write-only transactions, etc.

We no longer return the Busy error; we'll wait for transmit FIFO space.
We also never return the NoData error, instead returning success when
there's no I/O to do. Since this commit is a non-breaking change, the
two errors are still available in the error enum. I'll remove them
later.

I'm moving the blocking SPI example into RTIC and rewriting the driver
test. The tests demonstrate overlapping writes, writes with flushes, and
in-place transfers with a physical loopback. There's also tests that
show how word sizes and bit orders interact. I'd appreciate if folks
could test these changes in their system, since it affects how the
embedded-hal implementations behave. I'm only testing this commit on a
1170EVK with the new example.

Florob

I've tested this on my custom i.MX-RT 1020 board. It breaks the SPI display I'm using, which used to work (with a small workaround for the BUSY issue). Unfortunately I can't sniff that bus. I've bisected the issue to the last commit, introducing word packing and async functions.
I'll try to see if I can also see issues with the EVK where I can look at what's being transmitted.
Generally I'm not sure how to feel about using async. Looking at the generated code I have to say it optimizes much better than I would have expected. However, there is still some code size and runtime overhead.

src/lib.rs

mciantyre · 2024-05-20T11:59:55Z

Thanks for the feedback. One thing I noticed about this change: without a spin on the busy flag, this driver reduces the delay between transfers, or the time that chip select is deasserted between subsequent transactions. Given a 1MHz CLK, I see a reduction from 1430ns to 530ns for that delay.

If your workaround includes a spin on the busy flag, if you're bursting write(...)s, and if you were previous at your SPI display's minimum delay between transfers, then this could affect you.

There's some refactoring we could do to reclaim code space. I pushed a commit to show those changes. I agree that it's still not optimal; in particular, defining transfer with a try_join combinator doesn't get close to what might be ideal.

It's convenient (and fun) to define I/O like this, but I'm happy to drop it if folks need that space. mciantyre/imxrt-hal@ab053df isn't applied here, but it drops the futures from this branch.

Florob · 2024-05-20T19:17:39Z

Thanks for the feedback. One thing I noticed about this change: without a spin on the busy flag, this driver reduces the delay between transfers, or the time that chip select is deasserted between subsequent transactions. Given a 1MHz CLK, I see a reduction from 1430ns to 530ns for that delay.

That's interesting. I had tested the EVK yesterday and thought I saw the opposite result. So I re-tested and it turns out that inter-transfer gaps can be really strange at higher clock frequencies (I had 11 MHz). There are sometimes gaps in the order of multiple microseconds. Toggling a GPIO between each write() call indicates that in these cases there is actually more time spend in the write() call. However, that is AFAICT not a regression, and there are interactions with I-cache that might cause this, so it's probably fine.
The implication being that this is absolutely fine on i.MX RT 1021 and at least as far as bus performance goes does not appear to have a negative impact. Rather spinning on FIFO space instead of Busy seems to be an improvement.

If your workaround includes a spin on the busy flag, if you're bursting write(...)s, and if you were previous at your SPI display's minimum delay between transfers, then this could affect you.

Fortunately (because that would have been virtually impossible to fix) it's likely something else. For one the minimum allowed transfer gap is specified as 40ns, but I've also realized something else:
SPI MIPI interfaces typically have a separate D/C (data/command) pin, which is used to indicate what the data on the SPI bus means. This requires the SPI transfers to be blocking, or at least the ability to flush. That does not entirely explain why spinning for Busy on subsequent write() calls worked, but fundamentally brings us back to a variation on the original bug report: Users of e-h 0.2 expect the trait's methods to block. e-h 1.0 allows not blocking for the SpiBus trait, adding flush(). The SpiDevice trait's methods are however still blocking (implicitly flush).

There's some refactoring we could do to reclaim code space. I pushed a commit to show those changes. I agree that it's still not optimal; in particular, defining transfer with a try_join combinator doesn't get close to what might be ideal.

Interesting, that having two spin_on() (one via wait_for_fifo_space()) calls per operation reduces code size. I assume that is partially due to the inliner refusing to inline the wait_for_fifo_space().

It's convenient (and fun) to define I/O like this, but I'm happy to drop it if folks need that space. mciantyre/imxrt-hal@ab053df isn't applied here, but it drops the futures from this branch.

It's probably better to worry about this later. Having the bug fixed and e-h 1.0 support is more important than code size at least to me. If this turns out to be a real problem we can still try to optimize further down the line.

mciantyre · 2024-05-24T12:49:38Z

SPI MIPI interfaces typically have a separate D/C (data/command) pin, which is used to indicate what the data on the SPI bus means. This requires the SPI transfers to be blocking, or at least the ability to flush. That does not entirely explain why spinning for Busy on subsequent write() calls worked, but fundamentally brings us back to a variation on the original bug report: Users of e-h 0.2 expect the trait's methods to block.

I keep forgetting folks want to synchronize external components with the LPSPI through these eh02 interfaces. The latest commit adds a flush at the end of writes and transfers, striving for approach 1 in #136. I'm also not sure why a busy spin in the subsequent write worked, but a concluding flush should provide better guarantees.

Any other thoughts on these changes? Otherwise, I plan to release this into the 0.5 series for more user testing.

CHANGELOG.md

teburd

This looks good to me, and makes sense. I love all the tests here.

I guess if you move to an async API some form of completion notification (interrupt) would be needed to mark a Future<> complete is that right? Is there an interrupt for FIFO empty/FIFO watermark/FIFO full events? I'd guess so? I don't have the ref manual handy or lpspi IP block handy in my mind to answer those, and not really needed here, but were curiosity questions that popped up while reading through this.

Florob · 2024-05-24T19:05:45Z

I keep forgetting folks want to synchronize external components with the LPSPI through these eh02 interfaces.

Obviously so do I, even though I'm technically one of these folks…

Any other thoughts on these changes? Otherwise, I plan to release this into the 0.5 series for more user testing.

Sounds and looks good to me.

Instead of using LPSPI continuous transactions, we pack the user's data into u32 words. The primitives for translating the user's data to / from the data FIFOs should help us as we consider async LPSPI drivers. And, in order to implement embedded-hal 1.0 traits, we'll need something like the dummy transmit / receive helpers, since we need to handle differing transmit / receive buffer sizes. The included unit tests don't trigger an error in Miri, and they try to simulate how we'd use the primitives in firmware. (This is an "it's not obviously wrong" test, not an "it's correct" test; help me review here.) The commit introduces spinning futures into the LPSPI driver. By combining and spinning on these futures, we can realize in-place transfers, read-only transactions, write-only transactions, etc. These implementations flush the FIFOs, allowing users to synchronize external components with LPSPI I/O. We no longer return the Busy error; we'll wait for transmit FIFO space. We also never return the NoData error, instead returning success when there's no I/O to do. Since this commit is a non-breaking change, the two errors are still available in the error enum. I'll remove them later. I'm moving the blocking SPI example into RTIC and rewriting the driver test. The tests demonstrate overlapping writes, writes with flushes, and in-place transfers with a physical loopback. There's also tests that show how word sizes and bit orders interact. I'd appreciate if folks could test these changes in their system, since it affects how the embedded-hal implementations behave. I'm only testing this commit on a 1170EVK with the new example.

mciantyre · 2024-05-27T22:50:04Z

Thanks for the reviews!

I guess if you move to an async API some form of completion notification (interrupt) would be needed to mark a Future<> complete is that right? Is there an interrupt for FIFO empty/FIFO watermark/FIFO full events? I'd guess so?

Yup, the LPSPI supports watermarks and interrupt events for both FIFOs. We could use those to drive async I/O.

mciantyre mentioned this pull request Apr 30, 2024

Implementations of blocking embedded_hal traits should never return "Busy" #136

Closed

mciantyre force-pushed the lpspi-spin branch from 3ab5bdf to 6162a70 Compare May 9, 2024 12:41

mciantyre marked this pull request as ready for review May 9, 2024 12:41

Florob reviewed May 14, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

mciantyre commented May 24, 2024

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

teburd approved these changes May 24, 2024

View reviewed changes

mciantyre force-pushed the lpspi-spin branch from 4562999 to f4a369e Compare May 27, 2024 22:48

mciantyre merged commit f4a369e into imxrt-rs:main May 27, 2024
31 checks passed

mciantyre deleted the lpspi-spin branch May 27, 2024 22:51

mciantyre mentioned this pull request Jun 13, 2024

Remove LPSPI prototyping leftovers #164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor LPSPI with word packing, futures #157

Refactor LPSPI with word packing, futures #157

mciantyre commented Apr 30, 2024 •

edited

Loading

Florob left a comment

mciantyre commented May 20, 2024 •

edited

Loading

Florob commented May 20, 2024

mciantyre commented May 24, 2024

teburd left a comment •

edited

Loading

Florob commented May 24, 2024

mciantyre commented May 27, 2024

Refactor LPSPI with word packing, futures #157

Refactor LPSPI with word packing, futures #157

Conversation

mciantyre commented Apr 30, 2024 • edited Loading

Florob left a comment

Choose a reason for hiding this comment

mciantyre commented May 20, 2024 • edited Loading

Florob commented May 20, 2024

mciantyre commented May 24, 2024

teburd left a comment • edited Loading

Choose a reason for hiding this comment

Florob commented May 24, 2024

mciantyre commented May 27, 2024

mciantyre commented Apr 30, 2024 •

edited

Loading

mciantyre commented May 20, 2024 •

edited

Loading

teburd left a comment •

edited

Loading