Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAN-ISOTP kernel driver limited to messages of 8200 bytes #5371

Closed
tlalexander opened this issue Mar 7, 2023 · 23 comments
Closed

CAN-ISOTP kernel driver limited to messages of 8200 bytes #5371

tlalexander opened this issue Mar 7, 2023 · 23 comments

Comments

@tlalexander
Copy link

Describe the bug

I really appreciate all the hard work folks have put in to this code. I am working on sending firmware updates to an embedded device over CAN bus and I ran in to a problem. The CAN-ISOTP driver has a hard coded limit of 8200 bytes, but my firmware is 103,000 bytes.

You can see in the code comment that this limit is recognized as arbitrary, and that technically up to 4GB is possible. However for very large values this would probably need more code architecture around it.

linux/net/can/isotp.c

Lines 87 to 92 in 652a330

/* ISO 15765-2:2016 supports more than 4095 byte per ISO PDU as the FF_DL can
* take full 32 bit values (4 Gbyte). We would need some good concept to handle
* this between user space and kernel space. For now increase the static buffer
* to something about 8 kbyte to be able to test this new functionality.
*/
#define MAX_MSG_LENGTH 8200

I recompiled my kernel to raise this value to 1,000,000, and I have had good success sending my firmware which is 103,000 bytes. My embedded device, an RP2040 processor, only has about 240k of ram so I would never need to send more than about 200k max.

We will open source all of our code, so maybe others will want to try firmware updates over CAN bus. However the hard limit in the kernel would need to be raised. I understand that we cannot raise it to the maximum without more work, but I wonder if we could raise it to perhaps 1 megabyte without much trouble? This would allow more embedded users to try firmware updates over CAN bus. And it would be nice if I did not have to use a custom kernel for my robots.

If this value is raised, then it would make sense to update the corresponding values in the can-utils, like this one:
https://github.com/linux-can/can-utils/blob/master/isotpsend.c#L60

It looks like all of this is maintained by @hartkopp (hello again and many thanks for your assistance last time, my application is finally working nicely and I really appreciate all your code!)

Steps to reproduce the behaviour

Using the CAN-ISOTP kernel driver, attempt to send more than 8200 bytes. It will fail.

Device (s)

Raspberry Pi CM4

System

Sorry the device is away at the moment, but you can see from my first link that this limit is hard coded in the code and the comment acknowledges this. Please let me know if I should retrieve this system info.

Logs

No response

Additional context

No response

@6by9
Copy link
Contributor

6by9 commented Mar 7, 2023

This is a file that is unmodified from mainline Linux, so requests should really be routed there.

I do note that 5.18 and above have increased that limit to 64kB via 9c0c191

@hartkopp
Copy link
Contributor

hartkopp commented Mar 7, 2023

Hi @tlalexander
There are two reasons for the current MAX_MSG_LENGTH 66000 in isotp.c

  1. The original idea was to to flash a 64kbyte bootloader in one piece. And that led to a 32 bit length value. An enormous length but hey 32 bit values looked good for that.
  2. The rx/tx buffers in isotp.c are currently static buffers that are allocated when opening an isotp socket.

For a quick fix please increase the MAX_MSG_LENGTH in your kernel source on your own and build the can-isotp.ko module for your setup.

In general we could think about replacing the static buffers with a dynamic memory allocation implementation and probably also some sendfile() syscall which would simplify the usage on the sending side. So far there was no use-case nor people asking for such feature.
Even if we move to dynamic memory allocation the MAX_MSG_LENGTH value would be limited to some sane length maybe around 4MB.

@pelwell
Copy link
Contributor

pelwell commented Mar 7, 2023

Would it not make sense for the maximum size to be a module parameter? It's less effort than going fully dynamic, without forcing everyone to allocate more memory.

@hartkopp
Copy link
Contributor

hartkopp commented Mar 7, 2023

Yes, a module parameter would make sense! I'm not sure if a module parameter is still 'appropriate' in kernel developments these days or if there would be a need for providing a sysfs API. I will take a look at it. Thx!

@pelwell
Copy link
Contributor

pelwell commented Mar 7, 2023

Upstream seem to prefer module parameters to configuration via Device Tree, so I hope it would be acceptable.

@hartkopp
Copy link
Contributor

hartkopp commented Mar 7, 2023

I think this only applies to drivers. In our case we are in the networking subsystem.

@tlalexander
Copy link
Author

Something configurable would be lovely. Thanks for the explanation on the concerns re: static buffers.

I don't know how parameters work but is there anything I can do to help?

@hartkopp
Copy link
Contributor

hartkopp commented Mar 8, 2023

I don't know how parameters work but is there anything I can do to help?

Thanks. The main part is to rework the code for non-static buffers. I already looked into the code yesterday - but I'm currently pretty busy with my 'real' job. Maybe I can provide a PoC patch at the weekend.

@tlalexander
Copy link
Author

Ah well very cool. Much appreciated and take your time.

@hartkopp
Copy link
Contributor

I already checked some things out and even if we would provide a module parameter we should limit this value to a reasonable value.

What would you propose for a maximum PDU size value before people need to change the values in kernel code (again)? 260kbyte? 520kbyte? (just something above 256 or 512 to add checksums etc)

@tlalexander
Copy link
Author

I would be inclined to say 1 or 2 megabytes. I know I would need a maximum of around 260kB, and I assume someone out there could need more. Could you briefly explain the drawbacks of larger values? Is it purely a concern of RAM usage or is there more to it? If it was purely about RAM usage I would make the number much higher, possibly up to the maximum which I think I read is 4GB. Only if we are talking about user-modified parameters. Very few people would change this value and if they do it might be safe to assume they are willing to live with the consequences, though I do not yet understand what those are.

@hartkopp
Copy link
Contributor

Hi @tlalexander ,

I posted a RFC patch which allows to extend the PDU size up to 1025 kByte (1MByte + extra space) here:

https://lore.kernel.org/linux-can/[email protected]/T/#u

The patch can be downloaded from the raw link:
https://lore.kernel.org/linux-can/[email protected]/raw

It took some more time, but now the extended buffers are only allocated for the specific socket/connection once when 'big' PDUs need to be transferred.

Would you like to check out this patch whether it fits your requirements?

@hartkopp
Copy link
Contributor

Hi all,

I've sent an updated version (v2) of the patch on the Linux-CAN mailing list:

https://lore.kernel.org/linux-can/[email protected]/T/#u

The patch can be downloaded from the raw link:
https://lore.kernel.org/linux-can/[email protected]/raw

This new version limits the minimum 'max_pdu_size' to 4095 to maintain the classic behavior before ISO 15765-2:2016

Is anyone interested to test and/or approve the patch in their setup?

Thanks,
Oliver

@tlalexander
Copy link
Author

Hello, thank you for the patches!
I can't tell from your second comment - is the PDU limited to 4095 again, or will I be able to set larger values? I saw the email about UDS breaking things in to chunks, but since ISOTP already breaks things in to chunks and reassembles them, I found it convenient not to implement another chunking layer on top.
Just let me know, I can try to test this out on Friday this week.

@hartkopp
Copy link
Contributor

The second comment is only about the configuration that you can not set the PDU size below 4095 byte.

UDS currently is not using ISO-TP PDU sizes > 4095. When there are things to transfer like 'big' firmware blobs, UDS is breaking the blob into ISO-TP PDU chunks while ISO-TP breaks it into CAN frame (chunks).

@tlalexander
Copy link
Author

Thank you. Sorry for the delays but I believe I should be able to try this Monday. Have a nice weekend!

@tlalexander
Copy link
Author

Okay I finally tested this! I was in to a big PCB design project so it took me a while to go back to CAN bus and kernel stuff. but I appreciate the work you do and I want to make sure I am helping where I can.

I pulled in the kernel source and compiled the module with no changes first, to test on a fresh system just to make sure I had things working. As expected I was able to send PDUs up to the hard coded maximum value from the original code, and beyond that it would throw an error.

Then I applied your patch, recompiled and tested that module. With no parameters applied I hit the default limit of max 8300 bytes. I unloaded the module and reloaded it with the limit set to 20kbytes. Then I was able to send 20,000 bytes, but 20,001 bytes would produce an error, as expected. I unloaded the module and reloaded it with a higher limit, testing again I could send the maximum but it would fail with one byte beyond the maximum. I tested this at 20k, 68k, 100k, 160k, and 175k. I can't send much more than 175k because my microcontroller only has 240k of ram. But from my tests it looks like it is working great! If you would like, I can run this between two raspberry pi compute modules with my CAN hardware and test it with higher values. It would take a little setup time, but I want to make sure you feel the code is well tested.

Thank you for working up these changes! The loadable kernel parameter is really nice, I had never used that before. I also never knew how to load just one module - originally I recompiled and replaced the entire kernel and all modules to raise the limit.

intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Apr 4, 2023
With ISO 15765-2:2016 the PDU size is not limited to 2^12 - 1 (4095)
bytes but can be represented as a 32 bit unsigned integer value which
allows 2^32 - 1 bytes (~4GB). The use-cases like automotive unified
diagnostic services (UDS) and flashing of ECUs still use the small
static buffers which are provided at socket creation time.

When a use-case requires to transfer PDUs up to 1025 kByte the maximum
PDU size can now be extended by setting the module parameter
max_pdu_size. The extended size buffers are only allocated on a
per-socket/connection base when needed at run-time.

changes since v2: https://lore.kernel.org/all/[email protected]
- use ARRAY_SIZE() to reference DEFAULT_MAX_PDU_SIZE only at one place

changes since v1: https://lore.kernel.org/all/[email protected]
- limit the minimum 'max_pdu_size' to 4095 to maintain the classic
  behavior before ISO 15765-2:2016

Link: raspberrypi/linux#5371
Signed-off-by: Oliver Hartkopp <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
@hartkopp
Copy link
Contributor

hartkopp commented Apr 4, 2023

But from my tests it looks like it is working great! If you would like, I can run this between two raspberry pi compute modules with my CAN hardware and test it with higher values. It would take a little setup time, but I want to make sure you feel the code is well tested.

I think that is not needed. I also tested the new module option several times and just wanted to make sure I did not forget anything vital ;-)

Many thanks for testing! Have fun with it!

Best,
Oliver

@tlalexander
Copy link
Author

Hi Oliver,
Thanks again for all your help with this code change. I have been working on compiling updated kernel modules and for some reason my raspberry pi is not liking my cross compiled modules so I am doing a native build on the Pi to see if that helps.
However I did notice that while this change was accepted on the torvalds linux branch, it has not been pulled in to the raspberry pi linux branch. My life would be easier if this change was just built in to the raspberry pi kernel. Do you think the change could be pulled in?

Here is the commit history for the isotp.c file on the raspberry pi branch:
https://github.com/raspberrypi/linux/commits/rpi-6.1.y/net/can/isotp.c

And here it is on the torvalds linux branch:
https://github.com/torvalds/linux/commits/master/net/can/isotp.c

I noticed some newer commits have come to raspberry pi but this one has not, so I am unsure what to expect there in the future and thought I would ask.
Thank you!

6by9 pushed a commit to 6by9/linux that referenced this issue Jul 29, 2023
commit 96d1c81 upstream.

With ISO 15765-2:2016 the PDU size is not limited to 2^12 - 1 (4095)
bytes but can be represented as a 32 bit unsigned integer value which
allows 2^32 - 1 bytes (~4GB). The use-cases like automotive unified
diagnostic services (UDS) and flashing of ECUs still use the small
static buffers which are provided at socket creation time.

When a use-case requires to transfer PDUs up to 1025 kByte the maximum
PDU size can now be extended by setting the module parameter
max_pdu_size. The extended size buffers are only allocated on a
per-socket/connection base when needed at run-time.

changes since v2: https://lore.kernel.org/all/[email protected]
- use ARRAY_SIZE() to reference DEFAULT_MAX_PDU_SIZE only at one place

changes since v1: https://lore.kernel.org/all/[email protected]
- limit the minimum 'max_pdu_size' to 4095 to maintain the classic
  behavior before ISO 15765-2:2016

Link: raspberrypi#5371
Signed-off-by: Oliver Hartkopp <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
6by9 pushed a commit to 6by9/linux that referenced this issue Jul 29, 2023
With ISO 15765-2:2016 the PDU size is not limited to 2^12 - 1 (4095)
bytes but can be represented as a 32 bit unsigned integer value which
allows 2^32 - 1 bytes (~4GB). The use-cases like automotive unified
diagnostic services (UDS) and flashing of ECUs still use the small
static buffers which are provided at socket creation time.

When a use-case requires to transfer PDUs up to 1025 kByte the maximum
PDU size can now be extended by setting the module parameter
max_pdu_size. The extended size buffers are only allocated on a
per-socket/connection base when needed at run-time.

changes since v2: https://lore.kernel.org/all/[email protected]
- use ARRAY_SIZE() to reference DEFAULT_MAX_PDU_SIZE only at one place

changes since v1: https://lore.kernel.org/all/[email protected]
- limit the minimum 'max_pdu_size' to 4095 to maintain the classic
  behavior before ISO 15765-2:2016

Link: raspberrypi#5371
Signed-off-by: Oliver Hartkopp <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
@6by9
Copy link
Contributor

6by9 commented Jul 29, 2023

Raspberry Pi stick with the Long Term Supported (LTS) kernels, which at present is 6.1.
Mainline Linux backports patches considered to be significant fixes to the LTS kernels (generally anything with a Fixes: ....tag in the commit text), but not general improvements.

The patch increasing the buffer size would appear to be torvalds/linux@96d1c81. If you look at that patch in Github it tells you that it is in master, and tags from v6.5-rc3 to v6.4-rc1, ie it was merged to the 6.4 kernel.

Three options if you want to test it now.

  • Raspberry Pi do build the 6.4 and 6.5 kernels. If you use sudo rpi-update rpi-6.4.yyou will get the 6.4 branch, and make the obvious change to get the rpi-6.5.y branch. You have to accept that the non-LTS branches get far less testing than the other branches, and you may get issues. Please test it on a non-critical Pi first as you may have to reinstall.
  • Build your own kernel, and cherry-pick the commit. https://www.raspberrypi.com/documentation/computers/linux_kernel.html#building for our instructions on building the kernel. If you don't want to actually build the kernel yourself but can create a pull request, then CI attached to this repo will build PRs, and the artefacts can be installed via ```sudo rpi-update pulls/XXXX`` (where XXXX is the PR number).
  • Persuade us that backporting it is worthwhile.

Whilst typing this out I went down the route of 2, did the cherry-pick and created #5557. Give it an hour and sudo rpi-update pulls/5557should give you a current 6.1 kernel with that patch.
If you can test it and confirm that it works, then I suspect pelwell will be prepared to accept it.

@hartkopp
Copy link
Contributor

Thanks for the comprehensive documentation how to handle such cases for the RasPi environment!

@tlalexander
Copy link
Author

Thank you @6by9 for getting this together for me! I have tested this on multiple Raspberry Pi CM4's and they are all working well. I also ran this on my robot and did some firmware updates for the CAN devices which require CAN-ISOTP packets of about 150kbytes in size and that worked flawlessly. In addition to that I ran the robot in normal operation and all looks good. Seems like this would be okay to merge as far as I can tell.

pelwell pushed a commit that referenced this issue Aug 1, 2023
commit 96d1c81 upstream.

With ISO 15765-2:2016 the PDU size is not limited to 2^12 - 1 (4095)
bytes but can be represented as a 32 bit unsigned integer value which
allows 2^32 - 1 bytes (~4GB). The use-cases like automotive unified
diagnostic services (UDS) and flashing of ECUs still use the small
static buffers which are provided at socket creation time.

When a use-case requires to transfer PDUs up to 1025 kByte the maximum
PDU size can now be extended by setting the module parameter
max_pdu_size. The extended size buffers are only allocated on a
per-socket/connection base when needed at run-time.

changes since v2: https://lore.kernel.org/all/[email protected]
- use ARRAY_SIZE() to reference DEFAULT_MAX_PDU_SIZE only at one place

changes since v1: https://lore.kernel.org/all/[email protected]
- limit the minimum 'max_pdu_size' to 4095 to maintain the classic
  behavior before ISO 15765-2:2016

Link: #5371
Signed-off-by: Oliver Hartkopp <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
@tlalexander
Copy link
Author

This has been merged in to 6.1. Thanks everyone! Our open source farming robot is better for this change. 🌱

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants