-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random Premature/Late Transmissions with Taprio Traffic Controller #17
Comments
So isochron uses gatemask 0x2, and PTP uses gatemask 0x4.
When you increase isochron's window beyond 0.002 seconds (2 ms, the interval of the "gatemask 0x2" entry, it starts overlapping with the PTP schedule entry. And when that happens, sometimes isochron will wake up just early enough to send a timestampable packet and "steal" ptp4l's chance of doing the same thing. It's an i210 hardware limitation where there cannot be more than one TX timestampable packet in flight at any given time, see the many threads (only on isochron) on the topic: Check ethtool -S enp1s0 | grep tx_hwtstamp_skipped to make sure I'm right.
I don't think those are a problem of the tool. I forget if the log is saved when the program aborts due to a premature/late transmission. If it is, can you do a postmortem of the log file with isochron-report, and see what were the S, T, t and s times for the failed (last) seqid? If we don't find something obviously miscalculated/off-looking, I'd tend to trust the data. Can you please post your taprio configuration script for I210? I know the igb driver doesn't even support full taprio offload. I think it must use the txtime-assist feature ( It almost sounds like there's no actual packet scheduling in your NIC, and you're just relying on pure chance and on the operating system's scheduler to send packets out at the right time. |
Hey, I should've specified that when I change the isochron window times, I also change the taprio configurations to match them. With regards to the isochron window size being bigger causing ptp to fault, this was the taprio configuration I was using:
and the isochron send command would also change to reflect these changes (different window size and cycle time). Here is the command that I am using to change the taprio configurations:
You do seem to be right about the isochron packet replacing the ptp packet and it can be seen in the ethtool command (output: |
Because, as mentioned, with "flags 0" (implicit if you don't specify it) and "clockid CLOCK_TAI", you use software-only packet scheduling (the hrtimer-based kernel implementation from net/sched/sch-taprio.c). That will do a best-effort attempt of delivering packets for transmission to the NIC. You are not protected against gate overruns. It is entirely plausible that, even with isochron sitting quietly in its time slot, the software qdisc will still schedule a ptp4l packet a bit late, and isochron will tramp over it. With "flags 0", you get just about as much TSN out of the i210 NIC as from any non-TSN Realtek, Mediatek, or $insert_vendor card here. With the added downside that the i210 can't handle multiple TX timestamps in flight. |
I think this wiki is maintained by Intel people. See how they recommend configuring the taprio root with |
Hey,
I'm currently using isochron to measure the delay from one I210 to another using a taprio traffic controller over VLAN. Unfortunately, it seem to be getting a lot of variable results. Sometimes the program will run perfectly fine for over a minute, other times it will result in late/premature transmissions within seconds, discarding all other timestamps.
The taprio traffic controller configuration is as follows:
where isochron has socket priority 6 (using traffic class 1).
The isochron send command I am using is as follows:
sudo isochron send -i enp1s0.5 -s 64 --client 10.0.0.1 -c 0.01 -t 1 -w 0.002 -F ${fileprefix}-isochron.dat -n 6500 -o -O 37 --cpu-mask $((1 << 1)) --sched-fifo --sched-priority 99 -4 -J 10.0.0.1 -p 6 -S 0.002 -Q
and the following is the isochron receive:
taskset -a $((1 << 1)) sudo isochron rcv -i enp5s0.5 -t 1 -O 37 --sched-fifo --sched-priority 99 -4
The resulting isochron output on the send side is:
I've noticed that increasing the window size of isochron causes ptp (which is running on traffic class 2) to fault:
and decreasing the window size any more results in isochron having consistent premature/late transmissions.
I was mainly wondering if this was common behavior with isochron, if there were any configuration changes that could be made to help stabilize the results, and finally if there are any other known causes that may be interfering with the isochron processes.
The text was updated successfully, but these errors were encountered: