Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add readRTX function for TrackRemote #2519

Closed
wants to merge 1 commit into from

Conversation

San9H0
Copy link

@San9H0 San9H0 commented Jul 27, 2023

Add readRTX function for TrackRemote

Description

You can use readRTX function in onTrack

In SDP, like this

a=rtpmap:96 VP8/90000
a=rtcp-fb:96 goog-remb
a=rtcp-fb:96 transport-cc
a=rtcp-fb:96 ccm fir
a=rtcp-fb:96 nack
a=rtcp-fb:96 nack pli
a=rtpmap:97 rtx/90000
a=fmtp:97 apt=96

If rtx stream is negotiated, TrackRemote can read packets from rtx tracks

There is sample code in examples/read-rtx folder

@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Patch coverage is 25.00% of modified lines.

Files Changed Coverage
track_remote.go 7.40%
rtpreceiver.go 52.94%

📢 Thoughts on this report? Let us know!.

@adriancable
Copy link
Contributor

@San9H0 - thank you for your contribution! This is an important improvement because right now distinct-SSRC RTX packets are not retrievable. So I would like to get this PR merged.

A few things needed before we can do this. First some changes which I've made - please can kindly update your branch with these changes and re-push. (I don't have write access to your fork.) Changes are as follows:

  1. Add godoc comments for the public API functions (e.g. ReadRTX). I have done this (see attachments).
  2. RTX streams can contain padding probes (see: https://chromium.googlesource.com/external/webrtc/+/master/pc/g3doc/rtp.md). These do not have an OSN prefix added (since they don't correspond to packets in the 'unrepaired' stream) and right now your ReadRTX() chokes on these, first by reading past the end of the packet buffer and then ultimately giving a buffer-to-small error later on. I have fixed this (see my attachments).

Two things you will need to do yourself:

  1. Add a CI test for ReadRTX()
  2. (Optional but I highly recommend it) add an example showing ReadRTX() usage

One thing I am not sure about (@Sean-Der, @davidzhao - thoughts please!) - in this PR you have removed the goroutine in receiveForRid() which polls repairInterceptor.Read() and moved this to readRTX(). So after this change, this poll does not occur if you are not yourself calling ReadRTX() (which wraps readRTX()) to pull the RTX packets. In this case, the TWCC generator interceptor will miss the RTX packets so it will erroneously flag them as 'not received'. I am not sure if offering e.g. a recvonly video/rtx and then not actually calling ReadRTX() is a supported/sensible use case, so not sure if this behavioural change is OK or not, but it is a behavioural change.

I have done a bit of testing on the amended version and it does appear to work very nicely, so thank you.

track_remote.go.txt
rtpreceiver.go.txt

@San9H0
Copy link
Author

San9H0 commented Sep 2, 2023

@adriancable you're right
In the old code, If readRTX is not called, It seems that problems will is occur.

I think it would be good to roll back to repairInterceptor.Read() in receiveForRtx.
And It would be better to handle ReadRTX inside receiveForRtx.

I'll try to fix it

@adriancable
Copy link
Contributor

@San9H0 - that sounds like a better approach. I'm happy to review any changes you make. I think correct handling of distinct-SSRC RTX is important so am keen to see this one through, but we need to watch for implementation gotchas.

@San9H0 San9H0 force-pushed the rtxfortrackremote branch 3 times, most recently from d234cdd to af30fb6 Compare September 3, 2023 16:12
@adriancable
Copy link
Contributor

@San9H0 - this looks good. It looks like you have committed an errors.go with some 'no op' changes. Can you kindly remove that from the commit? I'll then do a bit of testing of the code and we'll go from there.

@Sean-Der
Copy link
Member

Sean-Der commented Sep 3, 2023

Amazing work @San9H0! I am so glad you are tackling this. This is the biggest deficiency in Pion currently.

API wise I had these goals originally.

  • Users shouldn't need to call a new API to get RTX packet.
    Many Pion users don't understand WebRTC. Things should work out of the box if thy call Read

  • Should work with Simulcast also

  • Users should know if an RTX is happening.
    For debug purposes and just to empower users. This doesn't have to be possible via public API, but seems like an important use case still?

The first commit doesn't have to be perfect either. I would prefer to see small commits land sooner. As long as we have a plan to get somewhere good eventually :)

This is going to be a big change. Would love to hear everyone else's thoughts if they have it.

@davidzhao @adriancable @at-wat @cnderrauber @jech @boks1971 @Antonito @kevmo314 @mengelbart @streamer45

@adriancable
Copy link
Contributor

@Sean-Der et al. - my thoughts as follows.

Users shouldn't need to call a new API to get RTX packet.
Many Pion users don't understand WebRTC. Things should work out of the box if thy call Read

I agree, but I actually think @San9H0's approach fits this principle! Right now, 'out of the box', using the default interceptors, you will get NACK feedback which asks the remote to retransmit on the same SSRC and you get everything using Read(). @San9H0's work doesn't interfere with this 'out of the box' use case.

But if the user does want the remote to RTX on a distinct SSRC, presumably that's because they want to separate out the RTX stream from the original stream. If they want this, then they have to call RegisterCodec for video/rtx and then they need to poll ReadRTX().

So to me, how things work with @San9H0's approach feels 'right': you can choose, out of the box, just call Read, get original & retransmitted packets together (just like before). Or, if you want RTX on a separate stream - now you can do that too - just RegisterCodec video/rtx, then use Read to get the original packets & ReadRTX to get the RTX packets.

Thoughts?

@cnderrauber
Copy link
Member

cnderrauber commented Sep 4, 2023

Looks good to me! @Sean-Der I would agree with @adriancable about the API design. We don't enable rtx codec by default so if a user wants rtx on a distinct SSRC, he might need it for two reasons:

  1. Accurate loss rate stats on the wire
  2. Separate padding packets from the media stream to avoid extra work in forward path (sequence munge, NACK for padding...)
    These will require an explicit read from RTX packet. So a separate ReadRTX() API will satisfy both advanced and basic use cases.

Add readRTX function for TrackRemote
Add example/read-rtx
@San9H0
Copy link
Author

San9H0 commented Sep 4, 2023

@adriancable What is 'no op' changes ? I don't understand
In errors.go, Just added errRTPReceiverWithRtxSSRCTrackStreamNotFound. It is helpful for debug

@jech
Copy link
Member

jech commented Sep 4, 2023

I agree that we need to be able to distinguish non-RTX from RTX packets. However, I'm not sure it makes sense to have two distinct flows of packets. What about the following API:

  • the existing Read/ReadRTP functions return either an original or an RTX packet, indifferently;
  • there's a new ReadPacket function (not too keen on the name) that returns both the packet and a boolean indicating whether it was an RTX packet.

Thus, we get a single flow of packets, but indications about which packets were resent.

@boks1971
Copy link
Contributor

boks1971 commented Sep 4, 2023

I second @jech 's choice for the API. On read side, having a single flow of packets with an attribute of RTX vs not would be good. If a separate API like ReadPacket makes that possible, apps can adopt that if they need that information.

On the write side (which this PR does not have, but leaving my 2c 😄 ), a separate method like WriteRTX would be great. It would be easy to use when handling NACKs or when sending padding packets for bandwidth probing. WriteRTX can maintain sequence numbers for RTX stream and wrap the incoming data in an RTX packet and write downstream. Maybe, this could done with an interceptor too a la the TWCC sender?

@adriancable
Copy link
Contributor

@boks1971 - I totally appreciate the API design is a subjective matter, so I am not trying to say what is 'right' and 'wrong'. But I think that @San9H0's original API design is preferable, for two reasons. Let me explain.

First, it would indeed be ideal and super clean if we could simply extend the current track.Read() with a flag indicating if the returned packet is RTX or not. Unfortunately this is not really possible. In the same-SSRC RTX situation (which is the default), it is not possible to distinguish RTX packets. So if we made some API modification to track.Read() with an RTX flag we would end up in a situation where the flag would be present always, but only work in distinct-SSRC RTX. This would be an 'unexpected' subtlety of the kind which is likely to surprise developers, and in general when designing APIs good design is largely about preventing unexpected behaviour. In this case the simplest approach to avoid this is to have a separate API function that just deals with reading the RTX track. That way the developer can choose to deal with it if they want, or not if they don't, but in neither case do unexpected things happen. This is basically @San9H0 approach. So +1 points for the design choice there, from me.

Second, we do as you point out in the future need to implement a track.WriteRTX(). Pion in general has symmetrical API calls for reading and writing, this is nice and clean and we should be careful about breaking this. If we do anything other than track.ReadRTX() now, we would end up introducing a schism in the API when we later want to do track.WriteRTX().

@jech
Copy link
Member

jech commented Sep 4, 2023

In this case the simplest approach to avoid this is to have a separate API function that just deals with reading the RTX track. That way the developer can choose to deal with it if they want, or not if they don't,

I don't understand. If RTX packets are only visible to a new ReadRTX function, then all clients must be modified to call both Read and ReadRTX (in separate goroutines). If they don't, they'll miss all of the RTX packets, right?

To my untrained eyes, this seems to imply that we need to retrofit the existing Read functions to return RTX packets when available, as this is the only way to avoid modifying all existing clients.

Pion in general has symmetrical API calls for reading and writing

The symmetry is only skin-deep. On the writing side, having separate functions causes little complication, since the writer may choose which of the two to call, and may safely ignore the new function. On the reading side, however, having separate functions causes significant complication, since all users must be updated to call both functions. In parallel (in separate goroutines).

@adriancable
Copy link
Contributor

adriancable commented Sep 4, 2023

I don't understand. If RTX packets are only visible to a new ReadRTX function, then all clients must be modified to call both Read and ReadRTX (in separate goroutines). If they don't, they'll miss all of the RTX packets, right?

Regardless of the approach taken, clients will need to be modified if they wish to use distinct-SSRC RTX (by, at the very least, calling RegisterCodec for video/rtx). I don't think we want to change Pion's default behaviour, here, and register video/rtx by default since this would definitely break things.

If clients want to continue to use same-SSRC RTX like they're doing today, they don't need to change anything, and they won't miss RTX packets since they're sent on the same track.

To my untrained eyes, this seems to imply that we need to retrofit the existing Read functions to return RTX packets when available, as this is the only way to avoid modifying all existing clients.

If we can find a clean, non-breaking way to do this, I am supportive. I don't like retrofitting Read but then also having another function (ReadPacket, ReadRTX etc.) since this feels over-complicated. So the question is, can we do everything we need by modifying Read without breaking backwards compatibility or introducing unexpected behaviours?

One possibility that comes to mind is to modify Read to (1) rewrite the SSRC in the returned packet with the original video track, if it's coming from the RTX track, and (2) add something in the interceptor.Attributes return value with the 'actual' unrewritten track SSRC. I think we would need to do (1) to make it transparent to clients, otherwise they will end up getting some packets with the expected SSRC from Read, and some packets with the RTX track SSRC from Read - if we rewrite the SSRC in the latter case then this reduces the 'modification' needed on the client side. (2) allows us to differentiate distinct-SSRC RTX packets from regular track packets if we wish - we can just check if the 'actual' track SSRC value returned in interceptor.Attributes matches the video track SSRC or the RTX SSRC. @jech, this seems closer to your line of thought, what do you think?

@jech
Copy link
Member

jech commented Sep 5, 2023

The point I'm making is that there needs to be a single function (method) that returns the next packet, whether it came on the main track or the RTX track. Currently, a read loop for a track looks like this:

for {
    count, _, err := track.Read(buf)
    if err != nil {
        break
    }
    handlePacket(buf[:count])
}

I'm hoping that the new API can make something like this possible:

for {
    count, isRTX, _, err := track.ReadPacket(buf)
    if err != nil {
        break
    }
    handlePacket(buf[:count], isRTX)
}

I'd like to avoid having to write something like this in every application:

var wg sync.WaitGroup
wg.Add(2)
go func() {
    defer wg.Done()
    for {
        count, _, err := track.Read(buf)
        if err != nil {
            break
        }
        handlePacket(buf[:count], false)
    }
}()
go func() {
    defer wg.Done()
    for {
        count, _, err := track.ReadRTX(buf2)
        if err != nil {
            break
        }
        handlePacket(buf2[:count], true)
    }
}()
wg.Wait()

(Adding error handling and avoiding goroutine leaks is left as an exercise to the reader.)

@adriancable
Copy link
Contributor

adriancable commented Sep 5, 2023

@jech -

The point I'm making is that there needs to be a single function (method) that returns the next packet, whether it came on the main track or the RTX track.

I agree this would be very nice! What I am arguing for (and inviting people to shoot down if I have missed something) is that this single function doesn't need to be a new function. Instead, we can make Read() return packets both from the main track or the RTX track and return something in the interceptor.Attributes parameter (e.g. the 'real' SSRC of the packet) which will let someone identify the actual source of the packet, if they want that information, but equally they can ignore it if they don't want it.

So your example right now:

for {
    count, _, err := track.Read(buf)
    if err != nil {
        break
    }
    handlePacket(buf[:count])
}

... would continue to work just the same with video/rtx. But if you want, instead of throwing away the interceptor.Attributes parameter, you could do something like:

for {
    count, a, err := track.Read(buf)
    if err != nil {
        break
    }
    if uint32(a.Get("actualSSRC")) != uint32(track.SSRC()) {
      // it's from the RTX track
    }
    handlePacket(buf[:count])
}

Does this make sense? If so, we don't need to add any new function at all. We can just continue to use track.Read() and everything will just work, but the user can differentiate separate-SSRC RTX packets from main track packets if they wish in their read loop.

@kcaffrey
Copy link

I like the approach of returning both regular and RTX packets in Read(), with something in the attributes to indicate whether the packet was from an RTX stream. It has a lot of benefits:

  • Backwards compatible API. Existing callers, if they ignore the new attribute, would see the same thing they see with same-SSRC RTX.
  • It is possible to distinguish RTX from regular packets via the interceptor attributes
  • No extra goroutines needed

As far as the attribute goes, I think we can have some helper to make it more user-friendly, e.g. (I don't care about the exact API or naming, just an example):

for {
    count, a, err := track.Read(buf)
    if err != nil {
        break
    }
    if rtxSSRC, ok := webrtc.CheckIfRTX(a); ok {
      // it's from the RTX track
    }
    handlePacket(buf[:count])
}

As far as writes go, I agree that it's not a big deal to have a separate method to call. It would still be backwards compatible, and any caller that would need to use that API needs to do extra work anyway (such as finding a recent unacknowledged packet to send, if using for probing).

@Sean-Der
Copy link
Member

Sean-Der commented Sep 24, 2023

I would like to put a is_rtx(name?) inside the interceptor.Attributes. This isn't an API break and follows the design of interceptors!

If a user really cares they can check. The majority of Pion WebRTC users I don't believe will care.

@adriancable
Copy link
Contributor

adriancable commented Sep 24, 2023

@Sean-Der - I have been thinking some more about this - here's where I'm at right now.

Summary: I'm beginning to think it's better to have separate Read() and ReadRTX() functions, as per the original PR. But ultimately it boils down to a philosophy of how high-level pion should abstract what's happening behind the scenes.

Detail: When operating with distinct SSRC RTX, the main video and RTX video are on separate tracks. So if we do everything via Read(), we would need to peek at both the main and RTX RTP streams to see if there's any data there, and based on the result, read from one of those tracks. If there's data on both tracks, then we would need to return whichever RTP packet was received first, which means we also need to timestamp RTP packets on receipt, or maintain some kind of transport-wide sequence number. Both of these things add overhead, which may be non-negligible since the RTP packet rate can be quite high and pion often runs on low-end hardware.

The main argument for abstracting away main vs RTX packets and just returning either via Read() is that it's simpler for the user if they don't need to differentiate between main and RTX packets. But to that I would say: if the user doesn't need to differentiate, they should/would just use same-SSRC RTX and then 'everything just works' already today. As far as I can see, the only case where the user would want RTX on a separate track is if they want to handle main and RTX packets differently in some way. So, they would always need to check the result of Read() to see if it's an RTX packet or not. In this case, it's no more complex for the user to make separate calls to Read() and ReadRTX() and then we avoid the additional overhead in Read() which would be borne by everyone.

Do you agree/disagree?

@kcaffrey
Copy link

if the user doesn't need to differentiate, they should/would just use same-SSRC RTX and then 'everything just works' already today

This is not always the case. libwebrtc prefers to use RTX for probing both because you can probe with larger packets (instead of being limited to the maximum padding-only packet size) and because probing with redundancy provides value, whereas probing with padding is purely overhead. However, libwebrtc will only use RTX for probing when there is an RTX stream (ie, distinct SSRC) negotiated. There are many reasons for this (some of which are discussed in RFC 4588), but despite the reasoning, if you want to use libwebrtc to communicate with Pion and have libwebrtc use RTX for probing, distinct SSRC is required. However, many users in such a situation do not actually care about the distinction between RTX and non-RTX packets.

Note that other than better bandwidth efficiency, one good reason to negotiate RTX streams (and therefore have libwebrtc use RTX for probing), is that apparently libwebrtc's bandwidth estimation is not well tested in the case where padding is used to probe for video. A recent issue (1477602) filed regarding libwebrtc's BWE was partially attributed to not having RTX streams negotiated. Google Meet and MS Teams (two of largest WebRTC apps, driving a lot of decisions for libwebrtc) both negotiate RTX, therefore the code paths for probing with padding data are not given as much attention.

@adriancable
Copy link
Contributor

@kcaffrey - thanks for the insight. Do you think this swings things far enough to make a single Read() function handling both the main and RTX tracks the right approach? (NB, handling probe packets on RTX requires special handling anyway, since they lack the OSN field, but this can/should be done on the pion side)

I am concerned about the overhead. If we did it this way, we would need to do 2 peeks + 1 read internally per call (instead of just 1 read). I haven't looked at the internal implementation so I don't know how significant this would be, but I think overhead is at least a consideration in the opposite direction.

@kcaffrey
Copy link

kcaffrey commented Sep 25, 2023

Overall I still am in favor of a single Read() method, but the concern about overhead is a fair point.

One thing to consider is that for a user that has negotiated RTX, they will have to pay that overhead somewhere. If there are separate Read() vs ReadRTX() methods, there is the overhead of additional goroutines (which can become an issue for mixer servers) as well as overhead of synchronization between the two reading goroutines.

In the case where RTX is not negotiated, is it possible to have little to no overhead with the single Read() approach? That is, only paying the overhead cost when RTX is used? If so, I think that pushes things more strongly towards the single method. If all users, whether or not they have negotiated RTX, have to pay the overhead, then I'm not sure.

@adriancable
Copy link
Contributor

adriancable commented Sep 25, 2023

@kcaffrey - we can definitely avoid any/most overhead if there's no separate RTX track.

I'll have a go later this week to see if I can put together a reasonable implementation. Thinking out loud the way it will probably work is that data on the RTX track will be read via a goroutine and passed to the TrackRemote via a channel. The Read() call on TrackRemote will not block on the RTX track, but will return a packet from the channel if available (along with some attribute that would allow the caller to check if it's RTX, e.g. the SSRC which the caller can then compare against the main track/RTX track SSRC if they wish, or ignore otherwise), otherwise will block on the main track like it does today. This will minimise overhead.

@Sean-Der
Copy link
Member

@adriancable

Would you be ok with having one Read if it had no performance overhead?

I believe distinct SSRC is always better. These are the two things I have seen recently.

  • The ReplayDetector may discard very old packets
  • It allows the user to distinguish between late packets and a rtx (better stats)

So would be nice if 'out of the box' we gave that to users.If we can't make it work would a 'High Performance/Single SSRC' mode be a ok compromise?

@adriancable
Copy link
Contributor

adriancable commented Sep 25, 2023

@Sean-Der - yes. I'll see if I can put together something on this basis. (minimal/no overhead, no special mode needed)

Quick question: it looks right now that we're registering video/rtx in webrtc.RegisterDefaultCodecs. Surely this can't possibly be right because right now we just ignore the RTX track? Won't that have the effect of making the NACK generator effectively a no-op? (NACKs get sent by us correctly, but RTX will be sent by the remote on a separate track so we'll lose all retransmissions?)

@adriancable
Copy link
Contributor

@San9H0 - can you kindly give me write access to your repo? I'd like to make a bunch of changes based on this discussion, and it's easier to base off your PR rather than create a new PR. Thank you!

@adriancable
Copy link
Contributor

Superceded by #2592.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants