🎻 May 2023 fiddle
Look us up here for the WebRTC Fiddle of the Month, created… once a month.
Or just enroll to one of our excellent WebRTC training courses.
Tsahi: Hi there. It is time for our monthly fiddle, and this time we’re going to do something slightly different. Instead of looking at WebRTC, we’re going to look at WebCodecs, and we’re going to check how they deal with frame losses. And that would also teach us some things about codecs when it comes to WebRTC. With me is Philipp Hancke, and let’s start.
Philipp: Yes. So not WebRTC this time, but we’re going to use a lot APIs that you are already familiar with from WebRTC. Let’s first look at the elements we have. We have pretty common localvideo, remotevideo element. We have a drop-down for selecting the codec, we are going to use either VP8 or H.264. We have a start button. We have a button to drop a frame to generate frame loss and to generate the key frame. That part is pretty easy. Let’s look at the JavaScript part. We have an encoder, which is a WebCodecs API. What we’re going to do is we’re going to wire up this encoder to a WebCodecs decoder. You always need to configure your decoder. If we get some metadata along with decoded frame by the encoder, we’re going to configure the decoder accordingly. This is something if you do a production app, you would need to transfer it somehow similar to what you do in WebRTC with the SDP to configure what you want to do. We have a MediaStreamTrack generator, which is a WebRTC-ish API that we use to get something that we can put our decoded frames into and then make them available as a MediaStreamTrack that we’re already used to from WebRTC.
Tsahi: We’re doing that in order to be able to drop the frames in this case?
Philipp: Well, we’re going to need something to take the decoded frames and display them.
Tsahi: Okay.
Philipp: That’s basically the way to do it. MediaStreamTrack generator is something that you can put frames into and then it will give you the MediaStreamTrack object that you’re used to from WebRTC or from getUserMedia. Okay. We have our decoder, which is simply going to feed into the generator. Then we have our actual main function, which is going to be triggered when you hit the start button. It enables and disables some buttons. It configures the encoder to encode the codec that we see in the drop-down, which is either VP8 or H.264. We do 640×360, bitrate is 1 megabit per second, 30 frames per second. You can also play with hardware acceleration, so whether you want to do the hardware encoder or software encoder or just let the browser pick. Then we have the stream that we get from getUserMedia. We show it in the local video element. Then we We have a MediaStreamTrack processor, which is another breakout box API, which lets us read frames from the local track and feed them into the encoder. That is just what we do here. We have the processor, we get a reader on it, and then we just do a loop.
Philipp: We read a frame from the local track, we encode it using the encoder. Whether we want a keyframe or not depends on whether it’s the first frame, or if the button was hit, so we reset this keyframe flag here, then we close the local frame. And that will then trigger the encoder’s output function, which then goes and calls the decoder decode, which will in turn call the output function on the decoder, which gets into the generator.
Tsahi: Okay. And we’re generating keyframes on a button click on the decoder side, and we’re dropping on the decoder side.
Philipp: Yes.
Tsahi: Okay.
Philipp: Okay, so let’s just do that. We start, we see video. So far, that’s just a simple WebCodecs pipeline locally. But what happens if we drop a frame? Not much to see here. Let’s drop a couple more. Oh, we see this corruption. The codec is not happy anymore – It got something wrong. The frames that we’re now decoding depend on frames that we lost, so they’re not representing the actual state of the encoder anymore.
Tsahi: Usually, you don’t see these things in WebRTC?
Philipp: Yes, because WebRTC does a lot of things to prevent this kind of frame loss from happening.
Tsahi: For example, it can freeze, it can generate keyframes, it can retransmit. It tries to get over that, and this is why doing that manually here makes sense if you want to learn more about these types of artifacts or problems.
Philipp: Yes. What WebRTC typically does, if it can’t decode a frame for 200 milliseconds, it asks the remote end to send a keyframe, which we’re going to do by clicking this button, which generates a keyframe, and we’re good again. So that is VP8. It gets ugly pretty quickly. But what if we choose H.264 instead? So no frame loss, but what happens if we drop a frame? We dropped a couple of frames now, 20, 25, and it still looks okay. So I move my hand. You see these artifacts dropping more, more artifacts, but it is repairing itself pretty well. So it shows this was a bit more robust to this frame loss.
Tsahi I wonder if that would be the case if you would have used the VP8 with temporal scalability?
Philipp: That’s a good question. I’m not sure if you can figure that in WebCodecs.
Tsahi: I don’t know, but it’s like there are other mechanisms as well at play that can change configure the encoder itself in specific codecs to be more robust to these kinds of things as well.
Philipp: Yes. If you lose a high-level, a high-temporal layer VP8 frame, it would be able to decode without the lost frames because that’s the scalable video encoding. So you encode with different references. And the problem with these artifacts is that you don’t have the complete references you need.
Tsahi: So what have we seen exactly? We’ve seen using WebCodecs and breakout box, or Insertable Streams, whatever you want to call them. We’ve seen playing with frame losses, or we can call them even packet losses in some way, and what they cause video and what artifacts they cause video to bring with them. Then there’s all the other things around that of how to fix that that we’re not covering here.
Philipp: Okay.
Tsahi: Thank you for this, and see you in our next Fiddle of the Month.