Eric F. Steimle
Table of contents
19 May 2010
Before diving into this article, you should already have a basic understanding of a simple Collaboration peer-to-peer video chat client. If you don't have your own solution running, or haven't at least built and run Adobe's example client, then I urge you do that before diving into this tutorial. A basic understanding of UDP networking is helpful but not required.
Adobe LiveCycle Collaboration Service provides a quick and easy way to get started with peer–to-peer video chat. You can quickly connect standard Collaboration components together and have your own video chat running in no time. That all works great in the lab, but when you take your solution out into the wild, you may start to notice the audio is choppy, or the video is really slow to update. If you want to fix these problems so your customers can actually use your product, you're going to have to learn to manage your bandwidth. In this article, I'll explain what I came up with to manage bandwidth in our production PEER TO PEER video chat.
If you're reading this article, I'm guessing you have a Collaboration video chat client and you are experiencing choppy audio, slow video, or one of the many other manifestations of a bad video chat experience. In this article, I hope to help you solve these problems by showing you how to manage your bandwidth.
The first thing you need to understand is that Collaboration peer–to-peer video and audio streams are all sent over UDP, and not TCP connections. Remember how the package delivery guy used to come by, ring your doorbell, actually hand you your package, and make sure you got it? Well that's TCP. UDP is more like the delivery guy who throws your package out of his moving truck towards your door. He's got a lot of packages to deliver in a hurry and he couldn't care less about whether you get it.
It's probably also worth taking a second to note that Collaboration will automatically fall back to a hub-and–spoke mode if it cannot establish a direct peer-to-peer connection. This has the added benefit of helping you make a connection under difficult network conditions; however, it does come with the penalty of added latency.
Plenty of bandwidth
The basic idea behind using UDP is you don't have to wait for anything before you send your next packet. So, when your video chat program is sending audio data, it just happily streams out audio packet after audio packet without any idea how much data the receiver is actually getting. This is great in terms of latency for real-time data, and that's why it's used. Everything will work fine as long as the pipe between the sender and receiver can handle what's being stuffed down it. When it can't, that's when things start to fall apart.
If the pipe between the sender and receiver is not wide enough to send all that data, then packets are going to start getting dropped. So pieces of your audio and video data are just going to disappear forever. This is going to translate into choppy audio and lost video frames.
At this point you might be tempted to wish for a more guaranteed delivery of your audio and video data, but audio from 20 seconds ago isn't much use in a real-time conversation. Imagine if our delivery guy was delivering you crates of bananas, but he starts to get backed up with banana deliveries. You don't want him to let those bananas go bad in the warehouse while he tries to delivery every single package of rotten bananas to you. You want him to get rid of the old bananas and only deliver the fresh ones to you. That's the same basic concept behind dropping UDP packets here; what good is a garage full of moldy bananas?
Fixing the problem
The only way to fix this problem is to add some feedback from receiver to sender. You need to pick up the phone and tell the guy shipping you bananas to slow it down (see Figure 1). In the rest of this article, I'll cover how to know when to tell the sender to slow down and how to do the slowing down.
In a two-way peer-to-peer audio conference, you can collect data on two things: what you send and what you receive. If you want to know what the other guy received, you have to ask him.
Getting audio metrics
You know how much audio data you are sending from your audioPublisher.netStreamInfo class. You can also collect information on the audio data that is being sent to you by using your audioSuscriber like this:
streamInfo = audioSub.getNetStreamInfo(streamDesc.initiatorID);
You can get your streamDescriptors for the previous call from your streamManager by doing something like this:
var audioStreams:Object = session.streamManager.getStreamsOfType(StreamManager.AUDIO_STREAM);
Which metric to choose?
At Family Health Network, after a bunch of fooling around, restricting bandwidth between two clients, I think we got the most bang for our buck by watching the Audio Loss metric. That basically tells you when you're losing audio frames, and for us that always came with choppy audio.
Slowing it down
Collaboration can use one of two codec types for encoding audio: one is NellyMoser, and the other is the open-source audio codec Speex. Make sure you are using the Speex codec, as that will give you the best quality. When you create your audioPublisher, you should set its
codec property like this:
audioPub.codec = SoundCodec.SPEEX;
codec set to
SPEEX , you can now adjust the quality of the audio encoding to adjust how much bandwidth it will take to transmit your audio stream. Our receiving chat client monitors its received audio loss every second. When the receiver sees its audio loss rate go over a certain threshold, it sends an out–of-band message that says "hey, slow down on the audio transmissions" (you could do this with a shared Object, for instance).
When the slow down signal comes in, the sender can then lower the Speex quality to one of our preset thresholds. You can use your audioPublisher's
microphoneManager to set your Speex endcode quality like this:
audioPub.microphoneManager.encodeQuality = 8;
The Flash documentation lists the bit rates used for Speex in Table 1.
|Quality value||Required bit rate (kilobits per second)|
If you are at a quality setting of 10, then you are using roughly 42.2kbps plus any overhead required to send that audio data. If your data connection can't handle a constant upstream bandwidth of 42.2kbps, then you're going to get dropped packets and choppy audio. So when the audio loss metric crosses the threshold you define, say 0.10, just adjust your Speex quality down to a lower quality value. You'll get lower quality encoding, but a much better conversation. In our solution I dropped several quality levels when there was trouble just to give us some overhead.
I should also mention that Audio Loss is only a valid metric when you are in peer-to-peer mode. In hub–and-spoke mode, you'll have to take the difference of the sent and received audio bandwidth. I'd use a similar approach to the one I describe in video metrics below.
Video data is a little trickier, and you can make a lot more trade-offs. Do you want high-quality images at all times? Then prepare to sacrifice frame rate—in other words, you'll need to use clear images that rarely update.
We went the other way; our users get frantic if they don't see the other party moving. Our basic plan was when we're in trouble, we lower resolution, then lower frame rate, and finally set a bandwidth cap.
One nice thing about using Collaboration is they already prioritize audio over video traffic, which is great.You need to keep in mind, however, that your available video bandwidth will really be your total bandwidth minus whatever audio bandwidth you're using.
Finding the right metric
Why not just use video packet loss like you do with audio data? Ah, because there's no video packet loss metric. Well, there is total frame loss, and we started off with that: keeping track of the changes every second, looking for spikes, and so on. In the end, though, this proved to be not good enough for what we wanted to do. The obvious thing seemed to be to compare the data rate that the sender thought he was sending with the data rate the receiver was actually getting.
In practice, that was a little harder, since the receive statistics could be delayed by a second or more, and we don't have a shared time source. What we ended up doing was using a one-second timer to constantly sample the received video data rate. Then we package up that received bandwidth table in thirty-second chunks and send it back to our sender.
The received bandwidth table comes to the sender with a timestamp showing when it was sent. We use this timestamp to offset the timestamps found in the bandwidth data (the user could have a clock set to 1959 for some reason or another.) Now all we have to do is line up the data we thought we sent with the data that was actually received (see Table 1).
|timestamps||What I think I sent||What was actually received|
Then you just calculate the mean transmitted vs the mean received. We chose a threshold value of 10%, above which those two numbers cannot differ. If the data crosses the threshold, we take the mean received bandwidth and adjust our webPublisher to send data at a lower bandwidth.
Adjusting your audio bandwidth was easy: there was just one setting. Video is not that simple. We did a bunch of testing, slowly lowering the bandwidth between the two clients, and then hand-adjusting the webPublisher settings to find what worked for us at those levels. Once you have a table like that, you can just choose a setting based on the mean received bandwidth. We adjusted the following settings.
This is basically how much raw data the camera is capturing. It's the multiplier applied to Flash's basic camera resolution, which is pretty low (160x120 in the spec seems to be 160x160 for me). If you want to drop your bandwidth in a hurry, just lower your resolution.
The same basic principle applies here: fewer frames per second means less data needs to be sent.
This seems like an obvious one, but in practice it was a little tricky. The flash camera documentation says you can use the
setQuality() method like this:
That's supposed to tell it to sacrifice quality or frame rate to stay under your bandwidth cap. In practice, this didn't always work for us, unless we also adjusted
resolutionFactor . As best I can tell, if your
resolutionFactor is too high, the Flash Player ignores the bandwidth setting.
I'm sure I could just be doing something wrong here, but I could clearly see that happening in our graphs when we adjusted the bandwidth settings. We could not push the bandwidth lower than a certain point until we lowered our resolution.
Collaboration does provide three standard bandwidth caps to start with: LAN, DSL, and Modem. I think you will find that you'll need more levels from about 500kbps down to about 50kbps. Those settings are entirely up to you, though.
These last two sections are based on our experimenting with our Collaboration video chat clients. We found handling both these cases greatly improved our quality, so I'll include them here.
This interesting and somewhat annoying problem happens when switching from a high-bandwidth situation to a low-bandwidth situation. This could just be how we were testing it, but I've seen it in the wild on extremely slow links. Anyway, you're basically humming along on your super-fast link, and then bam: your bandwidth gets restricted down to 200kbps.
Now it seems the webcamPublisher is trying to flush all that higher-resolution data out over the new 200kbps link before it gets to the new low-resolution data. So you end up with a massive delay that sticks around forever.
I couldn't find a good way to just flush the buffer, so we just set our app to start and restart the camera automatically, and that seems to fix things. Be warned, though, if you do that, your bandwidth setting from before will be erased; so you need to reset that once you know that the camera is up. You can solve this by resetting the bandwidth in your camera restart function, or you can override the
onbwActualChange() function in WebcamPublisher to
camera.setQuality() with your bandwidth again.
I should probably also add that we have some hold-off timers, so we don't degrade to one level, then quickly degrade to the next. Also, for video, we have a three-strikes rule, so you have to violate our mean bandwidth threshold three times before we do anything about it. All of this is designed to prevent us from immediately dropping to the lowest possible quality settings at the first sign of trouble.
This article talks all about lowering your bandwidth to improve quality when bandwidth is scarce, but I'm sure many people would want to know, how you go back up? It's an interesting question that all video chat developers face.
The only way you know how wide the pipe between the sender and receiver is by testing it. There's no magic metric that's going to tell you that you can now send 300kbps over your pipe where before you could only send 200kbps. Plus, you really don't want to constantly try to increase your bandwidth to see if it works, because that will be a bad expereince for your user. Think about it: if things are going fine at 200kbps and you raise your rates to 300kbps, all those reasons why you dropped down to 200kbps are going to come right back.
I suppose the RTT (Round Trip Time) metric might give you some indication; if all of a sudden you saw it drop like a rock, it could indicate that your pipe is less congested. The problem with RTT is that it's dependent on what data you are sending and the condition of your pipe. For example, you'll see it drop on its own if you stop speaking and let your audio rate drop.
Keeping track of what RTT is at a certain bandwidth number might be a good start. Personally, I think the way to go would be to open a second channel between sender and receiver and start trying to push some data down it. For instance, if you found you could reliably get an extra 50kbps, you could up your video bandwidth from 200kbps to 300kbps.
What you really want for this is a separate data connection that has a lower priority than video and audio streams. This doesn't exist yet as far as I can tell, but it would be a great feature. You could try to hack it now by using an extra hidden stream, but without the prioritizing in place, I think you'd find it to be just as bad for the user as it would be to periodically trying to up the user's bandwidth.
For now I'm sticking with only degrading our video chats; in the future, perhaps, we'll make something that goes back up as well.
Where to go from here
Using the methods outlined in this article should help you give your users a much better video chat experience out in the real world. One of the things I didn't go over was how to test all this: If you're like me and don't have the cash for expensive network simulators, I urge you to check out the open-source IPFirewall (IPFW). I used IPFW's pipes to restrict bandwidth between sender and receiver during most of our lab testing.
IPFW comes standard with OSX, and I used a great GUI interface to it called waterroof.
There is also a verion of IPFW for Windows on SourceForge.net, but I have not tested it myself.
Another thing we found helpful was to graph all of the incoming and outgoing statistics. That way we could see what was happening as we adjusted the bandwidth between the two clients. We basically just modified the Adobe Stratus example code. It uses the Flex data visualization library, but you can use that for free if you don't mind the water mark. Figure 2 shows ours in action.