I was recently on a job installing a system with a somewhat new DSP device. We could not get the system to perform with full duplex during conferencing. The levels looked good. The echo canceller was working. It was a fairly difficult room, but nothing crazy. The problem was with the processing blocks from this manufacturer.
In an effort to keep things simple, the manufacturer provided an easy to use interface for Acoustic Echo Cancelling (AEC). First, you set the tail length of the room: 100ms, 200ms or 400ms. Then, if there were still hints of echo, you enable the Non-Linear Processing (NLP) to clean up any residual echo sent back to the far end. It’s that simple. In principle, it makes good sense. In most rooms, it works well. In difficult rooms, the easy solution is sometimes too easy. Unfortunately, this was one of those difficult rooms.
Author’s Note: There are many other resources available that do a much better job of explaining how AEC and NLP work together than I could. For this blog, I’ll just stick to the experience aspect of the story. That will also prevent those who don’t like to geek out on DSP algorithms from hitting their heads on their desk as they fall asleep. You’re welcome!)
We could get rid of the echo with NLP enabled, but we then had poor duplex performance. That is, when both sides of a conference spoke at the same time, the outgoing audio from the room was severely ducked. It was distracting to the users. If someone coughed, or verbally agreed on the far end (e.g., “uh huh”), they lost a few syllables in the conversation. Conferences could be decently effective, but the experience was just fine. I hate “just fine” experiences.
If we disabled NLP, there was still some residual echo. We had to keep it on. After speaking with the manufacturer, they agreed/admitted that this was the best their device could sound in this particular environment. However, they remembered that they kept some poor DSP programmer locked in a basement somewhere for just such an occasion. I kid you not, within 24 hours they had a revised firmware that included a dial to manage how much NLP was included in the echo cancelling process. (I feel so bad for that poor programmer. I’m not sure he remembers what sunlight looks like. I do appreciate the effort though.) You could dial in the NLP from 0 to 1 with increments of 0.01. I was blown away by the response. We loaded the new firmware, dialed the NLP to 0.72 and had an excellent conference experience with great, full-duplex performance.
In the olden days, we could coarsely set the severity of the NLP: None, Soft, Medium, Aggressive. There were four options…so quaint. Now, we have NLP granularity to 100 settings. When I’m older, I think I’ll tell that to my grandkids: “In my day…we only had four NLP settings…and had to move our computer mice uphill…both ways!” We live in an amazing world where a manufacturer can revise code in less than 24 hours and give you algorithms that solve your problem perfectly. It was an excellent experience. I should say that this is technically a one-off firmware version, and not supported by the manufacturer as of yet. It worked so well, I can’t see why they wouldn’t include it in a general release. Only time will tell if flexibility and power will beat out simplicity and ease of use. I think it’s a PC vs. MAC kind of thing.