RS232 Overload

mixer slider - house of worship

Checklist Item Under Test: 3.4.17: The port and processor are of an adequate size and buffer given their intended function.

Reasoning: RS232’s popularity may be fading, but it is still a common method of bidirectional control over AV devices. Without access to debugging tools, it is often not possible to observe all the data moving back and forth between the control system and the device. There may be incredible amounts of data being passed, which can flood a port and stall the control processor. There are also effects from the data rates that may create issues with control systems. And the polling cadence to obtain status updates may also be problematic. Flooded RS232 ports have definitely caused their fair share of sporadic, mysterious control system lockups requiring a processor reboot. With some experience and attention to detail, they can be avoided.

The Story: “Call me Ishmael. Test Mic 1.”

“What? Stop messing around. What did you just do?”

“Nothing. I didn’t do anything.”

“Well, you did something, because the processor just locked again. Did you press anything?”

“I just cleared the cue on the mixer.”

“…Do it again…Holy schnikeys! That’s it. That’s the one.”

“Clearing the cue? But that shouldn’t do anything to the control system….”

We were called onto a jobsite where the control system would randomly lock up and require a reboot. It might happen twice a day. It might not happen for a week. It was a big system, but operators had narrowed it down to the control processor interfacing with the system mixing console. The users needed some way to recall scenes on the console, so an RS232 connection was run for this communication. It was for a large company where performance was way more important than price, so the first step they took was to buy an entire control processor dedicated to this mixer scene recall function. That’s it. You read that correctly: an entire processor to recall eight different scenes on a mixer. However, when it was still happening even after separating the processor, they figured they needed to drill down a little deeper into the issue.

We sat and observed the operators for a bit, and nothing happened. Scenes were recalled rapidly, and the performance seemed perfectly fine. All devices were behaving. Levels were adjusted. Mock events were set up and tested. No lockups were observed. We were logged into the control system and could monitor all the communications between the mixer and processor, and the strings of commands were very straightforward. It was very frustrating because there was no obvious cause. We continued our hunt.

Then we started exercising the mixer a bit. We adjusted faders. We cued sources. We basically just wanted to go through all the features used during an event. And that’s when the processor started to lock up, but it would follow an operator not a task. For example, when Iris set up the event, it was fine, but when Rachel did it, the processor locked up. We were getting closer to finding the issue. We were so close, we could taste it. The search for the issue was starting to consume us.

So, we had Rachel set up the event, one press/command at a time…and nothing. Then it was Iris’ turn, but before they changed seats, the system locked up. That’s also when the exchange at the start of our tale occurred.

When Iris would clear the board, she would clear the sources she cued (PFL) individually. Rachel, on the other hand, would use the handy-dandy “Clear Cue” button. That Clear Cue button was the troublemaker. When it was pressed, it would send an incredibly large amount of data to the control processor for some reason. I’m not entirely sure of the specifics, but my guess is that it exceeded the buffer on the RS232 port of the control processor, and when that happens, the processor freezes. It took a very long time to find this particular idiosyncrasy in the system, but at long last, our hunt was over. We had found our Moby Dick of control protocols.

Fixing it was easy, of course. Once you find the problem, creating a solution is a piece of cake. The programmer simply told the processor to ignore all data after the header information from the Clear Cue button press. We also could have just switched to an IP-based control protocol to get rid of the bandwidth and buffer limitations. Either way would have worked fine.

The lesson was learned, though. RS232 ports need to be kept happy to keep the processor happy. There are so many aspects to the protocol that are haphazardly set in the field, which can interrupt the control system. The baud rate could be dropped to a lower rate because of a long cable run (more than 50 feet may only work at 9600 baud). The control system could poll a device several times a second for status updates that may contain a lot of information. Commands can get backed up if they are not sequenced properly within the programming.

All these things would probably be managed by an experienced programmer familiar with the equipment. But what about an inexperienced programmer, or some brand new kid? To add to the problem, these are the types of issues that are only uncovered after the system has been working fine for a few weeks, and then someone decides to go and press the Clear Cue button for some reason. These are also the types of issues that lead to long service calls, trying to reproduce the problem, and extra, unnecessary equipment being purchased. It is very important to manage control system communication protocols that may be susceptible to overload. They are difficult to uncover, and they have the ability to sink the entire ship.

“It was the devious-cruising Rachel, that in her retracing search after her missing children [or pre-fader cues, as the case may be], only found another orphan.” –Herman Melville, Moby Dick

Previous ArticleNext Article
Send this to a friend