We are having a strange network issue on one of our systems in the field. Every 20 minutes or so we’ll get a brief drop out to a serial to Ethernet device we regularly get data from (every 25 or so milliseconds). Our systems have multiple of these serial to Ethernet devices but the drop out only happens to one of them. We’ve already replaced the device with a spare but the problem persists.
I got a tcpdump capture of when it happened and at the highlighted point you can see the traffic stops for a full second. We poll for data so the 1 second drop out means we stopped polling for that second.
When I remove the filter and see why it started working again I noticed this. There is an arp request from QNX looking for the .150 node. Once it finds it, everything starts working again.
Googling around, I found evidence that QNX arp table goes stale roughly every 20 minutes (when our problem happens). But we are getting plenty of traffic from the other side so it should never go stale.
I had them do a nicinfo command and everything looks fine (no errors of any kind that would indicate a bad network card or faulty cable at least on the QNX PC).
I’m out of ideas as to why on just this system to just 1 device we’d get a 1 second hang every 20 minutes that looks like a stale arp cache issue. Anyone seen anything like this or have any suggestions to try and diagnose what’s going on (next steps are to replace network switch or cable to the device that has the problem).
Tim

