Standby node "locks up" after around 5 hours opera

Hello Guys,

Over the past 3 days we have had 4 “lock-ups” of the “Standby” node.

For the first two days we had Node 1 “Active” and Node 2 as “Standby” most of the time - since the communications lines were only connected to Node 1.

During this time, on 3 occasions RealFlex reported that the “Standby” had failed. Node 2 could not be accessed via the network (e.g. ping) and the console had “locked-up” (i.e. either the screen was black or frozen with the mouse and keyboard not having any effect). The server had to be powered off and on again.

We thought this may have been a hardware issue with Node 2. In order to confirm this we swapped the communications lines over to Node 2 and ran Node 1 “Standby” and Node 2 as “Active” today. After about 5 hours Node 1 locked-up in exactly the same manner that Node 2 had been.

It looks like there is an issue with the Standby node locking-up after running for a number of hours.

We have not had this issue on our in house test system running the same software and database. However, our system does not have the same system loa (i.e. we don’t have actual RTUs connected etc.)

DELL R200 server QNX 6.3 and RealFlex 6.4.76

Any suggestion …how to debug this issue ?

Are you using an Adaptec SCSI controller on the Dell Server, and is it multi-processor?

Now we are not using Adaptec SCSI controller on DELL Server.
Yes it is Multi-processor (2 Processor).

Now we have upgraded to QNX 6.5/RealFlex 6.5.XX.

Kind Regards,
Naresh

Please let us know if upgrading to QNX 6.5 solves the problem, :slight_smile:.