Prcoess Monitor for monitoring threads of process

I have a requirement to develop a Process monitor module
that starts, stops and continually monitors certain processes (and
their threads) on an embedded QNX system.
The idea is that is that if a thread is detected to have exited or
hung/consumed CPU or deadloacked, this monitor would kill and restart the parent process and all
its threads.

However ,most threads are created to run in a blocking type
scenario, and there is no guarentee that the events they block on
would occur often enough to determine if the thread was in trouble or
merely legitimately blocked for a long time.
Any ideas on how to go about this requirement.

You should look at HAM (High Availablity Monitor). It will detect what you are describing if set up right.

Is it possible to detect even hung processes /cpu consuming processes via HAM?
I thought that HAM only finds out the death of a process…

HAM can be configured to require a heartbeat so detecting a hung process is relatively straightforward. As for high CPU utilization, I’m not sure of how to implement it but, QNX’s documentation says:

“Also, by studying the source code, it is possible to add the capability of detecting other conditions into the HAM (e.g. low memory, high CPU utilization, low disk space, etc.) to suit your HA application.” … mover.html

Hi.I have a low end system with 500 Mhz CPU and 256MB RAM.
There can be around 50 apps running on this system excluding drivers and other system utilities.
I need monitor all these 50 apps.Will it be a good idea to regsiter each of these 50 processes with HAM
without affecting performance of sytsem?
Apart from HAM, is there anything i can do in code for monitoring them?

Every process affects the performance of a system. The question you wish to have answered is whether it will affect the system so negatively that it will not work? The answer to this is simple. If you have sufficient resources left in the system such as cpu, memory then it will work. If you don’t have sufficient resources left, it will not work. You could get an idea you could look at the percentage of cpu being used by Idle and how much free memory is in the system. To know for sure you will have to test.

The alternative to using HAM is to write your own monitor. Will this use less resources than HAM? Possible but unlikely.

HAM only requires very few resources. Detecting process death is an event-based thing so it doesn’t consume CPU. Sending the heartbeats does consume very little CPU.

If a process consumes too much CPU, others will not be able to send their heartbeats and you can react. How - there is no easy answer. You would need to log CPU usage on a regular basis to find out who is using “too much” CPU - and you have to define how much that is. Probably using Adaptive Partitioning is a better approach for this type of problem.