PtTimer stops working?

Hi Folks-

Has anyone seen a case where a PtTimer widget stopped
working after a long period of continuous operation (like
months)? Apparently one of our machines stopped refreshing
certain fields on its screen after running correctly for a
significant amount of time. These fields are all driven by
the same PtTimer callback. (I say “apparently” because
the machine got rebooted before I could actually check it
out in person.) :frowning: Other than that one bug, our GUI
program had continued to run correctly.

FWIW, all of the program’s IPC is handled in an input
function. The program also does some file I/O (using both
file descriptors and file pointers), but doesn’t use a
file-descriptor function. It doesn’t use select(), nor
does it use pipes.

Here’s the output from “sin ver”:

PROGRAM NAME VERSION DATE
sys/Proc32 Proc 4.25I Nov 25 1998
sys/Proc32 Slib16 4.23G Oct 04 1996
sys/Slib32 Slib32 4.24B Aug 12 1997
/bin/Fsys Fsys32 4.24T Feb 26 1999
/bin/Fsys Floppy 4.24B Aug 19 1997
/bin/Fsys DOC2000_TFFS 4.24A Aug 21 2000
//1/bin/Dev Dev32 4.23G Oct 04 1996
//1/bin/Dev.con Dev32.ansi 4.23H Nov 21 1996
//1/bin/Dev.ser Dev32.ser 4.23X Apr 26 2001
//1/bin/Dev.par Dev.par 4.26 Feb 24 2000
//1/bin/Mqueue mqueue 4.24A Aug 30 1999
//1//photon/bin/Photon Photon 1.13D Sep 03 1998
//1/
/bin/phfontphf Photon Font 1.13A Jul 07 1998


TIA,

  • Pete


±---- Pete DiMarco ------±--------------------------------------+
| Staff Software Engineer | Web: www.ifspurity.com |
| Integrated Flow Systems | Email: peted [At] ifspurity [Dot] com |
±------------------------±--------------------------------------+
<< Opinions expressed here are my own, not those of my employer. >>

Pete DiMarco wrote:

Hi Folks-

Has anyone seen a case where a PtTimer widget stopped
working after a long period of continuous operation (like
months)? Apparently one of our machines stopped refreshing
certain fields on its screen after running correctly for a
significant amount of time. These fields are all driven by
the same PtTimer callback. (I say “apparently” because
the machine got rebooted before I could actually check it
out in person.) > :frowning: > Other than that one bug, our GUI
program had continued to run correctly.

It’s unfortunate you couldn’t get to it personally to check it out.
Next time, you should extract a ‘sin rt’ (along with other sin outputs
such as proxy, info, irq etc).

Without much else, it’s a little premature to guess at what happened.
Although, I’ve fixed a few timer related issues in QNX4 (fixes are in
4.25 patch G). Under certain load conditions, it was possible that
timers went negative, and never fired from that point on - sin rt would
show that.

Let us know if you manage to extract more info, if it happens again.


Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>

“Adam Mallory” <amallory@qnx.com> wrote in message
news:campt1$373$1@inn.qnx.com

Pete DiMarco wrote:
Hi Folks-

Has anyone seen a case where a PtTimer widget stopped
working after a long period of continuous operation (like
months)? Apparently one of our machines stopped refreshing
certain fields on its screen after running correctly for a
significant amount of time. These fields are all driven by
the same PtTimer callback. (I say “apparently” because
the machine got rebooted before I could actually check it
out in person.) > :frowning: > Other than that one bug, our GUI
program had continued to run correctly.

It’s unfortunate you couldn’t get to it personally to check it out.
Next time, you should extract a ‘sin rt’ (along with other sin outputs
such as proxy, info, irq etc).

Without much else, it’s a little premature to guess at what happened.
Although, I’ve fixed a few timer related issues in QNX4 (fixes are in
4.25 patch G). Under certain load conditions, it was possible that
timers went negative, and never fired from that point on - sin rt would
show that.

Let us know if you manage to extract more info, if it happens again.


Cheers,
Adam

QNX Software Systems Ltd.
[ > amallory@qnx.com > ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <> pschon@baste.magibox.net

I’ve got this problem few days before ( first time I saw this problem ). My
photon application ( vizohr ) after 7 months of trouble free running
suddenly stop refreshing itself ( PtTimer stop activate callback ). I have a
luck: I had enough time to find the problem ( fortunately(?) not in my
application :slight_smile:. Application performed callbacks based on user inputs ( from
PtButton e.g. ), only PtTimer stop activate their callback. When I restart
application( not node ), everything was OK. I saved these information:

Info:

QNX 4.25+patchE
Photon 1.14+patchC

sin in

Node CPU Machine Speed Memory Ticksize Display
Flags
11 1586/1587 PCI 48221 106M/129M 10.0ms VGA
Color -3P±---------8P

Heapp Heapf Heapl Heapn Hands Names Sessions Procs Timers Nodes Virtual
0 0 22488 0 64 100 64 500 125 99 43M/
142M

sin

Boot from Hard at Dec 16 15:07 Locators: 9 10 11 14 13 17
SID PID PROGRAM PRI STATE BLK CODE DATA
– – Microkernel — ----- — 10524 0
0 1 sys/Proc32 30f READY — 118k 921k
0 2 sys/Slib32 10r RECV 0 53k 4096
0 4 /bin/Fsys 10r RECV 0 77k 18636k
0 5 /bin/Fsys.eide 22r RECV 0 61k 114k
0 8 idle 0r READY — 0 40k
0 16 //11/bin/Dev32 24f RECV 0 32k 94k
0 19 //11/bin/Dev32.ansi 20r RECV 0 40k 135k
0 21 //11/bin/Dev32.ser 20r RECV 0 16k 24k
0 22 //11/bin/Dev32.par 9o RECV 0 8192 16k
0 23 //11/bin/Dev32.pty 20r RECV 0 12k 32k
0 28 //11/bin/Mouse 12o RECV 0 16k 20k
0 31 //11/bin/Input 12o RECV 0 16k 28k
0 33 //11/bin/Input 10o RECV 0 16k 28k
0 36 //11/bin/Fsys.floppy 10o RECV 0 20k 40k
0 38 //11/bin/Pipe 10r RECV 0 16k 32k
0 42 //11/bin/nameloc 20o RECV 0 6144 20k
0 44 //11/bin/nameloc 20o REPLY 0 6144 24k
0 47 //11/bin/Net 23r RECV 0 32k 81k
0 60 //11/bin/Net.rtl 20r RECV 0 40k 118k
0 67 //11/bin/dumper 29o RECV 0 16k 24k
1 70 //11//bin/Photon 12r RECV 0 57k 86k
2 74 //11/
/bin/phfontpfr 12r RECV 0 126k 385k
0 76 //11//drivers/Null.ms 10o RECV 0 12k 20k
0 78 //11/
/Pg.rage128 12r REPLY 70 176k 192k
0 83 //11/bin/Input 12o RECV 0 16k 28k
0 87 //11/bin/Input 10o RECV 0 16k 28k
0 92 //11/bin/tinit 10o WAIT -1 16k 28k
3 106 //11//photon/bin/pwm 10o RECV 0 94k 147k
4 116 //11/bin/login 10o REPLY 16 24k 20k
3 6873 //11/
/photon/bin/pterm 10o RECV 0 32k 110k
5 6876 //11/bin/ksh 10o WAIT -1 31k 36k
5 6889 //11/bin/ksh 10o WAIT -1 31k 45k
5 13905 //11//bin/vizohr 10o REPLY 70 110k 1187k
6 30099 //11/bin/ksh 10o WAIT -1 31k 36k
6 30522 //11/bin/sin 10o REPLY 1 45k 49k
3 30608 //11/
/photon/bin/pterm 10o RECV 0 32k 110k

sin ti

SID PID PROGRAM PRI START TIME UTIME STIME CUTIME
CSTIME
0 1 sys/Proc32 30f Dec 16 15:07 3732 1346 561
14869
0 2 sys/Slib32 10r — – --:-- 0.000 0.000 0.000
0.000
0 4 /bin/Fsys 10r — – --:-- 8.859 3.050 0.000
0.000
0 5 /bin/Fsys.eide 22r — – --:-- 134 0.280 0.000
0.000
0 8 idle 0r — – --:-- 4294967 463 0.000
0.000
0 16 //11/bin/Dev32 24f Dec 16 15:07 9.859 3.220 0.000
0.000
0 19 //11/bin/Dev32.ansi 20r Dec 16 15:07 0.000 0.000 0.000
0.000
0 21 //11/bin/Dev32.ser 20r Dec 16 15:07 0.000 0.000 0.000
0.000
0 22 //11/bin/Dev32.par 9o Dec 16 15:07 0.000 0.000 0.000
0.000
0 23 //11/bin/Dev32.pty 20r Dec 16 15:07 0.140 0.010 0.000
0.000
0 28 //11/bin/Mouse 12o Dec 16 15:07 0.000 0.000 0.000
0.000
0 31 //11/bin/Input 12o Dec 16 15:07 5.850 1.900 0.000
0.000
0 33 //11/bin/Input 10o Dec 16 15:07 0.000 0.000 0.000
0.000
0 36 //11/bin/Fsys.floppy 10o Dec 16 15:07 0.010 0.000 0.000
0.000
0 38 //11/bin/Pipe 10r Dec 16 15:07 0.000 0.000 0.000
0.000
0 42 //11/bin/nameloc 20o Dec 16 15:07 305 285 0.000
0.000
0 44 //11/bin/nameloc 20o Dec 16 15:07 20.319 18.429 0.000
0.000
0 47 //11/bin/Net 23r Dec 16 15:07 26262 1546 0.000
0.000
0 60 //11/bin/Net.rtl 20r Dec 16 15:07 0.020 0.000 0.000
0.000
0 67 //11/bin/dumper 29o Dec 16 15:07 0.000 0.020 0.000
0.000
1 70 //11//bin/Photon 10r Dec 16 15:07 250 123 0.010
0.000
2 74 //11/
/bin/phfontpfr 12r Dec 16 15:07 1398 261 0.000
0.000
0 76 //11//drivers/Null.ms 10o Dec 16 15:07 0.000 0.000 0.000
0.000
0 78 //11/
/Pg.rage128 12r Dec 16 15:07 5213 45.927 0.000
0.000
0 83 //11/bin/Input 12o Dec 16 15:07 5.590 1.870 0.000
0.000
0 87 //11/bin/Input 10o Dec 16 15:07 5.240 1.930 0.000
0.000
0 92 //11/bin/tinit 10o Dec 16 15:07 0.000 0.000 0.000
0.000
3 106 //11//photon/bin/pwm 10o Dec 16 15:07 6.380 2.870 2.720
28.710
4 116 //11/bin/login 10o Dec 16 15:07 0.000 0.000 0.000
0.000
3 6873 //11/
/photon/bin/pterm 10o Feb 02 23:11 0.040 0.000 0.000
0.000
5 6876 //11/bin/ksh 10o Feb 02 23:11 0.000 0.000 0.000
0.000
5 6889 //11/bin/ksh 10o Feb 02 23:11 0.020 0.000 0.240
0.000
5 13905 //11//bin/vizohr 10o Feb 02 23:21 2689 196 0.000
0.000
6 21471 //11/bin/sin 10o Jul 02 15:22 0.000 0.000 0.000
0.000
6 30099 //11/bin/ksh 10o Jul 02 15:21 0.000 0.000 0.010
0.000
3 30608 //11/
/photon/bin/pterm 10o Jul 02 15:21 0.010 0.000 0.000
0.000

date

Fri Jul 02 15:39:05 cest 2004

I didn’t save sin rt info. I didn’t see any “vizohr” details in “sin rt” or
“sin pr” output.

Is this problem solved in path G? Where I can achieve info about patch G?
After qnx web shake-up there is only these infos:

  • These patches contain Patch G update to the QNX RTOS v4.25 operating
    system
  • Filename: qnx-4.25-01G.tarx
  • Check Sum: 1965590861 4017156
  • Size: 3.83 Mb
    pretty brief…

PS: any hope of “normal” .tar.F patch G for my nodes without CDROM :slight_smile:
couple of them is even without Photon :frowning:

Martin Michalek wrote:

I’ve got this problem few days before ( first time I saw this problem ). My
photon application ( vizohr ) after 7 months of trouble free running
suddenly stop refreshing itself ( PtTimer stop activate callback ). I have a
luck: I had enough time to find the problem ( fortunately(?) not in my
application > :slight_smile:> . Application performed callbacks based on user inputs ( from
PtButton e.g. ), only PtTimer stop activate their callback. When I restart
application( not node ), everything was OK. I saved these information:

This doesn’t sound like the negative timer issue. The problem is that
the non-proxy notified timers would stop firing forever. Restarting an
application wouldn’t “fix” the problem, only a reset of the machine.

I didn’t save sin rt info. I didn’t see any “vizohr” details in “sin rt” or
“sin pr” output.

sin rt is still a valuable source of information for timers in Photon
and other processes. It’s a shame you didn’t collect the information.

Is this problem solved in path G? Where I can achieve info about patch G?
After qnx web shake-up there is only these infos:

  • These patches contain Patch G update to the QNX RTOS v4.25 operating
    system
  • Filename: qnx-4.25-01G.tarx
  • Check Sum: 1965590861 4017156
  • Size: 3.83 Mb
    pretty brief…

PS: any hope of “normal” .tar.F patch G for my nodes without CDROM > :slight_smile:
couple of them is even without Photon > :frowning:

The negative timer issue is addressed in patch G. You’ll have to install
the whole patch on one development machine, and extract the elements you
wish
to install on the target nodes manually.

\

Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>

“Adam Mallory” <amallory@qnx.com> wrote in message
news:ccedf6$k67$1@inn.qnx.com

Martin Michalek wrote:

The negative timer issue is addressed in patch G. You’ll have to install
the whole patch on one development machine, and extract the elements you
wish to install on the target nodes manually.

For people out there: Patch G doesn’t resolved all timer related problem.
If
an application consumes all the timers, even after terminating and returning
used up timer to
OS, the OS becomes very unstable, hangs, kernel crash etc.

I was given a beta that fixed the problem but got no information as to when
it would
make it into the next patch.


Cheers,
Adam

QNX Software Systems Ltd.
[ > amallory@qnx.com > ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <> pschon@baste.magibox.net

Mario, Do you have any more specific info on the beta? I had a
problem with the PtTimer in patch version B, and now
I have recently had the problem again in patch G.

The problem is that after 5 months running, the PtTimer that I put on
the start-up page decided to stop
running. I have a local timer on each individual page that seemed to
restart when the page was
refreshed, but then would stop again a short time later. So none of
the callbacks that are supposed to
run over the whole hierarchy are running. It appears like the rest of
the system is fine.

Do you know of any work arounds for the PtTimer? Or any other widgets
that could replace it?

Thanks

mpickell wrote:

The problem is that after 5 months running, the PtTimer that I put on
the start-up page decided to stop
running. I have a local timer on each individual page that seemed to
restart when the page was
refreshed, but then would stop again a short time later. So none of
the callbacks that are supposed to
run over the whole hierarchy are running. It appears like the rest of
the system is fine.

Is the rest of the system really fine, or have all Photon timers stopped
working? For instance, is the clock on the pwm taskbar still running?
Are the cursors in your pterms still blinking, and are their window
titles still updating themselves correctly? If you click, wait a few
seconds or longer, and click again, is that interpreted as two single
clicks rather than a double click?

As far as I can tell (I did not think to check these) these things
were working. Every display has a PtTimer on it. These all seemed
to restart themselves when the user switched away from, and then back
to, a page (thus re-realizing the widget). The display application
itself seems to remain working, except that anything that depends on
the main PtTimer (The one on the start-up page that is used for the
entire hierarchy) does not update. Other parts of Photon and the OS
seem to be working fine.

When debugging the last occurence of this (on 4.25B, this occurence
was on 4.25G), we noticed that the display application seemed to go
into a RECV state and get stuck there. Whatever it was waiting for
never showed up. But the application still allowed us to switch
displays and use controls, but nothing updated that was based on the
main pttimer, and none of that widget’s callbacks were running
anymore. None of those callbacks would put the application in a RECV
state.

Is there any problem with having the startup display be a display that
is also frequently navigated back to from all the other displays
instead of a special startup display that has no other purpose than
running the PtTimer? Is there the possibility of more than one
instance of the main PtTimer occurring and causing a problem?

Are there any tricks to using the PtTimer? Any way to monitor and
force it to restart if it stops? I attempted to un-realize it and
rerealize it, but it didn’t seem to work.

mpickell wrote:

As far as I can tell (I did not think to check these) these things
were working. Every display has a PtTimer on it. These all seemed
to restart themselves when the user switched away from, and then back
to, a page (thus re-realizing the widget). The display application
itself seems to remain working, except that anything that depends on
the main PtTimer (The one on the start-up page that is used for the
entire hierarchy) does not update. Other parts of Photon and the OS
seem to be working fine.

OK, then it’s not what I suspected it might be. That’s a good thing, in
a way.

When debugging the last occurence of this (on 4.25B, this occurence
was on 4.25G), we noticed that the display application seemed to go
into a RECV state and get stuck there. Whatever it was waiting for
never showed up. But the application still allowed us to switch
displays and use controls, but nothing updated that was based on the
main pttimer, and none of that widget’s callbacks were running
anymore. None of those callbacks would put the application in a RECV
state.

It’s normal for most Photon applications to spend most of the time
RECEIVE-blocked on Photon. That’s how they wait for Photon to let them
know when something happens that requires their attention. If your app
still reacts to events such as mouse or keyboard input, it means that
it’s not stuck.

Is there any problem with having the startup display be a display that
is also frequently navigated back to from all the other displays
instead of a special startup display that has no other purpose than
running the PtTimer? Is there the possibility of more than one
instance of the main PtTimer occurring and causing a problem?

That may depend on what exactly you mean by “display” and “navigated”.
Is it a PhAB application? Are your “displays” separate window or dialog
modules that you “navigate” to and from by creating and destroying the
widgets? Or are they picture modules that you create and destroy inside
one window?

Normally, having multiple PtTimer widgets is not a problem. It’s not
recommended to have a lot of PtTimers set to very short intervals (under
20ms or so), because that causes Photon to re-arm its timer constantly,
which makes its timekeeping inaccurate. (It also increases the chance
of triggering the bug that, as it turns out, is not the problem in your
case.)

Are there any tricks to using the PtTimer? Any way to monitor and
force it to restart if it stops? I attempted to un-realize it and
rerealize it, but it didn’t seem to work.

No tricks should be necessary; since a PtTimer has no battery and no
moving parts and is immune to rust, it should keep running forever
without stopping. :wink:

If it does stop nevertheless, the trick is to figure out what the cause
is, and remove it.

A couple of things that might help:

#1 Do you ever set Pt_ARG_TIMER_INITIAL or Pt_ARG_TIMER_REPEAT from your
code after the widget has been realized, or is the widget supposed to
keep running with the values it was initialized with (e.g. in PhAB)? If
you do, does it happen inside of outside of the timer’s callback?

#2 Does your timer callback do anything that might cause it to get stuck
in a “modal” event-processing loop? Could you put a printf() at the
very top of the callback function, and another one just before its
return, to confirm that you’re not getting stuck inside the callback?
(A PtTimer does not keep ticking while in the callback – the code
receives a timer event from Photon, then calls your function, and then
asks Photon for another timer event. The repeat delay does not start
until after your callback has returned. If your callback never returns,
the delay never starts. There may be exceptions to this rule if you
re-realize the widget or set its resources without returning from the
callback; but hopefully, you’re not doing that, except to see what
happens if you do…)

“Wojtek Lerch” <Wojtek_L@yahoo.ca> wrote in message
news:d783q2$rf$1@inn.qnx.com

It’s normal for most Photon applications to spend most of the time
RECEIVE-blocked on Photon. That’s how they wait for Photon to let them
know when something happens that requires their attention.

Correction: I shouldn’t have written “RECEIVE-blocked on Photon”; I was
thinking REPLY-blocked on Photon. But RECEIVE-blocked is still normal –
it’s the “on Photon” part that was a lie.

Photon applications never RECEIVE from Photon because Photon never sends to
them. They wait for events by being either REPLY-blocked on Photon or
RECEIVE-blocked on zero (i.e. waiting for a message from anybody who sends
it). The former is the default mode for applications that don’t need to
wait for anything other than Photon events – it’s an equivalent of a
blocking read(). The latter is the mode apps switch to when they need to
wait for other things beside Photon events – it’s an equivalent of
select(). In this mode, Photon uses a proxy to tell the application to do a
non-blocking read(). If your application is RECEIVE-blocked, it means that
it has called PtAppAddInput() or some other function that uses
PtAppAddInput() internally.

Sorry for the confusion.

I’ve this problem!!

“Is the rest of the system really fine, or have all Photon timers
stopped
working? For instance, is the clock on the pwm taskbar still running?
Are the cursors in your pterms still blinking, and are their window
titles still updating themselves correctly? If you click, wait a few
seconds or longer, and click again, is that interpreted as two single
clicks rather than a double click?”

Yes the clock in the taskbar isn’t running.
Here is the sin rt (look at the photon proxy):

ID PID PROGRAM ACTION TRIGGER REPEAT
0 4 /bin/Fsys proxy 11 0.079 0.500
1 36 //81/bin/Net.epic proxy 37 1.588 3.000
2 29516 //81//photon/bin/pterm sleep -.— -.—
3 38 //81/bin/Net.ct100tx proxy 40 1.668 3.000
4 5 /bin/Fsys.eide proxy 13 0.139 2.000
5 41 //81/bin/Net.ct100tx proxy 42 1.718 3.000
6 85 //81/bin/tinit sleep -.— -.—
7 57 //81/
/usr/ucb/Socket proxy 69 0.060 0.000
10 86 //81/bin/cron signal 14 21075.092 0.000
13 85 //81/bin/tinit signal 14 33.376 60.000
18 274 //81//bin/Photon proxy 288 4284463.450 0.000
19 455 //81/bin/Input sleep -.— -.—
22 459 //81/bin/Input proxy 461 -.— -.—
23 463 //81/
/photon/bin/pwm sleep -.— -.—
24 82 //81//usr/ucb/inetd sleep -.— -.—
25 996 //81/
/photon/bin/pfm sleep -.— -.—
29 9539 //81//bin/phrelay proxy 32073 0.796 1.000
32 21807 //81/
/photon/bin/pterm sleep -.— -.—
43 27493 //81//bin/phrelay proxy 27498 0.445 1.000
44 27540 //81/
/bin/Photon proxy 27551 0.010 0.000
45 27624 //81//photon/bin/pwm sleep -.— -.—
46 27679 //81/
/photon/bin/pdm sleep -.— -.—
49 27716 //81//photon/bin/pdmd sleep 3.690 0.000
50 6098 //81/
/photon/bin/pterm sleep -.— -.—
51 11941 //81/*/photon/bin/pterm sleep -.— -.—
52 26036 //81/bin/sleep signal 14 0.470 0.000

Sombody help please!!
Thanks

Did anyone ever figure out the cause of this problem? Is there a
fix/workaround other than periodic rebooting?

Thanks,
ame