Fsys trouble

We extend our system based on QNX4.24 4 month ago. Since then we have
sporadic hangups ( up to 20% of 4000 devices ). It was difficult to produce
this flaw. Yesterday I discovered, that some of our processes are
REPLY-BLOCKED on Fsys ( 4.24K ).


Fsys options:
$ Fsys -c 0 -r 8000



PID PROGRAM PRI STATE BLK
1 Proc32 30 READY 0
2 Slib32 10 RECV 0
3 Fsys 22 RECV 0
4 Fsys.diskonchip 10 RECV 0
6 Dev32 24 RECV 0
7 Dev32.ser 20 RECV 0
9 idle 0 READY 0
17 startup 10 RECV 0
26 Dev.ansi_1 20 RECV 0
28 Dev.pty_1 20 RECV 0
30 nameloc_1 20 RECV 0
31 nameloc_1 20 REPLY 0
33 Audio_1 10 RECV 0
37 Fsys.floppy_1 10 RECV 0
40 Dosfsys_1 10 RECV 0
42 Photon_2 17 RECV 0
44 phfontpfr_2 12 RECV 0
45 Hydra.ms_2 10 RECV 0
47 Pg.flat_1 12 REPLY 42
48 Hydra.ms_2 10 RECV 0
49 Pg.flat_1 12 REPLY 42
50 Input_2 14 HELD 0
54 Input_2 12 RECV 0
24709 zubehoer_mail 10 REPLY 3
12978 sema_2 10 RECV 0
12979 sram_4 10 RECV 0
12980 logger_8 10 RECV 0
12986 lsp_serv_2 12 RECV 0
3260 scan2srv_2 10 RECV 0
12992 scanner_9 10 RECV 0
12994 ocr_6 10 REPLY 0
12995 sound_2 10 RECV 0
18629 iom_12 10 RECV 0
3278 assist_12 10 REPLY 3
3289 bipar_11 10 RECV 0
3292 drucker_10 10 RECV 0
3295 barcode_7 10 RECV 0
3299 werbeinfo_8 10 REPLY 3
3301 benutzer_11 10 RECV 0
3302 watchdog_1 10 REPLY 1
3303 nibble 10 REPLY 0
3305 komm_15 10 RECV 0
2801 x28_10 12 RECV 0
3317 zentralkomm_11 10 RECV 0
8958 wettschein 10 REPLY 3
13065 sportwettschein 10 REPLY 3
13068 instant_12 10 REPLY 3
13071 secmain_13 10 RECV 0
12583 zubehoer_11 10 REPLY 3
12603 kunden_10 10 REPLY 3

What is happening here?
What can I do to get to the root of the problem?

Any information is useful.


Gerald Aichwalder

Hi Gerald,

Can you upgrade to the latest version of the OS? 4.25E is the latest
release, and see if the problems still exists then.

E.


Gerald Aichwalder <gerald.aichwalder@lottery.co.at> wrote:

We extend our system based on QNX4.24 4 month ago. Since then we have
sporadic hangups ( up to 20% of 4000 devices ). It was difficult to produce
this flaw. Yesterday I discovered, that some of our processes are
REPLY-BLOCKED on Fsys ( 4.24K ).



Fsys options:
$ Fsys -c 0 -r 8000



PID PROGRAM PRI STATE BLK
1 Proc32 30 READY 0
2 Slib32 10 RECV 0
3 Fsys 22 RECV 0
4 Fsys.diskonchip 10 RECV 0
6 Dev32 24 RECV 0
7 Dev32.ser 20 RECV 0
9 idle 0 READY 0
17 startup 10 RECV 0
26 Dev.ansi_1 20 RECV 0
28 Dev.pty_1 20 RECV 0
30 nameloc_1 20 RECV 0
31 nameloc_1 20 REPLY 0
33 Audio_1 10 RECV 0
37 Fsys.floppy_1 10 RECV 0
40 Dosfsys_1 10 RECV 0
42 Photon_2 17 RECV 0
44 phfontpfr_2 12 RECV 0
45 Hydra.ms_2 10 RECV 0
47 Pg.flat_1 12 REPLY 42
48 Hydra.ms_2 10 RECV 0
49 Pg.flat_1 12 REPLY 42
50 Input_2 14 HELD 0
54 Input_2 12 RECV 0
24709 zubehoer_mail 10 REPLY 3
12978 sema_2 10 RECV 0
12979 sram_4 10 RECV 0
12980 logger_8 10 RECV 0
12986 lsp_serv_2 12 RECV 0
3260 scan2srv_2 10 RECV 0
12992 scanner_9 10 RECV 0
12994 ocr_6 10 REPLY 0
12995 sound_2 10 RECV 0
18629 iom_12 10 RECV 0
3278 assist_12 10 REPLY 3
3289 bipar_11 10 RECV 0
3292 drucker_10 10 RECV 0
3295 barcode_7 10 RECV 0
3299 werbeinfo_8 10 REPLY 3
3301 benutzer_11 10 RECV 0
3302 watchdog_1 10 REPLY 1
3303 nibble 10 REPLY 0
3305 komm_15 10 RECV 0
2801 x28_10 12 RECV 0
3317 zentralkomm_11 10 RECV 0
8958 wettschein 10 REPLY 3
13065 sportwettschein 10 REPLY 3
13068 instant_12 10 REPLY 3
13071 secmain_13 10 RECV 0
12583 zubehoer_11 10 REPLY 3
12603 kunden_10 10 REPLY 3

What is happening here?
What can I do to get to the root of the problem?

Any information is useful.

Gerald Aichwalder

In principle it is possible to upgrade, but we have to upgrade more than
4000 devices.
I think, if we upgrade, we have to recompile and reinstall our whole
application, too.
I would prefer to get a new Fsys for Qnx4.24 ( and change only .boot ) or a
workaround for this problem.
As far as I remember it was reported a Fsys heavy load problem some years
ago on Quics, but I do not find any hints about it.

I think, the problem is caused by several read and write acesses to
DiskOnChip and ramdisk from different tasks.

Gerald



“Hardware Support Account” <hw@qnx.com> schrieb im Newsbeitrag
news:9pv0bu$2l2$2@nntp.qnx.com

Hi Gerald,

Can you upgrade to the latest version of the OS? 4.25E is the latest
release, and see if the problems still exists then.

E.


Gerald Aichwalder <> gerald.aichwalder@lottery.co.at> > wrote:
We extend our system based on QNX4.24 4 month ago. Since then we have
sporadic hangups ( up to 20% of 4000 devices ). It was difficult to
produce
this flaw. Yesterday I discovered, that some of our processes are
REPLY-BLOCKED on Fsys ( 4.24K ).


Fsys options:
$ Fsys -c 0 -r 8000



PID PROGRAM PRI STATE BLK
1 Proc32 30 READY 0
2 Slib32 10 RECV 0
3 Fsys 22 RECV 0
4 Fsys.diskonchip 10 RECV 0
6 Dev32 24 RECV 0
7 Dev32.ser 20 RECV 0
9 idle 0 READY 0
17 startup 10 RECV 0
26 Dev.ansi_1 20 RECV 0
28 Dev.pty_1 20 RECV 0
30 nameloc_1 20 RECV 0
31 nameloc_1 20 REPLY 0
33 Audio_1 10 RECV 0
37 Fsys.floppy_1 10 RECV 0
40 Dosfsys_1 10 RECV 0
42 Photon_2 17 RECV 0
44 phfontpfr_2 12 RECV 0
45 Hydra.ms_2 10 RECV 0
47 Pg.flat_1 12 REPLY 42
48 Hydra.ms_2 10 RECV 0
49 Pg.flat_1 12 REPLY 42
50 Input_2 14 HELD 0
54 Input_2 12 RECV 0
24709 zubehoer_mail 10 REPLY 3
12978 sema_2 10 RECV 0
12979 sram_4 10 RECV 0
12980 logger_8 10 RECV 0
12986 lsp_serv_2 12 RECV 0
3260 scan2srv_2 10 RECV 0
12992 scanner_9 10 RECV 0
12994 ocr_6 10 REPLY 0
12995 sound_2 10 RECV 0
18629 iom_12 10 RECV 0
3278 assist_12 10 REPLY 3
3289 bipar_11 10 RECV 0
3292 drucker_10 10 RECV 0
3295 barcode_7 10 RECV 0
3299 werbeinfo_8 10 REPLY 3
3301 benutzer_11 10 RECV 0
3302 watchdog_1 10 REPLY 1
3303 nibble 10 REPLY 0
3305 komm_15 10 RECV 0
2801 x28_10 12 RECV 0
3317 zentralkomm_11 10 RECV 0
8958 wettschein 10 REPLY 3
13065 sportwettschein 10 REPLY 3
13068 instant_12 10 REPLY 3
13071 secmain_13 10 RECV 0
12583 zubehoer_11 10 REPLY 3
12603 kunden_10 10 REPLY 3

What is happening here?
What can I do to get to the root of the problem?

Any information is useful.


Gerald Aichwalder

“Gerald Aichwalder” <gerald.aichwalder@lottery.co.at> wrote in message
news:9pvqu3$4uc$1@inn.qnx.com

In principle it is possible to upgrade, but we have to upgrade more than
4000 devices.

I think, if we upgrade, we have to recompile and reinstall our whole
application, too.

No you don’t have to recompile. But any change in Fsys will required a
new boot image anyway.

I would prefer to get a new Fsys for Qnx4.24 ( and change only .boot ) or
a
workaround for this problem.

You might by able to take Fsys and famillily QNX4.25 and use it on QNX
4.24.

As far as I remember it was reported a Fsys heavy load problem some years
ago on Quics, but I do not find any hints about it.

I think, the problem is caused by several read and write acesses to
DiskOnChip and ramdisk from different tasks.

You can try using vdir which is a different way to create a ram disk.
This is included with QNX 4.25.

You might still want to try your software on 4.25 and see if the problem
goes away. If it does you could try to update only the Fsys part.
If not, back to square one ;-(

Gerald



“Hardware Support Account” <> hw@qnx.com> > schrieb im Newsbeitrag
news:9pv0bu$2l2$> 2@nntp.qnx.com> …
Hi Gerald,

Can you upgrade to the latest version of the OS? 4.25E is the latest
release, and see if the problems still exists then.

E.


Gerald Aichwalder <> gerald.aichwalder@lottery.co.at> > wrote:
We extend our system based on QNX4.24 4 month ago. Since then we have
sporadic hangups ( up to 20% of 4000 devices ). It was difficult to
produce
this flaw. Yesterday I discovered, that some of our processes are
REPLY-BLOCKED on Fsys ( 4.24K ).


Fsys options:
$ Fsys -c 0 -r 8000



PID PROGRAM PRI STATE BLK
1 Proc32 30 READY 0
2 Slib32 10 RECV 0
3 Fsys 22 RECV 0
4 Fsys.diskonchip 10 RECV 0
6 Dev32 24 RECV 0
7 Dev32.ser 20 RECV 0
9 idle 0 READY 0
17 startup 10 RECV 0
26 Dev.ansi_1 20 RECV 0
28 Dev.pty_1 20 RECV 0
30 nameloc_1 20 RECV 0
31 nameloc_1 20 REPLY 0
33 Audio_1 10 RECV 0
37 Fsys.floppy_1 10 RECV 0
40 Dosfsys_1 10 RECV 0
42 Photon_2 17 RECV 0
44 phfontpfr_2 12 RECV 0
45 Hydra.ms_2 10 RECV 0
47 Pg.flat_1 12 REPLY 42
48 Hydra.ms_2 10 RECV 0
49 Pg.flat_1 12 REPLY 42
50 Input_2 14 HELD 0
54 Input_2 12 RECV 0
24709 zubehoer_mail 10 REPLY 3
12978 sema_2 10 RECV 0
12979 sram_4 10 RECV 0
12980 logger_8 10 RECV 0
12986 lsp_serv_2 12 RECV 0
3260 scan2srv_2 10 RECV 0
12992 scanner_9 10 RECV 0
12994 ocr_6 10 REPLY 0
12995 sound_2 10 RECV 0
18629 iom_12 10 RECV 0
3278 assist_12 10 REPLY 3
3289 bipar_11 10 RECV 0
3292 drucker_10 10 RECV 0
3295 barcode_7 10 RECV 0
3299 werbeinfo_8 10 REPLY 3
3301 benutzer_11 10 RECV 0
3302 watchdog_1 10 REPLY 1
3303 nibble 10 REPLY 0
3305 komm_15 10 RECV 0
2801 x28_10 12 RECV 0
3317 zentralkomm_11 10 RECV 0
8958 wettschein 10 REPLY 3
13065 sportwettschein 10 REPLY 3
13068 instant_12 10 REPLY 3
13071 secmain_13 10 RECV 0
12583 zubehoer_11 10 REPLY 3
12603 kunden_10 10 REPLY 3

What is happening here?
What can I do to get to the root of the problem?

Any information is useful.


Gerald Aichwalder
\

Gerald,

At the minimum get to Fsys 4.24S. 4.24K has some major race conditions
internally that John Garvey fixed a couple years back that result in exactly
what you are seeing - processes blocked against Fsys.

It is possible to run Fsys 4.25E on a Proc 4.24 but you must make sure
you get the related pieces - if memory serves it is Fsys, Fsys.* drivers,
dinit, and chkfsys.

Good luck,
Jay

Gerald Aichwalder wrote in message <9pvqu3$4uc$1@inn.qnx.com>…

In principle it is possible to upgrade, but we have to upgrade more than
4000 devices.
I think, if we upgrade, we have to recompile and reinstall our whole
application, too.
I would prefer to get a new Fsys for Qnx4.24 ( and change only .boot ) or a
workaround for this problem.
As far as I remember it was reported a Fsys heavy load problem some years
ago on Quics, but I do not find any hints about it.

I think, the problem is caused by several read and write acesses to
DiskOnChip and ramdisk from different tasks.

Gerald



“Hardware Support Account” <> hw@qnx.com> > schrieb im Newsbeitrag
news:9pv0bu$2l2$> 2@nntp.qnx.com> …
Hi Gerald,

Can you upgrade to the latest version of the OS? 4.25E is the latest
release, and see if the problems still exists then.

E.


Gerald Aichwalder <> gerald.aichwalder@lottery.co.at> > wrote:
We extend our system based on QNX4.24 4 month ago. Since then we have
sporadic hangups ( up to 20% of 4000 devices ). It was difficult to
produce
this flaw. Yesterday I discovered, that some of our processes are
REPLY-BLOCKED on Fsys ( 4.24K ).


Fsys options:
$ Fsys -c 0 -r 8000



PID PROGRAM PRI STATE BLK
1 Proc32 30 READY 0
2 Slib32 10 RECV 0
3 Fsys 22 RECV 0
4 Fsys.diskonchip 10 RECV 0
6 Dev32 24 RECV 0
7 Dev32.ser 20 RECV 0
9 idle 0 READY 0
17 startup 10 RECV 0
26 Dev.ansi_1 20 RECV 0
28 Dev.pty_1 20 RECV 0
30 nameloc_1 20 RECV 0
31 nameloc_1 20 REPLY 0
33 Audio_1 10 RECV 0
37 Fsys.floppy_1 10 RECV 0
40 Dosfsys_1 10 RECV 0
42 Photon_2 17 RECV 0
44 phfontpfr_2 12 RECV 0
45 Hydra.ms_2 10 RECV 0
47 Pg.flat_1 12 REPLY 42
48 Hydra.ms_2 10 RECV 0
49 Pg.flat_1 12 REPLY 42
50 Input_2 14 HELD 0
54 Input_2 12 RECV 0
24709 zubehoer_mail 10 REPLY 3
12978 sema_2 10 RECV 0
12979 sram_4 10 RECV 0
12980 logger_8 10 RECV 0
12986 lsp_serv_2 12 RECV 0
3260 scan2srv_2 10 RECV 0
12992 scanner_9 10 RECV 0
12994 ocr_6 10 REPLY 0
12995 sound_2 10 RECV 0
18629 iom_12 10 RECV 0
3278 assist_12 10 REPLY 3
3289 bipar_11 10 RECV 0
3292 drucker_10 10 RECV 0
3295 barcode_7 10 RECV 0
3299 werbeinfo_8 10 REPLY 3
3301 benutzer_11 10 RECV 0
3302 watchdog_1 10 REPLY 1
3303 nibble 10 REPLY 0
3305 komm_15 10 RECV 0
2801 x28_10 12 RECV 0
3317 zentralkomm_11 10 RECV 0
8958 wettschein 10 REPLY 3
13065 sportwettschein 10 REPLY 3
13068 instant_12 10 REPLY 3
13071 secmain_13 10 RECV 0
12583 zubehoer_11 10 REPLY 3
12603 kunden_10 10 REPLY 3

What is happening here?
What can I do to get to the root of the problem?

Any information is useful.


Gerald Aichwalder
\

“Jay Hogg” <Jay.Hogg@t-netix.com.r-e-m-o-v-e> wrote in message
news:9qcgv6$6ru$1@inn.qnx.com

At the minimum get to Fsys 4.24S. 4.24K has some major race conditions
internally that John Garvey fixed a couple years back that result in
exactly
what you are seeing - processes blocked against Fsys.

Like Jay said, you should really try upgrading to a newer Fsys (4.24V/W).
Race conditions (leading to deadlock) under heavy load were fixed in
open(), write(), rename(), and link(); plus some semantic fixes with
O_SYNC, fsync(), mount() and umount(); and some bugs fixed with
removable (compact flash/Sandisk) media.

It is possible to run Fsys 4.25E on a Proc 4.24 but you must make sure
you get the related pieces - if memory serves it is Fsys, Fsys.* drivers,
dinit, and chkfsys.