corrupted file system issues

Subject Qnx 6.2 m.n.c.
Hardware: 450Mhz/500MB Ram. Qnx on 3.2 Gig master ide.

This is easy to reproduce. Screw around with Mozilla for a while in a memory
tight environment (it will totally hang qnx) or simply pull out the mains
plug.

Related to: httpd (apache), ssh and samba.

During boot, I can just about catch a glimpse of httpd reporting a
“corrupted file system error” which is good because then at least I know
where to start looking.
Qnx will boot ok in its graphical desktop. Networking is up. Voyager works
ok. No http, ssh or samba. The System information window shows none of the
processes (httpd, smbd, nmbd and sshd) are running.

Upon closer investigation these are the reasons: (thank God the File Manager
browser has this Inspect Dialog or else I would have had to do a fresh
install of everything and probably drop Qnx alltogether).

opt/logs/error_log was corrupted. It could not be read by ped and the
inspector reported:
"can’t read ‘error_log’ (Corrupted file system)
Size: 2,894

Deleting this file, made httpd and ssh start succesfully.

For samba more files had to be deleted:
/opt/var/log.nmbd
/opt/var/log.smbd
/opt/var/locks/nmbd.pid
/opt/var/locks/smbd.pid
/opt/var/locks/unexpected.tdb
/opt/var/locks/message.tdb
/opt/var/locks/locking.tdb
/opt/var/locks/connections.tdb
/opt/var/locks/brlock.tdb

They were all “corrupted”.
Now for the 64 thousand dollars question:
“Is this the normal day to day life of a Qnx maintance dude after a crash?”
Is there a tool to deal with this. If a starting process fails to open and
write to the system error_log, is it normal practice to abort?
I am not trying to start a religious debate here!!! I am merely trying to
get a grip on what it entails to run Qnx as a server os (primarily) and
desktop os (secondarily) compared to other alternatives. Qnx interests me
greatly, particularly from a developer’s point of view, or I wouldn’t be
spending the time. I am trying to figure out what the “entrance fee” will be
in the end, quite appart from what a Qnx professional OS lisence and a QNX
Momentics 6.2 Professional Edition lisence will cost.

Cheers,
Conrad Weyns
Stokke, Norway.

weyns@online.no sed in <b15ecc$aob$1@inn.qnx.com>:

Now for the 64 thousand dollars question:
“Is this the normal day to day life of a Qnx maintance dude after a crash?”
Is there a tool to deal with this. If a starting process fails to open and
write to the system error_log, is it normal practice to abort?

Pulling a plug when files open will corrupt most filesystems,
not specific to QNX.
Use a journaling filesystem (ex. ReiserFS) when pullplug-proof is mandatory.

I am not trying to start a religious debate here!!!
Very interesting in what direction you were expecting here…



I am merely trying to
get a grip on what it entails to run Qnx as a server os (primarily) and
desktop os (secondarily) compared to other alternatives.
IMHO QNX is for embedded applications, not for desktops.

Is Telco applications called “server”, QSSL?

(Hmm, what’s a “server” anyway?
Anything without a keyboard, mouse and display?)

kabe

<kabe@sra-tohoku.co.jp> skrev i melding news:b17is8$o89$1@inn.qnx.com

weyns@online.no > sed in <b15ecc$aob$> 1@inn.qnx.com> >:

Now for the 64 thousand dollars question:
“Is this the normal day to day life of a Qnx maintance dude after a
crash?”
Is there a tool to deal with this. If a starting process fails to open
and
write to the system error_log, is it normal practice to abort?

Pulling a plug when files open will corrupt most filesystems,
not specific to QNX.

Pulling the plug was simply an easy way to reproduce the specific corrupted
files issue. During the few days that I have screwed around with qnx, it has
hung several times either after some action taken on the qnx box itself or
by for instance accessing it via samba from windoze boxes.
The result is consistant: The cpu monitor sticks at 100% and hangs. A reboot
is the only way out resulting in corrupting the sames files over and over
again:
opt/logs/error_log
opt/logs/httpd.pid
opt/logs/httpd.scoreboard

/opt/var/log.nmbd
/opt/var/log.smbd
/opt/var/locks/nmbd.pid
/opt/var/locks/smbd.pid
/opt/var/locks/unexpected.tdb
/opt/var/locks/message.tdb
/opt/var/locks/locking.tdb
/opt/var/locks/connections.tdb
/opt/var/locks/brlock.tdb

Why a corrupted error_log makes a starting process abort completely is
questionable i.m.o. I could live with httpd and samba being down but ssh is
a different matter…
It seems to me, this is valuable information coming from a “newbie” :slight_smile:

Use a journaling filesystem (ex. ReiserFS) when pullplug-proof is
mandatory.

Thanks for the tip.

I am not trying to start a religious debate here!!!
Very interesting in what direction you were expecting here…

I think you are pulling things out of context.

I am merely trying to
get a grip on what it entails to run Qnx as a server os (primarily) and
desktop os (secondarily) compared to other alternatives.
IMHO QNX is for embedded applications, not for desktops.

You may be right but it seems to me that qnx is putting a considerable
amount of effort in their desktop, why?

Is Telco applications called “server”, QSSL?

(Hmm, what’s a “server” anyway?
Anything without a keyboard, mouse and display?)

Perhaps anything that “serves you well” instead of kicking your ass :slight_smile:
/conrad


kabe

kabe@sra-tohoku.co.jp wrote:

weyns@online.no > sed in <b15ecc$aob$> 1@inn.qnx.com> >:

Now for the 64 thousand dollars question:
“Is this the normal day to day life of a Qnx maintance dude after a crash?”
Is there a tool to deal with this. If a starting process fails to open and
write to the system error_log, is it normal practice to abort?

Pulling a plug when files open will corrupt most filesystems,
not specific to QNX.
Use a journaling filesystem (ex. ReiserFS) when pullplug-proof is mandatory.

I am not trying to start a religious debate here!!!
Very interesting in what direction you were expecting here…

I am merely trying to
get a grip on what it entails to run Qnx as a server os (primarily) and
desktop os (secondarily) compared to other alternatives.
IMHO QNX is for embedded applications, not for desktops.

I have used QNX for many desktop/server applications. It works just
fine. Whenever files are written to sequentially and there is a
system crash,there are risks that some data is going to be lost.

If a program is using local buffers there is nothing the OS can do
about that. Even if your program writes directly the filesystem, the
odds are that it will be cached briefly before it is written out to
disk. Hell, for that matter, even if the OS does everything possible
to write data to disk, many hard drives can cache data being written.
So a power failure can still cause problems.

But there are several things that you cn do to minimize the probibility
of data loss. Your software should do asyncronous writes or issue
flushes after appropriate write opperations. The file system driver
should also be configured to flush as often as possible, or at least
when told to. Both of these techniques, while they will help to
preserve data, will have a slight performance impact on the system.

Another technique that I have used is when a logging process starts
back up, it should lseek to the EOF minus the largest record that can
be written. Then read forward to find the next start of record and
confirm that that record exists in it’s entirity. If it does not,
you can either eliminate the bad/truncated record by lseeking to it’s
beginning or fill in the missing data with fake data just to fill up
the record slot. (The application will dictate which it more
acceptable.)


Bill Caroselli – Q-TPS Consulting
1-(626) 824-7983
qtps@earthlink.net

#/** PhEDIT attribute block
#-11:16777215
#0:2655:FixedFont9:-3:-3:0
#** PhEDIT attribute block ends (-0000116)**/

qtps@earthlink.net sed in <b195ik$lbk$1@inn.qnx.com>:

But there are several things that you cn do to minimize the probibility
of data loss.
[other excellent tips regrettably suppressed]

Ya.
But methinks his Real Question was how to get around fsck/chkfsys-ing
when Third Party Software (samba,httpd) went wild and
had to pull a plug.
This could be a issue when there’s no babysitter for the box…

So the answer for “you have to chkfsys?” is yes, but
for “how do you automate?” could be “plant chkfsys in *.ifs”.

kabe

When a file is opened for “write” it is first tagged as “busy”. If you
yank the power, the file system will report EBADFSYS for these files
until the file system checker has had a look. Run “chkfsys -P” when you
boot-up, just be careful where you run it from, because it sometime
causes the file system to re-mount and close all open files (including
the script file that chkfsys is running from!).

Daryl Low

Conrad Weyns wrote:

Subject Qnx 6.2 m.n.c.
Hardware: 450Mhz/500MB Ram. Qnx on 3.2 Gig master ide.

This is easy to reproduce. Screw around with Mozilla for a while in a memory
tight environment (it will totally hang qnx) or simply pull out the mains
plug.

Related to: httpd (apache), ssh and samba.

During boot, I can just about catch a glimpse of httpd reporting a
“corrupted file system error” which is good because then at least I know
where to start looking.
Qnx will boot ok in its graphical desktop. Networking is up. Voyager works
ok. No http, ssh or samba. The System information window shows none of the
processes (httpd, smbd, nmbd and sshd) are running.

Upon closer investigation these are the reasons: (thank God the File Manager
browser has this Inspect Dialog or else I would have had to do a fresh
install of everything and probably drop Qnx alltogether).

opt/logs/error_log was corrupted. It could not be read by ped and the
inspector reported:
"can’t read ‘error_log’ (Corrupted file system)
Size: 2,894

Deleting this file, made httpd and ssh start succesfully.

For samba more files had to be deleted:
/opt/var/log.nmbd
/opt/var/log.smbd
/opt/var/locks/nmbd.pid
/opt/var/locks/smbd.pid
/opt/var/locks/unexpected.tdb
/opt/var/locks/message.tdb
/opt/var/locks/locking.tdb
/opt/var/locks/connections.tdb
/opt/var/locks/brlock.tdb

They were all “corrupted”.
Now for the 64 thousand dollars question:
“Is this the normal day to day life of a Qnx maintance dude after a crash?”
Is there a tool to deal with this. If a starting process fails to open and
write to the system error_log, is it normal practice to abort?
I am not trying to start a religious debate here!!! I am merely trying to
get a grip on what it entails to run Qnx as a server os (primarily) and
desktop os (secondarily) compared to other alternatives. Qnx interests me
greatly, particularly from a developer’s point of view, or I wouldn’t be
spending the time. I am trying to figure out what the “entrance fee” will be
in the end, quite appart from what a Qnx professional OS lisence and a QNX
Momentics 6.2 Professional Edition lisence will cost.

Cheers,
Conrad Weyns
Stokke, Norway.

kabe@sra-tohoku.co.jp wrote:

qtps@earthlink.net > sed in <b195ik$lbk$> 1@inn.qnx.com> >:

But there are several things that you cn do to minimize the probibility
of data loss.
[other excellent tips regrettably suppressed]

Ya.
But methinks his Real Question was how to get around fsck/chkfsys-ing
when Third Party Software (samba,httpd) went wild and
had to pull a plug.
This could be a issue when there’s no babysitter for the box…

So the answer for “you have to chkfsys?” is yes, but
for “how do you automate?” could be “plant chkfsys in *.ifs”.

You’re right. I tend to forget what was said in previous posts and just
look at the post I’m replying to.

In QNX4’s Fsys there was an option to force all writes to be
asyncronious. Does a similar option exist for QNX6? Of course if it
does it would have a negitive impact on overall system performance.
And, I guess it still won’t help the BUSY flag issue.


Bill Caroselli – Q-TPS Consulting
1-(626) 824-7983
qtps@earthlink.net

“Daryl Low” <dlo*w@qnx.com> skrev i melding
news:3E390240.2000707@qnx.com

When a file is opened for “write” it is first tagged as “busy”. If you
yank the power, the file system will report EBADFSYS for these files
until the file system checker has had a look. Run “chkfsys -P” when you
boot-up, just be careful where you run it from, because it sometime
causes the file system to re-mount and close all open files (including
the script file that chkfsys is running from!).

Daryl Low

Thanks Daryl, this is usefull info for a true “qnx newbie” as I am.
I am happy there is a dedicated group for this.

Thanks to Kabe and Bill as well. I am, albeit very slowly, learning…

The are a few aspects of Qnx Neutrino 6.2 (my installation of it that is)
that concern me. On the one hand, installing it went like a dream and was
quick and easy. The desktop came up, I logged in as root, qnx proceeded to
continue installing more goodies from the cd. The disk was going like a
machine gun! At the same time I was able to use the photon browser, start a
download of the 452MB large qnxpub100.iso image which stabelized smoothly at
about 60kb/sec (I have a 1MBit sdsl line) and still qnx let me launch
applications and the gui continued to be responsive to my mouse and keyboard
activity. And finaly, someone got it right, the extended desktops are not
seperate workspaces but a part of a bigger picture. Being able to drag a
window across workspaces in the worldview while everything else is going on
is realy show! We’d call that pron amongs my developer colleages. Believe it
or not, at first this was on an old 200Mhz/64MB Pentium Pro! A few days
later I repeated the process on a 450MHz/520MB PIII. Everything just as
smooth but still no usb trackball though. Yet on the other hand, I can
easily make the whole thing hang forever using either Mozilla (in a tight
memory situation - mozilla is slow!) or screwing about with samba
up&downloads from some windoze boxes (difficult to reproduce but after a
while, it will happen). Yanking the power is the only option. (I think qnx
is taking over both the reboot button and the on/off switch. Ok, I have seen
NT do that sometimes but never on that machine though…

Now, remember: I am a newbie. When qnx reboots after such an event, and I
find that many processes simply haven’t been able to start due to a “corrupt
file system”, what can you expect me to do? Had I been slightly pressed for
time and not taken the burden to investigate, I might have re-installed the
whole shablong once more - but I can tell you with 100% certainly, not
twice!

So called “free non commercial” downloads are nothing but a bate and never
realy for free. That’s ok, I knew that much :slight_smile: But there are other things
like config and log files living in weird places. It’s is hard to find them
and certainly rather inconsistent. I know, that qnx cannot take the blame
for third party products but there are limits. I found several discrepencies
in the documentation and help files with what is actually going on. I have a
feeling it is simply copied from the ‘unixy’ world and not properly adapted
to qnx. I had to eventually, make a note on paper of the most important
paths. What is expected from a qnx newbie? How much unix knowledge is
required and will this unix knowledge be applicable?

I think sshd should be made an integral part of the core installation and it
should do its utmost to be able to run after a system crash, at least as
long as the services it requires are running, networking, which in my issues
has allways been the case. The reason for this is smple: it is the key to
getting outside help. Telnet is an option obviously but not well regarded
among my few ‘unixy’ heavy friends :slight_smile:

Cheers,
Conrad.

Conrad Weyns wrote:
Subject Qnx 6.2 m.n.c.
Hardware: 450Mhz/500MB Ram. Qnx on 3.2 Gig master ide.

This is easy to reproduce. Screw around with Mozilla for a while in a
memory
tight environment (it will totally hang qnx) or simply pull out the
mains
plug.

Related to: httpd (apache), ssh and samba.

During boot, I can just about catch a glimpse of httpd reporting a
“corrupted file system error” which is good because then at least I know
where to start looking.
Qnx will boot ok in its graphical desktop. Networking is up. Voyager
works
ok. No http, ssh or samba. The System information window shows none of
the
processes (httpd, smbd, nmbd and sshd) are running.

Upon closer investigation these are the reasons: (thank God the File
Manager
browser has this Inspect Dialog or else I would have had to do a fresh
install of everything and probably drop Qnx alltogether).

opt/logs/error_log was corrupted. It could not be read by ped and the
inspector reported:
"can’t read ‘error_log’ (Corrupted file system)
Size: 2,894

Deleting this file, made httpd and ssh start succesfully.

For samba more files had to be deleted:
/opt/var/log.nmbd
/opt/var/log.smbd
/opt/var/locks/nmbd.pid
/opt/var/locks/smbd.pid
/opt/var/locks/unexpected.tdb
/opt/var/locks/message.tdb
/opt/var/locks/locking.tdb
/opt/var/locks/connections.tdb
/opt/var/locks/brlock.tdb

They were all “corrupted”.
Now for the 64 thousand dollars question:
“Is this the normal day to day life of a Qnx maintance dude after a
crash?”
Is there a tool to deal with this. If a starting process fails to open
and
write to the system error_log, is it normal practice to abort?
I am not trying to start a religious debate here!!! I am merely trying
to
get a grip on what it entails to run Qnx as a server os (primarily) and
desktop os (secondarily) compared to other alternatives. Qnx interests
me
greatly, particularly from a developer’s point of view, or I wouldn’t be
spending the time. I am trying to figure out what the “entrance fee”
will be
in the end, quite appart from what a Qnx professional OS lisence and a
QNX
Momentics 6.2 Professional Edition lisence will cost.

Cheers,
Conrad Weyns
Stokke, Norway.
\

Conrad Weyns <weyns@online.no> wrote:
: I found several discrepencies
: in the documentation and help files with what is actually going on. I have a
: feeling it is simply copied from the ‘unixy’ world and not properly adapted
: to qnx. I had to eventually, make a note on paper of the most important
: paths. What is expected from a qnx newbie? How much unix knowledge is
: required and will this unix knowledge be applicable?

Can you please post the discrepancies so we can fix the docs? Thanks.


Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems

qtps@earthlink.net sed in <b1bon3$jvt$2@inn.qnx.com>:

In QNX4’s Fsys there was an option to force all writes to be
asyncronious. Does a similar option exist for QNX6? Of course if it
does it would have a negitive impact on overall system performance.

Is
devb-* blk delwri=0,commit=high
correct for this… io-blk.so manual says commit=high is syncronous.

Reminder:

And, I guess it still won’t help the BUSY flag issue.

kabe