Preventing File System Corruption

What strategies can we use to prevent file system corruption caused by
unexpected power loss? I’m mainly concerned with protecting files that the
OS relies on, not our application files. Here are some of the ideas we’ve
already kicked around.

  • UPS
  • Soft power key delivers interrupt to OS allowing time to shutdown file
    system
  • Run the OS from a read-only partition
    – Will QNX and Photon run correctly from a read-only partition?
  • Running chkfsys after the fact will not always recover lost data

Thanks, Dennis

  • Running chkfsys after the fact will not always recover lost data

Hello, I use chkfsys in /etc/system/sysinit (qnx6)
David.

Start the pseudo-tty manager (probably should let users fiddle the parm)

devc-pty -n 32
echo “Starting chkfsys”; chkfsys -Prq /dev/hd0t79
if test -x /etc/rc.d/rc.sysinit; then
. /etc/rc.d/rc.sysinit
fi

Depending on the file system corruption, the file system is not always repairable, even if you run chkfsys during startup. Even if chkfsys succeeds, it only means that the file system is self-consistent again. Data may still have been lost.

Also, when you run chkfsys with the -r option and chkfsys does update the bitmap, you should run chkfsys again. But I’ve never seen anyone deal with this in a startup script, by looping until chkfsys reports success without making changes.

Dennis
“David Brdièka” <dbrdicka@retia.cz> wrote in message news:f2h16h$mkp$1@inn.qnx.com

  • Running chkfsys after the fact will not always recover lost data

Hello, I use chkfsys in /etc/system/sysinit (qnx6)
David.

Start the pseudo-tty manager (probably should let users fiddle the parm)

devc-pty -n 32
echo “Starting chkfsys”; chkfsys -Prq /dev/hd0t79
if test -x /etc/rc.d/rc.sysinit; then
. /etc/rc.d/rc.sysinit
fi

Dennis Miller wrote:

What strategies can we use to prevent file system corruption caused by
unexpected power loss? I’m mainly concerned with protecting files that the
OS relies on, not our application files. Here are some of the ideas we’ve
already kicked around.

  • UPS
  • Soft power key delivers interrupt to OS allowing time to shutdown file
    system
  • Run the OS from a read-only partition
    – Will QNX and Photon run correctly from a read-only partition?
  • Running chkfsys after the fact will not always recover lost data

Thanks, Dennis

QNX runs quite well with the system in a read-only partition.
In fact, you can run with the OS in ROM, or on a CD-ROM drive.
For many embedded applications, that’s a good choice.
There’s much to be said for building your own boot image and
putting it on some medium that’s truly read-only. Then you’re
guaranteed that, short of physical damage to the hardware, the
system will start.

John Nagle

John Nagle <nagle@downside.com> wrote:

Dennis Miller wrote:
What strategies can we use to prevent file system corruption caused by
unexpected power loss? I’m mainly concerned with protecting files that the
OS relies on, not our application files. Here are some of the ideas we’ve
already kicked around.


QNX runs quite well with the system in a read-only partition.
In fact, you can run with the OS in ROM, or on a CD-ROM drive.
For many embedded applications, that’s a good choice.

I definitely agree with this suggestion. OS components, your
application components – all in a RO partition, or on a RO
device.

Dynamic data – logging, configuration, data collection, etc,
in a seperate writable partition or on a seperate writable device.

If you are HD based, seperate partitions are probably enough.

If you’re using something like compact flash that sits behind
an IDE controller, you may still be at risk. While the OS
will treat the partitions as seperate, it is hard to know what
algorithms for wear-levelling of erase blocks are being used
in the CF device, and whether power-cycling in the middle of
one unexpectedly could affect nominally read-only data.

-David

QNX Training Services
http://www.qnx.com/services/training/
Please followup in this newsgroup if you have further questions.

John, David,

Thanks! That’s what I was looking for.

Dennis


“David Gibbs” <dagibbs@qnx.com> wrote in message
news:f2k9pc$j6p$1@nntp.qnx.com

John Nagle <> nagle@downside.com> > wrote:
Dennis Miller wrote:
What strategies can we use to prevent file system corruption caused by
unexpected power loss? I’m mainly concerned with protecting files that
the
OS relies on, not our application files. Here are some of the ideas
we’ve
already kicked around.


QNX runs quite well with the system in a read-only partition.
In fact, you can run with the OS in ROM, or on a CD-ROM drive.
For many embedded applications, that’s a good choice.

I definitely agree with this suggestion. OS components, your
application components – all in a RO partition, or on a RO
device.

Dynamic data – logging, configuration, data collection, etc,
in a seperate writable partition or on a seperate writable device.

If you are HD based, seperate partitions are probably enough.

If you’re using something like compact flash that sits behind
an IDE controller, you may still be at risk. While the OS
will treat the partitions as seperate, it is hard to know what
algorithms for wear-levelling of erase blocks are being used
in the CF device, and whether power-cycling in the middle of
one unexpectedly could affect nominally read-only data.

-David

QNX Training Services
http://www.qnx.com/services/training/
Please followup in this newsgroup if you have further questions.

Evidently chkfsys doesn’t catch 100% of corruptions. On my 6.3.2 SP3 machine
right now, I have a corrupt filesystem, as reported by rm:
rm: Can’t remove directory cvsmoz/mozilla/gc/boehm/cord/private: Corrupted
file system detected
But when I run chkfsys -u on the mount point of that partition, it does not
detect any corruption and flags the filesystem CLEAN.


“Dennis Miller” <dmiller@NOSPAMminnetronix.com> wrote in message
news:f2hme7$35g$1@inn.qnx.com
Depending on the file system corruption, the file system is not always
repairable, even if you run chkfsys during startup. Even if chkfsys
succeeds, it only means that the file system is self-consistent again. Data
may still have been lost.

Also, when you run chkfsys with the -r option and chkfsys does update the
bitmap, you should run chkfsys again. But I’ve never seen anyone deal with
this in a startup script, by looping until chkfsys reports success without
making changes.

Dennis
“David Brdièka” <dbrdicka@retia.cz> wrote in message
news:f2h16h$mkp$1@inn.qnx.com

  • Running chkfsys after the fact will not always recover lost data

Hello, I use chkfsys in /etc/system/sysinit (qnx6)
David.

Start the pseudo-tty manager (probably should let users fiddle the parm)

devc-pty -n 32
echo “Starting chkfsys”; chkfsys -Prq /dev/hd0t79
if test -x /etc/rc.d/rc.sysinit; then
. /etc/rc.d/rc.sysinit
fi