Robert.
http://www.paschiche.com/forum/viewtopic.php?t=32
For me, it’s obviously the CF card which is DEAD ! The constant 3.2% of
corruption togheter which the continuous bas cluster spread in several
locations indicates that a part of the memory DIE has… DIED ! Like if
several continuous memory cells from continous lines (matrix of cells) are
burnt. For me, a square of the memory die is dead, or an imperfection was
introduced in the silicon and reveals in some conditions (temperature,
voltage, …)
Kochise
Robert Muil wrote:
Robert,
Not sure if you are still interested in this issue - I hope so!
I have had big problems with flash corruption under QNX. We use
CompactFlash
1Gb cards, with QNX 6.2.1A.
We seem to be severely exercising the problem now with a recurrent
rebooting
problem. Because of a separate problem, our system reboots every hour.
After
about 4 days the flash disk on which the OS resides becomes heavily
corrupted.
This has happened to at least 2 disks. The patterns of bad blocks are
interesting: in both cases, the bad blocks occupied exactly 3.2% of the
disk
and were almost entirely contiguous. They were in different locations
though.
I have been racking my brain about what in the OS could possibly be
writing
heavily to the disk during normal operation. There really should be very
little activity on the disk beyond reading it at bootup. However, your
suggestions that it is to do with unordered shutdown would make much
sense
in my experience.
Flash Disk: Sandisk 1GB CompactFlash
I/F: IDE (treat as standard BIOS hard drive)
OS: QNX 6.2.1A
Regards,
Robert Muil.
“Robert Krten” <> rk@parse.com> > wrote in message
news:c7d9l7$o4u$> 1@inn.qnx.com> …
Donald backstrom <> donaldb@cstgroup.com.au> > wrote:
I’m running QNX4 on a network of 1996 vintage boards which were
recently
upgraded to IDE flash modules (the ones that go into the sockets
directly,
but look to the software like IDE drives). The original set of 18 was
from
ICP but I sent these back when we had three failures. Theu were
replaced
by others also from ICP, but they boot as Sandisk SDC1-32. They are
the
44-pin version (ie 2mm pitch connector).
These ones have not failed, but they do suffer data corruption. It is
always related to power down, but I’m not certain that it relates
necessarily to whether the drive was written to since previously
powered
up.
It is curious that we are using these modules since we could not make
Compact Flash work, even though we were using boards that allegedly
supported CF as an IDE drive. We could not load a QNX disk image onto
these drives and have them last reliably. The boards we were using for
this were current - PCM5823 3.5inch form factor.
We don’t have a good record with flash at all.
We originally use M-systems PC104 flash disks, and apart from a number
of
drive failures over the years (probably 4 in 18 systems in 6 years)
they
did not corrupt data. This tends to support the TFFS data integrity
theory.
Hi Donald,
thanks for the datapoint! I will be working on this issue some more
this
week / next week. I did receive input from another user that there is
definitely some corruption related to powerdown, and that the last
write
needs to take place something like 30 to 120 seconds before the flash
is powered down – this is related to “automatic” block shuffling done
by the flash’s built in controller.
More as I have it.
Cheers,
-RK
nntp.qnx.com wrote:
Hey Robert,
I sing the same tune as Miguel, with the exception that DiskOnChip
has
never
given me any problems, but IDE flash devices always have, and plain
vanilla
flash devices always have as well.
Basically, if we would overwrite anything on the flash disk and power
down
before giving it around 10 seconds, things would be corrupted. Also,
we
got
into the habit of using QNX 4’s sync command, that helped as well.
I’ve
never really come up with a solution with QNX6, except to use
DiskOnChip
devices only.
I tend to trust DOC much more than just plain vanilla flash because
of
the
TrueFFS driver. They have apparently have algorithms to help insure
data
integrity. Also, the DOC devices have functionality (or is it in the
TrueFFS driver? I don’t remember) that keep old data until the new
data
has
been fully committed. Its a Write then Delete process instead of
delete
then write. This makes the device tolerant to being powered down
before
all
the data has been committed to the device. Don’t remember the
specifics.
But you might check it out.
Hope it helps.
Kevin
“Miguel Simon” <> simon@ou.edu> > wrote in message
news:c5slto$ol1$> 1@inn.qnx.com> …
Hi RK…
I hope that all is well.
I will get back to you with more details at a later time. But for
now,
here is a way in which I have been able to corrupt flash disk in a
consistent manner:
I. subject: SanDisk CompactFlash 64Mb and 120 Mb
OS: QNX 6.2.1-B PE
hardware: VMIC cPCI, 933 MHz
- can boot QNX ok,
- copy over the old binary without deleting it first
- flash memory gets corrupted
- must reformat the memory after a while
- mean time between failure: about a year or so, depending on how
often
you write to the flash
- time of last incident: 1 month ago
II. subject: DiskOnChip 2000
OS: QNX 6.2.0
hardware: Adastra EBX board (do not recall actual board
number)
- can boot QNX ok,
- DiskOnChip gets corrupted at random, no pattern perceived…
- time of last incidence: 1 year ago
- NOTE: we changed to Prometheus PC104 from Diamond Systems
because
of
this problem. Things are a little better, but…
III. subject: Prometheus Flash Disk Module
OS: QNX 6.2.1-B PE
hardware: Prometheus PC104
- can boot QNX ok,
- copy over the old binary without deleting it first
- flash memory gets filled and reports files that have been
deleted
- must reformat the memory after a while
- mean time between failure: about a year or so, depending on how
often
you write to the flash
- time of last incident: 1 month ago
It seems that with QNX OS, when we write over old binaries, the
flash
memory gets corrupted regardless of media and hardware. You can try
to
do this for yourself and see if you get the same results. However,
notice that if I delete the old binaries first, it seems that I can
delay the onset of flash corruption (until I forget to do this and
write
over old binaries any way). Also, notice that flash gets corrupted
when
I write to the flash disk repeatedly over and over. Finally, the
DiskOnChip + Adastra board were a real bad match, but newer Adastra
boards may be ok. Also, I do not ever use MS embedded products.
As time passes I will collect better data, and I will let you know
(provided that you still need the information).
Regards…
Miguel.
Robert Krten wrote:
I’m about to start investigating flash corruption on behalf of
three
distinct customers; I don’t have very many details right at this
point,
but the general consensus is that these devices “work just fine”
under
Windows CE, DOS, and other OS’s, and “experience corruption” when
used
with QNX 6. That’s all the “hard facts” I have at this point.
The purpose of this post is to solicit input from the field on
flash
corruption – I’m looking for things like model numbers, flash
technology
used, usage patterns when it failed, and whether this is a QNX
6-specific
problem or not (as far as you are able to tell). I’ll summarize
the
results and analysis as much as I’m able to when given possible
NDA
constraints etc.
Thanks in advance for your input!
Cheers,
-RK
\
[If replying via email, you’ll need to click on the URL that’s emailed
to
you
afterwards to forward the email to me – spam filters and all that]
Robert Krten, PDP minicomputer collector > http://www.parse.com/~pdp8/
\