system files deleted after a driver crash. possible recover?

Hi

I have consistenly met a “system files deleted” problem and need your help. Below is the description of the problem. Thanks in advance!

I run a program on qnx 6.3.2. The program uses a CAN bus card that requires a driver. The driver is an executable file running in background. (FYI, I run the program as root because the program needs to run in high priorities and needs to access IO.)

As I ctrl-c to kill the program, the driver exists/crashes. (This driver-crashing does not happen a lot before).
Then QNX cannot find shell commands such as pwd, cd, etc.
I have to the reboot the system. Not surprisingly, the system cannot boot.

Trying to recover, I boot from a QNX CD. I notice that the hard drive is already mounted normally as /fs/hd0-qnx4/ But as I cd into it, most files are gone. Remaining files are: “.bitmap .boot .filenames .altboot .inodes usr” Their sizes add up to a few MB.

Try fdisk /dev/hd0. Everything looks normal. (size, patition information)
Try dcheck, no bad sector
Try chkfsys, some fixable errors are detected. I choose to fix them. The system still cannot boot.

I spend a whole day to reinstall the system, softwares and my program. As I run the program it happens *** again ***: system files are deleted, same process as I described above. So it does not looks like an incident.

I don’t want to keep re-installing and thus I have to find out the reason of system files being deleted.
Any idea about the mistakes that I have made?

Thank you very much!

Does the CAN bus driver perform DMA? That is the only scenario in which I can imagine the level of damage you describe to the filesystem. Have you tuned the filesystem for reliability over performance (see io-blk documentation)?

The CAN card is : CAN-PCI/266 made by esd-electronics
I have not found information from its website about whether it uses DMA or not.

About the tuning, do you refer to adjusting “commit=level”? The CAN driver is an executable file and does not have this option. I read the io-blk.so document at qnx.com/developers/docs/6.3. … lk.so.html

btw, the harddrive is mounted on a robot that moves around. The motion can do damage to the harddrive. However, if dcheck reports no bad sector, can I assume the harddrive is not broken?

Thanks!

The “commit=level” is for the disk driver not the CAN driver. It allows you to set the level of synchronicity between a file write (or metadata update) and it’s physical storage onto the media. Setting this level affects how tolerant the filesystem is of power failure and/or crash. Since the CAN bus driver is crashing the system, it might be prudent to set commit level to high, until you sort out the issue…

If you set commit to high, and the problem still happens, it suggests that something other than the file system or its drivers is messing with the hard drive. Another avenue if you have enough memory, would be to mount a ram disk, copy all needed files to it, and then remount the ram disk as root, and unmount your hard drive. If the hard drive is still trashed, then my first observation would be confirmed.

BTW, I’ve never heard of physical damage to a disk causing the type of result that you are reporting.

One last observation. Except for usr, the list of files left looks like what one would expect after a dinit.

Don’t use devb-eide’s ram disk, it’s buggy ( don’t remember the exact problem).

devb-ram implements RAM disk without any problems.
If the damage is done by software, an interesting question is whether CAN bus driver is to blame or applications going berserk when the driver is terminated (crashed).

Applications that don’t talk to hardware that does DMA, can’t damage the kernel. That’s why I would want to know if the CAN driver does DMA…

I wasn’t talking about devb-ram but dev-eide.

Mario,
I completely agree with you.

io-blk.so option ramdisk=size which may be used by devb-eide is not perfect in some versions.
devb-ram (a separate driver) is OK.
Yuriy