Service Pack 2 and ifs file "issues"

wretched · November 28, 2005, 1:56am

Has anyone else noticed anything weird when generating compact flash images using Momentics with SP2 applied?

We used to build a comprehensive image (~8Mb) as an ifs and loaded onto one partition of a CF card (taking the obvious memory hit when the whole thing is loaded as a read-only RAM partition), but noticed that there were several strange behaviours when a “known good” image was generated using SP2.

Firstly, files which end up near the end of the image are occasionally corrupted (file sizes do not match, going up in some cases when it would be expected that the file stripping which occurs should at least keep the files the same size). THis seemed to include libraries getting damaged - resulting in much pain and suffering trying to find the cause of programs aborting before entering main(). We believe that this is because there is a fundamental limit to the size of an ifs system. The vague references available suggest that this is the case, but not what that particular limit is.

Secondly, even with smaller images (testing purposes) there seems to be nothing systematic in the way that the addition or removal of resources affects the operation of the system: I created an image from the “generic x86” project and added binaries one at a time until the bsh script was unable to execute the individual commands - say devb-eide at the start of the file (image was about 4.5Mb at this stage). Removing the single binary added (note that these were binaries not used in the bsh file) sometimes got it going again, but often it did not. Sometimes adding the binary back didn’t cause problems the second time around. Fiddling with the “page align image” option also changed the behaviour.

We are wondering if there is a page-alignment or similar issue with the mkifs utility, or a fundamental limit to the operational size of the ifs as a result of this…

The good news, however, is that we seem to have a good solution (which it turns out is probably more like the intention of the qnx creators of mkifs): create an ifs with the minimum set of objects to start the system and mount a secondary partition (devb-eide, devc-con, pci-bios, seedres, sh, waitfor, libc.so.2, libcam.so.2, libcpp.so.2, cam-disk.so, fs-dos.so, fs-qnx.so, io-blk.so, libcam.so->/proc/boot/libcam.so.2, libcpp.so->/proc/boot/libcpp.so.2) all in /proc/boot and mount a file system containing the rest of the things needed (/bin, /sbin, /usr/bin, /usr/sbin, /opt) and creating symlinks in the image to do this cleanly from the perspective of the resulting system (e.g. /sbin → /sbin)

There was one issue: the contents of /usr/lib needed to be included in the ifs file as the bsh script appears to require ldqnx.so.2 (symlink to libc.so.2) to be located in /usr/lib and does not work when it is in /proc/boot. This seems to be hardcoded somewhere…

If anyone else has had these types of problems, or knows why they may have occurred we’d be interested in hearing.

ed1k · November 28, 2005, 2:45pm

What is your target platform? How much memory does it have? What is [memory=] option in your build file? Do you creat compressed or uncompressed image? If it is x86, the loader will load your image starting from (1M+64K), then it will uncompress or copy image to specified by [memory=] option region. If your image is 8M big and you specified [memory=8m], you will have an overlap and end of image in RAM will be corrupted.
Eduard.

wretched · November 29, 2005, 12:53am

Image built using QNX System Builder in Momentics, so options seem to be called different things

Platform: x86 platform, 128Mb RAM, P266MHz
Image: Uncompressed, image address = Default, RAM address = Default
I presumed that this meant that the file system is copied/decompressed into the region immediately following the image itself and that the system RAM starts either: after the loaded image, or at the default location but with the image region marked as used. I could well be wrong in my understanding of the process, however.

The strange behaviour is the inconsistency, rather than the apparent image corruption at large file sizes. Even at a size as “small” as 4Mb, a size which has operated reliably for several years under earlier version on the same hardware, the addition of a binary can stop the system working, removing it might fix it (maybe on the second or third add-remove cycle). The failure mode of the smaller images is that the shared libraries seem to be unavailable at boot-time and nothing from the startup script will run (making it difficult to interrogate further!).

Thanks for pointing me in an interesting direction.

Thunderblade · November 29, 2005, 8:46am

Have you tried a different CF card? In my experience some CF cards don’t last as long as they should. I met corruption in the form that file size was correct, but the cksum for the file was wrong. Re-copying the image to the CF, the same file was affected, which means the CF controller does not apply wear-leveling, i.e. when you copy an image again and again you hit the same flash blocks all the time. In the end I tried to re-format the CF under Windows with a FAT filesystem and even that failed. Then I tried a new CF and everything worked…

wretched · November 30, 2005, 12:42am

Unfortunately, this problem still occurs on new/unused cards.

ed1k · November 30, 2005, 1:12am

Huh, I never used Momentics, so I can’t comment on this. I guess that GUI thing creates somehow build file and launches mkifs on this autogenerated build file. If it’s the case you have to be able to click somewhere and put memory address where you want to put your image in RAM, or in other words it adds [memory=] options to build file.
Memory region where to place image by default was 16M on x86, I believe. They could easily change that, actualy it’s not good for memory restricted systems (yes, some systems don’t have that much memory as 16MB); and it doesn’t really matter that image would be placed well in the middle of RAM as long as you don’t need physically contiguous memory for DMA or something (MMU will take care and memory will look contiguous for applications)
From my experience (long before QNX6.3), QNX secondary loader (boot sector) loads image from disk to memory starting at 0x111000 leaving copy of first 64K of image in low memory at 0x1000. This low memory code switches CPU to protected mode and copies/decompresses image from 0x111000 to physical memory which you specified (if you didn’t it will be 16m or something other).
Building image in console is much more informative because mkifs -vvv will print a lot of useful information (though to redirect this info to file you have to use “mkifs … 2>logfile.mkifs”).
Also, there is “dumpifs” utility which could help you. If mkifs is broken you will see size missmatch you noticed before by dumpifs, no reason to boot image. If everything looks great, then something wrong with CF or buildfile. I remember there was similar problem related to h/w issues on some computers (some IBM laptops?.. I think it was low memory preboot code and issue with switch to protected mode).
Anyway, I don’t think such a huge image is a good idea and you understand it Worst could be if you really nailed the problem in mkifs and abandon it to live there forever.
Hope this helps,
Eduard.

ed1k · November 30, 2005, 1:45am

BTW,
Because QNX loader uses int 0x15 ah=0x87 BIOS function to load image from disk to memory on a PC, starting address is 0x111000 and this BIOS function works below 16M only, the max. size of bootable image is ~15290KB.
Eduard.

wretched · December 1, 2005, 7:16am

Thanks heaps for the new info. I will try to get some time to look into the mkifs problem at some stage (not really a high priority as the much smaller, cleaner image is working perfectly for us and, unfortunately, as always there is never enough time to do everything you want.)