devb-eide CPU usage with SATA drives

We are currently unhappy with the vendor that supplies some of our QNX blades. Due to the environment we operate in, the current blades are prematurely rusting. On top of that, the vendor has been switching hardware (motherboards, Harddrives etc), sometimes notifying us and sometimes not. In any case, it makes configuration a bit problematic especially for systems we have to support for 7-10 years.

So we’ve been looking at some specialized vendors who supply long life (guaranteed 10 year availability) motherboards. We just got in the first evaluation system from the first vendor to test out.

I installed QNX 6.3.0 SP3 without any problems and configured it for our needs and started testing.

What I immediately noticed is that hard drive performance is very slow. Specifically, devb-eide is eating up tons of CPU time in order to write to the disk.

The new vendor system consists of:
2.66 core 2 duo Intel CPU
2 Gigs of Ram
1 250 Gig SATA 2 drive
2 Realtek 1 Gig NIC cards

This is the same configuration as the ones we currently have (other than different NIC cards)

In the BIOS on the new MB I’ve set the following:

IDE HDD Block Mode - Enabled
IDE DMA Transfer Access - Enabled
IDE Primary Master UDMA - Enabled
IDE Primary Slave UDMA - Enabled
Legacy Mode Support - Enabled

Without Legacy mode QNX won’t boot. I’ve fiddled with the others and there isn’t anything that seems to make any difference to my devb-eide performance.

The arguments to devb-eide in my boot image are as follows:

devb-eide eide noslave,noreset blk cache=2M,automount=hd0t79:/

This is the same arugments to devb-eide we use on the current blades we have.

As an example of my testing, I copy a 1 meg file while running hogs at 1 second interval to report CPU usage over 2%.

On the current blades, this copy of a 1 meg file doesn’t even show any process using 2% CPU (In photon I also don’t see the CPU use graph move and the hard drive graph barely flickers).

On the new blade, this copy of a 1 meg file takes 15-22% of the CPU for 2-3 seconds (In photon I can see the CPU graph spike for 2-3 seconds while the hard drive graph barely flickers).

What’s interesting, is that the cp command returns to the prompt and it’s almost a full second or two later before I see the CPU spike when devb-eide is doing whatever it’s doing.

So my question is, does anyone have any idea why devb-eide might be consuming so much CPU to do a simple file copy. I can’t see any arguments I might add or change that would make any difference. I was sure some BIOS setting would fix it but nothing seems to make any real difference. This long life motherboard by the way does not have any logo on it from someone like Abit, Asus etc so I have to assume they build/make it special in order to have it available for 10 years to buy. All the video/sound/NIC cards are integrated like modern motherboards. The BIOS is an Award bios. I just wonder if there is really poor SATA to IDE legacy support code in the BIOS that is causing this slowness.

At the moment we are going to reject this system (which otherwise looks good) due to this problem.

TIA,

Tim

My guess is devb-eide is running in PIO mode ( post the output of sloginfo ).

We had a similar issue and it got fixed with a new devb-eide that was posted on community.qnx.com/sf/discussion/ … p.topc2851.

The second or so in delay before you see a spike is probably due to the cache holding on to the data before flushing it to disk.

When you say Blade you really mean Blade? Would you mind sharing the vendors you evaluated and/or are currently using?

Hello Tim,
Try passing the device ID and vendor ID to the driver:
devb-eide eide vid=0xXXXX,did=0xYYYY,noslave,noreset blk cache=2M,automount=hd0t79:/
Regards,
Yuriy

Mario,

Here is my sloginfo output (I clipped some entries at the end that I don’t think are meaningful and also cause they repeat endlessly). Note that I start sloginfo in my custom rc.sysinit file instead of in the bootimage. So I may be missing early devb-eide messages…

[code]
Time Sev Major Minor Args
Aug 06 15:17:48 5 14 0 tcpip starting
Aug 06 15:17:48 3 14 0 Using pseudo random generator. See “random” option
Aug 06 15:17:57 2 5 0 libcam.so (Jun 9 2006 15:31:26)
Aug 06 15:17:57 2 5 100 cam-disk.so (May 4 2006 15:35:16)
Aug 06 15:17:57 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=5a, error=70, sense=5, asc=24, ascq=0
Aug 06 15:17:57 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=5a, error=70, sense=5, asc=24, ascq=0
Aug 06 15:17:57 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=5a, error=70, sense=5, asc=24, ascq=0
Aug 06 15:17:57 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=35, error=70, sense=5, asc=24, ascq=0
Aug 06 15:18:18 1 8 0 phfont_init
Aug 06 15:18:18 1 8 0 phfontXX started OK
Aug 06 15:18:19 5 8 0 Process fontsleuth initialized.
Aug 06 15:18:20 1 8 0 phfont_init
Aug 06 15:18:20 1 8 0 phfontXX started OK
Aug 06 15:18:20 6 8 0 VGA primary : bus 0x0 dev/func 0x10
Aug 06 15:18:20 6 8 0 Secondary : bus 0x0 dev/func 0x11
Aug 06 15:18:20 6 8 0 Found 2 PCI/AGP display devices
Aug 06 15:18:20 6 8 0 Primary active: 0 10 4
Aug 06 15:18:20 6 8 0 pci_init: found PCI device 8086:29b2
Aug 06 15:18:20 6 8 0 SetDisplayOffset pos: 0 0
Aug 06 15:18:20 6 8 0 SetDisplayOffset pos: 0 0
Aug 06 15:18:20 5 8 0 Attached /dev/io-graphics/vesabios0, id = 0
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x54)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x76)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x76)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x76)
Aug 06 15:18:25 2 12 0 ps2 - Device Timeout (0x76)
Aug 06 15:18:26 2 9 0 hidd_report_attach failed (5)
Aug 06 15:18:26 2 9 0 Cannot attach mouse input report (error code 2)

Aug 06 15:18:26 2 12 0 ps2 - Device Timeout (0x76)
Aug 06 15:18:26 2 12 0 ps2 - Device Timeout (0x56)
Aug 06 15:21:50 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=35, error=70, sense=5, asc=24, ascq=0
Aug 06 15:21:57 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=35, error=70, sense=5, asc=24, ascq=0
Aug 06 15:22:07 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status=2, cmd=35, error=70, sense=5, asc=24, ascq=0
Aug 06 15:22:17 2 5 0 scsi_interpret_sense: path=0, target=0, lun=0, cam_status=c4, scsi_status

[code]

So do I have the PIO problem that requires a new devb-eide?

As for Vendors:

We currently use Cubix:

cubix.com/

The laserblades/cardcage on the main page are what we use (2 windows blades and 2 QNX blades make up our system)

If you aren’t in a harsh environment (we are in a marine one with salt air) they otherwise work great. For QNX, they put in 3rd party NIC cards since the onboard NIC’s were not supported. With all the new MB’s they are getting and with new drivers there may now be a MB that is supported without the 3rd party NIC cards.

They are slated to send us a ‘better’ blade with a new MB and rustproofing (we sent them a couple of rusty blades so they can see the issue first hand).

The system I am having trouble with is from Core Systems:

coresystemsusa.com/

These are not really blades like the Cubix ones are. They are 1U chasis and we’ll end up taking out the cardcard Cubix supplies and replacing it with a rack for these chasis.

The thing we like here is they are MIL spec and offer long life stuff shelf items which is important for us.

The most promising vendor is Corvalent:

corvalent.com/

They still haven’t sent us the system yet. Mostly because when we asked for QNX compatibility they actually took the trouble to ship a configured system directly to QNX to ensure it would all work. That’s the kind of touch I appreciate so I don’t have to do it myself!

Also they offer long life products etc.

They also are a 1U and 2U rackmount chassis style than a blade.

All 3 of these vendors at least took the link to the QNX supported hardware site and sent working stuff on the 1st try. Plus they took the time to listen to our needs and they at least claim to do custom solutions (the Cubix blades are custom for QNX since we needed 3rd party NIC’s). So they will probably work with you for what you need if your needs are different in terms of blade/size etc.

Hope that helps as the definitely of a Blade can vary quite a bit. The last 2 vendors aren’t blades, but rather chassis. I consider Cubix to be the only true blade.

Tim

Mario,

I grabbed the devb-eide from the link you posted and used it to build another boot image using the same command arguments as in my original post.

But I don’t see any difference in terms of CPU usage.

Were there any specific commands you needed to pass to this new devb-eide to get it to work? I see there are some specific options you can set regarding udma mode, PIO mode etc so I was curious if I needed to set any of those.

Tim

I didn’t have to.

You could try posting the device and vendor id ( or better output of pci -vv ) to the BSP section of foundry. Maybe they have a more recent devb-eide.

Just realized I forgot to thank you for the info on the Blades. Did you get the system from CoreValent. Currently we are using IBM x3 series but it changes too often on us.

Mario,

I fiddled around a bit more with the new driver. Basically, I created a RAM drive and copied bin, sbin and usr/bin to the RAM drive so I can stop/start devb-eide and try some options without needing to rebuilt images and reboot. Plus that lets me get actual debug info from devb-eide in sloginfo.

I found that removing the ‘noreset’ option so that I start as:

devb-eide eide noslave blk cache=2M,automount=hd0t79:/

seems to make a slight difference with the new driver (does nothing for the old one). It still uses CPU though so something is still not right. I’ll post the sloginfo and pci -vv output on foundry 27 as you suggest.

As for the blades, your welcome, I hope it’s of some use.

I still don’t have the CoreValent system. Supposedly it’s set to arrive either tomorrow or early next week. Once it does and I finish testing it, I’ll let you know what I find if your interested.

Tim

Yes I’m interested. Currently IBM wants to work with us and QNX to get their blade running QNX6. However their is no guaranty of availablity. So far our experience with the IBM stuff xSeries hasn’t been that great.

Mario,

I realize I owe you a reply on the Corvalent system. It arrived late last week and I haven’t had a ton of time testing it but I can tell you the following.

Our system came equipped with a 3 Gig processor (we asked for 2.66 or better). This is the long life MB in it and it has plenty of BIOS options if that matters to you. What we wanted and they have is the ability to flash the BIOS to a ‘factory’ state that we can specify so that we don’t have to worry about configuring the BIOS here and then having it revert to a factory state of something else if there is a power surge.

corvalent.com/02b_ind_boards/q35atx.shtml

QNX 6.3.0 SP3 runtime installed with no problems.

Only 1 of the 2 ethernet cards (82573) works out of the box but we only need 1. However a quick look on Foundry 27 shows a beta driver for the other one (the 82566. I have no plans to install it but will if you are really interested in knowing if the other card works).

I needed the beta devb-eide driver to overcome the PIO mode problem that started this thread. Incidentally that same driver fixed the problem on the other system (core) that we are evaluating.

USB works or at least the 2 on the front of the blade chassis do as I have plugged in thumb drives and tested them.

The built in Video is not photon friendly so it only uses the default vesa driver. That doesn’t matter to us since our QNX blades have no monitors by default as we use Windows for our GUI. No idea if the built in sound works or not as again it’s not an issue for us.

The blade is noisy if that matters to you. I’m hoping they can change the fan (I need to look at controlling fa speed in the BIOS). Ideally I’ll ask about a fanless solution. They did give us SATA connectors that have clips to hold them in place since our system can suffer vibration that often jars things like that loose over the course of time. The HD comes in a removable tray

Is there anything specific you are interested in knowing about?

We’ll be running this system in a bake off with the Core system and a new Cubix blade over the next few weeks to determine a winner.

Tim

You don’t owe me anything :wink: This is great stuff Tim thanks a lot. What form factor are you setting this up in? Do you have an email address I can contact at Corvalent.

Cheers,

Mario

Mario,

I can’t find the exact picture of the system we have. This one must be it though ours looks slightly different on the front and of course we have a completely different MB and CPU (a 3 Gig E8400) than what they list here.

corvalent.com/02_products/02 … _265.shtml

It’s a 2U form factor. We went with 2U because we are also going to buy our Windows blades from them as well (they can continue to supply a version of XP until 2016 which is again important to us) and so we needed 2U in order to get the riser card option to use a GTX video card under Windows.

If we didn’t need the riser card for windows it might have been possible to get the 1U form factor they have (you’d have to ask if that’s possible).

I believe this is the person who is our contact at Corvalent:

wade.wyatt@corvalent.com

Just mention QNX and Perry Slingsby Systems (the company I work for) and that should trigger his memory.

Tim

Mario and Tim,

I can offer you an alternative vendor in chassis-plans.com/custom-showcase.html for some examples of custom designs we’ve done. Neither Corvalent or Core design their own enclosures.

See also chassis-plans.com/rackmount- … puter.html for a system we designed for L3 for military transit case installation. Similar rugged aluminum construction. This one is unique in that 2 dual quad core Xeon SBCs are installed, one running Solaris and one XP. We could easily provide similar construction in a 2U enclosure tailored to the marine environment.

We’ve used the Corvalent Q35 boards and had trouble with them. We are no longer using them and can provide systems with the Radisys long life Q35 boards. We’ve finding very good system compatibility and support from Radisys. The motherboards on Core’s website are from Itox, I believe.

We have experience with QNX and would be happy to provide a configured system for evaluation. As an example of our level of service, our Systems Engineer just helped Johns Hopkins University get Suse Linux running on the Tyan Thunder e6550NX quad Opteron motherboard used in the 4U enclosure above. Not the easiest thing he’s done.

You can contact Chassis Plans at www.chassis-plans.com.

David

We have switch to Corvalent SB5000 MB and so far are very happy with it.

Mario,

Glad to hear you like them but I don’t know what the SB5000 model is since I don’t see it on their website.

We have no complaints either and are very happy with what we get and the level of support when we found a hardware problem with one system.

Tim

corvalent.com/02b_ind_boards/sb5000p.shtml

Mario,

Nice! Are you actually getting dual processors or just going with 1 Quad Core?

I guess there must be support for that particular NIC under QNX as I don’t recognize it at first glance.

Tim

Tim,

Depends on the product it’s going to, some are single Quad Cores other are dual! QNX 6.4.1 has support for the NICs which is from Intel.

I don’t recall if USB was working, we aren’t using it. USB and Network have a bad habit of sharing interrupts and we have LOTS of traffic on the 2 NIC, the overhead created by going in the USB ISR for nothing is significant (2-3%) hence we don’t use USB.

We don’t use Photon with a graphics card. Didn’t bother to check, but the boot CD worked so at least VGA is ok.