network boot help please! (QNX 6.1.0A)

Abstract: Help! I’m trying to get QNX6 to boot over the network from a
floppy, getting its entire filesystem from a QNX6
“filesystem server”. Photon is not working out.

I’m trying to set up the QNX6 (6.1.0A) equivalent of our QNX4 network.
We run two networks in parallel: a corporate Windows net and a
departmental QNX4 net. On the QNX4 network, we have file,
application, print/connectivity (Samba), and session (Phindows) servers.
I make up a boot floppy for each workstation that will be using the QNX
network. It works quite well for us, but with QNX4’s decline, it’s
time for the upstart QNX6 to take its place. It’s being very stubborn
though!

Tech details:

  • Network: switched 100Mbit ethernet

  • Workstation: 500MHz PIII, 256MB RAM, Dell Precision 220, Matrox G400
    video.

  • Server: 800MHz PIII, 512MB RAM, Dell Precision 420.

I’ve constructed a build file for a QNX6 network boot floppy, which
works well (if a bit klugey; attached below). The startup script loads
a minimum of drivers, including io-net, the NIC driver, qnet, tcpip,
and libsocket. It runs dhcp-client to get its IP address.

Then it mounts its filesystem. I tried different approaches:

  • Use “procmgr_symlink /net/odin /” (odin is the server): doesn’t
    work. This is the closest I could come to QNX4’s “sinit -r file://1/”
    (which mounts node 1’s filesystem at the network-booted nodes root).

  • Use “fs-nfs2 192.168.6.80:/ /” to do the same thing (6.80 is odin’s
    IP). This worked to a degree, but I had problems with the
    pseudo-directories like /net: I got the server’s copy of
    everything (except /dev, perhaps). I couldn’t get Photon to start
    though.

  • Use “procmgr_symlink /net/odin/bin /bin”, similarly for /etc, /lib,
    /sbin, and /usr. Once connected, /etc/system/sysinit calls a script
    that makes “ln -sP” symlinks for any directories on the server but
    not existing locally. This way, I keep my own /dev, /net, /proc,
    etc.

It’s a kluge, but this is the best result I’ve had. The build file
below reflects this approach.

Next the startup script runs a modified /etc/system/sysinit. This
starts up the rest of the drivers, and does a few special things:

  • Link /etc/system/config to a host-specific directory (creating it
    first if necessary)::

ln -sP /etc/system/config.host/$HOSTNAME /etc/system/config

This must be done before rc.devices is called, because it modifies
/etc/system/config/graphics-traplist and other files. This allows a
single filesystem to hold multiple hosts’ configurations.

(BTW, the crttrap docs are wrong. They say that
/etc/config/trap/crt.$NODE files are created and used, as in QNX4,
but they aren’t. I wish they were!)

Is there a better way to do this? I want the same functionality of QNX4’s
/etc/config/trap files.

  • Link /etc/net.cfg to /etc/system/config/net.cfg, which is now
    host-specific. Again, this is so that the server’s filesystem can
    support multiple hosts.

  • Run a slightly crippled /etc/rc.d/rc.devices
    (/etc/system/enum/devices/net has been removed, and is dynamically
    added only for non-netboot hosts).

  • Does some host-specific setup (sysinit contains a big case
    statement, one entry per host), including network setup for
    non-netboot hosts such as the server itself.

  • Runs most of the stuff from /etc/rc.d/rc.sysinit.

  • Finishes up with exec’ing tinit.

(If you’d like to see the sysinit system I’ve cobbled together, please
let me know.)

This setup lets me boot over the network and mount the server’s
filesystem almost as if it were my own (full of ugly symlinks though).
I log in in text mode, and everything seems fine. There’s some delay,
but it’s attributable to the network.

When I try to run Photon (ph), I get the following symptoms:

  • Photon starts up very slowly. It takes a long time, up to a minute
    or two for the shelves to appear. Sometimes the shelves never
    appear. According to netstat and nicinfo, the network is working
    fine. How many megabytes of object code does Photon go through when
    it starts up? Can that and the network account for the long delay?

  • I can’t run Voyager. I get the splash screen and the outline of the
    browser window, but no contents. The windows don’t go away. I can’t
    close the browser window by mouse, only by slaying the process.

I saw these dialogs during one session (the text was cut off; this
is exactly what I saw)::

Application Error: Voyager: Unable to make CONFIG d
not implemented) in (init g

Config Error: Unable to save the configuration. Restore back
to the old
(No such file or directory)

  • The Launch menu doesn’t work at all. I click on it, nothing happens.

  • Sometimes I can’t end the Photon session. The shelves and backdrop
    disappear, and I end up with a black screen and mouse pointer.

  • Sometimes the shelf process won’t terminate, and I have to kill it.

Does anybody have any suggestions? I’m open to any ideas, no matter
how unlikely.

Thank you.

David Goodger, Systems Administrator & Programmer
Automation Tooling Systems Inc., Advanced Systems
730 Fountain Street, Building 3, Cambridge, Ontario, Canada N3H 4R7
direct: +1-519-653-4483 ext. 7121 fax: +1-519-650-6695
e-mail: dgoodger@atsautomation.com

############################################################

/boot/build/netfloppy:

[search=/bin:/usr/bin:/lib:/usr/lib:/sbin:/usr/sbin:/lib/dll:/boot/sys]

Bootstrap Script

================

[virtual=x86,bios +compress] boot = {
startup-bios
PATH=/proc/boot:/x86/bin
LD_LIBRARY_PATH=/proc/boot:/lib:/lib/dll:/usr/lib:/dev/shmem procnto
}

Startup Script

==============

[+script] startup-script = {

LD_LIBRARY_PATH=/proc/boot:/lib:/lib/dll:/usr/lib:/dev/shmem:/x86/lib:/x86/d
ll

Start up some consoles

devc-con -n9 &
waitfor /dev/con1
reopen /dev/con1

display_msg “”
display_msg ATS Advanced Systems QNX6.1 Network Boot Test (QNET)

Fill kernel data structure with appropriate

system-specific values (IRQs, DMA channels, etc.).

seedres

Start the pci server

pci-bios &
waitfor /dev/pci

These env variables inherited by all the programs which follow

SYSNAME=nto
TERM=qansi-m
HOME=/

reopen /dev/con1

display_msg “Starting up network…”

Start NIC driver with QNET networking. (@@@)

io-net -d el900 -pqnet bind=ether,host=qnet.atsauto.com -ptcpip &
waitfor /dev/socket

(@@@) no domain after hostname

dhcp.client #-h qnet -i en0 -u

display_msg “Mounting server filesystem…”

waitfor /net/odin
procmgr_symlink /net/odin/bin /bin
procmgr_symlink /net/odin/etc /etc
procmgr_symlink /net/odin/lib /lib
procmgr_symlink /net/odin/sbin /sbin
procmgr_symlink /net/odin/usr /usr

Give control to the network system initialization file.

/etc/system/sysinit
}

File List

=========

libc.so # C shared lib (also contains the runtime linker)

Programs require the runtime linker (ldqnx.so) to be at a fixed location

[type=link] /usr/lib/ldqnx.so.2=/proc/boot/libc.so

Networking libs:

devn-el900.so # 3C90x
npm-qnet.so
npm-tcpip.so
libsocket.so

The files above this line can be shared by mutiple processes

[data=c]

List executables below this line

devc-con # console driver
pci-bios # pci server
seedres
io-net
dhcp.client # for dynamic IP