How to make qnet more deterministic

I’m pretty sure I’ve found the problem. I’m willing to bet that if you go into your BIOS setup, you will find that your parallel port is set to ECP.
Now not to point fingers, but ECP was invented by guess who. ;-).

Even so, the default for an ECP port should be to work just like a regular port. But remember, devc-par has been running. I’ll bet it puts the ECP into a higher performance mode, and when you slay it, nothing works quite right.

Here are two solutions. 1) Put the port into a different mode, Normal, Bi-Directional, or even EPP should work fine.
2) Read up on how to program an ECP port, to get you out of the mode you are in.

When I tried 1), the problem went away.

Have any of you guys who dupliated this tried running without photon? I was messing around with qnet options etc today to try to fix this and finally out if desperation I ran without photon (I was only using photon on the P4 not the P3) - and bingo - the delays are gone.

Please let me know if this fixes it for you guys too if you get a chance.

Starting photon back up definitely re-introduces the delay, even when the system is idle. Note that photon is only running on the P4 (the system that is running one server and one client). The delay is being noticed from the system that never runs photon (the p3) that just runs one server.

Thanks

I didn’t try but on both machines photon itself was non updating it’s display as I was working with Phindows. So I wouldn’t expet any graphics operation to occurs.

Doesn’t matter. Even with display idle I get the delay. Then I exited out of local photon on the qnx pc and started a Phindows session. SAME DELAY!

Again with it just sitting there idle.

So something related to photon is causing the problem, and it’s not the graphics driver.

It is strange, but I would bet you if you simply exit photon on both your machines and run your test again it will come up deterministic. It has for me :slight_smile:

NOTE: I also just tested it using the windows IDE and running the programs through that and it still has no delay unless I start Phindows or a local ph.

So even with the network traffic from qconn, tcpip - its not the issue. I’m so happy :stuck_out_tongue: WOOT! I can go demo this now!

I’m not yet ready to set up all the equipment again, but I’d like to point out something. If you are correct, and the problem goes away when Photon is not running, then there is something seriously wrong, either with Photon or the network. Both machines I’ve been testing on are quad processors. That would mean either 1) Photon is somehow able to tie up 4 processors for a relatively long period of time, or 2) Something even darker is going on between Photon and the network.

I could be that the graphics card holds the bus for for too long disrupting the network card transfer rate?

Possibly, but that would be an awfully long time to mess things up.

No, because it does the exact same thing (causese delay) when there is no graphics driver running - just Phindows.

This indicates to me that maybe the photon server is doing with the network, as Maschoen has suggested.

I emailed the instructions and code below to Colin at qnx and he will check it out. He asked for a trace but I don’t know how to generate one currently , I’ll figure it out as soon as I can.

=======================

Here’s updated code and instructions on how to replicate the issue.

Run the parportserver on two nodes, note the PID and CHID that it prints out.

Run the parporttimerclient on one of the two nodes, usage is

parportimerclient

Yeah I know I just haven’t implemented GNS lookups.

For example, my nodes are called “p4” and “p3”

So I run

on p3: parportserver

Console output will be like: 720441 1

on p4: parportserver

Console output will be like: 883423 1

Then I run on P4:

parporttimerclient p4 883423 1 p3 720441 1

The parportimerclient on p4 will send a msg to the server on p4 and then the server on p3. The servers will toggle the parallel port. Timer runs at 8ms, so a complete waveform is 16ms.

If you start photon locally on P4, a 8ms delay comes up about once a second on the remote (p3) node’s output.

If you have no photon running locally anywhere, but start up a Phindows session (add a windows PC to your network), you get the delay too.

Your photon sessions can be completely idle, you still get the delay.

Nick, a thought just occurred - are you running shelf? It has a plugin with polls the network driver for statistics. Maybe that is causing some amount of delay?
If you are running shelf, does slaying it off cause the problem to go away?

Yes shelf is running - I will try killing it. E-mailed you the code.

Hahaha lol zomg.

Yeah it was shelf.

:stuck_out_tongue:

I still get a 1ms or so jitter but that happens all the time, and is acceptable.

Colin = the man.

I talk to my boss today about possibly using QNX on one project right now. Hope he is amenable ><

BTW much thanks to everyone who helped out on this thread, especially mario, maschoen, and the rest! I really appreciate you guys taking your time to look into a stranger’s problem.

Another update. There is a new version of devn-speedo.so that Colin sent me which eliminates the problem so you can run photon and have Shelf running and the delay is not there.

I was also using speedo …

Yah thats why u had the issue too.

I find this quite ironic. I assume that the network statistics being gathered are the ones that are displayed in the gauge with cpu and disk usage. I’ve never seen that widget show me anything at all. Maybe I just haven’t pumped up the network enough.