Wojtek Lerch wrote:
John Parsons <> parsonsj@esi.com> > wrote:
Wojtek Lerch wrote:
John Parsons <> parsonsj@esi.com> > wrote:
Wojtek Lerch wrote:
John Parsons <> parsonsj@esi.com> > wrote:
Tryed a telnet from another station to the station running the wd
application> to the point of the execl()command. Do the execl() command the
telnet cannot connect to the remote site. At the bottom of the wd screen is
‘task complete’. Exit the wd session screen goes blank. No display of any
error condition. Do a sin from remote station, the initial process is held
and the child is (zombie) dead.
Does Photon behave normally until you exit from wd?
Photon behaves as normal to the point of failure of the testing process.
My point of view is that as long as we’re still investigating this, the
testing process has not failed. >
> But I’ll assume you meant “yes”.
Okay maybe we should try this one again. I do not exit from WD. During WD the screen
goes blank.
Is that immediately after executing the exec() call?
It seem so, but on further investigation I believe that the intended program is getting to
some point and either failing or locking the system. Not sure how to prove either.
A few seconds after the exec()?
A random amount of time after the exec()?
When you press a key after the exec()?
When you just sit and wait long enough?
None of the above?
None of the above.
What I am trying to ask is whether the screen going blank seems to be an
immediate reaction to something that you do, or does it always happen a
fixed amount of time after something you do, or does it perhaps seem to
happen after a completely unpredictable amount of time while you just
sit and stare at the monitor?
Always happen a fixed amount of time after the execl() command. Reason why I started with
this command as the problem. After the execl() command I’m unable to debug any further.
BTW Forgive me if I sound a bit impatient, but you must understand that
all I know about your software is what you have told me. Try to be
careful what you say and how you say it – English is not my first
language and I may not interpret things like “Exit the wd session screen
goes blank” the way you intended. And we don’t want to waste time
trying to investigate things that don’t exist, do we? > 
Don’t worry about it. I’m sure we will figure this out. I’ll try to improve how I describe
the different items. English as a written language is very difficult at the best of times.
Trying to be technical with correct english and so someone else understands ain’t easy some
times. 
By “blank”, do you mean the screen is completely black? Is there a mouse
cursor? Can you move it?
By “blank”, I mean the screen is completely black, no movement of anything cause
there is nothing there. From a remote station telnet session using the sin
command. It would appear that the chipsbios.ms and display drive have been closed.
They do not sowup on the list of acctive process.
Is there a traceinfo entry mentioning them?
Not sure what you are asking!
“traceinfo” is a QNX utility that gives you a log of important events
that happened in your system recently. Crashes are listed among them.
If your graphics driver crashes, there will be an entry in the traceinfo
log.
That would be nice, just how do I look at this information from the traceinfo.
…
There are four sin output files attached.
sinbefore - Is during the debug session just before the execl() is executed.
Why am I seeing two sets of WD and “userintfapp” in there? Were you
running two WD sessions? Are they related? It would make it easier to
analyze the output from “sin” if there were as few irrelevant processes
running on your machine as possible.
The first session is initial start of testing. The second is a fork() just prior to the
execl(). So yes there are two WD sessions running. I have got as few processes running
as I now how at this time.
sintaskcomplete - Is right after the execl() command is exicuted and the screen
goes blank.
Don’t you mean after execl() but before the screen goes blank? The
graphics driver in still running at this point.
I mean that after I execute the execl() command the WD sreen shows task complete at the
bottom of the screen. At the remote station the sin command does nothing until I exit
the WD at the host.
That’s because youre running it locally in a telnet session, and the WD
problem that I mentioned before prevents “sin” from running. But if you
run "sin -n " from another node, it will work immediately. This
machine is connected to a QNX network, isn’t it?
BTW But the second WD is not, neither is the second “userintfapp”. Did
you exit from the second WD without executing the execl() call?
You exit the second WD after the execl() command is executed and the task complete
message is displayed.
Now this is getting a bit too complicated for me. From what you have
said so far, this is how I imagine what is happening:
You start “wd userintfapp” in a pterm and run it until it forks.
Then, you find the forked child’s pid and attach a new WD to it; you
leave the first WD alone and only play with the second WD from now
on. When you’re talking about “the” WD, you’re referring to this
second WD and not the first one.
Yes you are correct so far. Helpful future hint WD1 (first wd session) WD2 (second wd
session).
Under the second WD, you let the child call exec(). WD says “task
terminated”. From now on, Photon behaves more or less normally
until ,
I exit WD2, because the task is said to be complete. In the QNX4.24 world when I get the task
complete and exit WD2 the diagapp continues to run and the diagnostics works correctly. It
would seem that on the QNX4.25 system diagapp fails for some reason. I cannot debug past the
execl() so it shows up as the problem. The problem maybe in something that the diagapp is
trying to do.
but you can’t run things like
“sin” in your telnet session until you exit from the (second) WD
(that’s what I call “the WD/Proc problem”).
Once you exit from the (second) WD, “sin” runs. From its output, I
can see that the graphics driver is still alive at this point, and
“diagapp” is doing some file I/O.
This could be true, but I see nothing at this point.
After ,
Not sure what to insert at this point. It does appear that “diagapp has died and turned into
a zombie, and that neither the graphics driver is running”. At this point I’m lost as to
what to do next. I do not have a graphics driver, the screen is blank and diagapp is dead.
the screen turns blank. When you
run sin, it shows that “diagapp” has died and turned into a zombie,
and that neither the graphics driver is running.
If the above is correct, could you fill in the blanks? If not, could you
please give me the exact scenario in at least as much detail as the
above?
blankscreen - Is when the blank screen is displayed.
Yes, the graphics driver and the mode switcher are missing from this
one, and process 1122 “diagapp” has turned into a zombie. And these are
the only differences between the two logs – both the “userintfapp” and
the WD are still running. This doesn’t seem consistent with my
assumption that it’s exiting from WD that makes the screen go blank.
What exactly did you do between the previous “sin” and this one?
Or does the graphics driver die simply because you’re letting “diagapp”
run for a while, without having to touch the keyboard or the mouse?
I exited the WD after execl(). This allowed the ‘sintaskcomplete’ to complete. The
screen goes blank and then I did a blankscreen. The reason was to show the difference
just before and just after blank screen condision.
Didn’t you just say that the screen goes blank “during WD”?
You’re not just trying to confuse me, are you? > 
Sorry your correct I have exited the WD2 then the screen goes blank. 
sinafterreset - Is after the reset process (alt-del-shift-backspace) and I have
returned to a
normal screen.
I can’t explain how the Ctrl-Shift-Alt-Bkspace can do anything to the
screen if the mode switcher is already dead. But it does bring you back
to text mode, with the shell prompt and any previous shell commands and
their output visible, correct?
Yes!
There is only one explanation I can think of at this point: the blank
screen is already in text mode, but you’re looking at an empty console.
The Ctrl-Alt-Shift-BkSp shuts down your Photon, but also causes a
text-mode console switch to your first console.
To see whether that is the case, run a shell on every text-mode console
before starting Photon. This way, there will be something on every
text-mode console, and you’ll be able to distinguish between a
graphics-mode blank screen and text mode.
BTW If you run “crttrap start” from your telnet session when the screnn
is blank, it should restart the graphics driver. It might be
interesting to see what is going on in Photon before you kill it…
Did the “crttrap start” idea and I got the photon backup. A sin command shows that the
diagapp is dead. So this is where the problem must be. Diagapp should continue to run until
the completion of the diagnostic test. How do I start a WD3 after the execl() so I can find
out where diagapp is being killed.
…
Can you tell me more about this “diagapp” program:
diagapp is a diagnostic application that could possible kill chipsbios.ms and Pg.chips
so that a new display screen size could be generated. Is there a way to stop the
Uh… Isn’t it then possible that your “diagapp” indeed kills
chipsbios.ms and Pg.chips, and then dies? Why didn’t you mention before
that it can do that?
Yes it is possible. I did not mention it because I was not aware of where the problem was or
is. You must remmeber that it works fine on a QNX4.24 system and does not work on a QNX4.25.
So I’m trying to determine where in the whole application the failure maybe. This application
is a proven customer code of very large size.
execl() so that a debug session could be started to determine where in diagapp the
failure accures?
You could have a command-line option of environment variable causing
“diagapp” to call raise(SIGSTOP) at startup – this will let you attach
a WD to it.
Do not now how to do this.
Or, you could have a command-line option to “userintfapp”
that makes it run “wd diagapp …” instead of just “diagapp …”.
I do not think this would be possible under the present structure of the code.
Can you run it from a pterm, or does it have to be execed by
“userintfapp”? If you run it from a pterm, does it also cause
problems? Can you run it under the debugger?
You need to runn userintfapp to set the correct params for the test to be run.
Does it do any drawing on the screen? Is it using any Pg calls, or
is all the drawing done by widgets (other than PtRaw)?
It could do at different points, but I have no way to determine if I ever get close to
any of that could.
You could fprintf some messages to a file… Or run diagapp under WD
the way I described above.
Is “userintfapp” also a Photon application?
It needs to be run uder Photon to run the testing.
What I meant was does it make any Photon library calls – does it
perhaps create any widgets?
NO.
This shouldn’t really matter, unless there’s
a bug somewhere that makes the graphics driver crash when a Photon app
execs another Photon app. But we don’t know at this point whether the
driver crashes or gets killed, do we…
If I get in to diagapp far enough it is possible that the drive is getting killed.
By the way, I have inherited this code from people that are no longer at the company. So
there are great parts of this code that I do not lknow very well. This makes it hard to know
just what diagapp does at what time.
Sure hope this helps. Sounds to me we have narrowed it down to something in diagapp.
–
Wojtek Lerch (> wojtek@qnx.com> ) QNX Software Systems Ltd.