Get back trace from within program?

randyc4053 · October 26, 2005, 2:12pm

I am trying to diagnose a problem where an application (QNX 6, written in C++) crashes occasionally. Sometimes several times a day, sometimes not for several days of continuous running.

This is a remote support problem with access only by modem. The best I can hope for is to instrument the code to provide more info when it crashes. The core file has narrowed the scope, but it sometimes can’t identify all the function calls, so it’s incomplete. Also I don’t get function parameters. I know the program dies with a SIGSEGV, and I have installed signal handling to generate a file drop when it croaks. However, I would like to be able to get more data.

I seem to sorta remember when I was doing C programming on HP-UX some years ago, that there was a way to get stack information from within a program, but I can’t remember what it was, and it might be a facility that QNX lacks, anyway.

Can anyone suggest a way to dump program state on receipt of a signal?

Thanks,
Randy C.

cburgess · October 27, 2005, 1:36pm

How is a backtrace going to give you more than the core dump in gdb? It still won’t identify function names and parameters.

If you don’t get function parameters I imagine that you aren’t compiled with debug information? If the debug version is too big you can make
a stripped version to place on the target, and use the debug version on the host.

ntox86-strip -g -o stripped_exec debug_exec

ntox86-gdb debug_exec stripped_exec.core

randyc4053 · October 27, 2005, 4:22pm

I have done as you suggest - bringing the core file over and analyzing it with an executable built debuggable. There is at least one library involved, and I can see calls that were made from within it. But there is still one call, very high on the stack, so probably important, that is just identified as ??. Maybe it’s from another library, but the address shown in the gdb backtrace is very close to the ones before it.

My conception is that if I can figure out how to access the call stack programmatically, I can have better control over the data that is dumped at the time of the crash. I have done this in the dim, dark recesses of the past, but I don’t remember how, and I can’t figure out how to do it in QNX.

I am trying to download the gdb source and see how it does it, but I have had some bandwidth problems in getting that done.

There is also potentially a code/executable mismatch. I’ll try to resolve that by place my own executable so I know what code it was build with.

Thanks,
Randy C.

cburgess · October 27, 2005, 4:44pm

Yes, if you have mis-matched code (especially in shared libs etc) then gdb will just get confused.
The easiest way to avoid it is to put a link to all the shared libs you used to build your target filesystem in the directory you are running gdb in, eg

ln -sf /usr/qnx630/target/qnx6/x86/lib/libc.so.2 ldqnx.so.2
ln -sf /usr/qnx630/target/qnx6/x86/lib/libc.so.2 libc.so.2
ln -sf /usr/qnx630/target/qnx6/x86/lib/libcpp.so.4 libcpp.so.4

and then do an info shared to make sure that you loaded the right ones.

If you still see ??? then do info frame to get some idea where the address is.

randyc4053 · October 28, 2005, 1:29pm

Thank you, sounds like a good suggestion. I’m pretty sure this is part of what’s happening, and yes, there are shared libs involved.

I have also placed new executables & libs on the computer in question, so I know what sources they were built with. Next crash, I should get better diagnostics from the core file.

Thanks,
Randy C.