Failed running application on X86.

I have narrowed down to the following code:

int mncmgrfd = 0;
char hostname[MAXNAMELEN]; // Our hostname
char devname[MAXNAMELEN]; // Our devname
int ifunit; // interface unit
int main( int argc, char **argv)
{
printf(“ty:/dev/socket/mncmgr active\n”);
mncmgrfd = open("/dev/socket/mncmgr",O_RDWR);
if (mncmgrfd < 0) {
printf(“Failed to open mncmgr\n”);
exit(1);
}
strcpy(devname,ttyname(0));
printf(“ty::device name: %s\n”,devname);
if (devctl(mncmgrfd,PPPIOATTACH,devname,strlen(devname)+1,NULL)==-1) {
printf(“Failed attach\n”);
exit(1);
}
if (devctl(mncmgrfd,PPPIOCGUNIT,&ifunit) < 0) {
printf(“Failed to get interface unit\n”);
exit(1);
}
if(dlopen (“swRdnIP.so”, RTLD_NOW) == NULL ) {
printf (“dlopen failed, %s…\n”, dlerror() );
} else
printf(“dlopen() successful…\n”);
printf(“ty::ifunit = %d\n”,ifunit);
}

First, this test program runs fine on ppc, failed on x86.
Secondly, if I comment out either one of the devctl call, the dlopen() will be successful.
I notice one of the difference between running on ppc and x86 is the ttyname(0).
It was /dev/ser1 on ppc and /dev/ttyp2 when it was running form x86 processor.

Setting DL_DEBUG didn’t give me much information. The following is the stdout
when running test program.

When commented out the second devctl call

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen(“swRdnIP.so”,2)
load_object: attempt load of swRdnIP.so
load_elf32: loaded lib at addr b8208000(text) b820bf48(data)
dlsym(804d488,_btext)=80484f0
Library loaded; type ‘add-sym libswRdnIP.so.1 80484f0’ in gdb to load symbols
dlopen() successful…
ty::ifunit = 0

When both devctl call are there.

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen failed, Library cannot be found…
ty::ifunit = 0

Thank for your help.

-Beth

Colin Burgess wrote:

Can you try to get the DL_DEBUG output? This prints to stderr, so make sure
you don’t close it.

Cheers,

Colin

Beth <> id@address.com> > wrote:
A lot difference between these two programs. My test program only does dlopen() and
nothing else. The real application has other more than ten objects. But the dlopen()
instruction is the same.

That still didn’t explain why the application worked on ppc and arm but not x86.
BTW, I used the same common.mk file from the application for building the little test
program.

I am trying everything I could think of. Any other suggestions?

-Beth

Colin Burgess wrote:

Beth <> id@address.com> > wrote:
It is loaded from one of the directories on local. I copied the library to /lib,
that didn’t help.
I don’t think the error message was accurate. I wrote a simple test program to
load this shared library using dlopen(), it worked fine on x86 processor.

Alright, so the dll itself appears to be ok. What’s different in your
program from the the test case?


cburgess@qnx.com


cburgess@qnx.com

Beth <id@address.com> wrote:

I have narrowed down to the following code:

int mncmgrfd = 0;
char hostname[MAXNAMELEN]; // Our hostname
char devname[MAXNAMELEN]; // Our devname
int ifunit; // interface unit
int main( int argc, char **argv)
{
printf(“ty:/dev/socket/mncmgr active\n”);
mncmgrfd = open("/dev/socket/mncmgr",O_RDWR);
if (mncmgrfd < 0) {
printf(“Failed to open mncmgr\n”);
exit(1);
}
strcpy(devname,ttyname(0));
printf(“ty::device name: %s\n”,devname);
if (devctl(mncmgrfd,PPPIOATTACH,devname,strlen(devname)+1,NULL)==-1) {
printf(“Failed attach\n”);
exit(1);
}
if (devctl(mncmgrfd,PPPIOCGUNIT,&ifunit) < 0) {

Are you sure this is not typo? Maybe you mean ioctl() here.

devctl() took 5 parameters (like the one above). You only passin
3, which means whatever left on stack, is passed in …
change it to:

if (devctl(mncmgrfd, PPPIOCGUNIT, &ifunit, sizeof(ifunit), NULL) != 0)

Also please note “devctl()” returning error numbers rather then
0 or -1…

-xtang

printf(“Failed to get interface unit\n”);
exit(1);
}
if(dlopen (“swRdnIP.so”, RTLD_NOW) == NULL ) {
printf (“dlopen failed, %s…\n”, dlerror() );
} else
printf(“dlopen() successful…\n”);
printf(“ty::ifunit = %d\n”,ifunit);
}

First, this test program runs fine on ppc, failed on x86.
Secondly, if I comment out either one of the devctl call, the dlopen() will be successful.
I notice one of the difference between running on ppc and x86 is the ttyname(0).
It was /dev/ser1 on ppc and /dev/ttyp2 when it was running form x86 processor.

Setting DL_DEBUG didn’t give me much information. The following is the stdout
when running test program.

When commented out the second devctl call

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen(“swRdnIP.so”,2)
load_object: attempt load of swRdnIP.so
load_elf32: loaded lib at addr b8208000(text) b820bf48(data)
dlsym(804d488,_btext)=80484f0
Library loaded; type ‘add-sym libswRdnIP.so.1 80484f0’ in gdb to load symbols
dlopen() successful…
ty::ifunit = 0

When both devctl call are there.

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen failed, Library cannot be found…
ty::ifunit = 0

Thank for your help.

-Beth

Colin Burgess wrote:

Can you try to get the DL_DEBUG output? This prints to stderr, so make sure
you don’t close it.

Cheers,

Colin

Beth <> id@address.com> > wrote:
A lot difference between these two programs. My test program only does dlopen() and
nothing else. The real application has other more than ten objects. But the dlopen()
instruction is the same.

That still didn’t explain why the application worked on ppc and arm but not x86.
BTW, I used the same common.mk file from the application for building the little test
program.

I am trying everything I could think of. Any other suggestions?

-Beth

Colin Burgess wrote:

Beth <> id@address.com> > wrote:
It is loaded from one of the directories on local. I copied the library to /lib,
that didn’t help.
I don’t think the error message was accurate. I wrote a simple test program to
load this shared library using dlopen(), it worked fine on x86 processor.

Alright, so the dll itself appears to be ok. What’s different in your
program from the the test case?


cburgess@qnx.com


cburgess@qnx.com

First, this test program runs fine on ppc, failed on x86.
Secondly, if I comment out either one of the devctl call, the dlopen() will be successful.
I notice one of the difference between running on ppc and x86 is the ttyname(0).
It was /dev/ser1 on ppc and /dev/ttyp2 when it was running form x86 processor.

Setting DL_DEBUG didn’t give me much information. The following is the stdout
when running test program.

When commented out the second devctl call

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen(“swRdnIP.so”,2)
load_object: attempt load of swRdnIP.so
load_elf32: loaded lib at addr b8208000(text) b820bf48(data)
dlsym(804d488,_btext)=80484f0
Library loaded; type ‘add-sym libswRdnIP.so.1 80484f0’ in gdb to load symbols
dlopen() successful…
ty::ifunit = 0

When both devctl call are there.

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen failed, Library cannot be found…
ty::ifunit = 0

It’s strange - you don’t see the dlopen(“swRDnIP.so”,2) message in the
second version, but that’s the first line of code in dlopen!

If you change your devctl line to what Xiodan suggests, does that affect
it?

What is /dev/socket/mncmgr - is that one of your resource managers?


cburgess@qnx.com

Xiaodan, thank you very much for your help.

Changing the code as you had pointed out helped. I can now load the shared library,
dlopen() is successful from my test program.
However, I have “memory fault (core dumped)” after loading the library when I
made the same change in the application.

-Beth

Xiaodan Tang wrote:

Beth <> id@address.com> > wrote:
I have narrowed down to the following code:

int mncmgrfd = 0;
char hostname[MAXNAMELEN]; // Our hostname
char devname[MAXNAMELEN]; // Our devname
int ifunit; // interface unit
int main( int argc, char **argv)
{
printf(“ty:/dev/socket/mncmgr active\n”);
mncmgrfd = open("/dev/socket/mncmgr",O_RDWR);
if (mncmgrfd < 0) {
printf(“Failed to open mncmgr\n”);
exit(1);
}
strcpy(devname,ttyname(0));
printf(“ty::device name: %s\n”,devname);
if (devctl(mncmgrfd,PPPIOATTACH,devname,strlen(devname)+1,NULL)==-1) {
printf(“Failed attach\n”);
exit(1);
}
if (devctl(mncmgrfd,PPPIOCGUNIT,&ifunit) < 0) {

Are you sure this is not typo? Maybe you mean ioctl() here.

devctl() took 5 parameters (like the one above). You only passin
3, which means whatever left on stack, is passed in …
change it to:

if (devctl(mncmgrfd, PPPIOCGUNIT, &ifunit, sizeof(ifunit), NULL) != 0)

Also please note “devctl()” returning error numbers rather then
0 or -1…

-xtang

printf(“Failed to get interface unit\n”);
exit(1);
}
if(dlopen (“swRdnIP.so”, RTLD_NOW) == NULL ) {
printf (“dlopen failed, %s…\n”, dlerror() );
} else
printf(“dlopen() successful…\n”);
printf(“ty::ifunit = %d\n”,ifunit);
}

First, this test program runs fine on ppc, failed on x86.
Secondly, if I comment out either one of the devctl call, the dlopen() will be successful.
I notice one of the difference between running on ppc and x86 is the ttyname(0).
It was /dev/ser1 on ppc and /dev/ttyp2 when it was running form x86 processor.

Setting DL_DEBUG didn’t give me much information. The following is the stdout
when running test program.

When commented out the second devctl call

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen(“swRdnIP.so”,2)
load_object: attempt load of swRdnIP.so
load_elf32: loaded lib at addr b8208000(text) b820bf48(data)
dlsym(804d488,_btext)=80484f0
Library loaded; type ‘add-sym libswRdnIP.so.1 80484f0’ in gdb to load symbols
dlopen() successful…
ty::ifunit = 0

When both devctl call are there.

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen failed, Library cannot be found…
ty::ifunit = 0

Thank for your help.

-Beth

Colin Burgess wrote:

Can you try to get the DL_DEBUG output? This prints to stderr, so make sure
you don’t close it.

Cheers,

Colin

Beth <> id@address.com> > wrote:
A lot difference between these two programs. My test program only does dlopen() and
nothing else. The real application has other more than ten objects. But the dlopen()
instruction is the same.

That still didn’t explain why the application worked on ppc and arm but not x86.
BTW, I used the same common.mk file from the application for building the little test
program.

I am trying everything I could think of. Any other suggestions?

-Beth

Colin Burgess wrote:

Beth <> id@address.com> > wrote:
It is loaded from one of the directories on local. I copied the library to /lib,
that didn’t help.
I don’t think the error message was accurate. I wrote a simple test program to
load this shared library using dlopen(), it worked fine on x86 processor.

Alright, so the dll itself appears to be ok. What’s different in your
program from the the test case?


cburgess@qnx.com


cburgess@qnx.com

Colin,

To answer your question, the /dev/socket/mncmgr is one of our resource managers.
I changed the code as Xiaodan suggested, now my application passed loading the shared
library but stop with “memory fault”.
Again, the same code worked on Powerpc and Strongarm, how to explain that?

Appreciate your help.

-Beth


Colin Burgess wrote:

First, this test program runs fine on ppc, failed on x86.
Secondly, if I comment out either one of the devctl call, the dlopen() will be successful.
I notice one of the difference between running on ppc and x86 is the ttyname(0).
It was /dev/ser1 on ppc and /dev/ttyp2 when it was running form x86 processor.

Setting DL_DEBUG didn’t give me much information. The following is the stdout
when running test program.

When commented out the second devctl call

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen(“swRdnIP.so”,2)
load_object: attempt load of swRdnIP.so
load_elf32: loaded lib at addr b8208000(text) b820bf48(data)
dlsym(804d488,_btext)=80484f0
Library loaded; type ‘add-sym libswRdnIP.so.1 80484f0’ in gdb to load symbols
dlopen() successful…
ty::ifunit = 0

When both devctl call are there.

./test

load_object: attempt load of libm.so.2
load_elf32: loaded lib at addr b034e000(text) b035de48(data)
load_object: attempt load of libsocket.so.2
load_elf32: loaded lib at addr b035f000(text) b036e4e0(data)
load_object: attempt load of swUtil.so
load_elf32: loaded lib at addr b8200000(text) b8206b40(data)
ty:/dev/socket/mncmgr active
ty::device name: /dev/ttyp2
dlopen failed, Library cannot be found…
ty::ifunit = 0

It’s strange - you don’t see the dlopen(“swRDnIP.so”,2) message in the
second version, but that’s the first line of code in dlopen!

If you change your devctl line to what Xiodan suggests, does that affect
it?

What is /dev/socket/mncmgr - is that one of your resource managers?


cburgess@qnx.com

Beth <id@address.com> wrote:

Colin,

To answer your question, the /dev/socket/mncmgr is one of our resource managers.
I changed the code as Xiaodan suggested, now my application passed loading the shared
library but stop with “memory fault”.
Again, the same code worked on Powerpc and Strongarm, how to explain that?

x86, PPC, and ARM will lay things out differently in memory. If you’re
munging stuff on the stack, or corrupting something, then you’ll be
corrupting something different. That is, it is probably broken on
non-x86 as well, but just “happens” to work.

Not a solution, but maybe part of the reason for the discrepancy.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

That makes sense. How do I find more information as to where is the code failed?
How do I read the core dump?

Thanks,
-Beth

David Gibbs wrote:

Beth <> id@address.com> > wrote:
Colin,

To answer your question, the /dev/socket/mncmgr is one of our resource managers.
I changed the code as Xiaodan suggested, now my application passed loading the shared
library but stop with “memory fault”.
Again, the same code worked on Powerpc and Strongarm, how to explain that?

x86, PPC, and ARM will lay things out differently in memory. If you’re
munging stuff on the stack, or corrupting something, then you’ll be
corrupting something different. That is, it is probably broken on
non-x86 as well, but just “happens” to work.

Not a solution, but maybe part of the reason for the discrepancy.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

Beth <id@address.com> wrote:

That makes sense. How do I find more information as to where is the code failed?
How do I read the core dump?

gdb myapp

(gdb) set solib-search-path etc etc
(gdb) target core /var/dumps/myapp.core

things may be a little weird in the dump if you are overwriting your stack
though.


cburgess@qnx.com