bug: qnx or j9?

I would like to submit a bug report but I am bit stumped.

My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.

I killed the shell & re-ran the program no problem. But the frozen
process is still on my machine, consuming a big chunk of memory and
no cpu.

I don’t need the memory so it is still hanging around, I can attach w/gdb
and get the following stack trace:

Attaching to program `/opt/vame1.4/ive/bin/j9’, process 66285623
0xb0327ce2 in ?? () from /x86/lib/libc.so.2
(gdb) bt
#0 0xb0327ce2 in ?? () from /x86/lib/libc.so.2
#1 0xb825844c in ?? () from /opt/vame1.4/ive/bin/libj9thr14.so
#2 0xb829b40c in ?? () from /opt/vame1.4/ive/bin/libj9max14.so
#3 0xb823f087 in ?? () from /opt/vame1.4/ive/bin/libj9vm14.so
#4 0xb824e589 in ?? () from /opt/vame1.4/ive/bin/libj9prt14.so
#5 0xb823f7cd in ?? () from /opt/vame1.4/ive/bin/libj9vm14.so
#6 0xb823dd0f in ?? () from /opt/vame1.4/ive/bin/libj9vm14.so
#7 0x804bef5 in main ()
#8 0x804a1c6 in gpProtectedMain ()
#9 0xb824e589 in ?? () from /opt/vame1.4/ive/bin/libj9prt14.so
#10 0x804a58b in main ()
(gdb) The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from program: /opt/vame1.4/ive/bin/j9 process 136377104
bash-2.04# ps -A|grep j9
66285623 ? 00:00:04 j9
bash-2.04# slay -9 j9
bash-2.04# ps -A|grep j9
66285623 ? 00:00:04 j9
bash-2.04$ uname -a
QNX qnxdev2 6.1.0 2001/06/25-15:31:48edt x86pc x86

Unless anyone has some clever ideas I’m going to reboot the machine but I’ll
leave the process hanging for a few days, like a virus in a test-tube.

My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.

The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:

My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <cgoebel@tridium.com> wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Hmmm, no luck, SIGCONT didn’t appear to do anything.

bash-2.04# kill -SIGCONT 66285623
bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1
bash-2.04#


Chris McKillop wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

Yeah - if they are still doing the ThreadCtl() in the released version then
it could very well be that there is nothing you can do. I know it is fixed
because I helped them fix it. :wink: In your java app are you trying to run any
external programs?

chris


Chris Goebel <cgoebel@tridium.com> wrote:

Hmmm, no luck, SIGCONT didn’t appear to do anything.

bash-2.04# kill -SIGCONT 66285623
bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1
bash-2.04#


Chris McKillop wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

I wouldn’t be surprised, it is a very large java app that does
a little bit of everything and it froze during a diagnostic test, but
I don’t remember which step.

good to know that the problem has been addressed, I guess I will
go ahead and reboot the machine & wait for the next revision

So what’s the scoop on ThreadCtl() btw?


Chris McKillop wrote:

Yeah - if they are still doing the ThreadCtl() in the released version then
it could very well be that there is nothing you can do. I know it is fixed
because I helped them fix it. > :wink: > In your java app are you trying to run any
external programs?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

Hmmm, no luck, SIGCONT didn’t appear to do anything.

bash-2.04# kill -SIGCONT 66285623
bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1
bash-2.04#


Chris McKillop wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

They are using the ThreadCtl() that stops all threads but the calling one to
impliment some syncronization for something they didn’t realize an API already
existed for doing what they wanted. :wink:

chris


Chris Goebel <cgoebel@tridium.com> wrote:

So what’s the scoop on ThreadCtl() btw?


Chris McKillop wrote:

Yeah - if they are still doing the ThreadCtl() in the released version then
it could very well be that there is nothing you can do. I know it is fixed
because I helped them fix it. > :wink: > In your java app are you trying to run any
external programs?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

Hmmm, no luck, SIGCONT didn’t appear to do anything.

bash-2.04# kill -SIGCONT 66285623
bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1
bash-2.04#


Chris McKillop wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris




cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL



\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Ok, so the lesson here is don’t use ThreadCtl() because it can lead to
a nasty deadlock that generates an unkillable process


Chris McKillop wrote:

They are using the ThreadCtl() that stops all threads but the calling one to
impliment some syncronization for something they didn’t realize an API already
existed for doing what they wanted. > :wink:

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

So what’s the scoop on ThreadCtl() btw?


Chris McKillop wrote:

Yeah - if they are still doing the ThreadCtl() in the released version then
it could very well be that there is nothing you can do. I know it is fixed
because I helped them fix it. > :wink: > In your java app are you trying to run any
external programs?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

Hmmm, no luck, SIGCONT didn’t appear to do anything.

bash-2.04# kill -SIGCONT 66285623
bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1
bash-2.04#


Chris McKillop wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris




cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL



\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

On 15 Dec 2001 01:04:32 GMT, Chris McKillop <cdm@qnx.com> wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

I try to follow on a semi-regular basis. I’ll forward this on to the
VM team

-Andrew

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

On 17 Dec 2001 18:19:24 GMT, Chris McKillop <cdm@qnx.com> wrote:

Yeah - if they are still doing the ThreadCtl() in the released version then
it could very well be that there is nothing you can do. I know it is fixed
because I helped them fix it. > :wink: > In your java app are you trying to run any
external programs?

Chris - who did you work with to fix this?
-Andrew

chris


Chris Goebel <> cgoebel@tridium.com> > wrote:

Hmmm, no luck, SIGCONT didn’t appear to do anything.

bash-2.04# kill -SIGCONT 66285623
bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1
bash-2.04#


Chris McKillop wrote:

Try laying down a SIGCONT on the process - looks like they are still using
ThreadCtl() method of syncronizing things. Does anyone from OTI read this
forum?

chris

Chris Goebel <> cgoebel@tridium.com> > wrote:

doh! shoulda thought of that…

bash-2.04# pidin |grep j9
66285623 1 vame1.4/ive/bin/j9 10o STOPPED
66285623 2 vame1.4/ive/bin/j9 10r CONDVAR b825a644
66285623 3 vame1.4/ive/bin/j9 10r STOPPED
66285623 4 vame1.4/ive/bin/j9 10r STOPPED
66285623 5 vame1.4/ive/bin/j9 10r MUTEX 66285623-04 #1


Chris McKillop wrote:


My java app froze. Figuring it was my problem I attempted
to slay/kill the process, no luck. Even went as far as “slay -9 j9”
as root with no luck.


The key is to look at what and where the process is blocking. You need
to post the output of “pidin | grep j9”.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL


\

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL

Andrew Sandstrom <andrew_sandstrom@oti.com> wrote:

On 17 Dec 2001 18:19:24 GMT, Chris McKillop <> cdm@qnx.com> > wrote:


Yeah - if they are still doing the ThreadCtl() in the released version then
it could very well be that there is nothing you can do. I know it is fixed
because I helped them fix it. > :wink: > In your java app are you trying to run any
external programs?


Chris - who did you work with to fix this?

DeLoy and I made the change and we talked with a guy in Ottawa about it.
This was sometime ago now on one of my visits to your office. :wink: Basically it
was using the ThreadCtrl() before doing a spawn or a fork/exec combo to stop
other threads from accessing stdin/stdout/stderr before the child process
was created. We changed it to use the fd array you can pass into spawn().

chris

\

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Oh, now I remember. That was quite a while ago. For some reason I was
thinking about this in a different context and didn’t make the
connecion. Sorry 'bout that.

-Andrew

On 18 Dec 2001 22:39:26 GMT, Chris McKillop <cdm@qnx.com> wrote:

DeLoy and I made the change and we talked with a guy in Ottawa about it.
This was sometime ago now on one of my visits to your office. > :wink: > Basically it
was using the ThreadCtrl() before doing a spawn or a fork/exec combo to stop
other threads from accessing stdin/stdout/stderr before the child process
was created. We changed it to use the fd array you can pass into spawn().

chris

Andrew Sandstrom <andrew_sandstrom@oti.com> wrote:

Oh, now I remember. That was quite a while ago. For some reason I was
thinking about this in a different context and didn’t make the
connecion. Sorry 'bout that.

Yep - but that is the only way to get all those threads into the STOPPED
state (unless it is being debugged). So it is either a case of that fix
not making it into 6.1.0’s VAME package or there is another place in the
VM’s code using that call to stop all the threads.

chris

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<