incomplete fwrite?

What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

Scott

Previously, J. Scott Franko wrote in qdn.public.qnx4:

What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

How do you know you are interrupting fwrite()? A C function call isn’t
interruptible, only system calls are. fwrite() sometimes makes a system
call (write), and sometimes doesn’t. I’m curious what kind of test you’ve
done that you think is interrupting fwrite().

I looked at the BSD source, and it appears to me that if fwrite() returns
a number of complete objects that is less than what you asked for,
then it is possible for a partial object to have been written.

If you think about it, a succesful return from fwrite() CAN’T indicate
success, if you fwrite() 8 10-byte objects, then fwrite will place
them in the FILE buffer and return sucess. At this point absolutely
nothing has been written to disk. Later when the data is flushed to
disk an error may occur. If you need gurantees, you must use fflush()
after each fwrite() call, and even then you only know an error has
occurred. Unless Watcom’s library is very different, which I doubt,
stdio is NOT signal safe. Use the system calls!

Sam


Sam Roberts (sam@cogent.ca), Cogent Real-Time Systems (www.cogent.ca)

Sam Roberts wrote:

Previously, J. Scott Franko wrote in qdn.public.qnx4:
What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

How do you know you are interrupting fwrite()? A C function call isn’t
interruptible, only system calls are. fwrite() sometimes makes a system
call (write), and sometimes doesn’t. I’m curious what kind of test you’ve
done that you think is interrupting fwrite().

I wrote a small test program. Made a big array of structures (several MB’s),
and then set up a handler
for SIGALRM, set a 10 second timer with alarm(), write the size of the the
whole arrary using printf, called fwrite using size of each array element
structure (which Mario helped me correct; see previous fwrite thread by me),
and sent several array elements, then write the completion errno and status,
and close the file.

Meanwhile the fwrite takes long enough that the signal goes off before the
fwrite completes, and the signal handler has a printf in it. I know that the
signal goes off before fwrite is done, because its printf output comes out
before the completion message with the errno and status from fwrite, which
directly follows fwrite. fwrite returns the same number of elements as I
requested, and then I ls -al the file and see that its byte size is the same
as the byte size of the array elements.

So even though fwrite doesn’t appear to be re-entrant from looking in the
Signals chapter of “Advanced Programming in the Unix Environment”, it does
re-enter, pick up where it left off. I’ve varried the number of elements in
the array, and wrote it many times, all with the same good results. Also,
from reading the Advanced book, I didn’t an indication that signals only
interrupt system calls. I therefore assumed that signals asynchronously
interrupt any part of the code, unless you specifically set up to ignore them.

I looked at the BSD source, and it appears to me that if fwrite() returns
a number of complete objects that is less than what you asked for,
then it is possible for a partial object to have been written.

If this is so, how do I recover? How can I take back the partial object it
has written, and rewrite the complete object.

If you think about it, a succesful return from fwrite() CAN’T indicate
success, if you fwrite() 8 10-byte objects, then fwrite will place
them in the FILE buffer and return sucess. At this point absolutely
nothing has been written to disk. Later when the data is flushed to
disk an error may occur. If you need gurantees, you must use fflush()

after each fwrite() call, and even then you only know an error has

occurred. Unless Watcom’s library is very different, which I doubt,
stdio is NOT signal safe. Use the system calls!

By system calls, you mean the posix, write and read? If I used them, I have
to roll my own object write? There is no equivalent to fwrite in the system
calls, right?

We have another one of those tricky problems. Code that has worked for 20
years in VMS systems, when ported to QNX suddenly gets what appears to be a
corrupted file, which we don’t find out about until we try to read it back
in. Its a calibration file where we save data about the rolling
characteristics of a particular railroad car, for future use in controlling
its speed through a trainyard. But every 1-4 months, a record can’t be read
back in. Analysis shows that the length field in the headers is either zero
or some impossible huge number, which throws the read program off. I’m trying
to come up with theories as to why its happening so we can decide where to
look to stop it.

Scott

Sam


Sam Roberts (> sam@cogent.ca> ), Cogent Real-Time Systems (> www.cogent.ca> )

“J. Scott Franko” <jsfranko@switch.com> wrote in message
news:39B92D27.14606E62@switch.com

Sam Roberts wrote:

Previously, J. Scott Franko wrote in qdn.public.qnx4:
What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write
nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

How do you know you are interrupting fwrite()? A C function call isn’t
interruptible, only system calls are. fwrite() sometimes makes a system
call (write), and sometimes doesn’t. I’m curious what kind of test
you’ve
done that you think is interrupting fwrite().

I wrote a small test program. Made a big array of structures (several
MB’s),
and then set up a handler
for SIGALRM, set a 10 second timer with alarm(), write the size of the
the
whole arrary using printf, called fwrite using size of each array element
structure (which Mario helped me correct; see previous fwrite thread by
me),
and sent several array elements, then write the completion errno and
status,
and close the file.

Meanwhile the fwrite takes long enough that the signal goes off before the
fwrite completes, and the signal handler has a printf in it.

Printf is NOT signal safe your asking for trouble…

I know that the
signal goes off before fwrite is done, because its printf output comes out
before the completion message with the errno and status from fwrite, which
directly follows fwrite. fwrite returns the same number of elements as I
requested, and then I ls -al the file and see that its byte size is the
same
as the byte size of the array elements.

That’s normal, where is the problem?

So even though fwrite doesn’t appear to be re-entrant from looking in the
Signals chapter of “Advanced Programming in the Unix Environment”, it does
re-enter, pick up where it left off.

Reentrant doesn’t apply here. Because you provided your own handler
the fwrite code resume where it left off, that’s not being reentrant that
simple like an interrupt.

Reentrancy is the ability to allow a function to call itself or to be
invoke at the same time by different thread/process/interrupt and
still work.


If you think about it, a succesful return from fwrite() CAN’T indicate
success, if you fwrite() 8 10-byte objects, then fwrite will place
them in the FILE buffer and return sucess. At this point absolutely
nothing has been written to disk. Later when the data is flushed to
disk an error may occur. If you need gurantees, you must use fflush()

after each fwrite() call, and even then you only know an error has

occurred. Unless Watcom’s library is very different, which I doubt,
stdio is NOT signal safe. Use the system calls!


By system calls, you mean the posix, write and read? If I used them, I
have
to roll my own object write? There is no equivalent to fwrite in the
system
calls, right?

Yes it’s write(). You can replace fopen() by open() and fwrite() by write()

We have another one of those tricky problems. Code that has worked for 20
years in VMS systems, when ported to QNX suddenly gets what appears to be
a
corrupted file, which we don’t find out about until we try to read it back
in. Its a calibration file where we save data about the rolling
characteristics of a particular railroad car, for future use in
controlling
its speed through a trainyard. But every 1-4 months, a record can’t be
read
back in. Analysis shows that the length field in the headers is either
zero
or some impossible huge number, which throws the read program off. I’m
trying
to come up with theories as to why its happening so we can decide where to
look to stop it.

I wouldn’t look into fwrite or other issue like that. I would look
into issue like memory trashing or corruption, possible wild pointers.
The fact the code worked for 20 years in VMS doesn’t mean
it’s reliable. It could also be a difference in behavior of some
system call, if the code is 20 years old, i don’t think it’s POSIX :wink:



Scott


Sam


Sam Roberts (> sam@cogent.ca> ), Cogent Real-Time Systems (> www.cogent.ca> )

Previously, J. Scott Franko wrote in qdn.public.qnx4:

Sam Roberts wrote:

Previously, J. Scott Franko wrote in qdn.public.qnx4:
What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

How do you know you are interrupting fwrite()? A C function call isn’t
interruptible, only system calls are. fwrite() sometimes makes a system
call (write), and sometimes doesn’t. I’m curious what kind of test you’ve
done that you think is interrupting fwrite().

I wrote a small test program. Made a big array of structures (several MB’s),
and then set up a handler
for SIGALRM, set a 10 second timer with alarm(), write the size of the the
whole arrary using printf, called fwrite using size of each array element
structure (which Mario helped me correct; see previous fwrite thread by me),
and sent several array elements, then write the completion errno and status,
and close the file.

Meanwhile the fwrite takes long enough that the signal goes off before the
fwrite completes, and the signal handler has a printf in it. I know that the
signal goes off before fwrite is done, because its printf output comes out
before the completion message with the errno and status from fwrite, which
directly follows fwrite. fwrite returns the same number of elements as I
requested, and then I ls -al the file and see that its byte size is the same
as the byte size of the array elements.

So even though fwrite doesn’t appear to be re-entrant from looking in the
Signals chapter of “Advanced Programming in the Unix Environment”, it does
re-enter, pick up where it left off. I’ve varried the number of elements in

This is not re-entering! In your printf() you are accessing global data
(stdout). In your fwrite() you are also accessing global data (but it’s
a FILE that you opened). Since they are acessing different global data, you
are ok. If you were in the middle of a printf(), and called printf() from
a signal handler at one of the critical times, stdout would be corrupted.
This is what they mean by non-reentrant.

the array, and wrote it many times, all with the same good results. Also,
from reading the Advanced book, I didn’t an indication that signals only
interrupt system calls. I therefore assumed that signals asynchronously
interrupt any part of the code, unless you specifically set up to ignore them.

“Interrupt” has a precise meaning, you’re using it too loosely, though I
can understand why.

“interrupt” and “occur during” are not the same. During fwrite() there are
some times that the fwrite() is doing a memcpy() into a buffer, and sometimes
that it does a Send() of some data to Fsys. When a signal occurs during the
memory copy the kernel arranges for the signal handler to be called and
executed by the process immediately. This doesn’t take too much work. When
the signal occurs during the Send(), the process is blocked, it can’t
execute code while its blocked, so it has to break out of the Send(). The
exact mechanism varies on Unix and on QNX, because of the message passing
in QNX, but the effect is the same. This breaking out is done in a way
that the process can’t just magically go back to Send() blocked. This breaking
of the Send() is what is the “interruption” caused by a signal, and is VERY
different from the “immediately called” action that takes place when a signal
occurs during normal process execution.

So what I’m saying is that some proportion of the fwrite() of megabytes is
spent in-process copying memory, and some proportion is spent in a Send().
Interruption only occurs if the signal occurs during the Send().

I looked at the BSD source, and it appears to me that if fwrite() returns
a number of complete objects that is less than what you asked for,
then it is possible for a partial object to have been written.

If this is so, how do I recover? How can I take back the partial object it
has written, and rewrite the complete object.

I don’t think there is a portable way. I’m not kidding when I claim signals
are a problem under Unix, its very unfortuneate the s/w you are maintaining
uses them so much, they are inherently unpredictable, and subtly timing
changes can uncover long-dormant bugs.

By system calls, you mean the posix, write and read? If I used them, I have
to roll my own object write? There is no equivalent to fwrite in the system
calls, right?

Look at the docs for write(). fwrite() is a cheesy wrapper on write, it is
exactly equivalent, except write() has a useful return value, and fwrite()
multiplies two numbers for you!

\

Sam Roberts (sam@cogent.ca), Cogent Real-Time Systems (www.cogent.ca)

Thanks both to Sam and Mario. You’ve given me lots of good information in my
struggle to become a unix programmer. Have either of you considered writing a
book? You have at least one guaranteed customer right here.

So an “interrupt” boils down to either some minor function context switch to execute
the handler, or it interrupts the io in a way that can’t be restored, and returns
the EINTR errno. I shouldn’t used fwrite. I can use printf associated with signals
as long as I am not interrupting a printf with a printf, or more specifically,
clobbering the same global stdio space.

My signal test here was just to see what the results would be as the “advanced” book
says that it is dependent on the implementation of fwrite. IE it differs from
compiler to compiler (or c library to c library).

The only use of a signal in the process with the problem is a sigsuspend on SIGUSR2
to hibernate the process when it is done processing, and to wake it again, when data
is put on its queue for processing. There is no explicit handler. Another process
sets SIGUSR2. A process table keeps track of who it was set for and awakes the
proper process. It appears this was a way to port the VMS method of hibernating
code over to QNX.

I’m going to wait for one more occurence of the problem, now that we have addition
diags in, and if I can’t figure out where the problem is, I’ll take your advice and
rewrite using write().

Scott

Sam Roberts wrote:

Previously, J. Scott Franko wrote in qdn.public.qnx4:


Sam Roberts wrote:

Previously, J. Scott Franko wrote in qdn.public.qnx4:
What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

How do you know you are interrupting fwrite()? A C function call isn’t
interruptible, only system calls are. fwrite() sometimes makes a system
call (write), and sometimes doesn’t. I’m curious what kind of test you’ve
done that you think is interrupting fwrite().

I wrote a small test program. Made a big array of structures (several MB’s),
and then set up a handler
for SIGALRM, set a 10 second timer with alarm(), write the size of the the
whole arrary using printf, called fwrite using size of each array element
structure (which Mario helped me correct; see previous fwrite thread by me),
and sent several array elements, then write the completion errno and status,
and close the file.

Meanwhile the fwrite takes long enough that the signal goes off before the
fwrite completes, and the signal handler has a printf in it. I know that the
signal goes off before fwrite is done, because its printf output comes out
before the completion message with the errno and status from fwrite, which
directly follows fwrite. fwrite returns the same number of elements as I
requested, and then I ls -al the file and see that its byte size is the same
as the byte size of the array elements.

So even though fwrite doesn’t appear to be re-entrant from looking in the
Signals chapter of “Advanced Programming in the Unix Environment”, it does
re-enter, pick up where it left off. I’ve varried the number of elements in

This is not re-entering! In your printf() you are accessing global data
(stdout). In your fwrite() you are also accessing global data (but it’s
a FILE that you opened). Since they are acessing different global data, you
are ok. If you were in the middle of a printf(), and called printf() from
a signal handler at one of the critical times, stdout would be corrupted.
This is what they mean by non-reentrant.

the array, and wrote it many times, all with the same good results. Also,
from reading the Advanced book, I didn’t an indication that signals only
interrupt system calls. I therefore assumed that signals asynchronously
interrupt any part of the code, unless you specifically set up to ignore them.

“Interrupt” has a precise meaning, you’re using it too loosely, though I
can understand why.

“interrupt” and “occur during” are not the same. During fwrite() there are
some times that the fwrite() is doing a memcpy() into a buffer, and sometimes
that it does a Send() of some data to Fsys. When a signal occurs during the
memory copy the kernel arranges for the signal handler to be called and
executed by the process immediately. This doesn’t take too much work. When
the signal occurs during the Send(), the process is blocked, it can’t
execute code while its blocked, so it has to break out of the Send(). The
exact mechanism varies on Unix and on QNX, because of the message passing
in QNX, but the effect is the same. This breaking out is done in a way
that the process can’t just magically go back to Send() blocked. This breaking
of the Send() is what is the “interruption” caused by a signal, and is VERY
different from the “immediately called” action that takes place when a signal
occurs during normal process execution.

So what I’m saying is that some proportion of the fwrite() of megabytes is
spent in-process copying memory, and some proportion is spent in a Send().
Interruption only occurs if the signal occurs during the Send().

I looked at the BSD source, and it appears to me that if fwrite() returns
a number of complete objects that is less than what you asked for,
then it is possible for a partial object to have been written.

If this is so, how do I recover? How can I take back the partial object it
has written, and rewrite the complete object.

I don’t think there is a portable way. I’m not kidding when I claim signals
are a problem under Unix, its very unfortuneate the s/w you are maintaining
uses them so much, they are inherently unpredictable, and subtly timing
changes can uncover long-dormant bugs.

By system calls, you mean the posix, write and read? If I used them, I have
to roll my own object write? There is no equivalent to fwrite in the system
calls, right?

Look at the docs for write(). fwrite() is a cheesy wrapper on write, it is
exactly equivalent, except write() has a useful return value, and fwrite()
multiplies two numbers for you!


Sam Roberts (> sam@cogent.ca> ), Cogent Real-Time Systems (> www.cogent.ca> )

Mario Charest wrote:

“J. Scott Franko” <> jsfranko@switch.com> > wrote in message
news:> 39B92D27.14606E62@switch.com> …


Sam Roberts wrote:

Previously, J. Scott Franko wrote in qdn.public.qnx4:
What happens when fwrite doesn’t complete? How do you recover? Say I
ask fwrite to write 1 object, and it returns a value other than 1.
According to the documents, a return less than the number of objects
requested is an error. What happens. Does it write a part of the
object to the disk, thus corrupting the file? Or does it write
nothing?

I’ve found that it appears to compete ok, even when interrupted by a
signal, and it doesn’t return EINTR in errno.

How do you know you are interrupting fwrite()? A C function call isn’t
interruptible, only system calls are. fwrite() sometimes makes a system
call (write), and sometimes doesn’t. I’m curious what kind of test
you’ve
done that you think is interrupting fwrite().

I wrote a small test program. Made a big array of structures (several
MB’s),
and then set up a handler
for SIGALRM, set a 10 second timer with alarm(), write the size of the
the
whole arrary using printf, called fwrite using size of each array element
structure (which Mario helped me correct; see previous fwrite thread by
me),
and sent several array elements, then write the completion errno and
status,
and close the file.

Meanwhile the fwrite takes long enough that the signal goes off before the
fwrite completes, and the signal handler has a printf in it.

Printf is NOT signal safe your asking for trouble…

I know that the
signal goes off before fwrite is done, because its printf output comes out
before the completion message with the errno and status from fwrite, which
directly follows fwrite. fwrite returns the same number of elements as I
requested, and then I ls -al the file and see that its byte size is the
same
as the byte size of the array elements.


That’s normal, where is the problem?

So even though fwrite doesn’t appear to be re-entrant from looking in the
Signals chapter of “Advanced Programming in the Unix Environment”, it does
re-enter, pick up where it left off.

Reentrant doesn’t apply here. Because you provided your own handler
the fwrite code resume where it left off, that’s not being reentrant that
simple like an interrupt.

Reentrancy is the ability to allow a function to call itself or to be
invoke at the same time by different thread/process/interrupt and
still work.

acknowledged. I’m still learning! :0)

If you think about it, a succesful return from fwrite() CAN’T indicate
success, if you fwrite() 8 10-byte objects, then fwrite will place
them in the FILE buffer and return sucess. At this point absolutely
nothing has been written to disk. Later when the data is flushed to
disk an error may occur. If you need gurantees, you must use fflush()

after each fwrite() call, and even then you only know an error has

occurred. Unless Watcom’s library is very different, which I doubt,
stdio is NOT signal safe. Use the system calls!


By system calls, you mean the posix, write and read? If I used them, I
have
to roll my own object write? There is no equivalent to fwrite in the
system
calls, right?


Yes it’s write(). You can replace fopen() by open() and fwrite() by write()

With some minor mods.

We have another one of those tricky problems. Code that has worked for 20
years in VMS systems, when ported to QNX suddenly gets what appears to be
a
corrupted file, which we don’t find out about until we try to read it back
in. Its a calibration file where we save data about the rolling
characteristics of a particular railroad car, for future use in
controlling
its speed through a trainyard. But every 1-4 months, a record can’t be
read
back in. Analysis shows that the length field in the headers is either
zero
or some impossible huge number, which throws the read program off. I’m
trying
to come up with theories as to why its happening so we can decide where to
look to stop it.


I wouldn’t look into fwrite or other issue like that. I would look
into issue like memory trashing or corruption, possible wild pointers.
The fact the code worked for 20 years in VMS doesn’t mean
it’s reliable. It could also be a difference in behavior of some
system call, if the code is 20 years old, i don’t think it’s POSIX > :wink:

Right, its not POSIX, it’s ANTIQUE! ;O)

Scott


Sam


Sam Roberts (> sam@cogent.ca> ), Cogent Real-Time Systems (> www.cogent.ca> )

“J. Scott Franko” wrote:

Thanks both to Sam and Mario. You’ve given me lots of good information in my
struggle to become a unix programmer. Have either of you considered writing a
book? You have at least one guaranteed customer right here.

So an “interrupt” boils down to either some minor function context switch to execute
the handler, or it interrupts the io in a way that can’t be restored, and returns
the EINTR errno. I shouldn’t used fwrite. I can use printf associated with signals
as long as I am not interrupting a printf with a printf, or more specifically,
clobbering the same global stdio space.

My signal test here was just to see what the results would be as the “advanced” book
says that it is dependent on the implementation of fwrite. IE it differs from
compiler to compiler (or c library to c library).

The only use of a signal in the process with the problem is a sigsuspend on SIGUSR2
to hibernate the process when it is done processing, and to wake it again, when data
is put on its queue for processing. There is no explicit handler. Another process
sets SIGUSR2. A process table keeps track of who it was set for and awakes the
proper process. It appears this was a way to port the VMS method of hibernating
code over to QNX.

I would suggest that using signals in this way is definitely NOT the
way to handle what amounts to a simple client/server setup. There are
many problems with signals and asynchronism, all of which are avoided by
simply using QNX messaging to obtain services as needed. BTW, signals
would not be the way in Unix either.

I’m going to wait for one more occurence of the problem, now that we have addition
diags in, and if I can’t figure out where the problem is, I’ll take your advice and
rewrite using write().

“J. Scott Franko” <jsfranko@switch.com> wrote in message
news:39BD028A.53FD3CB7@switch.com

Thanks both to Sam and Mario. You’ve given me lots of good information in
my
struggle to become a unix programmer. Have either of you considered
writing a
book?

I’m not a unix programmer :wink: I’m a QNX or NTO programmer that’s different.

So an “interrupt” boils down to either some minor function context switch
to execute
the handler, or it interrupts the io in a way that can’t be restored, and
returns
the EINTR errno.

Both, it depends on what your interrupt handler does.

I shouldn’t used fwrite.

I can use printf associated with signals
as long as I am not interrupting a printf with a printf, or more
specifically,
clobbering the same global stdio space.

This is VERY difficult to do, you would have to protect every access to
stdout.

My signal test here was just to see what the results would be as the
“advanced” book
says that it is dependent on the implementation of fwrite. IE it differs
from
compiler to compiler (or c library to c library).

The only use of a signal in the process with the problem is a sigsuspend
on SIGUSR2
to hibernate the process when it is done processing, and to wake it again,
when data
is put on its queue for processing. There is no explicit handler.
Another process
sets SIGUSR2. A process table keeps track of who it was set for and
awakes the
proper process. It appears this was a way to port the VMS method of
hibernating
code over to QNX.

This is something that is best done by the QNX4 kernal and use of IPC
primitive.

Dean Douthat wrote:

“J. Scott Franko” wrote:

Thanks both to Sam and Mario. You’ve given me lots of good information in my
struggle to become a unix programmer. Have either of you considered writing a
book? You have at least one guaranteed customer right here.

So an “interrupt” boils down to either some minor function context switch to execute
the handler, or it interrupts the io in a way that can’t be restored, and returns
the EINTR errno. I shouldn’t used fwrite. I can use printf associated with signals
as long as I am not interrupting a printf with a printf, or more specifically,
clobbering the same global stdio space.

My signal test here was just to see what the results would be as the “advanced” book
says that it is dependent on the implementation of fwrite. IE it differs from
compiler to compiler (or c library to c library).

The only use of a signal in the process with the problem is a sigsuspend on SIGUSR2
to hibernate the process when it is done processing, and to wake it again, when data
is put on its queue for processing. There is no explicit handler. Another process
sets SIGUSR2. A process table keeps track of who it was set for and awakes the
proper process. It appears this was a way to port the VMS method of hibernating
code over to QNX.

I would suggest that using signals in this way is definitely NOT the
way to handle what amounts to a simple client/server setup. There are
many problems with signals and asynchronism, all of which are avoided by
simply using QNX messaging to obtain services as needed. BTW, signals
would not be the way in Unix either.

Forgive me If I have given the impression that I am working with a simple client server
setup. I am in fact, working with a couple million lines of Process Control code that
operates a railroad classification yard. It was poorly ported from VMS to QNX, but it was
done in a somewhat portable way. The QNX messaging is a proprietary solution. When the
company I’m temping for decides that QNX was the right way to go (if ever), they may
decide to redo some of this stuff, but for now, its just being made to work, with as
little chewing gum as possible. Something about the realities of making money, they tell
me.

It’s a wonder signals were invented at all the way everyone puts them down.

I’m going to wait for one more occurence of the problem, now that we have addition
diags in, and if I can’t figure out where the problem is, I’ll take your advice and
rewrite using write().

Mario Charest wrote:

“J. Scott Franko” <> jsfranko@switch.com> > wrote in message
news:> 39BD028A.53FD3CB7@switch.com> …
Thanks both to Sam and Mario. You’ve given me lots of good information in
my
struggle to become a unix programmer. Have either of you considered
writing a
book?

I’m not a unix programmer > :wink: > I’m a QNX or NTO programmer that’s different.

Sure it’s different! QNX has a some proprietary features like message passing,
and a kernel and driver architecture that is a helluva lot easier to administer
than any unix I’ve ever encountered (Solaris, Irix, Digital Unix, Linux, and
MacOS X). And I’m sure you can tell me a few more differences! But all in all,
I’d say it has the same C libraries, most of the same command line tools, etc.
If it walks like a Unix and Talks like a Unix, and you program on it… ;O)


So an “interrupt” boils down to either some minor function context switch
to execute
the handler, or it interrupts the io in a way that can’t be restored, and
returns
the EINTR errno.

Both, it depends on what your interrupt handler does.

I shouldn’t used fwrite.

I can use printf associated with signals
as long as I am not interrupting a printf with a printf, or more
specifically,
clobbering the same global stdio space.

This is VERY difficult to do, you would have to protect every access to
stdout.


My signal test here was just to see what the results would be as the
“advanced” book
says that it is dependent on the implementation of fwrite. IE it differs
from
compiler to compiler (or c library to c library).

The only use of a signal in the process with the problem is a sigsuspend
on SIGUSR2
to hibernate the process when it is done processing, and to wake it again,
when data
is put on its queue for processing. There is no explicit handler.
Another process
sets SIGUSR2. A process table keeps track of who it was set for and
awakes the
proper process. It appears this was a way to port the VMS method of
hibernating
code over to QNX.


This is something that is best done by the QNX4 kernal and use of IPC
primitive.