Empty slog

We occasionally see a condition where sloginfo shows no log, i.e., it
appears to be empty. Some time later, it will contain entries, even
from during the time it appeared to be empty. It seems that perhaps
under a load, the log sometimes “disappears” and then shows up later.
Since we’re coming to depend on this log, this is fairly inconvenient…
of course, it typically fails right when we need to look at something!

Is this a known problem? We could try to gather more information as to
what makes it happen, but wanted to check first before spending a lot of
time on it.

Thanks,

lew

Can you specify which OS version you’re using? Also, do you have a test
case that demonstrates your issue, or at least detail some conditions that
your system is under, when this behaviour is observed.

Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>

Lewis Donzis <lew@nospam.donzis.com> wrote in message
news:3DC5DBB7.E3DEA147@nospam.donzis.com

We occasionally see a condition where sloginfo shows no log, i.e., it
appears to be empty. Some time later, it will contain entries, even
from during the time it appeared to be empty. It seems that perhaps
under a load, the log sometimes “disappears” and then shows up later.
Since we’re coming to depend on this log, this is fairly inconvenient…
of course, it typically fails right when we need to look at something!

Is this a known problem? We could try to gather more information as to
what makes it happen, but wanted to check first before spending a lot of
time on it.

Thanks,

lew

Adam Mallory wrote:

Can you specify which OS version you’re using?

Sorry, it’s 6.2.0.

Also, do you have a test
case that demonstrates your issue, or at least detail some conditions that
your system is under, when this behaviour is observed.

Sort of. We really don’t know what causes it, so we just wrote a little
test program. Every second, it opens the log, reads it to the end
counting the number of bytes, closes the file, and prints the time and
the number of bytes found in the log. We ran it for a while on one of
our systems and here’s what happened:

Mon Nov 4 17:01:35 2002: read 16328 bytes
Mon Nov 4 17:01:36 2002: read 16328 bytes
Mon Nov 4 17:01:37 2002: read 16328 bytes
Mon Nov 4 17:01:38 2002: read 0 bytes
Mon Nov 4 17:01:39 2002: read 0 bytes
Mon Nov 4 17:01:40 2002: read 0 bytes
…6.5 minutes of the same…
Mon Nov 4 17:08:11 2002: read 0 bytes
Mon Nov 4 17:08:12 2002: read 0 bytes
Mon Nov 4 17:08:13 2002: read 16332 bytes
Mon Nov 4 17:08:14 2002: read 16332 bytes
Mon Nov 4 17:08:16 2002: read 16332 bytes

So it was going along just fine at 16328 bytes, and then all of a
sudden, for no apparent reason, the log appeared to be empty for about
6.5 minutes. Then, just as suddenly, it had 16332 bytes in it.
Checking the actual contents of the log entries, during the “dead
period,” seven new entries had been logged and looked fine, but this
raises the point that it does not appear to be load-related. We’ve also
seen this behavior at one time or another on just about every QNX system
that we have.

We can try to run some more experiments, but not sure where to go from
here…

lew

Some more test results: the 6.5 minute period during which the log appears to be
zero bytes long is exactly bracketed by a pair of log entries; i.e., we were
unable to read the log for six minutes and 35 seconds, but when we COULD read it
again, we found a pair of entries exactly six minutes and 35 seconds apart. It
appears that the log gets into some state, perhaps some certain size, as the
result of an event being logged, and cannot be read (even though more events are
being recorded) until some other event kicks it out of that state.

Murf

Lewis Donzis wrote:

Adam Mallory wrote:
Can you specify which OS version you’re using?

Sorry, it’s 6.2.0.

Also, do you have a test
case that demonstrates your issue, or at least detail some conditions that
your system is under, when this behaviour is observed.

Sort of. We really don’t know what causes it, so we just wrote a little
test program. Every second, it opens the log, reads it to the end
counting the number of bytes, closes the file, and prints the time and
the number of bytes found in the log. We ran it for a while on one of
our systems and here’s what happened:

Mon Nov 4 17:01:35 2002: read 16328 bytes
Mon Nov 4 17:01:36 2002: read 16328 bytes
Mon Nov 4 17:01:37 2002: read 16328 bytes
Mon Nov 4 17:01:38 2002: read 0 bytes
Mon Nov 4 17:01:39 2002: read 0 bytes
Mon Nov 4 17:01:40 2002: read 0 bytes
…6.5 minutes of the same…
Mon Nov 4 17:08:11 2002: read 0 bytes
Mon Nov 4 17:08:12 2002: read 0 bytes
Mon Nov 4 17:08:13 2002: read 16332 bytes
Mon Nov 4 17:08:14 2002: read 16332 bytes
Mon Nov 4 17:08:16 2002: read 16332 bytes

So it was going along just fine at 16328 bytes, and then all of a
sudden, for no apparent reason, the log appeared to be empty for about
6.5 minutes. Then, just as suddenly, it had 16332 bytes in it.
Checking the actual contents of the log entries, during the “dead
period,” seven new entries had been logged and looked fine, but this
raises the point that it does not appear to be load-related. We’ve also
seen this behavior at one time or another on just about every QNX system
that we have.

We can try to run some more experiments, but not sure where to go from
here…

lew

Yes it definately seems size related - what happens when you increase the
size of the log file?

In the meantime, I’ll take a peek.

-Adam
John A. Murphy <murf@perftech.com> wrote in message
news:3DC7B710.23A79C03@perftech.com

Some more test results: the 6.5 minute period during which the log appears
to be
zero bytes long is exactly bracketed by a pair of log entries; i.e., we
were
unable to read the log for six minutes and 35 seconds, but when we COULD
read it
again, we found a pair of entries exactly six minutes and 35 seconds
apart. It
appears that the log gets into some state, perhaps some certain size, as
the
result of an event being logged, and cannot be read (even though more
events are
being recorded) until some other event kicks it out of that state.

Murf

Lewis Donzis wrote:

Adam Mallory wrote:
Can you specify which OS version you’re using?

Sorry, it’s 6.2.0.

Also, do you have a test
case that demonstrates your issue, or at least detail some conditions
that
your system is under, when this behaviour is observed.

Sort of. We really don’t know what causes it, so we just wrote a little
test program. Every second, it opens the log, reads it to the end
counting the number of bytes, closes the file, and prints the time and
the number of bytes found in the log. We ran it for a while on one of
our systems and here’s what happened:

Mon Nov 4 17:01:35 2002: read 16328 bytes
Mon Nov 4 17:01:36 2002: read 16328 bytes
Mon Nov 4 17:01:37 2002: read 16328 bytes
Mon Nov 4 17:01:38 2002: read 0 bytes
Mon Nov 4 17:01:39 2002: read 0 bytes
Mon Nov 4 17:01:40 2002: read 0 bytes
…6.5 minutes of the same…
Mon Nov 4 17:08:11 2002: read 0 bytes
Mon Nov 4 17:08:12 2002: read 0 bytes
Mon Nov 4 17:08:13 2002: read 16332 bytes
Mon Nov 4 17:08:14 2002: read 16332 bytes
Mon Nov 4 17:08:16 2002: read 16332 bytes

So it was going along just fine at 16328 bytes, and then all of a
sudden, for no apparent reason, the log appeared to be empty for about
6.5 minutes. Then, just as suddenly, it had 16332 bytes in it.
Checking the actual contents of the log entries, during the “dead
period,” seven new entries had been logged and looked fine, but this
raises the point that it does not appear to be load-related. We’ve also
seen this behavior at one time or another on just about every QNX system
that we have.

We can try to run some more experiments, but not sure where to go from
here…

lew

I haven’t tried changing the size of the log file yet, but I did make an
addition to my little test program such that I can make a fixed size entry to
the log every second. If I write 1024 bytes to the log (via
slogb(0,0,data,1012)) the reads all fail after awhile; if I make entries of 512
bytes, every second read shows zero bytes; with entries of 256 bytes, every
fourth read shows zero. This sure sounds like the classic problem of two
pointers walking around a circular buffer — when the pointers are equal, is
the buffer empty or full? Looks like maybe a full buffer is being seen as
empty…

Murf

Adam Mallory wrote:

Yes it definately seems size related - what happens when you increase the
size of the log file?

In the meantime, I’ll take a peek.

-Adam
John A. Murphy <> murf@perftech.com> > wrote in message
news:> 3DC7B710.23A79C03@perftech.com> …
Some more test results: the 6.5 minute period during which the log appears
to be
zero bytes long is exactly bracketed by a pair of log entries; i.e., we
were
unable to read the log for six minutes and 35 seconds, but when we COULD
read it
again, we found a pair of entries exactly six minutes and 35 seconds
apart. It
appears that the log gets into some state, perhaps some certain size, as
the
result of an event being logged, and cannot be read (even though more
events are
being recorded) until some other event kicks it out of that state.

Murf

Lewis Donzis wrote:

Adam Mallory wrote:
Can you specify which OS version you’re using?

Sorry, it’s 6.2.0.

Also, do you have a test
case that demonstrates your issue, or at least detail some conditions
that
your system is under, when this behaviour is observed.

Sort of. We really don’t know what causes it, so we just wrote a little
test program. Every second, it opens the log, reads it to the end
counting the number of bytes, closes the file, and prints the time and
the number of bytes found in the log. We ran it for a while on one of
our systems and here’s what happened:

Mon Nov 4 17:01:35 2002: read 16328 bytes
Mon Nov 4 17:01:36 2002: read 16328 bytes
Mon Nov 4 17:01:37 2002: read 16328 bytes
Mon Nov 4 17:01:38 2002: read 0 bytes
Mon Nov 4 17:01:39 2002: read 0 bytes
Mon Nov 4 17:01:40 2002: read 0 bytes
…6.5 minutes of the same…
Mon Nov 4 17:08:11 2002: read 0 bytes
Mon Nov 4 17:08:12 2002: read 0 bytes
Mon Nov 4 17:08:13 2002: read 16332 bytes
Mon Nov 4 17:08:14 2002: read 16332 bytes
Mon Nov 4 17:08:16 2002: read 16332 bytes

So it was going along just fine at 16328 bytes, and then all of a
sudden, for no apparent reason, the log appeared to be empty for about
6.5 minutes. Then, just as suddenly, it had 16332 bytes in it.
Checking the actual contents of the log entries, during the “dead
period,” seven new entries had been logged and looked fine, but this
raises the point that it does not appear to be load-related. We’ve also
seen this behavior at one time or another on just about every QNX system
that we have.

We can try to run some more experiments, but not sure where to go from
here…

lew

Yes this is exactly the problem, it’s a false positive on an empty buffer.
I’ve fixed this issue and it should be in the next release. If you require
an experimental version with the fix, you’ll need to contact technical
support with your support plan, or your sales rep to organize it.

If you’re having issues getting things rolling let me know.

-Adam

John A. Murphy <murf@perftech.com> wrote in message
news:3DC7EC99.81371601@perftech.com

I haven’t tried changing the size of the log file yet, but I did make an
addition to my little test program such that I can make a fixed size entry
to
the log every second. If I write 1024 bytes to the log (via
slogb(0,0,data,1012)) the reads all fail after awhile; if I make entries
of 512
bytes, every second read shows zero bytes; with entries of 256 bytes,
every
fourth read shows zero. This sure sounds like the classic problem of two
pointers walking around a circular buffer — when the pointers are equal,
is
the buffer empty or full? Looks like maybe a full buffer is being seen as
empty…

Murf

Adam Mallory wrote:

Yes it definately seems size related - what happens when you increase
the
size of the log file?

In the meantime, I’ll take a peek.

-Adam
John A. Murphy <> murf@perftech.com> > wrote in message
news:> 3DC7B710.23A79C03@perftech.com> …
Some more test results: the 6.5 minute period during which the log
appears
to be
zero bytes long is exactly bracketed by a pair of log entries; i.e.,
we
were
unable to read the log for six minutes and 35 seconds, but when we
COULD
read it
again, we found a pair of entries exactly six minutes and 35 seconds
apart. It
appears that the log gets into some state, perhaps some certain size,
as
the
result of an event being logged, and cannot be read (even though more
events are
being recorded) until some other event kicks it out of that state.

Murf

Lewis Donzis wrote:

Adam Mallory wrote:
Can you specify which OS version you’re using?

Sorry, it’s 6.2.0.

Also, do you have a test
case that demonstrates your issue, or at least detail some
conditions
that
your system is under, when this behaviour is observed.

Sort of. We really don’t know what causes it, so we just wrote a
little
test program. Every second, it opens the log, reads it to the end
counting the number of bytes, closes the file, and prints the time
and
the number of bytes found in the log. We ran it for a while on one
of
our systems and here’s what happened:

Mon Nov 4 17:01:35 2002: read 16328 bytes
Mon Nov 4 17:01:36 2002: read 16328 bytes
Mon Nov 4 17:01:37 2002: read 16328 bytes
Mon Nov 4 17:01:38 2002: read 0 bytes
Mon Nov 4 17:01:39 2002: read 0 bytes
Mon Nov 4 17:01:40 2002: read 0 bytes
…6.5 minutes of the same…
Mon Nov 4 17:08:11 2002: read 0 bytes
Mon Nov 4 17:08:12 2002: read 0 bytes
Mon Nov 4 17:08:13 2002: read 16332 bytes
Mon Nov 4 17:08:14 2002: read 16332 bytes
Mon Nov 4 17:08:16 2002: read 16332 bytes

So it was going along just fine at 16328 bytes, and then all of a
sudden, for no apparent reason, the log appeared to be empty for
about
6.5 minutes. Then, just as suddenly, it had 16332 bytes in it.
Checking the actual contents of the log entries, during the “dead
period,” seven new entries had been logged and looked fine, but this
raises the point that it does not appear to be load-related. We’ve
also
seen this behavior at one time or another on just about every QNX
system
that we have.

We can try to run some more experiments, but not sure where to go
from
here…

lew

Great! We’ll follow up through sales/support. Thanks for the quick response!

Murf

Adam Mallory wrote:

Yes this is exactly the problem, it’s a false positive on an empty buffer.
I’ve fixed this issue and it should be in the next release. If you require
an experimental version with the fix, you’ll need to contact technical
support with your support plan, or your sales rep to organize it.

If you’re having issues getting things rolling let me know.

-Adam

John A. Murphy <> murf@perftech.com> > wrote in message
news:> 3DC7EC99.81371601@perftech.com> …
I haven’t tried changing the size of the log file yet, but I did make an
addition to my little test program such that I can make a fixed size entry
to
the log every second. If I write 1024 bytes to the log (via
slogb(0,0,data,1012)) the reads all fail after awhile; if I make entries
of 512
bytes, every second read shows zero bytes; with entries of 256 bytes,
every
fourth read shows zero. This sure sounds like the classic problem of two
pointers walking around a circular buffer — when the pointers are equal,
is
the buffer empty or full? Looks like maybe a full buffer is being seen as
empty…

Murf

Adam Mallory wrote:

Yes it definately seems size related - what happens when you increase
the
size of the log file?

In the meantime, I’ll take a peek.

-Adam
John A. Murphy <> murf@perftech.com> > wrote in message
news:> 3DC7B710.23A79C03@perftech.com> …
Some more test results: the 6.5 minute period during which the log
appears
to be
zero bytes long is exactly bracketed by a pair of log entries; i.e.,
we
were
unable to read the log for six minutes and 35 seconds, but when we
COULD
read it
again, we found a pair of entries exactly six minutes and 35 seconds
apart. It
appears that the log gets into some state, perhaps some certain size,
as
the
result of an event being logged, and cannot be read (even though more
events are
being recorded) until some other event kicks it out of that state.

Murf

Lewis Donzis wrote:

Adam Mallory wrote:
Can you specify which OS version you’re using?

Sorry, it’s 6.2.0.

Also, do you have a test
case that demonstrates your issue, or at least detail some
conditions
that
your system is under, when this behaviour is observed.

Sort of. We really don’t know what causes it, so we just wrote a
little
test program. Every second, it opens the log, reads it to the end
counting the number of bytes, closes the file, and prints the time
and
the number of bytes found in the log. We ran it for a while on one
of
our systems and here’s what happened:

Mon Nov 4 17:01:35 2002: read 16328 bytes
Mon Nov 4 17:01:36 2002: read 16328 bytes
Mon Nov 4 17:01:37 2002: read 16328 bytes
Mon Nov 4 17:01:38 2002: read 0 bytes
Mon Nov 4 17:01:39 2002: read 0 bytes
Mon Nov 4 17:01:40 2002: read 0 bytes
…6.5 minutes of the same…
Mon Nov 4 17:08:11 2002: read 0 bytes
Mon Nov 4 17:08:12 2002: read 0 bytes
Mon Nov 4 17:08:13 2002: read 16332 bytes
Mon Nov 4 17:08:14 2002: read 16332 bytes
Mon Nov 4 17:08:16 2002: read 16332 bytes

So it was going along just fine at 16328 bytes, and then all of a
sudden, for no apparent reason, the log appeared to be empty for
about
6.5 minutes. Then, just as suddenly, it had 16332 bytes in it.
Checking the actual contents of the log entries, during the “dead
period,” seven new entries had been logged and looked fine, but this
raises the point that it does not appear to be load-related. We’ve
also
seen this behavior at one time or another on just about every QNX
system
that we have.

We can try to run some more experiments, but not sure where to go
from
here…

lew