Duplicate printf output

Greetings all,

I am working with a co-worker on a problem where some times, but not all of the times, our printf() output is duplicated. When the problem occurs we always get the duplicate lines one after the other.

For example:
[size=75][color=darkred]Output line 1
Output line 1
Output line 2
Output line 2
Output line 3
Output line 4[/size]

We have a multi-threaded applications where some output is coming from one thread and other output is comfing from another thread. From the example above, lines 1, 3 and 4 are from thread A and line 2 is from thread B.

We are using round robin scheduling. So it is possible, and probable that the threads are pre-empting each other once a timeslice completes.

A co-worker has a theory that printf is not thread safe. That we are being preempted in the middle of the printf and is somehow leading to the duplicated output.

This just does not seem likily to me. First the QNX documentation indicates that printf() is thread safe. And considering that QNX’s main line of buisiness is a multi-threaded environment, I could not imagine something so common and simple as this occuring – but who knows.

I have added code to validate that we are not invoking the functions twice, by using a static global that gets printed and incremented in the print statement. So if it was truly called twice the globals value would increment.

I have also added code to display the thread id so we can confirm that this is truly coming from the same thread.

I am baffled by this.

Can anyone suggest other debugging methods? Has anyone else encoutered this before? Is my co-worker correct that printf() doesn’t work correclty in a multi-threaded environment?

I have replicated this problem with QNX on a power PC board, as well as with the x86 vmWare QNX simulation.

Thanks in advance for your time.

-=John

I never saw that behavior myself. It’s not 100% clear to me what the thread safety of printf is because it seems to conflict with other stuff I read ( don’t remember where ). Functions like fprintf, fwrite and so forth are thread safe, but the file descriptors are not. That means you can use fprintf in multiple threads but only if they are using different file descriptor. That seems to conflict with printf which can be view as fprintf( stdout,…). Maybe printf uses some mutex to protect usage of stdout, that I don’t know.

It should be easy to test, just wrap the printfs with mutex, if the problem goes away then printf isn’t really thread-safe. It the problem is there you may be dealing with some sort of memory corruption or buffer over/underrun.

I have seen problems like this befor, in a few guises.
Just when you think you have found an odd bug in qnx, you realise that it’s a programming error.
How are the threads started, what triggers the printf, is it a loop…

Try to re-create the problem with simple code.

I will put money on the problem being in your code, not in the thread safety of printf or other libc code.

I am not criticising you as a programmer, just basing my judgement on ?? (too many) years of experience (having made many simmilar errors) . :slight_smile:

Mario’s suggestion, wrap a mutex around printf, seems the right one. On a paranoid note, it is possible to be fooled by incrementing a value. It is easier to explain if you have more than one processor, but even without it, you could have code that works like this:

int count = 0;

sub()
{
int local_count;
local_count = count;
printf(“count: %d\n”,count);
local_count++;
count = local_count;
}

I don’t mean that you have ‘C’ code like this, just that the underlying machine code might function this way.
Since printf would cause a context switch, it would be easy for two threads to grab the count value, and
each increment it separately and save it back.

John,

I assume your printf’s are being sent to the console?

Can you try doing a re-direct to a file (ie ‘myapp > tmp’).

Something is triggering in my memory about doing multi-threaded printf’s to the console and seeing duplicate messages. Once I re-directed to a file the issue went away.

I don’t think the problem is printf but rather some weird interaction with the actual console (dev-con or pterm) itself when threads are switching context. I never was able to fully explain what I saw other than to use the re-direct to a file to remove the problem (which may I suppose effectively wrap printf in a mutex as part of the file re-direct).

Tim

Thanks all.

Here is what I have found out so far.

1. Mutex the printf()

Mutex wrapping the printf() does indeed solve the problem. However, this seems like a really heavy handed approach. I mean really, grabbing a mutex for each printf()? A kernal call for each printf()? Especially since it is documented as thread safe.

I know… I know… I am being snippy.

2. Triplicates

I checked another scenario where 3 threads where attempting to display output and now sometimes I would see duplicates and other times I would see triplicates.

Moreover, one really strange interaction was that it looked like one of the output lines was a partial output from the previous two combined. For example:

------------------------------------------------------------Line1: Test Status: unixNatEng_4T1_Test : count 0 Line2: Swerr: ngdEoGoSignalWait%ERR_COND_WAIT_FAILED: 2-[7] pthread_cond_wait() failed. (0x16) Invalid argument Line3: ED: 2-[7] pthread_cond_wait() failed. (0x16) Invalid argument------------------------------------------------------------

I know its not very clear from above since this is not a monospaced font, and code doesn’t seem to make it monospaced (at least not when I previewed it).

Anyway… line-3 looks to be the tail of line-2. Morevoer, if you overlaid line-1 on line-2 and then added one more byte for the NULL termination, then the next string would begeing with ED

3. Console output

Yes, this is console output. I will attempt the redirection test. Thanks for the tip.

4. Incrementing value

I understand your point about two threads grabbing the count value. I think I have it covered. The code is as follows:

------------------------------------------------------------[code] …
{
static volatile int count;

    atomic_add(&count, 1);

    printf("%d-[%d] %s", count, pthread_self(), "the string");
}[/code]------------------------------------------------------------

So I get output like:

1-[9] The string

What I observed was that for the same thread the count remained the same such as:

1-[3]ngdCmTltStateMachine(): error_code = ngdCmInitPassed(p_elem)
1-[3]ngdCmTltStateMachine(): error_code = ngdCmInitPassed(p_elem)

From the above, since the pthread_self() identifes the source as thread 3 each time, and this is the only output for the “ngdCmInitPassed” routine I am fairly confident that its not being called by two separate threads.

Thanks all

-=John

Just a comment about 1. When you call fprintf(), your output might be buffered, but ultimately you end up with a message pass to some process, either the file system, or the console driver, and a message pass already involves a kernel call/context switch. This is just the way the OS works. And it does quite well. If you are curious enough you might want to actually time this.

And clearly your results show that printf, is not thread safe. This is a little surprising.

I would not worry about the overhead of a mutex for each printf ;-)

However the fact that printf is appears not to be thread safe is strange, given that it’s documented otherwise.

More test results for those that are interested:

1. Flush after print

All printf() and vprintf() where changed to have fflush(stdout)
immediately following them.

[color=darkred]i.e. printf(…); fflush(stdout);

This did not resolve the problem. Moreover, if you think about it, its not suprising that it did not resolve the problem. If this is a buffe/premption issue and you consider that you could be preempted after the printf() and before the fflush() we would always have this window of failure.

2. Change printf/vprintf to fprintf/vfprintf(stderr)

This caused even more garbled output since stderr isn’t buffered at all. I observed that in some cases when two threads printed a string “at the same time” that the output would be interleaved. No duplicates though.

For example:

[color=darkred]S1w-e[r3r]:ngdCmTltStateMachine (ngdEoGoSignalWait)%:ERR_COND_WAIT_FAILED :error_code = ngdCmInitPassed(p_elem) 1-[6] pthread_cond_wait() failed. (0x16) Invalid argument

is the missing of the following two outputlines:

[color=darkred]Swerr: ngdEoGoSignalWait %:ERR_COND_WAIT_FAILED: 1-[6] pthread_cond_wait() failed. (0x16) Invalid argument

and

[color=darkred]1-[3]ngdCmTltStateMachine():error_code = ngdCmInitPassed(p_elem)

3. Change printf/vprintf to fprintf/vfprintf(some file)

In this case the output was dumped into a file. This worked perfectly. No duplication and no interleaving. The output was exactly as desired.

4. Change stdout to unbuffered

In some online reading I stumbled across the setbuf() command. It can use it to change the buffering of a stream. Another reference online indicated that output to a file, such as fprintf() was unbuffered. So I used the setbuf() command to make stdout unbuffered.

This resulted in output the same as printing to stderr, another unbuffered output. This makes me think that even writing to a file has to be buffered at a certain level because the output to file was not interleaved or garbled.

5. Redirecting Output
I tried Tim’s suggestion of redirecting the output such as:

[color=darkred]program.exe > output.txt

This ‘worked’. The output was not duplicated, nor was it garbled.

Since the underlying console processes are provided as part of the QNX OS shouldn’t this be done correctly there?

John,

Apprently we posted at almost the exact same time. So I am re-editing my post to reflect your findings. Here’s what I think happens:

printf’s output must be buffered someplace for display purposes. It wouldn’t make sense to malloc space for that inside printf itself since it would never know when to free it since it obviously does a message send to a dev-conn/pterm for the real display. So I suspect the final recipient who sends to those processes must provide that space.

So I suspect that the provider of the buffer for printf isn’t thread safe even tho printf itself is. If you run the debugger you can see printf ends up calling a LOT of other deeper library routines. Since you can use a variety of libraries (Dinkum, GNU, Intel etc) it’s almost impossible for printf to guarantee everything it calls is threadsafe. It can only state that the few lines of code in it’s own routine is thread safe.

Tim

P.S. One thing I wanted to ask you. In your Makefile, what kinds of compile/link options are you using? I know in mine I always use -D_PTHREADS=1 for multi-threading. I believe this uses different libraries at link time that are thread safe. Maybe this is your issue in that your not linking the thread safe libraries for printf.

Hi Tim,

Redirecting the output to a file isn’t really feasable for us because this is an interactive system.

I am thinking that I might want to add a printf redirection layer. There are some reasons that the printf/mutex scenario won’t always work for us.

I am thinking of having a ‘printf-monitor’ thread whose sole purpose is to multiplex printfs from all the other threads, and behave as the single output source to the console.

I am very green to input/output redirection, but I think we might be able to do this by steps such as:

  1. During application init duplicate stdout to another file descriptor (newStdout), close the old stdout, and then reopen stdout for reading and writing.

I think at this point any printf’s will send to stdout, but instead of being sent out, it will be placed in the stdout ‘stream’.

  1. in the printf-monitor thread, read from the stdout stream and forward it via fprintf() to the newStdout stream. If this works I could even log the stuff to a file as well as output to the console, etc.

I kind of see this as an internal pipe between threads, as opposed to a pipe between processes.

Thanks for your insights Tim, and everyone else too.

-=John

Before going to that extent which I don’t think will solve the problem. You might want to get in touch with QNX support to confirm that printf is indeed not thread safe.

To have a own thread for printf is an idea too i think, so just open a proxy (err pulse) and trigger it from your other threads, output the pid (err rcvid) to see what thread it came from.

John,

Did you confirm that your using the -D_PTHREADS=1 in your makefile for the compiler and linker?

There most definitely is some non thread safe library code shipped with the compilers for QNX. The reason for that is speed purposes since many apps are not threaded and need to run as fast as possible.

My understanding is the compiler default is to use non-thread safe libraries which is why you need that option.

Also, you definitely do NOT need to go to all the work you described. Why not just change all your printf’s to fprintf’s (since you know they worked) and open STDOUT as your file in fopen. That will save you a LOT of effort and give you the thread safety you need and the text will go to the screen just as it does with printf.

But you should first confirm the compiler options to make sure your using thread safe libraries.

Tim

use fprintf with the same handle is NOT thread safe.

Mario,

That’s interesting. The doc’s don’t mention there is a problem since they say fprintf is thread safe.

Plus John’s findings (reported as #3) a couple of posts above indicated that he didn’t have any problems at all when he changed to using fprintf.

How would you characterize it as being not thread safe? I understand that both threads sharing an fd would write to the same file, but I can’t see how they’d corrupt each others output to that file if your using thread safe libraries because the buffers deep in the library are wrapped in a mutex (you can see this in the debugger).

Tim

I don’t recall where I read this, I think someone from QNX made a post on the newsgroup. Anyway he was pretty clear that although fprintf is thread safe, usage of a file descriptor wasn’t. Now I could be wrong about this, I couldn’t find anything specific in the doc.

For example fprintf could be thread safe, but what about doing a fprintf in a thread and a ftell in another one , a fwrite in another, and so forth?

Maybe this was in the 6.2 days, maybe it doesn’t apply to 6.3.

Mario,

What I expect is that each of those operations (fprintf, fwrite, ftell) would be atomic in that they would complete without any thread corruption of their data.

So for example if you have Thread A and ThreadB sharing an fd and Thread A does an fprintf of ‘Hello World’ and Thread B does an fwrite of ‘Goodbye All’ that there would appear in the file:

Hello World
Goodbye All

These could be reversed in order obviously depending on which thread ran first. What I would NOT expect is that there would be jumbling of the individual data so that you might get ‘HGeloodlobye WAorllld’.

The reason I believe that is because when you do the fopen call you only get 1 actual message channel to Fsys, not one per thread. So even tho both threads may send, only one can possibly send on the channel to Fsys since Fsys will only have 1 thread listening to your app. So the messages would get processed completely rather than being jumbled.

Of course it’s not smart programming to share fd’s since one thread can close the fd and pull the rug out from under the other thread (or worse, close and open another file and suddenly the other thread starts writing to the wrong file as I demonstrated in a test progam a couple of months ago)

Tim

Tim,

I want to clarify a few things:

fprintf(stderr…) this was extra bad. It caused the output to be mixed on a per character basis.

fprintf(stdout…) this behaved the same as a standard printf(), with some output lines being duplicatated and others repeating just a tail section (as mentioned in earlier posts.

fprintf(some file…) this worked fine.

The theory we are working with is that its one of the processes underneath us that might not be thread safe – or maybe it is the library itself. Anyway, since this seems to occur around context switches I think the pipe idea might work because the pipe writes are atomic up to ‘_PC_PIPE_BUF’ (as per QNX documentation qnx.com/developers/docs/6.3. … f.html#id9)

So the strings stuffed into the pipe would be atomicly added up to that limit. Moreover documentation indicates that these values can be changed via the fpathconf() command.

Anyway, I think this would take care of the “multiple threads writing” aspect, and then all of this would be funneled into a single thread doing printf() to the console so there wouldn’t be any problem with multiple threads accessing the console output simultaneously.

I have quickly prototyped this but I haven’t plugged into my system yet. I just did a quick, can I redirect my own output test.

I also checked and we do not have the -D_PTHREADS=1 our makefile. (In case you’re smacking your head thinking how on earth could I have missed this… I inherited a lot of the makefile compilation and linking instructions. I would gladly appreciate any documentation references that might educate me more in this particular area.).

I tried recompiling with -D_PTHREADS=1 in my [qcc] compilation command line and it did not resolve the duplicate output problem.

We are also in touch with the QNX support team but they are “swamped” right now. So we are also continuing furhter investigations on our own in parallel.

Thanks all for your time, help and comments.

-=John

If you are interested in the quickly slapped together prototype code, it is as follows below. I will plug it into the actual application some time tomorrow to see how it impacts things.

[size=75][color=brown][code]#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

int gNgd_Printf_Pipe[2];
int gNgd_Console_Stdout_Fd;

FILE *gNgd_Old_Stdout;
FILE *gNgd_Console_Stdout_Stream;

int printfRedirectInit(void)
{

/* create the printf redirection pipe */
if (-1 == pipe(gNgd_Printf_Pipe)) {
    perror("Pipe failed");
    return (1);
}


/*
 * turn the pipe's read side into a stream referenced by gNgd_Old_Stdout,
 * output sent stdout will be redirected into the pipe, and then this is
 * where we read that redirected data from.
 */
gNgd_Old_Stdout = fdopen(gNgd_Printf_Pipe[0], "r");
if (NULL == gNgd_Old_Stdout) {
    perror("fdopen(gNgd_Printf_Pipe[0] failed");
    return (1);
}


/*
 * duplicate stdout to another file descriptor so we still have access to
 * console output by way of fprintf() to the new file descriptor
 */
gNgd_Console_Stdout_Fd = dup(STDOUT_FILENO);
if (-1 == gNgd_Console_Stdout_Fd) {
    perror("Dup failed");
    return (1);
}


/* change the new stdout output file descriptor into a stream */
gNgd_Console_Stdout_Stream = fdopen(gNgd_Console_Stdout_Fd, "w");
if (NULL == gNgd_Console_Stdout_Stream) {
    perror("fdopen failed");
    return (1);
}



fprintf(gNgd_Console_Stdout_Stream,
        "This is output is from the dup'd fd\n");


/* close the original STDOUT file descriptor */
if (close(STDOUT_FILENO) != 0) {
    perror("close(STDOUT_FILENO) failed");
    return (1);
}

/*
 * duplicate the pipe input side, it uses the first available file
 * descriptor which is STDOUT because we just closed it.
 */
if (dup(gNgd_Printf_Pipe[1]) != STDOUT_FILENO) {
    perror("dup(gNgd_Printf_Pipe) failed");
    return (1);
}

/*
 * now close the pipe's FD, STDOUT is now duplicated to it and we don't
 * need the pipes fd anymore
 */
if (close(gNgd_Printf_Pipe[1]) != 0) {
    perror("close(gNgd_Printf_Pipe) failed");
    return (1);
}

}

void *thread1 (void *p_arg)
{
p_arg = p_arg;

printf("Sending into the pipe with a printf\n");



/* WARNING this blocks if no newline is presented, so we need to make this
   newline friendly */

fprintf(stdout, "Sending into the pipe with fprintf()\n");


sleep(1);

}

void *printfMonitor (void *p_arg)
{
char buffer[300] = {0};

while (fgets(buffer, sizeof(buffer), gNgd_Old_Stdout)) {
    fprintf(gNgd_Console_Stdout_Stream, "Redirected: %s", buffer);
}

}

#define EXIT_ON_ERROR(val, args…)
val = args;
if (val) {
printf("%s failed\n", #args);
perror("FAIL MSG= ");
exit (1);
}

int main (int argc, char *argv[])
{
pthread_attr_t tattr;
pthread_t tid;
int rc;

EXIT_ON_ERROR(rc, pthread_attr_init(&tattr));

EXIT_ON_ERROR(rc, pthread_attr_setschedpolicy(&tattr, SCHED_RR));


printf("Starting %s\n", __FUNCTION__);


printfRedirectInit();


EXIT_ON_ERROR(rc, pthread_create(&tid, &tattr, printfMonitor, NULL));

EXIT_ON_ERROR(rc, pthread_create(&tid, &tattr, thread1, NULL));

EXIT_ON_ERROR(rc, pthread_join(tid, NULL));


fprintf(gNgd_Console_Stdout_Stream, "Ending %s\n", __FUNCTION__);
sleep(1);
return (0);

}
[/code][/size]