Timing problems in multi-thread program

I am now doing a multi-thread automation system. I wrote a simple multi-thread program consists of only two: one is to generate the sensor data and the other is log the sensor data onto the solid state disk.

I let the program run like this: within a main loop, the 1st thread is activated, 1st thread run until it finishes, then the main loop will activate the 2nd thread, then 2nd thread run. Main loop is 20ms per period, while the execution time for the two thread are usually below 2ms and 8ms.

The problem now is when i check the time(when a new data is generated) of the 1st thread in each main loop, when the amount of logging data increased(the time consumption should be much less than 20ms), the timing intervals of 1st thread varies, which i could not understand.

attached is my sample codes. Thanks.

Eric

Eric,

I’m afraid I don’t understand your question/problem.

I took your code and compiled it on my machine which has a hard drive. I added a couple more cout’s to the code to help show what’s going on. The places I added them were in your main loop and the EveryRun() functions of the 2 thread classes. So it looks like:


    for (m_nCount=0; ; m_nCount++)
    {
    	MsgReceivePulse(m_chMain, &pulse, sizeof(_pulse), NULL);
		cout<<"Main Start " << ::GetTime()<<endl;
    	
    	if (_imu.IsReady())
    	{
    		_imu.SendPulse(PRIORITY_IMU, PULSECODE_DATA);
    		_imu.WaitForEvent(nsecIMU);
    	}
    	
    	if (_dlg.IsReady())
    	{
    		_dlg.SendPulse(PRIORITY_DLG, PULSECODE_DATA);
    		_dlg.WaitForEvent(nsecDLG);
    	}
    	
		cout<<"Main End " << ::GetTime()<<endl;
    }


int clsIMU::EveryRun()
{
	cout<<"IMU Start " << ::GetTime()<<endl;
	pthread_mutex_lock(&m_mtxIMU);

	m_imuData[m_nIMU].a = 1.0;
	m_imuData[m_nIMU].b = 1.0;
	m_imuData[m_nIMU].c = 1.0;
	m_imuData[m_nIMU].p = 1.0;
	m_imuData[m_nIMU].q = 1.0;
	m_imuData[m_nIMU].r = 1.0;
	m_imuData[m_nIMU].acx = 1.0;
	m_imuData[m_nIMU].acy = 1.0;
	m_imuData[m_nIMU].acz = 1.0;
	m_imuData[m_nIMU].magx = 2.0;
	m_imuData[m_nIMU].magy = 2.0;
	m_imuData[m_nIMU].magz = 2.0;
	m_imuData[m_nIMU++].tmp = 3.0;
	
	pthread_mutex_unlock(&m_mtxIMU);
	cout<< "IMU End " <<::GetTime()<<endl;
	
	return TRUE;
}

int clsDLG::EveryRun()
{
//	cout<<"[DLG] "<<m_nCount<<endl; 
	cout<<"DLG Start " << ::GetTime()<<endl;
	if (m_nCount%50 != 0)
		return 1;
	
	if (_imu.m_nIMU != 0)
	{
		pthread_mutex_lock(&_imu.m_mtxIMU);
		m_nIMU = _imu.m_nIMU;
		::memcpy(m_imu, _imu.m_imuData, m_nIMU*sizeof(IMUPACK));
		_imu.m_nIMU = 0;
		pthread_mutex_unlock(&_imu.m_mtxIMU);
	}
	
	// start writing data to the disk
	
	m_pfLogb = ::fopen(m_szFileb, "a");
	
	int i;
	for (i=0; i<m_nIMU; i++)
	{
		if (m_pfLogb!=NULL)
			::fwrite(&m_imu[i], sizeof(IMUPACK), 1, m_pfLogb);
	}
	m_nIMU = 0;
	
	if (m_pfLogb)
	{
		::fclose(m_pfLogb);
		m_pfLogb = NULL;
	}
	cout<<"DLG End " <<::GetTime()<<endl;
		
	return TRUE;
}

I then ran this code and I got the following output:

[IMU] start
[DLG] Start
Main Start 0.0215247
IMU Start 0.0218264
IMU End 0.0218545
DLG Start 0.0218793
DLG End 0.0221828
Main End 0.0222096
Main Start 0.0415208
IMU Start 0.041585
IMU End 0.0416093
DLG Start 0.0416348
Main End 0.0416577
Main Start 0.0615115
IMU Start 0.061546
IMU End 0.0615685
DLG Start 0.0615926
Main End 0.0616151
Main Start 0.0816569
IMU Start 0.0826545
IMU End 0.0826919
DLG Start 0.0827186
Main End 0.0827411
Main Start 0.101497
IMU Start 0.101521
IMU End 0.101543
DLG Start 0.101566
Main End 0.101588
Main Start 0.12149
IMU Start 0.121517
IMU End 0.121539
DLG Start 0.121563
Main End 0.121585
Main Start 0.141484
IMU Start 0.141507
IMU End 0.141528
DLG Start 0.141552
Main End 0.141574
Main Start 0.161477
IMU Start 0.161505
IMU End 0.161527
DLG Start 0.161551
Main End 0.161573
Main Start 0.18147
IMU Start 0.181496
IMU End 0.181518
DLG Start 0.181541
Main End 0.181563
Main Start 0.201464
IMU Start 0.201498
IMU End 0.20152
DLG Start 0.201544
Main End 0.201566
Main Start 0.221456
IMU Start 0.221479
IMU End 0.221503
DLG Start 0.221527
Main End 0.221549
Main Start 0.241449
IMU Start 0.241475
IMU End 0.241497
DLG Start 0.24152
Main End 0.241542
Main Start 0.261442
IMU Start 0.261466
IMU End 0.261487
DLG Start 0.261511
Main End 0.261533
Main Start 0.281436
IMU Start 0.281476
IMU End 0.281499
DLG Start 0.281523
Main End 0.281546
Main Start 0.301428
IMU Start 0.301452
IMU End 0.301473
DLG Start 0.301497
Main End 0.301519
Main Start 0.321421
IMU Start 0.321448
IMU End 0.32147
DLG Start 0.321494
Main End 0.321516
Main Start 0.341416
IMU Start 0.341478
IMU End 0.341501
DLG Start 0.341526
Main End 0.341548
Main Start 0.361407
IMU Start 0.361431
IMU End 0.361453
DLG Start 0.361476
Main End 0.361499
Main Start 0.3814
IMU Start 0.381427
IMU End 0.381448
DLG Start 0.381472
Main End 0.381494
Main Start 0.401394
IMU Start 0.40142
IMU End 0.401441
DLG Start 0.401465
Main End 0.401487
Main Start 0.421387
IMU Start 0.421413
IMU End 0.421435
DLG Start 0.421459
Main End 0.421481
Main Start 0.44138
IMU Start 0.441405
IMU End 0.441427
DLG Start 0.441454
Main End 0.441477
Main Start 0.461373
IMU Start 0.461399
IMU End 0.461421
DLG Start 0.461445
Main End 0.461467
Main Start 0.481366
IMU Start 0.481395
IMU End 0.481416
DLG Start 0.48144
Main End 0.481462
Main Start 0.501359
IMU Start 0.501382
IMU End 0.501404
DLG Start 0.501427
Main End 0.501449
Main Start 0.521352
IMU Start 0.521379
IMU End 0.521401
DLG Start 0.521425
Main End 0.521447
Main Start 0.541345
IMU Start 0.541383
IMU End 0.541406
DLG Start 0.54143
Main End 0.541453
Main Start 0.561338
IMU Start 0.561367
IMU End 0.561389
DLG Start 0.561412
Main End 0.561435
Main Start 0.581331
IMU Start 0.581355
IMU End 0.581377
DLG Start 0.5814
Main End 0.581422
Main Start 0.601325
IMU Start 0.601375
IMU End 0.601397
DLG Start 0.601422
Main End 0.601444
Main Start 0.621317
IMU Start 0.621342
IMU End 0.621364
DLG Start 0.621387
Main End 0.62141
Main Start 0.64131
IMU Start 0.641336
IMU End 0.641358
DLG Start 0.641382
Main End 0.641405
Main Start 0.661304
IMU Start 0.661327
IMU End 0.661348
DLG Start 0.661375
Main End 0.661397
Main Start 0.681297
IMU Start 0.681326
IMU End 0.681348
DLG Start 0.681372
Main End 0.681394
Main Start 0.70129
IMU Start 0.701313
IMU End 0.701335
DLG Start 0.701358
Main End 0.70138
Main Start 0.721283
IMU Start 0.721316
IMU End 0.721338
DLG Start 0.721361
Main End 0.721384
Main Start 0.741276
IMU Start 0.741303
IMU End 0.741324
DLG Start 0.741348
Main End 0.74137
Main Start 0.761269
IMU Start 0.761292
IMU End 0.761314
DLG Start 0.761337
Main End 0.761359
Main Start 0.781273
IMU Start 0.781539
IMU End 0.781572
DLG Start 0.781598
Main End 0.78162
Main Start 0.801255
IMU Start 0.801279
IMU End 0.801301
DLG Start 0.801324
Main End 0.801346
Main Start 0.821248
IMU Start 0.821275
IMU End 0.821297
DLG Start 0.82132
Main End 0.821342
Main Start 0.841244
IMU Start 0.841303
IMU End 0.841326
DLG Start 0.841351
Main End 0.841374
Main Start 0.861235
IMU Start 0.861269
IMU End 0.861292
DLG Start 0.861316
Main End 0.861338
Main Start 0.881227
IMU Start 0.881256
IMU End 0.881278
DLG Start 0.881302
Main End 0.881328
Main Start 0.90122
IMU Start 0.901244
IMU End 0.901265
DLG Start 0.901289
Main End 0.901311
Main Start 0.921214
IMU Start 0.921241
IMU End 0.921263
DLG Start 0.921287
Main End 0.921309
Main Start 0.941207
IMU Start 0.941232
IMU End 0.941254
DLG Start 0.941278
Main End 0.9413
Main Start 0.9612
IMU Start 0.961226
IMU End 0.961247
DLG Start 0.961271
Main End 0.961293
Main Start 0.981193
IMU Start 0.981217
IMU End 0.981238
DLG Start 0.981262
Main End 0.981285
Main Start 1.00119
IMU Start 1.00122
IMU End 1.00124
DLG Start 1.00126
Main End 1.00129
Main Start 1.02118
IMU Start 1.0212
IMU End 1.02122
DLG Start 1.02125
DLG End 1.02149

So looking at the start of the output I see:

Main Start 0.0215247
IMU Start 0.0218264
IMU End 0.0218545
DLG Start 0.0218793
DLG End 0.0221828
Main End 0.0222096

which shows your main and 2 threads completed running in about 1 ms of time. That’s far quicker than you expected.

Then you delay for 20 ms (0.0415208-0.0215247) till you wake your main thread up again and I then see:

Main Start 0.0415208
IMU Start 0.041585
IMU End 0.0416093
DLG Start 0.0416348
Main End 0.0416577

This time your DLG code exits early so it never completes the write (not sure why you are doing that). Again tho, it all completes in about 1 ms of time.

So what is it exactly that you are seeing that you don’t understand?

All the output I see shows the main thread waking every 20 ms and running 2 other threads which themselves run in about 1 ms of time.

Tim

Sorry that i failed to clarify my problem. I used the printed out time information to derive the time intervals between the latter one and previous one. I found the time intervals vary unsatisfactorily if the data logging thread is started. Attached are two figures of the time intervals in two cases, without dlg and with dlg. And i set the dlg thread to run every 50 times, i.e. log data every 1 second.

Eric

Eric,

So if I read those graphs right you ran for about 300 seconds total (5 minutes). Then graphed the difference between each printed out time interval to determine how many ms had passed between the time your thread last ran. Is that correct?

Then you observed that with the logging thread running the interval varried far more than expected every once in a while which is what you are asking about.

Assuming that’s all true, I ran again for 5 minutes on my machine (a 4 year old desktop PC with a hard drive) printing out the same data as I did before with regards to when each thread starts/stops. I then looked at my data and no where in my 5 minutes of data do I ever see the main + 2 threads take more than 1 ms to complete.

So that leads to me ask about the following things:

  1. What kind of setup are you running on hardware wise (CPU, Ram, CF card type)
  2. Are you redirecting the output to a file (ie ‘onboardThreads > tmp’) or printing to a screen? Which ever one you are doing, be aware that the very fact you are sending output someplace is going to cost a fractional period of time (most definitely should be less than 1 ms)
  3. Are you developing using the Momentics suite? If you are, there are excellent tools in Momentics that can track where every last CPU cycle goes for every process so you can see exactly what causes the occasional delay.

Since I don’t see anything like what you do with a hard drive I suspect that the disk driver isn’t optimally configured for the CF card. It may not be using DMA mode or PIO mode both of which dramatically increase the CPU requirements for the disk driver.

Quick Test to Try:

If you are running Photon, watch the CPU usage graph. If you are not running photon, run hogs at a 1 second interval.

Then copy a large file (at least 10 megs in size). See if the CPU used jumps really high for several seconds. If it does, it means your disk driver is not running optimally.

Tim

Actually, i care more about the time points when the IMU thread starts executing. The IMU thread should be starting just after the main loop starts, so the time intervals should be almost the same as long as the dlg thread could be finished in the current main loop. So i have no ideas how the variations in time intervals come from. Bcz the dlg thread could not finish within 20ms? And i have put a timer on each thread, if the dlg could not finish 5ms, then it may continue consume cpu time as long as the previous IMU thread finishes. Even so, when 20ms is reached, the main loop will definitely send an activation pulse to start the imu thread, which would not cause the starting time of imu to vary so great.

PC104(500MHz), Ram 256M, this link-- diamondsystems.com/products/flashdisk

I print the time out on the screen, the time consumption should not affect the time intervals, since the variations somehow does not follow a specific trend or phenomenon.

Yes, i am using the IDE tau. Could you tell me the name of this tool in the debug environment?

The solid state claims no driver needed. It is probably not optimally configured. I made it in the DMA mode, but no good effects.

And I will do the cpu usage test tmr. thx

Eric

Eric,

The first thing you need to do is use the instrumented Kernel (if you are not already doing so).

So in your build file (the one that creates your .boot file) you need to replace the part that says “PATH=/proc/boot procnto” with “PATH=/proc/boot procnto-instr”. Then create a new boot image (.boot file) and put it on your target.

Then in the IDE go to your “Target Navigator” view. Right click and select ‘Kernel Events Tracing’. Then you get to fill in some stuff like how long you want to trace for and what to trace (I’d avoid interrupts/kernel calls since those aren’t likely your issue and they make the log file large).

Run your program for how ever long you need to generate the problem (maybe 1-2 minutes) and let the events get traced.

The in the IDE go to the QNX System Profiler perspective. At this point you have to start digging into what was logged. There are ways to filter out stuff you don’t want so you can see what you do want. Far too much to explain here so you’ll have to try it to see what I mean. But you can literally see every message pulse, how long each process/thread ran etc.

Tim

Thanks, Tim, I have configured the system to be ready for the instrumented kernel function. One point i am confused is i want to log the file after i have run the debugger for some time. Currently, the log file generated are not what the time period i want. In other words, i just want to log after i have started my debugged program. How should i do this?

Eric

Eric,

If you only want to log when your program is running then you don’t use the IDE.

Instead in your code, add a line of:

TraceEvent(NTO_TRACE_START)

at the start of your code and a line of:

TraceEvent(NTO_TRACE_STOP)

at the end.

Check out TraceEvent in the helpviewer/docs for more info on how to filter out some events (if you want to do so).

Basically the TraceEvent() calls let you do everything you can do from the IDE but controlled by your program instead so you turn it on just when you need it.

Note: You’ll need to run tracelogger as a daemon process (tracelogger -d &) before starting your program when logging in this manner.

Tim

Hi, Tim. I have done this test. I copied a more than 30MB file to the target while the target is executing the hogs. Obviously, the devb-eide consumes CPU usage greatly, varies from 20% to more than 50%. So may i say that the CF card is not optimally configured? And what changes may i consider in the configurations in my case? thanks.

Eric

devb-eide is not using DMA it’s using CPU to write to the device.

Thx, i have disables the DMA mode. the start script for the CF card is:
devb-eide blk automount=hd0t77:/ eide nobmstr &
wondering what else cofigurations need considerations :slight_smile:

Eric

Eric,

Actually you WANT DMA mode enabled. That’s why your writes are so slow as Mario and I pointed out.

The commands I use are pretty much the same as yours except I make sure to pass everything to devb-eide BEFORE I start passing arguments to blk since I am not sure anything afterwards gets taken: Ie (my startup):

devb-eide eide noslave blk cache=2m,automount=hd0t79:/ &

I just took a look at the PC104 board spec link you posted above. According to the specs:

"An enhanced set of I/O ports is provided to support any application’s requirements, including 10/100Mbps Ethernet, UDMA-33 IDE, parallel port, PS/2 keyboard and mouse ports, and 4 USB 1.1 ports. The board also has 4 16450-compatible RS-232 serial ports. The watchdog timer provides protection from software crashes and is programmable for delays up to 2 seconds. "

The board has UDMA-33. That’s awfully old technology. I can’t remember the last time I used a PC/PC104 board with that (prob around 1999-2000). The UDMA modes for UDMA-33 are 0-2 (out of 6). You’ll probably have to specify UDMA mode to devb-eide (ie add a udma=2 to your devb-eide line) to get it to use it.

Personally I’d try to get away from that PC104 board if at all possible. There are lots of other PC104 board suppliers out there (we get ours from Ampro) with much newer hardware support.

Tim

Here is the output about the devb-eide from slogger:

Time Sev Major Minor Args
Apr 27 23:45:18 3 17 0 Range check failed (IO) - Dev 571 - Vend 1106 - Class 1018e - Addr 170 - Size 8
Apr 27 23:45:18 2 17 0 scan_windows: Alloc failed 171 - Size 8
Apr 27 23:45:18 2 19 0 devb-eide 1.00A (Jul 27 2007 14:49:50)
Apr 27 23:45:18 2 5 0 libcam.so (Jul 27 2007 14:27:15) bver 42000001
Apr 27 23:45:18 2 19 0 eide_mmap_device_io: (contig) status 50 != altstatus ff
Apr 27 23:45:18 2 19 0 eide_identify_devices: Generic IDE vid 0, did 0, class 0 rev 0, busno 0, dfunc 0
Apr 27 23:45:18 2 19 0 eide_identify_devices: cmd_addr 1f0, cntl_addr 3f4, irq e, chnl 0, udma -1, mdma 0, sdma 0, pio 0
Apr 27 23:45:18 2 19 0 eide_identify_devices: probing dev 0 status 50, error 0
Apr 27 23:45:18 2 19 0 eide_identify_devices: probing dev 1 status 0, error 0
Apr 27 23:45:18 2 19 0 eide_display_devices: mdl Hitachi XX.V.3.5.0.0 sn X0814 20060217141147 fw Rev 0.00 tid 0, cable 40, max udma -1, cur udma -1, max mdma -1, cur mdma -1, max sdma -1, cur sdma -1, pio 4, mblk 1
Apr 27 23:45:18 2 19 0 eide_init_devices: Hitachi XX.V.3.5.0.0 path 0, tid 0, udma -1, mdma -1, sdma -1, pio 4, mblk 1
Apr 27 23:45:18 2 19 0 eide_identify_devices: VIA 82C586A/B vid 1106, did 571, class 1018e rev 6, busno 0, dfunc 39
Apr 27 23:45:18 2 19 0 eide_identify_devices: cmd_addr 170, cntl_addr 374, irq ff, chnl 1, udma 5, mdma 2, sdma 0, pio 4
Apr 27 23:45:18 2 19 0 eide_identify_devices: probing dev 0 status ff, error ff
Apr 27 23:45:18 2 19 0 eide_identify_devices: probing dev 1 status ff, error ff
Apr 27 23:45:18 2 5 100 cam-disk.so (Aug 22 2007 14:29:05)

It seems that the current CF card does not support UDMA mode, and from the bios the AthenaII, the UDMA option is also uable to access(in grey). So i started the cf card with DMA, but no UDMA: devb-eide blk automount=hd0t77:/ &

Current status is i have several task threads to run in a while loop one by one. And if the data logging(cf card writing) is started, then one task(reading from serial port) also could not read as much data as possible. I have no ideas how the logging task would affect the execution of other tasks, since logging is the lowest-priority one. Those tasks are activated one by one and has a timer attached to each task.

And for the PC104, since this one has incorporated 4 serial ports, and u know, not cheap, thus we decide to make some possible further configurations on the disk driver. To what extent, may i change the configurations?

Thanks,
Eric

Eric,

What you don’t realize is that the disk driver (devb-eide) is running at a higher priority than any of your threads. So even though your low priority logging thread isn’t going to affect anything else, the disk driver most certainly is once it gets data from the logging thread.

If you did the system profiling I suggested you’d see devb-eide using CPU at those times when you have problems.

There are really only 3 suggestions:

  1. Did you even try adding udma=2 to the devb-eide line? If not you might want to at least see if that helps regardless of what slogger shows. devb-eide can be fooled on occasion so it’s worth at least seeing if this helps. If there is no DMA supported at all then you are never going to get speedy disk writes.

  2. You can increase your program/threads priority to be greater than devb-eide (21) and less than the serial driver (24). This guarantees devb-eide won’t affect your threads/reading serial data. This of course could cause other issues depending on if you use other hardware (ethernet, other cards etc) or if your program ever possibly can send data faster than it can be written to disk (ie you starve devb-eide).

  3. Depending on your requirements, you could log to a RAM disk (memory) instead of the CF card. Then at pre-determined intervals you could copy those RAM logs to the CF card. Of course there is a chance you’d lose the logs on power loss before they were copied from RAM to CF card so you have to weigh that consideration on how important the logs are.

Tim