Control loop timing validation

Hi all,

I’ve developed a PID controler for a setup that we are using at the university. The PID controler itself is generated code by use of 20sim. Within the simulation, there is no overshoot of the position of the motor. When deploying the code on RTAI, there is non either. So this seems to me that the 20sim code works properly.

However, I’ve used the same code on QNX and this code does have overshoot, you can actually see the robot arm moving back its just a little bit and you have to be quite anal about it but its there.

Since the code works ™ I figure that I’m doing something wrong while implementing it on QNX.

First I created a timer which sends a pulse to the application. The timer interval is 1000 hz (1ms). I’ve increased the timer resolution but i’m not sure its a) on time every time and b) the amount of time needed to calculate the control loop (expected very small though).

while (xx_time < xx_finish_time)
{

        rcvid = MsgReceive(chid, &msg, sizeof(msg), NULL);
        if (rcvid == 0)
        {
     
        	/* call the submodel to calculate the output */
           	//printf("new cycle %f\n",xx_time);
            XXCalculateSubmodel(u, y, xx_time);
            
            pwm_write(device,y[0]);
            u[0] = enc_read(device,2);
         }

I’ve used the IDE to trace the kernel but that tool the time line is extremely sluggish on my system, so I thought, lets use SAT. However i’m only interested in events generated by the application and events for the application.

Is there a way to only log evenets, and emit events to the tracelogger, for instance when i’m done doing the control loop?

About the IDE make sure you don’t sample too long .5 seconds should be plenty.

About the timer first I suggest you read this: qnx.com/developers/articles/ … 826_2.html.

You didn’t mention at what priority the timer pulse is. If it’s the default priority ( 10 ) then your program will get disrupted by interrupt and by other program of similar or superior priority.

If your code uses floating point it might be subjected to difference in behavior. However I very much doubt that such difference might affect a PID in such a way that you can visually see the difference on the robot.

I don’t know what you mean by RTAI but if it’s a simulation thingy as well you might have to consider the latency at which the “pwm_write” might introduce in the hardware.

You might also be affected by the dreaded SMI. It’s an interrupt that is handled by the BIOS and invisible to the OS. Usually this is active when things like USB emulation are enabled.

With RTAI I mean the real time application interface for the Linux kernel.

Further more the pwm_write is a function which calls a out16, it writes to an FPGA.

The PID controler is steered by a motion profile, this profile lasts 2 seconds, with a sampling rate of 1000hz, the control loop is executed 2000 times. I want to see the jitter/latency of each cycle and I want to see how ‘deterministic’ the timer is. The prio of timer is set to 15, And i’ve got USB disabled.

Its not only the extra overshoot thats introduced by implementing the loop in QNX, i just want to show the results, and measure at what time the system got the pulse. (the overshoot is added by something else I think; reading the encoders at the wrong place resulting in a postion[n-1] instead of postion[n])

I did a clock_gettime() in the while loop just before it recieves a pulse, the diff before it recieved the pulse is about 0.000977 seconds, fair enough but its not constant – I want to show these differences. But looking trough 2000 values this is a litte hard to check.

clock_gettime is only as precise as the system clock, instead use ClockCycles(), store the results in memory for say 10000 loops ( 10 seconds) , then save it to a file which you can import in say Excel and do all sort of nice stats.

ClockCycles is as precise at the CPU clock ( given it’s not dynamicaly changing clock speed on you).

Depending on the jitter you can live with priority of 15 may not sufficient. Go at the max, 255 to get an idea of what is the best you can get.

Okay I’ll try that, but i thought there might be some nice features within SAT I could use to make this more generic.

Btw I did set the qnx timer interval higher, 10khz.

Thx a bunch

Jeffry

Btw, within the article: qnx.com/download/feature.htm … amid=17751 They talk about exactly what I want, but no real examples

Not sure I understand this comment. The SAT (or System Profiler as it is now called) is for diagnosing problems. I thought the discussion here was related to how to design the application to achieve the objective. The System Profiler can help you validate that your implementation of your design is correct, but first you need a correct design to implement.

You need to perform RMA on your system to determine the priority that specific tasks should run at (you shouldn’t just pick a priority out of the air).

Mario’s suggestion allows you determine the overhead imposed by the OS (i.e. run at prio 255 to measure scheduling latency), but he isn’t implying that you should ultimately run this task at priority 255.

You set the tick period using the command line option to startup?

The profiler is nice but I wouldn’t only use it to get a general idea. To measure things like potentiel jitter down to a level that might affect a PID I would be worried of the overhead that things like tracelogger can create. I’m probably exaggerating a bit because in most case using the tracelogger and the system profiler is just fine.

Setting the ClockPeriod to 10khz might not be a good idea because you are increasing system overhead 10 fold as there are now 10 times more interrupts in the system. There should be no need to do that. First make sure what the real problem is.

The System Profiler is fine (you aren’t going to get any lower overhead doing your own logging). You need to know what you’re doing though, as the kernel events could create unacceptable overhead in the hands of a naive user…

Okay, I understand what you all are saying. I just thought that I could get a more specific output by using SAT. Cause in essence, it to uses the same kernel buffers to collect the information. The SAT manual says that the profiling kernel, runs at 98% speed of the time, so performance loss is always there when ever using qconn + system profiler or using the tracelogger right?

However, I’ll stick to the suggesetions here.

I set the system clock by using a system call, but I’ll set it back in order to drop the extra overhead.

Yes I think the overhead is always there, there are two version of the kernel. One that is instrumented and one that is not.

Well you can use tracelogger only ( start it manually ), this way qconn isn’t involved. By default tracelogger will write the file to /dev/shmem so at least hd operation isn’t involved. Note that tracelogger will run while your program does, which I why I mentionned that doing your own profiling will not affect the system as much.

This is wrong. The tracelogger can be operated in mapped mode, where it will, in fact, produce a much lower overhead than you could obtain by instrumenting your own program.

If what you say is true; how can I make it so? What I basicly want; isn’t that hard now is it? The kernel sends a pulse to my application and I want to trace that specific event.

Further more I want to have some construct that allows me to ‘post’ events to the tracelogger like: loop_enterd <time_stamp> and loop_exited <time_stamp> with the kernel buffers in place, all I need to do is inject data in to it.

I just want some method that is generic (I have 6 different motors for the robot arms - 6 different controllers).

Gila, meet TraceEvent(). TraceEvent() meet Gila :slight_smile:

Check the docs (built-in to the IDE) for TraceEvent, it allows you to control what events get logged, and to insert user events. You should insert non-string user events, since an unterminated string user event can cause a kernel crash. Your binary user events can be decoded by the IDE, by supplying an XML file that describes the structure (right click on the .kev file under the Navigator view, and then select properties, and then “User Event Data” to point the IDE to the xml file that describes your user event data).

To log in mapped mode see the docs for tracelogger, but an executive summary is:

tracelogger -M -S10M

(the above will use mapped mode to collect 10M of data).

Note that logging your user events is somewhat expensive this way, but in practice your user events should be very high level (i.e. infrequent) as you can rely on the built in events to provide the detail about what is going on. If you are really keen, you can download Tau from Foundry27 and try the integrated block function profiling that allows you to look inside a thread and see what functions are executing over time (very cool).

AH now I understand what you meant: I was saying that a custom “profiling” was faster then the one provided by QNX, but I was comparing it to TraceEvent() which is a kernel call. Not to the profiling done within the kernel.

About the -M option ( didn’t know about it ) does it means the kernel write directly in that shared memory or that tracelogger copy the data from the kernel buffer to this share memory?

While we are at it. I tried to experiment with tracelogger ( thanks to this thread ) it seems that if I use the ring mode that the .kev file won’t have the process name in it. Did I missed something.

I must admit I’m a bit confuse about the ring and linear mode what do they imply.

What I would like to do is setup some sort of tracelogger session running continuously and be able to get a snap shot of the buffer at any given time. Seems like the way to go is to run tracelogger in ring mode and terminate it when the snapshot is need make a copy of it and restart tracelogger. Is that how it should be done. Unfortunately though if ring mode implies the process/thread name are lost it makes things a bit less practical.

Nope. Since ring mode works backwards it doesn’t have the opportunity to determine the process name.

Linear mode means:

  • tracelogger starts, gets the names of all current processes, and synthesizes process create entries
  • runs until some point (controlled by user)
  • exits

Ring mode means:

  • tracelogger starts and does nothing.
  • at some point the user arranges for a trap (usually some bad state that they are trying to diagnose)
  • tracelogger then (if in mapped mode) does nothing more than write the header and exit, and (if in non-mapped mode) writes out the current kernel ring to the file specified

Yup, that’s exactly how it works and what it’s for.

Yes, it is a bit more difficult to use, and it should be able to get process names for pids that are still running at the time that the ring is dumped, but I have used it many times to solve some incredibly difficult to find bugs, so you’d have to pry it (process names or not) from my cold dead fingers at this point :slight_smile:

Ahhh, OK, I see what you mean, but as I said; typically the detail you are interested in is at the kernel level anyway (including the timing information that the OP was interested in), and the user level stuff is very high level (most of the time the purpose of user events are to allow you to quickly identify which part of the trace you want to look at by providing markers).

Hi all,

I got the traceEvent() stuff working now,and it is exactly what I need. One small question left :slight_smile:

I insert a user event in to the system: TraceEvent(_NTO_TRACE_INSERTUSRSTREVENT,_NTO_TRACE_USERFIRST,“CTRL_START”);

However, the events do not show op in the IDE timeline (its in the log file, its in the filter etc) I think the reason for this is that one field is not filled in, namely the Owner field. As such its not getting ‘drawn’ ? Just a wild guess.? Or there is no icon/definition for it in the software that draws the actual time line?

Any ideas?