connection timeout in mq_receive()

Hi all,
i am facing connection timeout problems in my qnx application.
there are two processes in the app. process B sends message to Process A using posix mq_send().Then waits for response using mq_receive() with a timeout as O_BLOCK.but mq_receive returns error status -1. when i see errno ,It says connnection timout.

can any one help me in this. so that I won’t get these connection timouts

How it is???

sorry timout as 0 ( means blocking mq_receive )

ravindraqnx,

I don’t see an errno of connection timeout with mq_receive. What is the actual errno value your getting.

Tim

HI tim,
the errno value is ETIMEDOUT. It is defined in errno.h as macro with value 260

Hi tim,
ETIMEDOUT means connection timout . this they mentioned in help documentation of errno

ravindraqnx,

Are you trying to use mq_receive across a network?

That error only occurs on network connections (look in /usr/include/errno.h).

In the doc’s on mq_receive, the errno of ETIMEDOUT isn’t listed. That means mq_receive isn’t reporting it. Something else is deeper down. That’s why I asked if your trying to do the receive across a network.

Tim

Hi Tim,
the IPC, i am using is between two processes running on QNX. one Processe(who is performing mw _receive) is running from telnet session.Other proceses i am directly running.
but libmq.so.1 is on shared network folder. will that is causing any problems

these two processes are running on ARM side. This is MUlti processor environment( ARM processor(QNX image) and TI DAVINCI processor(DSP/BIOS os).

Maybe the problem is the different CPUs? QNet is not, as far as I’m aware, cross-endian, while TCP/IP is. What CPU is in the DSP? Maybe a struct member alignment problem?

Can you successfully work over the network with other programs?

Regards,
Albrecht

ravindraqnx,

I’ve never used mqueue across a network/multi CPU’s before so you’ve reached the limit of my knowledge on the inner workings of mqueue at this point.

A couple of other questions tho:

  1. Does this message just happen a couple of times at startup or is intermittently happening?

  2. Are you ever able to get any messages through?

I am just wondering if this only occurs a few time at start up until everything is sync’d between the CPU’s (queue’s created and both sides ready to send/receive across the sockets).

Tim

hi albrecht,

I am using this POSIX ipc only between processes on ARM . we have DSPLINK(TIs ) MSGQ componnet
for communication between ARM and DSP. our DSP processor is TMX320DRA4461.
regards,
RavindraQnx

HI tim,
1.it is happening some times only.very random.
2. If i loop there in mq_receive first time if it fails, second or third time i was able to getb the message.
is it due to timer expiration like in mq_timedreceive
regards,
ravindraqnx

ravindraqnx,

The answer is to then look for the that error and ignore it and just go back and receive again till you get a legit message ;)

In all seriousness, what you see is very strange. I can understand it happening at startup before both sides are sync’d but once that happens it shouldn’t be occuring.

You setup again is:

     computer 1                         computer 2

ARM1 - process1 -mqueue - telnet <—> telnet - mqueue -process2 - ARM2

Correct?

If so, I wonder if telnet is causing the hiccup in someway by sending control characters across the telnet session that mqueue isn’t happy with. Maybe a ‘keep alive’ for the session.

Is there a reason your using telnet vs say something like qnet? At this point your probably going better advice than what I can give since I haven’t attempted using mqueue like this before.

Tim

hi tim,
configuration looks like this
ARM1 - process1 -mq - <—> - mq -process2 - ARM1

ARM1---- < DSPLINK >— DSP

regards,
ravindraQnx

ravindaqnx,

A quick question here.

You typed in mq instead of mqueue. You aren’t by chance using mq (and linking with -l mq) instead of mqueue are you? I ask because mq isn’t supported over a network (ie it works on a local machine only).

Tim

HI tim,
I have one doubt here. Actuallly my two processes are running on one machine only. But i am controlling one process from other machine through telnet.

I am using mq currently.

regards,
Ravindraqnx

ravindraqnx,

So then what you have is more like:

ARM1 - process1
| |
| mq
| |
| process2 - telnet <----> telnet - your control - ARM2
|
— DSP

OK, how are you determining your getting the connection timed out error? Does your code look like this:

if (mq_receive() == -1)
{
printf(“errno = %d\n”, errno);
}

or are you checking errno someplace else?

Which process is getting the error. process1, process2 or both?

Is your code multi-threaded in either process?

I believe your getting false error codes (which is why I asked about your code for checking the mq_receive error) because mq_receive can’t return connection timed out. So that means either another error occurs after the mq_receive fails (and overr writes errno which is a global variable) in the same or another thread.

Tim

i am using mq only.but both processes are running on the same machine .

tim,
what you explained is correct . I am getting error here
if (mq_receive() == -1)
{
printf(“errno = %d\n”, errno);
}

both processes are multi threaded.
is errno not per thread varible.(like task variables in vx-works ). so that this overwrite problem is avoided.

ravindraqnx,

errno is not like task variables in vxWorks. There is only one errno per process (not one per thread). You can see this by manually setting errno in one thread and reading it in another one.

So it is possible that more than one error occurs at once and in that case the last one reported would be the one setting errno.

The bigger question is what error is actually occuring that’s causing everything to break down. It would be ideal if there was a way to take out the telnet session from your 2nd ARM processor where your doing the manual control from. Either by writing a script and executing it on the 1st ARM processor or sending commands via a serial port or some manner other than telnet.

Tim