i am facing connection timeout problems in my qnx application.
there are two processes in the app. process B sends message to Process A using posix mq_send().Then waits for response using mq_receive() with a timeout as O_BLOCK.but mq_receive returns error status -1. when i see errno ,It says connnection timout.
can any one help me in this. so that I won’t get these connection timouts
sorry timout as 0 ( means blocking mq_receive )
I don’t see an errno of connection timeout with mq_receive. What is the actual errno value your getting.
the errno value is ETIMEDOUT. It is defined in errno.h as macro with value 260
ETIMEDOUT means connection timout . this they mentioned in help documentation of errno
Are you trying to use mq_receive across a network?
That error only occurs on network connections (look in /usr/include/errno.h).
In the doc’s on mq_receive, the errno of ETIMEDOUT isn’t listed. That means mq_receive isn’t reporting it. Something else is deeper down. That’s why I asked if your trying to do the receive across a network.
the IPC, i am using is between two processes running on QNX. one Processe(who is performing mw _receive) is running from telnet session.Other proceses i am directly running.
but libmq.so.1 is on shared network folder. will that is causing any problems
these two processes are running on ARM side. This is MUlti processor environment( ARM processor(QNX image) and TI DAVINCI processor(DSP/BIOS os).
Maybe the problem is the different CPUs? QNet is not, as far as I’m aware, cross-endian, while TCP/IP is. What CPU is in the DSP? Maybe a struct member alignment problem?
Can you successfully work over the network with other programs?
I’ve never used mqueue across a network/multi CPU’s before so you’ve reached the limit of my knowledge on the inner workings of mqueue at this point.
A couple of other questions tho:
Does this message just happen a couple of times at startup or is intermittently happening?
Are you ever able to get any messages through?
I am just wondering if this only occurs a few time at start up until everything is sync’d between the CPU’s (queue’s created and both sides ready to send/receive across the sockets).
I am using this POSIX ipc only between processes on ARM . we have DSPLINK(TIs ) MSGQ componnet
for communication between ARM and DSP. our DSP processor is TMX320DRA4461.
1.it is happening some times only.very random.
2. If i loop there in mq_receive first time if it fails, second or third time i was able to getb the message.
is it due to timer expiration like in mq_timedreceive
The answer is to then look for the that error and ignore it and just go back and receive again till you get a legit message
In all seriousness, what you see is very strange. I can understand it happening at startup before both sides are sync’d but once that happens it shouldn’t be occuring.
You setup again is:
computer 1 computer 2
ARM1 - process1 -mqueue - telnet <—> telnet - mqueue -process2 - ARM2
If so, I wonder if telnet is causing the hiccup in someway by sending control characters across the telnet session that mqueue isn’t happy with. Maybe a ‘keep alive’ for the session.
Is there a reason your using telnet vs say something like qnet? At this point your probably going better advice than what I can give since I haven’t attempted using mqueue like this before.
configuration looks like this
ARM1 - process1 -mq - <—> - mq -process2 - ARM1
ARM1---- < DSPLINK >— DSP
A quick question here.
You typed in mq instead of mqueue. You aren’t by chance using mq (and linking with -l mq) instead of mqueue are you? I ask because mq isn’t supported over a network (ie it works on a local machine only).
I have one doubt here. Actuallly my two processes are running on one machine only. But i am controlling one process from other machine through telnet.
I am using mq currently.
So then what you have is more like:
ARM1 - process1
| process2 - telnet <----> telnet - your control - ARM2
OK, how are you determining your getting the connection timed out error? Does your code look like this:
if (mq_receive() == -1)
printf(“errno = %d\n”, errno);
or are you checking errno someplace else?
Which process is getting the error. process1, process2 or both?
Is your code multi-threaded in either process?
I believe your getting false error codes (which is why I asked about your code for checking the mq_receive error) because mq_receive can’t return connection timed out. So that means either another error occurs after the mq_receive fails (and overr writes errno which is a global variable) in the same or another thread.
i am using mq only.but both processes are running on the same machine .
what you explained is correct . I am getting error here
if (mq_receive() == -1)
printf(“errno = %d\n”, errno);
both processes are multi threaded.
is errno not per thread varible.(like task variables in vx-works ). so that this overwrite problem is avoided.
errno is not like task variables in vxWorks. There is only one errno per process (not one per thread). You can see this by manually setting errno in one thread and reading it in another one.
So it is possible that more than one error occurs at once and in that case the last one reported would be the one setting errno.
The bigger question is what error is actually occuring that’s causing everything to break down. It would be ideal if there was a way to take out the telnet session from your 2nd ARM processor where your doing the manual control from. Either by writing a script and executing it on the 1st ARM processor or sending commands via a serial port or some manner other than telnet.