Problem with fd_set in rpc communication (LINUX->QNX port

Dear all,
at the moment I’m trying to run our LINUX Application on a QNX Host PC. Everything works fine just the rpc-communication hangs up, when a
rpc-request is received and the function svc_getreqset( &readfds ); is called. Normally if this function is called my registered ApplFunc_1 should be entered. But that is not the case (Please have a look to the c-code below)
1.) I start rpc_bind (or portmap which does the same)
2.) I start my application on the QNX-Host PC
3.) When starting the application the function initRPC( ) is called one time. There the function ApplFunc_1( ) will be registered by calling svc_register( ). initRPC() returns with “true”, that means success.
4.) I send a RPC-request via TCP/IP from my WindowsPC to the QNX PC
5.) When calling svc_getreqset( &readfds ); the QNX Application hangs up

This stuff runs on my LINUX platform, so either I forgot something or there is a difference between Linux and QNX. I assume that the fd_set variables are not correct. I have investigated them but I am not shure what has to be the right content, so I cannot say if they are good or bad.
Searching the web and the qnx forums did not deliver a solution for my problems.

I am very happy about a little help from you all.
Tanks and Best Regards
Marc

#include <rpc/rpc.h>
#include <rpc/pmap_clnt.h>
#include <errno.h>

#define MYAPPL ((u_long)400000)
#define MYAPPL_VERS_1 ((u_long)1)
#define MYAPPL_NULL ((u_long)0)


// ApplFunc_1 is the function which should be entered if a rpc-call is received
extern "C" 
{
   void ApplFunc_1( struct svc_req *rqstp, register SVCXPRT *transp );
};

initRPC()
{
  bool retval( false );
  register SVCXPRT *transp;

  pmap_unset( MYAPPL, MYAPPL_VERS_1 );

   // create UDP socket and register it
   transp = svcudp_create( RPC_ANYSOCK );
   if( transp != NULL )
   {
      if( svc_register( transp, MYAPPL, MYAPPL_VERS_1, ApplFunc_1, IPPROTO_UDP ) )
      {
         retval = true;
      }
      else
      {
        // to register
      }
   }  
   else
   {
      // cannot create UDP service." << endl;
   }

   if( retval )
   {
      // create TCP socket and register it
      transp = svctcp_create( RPC_ANYSOCK, 0, 0 );
      if( transp != NULL )
      {
         if( !svc_register( transp, MYAPPL, MYAPPL_VERS_1, ApplFunc_1, IPPROTO_TCP ) )
		   {
            retval = false;
            /// unable to register
         }
         else
         {
            // register done.
         }
      }
      else
      {
         retval = false;
         // cannot create TCP service.
      }
   }
   return( retval );
};

void execRPC()
{
   uint16 dtbsz( getdtablesize() );     		// max file descriptor no.
   fd_set readfds( svc_fdset );         		// file descriptor set for server.
   struct timeval timeout = { 0L, 1000L }; 	// 1ms wait for select.

   // check if any requests have arrived.
   switch(select( dtbsz, &readfds, NULL, NULL, &timeout ) )
   {
      case -1: // error
         if( errno == EBADFD ) 
         {
           // Call of select(..) failed
         }
         break;
      case 0: // no requests available
         break;
      default: // process requests
[b]         // !! IF RPC REQUEST processed DEFAULT is entered, printf is shown, but the !!
         // !! function called ApplFunc_1(), which was registered svc_register( ) is not reached and the 
         // !! complete PROCESS hangs forever.
         printf("rpc request received \n");      
         svc_getreqset( &readfds );[/b]         break;
   }

Marmau,

Normally with fd_set you use the built in QNX macro’s for manipulating your set of fd’s.

ie.

fd_set foo;

FD_ZERO(&foo); // Clear the set

FD_SET(myFd, &foo); // add myFd to the set

fd_set tmp = foo; // Working set for the select

select(tmp…); // Your select statement

// See if myFd triggered if you have more than 1 fd in your set
if (FD_ISSET(myFd, &tmp))
{
}

FD_UNSET(myFd, &foo); // This is how you remove myFd from a set if you need to do so

Tim

Also, you’ll want to check that FD_SETSIZE is big enough for the fd’s you are trying to insert into the set…

Dear Tim, Dear rgallen,
thanks for your response. What you tell me is right and fits to the QNX manual example shown under
qnx.org/developers/docs/6.3. … elect.html

But I have a given rpc communication design in my project which runs successful under Windows and Linux and which is
using the svc_xxx methods (svc.h) from the rpc library (as shown in my code example):

During init:
svcudp_create()
svc_register()
After init() the global read only variable svc_fdset which holds the file descriptors should have stored my new fd.

During executing:
fd_set readfds( svc_fdset );

svc_getreqset( &readfds );

I tried to use the macros you told me within my design, but it will not work, too. Normally all these stuff should be handled via the svc_xxx functions.
Waking up the select with return code “1” (…what means that one rpc request was received) works but as I wrote entering the
svc_getreqset(&readfds); crashes the application. Normally this function should check out which registered function has to be called and than call it.

To have an easier debugging I have limited the FD_SIZE from 256 to 32 with compilation option “FD_SETSIZE=32” (it doesn’t work with FD_SETSIZE=256, too).
Printing out the global variable svc_fdset under QNX before and after calling my init() function delivers the following.

BEFORE init():
Request svc_fdset.fds_bits[0] 0
Request svc_fdset.fds_bits[1] 0
Request svc_fdset.fds_bits[2] 0
Request svc_fdset.fds_bits[3] 0
Request svc_fdset.fds_bits[4] 0
Request svc_fdset.fds_bits[5] 0
Request svc_fdset.fds_bits[6] 0
Request svc_fdset.fds_bits[7] 0
Request svc_fdset.fds_bits[8] -1204686484
Request svc_fdset.fds_bits[9] -1204686464
Request svc_fdset.fds_bits[10] 6
Request svc_fdset.fds_bits[11] 0
Request svc_fdset.fds_bits[12] 4098
Request svc_fdset.fds_bits[13] 0
Request svc_fdset.fds_bits[14] 0
Request svc_fdset.fds_bits[15] 0
Request svc_fdset.fds_bits[16] 0
Request svc_fdset.fds_bits[17] 0
Request svc_fdset.fds_bits[18] 0
Request svc_fdset.fds_bits[19] 0
Request svc_fdset.fds_bits[20] 0
Request svc_fdset.fds_bits[21] 0
Request svc_fdset.fds_bits[22] 0
Request svc_fdset.fds_bits[23] 0
Request svc_fdset.fds_bits[24] 0
Request svc_fdset.fds_bits[25] 0
Request svc_fdset.fds_bits[26] 0
Request svc_fdset.fds_bits[27] 0
Request svc_fdset.fds_bits[28] 0
Request svc_fdset.fds_bits[29] 0
Request svc_fdset.fds_bits[30] 0
Request svc_fdset.fds_bits[31] 0

AFTER calling init():
Request svc_fdset.fds_bits[0] 48
Request svc_fdset.fds_bits[1] 0
Request svc_fdset.fds_bits[2] 0
Request svc_fdset.fds_bits[3] 0
Request svc_fdset.fds_bits[4] 0
Request svc_fdset.fds_bits[5] 0
Request svc_fdset.fds_bits[6] 0
Request svc_fdset.fds_bits[7] 0
Request svc_fdset.fds_bits[8] -1204686484
Request svc_fdset.fds_bits[9] -1204686464
Request svc_fdset.fds_bits[10] 6
Request svc_fdset.fds_bits[11] 0
Request svc_fdset.fds_bits[12] 4098
Request svc_fdset.fds_bits[13] 0
Request svc_fdset.fds_bits[14] 0
Request svc_fdset.fds_bits[15] 0
Request svc_fdset.fds_bits[16] 0
Request svc_fdset.fds_bits[17] 0
Request svc_fdset.fds_bits[18] 0
Request svc_fdset.fds_bits[19] 0
Request svc_fdset.fds_bits[20] 0
Request svc_fdset.fds_bits[21] 0
Request svc_fdset.fds_bits[22] 0
Request svc_fdset.fds_bits[23] 0
Request svc_fdset.fds_bits[24] 0
Request svc_fdset.fds_bits[25] 0
Request svc_fdset.fds_bits[26] 0
Request svc_fdset.fds_bits[27] 0
Request svc_fdset.fds_bits[28] 0
Request svc_fdset.fds_bits[29] 0
Request svc_fdset.fds_bits[30] 0
Request svc_fdset.fds_bits[31] 0

You can see at [0] that there happens something during init(). I think this has to be my new file descriptor.
The entries in [8], [9] and [12] are looking strange to me ??? Under Windows I can see that in [0] my new fd
takes place and all other entries are 0. Using FD_ZERO(&svc_fdset) before calling init() to set all entries to 0 at start up has no effect.

So I tried the following before calling svc_getreqset( &readfds ); in the execute stuff:

→ Init all entries of the temp variable readfds with zero execept [0] with my fd (which is always 48 )
readfds.fds_bits[0] = 48;
readfds.fds_bits[1] = 0;
readfds.fds_bits[2] = 0;

→ Than call:
svc_getreqset( &readfds );

Interesting: Application doesn’t crashes and continuous when entering the svc_getreqset() function, but this function never comes back ??
Strange: printing out the contents of readfds after the init shown above (and before entering svc_getreqset) delivers:

readfds.fds_bits[0] 48
readfds.fds_bits[1] 65536 // Whats that???
readfds.fds_bits[2] 0
readfds.fds_bits[3] 0
…all others are 0

I am not the creator of all these stuff, I just have to port it to QNX. So perhaps I forgot something important.

Here a few more ideas and questions:
1.) Is it enough to start portmap (rpcbind) on the QNX host PC or do I have to start more programs to get the rpc communication to work ?
2.) I am working with QNX Momentics 6.3.2, but because of other problems I have upgraded to GCC 4.2.1 and binutils 2.17. Could this be a
compatibility problem for the rpc lib ?
3.) Am I the first person on this planet who works with the svc_xxx functions under QNX ? Searching through openqnx.com qnx.com and the rest of
the web delivers no informations about using svc_xxx stuff under QNX

After a few days of trying I am very frustrated at the moment. Perhaps someone can bring me back from this deep and dark valley.

Thanks in Advance and Best Regards
Marc

P.S. Here one more note for information…this is the comment text of the file svc.h from the librpc:
/*

  • This interface must manage two items concerning remote procedure calling:
    1. An arbitrary number of transport connections upon which rpc requests
  • are received. The two most notable transports are TCP and UDP; they are
  • created and registered by routines in svc_tcp.c and svc_udp.c, respectively;
  • they in turn call xprt_register and xprt_unregister.
    1. An arbitrary number of locally registered services. Services are
  • described by the following four data: program number, version number,
  • “service dispatch” function, a transport handle, and a boolean that
  • indicates whether or not the exported program should be registered with a
  • local binder service; if true the program’s number and version and the
  • port number from the transport handle are registered with the binder.
  • These data are registered with the rpc svc system via svc_register.
  • A service’s dispatch function is called whenever an rpc request comes in
  • on a transport. The request’s program and version numbers must match
  • those of the registered service. The dispatch function is passed two
  • parameters, struct svc_req * and SVCXPRT *, defined below.
    */

Marc,

Your printout of the fd_set shows you don’t quite understand what it is.

If you look in /usr/include/sys/select.h you’ll see that fd_set is an array of bits, not an array of integers. So what you printed out was random memory after the first array position of svc_fdset.fds_bits[0].

With FD_SETSIZE set to 32, the fd_bits[] array is size 1 (32 bits fits into 1 integer). So to print out svc_fdset you only need to print array position 0 as an integer.

The value of 48 seems suspicious to me because if you only have 1 fd, it should be on a multiple of 2 (1,2,4,8,16,32 etc). Unless of course your printf is messed up.

If the printf is not messed up, the first thing you need to figure out is why your not setting just a single bit in your svc_register() call since you are only registering 1 socket.

What I suggest you do is print out the fd associated with transp (must be somewhere in the SVCXRPT struct). That should be a value like 5. After the svc_register() call your svc_fdset should be 16 (assuming the fd is 5, the 5th bit in your svc_fdset should be ‘on’ which equals a value of 16).

As for the rest…

The fact you are hand setting 32 integer values (when the array size is only 1) after returning from select is likely killing your stack and I’m surprised your app doesn’t immediately crash.

Tim

Hallo Tim,

Yes, you are right. My understanding of fd_set was wrong, but now with your description it is clarified.
I followed your suggestion and printed out the fd associated with transp after svctcp_create( )

/* create TCP socket and register it */
transp = svctcp_create( RPC_ANYSOCK, 0, 0 );
if( transp != NULL )
{
printf(“transp->xp_fd %d\n”, transp->xp_fd);
printf(“transp->xp_port %d\n”, transp->xp_port);

Within create xprt_register( ) my new created fd (variable sock) was assigned to the global variable (svc_fds) :

void xprt_register(xprt)
SVCXPRT *xprt;
{

if (sock < NOFILE)
{
xports[sock] = xprt;
svc_fds |= (1 << sock);
}

}

The print out delivers:
transp->xp_fd 5
transp->xp_port 1022

Yes you are right again. The fd has the value 5. But printing out svc_fdset still delivers the value 48 (instead of 16)
this means there are 2 bits set during the register in the 32 Bit bitfield instead of 1. (48 → 110000 ??)

Debugging under Windows delivers that there are set to file descriptors, too. I don’t know why but it seems to be O.K.
At the end I found a way to get it work. There are two problems detected and solved by me.

The first one was a wrong value for fd-set-size, hand over to the select() function.
So I have to differentiate between the three OSs in the following manner:

#ifdef _WIN32
uint16 dtbsz( _SYS_OPEN );
#else
#ifdef _QNX
uint16 dtbsz(FD_SETSIZE); // max file descriptor no.
#else // LINUX
uint16 dtbsz( getdtablesize() ); // max file descriptor no.
#endif
#endif
switch ( select(dtbsz, &readfds, 0, 0, &timeout ) )
{…

The variable getdtablesize() is known under QNX but it delivers the value 1000. If I hand over this value to select() than
the select function was entered and never left. After I have replaced it with FD_SETSIZE (its value is 256) it crashes a few lines later
by calling the function svc_getreqset( &readfds ); . But this was a NULL pointer access within my Application.
The application crashes so quickly, that my coded printf’s are too slow to become visible on the console, although they
are placed before the NULL pointer crash. So I assumed the crash at the wrong place.

Now the complete RPC communication works between my Windows PC Tool and my QNX PC Application.
So at the end I learned a lot, and I thank you for your useful help.

Best Regards
Marc from good old Germany ;-)