Problem with mult-ithreaded resource managers.

Hello All.


I have enocountered strange problem with multi-threaded resource managers.
I took part in development of large system based on multi-threaded resource managers.
Each resource manager registers processor for _IO_MSG message type and

handles client requests based on simple MsgSend() calls:



// initialize default message handlers and assign own processor for _IO_MSG message type

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);

io_funcs.msg = io_msg;


Yesterday, I was amazed, when I tested one critical resource manager. In spite of I am using

multi-threaded resource manager, only one message could be processed in time.



I have foud that resource manager don’t process any subsequent messages until previous io_msg() function

would be finished! All other threads stops in ‘CONDVAR’ state, until io_msg() function() finish.

I don’t used any synchronization calls in my own code explicitly!



I checked all available multi-threaded resource manager sources, I have. All resource managers have the similar

behaviour.



Finally, I took the sample from Neutrino help and got the same results.



It is easy to reproduce the problem:

In resource manager: Call sleep(10); after MsgReply() in io_msg() message handler.

In the client: Send messages in infinite loop to a resource manager.



Take a look at resource manager threads.



Some of threads would be in RECEIVE state (awaiting message)

One thread would be in NANOSLEEP state (performing sleep())

One thread would be in CONDVAR state. This thread received message but didn’t started it’s processing yet.


\

pidin | grep resmgr

33280030 1 ./resmgr 10o CONDVAR b034f040
33280030 2 ./resmgr 10o RECEIVE 1
33280030 3 ./resmgr 10o NANOSLEEP
33280030 4 ./resmgr 10o RECEIVE 1
33374244 1 ./resmgrclient 10o REPLY 33280030
33378341 1 ./resmgr 10o RECEIVE 1
33378341 2 ./resmgr 10o RECEIVE 1
33378341 3 ./resmgr 10o RECEIVE 1




If client program would be started several times, then resource manager will have several threads in CONDVAR state.

The number of such threads would be equal to number of client processes started. New clients connects slow to

resource manager because open() call is based on message processing too.



May be there are undocumented flags for resource manager API functions or (may be) thread pool functions.

I don’t know them.



Why multi-threaded resource manager behave such way? Does it right? Of course the problem could be avoided but

what pool thread does for me?







Any help would be appreciated.

With best regards, Andrew

Programmer, Volgasoft Ltd. Russia.



The sources of resource manager and client, which illustrate the problem follows (The resource manager is taken from

Neutrino help):



// file resmgr.cpp -----------------------------------------------------------------------------------------

// test multi-threaded resource manager -------------------------------------------------------------------

#include <errno.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>

#include <string.h>

/*

  • define THREAD_POOL_PARAM_T such that we can avoid a compiler
  • warning when we use the dispatch_*() functions below
    */
    #define THREAD_POOL_PARAM_T dispatch_context_t

#include <sys/iofunc.h>
#include <sys/dispatch.h>

static resmgr_connect_funcs_t connect_funcs;
static resmgr_io_funcs_t io_funcs;
static iofunc_attr_t attr;


int io_msg( resmgr_context_t *ctp, io_msg_t *iomsg, iofunc_ocb_t *ocb )
{
printf(“Messsage received\n”);
MsgReply(ctp->rcvid, 0, NULL, 0);

sleep(10);

return _RESMGR_NOREPLY;
}

main(int argc, char *argv)
{
/
declare variables we’ll be using */
thread_pool_attr_t pool_attr;
resmgr_attr_t resmgr_attr;
dispatch_t *dpp;
thread_pool_t *tpp;
dispatch_context_t *ctp;
int id;

/* initialize dispatch interface */
if((dpp = dispatch_create()) == NULL) {
fprintf(stderr, “%s: Unable to allocate dispatch handle.\n”,
argv[0]);
return EXIT_FAILURE;
}

/* initialize resource manager attributes */
memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;

/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);

io_funcs.msg = io_msg;

/* initialize attribute structure used by the device */
iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);

/* attach our device name /
id = resmgr_attach(dpp, /
dispatch handle /
&resmgr_attr, /
resource manager attrs /
“/dev/sample”, /
device name /
_FTYPE_ANY, /
open type /
0, /
flags /
&connect_funcs, /
connect routines /
&io_funcs, /
I/O routines /
&attr); /
handle */
if(id == -1) {
fprintf(stderr, “%s: Unable to attach name.\n”, argv[0]);
return EXIT_FAILURE;
}

/* initialize thread pool attributes */
memset(&pool_attr, 0, sizeof pool_attr);
pool_attr.handle = dpp;
pool_attr.context_alloc = dispatch_context_alloc;
pool_attr.block_func = dispatch_block;
pool_attr.handler_func = dispatch_handler;
pool_attr.context_free = dispatch_context_free;
pool_attr.lo_water = 2;
pool_attr.hi_water = 4;
pool_attr.increment = 1;
pool_attr.maximum = 50;

/* allocate a thread pool handle */
if((tpp = thread_pool_create(&pool_attr,
POOL_FLAG_EXIT_SELF)) == NULL) {
fprintf(stderr, “%s: Unable to initialize thread pool.\n”,
argv[0]);
return EXIT_FAILURE;
}

/* start the threads, will not return */
thread_pool_start(tpp);
}


Test client sources:



// file resmgrclient.cpp -----------------------------------------------------------------------------

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <sys/iofunc.h>

main(int argc, char **argv)
{
int nCoid = ::open("/dev/sample", O_WRONLY);
if ( nCoid == -1)
{
perror(“Can’t open resource:” );
return 0;
}

while (1)
{
_io_msg rMsg;
memset( &rMsg, 0, sizeof(rMsg));

rMsg.type = _IO_MSG;

printf(“Send message to resmgr.\n”);
MsgSend(nCoid, &rMsg, sizeof(rMsg), NULL, 0);
printf(“Reply received\n”);
}
}

“Andrew Chesnokov” <achesnokov@nospam-datac-control.com> wrote in message
news:bafpg3$5u4$1@inn.qnx.com

Hello All.


I have enocountered strange problem with multi-threaded resource managers.
I took part in development of large system based on multi-threaded resource
managers.
Each resource manager registers processor for _IO_MSG message type and

handles client requests based on simple MsgSend() calls:



// initialize default message handlers and assign own processor for
_IO_MSG message type

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);

io_funcs.msg = io_msg;


Yesterday, I was amazed, when I tested one critical resource manager. In spite
of I am using

multi-threaded resource manager, only one message could be processed in time.



I have foud that resource manager don’t process any subsequent messages until
previous io_msg() function

would be finished! All other threads stops in ‘CONDVAR’ state, until io_msg()
function() finish.

I don’t used any synchronization calls in my own code explicitly!



I checked all available multi-threaded resource manager sources, I have. All
resource managers have the similar

behaviour.



Finally, I took the sample from Neutrino help and got the same results.



It is easy to reproduce the problem:

In resource manager: Call sleep(10); after MsgReply() in io_msg() message
handler.

In the client: Send messages in infinite loop to a resource manager.



Take a look at resource manager threads.



Some of threads would be in RECEIVE state (awaiting message)

One thread would be in NANOSLEEP state (performing sleep())

One thread would be in CONDVAR state. This thread received message but didn’t
started it’s processing yet.


\

pidin | grep resmgr

33280030 1 ./resmgr 10o CONDVAR b034f040
33280030 2 ./resmgr 10o RECEIVE 1
33280030 3 ./resmgr 10o NANOSLEEP
33280030 4 ./resmgr 10o RECEIVE 1
33374244 1 ./resmgrclient 10o REPLY 33280030
33378341 1 ./resmgr 10o RECEIVE 1
33378341 2 ./resmgr 10o RECEIVE 1
33378341 3 ./resmgr 10o RECEIVE 1




If client program would be started several times, then resource manager will
have several threads in CONDVAR state.

The number of such threads would be equal to number of client processes
started. New clients connects slow to

resource manager because open() call is based on message processing too.



May be there are undocumented flags for resource manager API functions or (may
be) thread pool functions.

I don’t know them.



Why multi-threaded resource manager behave such way? Does it right? Of course
the problem could be avoided but

what pool thread does for me?







Any help would be appreciated.

With best regards, Andrew

Programmer, Volgasoft Ltd. Russia.

Andrew,

My guess is that what you are encountering is due to the attributes lock.
Unless the code in your io_read/write, etc. explicitly unlocks the attributes
lock, then it doesn’t matter how many threads you have in your pool -
they will be blocked waiting for the lock until the call that has it returns
to the resource manager library, which will release the lock. I believe
it is true that the thread with the lock will potentially be bumped up
in priority if a higher priority thread becomes blocked.

Of course, you have to be careful about unlocking the attributes
lock. It is there for a reason!

I hope this helps.

/Kirk

“Kirk Bailey” <kirk.a.bailey@delphi.com> wrote in message news:bafran$21a$1@nntp.qnx.com

“Andrew Chesnokov” <> achesnokov@nospam-datac-control.com> > wrote in message
news:bafpg3$5u4$> 1@inn.qnx.com> …
Hello All.


I have enocountered strange problem with multi-threaded resource managers.
I took part in development of large system based on multi-threaded resource
managers.
Each resource manager registers processor for _IO_MSG message type and

handles client requests based on simple MsgSend() calls:



// initialize default message handlers and assign own processor for
_IO_MSG message type

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);

io_funcs.msg = io_msg;


Yesterday, I was amazed, when I tested one critical resource manager. In spite
of I am using

multi-threaded resource manager, only one message could be processed in time.



I have foud that resource manager don’t process any subsequent messages until
previous io_msg() function

would be finished! All other threads stops in ‘CONDVAR’ state, until io_msg()
function() finish.

I don’t used any synchronization calls in my own code explicitly!



I checked all available multi-threaded resource manager sources, I have. All
resource managers have the similar

behaviour.



Finally, I took the sample from Neutrino help and got the same results.



It is easy to reproduce the problem:

In resource manager: Call sleep(10); after MsgReply() in io_msg() message
handler.

In the client: Send messages in infinite loop to a resource manager.



Take a look at resource manager threads.



Some of threads would be in RECEIVE state (awaiting message)

One thread would be in NANOSLEEP state (performing sleep())

One thread would be in CONDVAR state. This thread received message but didn’t
started it’s processing yet.


\

pidin | grep resmgr

33280030 1 ./resmgr 10o CONDVAR b034f040
33280030 2 ./resmgr 10o RECEIVE 1
33280030 3 ./resmgr 10o NANOSLEEP
33280030 4 ./resmgr 10o RECEIVE 1
33374244 1 ./resmgrclient 10o REPLY 33280030
33378341 1 ./resmgr 10o RECEIVE 1
33378341 2 ./resmgr 10o RECEIVE 1
33378341 3 ./resmgr 10o RECEIVE 1




If client program would be started several times, then resource manager will
have several threads in CONDVAR state.

The number of such threads would be equal to number of client processes
started. New clients connects slow to

resource manager because open() call is based on message processing too.



May be there are undocumented flags for resource manager API functions or (may
be) thread pool functions.

I don’t know them.



Why multi-threaded resource manager behave such way? Does it right? Of course
the problem could be avoided but

what pool thread does for me?







Any help would be appreciated.

With best regards, Andrew

Programmer, Volgasoft Ltd. Russia.


Andrew,

My guess is that what you are encountering is due to the attributes lock.
Unless the code in your io_read/write, etc. explicitly unlocks the attributes
lock, then it doesn’t matter how many threads you have in your pool -
they will be blocked waiting for the lock until the call that has it returns
to the resource manager library, which will release the lock. I believe
it is true that the thread with the lock will potentially be bumped up
in priority if a higher priority thread becomes blocked.

Of course, you have to be careful about unlocking the attributes
lock. It is there for a reason!

I hope this helps.

Thank you very much!
This is exactly what I need.
I think this problem should be better described in the documentation.
The reason of the problem and solution is not obvious.

The summary for all:

The multi-threaded resource manager from the samples is single threaded if you
don’t use iofunc_attr_unlock(ocb->attr) function explicitly in your message processing function!
All threads in thread pool are blocked until this function is called implicitly or explicitly.

But the question is: when it is safe to unlock ocb->attr attributes?
May be in most cases, I may call iofunc_attr_unlock(ocb->attr) at the beginning of message handler
when I don’t use (change) the ocb parameter in message processing function.

Andrew

Kirk Bailey <kirk.a.bailey@delphi.com> wrote:

Andrew,

My guess is that what you are encountering is due to the attributes lock.
Unless the code in your io_read/write, etc. explicitly unlocks the attributes
lock, then it doesn’t matter how many threads you have in your pool -
they will be blocked waiting for the lock until the call that has it returns
to the resource manager library, which will release the lock. I believe
it is true that the thread with the lock will potentially be bumped up
in priority if a higher priority thread becomes blocked.

Of course, you have to be careful about unlocking the attributes
lock. It is there for a reason!

I think Kirk is correct – before any callout for an IO message to
be handled, there is a lock done on the attribute structure that was
opened, and it is unlocked after each message is processed.

This is important in the normal case.

In your message handling do you use the attribute structure and OCB
at all?

If not, you might want to look at message_attach(), which is useful for
messages that are not open() oriented (basically using open for location
and that’s it).

On the sending side, you message could use any message type greater
than IO_MAX (511).

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

It is not ‘single threaded’. It is ‘serialized’ by default, since it can
access a common resource.
That definitely begs to be put in a bold font somewhere in the docs.

Of course, as long as you don’t update attributes in your I/O handlers you
can just set attribute locking functions to NULL and that would
‘unserialize’ it.

“Andrew Chesnokov” <achesnokov@nospam-datac-control.com> wrote in message
news:bafu0u$bep$1@inn.qnx.com

[…]
Thank you very much!
This is exactly what I need.
I think this problem should be better described in the documentation.
The reason of the problem and solution is not obvious.

The summary for all:

The multi-threaded resource manager from the samples is single threaded if
you
don’t use iofunc_attr_unlock(ocb->attr) function explicitly in your
message processing function!
All threads in thread pool are blocked until this function is called
implicitly or explicitly.

But the question is: when it is safe to unlock ocb->attr attributes?
May be in most cases, I may call iofunc_attr_unlock(ocb->attr) at the
beginning of message handler
when I don’t use (change) the ocb parameter in message processing
function.

Andrew

Andrew Chesnokov <achesnokov@nospam-datac-control.com> wrote:

The multi-threaded resource manager from the samples is single threaded
if you don’t use iofunc_attr_unlock(ocb->attr) function explicitly in your
message processing function! All threads in thread pool are blocked until
this function is called implicitly or explicitly.

Just to clarify, this is only true for each individual “entity” (openable
iofunc_attr_t object). So if you have just a single (non-dir) pathname,
then this serialisation will appear to be largely single-threaded; if
you have a full filesystem then you can do multi-threaded access to
different attrs/files at the same time. Note that you don’t want to
be gratuitously unlocking the attr in the general case because this
is not POSIX (read/write must be atomic) and you could get corruption
updating fields in the iofunc_attr_t. It is arguable though that if
you have a resmgr with just a single non-directory pathname that you
may as well just be single-threaded; in this case setting the lock/unlock
ocb callouts (now unneeded) to NULL will gain you some performance.

But the question is: when it is safe to unlock ocb->attr attributes?
May be in most cases, I may call iofunc_attr_unlock(ocb->attr) at thei
beginning of message handler when I don’t use (change) the ocb parameter
in message processing function.

You should only manually unlock it if you intend to block the client
without replying (ie store the rcvid/scoid away on a notify list
somewhere because the request can’t be immediately handled and return
RESMGR_NOREPLY). In all other case the attr should remain locked
during the message processing. See above re suggestion to just be
single-threaded. Note that if you manually unlock as you describe
you must also manully re-lock, or else the implicit one will EPERM.
Also note that it is not really the OCB that this lock protects, it
is the iofunc_attr_t, which will be updated by all the iofunc
*()
helper routines.

Hi Andrew,
That’s normal!
Normally multi-threaded resmgr are usefull to serve multiple resmgrs so,
as you only supply /dev/sample, multi threads (normally) don’t help you.
Here we can ask the following question: Do you really need a resmgr or
just a message passing facility.
Personnaly, I also use to adopt the resmgr even for just an io_msg
communication type.
So, in that case and if you know what you do with your resmgr, there is
a solution.
In fact, when a client access to a resmgr, the library locks the
resmgr’s attribute structure which contain some ‘critical fields’ such
as access rights, file accessing offset, file lentgh
, etc…
We can understand in an acces file case, that it could be harmfull to
allow several client to access the same file at the same time.

If you are not in that use case, as in an io_msg management, you can
unlock the attribute structure at the beginning of your io_msg() func,
and relock it before exit, to allow another client to access your
/dev/sample while a first one is yet in your io_msg() func. In that case
multi-threading recovers its usefulness.



int io_msg( resmgr_context_t *ctp, io_msg_t *iomsg, iofunc_ocb_t *ocb )
{

iofunc_attr_unlock((iofunc_attr_t *)&ocb->attr)
printf(“Messsage received\n”);
MsgReply(ctp->rcvid, 0, NULL, 0);

sleep(10);

iofunc_attr_lock((iofunc_attr_t *)&ocb->attr)

return _RESMGR_NOREPLY;
}


Regards,
Alain.
Andrew Chesnokov a écrit:

Hello All.


I have enocountered strange problem with multi-threaded resource managers.
I took part in development of large system based on multi-threaded resource managers.
Each resource manager registers processor for _IO_MSG message type and

handles client requests based on simple MsgSend() calls:



// initialize default message handlers and assign own processor for _IO_MSG message type

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);

io_funcs.msg = io_msg;


Yesterday, I was amazed, when I tested one critical resource manager. In spite of I am using

multi-threaded resource manager, only one message could be processed in time.



I have foud that resource manager don’t process any subsequent messages until previous io_msg() function

would be finished! All other threads stops in ‘CONDVAR’ state, until io_msg() function() finish.

I don’t used any synchronization calls in my own code explicitly!



I checked all available multi-threaded resource manager sources, I have. All resource managers have the similar

behaviour.



Finally, I took the sample from Neutrino help and got the same results.



It is easy to reproduce the problem:

In resource manager: Call sleep(10); after MsgReply() in io_msg() message handler.

In the client: Send messages in infinite loop to a resource manager.



Take a look at resource manager threads.



Some of threads would be in RECEIVE state (awaiting message)

One thread would be in NANOSLEEP state (performing sleep())

One thread would be in CONDVAR state. This thread received message but didn’t started it’s processing yet.


\

pidin | grep resmgr

33280030 1 ./resmgr 10o CONDVAR b034f040
33280030 2 ./resmgr 10o RECEIVE 1
33280030 3 ./resmgr 10o NANOSLEEP
33280030 4 ./resmgr 10o RECEIVE 1
33374244 1 ./resmgrclient 10o REPLY 33280030
33378341 1 ./resmgr 10o RECEIVE 1
33378341 2 ./resmgr 10o RECEIVE 1
33378341 3 ./resmgr 10o RECEIVE 1




If client program would be started several times, then resource manager will have several threads in CONDVAR state.

The number of such threads would be equal to number of client processes started. New clients connects slow to

resource manager because open() call is based on message processing too.



May be there are undocumented flags for resource manager API functions or (may be) thread pool functions.

I don’t know them.



Why multi-threaded resource manager behave such way? Does it right? Of course the problem could be avoided but

what pool thread does for me?







Any help would be appreciated.

With best regards, Andrew

Programmer, Volgasoft Ltd. Russia.



The sources of resource manager and client, which illustrate the problem follows (The resource manager is taken from

Neutrino help):



// file resmgr.cpp -----------------------------------------------------------------------------------------

// test multi-threaded resource manager -------------------------------------------------------------------

#include <errno.h
#include <stdio.h
#include <stddef.h
#include <stdlib.h
#include <unistd.h

#include <string.h

/*

  • define THREAD_POOL_PARAM_T such that we can avoid a compiler
  • warning when we use the dispatch_*() functions below
    */
    #define THREAD_POOL_PARAM_T dispatch_context_t

#include <sys/iofunc.h
#include <sys/dispatch.h

static resmgr_connect_funcs_t connect_funcs;
static resmgr_io_funcs_t io_funcs;
static iofunc_attr_t attr;


int io_msg( resmgr_context_t *ctp, io_msg_t *iomsg, iofunc_ocb_t *ocb )
{
printf(“Messsage received\n”);
MsgReply(ctp->rcvid, 0, NULL, 0);

sleep(10);

return _RESMGR_NOREPLY;
}

main(int argc, char *argv)
{
/
declare variables we’ll be using */
thread_pool_attr_t pool_attr;
resmgr_attr_t resmgr_attr;
dispatch_t *dpp;
thread_pool_t *tpp;
dispatch_context_t *ctp;
int id;

/* initialize dispatch interface */
if((dpp = dispatch_create()) == NULL) {
fprintf(stderr, “%s: Unable to allocate dispatch handle.\n”,
argv[0]);
return EXIT_FAILURE;
}

/* initialize resource manager attributes */
memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;

/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);

io_funcs.msg = io_msg;

/* initialize attribute structure used by the device */
iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);

/* attach our device name /
id = resmgr_attach(dpp, /
dispatch handle /
&resmgr_attr, /
resource manager attrs /
“/dev/sample”, /
device name /
_FTYPE_ANY, /
open type /
0, /
flags /
&connect_funcs, /
connect routines /
&io_funcs, /
I/O routines /
&attr); /
handle */
if(id == -1) {
fprintf(stderr, “%s: Unable to attach name.\n”, argv[0]);
return EXIT_FAILURE;
}

/* initialize thread pool attributes */
memset(&pool_attr, 0, sizeof pool_attr);
pool_attr.handle = dpp;
pool_attr.context_alloc = dispatch_context_alloc;
pool_attr.block_func = dispatch_block;
pool_attr.handler_func = dispatch_handler;
pool_attr.context_free = dispatch_context_free;
pool_attr.lo_water = 2;
pool_attr.hi_water = 4;
pool_attr.increment = 1;
pool_attr.maximum = 50;

/* allocate a thread pool handle */
if((tpp = thread_pool_create(&pool_attr,
POOL_FLAG_EXIT_SELF)) == NULL) {
fprintf(stderr, “%s: Unable to initialize thread pool.\n”,
argv[0]);
return EXIT_FAILURE;
}

/* start the threads, will not return */
thread_pool_start(tpp);
}


Test client sources:



// file resmgrclient.cpp -----------------------------------------------------------------------------

#include <stdio.h
#include <unistd.h
#include <fcntl.h
#include <string.h
#include <sys/iofunc.h

main(int argc, char **argv)
{
int nCoid = ::open("/dev/sample", O_WRONLY);
if ( nCoid == -1)
{
perror(“Can’t open resource:” );
return 0;
}

while (1)
{
_io_msg rMsg;
memset( &rMsg, 0, sizeof(rMsg));

rMsg.type = _IO_MSG;

printf(“Send message to resmgr.\n”);
MsgSend(nCoid, &rMsg, sizeof(rMsg), NULL, 0);
printf(“Reply received\n”);
}
}



\