Regarding use of named semaphores

Am using named semaphore for implementing locks across applications.
I create semaphore using sem_open and then use sem_wait and sem_post.
However ,in some cases sem_post fails. As a result ,sem_wait blocks indefinitely when next used. What could be the possible reasons post on semaphore would fail?

:unamused:

:frowning:

Sarswati,

As always, it would be helpful to see the code block relating to your sem_post(), sem_init() etc calls.

Also, when sem_post() fails, what does errno say?

Tim

Bad programming maybe? not reading the documentation maybe? Universe has something against you, this is common occurence in the programming world.

I have created a wrapper class whuch encapsulates semaphore functionality.
Here it is :
#include “CNamedSemaphore.h”

CNamedSemaphore::CNamedSemaphore(const string& strSemName, const Rhp_uint32_t uiInitCount) : m_phSemHandle(NULL),
 m_strName("") {
    //#[ operation CNamedSemaphore(const string&,const Rhp_uint32_t)
    
    // Save the name in the member variable
    m_strName = strSemName; 
    
    // Opening a named semaphore with the name that was passed in the
    // arguments. Open it in create mode.
    // Give the semaphore permission for read, write 
    // and execute all across. Declare initial value as specified.
    m_phSemHandle = sem_open(strSemName.c_str(), O_CREAT, S_IRWXG | S_IRWXO | S_IRWXU, 
    													uiInitCount);
    //#]
}

CNamedSemaphore::~CNamedSemaphore() {
    //#[ operation ~CNamedSemaphore()
    
    // Close and unlink the semaphore
    sem_close(m_phSemHandle);
    
    //#]
}

Rhp_int32_t CNamedSemaphore::getValue(Rhp_int32_t& iValue) const {
    //#[ operation getValue(Rhp_int32_t&) const
    
    return sem_getvalue(m_phSemHandle, &iValue);
    //#]
}

RhpBoolean CNamedSemaphore::post() const {
    //#[ operation post() const
    RhpBoolean bStatus = false;
    
    // Increment a semaphore
    const Rhp_int32_t iSemStatus = sem_post(m_phSemHandle);
    if( 0 == iSemStatus)
    {
    	bStatus = true;
    }
    return bStatus; 
    //#]
}

RhpBoolean CNamedSemaphore::tryWait() const {
    //#[ operation tryWait() const
    RhpBoolean bStatus = false;
    
    // Wait on a semaphore, but don't block
    const Rhp_int32_t iSemStatus = sem_trywait(m_phSemHandle);
    if( 0 == iSemStatus)
    {
    	bStatus = true;
    }
    
    return bStatus; 
    //#]
}

RhpBoolean CNamedSemaphore::wait() const {
    //#[ operation wait() const
    RhpBoolean bStatus = false;
    
    // Wait on a semaphore
    const Rhp_int32_t iSemStatus = sem_wait(m_phSemHandle);
    
    if( 0 == iSemStatus)
    {
    	bStatus = true;
    }
    return bStatus; 
    //#]
}

}

Now to use it:
I create a CNamedSemaphore obj(“mysem”);
obj.wait();
//my code//

bStatus = obj.post()

In some cases: post returns false.
and then my next process which needs to acess the same sem doesnt get it and it hangs…

I don’t see any sem_init() in there?

I have used named semaphore for which sem_open call is used.sem_init is used for unamed semaphores.Another thing i would like to point is that this thing happens rarely.I dont even have steps to reproduce.All of a sudden i realize my application is stuck up and on examining the logs I find that the sem post has failed…

Can it be the case that my piece of code within wait and post is sometimes doing something which is leading to memory corruption may be?

Sarswati,

A few things:

  1. In the sem_open call in the constructor you never check the status of what was returned to see if it was valid or not.

  2. You didn’t post the CNamedSemaphore.h content that shows the rest of the class. Specifically I’m interested in knowing whether the default constructor, copy constructor and assignment constructor are forbidden. If they aren’t you may get a classic error where you do something like:

CNamedSemaphore obj(“mysem”);
CNamedSemaphore obj1;
obj1 = obj; // Big mistake as you now have 2 instances of the class managing 1 semaphore and you’ll definitely end up with problems.

  1. Are you managing this semaphore between processes or threads within a process? I ask because the constructor is always trying to create the semaphore. If you are using it between 2 processes, the 2nd process doesn’t need to create it since the first one already did. It should just be trying to open an existing one. This is why I asked about you checking the failure in the sem_open call.

Tim

Hi Tim,
Answer to your queries:

  1. sem_open returns pointer to sem_t structure containing 2 fields(owner,count)…What should i check for failure? I dont think it would allowing checking with -1 which is the usual return code for failure.
  2. My default constructor, copy constructor and assignment constructor are all private.So 2nd issue wont arise.
  3. This class is used for managing resources across processes. As per QNX help, attempt to create a sem of existing name attaches to the current sem only.
    Another point to note is that sem_wait never fails.if sem_open had failed then sem_wait would also have failed.Its always the case that sem_post fails.

Compare m_phSemHandle against NULL. When the post fails what does errno says.

Sarswati,

Are you checking the count of your Semaphore after creation anywhere (the code you posted doesn’t)?

According to the documentation, named semaphores persist as long as the node isn’t rebooted. So between runs of your code, the semaphore will still exist. Thus the second/third etc time you run, the semaphore may not be in the state (count) you expect.

I would also print the error message (errno) that occurs when the sem_post call fails. Maybe somehow you’ve corrupted the m_phSemHandle during the running of your code which causes sem_post to fail.

Tim

Unfortunately I have never been able to get the err num as my code doesnt print it. and when I have included the code to print errnum the situation hasnt arisen .As I have told before its very rarely that post fails but it does which is an issue since my entire system hangs.
However,on checking help i saw that there are only 2 err nums associated with post
EINVAL
Invalid semaphore descriptor sem.
ENOSYS
The sem_post() function isn’t supported.
I dont think second would arise else my code would never have worked.So my guess is am getting Invalid desc err.

Also ,my code on initialization physically removes all the semaphores from system.so whenever i rerun my code i would start from a clean slate.

Sarswati,

Ah the dreaded ‘Quantum Observability Problem’ or Hawthorne Effect.

I suggest you simply always leave the printf in your code then and the problem will be solved ;)

I am pretty sure you are going to get an EINVAL error when the problem happens. That isn’t what’s going to be interesting. What will be interesting is what your handle is at that point. Either its (A) Corrupted or (B) You’ve closed the Semaphore.

Since (B) seems unlikely from the code you posted, (A) is almost assuredly what has happened. The trick will be printing out the right information to help detect what overflowed and corrupted your handle.

Tim

Thanks for this. I have the same problem and when I tried this. It just simply solve the problem.