]{ristoph <news2@kristoph.net> wrote:
: Greetings,
: Please ignore my last, rather alarmist, post. I dropped the towel as Igor
: would say ;o)
: Let me just step back and give you more of an idea of what I am doing. I
: noticed that there were a number of problems with the version of samba that
: is posted on qnxstart. The main symptoms were an accumulation of zombies and
: basically no support for SIGHUP.
: I had a closer look at the problem and I found that, in fact, signals were
: not working at all in any of the samba binaries. The problem turned out be
: that RTP advertises SA_RESTART but it does not support it. In samba, if
: SA_RESTART is defined it is used in calls to sigaction. sigaction then fails
: but samba does not notice.
: So, I removed SA_RESTART and sigaction was working fine. However, a new
: problem appeared. I was running smbd as a daemon and as soon as a connection
: was closed the daemon would stop working. I traced that to the fact that
: sys_select (a samba function) stopped returning after a signal, such as
: SIGCHLD, interrupted it. I only quickly looked at the sys_select and
: assumed, incorrectly, that it simply mapped to select. Hence, I got the
: impression that select was hanging …
: In fact, sys_select looks something like this (removing all the ifdef’s) …
select() will clear your fdset while waiting for the event. You
need to reset it before the next call. You’re probably waiting
for a SIGSELECT but no manager has been armed to send you one.
: int sys_select(int maxfd, fd_set *fds,struct timeval *tval)
: {
: struct timeval t2;
fd_set fds2;
: int selrtn;
: do
: {
memcpy(&fds2, fds, sizeof *fds);
: if (tval) memcpy((void *)&t2,(void *)tval,sizeof(t2));
: errno = 0;
: selrtn = select(maxfd,SELECT_CAST fds2,NULL,NULL,tval?&t2:NULL);
: }
: while (selrtn<0 && errno == EINTR);
: return(selrtn);
: }
: So, now I’ll try again to describe the problem …
: If the daemon is sitting in the select (typically without a timeout) and it
: receives a SIGHUP (or SIGCHLD) select will return with a -1 and errno will
: be EINTR, as expected. The do … while will cause select to be called again.
: At that point the select will no longer respond to connections, which is not
: expected. It will respond to signals though and return with EINTR again if a
: signal is received.
: Now, if SIGTERM is received (which is also caught by samba) while in this
: “dead” select state the entire system will freze at the point where the
: SIGTERM handler hits “exit(0)”.
There was a bug where the stack would run READY if a listening socket
was closed that had queued connections on it (hadn’t called accept()
yet). This sounds like what may be happening. Can you verify this?
This is fixed for the next patch.
: If you would like to experice this yourself …
: (make sure you have the full stack running)
: 1) Get the latest samba source (2.0.7)
: 2) Get the latest config.* files
: 3) Run ./configure
: 4) Edit source/include/config.h, add ‘#define HAVE_FCNTL_LOCK 1’
: 5) Edit source/lib/signal.c
: * Comment out the following two lines in the CatchSignal function.
: * if ( signum != SIGALRM )
: * act.sa_flags = SA_RESTART
: *
: 6) make install
: 7) /usr/local/samba/bin/smbd -D
:
kill -SIGHUP smbdpid
: 9) kill SIGHUP smbdpid
: Expect that your system will be dead after step 9. Actually, for step 8 you
: can use fs-cifs to make a connection (you’ll need to set-up smb.conf) and
: then kill that connection. smbd will fork on the connection and as the child
: dies it will send a SIGCHLD to the parent.
: Clearly, a fairly serious bug in NTO.
: I found some clean workaround’s for all of the above issues so I will post
: both the source and a new samba binary in the next day or so.
: ]{ristoph