Dmitri Poustovalov wrote:
That’s fine. But in order to be POSIX compliant kill(pid, 0) is supposed
to answer “is it alive?” question not “a different question”, isn’t it?
That’s the point: it isn’t. Kill() just tells you if the pid matches an
existing proces, without making a distinction between live processes and
zombie processes.
…
I am failing to see how zombie bussiness is applicable to the testcase I
described above. There is no parent-child relationship.
The zombie business was supposed to be an obvious example of why process
termination is not an atomic thing. I guess it wasn’t that obvious
after all.
When a process terminates, a lot of things happen: fds are closed;
signals and pulses are sent; memory is unmapped; the process becomes a
zombie; its parent returns from waitpid(); the pid becomes invalid. If
you try to detect the order of those things, you shouldn’t be surprised
that they happen in a certain order. Some of them are guaranteed to
happen during termination (i.e. before the process becomes a zombie) and
some after (e.g. the child must complete its termination before
waitpid() returns in the parent). But beyond that, you won’t find many
promises in POSIX or our docs about the order of things.
Since the OS doesn’t know in general how long a process will remain in
the zombie state, it’s desirable to free up its resources before it
turns into a zombie. In particular, it seems reasonable that we do
not promise that you can access the memory of a process that has
completed its termination and turned into a zombie. Since the
/proc/pid/as entry of the process represents its address space, you
shouldn’t be surprised that it goes away sooner than the pid. That was
the main point I was trying to make with the zombie business. In
general, there’s a stage in the life cycle of a process when its pid is
still valid, and kill() tells you it’s still valid, but most of its
other resources are gone, and any API that normally lets you access them
fails.
If you run the testcase you would see that Dummy process was not a
zombie and pidin reported it Ready. And one can make Dummy a daemon or
use SPAWN_NOZOMBIE flag, the result is going to be the same – kill()
has no clue what Dummy’s real status is.
But it’s not the job of kill() to tell you the “real status” of a
process. All it tells you whether the pid is valid. In your test case,
you’re just making the transitions take indefinitely longer, which makes
it easier to notice that they don’t happen instantenously.