debugging audio driver

I am hunting a weird bug in maestro-3 driver. It seems to be working fine
for anything but Doom and Quake :wink:
More precisely, Doom has ‘delayed’ sound, as if all sound effects are being
buffered before playing. Delay is about 1-2 secs, very consistent. It does
not happen with MPEG video, only Doom. I can’t deduce anything from log
file, it seems like everything works perfectly. Any ideas what could
possibly cause that effect?

Quake does not have sound at all because aquire() always fails within
ado_shm_alloc(), which is first function in the acquire(). Exactly,
shm_ctl() inside it returns ‘Function not implemented’ which I take it can
not mean anything but a memory corruption. I am really confused - this
happens even immediately upon driver load and no code is being executed
before that except for ctrl_init(). The DMA size passed from io-audio is
same as for Doom (64*4096) yet this only happens with Quake. The only
difference I can see is, capabilities() function is being called only once
before failing aquire() is called. In all other cases (different
applications) the capabilities() is always called twice before aquire().
What does it mean? Why once or twice?

One differense about Quake I’ve heard of, it seems to be using MMAP mode. I
tried to run aplay in MMAP mode and it seems to work. Anyway, I don’t see
how MMAP mode would affect anything, it probably would only make any
difference after aquire(). Does Quake do anything else differently?

This all does not make sense to me. The ctrl_init() works same way for all
apps, immediately after that Doom asks for 64*4096 bytes DMA buffer and
ado_shm_alloc() works, Quake asks for same thing in the same order and
shm_ctl() inside ado_shm_alloc() misteriously fails. If same DLL fails in
the same execution path with the same arguments, it can only mean one thing:
corruption happens somewhere else in io-audio.

It does not seem to be fatal. I can continue using driver with other
applications after failing Quake.

I also tried to use ADO_DEBUG/ado_memory_dump to trace possible corruption
in driver, but apparently ado_alloc_debug/ado_free_debug are undefined, so
driver does not load. Why is that?

Thanks,

  • igor

Igor Kovalenko <kovalenko@home.com> wrote:

I am hunting a weird bug in maestro-3 driver. It seems to be working fine
for anything but Doom and Quake > :wink:
More precisely, Doom has ‘delayed’ sound, as if all sound effects are being
buffered before playing. Delay is about 1-2 secs, very consistent. It does
not happen with MPEG video, only Doom. I can’t deduce anything from log
file, it seems like everything works perfectly. Any ideas what could
possibly cause that effect?

Usually, we see these types of bugs coming from a mistake in the
driver’s position callback. However, after a quick glance at the
Doom code, this doesn’t appear to be the case here.

One differense about Quake I’ve heard of, it seems to be using MMAP mode. I
tried to run aplay in MMAP mode and it seems to work. Anyway, I don’t see
how MMAP mode would affect anything, it probably would only make any
difference after aquire(). Does Quake do anything else differently?

MMAP mode should make no difference in this case. Also, a the point where
it is failing, the driver doesn’t know yet that the application will be
using MMAP mode.

This all does not make sense to me. The ctrl_init() works same way for all
apps, immediately after that Doom asks for 64*4096 bytes DMA buffer and
ado_shm_alloc() works, Quake asks for same thing in the same order and
shm_ctl() inside ado_shm_alloc() misteriously fails. If same DLL fails in
the same execution path with the same arguments, it can only mean one thing:
corruption happens somewhere else in io-audio.

Since, we have dozens of drivers on even more flavors of hardware all
running this common code (even on other platforms). The possibility
seems a little remote.

I also tried to use ADO_DEBUG/ado_memory_dump to trace possible corruption
in driver, but apparently ado_alloc_debug/ado_free_debug are undefined, so
driver does not load. Why is that?

Use io-audio_g (it should come with the DDK) when you are compiling with
debug. It has all the extra symbols for memory accounting and tracing.

I figured few more things, which only make me more confused,
unfortunately.

  1. both Doom and Quake appear to be using MMAP mode.
  2. the problem with ado_shm_alloc() usually does not happen right
    after fresh boot. It takes some time to happen, but once it has happened
    it will repreat every time (but with Quake only). Restarting io-audio
    does not help, I need to reboot for ado_shm_alloc() to ‘work’ again.
  3. there’s still no sound at all in Quake when ado_shm_alloc() ‘works’.

To me that sounds like the corruption happens outside of io-audio
address space, otherwise problem would not persist after restarting
io-audio. It is probably is caused by incorrect DMA operation, but I
really don’t understand how that can happen if all other applications
are working properly, including aplay in MMAP mode.

I need some insight into what Quake and Doom are doing with sound. One
difference I noticed on driver level, all other applications
(including aplay) appear to always split DMA buffer into 4 fragments.
Doom uses no more than 3 fragments (depending on max DMA size set by
driver), Quake always uses 2.

Could that make any difference? Any other ideas? I realize this is
remote possibility, but it still smells like something is not right with
the way MMAP mode is implemented. Perhaps it is specific to my hardware,
but it still is very interesting to figure out.

The io-audio_g indeed allows to track memory allocations, but it is not
very useful in my case, apparently.

Thanks,

  • igor

Audio Support wrote:

Igor Kovalenko <> kovalenko@home.com> > wrote:
I am hunting a weird bug in maestro-3 driver. It seems to be working fine
for anything but Doom and Quake > :wink:
More precisely, Doom has ‘delayed’ sound, as if all sound effects are being
buffered before playing. Delay is about 1-2 secs, very consistent. It does
not happen with MPEG video, only Doom. I can’t deduce anything from log
file, it seems like everything works perfectly. Any ideas what could
possibly cause that effect?

Usually, we see these types of bugs coming from a mistake in the
driver’s position callback. However, after a quick glance at the
Doom code, this doesn’t appear to be the case here.

One differense about Quake I’ve heard of, it seems to be using MMAP mode. I
tried to run aplay in MMAP mode and it seems to work. Anyway, I don’t see
how MMAP mode would affect anything, it probably would only make any
difference after aquire(). Does Quake do anything else differently?

MMAP mode should make no difference in this case. Also, a the point where
it is failing, the driver doesn’t know yet that the application will be
using MMAP mode.

This all does not make sense to me. The ctrl_init() works same way for all
apps, immediately after that Doom asks for 64*4096 bytes DMA buffer and
ado_shm_alloc() works, Quake asks for same thing in the same order and
shm_ctl() inside ado_shm_alloc() misteriously fails. If same DLL fails in
the same execution path with the same arguments, it can only mean one thing:
corruption happens somewhere else in io-audio.

Since, we have dozens of drivers on even more flavors of hardware all
running this common code (even on other platforms). The possibility
seems a little remote.

I also tried to use ADO_DEBUG/ado_memory_dump to trace possible corruption
in driver, but apparently ado_alloc_debug/ado_free_debug are undefined, so
driver does not load. Why is that?

Use io-audio_g (it should come with the DDK) when you are compiling with
debug. It has all the extra symbols for memory accounting and tracing.

Igor Kovalenko <Igor.Kovalenko@motorola.com> wrote:

I figured few more things, which only make me more confused,
unfortunately.

  1. both Doom and Quake appear to be using MMAP mode.
  2. the problem with ado_shm_alloc() usually does not happen right
    after fresh boot. It takes some time to happen, but once it has happened
    it will repreat every time (but with Quake only). Restarting io-audio
    does not help, I need to reboot for ado_shm_alloc() to ‘work’ again.
  3. there’s still no sound at all in Quake when ado_shm_alloc() ‘works’.

To me that sounds like the corruption happens outside of io-audio
address space, otherwise problem would not persist after restarting
io-audio. It is probably is caused by incorrect DMA operation, but I
really don’t understand how that can happen if all other applications
are working properly, including aplay in MMAP mode.

Also, since you are playing, not recording, the DMA should only be reading
data from memory. It shouldn’t be writing (unless there is something
really weird about the maestro’s DMA that I am not aware of). I
believe that it would be worth your while to throughly investigate how the
Maestro’s DMA engine works. The fact that you must cold boot to fix it
does indicate that there is a problem in the hardware level.

I forget. Is this a PCI card that you are testing on, or a laptop. If
possible, you should definitely try this out in a different machine
to rule out bad hardware off of the soundcard (ex. bad RAM)

I need some insight into what Quake and Doom are doing with sound. One
difference I noticed on driver level, all other applications
(including aplay) appear to always split DMA buffer into 4 fragments.
Doom uses no more than 3 fragments (depending on max DMA size set by
driver), Quake always uses 2.

For your hypothesis, there is a simple test. Take a simple sound player
(ex wave.c) that you have the source for and force its parameters to
match those that you are suspicious of (ex. 2 fragments) to see if it
fails in the same way.

Note: DOOM uses MMAP indirectly. It doesn’t call snd_pcm_mmap directly.
The plugin interface does it automatically on its behalf because it
doesn’t call snd_pcm_plugin_disable to turn off MMAP mode.

Could that make any difference? Any other ideas? I realize this is
remote possibility, but it still smells like something is not right with
the way MMAP mode is implemented. Perhaps it is specific to my hardware,
but it still is very interesting to figure out.

I would like to too, but without a reproducable case, there is not
much that I can do beyond acting as a sounding board.

Audio Support wrote:

Also, since you are playing, not recording, the DMA should only be reading
data from memory. It shouldn’t be writing (unless there is something
really weird about the maestro’s DMA that I am not aware of).

Maestro-3 has DSP chip called ASSP and that chip runs its own ‘kernel’
for which each Clients are linked into lists which are managed
internally by ASSP kernel code. And you stuff DMA addresses somewhere
into that black hole.

I believe that it would be worth your while to throughly investigate how the
Maestro’s DMA engine works. The fact that you must cold boot to fix it
does indicate that there is a problem in the hardware level.

I believe that too, but ‘kernel’ and ‘client’ code comes in form of huge
array initialized with hex values :wink: Would be fun to investigate it
thoroughly. There is no docs available beyond ALSA code.

I forget. Is this a PCI card that you are testing on, or a laptop. If
possible, you should definitely try this out in a different machine
to rule out bad hardware off of the soundcard (ex. bad RAM)

It is PCI card (IOMagic StormSurge). I do not have any other problems
with that machine.

I need some insight into what Quake and Doom are doing with sound. One
difference I noticed on driver level, all other applications
(including aplay) appear to always split DMA buffer into 4 fragments.
Doom uses no more than 3 fragments (depending on max DMA size set by
driver), Quake always uses 2.

For your hypothesis, there is a simple test. Take a simple sound player
(ex wave.c) that you have the source for and force its parameters to
match those that you are suspicious of (ex. 2 fragments) to see if it
fails in the same way.

I tried aplay, since it has MMAP mode. Results are interesting. When
compiled using audio headers from 6.1, it works properly with any number
of fragments.

However, when compiled with pre-6.1 audio headers it does not work
properly at all. I can hear some sounds (some distorted fragments of
original) and what exactly I hear depends on number of fragments used.
It also reports UNDERRUNS with any frag number lower than 4, which
appears to be ‘default’ number (aplay normally sets frags_max to -1 and
then snd_pcm_channel_setup sets number of frags to 4)

The reason I bother speaking about pre-6.1 stuff, ‘objdump -x’ reports
that both Doom and Quake are linked against libc.so.1 and
libasound.so.1.

Would it be jumping to conclusions if I suspected that to be a reason?
Can you give me binaries built for 6.1?

I would like to too, but without a reproducable case, there is not
much that I can do beyond acting as a sounding board.

I could send you the board, but it is $29 retail in CompUSA. Shipping
both ways will probably cost more. It is up to QNX to determine if such
expense would be justified, I can only hint that the chipset in question
is being used by large number of laptop models currently in production
(Dell, HP and Compaq).

  • igor