tx_down failed ENOBUFS

arno_stoum · June 14, 2002, 12:42am

Hi,

I wrote a down producer which worked fine for some days and then, it started
to act weird.
I didn’t change anything on the way I allocate the packets but, more often,
when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
5 seconds and then works ok for two or three minutes and starts again to
fail, then stops… Sometime, it just works fine for ten minutes…
(I talk about the tx_down of the producer, not from the converter)

erno is set to ENOBUFS (no buffer space available)

npkt is allocated with no problem.

I send a packet every ms.

Anybody an idea?

Thanks

Arnaud

arno_stoum · June 14, 2002, 12:48am

Wow, I checked the news something like 3 hours ago and Shaun’s question
wasn’t there!!
I just saw it when I checked to see if my question was posted! That’s a
funny coincidence!!

But I go sleep now (3am)…

Sean_Boudreau1 · June 14, 2002, 1:16pm

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <starn@yucom.be> wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks

: Arnaud

arno_stoum · June 15, 2002, 9:46pm

How is it possible to determine how fast the driver can tx?
I guess it depends on the processor speed.

I send a pulse every ms to wake up the thread that tx the packets and I
noticed that, mostly just after the down producer mounting, pulses are sent
really too fast for the ten first packets (I use a sniffer on the other side
on Windows and I thought it was maybe Windows 's fault) and that’s exactly
when the “tx_down failed” happen.

So this might be the problem? Pulses arriving faster than asked?

Thanks a lot

Arnaud

Sreekanth · June 17, 2002, 2:57am

I think this is the number of transmit buffer descriptors that the card
has.You can bump up this value using the options in the driver and get a
better performance

Sreekanth

“arno stoum” <starn@yucom.be> wrote in message
news:aegc3a$n70$1@inn.qnx.com…

How is it possible to determine how fast the driver can tx?
I guess it depends on the processor speed.

I send a pulse every ms to wake up the thread that tx the packets and I
noticed that, mostly just after the down producer mounting, pulses are
sent
really too fast for the ten first packets (I use a sniffer on the other
side
on Windows and I thought it was maybe Windows 's fault) and that’s exactly
when the “tx_down failed” happen.

So this might be the problem? Pulses arriving faster than asked?

Thanks a lot >

Arnaud

John_A_Murphy1 · June 17, 2002, 2:04pm

I can’t find any mention of this in the network DDK. The result from rx_down is
either TX_DOWN_OK, or TX_DONE_FAILED, with no mention of setting errno; does
io-net set errno to ENOBUFS when the driver returns TX_DOWN_FAILED?
Also, all the drivers in the DDK seem to queue tx buffers without limit, although
several of them will indicate an error if the link is down. So where does the
ENOBUFS error actually come from?

Murf

Sean Boudreau wrote:

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <> starn@yucom.be> > wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks >

: Arnaud

Sean_Boudreau1 · June 17, 2002, 2:11pm

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John A. Murphy <murf@perftech.com> wrote:

I can’t find any mention of this in the network DDK. The result from rx_down is
either TX_DOWN_OK, or TX_DONE_FAILED, with no mention of setting errno; does
io-net set errno to ENOBUFS when the driver returns TX_DOWN_FAILED?
Also, all the drivers in the DDK seem to queue tx buffers without limit, although
several of them will indicate an error if the link is down. So where does the
ENOBUFS error actually come from?

Murf

Sean Boudreau wrote:

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <> starn@yucom.be> > wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks >

: Arnaud

John_A_Murphy1 · June 17, 2002, 2:51pm

Aha, I’d missed that one! Thanks! It might be a good idea to make some mention of
that in the DDK docs…

Murf

Sean Boudreau wrote:

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I can’t find any mention of this in the network DDK. The result from rx_down is
either TX_DOWN_OK, or TX_DONE_FAILED, with no mention of setting errno; does
io-net set errno to ENOBUFS when the driver returns TX_DOWN_FAILED?
Also, all the drivers in the DDK seem to queue tx buffers without limit, although
several of them will indicate an error if the link is down. So where does the
ENOBUFS error actually come from?

Murf

Sean Boudreau wrote:

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <> starn@yucom.be> > wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks >

: Arnaud

Steve_Reid1 · June 17, 2002, 3:01pm

John A. Murphy <murf@perftech.com> wrote:
: Aha, I’d missed that one! Thanks! It might be a good idea to make some mention of
: that in the DDK docs…

I’ve added it to the queue. Thanks.

Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems

Shaun_Jackman · June 17, 2002, 6:14pm

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <seanb@qnx.com> wrote in message
news:aekqmk$6pa$1@nntp.qnx.com…

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

Sean_Boudreau1 · June 17, 2002, 6:34pm

If io-net is enforcing the limit, it increments a
counter when the driver gets a packet, and decrements
it when the driver calls ion->tx_done() on a packet.
If the number of packets in the driver is at the limit,
the driver won’t see anymore until some outstanding
are released: io-net calls tx_done() on the packet and
returns the error in the interim.

-seanb

Shaun Jackman <sjackman@nospam.vortek.com> wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John_A_Murphy1 · February 28, 2003, 12:13pm

I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

Sean_Boudreau1 · February 28, 2003, 2:18pm

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <murf@perftech.com> wrote:

I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John_A_Murphy1 · February 28, 2003, 2:54pm

At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

Sean_Boudreau1 · February 28, 2003, 3:04pm

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <murf@perftech.com> wrote:

At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John_A_Murphy1 · February 28, 2003, 3:17pm

I’ll give that a try, although I’ve never been able to decipher the mysteries of pidin’s
memory reporting - I’ve compared the results from a freshly booted machine and from the
same machine after it claims to have insufficient memory to run any commands, and not been
able to detect any significant difference in reported memory usage. As I mentioned, being
able to watch things from io-net’s perspective would often be a huge help in debugging.

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John_A_Murphy1 · February 28, 2003, 11:39pm

While chasing this problem, I’ve isolated a case where the ARP convertor sends the Ethernet
driver an npkt that has tot_iov set to 2, but has just one buffer on the buffer list; i.e.,
an IP packet with no Ethernet header. There are two npkt_done_t’s registered, and the
second one points to a buffer with an Ethernet header. And that buffer, the one containing
the Ethernet header, has it’s tqe_prev pointing at the npkt, as it should ne, but it’s
tqe_next points off into outer space. In other words, it looks like the ARP convertor’s
add_header routine got interrupted in the middle of doing it’s TAILQ_INSERT_HEAD - or else
both the npkt and it’s buffers list got corrupted on the way from the ARP convertor to the
Ethernet driver- or else ???
Any ideas?

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

Sean_Boudreau1 · March 3, 2003, 2:54pm

This one doesn’t ring a bell. It sounds like undetermined memory
coruption. The arp module has (or is supposed to have) exclusive
access to the npkt itself as down headed packets go to each module
in turn. Thus the lack of any locks when the actual TAILQ_INSERT_HEAD
is performed.

I’d look for something being freed too early / twice.

-seanb

John A. Murphy <murf@perftech.com> wrote:

While chasing this problem, I’ve isolated a case where the ARP convertor sends the Ethernet
driver an npkt that has tot_iov set to 2, but has just one buffer on the buffer list; i.e.,
an IP packet with no Ethernet header. There are two npkt_done_t’s registered, and the
second one points to a buffer with an Ethernet header. And that buffer, the one containing
the Ethernet header, has it’s tqe_prev pointing at the npkt, as it should ne, but it’s
tqe_next points off into outer space. In other words, it looks like the ARP convertor’s
add_header routine got interrupted in the middle of doing it’s TAILQ_INSERT_HEAD - or else
both the npkt and it’s buffers list got corrupted on the way from the ARP convertor to the
Ethernet driver- or else ???
Any ideas?

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John_A_Murphy1 · March 3, 2003, 3:51pm

Found it! It was a rather snaeky recursion path that resulted in returning an npkt twice, which
is why the npkt looked like the buffer with the Ethernet header had been properly inserted in the
list and then removed - that’s exactly what had happened. Once again, it sure would have been
helpful to be able to observe the io-net interals, and see that the counter had gone negative
instead of bumping into the limit.

Thanks for mentioning “too many frees” - that finally lead me to the problem!

Murf

Sean Boudreau wrote:

This one doesn’t ring a bell. It sounds like undetermined memory
coruption. The arp module has (or is supposed to have) exclusive
access to the npkt itself as down headed packets go to each module
in turn. Thus the lack of any locks when the actual TAILQ_INSERT_HEAD
is performed.

I’d look for something being freed too early / twice.

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
While chasing this problem, I’ve isolated a case where the ARP convertor sends the Ethernet
driver an npkt that has tot_iov set to 2, but has just one buffer on the buffer list; i.e.,
an IP packet with no Ethernet header. There are two npkt_done_t’s registered, and the
second one points to a buffer with an Ethernet header. And that buffer, the one containing
the Ethernet header, has it’s tqe_prev pointing at the npkt, as it should ne, but it’s
tqe_next points off into outer space. In other words, it looks like the ARP convertor’s
add_header routine got interrupted in the middle of doing it’s TAILQ_INSERT_HEAD - or else
both the npkt and it’s buffers list got corrupted on the way from the ARP convertor to the
Ethernet driver- or else ???
Any ideas?

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

Sean_Boudreau1 · March 4, 2003, 10:42am

John A. Murphy <murf@perftech.com> wrote:

Once again, it sure would have been
helpful to be able to observe the io-net interals, and see that the counter had gone negative
instead of bumping into the limit.

OK I’ll bite. Since you appear to have the source, you could have
built your own io-net and debugged it accordingly.

-seanb

tx_down failed ENOBUFS

Steve Reid stever@qnx.com TechPubs (Technical Publications) QNX Software Systems

Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems