tx_down failed ENOBUFS

Hi,

I wrote a down producer which worked fine for some days and then, it started
to act weird.
I didn’t change anything on the way I allocate the packets but, more often,
when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
5 seconds and then works ok for two or three minutes and starts again to
fail, then stops… Sometime, it just works fine for ten minutes…
(I talk about the tx_down of the producer, not from the converter)

erno is set to ENOBUFS (no buffer space available)

npkt is allocated with no problem.

I send a packet every ms.

Anybody an idea?

Thanks :slight_smile:

Arnaud

Wow, I checked the news something like 3 hours ago and Shaun’s question
wasn’t there!!
I just saw it when I checked to see if my question was posted! That’s a
funny coincidence!!
:slight_smile:
But I go sleep now (3am)…

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <starn@yucom.be> wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks :slight_smile:

: Arnaud

How is it possible to determine how fast the driver can tx?
I guess it depends on the processor speed.

I send a pulse every ms to wake up the thread that tx the packets and I
noticed that, mostly just after the down producer mounting, pulses are sent
really too fast for the ten first packets (I use a sniffer on the other side
on Windows and I thought it was maybe Windows 's fault) and that’s exactly
when the “tx_down failed” happen.

So this might be the problem? Pulses arriving faster than asked?

Thanks a lot :slight_smile:

Arnaud

I think this is the number of transmit buffer descriptors that the card
has.You can bump up this value using the options in the driver and get a
better performance

Sreekanth

“arno stoum” <starn@yucom.be> wrote in message
news:aegc3a$n70$1@inn.qnx.com

How is it possible to determine how fast the driver can tx?
I guess it depends on the processor speed.

I send a pulse every ms to wake up the thread that tx the packets and I
noticed that, mostly just after the down producer mounting, pulses are
sent
really too fast for the ten first packets (I use a sniffer on the other
side
on Windows and I thought it was maybe Windows 's fault) and that’s exactly
when the “tx_down failed” happen.

So this might be the problem? Pulses arriving faster than asked?

Thanks a lot > :slight_smile:

Arnaud

I can’t find any mention of this in the network DDK. The result from rx_down is
either TX_DOWN_OK, or TX_DONE_FAILED, with no mention of setting errno; does
io-net set errno to ENOBUFS when the driver returns TX_DOWN_FAILED?
Also, all the drivers in the DDK seem to queue tx buffers without limit, although
several of them will indicate an error if the link is down. So where does the
ENOBUFS error actually come from?

Murf

Sean Boudreau wrote:

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <> starn@yucom.be> > wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks > :slight_smile:

: Arnaud

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John A. Murphy <murf@perftech.com> wrote:

I can’t find any mention of this in the network DDK. The result from rx_down is
either TX_DOWN_OK, or TX_DONE_FAILED, with no mention of setting errno; does
io-net set errno to ENOBUFS when the driver returns TX_DOWN_FAILED?
Also, all the drivers in the DDK seem to queue tx buffers without limit, although
several of them will indicate an error if the link is down. So where does the
ENOBUFS error actually come from?

Murf

Sean Boudreau wrote:

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <> starn@yucom.be> > wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks > :slight_smile:

: Arnaud

Aha, I’d missed that one! Thanks! It might be a good idea to make some mention of
that in the DDK docs…

Murf

Sean Boudreau wrote:

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I can’t find any mention of this in the network DDK. The result from rx_down is
either TX_DOWN_OK, or TX_DONE_FAILED, with no mention of setting errno; does
io-net set errno to ENOBUFS when the driver returns TX_DOWN_FAILED?
Also, all the drivers in the DDK seem to queue tx buffers without limit, although
several of them will indicate an error if the link is down. So where does the
ENOBUFS error actually come from?

Murf

Sean Boudreau wrote:

The driver returns this when you’re getting ahead of it: sending
packets faster than it can tx.

-seanb

arno stoum <> starn@yucom.be> > wrote:
: Hi,

: I wrote a down producer which worked fine for some days and then, it started
: to act weird.
: I didn’t change anything on the way I allocate the packets but, more often,
: when I mount this producer to io-net, it tells me " tx_down failed" for 4 or
: 5 seconds and then works ok for two or three minutes and starts again to
: fail, then stops… Sometime, it just works fine for ten minutes…
: (I talk about the tx_down of the producer, not from the converter)

: erno is set to ENOBUFS (no buffer space available)

: npkt is allocated with no problem.

: I send a packet every ms.

: Anybody an idea?

: Thanks > :slight_smile:

: Arnaud

John A. Murphy <murf@perftech.com> wrote:
: Aha, I’d missed that one! Thanks! It might be a good idea to make some mention of
: that in the DDK docs…

I’ve added it to the queue. Thanks.


Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <seanb@qnx.com> wrote in message
news:aekqmk$6pa$1@nntp.qnx.com

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

If io-net is enforcing the limit, it increments a
counter when the driver gets a packet, and decrements
it when the driver calls ion->tx_done() on a packet.
If the number of packets in the driver is at the limit,
the driver won’t see anymore until some outstanding
are released: io-net calls tx_done() on the packet and
returns the error in the interim.


-seanb

Shaun Jackman <sjackman@nospam.vortek.com> wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <murf@perftech.com> wrote:

I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <murf@perftech.com> wrote:

At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

I’ll give that a try, although I’ve never been able to decipher the mysteries of pidin’s
memory reporting - I’ve compared the results from a freshly booted machine and from the
same machine after it claims to have insufficient memory to run any commands, and not been
able to detect any significant difference in reported memory usage. As I mentioned, being
able to watch things from io-net’s perspective would often be a huge help in debugging.

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

While chasing this problem, I’ve isolated a case where the ARP convertor sends the Ethernet
driver an npkt that has tot_iov set to 2, but has just one buffer on the buffer list; i.e.,
an IP packet with no Ethernet header. There are two npkt_done_t’s registered, and the
second one points to a buffer with an Ethernet header. And that buffer, the one containing
the Ethernet header, has it’s tqe_prev pointing at the npkt, as it should ne, but it’s
tqe_next points off into outer space. In other words, it looks like the ARP convertor’s
add_header routine got interrupted in the middle of doing it’s TAILQ_INSERT_HEAD - or else
both the npkt and it’s buffers list got corrupted on the way from the ARP convertor to the
Ethernet driver- or else ???
Any ideas?

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

This one doesn’t ring a bell. It sounds like undetermined memory
coruption. The arp module has (or is supposed to have) exclusive
access to the npkt itself as down headed packets go to each module
in turn. Thus the lack of any locks when the actual TAILQ_INSERT_HEAD
is performed.

I’d look for something being freed too early / twice.

-seanb

John A. Murphy <murf@perftech.com> wrote:

While chasing this problem, I’ve isolated a case where the ARP convertor sends the Ethernet
driver an npkt that has tot_iov set to 2, but has just one buffer on the buffer list; i.e.,
an IP packet with no Ethernet header. There are two npkt_done_t’s registered, and the
second one points to a buffer with an Ethernet header. And that buffer, the one containing
the Ethernet header, has it’s tqe_prev pointing at the npkt, as it should ne, but it’s
tqe_next points off into outer space. In other words, it looks like the ARP convertor’s
add_header routine got interrupted in the middle of doing it’s TAILQ_INSERT_HEAD - or else
both the npkt and it’s buffers list got corrupted on the way from the ARP convertor to the
Ethernet driver- or else ???
Any ideas?

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

Found it! It was a rather snaeky recursion path that resulted in returning an npkt twice, which
is why the npkt looked like the buffer with the Ethernet header had been properly inserted in the
list and then removed - that’s exactly what had happened. Once again, it sure would have been
helpful to be able to observe the io-net interals, and see that the counter had gone negative
instead of bumping into the limit.

Thanks for mentioning “too many frees” - that finally lead me to the problem!

Murf

Sean Boudreau wrote:

This one doesn’t ring a bell. It sounds like undetermined memory
coruption. The arp module has (or is supposed to have) exclusive
access to the npkt itself as down headed packets go to each module
in turn. Thus the lack of any locks when the actual TAILQ_INSERT_HEAD
is performed.

I’d look for something being freed too early / twice.

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
While chasing this problem, I’ve isolated a case where the ARP convertor sends the Ethernet
driver an npkt that has tot_iov set to 2, but has just one buffer on the buffer list; i.e.,
an IP packet with no Ethernet header. There are two npkt_done_t’s registered, and the
second one points to a buffer with an Ethernet header. And that buffer, the one containing
the Ethernet header, has it’s tqe_prev pointing at the npkt, as it should ne, but it’s
tqe_next points off into outer space. In other words, it looks like the ARP convertor’s
add_header routine got interrupted in the middle of doing it’s TAILQ_INSERT_HEAD - or else
both the npkt and it’s buffers list got corrupted on the way from the ARP convertor to the
Ethernet driver- or else ???
Any ideas?

Murf

Sean Boudreau wrote:

If your driver isn’t seeing the packet to tx, the error is being raised by
io-net because it thinks the driver has reached it’s limit as far as how
many packets it’s allowed to queue up for tx. The default limit is 64K
and can be manipulated with the DCMD_IO_NET_MAX_QUEUE ioctl. Every time
you receive a packet to tx, io-net increments the count. Every time you
call ion->tx_done() on a packet, io-net decrements the count.

I’d suggest making the limit a large value and monitor memory usage
with ‘pidin me’. If io-net grows until ENOBUFS, you’re probably leaking
packets. If it doesn’t, you may be releasing a packet twice?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
At the moment I’m seeing it with one of my own drivers, which is why I’m fairly sure
that the driver has released all its packets (I have a devctl that tells if if there
are outstanding xmit packets). Is there any way for a driver to find the address of
_ion, so that I can watch the num_queued count myself? I may well have a bug in this
driver that’s causing all the problems, but it would sure help the debug effor to
know what io-net thinks is happening.

Murf

Sean Boudreau wrote:

I think the algorithm has been described in this thread. If you
always get ENOBUFS, the driver isn’t releasing packets. Or a
packet has been released twice and a counter has wrapped around…

Which versions of which drivers are you seeing this with?

-seanb

John A. Murphy <> murf@perftech.com> > wrote:
I’ve noticed this same problem, “once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to send a
packet results in this error”, in several different drivers. Anybody got any
idea what’s going on?

Murf

Shaun Jackman wrote:

I noticed in pcnet_transmit_packets() (the rx_down function) the driver
never actually refuses a packet. So, io-net must be returning ENOBUFS to me
like you said. My problem is that once I receive a single
TX_DOWN_FAILED/ENOBUFS, I can no longer send any packets. Any attempt to
send a packet results in this error. So, what mechanism does io-net use to
enforce this limit? How does it calculate the number of outstanding packets?
I think somehow I’m not playing by the rules, and io-net is believing that
some number of packets that have actually been sent are unaccounted for.

Thanks,
Shaun

Sean Boudreau <> seanb@qnx.com> > wrote in message
news:aekqmk$6pa$> 1@nntp.qnx.com> …

The driver can either manage its queue length itself, or it
can use the DCMD_IO_NET_MAX_QUEUE devctl which tell io-net to
enforce the limit. And yes, errno is set.

-seanb

John A. Murphy <murf@perftech.com> wrote:

Once again, it sure would have been
helpful to be able to observe the io-net interals, and see that the counter had gone negative
instead of bumping into the limit.

OK I’ll bite. Since you appear to have the source, you could have
built your own io-net and debugged it accordingly.

-seanb