netinfo description?

Hi,

colleague of mine was out at customers location and detected a network
problem caused by a bad 10Base2 connector. After network was running
again, he looked at netinfo -l.
And now the questions (I think I asked a couple of years back, but I
lost the emails on an old system):

  • What is the meaning of the error counts displayed by netinfo -l?
  • On which amount I have a ‘bad’ network?
  • Am I able to locate the faulty part of the net from the different
    entries?
    As I remember there were answers like: if you have error counts and
    network problems, than you have a network problem. If you have error
    counts and no problem: all is fine.
    I would appreciate some clear words to that. Also I think the
    description should be added to the docs.

Friedhelm Schuetz
H.Kleinknecht & Co. GmbH

Ping on that.

Friedhelm Schuetz wrote:

Hi,

colleague of mine was out at customers location and detected a network
problem caused by a bad 10Base2 connector. After network was running
again, he looked at netinfo -l.
And now the questions (I think I asked a couple of years back, but I
lost the emails on an old system):

  • What is the meaning of the error counts displayed by netinfo -l?
  • On which amount I have a ‘bad’ network?
  • Am I able to locate the faulty part of the net from the different
    entries?
    As I remember there were answers like: if you have error counts and
    network problems, than you have a network problem. If you have error
    counts and no problem: all is fine.
    I would appreciate some clear words to that. Also I think the
    description should be added to the docs.

Friedhelm Schuetz
H.Kleinknecht & Co. GmbH

Friedhelm Schuetz <Friedhelm.Schuetz@kleinknecht.de> wrote:

Ping on that.

Ok, here’s a repeat of my answer.

From: David Gibbs <dagibbs@qnx.com>
Newsgroups: qdn.public.qnx4
Subject: Re: netinfo description?
Date: 5 Apr 2001 16:47:06 GMT
Message-ID: <9ai7ia$b26$1@nntp.qnx.com>
References: <3ACC7E91.1C1D7426@kleinknecht.de>
X-Trace: nntp.qnx.com 986489226 11334 209.226.137.120 (5 Apr 2001 16:47:06 GMT)
X-Complaints-To: usenet@qnx.com
NNTP-Posting-Date: 5 Apr 2001 16:47:06 GMT
Xref: nntp.qnx.com qdn.public.qnx4:3530

Friedhelm Schuetz <Friedhelm.Schuetz@kleinknecht.de> wrote:

Hi,

And now the questions (I think I asked a couple of years back, but I
lost the emails on an old system):

  • What is the meaning of the error counts displayed by netinfo -l?

How many of which type of error have been reported by the ethernet
card.

  • On which amount I have a ‘bad’ network?

There is no hard and fast rule on this. In general, look at the
ratio of error to success – if less than 1/10000 you are probably
ok, but it also depends on the error category.

  • Am I able to locate the faulty part of the net from the different
    entries?

Sometimes.

As I remember there were answers like: if you have error counts and
network problems, than you have a network problem. If you have error
counts and no problem: all is fine.
I would appreciate some clear words to that. Also I think the
description should be added to the docs.

This doesn’t really belong in our docs – this is ethernet operations
and network debugging. All our “netinfo -l” reports (generally) is
the number of errors that the ethernet NIC has reported when we checked
the error status register(s). That is, this stuff is not a feature or
behaviour of QNX4, it is a feature/behaviour of ethernet networks, and
that is where you should look for information on the meanings and
dangerousness of these types of errors.

Still, here is the feeling I’ve gotten for the different error categories
and how nasty they are over the last few years:

:Total Number Of Net Driver Slots: 2
:
:Driver Slot 0: Driver Pid 46 Logical Net 2 Network Card: Ethernet/
:SMC9000 Ethernet Controller
: Vendor ID … 0x10b8
: Device ID … 0x9000
: Revision … 0x9
: Physical Node ID … 00800F 00012C
: Media Rate … 10Mb/s
: Mtu … 1514
: I/O Port Range … 0x340 → 0x34F
: Hardware Interrupt … 11

Everything above this section is informational and should be static
while running.

: Total Packets Txd OK … 212246
: Total Packets Txd Bad … 5

Summary value of below.

: Tx Collision Errors … 2324

These are a normal behaviour and should be expected. Ethernet allocates
bandwidth using collisions. If you get more than 10% collision errors,
though, you’re network is probably too busy and you will be getting a
noticeable degradation in performance – look at partitioning your network
using a bridge (which terminates the collision domain).

: Tx Collision Errors (aborted) … 5

Had too many collisions, and gave up on transmitting a packet. Again,
not a critical failure – but if it gets above about 1 per 10,000 then
look at partitioning.

: Carrier Sense Lost on Tx … 0

When you went to tx a packet, it couldn’t find an ethernet there. Usually
a local problem – card or wire to your machine. If happening occasionally
it may be problem at hub/bridge. If less than 1 per 10,000 probably don’t
need to worry about it.

: FIFO Underruns During Tx … 0

Not as familiar with this one… I THINK it means that the driver on
your machine couldn’t keep the TX fifo full, so a transmit failed. If
this is occurring with any regularity you probably have either not
enough CPU on this node, or you are having problems with interrupt
latency – something is running a long time in an irq handler or with
interrupts disabled.

: Tx deferred … 5756

Again, not entirely familiar with this one – but not, I think,
an error. I think it means a back-off in tx.

: Out of Window Collisions … 0

These are bad ones. They almost always flag an improperly configured
ethernet – either too long cable runs, bad NIC somewhere (likely not
this NIC), too many hubs before a bridge, or something similar. They
shouldn’t happen on a properly configured and running ethernet. If these
are happening, short packets could be getting destroyed by collisions
without the TXing NIC being able to detect it. Very bad. It is hard
to determine what is causing this – sometimes you can start partitioning
to determine, but doesn’t always work. Again, if you’re getting less
than 1 per 1,000,000 (1 million) of these, you may be ok to ignore it.

: Total Packets Rxd OK … 20507666
: Total Rx Errors … 195

Summary.

: FIFO Overruns During Rx … 195

We didn’t get the packet out of the hardware and into memory fast
enough. Local, rather than network issue.

: Alignment errors … 0

Packet was misaligned. Could be local hardware, or txing hardware.

: CRC errors … 0

Packet failed its CRC. Could be corrupted in transmission accross
the network (noisy environment?) or a bad card on the TX side.

:
:Driver Slot 1: Unused

Hope this helps a bit.

-David

QNX Training Services
dagibbs@qnx.com

Thanks to you, David.
I’ll store and also print your answer, so I hope I’ll find it again in two years.
If the shown errors come from the NIC, are they NIC specific?
Do the manufactureres of the NIC or the ehthernet cards list that informations
somewhere?
Is it better to find ethernet problems with external lan-meters?
Experiances and suggestions appreciated.

Friedhelm Schuetz
H.Kleinknecht & Co. GmbH

Friedhelm Schuetz <Friedhelm.Schuetz@kleinknecht.de> wrote:

Thanks to you, David.
I’ll store and also print your answer, so I hope I’ll find it again in two years.
If the shown errors come from the NIC, are they NIC specific?

Usually not – they’re pretty standard.

Do the manufactureres of the NIC or the ehthernet cards list that
informations somewhere?

Usually not.

Is it better to find ethernet problems with external lan-meters?

For small or simple lans, they often aren’t needed. For anything larger
or complex, they are often a very helpful tool. (caveat: I’m not a lan
administrator – people who are, might have a different opinion. QNX does
have more tools for debugging lan problems than some other desktops,
including the ability to turn a QNX node, effectively, into a network
sniffer – see the netsniff utility for such an application – but this
may not be as useful as a custom built sniffer.)

-David

QNX Training Services
dagibbs@qnx.com