Alignment Problems with 3.3.1 Compiler

We’ve found an interesting difference between the 2.95.3 and 3.1.3
ompilers, and are curious to know if anyone can explain what’s happening.

The following trival program illustrates the situation:

#include <hw/nicinfo.h>
int main ()
{
printf("%d\n", sizeof(nic_stats_t));
return 0;
}

When complied with the 3.3.1 compiler, the printed result is 1212, but
when compiled with the 2.95.3 compiler, the result is 1216. Since the
value of DCMD_IO_NET_GET_STATS depends on the size of nic_stats_t, this
difference is critical! The difference is caused by the 64 bit
alignment (or lack thereof) of the octets_txed_ok field. Version 2.95.3
apparently defaults to putting 64 bit variables on 64 bit boundaries,
while 3.3.1 will put them on 32 bit boundaries. Sounds like a problem
with default confiurations — anybody know the details, including how
to fix it?

Murf

I have been there :frowning:. PR filed, fix is going to be in the next 6.3.0
patch.

-xtang


John Murphy <murf@perftech.com> wrote in message
news:414A0204.8090201@perftech.com

We’ve found an interesting difference between the 2.95.3 and 3.1.3
ompilers, and are curious to know if anyone can explain what’s happening.

The following trival program illustrates the situation:

#include <hw/nicinfo.h
int main ()
{
printf("%d\n", sizeof(nic_stats_t));
return 0;
}

When complied with the 3.3.1 compiler, the printed result is 1212, but
when compiled with the 2.95.3 compiler, the result is 1216. Since the
value of DCMD_IO_NET_GET_STATS depends on the size of nic_stats_t, this
difference is critical! The difference is caused by the 64 bit
alignment (or lack thereof) of the octets_txed_ok field. Version 2.95.3
apparently defaults to putting 64 bit variables on 64 bit boundaries,
while 3.3.1 will put them on 32 bit boundaries. Sounds like a problem
with default confiurations — anybody know the details, including how
to fix it?

Murf

Humm… fix. Again. Which way it will be ‘fixed’ now?

6.2.0: compiler aligns to 8, malloc aligns to 4 (breaking Postgres).
6.2.1: compiler and malloc both align to 8 (apparently breaking x86 ABI).
6.3.0: compiler aligns to 4, malloc aligns to 8 (apparently depending on the
compiler, LOL)

What NOW? Logically speaking, you only have one combination left to try -
align to 4 in all cases (which is what every other OS does on x86,
apparently). But knowing QNX one may wonder …

– igor

“Xiaodan Tang” <xtang@qnx.com> wrote in message
news:cifhlu$r3f$1@inn.qnx.com

I have been there > :frowning:> . PR filed, fix is going to be in the next 6.3.0
patch.

-xtang


John Murphy <> murf@perftech.com> > wrote in message
news:> 414A0204.8090201@perftech.com> …
We’ve found an interesting difference between the 2.95.3 and 3.1.3
ompilers, and are curious to know if anyone can explain what’s
happening.

The following trival program illustrates the situation:

#include <hw/nicinfo.h
int main ()
{
printf("%d\n", sizeof(nic_stats_t));
return 0;
}

When complied with the 3.3.1 compiler, the printed result is 1212, but
when compiled with the 2.95.3 compiler, the result is 1216. Since the
value of DCMD_IO_NET_GET_STATS depends on the size of nic_stats_t, this
difference is critical! The difference is caused by the 64 bit
alignment (or lack thereof) of the octets_txed_ok field. Version 2.95.3
apparently defaults to putting 64 bit variables on 64 bit boundaries,
while 3.3.1 will put them on 32 bit boundaries. Sounds like a problem
with default confiurations — anybody know the details, including how
to fix it?

Murf

Igor Kovalenko wrote:

Humm… fix. Again. Which way it will be ‘fixed’ now?

6.2.0: compiler aligns to 8, malloc aligns to 4 (breaking Postgres).
6.2.1: compiler and malloc both align to 8 (apparently breaking x86 ABI).
6.3.0: compiler aligns to 4, malloc aligns to 8 (apparently depending on the
compiler, LOL)

What NOW? Logically speaking, you only have one combination left to try -
align to 4 in all cases (which is what every other OS does on x86,
apparently). But knowing QNX one may wonder …

What does malloc() have to do with this?

I thought you had agreed a while ago that 8 is a multiple of 4, and that
a malloc() that always returns a multiple of 8 always returns a multiple
of 4? And that code that breaks unless malloc sometimes returns an
odd multiple of 4 is, well, broken?

“Wojtek Lerch” <Wojtek_L@yahoo.ca> wrote in message
news:cimr2m$ams$1@inn.qnx.com

Igor Kovalenko wrote:
Humm… fix. Again. Which way it will be ‘fixed’ now?

6.2.0: compiler aligns to 8, malloc aligns to 4 (breaking Postgres).
6.2.1: compiler and malloc both align to 8 (apparently breaking x86
ABI).
6.3.0: compiler aligns to 4, malloc aligns to 8 (apparently depending on
the
compiler, LOL)

What NOW? Logically speaking, you only have one combination left to
try -
align to 4 in all cases (which is what every other OS does on x86,
apparently). But knowing QNX one may wonder …

What does malloc() have to do with this?

I thought you had agreed a while ago that 8 is a multiple of 4, and that
a malloc() that always returns a multiple of 8 always returns a multiple
of 4? And that code that breaks unless malloc sometimes returns an
odd multiple of 4 is, well, broken?

Ah Wojtek, very sharp observation. But then one can state that both compiler
and malloc have aligned to 4 in all the abovementioned cases, right? After
all 8 is a multiple of 4… One may wonder why all the trouble… and what
have you changed with every release?

Nevertheless, I’d still like to know what the ‘next patch’ will do.

– igor

Igor Kovalenko wrote:

“Wojtek Lerch” <> Wojtek_L@yahoo.ca> > wrote in message
news:cimr2m$ams$> 1@inn.qnx.com> …
What does malloc() have to do with this?

I thought you had agreed a while ago that 8 is a multiple of 4, and that
a malloc() that always returns a multiple of 8 always returns a multiple
of 4? And that code that breaks unless malloc sometimes returns an
odd multiple of 4 is, well, broken?


Ah Wojtek, very sharp observation. But then one can state that both compiler
and malloc have aligned to 4 in all the abovementioned cases, right? After
all 8 is a multiple of 4…

No, of course not, unless you’re talking how the compiler aligns
complete variables. But this thread was originally about how the
compiler aligns structure members. If

offsetof( struct { char c; long long ll; }, ll )

is eight, the alignment is eight, not four, even though eight is a
multiple of four.

One may wonder why all the trouble… and what
have you changed with every release?

Nevertheless, I’d still like to know what the ‘next patch’ will do.

– igor

“Wojtek Lerch” <Wojtek_L@yahoo.ca> wrote in message
news:cin5be$ams$2@inn.qnx.com

Igor Kovalenko wrote:
“Wojtek Lerch” <> Wojtek_L@yahoo.ca> > wrote in message
news:cimr2m$ams$> 1@inn.qnx.com> …
What does malloc() have to do with this?

I thought you had agreed a while ago that 8 is a multiple of 4, and that
a malloc() that always returns a multiple of 8 always returns a multiple
of 4? And that code that breaks unless malloc sometimes returns an
odd multiple of 4 is, well, broken?


Ah Wojtek, very sharp observation. But then one can state that both
compiler
and malloc have aligned to 4 in all the abovementioned cases, right?
After
all 8 is a multiple of 4…

No, of course not, unless you’re talking how the compiler aligns
complete variables. But this thread was originally about how the
compiler aligns structure members. If

offsetof( struct { char c; long long ll; }, ll )

is eight, the alignment is eight, not four, even though eight is a
multiple of four.

Which was my point back then…

This time I was not arguing about malloc. I was just refreshing the context
and asking what exactly the ‘fix’ will do this time, considering the history
of the subject. Still haven’t got an answer to that.

– igor

Igor Kovalenko wrote:

This time I was not arguing about malloc.

Of course not. I apologize for suggesting that you were.

Would you mind if I asked again why you mentioned malloc() in your first
post in this thread?

“Wojtek Lerch” <Wojtek_L@yahoo.ca> wrote in message
news:cipcrs$a08$1@inn.qnx.com

Igor Kovalenko wrote:
This time I was not arguing about malloc.

Of course not. I apologize for suggesting that you were.

Would you mind if I asked again why you mentioned malloc() in your first
post in this thread?

Just to give the question a context. Compiler and malloc should play this
game together, this much we have established, right? So if the compiler is
‘broken’ again and is supposed to be ‘fixed’ in a next patch, I wanted to
know how it was fixed. Given the history, there could be any arbitrary
change in behavior, and it could be inconsistent with malloc again.

Igor Kovalenko wrote:

“Wojtek Lerch” <> Wojtek_L@yahoo.ca> > wrote in message
news:cipcrs$a08$> 1@inn.qnx.com> …
Would you mind if I asked again why you mentioned malloc() in your first
post in this thread?

Just to give the question a context. Compiler and malloc should play this
game together, this much we have established, right?

No, we haven’t. Sorry if I wasn’t clear about that. Let me try again.

This thread was originally about how much padding the compiler puts
between structure members and at the end of structures. Since this kind
of padding affects the size and layout of structures, changing it breaks
binary compatibility. If we change compiler defaults in a way that
changes the layout of some structures and breaks some APIs, it’s very
reasonable to complain about that.

The alignment of memory returned by malloc() is a completely different
story. Unlike the padding between structure members, the amount of
unused memory that malloc() keeps between allocated memory blocks is not
something that a reasonable programmer should rely on. You can, of
course, expect that malloc() gives you memory that is aligned
appropriately for any type you might want to use it as, but it doesn’t
make much sense to complain that it’s “too aligned”. If malloc()
returns an address that’s a multiple of 1024, that’s perfectly fine.
If, on the other hand, the compiler decides to put 1023 bytes of padding
between a char and an int in a structure, that’s definitely wrong.
Especially if the previous version of the compiler only put 3 bytes
there. The context you’re insisting on putting the question in is
completely irrelevant to the question.

So if the compiler is
‘broken’ again and is supposed to be ‘fixed’ in a next patch, I wanted to
know how it was fixed. Given the history, there could be any arbitrary
change in behavior, and it could be inconsistent with malloc again.

The addresses returned by malloc() are arbitrary. Even if we changed
our malloc() to always return a multiple of 16, there would be nothing
wrong about that (except you could suspect that it wasted more memory).
Even the existing malloc() sometimes returns a multiple of 16, doesn’t
it? Do you think that’s “inconsistent with the compiler” and wrong?

Alas, it must be that Postgres was broken. After all, why should QNX behave
like other systems do…

“Wojtek Lerch” <Wojtek_L@yahoo.ca> wrote in message
news:cipo5j$eel$1@inn.qnx.com

Igor Kovalenko wrote:
“Wojtek Lerch” <> Wojtek_L@yahoo.ca> > wrote in message
news:cipcrs$a08$> 1@inn.qnx.com> …
Would you mind if I asked again why you mentioned malloc() in your first
post in this thread?

Just to give the question a context. Compiler and malloc should play
this
game together, this much we have established, right?

No, we haven’t. Sorry if I wasn’t clear about that. Let me try again.

This thread was originally about how much padding the compiler puts
between structure members and at the end of structures. Since this kind
of padding affects the size and layout of structures, changing it breaks
binary compatibility. If we change compiler defaults in a way that
changes the layout of some structures and breaks some APIs, it’s very
reasonable to complain about that.

The alignment of memory returned by malloc() is a completely different
story. Unlike the padding between structure members, the amount of
unused memory that malloc() keeps between allocated memory blocks is not
something that a reasonable programmer should rely on. You can, of
course, expect that malloc() gives you memory that is aligned
appropriately for any type you might want to use it as, but it doesn’t
make much sense to complain that it’s “too aligned”. If malloc()
returns an address that’s a multiple of 1024, that’s perfectly fine.
If, on the other hand, the compiler decides to put 1023 bytes of padding
between a char and an int in a structure, that’s definitely wrong.
Especially if the previous version of the compiler only put 3 bytes
there. The context you’re insisting on putting the question in is
completely irrelevant to the question.

So if the compiler is
‘broken’ again and is supposed to be ‘fixed’ in a next patch, I wanted
to
know how it was fixed. Given the history, there could be any arbitrary
change in behavior, and it could be inconsistent with malloc again.

The addresses returned by malloc() are arbitrary. Even if we changed
our malloc() to always return a multiple of 16, there would be nothing
wrong about that (except you could suspect that it wasted more memory).
Even the existing malloc() sometimes returns a multiple of 16, doesn’t
it? Do you think that’s “inconsistent with the compiler” and wrong?

Igor Kovalenko wrote:

Alas, it must be that Postgres was broken. After all, why should QNX behave
like other systems do…

It’s certainly possible. While porting various things to “alternative”
operating systems (QNX 6, Neutrino, BeOS) I’ve found numerous code bugs
that “just worked” on, say, Linux, but which died running on a different
implementation of libc. Sometimes, just compiling something with a
different compiler is enough to make these problems appear.

If software depends on internal implementation details that aren’t
covered by a spec, you’ve found software that isn’t portable and needs
to be fixed or ported.


Chris Herborth (cherborth@qnx.com)
Never send a monster to do the work of an evil scientist.

“Chris Herborth” <cherborth@qnx.com> wrote in message
news:cis2ql$e09$1@inn.qnx.com

Igor Kovalenko wrote:
Alas, it must be that Postgres was broken. After all, why should QNX
behave
like other systems do…

It’s certainly possible. While porting various things to “alternative”
operating systems (QNX 6, Neutrino, BeOS) I’ve found numerous code bugs
that “just worked” on, say, Linux, but which died running on a different
implementation of libc. Sometimes, just compiling something with a
different compiler is enough to make these problems appear.

If software depends on internal implementation details that aren’t
covered by a spec, you’ve found software that isn’t portable and needs
to be fixed or ported.

This summarizes QNX attitude toward portability quite precizely. There’s
been so many cases when you’re different just for the sake of being
different, it’s frustrating. And even if it is not much problem to mimic ‘de
facto’ behavior of more mainstream systems, QNX would rather be stubborn and
let people deal with portability issues.

– igor