SIGBUS error.

Maschoen,

In one of his earlier posts he said he already did this step. That each line he commented out pushed the crash down to the next uncommented line.

I’m curious about the C++ idioms that we don’t see here such as:

  1. The function table pointer that belongs to all C++ classes.
  2. The fact that the constructor is assigning (not initializing) values to variables inside the constructor after the compiler has already done initialization via the initialization list (that isn’t in this code) Ie
clsMatrix::clsMatrix : 
  pM(NULL),
  m(0),
  n(0)
  m_bAutoDelete(false)
{
}

is how I’d normally write such a constructor. Afterall, if you don’t use the initializer list the compiler silently does so for you anyway and you basically repeat the work again when you do the manual assignments inside the constructor.

One or both of those could be the problem.

Tim

Eric,

The are even aligned if you mean divisible by 2. But 13d1f6 is definitely not 4 byte aligned. Is your processor 2 or 4 byte aligned? If it’s 4 byte aligned you have a problem.

Tim

Each variable is 4 bytes apart, that looks good, but the whole object is NOT aligned on a 4 byte boundary. Are the objects instantiated on the stack or via new. My guess is its on the stack. I have seen that happen, its the stack that is not aligned. Just a guess try compiling with -mstack-align

Hi, mario. I have tried to add the compile option in the build property, but “-mstack-align” is not recognizable. And my target is ARM cortex-A8 OMAP3530.

clsMatrix::clsMatrix : 
  pM(NULL),
  m(0),
  n(0)
  m_bAutoDelete(false)
{
}
#ifndef BOOL
#define BOOL	int
#define TRUE	1
#define FALSE	0
#endif

I used this constructor, now the crash line comes to the last assignment in m_bAutoDelete. So the self-defined type BOOL causes the error?

Thanks,
Eric

Eric,

I typo’d my example in that I used false (the C++ false) instead of FALSE (your false).

Not sure that makes a difference but you might a well use your defined FALSE for completeness sake.

Also, how do you know where the crash line is? The debugger is likely just showing the last line of the constructor definition. You’d need to examine at the assembly level to determine exactly which variable is the problem.

Just for fun, try doing this in your class definition:

class clsMatrix {
protected:
   char foo;
   char bar;
   double *pM;
   int m;
   int n;

clsMatrix::clsMatrix :
  char foo(0),
  char bar(0)
  pM(NULL),
  m(0),
  n(0)
  m_bAutoDelete(FALSE)
{
} 
 

The 2 extra characters should space the integers on a 4 byte boundary at which point the SIGBUS should stop.

Tim

P.S. Exactly which compiler are you using if the stack align Mario suggested isn’t working?

Thanks. But does not work for me. I attached the assembly this around as follows:


clsMatrix::clsMatrix() :
0x0012187c <clsMatrix>:                     mov    r12, sp
0x00121880 <clsMatrix+4>:                   push   {r11, r12, lr, pc}
0x00121884 <clsMatrix+8>:                   sub    r11, r12, #4	; 0x4
0x00121888 <clsMatrix+12>:                  sub    sp, sp, #4	; 0x4
0x0012188c <clsMatrix+16>:                  str    r0, [r11, #-16]
0x001218dc <clsMatrix>:                     mov    r12, sp
0x001218e0 <clsMatrix+4>:                   push   {r11, r12, lr, pc}
0x001218e4 <clsMatrix+8>:                   sub    r11, r12, #4	; 0x4
0x001218e8 <clsMatrix+12>:                  sub    sp, sp, #4	; 0x4
0x001218ec <clsMatrix+16>:                  str    r0, [r11, #-16]
	m_bAutoDelete(FALSE)
0x00121890 <clsMatrix+20>:                  ldr    r3, [r11, #-16]
0x00121894 <clsMatrix+24>:                  mov    r2, #0	; 0x0
0x00121898 <clsMatrix+28>:                  strb   r2, [r3]
0x0012189c <clsMatrix+32>:                  ldr    r3, [r11, #-16]
0x001218a0 <clsMatrix+36>:                  mov    r2, #0	; 0x0
0x001218a4 <clsMatrix+40>:                  strb   r2, [r3, #1]
0x001218a8 <clsMatrix+44>:                  ldr    r3, [r11, #-16]
0x001218ac <clsMatrix+48>:                  mov    r2, #0	; 0x0
0x001218b0 <clsMatrix+52>:                  str    r2, [r3, #4]
0x001218b4 <clsMatrix+56>:                  ldr    r3, [r11, #-16]
0x001218b8 <clsMatrix+60>:                  mov    r2, #0	; 0x0
0x001218bc <clsMatrix+64>:                  str    r2, [r3, #8]
0x001218c0 <clsMatrix+68>:                  ldr    r3, [r11, #-16]
0x001218c4 <clsMatrix+72>:                  mov    r2, #0	; 0x0
0x001218c8 <clsMatrix+76>:                  str    r2, [r3, #12]
0x001218cc <clsMatrix+80>:                  ldr    r3, [r11, #-16]
0x001218d0 <clsMatrix+84>:                  mov    r2, #0	; 0x0
0x001218d4 <clsMatrix+88>:                  str    r2, [r3, #16]
0x001218f0 <clsMatrix+20>:                  ldr    r3, [r11, #-16]
0x001218f4 <clsMatrix+24>:                  mov    r2, #0	; 0x0
0x001218f8 <clsMatrix+28>:                  strb   r2, [r3]
0x001218fc <clsMatrix+32>:                  ldr    r3, [r11, #-16]
0x00121900 <clsMatrix+36>:                  mov    r2, #0	; 0x0
0x00121904 <clsMatrix+40>:                  strb   r2, [r3, #1]
0x00121908 <clsMatrix+44>:                  ldr    r3, [r11, #-16]
0x0012190c <clsMatrix+48>:                  mov    r2, #0	; 0x0
0x00121910 <clsMatrix+52>:                  str    r2, [r3, #4]
0x00121914 <clsMatrix+56>:                  ldr    r3, [r11, #-16]
0x00121918 <clsMatrix+60>:                  mov    r2, #0	; 0x0
0x0012191c <clsMatrix+64>:                  str    r2, [r3, #8]
0x00121920 <clsMatrix+68>:                  ldr    r3, [r11, #-16]
0x00121924 <clsMatrix+72>:                  mov    r2, #0	; 0x0
0x00121928 <clsMatrix+76>:                  str    r2, [r3, #12]
0x0012192c <clsMatrix+80>:                  ldr    r3, [r11, #-16]
0x00121930 <clsMatrix+84>:                  mov    r2, #0	; 0x0
0x00121934 <clsMatrix+88>:                  str    r2, [r3, #16]
{}
0x001218d8 <clsMatrix+92>:                  ldm    sp, {r3, r11, sp, pc}
0x00121938 <clsMatrix+92>:                  ldm    sp, {r3, r11, sp, pc}
clsMatrix::clsMatrix(int i, int j, double *p, BOOL bAssign)
0x0012193c <clsMatrix>:                     mov    r12, sp
0x00121940 <clsMatrix+4>:                   push   {r11, r12, lr, pc}
0x00121944 <clsMatrix+8>:                   sub    r11, r12, #4	; 0x4
0x00121948 <clsMatrix+12>:                  sub    sp, sp, #20	; 0x14
0x0012194c <clsMatrix+16>:                  str    r0, [r11, #-16]
0x00121950 <clsMatrix+20>:                  str    r1, [r11, #-20]
0x00121954 <clsMatrix+24>:                  str    r2, [r11, #-24]
0x00121958 <clsMatrix+28>:                  str    r3, [r11, #-28]
	m = n = 0; pM = NULL; m_bAutoDelete = FALSE;
0x0012195c <clsMatrix+32>:                  ldr    r3, [r11, #-16]
0x00121960 <clsMatrix+36>:                  mov    r2, #0	; 0x0
0x00121964 <clsMatrix+40>:                  str    r2, [r3, #12]
0x00121968 <clsMatrix+44>:                  ldr    r3, [r11, #-16]
0x0012196c <clsMatrix+48>:                  ldr    r2, [r3, #12]
0x00121970 <clsMatrix+52>:                  ldr    r3, [r11, #-16]
0x00121974 <clsMatrix+56>:                  str    r2, [r3, #8]
0x00121978 <clsMatrix+60>:                  ldr    r3, [r11, #-16]
0x0012197c <clsMatrix+64>:                  mov    r2, #0	; 0x0
0x00121980 <clsMatrix+68>:                  str    r2, [r3, #4]
0x00121984 <clsMatrix+72>:                  ldr    r3, [r11, #-16]
0x00121988 <clsMatrix+76>:                  mov    r2, #0	; 0x0
0x0012198c <clsMatrix+80>:                  str    r2, [r3, #16]

Seems all right…
btw, where can i set the options for stack alignment in the IDE?

Regards,
Eric

Hum the stack-align option is not supported on ARM!

I don`t think the issue is in the code of the class. The adresses that were printed showed that for some object, alignmement was on 4 byte bondary and for other it was not. So it seems to me the stack is getting “missaligned” by something else outside the class.

I suspect that on Arm ( which I know nothing about ), it should be impossible to have a misaligned stack since everything is 32 bits. When you print the addresses at first it was misaligned and then became. Hence something is causing the stack to become missaligne. I would move around the instanciation of the object to figure out where this stack issue is appearing.

You didn`t say if the object are allocated on the heap or on the stack?

Well, i think it better off putting the source code here. Great thanks for helping find the bug. And please kindly specify the math library when compiling it.
And the objects as you can see, are allocated on the stack.

Regards,
Eric

Eric,

memAlignTest.cc doesn’t come close to compiling. There are missing header files that cause all kinds of compiler errors.

I can compile matrix.cpp on it’s own. But the main() is in memAlignTest.cc.

Also when I looked in memAlignTest.cc I see your matrix class is itself wrapped inside lots of other classes. It’s so entangled that I can’t easily see how deep it’s nested.

When Mario/Masochen mentioned doing a test they were expecting a simple test like:

main()
{
clsMatrix m;
}

and that’s it. Just literally declare an instance of your class, print out the addresses of the internal variables. Maybe add a few more lines and assign values to those internal variables.

If that works, then your problem lies somewhere inside all the class nesting going on in memAlignTest.cc

Tim

Eric,

Looking a bit inside the memAlignTest.h file I see the following:

#pragma pack(1)

proceeding several structures. These structures are included in your clsCTL class which contains the clsMatrix class (the nesting issue that makes everything so confusing).

That pragma pack(1) tells the compiler to align on 1 byte boundaries instead of the default boundaries for your processor. Lord only knows what it’s done to your alignment. I highly suspect those packs are causing everything to get misaligned especially since one of the 3 instances of clsMatrix was correctly aligned (I’d bet my life it’s the one in the clsPath class which does NOT contain any of the prama packed structures).

As a quick and dirty test, move the clsMatrix declarations in clsCTL so that they are the first defined variables in that class before those packed structures. See if that fixes your problem. If it does you know the packing is a problem you’ll have to solve.

Tim

pragma pack(1) , that is probably it.

Indeed, many thanks! Tim and mario. I just deleted all the “pragma pack” and ran the program, it executed normally now.

Eric

Sorry, another SIGBUS error occurs when I tried adding some stuff as follows:

struct TELEGRAPH {
	int nSize;
	char szTele[512];
};

void clsCMM::MakeTelegraph(TELEGRAPH *pTele, short code, double time, void *pData, int nDataSize)
{
	char *pBuffer = pTele->szTele;

//	(double &)pBuffer[6] = time;
	pBuffer[0] = 0x50 + _HELICOPTER;	//_nHelicopter
	pBuffer[1] = 0x55;				//ground station

	(short &)pBuffer[2] = nDataSize+10;
	(short &)pBuffer[4] = code;
//	(double &)pBuffer[6] = time;
	(float &)pBuffer[6] = (float)time;
	::memcpy(pBuffer+14, pData, nDataSize);

	(unsigned short &)pBuffer[14+nDataSize] = CheckSum(pBuffer+4, 10+nDataSize);

	pTele->nSize = nDataSize + 16;
}

The cursor stopped at “(float &)pBuffer[6] = (float)time;” the below is the assembly output, look forward to your kind suggestion. Thanks.

(short &)pBuffer[4] = code;

0x0010e9f8 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+128>: ldr r3, [r11, #-20]
0x0010e9fc <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+132>: add r3, r3, #4 ; 0x4
0x0010ea00 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+136>: ldrh r2, [r11, #-28]
0x0010ea04 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+140>: strh r2, [r3]
(float &)pBuffer[6] = (float)time;
0x0010ea08 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+144>: ldr r3, [r11, #-20]
0x0010ea0c <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+148>: add r4, r3, #6 ; 0x6
0x0010ea10 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+152>: sub r1, r11, #36 ; 0x24
0x0010ea14 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+156>: ldm r1, {r0, r1}
0x0010ea18 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+160>: bl 0x103748 <__truncdfsf2>
0x0010ea1c <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+164>: mov r3, r0
0x0010ea20 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+168>: str r3, [r4]
::memcpy(pBuffer+14, pData, nDataSize);
0x0010ea24 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+172>: ldr r3, [r11, #-20]
0x0010ea28 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+176>: add r2, r3, #14 ; 0xe
0x0010ea2c <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+180>: ldr r3, [r11, #8]
0x0010ea30 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+184>: mov r0, r2
0x0010ea34 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+188>: ldr r1, [r11, #4]
0x0010ea38 <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+192>: mov r2, r3
0x0010ea3c <_ZN6clsCMM13MakeTelegraphEP9TELEGRAPHsdPvi+196>: bl 0x103358

Same story, the code is trying to write a float ( which is 4 bytes ) on an 2 bytes boundary can’t to that on ARM architecture. You can align the data by moving it to pBuffer[8] instead, but it sounds like you do not have control over the format of the data. Seems like the best option is to use memcpy which works one byte at a time and thus is unaffected by alignement issues.

Thanks, mario. Works fine with the memcpy method. Another problem is what if i want to decode the same structure on a x86 host machine? The structures defined in armle can be correctly mapped on the x86 machine?

Thanks,
Eric

For example, i have a structure defined as:

struct PACK {
unsigned char type;

double  a;
double  b;
double  c;

double  p;
double  q;
double  r;

double  acx;
double  acy;
double  acz;

double  magx; 
double  magy;
double  magz; 

double  tmp; 

};

As far as I know default alignment is the same. The C and C++ language does not defined aligment behavior, as such any program that rely data aligment is NOT portable. Posting the structure definition like you just did means nothing because it depends on compiler options, usage of compiler extention etc. For this structure to work on ARM there must be 3 ( or 7 ) bytes in between variable type and a. However that is not the case with x86.

You can limit and control this with compiler extention (pragma pack, __attribute), and careful coding but it is always going to be a pain. You can use sizeof() and offsetof() to check if everything matches. You can also use code that will copy the data set byte by byte, meaning that the program may use a structure called ABC but when it’s type to transfer it over to another machine of different CPU type it’s move to a structure ABCNetwork which is well defined packing wise. When it’s receive on the x86 size the data ABCNetwork is copied into ABC again which because it’s on x856 may not have the same format as ABC on arm (that may include handling endian)

That’s a little bit how TCP/IP works and deal with endianness.

Eric,

What do you mean by decode? As in passing data directly from one to the other (either via writing to a file that’s later read on the other processor or across a serial/Ethernet link etc)

Obviously as a minimum you’d better make sure you have the Endianness set to Little which is how X86 works. I believe you’ve already done this.

On the X86 side you’ll want to have the byte boundaries set the same as those of the Arm processor. You only need to do that for your structure PACK. You can use the ‘#pragma pack’ directive to set the byte boundaries appropriately.

#pragma pack(push, 4)   // Align on 4 byte boundary
struct PACK {
unsigned char type;

double a;
double b;
double c;

double p;
double q;
double r;

double acx;
double acy;
double acz;

double magx;
double magy;
double magz;

double tmp;
};
#pragma pack(pop)

I’ve used the above concept to align on 1 byte boundaries before when reading data from a serial device that had no concept of byte alignment.

Tim

Many thanks for all your explanations!
Sorry for so late update. Basically, i collect data and write them into a log file on the armle and then decode on the x86 host. The log data when analyzed in hex format, found that a zero byte is paddled after the unsigned char type, which is not desired on x86 and lead to the incorrect parsing on the x86 with the same data structure PACK as shown above.

From Tim’s post, seems that i can change the alignment on x86 by using #pragma pack? I have tried (push,4) and (push,2) but didnot work… I think i did not get the whole idea…

Regards,
Eric