I have a few questions about QNX 6.3SP2 (x86) memory management:
What’s the difference between standard libc memory allocator, and the one which is in libmalloc.so (non debug version) ?
Is it legal to do a realloc() on a memory block allocated with memalign() ?
I’m currently porting VLC on QNX 6.3SP2 and I’m experiencing heap corruptions (crash during free() in __flist_dequeue_bin()). The crash occurs after some time, but source code line and real crash instant remain random.
When I replace all calls to malloc(x) with calls to memalign(16, x), application crashes earlier and always near the same code line. Maybe this is memory corruption by application, maybe a bug or a limitation in memory allocator.
Same question for blocks reserved with valloc().
Using debug version of libmalloc.so (all checks enabled), I have some strange warnings about “free block has been modified” when blocks are freed and reallocated at the same address without having been overwritten. But in fact, all bytes of block are still 0x02 (free block tag).
If you look in helpviewer docs on realloc you’ll see a host of warnings about possibly memory leaks or the even worse fact the realloc may be forced to allocate a new block leaving all pointers pointing to the old block now pointing to ‘free’d memory’.
ie consider the following:
char *p1 = malloc(100);
char *p2 = p1; // p2 points to memory malloc’d for p1
p1 = realloc(p1, 1000); // p1 points to a new block of memory
If the realloc needed a new block to satisfy the call then p1 will point to a new memory block while p2 will continue to point to the old one which is now free’d. If p2 modifies that memory is could corrupt that memory space.
That is almost certainly the cause of your corruption.
Incidentally, I don’t see why a memalign can’t be realloced as long as the new block is correctly aligned but it may be risky.
I usually avoid reallocs as well.
But I’m porting an application (VLC Media Player) which makes an extensive use of them. So I really don’t have the choice.
Also VLC seems to work quite well on Win32 and Linux…
For the memalign() thing, I don’t have any answer, just my experiments.
There seem to be no crash if I use real malloc() instead of memalign().
In fact it is not said in docs that reallocating a block previously reserved by memalign() would preserve alignement.
It’s just not said whether doing a realloc() other a memalign’ed block is valid…
You mentioned your using the debug malloc library (libmalloc.so) and seeing some warnings. When your crash occurs are you just seeing those warnings you mentioned or are there others? It’s worrysome to see those warnings.
I don’t know anything about the VLC media player so I wanted to ask a couple of questions. Is it a multi-threaded app and if so, does it use any mutexes/semphores. The fact it works quite well under Linux/Windows might lead one to believe the problem is in your port. How much code have you had to change to get it to compile?
VLC is a multi-threaded app (3-5 threads for each video stream). It makes an extensive use of mutexes.
No “real” code was changed during port. Only includes where added to fix compile errors, as well as some #ifdefs.
I’ve patched ffmpeg (which is the video decoder used by VLC) to use malloc() instead of memalign(). And, with standard libc allocator, crash seems to occurs less frequently (haven’t seen it since yesterday).
Using debug version of libmalloc always gives me the following message as soon as I open one video stream:
got trace bt depth 10
 main interface error: no interface module matched “hotkeys,none”
 main interface error: no suitable interface module
 main private error: interface “hotkeys,none” initialization failed
 main playlist debug: adding playlist item `rtsp://0.0.0.0:0/dummy log’ ( rtsp://0.0.0.0:0/dummy log )
MALLOC Error from memcpy() (called from instruction preceding 0xB033CE1E pointer value 0x08398030):
data in free’d memory block has been modified
Memory fault (core dumped)
I’d be VERY surprised if it’s a bug in libmalloc. It’s almost certainly a bug in the VLC code.
This address (0xB033CE1E) shows where in the code libmalloc thinks the problem occured but I assume you already knew that. Assuming you built a debug version you should be able to see the location it happened.
Since the issue happens when you open a video stream I’d start looking at the code in there and removing pieces of it until you find out which piece of the code causes the problem that libmalloc reports.
On the other hand if removing all the memalign calls fixes the crashes it may be entirely related to trying to realloc memory allocated via memalign. There are options to the gcc complier to align on 4 byte boundries. Maybe Windows and Linux are doing that and QNX isn’t by default which could be causing the memalign/realloc problem.
I agree with Tim. I have seen many cases of latent bugs in Linux code exposed when ported to QNX. If this is the case, and you have a dual core machine, it is very likely a race condition; as QNX exhibits greater concurrency than Linux. The simple way to confirm this would be to use the “on” utility to restrict the scheduling of the application to a single core. If this rectifies the problem, then you can be sure it is a race condition that is the problem. The System Profiler will help you locate the source of the race condition.