Running out of memory

I have a QNX 4.25G system running our own application. After a few weeks,
the system runs out of memory. I have caught the system just before it ran
out completely and used ‘sin’ to try to examine what is hapenning.

‘sin info’ showed me 0M free from 128M installed memory

‘sin freemem’ showed about 250k memory free in 5-6 blocks

These results seem to agree OK (taking rounding errors into account). But
then I ran ‘sin mem’ and did some processing with the memory selectors (to
try to remove shared memory areas from counting more than once). This is
what I did:
1.) Find all the selectors with a unique starting address
2.) Find all the selectors with a non-unique starting address
Group these by address.
For each group find the selector with the largest size
3.) Sum all the sizes from one and two
I was expecting this to yield the amount of memory used, but it added up to
about 50M

Can anyone tell me what I am doing wrong? How do I find where the missing
memory has gone?

Thanks, Simon Flower.



In case you are interested, here is the output from ‘sin mem’:

/boot/sys/Proc32 1
0005 F0000000 118784 -B-3--------DC- 000D F0040000 268M -B-3--------D–
10A9 00038000 28672 -B–±-G-----C- 0009 FF801000 29056 —3—G-------
0011 FF800000 2048 -------G------- 0049 0000A000 4096 -B-----G-------
/boot/sys/Slib32 2
0005 0002A000 53248 -BS3---------C- 000D 00037000 4096 -B-3-----------
10A1 0002A000 53248 -BS3±-G-----C-
/bin/Fsys 4
0005 00041000 77824 -B-3---------C- 000D 0024D000 18804k -BS3-----------
0015 02693000 19165k -BS3±--------- 001D 0005A000 61440 -BS3±-------C-
0025 014D5000 20480 -BS3±--------- 002D 0024D000 18804k -BS3±---------
0035 014CB000 20480 -BS3±-------C- 003D 0024D000 18804k -BS3±---------
/bin/Fsys.eide 5
0005 0005A000 61440 -BS3---------C- 000D 02693000 19165k -BS3-----------
0015 00001000 8192 -B-3-------MD-- 001D 0007A000 65536 -B-3-------MD–
0025 0008A000 65536 -B-3-------MD-- 002D 0024D000 18804k -BS3±---------
idle 8
10F9 014E7000 40960 -B-3—G—M—
//1/bin/Dev32 16
0005 0000C000 32768 -BS3---------C- 000D 00226000 94208 -BS3-----------
0015 00014000 16384 -BS3±--------- 001D 00003000 16384 -BS3±-------C-
0025 00007000 8192 -BS3±-----M— 002D 00054000 20480 -BS3±---------
0035 014AB000 40960 -BS3±-------C- 003D 00054000 20480 -BS3±---------
0045 014AB000 40960 -BS3±-------C- 004D 00073000 16384 -BS3±-----M—
0055 00009000 4096 -BS3±-----M— 005D 0006E000 12288 -BS3±---------
0065 00021000 8192 -BS3±-------C- 006D 00247000 16384 -BS3±---------
0075 0009A000 12288 -BS3±-------C- 007D 00247000 16384 -BS3±---------
0085 0009A000 12288 -BS3±-------C- 008D 014C1000 20480 -BS3-------M—
0095 014C6000 20480 -BS3-------M— 009D 00023000 4096 -BS3-------M—
00A5 0023D000 40960 -BS3±--------- 00AD 00018000 36864 -BS3±-------C-
00B5 015F0000 299008 -BS3-------M—
//1/bin/Pipe 21
0005 019BF000 16384 -B-3--------DC- 000D 019BF000 49152 -B-3-----------
//1/bin/Dev32.ser 23
0005 00003000 16384 -BS3---------C- 000D 00014000 16384 -BS3-----------
0015 00226000 94208 -BS3±--------- 001D 0000C000 32768 -BS3±-------C-
0025 00007000 8192 -BS3-------M—
//1/digi/Dq.ser32 24
0005 00018000 36864 -BS3---------C- 000D 0023D000 40960 -BS3-----------
0015 00069000 20480 -B-3-------M— 001D 00226000 94208 -BS3±---------
0025 0000C000 32768 -BS3±-------C-
00B5 015F0000 299008 -BS3±-----M—
//1/bin/Dev32.ansi 25
0005 014AB000 40960 -BS3---------C- 000D 00054000 20480 -BS3-----------
0015 000A0000 65536 -----------MD-- 001D 00226000 94208 -BS3±---------
0025 0000C000 32768 -BS3±-------C- 002D 00226000 94208 -BS3±---------
0035 0000C000 32768 -BS3±-------C- 003D 014B5000 49152 -B-3-------M—
004D 00073000 16384 -BS3-------M— 0055 00009000 4096 -BS3-------M—
//1/bin/Dev32.par 29
0005 00021000 8192 -BS3---------C- 000D 0006E000 12288 -BS3-----------
0015 00226000 94208 -BS3±--------- 001D 0000C000 32768 -BS3±-------C-
009D 00023000 4096 -BS3±-----M—
//1/bin/Dev32.pty 30
0005 0009A000 12288 -BS3---------C- 000D 00247000 16384 -BS3-----------
0015 00226000 94208 -BS3±--------- 001D 0000C000 32768 -BS3±-------C-
0025 00226000 94208 -BS3±--------- 002D 0000C000 32768 -BS3±-------C-
008D 014C1000 20480 -BS3±-----M— 0095 014C6000 20480 -BS3±-----M—
//1/bin/Fsys.floppy 34
0005 014CB000 20480 -BS3---------C- 000D 014D5000 20480 -BS3-----------
003D 0024D000 18804k -BS3±---------
10F1 014D0000 20480 -B-3—G-±MD–
//1/bin/Iso9660fsys 35
0005 014FE000 28672 -B-3--------DC- 000D 014FE000 57344 -B-3-----------
//1/bin/Net 38
0005 014DA000 32768 -B-3---------C- 000D 0151C000 32768 -B-3-----------
0015 01524000 40960 -BS3±-------C- 001D 0152E000 176128 -BS3±---------
//1/bin/Net.ether82557 41
0005 01524000 40960 -BS3---------C- 000D 0152E000 176128 -BS3-----------
0015 0144A000 73728 -B-3-+±—MD-- 001D 0003F000 4096 -B-3-+±—MD–
0025 0145C000 49152 -B-3-+±—MD-- 002D 0009D000 8192 -B-3-+±—MD–
//1/bin/nameloc 44
0007 01586000 12288 -B-3--------DC- 000F 01586000 20480 -B-3-----------
//1/bin/nameloc 45
0007 0155E000 12288 -B-3--------DC- 000F 0155E000 16384 -B-3-----------
//1//usr/ucb/Socket 53
0005 09F6D000 225280 -B-3--------DC- 000D 09F6D000 1290k -B-3-----------
//1/
/usr/ucb/inetd 63
0007 01D15000 36864 -B-3--------DC- 000F 01D15000 32768 -B-3-----------
//1/bin/Mqueue 69
0005 01445000 20480 -B-3---------C- 000D 038DA000 4218k -B-3-----------
//1//v0.3/bin/errlog 164
0007 016C7000 45056 -B-3--------DC- 000F 016C7000 49152 -B-3-----------
//1/
/v0.3/bin/schedule 165
0007 018C9000 135168 -B-3--------DC- 000F 018C9000 196608 -B-3-----------
//1//v0.3/bin/stalta 176
0007 01C93000 151552 -B-3--------DC- 000F 01C93000 86016 -B-3-----------
//1/
/bin/sdas_writer 181
0007 01A03000 135168 -B-3--------DC- 000F 01A03000 65536 -B-3-----------
//1//bin/sdas_clock 183
0005 019DB000 77824 -B-3--------DC- 000D 019DB000 45056 -B-3-----------
//1/
/v0.3/bin/adc2 195
0005 01B92000 135168 -B-3--------DC- 000D 01B92000 335872 -B-3-----------
//1//v0.3/bin/buffer 197
0007 01A86000 36864 -B-3--------DC- 000F 01A86000 28672 -B-3-----------
//1/
/bin/earthdata 201
0007 01A3D000 131072 -B-3--------DC- 000F 01A3D000 49152 -B-3-----------
//1//v0.3/bin/import 206
0007 08BA8000 139264 -B-3--------DC- 000F 08BA8000 241664 -B-3-----------
//1/
/bin/sdas_webs 223
0007 017D8000 237568 -B-3--------DC- 000F 017D8000 155648 -B-3-----------
//1//v0.3/bin/sdas_ds 224
0007 01B42000 155648 -B-3--------DC- 000F 01B42000 86016 -B-3-----------
//1/bin/tinit 226
0007 015AF000 16384 -B-3--------DC- 000F 015AF000 28672 -B-3-----------
//1/bin/tinit 227
0007 0176B000 16384 -B-3--------DC- 000F 0176B000 20480 -B-3-----------
//1/bin/ksh 228
0007 022EB000 94208 -B-3--------DC- 000F 022EB000 69632 -B-3-----------
//1/
/photon/bin/Photon 366
0005 01CD6000 57344 -B-3--------DC- 000D 01CD6000 49152 -B-3-----------
//1//bin/phfontpfr 370
0005 02379000 126976 -B-3--------DC- 000D 02379000 389120 -B-3-----------
//1/
/drivers/Null.ms 374
0005 0175D000 12288 -B-3--------DC- 000D 0175D000 16384 -B-3-----------
//1//drivers/Pg.rage 376
0005 05257000 155648 -B-3--------DC- 000D 05257000 167936 -B-3-----------
0015 000C0000 65536 -----------MDC- 001D 00000000 4096 -----------MD–
0025 00000000 4096 -----------MDC-
//1/bin/Input 380
0005 01942000 65536 -B-3--------DC- 000D 01942000 28672 -B-3-----------
//1/
/photon/bin/pwm 385
0007 0445D000 94208 -B-3--------DC- 000F 0445D000 143360 -B-3-----------
//1/bin/Input 389
0005 01942000 65536 -B-3--------DC- 000D 01942000 28672 -B-3-----------
//1//photon/bin/pdm 400
0007 04D35000 143360 -B-3--------DC- 000F 04D35000 507904 -B-3-----------
//1/
/photon/bin/pterm 406
0007 03F5D000 65536 -B-3--------DC- 000F 03F5D000 110592 -B-3-----------
//1/bin/ksh 409
0007 02409000 94208 -B-3--------DC- 000F 02409000 45056 -B-3-----------
//1/bin/ksh 23632
0007 0AEB4000 94208 -B-3--------DC- 000F 0AEB4000 45056 -B-3-----------
//1/bin/sin 23654
0005 0AEF7000 45056 -B-3--------DC- 000D 0AEF7000 57344 -B-3-----------

Simon Flower <s.flower@bgs.ac.uk> wrote:

I have a QNX 4.25G system running our own application. After a few weeks,
the system runs out of memory. I have caught the system just before it ran
out completely and used ‘sin’ to try to examine what is hapenning.

‘sin info’ showed me 0M free from 128M installed memory

‘sin freemem’ showed about 250k memory free in 5-6 blocks

I’ve never played with/followed the selectors stuff.

Do you tend to have a lot of process creation/termination happening?

If not, and even if that happens to some extent, it is probably
still worth grabbing a snapshot of the ‘sin’ output once the system
has fully initialized.

Then, after it is running, or if you catch it when it is almost out
of memory, do that ‘sin’ snapshot again, and compare on a process by
process basis how much memory each owns. If one, or a few, have grown
significantly, that’s the first place to look.

Next place to look is /dev/shmem. After your system is fully initialized,
grab an “ls -l” of /dev/shmem, then when memory is full, do it again. Are
there new objects of a certain type accumulating? Are there unused objects
accumulating? Who creates/uses those objects?

These results seem to agree OK (taking rounding errors into account). But
then I ran ‘sin mem’ and did some processing with the memory selectors (to
try to remove shared memory areas from counting more than once). This is
what I did:
1.) Find all the selectors with a unique starting address
2.) Find all the selectors with a non-unique starting address
Group these by address.
For each group find the selector with the largest size
3.) Sum all the sizes from one and two
I was expecting this to yield the amount of memory used, but it added up to
about 50M

Can anyone tell me what I am doing wrong? How do I find where the missing
memory has gone?

I know that going from “sin mem” to anything else is…complex. Just,
for example, consider:

//1/bin/tinit 226
0007 015AF000 16384 -B-3--------DC- 000F 015AF000 28672 -B-3-----------
//1/bin/tinit 227
0007 0176B000 16384 -B-3--------DC- 000F 0176B000 20480 -B-3-----------

Two instance of tinit running. Each has two selectors (code & data), did
you count that as four areas of memory? Even though the offset part
for both tinits is the same, the segment section is different, so for
the first tinit 0007:015AF000 and 000F:015AF00 represent different areas
of physical RAM.

But, QNX will do code-sharing for identical executables, so in fact the
first tinit’s 0007:015AF000 and the second tinit’s 0007:0176B000 are
probably the same physical RAM.

(If you do a ‘sin’, does it report 16K code for each tinit, or 8k code
for each? ‘sin’ generally divides the total code size/#instances for
reporting the code of a process. Compare:

#sin -P less
SID PID PROGRAM PRI STATE BLK CODE DATA
12 8246 //88/ram/less 10o REPLY 32 65k 40k

#less filename
#sin -P less
SID PID PROGRAM PRI STATE BLK CODE DATA
13 5885 //88/ram/less 10o REPLY 32 32k 40k
12 8246 //88/ram/less 10o REPLY 32 32k 40k

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com