Shm_open/mmap and malloc.

I am having a problem with a legacy application
that seems to have something to do with the interaction
between shared memory (for a PCI card access and inter-process
linked lists) and malloc. I was hoping someone could shed some
light on how this works so I can find the problem.

My program runs as three processes. The first is the driver which
reads from a PCI card which is memory mapped (there are two
256 sections of memory which are mapped per card). This process sends
any incoming packets of data to the second task (the input manager)
via a shared linked list. Anything to be written to the PCI card is
sent from the third task (the output manager) which also uses a shared
linked list to send packets to the driver which outputs them to the PCI
card. All three processes are compiled using the ‘flat’ memory to get
access to the large data area and all three have heaps set to 200k.

The PCI card is initialized using the shm_open and mmap functions as
below (I have left out the code which actually finds the card in using
the CA_PCI… functions as that detects the PCI card consistently).
Basically I open physical memory and then convert the two base address
registers into shared memory. This function typically works just fine
though it seems to always allocate memory on a 0x1000 byte boundary
even though I am only mapping 256 bytes per mmap, no big.

Boolean Map_PCI_Card
    (XLON_PCI_PCLTA *dev, struct _pci_config_regs *cregs)
{
    // Shared memory.
  int fd;

  // Open physical memory file descriptor.
  fd = shm_open ( "Physical", O_RDWR, 0777); 
  if (fd == -1) 
    return 0; 

  // Set lowest bits to zero, they do not contain information.
  dev->m_MemoryRange_P = creg->Base_Address_Regs[0] & 
    0xFFFFFFF0;

  // Map Firt base memory address (256 bytes).
  dev->Pport = mmap (0,
    256,
    PROT_READ | PROT_WRITE | PROT_NOCACHE, 
    MAP_SHARED,
    fd,
    dev->m_MemoryRange_P);
  if (dev->Pport == (void *) -1) 
    return 0;

  // Set lowest bits to zero, they do not contain information.
  dev->m_MemoryRange_L = creg->Base_Address_Regs[1] &
    0xFFFFFFF0;

  // Map Second base memory address (256 bytes).
  dev->Lport = mmap (0,
    256,
    PROT_READ | PROT_WRITE | PROT_NOCACHE, 
    MAP_SHARED,
    fd,
    dev->m_MemoryRange_L);
  if (dev->Lport == (void *) -1) 
    return 0;

  // Success!
  return 1;
}

The main code of the driver is as follows. Basically the driver calls
qnx_segment arm to allow processes to access this process’s shared
memory. Then I allocate a linked list for messages read from the PCI
card (Uplink_head/tail) which is sent to the input manager process
via a Send (which has been omitted for clarity). Then I wait for the
output manager to send the driver process a pointer to a linked list
which the driver uses to read messages to send to the PCI card
(this has also been omitted for clarity).

// MAIN DRIVER CODE…

typedef struct {
  unsigned segment;
  unsigned offset;
} shared_addr_t;

typedef struct {
  // Contents of packet.
  unsigned char Contents[254];

  // Next pointer.
  shared_addr_t next;
    
} element_t;

// Define a head and tail pointer for messages to send
// to input handler process. These are read from the PCI card.
element_t *Uplink_head;
element_t *Uplink_tail;

// Define where to read messages from the output manager.
// These are sent to the PCI card.
element_t __far *Downlink_queue;      

// Allow all other processes to access shared memory.
qnx_segment_arm (-1, -1, 0);

Uplink_head = Uplink_tail = malloc (sizeof(element_t));
memset(Uplink_head, 0, sizeof (*Uplink_head));

// This is a reference to the downlink queue which is sent to the
// process as a Shared_Addr. I left this out as it doesn't seem to
// be important for this problem.
{
  // The other task does a send to this task to of the downlink
  // pointer's shared_addr_t structure which is converted into a
  // far pointer (I left this step out).
  unsigned this_seg = qnx_segment_get(pid, Down_addr.segment, 0);
  Downlink_queue = MK_FP(this_seg, Down_addr.offset)
}

If you made it this far (don’t you have anything better to do?) then
this is where things get strange, at least to me. If I execute the
main driver code BEFORE the Map_PCI_Card function, I get an error
(ENOMEM) in Map_PCI_Card at the first call to mmap. If I do the main
driver code AFTER the Map_PCI_Card function, I can only allocate
24 (???) elements before I get malloc returning -1 to indicate
that the system is out of memory (see code segment below):

// Now just fill in the Uplink.
// This loop ends after allocating only 24 new elements.
for (i = 0; ; i++)
{
  element_t *new = malloc (sizeof(element_t));

  // Stop if no available memory.
  if (new == -1)
    break;
  memset (new, 0, sizeof(*new));

  // Fill in New->Contents...

  // Link at tail by replacing tail->next.
  Uplink_tail->next.segment = FP_SEG(new);
  Uplink_tail->next.offset = FP_OFF(new);
  Uplink_tail  = new;
}

Well, this is my question. What is happening? Why is the malloc failing.
I have made sure that the output manager is not actually sending
anything to the PCI during this time and there is no other memory
allocation done from the driver’s memory (i.e., no more mallocs). Is
this a problem with mixing the qnx_segment_* stuff with the
shm_open/mmap (i.e., the Watcom documentation mentions that the former
is for 16 bit and the latter is for 32 bit)? Is there some other
compiler/linker or Proc32 options that is limitting my shared memory
mallocs?

You mention you are using the flat model, so I assume you are in 32 bit mode right. In 32bit mode, flat model does not give more access to more memory then the small model.

You also mention “all three have heaps set to 200k”. Setting heaps since though the linker will not to much, heap will grow as needed.

Now I havent analysed througly your problem (I do have other things to do <!-- s;-) --><img src="{SMILIES_PATH}/icon_wink.gif" alt=";-)" title="Wink" /><!-- s;-) --> ), one reason is havent use qnx_segment_* in ages.

Plus because you use qnx_segment( -1,-1, -) basicaly you are giving any program a way to trash your program.

What you could do is setup the uplink and downlink in the own private memory section that you would create with shm_open and then mmap. That`s the way to go.

Does that mean I can’t use malloc() to allocate my linked lists anymore? Do I have to create my own private memory section and then write functions to do allocation/deallocation from it?

Yes I am running in 32 bit mode. The heaps had to be set so that the shared memory allocation using malloc() would allow allocation of more memory, though this applied only to older versions of QNX (I am running 4.25).