OS Image Relocation

C11 · September 1, 2006, 12:54pm

Hi everyone!

Quick question, is it possible to dynamically relocate the OS image in RAM at runtime? I’m not 100% sure about that, but I suppose there must be any fixed pointers that point at an address inside the image. At least that’s startup_vaddr, if I’m not mistaken. Any chance one can alter all the affected pointers or is this quest to relocate the image at runtime useless?

Suggestions? Comments? Facts?

Regards,
C++

mario · September 1, 2006, 2:05pm

You mean once it’s has already started. I doubt that’s possible. I can imagine it could be possible but I’m sure it would be extremely challenging…

maschoen · September 1, 2006, 7:50pm

Well in some sense it is easy, and some impossible. For x86 at least, the actual ram locations of memory are irrelevant. It’s all virtual. If you could stop everything, you could copy all the OS segments to other ram locations, update all the required tables, and go. On the other hand, you would need a hook into the OS to do this, which probably is not there.

mario · September 1, 2006, 8:25pm

I see some possible with moving the kernel itself with may have portion of it that isn’t using virtual address space. And who is to say that when you move that non virtual section of code there isn’t some address put on the kernel stack waiting to pop to return to another function. You might also have to update interrupt vector addresses.

Without the source to the kernel I wouldn’t dare try it.

maschoen · September 1, 2006, 8:36pm

With respect to your last statement, yes, yes, yes, which is why
this is all academic.

Now, I’m pretty darn sure that the vector table in protected mode is movable. It only sits down at real: 0000:0000 in real mode. The only
data that I know of that has to stay in a fixed real location would
be driver data that you are using DMA on. But with QNX, the
drivers are outside the OS.

C11 · September 4, 2006, 7:19am

Hello everyone,

thanks for your replies.

To your first answers Mario an Maschoen, yes, I’d like to move the entire thing once it has been started. And yes, it’s for a x86t target.

Mario, you said that it were extremely challanging. So I guess it’s nothing one can do in a matter of days You also mentioned the kernel source code. Is this available for download for free? I suppose not.

Maschoen, you don’t sound too upbeat on dynamic relocation either, when commenting on the issue as being academic. For moving the vector table I must have access to the kernel source, mustn’t I?

One last question to the two of you or anyone else who’d like to join in. When you mentioned that the vector table sits at real 0000:0000, you ment the physical RAM address, didn’t you?
I was wondering when dumping the memory, it’s divided into several sections, one is the first megabye of RAM.
Question: The 640 to 1024k section is the reserved address space, where video RAM, I/O devices etc. are stored. But what about the 0 to 640k section? What does QNX Neutrino do with it and what lies in there? OK, the vector table you mentioned, but what else? Any chances I can map this section to another portion of RAM upon booting the system, or is this always at this fixed address?

maschoen · September 4, 2006, 7:06pm

Not for free, maybe not at any reasonable price either.

In this case most probably a synonym for impossible.

Well, what you need access to I believe is the page tables. To find them
you would probably need access to the kernel source. You also would likely need Ring 0. The only knowledge I have about getting Ring 0, is
that the kernel debugger, which gets loaded very early, needs Ring 0, and this is the only way to get it. Once things get started, you can’t.

I was refering to physical ram, and the processor in its startup real mode.
This is the mode that the 8088, 8086 always ran in, and that most subsequent x86 processors start up in. Once you are in a protected mode, the location of the vector table becomes assignable by a loadable register.

From 0 to 640k is special in two ways. 1) it is easily accessable during start up. The IPL of the boot loader is always to this section, although the boot loader could immediately move itself to higher ground.
2) It is non-continguous with the rest of memory.

I believe that the QNX OS loader puts the OS up above the 1Meg boundary. Then once in protected mode, 0-640K becomes just another piece of ram. Even the non-contiguousness of the memory doesn’t matter much at this point, as page tables make this look contiguous.
The only thing you could use this for would be a piece of DMA memory > 640K. I know I’m handwaving here, but I don’t want to get into a discussion of the two levels of virtualization in 386+ processors.

mario · September 4, 2006, 9:54pm

If I had to do it and even with the kernel source code I would say at least a month, if not more. I’m sure there would be LOTS of details to take care of. If this is only for your application and not something generic maybe you wouldn’t have to cover all cases…

Not it’s not and it’s very very very expensive I hear.

Well I guess it’s possible to figure out where these table are from looking at processor registers, but then when you move them, the kernel must be aware of this. For example the kernel may have “precached table” for virtualisation that would probably need to be totaly recalculed.

Even a user program may have calculated a physical pointer to some data relative to the kernel or any of the program in the image. I know some people will use physical address of programs to calculate checksum at runtime. If you’d move the image, the crc program would break.

Out of curiously why would you need to move the image.

C11 · September 5, 2006, 8:06am

Hello and thanks again

So I get the point. “Hacking” the kernel is not only time expensive, but monetary expensive as well.

Maschoen, believe it or not, but I’m quite interrested in the entire startup procedure of x86 CPUs. So if you have any useful internet pages where the startup procedure is explained in detail, I’d be happy if you can provide me with those URLs

You also mentioned the IPL of the boot loader being in the first 640k of RAM. When browsing through the QNX 6.2 manuals, in the utilities.pdf I found on p.953, that the boot image contains a startup header. Searching through the source files, I also found the bit pattern or the signature of that header. Conducting a memory dump, startup header signature is always found twice in the first 640k of RAM. Do you have any insight, why it is found twice? According to your previous explanations and my memory dump, the image seems to be cut in half and loaded into different sectors of the RAM.

Mario, yes, you may ask why I’d like to move the image at runtime. It’s a long explanation, I hope you don’t mind

Currently I’m working on a memory test. Problem is, that I must test the entire memory space, or at least as much as possible. So if I have an image located at address 0x400000 +1024k, I can’t test those 1024k, which is why I have to move it. Now I’ve come up with 2 strategies. Either move the image at runtime, or conduct a reboot and place the image at another address then. As I’ve learnt, I can most certainly forget the first choice, which is why I’m focussing on a double boot strategy right now. That was also the case (@mario), why I was eager to place the image at a particular address in the RAM in the other thread I opened in this forum.

The only problem I now have is to come to terms with the first 640k of RAM, because they seem to be kind of special. So if any of you has specific information what QNX puts in there exactly, I’d be very happy. Therefore the question, after the image is loaded into RAM, does anything get extracted and moved to other memory locations, or does QNX operate solely within the image? But to be honest, I don’t believe this, because processes that get started right after booting the system most certainly need some kind of memory to run.

If anyone of the two of you can confirm to me, that QNX Neutrino tests the memory it occupies itself before it starts up, that would be good news as well.

Now you might ask, why I simply don’t use Memtest86 for this endeavour. Good question. Problem is, that the customer likes to have a logfile written to hard disk or any other permanent storage device. Unfortunately Memtest86 doesn’t allow this. It can test the entire RAM, except reserved RAM space, like the 640 to 1024k area. It can also relocate itself at runtime, which is why I thought QNX can be tought to do the same. But it runs completely without operating system support, which means no file system, no system calls, no nothing. Writing a logfile under these circumstances is a bit hard I suppose, but it’s a second option I currently follow. If you have any alternatives I haven’t considered so far, please let me know. Because right now it seems that QNX is giving me too much, Memtest86 too little

So, having gotten the information you provided to me, what I’d like to know in order to continue is whether the QNX operating system tests RAM it occupies itself. How does the first 640k of RAM look like in QNX specifically? And is there an easy way to do an automated reboot, selecting the second image being started.

What I have in mind is the following. 3 images:
1st: Placed at address 0x400000 (testing RAM, minimal image)
2nd: Placed at address 0x600000 (testing RAM, minimal image)
3rd: Placed at default RAM address (normal image, applications, etc.)

Correct me if I’m wrong, but under Linux the kernel image lies under /proc/boot/ Would it be sufficient to, after the first image is done, to swap the 1st and the 2nd image under /proc/boot? Or isn’t it that easy?

Regards,
C++

mario · September 5, 2006, 12:48pm

And you may have to do it all over for every kernel release…

Intel publish manual for the CPU, that is probably a good start. Then you need to check out BIOS documentation.

The image is NOT in the first 640k, it is the IPL which is usually very small.
As to why the signature of the .boot image, is found twice, I would guestimate that one copy must by part of the IPL (since it needs to search for it, it must have a “copy of the signature to compare against”, then it’s probably loaded into memory while the IPL load and decompress the image.

Sure you can. I see two possibilities, there a probably more ;-)Write another IPL that will perform the memory test BEFORE the image is loaded. Since in the IPL you have access to the BIOS you can write to disk.

While the memory test is performed you can disable interrupt which means the kernel code will NEVER be accessed. However I would still be worried about writting in the GDT, LTD at run time (the IDT should be ok). But you can figure out where these table are by reading CPU registers.

It not the image, but a file system representing files store in the image. Can’t play with that

C11 · September 5, 2006, 2:49pm

Mario! My friend! Don’t stop talking

CASE 1:
So you suggest activating the RAM test before the IPL has a chance to load the image into RAM. That would make sense, I didn’t think of that. So I’m able to test the area in RAM that is later occupied by the image as well in one go. Great proposition, thanks.

The only question I have is … how do I do that?

When I stop right before the image get’s loaded, that has to happen in the IPL. At that stage the operating system has not been loaded, has it?
So, how am I supposed to allocate memory then? I can’t use the mmap_device_memory() methods or any other system calls. So I have to go the way the Memtest86 project used, mustn’t I?
Sorry, I’m a bit confused right now. I’ve gathered lot’s of information over the past couple of weeks, which I have to come to terms with.

And finally, there’s the question about the hard disk access. What exactly do you have in mind to compensate for the loss of operating system functionality? How can I write to the hard disk? You mentioned the BIOS. Are you insinuating a RAW write? Or is there a way to have file system / operating system support?
By the way, if this piece of information is of any importance, it’s a SCSI hard drive. I guess that complicates things a bit, doesn’t it?

CASE 2:
OK, disabling the interrupts. Noted. No interrupts, no timer to kick me from the CPU, exclusive access, what do I need more?
When clearing the interrupts, do I really have the possibility to eliminate all interrupts? In the operating system lecture I once attended, and if I remember correctly, isn’t there at least one IRQ I cannot “kill”? But I don’t think that will be a problem, since the IRQ that’s responsible for scheduling is enough to disable.

In theory, if I’m the only one active in the system, I can also perform destructive memory writes and reads, can’t I? Something like that

FOR ALL memory cells MC[i] DO {
    tmp = MC[i]
    MC[i] = 0xAA
    if (MC[i] != 0xAA) {error}
    MC[i] = 0x55
    if (MC[i] != 0x55) {error}
    MC[i] = tmp
}

If I’m not accessing any system calls, which due to the disabled IRQs wouldn’t initiate a context switch anyway, this should be OK. Could you please double check me on that, Mario? Currently I see nothing wrong with that.

Lastly, just give me a short explanation on GDT, LTD and IDT. What’s that? Following the algorithm above, it shouldn’t be a problem writing there as well. The only part I cannot write to is the 640 to 1024k area, because that’s where other devices are mapped into my memory space.

Mario, you really made my day
Thank you

mario · September 5, 2006, 6:08pm

C++:

Mario! My friend! Don’t stop talking
mario:
LOL!

C++:

CASE 1:
So you suggest activating the RAM test before the IPL has a chance to load the image into RAM. That would make sense, I didn’t think of that. So I’m able to test the area in RAM that is later occupied by the image as well in one go. Great proposition, thanks.

The only question I have is … how do I do that?

When I stop right before the image get’s loaded, that has to happen in the IPL. At that stage the operating system has not been loaded, has it?

It has not.

C++:

So, how am I supposed to allocate memory then?

There is no OS, so there is no need to allocated memory. You have all the memory for you

C++:

I can’t use the mmap_device_memory() methods or any other system calls. So I have to go the way the Memtest86 project used, mustn’t I?

In a nutshell yes. Although I suspect Memtest86 doesn’t stay in real mode and switched to protected mode.

C++:

Sorry, I’m a bit confused right now. I’ve gathered lot’s of information over the past couple of weeks, which I have to come to terms with.

Been a while since I looked at the IPL code, but it’s the IPL that uncompress the image in memory, so it looks in the image header to figure out were it wants to load itself. The header also contains the size. That’s all the data you need!

C++:

And finally, there’s the question about the hard disk access. What exactly do you have in mind to compensate for the loss of operating system functionality? How can I write to the hard disk? You mentioned the BIOS. Are you insinuating a RAW write? Or is there a way to have file system / operating system support?
By the way, if this piece of information is of any importance, it’s a SCSI hard drive. I guess that complicates things a bit, doesn’t it?

SCSI is not an issue, everything goes through the BIOS. There are BIOS calls available to access the HD. Yes I would use raw write. Since supporting the file system would be quite tricky (but doable). Personally I would create a very small partition, as small as possible and write the results of the memory test in that partition in your own format. Then it would be a piece of cake to read that from QNX.

C++:

CASE 2:
OK, disabling the interrupts. Noted. No interrupts, no timer to kick me from the CPU, exclusive access, what do I need more?

If applicable you may have to take care of SMP issue ( disabling interrupt will not prevent code from running on the other CPU)

C++:

When clearing the interrupts, do I really have the possibility to eliminate all interrupts?

Yes all except NMI and the SMI, but that’s not important.

C++:

In the operating system lecture I once attended, and if I remember correctly, isn’t there at least one IRQ I cannot “kill”?

NMI means Non Masquable Interrupt It’s not used by QNX and usually is triggered by a memory parity error, if supported by the motherbord/memory type.

C++:

But I don’t think that will be a problem, since the IRQ that’s responsible for scheduling is enough to disable.

Not it’s not enough. If you trip on interrupt vector while an interrupt from network come-in the PC will go bye bye. ALL interrupts must be disable ( remember that if interrupt are disable for too long certain thing may fail once everything is restarted)
C++:
In theory, if I’m the only one active in the system, I can also perform destructive memory writes and reads, can’t I? Something like that
FOR ALL memory cells MC[i] DO {
    tmp = MC[i]
    MC[i] = 0xAA
    if (MC[i] != 0xAA) {error}
    MC[i] = 0x55
    if (MC[i] != 0x55) {error}
    MC[i] = tmp
}
Yep!

C++:

If I’m not accessing any system calls, which due to the disabled IRQs wouldn’t initiate a context switch anyway, this should be OK.

If you do a kernal call, that includes printf, write(), etc, interrupts will be reenabled. So you must avoid them. Otherwise you are ok.

C++:

Lastly, just give me a short explanation on GDT, LTD and IDT. What’s that? Following the algorithm above, it shouldn’t be a problem writing there as well. The only part I cannot write to is the 640 to 1024k area, because that’s where other devices are mapped into my memory space.

I’m not 100% familiar with these but basically they are special registers that point to tables holding information about virtual addressing. (I’m assuming you know about virtual memory). Hence if you modify the content of these tables you may affect your OWN process virtual table which would most likely results in a crash.

C++:

Mario, you really made my day

My pleasure. Given your alias is C++ I may fire a C++ question at you one day, lol!

C11 · September 6, 2006, 8:06am

Hi Mario

CASE 1:

Well, I cannot confirm that, because I haven’t read through the source code completely. But I suppose that’s what Memtest86 does. Otherwise it wouldn’t be able to address RAM beyond 1 megabyte.

Any good references Well, I try asking Google later on, but maybe you have a good internet site at hand that has an overview and a tutorial on BIOS calls

Yes, I was thinking about such a thing as well, having 2 hard disk partitions. Can I access such an unformatted hard disk under QNX? You said it were a piece of cake. Can I use the same BIOS calls under QNX? The open() and fopen() functions expects a pathname, which I can’t provide. So addressing the hard disk in CHS (cylinder, head, sector) using the BIOS would be handy.

CASE 2:

Right. This could be a problem, but right now I’m on a single processor machine. However, I thought I read something in the QNX manuals about the ability to disable IRQs on all processors. I’ll have to do a research on that.

Correct. Didn’t think of that.

Why is that, what kind of “damage” will occur and what do you understand under the term “too long”? I know that under normal circumstances one should disable the IRQs for as long as necessary, but as short as possible. However, testing the RAM might take a few minutes, I suppose. Would that be too long and cause damage to the system?

mario:

C++:
In theory, if I’m the only one active in the system, I can also perform destructive memory writes and reads, can’t I? Something like that
FOR ALL memory cells MC[i] DO {
    tmp = MC[i]
    MC[i] = 0xAA
    if (MC[i] != 0xAA) {error}
    MC[i] = 0x55
    if (MC[i] != 0x55) {error}
    MC[i] = tmp
}
Yep!

Quick followup that came to my mind yesterday when I was at home. The process/thread which tests the RAM is also located there upon execution. I cannot destroy that part of RAM, can I? Any chance I can find out where in RAM a certain process is located? Because what I’m trying to do is a mmap_device_memory() system call to get all the memory mapped into my process’s address space. mmap() and malloc() will only give me free memory back. But overwriting the RAM test code in RAM while executing it isn’t the real thing. That’s the only problem I see right now.

Thanks for pointing that out. So I cannot use any printfs() while testing the RAM. OK, then I have to store the results in a structure on the heap or on the stack, watching out for any destructive writes and after test completion print the results to screen and write them down to disk.

I see and I can imagine what you’re getting at. Each thread has its own virtual memory space, which maps physical memory to process local one. If I kill that mapping, I’m dead as well. What’s the register’s denomination, that is responsible for that mapping? If I can access it, I know the memory location of my tables. The only thing that’s missing is the length. Any chance I can find that out as well? Or are there other ways to achieve that goal?

No problem. You can also ask me somthing about C and JAVA if you like to. As always, thanks again for your support Mario

mario · September 6, 2006, 12:20pm

It’s possible, since the IPL does it I don’t remember how it’s done though.

I don’t keep one, so easy to find on the web.

fopen( “/dev/hd0t??”…);

I think all interrupts are routed to processor 0. But it’s not only the interrupts you need to disable but you have to make sure the other processors aren’t executing any code that you have no control over.

Programs may timeout on what ever operation they were doing (hd driver may think a block was not written to disk for example). You don’t have to run the memory test all at once. You can test 500k at a time for example, let’s say every 10ms. Or you can set the memory test program to run at the lowest priority but run the memory test for only 10ms every time and then yield the CPU.

Personnaly I would do it in the IPL. SO much simpler. FAR less unknown.

mem_offset() will give you the physical address. You could make a copy of yourself of just make two functions that test the ram in your process and alternate functions when you detect you are stumbling upon yourself.

Can’t call malloc either, cause it may results in kernel call as well. See why I’m saying you should do it in the IPL. Since you don’t seems to have deep knowledge of the OS I think it would be very dangerous for you. Imagine the nasty bugs that would have append if you didn’t realized making kernal calls re-enabled interrupt. You could actually made other program crash, they could even crash minutes after the ram test is finished…

You could probably figure it out from reading the processor manual (that is where I got the info). Given your current level of knowledge and the type of question I strongly recommend you stick to the IPL solution.

maschoen · September 6, 2006, 7:11pm

At the risk of repeating someone else, I’ll fill in what I know.

The work I refered to was done before there were any useful web pages.
I worked on an 8088 bios. The hardware initialization was probably not
relevent to current Pentium processors. Things that neededing doing included start DMA refresh. I think this is built into modern DRAM’s, but I really don’t know. There probably are useful web pages on this, but you can find them as easily as I can.

This does sound familiar, and normal. The bios typically reads the first
sector of the boot hard disk into a relatively low memory place. Historically this was in the first 16K, or maybe 48K of memory.
I know this because QNX .43 ran just find in 48K. For early loaders,
this was a problem because low memory was where it wanted to load the OS loader. A surprisingly small number of instructions (remember less than 512bytes of code are loaded) were needed to move itself. Once
286’s came out, and the “1 meg barrier” was broken, the OS loader usually loaded itself up above 1 meg.

I’m almost certain that the answer is no. I’ve never known any OS to attempt this. It’s almost always part of the bios/rom boot sequence.

That said, I’d like to suggest a different approach to your problem, one that has some hope of success, and probably will be a great deal easier.

Build your own custom OS loader. Have it perform the following operations.

Test Memory
Using the BIOS, write the results of this test to a known sector on
your hard drive. I thought about hiding it in memory, but I suspect that
QNX zero’s out all memory at startup.
Now load QNX.

After startup, you can read the sector and find out what happened. Of course it is possible that you will never get that far if you have bad memory.

Writing such a loader will not be easy, as you will probably have to hand code most of it in assembler. You could learn a lot be disassembling an existing.

mario · September 6, 2006, 7:54pm

See C++, Mitchell came to the same conclusion Go for the IPL solution.

C11 · September 7, 2006, 8:25am

Hi Mario!

Wow! Didn’t know it was that easy Now you’ve astounded me. I always thought I needed a filesystem for those calls to work. Well, one never stops learning I suppose.

How about a sync then? But I suppose that’s not the only pitfall one can stumble in. However, I was also thinking about doing the memory test in parts.

I get your point.

Wasn’t planing on using malloc() I thought more of that procedure. Well, kind of. It’s RAW pseudo code and possible incomplete. Just should outline the prodcedure I was planning.

addr = mmap_device_memory(0x00, MEM_MAX);
lock_IRQ();
FOR ALL addr
    tmp = *addr;
    *addr = 0xFF;
    if (*addr != 0xFF) {error;}
    *addr = 0xAA;
    if (*addr != 0xAA) {error;}
    *addr = tmp;
unlock_IRQ();
unmap_device
addr++;

Hey! I’ve got about … 4 Weeks of experience on QNX Isn’t that something? No, but seriously, as I pointed out, I’m just a beginner. So I guess you’re right, when you advise me to do it the IPL way.

Yes, as you’ve told me, there are lots of pitfalls one can stumble into. Currently I’m weighing the options though. An IRQ based approach would allow me to write the logs more comfortably, because I have operating system support for file I/O, however there are other processes competing for the CPU/RAM and that stuff, and there are pitfalls to stumble into.
An IPL based approach on the other hand would allow me to test the RAM easier, since no one allocated RAM, but file I/O is harder. Well, well. I guess it would be best if I sticked to your proposition with the IPL.

Hi Maschoen,

no, no. I’m glad you joined back into the discussion. Repetition is no problem

I see. Well, I thought by coincidence you might have some references handy

Thanks for your insights. I just did another memory dump, the signature is found at 4233 bytes and at 5120 bytes if you’d like to know.

OK, so much about cheering things up

Well, that’s also what Mario suggested. So that makes it 2-0.

I’ll come back to the two of you, if I run in any problems

One last question. I’m using a x86 target and in the buildings.pdf on p.54 it says: “Systems that boot from disk or over the network typically come with a BIOS or ROM monitor, which already contains a large part of the IPL within it.”

That’s what I have. Can that cause any problems? I mean obviously the IPL seems to be implemented partially on those platforms. Or did I read that part wrong?

Thanks,
C++

C11 · September 7, 2006, 8:28am

Well, it’s a 2-0 majority then. Can’t win against that, can I?

mario · September 7, 2006, 12:21pm

Other solution I see (but I still think IPL is best) is to write a custom startup-bios. That’s part of the image, and is run after the IPL but before the kernel starts.

This is described in the Customizing Image Startup Program chapter in the doc.

C11 · September 7, 2006, 3:37pm

What’s the disadvantage of that approach, compared to the IPL one?
…
No, give me a moment to reconsider on that and please correct me if I’m wrong. IF I remember correctly, the IPL loads the image and then jumps to startup_vaddr, which is the entry point to the image, as its last action. Doing it the way you outlined in your latest proposition, I would already have the image in RAM, therefore I cannot perform destructive writes. Well, more or less.

Apart from that, are there any other pitfalls to watch out for, except the usual ones? Or why do you prefer the IPL approach over this one?

Regards,
C++