Black Art of 3D Game Programming, Chapter 8: The Art of Possession
From Dpfileswiki
Video game programming is like no other programming on Earth (or any other planet). Video games are so complex and use so much of the PC’s resources that they must control every aspect of the computer. In this chapter, we’re going to sharpen our skills as Cybersorcerers by learning about some of the low-level aspects of game programming, such as multitasking, the interrupt controller, the internal timer, and general real-time programming techniques.
Multitasking Fundamentals
A video game is a perfect example of a multitasking single-processor program. The word multitasking means to process many tasks at the same time. This definition is a bit loose, but the fact is that a single-processor computer can only process a single task at a single time. However, by virtue of the sheer speed of the PC, the PC can execute tasks sequentially so quickly that it seems as if they are all being executed at once. Figure 8-1 shows a representation of many tasks being executed in a sequential manner on the PC. A true multitasking computer would have multiple processors and hence be called a multiprocessor computer. In fact, multiprocessor computers do exist and are common in the workstation arena of computing. As an extreme example, the Connection Machine (Thinking Machines Inc.) has up to 64,000 processing elements!
A multiprocessor computer can execute more than one process at a time since a single processor can be allocated to each task, as shown in Figure 8-2. This is the ultimate video game machine. Alas, since the price tag is upwards of $100,000, we’ll have to stick with the single-processor PC. But this isn’t too bad, when we consider that a single 486 or Pentium processor is equivalent to a few dozen 8086s! Therefore, by time-multiplexing the resources of these advanced processors, we can achieve the same performance as a slow multiprocessor computer. Anyway, that’s all academic. We need to stick to reality–well, at least Virtual Reality.
The PC under DOS has no means to accomplish multitasking. There is a form of crude multitasking that takes place when an interrupt occurs, but this doesn’t really allow us to perform general multitasking. However, we don’t care that DOS doesn’t do multitasking, since we’ll write our games in such a way that they are themselves multitasking on a functional level. Take a look at Figure 8-3. Here we see a diagram of a typical game event loop calling many subfunctions. These subfunctions are the elements that make up the game. There are calls to draw the objects, move them, process the input devices, and so forth. Moreover, since each of these tasks is repeated every game cycle, the game itself is acting like a small multitasking kernel.
This is the first concept to learn about game programming: the multitasking or real-time aspect of the game is synthesized by means of an event loop that repeats a sequence of tasks every cycle. Of course, the PC’s ability to be interrupted by some event and momentarily switch to another task is very important in video game design, and we’ll discuss this in detail. But for the most part, a video game is a single program that executes a sequence of tasks in a cyclic manner at such a speed that, to the player, these tasks seem be occurring simultaneously. And this is the illusion we must endeavor to create.
Even though we’ll process most of the game logic within the main event loop via calls to subfunctions, there are nevertheless times when some aspects of a game must be triggered by events either physical or temporal. For this reason, we must understand the interrupt structure of the PC and how to interface with it. By mastering this incantation, we can wield the full power of the PC and use it as a weapon against the competition!
PC Interrupts
The PC is equipped to handle interrupts as part of its normal processing stream. Interrupts can be generated either by a hardware event, such as a serial communications signal, or a software event, such as a call to INT 21h. In either case, the sequence of events is similar, as shown in Figure 8-4. An interrupt begins with the CPU pushing the flags register along with the address of the next instruction onto the stack. Then the CPU indexes into the interrupt table and vectors to the address of the appropriate interrupt service routine (ISR).
The ISR then processes the interrupt and performs its work. On completion, an IRET instruction is used to return to the interrupted program. The IRET instruction differs from RET in only one way–IRET restores the flags register from the stack, whereas RET doesn’t. In reality, the only reason the flags register is saved by an interrupt is that this is the bare minimum that is needed to make a “context switch.” However, it is up to your ISR to save any registers on the stack that may be altered by the ISR itself. This is necessary because the interrupted program has no idea that an interrupt occurred; and if the program was performing a calculation with the AX register and the contents of the register are modified by the ISR, then the resulting calculation will be in error. Thus, ISRs must save any CPU registers that they use and restore them on exit.
Now that we have a general idea of what interrupts are and how they are generated, let’s focus on the two origins of interrupts–software and hardware.
Software Interrupts
Software interrupts are generated by program calls on the PC. The interrupts themselves are initiated by a call with the INT XXh instruction. When the CPU comes across an INT XXh instruction, the XXh is used as an index into the interrupt vector table. This table holds the address of all the interrupt service routines within the PC. Even though software interrupts aren’t very useful for timing purposes or physical event tracking, they’re a good way of creating a layer between the application programs and the operating system. This extra layer allows future versions of the operating system to add functions onto these interrupts without changing previous software, since the interface can stay the same while the underlying code within each interrupt routine can change radically.
For example, INT 10h is the video BIOS software interrupt, and regardless of whether you wrote a program five years ago, yesterday, or today, the interface to the interrupt is the same and the general functionality is the same. However, options and features can be added to the ISR by the BIOS engineers without our being too concerned, since in most cases the routines are kept downward compatible. Basically, software interrupts are a good way to add functionality to the operating system at a system level rather than an application level such as the run-time library for C or the DLLs for Visual Basic. All programs have the ability to use the standard software interrupts.
Hardware Interrupts
Hardware interrupts are much more interesting than software interrupts, but amazingly enough, they work in the same way (from a programmer’s point of view). The major difference is that hardware interrupts are generated by some physical event, such as another character in the communications buffer, a keyboard press, a DMA transfer request, and so forth. Hardware interrupts allow the CPU and system software to respond to real-world events. The implementation of hardware interrupts is rather interesting and worth a bit of explanation, so let’s take a look.
The Programmable Interrupt Controller (PIC)
The PC is equipped with a chip called a programmable interrupt controller or PIC. The numerical designation of this chip is 8259A, which probably is completely arbitrary. Anyway, the PIC has a series of external connections. These connections are physically connected to devices on the PC, as shown in Figure 8-5. When one of the devices “pulls” on one of the PIC interrupt lines, the PIC sends a hardware interrupt request, along with the index number of the interrupt, to the CPU. The CPU uses this information to index into the interrupt table, and the appropriate ISR is called to service the hardware processed.
If you recall, we have already used a hardware interrupt to control the keyboard. The keyboard has a processor in it named the 8048, and when the keyboard senses a keypress (or release), this causes a scan code to be sent to the PC. When the PC receives the scan code, an interrupt is generated by the PIC, which causes the CPU to vector to the proper ISR (the address of which is located at location 09h of the interrupt table). The PIC is rather complex, but there are a couple aspects of it we need to understand. First, whenever an interrupt occurs, the PIC is inhibited so that it won’t generate another interrupt. This state can be cleared by writing to one of the registers in the PIC called the interrupt control register or ICR. This register can be found at I/O port 20h. To reset the PIC and enable interrupts again, the ISR should write a 20h to the ICR. The value 20h is the end of interrupt (EOI) command that instructs the PIC to reenable interrupt processing.
The second register of interest on the PIC is the interrupt mask register or IMR. This register is located at I/O port 21h and has the bit assignments shown in Table 8-1.
Granted, the PC hasn’t nearly the number of hardware interrupts that it has software interrupts, but these have sufficed for years and probably will suffice in the future. Although there are other hardware interrupts that are not listed in Table 8-1, called non-maskable interrupts or NMIs, we aren’t going to cover them since they are for more advanced operating system-related functions.
When the PC starts, the BIOS programs the IMR with the proper pattern depending on the PC’s configuration. To enable an interrupt, the desired bit is cleared, and to disable an interrupt, the desired bit is set. This is a bit backwards, but that’s life. For example, if we wanted to turn the keyboard off, we could write this:
unsigned char data; // used to read and write the data // read the IMR register data = _inp(0x21); // clear the keyboard bit d1 data = data & 0xFD; // write the new mask data back to IMR _outp(0x21,data);
If you were to run this program, the keyboard wouldn’t function anymore and you would have to reboot!
In general, the only thing we need to worry about with the PIC is enabling interrupts and sending the EOI command to the PIC when an ISR is complete. We have been referring to this phantom interrupt vector table for a while, so let’s see where it is and how it works.
The Interrupt Vector Table
The interrupt vector table is located at 0000:0000 to 0000:0400 of the PC’s memory map. The table consists of 256 entries (each entry is 4 bytes long) that are addresses to each ISR, 0 to 255. Each entry is in the form offset:segment; thus, each 4-byte entry is simply a pointer to an ISR. Figure 8-6 shows a graphic of the interrupt vector table with a few of the more important ISRs labeled. This table is the key to everything, and by manipulating it, we can totally control the PC. We can literally throw the operating system out and take complete control of every aspect of the PC. Although we don’t need to do this, it’s nice to know we can! Each of the 256 interrupt vectors has a function, and they are listed in Table 8-2.
We only need to take control of a few of the PC’s interrupts, such as the keyboard, internal timer, and the serial ports. However, I have highlighted a few others that might come in handy. So how does the table work? Whenever a software or hardware interrupt occurs, the CPU looks up the appropriate ISR vector in the table and vectors to it. For example, if a divide by zero occurs, the CPU saves the flags register on the stack along with the next instruction to be executed (the return address), and a jump is made to the address located in ISR vector 0000:0000- 0000:0003. Hopefully, the address in these bytes points to a valid ISR routine that processes the divide by zero interrupt. And the ISR had better end with an IRET or a call to another ISR that does.
What we want to do as Cybersorcerers is change the entries in the table to point to our own ISRs rather than those that belong to DOS. We do this by accessing the interrupt vector table with a pointer and changing the ISR vectors. However, performing this seemingly simple operation can be dangerous since an interrupt could occur as we are making the change. The result would be a half-updated interrupt vector that would then be used, causing a jump to a random location in memory, resulting in an error or crash. Hence, a cleaner method must be used to perform these updates using operating system calls, but before we study them, let’s talk about how an ISR is written in C.
Writing an Interrupt Service Routine in C
Luckily for us, we won’t need to resort to pure assembly language to write interrupt service routines. Although in many cases we will write portions of a particular ISR using the inline assembler, there is support for ISRs within C itself. An ISR is no more than a function that performs a particular task and then returns. However, ISRs have a few special properties that must be observed:
- The ISR cocde must execute quickly and release control back to the foreground process as soon as possible.
- The ISR code must be re-entrant, meaning that the interrupt itself must be able to be interrupted by itself (which may occur in special cases) without crashing.
- The ISR must restore any data structures that it modifies.
- The ISR must issue an end of interrupt command 20h to the PIC at the exit of the ISR (in most cases--but, better safe than sorry).
In general, ISRs should be short and to the point. The whole idea of an ISR is to perform its task without the main foreground process being affected by its execution. The C language has a new keyword (_interrupt) which, when placed in a function declaration, takes care of the prolog and epilog code that an ISR must have (such as saving the CPU registers and using an IRET instead of a standard RET to terminate the ISR function). The shell for an ISR in the C language is shown in Listing 8-1.
Listing 8-1 Minimum C implementation of an ISR
void far _interrupt ISR_Function(void)
{
_asm sti ; re-enable interrupts
// ISR code goes here
// end of ISR code
// re-enable hardware interrupts by sending an EOI command to the PIC
_outp(PIC_ICR,PIC_EOI);
} // end ISR_Function
You will notice the function begins with a STI instruction. This tells the CPU to reenable interrupts. However, if we wish, we can inhibit interrupts while processing the ISR. This can be dangerous if an important interrupt occurs and isn’t serviced, but is sometimes necessary in time-critical code that can’t be interrupted. The disabling of interrupts can be accomplished through the use of the assembly language instruction CLI. Interrupts can then be reenabled with the STI instruction. Adding this to our ISR shell, the results are shown in Listing 8-2.
Listing 8-2 The C ISR with interrupt disabling
void far _interrupt ISR_Function(void)
{
_asm cli ; clear interrupt flag
// ISR code goes here
// end of ISR code
_asm sti ; set interrupt flag
// re-enable hardware interrupts by sending an EOI command to the PIC
_outp(PIC_ICR,PIC_EOI);
} // end ISR_Function
Using one of the two shells shown in Listings 8-1 and 8-2, most interrupts can be written as if they were normal C functions. For example, the keyboard handler we created in Chapter 5 uses a shell similar to the first. By the way, the defines used in the shells are declared in BLACK8.H as:
#define PIC_ICR 0x20 // the interrupt control register #define PIC_IMR 0x21 // the interrupt mask register #define PIC_EOI 0x20 // end of interrupt command
All right, now we know how to write an ISR function. How can we install it into the interrupt vector table?
Installing an Interrupt Service Routine
Let’s refresh our memories a little about the problems with manually changing interrupt vectors ourselves. The interrupt vector table is nothing more than a collection of 4-byte entries that spans 0000:0000 to 0000:0400 in the PC’s memory. Thus, we can alias a pointer to this region and manipulate any of the interrupt vectors that we wish. However, there is a problem with this. If the PC is interrupted while the update is taking place and the interrupt happens to be the interrupt that we are updating, a system crash can occur. Hence, the update to the table must occur in a single instruction. There is no single instruction to do this, but there is a C function (based on a DOS call) that can install an interrupt service routine without danger of being interrupted by the PC.
The DOS-based C function accomplishes this by inhibiting all interrupts and using a clever method to update the interrupt vector. In any case, we will use the C function to install our ISRs. The name of the function is _dos_setvect( ) and has the following syntax:
_dos_setvect(unsigned interrupt_number, void (_interrupt _far
*ISR_Function)());
If the syntax is a bit confusing, it basically reads, “send the interrupt number along with a pointer to the ISR function.” For example, to install one of our ISR shells at interrupt 1Ch (the timekeeper), we can write:
_dos_setvect(0x1C, ISR_Function);
Once the ISR is installed, the old ISR vector will be lost, as shown in Figure 8-7. This isn’t acceptable; we must find a way to save the old ISR. This can be done with the C function _dos_getvect( ). It works in the opposite way that _dos_setvect( ) does. That is, instead of setting an interrupt vector, _dos_getvect( ) retrieves one. The syntax is,
void (_interrupt _far *_dos_getvect(unsigned interrupt_number))();
which simply returns a FAR pointer to the current ISR installed at location interrupt_number. For example, continuing with our previous example of redirecting the timekeeper interrupt, we can first save the old timekeeper ISR with the following code fragment:
void (_interrupt _far *Old_Time_ISR)(); // we'll place the old isr in here // save old ISR Old_Time_ISR = _dos_getvect(0x1C); // install new ISR _dos_setvect(0x1C, ISR_Function);
Finally, when a game or application exits back to DOS, we’ll need to replace all of the old interrupt vectors that we changed. Let’s see how.
Restoring an Interrupt Service Routine
As long as we save any ISRs that we update with new ones, we can restore the old ISR by using _dos_setvect() once again. However, this time instead of installing our own ISR, we reinstall the old DOS ISR that was originally there. Figure 8-8 shows this process graphically. First, the DOS ISR is saved in a variable. Then our new ISR is installed. Then when the game terminates, the old DOS ISR is reinstalled and the system is back to normal.
Again continuing with the ongoing example, we can reinstall the saved DOS timekeeper ISR with the following code:
_dos_setvect(0x1C,Old_Time_ISR);
That’s it! As a solid example of creating an ISR and installing it into the interrupt table, the following program, JELLY.C (Listing 8-3), installs a new timekeeper ISR at interrupt vector 1Ch. This interrupt will be called every 1/18 second. The ISR itself simply updates a pair of global variables that are used in the main( ) function of the program to move a creature on the screen. The executable of the code is called JELLY.EXE.
Listing 8-3 Program that installs a new timekeeper ISR
// JELLY.C - A demo of latching onto the timer interrupt 0x1C
// I N C L U D E S ///////////////////////////////////////////////////////////
#include <io.h>
#include <conio.h>
#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <bios.h>
#include <fcntl.h>
#include <memory.h>
#include <malloc.h>
#include <math.h>
#include <string.h>
#include "black3.h"
#include "black4.h"
#include "black8.h"
// G L O B A L S ////////////////////////////////////////////////////////////
pcx_picture image_pcx; // general PCX image used to load background and imagery
sprite jelly; // the jelly fish
int ISR_jelly_x = -20, // global variables that track the position of the
ISR_jelly_y = 100; // jelly fish and are updated by the ISR
// this holds the old timer keeper isr
void (_interrupt _far *Old_Timer_ISR)();
////////////////////////////////////////////////////////////////////////////
void _interrupt _far Jelly_ISR(void)
{
// this function will update the global jelly fish position variables and
// test if the jelly fish has moved off the screen, this function will be
// called once every timer tick
_asm sti ; re-enable interrupts
//////////////////////////////////////////////////
// move the jelly fish
if (++ISR_jelly_x > 320)
ISR_jelly_x = -20;
ISR_jelly_y = ISR_jelly_y - 1 + rand()%3;
// test y bounds
if (ISR_jelly_y > 200)
ISR_jelly_y=-20;
else
if (ISR_jelly_y < -20)
ISR_jelly_y=200;
/////////////////////////////////////////////////
// re-enable pic
_outp(PIC_ICR,PIC_EOI);
} // end Jelly_ISR
// M A I N //////////////////////////////////////////////////////////////////
void main(int argc, char **argv)
{
int index; // loop variable
// set the graphics mode to mode 13h
Set_Graphics_Mode(GRAPHICS_MODE13);
// create the double buffer
Create_Double_Buffer(200);
// load the imagery for the jelly fish
PCX_Init((pcx_picture_ptr)&image_pcx);
PCX_Load("jelly.pcx", (pcx_picture_ptr)&image_pcx,1);
// intialize the jelly fish
Sprite_Init((sprite_ptr)&jelly,-20,100,16,16,0,0,0,0,0,0);
// extract the bitmaps for the speeder, there are 4 animation cells
for (index=0; index<4; index++)
PCX_Get_Sprite((pcx_picture_ptr)&image_pcx,(sprite_ptr)&jelly,index,index,0);
// done with this PCX file so delete memory associated with it
PCX_Delete((pcx_picture_ptr)&image_pcx);
// install the jelly fish motion interrupt
Old_Timer_ISR = _dos_getvect(TIME_KEEPER_INT);
_dos_setvect(TIME_KEEPER_INT,Jelly_ISR);
// scan background before entering event loop
Sprite_Under_Clip((sprite_ptr)&jelly,double_buffer);
// put up exit instructions
Print_String_DB(80,2,9,"Hit any key to exit",1);
// main event loop, process until keyboard hit
while(!kbhit())
{
// do animation cycle
// erase the jelly fish
Sprite_Erase_Clip((sprite_ptr)&jelly,double_buffer);
// move the jelly fish, note that we only copy global variables here
// the variables themselves are modified by the ISR
jelly.x = ISR_jelly_x;
jelly.y = ISR_jelly_y;
// animate the jelly fish
if (++jelly.curr_frame == 4)
jelly.curr_frame = 0;
// ready to draw speeder, but first scan background under it
Sprite_Under_Clip((sprite_ptr)&jelly,double_buffer);
Sprite_Draw_Clip((sprite_ptr)&jelly,double_buffer,1);
// display double buffer
Display_Double_Buffer(double_buffer,0);
// lock onto 18 frames per second max
Time_Delay(1);
} // end while
// exit in a very cool way
Screen_Transition(SCREEN_DARKNESS);
// free up all resources
Sprite_Delete((sprite_ptr)&jelly);
Delete_Double_Buffer();
// restore the old isr
_dos_setvect(TIME_KEEPER_INT,Old_Timer_ISR);
Set_Graphics_Mode(TEXT_MODE);
} // end main
The program is rather simple, but we can do a lot with it. The beginning sets up the graphics and loads the imagery. Then the old timekeeper interrupt is saved, and the new ISR named Jelly_ISR( ) is installed in its place. Then the program falls into the main event loop. The event loop does nothing except draw the creature on the screen and animate it. However, the creature moves even though the position variables (jelly.x,jelly.y) are not explicitly updated by main( ). This is because Jelly_ISR( ) is being called by the timekeeper interrupt, which itself is updating the global variables ISR_jelly_x and ISR_jelly_y. Then these variables are copied into the sprite’s position variables at the correct time. Hence, we have created a crude form of multitasking: the main event loop is responsible for rendering the creature, while the ISR is responsible for moving it. An important detail to be learned from this demo that may not be apparent is that the precise moment of the interrupt isn’t known relative to the program execution; thus, we can’t guarantee that the interrupt will occur when the sprite is being erased, drawn, or animated. Consequently, we must buffer the position in a pair of global variables and, when the time is right, move them into the position variables of the creature’s sprite structure. Figure 8-9 shows this process graphically. Notice that no matter when the global variables are updated, the “effect” of the update isn’t propagated to the logic until the logic makes use of them.
The timer interrupt is one of the most important and useful interrupts in video games. In many cases, it’s the only way to perform some task at a specific time with any precision. The timekeeper interrupt is used for such things as music systems, color palette updates, I/O device polling, and other similar operations that must be done on a regular basis. The only drawback to the timer interrupt is that it occurs once every 1/18.2 seconds or 55ms, which may not be fast enough. However, since the timer interrupt is generated by a timer chip (the 8255), this rate can be changed by software. But we’ll cover that a bit later.
The only problem with installing a new ISR is that many times the new ISR doesn’t perform the tasks that the original ISR was meant to do. There are two solutions to this. The first solution is to make sure that any new ISR that is installed has the functionality of the old ISR in addition to performing any special tasks. The second and more common solution is to install the new ISR and chain it to the old one, so the menial tasks are performed by the old ISR. This is a common practice for the keyboard and timekeeper ISRs.
Chaining Interrupt Service Routines
The timer and keyboard are two of the most common interrupts that are latched onto when the game needs tight control of these resources. However, the way we’ve learned to latch onto these resources is quite brutal. In fact, we completely took them over without considering other programs or the responsibilities that the system interrupt handlers may have had. In the case of the keyboard ISR that we installed to create the keyboard driver in Chapter 5, completely taking over the keyboard didn’t have too much effect since that was the main goal. However, if the user had installed a TSR that installed its own ISR into the keyboard interrupt vector, then it wouldn’t work anymore. Granted, we probably don’t want or need TSRs we don’t know about lurking around, but on the other hand, we should learn how to play nice with the other children.
The solution to the problem is to chain ISRs when applicable. Take a look at Figure 8-10, where we see two possible scenarios of installing an ISR. In one case, the old ISR is suspended in hyperspace while the newly installed one takes over. However, the second case depicted installs the new ISR on top of the old ISR. This allows the old ISR to be called after the newly installed ISR completes its tasks, performing any menial tasks that the new ISR doesn’t. As Cybersorcerers, we aren’t always that nice; we’ll usually take over all the interrupts of interest and let the old system ISRs hang out to dry. But if we need to chain ISRs, it’s nice to know how.
Chaining ISRs is very simple. Nothing more than a call to the previously saved ISR’s address is needed at the end of the newly installed ISR. For example, let’s see how we can create a chained timer ISR. First, we need to save the old timer ISR address, like this:
void (_interrupt _far *Old_ISR)(); // pointer to old ISR // extract vector of timer interrupt 0x1C Old_ISR = _dos_getvect(TIME_KEEPER_INT);
Also, assume that the new timer interrupt is called Timer_ISR( ). To install it we would do the following:
_dos_setvect(TIME_KEEPER_INT, Timer_ISR());
That’s it! You’re probably saying, “What’s the difference between the chained ISR and the standard one?” The difference is in the ISR itself. The definition of Timer_ISR( ) would look something like this:
void _interrupt _far Timer_ISR(void)
{
_asm sti ; re-enable interrupts
// do work
// end of work
// re-enable PIC
_outp(PIC_ICR,PIC_EOI);
// now call old interrupt and chain!
Old_ISR();
} // end Timer_ISR
In essence, all we need to do is call the old ISR by creating a function invocation with the saved function pointer. The first ISR is called by the interrupt itself, and then at the tail end of the newly installed ISR, the old ISR is called. There is one interesting detail to note though. After the old ISR executes, it returns to the newly installed ISR, and the new ISR performs the final return to the operating system. If you do not wish this to occur, you can use the C function _chain_intr( ), which will modify the stack such that the old ISR will directly return to the operating system when complete. Hence, if the ISR shell above were to take this into consideration, it would look like this:
void _interrupt _far Timer_ISR(void)
{
_asm sti ; re-enable interrupts
// do work
// end of work
// re-enable PIC
_outp(PIC_ICR,PIC_EOI);
// now call old interrupt and chain!
_chain_intr(Old_ISR);
} // end Timer_ISR
Basically, _chain_intr( ) changes the stack so that the second ISR won’t return to its calling point in the new ISR, but to the place where the original interrupt was issued in the game application. The actual stack manipulation is too low level, so we won’t concern ourselves with it. Pretend it’s magick! Reflecting upon this new information about interrupt chaining, we’ll only need it in special cases. Most of the time, we’ll want total control of the PC.
The Issues of Re-Entrancy and DOS Calls
We’ve mentioned that interrupt service routines must be re-entrant. This isn’t totally true, especially if the newly installed ISR inhibits all interrupts while performing its task. However, for an ISR to be nonabusive to the system, it should be re-entrant or, in other words, be able to be interrupted by itself without a problem. Writing reentrant functions can be easy and it can be hard. Fortunately for us, the problem of an interrupt interrupting itself seldom occurs (if at all). So we don’t need to concern ourselves too much with re-entrant programming techniques, but as a caution, DOS calls should be avoided if at all possible within ISRs.
DOS calls are very slow and in general are not re-entrant. The combination of these two factors increases the probability of the same interrupt occurring while it’s being processed. For example, say we wrote an ISR that took over the timekeeper interrupt. We basically have a maximum of 1/18 second to do something and exit. But imagine that we make a few DOS calls that are slow. They may take more than 1/18 second, and it’s possible that the timekeeper interrupt itself could be reissued (see Figure 8-11).
Referring to the figure, we see that during the processing of the timekeeper interrupt on the first generated interrupt, it was again interrupted in the middle of a nonre-entrant DOS call. Therefore, not only did the interrupt routine take much too long, a DOS call has been reentered and chaos is bound to occur. The moral of the story is: stay away from DOS calls in ISRs, and write ISR code that is as fast and efficient as possible.
The Vertical Blank Interrupt
Even though the keyboard and timer interrupts are the most popular interrupts to latch onto, there is another that is even better (when it works). This interrupt is called the vertical blank interrupt or Vblank. The problem is that it isn’t supported on all VGA cards and assuming that it exists is a gamble. However, the concept of it is so important that we are going to discuss it and see an implementation of it.
The vertical blanking period or interval occurs when the raster is retracing after drawing the entire screen (see Figure 8-12). The time it takes the raster to perform the retrace is about 5 to 10 percent of the time it takes to draw the screen. Thus, this is an opportune time to update the video buffer, since the VGA hardware isn’t accessing the video RAM. Moreover, an interrupt that is connected to the vertical blank will occur precisely every 1/70 second (if the VGA is in mode 13h), which is a great time base for games.
So which interrupt in the interrupt vector table is the vertical blank interrupt? It’s number 0x0A, which is reserved for IRQ 2. This means that any device attached to IRQ 2 will generate an interrupt, and the ISR installed at 0x0A will be called. Of course, the PIC’s interrupt mask register would have to be enabled, but that’s as easy as clearing a bit. The problem is forcing the VGA to generate the interrupt in the first place! To instruct the VGA to generate a vertical blank interrupt takes a bit of trickery but can be done in a few lines of code. Unfortunately, some VGAs work one way and others work another way. The ramifications of this fact are that some bits must be on or off depending on the manufacturer. Alas, there is a bit of hardware dependence. But don’t be too intimidated; most of the time the Vblank will work the way we’re going to learn it does. Only in a few cases do the code and bit settings need tweaking. All right, here we go. To enable a vertical blank interrupt, we must do three things:
- 1. Install a vertical blank interrupt handler at location 0x0A in the interrupt
vector table.
- 2. Enable the interrupt on the PIC by clearing bit d2 (the third bit from the
right) of the PIC interrupt mask register (IMR).
- 3. Clear the d4 (vertical blank interrupt latch) and d5 (vertical blank interrupt
enable) bits of the VGAs Vertical Retrace End Register (in the CRT controller). The bit designations are shown in Table 8-3.
When the vertical blank interrupt occurs, the system calls the ISR pointed to in the interrupt vector table. The ISR has to perform two tasks. First, it should make sure that the interrupt that caused it to be called was in fact the vertical blank interrupt. This is a safety precaution, since other devices may also be connected to IRQ 2, causing phantom interrupts. This test can be performed by looking at bit d7 of the VGA’s input status register 0 at I/O port 0x3C2. If you remember, the Wait_For_Vertical_Retrace( ) function uses this very register. The second part of processing a vertical blank interrupt is resetting the vertical interrupt latch at bit d4 of the Vertical Retrace End Register in the CRT controller. This is done by writing a 0 to the bit.
Before we delve into writing a vertical blank interrupt program, there is one detail that we must adhere to. This detail is that during the manipulations of the bits in the various registers, we only modify the bits that we are interested in while leaving the others alone. This is of great importance, especially when manipulating the PIC’s interrupt mask register. One false move with those bits and the whole computer can be locked up! For that reason, we will be using the SET_BITS( ) and RESET_BITS( ) macros in the program so that only the bits we wish altered are in fact changed. In general, this is a good programming practice. If you aren’t sure what will happen if you inadvertently change bits in a register, then don’t do it!
The program in Listing 8-4 demonstrates the vertical blank interrupt. The name of the program is VBLANK.EXE and the source code is called VBLANK.C. The program does nothing more than install a vertical blank interrupt handler that increments a counter and prints the number of times the vertical blank has been called. The program only works on one of my two VGA cards, so it might not work on yours at all. If it doesn’t, make sure to at least understand the process of installing the interrupt. To create your own executable, include the library modules BLACK3.C and BLACK8.C. The source code is shown here for review.
Listing 8-4 Demo of the vertical blank interrupt
// VBLANK.C - demo of the vertical blank interrupt supported by a few VGA cards
// I N C L U D E S ///////////////////////////////////////////////////////////
#include <io.h>
#include <conio.h>
#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <bios.h>
#include <fcntl.h>
#include <memory.h>
#include <malloc.h>
#include <math.h>
#include <string.h>
#include "black3.h"
#include "black8.h"
// D E F I N E S /////////////////////////////////////////////////////////////
#define VERTICAL_BLANK_INTERRUPT 0x0A // index of vblank in vector table
#define CRT_VERTICAL_END 0x11 // register with vblank enable bits
// G L O B A L S ////////////////////////////////////////////////////////////
void (_interrupt far *Old_Vertical)();
// F U N C T I O N S /////////////////////////////////////////////////////////
void _interrupt _far Vertical_Blank(void)
{
// this function is the vertical blank handler
static int count = 0; // used to count the vertical blanks
char buffer[64]; // used to print a string
unsigned char data; // used to read data
_asm sti ; // re-enable interrupts
// reset latch on VGA
_outp(CRT_CONTROLLER,CRT_VERTICAL_END);
data = _inp(CRT_CONTROLLER+1);
data = RESET_BITS(data,0x10);
_outp(CRT_CONTROLLER+1,data);
// do whatever you want here, but make it quick!
sprintf(buffer,"Number of vertical blanks is %d",count++);
Print_String(0,0,10,buffer,1);
// end process vertical blank
// re-enable interrupts on PIC, send end of interrupt command EOI, 20h
_outp(PIC_ICR, PIC_EOI);
} // end Vertical_Blank
// M A I N ///////////////////////////////////////////////////////////////////
void main(void)
{
unsigned char data;
// set the graphics mode to mode 13h
Set_Graphics_Mode(GRAPHICS_MODE13);
// save old vertical blank interrupt
Old_Vertical = _dos_getvect(VERTICAL_BLANK_INTERRUPT);
// set new interrupt
_dos_setvect(VERTICAL_BLANK_INTERRUPT,Vertical_Blank);
// enable interrupt on VGA
_outp(CRT_CONTROLLER,CRT_VERTICAL_END);
data = (unsigned char)_inp(CRT_CONTROLLER+1);
data = RESET_BITS(data,0x20); // for your VGA you may need to SET this bit
_outp(CRT_CONTROLLER+1,data);
data = RESET_BITS(data,0x10);
_outp(CRT_CONTROLLER+1,data);
// enable interrupt on PIC, i.e. enable IRQ 2
data = (unsigned char)_inp(PIC_IMR);
data = RESET_BITS(data,0x40);
_outp(PIC_IMR,data);
while(!kbhit());
// disable interrupts from VGA
_outp(CRT_CONTROLLER,CRT_VERTICAL_END);
data =(unsigned char)_inp(CRT_CONTROLLER+1);
data = SET_BITS(data,0x20); // for your VGA you may need to reset this bit
_outp(CRT_CONTROLLER+1,data);
// disable vertical blank interrupt on PIC
data =(unsigned char)_inp(PIC_IMR);
data = SET_BITS(data,0x04);
_outp(PIC_IMR,data);
// restore old vector
_dos_setvect(VERTICAL_BLANK_INTERRUPT,Old_Vertical);
Set_Graphics_Mode(TEXT_MODE);
} // end main
The program begins by setting the graphics mode to 13h and then saving the old VBI vector at 0x0A (which is probably NULL). At this point the vertical blank interrupt is enabled on the VGA, and the proper bit is cleared in the PIC’s interrupt mask register to allow the hardware interrupt IRQ 2 to occur. Then the program sits in the tight loop,
while(!kbhit());
until a key is pressed. However, if your VGA supports vertical blanks, you will see a green line of text displaying a real-time count of the number of vertical blanks that have occurred. Thus, the PC is again performing a crude form of multitasking. Finally, when a key is hit, the program resets both the VGA and the PIC and reinstalls the old ISR that was originally in the interrupt vector table.
It is truly a shame that the vertical blank interrupt doesn’t work on all VGA cards. I admit I haven’t tried it on new cards, but I hope it works, since this interrupt can be a vital help in the video updating and synchronization aspects of a game.
The Internal Timer
The PC has come a long way, but much of its design still looks like something you would find in an old issue of BYTE magazine! One of its flaws is the lack of onboard timers. Video games are extremely time-critical applications. We must time a great many events in a game–something the PC doesn’t lend itself to easily. For example, we have been placing the function call,
Time_Delay(1);
at the end of most of the demos so far. This has the effect of locking the demo at a frame rate of no higher than 18 FPS. We do this because the timer function is based on the PC’s internal timer, which is itself running at 18.2 clicks per second or 18.2 Hz. The only problem with this is that 18.2 clicks per second is an eternity in computer time. We need something with a higher resolution, at least 30 or 60 Hz. The question is, how do we manipulate the PC’s internal timer? And are there extra timers so we can have multiple time bases? The answer to the second question is yes and no. Yes, there are other timers, but we can’t modify them because they are being used for system control. The answer to the first question takes a bit of explaining, so let’s start from the beginning.
The PC has a chip in it called the 8253 (or at least there is a chunk of silicon that emulates this chip in new board designs). The 8253 has, among other things, three counters numbered 0, 1, and 2 and a control register. The designations and I/O ports of these counters are shown in Table 8-4.
With a paltry three counters at our disposal, there’s not a lot we can do. First, counters 1 and 2 are already in use (especially counter 1). The only counter we can really use is counter 0, which is already being used to generate the system timer interrupt. Currently, the counter is counting at a rate of 18.2 Hz. Changing this rate is a matter of understanding how the 8253 is programmed to generate this rate. It turns out that with a bit of work and an Intel chip guide, we can figure out how to program the chip’s counters for any rate. Moreover, the counters not only count but can be used to generate interrupts and other interesting timing structures.
We won’t be interested in all the neat things that can be done with the counters. We only want to know how to reprogram the rate of counter 0. To accomplish this, take a look at Table 8-5, which describes the bit designations of the control register.
Most of the information in Table 8-5 will only make sense to an electrical engineer, so don’t worry about things like strobes and one-shots. We only need to understand how to program counter 0 for a different rate. This is done in the following sequence of steps. First the new rate is computed as a 16-bit value (a pair of bytes, high and low). These bytes are derived by taking the desired frequency and dividing into 1.19318 MHz. This is the frequency of the oscillator input into the 8253. Figure 8-13 shows the frequency division graphically. The value in the counter register is used to divide down the input frequency of 1.19318 MHz. The result of this division is used as the output of the 8253. Hence, if we wanted to find the proper value to program the 8253 so it would generate an output of 100 Hz, we would compute the following quotient (notice that we truncate the decimal portion):
1.19318x106
16 bit counter value = —————— = 11931 = 2E9Bh
100 Hz
Using the above technique, we can come up with a set of 16-bit values that can be used to program the counters at different rates. If you look in BLACK8.H, you will find the following values already defined:
#define TIMER_120HZ 0x26D7 // 120 hz #define TIMER_100HZ 0x2E9B // 100 hz #define TIMER_60HZ 0x4DAE // 60 hz #define TIMER_50HZ 0x5D37 // 50 hz #define TIMER_40HZ 0x7486 // 40 hz #define TIMER_30HZ 0x965C // 30 hz #define TIMER_20HZ 0xE90B // 20 hz #define TIMER_18HZ 0xFFFF // 18.2 hz (slowest possible)
So how do we program these values into the 8253? It is a two-step process. The 8253 can only accept one byte at a time; therefore, the 16-bit values are sent as a pair of bytes: first the low byte then the high byte. If you take a look at bits d5 and d4 of the 8253’s control register, they are used to select the exact way that bytes will be written to the counters. We are going to use the method of “least then most significant byte” sequence or mode 11. The next aspect of programming a timer is setting the rest of the bits, and to make a long story short, the bits should be set for binary counting/rate generator. Before we move on to writing the function that reprograms the counters, let’s see the remaining defines that are declared in BLACK8.H, so we are on the same page!
#define TIMER_CONTROL 0x43 // the 8253's control register
#define TIMER_SET_BITS 0x3C // the bit pattern that will place the timer into
// binary counting with counter load sequence of
// low byte then high byte
#define TIMER_COUNTER_0 0x40 // counter 0
#define TIMER_COUNTER_1 0x41 // counter 1
#define TIMER_COUNTER_2 0x42 // counter 2
As you see, the defines are simply the I/O ports and some constants to make the cryptic work of bit manipulating easier to see. We’re now ready to write a function to reprogram the timer chip. To be general, we’ll write it so that any counter can be programmed, even though we won’t program any counter but 0 (right?). Listing 8-5 contains the function.
Listing 8-5 Function to reprogram the 8253’s counters
void Timer_Program(int timer,unsigned int rate)
{
// this function re-programs the internal timer
// first program the timer to mode 2 - binary and data loading sequence of
// low byte then high byte
_outp(TIMER_CONTROL, TIMER_SET_BITS);
// write least significant byte of the new rate to the counter register
_outp(timer,LOW_BYTE(rate));
// and now the most significant byte
_outp(timer,HI_BYTE(rate));
} // end Timer_Program
The function takes two parameters. The first is the counter to be programmed and the second is the new rate. For example, to reprogram counter 0 to 100 Hz, we can write:
Timer_Program(TIMER_COUNTER_0,TIMER_100HZ);
And instantly time will start moving very quickly. In fact, if you use a utility to watch the time on the PC, you will notice that it has been sped up about five times or, to be exact:
- 100/18.2 = 5.49 times
This is because the newly programmed rate of 100 Hz is causing the timer interrupt to occur more frequently, and thus the PC’s time is moving faster than normal. This is the side effect of reprogramming the internal timer at counter 0. The system time will get out of whack. The solution to this is to keep track of time ourselves when we reprogram the timer and restore the proper time when our application exits.
Programming the Timer
Most of the time (pun intended) we’ll reprogram the internal timer to run at a rate that is more conducive to some time-critical element of the video game, such as the frame rate or some event that occurs periodically. For example, MIDPAK reprograms the timer to a rate of 120h or 288 Hz (which in my opinion is a bit much). The reason 288 Hz is necessary for music is to keep up with the subtle changes in the MIDI music that is generating the sound. The only problem with such a high rate is that the PC is being interrupted 288 times a second, and this can be quite a strain on the application. The Creative Labs sound driver SBFMDRV.COM only runs at 60h or 96 Hz, which is much easier on the system.
As an example of programming the internal timer, the following program, TIMER.EXE, allows you to select one of the defined counter rates and reprogram the internal timer. The source code is named TIMER.C and is shown in Listing 8-6.
Lisiting 8-6 Program that allows you to change the timer rate
// TIMER.C - A demo of reprogramming the PC's internal timer
// I N C L U D E S ///////////////////////////////////////////////////////////
#include <io.h>
#include <conio.h>
#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <bios.h>
#include <fcntl.h>
#include <memory.h>
#include <malloc.h>
#include <math.h>
#include <string.h>
#include "black8.h"
// M A I N ////////////////////////////////////////////////////////////////////
void main(void)
{
int done=0, // exit flag
selection; // user input variable
// main event loop
while(!done)
{
// display menu
printf("\nPC Timer Re-Programming Utility\n");
printf("\n1 - Program timer to 120HZ.");
printf("\n2 - Program timer to 100HZ.");
printf("\n3 - Program timer to 60HZ.");
printf("\n4 - Program timer to 50HZ.");
printf("\n5 - Program timer to 40HZ.");
printf("\n6 - Program timer to 30HZ.");
printf("\n7 - Program timer to 20HZ.");
printf("\n8 - Program timer to 18.2HZ (default).");
printf("\n9 - Exit program.");
printf("\n\nSelect one?");
// get input
scanf("%d",&selection);
// what rate did user select?
switch(selection)
{
case 1: // set timer to 120hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_120HZ);
} break;
case 2: // set timer to 100hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_100HZ);
} break;
case 3: // set timer to 60hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_60HZ);
} break;
case 4: // set timer to 50hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_50HZ);
} break;
case 5: // set timer to 40hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_40HZ);
} break;
case 6: // set timer to 30hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_30HZ);
} break;
case 7: // set timer to 20hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_20HZ);
} break;
case 8: // set timer to 18hz
{
Timer_Program(TIMER_COUNTER_0,TIMER_18HZ);
} break;
case 9: // exit program
{
done=1;
} break;
default:
{
printf("\nInvalid Selection!\n");
} break;
} // end switch
} // end while
} // end main
The program is so simple it’s not worth reviewing. About the only thing the program does is obtain user input and use it to select the rate at which to program counter 0. Just one caution: when this program exits, the timer will run at the last rate you programmed; hence, if you want the system to run at its standard rate of 18.2 Hz, select command 8 before exiting.
Time Tracking
We have mentioned event timing from time to time, but we haven’t written any specific software to implement it. In many situations we may need to compute the amount of time a process has taken or allow some process to execute for a specific amount of time. For example, we use the function Time_Delay( ) at the end of most of our demos to lock the demos at a frame rate of no more than 18 FPS. However, the downfall to this approach is that if the game processing takes 50 milliseconds (ms), and then at the end of the game loop another 55 milliseconds is taken up during the call to Time_Delay( ) with a parameter of 1, then the maximum frame rate will be:
_____1______ = 9 FPS 50 ms+55 ms
Again, this occurs because the game logic and rendering takes some amount of time, and this time must be added onto the time delay at the end of the game. A better solution is to lock a game at a specific frame rate by timing the events from the start of the event loop to the end of it, and if the elapsed time is greater than or equal to the desired delay time (which results in a specific frame rate), the next cycle of the event loop is processed. However, if less than the required time has elapsed, a while( ) or similar structure waits until the specified amount of time has passed. As an example, take a look at the following fragment:
// main event loop
while(!done)
{
// obtain starting time
start_time = Timer_Query();
// do the standard operations in a game loop
Erase_Objects();
Move_Objects();
Draw_Objects();
Draw_Screen();
// end main portion of game loop
// now determine if a clock tick has passed
while ((Timer_Query() - start_time)<WAIT_TIME);
} // end main event loop
The above fragment functions by first saving the current time with the function Timer_Query( ), and then the main game logic is executed. Now the timing aspect of the program comes into play. Instead of simply making a call to Time_Delay( ) (with the desired number of ticks), the starting time is continually subtracted from the current time until the desired number of ticks has passed. Thus, if during the execution of the logic portion of the event loop the desired execution time of the program was already used up, the waiting loop would evaluate as true and the event loop would continue.
On the other hand, if the logic portion of the code only took up a few milliseconds and we had set WAIT_TIME to 3 clicks (3*55 ms), which is roughly 165 ms, then the while( ) loop would wait until the time expired. If you are getting a bit confused by the use of the constant 55 ms, remember that it is the amount of time it takes for one clock tick with the default setting of the system timer–that is, 1/18.2 = 55 ms when rounded up.
Of course, if we had programmed the PC’s timer with a rate of 100 Hz using the Timer_Program( ) function, the resolution of the timer becomes 10 ms; thus, a delay of 3 system clicks would equal 30 ms, and if this were used to control the frame rate at the end of an event loop, the frame rate would be roughly 33 FPS.
Another use of time tracking is “timing out.” As noted, there are many processes that we may want to time in a video game. For example, we may want to give a player only a certain amount of time to solve a puzzle, or maybe we want to allot only a certain amount of time to make a MODEM connection. It’s up to you, but time tracking and the software to implement it is a necessity in a video game. Luckily, the software to track time is nothing more than a single variable lookup. The function Time_Delay( ) uses a memory location in the PC at 0000:046Ch, which contains a 32-bit value of the number of ticks from 1970 (don’t ask me why). We don’t care about the absolute meaning of the data, just the relative meaning. We know, for instance, if the value increments by 10, then 10 clock ticks must have passed since the last time we read it. Furthermore, since we know a clock tick is roughly 55 ms, then we can deduce that 550 ms have passed since the last reading.
With this information, all we really need to do is dereference a pointer aliased to 0000:046Ch whenever we want to know the current time in clicks. Although this works, a bit cleaner interface would be to use a function (in case that variable location changes some day). The Timer_Query( ) function is shown in Listing 8-7.
Listing 8-7 Function to query the current time of the PC in clicks
long Timer_Query(void)
{
// this function is used to record the current time
long far *clock = (long far *)0x0000046CL; // address of timer
return(*clock);
} // end Timer_Query
The function takes no parameters and simply returns the current value in the system time variable. There is one important detail that must not be overlooked. The system timer variable will change at a rate that is equal to the value programmed into the timer chip at counter 0. Hence, if we change the counter, we had better take note of it when using the Timer_Query( ) or Time_Delay( ) functions in our code, since they will be operating in a different time scale than the default of 18.2 clicks per second.
Background Processing
One of the main goals of this chapter is to learn the fundamental techniques to gain temporal control of the PC. And I think that we are starting to learn why interrupts and reprogramming the timer are important, but let’s discuss briefly a few applications of all this sorcery.
We learned that a video game can be thought of as an entire environment unto itself. This environment is responsible for one thing, and that’s running a game. The game consists of a set of modules that are each executed for a period of time in a cyclic manner, as shown in Figure 8-14. Thus, the game is a form of a multitasking kernel, and the tasks that are processed make up the game itself. However, the single drawback to this setup is that the game logic doesn’t have a good grasp on time. For example, if a light is supposed to blink every 20 seconds, it’s hard for the program to make this happen since the game logic may be processing an explosion at the time the light is supposed to blink.
Of course, the solution to the problem is to have an interrupt with enough resolution to process all of the events that must occur at regular intervals regardless of the state of the game. For example, the music system that we’re using, MIDPAK, uses the timer interrupt to process the MIDI music. If MIDPAK didn’t do this, the game program would have to do the processing. And if the game started slowing down because of complex graphics or lots of explosions, then so would the sound. However, if the sound system is attached to an interrupt, the speed of the game is irrelevant. The interrupt will occur like clockwork no matter what the CPU is doing. This is because the interrupt by definition will interrupt anything that is occurring. That’s a brutal statement!
As another example, imagine we have a game with a lot of color palette animation. This animation can and should be done by an interrupt (as long as the animation is passive). Figure 8-15 shows two possible situations the same game can get into at different times. Due to the current state of the game, in one case it has slowed down to 8 FPS, while in the other case it has slowed to 15 FPS. In both cases if the palette animation were done by the main game logic itself, these updates would also occur at 8 or 15 FPS, which may be unacceptable and visually bothersome. However, if we have reprogrammed the timer for 60 Hz and latched the color animation ISR onto it, at least the color animation will occur at 60 Hz no matter what is happening to the game. Of course, we must realize that we don’t want to make interrupt functions too complex or long. The idea of background tasks using ISRs is to keep them relatively simple and short.
In conclusion, background processes that are based on interrupts should be used for tasks that we never want to see slow down or for tasks that must occur at some specific interval no matter how much the game logic is loading the system and CPU. I use background tasks in my games for animations that aren’t part of the game logic, such as shooting stars, birds, color changes, screen shakes, and so forth. Try to keep your background tasks at the same level of complexity. For example, don’t try to render a 3D view in an ISR!
We have almost completed our discussion of interrupts and timing; now we are going to change gears a little and talk about some techniques that are necessary evils in game programming. The first is writing self-contained or autonomous functions.
Autonomous Functions and Programming Techniques
You’re probably getting pretty excited about writing your first game, even though we haven’t begun the 3D portion of our studies. However, before we bury ourselves in a full game (which we will do in the next chapter), let’s take a look at some programming techniques that will make life easier during the creation of game code.
In general, writing games isn’t like writing other pieces of software. As Cybersorcerers we must break a lot of rules and make up some as we go. This is because we must squeeze every particle of power from the PC. The result: we may use some questionable programming techniques. But their performance won’t be! I’m going to give you a list of dos and don’ts when writing game software and functions.
- Don’t add error detection logic unless an error is going to occur. You
don’t have time to test for errors in a game. The only error checking should be done during initialization and user input if possible.
- Use global variables freely. Passing parameters to a function that simply
plots a pixel will probably take more time than plotting the pixel!
- Use 32-bit fixed point math instead of floating point where possible.
Even on 486 and Pentium processors, 32-bit fixed point is faster than floating point in most cases. We will learn about the mechanics of fixed point math in later chapters.
- Try to use look-up tables with precomputed values where possible,
especially for mathematically intensive computations such as trigonometric tables and square root functions.
- Use inline assembly language for time-critical operations such as graphics
and I/O. However, shy away from assembly language for AI and general game logic.
- Try to use the smallest memory model possible. The MEDIUM memory
model is the best choice in most cases, but if you can squeeze your code into 64K, by all means use the SMALL model.
- Most of the time in a game is spent in rendering graphics. Take the time
to optimize these sections. Don’t waste effort optimizing AI algorithms and general game logic; these won’t give you large enough returns for the time investment.
- Finally, stay away from complex data structures such as binary trees
and linked lists. Use them only when necessary. Arrays and simple variables are much faster.
If you pay attention to these dos and don’ts, you’ll avoid many headaches, but game programming takes a lot of effort and experience. There aren’t any magickal techniques, but there are a few standard methodologies for writing some of the more common functions in a game. One of these methodologies I refer to as autonomous functions. Really this is nothing more than a function that can take a command and has one or more local static variables so the function can track its own state. This feature relieves the main game logic from having to know everything.
In essence, autonomous functions are like game objects. They take care of themselves and do their own thing. If there is one mistake that a game programmer can make, it’s writing all of a game’s control logic into the main( ) function of the program. Instead, we must think of all the subsystems of the game–the explosions, the rendering engine, the I/O system, and so forth–as independent autonomous entities that we direct to do work. But we don’t have to stand over them like the MCP to make sure they do the work. To start your thinking along these lines, take a look at the function in Listing 8-8.
Listing 8-8 Shell for an autonomous function
int Auto_Function(int command)
{
static int times_executed = 0; // this tracks the number of times the
// function is called
// test if this is the first time the function has been called
if (!times_executed)
{
//..initialize any data structures and perform any start up tasks
} // end if first time
// we have executed this function once more
times_executed++;
// what command has caller issued
switch(command)
{
case COMMAND_1:
{
// process command
} // break;
// other commands
default:break;
} // end switch
} // end Auto_Function
This function is simple in appearance, but it has much power to offer. Notice
that the function is aware of the number of times it’s called; moreover, during its first call, the function will execute a section of code that initializes any data structures. Thus, we have seen that using a static variable, a function can remember its own local state. Improvements can of course be made on the function shell (based on what the function is doing). For example, we could add more local static variables and some parameters.
The model of computation for a game must have some air of object orientation within it, meaning that functions should be thought of as objects composed of functions and their data, as shown in Figure 8-16. Using this technique consistently (when applicable), a video game becomes very object-oriented, and each function or module can be updated, debugged, and so forth, without knowing how the rest of the system works.
Now that we have mastered the art of writing efficient functions for our game objects (hopefully), the next logical step in our evolution is to learn how to replicate our work to create many of the same objects.
Processing Multiple Data Objects
In the previous chapter, we saw how many objects can be created from one template by using multiple data sets with the same code, but we didn’t know all that we know now to really appreciate the technique. And since being able to replicate or make multiple copies of objects in a game is so important, we are going to see another example of it here.
The example is based on a system that is commonly modeled in games called particle systems, which is a system of relatively simple game objects that follow a specific set of rules. We’re going to simulate hot chunks of molten rock (cinders) spewing out of the mouth of a volcano on an alien world. Let’s begin with the data structure of the particles. The structure needs to track the position, velocity, and color, and have a few timing variables to track the particle’s progress. Here is such a data structure:
typedef struct cinder_typ
{
int x,y; // position of cinder
int xv,yv; // velocity of cinder
int color; // color of cinder
int lifetime; // total number of frames cinder will live for
int counter_1; // tracks when cinder color will change
int threshold_1;
int counter_2; // tracks when gravity will be applied
int threshold_2;
int col; // the current color of the cinder
int under; // this holds the pixel under the cinder
} cinder, *cinder_ptr;
About the only fields that aren’t obvious from their names are the counter fields. They are used to “granulate” time. To avoid floating point calculations, we use integers, but mathematical operations are only performed every few frames. The result of this technique (which is important in itself) is that objects move as if they had decimal accuracy. For example, say that we wanted to move a pixel .25 pixel per frame. This would be impossible with standard integers, but if we use a counter that counts frames, then it can be done. When the counter reaches 4, the pixel is moved 1.0, which is equivalent to 4*.25. Anyway, this technique is used to simulate the gravity and the rate at which the volcanic cinders lose their red-hot glow.
So how do we write a single set of functions that can process a number of cinders? It’s actually very simple–the cinders can be located in an array (a linked list could be used alternatively). This array would hold the data for each cinder. Something like this would work,
cinder cinders[MAX_CINDERS]; // this holds all the cinders
where MAX_CINDERS is the maximum number of cinders in the simulation.
With the array in hand, all that is needed are the functions that erase, move, draw, scan under, and initialize all the cinders. The trick is understanding that each function really works on one cinder at a time, but the functions are written so that a for loop processes all the cinders in the array. Abstractly put, if a function operates on a game object named game_object, then an algorithm to process a collection of game_objects would look like this:
for (index=0; index<number_of_objects; index++)
{
Process(game_object[index]);
} // end for
This technique is all that is needed to transform any function into a function that works on an array of base types. Using this technique, we can write all the functions we need. I don’t want to list every function of the volcano program since the complete program follows in a couple of pages, but let’s take a look at one of the volcano functions and see what’s in it. Listing 8-9 contains the Erase_Cinders( ) function for analysis.
Listing 8-9 The cinder erasure function
void Erase_Cinders(void)
{
// this function replaces the pixel that was under a cinder
int index; // looping variable
for (index=0; index<active_cinders; index++)
if (cinders[index].y >=0)
Write_Pixel_DB(cinders[index].x,cinders[index].y,cinders[index].under);
} // end Erase_Cinders
The function is rather simple, but it is derived using the techniques just discussed. The cinders are contained in the array cinders[ ]. The function retrieves the pixel that was under the current cinder within its own data structure and places it at the proper (x, y) coordinates. This is done for each cinder in the array. If our program only had one cinder, the Erase_Cinders( ) function would be called Erase_Cinder( ) and it might look like Listing 8-10.
Listing 8-10 A single cinder eraser
void Erase_Cinder(void)
{
if (cinders.y >=0)
Write_Pixel_DB(cinders.x,cinders.y,cinders.under);
} // end Erase_Cinder
The only difference between the two functions is that the Erase_Cinder( ) logic lacks the array indexing! Therefore, 90 percent of the functions we write to control a single object can be transformed with a text editor in minutes to handle multiple objects. The only functions that we have to write more carefully are collision detection and the initialization functions that set up the data structures, since we may want to have some data dependencies between each data object. Anyway, the other functions, Draw_Cinders( ), Move_Cinders( ), and Scan_Cinders( ) are all written similarly to Erase_Cinders( ). The function Init_Cinders( ) generates all the random positions and velocities of the cinders during startup. It works in much the same way (each cinder is processed in a loop), except that it takes a pair of parameters that indicate which subset of cinders to initialize. This is a general practice. You should write your functions so that a subset of the objects can be called upon for work.
At this point I think the best thing to do is look at the short program and see it in action. The name of the program is VOLCANO.EXE and the source is VOLCANO.C. When you run the program, you will see an alien landscape with a volcano in the middle (well, it’s supposed to look like a volcano). Anyway, the volcano spews out cinders at different velocities and temperatures (indicated by color), and the cinders fall back to the surface and die out. To build this program, you’ll need the library modules BLACK3, BLACK4, and BLACK8. Listing 8-11 contains the code.
Listing 8-11 Program that replicates a single cinder to simulate an active volcano
// VOLCANO.C - A demo of multiple data single logic programming
// I N C L U D E S ///////////////////////////////////////////////////////////
#include <io.h>
#include <conio.h>
#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <bios.h>
#include <fcntl.h>
#include <memory.h>
#include <malloc.h>
#include <math.h>
#include <string.h>
#include "black3.h"
#include "black4.h"
#include "black8.h"
// D E F I N E S ////////////////////////////////////////////////////////////
#define MAX_CINDERS 100 // maximum number of cinders in the simulation
#define VOLCANO_MOUTH_X 145 // position where cinders will be emitted
#define VOLCANO_MOUTH_Y 102
#define CINDER_START_COLOR 48 // starting cinder color
#define CINDER_END_COLOR 48+15 // ending cinder color
// S T R U C T U R E S //////////////////////////////////////////////////////
typedef struct cinder_typ
{
int x,y; // position of cinder
int xv,yv; // velocity of cinder
int color; // color of cinder
int lifetime; // total number of frames cinder will live for
int counter_1; // tracks when cinder color will change
int threshold_1; //
int counter_2; // tracks when gravity will be applied
int threshold_2;
int col; // the current color of the cinder
int under; // this holds the pixel under the cinder
} cinder, *cinder_ptr;
// G L O B A L S ////////////////////////////////////////////////////////////
pcx_picture image_pcx; // general PCX image used to load background and imagery
cinder cinders[MAX_CINDERS]; // this holds all the cinders
int active_cinders; // nunmbr of active cinders in the world
// F U N C T I O N S /////////////////////////////////////////////////////////
void Init_Cinders(int starting_cinder, int ending_cinder)
{
// this function initializes cinders, it can initialize a sequence of cinders,
// or a single cinder
int index; // looping variable
// intialize each cinder or restart a single cinder
for (index=starting_cinder; index<=ending_cinder; index++)
{
// fill in position, velocity and color of cinder
cinders[index].x = VOLCANO_MOUTH_X - 5 + rand() % 11;
cinders[index].y = VOLCANO_MOUTH_Y - rand() % 3;
cinders[index].xv = - 2 + rand() % 5;
cinders[index].yv = - 5 - rand() % 5;
cinders[index].col = CINDER_START_COLOR;
// scan under cinder
cinders[index].under = Read_Pixel_DB(cinders[index].x,cinders[index].y);
// set timing fields
cinders[index].lifetime = 20 + rand()%100;
cinders[index].counter_1 = 0;
cinders[index].counter_2 = 0;
// set how long it will take for cinder to cool
cinders[index].threshold_1 = 2 + rand()%6;
// set how long it will take for gravity to take effect
cinders[index].threshold_2 = 1+rand()%3;
} // end for index
} // end Init_Cinders
/////////////////////////////////////////////////////////////////////////////
void Erase_Cinders(void)
{
// this function replaces the pixel that was under a cinder
int index; // looping variable
for (index=0; index<active_cinders; index++)
if (cinders[index].y >=0)
Write_Pixel_DB(cinders[index].x,cinders[index].y,cinders[index].under);
} // end Erase_Cinders
/////////////////////////////////////////////////////////////////////////////
void Scan_Cinders(void)
{
// this function scan under the cinders
int index; // looping variable
for (index=0; index<active_cinders; index++)
if (cinders[index].y >=0)
cinders[index].under = Read_Pixel_DB(cinders[index].x,cinders[index].y);
} // end Scan_Cinders
/////////////////////////////////////////////////////////////////////////////
void Draw_Cinders(void)
{
// this function draws the cinders
int index; // looping variable
for (index=0; index<active_cinders; index++)
if (cinders[index].y >=0)
Write_Pixel_DB(cinders[index].x,cinders[index].y,cinders[index].col);
} // end Draw_Cinders
/////////////////////////////////////////////////////////////////////////////
void Move_Cinders(void)
{
// this function moves and updates all the timing fields of the cinder
// it also applies gravity to the cinders
int index, // looping variable
pixel; // used to read the pixels under the cinder
// process each cinder
for (index=0; index<active_cinders; index++)
{
// move the cinder
cinders[index].x+=cinders[index].xv;
cinders[index].y+=cinders[index].yv;
// apply gravity
if (++cinders[index].counter_2 >= cinders[index].threshold_2)
{
// apply a downward velocity of 1
cinders[index].yv++;
// reset gravity counter
cinders[index].counter_2 = 0;
} // end if time to update gravity
// test if it's time to update cinder color
if (++cinders[index].counter_1 >=cinders[index].threshold_1)
{
// reset counter
cinders[index].counter_1 = 0;
// test if cinder is already out
if (cinders[index].col < CINDER_END_COLOR)
cinders[index].col++;
} // end if time for color change
// test if cinder is dead, off screen, lifetime over, or hit moutain
pixel = Read_Pixel_DB(cinders[index].x,cinders[index].y);
// test if the pixel is part of the mountain
if (pixel!=0 && (pixel < CINDER_START_COLOR || pixel > CINDER_END_COLOR))
{
// restart this cinder
Init_Cinders(index,index);
} // end if cinder hit mountain
else
if (--cinders[index].lifetime <= 0)
{
// restart this cinder
Init_Cinders(index,index);
} // end if cinders lifetime expired
else
if (cinders[index].x > 320 || cinders[index].x < 0)
{
// restart this cinder
Init_Cinders(index,index);
} // end if cinder is off screen
} // end for index
} // end Move_Cinders
// M A I N //////////////////////////////////////////////////////////////////
void main(int argc, char **argv)
{
// set the graphics mode to mode 13h
Set_Graphics_Mode(GRAPHICS_MODE13);
// create the double buffer
Create_Double_Buffer(200);
// now load the background image
PCX_Init((pcx_picture_ptr)&image_pcx);
PCX_Load("volcano.pcx",(pcx_picture_ptr)&image_pcx,1);
// copy PCX image to double buffer
PCX_Copy_To_Buffer((pcx_picture_ptr)&image_pcx,double_buffer);
PCX_Delete((pcx_picture_ptr)&image_pcx);
// draw instructions
Print_String_DB(80,2,9,"Hit any key to exit",1);
// intialize all the cinders
Init_Cinders(0,19);
active_cinders = 20;
// main event loop, process until keyboard hit
while(!kbhit())
{
// erase all the cinders coming out of the volcano
Erase_Cinders();
// move all the cinders
Move_Cinders();
// scan under and draw the cinders
Scan_Cinders();
Draw_Cinders();
// display double buffer
Display_Double_Buffer(double_buffer,0);
// lock onto 18 frames per second max
Time_Delay(1);
} // end while
// exit in a very cool way
Screen_Transition(SCREEN_DARKNESS);
// free up all resources
Delete_Double_Buffer();
Set_Graphics_Mode(TEXT_MODE);
} // end main
The main( ) function of the program is surprisingly simple, but this is an illusion
since most of the work is done in the cinder functions. In fact, the most complex of
the functions is Move_Cinders( ), which updates all the positions of the cinders and
updates the colors of the cinders. In addition to moving and transforming the cinders,
Move_Cinders( ) tests them to see if their life cycle is complete. This occurs in
a few cases: the cinder has burnt out, the cinder has hit the surface, the cinder has
gone off a screen edge. In any of these cases, the cinder is recycled and restarted
from the mouth of the volcano with another call to Init_Cinders( ), but with a pair of
parameters that are the index of the cinder that just died.
For a small exercise, I want you to add to the program the ability to change the gravity and the number of active cinders by keyboard input.
Summary
This chapter is probably one of the most important of the whole book. We learned the lowest forms of alchemy that are needed to take control of the PC. We learned about multitasking, the timer, autonomous functions, and standard game programming techniques used to optimize game code. But the most important thing of all to gain from this chapter is that a video game is a little world, a self-contained universe that is generated within the PC. Making this world seem alive and functioning in real-time is one of the Black Arts we so much desire.
Now it’s time to learn how to pull more than yourself into the Cyberspace abyss, with multiplayer game techniques.
Continue
- Black Art of 3D Game Programming: Writing Your Own High-Speed 3D Polygon Video Games in C Table of Contents
- Black Art of 3D Game Programming, Chapter 9: Multiplayer Game Techniques
| Copyright 2006 Andre LaMothe |
|
