A few Details about the Intel Architecture

Last Up date: 2005 October 5
Started 2004 June 6

If you have not programmed in Assembly, I am going to assume that you are not familiar with the idiosyncrasies of the Intel 8086 through Pentium processors. If you know this, I would appreciate your thoughts and suggestions for improvement. Or you may just want to skip over most of it.

A brief introduction to the 8086 registers.

The registers in the Intel architecture are specialized and are not "general registers" in the sense that they can be used interchangeably, like most of the registers in the PDP-11 or the IBM-360 processors.

Because the registers have special uses you will find much of any program is simply moving data from one register to another. You may ask: "Why did Intel do it that way?" Having never worked for Intel, I can only guess, and make logical deductions. We must remember when the 8086 was designed memory was very expensive. Each bit in memory was a tiny core with four insulated wires hand threaded through each one. With specialized registers most instructions do not have to specify the registers to uses, as has to be done with a "general register" architecture. It is quite amazing how short, often just one byte, the most frequently used instructions are. At one time I tried to evaluate the instruction set of different machines by comparing how many bits of memory it took to program a given algorithm. Surprisingly, the 8086 and the PDP-8 always looked very good by this measure.

The four most general registers have two letter names. The first letters are: A, B, C, and D.

The "A" registers: AX, AL, and to a lessor extent AH are the main working accumulators, and are implied for the string commands as well as the multiply and divide. They are not used as index registers.

The "D" registers (DX specifically) augments AX for multiply and divide. (Multiplying two 16-bit values produces a 32-bit result, and dividing a 32-bit value by a 16-bit value produces a 16-bit quotient and a 16-bit remainder.) D registers are not used with the string instructions, or as index registers; hence they are not as frequently used as the A registers.

The "B" register (BX specifically) is used with the "Translate" instruction to point to the translate table; and it can be used as an index register, and is the only index register where either half can be specified.

The "C" register (CX specifically) is used to hold the count for the loop and "rep" (repeat string instructions).

There are four other 16-bit only registers you should know about: SI, DI, BP, and SP.
SI and DI are the Source and Destination Index registers used with the string instructions; either can be used as a simple index register.
SP is the Stack Pointer used with the push, pop, and subroutine call and return instructions. Rarely do you need to explicitly move something to or from SP.
BP has special features making it useful to retrieve addresses of parameters that have been pushed onto the stack. Compilers make use of this when subroutines are called when the stack is in a separate segment from the code and data. We haven't mentioned the segment registers, and in .COM programs that fit in 64K the operating system sets the segment registers before it starts the .COM program and we won't have to worry about them, at least for now.

The above are hardware facts, in addition there are a few operating system conventions that you will have to know a little about. For example: In most of the routines that read or write files your program specifies where the data starts with the address in DX, while CX specifies how many bytes to read or write.

.COM programs start at hexadecimal location 100. Eric's A86 assembler knows about this, and automatically starts at 100. You have to explicitlly specify this in most other assemblers. All addresses and most internal data is in hexadecimal, if you are not familiar with hexadecimal, you will have to learn it. There are plenty books that have a chapter or so on hexadecimal.

The area from 0 to 7F is used to store information the operating system needs about where the program is located and how much memory is used. Hexadecimal 80 to FF holds the information following the command, from the command line. You use this to process command line arguments. At least one of my sample programs uses data from the command line.

Now lets look at Hello details.


Contact me at:
My phone #'s
For comments call, or e-mail me. Go to My Home Page or TOP of this page.