Memory model of x51 compatible micros (8051, 8052, ...)

to ASCII conversion  Obsah  no next chap.
Microcontrollers of the MCS-51 series use separate address spaces for program and for data. Both address spaces can be further divided into internal and external memory areas. The area of Special Function Registers can be also considered an address space. Individual address spaces do not overlap and are accessed by different sets of instructions.

Program address space

The program address space is the simplest one. The MCS-51 family can address directly, without special features, up to 64 kBytes of code. The GNU assembler MCS51-AS (used by the Web51 development environment) uses the standard .text segment for program memory. Normally, program code is generated into this segment. To store constants in this segment, the following commands can be used, depending on the required format of the constant:
Example 1.1
   1                 	.text
   3                 	.equ	symbol1, 0x4A
   5 0000 4A          	.byte	74
   6 0001 4A          	.byte	0b01001010
   7 0002 4A          	.byte	0112
   8 0003 4A          	.byte	0x4A
   9 0004 4A          	.byte	0X4a
  10 0005 4A          	.byte	'J
  11 0006 4A          	.byte	'J'
  12 0007 4A          	.byte	'\J
  13 0008 4A          	.byte	symbol1
  14 0009 00          	.byte	unknownsymbol
  15 000a 12 34       	.word	4660
  16 000c 12 34       	.word	0x1234
  17 000e 12 34 56 78 	.int	02215053170
  18 0012 12 34 56 78 	.int	305419896
  19 0016 12 34 56 78 	.int	0x12345678
  20 001a 12 34 56 78 	.long	0x12345678
  21 001e 12 34 56 78 	.quad	0x123456789ABCDEF0
  21      9A BC DE F0 
  22 0026 12 34 56 78 	.octa	0x123456789ABCDEF0123456789abcdef0
  22      9A BC DE F0 
  22      12 34 56 78 
  22      9A BC DE F0 

         example.asm:3      *ABS*:0000004a symbol1


The example shows that the assembler can work with 8/16/32/64/128-bit constants, defined using C-like convention, either in decimal (number starts with a digit 1 through 9), hexadecimal (starts with 0x), binary (starts with 0b), or ASCII format. A symbolic name can be specified in place of a constant and is replaced by the value during compiling or linking.

The assembler can also embed string constants into the program address space, for example error messages. Such constants can be automatically zero-terminated (using the .string or .asciiz directive).

Example 1.2
  24 0036 4E 65 77 6C 	.ascii	"Newlinehere\12"
  24      69 6E 65 68            
  24      65 72 65 0A 
  25 0042 4E 65 77 6C 	.ascii	"Newlinehere\012"
  25      69 6E 65 68 
  25      65 72 65 0A 
  26 004e 4E 65 77 6C1 	.ascii	"Newlinehere\x0A"
  26      69 6E 65 68 
  26      65 72 65 0A 
  27 005a 4E 65 77 6C 	.ascii	"Newlinehere\n"
  27      69 6E 65 68 
  27      65 72 65 0A 
  28 0066 54 72 61 69 	.string	"Trailingzero"
  28      6C 69 6E 67 
  28      7A 65 72 6F 
  28      00 
  29 0073 54 72 61 69 	.asciz	"Trailingzero"
  29      6C 69 6E 67 
  29      7A 65 72 6F 
  29      00 
  30 0080 08 0C 0A 0D 	.ascii	"\b\f\n\r\t\\\""
  30      09 5C 22

Text strings may contain special characters as in the "C" language, for example \n for newline. Other special characters can be specified using the backlash \ and the octal representation of the special character, or the \x sequence and the hexadecimal value of the character. To insert a backslash into the string, enter it twice. To enter the double quote character, precede it with a backslash, as shown above.

It is simple to enter even floating point constants into the code. MCS51-AS uses the IEEE floating point format.

Example 1.3
  31 0071 40 49 0F D8 	.float	3.141592
  32 0075 40 49 0F D8 	.float	3141592E-6
  33 0079 40 49 0F D8 	.single	3141592E-6
  34 007d 40 09 21 FA 	.double	3141592E-6
  34      FC 8B 00 7A 
  35 0085 40 49 0F D8 	.float	0f3141592E-6
  36 0089 40 49 0F D8 	.float	0d3141592E-6

Data address space

Data storage is much more complicated. MCS-51 series microcontrollers have two data areas. First, there is the internal data memory of 128 bytes in the smallest ones (8031, 8051, ...) up to 256 bytes in the largest ones (8052, ...). The external data memory can occupy up to 64 kBytes. The internal and external memory areas are accessed by separate instruction sets and do not overlap. However, it is possible to make the external data memory and the external program memory overlap using a simple hardware trick.

Internal data memory

The internal data memory is partially overlapped by the Special Function Registers (SFR). The first micros of the MCS-51 series had only 128 bytes of internal data memory, so the SFR's were placed into the remaining 128 bytes addressable by the 8-bit address. For example, the data register of port 0 is located at address 0x80. Unlike the internal data memory, SFR's cannot be addressed indirectly. That is, it is not possible to access them while having the address in register R0 (mov a,@r0)(mov @r0,a) or R1 (mov a,@r1)(mov @r1,a); the only allowed method is direct access (e.g. mov a,0x80)(mov a,p0)(anl p0,#0b10101010). The entire internal memory area is accessible using indirect addressing only. Looking for errors caused by direct addressing of the internal memory in its upper range is a common nightmare of many programmers. Unfortunately, the assembler can't provide many warnings.

To complicate things even more, there are another two special areas in the internal data memory. First is the 0x00 to 0x1F range that contains four register banks. Second is the 0x20 to 0x2F range that is bit-addressable.

The MCS-51 compatible micros use eight general-purpose registers named R0 to R7. To avoid e.g. pushing them on the stack in interrupt handlers, the MCS-51 micros use banking. There are 4 register banks. Switching is accomplished using two bits (RS0, RS1) in the status register PSW. Individual banks begin at 0x00, 0x08, 0x10, and 0x18. Since some instructions (such as PUSH, POP, ...) cannot have a register as its parameter but only an absolute address, almost all assemblers try to help the programmer with "absolute pseudoregisters" AR0 to AR7. However, these registers do not check the status of the RS0 and RS1 bits but infer the starting address of the bank from the pseudoinstruction .using 0 to 3. The .using directive also affects allocation of space for the registers by the linker. The effect of these pseudoinstructions on the generated code is shown below.

Example 1.4.
  38                 	.using	0
  39 008d C0 00       	push	ar0
  40 008f C0 07       	push	ar7
  41                 	.using	1
  42 0091 C0 08       	push	ar0
  43 0093 C0 0F       	push	ar7
  44                 	.using	2
  45 0095 C0 10       	push	ar0
  46 0097 C0 17       	push	ar7
  47                 	.using	3
  48 0099 C0 18       	push	ar0
  49 009b C0 1F       	push	ar7

Registers are followed by the bit-addressable portion of the internal memory. Individual bits are addressed only directly, bit address is a part of the instruction. Similarly to the data memory, bit addressing uses only the first 128 bits; remaining 128 bit addresses are used to access individual bits in the SFRs. The bit-addressable memory ranges from 0x20 to 0x2F. Should we need to access this memory area by bits and bytes at the same time, it is useful to have a function that can calculate the address of an individual bit from a given byte. The MCS51-AS uses a built-in function B2B(byte, bit_offset_in_byte). The bit offset is not limited to 0..7, B2B can address multibyte variables. It is possible to define a 4-byte variable and access all of its bits using B2B.

Example 1.5
  51                 	    .equ	test1,0x20	;start of bit-addressable data area
  52                 	    .equ	test2,0x24	;4 bytes in bit-addressable area
  54 009d A2 00       	    mov C,B2B(test1, 0)         ;move bit 0 of variable test1 to Carry
  55 009f A2 0D       	    mov C,B2B(test1, 13)
  56 00a1 A2 1F       	    mov C,B2B(test1, 31)
  57 00a3 A2 20       	    mov C,B2B(test2, 0)
  58 00a5 A2 2D       	    mov C,B2B(test2, 13)
  59 00a7 A2 3F       	    mov C,B2B(test2, 31)

Absolute locations into bit-addressable space were used for simplicity only. Usually, the memory is allocated by the linker.[1].[2]

It would be very impractical to specify absolute adresses of variables using .equ. The assembler and linker allow comfortable memory allocation. Two ways can be used to allocate memory for variables. The first one involves switching into the appropriate data segment with one of the .data pseudoinstructions and then declaring the space with one of the .ds pseudoinstructions. The other method is to use a single pseudoinstruction that specifies the data segment, variable name and size at the same time. This would be one of the .comm or .common pseudoinstructions. Individual data segments, or rather portions of the internal data space, are distinguished by the predefined segment names and .comm pseudoinstruction modifications. Location of individual segments in the internal data memory is controlled by the linking scripts, standard location is in the following table.

Table 1.1

From address To address Segment   Note
0x08(.using 0), 0x10(using 1), 0x18(using 2), 0x20(.using 3) max. 0xFF, usually max. 0x1F .rdata .rcomm/.rcommon Unused part of registers, directly addressable
after .rdata, min. 0x20 max. 0xFF, usually max. 0x2F .bdata .bcomm/.bcommon Bit-addressable area (0x20...0x2F), addressed bytewise
after (.bdata - 0x20)*8 bitwise max. 0x7F bitwise, max. 0x2F bytewise .bitdata .bitcomm/.bitcommon Bit-addressable area (0x00...0x7F), addressed bitwise
after (.bdata + sizeof .bitdata in bytes), that is min. 0x20 max. 0xFF, usually 0x7F .data .comm/.common Directly addressable area (<0x80)
after .data, that is min. 0x20 max. 0xFF .idata .icomm/.icommon Indirectly addressable area

Example 1.6
  61                 	    .rdata
  62 0000 00          	var1:	.ds.b	1
  63                 	    .bdata
  64 0000 00          	var2:	.ds.b	1
  65                 	    .data
  66 0000 00          	var3:	.ds.b	1	;1-byte variable
  67 0001 00 00       	var4:	.ds.w	1	;2-byte variable
  68 0003 00 00 00 00 	var5:	.ds.s	1	;4-byte variable
  69 0007 00 00 00 00 	var6:	.ds.d	1	;8-byte variable
  69      00 00 00 00 
  70 000f 00 00       	var7:	.ds	1	;2-byte variable
  71                 	    .idata
  72 0000 00          	var8:	.ds.b	1
  74                 	.rcomm	var9, 1		;1-byte variable
  75                 	.bcomm	var10, 1
  76                 	.comm	var11, 1
  77                 	.icomm	var12, 1

         example.asm:62     .rdata:00000000 var1
         example.asm:64     .bdata:00000000 var2
         example.asm:66     .data:00000000 var3
         example.asm:67     .data:00000001 var4
         example.asm:68     .data:00000003 var5
         example.asm:69     .data:00000007 var6
         example.asm:70     .data:0000000f var7
         example.asm:72     .idata:00000000 var8
                            .rbss:00000001 var9
                            .bbss:00000001 var10
                            *COM*:00000001 var11
                            .ibss:00000001 var12

External data memory

External data memory is nowadays often part of the microcontroller, implemented as RAM or EEPROM. Individual portions of such a built-in memory are accessible when enabled by special configuration bits, overlapping the normal external data memory. This has to be taken into account when writing interrupt handlers. The MCS51-AS assembler and the linker allow the use of customized, user-defined memory segments; however, even the basic built-in support for external memory is aware of multiple memory spaces. There is the standard .xdata memory area and the new .edata and .eeprom areas. Variables can be declared independently in all of these areas just as in the .data section. Besides allocation using .ds, common variables can be also allocated using the commands .xcomm, .xcommon, .ecomm, .ecommon. In the .eeprom segment, no equivalent of the .common command is implemented; due to limited number and speed of writes to this segment, it is intended primarily for storing constants. Constants can be stored in all data segments (using the commands mentioned for the .text section); both the compiler and the linker can work with them.

Use of .common commands is appealing; variables can be defined at the point where they are actually used - unlike the usual method involving a special section at the beginning or end of the program. However, as implied by the "common" command, these variables are shared. If a variable of the same name is defined in two parts of the program, the linker considers them to be a single variable. If a variable is declared by .ds in the appropriate section, no such overlap occurs, as demonstrated by the following example.

Example 1.7
file www8051.lst
  12                 	    .data
  13 0000 00 00       	test1:	.ds	1
  14                 	.global test1

file testLED.lst
  11                 	    .data
  12 0000 00 00       	test1:	.ds	1
  13                 	.global test1

testLED.obj: In function `test1':
testLED.obj(.data+0x0): multiple definition of `test1'
www8051.obj(.data+0x0): first defined here
make: *** [www8051.o2] Error 1

[1] However, neither the assembler nor the linker check for substitutions of a bit variable for a byte variable. For example, mov a,B2B(test1, 0) generates no errors although the usage is wrong - trying to move a bit variable into a byte variable (8-bit register). What actually happens is that contents of a corresponding byte variable at the address equal to the address of the bit variable are moved to the accumulator.

[2] Conversion of bytes to bits corresponds to the MCS-51 convention. That is, if we number the bits of the 4-byte variable in the example, result is the following sequence: 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24. MCS-51 stores multi-byte numbers MSB first (at lower address).

[3] For .comm variables, names derived from .bss are used as segment names. The linker first places the respective .data segment followed by the .bss into the appropriate memory area.

Sponzored by LPhard Ltd. Graphics by GIMP Created by EasyPad

(c)Copyright 2000 - 2002, HW server & Radek Benedikt,
to ASCII conversion  Obsah  no next chap.