Memory model of x51 compatible micros (8051, 8052, ...)
The program address space is the simplest one. The MCS-51 family can address directly, without special features, up to 64 kBytes of code. The GNU assembler MCS51-AS (used by the Web51 development environment) uses the standard .text segment for program memory. Normally, program code is generated into this segment. To store constants in this segment, the following commands can be used, depending on the required format of the constant:
Example 1.11 .text 2 3 .equ symbol1, 0x4A 4 5 0000 4A .byte 74 6 0001 4A .byte 0b01001010 7 0002 4A .byte 0112 8 0003 4A .byte 0x4A 9 0004 4A .byte 0X4a 10 0005 4A .byte 'J 11 0006 4A .byte 'J' 12 0007 4A .byte '\J 13 0008 4A .byte symbol1 14 0009 00 .byte unknownsymbol 15 000a 12 34 .word 4660 16 000c 12 34 .word 0x1234 17 000e 12 34 56 78 .int 02215053170 18 0012 12 34 56 78 .int 305419896 19 0016 12 34 56 78 .int 0x12345678 20 001a 12 34 56 78 .long 0x12345678 21 001e 12 34 56 78 .quad 0x123456789ABCDEF0 21 9A BC DE F0 22 0026 12 34 56 78 .octa 0x123456789ABCDEF0123456789abcdef0 22 9A BC DE F0 22 12 34 56 78 22 9A BC DE F0 DEFINED SYMBOLS example.asm:3 *ABS*:0000004a symbol1 UNDEFINED SYMBOLS unknownsymbol
The example shows that the assembler can work with 8/16/32/64/128-bit constants, defined using C-like convention, either in decimal (number starts with a digit 1 through 9), hexadecimal (starts with 0x), binary (starts with 0b), or ASCII format. A symbolic name can be specified in place of a constant and is replaced by the value during compiling or linking.
The assembler can also embed string constants into the program address space, for example error messages. Such constants can be automatically zero-terminated (using the .string or .asciiz directive).
Example 1.224 0036 4E 65 77 6C .ascii "Newlinehere\12" 24 69 6E 65 68 24 65 72 65 0A 25 0042 4E 65 77 6C .ascii "Newlinehere\012" 25 69 6E 65 68 25 65 72 65 0A 26 004e 4E 65 77 6C1 .ascii "Newlinehere\x0A" 26 69 6E 65 68 26 65 72 65 0A 27 005a 4E 65 77 6C .ascii "Newlinehere\n" 27 69 6E 65 68 27 65 72 65 0A 28 0066 54 72 61 69 .string "Trailingzero" 28 6C 69 6E 67 28 7A 65 72 6F 28 00 29 0073 54 72 61 69 .asciz "Trailingzero" 29 6C 69 6E 67 29 7A 65 72 6F 29 00 30 0080 08 0C 0A 0D .ascii "\b\f\n\r\t\\\"" 30 09 5C 22
Text strings may contain special characters as in the "C" language, for example \n for newline. Other special characters can be specified using the backlash \ and the octal representation of the special character, or the \x sequence and the hexadecimal value of the character. To insert a backslash into the string, enter it twice. To enter the double quote character, precede it with a backslash, as shown above.
It is simple to enter even floating point constants into the code. MCS51-AS uses the IEEE floating point format.
Example 1.331 0071 40 49 0F D8 .float 3.141592 32 0075 40 49 0F D8 .float 3141592E-6 33 0079 40 49 0F D8 .single 3141592E-6 34 007d 40 09 21 FA .double 3141592E-6 34 FC 8B 00 7A 35 0085 40 49 0F D8 .float 0f3141592E-6 36 0089 40 49 0F D8 .float 0d3141592E-6
Data storage is much more complicated. MCS-51 series microcontrollers have two data areas. First, there is the internal data memory of 128 bytes in the smallest ones (8031, 8051, ...) up to 256 bytes in the largest ones (8052, ...). The external data memory can occupy up to 64 kBytes. The internal and external memory areas are accessed by separate instruction sets and do not overlap. However, it is possible to make the external data memory and the external program memory overlap using a simple hardware trick.
Internal data memory
The internal data memory is partially overlapped by the Special Function Registers (SFR). The first micros of the MCS-51 series had only 128 bytes of internal data memory, so the SFR's were placed into the remaining 128 bytes addressable by the 8-bit address. For example, the data register of port 0 is located at address 0x80. Unlike the internal data memory, SFR's cannot be addressed indirectly. That is, it is not possible to access them while having the address in register R0 (mov a,@r0)(mov @r0,a) or R1 (mov a,@r1)(mov @r1,a); the only allowed method is direct access (e.g. mov a,0x80)(mov a,p0)(anl p0,#0b10101010). The entire internal memory area is accessible using indirect addressing only. Looking for errors caused by direct addressing of the internal memory in its upper range is a common nightmare of many programmers. Unfortunately, the assembler can't provide many warnings.
To complicate things even more, there are another two special areas in the internal data memory. First is the 0x00 to 0x1F range that contains four register banks. Second is the 0x20 to 0x2F range that is bit-addressable.
The MCS-51 compatible micros use eight general-purpose registers named R0 to R7. To avoid e.g. pushing them on the stack in interrupt handlers, the MCS-51 micros use banking. There are 4 register banks. Switching is accomplished using two bits (RS0, RS1) in the status register PSW. Individual banks begin at 0x00, 0x08, 0x10, and 0x18. Since some instructions (such as PUSH, POP, ...) cannot have a register as its parameter but only an absolute address, almost all assemblers try to help the programmer with "absolute pseudoregisters" AR0 to AR7. However, these registers do not check the status of the RS0 and RS1 bits but infer the starting address of the bank from the pseudoinstruction .using 0 to 3. The .using directive also affects allocation of space for the registers by the linker. The effect of these pseudoinstructions on the generated code is shown below.
Example 1.4.38 .using 0 39 008d C0 00 push ar0 40 008f C0 07 push ar7 41 .using 1 42 0091 C0 08 push ar0 43 0093 C0 0F push ar7 44 .using 2 45 0095 C0 10 push ar0 46 0097 C0 17 push ar7 47 .using 3 48 0099 C0 18 push ar0 49 009b C0 1F push ar7
Registers are followed by the bit-addressable portion of the internal memory. Individual bits are addressed only directly, bit address is a part of the instruction. Similarly to the data memory, bit addressing uses only the first 128 bits; remaining 128 bit addresses are used to access individual bits in the SFRs. The bit-addressable memory ranges from 0x20 to 0x2F. Should we need to access this memory area by bits and bytes at the same time, it is useful to have a function that can calculate the address of an individual bit from a given byte. The MCS51-AS uses a built-in function B2B(byte, bit_offset_in_byte). The bit offset is not limited to 0..7, B2B can address multibyte variables. It is possible to define a 4-byte variable and access all of its bits using B2B.Example 1.551 .equ test1,0x20 ;start of bit-addressable data area 52 .equ test2,0x24 ;4 bytes in bit-addressable area 53 54 009d A2 00 mov C,B2B(test1, 0) ;move bit 0 of variable test1 to Carry 55 009f A2 0D mov C,B2B(test1, 13) 56 00a1 A2 1F mov C,B2B(test1, 31) 57 00a3 A2 20 mov C,B2B(test2, 0) 58 00a5 A2 2D mov C,B2B(test2, 13) 59 00a7 A2 3F mov C,B2B(test2, 31)
Absolute locations into bit-addressable space were used for simplicity only. Usually, the memory is allocated by the linker..
It would be very impractical to specify absolute adresses of variables using .equ. The assembler and linker allow comfortable memory allocation. Two ways can be used to allocate memory for variables. The first one involves switching into the appropriate data segment with one of the .data pseudoinstructions and then declaring the space with one of the .ds pseudoinstructions. The other method is to use a single pseudoinstruction that specifies the data segment, variable name and size at the same time. This would be one of the .comm or .common pseudoinstructions. Individual data segments, or rather portions of the internal data space, are distinguished by the predefined segment names and .comm pseudoinstruction modifications. Location of individual segments in the internal data memory is controlled by the linking scripts, standard location is in the following table.
From address To address Segment Note 0x08(.using 0), 0x10(using 1), 0x18(using 2), 0x20(.using 3) max. 0xFF, usually max. 0x1F .rdata .rcomm/.rcommon Unused part of registers, directly addressable after .rdata, min. 0x20 max. 0xFF, usually max. 0x2F .bdata .bcomm/.bcommon Bit-addressable area (0x20...0x2F), addressed bytewise after (.bdata - 0x20)*8 bitwise max. 0x7F bitwise, max. 0x2F bytewise .bitdata .bitcomm/.bitcommon Bit-addressable area (0x00...0x7F), addressed bitwise after (.bdata + sizeof .bitdata in bytes), that is min. 0x20 max. 0xFF, usually 0x7F .data .comm/.common Directly addressable area (<0x80) after .data, that is min. 0x20 max. 0xFF .idata .icomm/.icommon Indirectly addressable area
Example 1.661 .rdata 62 0000 00 var1: .ds.b 1 63 .bdata 64 0000 00 var2: .ds.b 1 65 .data 66 0000 00 var3: .ds.b 1 ;1-byte variable 67 0001 00 00 var4: .ds.w 1 ;2-byte variable 68 0003 00 00 00 00 var5: .ds.s 1 ;4-byte variable 69 0007 00 00 00 00 var6: .ds.d 1 ;8-byte variable 69 00 00 00 00 70 000f 00 00 var7: .ds 1 ;2-byte variable 71 .idata 72 0000 00 var8: .ds.b 1 73 74 .rcomm var9, 1 ;1-byte variable 75 .bcomm var10, 1 76 .comm var11, 1 77 .icomm var12, 1 DEFINED SYMBOLS example.asm:62 .rdata:00000000 var1 example.asm:64 .bdata:00000000 var2 example.asm:66 .data:00000000 var3 example.asm:67 .data:00000001 var4 example.asm:68 .data:00000003 var5 example.asm:69 .data:00000007 var6 example.asm:70 .data:0000000f var7 example.asm:72 .idata:00000000 var8 .rbss:00000001 var9 .bbss:00000001 var10 *COM*:00000001 var11 .ibss:00000001 var12
External data memory is nowadays often part of the microcontroller, implemented as RAM or EEPROM. Individual portions of such a built-in memory are accessible when enabled by special configuration bits, overlapping the normal external data memory. This has to be taken into account when writing interrupt handlers. The MCS51-AS assembler and the linker allow the use of customized, user-defined memory segments; however, even the basic built-in support for external memory is aware of multiple memory spaces. There is the standard .xdata memory area and the new .edata and .eeprom areas. Variables can be declared independently in all of these areas just as in the .data section. Besides allocation using .ds, common variables can be also allocated using the commands .xcomm, .xcommon, .ecomm, .ecommon. In the .eeprom segment, no equivalent of the .common command is implemented; due to limited number and speed of writes to this segment, it is intended primarily for storing constants. Constants can be stored in all data segments (using the commands mentioned for the .text section); both the compiler and the linker can work with them.
Use of .common commands is appealing; variables can be defined at the point where they are actually used - unlike the usual method involving a special section at the beginning or end of the program. However, as implied by the "common" command, these variables are shared. If a variable of the same name is defined in two parts of the program, the linker considers them to be a single variable. If a variable is declared by .ds in the appropriate section, no such overlap occurs, as demonstrated by the following example.
Example 1.7file www8051.lst 12 .data 13 0000 00 00 test1: .ds 1 14 .global test1 file testLED.lst 11 .data 12 0000 00 00 test1: .ds 1 13 .global test1 testLED.obj: In function `test1': testLED.obj(.data+0x0): multiple definition of `test1' www8051.obj(.data+0x0): first defined here make: *** [www8051.o2] Error 1
 However, neither the assembler nor the linker check for substitutions of a bit variable for a byte variable. For example, mov a,B2B(test1, 0) generates no errors although the usage is wrong - trying to move a bit variable into a byte variable (8-bit register). What actually happens is that contents of a corresponding byte variable at the address equal to the address of the bit variable are moved to the accumulator.
 Conversion of bytes to bits corresponds to the MCS-51 convention. That is, if we number the bits of the 4-byte variable in the example, result is the following sequence: 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24. MCS-51 stores multi-byte numbers MSB first (at lower address).
 For .comm variables, names derived from .bss are used as segment names. The linker first places the respective .data segment followed by the .bss into the appropriate memory area.
|Web51 description||News||FAQ||ORDER FORM||DOWNLOAD||Links|
|(c)Copyright 2000 - 2002, HW server & Radek Benedikt