Strict Standards: Declaration of action_plugin_popularity::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /membri/ex6502/lib/plugins/popularity/action.php on line 11

Strict Standards: Declaration of action_plugin_googleanalytics::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /membri/ex6502/lib/plugins/googleanalytics/action.php on line 50

Strict Standards: Declaration of action_plugin_safefnrecode::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /membri/ex6502/lib/plugins/safefnrecode/action.php on line 66
6502 EX (Extended) [architecture]


The 6502EX is a standard Von Neumann architecture with capability to address 4Gbyte of memory in 8bit and 32bit systems. In the second case 8, 16 and 32bit access modes are allowed.

From a logic point of view, the 6502EX is designed as an extension of a standard 8bit core. Thanks to a new set of registers and a new set of instructions, 32bit risc-like capabilities are added in order to improve performances of the original 8bit core. The architecture supports a seamless integration between the legacy registers and the new registers as well as a seamless integration between legacy instructions and new instructions.

Instruction Set Architecture

The resulting instruction set can be seen as the merging between legacy ISA (hereafter referred as ISA8) and the new ISA (hereafter referred as ISA32).

Fig1: Extended Instruction Set Architecture

The little overlap between ISA8 and ISA32 represents the few legacy instructions implemented differently in order to support the 6502EX architecture. These modified legacy instructions will not change their functional meaning and usage for the programmer.

The new instructions (ISA32) can be grouped in four groups:

1. Load/Store Five basic addressing modes are introduced. For some of them the possibility to access data in 8,16 and 32bit mode is allowed:

  • register
  • register with offset
  • register post-decremented
  • absolute
  • immediate (load only)

2. Logic&arithmetic Two basic addressing modes are introduced:

  • register-register
  • register-immediate

On top to standard logic and arithmetic instructions, 32bit multiplications with and without sign and multiplications with accumulation are included in these categories. All the instructions can be used in a vectored form thanks to a new 32bit ALU that can be seen as four 8bit elementary units, two 16bit elementary units or a single 32bit unit allowing parallelism in algorithm computation.

3. Jumps In this category are included new instructions to perform program jumps (and segment switch) specifying the whole 32bit address in a register, as it was a linear 4Gbyte memory space, or a relative offset inside the segmet:

  • Jump indirect
  • Jump indirect with link
  • Conditional relative jump (+/- 1Kbyte)
  • Unconditional realtive jump (+/- 16Kbyte)

4. Miscellaneous In this category special instructions of different type are collected:

  • Move data from register bank to system register bank and viceversa
  • Copy 32bit registers into the Legacy Registers and viceversa
  • Manage vectored flags
  • Coprocessor instruction
  • System registers bit clear and reset

Memory organization

The 6502EX can address up to 4Gbyte of memory physically implemented using 8bit or 32bit memory system.

From a logical point of view, the memory can be considered “segmented” since a jump instruction is needed to move from one segment to another. Using new jump modes, the segment and the offset into the segment (that is the full 32bit address) can be specified into a single instruction. Instead, using legacy jump instructions, segment switching can be achieved pre-charging a dedicated register with the new segment and then performing a legacy jump into the segment.

The segment is 64Kbyte large, there are 64K segments into the whole memory. The zero page and the stack can be remapped everywhere into the 64Kbyte segment. Every segment can have its zero page and its stack page, giving to the programmer the possibility to write application with multi zero-page and multi stacks capability, enabling context depended mechanisms or virtualization of multi-6502 machines.

Fig.2: Segmented Memory


The 6502EX supports non-maskable and maskable interrupts with the same timing as the original 6502.

Upon interrupts, the 6502EX jumps to a memory location indicated by the interrupt table below located in segment zero (first segment of the whole memory space) at the following address:

Tab1: Interrupt Vector

The 6502EX executes the handler located at the interrupt vector specified into the proper interrupt entry point (see Tab1) inside the segment indicated into a specific system register (see system registers section afterword for detail). This way, the programmer can adopt context specific handlers for its application.

The 6502EX supports two new interrupt vectors located at FFF6, FFF7 and FFF8, FFF9:

  • the first is used when the COP instruction (coprocessor instruction) is executed to send a command to an external coprocessor but the addressed coprocessor is not connected to the 6502EX. In this case coprocessor emulation is allowed storing the coprocessor routine vector into the FFF6 and FFF7 locations.
  • the second is used to serve interrupt generated by the coprocessor when it finishes executing a task or an instruction.

System and User stacks

The 6502EX support two different stack implementations, the 6502 legacy stack and a risc-like software stack.

The 6502 legacy stack is always used by the 6502EX to manage interrupts and can have a size of maximum 256byte (or 64 words). When an interrupt arise, the 6502EX saves the lower 16bit of the program counter (the legacy PC) and the Processor Status Register (the legacy PSR) into the stack as per the original 6502. Once the return address and the processor status are saved, the handler, if required, can save also the return memory segment into the stack using new dedicated instructions. The legacy stack is referred as the System Register.

The software stack instead has been designed to efficiently manage jump to subroutine in 32nit mode and can have a maximum size of 64Kbyte (or 16Kwords) allowing to save and to restore 32bit data in a single access. In this case new load/store instructions are required and a 32bit memory system is mandatory to support 32bit data accesses to the stack. The software stack is referred as the User Stack.

In case of 8bit system the System Stack can be used also to manage return address in case of jump to subroutine replacing the User Stack. In this case the System Stack and the User Stack are coincident.

Implementation of multi System Stacks and multi User Stacks is allowed into an application thanks to memory segmentation.

Vectored ALU

The 6502EX implements a four ways ALU able to perform a single 32bit operation, two 16bit operations in parallel or four 8bit operations in parallel. This facility makes easier implementation of parallelism into program.

The figure below represents a conceptual high level scheme of the vectored ALU.

Fig.3: vectored ALU

Each elementary 8bit part of the ALU has its own set of flags (N, C, Z, V). In case of four 8bit calculations executed in parallel, four set of flags will be updated. Conditional branches can be taken upon the status of each set of flags.

These set of flags is contained into sixteen bits of the Actual Status Register (ASR).

Thanks to the vectored ALU, reliable or fault tolerant applications can be easily implemented duplicating each data operand (eight or sixteen bits) into the same 32-bit register and performing a vectored operation (e.g an add). A specific bit contained into the ASR register will flag if all the results (on eight or sixteen bits) are equal or not. In case of permanent or transient fault during the operation, results will be not the same and the 6502EX, if properly enabled, will generate an interrupt allowing an external supervisor to take actions and to recover from the fault.

Register Bank

A main register bank includes nineteen 32bit registers classified as follow:

  • LR (Legacy Register)
  • R0-R15 (R0-R7: General Purpose, R8-R15: Special)
  • ASR, SSR (Actual and Saved Status Registers)

Fig.4 a) 6502EX full register set b) Legacy registers mapping

The first register is the Legacy Register (LR), it collects the full “register bank” of the original 6502 processor, it is composed by the Accumulator (A) and the two Index registers (X and Y). This is represented in Fig.4b. The CPL (Copy Legacy) instruction allows copying LR content into the others 32bit registers and viceversa. For this reason the LR register shall be considered as the register “bridge” between the legacy registers and the new registers. For example, a 32bit register content can be moved into LR using a single instruction and then, by means of legacy 6502 instruction acting on A, X and Y, the three bytes into LR can be pushed into the System Stack or used for generic computation. On the other hand, data originally stored into A, X, Y can be moved into a 32bit register using a single instruction. The content of this register could be used to save data into the 32bit User Stack or used for generic computation as well.

After the legacy register, sixteen 32bit registers named R0-15 are available for the application. This is the real register bank of the 6502EX. This set can be decomposed in two subsets

  • R0-R7: General Purpose Registers (to manipulate data)
  • R8-R15: Special Registers (to manipulate memory segmentation)

The new instructions can be applied indifferently to the full set of registers with the only difference that 8bit and 16bit operation are allowed only when R0-R7 are used as destination registers. This way, 8bit or 16bit load(store) will be possible using R0-R7 as destination(source) registers. Similarly, 8bit or 16bit vectored computations will be possible using R0-R7 as destination registers. R8-R15 will support only 32bit operations. Of course, 32bit operations are allowed on the full set of registers.

The following figure clarifies this concept when applied to load/store instructions.

Fig.5 a) R0-R7 as destination (load) or source (store) b) R8-R15 as destination (load) or source (store)

Looking at Fig.4 b, a special meaning is assigned to registers R10-R15. They are the principal registers used to manage memory segmentation.

Fig.6: Special Registers

Here after a quick overview of these registers:

  • CS (Code Segment): this register is typically used to enable segment switching using legacy 6502 instructions JMP and JSR. When used for this scope, only the higher 16bits are significant. CS shall be pre-charged with the new memory segment where the program shall jump to, then, upon JSR or JMP invocation the higher 16bit of the 32bit program counter will be copied with the value of CS[31:16] and a new segment will be used for the program. When new jumps with segment switch instructions are used, the new segment and its offset are directly specified into the instruction and CS pre-charging is not needed. If CS is not used for segment switching, it can be used as general purpose register.
  • DS (Data Segment): this register contains the value of the segment where data are read and written by the processor. Only the higher 16bits are significant. This way, the programmer has the possibility to separate the code from the data and map them in different segments. To change the data segment just charge a new value into DS. The next data will be accessed into the new segment. If the DS register is not used, it can be used as general purpose register.
  • ZS (Zero Page Segment): this register contains the value of the segment and the page inside that segment where the zero page shall be mapped to. Only the higher 24bits are significant (16bits for the segment, 8bit for the page into the segment). To change the zero page address, just charge a new value into ZS. The next zero page will be accessed into the new segment and/or the new page. If the ZS register is not used, it can be used as general purpose register.
  • SS (User Stack Segment): this register contains the value of the segment where the User Stack is mapped to. Only the higher 16bits are significant. To change the data segment, just charge a new value into SS. The next User Stack will be located into the specified segment. If the SS register is not used, it can be used as general purpose register.
  • LA (Link Address): this register is automatically updated upon invocation of the legacy JSR and the new JL instruction (jump to subroutine) as well as during interrupts processing. In all the cases the value of the 32bit return address is copied into the LA register. In order to manage re-entrant routine, the value of LA shall be saved into the stack. In case the System Stack is used as destination stack, the LR register shall be used to copy the value of LA into legacy registers.
  • GPC (Global Program Counter): this register contains the full 32bit memory address used by 6502EX to fetch instructions. The higher 16bit represents the active segment, the lower 16bit represents the offset inside the segment where the next instruction is fetched.
  • ASR (Actual Status Register): this register contains the legacy Processor Status Stack (PSR) in the most right eight bit (ASR[7:0]) and the four set of vectored flags into the field ASR[23:8].

Fig.7: Actual Status Register

When the vectored ALU is used, the four set of flags are updated accordingly. The flags contained into the legacy PSR will result as the logic OR between correspondent bits into the vectored flags field: e.g. the PSR carry bit will result as the logic OR among the others carry bit contained into ASR[23:16]. One, two or four carry bits will be involved according to the type of vectored operation (32,16 or 8 bit). This way the programmer can detect if at least one operation generated a carry and, if so, look for which one.

  • SSR (Saved Status Register): this register is copied with the content of ASR when an interrupt is processed. The interrupt handler will provide to save this register into the User Stack if required by the program. The PSR is always automatically saved into the System Stack.

System Registers

The 6502EX can support a system register space where up to 128 optional registers can be connected. These registers can be used as GPIO to program or to control peripherals. Dedicated instructions allow moving the content of registers into system registers and viceversa.

The first two system registers (SR0 and SR1) located at address x00 and x01 are mandatory and fundamental for the functionality of the processor.

Fig.8: SR0 and SR1

  • SR0 (System Register 0): this register contains the legacy stack pointer SP into the most right byte. In the higher 16bits it is stored the value of the segment containing the System Stack and into the second byte from the right it is stored the page into which the System Register shall be mapped.
  • SR1 (System Register 1): this register contains the segment used by the processor to complete the 32bit address of the interrupt vector (see Tab1). The segment is stored into the higher 16bits. This way, a context dependent handlers is enabled changing the value into the SR1 register.

Specific instruction like the MVS (Move System Register) allows moving data between the register bank and the system registers.

Coprocessor support

A coprocessor is a custom auxiliary processor jointly working with the 6502EX. The coprocessor shall be connected to the system registers bus. The maximum number of coprocessors supported by the 6502EX is limited by the number of available system registers.

There are two possible operative modes applicable to the coprocessor: the coprocessor can work in parallel to the 6502EX (co-execution) without stalling the processor pipeline, or in synchronization with the processor (sync-up), meaning that the next processor instruction will be executed only when the current coprocessor operation is terminated. In this case the processor pipeline is synchronized with coprocessor pipeline. The usable operative mode is not coprocessor dependent but it can be negotiated on an instruction basis.

The following figure shows these two operative modes

Fig.9: two operative coprocessor modes

The COP instruction is used to send an instruction to the coprocessor. This instruction implements a register indexed load from memory to a specific system register where the command register of the coprocessor is mapped to. The register specified into the COP instruction acts as a pointer into the memory to the instruction to be loaded into the specified system register representing the coprocessor command register. After each COP instruction invocation, the index register is automatically incremented in order to point to the next instruction. Thanks to this register indexed addressing mode, the format and the semantic of the coprocessor instruction can be defined by the coprocessor designer with the maximum degree of flexibility.

In the following figure a coprocessor with the command (instruction) register connected to the system register SR64 is reported as an example.

Fig.10: cop instructions are fetched using a register indexed mechanism

Fig.11: example of coprocessor custom instruction format

On top to the two operative modes described (co-execution and sync-up), there are two hardware models that can be adopted by the designer to realize its coprocessor: standard and advanced.

Standard model

  • the coprocessor command register and the coprocessor registers bank are mapped into the 6502EX System Register space.
  • the 6502EX performs data load/store from memory and stores/loads those values into the coprocessor register bank (i.e. System register space) using the proper MSR (move system register) instruction.
  • the 6502EX performs the fetch of the coprocessor command and send it to the coprocessor by means of COP instruction (fig.10).
  • the coprocessor implements command decoding and command execution.

Advanced model

  • only the coprocessor command register is mapped into the system register space. As a consequence the coprocessor register bank is not directly accessible from 6502EX instructions.
  • the coprocessor performs data load/store from its local memory into its registers bank after sending load/store commands through the COP instruction (fig.10).
  • the 6502EX performs the fetch of the coprocessor instruction and send it to the coprocessor by means of COP instruction (fig.10).
  • the coprocessor implements command decoding and command execution including data load/store.

If you are interested to know more about 6502EX availability, please send an email to 6502ex [at] gmail [dot] com . We will be pleased to answer to your enquire.

The information reported on this page is proprietary of the author of this site, all right reserved.

This page contains proprietary information that may be only used by persons in accordance to authorization under and to the extent permitted by the author of this site.

Information reported on this page is subject to change without notice according to continuous 6502EX development and improvements. All 6502EX details and 6502EX usage described in this page are given by the author in a good faith.

The author shall not be liable for any loss or damage deriving from the incorrect use of information contained in this page, or any error or omission in such information, or any incorrect use of the 6502EX.

Trademarks and pictures reported in this page are the exclusive property of their respective owners.

Technical Requirements

architecture.txt · Last modified: 2011/09/07 00:15 by sysadmin
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki