Computer Hardware User Manual

ManualsBrandsToshiba ManualsComputer Hardware32-Bit RISC Microprocessor TX39

32-Bit RISC MICROPROCESSOR

TX39 FAMILY CORE ARCHITECTURE

USER'S MANUAL

Jul. 27, 1995

Summary of content (246 pages)

PAGE 1
32-Bit RISC MICROPROCESSOR TX39 FAMILY CORE ARCHITECTURE USER'S MANUAL Jul.
PAGE 2
PAGE 3
R3000A is a Trademark of MIPS Technologies, Inc. The information contained herein is subject to change without notice. The information contained herein is presented only as a guide for the applications of our products. No responsibility is assumed by TOSHIBA for any infringements of patents or other rights of the third parties which may result from its use. No license is granted by implication or otherwise under any patent or patent rights of TOSHIBA or others.
PAGE 4
CONTENTS CONTENTS Architecture Chapter 1 Introduction--------------------------------------------------------------------------- 3 1.1 Features ------------------------------------------------------------------------------ 3 1.1.1 1.1.2 1.1.3 1.1.
PAGE 5
CONTENTS Chapter 4 Pipeline Architecture-----------------------------------------------------------------39 4.1 Overview--------------------------------------------------------------------------------39 4.2 Delay Slot-------------------------------------------------------------------------------40 4.2.1 4.2.
PAGE 6
CONTENTS 6.3.5 6.3.6 6.3.7 6.3.8 6.3.9 6.3.10 6.3.
PAGE 7
CONTENTS TMPR3901F Chapter 1 Introduction--------------------------------------------------------------------------- 201 1.1 Features ------------------------------------------------------------------------------ 201 1.2 Internal Blocks----------------------------------------------------------------------- 203 Chapter 2 Configuration ------------------------------------------------------------------------ 205 2.
PAGE 8
CONTENTS 4.5 Bus Arbitration----------------------------------------------------------------------- 227 4.5.1 4.5.2 Bus request and bus grant-----------------------------------------------------------------227 Cache snoop ----------------------------------------------------------------------------------228 4.6 Reset ---------------------------------------------------------------------------------- 229 4.
PAGE 9
Architecture Architecture 1
PAGE 10
Architecture 2
PAGE 11
Architecture Chapter 1 Introduction 1.1 Features The R3900 Processor Core is a high-performance 32-bit microprocessor core developed by Toshiba based on the R3000A RISC (Reduced Instruction Set Computer) microprocessor. The R3000A was developed by MIPS Technologies, Inc. Toshiba develops ASSPs (Application Specific Standard Products) using the R3900 Processor Core and provides the R3900 as a processor core in Embedded Array or Cell-based ICs.
PAGE 12
Architecture • Real-time performance − Cache Lock Function: Lock one set of the two-way set associative cache memory to keep data in cache memory • Debug support − Breakpoint − Single step execution • Real-time debug system interface 1.1.
PAGE 13
Architecture 1.2 Notation Used in This Manual Mathematical notation • Hexadecimal numbers are expressed as follows (example shown for decimal number 42) 0x2A • A K(kilo)byte is 210 = 1,024 bytes, a M(mega)byte is 220 = 1,024 x 1,024 = 1,048,576 bytes, and a G(giga)byte is 230 = 1,024 x 1,024 x 1,024 = 1,073,741,824 bytes.
PAGE 14
Architecture 2.
PAGE 15
Architecture Chapter 2 Architecture 2.1 Overview A block diagram of the R3900 Processor Core is shown in Figure 2-1. It includes the CPU core, an instruction cache and a data cache. You can select an optimum data and instruction cache configuration for your system from among a variety of possible configurations. The CPU Core comprises the following blocks: • CPU registers : General-purpose register, HI/LO register and program counter (PC).
PAGE 16
Architecture 2.2 Registers 2.2.1 CPU registers The R3900 Processor Core has the following 32-bit registers. • Thirty-two general-purpose registers • A program counter (PC) • HI/LO registers for storing the result of multiply and divide operations The configuration of the registers is shown in Figure 2-2. Multiply/Divide registers General-purpose registers 31 0 31 0 r0 HI r1 31 0 r2 LO . . . . Program counter r29 31 0 PC r30 r31 Figure 2-2.
PAGE 17
Architecture 2.2.2 System control coprocessor (CP0) registers The R3900 Processor Core can be connected to as many as three coprocessors, referred to as CP1, CP2 and CP3. The R3900 also has built-in system control coprocessor (CP0) functions for exception handling and for configuring the system. Figure 2-3 shows the functional breakdown of the CP0 registers.
PAGE 18
Architecture Table 2-1 lists the CP0 registers built into the R3900 Processor Core. Some of these registers are reserved for use by an external memory management unit. Table 2-1.
PAGE 19
Architecture 2.3 Instruction Set Overview All R3900 Processor Core instructions are 32 bits in length. There are three instruction formats: immediate (I-type), jump (J-type) and register (R-type), as shown in Figure 2-4. Having just three instruction formats simplifies instruction decoding. If more complex functions or addressing modes are required, they can be produced with the compiler using combinations of the instructions.
PAGE 20
Architecture The instruction set is classified as follows. (1) Load/store These instructions transfer data between memory and general registers. All instructions in this group are I-type. “Base register + 16 bit signed immediate offset” is the only supported addressing mode. (2) Computational These instructions perform arithmetic, logical and shift operations on register values.
PAGE 21
Architecture The instruction set supported by all MIPS R-Series processors is listed in Table 2-2. Table 2-3 shows extended instructions supported by the R3900 Processor Core, and Table 2-4 lists coprocessor 0 (CP0) instructions. Table 2-5 shows R3000A instructions not supported by the R3900 Processor Core. Table 2-2.
PAGE 22
Architecture Table 2-2(cont.).
PAGE 23
Architecture Table 2-3.
PAGE 24
Architecture 2.4 Data Formats and Addressing This section explains how data is organized in R3900 registers and memory. The R3900 uses the following data formats: 64-bit doubleword, 32-bit word, 16-bit halfword and 8-bit byte. The byte order can be set to either big endian or little endian. Figure 2-5 shows how bytes are ordered in words, and how words are ordered in multiple words, for both the big-endian and little-endian formats.
PAGE 25
Architecture 17
PAGE 26
Architecture In this document (bit 0 is always the rightmost bit). Byte addressing is used with the R3900 Processor Core, but there are alignment restrictions for halfword and word access. Halfword access is aligned on an even byte boundary (0, 2, 4...) and word access on a byte boundary divisible by 4 (0, 4, 8...) . The address of multiple-byte data, as shown in Figure 2-5 above, begins at the most significant byte for the big endian format and at the least significant byte for the little endian format.
PAGE 27
Architecture 2.5 Pipeline Processing Overview The R3900 Processor Core executes instructions in five pipeline stages (F: instruction fetch; D: decode; E: execute; M: memory access; W: register write-back). Each pipeline stage is executed in one clock cycle. When the pipeline is fully utilized, five instructions are executed at the same time resulting in an instruction execution rate of one instruction per cycle.
PAGE 28
Architecture 2.6 Memory Management Unit (MMU) 2.6.1 R3900 Processor Core operating modes The R3900 Processor Core has two operating modes, user mode and kernel mode. Normally the processor operates in user mode. It switches to kernel mode if an exception is detected. Once in kernel mode, it remains there until an RFE (Restore From Exception) instruction is executed. (1) User mode User mode makes available one of the two 2 Gbyte virtual address spaces (kuseg).
PAGE 29
Architecture 2.6.2 Direct segment mapping The R3900 Processor Core includes a direct segment mapping MMU. The following virtual address spaces are available depending on the processor mode (Figure 2-8 shows the address mapping). (1) User mode One 2 Gbyte virtual address space (kuseg) is available. Virtual addresses from 0x0000 0000 to 0x7FFF FFFF are translated to physical addresses 0x4000 0000 to 0xBFFF FFFF, respectively.
PAGE 30
Architecture Virtual address space Physical address space 0xFFFF FFFF 16MB Kernel Reserved Kernel Cached Tasks Kernel Cached 0xC000 0000 1024MB (kseg2) Kernel Uncached 0xA000 0000 (kseg1) Kernel/User Kernel Cached 0x8000 0000 2048MB Cached Tasks (kseg0) 16MB User Reserved Inaccessible 512MB Kernel/User Cached (kuseg) Kernel Boot and I/O 0x0000 0000 Cached/uncached Figure 2-8.
PAGE 31
Architecture 3.
PAGE 32
Architecture Chapter 3 Instruction Set Overview This chapter summarizes each of the R3900 Processor Core instruction types in table format and explains each instruction briefly. Details of individual instructions are given in Appendix A. 3.1 Instruction Formats Each of the R3900 Processor Core instructions is aligned on a word boundary and has a 32-bit (single-word) length. There are only three instruction formats, as shown in Figure 3-1. As a result, instruction decoding is simplified.
PAGE 33
Architecture 3.3 Load and Store Instructions Load and Store instructions move data between memory and general registers and are all I-type instructions. The only directly supported addressing mode is base register plus 16-bit signed immediate offset. With the R3900 Processor Core, the result of a load instruction can be used by the immediately following instruction. Execution of the following instruction is delayed by hardware interlock until the load result becomes available.
PAGE 34
Architecture Table 3-2. Load/store instructions (1/2) Instruction Format and Description Load Byte LB rt, offset (base) Generate the address by sign-extending a 32-bit offset and adding it to the contents of register base. Sign-extend the contents of the addressed byte and load into register rt. LBU rt, offset (base) Generate the address by sign-extending a 32-bit offset and adding it to the contents of register base. Zero-extend the contents of the addressed byte and load into register rt.
PAGE 35
Architecture Table 3-2. Load/store instructions (2/2) Instruction Format and Description Store Word SW rt, offset (base) Generate the address by sign-extending a 32-bit offset and adding it to the contents of register base. Store the contents of the least significant word of register rt at the addressed byte. SWL rt, offset (base) Generate the address by sign-extending a 32-bit offset and adding it to the contents of register base.
PAGE 36
Architecture 3.4 Computational Instructions Computational instructions perform arithmetic, logical or shift operations on values in registers. The instruction format can be R-type or I-type. With R-type instructions, the two operands and the result are register values. With I-type instructions, one of the operands is 16-bit immediate data. Computational instructions can be classified as follows.
PAGE 37
Architecture Table 3-5. Three-operand register-type instructions Instruction Format and Description op Add rs rt rd 0 funct ADD rd, rs, rt Add the contents of registers rs and rt, and store the result in register rd. An exception is raised in the event of a two’s-complement overflow. Add Unsigned ADDU rd, rs, rt Add the contents of registers rs and rt, and store the result in register rd. No exception is raised on a two’s-complement overflow.
PAGE 38
Architecture Table 3-6. Shift instructions (a) SLL, SRL, SRA Instruction Format and Description Shift Left Logical SLL rd, rt, sa Left-shift the contents of register rt by the number of bits indicated in sa (shift amount), and zero-fill the low-order bits. Store the resulting 32 bits in register rd. SRL rd, rt, sa Right-shift the contents of register rt by sa bits, and zero-fill the high-order bits. Store the resulting 32 bits in register rd.
PAGE 39
Architecture Table 3-7. Multiply/Divide Instructions (a) MULT, MULTU, DIV, DIVU Instruction Format and Description Multiply MULT rs, rt Multiply the contents of registers rs and rt as two's complement integers, and store the doubleword (64-bit) result in multiply/divide registers HI and LO. MULTU rs, rt Multiply the contents of registers rs and rt as unsigned integers, and store the doubleword (64-bit) result in multiply/divide registers HI and LO.
PAGE 40
Architecture Table 3-8. Multiply, multiply / add instructions (R3000A extended instruction set) MULT, MULTU, MADD, MADDU (ISA extended set) Instruction Format and Description Multiply MULT rd, rs, rt Multiply the contents of registers rs and rt as two’s complement integers, and store the doubleword (64-bit) result in multiply/divide registers HI and LO. Also, store the lower 32 bits in register rd.
PAGE 41
Architecture 3.5 Jump/Branch Instructions Jump/branch instructions change the program flow. A jump/branch instruction will delay the pipeline by one instruction cycle, however, an instruction inserted into the delay slot (immediately following a branch instruction) can be executed while the instruction at the branch target address is being fetched. Jump and Jump And Link instructions, typically used to call subroutines, have the J-type instruction format. The jump target address is generated as follows.
PAGE 42
Architecture instruction in the delay slot is executed during the jump). The following notes apply to Table 3-10. • The target address of a branch instruction is generated by adding the address of the instruction in the delay slot (the instruction to be executed during the branch) to the 16-bit offset (that has been left-shifted two bits and sign-extended to 32 bits). Branch instructions are executed with a one-cycle delay.
PAGE 43
Architecture (d) BEQL, BNEL, BLEZL, BGTZL, BLTZL, BGEZL, BLTZALL, BGEZALL (ISA Extended Set) Instruction Format and Description Branch on Equal Likely Branch on Not Equal Likely Branch on Less Than or Equal Zero Likely Branch on Greater Than Zero Likely BEQL rs, rt, offset Branch to the target if the contents of registers rs and rt are equal. BNEL rs, rt, offset Branch to the target if the contents of registers rs and rt are not equal. BLEZL rs, offset Branch to the target if register rs is 0 or less.
PAGE 44
Architecture 3.6 Special Instructions There are three special instructions used for software traps. The instruction format is R-type for all three. Table 3-11. Special instructions (a) SYSCALL Instruction Format and Description System Call SYSCALL code Raise a system call exception, passing control to an exception handler. op code funct (b) BREAK Instruction Format and Description Breakpoint BREAK code Raise a breakpoint exception, passing control to an exception handler.
PAGE 45
Architecture 3.7 Coprocessor Instructions Coprocessor instructions invoke coprocessor operations. The format of these instructions depends on which coprocessor is used. Table 3-12. Coprocessor instructions (a) MTCz, MFCz, CTCz, CFCz Instruction Format and Description Move To Coprocessor MTCz rt, rd Move the contents of CPU general register rt to coprocessor z’s coprocessor register rd. MFCz rt, rd Move the contents of coprocessor z’s coprocessor register rd to CPU general register rt.
PAGE 46
Architecture (d) BCzTL, BCzFL (ISA Extended Set) Instruction Format and Description Branch on Coprocessor z True Likely BCzTL offset Generate the branch target address by adding the address of the instruction in the delay slot (the instruction to be executed during the branch) and the 16-bit offset (after left-shifting two bits and sign-extending to 32 bits). If the coprocessor z condition line is true, branch to the target address after a onecycle delay.
PAGE 47
Architecture 3.8 System Control Coprocessor (CP0) Instructions Coprocessor 0 instructions are used for operations involving the system control coprocessor (CP0)registers, processor memory management and exception handling. Note :Attempting to execute a CP0 instruction in user mode when the CU0 bit in the status register is not set will return a Coprocessor Unusable exception. Table 3-13.
PAGE 48
Architecture Chapter 4 Pipeline Architecture 4.1 Overview The R3900 Processor Core executes instructions in five pipeline stages (F: instruction fetch; D: decode; E: execute; M: memory access; W: register write-back). The five stages have the following roles. F : An instruction is fetched from the instruction cache. D : The instruction is decoded. Contents of the general-purpose registers are read. If the instruction involves a branch or jump, the target address is generated.
PAGE 49
Architecture 4.2 Delay Slot Some R3900 Processor Core instructions are executed with a delay of one instruction cycle. The cycle in which an instruction is delayed is called a delay slot. A delay occurs with load instructions and branch/jump instructions. 4.2.1 Delayed load With load instructions, a one-cycle delay occurs while waiting for the data being loaded to become available for use by another instruction.
PAGE 50
Architecture • The R3900 Processor Core provides Branch Likely instructions in addition to the normal Branch instructions that allow the instruction at the target branch address to be placed in the delay slot. If the branch condition of the Branch Likely instruction is met, the instruction in the delay slot is executed and the branch is taken. If the branch is not taken, the instruction in the delay slot is treated as a NOP.
PAGE 51
Architecture 4.5 Divide Instruction (DIV, DIVU) The R3900 Processor Core performs division instructions in the division unit independently of the pipeline. Division starts from the pipeline E stage and takes 35 cycles. Figure 4-6 shows an example of a divide instruction. Division in the division E1 E2 E3 E34 E35 ES ES E unit div r5,r1 F mflo r4 D E M W F D ES ES M W Figure 4-6.
PAGE 52
Architecture Chapter 5 Memory Management Unit (MMU) The R3900 Processor Core doesn't have TLB. 5.1 R3900 Processor Core Operating Modes The R3900 Processor Core has two operating modes, user mode and kernel mode. Normally it operates in user mode, but when an exception is detected it goes to kernel mode. Once in kernel mode, it remains until an RFE (Restore From Exception) instruction is executed. The available virtual address space differs with the mode, as shown in Figure 5-1.
PAGE 53
Architecture 5.2 Direct Segment Mapping The R3900 Processor Core has a direct segment mapping MMU. Figure 5-2 shows the virtual address space of the internal MMU. Kernel mode 0xFFFF FFFF 1GB kseg2 0xC000 0000 0.5GB kseg1 0xA000 0000 User mode 0x7FFF FFFF 0x8000 0000 0x7FFF FFFF 2GB kuseg 0.5GB kseg0 2GB kuseg 0x0000 0000 0x0000 0000 Figure 5-2. Internal MMU virtual address space (1) User mode One 2 Gbyte virtual address space (kuseg) is available in user mode.
PAGE 54
Architecture (a) kuseg This is the same virtual address space available in user mode. Virtual addresses 0x0000 0000 to 0x7FFF FFFF are translated to physical addresses 0x4000 0000 to 0xBFFF FFFF, respectivery. The upper 16-Mbyte area of kuseg (0x7F00 0000 to 0x7FFF FFFF) is reserved for on-chip resources and is not cacheable. (b) kseg0 This is a 512 Mbyte segment spanning virtual addresses 0x8000 0000 to 0x9FFF FFFF.
PAGE 55
Architecture Virtual address space Physical address space 0xFFFF FFFF 16MB Kernel Reserved 0xC000 0000 0xA000 0000 Kernel Cached (kseg2) Kernel Uncached (kseg1) Kernel Cached (kseg0) Kernel Cached Tasks 1024MB Kernel/User Cached Tasks 2048MB Inaccessible 512MB 0x8000 0000 16MB User Reserved Kernel/User Cached (kuseg) Kernel Boot and I/O Cached/Uncached 512MB 0x0000 0000 Figure 5-3. Internal MMU address mapping Table 5-1.
PAGE 56
Architecture Chapter 6 Exception Processing This chapter explains how exceptions are handled by the R3900 Processor Core, and describes the registers of the system control coprocessor CP0 used during exception handling. 6.1 Overview When the R3900 Processor Core detects an exception, it suspends normal instruction execution. The processor goes from user mode to kernel mode so it can perform processing to handle the abnormal condition or asynchronous event.
PAGE 57
Architecture Table 6-1.
PAGE 58
Architecture Table 6-2 shows the vector address of each exception and the values in the exception code (ExcCode) field of the Cause register. Table 6-2.
PAGE 59
Architecture 6.2 Exception Processing Registers The system control coprocessor (CP0) has seven registers for exception processing, shown in Figure 6-1. Status Cause EPC BadVAddr PRId Config Cache Figure 6-1. Exception processing registers (a) Cause register Indicates the nature of the most recent exception.
PAGE 60
Architecture 6.2.1 Cause register (register no.13) 31 30 29 28 27 BD 0 CE[1:0] 0 IP[5:0] Sw[1:0] 0 ExCode 1 1 2 12 6 2 1 5 Bits Mnemonic 31 BD 29-28 CE 15-10 IP 9-8 Sw 6-2 ExcCode 30 27-16 7 1-0 0 16 15 Field name 10 9 8 7 6 Description Branch Delay Set to 1 when the most recent exception was caused by an instruction in the branch delay slot (executed during a branch).
PAGE 61
Architecture Table 6-3. ExcCode field ExcCode Field of Cause Register No. Mnemonic 0 1 2 3 4 5 6 7 8 9 10 11 12 13-31 Int Mod TLBL TLBS AdEL AdES IBE DBE Sys Bp RI CpU Ov - 6.2.
PAGE 62
Architecture 6.2.3 Status register (register no.12) This register holds the operating mode status (user mode or kernel mode), interrupt masking status, diagnosis status and similar information.
PAGE 63
Architecture Figure 6-4.
PAGE 64
Architecture Bits Mnemonic 27-26 24-23 19-16 7-6 0 Field name Value on Reset Description Ignored on write; 0 when read. 0 Read/ Write Read Figure 6-4. Status register (2/2) (1) CU (Coprocessor Usability) The CU bits CU0 - CU3 control the usability of the four coprocessors CP0 through CP3. Setting a bit to 1 allows the corresponding coprocessor to be used, and clearing the bit to 0 disables that coprocessor.
PAGE 65
Architecture (5) NmI (Non-maskable Interrupt) This bit is set to 1 when a non-maskable interrupt is raised by the falling edge of the nonmaskable interrupt signal. The bit is cleared to 0 by writing a 1 to it or when a Reset exception is raised. (6) IntMask (Interrupt Mask) The IntMask bits separately enable or mask each of six hardware and two software interrupts. Clearing a corresponding bit to 0 masks an interrupt, and setting it to 1 enables the interrupt.
PAGE 66
Architecture 6.2.4 Cache register (register no.7) This register controls the cache lock function.
PAGE 67
Architecture (1) DALc/DALp/DALo (Data Cache Auto-Lock: current/previous/old) The three bits DALc/DALp/DALo form a three-level stack, indicating the current, previous and old auto-lock status of the data cache. For each bit, 1 means the lock is in effect, and 0 means it is not. A Reset exception clears DALc, DALp and DALo to 0.
PAGE 68
Architecture 6.2.5 Status register and Cache register mode bit and exception processing When the R3900 Processor Core responds to an exception, it saves the values of the current operating mode bit (KUc) and current interrupt enabled mode bit (IEc) in the previous mode bits (KUp and IEp). It saves the values of the previous mode bits (KUp and IEp) in the old mode bits (KUo and IEo). The current mode bits (KUc and IEc) are cleared to 0, with the processor going to kernel mode and interrupts disabled.
PAGE 69
Architecture After an exception handler has executed to perform exception processing, it must issue an RFE (Restore From Exception) instruction to restore the system to its previous status. The RFE instruction returns control to processing that was in progress when the exception occurred. When a RFE instruction is executed, the previous interrupt enabled bit (IEp) and previous operating mode bit (KUp) in the Status register are copied to the corresponding current bits (IEc and KUc).
PAGE 70
Architecture 6.2.6 BadVAddr (Bad Virtual Address) register (register no.8) When an Address Error exception (AdEL or AdES) is raised, the virtual address that caused the error is saved in the BadVAddr register. When a TLB Refill, TLB Modified or UTLB Refill exception is raised, the virtual address for which address translation failed is saved in BadVaddr. BadVaddr is a read-only register. Note : A bus error is not the same as an Address Error and does not cause information to be saved in BadVaddr.
PAGE 71
Architecture 6.2.8 Config (Configuration) register (register no.3) This register designates the R3900 Coprocessor Core configuration. 31 21 0 ICS 19 18 16 11 DCS 0 10 9 8 7 RF 6 5 4 3 21 IRSize DRSize Doze Halt Lock DCBR ICE DCE † †† Bits Mnemonic Field name 21-19 ICS Instruction Cache Size 18-16 DCS Data Cache Size 11-10 RF Reduced Frequency 9 Doze Doze†† Description Value on Reset Indicates the instruction cache size.
PAGE 72
Architecture Bits Mnemonic 8 Halt†† 7 Lock 6 DCBR 5 ICE 4 DCE 3-2 IRSize 1-0 DRSize 31-22, 15-12 0 Note : Field name Description Halt Setting this bit to 1 puts the R3900 Processor Core in Halt mode. This state is canceled by a Reset exception when a reset signal is received, or when cancelled by a non-maskable interrupt signal or interrupt signal that clears the Halt bit to 0. The Halt bit is cleared even if interrupts are masked. Data cache snoops are not possible in Halt mode.
PAGE 73
Architecture 6.3 Exception Details 6.3.1 Memory location of exception vectors Exception vector addresses are stored in an area of kseg0 or kseg1. The vector address of the Reset and NmI exceptions is always in a non-cacheable area of kseg1. Vector addresses of the other exceptions depend on the Status register BEV bit. When BEV is 0 the other exceptions are vectored to a cacheable area of kseg0. When BEV is 1, all vector addresses are in a non-cacheable area of kseg1.
PAGE 74
Architecture 6.3.2 Address Error exception • Causes − Attempting to load, fetch or store a word not aligned on a word boundary. − Attempting to load or store a halfword not aligned on a halfword boundary. − Attempting to access kernel mode address space kseg while in user mode. • Exception mask The Address Error exception is not maskable. • Applicable instructions LB, LBU, LH, LHU, LW, LWL, LWR, SB, SH, SW, SWL, SWR. • Processing − The common exception vector (0x8000 0080) is used.
PAGE 75
Architecture 6.3.3 Breakpoint exception • Cause − Execution of a BREAK command. • Exception mask The Breakpoint exception is not maskable. • Applicable instructions BREAK • Processing − The common exception vector (0x8000 0080) is used. − BP(9) is set for ExcCode in the Cause register. − The EPC register points to the address of the instruction causing the exception.
PAGE 76
Architecture 6.3.4 Bus Error exception • Causes − This exception is raised when a bus error signal is input to the R3900 Processor Core during a memory bus cycle. This occurs during execution of the instruction causing the bus error. The memory bus cycle ends upon notification of a bus error. When a bus error is raised during a burst refill, the following refill is not performed.
PAGE 77
Architecture − When a bus error occurs with a load instruction, the destination register value will be undefined. − In the following cases, a Bus Error exception may be raised even though the instruction causing the bus error did not actually execute. (1) When a bus error occurs during an instruction cache refill, but the instruction sequence is changed due to a jump/branch instruction in the instruction stream, the instruction at the address where the bus error occurred may not actually execute.
PAGE 78
Architecture 6.3.5 Coprocessor Unusable exception • Cause − Attempting to execute a coprocessor CPz instruction when its corresponding CUz bit in the Status register is cleared to 0 (coprocessor unusable). − In user mode, attempting to execute a CP0 instruction when the CU0 bit is cleared to 0. (In kernel mode, an exception is not raised when a CP0 instruction is issued, regardless of the CU0 bit setting.) • Exception mask The Coprocessor Unusable exception is not maskable.
PAGE 79
Architecture 6.3.6 Interrupts • Cause − An Interrupt exception is raised by any of eight interrupts (two software and six hardware). A hardware interrupt is raised when the interrupt signal goes active. A software interrupt is raised by setting the Sw1 or Sw0 bits in the Cause register. • Exception mask − Each of the eight interrupts can be masked individually by clearing its corresponding bit in the IntMask field of the Status register.
PAGE 80
Architecture 6.3.7 Overflow exception • Cause − A two's complement overflow results from the execution of an ADD, ADDI or SUB instruction. • Exception mask The Overflow exception is not maskable. • Applicable instructions ADD, ADDI, SUB • Processing − The common exception vector (0x8000 0080) is used. − Ov(12) is set for ExcCode in the Cause register. − The EPC register points to the address of the instruction causing the exception.
PAGE 81
Architecture 6.3.9 Reset exception • Cause − The reset signal in the R3900 Processor Core is asserted and then de-asserted. • Exception mask The Reset exception is not maskable. • Processing − A special interrupt vector (0xBFC0 0000) that resides in an uncached area is used. It is therefore not necessary for hardware to initialize cache memory in order to process this exception. − The contents of all registers in the R3900 Processor Core become undefined.
PAGE 82
Architecture 6.3.10 System Call exception • Cause − Execution of an R3900 Processor Core SYSCALL instruction. • Exception mask The System Call exception is not maskable. • Applicable instructions SYSCALL • Processing − The common exception vector (0x8000 0080) is used. − Sys(8) is set for ExcCode in the Cause register. − The EPC register points to the address of the instruction causing the exception.
PAGE 83
Architecture 74
PAGE 84
Architecture 6.4 Priority of Exceptions More than one exception may be raised for the same instruction, in which case only the exception with the highest priority is reported. The R3900 Processor Core instruction exception priority is shown in Table 6-5. See chapter 8 for the priority of debug exceptions. Table 6-5.
PAGE 85
Architecture 7.
PAGE 86
Architecture Chapter 7 Caches The R3900 Processor Core is equipped with separate on-chip caches for data and instructions. These caches can be configured in a variety of sizes as required by the user system. Note : Currently only the cache configuration described below is supported. It consists of a 4 Kbyte instruction cache and 1 Kbyte data cache. 7.1 Instruction Cache The instruction cache has the following specifications.
PAGE 87
Architecture 7.2 Data Cache The data cache has the following specifications. − Cache size : 1 Kbyte (Config register DCS bits = 000) − Two-way set-associative − Replace algorithm : LRU (Least Recently Used) − Block (line) size : 1 word (4 bytes) − Write-through − Physical cache − Refill size : Choice of size 1/4/8/16/32 words (set in Config register) − Byte-writable − All valid bits and lock bits cleared by a Reset exception − Lock function Figure 7-3 shows the data cache configuration.
PAGE 88
Architecture Figure 7-4 shows the data cache address field. 31 98 Physical Tag 1 0 Cache Tag Index Byte Select Figure 7-4. Data cache address field When a data store misses, the data is stored to main memory only, not to the cache (no write allocate). The data cache can be written in individual bytes. (When a byte or halfword store is used, there is no readmodify-write.) 7.2.1 Lock function The lock function can be used to route critical data to one data cache set.
PAGE 89
Architecture (3) Lock bit clearing Cache register 13 12 11 10 9 8 IALo DALo IALp DALp IALc DALc exception raised 0 IALo DALo IALp DALp IALc DALc 13 12 11 10 9 8 IALo DALo IALp DALp IALc DALc IALo DALo IALp DALp IALc DALc 0 RFE executed IALo,IALp and IALc are reserved for the instruction cache. Figure 7-5. Auto-lock bits The lock bit for an entry is cleared using the CACHE instruction IndexLockBitClear. Clearing the lock bit disables the lock function.
PAGE 90
Architecture 7.3 Cache Test Function (1) Cache disabling The Config register bits ICE (Instruction Cache Enable) and DCE (Data Cache Enable) are used to enable and disable the instruction cache and data cache, respectively. When a cache is disabled, all cache accesses are misses and there is no refill (nor is there any burst bus cycle; this is the same as accessing a non-cacheable area). The valid bit (V) for each entry cannot be modified.
PAGE 91
Architecture 7.4 Cache Refill A physical cache line in the R3900 Processor Core comprises 4 words for the instruction cache and 1 word for the data cache. The refill size can be designated independently of the line size. The refill size can be 4/8/16/32 words for the instruction cache, and 1/4/8/16/32 words for the data cache. In a burst read operation, data or instructions of the designated refill size are read.
PAGE 92
Architecture 7.5 Cache Snoop The R3900 Processor Core has a bus arbitration function that releases bus mastership to an external bus master. Consistency between cache memory and main memory could deteriorate when an external bus master has write access to main memory. The purpose of the cache snoop function is to maintain this data consistency. When the R3900 Processor Core releases the bus, the bus cycle is snooped by an external bus master.
PAGE 93
Architecture 82
PAGE 94
Architecture Chapter 8 Debugging Functions The R3900 Processor Core has the following support functions for debugging that have been added to the R3000A instruction base. They are independent of the R3000A architecture, which makes them transparent to user programs. The real-time debugging system is supported by a third party.
PAGE 95
Architecture The CP0 registers are listed in Table 8-1. Table 8-1.
PAGE 96
Architecture (1) DEPC (Debug Exception Program Counter) register (register no.17) The DEPC register holds the address where processing is to resume after the debug exception has been taken care of. (Note : DEPC is a read/write register.) The address that goes in the DEPC register is the virtual address of the instruction that caused the debug exception.
PAGE 97
Architecture n NIS (Non-maskable Interrupt Status) This bit is set to 1 when a Non-maskable interrupt occurs at the same time as a debug exception. In this case the Status, Cause, EPC and BadVAddr registers assume their usual status after the occurrence of a Non-maskable interrupt, but the address in DEPC is not the non-maskable interrupt exception vector address (0xBFC0 0000).
PAGE 98
Architecture n DSS (bit 0) Set to 1 to indicate a Single Step exception. DBp and DSS bits indicate the most recent debug exception. Each bit represents one of the two debug exceptions and is set to 1 accordingly when that exception occurs. Note : DSS has a higher priority than DBp, since they occur in the pipeline E stage. For this reason DSS and DBp are not raised at the same time. n 0 Ignored when written; returns 0 when read. n Reserved. Undefined value. 8.
PAGE 99
Architecture (2) Debug exception handling i) Raising a debug exception n DEPC and Debug register updates DEPC : The address where the exception was raised is put in this register. DBD : Set to 1 when the exception was raised for an instruction in the branch delay slot. DM : Set to 1. DSS, DBp : Set to 1 if the corresponding exception was raised. NIS : Set to 1 if a Non-maskable interrupt occurred at the same time as the debug exception.
PAGE 100
Architecture iii) Return from a debug exception handler n When a user program exception occurs at the same time as a Debug exception, change the DEPC value so that a return will be made to the exception handler. When NIS = 1, change DEPC to 0xBFC0 0000. When OES = 1, change DEPC to 0x8000 0080 (if BEV = 0) or 0xBFC0 0180 (if BEV = 0). n Executing a DERET instruction PC: Contains the DEPC value. Debug register DM: Cleared to 0. Status register KUc, IEc: Set to 1, enabling interrupts.
PAGE 101
Architecture 8.3 Details of Debug Exceptions (1) Single Step exception • Cause − When the Debug register SSt bit is set, a Single Step exception is raised each time one instruction is executed. • Exception masking − The Single Step exception can be masked by the Debug register SSt bit. When SSt is cleared to 0, a Single Step exception cannot be raised. (Note : In the debug exception handler, a Single Step exception is masked regardless of the SSt bit value.
PAGE 102
Architecture (2) Debug Breakpoint exception • Cause − A Debug Breakpoint exception is raised when an SDBBP instruction is executed. • Exception masking − The Breakpoint exception cannot be masked. (Note : Its behavior during another debug exception is undefined.) • Instruction causing this exception SDBBP • Processing − When this exception is raised, processing jumps to a special debug exception handler at 0xBFC0 0200.
PAGE 103
Architecture 92
PAGE 104
Architecture Appendix A Instruction Set Details This appendix presents each instruction in alphabetical order, explaining its operation in detail. Exceptions that might occur during the execution of each instruction are listed at the end of each explanation. The direct causes of exceptions and how they are handled are explained elsewhere in this manual, and are not described in detail in this Appendix.
PAGE 105
Architecture Instruction Classes The R3900 Processor Core has five classes of CPU instructions, as follows. • Load/store These instructions transfer data between memory and general-purpose registers. "Base register + 16-bit signed immediate offset" is the only supported addressing mode, so the format of all instructions in this class is I-type. • Computational These instructions perform arithmetic logical and shift operations on register values.
PAGE 106
Architecture Instruction Formats Every instruction consists of a single word (32 bits) aligned on a word boundary. The main instruction formats are shown in Figure A-1.
PAGE 107
Architecture Instruction Notation Conventions In this appendix all variable subfields in an instruction format are written in lower-case letters (rs, rt, immediate, etc.). For some instructions, an alias is used for subfield names, for the sake of clarity. For example, rs in a load/store instruction may be referred to as “base”. Such an alias refers to a subfield that can take a variable value and is therefore also written in lower-case letters.
PAGE 108
Architecture Table A-1. Symbols used in instruction operation notation Symbol Meaning ← || xy xy..z Assignment Bit string concatenation Replication of bit value x into a y-bit string. Note that x is always a single-bit value. Selection of bits y through z of bit string x. Little endian bit notation is always used here. If y is less than z, this expression results in an empty (null length) bit string.
PAGE 109
Architecture Examples of Instruction Notation Two examples of the notation used in explaining instructions are given below. Example 1: GPR[rt] ← immediate || 016 This means that 16 zero bits are concatenated with an immediate value (normally 16 bits), and the resulting 32-bit string (with the lower 16 bits cleared to 0) is assigned to general-purpose register (GPR) rt. Example 2: (immediate15)16 || immediate 15..
PAGE 110
Architecture Load and Store Instructions With the R3900 Processor Core, the instruction immediately following a load instruction can use the loaded value. Hardware is interlocked for this purpose, causing a delay of one instruction cycle. Programming should be carried out with an awareness of the potential effects of the load delay slot. The descriptions of load/store operations make use of the functions listed in Table A-2 in describing the handling of virtual addresses and physical memory. Table A-2.
PAGE 111
Architecture Table A-3. Load/Store access type designations Mnemonic Value Meaning WORD 3 Word access (32 bits) TRIPLEBYTE 2 Triplebyte access (24 bits) HALFWORD 1 Halfword access (16 bits) BYTE 0 Byte access (8 bits) The individual bytes in an addressed word can be determined directly from the access type and the low-order two bits of the address, as shown in Table A-4.
PAGE 112
Architecture Jump and Branch Instructions All jump and branch instructions are executed with a delay of one instruction cycle. This means that the immediately following instruction (the instruction in the delay slot) is executed while the branch target instruction is being fetched. A jump or branch instruction should never be put in the delay slot; if this is done, it will not be detected as an error and the result will be undefined.
PAGE 113
Architecture ADD ADD Add 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 ADD 00000 100000 5 6 Format : ADD rd, rs, rt Description : Adds the contents of general-purpose registers rs and rt and puts the result in general-purpose register rd. If carry-out bits 31 and 30 differ, a two's complement overflow exception is raised and destination register rd is not modified.
PAGE 114
Architecture ADDI ADDI Add Immediate 31 26 25 ADDI 21 20 16 15 0 rs rt immediate 5 5 16 001000 6 Format : ADDI rt, rs, immediate Description : Sign-extends a 16-bit immediate value, adds it to the contents of general-purpose register rs and puts the result in general-purpose register rt. If carry-out bits 31 and 30 differ, a two's complement overflow exception is raised and destination register rt is not modified. Operation : T: GPR[rt] ← GPR[rs] + (immediate15 )16 || immediate15..
PAGE 115
Architecture ADDIU ADDIU Add Immediate Unsigned 31 26 25 ADDIU 21 20 16 15 0 rs rt immediate 5 5 16 001001 6 Format : ADDIU rt, rs, immediate Description : Sign extends a 16-bit immediate value, adds it to the contents of general-purpose register rs and puts the result in general-purpose register rt. The only difference from ADDI is that ADDIU cannot cause an overflow exception. Operation : T: GPR[rt] ← GPR[rs] + (immediate15 )16 || immediate15..
PAGE 116
Architecture ADDU ADDU Add Unsigned 31 26 25 21 20 SPECIAL rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 ADDU 00000 100001 5 6 Format : ADDU rd, rs, rt Description : Adds the contents of general-purpose registers rs and rt and puts the result in general-purpose register rd. The only difference from ADD is that ADDU cannot cause an overflow exception.
PAGE 117
Architecture AND AND And 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 AND 00000 100100 5 6 Format : AND rd, rs, rt Description : Bitwise ANDs the contents of general-purpose registers rs and rt and puts the result in generalpurpose register rd.
PAGE 118
Architecture ANDI ANDI And Immediate 31 26 25 ANDI 21 20 16 15 0 rs rt immediate 5 5 16 001100 6 Format : ANDI rt, rs, immediate Description : Zero-extends a 16-bit immediate value, bitwise logical ANDs it with the contents of general-purpose register rs and puts the result in general-purpose register rt. Operation : T: GPR[rt] ← 016 || (immediate and GPR[rs]15..
PAGE 119
Architecture BCzF BCzF Branch On Coprocessor z False 31 26 25 21 20 16 15 COPz BC BCF 0100xx* 01000 00000 5 5 6 0 offset 16 Format : BCzF offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 120
Architecture BCzF BCzF Branch On Coprocessor z False (cont.) Exceptions : Coprocessor Unusable exception Operation Code Bit Encoding : BCzF Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC0F 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC1F 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 Bit No.
PAGE 121
Architecture BCzFL BCzFL Branch On Coprocessor z False Likely 31 26 25 21 20 16 15 COPz BC BCFL 0100xx* 01000 00010 6 5 5 0 offset 16 Format : BCzFL offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 122
Architecture BCzFL BCzFL Branch On Coprocessor z False Likely (cont.) Operation : T − 1: T: T + 1: condition ← not COC[z] target ← (offset15)14 || offset || 02 if condition then PC ← PC + target else NullifyCurrentInstruction endif Exceptions : Coprocessor Unusable exception Operation Code Bit Encoding : BCzFL Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC0FL 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 Bit No.
PAGE 123
Architecture BCzT BCzT Branch On Coprocessor z True 31 26 25 21 20 16 15 COPz BC BCT 0100xx* 01000 00001 6 5 5 0 offset 16 Format : BCzT offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 124
Architecture BCzT BCzT Branch On Coprocessor z True (cont.) Exceptions : Coprocessor Unusable exception Operation Code Bit Encoding : BCzT Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC0T 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC1T 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 Bit No.
PAGE 125
Architecture BCzTL BCzTL Branch On Coprocessor z True Likely 31 26 25 21 20 16 15 COPz BC BCTL 0100xx* 01000 00011 6 5 5 0 offset 16 Format : BCzTL offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 126
Architecture BCzTL BCzTL Branch On Coprocessor z True Likely (cont.) Exceptions : Coprocessor Unusable exception Operation Code Bit Encoding : BCzTL Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC0TL 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 Bit No. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 BC1TL 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 Bit No.
PAGE 127
Architecture BEQ BEQ Branch On Equal 31 26 25 BEQ 21 20 16 15 0 rs rt offset 5 5 16 000100 6 Format : BEQ rs, rt, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). The contents of general registers rs and rt are compared and, if equal, the program branches to the target address after a onecycle delay.
PAGE 128
Architecture BEQL BEQL Branch On Equal Likely 31 26 25 BEQL 21 20 16 15 0 rs rt offset 5 5 16 010100 6 Format : BEQL rs, rt, offset Description : Generates the branch target address by adding the address of the instruction in the delay slot to the 16-bit offset (that has been left-shifted two bits and sign-extended to 32 bits). It compares the contents of general registers rs and rt and, if equal, the program branches to the target address after a one-cycle delay.
PAGE 129
Architecture BGEZ BGEZ Branch On Greater Than Or Equal To Zero 31 26 25 BCOND 21 20 16 15 BGEZ rs 000001 0 offset 00001 6 5 5 16 Format : BGEZ rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the sign bit of the value in general-purpose register rs is 0 (i.e.
PAGE 130
Architecture BGEZAL Branch On Greater Than Or Equal To Zero And Link 31 26 25 BCOND 21 20 rs 000001 BGEZAL 16 15 BGEZAL 0 offset 10001 6 5 5 16 Format : BGEZAL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 131
Architecture BGEZALL Branch On Greater Than Or Equal To Zero And Link Likely 31 26 25 BCOND 21 20 16 15 BGEZALL rs 000001 BGEZALL 0 offset 10011 6 5 5 16 Format : BGEZALL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 132
Architecture BGEZL Branch On Greater Than Or Equal To Zero Likely 31 26 25 BCOND 21 20 16 15 BGEZL rs 000001 BGEZL 0 offset 00011 6 5 5 16 Format : BGEZL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the sign bit of the value in general-purpose register rs is 0 (i.e.
PAGE 133
Architecture BGTZ BGTZ Branch On Greater Than Zero 31 26 25 BGTZ 21 20 16 15 0 0 rs 000111 offset 00000 6 5 5 16 Format : BGTZ rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the value in generalpurpose register rs is positive (i.e.
PAGE 134
Architecture BGTZL BGTZL Branch On Greater Than Zero Likely 31 26 25 BGTZL 21 20 16 15 0 0 rs 010111 offset 00000 6 5 5 16 Format : BGTZL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the value in generalpurpose register rs is positive (i.e.
PAGE 135
Architecture BLEZ BLEZ Branch On Less Than Or Equal To Zero 31 26 25 BLEZ 21 20 16 15 0 0 rs 000110 offset 00000 6 5 5 16 Format : BLEZ rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the the value in general-purpose register rs is negative or 0 (i.e.
PAGE 136
Architecture BLEZL Branch On Less Than Or Equal To Zero Likely 31 26 25 BLEZL 21 20 16 15 0 0 rs 010110 BLEZL offset 00000 6 5 5 16 Format : BLEZL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the value in generalpurpose register rs is negative or 0 (i.e.
PAGE 137
Architecture BLTZ BLTZ Branch On Less Than Zero 31 26 25 BCOND 21 20 rs 000001 16 15 0 BLTZ offset 00000 6 5 5 16 Format : BLTZ rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the value in generalpurpose register rs is negative (i.e., the sign bit of rs is 1), the program branches to the target address after a one-cycle delay.
PAGE 138
Architecture BLTZAL BLTZAL Branch On Less Than Zero And Link 31 26 25 BCOND 21 20 rs 000001 16 15 BLTZAL 0 offset 10000 6 5 5 16 Format : BLTZAL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 139
Architecture BLTZALL BLTZALL Branch On Less Than Zero And Link Likely 31 26 25 BCOND 21 20 rs 000001 16 15 BLTZALL 0 offset 10010 6 5 5 16 Format : BLTZALL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits).
PAGE 140
Architecture BLTZL BLTZL Branch On Less Than Zero Likely 31 26 25 BCOND 21 20 rs 000001 16 15 0 BLTZL offset 00010 6 5 5 16 Format : BLTZL rs, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). If the value in generalpurpose register rs is negative (i.e.
PAGE 141
Architecture BNE BNE Branch On Not Equal 31 26 25 BNE 21 20 16 15 0 rs rt offset 5 5 16 000101 6 Format : BNE rs, rt, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). The contents of general registers rs and rt are compared and, if not equal, the program branches to the target address after a one-cycle delay.
PAGE 142
Architecture BNEL BNEL Branch On Not Equal Likely 31 26 25 BNEL 21 20 16 15 0 rs rt offset 5 5 16 010101 6 Format : BNEL rs, rt, offset Description : Generates a branch target address by adding the address of the instruction in the delay slot to the 16bit offset (that has been left-shifted two bits and sign-extended to 32 bits). The contents of general registers rs and rt are compared and, if not equal, the program branches to the target address after a one-cycle delay.
PAGE 143
Architecture BREAK BREAK Breakpoint 31 26 25 65 0 SPECIAL code BREAK 001101 20 6 000000 6 Format : BREAK code Description : Raises a Breakpoint exception, then immediately passes control to an exception handler. The code field can be used to pass software parameters, but the only way to have the code field retrieved by the exception handler is use the DEPC register to load the contents of the memory word containing this instruction.
PAGE 144
Architecture CACHE CACHE Cache 31 26 25 CACHE 21 20 16 15 0 base op offset 5 5 16 101111 6 Format : CACHE op, offset(base) Description : Generates a virtual address by sign-extending the 16-bit offset and adding the result to the contents of register base. The virtual address is translated to a physical address, and a 5-bit sub-opcode designates the cache operation to be performed at that address.
PAGE 145
Architecture CACHE CACHE Cache (cont.) Bits 20..18 of the Cache instruction select the operation to be performed as follows. Bit# 20 19 18 Cache ID Operation Name Description 0 0 0 I IndexInvalidate Sets the cache state of the cache block to Invalid. This instruction is valid only when the instruction cache is invalid (Config register ICE bit is 0). 0 0 1 D IndexLRUBitClear Clears the LRU bit of the cache at the designated index.
PAGE 146
Architecture CFCz CFCz Move Control From Coprocessor 31 26 25 21 20 16 15 11 10 0 COPz CF 0100xx* 00010 6 5 rt rd 0 000 0000 0000 5 5 11 Format : CFCz rt, rd Description : Loads the contents of coprocessor z's control register rd into general-purpose register rt. This instruction is not valid when issued for CP0. Operation : T: GPR[rt] ← CCR[z, rd] Exceptions : Coprocessor Unusable exception * Operation Code Bit Encoding : CFCz 21 0 0 Bit No.
PAGE 147
Architecture COPz Coprocessor Operation 31 26 25 24 COPz CO 0100xx* 1 6 1 COPz 0 cofun 25 Format : COPz cofun Description : Performs the operation designated by cofun in coprocessor z. This operation may involve selecting or accessing internal coprocessor registers or changing the status of the coprocessor condition signal (CPCOND), but will not modify internal states of the processor or cache/memory system.
PAGE 148
Architecture CTCz CTCz Move Control To Coprocessor 31 26 25 21 20 16 15 11 10 0 COPz CT 0100xx* 00110 6 5 rt rd 0 000 0000 0000 5 5 11 Format : CTCz rt, rd Description : Loads the contents of general register rt into control register rd of coprocessor z. This instruction is not valid when issued for CP0. Operation : T: CCR[z, rd] ← GPR[rt] Exceptions : Coprocessor Unusable exception *Refer to the section entitied“Bit Encoding of CPU Instruction Opcodes”at the end of this appendix.
PAGE 149
Architecture DERET DERET Debug Exception Return 31 26 25 24 65 0 COP0 CO 0 DERET 010000 1 000 0000 0000 0000 0000 011111 6 1 19 6 Format : DERET Description : Executes a return from a self-debug interrupt or exception. This instruction requires a branch delay slot like that of the branch or jump instructions, and executes with a delay of one instruction cycle. The DERET instruction itself cannot be put in the delay slot.
PAGE 150
Architecture DIV DIV Divide 31 26 25 SPECIAL 21 20 rs 16 15 65 rt 000000 6 5 5 0 0 DIV 00 0000 0000 011010 10 6 Format : DIV rs, rt Description : Divides the contents of general register rs by the contents of general register rt, treating both operands as two's complement integers. An overflow exception is never raised. If the divisor is zero, the result is undefined. Ordinarily, instructions are placed after this instruction to check for zero division and overflow.
PAGE 151
Architecture DIVU DIVU Divide Unsigned 31 26 25 SPECIAL 21 20 rs 000000 6 5 16 15 65 0 rt 0 DIVU 00000 00 0000 0000 011011 5 10 6 Format : DIVU rs, rt Description : This instruction divides the contents of general register rs by the contents of general register rt, treating both operands as two's complement integers. An integer overflow exception is never raised. If the divisor is zero, the result is undefined.
PAGE 152
Architecture J Jump 31 J 26 25 0 J target 000010 6 26 Format : J target Description : Generates a jump target address by left-shifting the 26-bit target by two bits and combining the result with the high-order 4 bits of the address of the instruction in the delay slot. The program jumps unconditionally to this address after a delay of one instruction cycle. Operation : T: T + 1: temp ← target PC ← PC31..
PAGE 153
Architecture JAL JAL Jump And Link 31 26 25 0 JAL target 000011 6 26 Format : JAL target Description : Generates a jump target address by left-shifting the 26-bit target by 2 bits and combining the result with the high-order 4 bits of the address of the instruction in the delay slot. The program jumps unconditionally to this address after a delay of one instruction cycle. The address of the instruction after the delay slot is placed in link register r31 as the return address from the jump.
PAGE 154
Architecture JALR JALR Jump And Link Register 31 26 25 21 20 SPECIAL 16 15 0 rs 000000 11 10 rd 00000 6 5 5 5 65 0 0 JALR 00000 001001 5 6 Format : JALR rs JALR rd, rs Description : Causes the program to jump unconditionally to the address in general register rs after a delay of one instruction cycle. The address of the instruction following the delay slot is put in general register rd as the return address from the jump.
PAGE 155
Architecture JR JR Jump Register 31 26 25 SPECIAL 21 20 0 0 JR 000 0000 0000 0000 001000 15 6 rs 000000 6 65 5 Format : JR rs Description : Causes the program to jump unconditionally to the address in general register rs after a delay of one instruction cycle. Since instructions must be aligned on a word boundary, the two low-order bits of target register rs must be 00. If not, an Address Error exception will be raised when the target instruction is fetched.
PAGE 156
Architecture LB LB Load Byte 31 26 25 LB 21 20 16 15 base rt 5 5 0 offset 100000 6 16 Format : LB rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then sign-extends the byte at the memory location pointed to by the effective address and loads the result into general-purpose register rt. Operation : T: vAddr ← ((offset15)16 || offset15..
PAGE 157
Architecture LBU LBU Load Byte Unsigned 31 26 25 LBU 21 20 base 16 15 0 rt offset 5 16 100100 6 5 Format : LBU rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then zero-extends the byte at the memory location pointed to by the effective address and loads the result into general-purpose register rt. Operation : T: vAddr ← ((offset15)16 || offset15..
PAGE 158
Architecture LH LH Load Halfword 31 26 25 LH 21 20 16 15 0 base rt offset 5 5 16 100001 6 Format : LH rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then sign-extends the halfword at the memory location pointed to by the effective address and loads the result into general-purpose register rt. If the effective address is not aligned on a halfword boundary, i.e.
PAGE 159
Architecture LHU LHU Load Halfword Unsigned 31 26 25 LHU 21 20 16 15 0 base rt offset 5 5 16 100101 6 Format : LHU rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then zero-extends the halfword at the memory location pointed to by the effective address and loads the result into general-purpose register rt.
PAGE 160
Architecture LUI LUI Load Upper Immediate 31 26 25 21 20 LUI 0 00111 00000 6 5 16 15 0 rt immediate 5 16 Format : LUI rt, immediate Description : Left-shifts 16-bit immediate by the 16 bits, zero-fills the low-order 16 bits of the word, and puts the result in general register rt.
PAGE 161
Architecture LW LW Load Word 31 26 25 LW 21 20 16 15 0 base rt offset 5 5 16 100011 6 Format : LW rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then loads the word at the memory location pointed to by the effective address into general-purpose register rt. If the effective address is not aligned on a word boundary, i.e.
PAGE 162
Architecture LWL LWL Load Word Left 31 26 25 LWL 21 20 16 15 0 base rt offset 5 5 16 100010 6 Format : LWL rt, offset(base) Description : Used together with LWR to load four consecutive bytes to a register when the bytes cross a word boundary. LWL loads the left part of the register from the appropriate part of the high-order word; LWR loads the right part of the register from the appropriate part of the low-order word.
PAGE 163
Architecture LWL Load Word Left (cont.) LWL It is alright to put a load instruction that uses the same rt as the LWL instruction immediately before LWL (or LWR). The contents of general-purpose register rt are bypassed internally in the processor, eliminating the need for a NOP between the two instructions. No Address Error instruction is raised due to misalignment. Operation : T: vAddr ← ((offset15)16 || offset15..0) + GPR[base] (pAddr, uncached) ← AddressTranslation (vAddr, DATA) pAddr ← pAddr31..
PAGE 164
Architecture LWR LWR Load Word Right 31 26 25 LWR 21 20 16 15 0 base rt offset 5 5 16 100110 6 Format : LWR rt, offset(base) Description : Used together with LWL to load four consecutive bytes to a register when the bytes cross a word boundary. LWR loads the right part of the register from the appropriate part of the low-order word; LWL loads the left part of the register from the appropriate part of the high-order word.
PAGE 165
Architecture LWR Load Word Right (cont.) LWR It is alright to put a load instruction that uses the same rt as the LWR instruction immediately before LWR. The contents of general-purpose register rt are bypassed internally in the processor, eliminating the need for a NOP between the two instructions. No Address Error instruction is raised due to misalignment. Operation : T: vAddr ← ((offset15)16 || offset15..0) + GPR[base] (pAddr, uncached) ← AddressTranslation (vAddr, DATA) pAddr ← pAddr31..
PAGE 166
Architecture MADD Multiply/Add 31 26 25 MADD / MADDU 21 20 rs 16 15 rt MADD 11 10 rd 011100 6 5 5 5 65 0 0 MADD 00000 000000 5 6 Format : MADD rs, rt MADD rd, rs, rt Description : Multiplies the contents of general registers rs and rt, treating both values as two's complement, and puts the double-word result in special registers HI and LO. An overflow exception is never raised.
PAGE 167
Architecture MADDU MADDU Multiply/Add Unsigned 31 26 25 MADD/MADDU 21 20 rs 16 15 rt 11 10 rd 011100 6 5 5 5 65 0 0 MADDU 00000 000001 5 6 Format : MADDU rs, rt MADDU rd, rs, rt Description : Multiplies the contents of general registers rs and rt, treating both values as unsigned , and puts the double-word result in special registers HI and LO. An overflow exception is never raised.
PAGE 168
Architecture MFC0 MFC0 Move From System Control Coprocessor 31 26 25 21 20 COP0 MF 010000 00000 6 5 16 15 rt 11 10 rd 0 0 000 0000 0000 5 5 11 Format : MFC0 rt, rd Description : Loads the contents of coprocessor CP0 register rd into general-purpose register rt.
PAGE 169
Architecture MFCz MFCz Move From Coprocessor 31 26 25 21 20 COPz MF 0100xx* 00000 6 5 16 15 rt 11 10 rd 0 0 000 0000 0000 5 5 11 Format : MFCz rt, rd Description : Loads the contents of coprocessor z register rd into general-purpose register rt.
PAGE 170
Architecture MFCz Move From Coprocessor (cont.) MFCz *Operation Code Bit Encoding : MFCz Bit No. 31 30 29 28 27 26 25 24 23 22 21 MFC0 0 1 0 0 0 0 0 0 0 0 0 Bit No. 31 30 29 28 27 26 25 24 23 22 21 MFC1 0 1 0 0 0 1 0 0 0 0 0 Bit No. 31 30 29 28 27 26 25 24 23 22 21 MFC2 0 1 0 0 1 0 0 0 0 0 0 Bit No. 31 30 29 28 27 26 25 24 23 22 21 MFC3 0 1 0 0 1 1 0 0 0 0 0 opcode coprocessor sub-opcode coprocessor unit no.
PAGE 171
Architecture MFHI MFHI Move From HI 31 26 25 16 15 SPECIAL 0 000000 00 0000 0000 6 10 11 10 rd 5 65 0 0 MFHI 00000 010000 5 6 Format : MFHI rd Description : Loads the contents of special register HI into general-purpose register rd. To guarantee correct operation even if an interrupt occurs, neither of the two instructions following MFHI should be DIV or DIVU instructions which modify the HI register contents.
PAGE 172
Architecture MFLO MFLO Move From LO 31 26 25 16 15 SPECIAL 0 000000 00 0000 0000 6 10 11 10 rd 5 65 0 0 MFLO 00000 010010 5 6 Format : MFLO rd Description : Loads the contents of special register LO into general-purpose register rd. To guarantee correct operation even if an interrupt occurs, neither of the two instructions following MFLO should be DIV or DIVU instructions which the LO register contents.
PAGE 173
Architecture MTC0 MTC0 Move To System Control Coprocessor 31 26 25 21 20 COP0 MT 010000 00100 6 5 16 15 rt 11 10 rd 0 0 000 0000 0000 5 5 11 Format : MTC0 rt, rd Description : Loads the contents of general-purpose register rt into CP0 coprocessor register rd.
PAGE 174
Architecture MTCz MTCz Move To Coprocessor 31 26 25 21 20 COPz MT 0100xx* 00100 6 5 16 15 11 10 rt 0 0 rd 000 0000 0000 5 5 11 Format : MTCz rt, rd Description : Loads the contents of general-purpose register rt into coprocessor z register rd. Operation : CPR[z, rd] ← GPR[rt] T: Exceptions : Coprocessor Unusable exception * Operation Code Bit Encoding : MTCz Bit No. 31 30 29 28 27 26 25 24 23 22 21 COP0 0 1 0 0 0 0 0 0 1 0 0 Bit No.
PAGE 175
Architecture MTHI MTHI Move To HI 31 26 25 SPECIAL 21 20 0 0 MTHI 000 0000 0000 0000 010001 15 6 rs 000000 6 65 5 Format : MTHI rs Description : Loads the contents of general-purpose register rs into special register HI. If executed after a DIV or DIVU instruction or before a MFLO, MFHI, MTLO or MTHI instruction, the contents of special register LO will be undefined.
PAGE 176
Architecture MTLO MTLO Move To LO 31 26 25 SPECIAL 21 20 0 0 MTLO 000 0000 0000 0000 010011 rs 000000 6 65 5 15 6 Format : MTLO rs Description : Loads the contents of general-purpose register rs into special register LO. If executed after a DIV or DIVU instruction or before a MFLO, MFHI, MTLO or MTHI instruction, the contents of special register HI will be undefined.
PAGE 177
Architecture MULT MULT Multiply 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 MULT 00000 011000 5 6 Format : MULT rs, rt MULT rd, rs, rt Description : Multiplies the contents of general-purpose register rs by the contents of general register rt, treating both register values as 32-bit two's complement values. This instruction cannot raise an integer overflow exception.
PAGE 178
Architecture MULTU MULTU Multiply Unsigned 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 MULTU 00000 011001 5 6 Format : MULTU rs, rt MULTU rd, rs, rt Description : Multiplies the contents of general-purpose register rs by the contents of general register rt, treating both register values as 32-bit unsigned values. This instruction cannot raise an integer overflow exception.
PAGE 179
Architecture NOR NOR Nor 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 NOR 00000 100111 5 6 Format : NOR rd, rs, rt Description : Bitwise NORs the contents of general register rs with the contents of general register rt, and loads the result in general register rd.
PAGE 180
Architecture OR OR Or 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 OR 00000 100101 5 6 Format : OR rd, rs, rt Description : Bitwise ORs the contents of general-purpose register rs with the contents of general-purpose register rt, and loads the result in general-purpose register rd.
PAGE 181
Architecture ORI ORI Or Immediate 31 26 25 ORI 21 20 16 15 0 rs rt immediate 5 5 16 001101 6 Format : ORI rt, rs, immediate Description : Zero-extends the 16-bit immediate value, bitwise ORs the result with the contents of general-purpose register rs, and loads the result in general-purpose register rt. Operation : T: GPR[rt] ← GPR[rs]31..16 || (immediate or GPR[rs]15..
PAGE 182
Architecture RFE RFE Restore From Exception 31 26 25 24 65 0 COP0 CO 0 RFE 010000 1 000 0000 0000 0000 0000 010000 1 19 6 6 Format : RFE Description : Copies the Status register bits for previous interrupt mask mode and previous kernel/user mode (IEp and KUp) to the current mode bits (IEc and KUc), and copies the old mode bits (IEo and KUo) to the previous mode bits (IEp and KUp). The old mode bits remain unchanged.
PAGE 183
Architecture SB SB Store Byte 31 26 25 SB 21 20 16 15 0 base rt offset 5 5 16 101000 6 Format : SB rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then stores the least significant byte of register rt at the resulting effective address. Operation : T: vAddr ← ((offset15)16 || offset15..
PAGE 184
Architecture SDBBP SDBBP Software Debug Breakpoint 31 26 25 65 SPECIAL code 000000 0 SDBBP 001110 6 20 6 Format : SDBBP code Description : Raises a Debug Breakpoint exception, passing control to an exception handler. The code field can be used for passing information to the exception handler, but the only way to have the code field retrieved by the exception handler is to load the contents of the memory word containing this instruction using the DEPC register.
PAGE 185
Architecture SH SH Store Halfword 31 26 25 SH 21 20 16 15 0 base rt offset 5 5 16 101001 6 Format : SH rt, offset(base) Description : Generates an unsigned 32-bit effective address by sign-extending the 16-bit offset and adding it to the contents of general-purpose register base. It then stores the least significant halfword of register rt at the resulting effective address.
PAGE 186
Architecture SLL SLL Shift Left Logical 31 26 25 21 20 SPECIAL 0 000000 00000 6 5 16 15 rt 11 10 rd 65 sa 0 SLL 000000 5 5 5 6 Format : SLL rd, rt, sa Description : Left-shifts the contents of general-purpose register rt by sa bits, zero-fills the low-order bits, and puts the result in register rd. Operation : T: GPR[rd] ← GPR[rt]31-sa..
PAGE 187
Architecture SLLV SLLV Shift Left Logical Variable 31 26 25 21 20 SPECIAL rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SLLV 0 0000 000100 5 6 Format : SLLV rd, rt, rs Description : Left-shifts the contents of general-purpose register rt (by the number of bits designated in the loworder five bits of general-purpose register rs), zero-fills the low-order bits and puts the 32-bit result in register rd. Operation : T: s ← GPR[rs]4..0 GPR[rd] ← GPR[rt](31-s)..
PAGE 188
Architecture SLT SLT Set On Less Than 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SLT 00000 101010 5 6 Format : SLT rd, rs, rt Description : Compares the contents of general-purpose registers rt and rs as 32-bit signed integers. A 1, if rs is less than rt, or a 0, otherwise, is placed in general-purpose register rd as the result of the comparison. No overflow exception is raised.
PAGE 189
Architecture SLTI SLTI Set On Less Than Immediate 31 26 25 SLTI 21 20 16 15 0 rs rt immediate 5 5 16 001010 6 Format : SLTI rt, rs, immediate Description : Sign-extends the 16-bit immediate value and compares the result with the contents of generalpurpose register rs, treating both values as 32-bit signed integers. A 1, if rs is less than the sighextended immediate value, or a 0, otherwise, is placed in general-purpose register rt as the result of the comparison.
PAGE 190
Architecture SLTIU SLTIU Set On Less Than Immediate Unsigned 31 26 25 SLTIU 21 20 16 15 0 rs rt immediate 5 5 16 001011 6 Format : SLTIU rt, rs, immediate Description : Sign-extends the 16-bit immediate value and compares the result with the contents of generalpurpose register rs, treating both values as 32-bit unsigned integers. A 1, if rs is less than the sighextended immediate value, or a 0, otherwise, is placed in general-purpose register rt as result of the comparison.
PAGE 191
Architecture SLTU SLTU Set On Less Than Unsigned 31 26 25 21 20 SPECIAL rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SLTU 00000 101011 5 6 Format : SLTU rd, rs, rt Description : Compares the contents of general registers rt and rs as 32-bit unsigned integers. A 1, if rs is less than rt, or a 0, otherwise, is placed in general-purpose register rd as the result of the comparison. No overflow exception is raised.
PAGE 192
Architecture SRA SRA Shift Right Arithmetic 31 26 25 21 20 SPECIAL 0 000000 00000 6 5 16 15 rt 11 10 rd 65 sa 0 SRA 000011 5 5 5 6 Format : SRA rd, rt, sa Description : Right-shifts the contents of general-purpose register rt by sa bits, sign-extends the high-order bits, and puts the result in register rd. Operation : T: GPR[rd] ← (GPR[rt]31)sa || GPR[rt]31..
PAGE 193
Architecture SRAV SRAV Shift Right Arithmetic Variable 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SRAV 00000 000111 5 6 Format : SRAV rd, rt, rs Description : Right-shifts the contents of general-purpose register rt (by the number of bits designated in the loworder five bits of general-purpose register rs), sign-extends the high-order bits, and puts the result in register rd. Operation : T: s ← GPR[rs]4..0 GPR[rd] ← (GPR[rt]31)s|| GPR[rt]31..
PAGE 194
Architecture SRL SRL Shift Right Logical 31 26 25 21 20 SPECIAL 0 000000 00000 6 5 16 15 rt 11 10 rd 65 sa 0 SRL 000010 5 5 5 6 Format : SRL rd, rt, sa Description : Right-shifts the contents of general-purpose register rt by sa bits, zero-fills the high-order bits, and puts the result in register rd. Operation : T: GPR[rd] ← 0sa || GPR[rt]31..
PAGE 195
Architecture SRLV SRLV Shift Right Logical Variable 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SRLV 00000 000110 5 6 Format : SRLV rd, rt, rs Description : Right-shifts the contents of general register rt (by the number of bits designated in the low-order five bits of general register rs), zero-fills the high-order bits, and puts the result in register rd. Operation : T: s ← GPR[rs]4..0 GPR[rd] ← 0s || GPR[rt]31..
PAGE 196
Architecture SUB SUB Subtract 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SUB 00000 100010 5 6 Format : SUB rd, rs, rt Description : Subtracts the contents of general-purpose register rt from general-purpose register rs and puts the result in general-purpose register rd. If carry-out bits 31 and 30 differ, a two's complement overflow exception is raised and destination register rd is not modified.
PAGE 197
Architecture SUBU SUBU Subtract Unsigned 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 SUBU 00000 100011 5 6 Format : SUBU rd, rs, rt Description : Subtracts the contents of general-purpose register rt from general-purpose register rs and puts the result in general-purpose register rd. The only difference from SUB is that SUBU cannot cause an overflow exception.
PAGE 198
Architecture SW SW Store Word 31 26 25 SW 21 20 16 15 0 base rt offset 5 5 16 101011 6 Format : SW rt, offset(base) Description : Generates a 32-bit effective address by sign-extending the 16-bit offset value and adding it to the contents of general-purpose register base. It then stores the contents of register rt at the resulting effective address.
PAGE 199
Architecture SWL SWL Store Word Left 31 26 25 SWL 21 20 16 15 0 base rt offset 5 5 16 101010 6 Format : SWL rt, offset(base) Description : Used together with SWR to store the contents of a register into four consecutive bytes of memory when the bytes cross a word boundary. SWL stores the left part of the register into the appropriate part of the high-order word in memory; SWR stores the right part of the register into the appropriate part of the low-order word in memory.
PAGE 200
Architecture SWL Store Word Left (cont.) Operation : T: vAddr ← ((offset15)16 || offset15..0) + GPR[base] (pAddr, uncached) ← AddressTranslation (vAddr, DATA) pAddr ← pAddr31..2 || (pAddr1..0 xor ReverseEndian2) If BigEndianMem = 0 then pAddr ← pAddr31..2 || 02 endif byte ← vAddr1..0 xor BigEndianCPU2 data ← 0 24 - 8*byte || GPR[rt]31..
PAGE 201
Architecture SWR SWR Store Word Right 31 26 25 SWR 21 20 16 15 0 base rt offset 5 5 16 101110 6 Format : SWR rt, offset(base) Description : Used together with SWL to store the contents of a register into four consecutive bytes of memory when the bytes cross a word boundary. SWR stores the right part of the register into the appropriate part of the low-order word in memory; SWL stores the left part of the register into the appropriate part of the high-order word in memory.
PAGE 202
Architecture SWR Store Word Right (cont.) Operation : T: vAddr ← ((offset15)16 || offset15..0) + GPR[base] (pAddr, uncached) ← AddressTranslation (vAddr, DATA) pAddr ← pAddr31..2 || (pAddr1..0 xor ReverseEndian2) If BigEndianMem = 0 then pAddr ← pAddr31..2 || 02 endif byte ← vAddr1..
PAGE 203
Architecture SYNC SYNC Synchronize 31 26 25 65 0 SPECIAL 0 SNYC 000000 0000 0000 0000 0000 0000 001111 6 20 6 Format : SYNC Description : Interlocks the pipeline until the load, store or data cache refill operation of the previous instruction is completed. The R3900 Processor Core can continue processing instructions following a load instruction even if a cache refill is caused by the load instruction or a load is made from a noncacheable area.
PAGE 204
Architecture SYSCALL SYSCALL System Call 31 26 25 65 SPECIAL code 000000 0 SYSCALL 001100 6 20 6 Format : SYSCALL code Description : Raises a System Call exception, then immediately passes control to an exception handler. The code field can be used to pass information to an exception handler, but the only way to have the code field retrieved by the exception handler is to use the EPC register to load the contents of the memory word containing this instruction.
PAGE 205
Architecture XOR XOR Exclusive Or 31 26 25 SPECIAL 21 20 rs 16 15 rt 11 10 rd 000000 6 5 5 5 65 0 0 XOR 00000 100110 5 6 Format : XOR rd, rs, rt Description : Bitwise exclusive-ORs the contents of general-purpose register rs with the contents of generalpurpose register rt and loads the result in general-purpose register rd.
PAGE 206
Architecture XORI XORI Exclusive Or Immediate 31 26 25 XORI 21 20 16 15 0 rs rt immediate 5 5 16 001110 6 Format : XORI rt, rs, immediate Description : Zero-extends the 16-bit immediate value, bitwise exclusive-ORs it with the contents of generalpurpose register rs, then loads the result in general-purpose register rt.
PAGE 207
Architecture Bit Encoding of CPU Instruction Opcodes Figure A-2 shows the bit codes for all CPU instructions (ISA and extended ISA). OPcode 31..29 0 1 2 3 28..
PAGE 208
Architecture COPz rt 20..19 0 1 2 3 18..16 0 BCF γ γ γ 1 BCT γ γ γ 2 BCFLχ γ γ γ 3 BCTLχ γ γ γ 4 γ γ γ γ 5 γ γ γ γ 6 γ γ γ γ 7 γ γ γ γ 5 φ φ φ φ φ φ φ φ 6 (TLBWR) φ φ φ φ φ φ φ φ 7 φ φ φ DERETχ φ φ φ φ 5 γ γ γ γ γ γ γ γ 6 γ γ γ γ γ γ γ γ CP0 Function 2.0 5..3 0 1 2 3 4 5 6 7 0 φ (TLBP) φ RFE * φ φ φ φ 1 (TLBR) φ φ φ φ φ φ φ φ 2 (TLBWI) φ φ φ φ φ φ φ φ 3 φ φ φ φ φ φ φ φ 4 φ φ φ φ φ φ φ φ MADD/MADDU 5..3 0 1 2 3 4 5 6 7 2.
PAGE 209
Architecture Notation : * Reserved for future architecture implementations; use of this instruction with existing versions raises a Reserved Instruction exception. γ Invalid instruction, but dose not raise Reserved Instruction exception in the case of the R3900 Processor Core. δ Valid on the R3900 Processor Core but raises a Reserved Instruction exception on the R3000A. φ Reserved for memory management unit (MMU).
PAGE 210
TMPR3901F TMPR3901F 199
PAGE 211
TMPR3901F 200
PAGE 212
TMPR3901F Chapter 1 Introduction This document describes the specifications of the TMPR3901F microprocessor. The R3900 Processor Core is incorporated into the TMPR3901F. 1.1 Features The TMPR3901F is a general-purpose microprocessor incorporating on-chip the 32-bit R3900 Processor Core, developed by Toshiba. In addition to the processor core it includes a clock generator, bus interface unit, memory protection unit and debug support unit. The TMPR3901F features are as follows. (1) R3900 Processor Core.
PAGE 213
TMPR3901F (4) Low power consumption, optimal for portable applications • 3.
PAGE 214
TMPR3901F 1.2 Internal Blocks The TMPR3901F comprises the following blocks (Figure 1-1). Clock Generator R3900 Processor Core Debug Support Unit CPU core Interrupt Reset 4KB Instruction Cache Synchronizer 1KB Data Cache Real-time Debugger Interface Address Protection Unit Bus Controller / Write Buffer System Interface Figure 1-1 TMPR3901F block diagram (1) R3900 Processor Core (2) Clock generator A quadruple-frequency PLL is built in and operates from an external crystal generator.
PAGE 215
TMPR3901F 2.
PAGE 216
TMPR3901F Chapter 2 Configuration This chapter describes the configuration of the TMPR3901F. A block diagram of the TMPR3901F is shown in Figure 2-1. Clock Generator R3900 Processor Core Debug Support Unit CPU core Interrupt Reset Synchronizer 4KB Instruction Cache 1KB Data Cache Real-time Debugger Interface Address Protection Unit Bus Controller / Write Buffer System Interface Figure 2-1 TMPR3901F block diagram 2.
PAGE 217
TMPR3901F 2.1.2 Address mapping Address mapping in the TMPR3901F is performed by the direct segment mapping MMU in the R3900 Processor Core. The TMPR3901F uses the kseg2 reserved area (0xFF00 0000 - 0xFFFF FFFF) as follows. 0xFF00 0000 - 0xFF00 FFFF address protection unit 0xFF20 0000 - 0xFF3F FFFF debug support unit The TMPR3901F outputs bus operation signals even when it accesses the above area. The TMPR3901F ignores bus operation input signals (ACK*, BUSERR*, etc) at that time. 2.
PAGE 218
TMPR3901F 2.3 Bus Interface Unit (Bus Controller / Write Buffer) The bus interface unit controls TMPR3901F bus operations. Bus operations are synchronous with the rising edge of SYSCLK. The bus interface unit has a four-deep write buffer. The R3900 Processor Core can complete write operations without pipeline stall. There may be conflicts between TMPR3901F write requests from the write buffer and read requests by the R3900 Processor Core. The priority is shown below.
PAGE 219
TMPR3901F 2.4 Address Protection Unit The TMPR3901F has an address protection unit that allows two virtual address breakpoints to be set. Figure 2-2 shows a block diagram of the address protection unit. BAddr0 Register Virtual Address (31 : 2) BMsk0 Register Compare Conditioning OR/ XOR TLB Exception BCnt0 Register IFch DtWr DtRd UsEn KnEn Channel 0 Channel 1 Minv MEn st (1) st (2) BSts Register Figure 2-2 Address protection unit 2.4.
PAGE 220
TMPR3901F (b) Break Mask register (BMsk0-1) The break mask register holds the bit mask used for address comparison. BMsk0 is for channel 0, and BMsk1 is for channel 1. 31 210 BMsk BMsk[31:2] 00 (Break Mask) This is the bit mask for address comparison. Only those bits in the BAddr register that have their corresponding bits set to 1 in the BMsk register are compared. 0 (c) Always 0. Ignored on write; 0 when read.
PAGE 221
TMPR3901F (d) Break Status register (BSts) The break status register is used to set conditions for exception requests. 31 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 St MEn MInv MInv [9] (Master Overlay Invert) If this bit is set to 1, exception requests are triggered by an XOR of the channel 0 and channel 1 address comparison results.This means that an exception request occurs if the address comparison is true (the address matches) for only one of the two channels.
PAGE 222
TMPR3901F 2.4.3 Register address map Seven registers associated with the memory protection scheme are mapped in from the kernel memory space. Table 2-1 shows the addresses of these registers. Table 2-1. Address protection unit control register addresses Register Virtual address BSts 0xFF00 0010 BAddr0 0xFF00 0020 Bcnt0 0xFF00 0024 BMsk0 0xFF00 0028 BAddr1 0xFF00 0030 Bcnt1 0xFF00 0034 BMsk1 0xFF00 0038 2.5 Debug Support Unit This unit supports an external real-time debug system.
PAGE 223
TMPR3901F (2) INT[5:0]* The INT[5:0]* signal is synchronized with the processor clock in phase with SYSCLK (Figure 2-4).
PAGE 224
TMPR3901F (3) NMI* The NMI* signal is synchronized with the processor clock in phase with SYSCLK (Figure 2-5).
PAGE 225
TMPR3901F (4) CPCOND[3:1] The CPCOND[3:1] signal is synchronized with the processor clock in phase with SYSCLK (Figure 26).
PAGE 226
TMPR3901F Chapter 3 Pins The following table summarizes the TMPR3901F pins. NAME I/O DESCRIPTION I/O Address bus. When TMPR3901F has bus mastership, outputs the address to be accessed. When TMPR3901F releases bus mastership, inputs the data cache snoop address. Byte-enable signal. At read and write, indicates which bytes of the data bus are accessed by TMPR3901F. The correspondence with the data bus is: BE [3]* : D [31:24] BE [2]* : D [23:16] BE [1]* : D [15:8] BE [0]* : D [7:0] Data bus. Read signal.
PAGE 227
TMPR3901F NAME BUSGNT* XIN XOUT PLLOFF* CLKEN I/O DESCRIPTION O Bus grant signal. Used by TMPR3901F to indicate it has released bus mastership in response to a request by an external bus master. Connect to crystal oscillator. Connect to crystal oscillator. Stops internal PLL oscillation. Enables internal PLL clock. System clock signal. TMPR3901F bus operation is based on SYSCLK. The frequency can be reduced by 1/2, 1/4 or 1/8 using reduced frequency mode. Free clock signal.
PAGE 228
TMPR3901F Chapter 4 Operations This chapter shows TMPR3901F bus operations and timing. All TMPR3901F bus operations are synchronized with the rising edge of SYSCLK. The bus operation pin states are as follows when no bus operations are being performed. A [31:2] undefined D [31:0] high impedance BE [3:0]* H RD*, WR* H LAST* H BSTART* H BURST* H BSTSZ [1:0] undefined 4.1 Clock The TMPR3901F can control the clock frequency to reduce power dissipation and to simplify system design.
PAGE 229
TMPR3901F The relationship among the clocks is shown in the table below.
PAGE 230
TMPR3901F 4.2 Read Operation The TMPR3901F supports two kinds of read operations single read and burst read . 4.2.1 Single Read The single read operation reads four bytes or less data. It is used in the following cases.
PAGE 231
TMPR3901F At the start of a single read, the BSTART* signal is asserted for one clock cycle only. At the same time the RD* and LAST* signals are asserted. Then the address A[31:2] and BE[3:0]* signals are valid. An external circuit drives the data onto the data bus and asserts an ACK* signal. The TMPR3901F samples the ACK* signal at the rising edge of SYSCLK, confirming that it has been asserted, and latches the data at the rising edge of the next clock.
PAGE 232
TMPR3901F 4.2.2 Burst Read Burst read operation is used to refill a multiword area in cache memory. Because the second and each succeeding data in a burst read operation can each be read in a single cycle, multiword data can be read in from memory very quickly in this mode. Burst read operation is issued whenever a cache miss occurs with either the instruction cache or data cache.
PAGE 233
TMPR3901F SYSCLK A[31:2] BE[3:0]* RD* BSTART* LAST* BURST* BSTSZ[1:0] 00 ACK* BUSERR* D[31:0] Figure 4-3 Burst read (4 words : 1 wait) 222
PAGE 234
TMPR3901F BUSERR* is valid until the clock cycle in which the last data is read. In the clock cycle in which the TMPR3901F recognizes the assertion of BUSERR*, the TMPR3901F ends the burst read cycle and raises a Bus Error exception (see Figure 4-4). When a bus error occurs in a burst read, only those cache lines for which complete reads were accomplished are refilled.
PAGE 235
TMPR3901F 4.3 Write Operation The TMPR3901F supports only single write operations for writes. Figure 4-5 shows the timing for a single-write operation. At the start of the operation, the BSTART* signal is asserted for one clock only. At the same time the WR* and LAST* signals are asserted. Then the address A[31:2] and BE[3:0]* signals are valid. Data is output to the data bus D[31:0] from the second clock after the start of the single-write cycle.
PAGE 236
TMPR3901F 4.4 Interrupts The TMPR3901F supports six hardware interrupts and two software interrupts. It also supports a nonmaskable interrupt. The INT[5:0]* signals can be used to raise interrupt exceptions. The NMI* signal is used to raise a non-maskable interrupt exception. All of the interrupt signals are low-active and should be synchronous with SYSCLK rising edge. 4.4.1 NMI* The TMPR3901F recognizes an NMI* signal on the SYSCLK rising edge (Figure 4-6).
PAGE 237
TMPR3901F 4.4.2 INT[5:0]* The INT[5:0]* signals are used to invoke interrupt exceptions. These interrupts can be masked with the IntMask field of the Status register. The TMPR3901F recognizes an INT[5:0]* signal on the SYSCLK rising edge (Figure 4-7). 1 2 SYSCLK INT[5:0]* Figure 4-7 Interrupt 1 Recognize INT[5:0]* high signal. 2 Recognize INT[5:0]* low signal, thus invoking interrupt exception. The TMPR3901F recognizes an INT[5:0]* low signal on the SYSCLK rising edge as shown Figure 47.
PAGE 238
TMPR3901F 4.5 Bus Arbitration 4.5.1 Bus request and bus grant An external bus master can request that the TMPR3901F grant control of the bus. This is done by asserting the BUSREQ* signal. In response, the TMPR3901F will release the bus and assert a BUSGNT* signal. If BUSREQ* is asserted, while the TMPR3901F is already engaged in a bus operation cycle, the TMPR3901F will not relinquish the bus until that cycle is completed.
PAGE 239
TMPR3901F The BUSREQ* signal is confirmed on the rising edge of SYSCLK. If no bus operation is currently in progress, the BUSGNT* signal is asserted in the next clock after the BUSREQ* assertion is confirmed. The TMPR3901F stops driving the bus in the next clock, thus releasing it. During the time the bus is released by the TMPR3901F, the pin states related to bus operation are as follows. 4.5.
PAGE 240
TMPR3901F 4.6 Reset The TMPR3901F can be reset with the RESET* signal. The RESET* signal must be asserted for a certain number of R3900 Processor Core clock cycles in order for the TMPR3901F reset to take effect. Since the RESET* signal is clock-synchronized with in the TMPR3901F, it can be asserted asynchronously . TMPR3901F operations upon reset are as follows. • The pipeline stalls, and TMPR3901F internal states are initialized.
PAGE 241
TMPR3901F 4.7 Half-Speed Bus Mode To accommodate slower peripheral circuits, the TMPR3901F offers a half-speed bus mode in which bus operations are clocked at half the frequency of the R3900 Processor Core. This mode is selected by setting the HALF* signal to low. When HALF* is set to high, bus operations occur at the same frequency at which the R3900 Processor Core operates. This is called full-speed bus mode.
PAGE 242
TMPR3901F Chapter 5 Power-Down Mode The TMPR3901F has the following four power-down modes to enable lower power dissipation through control of the internal clock. • Halt mode • Standby mode • Doze mode • Reduced Frequency mode 5.1 Halt mode Figure 5-1 shows a state diagram of power down mode.
PAGE 243
TMPR3901F The TMPR3901F sets the HALT signal according to the status of the Halt bit in the Config register. Output signals of the memory interface during Halt mode are the same as when a bus operation is not in progress.
PAGE 244
TMPR3901F 5.2 Standby Mode Stopping the PLL clock in the TMPR3901F results in even less power dissipation than in Halt mode. This is referred to as standby mode. To transit from Active mode to Standby mode, first set the Halt bit the config register to 1. Then, follow the sequence below to empty the write buffer. Finally, set the Halt bit to 1 using the MTC0 instruction. SYNC NOP Loop : BC0F Loop NOP Figure 5-2 shows how stop the PLL and go to Standby mode.
PAGE 245
TMPR3901F 5.3 Doze Mode In this mode, the TMPR3901F stops internal operations the same as in Halt mode to reduce power dissipation. However, in Doze mode bus arbitration and data cache snooping can continue. Setting the Config register Doze bit to 1 switches from Active mode to Doze mode. During Doze mode, the TMPR3901F will assert the DOZE signal and stall the pipeline in “holding current”status.
PAGE 246
TMPR3901F 5.4 Reduced Frequency Mode The TMPR3901F processor clock frequency can be controlled with the Config register RF field. A slower processor clock frequency enables lower power dissipation by the TMPR3901F. The relationship between the RF field and processor clock is follows. Note RF[1:0] processor clock/master clock 00 1/1 01 1/2 10 1/4 11 1/8 :The R3900 Processor Clock is limited to a minimum operation frequency 5 MHz. Please keep this in mind when using reduced frequency mode.