11.1 RSP Overview

A program which runs on the RSP is called a task; the application is completely responsible for scheduling and invoking tasks on the RSP.

The interface between the application and the RSP task is accomplished with a series of operating system calls, and a structure called the task list (or task header) which is type OSTask (defined in sptask.h). The task list contains all the information necessary to begin task execution, including pointers to the microcode to run. This structure is filled in by the application program.

A detailed description of invocation of a task on the RSP is beyond the scope of this section (please see Section 4.7 "RCP Task Management", but the essential procedure is straightforward:

the RSP is assumed to be halted (or N64 CPU halts it).
N64 CPU DMA's the boot microcode into the RSP IMEM.
N64 CPU DMA's the "task header" into the RSP DMEM.
N64 CPU sets the RSP PC to 0.
N64 CPU clears the RSP halt status (allowing it to run).

From this point, the boot microcode takes over, loading the task microcode (and data) specified in the task list, and jumping to the beginning of the task.

One item in the task header is a pointer to the initial data to process (in the case of a graphics task, this is a display list pointer).

11.1.1 Display List Format

The display list which the gspFast3D, gspF3DNoN, or gspLine3D microcode running on the RCP interprets is defined as a stream of 64-bit commands.

Applications written in C will usually use the interface from the file gbi.h., which will be included via inclusion of ultra64.h. Although the construction of display lists looks like a familiar series of function calls, they are actually just bit-packing macros. These macros are described in detail in their individual man pages.

Each macro has two forms, i.e. gSPTexture() and gsSPTexture(). The difference between 'g' and 'gs' is that the 'g' form is an in-line model form which requires an additional argument (pointer of the display list being constructed). The display list pointer must be of the form "ptr++" in order for the macros to work properly.

The 'gs' form is for static declarations, and generates the appropriate C structure initialization sequence.

Throughout this document, only the 'gs' form is mentioned, however the 'g' form also applies, and could always be substituted.

All of the display list building macros also embed an 'SP' or a 'DP' to describe the functional unit of the RCP which will operate on this command. This is certainly confusing, especially to application programmers familiar with higher-level graphics API's such as OpenGL. In order to achieve maximum performance, it is necessary to expose RSP and RDP, the two major units of the RCP to the application programmer. The primary reason for this is resource constraints; there is simply not enough RSP IMEM to build a display list processor that is rich enough to hide these details from the application programmer. The binary encoding of most of the display list commands is the lowest possible level: they are the bits that control the hardware.

Exposing the two functional units of the RCP also limits the amount of state shared between them. The major drawback of this design decision is that you must often tell the same thing to the RSP and the RDP. For example, in order to "turn on texture mapping" you must turn it on in the RSP and turn it on in the RDP. This is a common source of display list bugs, but the parallel execution of the RSP and RDP, plus a highly efficient display list processing machine make this trade-off worthwhile.

11.1.2 Segmented Memory and the RSP Memory Map

All DRAM addresses in the display list are segmented addresses. The mapping of segments and their base addresses is provided using the gSPSegment() macro. It is the responsibility of the application to maintain this mapping and inform the RSP via the display list.

The RSP maintains an associative table of up to 16 segment ID's and their base addresses. Any DRAM address in the display list is 'physical-ized' using this table. The RDP only uses physical addresses, and one of the chores of the RSP is to do the address translation necessary for the RDP.

Note: By convention, segment table entry 0 is reserved for physical addressing.

The RSP software can only access DMEM. All data must first be transferred into DMEM using DMA operations, which must be 64-bit aligned (if the size of data is below 64-bit, it is necessary to pad it). Invocation of the DMA engine is handled by the RSP software, but the application programmer needs to be aware of the boundary requirements. Any data structure that is to be passed to the RSP must be aligned to a 64-bit boundary. The structures in gbi.h use C unions to guarantee this.

Since the DMA engine is shared between the N64 CPU and the RSP, the application program should also avoid unnecessary DMA activity while the RSP is running.

11.1.3 Interaction Between the RSP and N64 CPU Memory Caching

The most prevalent example of communication between the CPU and the RSP is that of the CPU creating a display list in DRAM for eventual interpretation by the RSP. The display list data is read from DRAM via a DMA mechanism. Unfortunately, DRAM locations may be "stale" with respect to newer data being held in the N64 CPU's data cache. The N64 CPU cache mechanism implements a "write-back" caching policy which means individual stores to memory are not immediately written to memory. To update the memory contents with more recent cached data, the CPU must first write back cached data to the DRAM. Then, and only then, will the RSP be able to DMA the correct data for display list processing.

Conversely, the contents of memory may be more recent than cached data in some situations when the RSP modifies memory (an obvious example is updating the color frame buffer). In this case, the CPU's cache may contain stale data and the CPU should invalidate the cached data to force an access directly to DRAM and get the most recent data.

As a practical note, this second scenario only arises in advanced applications.