3

3. Microcode

N64 JPEG microcode does the high-speed compression and decompression of JPEG files (N64 JPEG) from Nintendo original format.

There are the two kinds of microcode:

njpgdspMain.o decompresses the compressed images

njpgespMain.o compresses images

As the application programmer, you can choose the microcode appropriate for your game application.

3.1 The Decompression Microcode

The decompression microcode njpgdspMain is explained here.

3.1.1 Input Data

The N64 JPEG format has two operation blocks for image compression, a sub-block (SB) and a macro block (MB).

A sub-block (SB) is an 8x8 dot square. For both compression and decompression of the JPEG file, discrete cosine transformation (reverse-discrete cosine transformation), quantization (reverse-quantization), zigzag scanning, and the Huffman encode (decode) are done in a sub-block.

A macro block (MB) is a 16x16 dot square. It is a group of four sub-blocks. The N64 JPEG color space uses the YUV format as does standard JPEG, not RGB format which is frequently used in N64 development. N64 JPEG culls the color differences in a macro block after converting the color space from RGB to YUV (4:1:1). As a result, RSP output during the compression (or RSP input during decompression) keeps the YUV data in the macro block. The final drawing during the decompression is also done in the macro block (please see Section 3.1.3, "Output Data," for more information).

The following diagram shows the relationship between the sub-blocks and macro blocks (3-1): The following diagram shows the RSP output data format during compression (or RSP input data format during decompression) (3-2):

3.1.2 Microcode Start-up

The procedure for starting the expansion microcode, njpgdsMain, is described below.

First, include the header-file "n64jpeg.h" in the source code for your game application, and include "njpgdspMain.o" in the spec file for makerom. Then establish the OSTask structure as follows:

type			M_NJPEGDTASK
flags			0x0
*ucode_boot 		(u64 *)rspbootTextStart
ucode_boot_size		SP_BOOT_UCODE_SIZE
*ucode			(u64 *)njpgdspMainTextStart
ucode_size		SP_UCODE_SIZE
*ucode_data		(u64 *)njpgdspMainDataStart
ucode_data_size		SP_UCODE_DATA_SIZE
*dram_stack		NULL
dram_stack_size		0
*output_buf		NULL
*output_buff_size	NULL
*data_ptr		Pointer to the parameter structure
data_size		NJPEGD_PARAM_SIZE
*yield_data_ptr		yieldBuffer pointer
yield_data_size		NJPEGD_YIELD_SIZE

The address of the yield buffer must conform to 16-byte alignment, and it must be 272 bytes in size.

The parameter structure is defined as follows:

typedef struct{
	u64 *buffer;	/* input/output data buffer address*/
	u32 mbs;	/* number of all the macro blocks*/
	u32 scale;	/* quantization scale(-2, -1, 0 ,1 ,2) */
	u32 dummy;
} NJPEGDParam;

In the header address of the input/output data buffer, enter the output address for the CPU decoding process that will be done by njpgHuffDecode function. Microcode overwrites the decompressed input data in RAM, so you don't have to specify the output address for the microcode.

The microcode processes and outputs the image data with a macro block of 16x16 (length x width) dots (pixels). For the mbs argument (the number of all macro blocks), enter the number macro blocks in the original image data. For example, if the size of the original image data is 240x 320 (length x width) dots, the total number of macro blocks should be 300, which is 15 (length) x 20 (width).

For the scale argument (the quantization scale value), enter the specified quantization scale value during the compression. The larger this value is, the higher the compression rate becomes and the shorter the decompression time. However, the image quality becomes worse. Choose a value appropriate for your image.

You can specify any value in the range from -2 to 2 as the quantization scale value. Any other value may cause unexpected results. If you specify 0, the reverse-quantization is not done.

After the CPU decoding process ends and the OSTask structure is prepared, you can start to use the microcode. Call the osSpTaskStart function. It works in the same way as graphics or audio microcode.

(Example)

OSTask njpgdtlist;
	 .
	 .
	 .
/* OSTask structure njpgdtlist set */
	 .
	 .
/* Start RSP Processing */
osWritebackDCacheAll();
osSpTaskStart(&njpgdtlist);
	 .
	 .
/* Wait for the end of RSP processing */
osRecvMesg(&rspMessageQ, NULL, OS_MESG_BLOCK);

3.1.3 Output Data

Data output by the decompression microcode has a format such that each graphics microcode is usable for the YUV texture. For more information about 16bitYUV texel format, please refer to N64Programming Manual Chapter 13 Texture mapping" figure 13-18.

The decompression microcode output data overwrites the input data of each macro block, so you don't have to prepare another microcode output buffer.

The Figure 3-3 shows the format of the data output by the decompression microcode.

Draw the data output by the decompression microcode as illustrated in figure 3-3 as YUV texture per every macro block on the appropriate position of the screen. Although there are different kinds of methods of drawing, the output data is drawn using sprite microcode in the sample program which comes with the manual.

3.2 The Compression Microcode

The compression microcode njpgespMain is explained here.

3.2.1 Input Data

This microcode can only encode (compress) 16-bit RGBA formatted image data format. No other format can be encoded. With regard to the size of the graphic data, the only restriction is that the length and width each conform to a dot multiple of 16.

The microcode directly divides and encodes the image data into 16x16 dot macro blocks (MBs), so it is impossible to cut and process a piece of a large image. For example, it is impossible to cut out a 240x160 piece from a drawn 320x 240 dot image in a frame buffer to use directly as input data for the microcode without erasing it from the frame buffer. However, after first cutting out the piece for another buffer in the application, you can then use the cut-out data as the input data for the microcode.

3.2.2 Microcode Start-up

How to launch the compression microcode njpgespMain is described below.

First, include the n64jpeg.h header file in the source code for your game application, and include the njpgespMain.o file in the spec file for makerom. Then set the OSTask structure as follows:

type			M_NJPEGTASK;
flags			0x0;
*ucode_boot		(u64 *)rspbootTextStart;
ucode_boot_size		SP_BOOT_UCODE_SIZE;
*ucode			(u64 *)njpgespMainTextStart;
ucode_size		SP_UCODE_SIZE;
*ucode_data		(u64 *)njpgespMainDataStart;
ucode_data_size		SP_UCODE_DATA_SIZE;
*dram_stack		NULL;
dram_stack_size		0;
*output_buff		NULL;
*output_buff_size	NULL;
*data_ptr		Pointer to the parameter structure;
data_size		NJPEGE_PARAM_SIZE;
*yield_data_ptr		pointer to yield buffer;
yield_data_size		NJPEGE_YIELD_SIZE;

The OSTask structure is used to transfer the data to the RCP. The address of the yield buffer must conform to 16-byte alignment, and it must be 288 bytes in size for encoding.

The parameter structure is defined as follows:

typedef struct{
	u64 *input;	/* input data buffer address*/
	u64 *output;	/* output data buffer address*/
	u32 mbs_x;	/* number of macro blocks in the x direction */
	u32 mbs_y;	/* number of macro blocks in the y direction- 1*/
	u32 scale;	/* quantization scale(-2, -1, 0, 1, 2) */
	u32 dummy1;
	u32 dummy2;
	u32 dummy3;
} NJPEGEParam;

The quantization scale value is a coefficient to be multiplied by each value in a quantization table when quantizing. There are five possible scale values: -2, -1, 0, 1, and 2 which represent 1/4, 1/2, don't use the quantization table, 1, and 2 respectively. When the quantization table isn't used, the output value of the discrete cosine transformation is used only for zigzag scanning.

After completing each setup item outlined above, call the osSpTaskStart function to activate the compression microcode in the same way you would activate graphics or audio microcode.

OSTask	njpgetlist;
	.
	. 
	.
/* OSTask structure njpgetlist set */
	.
	.
/* Start RSP Processing */
osWritebackDCacheAll();
osSpTaskStart(&njpgetlist);
	.
	.
/* Wait for the end of RSP processing */
osRecvMesg(&rspMessageQ, NULL, OS_MESG_BLOCK);

3.2.3 Output Data

In each macro block (MB), the microcode converts the data from RGB to YUV format, reduces (culls) the data, performs discrete cosine transformation, quantizes the data, and zigzag scans the data. You can use culling (in the ratio of 4:1:1) and the quantization table to make necessary modifications to the data. You can change the quantization table by changing the quantization scale factor.

Each macro block must be 768 bytes in size to hold the output data buffer. For example, an image of 320 x 240 dots is divided into a total of 300 MBs. Therefore, a 230,400-byte region is required for the output buffer. Each application must prepare this buffer.