The Voice Recognition System is a peripheral device for the N64 which makes it possible for words spoken by the user to be recognized during an N64 game. To use the system, insert the plug on the Voice Recognition Unit into the N64 controller port, thereby connecting the special microphone on the Voice Recognition Unit. This makes it possible to make the characters in the game move and respond by voice in addition to the conventional controller-only interface. This allows the game to proceed with a sense of "realism", for instance, allowing the player to give verbal commands which put secondary characters into action, while moving the main character with the controller.
The main features of the Voice Recognition System are shown below.
Item |
Function |
Voice recognition format |
Semisyllabic voice recognition system *1 |
Language recognized |
Japanese words (delineates one word by a 0.4 second silence after pronunciation is completed) |
Speakers recognized |
Any speaker (speaker needs no specific prior training) |
Maximum registered words |
Maximum 255 words (about 80 words at 5 syllables per word) *2 |
Characters per word |
Maximum 17 characters |
Text registration method |
Enter using shift-JIS code |
Maximum pronunciation length |
About 10 seconds (any sound in excess of the maximum pronunciation length is processed as noise) |
Recognition result output method |
Words closest to voice input are output ranked 1st ~ 5th |
*1 Words are comprised of syllables. Syllables divided into two parts by the center of the vowels are called "semisyllables".
*2 Use osVoiceCountSyllables() function, described later, to obtain the number of semisyllables in a word.
The Voice Recognition System configuration is shown below. The system is used by inserting the plug on the Voice Recognition Unit into the N64 controller port, thereby connecting the special microphone on the Voice Recognition Unit. Since power is supplied by the N64, batteries are not needed.
Words to be recognized by the Voice Recognition System are placed in a registered word dictionary. Words can be freely registered in the registered word dictionary from the program side. The words which are determined to match closest to the voice input, are output from among these registered words. The Voice Recognition structure is shown in the figure below. A description of each step follows.
(1) Registration of word data to dictionary
Input the words that are to be recognized using the SJIS code. The words which have been input are converted to the format necessary for voice recognition processing and registered in the dictionary.
(2) Voice input
The user's voice is input via the special microphone connected to the Voice Recognition System. The input voice is then converted into the format necessary for voice recognition processing.
(3) Comparison between input voice pattern and registered words
The voice input is compared with the patterns of the words registered in the dictionary and a distance value (a numeric value expressing how different the voice input is from the word to which it is being compared) is computed.
(4) Output of similar word ranking
The words from among those words registered in the dictionary with the smallest distance values are output ranked in order from 1st ~ 5th place.
Changes in status while voice recognition is running are explained below. There are 5 command statuses during voice recognition execution.
The processing flow is shown in the figure below. A description of each step follows.
When the status moves from VOICE_STATUS_END to VOICE_STATUS_READY or is VOICE_STATUS_END, the Get Recognition Results command can be executed. If the Get Recognition Results command is executed while the status is VOICE_STATUS_END, the status will switch to VOICE_STATUS_READY after completion of the Get Recognition Results command. Once the status has switched to VOICE_STATUS_READY, the next Start Recognition command can be executed.
The variable which indicates the current status is stored in the voice recognition system control structure. Please see Section 26.8.6.1 "Initialize Voice Recognition System" for details.
Following is a simple example of the flow of a program for performing voice recognition.
First, initialize the Voice Recognition System. Next, initialize the registered word dictionary and register the words to be recognized. Once word registration is completed, the program moves to voice recognition processing. By starting voice recognition, voice input from the microphone can be acquired as words. Execute the Get Voice Recognition Results function to acquire a word. The library functions for the Voice Recognition System which perform processing at each step of the flow are explained in Section 8.6. Detailed programming procedures, including error branching, etc., are explained in Section 8.7. |
The library functions used when the Voice Recognition System is handled by an N64 program are explained below. There are a total of 10 Voice Recognition System-related functions.
Function
osVoiceInit
Initialize Voice Recognition System control structure and hardware
Syntax
#include <ultra64.h> s32 osVoiceInit(OSMesgQueue *siMessageQ, OSVoiceHandle *hd, int channel);
Arguments
Description
The osVoiceInit() function initializes the Voice Recognition System. It initializes both the hardware and the Voice Recognition System control structure. Consequently, there is no need to initialize the hd structure on the application side. Call this function first when using the Voice Recognition System.
It is recommended that you check to see which device is connected to a particular port prior to initialization. Standard controllers and peripheral devices other than the Voice Recognition System may be inserted into the controller ports as well. This check can be accomplished with the osContStartQuery() function and the osContGetQuery() function. The Voice Recognition System is connected if the value of the member variable "errno" of the OSContStatus structure is 0 (zero), and if the AND (logical product) of the value for type and CONT_TYPE_MASK is CONT_TYPE_VOICE.
siMessageQ is the message queue initialized in connection with OS_EVENT_SI. Please refer to the osSetEventMesg() function regarding how to establish this connection. channel is the channel number of the controller port to which the Voice Recognition Unit is connected. It is a value 0~3.
The Voice Recognition System control structure OSVoiceHandle is configured as follows:
typedef struct { OSMesgQueue *__mq; /* SI message queue */ int __channel; /* Controller port No. */ s32 __mode; /* Used within the OS */ u8 cmd_status; /* Command status */ } OSVoiceHandle;
Do not change the values of these various members in the application. In addition, the only member variable which is referred to and which has any meaning is cmd_status. The member variables other than cmd_status are used by the system and therefore do not need to be referred to by the application.
The member variable cmd_status indicates the voice recognition command status. When the voice recognition command status is checked within the voice recognition library, that value is stored in cmd_status. Specifically,the following function calls update the values.
osVoiceInit()
osVoiceClearDictionary()
osVoiceSetWord()
osVoiceMaskDictionary()
osVoiceStartReadData()
osVoiceStopReadData()
osVoiceGetReadData()
The following values can be handled by cmd_status. Please see Section 26.8.4.2 "Status When Voice Recognition is Running" for details on each status.
Definition Name |
Value |
Description |
VOICE_STATUS_READY |
0 |
Stop/End |
VOICE_STATUS_START |
1 |
Voice Undetected (no voice input) |
VOICE_STATUS_CANCEL |
3 |
Cancel (cancel extraneous noise) |
VOICE_STATUS_BUSY |
5 |
Detected/Detecting (voice being input, recognition processing under way) |
VOICE_STATUS_END |
7 |
End recognition processing (enable execution of Get Recognition Results command) |
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem in the Voice Recognition System connection.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Function
osVoiceClearDictionary
Initialize Registered Word Dictionary
Syntax
#include <ultra64.h> s32 osVoiceClearDictionary(OSVoiceHandle *hd, u8 words);
Arguments
Description
The osVoiceClearDictionary() function initializes the registered word dictionary for the Voice Recognition System. The dictionary is initialized so that the specified number of words can be registered in the dictionary. Words cannot be registered with the osVoiceSetWord before the dictionary is initialized with the osVoiceClearDictionary() function.
hd is the Voice Recognition System control structure. The Voice Recognition System must be initialized with the osVoiceInit() function before the osVoiceClearDictionary() function is called. The number of words to be registered is specified in words. 1~255 words can be registered in the dictionary.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem in the Voice Recognition System connection.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Function
osVoiceSetWord
Register Words into Voice Recognition System Dictionary
Syntax
#include <ultra64.h> s32 osVoiceSetWord(OSVoiceHandle *hd, u8 *word);
Arguments
Description
The osVoiceSetWord() function is for registering words in the Voice Recognition System dictionary. hd is the Voice Recognition System control structure. The Voice Recognition System must be initialized with the osVoiceInit() function before the osVoiceSetWord() function is called.
The word (SJIS) to be registered is specified in word. The word can be up to 17 characters long. Since calling the osVoiceSetWord() function once registers one word, execute osVoiceSetWord() multiple times to register multiple words. The number of words registered must match the number set by the osVoiceClearDictionary() function. Please note that an error will be generated when the osVoiceStartReadData() function is executed, if the number of words registered is greater than or less than the specified number of words.
The maximum number of words which can be registered in the dictionary is about 80 words, assuming 5 syllables per word. Therefore, while the maximum number of words which can be registered is set at 255, if there are several syllables per word, the dictionary may subsequently overflow the memory. In this case, voice recognition can be executed without an error being caused by the osVoiceStartReadData() function even if the number of registered words is less than the number set by the osVoiceClearDictionary() function.
The characters which can be registered and their codes are shown in the table below.
In addition, the following restrictions apply to character combinations when registering words. Use the osVoiceCheckWord() function to check whether or not the word that you are trying to register can be registered in the Voice Recognition System. Use this in the case of game applications in which registered words will be input during debugging or by the game player.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem in the Voice Recognition System connection.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
This error will also occur when you attempt to register more words than specified with the osVoiceClearDictionary() function.
CONT_ERR_VOICE_WORD
A word containing improper characters has been registered. The set word is invalidated and the word number is not incremented. Execute the osVoiceSetWord() function to register a proper word.
CONT_ERR_VOICE_MEMORY
This indicates a dictionary memory overflow. However, if the recognition command is executed in this condition, normal recognition processing can be performed even if the number of words which have been set is less than the number of words set by the osVoiceClearDictionary() function. When this error is generated, control the number of words actually set on the application side.
Function
osVoiceCheckWord
Check to see if the target word can be registered in the dictionary
Syntax
Arguments
Description
The osVoiceCheckWord() function is for checking whether or not a specified word can be registered in the Voice Recognition System. Use this when the words to be registered will be input during debugging or by the game player.
word specifies the word (SJIS) to be registered. An error will be returned if a word is specified which contains a character combination which does not satisfy the conditions listed in the table below.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_VOICE_WORD
The word cannot be registered. This word cannot be registered in the voice recognition dictionary.
Function
osVoiceCountSyllables
Count the number of semisyllables in word
Syntax
Arguments
Description
The osVoiceCountSyllables() function is for calculating how many syllables there are when registering a specific word in the Voice Recognition System. By using this function, you can later determine how many words can be registered in the dictionary. It is convenient to use the function during debugging or when asking the game player to input registered words.
word specifies the word (SJIS) to registered. The number of semisyllables resulting from the calculation is substituted for *syllables.
The total number of semisyllables which can be registered in the Voice Recognition System dictionary is 880 (440 syllables). If more than this are registered with the osVoiceSetWord() function, a CONT_ERR_VOICE_MEMORY error will occur.
The number of semisyllables is calculated as follows. One semisyllable per word must be added as an offset value.
Type of Syllable
Number of Semisyllables
Conditions
Vowel only
2
Start of word
Vowel only
1
Anywhere but start of word
Consonant + vowel
2
Start of word, or anywhere but when start of word is Romanized by k, t, c, or p
Consonant + vowel
3
Anywhere but start of word, anywhere except when preceding character is a small "tsu", or when start of word is Romanized by k, t, c, or p
Consonant + diphthong
2
Small "ya" and the like. Start of word or when start of word is Romanized by k, t, c, or p
Consonant + diphthong
3
Small "ya" and the like. Anywhere but start of word, anywhere except when preceding character is a small "tsu", or when start of word is Romanized by k, t, c, or p
"n" sound
1
none
Long "-" sound
1
none
Assimilated "tsu" sound
1
none
Function
osVoiceMaskDictionary
Switch between recognizing words registered in the dictionary and eliminating words from recognition
Syntax
Arguments
Description
The osVoiceMaskDictionary() function is for masking words registered in the Voice Recognition System. hd is the Voice Recognition System control structure. The Voice Recognition System must be initialized with the osVoiceInit() function before the osVoiceMaskDictionary() function is called.
Specify the word mask pattern in maskpattern. The mask data for all words are enumerated in maskpattern. The number of bytes in maskpattern is specified in size. In the mask data, one byte equals one word. A zero (0) indicates to mask (do not recognize a word) and a one (1) indicates not to mask (recognize a word). The word number (the number assigned the registered words in the order that they were registered) sequence in the mask data corresponds with the LSB to MSB sequence. In other words, bit 0 of the first byte corresponds with word No. 0, while bit 7 corresponds with word No. 7. If there are many words, prepare arrays for necessry number of bytes to create mask data. If the number of words is not a multiple of 8, put zeros (0) in the remaining most significant bits of the last byte of the mask data. If the osVoiceMaskDictionary() function has not been called, all of the words are unmasked.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem in the Voice Recognition System connection.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Function
osVoiceStartReadData
Start voice recognition by the Voice Recognition System
Syntax
Arguments
Description
The osVoiceStartReadData() function is for starting recognition processing by the Voice Recognition System. Before starting voice recognition processing with the osVoiceStartReadData() function, the Voice Recognition System must be initialized with the osVoiceInit() function, the dictionary must be initialized with the osVoiceClearDictionary() function, and word registration must be performed with the osVoiceSetWord() function. Be absolutely sure to call the osVoiceStartReadData() function after calling these functions.
After calling the osVoiceStartReadData() function, recognition results can be obtained by calling the osVoiceGetReadData() function. In addition, once voice recognition has been started, call the osVoiceStopReadData() function to forcibly stop recognition.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem in the Voice Recognition System connection.
CONT_ERR_INVALID
The voice recognition process attempted to start up, however, words were not registered properly in the dictionary with osVoiceSetWord(). There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Function
osVoiceGetReadData
Get voice recognition result from the Voice Recognition System
Syntax
Arguments
Description
The osVoiceGetReadData() function is for getting the recognition result from the Voice Recognition System. hd is the Voice Recognition System control structure. The Voice Recognition System must be initialized with the osVoiceInit() function before the osVoiceGetReadData() function is called.
The recognition result is stored in result of the OSVoiceData structure. The contents of the OSVoiceData structure are as follows:
The warning member variable of the OSVoiceData structure is the warning which pertains to the recognition result. The following bits are flagged when there is any problem with the recognition result.
Warning Name
Value
Description
Conditions
VOICE_WARN_TOO_SMALL
0x0400
Voice level is too low
100 < Voice Level < 150
VOICE_WARN_TOO_LARGE
0x0800
Voice level is too high
Voice Level > 3500
VOICE_WARN_NOT_FIT
0x4000
No words match recognition word
No. 1 Candidate Distance Value > 1600
VOICE_WARN_TOO_NOISY
0x8000
Too much ambient noise
Relative Voice Level =< 400
The answer_num member variable is the number of valid candidates. This is the number of words judged by the Voice Recognition System being valid as candidates. It is a value from 0 to 5. If this is 0, there are no valid candidates.
The voice_level member variable is the level of the input voice. The greater the voice input, the larger this value is.
The voice_sn member variable is the relative level of the voice input to the noise input.
The voice_time member variable is the voice input time in ms units.
The answer[] member variable is the numbers of the words from the 1st candidate to the 5th candidate. The word numbers are always output from the 1st candidate to the 5th candidate, but those which are deemed by the Voice Recognition System to be valid are numbered as candidates from the first to number of words in answer_num. Normally, answer[] is a value 0 ~ 0x00ff, but if there are no suitable words, its value is 0x7fff.
The distance[] member variable is the distance value of the word from the 1st candidate to the 5th candidate. The more similar the word, the smaller this value is.
Before calling the osVoiceGetReadData() function, voice recognition processing must be started with the osVoiceStartReadData() function.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_NOT_READY
Either no voice has been input, or results cannot be acquired for some reason, such as that processing is still underway, etc. Wait for a moment then try calling this function again. This error will occur if the status following execution of the osVoiceStartReadData() function is VOICE_STATUS_START, VOICE_STATUS_CANCEL, or VOICE_STATUS_BUSY.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem with the Voice Recognition System connection.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Function
osVoiceStopReadData
Forcibly stop voice recognition processing by the Voice Recognition System
Syntax
Arguments
Description
The osVoiceStopReadData() function is for forcibly stopping recognition processing once recognition by the Voice Recognition System has been started. hd is the Voice Recognition System control structure. The Voice Recognition System must be initialized with the osVoiceInit() function before the osVoiceStopReadData() function is called.
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_NO_CONTROLLER
Nothing is connected to the controller port.
CONT_ERR_DEVICE
Something other than the Voice Recognition System is connected to the controller port.
CONT_ERR_VOICE_NO_RESPONSE
There was no response from the Voice Recognition System. There may be a problem with the hardware.
CONT_ERR_CONTRFAIL
There was a data transmission failure. There is a problem in the Voice Recognition System connection.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Function
osVoiceControlGain
Adjust the input gain of the Voice Recognition System
Syntax
Arguments
Description
The osVoiceControlGain() function is for adjusting the gain of the input voice in the Voice Recognition System. The strength of the input voice signal can be changed by adjusting the gain. If the input voice is too strong, try decreasing the gain to decrease the voice level (normally, there is no particular need to change the gain).
hd is the Voice Recognition System control structure. The Voice Recognition System must be initialized with the osVoiceInit() function before the osVoiceControlGain() function is called.
analog is the analog gain of the transmission system. The analog gain is for adjusting the strength of the voice signal which is input from the microphone. The following values are available.
analog
Transmission system analog gain
0
0 dB (default)
1
-3 dB
digital is the digital gain of the transmission system. The digital gain is for adjusting the strength of the digital signal converted from the analog voice signal. The following values are available.
digital
Transmission system digital gain
0
0 dB (default)
1
-0.4 dB
2
-0.8 dB
3
-1.2 dB
4
-1.6 dB
5
-2.0 dB
6
-2.4 dB
7
-2.8 dB
The returned value is an error code. A 0 (zero) is returned when processing ends normally. If an error occurs, this function has the following error codes.
CONT_ERR_INVALID
There is an error in the function call method or in the argument. This error will not occur if the function is being used correctly. Write your program so that this error does not occur when development is completed.
Typical methods of using the various Voice Recognition System functions are described below. Please see Figure 27-8-1 for illustrated procedure and flow chart.
Typically, there are 5 types of processing which are performed:
(1) Initialize the Voice Recognition System using the osVoiceInit() function
Detailed descriptions of these 5 processes are given in Section 26.8.7.1 "Flow of Voice Recognition Processing". Processing in which errors are returned as the return values for the various functions is explained in Section 26.8.7.2 "Error Processing".
(1) Voice Recognition System Initialization Processing
The osVoiceInit() function is called in the processing here. The osVoiceInit() function initializes the Voice Recognition System.
It is recommended that you first check what is connected to which port prior to initialization because standard controllers,etc., other than the Voice Recognition System may be inserted into the controller ports as well. This check can be accomplished with the osContStartQuery() function and the osContGetQuery() function. The Voice Recognition System is connected if the value of the member variable errno of the OSContStatus structure is 0 (zero), and if the AND (logical product) of the value for type and CONT_TYPE_MASK is CONT_TYPE_VOICE.
(2) Initialize Registered Word Dictionary
The osVoiceClearDictionary() function is called in the processing here. The osVoiceClearDictionary() function initializes the registered word dictionary. Initialize the dictionary before registering words using the osVoiceSetWord() function.
(3) Register Words to Registered Word Dictionary
The osVoiceSetWord() function is called in the processing here. The osVoiceSetWord() function registers words which are to be registered in the registered word dictionary. Since calling the osVoiceSetWord() function once registers one word, execute osVoiceSetWord() multiple times to register multiple words. The number of words registered must match the number set by the osVoiceClearDictionary() function. Please note that an error (CONT_ERR_INVALID) will be generated when the osVoiceStartReadData() function is executed if the number of words registered is greater than or less than the specified number of words.
(4) Start Voice Recognition
The osVoiceStartReadData() function is called in the processing here. The osVoiceStartReadData() function is for starting voice recognition processing. After osVoiceStartReadData() has been called, the recognition results can be acquired by calling osVoiceGetReadData(). In addition, call osVoiceStopReadData() to forcibly stop recognition after voice recognition has been started.
(5) Acquire Recognition Results
The osVoiceGetReadData() function is called in the processing here. The osVoiceGetReadData() function is for acquiring recognition results. The recognition results are stored in the OSVoiceData structure. Refer to the following example for the method of acquiring a recognized word from the data in the OSVoiceData structure. Please refer to Section 26.8.6.8 "Get Recognition Result" for details on the OSVoiceData structure.
To continue recognition processing again after the voice has been detected again, repeat processing from (4).
In order to rapidly respond to voice input from the user, you may call the osVoiceGetReadData() function every frame to check for voice input from the user.
Perform the processing shown below when an error is returned upon execution of the various functions.
If one of the five errors CONT_ERR_NO_CONTROLLER, CONT_ERR_DEVICE, CONT_ERR_CONTRFAIL, CONT_ERR_VOICE_NO_RESPONSE, or CONT_ERR_INVALID occurs when any of the various functions is executed, display a message and repeat processing starting from (1). Since the two errors CONT_ERR_VOICE_NO_RESPONSE and CONT_ERR_INVALID are errors which are due to software or hardware failures or bugs, they will not normally occur.
If the CONT_ERR_VOICE_WORD error occurs when executing the osVoiceSetWord() function, the word that was being registered at the time contains improper characters. Re-register the proper word.
If the CONT_ERR_VOICE_MEMORY error occurs when executing the osVoiceSetWord() function, the dictionary has overflowed memory and no more words can be registered. However, even if the number of registered words is less than the number which was set by the osVoiceClearDictionary()function in this case, recognition processing can still be performed from (4) on. Consequently, when this error occurs, store the number of words registered up to that point as the number of registered words and shift to the processing at (4). To repeat registration, redo processing from (2).
If the CONT_ERR_NOT_READY error is returned during execution of the osVoiceGetReadData() function, either no voice has been input or recognition processing is still underway. Wait a moment and retry the osVoiceGetReadData() function.
You may also refer to the sample program "voice" which uses Voice Recognition System functions. It is stored under the /usr/src/PR/demos/ directory.
(1) Recognition Accuracy
The Voice Recognition System performs pattern characteristic extraction in syllable units to recognize one word. Because of this, there are cases in which recognition accuracy may be slightly inferior to characteristic extraction in word units. Since instances may arise in which the input voice cannot be recognized and the user is prompted to re-input, be particularly careful when real-time responses are required, as during an action game. In these cases, take measures so as to avoid mis-recognition, such as keeping the number of registered words low, or registering only words whose pronunciations are completely different.
For example, keep the words which are registered in the dictionary at that time low, or mask those words registered in the dictionary which are not needed, so that the user selects from the restricted vocabulary. Thus, the recognition success rate becomes very high since recognition is performed only from a limited small number of words.
(2) To Change to a New Recognized Word During Recognition Processing
To newly register a recognized word when the osVoiceStartReadData() function has been called and recognition processing is being executed, be sure to temporarily interrupt recognition processing with the osVoiceStopReadData() function. Then repeat the osVoiceClearDictionary() function.
(3) Registration to Recognized Word Dictionary
Do not register words in the dictionary which contain invalid character combinations which would return an error when entered to the osVoiceCheckWord() function. There are instances in which an error will not be returned and operation of the software will become unstable if the specific character combinations shown below are entered in the dictionary.
(4) Precautions During Voice Input
Depending on the words registered in the dictionary, valid word candidates may be output simply by coughing or breathing into the microphone. Because of this, limit the acceptance of voice input to when a controller button is being pressed, or the like, so as to avoid erroneous recognition. There may also be cases in which the Voice Recognition System is unable to complete preparation to accept voice input when voice input is performed at the same time that the button is pressed. In this case, you may perform the following procedure.
(5) Voice Input Gain Adjustment
Since the voice detection threshold value is determined by the strength of the input signal at the time that recognition processing is started, there are instances when the voice level is high (when voice recognition is started) in which the threshold value becomes high, making it difficult to detect voice input. If this occurs, try decreasing the gain to lower the voice level. Do not change the gain during the game except when it can be assumed that there will be unexpectedly high voice input levels at the start of recognition.
(6) Precautions Regarding Warnings
The warnings which are returned to the warning member variable of the OSVoiceData structure represent the reliability of the recognition results, but do not indicate a serious failure as an error. For instance, even if valid candidates are returned to the answer[] member variable, VOICE_WARN_NOT_FIT (word is not among the recognized words) may be returned as a warning. This will occur when the distance[0] member variable, which expresses the distance value of the No. 1 candidate word, is a value 1600 or greater, even though the answer_num member variable, which expresses the number of valid candidates, returns a value of 1 or more. In this case, the judgment priority for the two member variables depends on the application, but the warning essentially can be ignored. Use of the warning results is up to the discretion of the person creating the application.
Usage
Character
No limitation on use
Can be used only after specified characters
(Combinable characters)
Cannot be used at the beginning of a word
Cannot be used at the end of a word
Cannot be used in front of "-"
Cannot be used after small "tsu"
Combinations which cannot be used
26.8.6.4 Check Registerable Words
#include <ultra64.h>
s32 osVoiceCheckWord(u8 *word);
Usage
Character
No limitation on use
Can be used only after specified characters
(Combinable characters)
Cannot be used at the beginning of a word
Cannot be used at the end of a word
Cannot be used in front of "-"
Cannot be used after small "tsu"
Combinations which cannot be used
26.8.6.5 Count Semisyllables in Word
#include <ultra64.h>
void osVoiceCountSyllables(u8 *word, u32 *syllable);
26.8.6.6 Mask Registered Words
#include <ultra64.h>
s32 osVoiceMaskDictionary(OSVoiceHandle *hd, u8 *maskpattern, int size);
26.8.6.7 Start Voice Recognition
#include <ultra64.h>
s32 osVoiceStartReadData(OSVoiceHandle *hd);
26.8.6.8 Get Recognition Result
#include <ultra64.h>
s32 osVoiceGetReadData(OSVoiceHandle *hd, OSVoiceData *result);
typedef struct {
u16 warning; /* Warning */
u16 answer_num; /* Candidate number (0~5) */
u16 voice_level; /* Voice input level */
u16 voice_sn; /* Relative voice level */
u16 voice_time; /* Voice input time */
u16 answer[5]; /* Candidate word number */
u16 distance[5]; /* Distance value */
} OSVoiceData;
26.8.6.9 Forcibly Stop Recognition Processing
#include <ultra64.h>
s32 osVoiceStopReadData(OSVoiceHandle *hd);
26.8.6.10 Adjust Input Gain
#include <ultra64.h>
s32 osVoiceControlGain(OSVoiceHandle *hd, s32 analog, s32 digital);
26.8.7 Examples Using Voice Recognition System Functions
(2) Initialize the registered word dictionary using the osVoiceClearDictionary() function
(3) Register words to the registered word dictionary using the osVoiceSetWord() function
(4) Start voice recognition using the osVoiceStartReadData() function
(5) Acquire voice recognition results using the osVoiceGetReadData() function
26.8.7.1 Flow of Voice Recognition Processing
u8 *registration_word[] = {
"Yakiniku",
"Mario",
.
.
.
"Pikachu"
};
OSVoiceData result;
26.8.7.2 Error Processing
26.8.8 Precautions