Announcements

Feature: Enhanced Speech Recognition

4 min

we are pleased to announce that our speech recognition system has been completely rebuilt to provide a better user experience with this update, users can expect the following enhancements automatic speech detection, you no longer need to use push to talk to speak without digital humans enhanced accuracy in speech recognition, especially for short utterances no decrease in accuracy even in high latency environments your microphone audio will not be transmitted from your device until you begin speaking, ensuring privacy you have the option to mute/unmute your microphone for privacy or to simulate push to talk functionality in noisy environments our voice activity detection system has been trained to detect speech rather than noise (background noise, coughing, music, etc) you can interrupt the digital human, and the character will stop speaking however, background noise alone will not interrupt the character; it requires an interim transcription result improved stability for a more reliable experience there are two ways to access this new features either through our hosted experience (drop in script to provide an instant user experience), or via our web sdk ( npm package ) hosted experience migration guide hosted experience will provide full ui support speech recognition mode buttons to mute/unmute the users microphone indicators of users microphone status (muted, listening, active speech, blocked) (coming soon transcription of the users speech displayed on screen) to change to speech recognition mode, via your uneeqinteractionsoptions configuration, set voiceinputmode to "speech recognition" example window\ uneeqinteractionsoptions = { personashareid "enter persona share id here", showuserinputinterface true, voiceinputmode "speech recognition" } method changes if you are using uneeq methods to programatically control voice recording you need to be aware of the following changes uneeqstartrecording and uneeqstoprecording will have no action when using speech recognition mode as speech recognition is automatic without the use of push to talk these methods are no longer required message changes previously, when using push to talk, you would receive messages recordingstarted and recordingstopped to indicate when push to talk was engaged/disengaged using speech recognition mode, you will no longer recieve these messages there are new messages that will be sent when using speech recognition mode userstartedspeaking voice activity detection has recognized that the user has started speaking userstoppedspeaking voice activity detection has recognized that the user has stopped speaking speechtranscription a new interim or final transcription result is available see here for details of the message contents build your own (npm package) migration guide if you've built your own experience and ui using our npm package you will need to set voiceinputmode to "speech recognition" from version 2 49 0 onwards voiceinputmode voice activity will be merged with speech recognition in version 2 50 0 from this version onwards you will get the speech recognition experience when using voice activity as your voiceinputmode example new uneeq({ url "uneeqconnectionurl," conversationid "personaid", voiceinputmode "speech recognition" }); method changes if you are using uneeq methods to programatically control voice recording you need to be aware of the following changes uneeqstartrecording and uneeqstoprecording will have no action when using speech recognition mode as speech recognition is automatic without the use of push to talk these methods are no longer required message changes previously, when using push to talk, you would receive messages recordingstarted and recordingstopped to indicate when push to talk was engaged/disengaged using speech recognition mode, you will no longer recieve these messages there are new messages that will be sent when using speech recognition mode userstartedspeaking voice activity detection has recognized that the user has started speaking userstoppedspeaking voice activity detection has recognized that the user has stopped speaking speechtranscription a new interim or final transcription result is available see here for details of the message contents