Speech Recognition Game - Unity

Speech Recognition Game - Unity - c#

Good Afternoon everyone,
I am currently working on a uni project on accessibility in video games. My game uses eye tracking and speech recognition. It consists of 2 small level : a shooting game and a running level. The game i offline. The Eye tracking part works fine but I encountered an issue with the speech recognition. I am using the phrase recognizer from unity speech: https://learn.microsoft.com/en-us/windows/mixed-reality/develop/unity/voice-input-in-unity .
The problem is that there is a delay of a second to a second and a half from the moment I speak to the recognition. It happens before my onphrase recognizer is called (before my functions are called). The delay is still present when I take down wifi and cortana and I am wondering if there is any way to shorten it as it is pretty bad in a video game...
Here is the code in question:
//Speech recognition Initialization
private KeywordRecognizer keywordRecognizer;
private Dictionary<string, System.Action> actions = new Dictionary<string, System.Action>();
[...]
void Start()
{
//we add the jump function to the dictionnary
actions.Add("jump", () => Up(1.25f));
//we set the speech recognition function and start it
keywordRecognizer = new KeywordRecognizer(actions.Keys.ToArray(), ConfidenceLevel.Low);
keywordRecognizer.OnPhraseRecognized += RecognizedSpeech;
keywordRecognizer.Start();
}
private void RecognizedSpeech(PhraseRecognizedEventArgs speech)
{
Debug.LogWarning("jump");
actions[speech.text].Invoke();
}
public void EndListening()
{
actions.Clear();
//keywordRecognizer.Stop();
}
[...]"
Would anyone have a lead or an advice or is working/ has worked on something similar ?
Thank you for your time.

Related

Song doesn't play all the time in Monogame

Heey,
We created a game with Monogame but we got the following problem.
We got a themesong that plays when you have loaded the game now is the problem that the themesong sometimes plays but sometimes just doesn't. We convert it by the XNA pipeline to a wma and load it into our game with the .xnb together but just sometimes the music doesn't wanna start.
We just use the standard code for starting a song and all of this code does fire.
internal void PlayBackgroundMusic()
{
if (MediaPlayer.GameHasControl)
{
MediaPlayer.Volume = mainVolume;
MediaPlayer.Play(backgroundMusic);
MediaPlayer.IsRepeating = true;
}
}
We also use SoundEffects but these work 100% of the time it's only the song that won't play everytime you start. Windows RT runs it fine by the way.

Make sure that the debugger gets into the if statement through debugging (or remove the statement temporarily). Another possibility might be that the function is running before the game is fully initialized. You could try delaying the function until the game has been fully loaded.
PS: I can't comment on questions yet so here's an answer.
Edit:
Alright, after some messing around with the Song class and looking in the implementation in MonoGame I came to the conclusion that the SoundEffect class is easier to use and better implemented.
backgroundSound = Content.Load<SoundEffect>("alarm");
protected override void BeginRun()
{
// I created an instance here but you should keep track of this variable
// in order to stop it when you want.
var backSong = backgroundSound.CreateInstance();
backSong.IsLooped = true;
backSong.Play();
base.BeginRun();
}
I used this post: using BeginRun override to play the SoundEffect on startup.

If you want to play a song, use the Song class. But make sure you are using ".mp3" instead of ".wma" before converting them into ".xnb".
backgroundMusic = Content.Load<Song>("MainTheme");
MediaPlayer.Play(backgroundMusic);
MediaPlayer.IsRepeating = true;
See MSDN.

c# Kinect speech and gesture recognition not working together

I am writing a code that uses both speech and gesture recognition. I have used code from the Kinect Dev toolkit browser for speech and a blog (http://dotneteers.net/blogs/vbandi/archive/2013/03/25/kinect-interactions-with-wpf-part-i-getting-started.aspx) regarding the gesture control. The problem I am having is that I believe the initializations are interfering with each other.
private KinectSensor InitializeKinect()
{
CurrentSensor = KinectSensor.KinectSensors.FirstOrDefault();
speechRecognizer = CreateSpeechRecognizer();
CurrentSensor.Start();
Start();
return CurrentSensor;
}
That interferes with
private void OnLoaded(object sender, RoutedEventArgs routedEventArgs)
{
this.sensorChooser = new KinectSensorChooser();
this.sensorChooser.KinectChanged += SensorChooserOnKinectChanged;
this.sensorChooserUi.KinectSensorChooser = this.sensorChooser;
this.sensorChooser.Start();
somehow. I already edited the InitializeKinect function a little bit due to KinectStatus being an not comparable (== doesnt work).
If I comment out OnLoaded or InitalizeKinect in the MainWindow(), the other will work and if both are uncommented out, Speech only works.
Thanks for the help!

I know nothing about Kinect, but - InitializeKinect looks like it's finding a Kinect sensor and initializing the SR engine (most likely using some Kinect information). I would remove the InitializeKinect call and add
speechRecognizer = CreateSpeechRecognizer();
just before
this.sensorChooser.Start();

Kinect speech recognition and skeleton tracking don't work together

I'm writing an application that can take several different external inputs (keyboard presses, motion gestures, speech) and produce similar outputs (for instance, pressing "T" on the keyboard will do the same thing as saying the word "Travel" out loud). Because of that, I don't want any of the input managers to know about each other. Specifically, I don't want the Kinect manager (as much as possible) to know about the Speech manager and vice versa, even though I'm using the Kinect's built-in microphone (the Speech manager should work with ANY microphone). I'm using System.Speech in the Speech manager as opposed to Microsoft.Speech.
I'm having a problem where as soon as the Kinect motion recognition module is enabled, the speech module stops receiving input. I've tried a whole bunch of things like inverting the skeleton stream and audio stream, capturing the audio stream in different ways, etc. I finally narrowed down the problem: something about how I'm initializing my modules does not play nicely with how my application deals with events.
The application works great until motion capture starts. If I completely exclude the Kinect module, this is how my main method looks:
// Main.cs
public static void Main()
{
// Create input managers
KeyboardMouseManager keymanager = new KeyboardMouseManager();
SpeechManager speechmanager = new SpeechManager();
// Start listening for keyboard input
keymanager.start();
// Start listening for speech input
speechmanager.start()
try
{
Application.Run();
}
catch (Exception ex)
{
MessageBox.Show(ex.StackTrace);
}
}
I'm using Application.Run() because my GUI is handled by an outside program. This C# application's only job is to receive input events and run external scripts based on that input.
Both the keyboard and speech modules receive events sporadically. The Kinect, on the other hand, generates events constantly. If my gestures happened just as infrequently, a polling loop might be the answer with a wait time between each poll. However, I'm using the Kinect to control mouse movement... I can't afford to wait between skeleton event captures, because then the mouse would be very laggy; my skeleton capture loop needs to be as constant as possible. This presented a big problem, because now I can't have my Kinect manager on the same thread (or message pump? I'm a little hazy on the difference, hence why I think the problem lies here): from the way I understand it, being on the same thread would not allow keyboard or speech events to consistently get through. Instead, I kind of hacked together a solution where I made my Kinect manager inherit from System.Windows.Forms, so that it would work with Application.Run().
Now, my main method looks like this:
// Main.cs
public static void Main()
{
// Create input managers
KeyboardMouseManager keymanager = new KeyboardMouseManager();
KinectManager kinectManager = new KinectManager();
SpeechManager speechmanager = new SpeechManager();
// Start listening for keyboard input
keymanager.start();
// Attempt to launch the kinect sensor
bool kinectLoaded = kinectManager.start();
// Use the default microphone (if applicable) if kinect isn't hooked up
// Use the kinect microphone array if the kinect is working
if (kinectLoaded)
{
speechmanager.start(kinectManager);
}
else
{
speechmanager.start();
}
try
{
// THIS IS THE PLACE I THINK I'M DOING SOMETHING WRONG
Application.Run(kinectManager);
}
catch (Exception ex)
{
MessageBox.Show(ex.StackTrace);
}
For some reason, the Kinect microphone loses its "default-ness" as soon as the Kinect sensor is started (if this observation is incorrect, or there is a workaround, PLEASE let me know). Because of that, I was required to make a special start() method in the Speech manager, which looks like this:
// SpeechManager.cs
/** For use with the Kinect Microphone **/
public void start(KinectManager kinect)
{
// Get the speech recognizer information
RecognizerInfo recogInfo = SpeechRecognitionEngine.InstalledRecognizers().FirstOrDefault();
if (null == recogInfo)
{
Console.WriteLine("Error: No recognizer information found on Kinect");
return;
}
SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(recogInfo.Id);
// Loads all of the grammars into the recognizer engine
loadSpeechBindings(recognizer);
// Set speech event handler
recognizer.SpeechRecognized += speechRecognized;
using (var s = kinect.getAudioSource().Start() )
{
// Set the input to the Kinect audio stream
recognizer.SetInputToAudioStream(s, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
// Recognize asycronous speech events
recognizer.RecognizeAsync(RecognizeMode.Multiple);
}
}
For reference, the start() method in the Kinect manager looks like this:
// KinectManager.cs
public bool start()
{
// Code from Microsoft Sample
kinect = (from sensorToCheck in KinectSensor.KinectSensors where sensorToCheck.Status == KinectStatus.Connected select sensorToCheck).FirstOrDefault();
// Fail elegantly if no kinect is detected
if (kinect == null)
{
connected = false;
Console.WriteLine("Couldn't find a Kinect");
return false;
}
// Start listening
kinect.Start();
// Enable listening for all skeletons
kinect.SkeletonStream.Enable();
// Obtain the KinectAudioSource to do audio capture
source = kinect.AudioSource;
source.EchoCancellationMode = EchoCancellationMode.None; // No AEC for this sample
source.AutomaticGainControlEnabled = false; // Important to turn this off for speech recognition
kinect.AllFramesReady += new EventHandler<AllFramesReadyEventArgs>(allFramesReady);
connected = true;
return true;
}
So when I disable motion capture (by having my main() look similar to the first code segment), speech recognition works fine. When I enable motion capture, motion works great but no speech gets recognized. In both cases, keyboard events always work. There are no errors, and through tracing I found out that all the data in the speech manager is initialized correctly... it seems like the speech recognition events just disappear. How can I reorganize this code so that the input modules can work independently? Do I use threading, or just Application.Run() in a different way?

The Microsoft Kinect SDK have several known issues, one of them being that audio is not processed if you begin tracking the skeleton after starting the audio processor. From the known issues:
Audio is not processed if skeleton stream is enabled after starting audio capture
Due to a bug, enabling or disabling the SkeletonStream will stop the AudioSource
stream returned by the Kinect sensor. The following sequence of instructions will
stop the audio stream:
kinectSensor.Start();
kinectSensor.AudioSource.Start(); // --> this will create an audio stream
kinectSensor.SkeletonStream.Enable(); // --> this will stop the audio stream as an undesired side effect
The workaround is to invert the order of the calls or to restart the AudioSource after changing SkeletonStream status.
Workaround #1 (start audio after skeleton):
kinectSensor.Start();
kinectSensor.SkeletonStream.Enable();
kinectSensor.AudioSource.Start();
Workaround #2 (restart audio after skeleton):
kinectSensor.Start();
kinectSensor.AudioSource.Start(); // --> this will create an audio stream
kinectSensor.SkeletonStream.Enable(); // --> this will stop the audio stream as an undesired side effect
kinectSensor.AudioSource.Start(); // --> this will create another audio stream
Resetting the SkeletonStream engine status is an expensive call. It should be made at application startup only, unless the app has specific needs that require turning Skeleton on and off.
I also hope that when you say you're using "version 1" of the SDK, you mean "version 1.6". If you are using anything but 1.5 or 1.6, you are only hurting yourself due to the many changes that were made in 1.5.

Playing MIDI-files and synchronizing time with visuals with WPF

I'm writing an application using C#/Windows Presentation Foundation.
It is visualizing the steps of a dance with foot shapes.
Currently I'm playing the music as WAV-file and timing the steps with a Timer.
Because of the irregularities of a Timer the music is not in sync with the steps.
I need some kind of synchronization, this is why I wanted to use MIDI-files.
To sync the steps I need an event for each time in the music and would then show the next step. In this case I wouldn't use the Timer anymore.
I already looked at NAudio. I found tutorials for playing MP3-files which don't help me. I created a MidiFile-object but I don't know how to play it. I know that a MIDI-file contains information on how to play the music (for synthesizers) but I don't want to implement my own player.
What is a simple way to play a MIDI-file with NAudio?
How can I receive Events in each time of the music?
Is there an alternative to NAudio that can probably help me better?
Is there an alternative to MIDI that can sync to my visualization?
I am thankful for every kind of help. I've been searching for a while and think that I am maybe looking in the wrong direction.

With DryWetMIDI (I'm the author) playing MIDI files along with firing played events is pretty simple:
namespace SimplePlaybackApp
{
class Program
{
private static Playback _playback;
static void Main(string[] args)
{
var midiFile = MidiFile.Read("The Greatest Song Ever.mid");
var outputDevice = OutputDevice.GetByName("Microsoft GS Wavetable Synth");
_playback = midiFile.GetPlayback(outputDevice);
_playback.EventPlayed += OnEventPlayed;
_playback.Start();
SpinWait.SpinUntil(() => !_playback.IsRunning);
Console.WriteLine("Playback stopped or finished.");
outputDevice.Dispose();
_playback.Dispose();
}
private static void OnEventPlayed(object sender, MidiEventPlayedEventArgs e)
{
// ... do something
}
}
}
More info in Playback article and Playback API reference.

If you want to get deeper into the midi internals this looked like a pretty cool library and source code to explore.
http://code.google.com/p/midi-dot-net/

XNA performance and setup on WP7

I've been playing around with IsFixedTimeStep and TargetElapsedTime but I'm unable to get an fps above the 30fps. This is in both the emulator and on my HTC HD7 phone.
I'm trying to get the World.Step() setting correct in Farseer too, but havent found a good setting for this.
If I want to get it running at 60fps what should the three settings (IsFixedTimeStep, TargetElapsedTime and World.Step) ideally be?
Thanks!

You can make your game run at 60fps as long as you are running a Mango Deployed App
the code below was lifted from: MSDN: Game at 60fps
game timer interval run at 60Hz
timer.UpdateInterval = TimeSpan.FromTicks(166667);
Create the event handler
public Game1()
{
graphics = new GraphicsDeviceManager(this);
graphics.PreparingDeviceSettings += new EventHandler<PreparingDeviceSettingsEventArgs>(graphics_PreparingDeviceSettings);
// Frame rate is 60 fps
TargetElapsedTime = TimeSpan.FromTicks(166667);
}
Then implement the handler
void graphics_PreparingDeviceSettings(object sender, PreparingDeviceSettingsEventArgs e)
{
e.GraphicsDeviceInformation.PresentationParameters.PresentationInterval = PresentInterval.One;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Speech Recognition Game - Unity - c#

Related

Song doesn't play all the time in Monogame

c# Kinect speech and gesture recognition not working together

Kinect speech recognition and skeleton tracking don't work together

Playing MIDI-files and synchronizing time with visuals with WPF

XNA performance and setup on WP7

Categories

Resources