Using Cortana's voice - c#

I'm actually creating a homemade assistant and the default speech synthetiser (in c#) hasn't a nice voice. I would like to know if it is possible, and how, to use Cortana voice and pronunciation?

You can see how to tune Cortana's responses via this link:
https://learn.microsoft.com/en-us/cortana/skills/speech-synthesis-markup-language
And you can output text-to-speech (TTS) this way:
https://learn.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/bingvoiceoutput
You can also change the Windows default voice to Cortana:
https://superuser.com/questions/1141840/how-to-enable-microsoft-eva-cortanas-voice-on-windows-10
Does this help?

Related

API for Live Captions on Windows

I'm build a Win Universal App with capabilities to watch live captions of the lecture which student is currently watching or attending in person. I'm looking for a built-in free solution to do audio to text operations.
macOS have the Speech lib https://developer.apple.com/documentation/speech , which we're going to use, but cannot find a similar on Windows. Found docs on Windows.Media package, but cannot figure out if it actually has audio2text api or just commands recognition https://learn.microsoft.com/en-us/uwp/api/windows.media.speechrecognition?view=winrt-22621
Maybe someone has experience with building such kind of capabilities on Windows?
Yes, you could use the Windows.Media.SpeechRecognition API for speech recognition not only with the commands recognition.
You could make a simple test with the official Speech Recognition Sample here: SpeechRecognitionAndSynthesis. Just remember to enable the Online speech recognition (Settings -> Privacy -> Speech).

Disable general speech commands in Hololens

I am making an application with Unity and MRTK for Hololens 2.
I would like to know if it is possible to disable general speech commands.
That is, I want to keep my own voice commands that I have created but I don't want phrases like "Take a picture", "Take a video".... to be recognized.
I've searched the internet but haven't found anything about it.
Does anyone know if there is an option to do this?
Speech commands in the Unity UWP app are powered by the same engine that supports speech in Windows 10 System. It means if you disable the Speech feature in Settings > Privacy > Speech, the speech recognition will no longer be available in your Unity app. Apart from this, HoloLens does not provide any way to allow you to modify system commands for now. Therefore, it is recommended you avoid using system commands in your app, or consider using the Unity plug-in for the Cognitive Speech Services SDK or other third-party speech engines.
You can go setting page,then go to camera ,then find "choose which apps can acess your camera" ,then uncheck Mixed Reality Camera.

Microsoft Cognitive Services: Bing Speech Recognition XAML

I am trying to develop a humble AI to help the daily routines of my family.
Voice recognition is a must, and commands can not be limited to a command library.
so command library mode is out of the table
I tried dictation mode, which already has a terrible recognition with headset, wont be able to understand anything with a room mic.
So I am trying to use Microsoft Cognitive Services: Bing Speech Recognition:
I downloaded the documents and the example, I see everything is in XAML form. I don't understand why.
I am asking some guidance from those who are experienced in this, is it possible to make it in console app or windows form? (I am using .Net 4.6).
If not do you have any suggestion for me to solve my problem?
Thank you for your time and patience.
You can use NuGets to achieve the same. Find sample over here https://github.com/Azure-Samples/Cognitive-Speech-STT-Windows
Further details about BingSpeechSDK -> https://learn.microsoft.com/en-us/azure/cognitive-services/Speech/GetStarted/GetStartedCSharpDesktop
You can use the same for Console App also just need to define the input segment from MicIn.

Using the narrator in C# windows store apps

the narrator API (System.Speech) is not available for windows store.
Is there an alternative API or method for using this?
I found that there is a text-to-speech function as part of the Microsoft translator service, however since the narrator is already available, it would be silly to have to use this service for that, so I'd rather use it directly if possible.
I thought of maybe running a command through CMD.exe to text-to-speech something, but in windows store apps it's impossible to launch external processes, also I haven't found a command line interface for the narrator anyway.
Does anybody know of any method of doing this?
Windows 8.1 has offline text to speech support. Here's MSDN sample.
Are you trying to implement accessibility? If so, then this may be what your looking for:
https://msdn.microsoft.com/library/windows/apps/windows.ui.xaml.automation.automationproperties
You can set the Name attached property to whatever you would like to have read.

Speech to text on windows server 2008 with c# and supports multilanguage?

I'm looking for speech (wave files) to text on windows server 2008 (or win server 2008 r2) using c# (at least an api that i can call from c#) that supports multilanguage.
As far as i know i can't use .net speech (sapi) because it works only on vista \ windows 7.
I can't use Microsoft Speech Platform because it not supports all the languages i need (as far as i checked there is no Hebrew (he) support).
It can't be a web based service (i need it on my server).
I'm looking for something that can be used in commercial software and i'm also willing to pay for a third party product.
Can you please help me with that?
Thanks
You have text-to-speech listed as a tag but the description sounds like speech recognition. If I understand what you want to do it is to take a wav file with speech in it and convert it too text. Actually this is not even normal speech recognition because most of the speech reco systems work on targeted speech input that use grammars to restrict the search space that the speech engine has to use. I think what you are describing is automatic translation or transcription, akin to what Google Voice does to your voice mail messages when it sends you a text translation in an email. This is a much more difficult problem and the state-of-the-art is not that advanced right now. Most of these solutions are offered as services and the best ones still use human translators when the speech recognition confidence rate is low. I think the leader in this area is Nuance. I would check with them for a solution. I know they recently bought out a company that provides this automated transcription service and perhaps they now offer it as a product. They are also a leader in transcribing doctors orders/findings automatically to text with their product Dragon Naturally Speaking.

Categories

Resources