What is the difference between these two methods in C# using the speech API or SAPI?
using SpeechLib;
SpVoice speech = new SpVoice();
speech.Speak(text, SpeechVoiceSpeakFlags.SVSFlagsAsync);
returns the Apacela voices, and
SpeechSynthesizer ss = new SpeechSynthesizer();
ss.SpeakAsync ("Hello, world");
Does not work with Apacela voices.
The first one return all voices but the second one only return few voices. Is this something related to SAPI 5.1 and SAPI 5.3?
The behavior is same on Vista and XP, on both SpVoice was able to detect the Apacela voice but using SpeechSynthesizer, the voices does not detected on both XP and Vista.
I guess XP uses SAPI 5.1, and Vista uses SAPI 5.3 then why the same behavior on all OS, but different behavior with the API?
Also which API is more powerful and what are the difference between the two ways/API?
SpeechLib is an Interop DLL that makes use of classic COM-based SAPI under the covers. System.Speech was developed by Microsoft to interact with Text-to-speech (and voice recognition) directly from within managed code.
In general, it's cleaner to stick with the managed library (System.Speech) when you're writing a managed application.
It's definitely not related to SAPI version--the most likely problem here is that a voice vendor (in this case Acapela) has to explicitly implement support for certain System.Speech features. It's possible that the Acapela voices that you have support everything that is required, but it's also possible that they don't. Your best bet would be to ask the Acapela Group directly.
Voices are registered in HKLM\SOFTWARE\Microsoft\Speech\Tokens, and you should see the Windows built-in voices, as well as the Acapela voices that you have added listed there. If you spot any obvious differences in how they're registered, you might be able to make the Acapela voices work by making their registration match that of, for example, MS-Anna.
But I'd say the most likely possibility is that the Acapela voices have not been updated to support all of the interfaces required by System.Speech.
SpeechLib is an interop DLL and so maps to whatever version of SpeechLib it was created for (you can check it's properties).
System.Speech.* is the "official" support for speech in the .NET framework. SpeechSynthesizer chooses which speech library to use at runtime (much like the System.Web.Mail classes did).
I'm not sure why they return a different number of voices but it is likely to be related to the SAPI version being used.
Related
I use the speech synthesis for a simple program, and I was
wondering if there is supporting in other languages than english?
I want that the speech will be in the local language. Is it possible?
You can use SpeechSynthesizer.GetInstalledVoices to obtain a list of all available voices, together with some Culture Information. On my Windows 8.1 machine, there is a German and an English language installed. You should be able to check if there is a capable voice present with the GetInstalledVoices method.
Here is a list of the supported languages on the Microsoft Speech Platform SDK 11
I'm using Microsoft Speech Synthesis in C# and I want to know if there is a way to add echo effects and other sound effects to the speech such that the speech appears to be happening in a live stadium or a room etc. Also, I want to use other voices for my code besides Microsoft Anna in Win 7 64 bit but all I found was ways to change voices using .cpl files but I did not find any free voices. I did find http://www.cepstral.com/en/personal/download which has free voice downloads but these are for older SAPI versions - will these create problems in the current installation? Any other sources to download free voices to be used in the code or even ways to make other voices such as Sam (old windows), David (Win 8) usable?
I'm not sure about specific effects but you can choose a voice by SelectVoice(), or SelectVoiceByHints(gender, age, position, locale). Of course you can also set the rate. So you can do quite a few effects using just the Windows Speech Synthesizer. MSDN wss
So, you've all probably seen Iron Man where Tony interacts with an AI system called Jarvis. Demo clip here (Sorry it's a commercial).
I'm very familiar with C#, C++ and Visual Basic, but I am unsure what options I have available for me to program something like this. Ideally, I'd like to have it assist me while working on some projects by automating a few things.
After doing a bit of research, I saw that a lot of people where using apple script. Well, I'm a windows developer and I work on windows, SO, that won't work.
Microsoft has a Speech SDK, but I hear that I can't program it to learn custom words... as in it just uses it's standard library. Is this true? What are the other limitations of speech recognition with the SDK? Is there something else then?
Also, which language would be better to use for a project like this? C# or VB?
The .NET 3.0 System.Speech.Recognition namespace has very elegant .NET wrapper classes around the SAPI SDK. Including the Grammar class to customize the recognition. As usual, any .NET enabled language can take advantage of it, the specific language doesn't matter.
I am making a Smart House Control System right now, and I have a little problem.
I was thinking on using Cosmos for a base system, and adding the needed namespace libraries to it, but as the usual System.Speech.Recognition namespace depends too much on Windows Speech API, I have to forget about using it.
So my question is, is there any (free if possible) voice recognition and/or speech speech synthesizer library for C#, what has the following:
support for multi-language speaking
extracting text content from speech sample
synthesizing speech with selectable (or user-written) speech pattern (voice)
A general usage, non-windows dependent library would be the best, and of course, if it was free too.
Voxeo offers developer accounts which you could use to develop a speech powered home automation system. I've interfaced it to my own home automation system for a small subset of the commands my home understands and it works great. You'll need to learn some VoiceXML to use it.
SAPI works OK for voice synthesis; I use SAPI in my system for spoken prompts in the house like a weather forecast that comes over the speakers in the morning when you walk into the bathroom. If Cosmos doesn't allow you to include all the DLLs you need maybe you could create a separate service using SAPI and then use WCF (or other) to communicate between them??
For the related problem of understanding natural language in a typed form I've developed a C# NLP Engine which I hope to be able to make available for non-commercial at some point in the future.
Extracting text from speech without specifying any grammar up-front is a very hard problem and is going to be error prone. Even if you could solve that, you'd still have the problem of trying to understand what they said using NLP. Constructing a grammar that guides the recognizer to the kinds of sentences you want to recognize (like VoiceXML does) is likely to achieve much higher accuracy.
Check out this project: http://cmusphinx.sourceforge.net/
It's an open source speech recognition project. It is trainable with any language you want plus since its open source you can modify it to suit your needs or expand it.
We have a application that we were planing to use Microsoft speech API for. Now we tested it on Windows XP using Microsoft Sam voice and frankly it sound terrible ... It's almost impossible to hear what the voice is trying to say.
Are there other, better voice. Are there any updates or newer versions out there that are better. Are there other product, open source projects etc that can work as an alternative?
Just to clarify - It needs to have some sort of API so I actually can program against it.
On Windows about the best I have found was using the speech API and voices from AT&T Natural Voices: https://nextup.com/attnv.html
They are however VERY expensive if available at all. I have run into projects where the usage/business model was so far from what AT&T was thinking of that they wouldn't even sell a license.
There is a free software alternative, Festival: http://festvox.org/ , the quality though is horrible. It is about 10 years behind the current sound quality of commercial systems. It is however free.
A third alternative which has worked well for me was to shift the voice synthesis part of a few projects to OS X. OS X has a decent set of tools and speech APIS and a fairly decent set of stock voices. The downside of course is that prorams written for these APIs run only under OS X which runs only on Apple hardware.
AT&T Natural Voices engine produces great speech but its not free
there is also NeoSpeech which are also good - Not free as well
You don't describe your licensing needs, so I don't know if any of these will be suitable in that regard, but all of the following are sources of SAPI 5 compatible voices:
Ivona (http://www.ivona.com/) - I'm using their Kendra voice on a SAPI project.
AT&T Natural Voices (http://www2.research.att.com/~ttsweb/tts/)
Loquendo (http://www.loquendo.com/)
Acapela (http://www.acapela-group.com/products/products.asp)
Cepstral (http://www.cepstral.com/)
fonix (http://www.fonixspeech.com/tts.php) - only if you loved the original Speak & Spell.
Nuance RealSpeak (I'm not sure about this one...)
You can use free and open source Festival. The default Festival voice sounds a little like Stephen Hawking but you can use some other much better HTS voices. For example try selecting Peter HTS 2011 voice on this demo page: http://www.cstr.ed.ac.uk/projects/festival/morevoices.html. Most of HTS voices for Festival that I've seen are not allowed for commercial use however this one seems to be free: http://homepages.inf.ed.ac.uk/jyamagis/software/page54/page54.html
You can check this youtube tutorial: http://www.youtube.com/watch?v=MmcLFJQpv2o