i'm trying to use Azure Cognitive Services Speech to Text and i am hitting a roadblock in .net Core
i have native support for a WAV file using the audioConfig.FromWafFileInput(); which is great.
however i need to also support MP3's
I have found compressed audio support
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-codec-compressed-audio-input-streams?tabs=debian&pivots=programming-language-csharp
however this is referencing PushAudio Streams.
this is where i'm getting lost....
i have found this example for stream codec compressed audio
https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/cpp/linux/compressed-audio-input/compressed-audio-input.cpp
however this is not C# .net core and conversion is not really my strong suit.
so yeah at a bit of a loss.
any assistance would be greatly appreciated (y)
This sample: https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/speech_recognition_samples.cs has compressed audio specific methods here and here. The latter pull stream sample seems pretty straightforward, just plug in your key, region, and filepath.
If you have files, especially if you have multiple of them, you can benefit from using batch transcription. It natively supports files in WAV, MP3 and OGG format.
The documentation links to the API documentation, that also includes model customization. Here you can select the region you are interested in and export a swagger file. The swagger file you can use to generate a client in the programming language of your choice.
For your scenario you will only need 4 APIs and you could use the standard HttpClient to execute the requests. You would want to
Create a batch transcription.
Get your transcriptions to check the state. If it is complete, you get the URL you will need next. If it is failed, you get a message about the problem.
Get the results after the batch transcription succeeded. The object with the kind TranscriptionReport contains a list of files that got transcribed, if the transcription was successful and if not, why. The other objects contain the result of the successful transcriptions.
(here you need to iterate over the contentUrls, to download the files.)
Delete the transcription(s), after you got the results.
Related
I'm trying to code a very basic C# console app to read from a bucket in AWS. The file I'm trying to read is in the avro format.
At this point I have a console program with the nuget packages for AWSSDK.s3, AWSSDK.Core, and the avro package from Apache.
I know how to get a list of files in the bucket. So I can connect to AWS. I guess what I need to do now is figure out how to deserialize the data.
The final goal is to load the data into an SQL Server table. The files I'm working with are not very large.
We are working with another company on this project and they are sending us this data in the avro format.
I'm completely new to AWS programming and never heard of avro until about a week ago. Finding information on the internet has been kind of hard.
Any help would be great.
Thanks.
First, you need to read object data from an Amazon S3 bucket using the AWS SDK for .NET. You can use the client.GetObjectAsync method to read the data. See this example in Github:
https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/dotnetv3/S3/GetObjectExample/GetObjectExample/GetObject.cs
For your use case, you would need to use a lib like https://www.nuget.org/packages/Apache.Avro/ to handle the Avro requirements. As this is a very specific use case, I doubt you will find this specific example on the Internet.
recently i have started workin on project of my own which is capturing a camera output using DirectShow .Net. There are few problems that i don't know how ot solve
1) How can i encode the captured stream into H.264 format.I understand i should some how add filter in the filter graph. But i wasn't able to find where and how . Also i was not able to find if there is standart H.264 filter or should i download it form some where? If i need to download it can it just be a dll to wich i add reference or should it be installer?
2) Is there a way to save the captured output into a memory object, some kind of stream or it can only be written to file?
Best Regards,
Iordan
You're can use commercial software from VisioForge or Viscomsoft.
AForge potencial problem - no audio during capture, only video. Also output formats is a very limited. But - it's free and open source, if you're have any DirectShow experiense you can add audio support.
You will need to use something like FFMpeg or Handbrake. Check out http://vidcoder.codeplex.com/.
AForge also has some video editing abilities and you can also pass filters to it. There are also several FFMpeg C# wrappers you could use such as https://github.com/crazyender/FFMPEG.net
You should use AForge.net. All of the hard work is done for you already. Use VideoFileWriter http://www.aforgenet.com/framework/docs/html/4ee1742c-44d3-b250-d6aa-90cd2d606611.htm. I appears that the AForge framework uses FFMpeg under the hood as well (See AForge.Video.FFMPEG namespace). You just create a writer and pass it the bitmap/frame writer.WriteVideoFrame(bitmap);
You should fully investigate the video abstractions in AForge. You could save yourself considerable amounts of time.
Example: http://www.aforgenet.com/framework/samples/video.html
1) Yes, You should download encoder filter library. Most of decoders are supplied with free. But encoder is not. If you don't want to pay, you have to find open source Encoder.
And To use at the graphedit program, you should do dll register with that dll file. Or the installer program do this instead. And you also need to check container type like mp4, mkv... In other words you need mux filter to save that in a file. I think someone will link available urls. Sorry I don't have urls now.
2)What is that the means of capture? Is that Movie? or Image?
In case of Image, There are many sample projects and SampleGrabbers. You can save it to both of a file and memory.
In case of Movie, I'm sure your pc memory is not enougth to them with out encoding. Make memory stream and pass it to capture manager.
You can also make a filter in order to customize filter to your needs. All of information are included in Windows SDK samples.
I am trying to make tool for backup/restore of Documents from Google account.
Backup is easy and I have no problems with it. But I have two unsolved questions for restore:
1) Is it possible to upload new version of existing document? When I upload document, it appears as separate copy.
I found it was discussed already here Upload and replace file in given folder on Google Docs using .net api, but it seems it was suggested just to remove old version before uploading new, the Id of document will be changed. Is this correct?
2) Google Docs have limit for size of documents able to be converted into internal format. http://docs.google.com/support/bin/answer.py?hl=en&answer=37603. So it is possible to create large document, save it to local computer and then Google Docs will refuse to convert it because the document's size is over limit. In such case it is possible to upload the document without convert, but it becomes un-editable via web site. Is there some workaround for this situation?
Unable to upload large files to Google Docs - Here is advice to break document into small pieces before uploading and link them together after. But maybe there some other ideas?
1. Is it possible to upload new version of existing document? When I upload document, it appears as separate copy.
Yes, this is possible. We call it "upload & replace" as you've noticed. No need to remove the existing version first. The following link describes how to do this in the protocol:
http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#UpdatingMetadataAndContent
From the .NET client library, what you need to do is attach a an input stream to the Update() request. The method header for what you need is here:
http://code.google.com/p/google-gdata/source/browse/trunk/clients/cs/src/core/service.cs#554
Create a stream containing your new file content, and just pass that in. That should be it!
2. Google Docs have limit for size... Is there some workaround for this situation?
Unfortunately there is not a way currently to circumvent the size limitations of converted documents. They must be uploaded as unconverted files, and thus, are not editable in the Google Docs user interface.
I currently have an app written in C# that can take a file and encrypt
it using gpg.exe
What I'm trying to do is, instead of
1. Creating a file (from database queries usually)
2. encrypting the file
3. deleting the non-encrypted file
I want to
Gather info into memory (into a dictionary or a list or whatever)
stream the text/data into gpg.exe to end up with the encrypted file
outputted
I've looked into pipestream, redirecting standard input to the gpg
process, etc, but I haven't figured out a way to trick gpg.exe into
accepting streamed text/data instead of a file on the hard drive.
Initially figured if I could do it for gpg, I could also do it for Zip
as well, but I'm wondering if it's even possible.
Found some refs to popen which seems to be php related, but nothing
for c#.
Essentially, I'm looking to do the below programatically with text.txt
being stuff in memory streamed to the app instead of an actual file on
the hard drive.
C:\Program Files\GNU\GnuPG>type C:\test.txt | zip > plubber.zip
C:\Program Files\GNU\GnuPG>type C:\test.txt | gpg -er
"mycomp_operations " > Test.pgp
Thanks for any help you may be able to give :)
Tony!
You can use DotNetZip to create a zip file in-memory, but I don't know how that would interface with the gpg stuff. DotNetZip can do AES encryption, but that is obviously a different model from PGP or GPG.
Just a quick googly search turned up
this hint on GPG.
Looks like they run the gpg.exe in a separate process, sitting there waiting for input.
Please review the BouncyCastle C# implementation at:
http://www.bouncycastle.org/csharp/
This will allow GPG inprocess encryption and decryption without external files. I am currently using it to do the same thing for a BizTalk pipeline component.
Benton Stark has written a good wrapper for GnuPG which demonstrates (among other things) how to take data from a Stream, pipe it into the GPG executable and write the output back to a stream - all in C#.
Benton has answered another question with a link to his website. Benton writes:
You can try using my open source and free GnuPG wrapper for C# (and
VB.NET). All the code is licensed via MIT, non-GPL restrictions. You
can find the release with source code on Sourceforge.net.
http://sourceforge.net/projects/starksoftopenpg/
Well, named-pipes does most of what you are discussing, but to be honest it isn't worth it... in most cases, a temp file is a reasonable approach.
Using our SecureBlackbox components you can avoid calling external program for ZIP or PGP operations. The components operate with streams and don't need temporary files.
I'm working on an ASP.NET app that allows users to upload video files. After the user uploads, I need to determine some of the attributes of the media - namely it's duration/length, resolution, and codec (if possible).
What's the simplest way to approach this? Should I use the WMP SDK - this seems to involve actually instantiating the media player on the server. Is there anything in the framework to do this, or do I need to rely on an external library?
I'm not concerned about displaying or streaming the video back to the user.
There is nothing in the framework, you will need some sort of library. The best I've seen (but it has been a year or so since I've looked) is taglib-sharp:
http://developer.novell.com/wiki/index.php/TagLib_Sharp
The site seems to be down right now, but I see that it's been ported to fink (for OSX) only a couple of months ago, so I assume that is temporary.
oops, just saw that you're not the first to ask a question along these lines and I'm not the first to suggest taglib-sharp:
View/edit ID3 data for MP3 files
(note: it supports audio and video files).
hth