Get subtitles from youtube video using video.google.com - text format - c#

I want to get subtitles from a youtube video. When I write in the url "http://video.google.com/timedtext?lang=en&v=Dceyy0cX6J4&fmt=srv3" the text is as expected, but when I use C# the text has some characters with the &#39 ; (example)
The c# code is pretty simple:
using (HttpClient client = new HttpClient)
{
var response = client.GetString("http://video.google.com/timedtext?lang=en&v=Dceyy0cX6J4&fmt=srv3")
}
Is there any way to add a format header? How could I fix it ?

What you are seeing is url encoded content.
You will need to decode this.
Luckily you can use HttpUtility.HtmlDecode(response) from System.Web and this will give you a perfect readable response

Check out the URLEncode method:
https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.urlencode?view=netframework-4.7.2

Related

convert rtsp stream to http stream

In c# is there possibility that rtsp video stream is used "System.net.httpwebrequest" if not plz tell me another alternative .
// the URL to download the file from
string basepath = #"rtsp://ip.worldonetv.com:1935/live/ ";
// the path to write the file to
// string sFilePathToWriteFileTo = "d:\\Download";
// first, we need to get the exact size (in bytes) of the file we are downloading
Uri url = new Uri(basepath);
System.Net.HttpWebRequest request = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(url);
System.Net.HttpWebResponse response = (System.Net.HttpWebResponse)request.GetResponse();
response.Close();
You can formulate RtspRequests with my library.
You can then base64 encode the RtspRequest and put that as the body to the HttpRequest.
Add the content-length header which would be equal to the length of the base64 encoded rtsp request in the body.
Add the header rtsp/x-tunneled to HttpRequest and then sent it along.
You should get back a HttpResponse with the body containing a base64 encoded RtspResponse.
Base64 decode the Body of the HttpResponse and then use the RtspResponse class in my library to parse it.
The library is # http://net7mma.codeplex.com/
And there is a codeproject article # http://www.codeproject.com/Articles/507218/Managed-Media-Aggregation-using-Rtsp-and-Rtp
If you need anything else let me know!
There's no standard C# library to do this. You can't even do it with the various .NET DirectShow wrappers. I just had a coworker spend a month on this problem and he ended up writing his own C# wrapper on GStreamer. If you're planning to display the video, the easiest option is to embed the VLC ActiveX control.

Read a web page with all images in Base64-Embedded format

In my scenario I want to download the HTML of a page (any page on the Internet) programaticaly but also I want all of the images in the HTML to be in base64 embedded format (not referenced)
In other words, instead of :
<img src='/images/delete.gif' />
I want the downloaded html to look like this:
<img src="..." />
This way I don't need to go through the process of storing all images in directories, etc, etc.
Does any of you have any idea how this can be done? Or any plugin to do this efficiently?
Well, you'd need to:
Download the original HTML
Find each img element in the HTML (for instance using the HTML agility pack) and for each one:
If it's already using a data URL, ignore it
Otherwise:
Download the image
Encoded it in Base64 using Convert.ToBase64String
Replace the original img tag with one using the base64 version (either in the original string, or via a DOM representation)
Save the final HTML to disk
Is any of these steps causing you a particular problem? You could potentially make it quicker by downloading the images in parallel, but I'd get a serial version working first.
Instead of using a html page with images as base64 encoded strings in the src attribute you might consider using the MHTML format instead. Most browsers supports the format and it embeds all external resources (including images).
var msg = new CDO.MessageClass();
msg.MimeFormatted = true;
msg.CreateMHTMLBody("http://www.google.com", CDO.CdoMHTMLFlags.cdoSuppressNone, "", "");
var stream = msg.GetStream();
var mhtml = stream.ReadText(stream.Size);
Use a regular expression (regex) to extract URLs from img tags, translate them to absolute URLs using the Uri class, then use WebClient to download the target images. After that it's just a case of using Convert.ToBase64String to produce the Base64.

Google Translate Api and Special Characters

I've recently started using the google translate API inside a c# project. I am trying to translate some text from english to french. I am having issues with some special characters though.
For example the word Company comes thru as Société instead of Société as it should. Is there some way in code I can convert these to the correct special characters? ie (é to é)
Thanks
If you need anymore info let me know.
I ran into this same exact issue. If you're using the WebClient class to download the json response from google, try setting the Encoding property to UTF8.
using(var webClient = new WebClient { Encoding = Encoding.UTF8 })
{
string json = webClient.DownloadString(someUri);
...
}
I have reproduced your problem, and it looks like you are using the UTF7 encoding. UTF8 is the way you need to go.
I use Google's API by creating a WebRequest to get an HTTP response from the server, then I read the response stream with a StreamReader. StreamReader defaults to UTF8, but to reproduce your problem, I passed Encoding.UTF7 into the StreamReader's constructor.

Are there any multipart/form-data parser in C# - (NO ASP)

I am just trying to write a multipart parser but things getting complicated and want to ask if anyone knows of a ready parser in C#!
Just to make clear, I am writing my own "tiny" http server and need to pars multipart form-data too!
Thanks in advance,
Gohlool
I open-sourced a C# Http form parser here.
This is slightly more flexible than the other one mentioned which is on CodePlex, since you can use it for both Multipart and non-Multipart form-data, and also it gives you other form parameters formatted in a Dictionary object.
This can be used as follows:
non-multipart
public void Login(Stream stream)
{
string username = null;
string password = null;
HttpContentParser parser = new HttpContentParser(stream);
if (parser.Success)
{
username = HttpUtility.UrlDecode(parser.Parameters["username"]);
password = HttpUtility.UrlDecode(parser.Parameters["password"]);
}
}
multipart
public void Upload(Stream stream)
{
HttpMultipartParser parser = new HttpMultipartParser(stream, "image");
if (parser.Success)
{
string user = HttpUtility.UrlDecode(parser.Parameters["user"]);
string title = HttpUtility.UrlDecode(parser.Parameters["title"]);
// Save the file somewhere
File.WriteAllBytes(FILE_PATH + title + FILE_EXT, parser.FileContents);
}
}
I've had some issues with parser that are based on string parsing particularly with large files I found it would run out of memory and fail to parse binary data.
To cope with these issues I've open sourced my own attempt at a C# multipart/form-data parser here
See my answer here for more information.
Check out the new MultipartStreamProvider and its subclasses (i.e. MultipartFormDataStreamProvider). You can create your own implementation too if none of the built in implementations are suitable for you use case.
With Core now you have access to a IFormCollection by using HttpContext.Request.Form.
Example saving an image:
Microsoft.AspNetCore.Http.IFormCollection form;
form = ControllerContext.HttpContext.Request.Form;
using (var fileStream = System.IO.File.Create(strFile))
{
form.Files[0].OpenReadStream().Seek(0, System.IO.SeekOrigin.Begin);
form.Files[0].OpenReadStream().CopyTo(fileStream);
}
I had a similar problem that i recently solved thanks to Anthony over at http://antscode.blogspot.com/ for the multipart parser.
Uploading file from Flex to WCF REST Stream issues (how to decode multipart form post in REST WS)

How do I get a C# WebBrowser control to show jpeg files (raw)?

Does anyone know in .Net 2.0 - .Net 3.5 how to load a jpeg into a System.Windows.Forms.WebControl as a byte-array and with the right mimetypes set so it will show?
Something like:
webBrowser1.DocumentStream = new MemoryStream(File.ReadAllBytes("mypic.jpg"));
webBrowser1.DocumentType = "application/jpeg";
The webBrowser1.DocumentType seems to be read only, so I do not know how to do this. In general I want to be able to load any kind of filesource with a mimetype defined into the browser to show it.
Solutions with writing temp files are not good ones. Currently I have solved it with having a little local webserver socket listener that delivers the jpeg I ask for with the right mimetype.
UPDATE: Since someone deleted a answer-my-own question where I had info that others could use, I will add it as an update instead. (to those who delete that way, please update the questions with the important info).
Sample solution in C# here that works perfectly: http://www.codeproject.com/KB/aspnet/AspxProtocol.aspx
You have to implement an async pluggable protocol, e.g. IClassFactory, IInternetProtocol... Then you use CoInternetGetSession to register your protocol. When IE calls your implementation, you can serve your image data from memory/provide mime type.
It's a bit tedious, but doable. Look at IInternetProtocol and pluggable protocols documentation on MSDN.
You cannot do it. You cannot stuff images into Microsoft's web-browser control.
The limitation comes from the IWebBrowser control itself, which .NET wraps up.
If you want a total hack, try having your stream be the HTML file that only shows your picture. You lose your image byte stream and will have to write the image to disk.
I do not know whether the WebBrowser .NET control supports this, but RFC2397 defines how to use inline images. Using this and a XHTML snippet created on-the-fly, you could possibly assign the image without the need to write it to a file.
Image someImage = Image.FromFile("mypic.jpg");
// Firstly, get the image as a base64 encoded string
ImageConverter imageConverter = new ImageConverter();
byte[] buffer = (byte[])imageConverter.ConvertTo(someImage, typeof(byte[]));
string base64 = Convert.ToBase64String(buffer, Base64FormattingOptions.InsertLineBreaks);
// Then, dynamically create some XHTML for this (as this is just a sample, minimalistic XHTML :D)
string html = "<img src=\"data:image/" . someImage.RawFormat.ToString() . ";base64, " . $base64 . "\">";
// And put it into some stream
using (StreamWriter streamWriter = new StreamWriter(new MemoryStream()))
{
streamWriter.Write(html);
streamWriter.Flush();
webBrowser.DocumentStream = streamWriter.BaseStream;
webBrowser.DocumentType = "text/html";
}
No idea whether this solution is elegant, but I guess it is not. My excuse for not being sure is that it is late at night. :)
References:
RFC2397
Image to base64 encoded string
IE only support 32KB for inline images in base64 encoding, so not a good solution.
Try the res: protocol.
I haven't tried it with a .net dll but this post says it should work. Even if it does require a C++ dll it's much simpler to use as far as coding goes.
I've created a post that show you how here that shows you how to create the resource script and use the res: protocol correctly.

Categories

Resources