Specified value has invalid HTTP characters with WebClient download

Specified value has invalid HTTP characters with WebClient download - c#

I have done a bit of research on this problem before. I had a look at this question and it didn't help me. Basically I am trying to build a program to help people use a website, and I need to get the recaptcha V2 challenge images from google's recaptcha API. I keep getting a
Specified value has invalid HTTP characters
when attempting to download the stream
try
{
WebClientEx wc = new WebClientEx(cookieJar);
wc.Headers.Add("Referer", recaptchaframe_url);
wc.Headers.Add("UserAgent:", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)");
Stream responsestream = wc.OpenRead("http://www.google.com" + challengeimageurl);
}
catch (Exception ex)
{
MessageBox.Show("This program was unable to download CAPTCHA image" + ex.Message);
}

I was able to get past this error by converting all manually set headers in the webclient to their .NET properties. (eg UserAgernt to HttpRequestHeader.UserAgent. It appears that it's useless to set header names manually.

Related

Downloading a pdf in .NET 3.5 vs .NET 4.6 using C#

I have a third party server url that opens a pdf file in browser. Before one can access the pdf, request must be authenticated, so authentication information is integrated in the url. So It takes the form like:
SERVER_IP:/runReport.jsp?&jrs.cmd=jrs.get_subnodes&jrs.authorization=YWRtaW46YWRtaW4=&jrs.report_sheet$Report=true&jrs.catalog=/cata/catafolder/cataname.cat&jrs.report=/cata/catafolder/container.cls&jrs.result_type=2&jrs.profile=myProfile&jrs.param$InputAD;1_#assignmentId=AD0000695585
jrs.authorization=YWRtaW46YWRtaW4=
is authorization information attached to the URL. I developed two pieces of code.
The one in .NET 4.6 is:
HttpClient client = new HttpClient();
string param= "someparamvalue";
string url = UrlBuilder(param);
HttpResponseMessage message = client.GetAsync(url).Result;
HttpContent content = message.Content;
System.IO.Stream stream = content.ReadAsStreamAsync().Result;
FileStream fs = File.Create($"exported\\{param}.pdf");
stream.CopyTo(fs);
fs.Close();
stream.Close();
And it successfully downloads the pdf in the "exported" folder.
We have a restriction to do it in .NET 3.5 (.NET 4+ cannot be installed immediately on the machine), so I tried it with following code in 3.5:
string param= "someparamvalue";
string url = UrlBuilder(param);
string fileName = $"exported\\{param}.pdf";
WebClient wc = new WebClient();
wc.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36");
wc.Headers.Add(HttpRequestHeader.Accept, "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8");
wc.DownloadFile(url, fileName);
but all it downloads, is a pdf that is 1) Cannot be opened with pdf reader (A message appears that says "File is either not supporte or corrupted...") 2) I can still open the downloaded file in "Notepad++", I see the html for login web page. (So even if authorization information is there in URL, it still didn't authenticate the user)
Anything, I am missing on how these 2 pieces of code work differently for same request? What wrong I am doing with 3.5 code ?

Spellcheck in web browser control not working under IE11 emulation

I am trying to have spellcheck work in a Winforms web browser control.
This is my current C# code:
try
{
String appname = Process.GetCurrentProcess().ProcessName + ".exe";
RegistryKey key = Registry.LocalMachine.OpenSubKey("Software\\Microsoft\\Internet Explorer\\Main\\FeatureControl\\FEATURE_BROWSER_EMULATION", RegistryKeyPermissionCheck.ReadWriteSubTree);
object ieVal = key.GetValue(appname, null);
MessageBox.Show(ieVal.ToString());
if (ieVal == null || (int)ieVal != 11001)
{
key.SetValue(appname, 11001, RegistryValueKind.DWord);
}
key.Close();
}
catch
{
MessageBox.Show("Registry stuff didn't work");
}
MessageBox.Show(webBrowser1.Version.ToString());
webBrowser1.DocumentText = "<html><head><body><div spellcheck=\"true\" style=\"width:100%; height:100%;\" contenteditable=\"true\"></div>"
+"<script>alert(navigator.userAgent);</script>"
+"</body></html>";
So first I set the proper registry key so that the browser emulates IE11
Then I add a div tag with spellcheck attribute set to true.
The version that the MessageBox.Show(webBrowser1.Version.ToString()) shows is:
11.0.9600.18525
The navigator.userAgent that the Javascript displays is:
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
So it seems like the web browser control is using IE11. But when I type the spell check doesn't work.
Note: When I run that html code with the real IE everything works properly.
Also, the navigator.userAgent displayed on the actual browser is:
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; rv:11.0) like Gecko
Note2: When I run my application on Windows 10 machine the spellcheck works. But I need to make it work on Windows 7 machines.

I had very similar problems using a WebBrowser control on a form and found 2 solutions, each with differing effects, which could both be used.
Using spellcheck=true in the HTML:
Adding the attribute spellcheck=true to the HTML or BODY or TEXTAREA tags, depending on where you want it implemented, will allow spelling checking on text input boxes (my tests were on Windows 10).
Note that any EXISTING text in the text boxes was not spell checked - you had to type NEW text in. This caught me out when running test EXEs with a pre-filled text box, which never got the little red underlines on its spelling mistakes.
Registry entry for FEATURE_SPELLCHECK
This did something different. It allowed spell checking in TinyMCE to work in our embedded web browser control. On its own, it did not enable spell checking in text areas.
See the link below for references to a registry key close to the one you were setting. However, it requires a new key and then a value that matches the name of the EXE you're running. In my case, this involved creating the FEATURE_SPELLCHECKING registry key, and then a DWORD with name TEST123.EXE and value 1.
https://msdn.microsoft.com/en-us/library/ee330735%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396#spellchecking
I found this through the page linked below, where someone reports it not working for Windows 7. Note that this person also tries it with a "local user" key, which does not work in my experience:
https://social.msdn.microsoft.com/Forums/ie/en-US/515fa4b1-2b85-46e4-a041-7dc27c4539c4/how-to-enable-spell-checker-in-web-browser-control-for-ie10?forum=ieextensiondevelopment
Use both of the above.
We found that approach 2 above met most of our needs, as we were using TinyMCE. However, in other applications, both 1 and 2 can be used in conjunction to provide the most functionality.

Web Client Slow Download Video?

I use webclient download to youtube, i have an 100 mb connection but my mp4 download rate 100kb/s :)
WebClient client = New WebClient;
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 6.1; .NET CLR 1.0.3705;)");
client.Proxy = Null;
client.DownloadFileAsync(New Uri(url.ToString()), directory + file.ToString());
How to fix this problem or where my wrong?
I think youtube block my speed but i tried Internet download manager then very fast down this video.
Thanks for help!
Sorry for my bad English.
Ertim Abon

There's nothing inherently wrong with your code - the "problem" is on the other end. YouTube throttles the connection so that the video is downloaded at about the same speed as it's played. This saves on bandwidth if (when!) people don't watch the entire video. Well-configured video streaming websites will give you a burst at the start and then stream the rest at a lower speed.
The only way around it would be to see if you could make multiple requests to different parts of the video to get the "burst" multiple times, for instance with Range headers. They might not like you doing that though.

Fixing a youtube downloader class

A Little time ago I opend a thread (link at the bottom).
And I'm happy to say it has been fixed, that is partially.
It still uses the wrong youtube links.
And since youtube keeps updating all examples i could find where broken.
I think this has to do with the "regular" expressions.
Could someone enlighten me on that subject?
And now for the error at hand:
An unhandled exception of type 'System.Net.WebException' occurred in System.dll
Additional information: The remote server returned an error: (403) Forbidden.
At Line 22: wc.DownloadFile(kvp.Value, #"C:\Users\waralot\Downloads\youtube\"+kvp.Key);
The console during compilation is here: pastebin.com/BrgKkAmk
Original project at HackForums: http://www.hackforums.net/showthread.php?tid=2052105
My current version: http://pastebin.com/2iH2vQ2L
Again my first thread can be found here: Converting a Youtube downloader form VB to C#

Seems like Youtube blocks you from accessing the link, this may because you don't set an user-agent for your WebClient.
Try adding this before you try to download the video.
wc.Headers.Add ("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
Code for url cleanup also needs to be changed like so.
//clean up residual tags and encoded strings
link = slink.Replace("url=", "");
link = link.Replace("\\u0026", "&");
link = HttpUtility.UrlDecode(link);

can asp.net run without .net framework

i am very new to asp.net. I would like to ask can asp.net run without the .net framework? as in can default.aspx run smoothly without the .net framework? I am asking this due to the following existing code which was runned on a web hosting server and another is a private server. I am not sure about the private server details ( going to know in a 2-3 days)...the code goes as...
try
{
WebRequest req = WebRequest.Create(General.SiteUrl + "/pages/" + page + ".htm");
WebResponse resp = req.GetResponse();
Stream stream = resp.GetResponseStream();
StreamReader reader = new StreamReader(stream);
content = reader.ReadToEnd();
}
catch { content = "<html><head></head><body>Content not found.</body></html>"; }
the web hosting server manage to run the "Try" successfully whereas the private one always shows content not found....any ideas guys?

People that visit your website will not need the .NET Framework; all they'll need is a browser.
The server that runs your website will need the .NET Framework since ASP.NET is a part of it.
The .NET Framework is required on the Server side for a few reasons (these are just some examples):
Your code is compiled into an intermediate language designed to be platform agnostic. A runtime (The .NET Framework) is required to convert this intermediate language into something the machine can understand. This is accomplished by the JIT.
There are several libraries in ASP.NET; System.Web.dll; for example. These are distributed as part of the .NET Framework.
The code is hosted inside of a virtual machine (in the non-traditional sense). The virtual machine takes care of a lot of heavy lifting for you; such as security; garbage collection; etc. Again; this is all part of the .NET Framework.
EDIT:
I think you are asking the wrong question here. You ask wondering why your code is going inside of the catch block and returning Content not found. The .NET Framework is properly installed since the catch block is being called; in fact it couldn't get nearly that far without the .NET Framework.
You need to figure out what exception is being thrown inside of the try block that is causing it to go into the catch block. You can achieve this with a debugger; logging; or temporarily removing the catch block all together to get the server to let the exception bubble all the way up to the top. For example; if you change your code block to look like this:
WebRequest req = WebRequest.Create(General.SiteUrl + "/pages/" + page + ".htm");
WebResponse resp = req.GetResponse();
Stream stream = resp.GetResponseStream();
StreamReader reader = new StreamReader(stream);
content = reader.ReadToEnd();
The exception details will be displayed in the browser (provided you have debugging turned on). What error is displayed without the try / catch?

No, .Net code will not run without support of the .Net framework. Because code written in .Net language will be compiled and converted to IL (Intermediate Language) Code.

The .NET framework, or some variation of, e.g. Mono, is not required on the client side. This is a requirement of the server which is serving the pages.
When data is sent to the client via HTTP, it is translated into HTML. So all the client would need would be a browser capible of consuming HTML and running any scripts associated with that site.

the .net framework is the foundation that powers this code
try
{
WebRequest req = WebRequest.Create(General.SiteUrl + "/pages/" + page + ".htm");
WebResponse resp = req.GetResponse();
Stream stream = resp.GetResponseStream();
StreamReader reader = new StreamReader(stream);
content = reader.ReadToEnd();
}
catch
{
content = "<html><head></head><body>Content not found.</body></html>";
}
so in short, "no", you must have the .net framework installed on the server that is hosting your website.
On the other hand however, on the client side, your website visitors do NOT need the .net framework to "view" your website.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.