Why does mono sometimes truncate http downloads?

Why does mono sometimes truncate http downloads? - c#

I use the following code to download text (json):
var request = WebRequest.Create(url);
using (var response = request.GetResponse())
{
string charset = null;
var httpResponse = response as HttpWebResponse;
if (httpResponse != null)
{
if (httpResponse.StatusCode != HttpStatusCode.OK)
{
throw new System.Net.WebException("Ststus code was: " + httpResponse.StatusCode);
}
charset = httpResponse.CharacterSet;
}
Encoding enc = charset != null ? Encoding.GetEncoding(charset) : null;
using (var reader = new StreamReader(response.GetResponseStream(), enc, true))
{
return reader.ReadToEnd();
}
}
On Windows (.net) it works fine. On Linux (Mono runtime) it sometimes returns truncated data: The json parser crashes, because can't find the closing delimiter for strings and similar errors. It is not a problem with the parser: I have tried 2 different. It does not seem to be a problem with encoding, because it sometimes works and sometimes doesn't for the exact same downloaded data.
Why would mono behave this way and how can I avoid this problem?
Edit: I added a console print for debugging purposes. The string coming directly from the code above is definitively truncated.
Edit2: Here is how I use the result:
string json = DownloadTextFile(url);
dynamic obj = Json.Decode(json);//Decoding fails here, because string is truncated.

The problem occurs much less frequently when I let the program run on a server with a good very good connection to the net. (After a few thousand downloads, instead of after a few hundred). That is good enough for my purposes.
Checking the content length does not help much, because it is -1 more often than not. It is sad that network stuff is implemented so poorly in mono. (On .net the same code works flawlessly even with a bad connection.)

Related

Uploading Large Files to WCF from Xamarin Android App Crashes

I'm trying to upload a large video (1 GB+) from my xamarin app and it keeps crashing once it reaches about 0.5 GB of my file. The only way I've found to get the videos to post to my WCF service while sending data along with it is using the MultiPart logic but I'm not sure if I'm running out of memory or what because even in debug mode, it simply crashes without any real error message.
I'm trying to run it on a native device (not a sim) and it's a Samsung Galaxy S9 with Android 9.
Here's the upload code that I'm using: (p.s. - as a test, I tried putting the WriteAsync into a for loop thinking that maybe trying to write the whole gig was the problem, but the result was the same. That's why you'll see the MAXFILESIZEPART constant in there which is just an int equal to 10000000.)
private async Task<byte[]> GetMultipartFormDataAsync(Dictionary<string, object> postParameters, string boundary)
{
try
{
using (Stream formDataStream = new System.IO.MemoryStream())
{
bool needsCLRF = false;
foreach (var param in postParameters)
{
// Thanks to feedback from commenters, add a CRLF to allow multiple parameters to be added.
// Skip it on the first parameter, add it to subsequent parameters.
if (needsCLRF)
await formDataStream.WriteAsync(Encoding.UTF8.GetBytes("\r\n"), 0, Encoding.UTF8.GetByteCount("\r\n"));
needsCLRF = true;
if (param.Value is FileParameter)
{
FileParameter fileToUpload = (FileParameter)param.Value;
// Add just the first part of this param, since we will write the file data directly to the Stream
string header = string.Format("--{0}\r\nContent-Disposition: form-data; name=\"{1}\"; filename=\"{2}\"\r\nContent-Type: {3}\r\n\r\n",
boundary,
param.Key,
fileToUpload.FileName ?? param.Key,
fileToUpload.ContentType ?? "application/octet-stream");
await formDataStream.WriteAsync(Encoding.UTF8.GetBytes(header), 0, Encoding.UTF8.GetByteCount(header));
// Write the file data directly to the Stream, rather than serializing it to a string.
if (fileToUpload.File.Length > MAXFILESIZEPART)
{
for (var i = 0; i < fileToUpload.File.Length; i += MAXFILESIZEPART)
{
var len = i + MAXFILESIZEPART > fileToUpload.File.Length
? fileToUpload.File.Length - i
: MAXFILESIZEPART;
await formDataStream.WriteAsync(fileToUpload.File, i, len);
}
}
else
{
await formDataStream.WriteAsync(fileToUpload.File, 0, fileToUpload.File.Length);
}
}
else
{
string postData = string.Format("--{0}\r\nContent-Disposition: form-data; name=\"{1}\"\r\n\r\n{2}",
boundary,
param.Key,
param.Value);
await formDataStream.WriteAsync(Encoding.UTF8.GetBytes(postData), 0, Encoding.UTF8.GetByteCount(postData));
}
}
// Add the end of the request. Start with a newline
string footer = "\r\n--" + boundary + "--\r\n";
await formDataStream.WriteAsync(Encoding.UTF8.GetBytes(footer), 0, Encoding.UTF8.GetByteCount(footer));
// Dump the Stream into a byte[]
formDataStream.Position = 0;
byte[] formData = new byte[formDataStream.Length];
formDataStream.Read(formData, 0, formData.Length);
return formData;
}
}
catch (Exception e)
{
Console.WriteLine(e);
throw;
}
}
And it's eventually failing on the following line
await formDataStream.WriteAsync(fileToUpload.File, i, len);
but only after a certain point (about 500MB) so I'm assuming it's a memory issue but it doesn't say so. Is there a better way to accomplish this task? I'm doing it so that it also records the progress as the upload happens. I'm trying to accomplish something similar to uploading large videos via the facebook app so that it will upload in the background while you continue working. It works great when working with smaller files (i.e. - < 500 MB) but this is the first time I've tried a file that was almost a gig in size.
NOTE: This happens BEFORE it starts posting anything to the server so it's not IIS or WCF related. This code crashes just writing the bytes to the memory stream.
Any suggestions?
Thanks!

According to your description, the service will stop at a certain time point, and because the file you transfer is about 1G, it is likely to be sendtimeout.No transfer completed within the specified time, causing exception。The SendTimeout that specifies how long the write operation has to complete before timing out. The default value is 1 minute.
I set sendtimeout to 15 seconds in my configuration file.If the data takes more than 15 seconds, an exception will occur. You can set it to a higher value to avoid timeout and exception.
For information about sendtimeout, please refer to the following link:
https://learn.microsoft.com/en-us/dotnet/api/system.servicemodel.channels.binding.sendtimeout?view=dotnet-plat-ext-3.1
UPDATE
I think it might be a memory overflow problem.Large file may cause memory overflow, unable to read at the same time.
You can refer to the following links for solutions
https://learn.microsoft.com/en-us/archive/blogs/johan/are-you-getting-outofmemoryexceptions-when-uploading-large-files

System.NotSupportedException: No data is available for encoding 1252

I'm working with a Trust Commerce Tutorial on how to generate a payment token that will allow customers to use the TC Trustee Host payment form. I was given an example on how to retrieve this token by their dev team.
using System;
using System.Net;
using System.IO;
using System.Text;
using System.Collections;
using System.Web;
/** #class TCToken
* An example class for generating a TrustCommerce Trustee Token
*/
public class TCToken
{
public static void Main(string [] args)
{
string custid = "123456";
string password = "XXXXXX";
try {
// Adapted from http://www.west-wind.com/presentations/dotnetWebRequest/dotnetWebRequest.htm
string gateway_post_address = "https://vault.trustcommerce.com/trustee/token.php";
HttpWebRequest req = (HttpWebRequest) WebRequest.Create(gateway_post_address);
// A sixty second timeout.
req.Timeout = 60000;
string post_data = "custid=" + HttpUtility.UrlEncode(custid) +
"&password=" + HttpUtility.UrlEncode(password);
req.Method = "POST";
byte [] buf = System.Text.Encoding.GetEncoding(1252).GetBytes(post_data);
req.ContentLength = buf.Length;
req.ContentType = "application/x-www-form-urlencoded";
Stream s = req.GetRequestStream();
s.Write(buf, 0, buf.Length);
s.Close();
HttpWebResponse rep = (HttpWebResponse) req.GetResponse();
Encoding enc = System.Text.Encoding.GetEncoding(1252);
StreamReader rs = new StreamReader(rep.GetResponseStream(), enc);
string token = rs.ReadToEnd();
Console.WriteLine(token);
rep.Close();
rs.Close();
} catch (Exception e) {
Console.WriteLine(e);
}
}
}
I made a new console application in visual studio, copied this code, and replaced the username and password with the correct credentials. When I try to run this, I get the following error in the console.
System.NotSupportedException: No data is available for encoding 1252.
For information on defining a custom encoding, see the documentation
for the Encoding.RegisterProvider method. at
System.Text.Encoding.GetEncoding(Int32 codepage) at
TCToken.Program.Main(String[] args) in
C:\Users\xxxx\source\repos\TCToken\TCToken\Program.cs:line 29
I've tried to google this error and most of the responses are a little above my understanding. I'm certainly not a C# expert.

What ckuri said. Just to be clear, you need the following line of code before opening the stream (steps 2,3):
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
ExcelDataReader - Important note on .NET Core
By default, ExcelDataReader throws a NotSupportedException "No data is
available for encoding 1252." on .NET Core.
To fix, add a dependency to the package System.Text.Encoding.CodePages
and then add code to register the code page provider during
application initialization (f.ex in Startup.cs):
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
This is required to parse strings in binary BIFF2-5 Excel documents
encoded with DOS-era code pages. These encodings are registered by
default in the full .NET Framework, but not on .NET Core.

.NET Core supports only ASCII, ISO-8859-1 and Unicode encodings, whereas .NET Framework supports much more.
However, .NET Core can be extended to support additional encodings like Windows-1252, Shift-JIS, GB2312 by registering the CodePagesEncodingProvider from the System.Text.Encoding.CodePages NuGet package.
After the NuGet package is installed the following steps as described in the documentation for the CodePagesEncodingProvider class must be done to register the provider:
Add a reference to the System.Text.Encoding.CodePages.dll assembly to your project.
Retrieve a CodePagesEncodingProvider object from the static Instance property.
Pass the CodePagesEncodingProvider object to the Encoding.RegisterProvider method.

nuget:
Install-Package System.Text.Encoding.CodePages -Version 5.0.0
code:
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

I was experiencing similar issue when I was trying to read and convert xlsx file to DataTable. I found out that encoding 1252 are not default in .NET Core therefore I had to separately add NuGet package for the same.
Below is the method where I convert the data from memory stream.
private static DataTableCollection ExcelToDataTable(MemoryStream stream)
{
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
var result = reader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (data) => new ExcelDataTableConfiguration()
{
UseHeaderRow = true
}
});
return result.Tables;
}
}
I referenced the Encoder from nuGet Package at the start of the method and it worked fine for me. This answer is late but might help people who are reading data from streams.
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);

Solution:
Add the System.Text.Encoding.CodePages Package to your project.
Write this code in your program:
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);

I've installed the library System.Text.Encoding.CodePages like other posts said and this code worked for me:
System.Text.Encoding.RegisterProvider(
System.Text.CodePagesEncodingProvider.Instance);
Encoding srcEncoding = Encoding.GetEncoding(1251);
using (var reader = new StreamReader(#"D:\someFile.csv", encoding: srcEncoding))
{
List<string> listA = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(';');
listA.Add(values[0]);
}
}

Retrieve html from website

This is a little bit tricky but this is how it goes.
Page loads
Executes some javascript which generates more html code. And source code is the one I need.
Now I see I can't use html parser because there isn't actually a way to run the code.
Using http I can manage getting the first source code but the javascript isn't executed so I never get the source code I need.
What is the best way to retrieve that code generated afterwards?
Edit: I am trying to avoid using a hidden web browser. It is actually possible with it since it works as a javascript interpreter here but it is very slow and very ugly way.
Edit2: Added code
static private string _InetReadEx(string sUrl)
{
string aRet;
HttpWebRequest webReq = (HttpWebRequest)HttpWebRequest.Create(sUrl);
try
{
webReq.CookieContainer = new CookieContainer();
webReq.Method = "GET";
using (WebResponse response = webReq.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
StreamReader reader = new StreamReader(stream);
aRet = reader.ReadToEnd();
return aRet;
}
}
}
catch (Exception ex)
{
return string.Empty;
}
}

Unless you're using WebBrowser as you mentioned you want to avoid. There is no other conveneiet way.
You can mimic the behavior of the JavaScript that runs and execute it and than format it as the WebBrowser does, but this will not be dynamic formatting and thus much less desired.

WebException.Response.GetResponseStream() limited to 65536 characters

I am trying to retrieve HTML code from a webpage using HttpWebRequest and HttpWebResponse.
response = (HttpWebResponse)request.GetResponse();
...
Stream stream = response.GetResponseStream();
The response object has a ContentLength value of 106142. When I look at the stream object, it has a length of 65536. When reading the stream with a StreamReader using ReadToEnd(), only the first 65536 characters are returned.
How can I get the whole code?
Edit:
Using the following code segment:
catch (WebException ex)
{
errorMessage = errorMessage + ex.Message;
if (ex.Response != null) {
if (ex.Response.ContentLength > 0)
{
using (Stream stream = ex.Response.GetResponseStream())
{
using (StreamReader reader = new StreamReader(stream))
{
string pageOutput = reader.ReadToEnd().Trim();
ex.Response.ContentLength = 106142
ex.Response.GetResponseStream().Length = 65536
stream.Length = 65536
pageOutput.Length = 65534 (because of the trim)
And yes, the code is actually truncated.

You can find an answer in this topic in System.Net.HttpWebResponse.GetResponseStream() returns truncated body in WebException
You have to manage the HttpWebRequest object and change DefaultMaximumErrorResponseLength property.
For example :
HttpWebRequest.DefaultMaximumErrorResponseLength = 1048576;

ReadToEnd does specifically just that, it reads to the end of the stream. I would check to make sure that you were actually being sent the entire expected response.

There seems to be a problem when calling the GetResponseStream() method on the HttpWebResponse returned by the exception. Everything works as expected when there is no exception.
I wanted to get the HTML code from the error returned by the server.
I guess I'll have to hope the error doesn't exceed 65536 characters...

Reading "chunked" response with HttpWebResponse

I'm having trouble reading a "chunked" response when using a StreamReader to read the stream returned by GetResponseStream() of a HttpWebResponse:
// response is an HttpWebResponse
StreamReader reader = new StreamReader(response.GetResponseStream());
string output = reader.ReadToEnd(); // throws exception...
When the reader.ReadToEnd() method is called I'm getting the following System.IO.IOException: Unable to read data from the transport connection: The connection was closed.
The above code works just fine when server returns a "non-chunked" response.
The only way I've been able to get it to work is to use HTTP/1.0 for the initial request (instead of HTTP/1.1, the default) but this seems like a lame work-around.
Any ideas?
#Chuck
Your solution works pretty good. It still throws the same IOExeception on the last Read(). But after inspecting the contents of the StringBuilder it looks like all the data has been received. So perhaps I just need to wrap the Read() in a try-catch and swallow the "error".

Haven't tried it this with a "chunked" response but would something like this work?
StringBuilder sb = new StringBuilder();
Byte[] buf = new byte[8192];
Stream resStream = response.GetResponseStream();
string tmpString = null;
int count = 0;
do
{
count = resStream.Read(buf, 0, buf.Length);
if(count != 0)
{
tmpString = Encoding.ASCII.GetString(buf, 0, count);
sb.Append(tmpString);
}
}while (count > 0);

I am working on a similar problem. The .net HttpWebRequest and HttpWebRequest handle cookies and redirects automatically but they do not handle chunked content on the response body automatically.
This is perhaps because chunked content may contain more than simple data (i.e.: chunk names, trailing headers).
Simply reading the stream and ignoring the EOF exception will not work as the stream contains more than the desired content. The stream will contain chunks and each chunk begins by declaring its size. If the stream is simply read from beginning to end the final data will contain the chunk meta-data (and in case where it is gziped content it will fail the CRC check when decompressing).
To solve the problem it is necessary to manually parse the stream, removing the chunk size from each chunk (as well as the CR LF delimitors), detecting the final chunk and keeping only the chunk data. There likely is a library out there somewhere that does this, I have not found it yet.
Usefull resources :
http://en.wikipedia.org/wiki/Chunked_transfer_encoding
https://www.rfc-editor.org/rfc/rfc2616#section-3.6.1

I've had the same problem (which is how I ended up here :-). Eventually tracked it down to the fact that the chunked stream wasn't valid - the final zero length chunk was missing. I came up with the following code which handles both valid and invalid chunked streams.
using (StreamReader sr = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
StringBuilder sb = new StringBuilder();
try
{
while (!sr.EndOfStream)
{
sb.Append((char)sr.Read());
}
}
catch (System.IO.IOException)
{ }
string content = sb.ToString();
}

After trying a lot of snippets from StackOverflow and Google, ultimately I found this to work the best (assuming you know the data a UTF8 string, if not, you can just keep the byte array and process appropriately):
byte[] data;
var responseStream = response.GetResponseStream();
var reader = new StreamReader(responseStream, Encoding.UTF8);
data = Encoding.UTF8.GetBytes(reader.ReadToEnd());
return Encoding.Default.GetString(data.ToArray());
I found other variations work most of the time, but occasionally truncate the data. I got this snippet from:
https://social.msdn.microsoft.com/Forums/en-US/4f28d99d-9794-434b-8b78-7f9245c099c4/problems-with-httpwebrequest-and-transferencoding-chunked?forum=ncl

It is funny. During playing with the request header and removing "Accept-Encoding: gzip,deflate" the server in my usecase did answer in a plain ascii manner and no longer with chunked, encoded snippets. Maybe you should give it a try and keep "Accept-Encoding: gzip,deflate" away. The idea came while reading the upper mentioned wiki in topic about using compression.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why does mono sometimes truncate http downloads? - c#

Related

Uploading Large Files to WCF from Xamarin Android App Crashes

System.NotSupportedException: No data is available for encoding 1252

Retrieve html from website

WebException.Response.GetResponseStream() limited to 65536 characters

Reading "chunked" response with HttpWebResponse

Categories

Resources