Download Json file from URL with progressbar - c#

I need to download large json files from url. It uses post method, with parameter, username, password, etc.
As it takes quite long time, I try to put a progressbar, so users can see how far it has gone, and how much is left.
Problem
I am not able to retrieve the content length of the Json file from url before the download. Please see error commented in the code below. Any suggestions ?
public void downloadJson(string Url, string UserName, string Password, string FileDownload)
{
HttpWebRequest httpRequest;
HttpWebResponse httpResponse;
int Size;
var json = string.Format("{{\"user\":\"{0}\",\"pwd\":\"{1}\",\"DS\":\"KG\"}}", UserName, Password);
httpRequest = (HttpWebRequest)WebRequest.Create(Url);
httpRequest.Method = WebRequestMethods.Http.Post;
httpRequest.ContentLength = json.Length;
httpRequest.ContentType = "application/json";
httpRequest.Timeout = 600 * 60 * 1000;
var data = Encoding.ASCII.GetBytes(json); // or UTF8
using (var s = httpRequest.GetRequestStream())
{
s.Write(data, 0, data.Length);
s.Close();
}
httpResponse = (HttpWebResponse)httpRequest.GetResponse();
Size = (int)httpResponse.ContentLength;
Stream rs = httpResponse.GetResponseStream();
//**********************************************
//Here is the error
//the below progress bar would never work
//Because Size returned from above is always -1
//**********************************************
progressBar.Invoke((MethodInvoker)(() => progressBar.Maximum = Size));
using (FileStream fs = File.Create(FileDownload))
{
byte[] buffer = new byte[16 * 1024];
int read;
int position;
using (rs)
{
while ((read = rs.Read(buffer, 0, buffer.Length)) > 0)
{
fs.Write(buffer, 0, read);
position = (int)fs.Position;
progressBar.Invoke((MethodInvoker)(() => progressBar.Value = position));
Console.WriteLine ("Bytes Received: " + position.ToString());
}
}
fs.Close();
}

The error could be here;
Size = (int)httpResponse.ContentLength;
I believe when you declared like
int Size;
You wanted to assign the integer datatype Size
But it seems like Size is not behaving like an integer datatype.
Try changing Size to something like siz and try it again.

Related

Download FTP using FtpWebRequest in Windows Form .Net

I have tried to download an FTP file using C# and have had various problems. What I want to achieve is to be able to show download progress in a progressBar. It is important that I use Windows Form and .Net.
I have tried two codes;
My first code works perfectly, that is, I can download the FTP file without problems.
CODE 1
FtpWebRequest dirFtp = ((FtpWebRequest)FtpWebRequest.Create(ficFTP));
dirFtp.KeepAlive = true;
dirFtp.UsePassive = UsePassive;
dirFtp.UseBinary = UseBinary;
// Los datos del usuario (credenciales)
NetworkCredential cr = new NetworkCredential(user, pass);
dirFtp.Credentials = cr;
FtpWebResponse response = (FtpWebResponse)dirFtp.GetResponse();
long size = (long)response.ContentLength;
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream);
using (FileStream writer = new FileStream(dirLocal, FileMode.Create))
{
int bufferSize = 2048;
int readCount;
byte[] buffer = new byte[2048];
readCount = responseStream.Read(buffer, 0, bufferSize);
while (readCount > 0)
{
writer.Write(buffer, 0, readCount);
readCount = responseStream.Read(buffer, 0, bufferSize);
}
}
lblDescarga.Text = "¡Downloaded!";
reader.Close();
response.Close();
Problem with this code
My problem with this code is that I can't get the size of the FTP file to be able to use the progressBar, In theory this section of code would tell me the size of my file but it always returns -1:
long size = (long)response.ContentLength;
As this did not work as I wanted, I made a post and people recommended this solution FtpWebRequest FTP download with ProgressBar:
CODE 2
try
{
const string url = "ftp://185.222.111.11:21/patch/archive.zip";
NetworkCredential credentials = new NetworkCredential("user", "pass");
// Query size of the file to be downloaded
WebRequest sizeRequest = WebRequest.Create(url);
sizeRequest.Credentials = credentials;
sizeRequest.Method = WebRequestMethods.Ftp.GetFileSize;
int size = (int)sizeRequest.GetResponse().ContentLength;
progressBar1.Invoke(
(MethodInvoker)(() => progressBar1.Maximum = size));
// Download the file
WebRequest request = WebRequest.Create(url);
request.Credentials = credentials;
request.Method = WebRequestMethods.Ftp.DownloadFile;
using (Stream ftpStream = request.GetResponse().GetResponseStream())
using (Stream fileStream = File.Create(#"C:\tmp\archive.zip"))
{
byte[] buffer = new byte[10240];
int read;
while ((read = ftpStream.Read(buffer, 0, buffer.Length)) > 0)
{
fileStream.Write(buffer, 0, read);
int position = (int)fileStream.Position;
progressBar1.Invoke(
(MethodInvoker)(() => progressBar1.Value = position));
}
}
}
catch (Exception e)
{
MessageBox.Show(e.Message);
}
Problem with this code
The problem with this code is when it gets to this point:
int size = (int) sizeRequest.GetResponse (). ContentLength;
Remote server error: (550) File not available (eg file not found or not accessed).
The truth is that it is impossible to tell that you do not have permission if code 1 works well. However I have the normal permissions in FTP, could someone give me an idea please?

Split an avro file and upload to REST

I have created some avro files. I can use the following commands to convert them to json, just to check whether the files are ok
java -jar avro-tools-1.8.2.jar tojson FileName.avro>outputfilename.json
Now, I have some big avro files and the REST API I m trying to upload to, has size limitations and thus I am trying to upload it in chunks using streams.
The following sample, which just reads from the original file in chunks and copies to another avro file, creates the file perfectly
using System;
using System.IO;
class Test
{
public static void Main()
{
// Specify a file to read from and to create.
string pathSource = #"D:\BDS\AVRO\filename.avro";
string pathNew = #"D:\BDS\AVRO\test\filenamenew.avro";
try
{
using (FileStream fsSource = new FileStream(pathSource,
FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[(20 * 1024 * 1024) + 100];
long numBytesToRead = (int)fsSource.Length;
int numBytesRead = 0;
using (FileStream fsNew = new FileStream(pathNew,
FileMode.Append, FileAccess.Write))
{
// Read the source file into a byte array.
//byte[] bytes = new byte[fsSource.Length];
//int numBytesToRead = (int)fsSource.Length;
//int numBytesRead = 0;
while (numBytesToRead > 0)
{
int bytesRead = fsSource.Read(buffer, 0, buffer.Length);
byte[] actualbytes = new byte[bytesRead];
Array.Copy(buffer, actualbytes, bytesRead);
// Read may return anything from 0 to numBytesToRead.
// Break when the end of the file is reached.
if (bytesRead == 0)
break;
numBytesRead += bytesRead;
numBytesToRead -= bytesRead;
fsNew.Write(actualbytes, 0, actualbytes.Length);
}
}
}
// Write the byte array to the other FileStream.
}
catch (FileNotFoundException ioEx)
{
Console.WriteLine(ioEx.Message);
}
}
}
How do I know this creates a ok avro. Because the earlier command to convert to json, again works i.e.
java -jar avro-tools-1.8.2.jar tojson filenamenew.avro>outputfilename.json
However, when I use the same code, but instead of copying to another file, just call a rest api, the file gets uploaded but upon downloading the same file from the server and running the command above to convert to json says - "Not a Data file".
So, obviously something is getting corrupted and I am struggling to figure out what.
This is the snippet
string filenamefullyqualified = path + filename;
Stream stream = System.IO.File.Open(filenamefullyqualified, FileMode.Open, FileAccess.Read, FileShare.None);
long? position = 0;
byte[] buffer = new byte[(20 * 1024 * 1024) + 100];
long numBytesToRead = stream.Length;
int numBytesRead = 0;
do
{
var content = new MultipartFormDataContent();
int bytesRead = stream.Read(buffer, 0, buffer.Length);
byte[] actualbytes = new byte[bytesRead];
Array.Copy(buffer, actualbytes, bytesRead);
if (bytesRead == 0)
break;
//Append Data
url = String.Format("https://{0}.dfs.core.windows.net/raw/datawarehouse/{1}/{2}/{3}/{4}/{5}?action=append&position={6}", datalakeName, filename.Substring(0, filename.IndexOf("_")), year, month, day, filename, position.ToString());
numBytesRead += bytesRead;
numBytesToRead -= bytesRead;
ByteArrayContent byteContent = new ByteArrayContent(actualbytes);
content.Add(byteContent);
method = new HttpMethod("PATCH");
request = new HttpRequestMessage(method, url)
{
Content = content
};
request.Headers.Add("Authorization", "Bearer " + accesstoken);
var response = await client.SendAsync(request);
response.EnsureSuccessStatusCode();
position = position + request.Content.Headers.ContentLength;
Array.Clear(buffer, 0, buffer.Length);
} while (numBytesToRead > 0);
stream.Close();
I have looked through the forum threads but haven't come across anything which deals with splitting of avro files.
I have a hunch that my "content" for the http request isn't right. what is it that I am missing?
If you need more details, I will be happy to provide.
I have found the problem now. The problem was because of MultipartFormDataContent. When an avro file is uploaded with that, it adds extra text like content Type etc, along with removal of many lines (I do not know why).
So, the solution was to upload the contents as "ByteArrayContent" itself and not add it to MultipartFormDataContent like I was doing earlier.
Here is the snippet, almost similar to the one in the question, except that I no longer use MultipartFormDataContent
string filenamefullyqualified = path + filename;
Stream stream = System.IO.File.Open(filenamefullyqualified, FileMode.Open, FileAccess.Read, FileShare.None);
//content.Add(CreateFileContent(fs, path, filename, "text/plain"));
long? position = 0;
byte[] buffer = new byte[(20 * 1024 * 1024) + 100];
long numBytesToRead = stream.Length;
int numBytesRead = 0;
//while ((bytesRead = stream.Read(buffer, 0, buffer.Length)) > 0)
//{
do
{
//var content = new MultipartFormDataContent();
int bytesRead = stream.Read(buffer, 0, buffer.Length);
byte[] actualbytes = new byte[bytesRead];
Array.Copy(buffer, actualbytes, bytesRead);
if (bytesRead == 0)
break;
//Append Data
url = String.Format("https://{0}.dfs.core.windows.net/raw/datawarehouse/{1}/{2}/{3}/{4}/{5}?action=append&position={6}", datalakeName, filename.Substring(0, filename.IndexOf("_")), year, month, day, filename, position.ToString());
numBytesRead += bytesRead;
numBytesToRead -= bytesRead;
ByteArrayContent byteContent = new ByteArrayContent(actualbytes);
//byteContent.Headers.ContentType= new MediaTypeHeaderValue("text/plain");
//content.Add(byteContent);
method = new HttpMethod("PATCH");
//request = new HttpRequestMessage(method, url)
//{
// Content = content
//};
request = new HttpRequestMessage(method, url)
{
Content = byteContent
};
request.Headers.Add("Authorization", "Bearer " + accesstoken);
var response = await client.SendAsync(request);
response.EnsureSuccessStatusCode();
position = position + request.Content.Headers.ContentLength;
Array.Clear(buffer, 0, buffer.Length);
} while (numBytesToRead > 0);
stream.Close();
But the streaming by record will not be able to handle the AVRO file as a whole in a transaction. We may end up in partial success, if some records fail, for example.
If we have a small tool that can split AVRO files based on a threshold number of records, it will be great.
The spark-based split by partition technique does allow to split data set to a pre-defined number of files; but, it does not allow splitting based on the number of records. I.e., I do not want an AVRO file with more than 500 records.
So we have to devise a batching logic based on the comfortable heap size the application can handle along with a two-phase commit, to handle transactions

tracking upload progress of multiple file uploads using multipart/body web request

I'm using HttpWebRequest to upload files to the server. The request sends 2 files to the server, a video file and an image file. I'm trying to track the progress of entire progress but the issue is, the progress log runs separately for each file upload. I want the progress to show only once for both the uploads but I can't quite figure out how to do it. Here's my client side code:
Dictionary<string, string> fields = new Dictionary<string, string>();
fields.Add("username", username);
HttpWebRequest hr = WebRequest.Create(url) as HttpWebRequest;
hr.Timeout = 500000;
string bound = "----------------------------" + DateTime.Now.Ticks.ToString("x");
hr.ContentType = "multipart/form-data; boundary=" + bound;
hr.Method = "POST";
hr.KeepAlive = true;
hr.Credentials = CredentialCache.DefaultCredentials;
byte[] boundBytes = Encoding.ASCII.GetBytes("\r\n--" + bound + "\r\n");
string formDataTemplate = "\r\n--" + bound + "\r\nContent-Disposition: form-data; name=\"{0}\";\r\n\r\n{1}";
Stream s = hr.GetRequestStreamWithTimeout(1000000);
foreach (string key in fields.Keys)
{
byte[] formItemBytes = Encoding.UTF8.GetBytes(
string.Format(formDataTemplate, key, fields[key]));
s.Write(formItemBytes, 0, formItemBytes.Length);
}
s.Write(boundBytes, 0, boundBytes.Length);
string headerTemplate =
"Content-Disposition: form-data; name=\"{0}\"; filename=\"{1}\"\r\n Content-Type: application/octet-stream\r\n\r\n";
List<string> files = new List<string> { fileUrl, thumbUrl };
List<string> type = new List<string> { "video", "thumb" };
int count = 0;
foreach (string f in files)
{
var m = Path.GetFileName(f);
var t = type[count];
var j = string.Format(headerTemplate, t, m);
byte[] headerBytes = Encoding.UTF8.GetBytes(
string.Format(headerTemplate, type[count], Path.GetFileName(f)));
s.Write(headerBytes, 0, headerBytes.Length);
FileStream fs = new FileStream(f, FileMode.Open, FileAccess.Read);
int bytesRead = 0;
long bytesSoFar = 0;
byte[] buffer = new byte[1024];
while ((bytesRead = fs.Read(buffer, 0, buffer.Length)) != 0)
{
bytesSoFar += bytesRead;
s.Write(buffer, 0, buffer.Length);
Console.WriteLine(string.Format("sending file data {0:0.000}%", (bytesSoFar * 100.0f) / fs.Length));
}
s.Write(boundBytes, 0, boundBytes.Length);
fs.Close();
count += 1;
}
s.Close();
string respString = "";
hr.BeginGetResponse((IAsyncResult res) =>
{
WebResponse resp = ((HttpWebRequest)res.AsyncState).EndGetResponse(res);
StreamReader respReader = new StreamReader(resp.GetResponseStream());
respString = respReader.ReadToEnd();
resp.Close();
resp = null;
}, hr);
while (!hr.HaveResponse)
{
Console.Write("hiya bob!");
Thread.Sleep(150);
}
Console.Write(respString);
hr = null;
How do I combine the progress log for both uploads into a single log? Any help is appreciated.
One option is to calculate the total number of bytes you need to send before doing any work:
// Calculate the total size to upload before starting work
long totalToUpload = 0;
foreach (var f in files)
{
totalToUpload += (new FileInfo(f)).Length;
}
Then keep track of the total number of bytes sent in any file, and use that in your calculation of progress:
int count = 0;
long bytesSoFar = 0;
foreach (string f in files)
{
// ... Your existing work ...
while ((bytesRead = fs.Read(buffer, 0, buffer.Length)) != 0)
{
bytesSoFar += bytesRead;
// Make sure to only write the number of bytes read from the file
s.Write(buffer, 0, bytesRead);
// Console.WriteLine takes a string.Format() style string
Console.WriteLine("sending file data {0:0.000}%", (bytesSoFar * 100.0f) / totalToUpload);
}

C# - Is there a limit to the size of an httpWebRequest stream?

I am trying to build an application that downloads a small binary file (20-25 KB) from a custom webserver using httpwebrequests.
This is the server-side code:
Stream UpdateRequest = context.Request.InputStream;
byte[] UpdateContent = new byte[context.Request.ContentLength64];
UpdateRequest.Read(UpdateContent, 0, UpdateContent.Length);
String remoteVersion = "";
for (int i = 0;i < UpdateContent.Length;i++) { //check if update is necessary
remoteVersion += (char)UpdateContent[i];
}
byte[] UpdateRequestResponse;
if (remoteVersion == remotePluginVersion) {
UpdateRequestResponse = new byte[1];
UpdateRequestResponse[0] = 0; //respond with a single byte set to 0 if no update is required
} else {
FileInfo info = new FileInfo(Path.Combine(Directory.GetCurrentDirectory(), "remote logs", "PointAwarder.dll"));
UpdateRequestResponse = File.ReadAllBytes(Path.Combine(Directory.GetCurrentDirectory(), "remote logs", "PointAwarder.dll"));
//respond with the updated file otherwise
}
//this byte is past the threshold and will not be the same in the version the client recieves
Console.WriteLine("5000th byte: " + UpdateRequestResponse[5000]);
//send the response
context.Response.ContentLength64 = UpdateRequestResponse.Length;
context.Response.OutputStream.Write(UpdateRequestResponse, 0, UpdateRequestResponse.Length);
context.Response.Close();
After this the array UpdateRequestResponse contains the entire file and has been sent to the client.
The client runs this code:
//create the request
WebRequest request = WebRequest.Create(url + "pluginUpdate");
request.Method = "POST";
//create a byte array of the current version
byte[] requestContentTemp = version.ToByteArray();
int count = 0;
for (int i = 0; i < requestContentTemp.Length; i++) {
if (requestContentTemp[i] != 0) {
count++;
}
}
byte[] requestContent = new byte[count];
for (int i = 0, j = 0; i < requestContentTemp.Length; i++) {
if (requestContentTemp[i] != 0) {
requestContent[j] = requestContentTemp[i];
j++;
}
}
//send the current version
request.ContentLength = requestContent.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(requestContent, 0, requestContent.Length);
dataStream.Close();
//get and read the response
WebResponse response = request.GetResponse();
Stream responseStream = response.GetResponseStream();
byte[] responseBytes = new byte[response.ContentLength];
responseStream.Read(responseBytes, 0, (int)response.ContentLength);
responseStream.Close();
response.Close();
//if the response containd a single 0 we are up-to-date, otherwise write the content of the response to file
if (responseBytes[0] != 0 || response.ContentLength > 1) {
BinaryWriter writer = new BinaryWriter(File.Open(Path.Combine(Directory.GetCurrentDirectory(), "ServerPlugins", "PointAwarder.dll"), FileMode.Create));
writer.BaseStream.Write(responseBytes, 0, responseBytes.Length);
writer.Close();
TShockAPI.Commands.HandleCommand(TSPlayer.Server, "/reload");
}
The byte array responseBytes on the client should be identical to the array UpdateRequestResponse on the server, but it isn't. after about 4000 bytes every byte after that is set to 0 rather than what it should be (responseBytes[3985] is the last non-zero byte).
Does this happen because httpWebRequest has a size limit? I can't see any bug in my code that could be causing it and the same code works in other instances where I only have to pass around short sequences of data (less than 100 bytes).
The MSDN pages don't mention any size limit like this.
It's not that it has any artificial limit, this is a byproduct of the Streaming nature of what you're attempting to do. I have a feeling the following line is the offender:
responseStream.Read(responseBytes, 0, (int)response.ContentLength);
I've had this issue in the past (with TCP streams), it doesn't read all of the contents of the array, because they haven't all been sent over the wire yet. This is what I would try instead.
for (int i = 0; i < response.ContentLength; i++)
{
responseBytes[i] = responseStream.ReadByte();
}
That way, it will make sure to read all the way until the end of the stream.
EDIT
usr's BinaryReader based solution is much more efficient. Here is the relevant solution:
BinaryReader binReader = new BinaryReader(responseStream);
const int bufferSize = 4096;
byte[] responseBytes;
using (MemoryStream ms = new MemoryStream())
{
byte[] buffer = new byte[bufferSize];
int count;
while ((count = binReader.Read(buffer, 0, buffer.Length)) != 0)
ms.Write(buffer, 0, count);
responseBytes = ms.ToArray();
}
You are assuming that Read is reading as many bytes as you request. But the requested count is just an upper limit. You must tolerate reading small chunks.
You can use var bytes = new BinaryReader(myStream).ReadBytes(count); to read an exact number. Don't call ReadByte too often because that is very CPU intensive.
The best solution would be to step away from the fairly manual HttpWebRequest and use HttpClient or WebClient. All of this is automated for you and you get back a byte[].

Can't download complete image file from skydrive using REST API

I'm working on a quick wrapper for the skydrive API in C#, but running into issues with downloading a file. For the first part of the file, everything comes through fine, but then there start to be differences in the file and shortly thereafter everything becomes null. I'm fairly sure that it's just me not reading the stream correctly.
This is the code I'm using to download the file:
public const string ApiVersion = "v5.0";
public const string BaseUrl = "https://apis.live.net/" + ApiVersion + "/";
public SkyDriveFile DownloadFile(SkyDriveFile file)
{
string uri = BaseUrl + file.ID + "/content";
byte[] contents = GetResponse(uri);
file.Contents = contents;
return file;
}
public byte[] GetResponse(string url)
{
checkToken();
Uri requestUri = new Uri(url + "?access_token=" + HttpUtility.UrlEncode(token.AccessToken));
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(requestUri);
request.Method = WebRequestMethods.Http.Get;
WebResponse response = request.GetResponse();
Stream responseStream = response.GetResponseStream();
byte[] contents = new byte[response.ContentLength];
responseStream.Read(contents, 0, (int)response.ContentLength);
return contents;
}
This is the image file I'm trying to download
And this is the image I am getting
These two images lead me to believe that I'm not waiting for the response to finish coming through, because the content-length is the same as the size of the image I'm expecting, but I'm not sure how to make my code wait for the entire response to come through or even really if that's the approach I need to take.
Here's my test code in case it's helpful
[TestMethod]
public void CanUploadAndDownloadFile()
{
var api = GetApi();
SkyDriveFolder folder = api.CreateFolder(null, "TestFolder", "Test Folder");
SkyDriveFile file = api.UploadFile(folder, TestImageFile, "TestImage.png");
file = api.DownloadFile(file);
api.DeleteFolder(folder);
byte[] contents = new byte[new FileInfo(TestImageFile).Length];
using (FileStream fstream = new FileStream(TestImageFile, FileMode.Open))
{
fstream.Read(contents, 0, contents.Length);
}
using (FileStream fstream = new FileStream(TestImageFile + "2", FileMode.CreateNew))
{
fstream.Write(file.Contents, 0, file.Contents.Length);
}
Assert.AreEqual(contents.Length, file.Contents.Length);
bool sameData = true;
for (int i = 0; i < contents.Length && sameData; i++)
{
sameData = contents[i] == file.Contents[i];
}
Assert.IsTrue(sameData);
}
It fails at Assert.IsTrue(sameData);
This is because you don't check the return value of responseStream.Read(contents, 0, (int)response.ContentLength);. Read doesn't ensure that it will read response.ContentLength bytes. Instead it returns the number of bytes read. You can use a loop or stream.CopyTo there.
Something like this:
WebResponse response = request.GetResponse();
MemoryStream m = new MemoryStream();
response.GetResponseStream().CopyTo(m);
byte[] contents = m.ToArray();
As LB already said, you need to continue to call Read() until you have read the entire stream.
Although Stream.CopyTo will copy the entire stream it does not ensure that read the number of bytes expected. The following method will solve this and raise an IOException if it does not read the length specified...
public static void Copy(Stream input, Stream output, long length)
{
byte[] bytes = new byte[65536];
long bytesRead = 0;
int len = 0;
while (0 != (len = input.Read(bytes, 0, Math.Min(bytes.Length, (int)Math.Min(int.MaxValue, length - bytesRead)))))
{
output.Write(bytes, 0, len);
bytesRead = bytesRead + len;
}
output.Flush();
if (bytesRead != length)
throw new IOException();
}

Categories

Resources