I am trying to retrieve HTML code from a webpage using HttpWebRequest and HttpWebResponse.
response = (HttpWebResponse)request.GetResponse();
...
Stream stream = response.GetResponseStream();
The response object has a ContentLength value of 106142. When I look at the stream object, it has a length of 65536. When reading the stream with a StreamReader using ReadToEnd(), only the first 65536 characters are returned.
How can I get the whole code?
Edit:
Using the following code segment:
catch (WebException ex)
{
errorMessage = errorMessage + ex.Message;
if (ex.Response != null) {
if (ex.Response.ContentLength > 0)
{
using (Stream stream = ex.Response.GetResponseStream())
{
using (StreamReader reader = new StreamReader(stream))
{
string pageOutput = reader.ReadToEnd().Trim();
ex.Response.ContentLength = 106142
ex.Response.GetResponseStream().Length = 65536
stream.Length = 65536
pageOutput.Length = 65534 (because of the trim)
And yes, the code is actually truncated.
You can find an answer in this topic in System.Net.HttpWebResponse.GetResponseStream() returns truncated body in WebException
You have to manage the HttpWebRequest object and change DefaultMaximumErrorResponseLength property.
For example :
HttpWebRequest.DefaultMaximumErrorResponseLength = 1048576;
ReadToEnd does specifically just that, it reads to the end of the stream. I would check to make sure that you were actually being sent the entire expected response.
There seems to be a problem when calling the GetResponseStream() method on the HttpWebResponse returned by the exception. Everything works as expected when there is no exception.
I wanted to get the HTML code from the error returned by the server.
I guess I'll have to hope the error doesn't exceed 65536 characters...
Related
I'm trying to get an image from an url using a byte stream. But i get this error message:
This stream does not support seek operations.
This is my code:
byte[] b;
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
WebResponse myResp = myReq.GetResponse();
Stream stream = myResp.GetResponseStream();
int i;
using (BinaryReader br = new BinaryReader(stream))
{
i = (int)(stream.Length);
b = br.ReadBytes(i); // (500000);
}
myResp.Close();
return b;
What am i doing wrong guys?
You probably want something like this. Either checking the length fails, or the BinaryReader is doing seeks behind the scenes.
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
WebResponse myResp = myReq.GetResponse();
byte[] b = null;
using( Stream stream = myResp.GetResponseStream() )
using( MemoryStream ms = new MemoryStream() )
{
int count = 0;
do
{
byte[] buf = new byte[1024];
count = stream.Read(buf, 0, 1024);
ms.Write(buf, 0, count);
} while(stream.CanRead && count > 0);
b = ms.ToArray();
}
edit:
I checked using reflector, and it is the call to stream.Length that fails. GetResponseStream returns a ConnectStream, and the Length property on that class throws the exception that you saw. As other posters mentioned, you cannot reliably get the length of a HTTP response, so that makes sense.
Use a StreamReader instead:
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
WebResponse myResp = myReq.GetResponse();
StreamReader reader = new StreamReader(myResp.GetResponseStream());
return reader.ReadToEnd();
(Note - the above returns a String instead of a byte array)
You can't reliably ask an HTTP connection for its length. It's possible to get the server to send you the length in advance, but (a) that header is often missing and (b) it's not guaranteed to be correct.
Instead you should:
Create a fixed-length byte[] that you pass to the Stream.Read method
Create a List<byte>
After each read, call List.AddRange to append the contents of your fixed-length buffer onto your byte list
Note that the last call to Read will return fewer than the full number of bytes you asked for. Make sure you only append that number of bytes onto your List<byte> and not the whole byte[], or you'll get garbage at the end of your list.
If the server doesn't send a length specification in the HTTP header, the stream size is unknown, so you get the error when trying to use the Length property.
Read the stream in smaller chunks, until you reach the end of the stream.
With images, you don't need to read the number of bytes at all. Just do this:
Image img = null;
string path = "http://www.example.com/image.jpg";
WebRequest request = WebRequest.Create(path);
req.Credentials = CredentialCache.DefaultCredentials; // in case your URL has Windows auth
WebResponse resp = req.GetResponse();
using( Stream stream = resp.GetResponseStream() )
{
img = Image.FromStream(stream);
// then use the image
}
Perhaps you should use the System.Net.WebClient API. If already using client.OpenRead(url) use client.DownloadData(url)
var client = new System.Net.WebClient();
byte[] buffer = client.DownloadData(url);
using (var stream = new MemoryStream(buffer))
{
... your code using the stream ...
}
Obviously this downloads everything before the Stream is created, so it may defeat the purpose of using a Stream. webClient.DownloadData("https://your.url") gets a byte array which you can then turn into a MemoryStream.
The length of a stream can not be read from the stream since the receiver does not know how many bytes the sender will send. Try to put a protocol on top of http and send i.e. the length as first item in the stream.
I have code below that read ftp response stream and write data to two different files (test1.html & test2.html). The 2nd StreamReader throw a stream was not readable error. The response stream should be readable because it's not out of scope yet and the dispose shouldn't be called. Can someone explain why?
static void Main(string[] args)
{
// Make sure it is ftp
if (Properties.Settings.Default.FtpEndpoint.Split(':')[0] != Uri.UriSchemeFtp) return;
// Intitalize object to used to communicuate to the ftp server
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(Properties.Settings.Default.FtpEndpoint + "/test.html");
// Credentials
request.Credentials = new NetworkCredential(Properties.Settings.Default.FtpUser, Properties.Settings.Default.FtpPassword);
// Set command method to download
request.Method = WebRequestMethods.Ftp.DownloadFile;
// Get response
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
using (Stream output = File.OpenWrite(#"C:\Sandbox\vs_projects\FTP\FTP_Download\test1.html"))
using (Stream responseStream = response.GetResponseStream())
{
responseStream.CopyTo(output);
Console.WriteLine("Successfully wrote stream to test.html");
try
{
using (StreamReader reader = new StreamReader(responseStream))
{
string file = reader.ReadToEnd();
File.WriteAllText(#"C:\Sandbox\vs_projects\FTP\FTP_Download\test2.html", file);
Console.WriteLine("Successfully wrote stream to test2.html");
}
}
catch (Exception ex)
{
Console.WriteLine($"Exception: {ex}");
}
}
}
You can't read from the stream twice. After this call:
responseStream.CopyTo(output);
... you've already read all the data in the stream. There's nothing left to read, and you can't "rewind" the stream (e.g. seeking to the beginning) because it's a network stream. Admittedly I'd expect it to just be empty rather than throwing an error, but the details don't really matter much as it's not a useful thing to try to do.
If you want to make two copies of the same data, the best option is to copy it to disk as you're already doing, then read the file that you've just written.
(Alternatively, you could just read it into memory by copying to a MemoryStream, then you can rewind that stream and read from it repeatedly. But if you're already going to save it to disk, you might as well do that first.)
I have the following code:
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
req.Credentials = new NetworkCredential("admin", "password");
System.Net.WebResponse resp = req.GetResponse();
System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
var result = sr.ReadToEnd().Trim();
When I run the code the result is just an empty string. However when I step through the code the result is a string with data in it, as I was expecting, when I put a breakpoint on this line:
System.Net.WebResponse resp = req.GetResponse();
So I think the problem lies with this or the subsequent line. Not sure how to proceed, help would be appreciated.
I came across a similar issue whilst using CopyToAsync() on a WebResponse, it turned out that the Stream's pointer was ending up at the end of the Stream (it's pointer position was equal to it's length).
If this is the case, you can reset the pointer before reading the contents of the string with the following...
var responseStream = resp.GetResponseStream();
responseStream.Seek(0, SeekOrigin.Begin);
var sr = new StreamReader(responseStream);
var result = sr.ReadToEnd().Trim();
Although, since you're reading the stream directly, and not copying it into a new MemoryStream, this may not apply to your case.
May be "req.GetResponse();" taking more time..... When your putting the break point its getting time to complete the task.
You need to check
resp.StatusDescription
before
System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
I'm trying to get an image from an url using a byte stream. But i get this error message:
This stream does not support seek operations.
This is my code:
byte[] b;
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
WebResponse myResp = myReq.GetResponse();
Stream stream = myResp.GetResponseStream();
int i;
using (BinaryReader br = new BinaryReader(stream))
{
i = (int)(stream.Length);
b = br.ReadBytes(i); // (500000);
}
myResp.Close();
return b;
What am i doing wrong guys?
You probably want something like this. Either checking the length fails, or the BinaryReader is doing seeks behind the scenes.
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
WebResponse myResp = myReq.GetResponse();
byte[] b = null;
using( Stream stream = myResp.GetResponseStream() )
using( MemoryStream ms = new MemoryStream() )
{
int count = 0;
do
{
byte[] buf = new byte[1024];
count = stream.Read(buf, 0, 1024);
ms.Write(buf, 0, count);
} while(stream.CanRead && count > 0);
b = ms.ToArray();
}
edit:
I checked using reflector, and it is the call to stream.Length that fails. GetResponseStream returns a ConnectStream, and the Length property on that class throws the exception that you saw. As other posters mentioned, you cannot reliably get the length of a HTTP response, so that makes sense.
Use a StreamReader instead:
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
WebResponse myResp = myReq.GetResponse();
StreamReader reader = new StreamReader(myResp.GetResponseStream());
return reader.ReadToEnd();
(Note - the above returns a String instead of a byte array)
You can't reliably ask an HTTP connection for its length. It's possible to get the server to send you the length in advance, but (a) that header is often missing and (b) it's not guaranteed to be correct.
Instead you should:
Create a fixed-length byte[] that you pass to the Stream.Read method
Create a List<byte>
After each read, call List.AddRange to append the contents of your fixed-length buffer onto your byte list
Note that the last call to Read will return fewer than the full number of bytes you asked for. Make sure you only append that number of bytes onto your List<byte> and not the whole byte[], or you'll get garbage at the end of your list.
If the server doesn't send a length specification in the HTTP header, the stream size is unknown, so you get the error when trying to use the Length property.
Read the stream in smaller chunks, until you reach the end of the stream.
With images, you don't need to read the number of bytes at all. Just do this:
Image img = null;
string path = "http://www.example.com/image.jpg";
WebRequest request = WebRequest.Create(path);
req.Credentials = CredentialCache.DefaultCredentials; // in case your URL has Windows auth
WebResponse resp = req.GetResponse();
using( Stream stream = resp.GetResponseStream() )
{
img = Image.FromStream(stream);
// then use the image
}
Perhaps you should use the System.Net.WebClient API. If already using client.OpenRead(url) use client.DownloadData(url)
var client = new System.Net.WebClient();
byte[] buffer = client.DownloadData(url);
using (var stream = new MemoryStream(buffer))
{
... your code using the stream ...
}
Obviously this downloads everything before the Stream is created, so it may defeat the purpose of using a Stream. webClient.DownloadData("https://your.url") gets a byte array which you can then turn into a MemoryStream.
The length of a stream can not be read from the stream since the receiver does not know how many bytes the sender will send. Try to put a protocol on top of http and send i.e. the length as first item in the stream.
I'm having trouble reading a "chunked" response when using a StreamReader to read the stream returned by GetResponseStream() of a HttpWebResponse:
// response is an HttpWebResponse
StreamReader reader = new StreamReader(response.GetResponseStream());
string output = reader.ReadToEnd(); // throws exception...
When the reader.ReadToEnd() method is called I'm getting the following System.IO.IOException: Unable to read data from the transport connection: The connection was closed.
The above code works just fine when server returns a "non-chunked" response.
The only way I've been able to get it to work is to use HTTP/1.0 for the initial request (instead of HTTP/1.1, the default) but this seems like a lame work-around.
Any ideas?
#Chuck
Your solution works pretty good. It still throws the same IOExeception on the last Read(). But after inspecting the contents of the StringBuilder it looks like all the data has been received. So perhaps I just need to wrap the Read() in a try-catch and swallow the "error".
Haven't tried it this with a "chunked" response but would something like this work?
StringBuilder sb = new StringBuilder();
Byte[] buf = new byte[8192];
Stream resStream = response.GetResponseStream();
string tmpString = null;
int count = 0;
do
{
count = resStream.Read(buf, 0, buf.Length);
if(count != 0)
{
tmpString = Encoding.ASCII.GetString(buf, 0, count);
sb.Append(tmpString);
}
}while (count > 0);
I am working on a similar problem. The .net HttpWebRequest and HttpWebRequest handle cookies and redirects automatically but they do not handle chunked content on the response body automatically.
This is perhaps because chunked content may contain more than simple data (i.e.: chunk names, trailing headers).
Simply reading the stream and ignoring the EOF exception will not work as the stream contains more than the desired content. The stream will contain chunks and each chunk begins by declaring its size. If the stream is simply read from beginning to end the final data will contain the chunk meta-data (and in case where it is gziped content it will fail the CRC check when decompressing).
To solve the problem it is necessary to manually parse the stream, removing the chunk size from each chunk (as well as the CR LF delimitors), detecting the final chunk and keeping only the chunk data. There likely is a library out there somewhere that does this, I have not found it yet.
Usefull resources :
http://en.wikipedia.org/wiki/Chunked_transfer_encoding
https://www.rfc-editor.org/rfc/rfc2616#section-3.6.1
I've had the same problem (which is how I ended up here :-). Eventually tracked it down to the fact that the chunked stream wasn't valid - the final zero length chunk was missing. I came up with the following code which handles both valid and invalid chunked streams.
using (StreamReader sr = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
StringBuilder sb = new StringBuilder();
try
{
while (!sr.EndOfStream)
{
sb.Append((char)sr.Read());
}
}
catch (System.IO.IOException)
{ }
string content = sb.ToString();
}
After trying a lot of snippets from StackOverflow and Google, ultimately I found this to work the best (assuming you know the data a UTF8 string, if not, you can just keep the byte array and process appropriately):
byte[] data;
var responseStream = response.GetResponseStream();
var reader = new StreamReader(responseStream, Encoding.UTF8);
data = Encoding.UTF8.GetBytes(reader.ReadToEnd());
return Encoding.Default.GetString(data.ToArray());
I found other variations work most of the time, but occasionally truncate the data. I got this snippet from:
https://social.msdn.microsoft.com/Forums/en-US/4f28d99d-9794-434b-8b78-7f9245c099c4/problems-with-httpwebrequest-and-transferencoding-chunked?forum=ncl
It is funny. During playing with the request header and removing "Accept-Encoding: gzip,deflate" the server in my usecase did answer in a plain ascii manner and no longer with chunked, encoded snippets. Maybe you should give it a try and keep "Accept-Encoding: gzip,deflate" away. The idea came while reading the upper mentioned wiki in topic about using compression.