Using dataSet.ReadXml when the server is continuously pushing xml - c#

I'm connecting (using HttpWebRequest and HttpWebResponse) to a server that is continuously pushing XML data. what I need is to read the data and store them into a dataset online.
Currently this is what I do:
HttpWebRequest httpRequest;
string url = "something";
System.Data.DataSet dataSet = new System.Data.DataSet();
httpRequest = (HttpWebRequest)WebRequest.Create(url);
httpRequest.Credentials = new NetworkCredential("username", "pass");
ServicePointManager.ServerCertificateValidationCallback = ((sender1, certificate, chain, sslPolicyErrors) => true);
byte[] buf = new byte[1024];
HttpWebResponse response;
int count = -1;
string read = string.Empty;
response = (HttpWebResponse)httpRequest.GetResponse();
do
{
count = response.GetResponseStream().Read(buf, 0, buf.Length);
read += Encoding.UTF8.GetString(buf, 0, count);
dataSet.ReadXml(new MemoryStream(System.Text.Encoding.UTF8.GetBytes(read)));
} while (response.GetResponseStream().CanRead && count != 0);
but since in every loop a part of the xml data is received, the ReadXml function would cause an exception.
What can I do to get this problem solved?

If by "continuously" you mean without ever ending, then you should not use a DataSet, since this is not designed to be used like this.
That being said, the problem you're having is that the network data you're reading can be arbitrarily cut to pieces, so that you don't even get valid XML fragments back. Instead, you should use an XmlReader on the response stream and read from that, since this will put the pieces back together.
However, directly passing this into ReadXml() will not do the trick because it would never stop. So what you'll probably end up with is some simple code reading the parsed fragments from the XmlReader, and insert these manually into whatever data storage you want.

Have you tried putting the line
dataSet.ReadXml(new MemoryStream(System.Text.Encoding.UTF8.GetBytes(read)));
outside the do-while loop, and then having the while-clause only evaluate count:
while (count > 0);
It could also be helpful to get the instance of the response stream outside the loop. There might be unforseen consequences by having the GetResponseStream() call on every loop.

Related

How to remove last line space streamwriter?

currently i'm facing issue while creating file, i'm trying to write text contents using streamWriter class but i'm not getting expected answer..
Below is my example code :-
My c# code looks like :-
public void ProcessRequest(HttpContext context)
{
// Create a connexion to the Remote Server to redirect all requests
RemoteServer server = new RemoteServer(context);
// Create a request with same data in navigator request
HttpWebRequest request = server.GetRequest();
// Send the request to the remote server and return the response
HttpWebResponse response = server.GetResponse(request);
context.Response.AddHeader("Content-Disposition", "attachment; filename=playlist.m3u8");
context.Response.ContentType = response.ContentType;
Stream receiveStream = response.GetResponseStream();
var buff = new byte[1024];
int bytes = 0;
string token = Guid.NewGuid().ToString();
while ((bytes = receiveStream.Read(buff, 0, 1024)) > 0)
{
//Write the stream directly to the client
context.Response.OutputStream.Write(buff, 0, bytes);
context.Response.Write("&token="+token);
}
//close streams
response.Close();
context.Response.End();
}
output of above code looks like :-
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=20776,CODECS="avc1.66.41",RESOLUTION=320x240
chunk.m3u8?nimblesessionid=62
&token=42712adc-f932-43c7-b282-69cf349941da
But my expected output is :-
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=20776,CODECS="avc1.66.41",RESOLUTION=320x240
chunk.m3u8?nimblesessionid=62&token=42712adc-f932-43c7-b282-69cf349941da
I just wanted that token param in same line instead of new line..
Thank you.
If you want to simply remove a newline at the end of the received bytes, change the code in your while loop like so:
while ((bytes = receiveStream.Read(buff, 0, 1024)) > 0)
{
if (buff[bytes-1] == 0x0a)
bytes -= 1;
//Write the stream directly to the client
context.Response.OutputStream.Write(buff, 0, bytes);
context.Response.Write("&token="+token);
}
Several caveats:
It will only work if 0x0a (newline byte, '\n' as a character) is at the end of the bytes you received. If for some reason the message sent by the server is received in several blocks, you will first have to make sure you received everything there is to receive before checking the last byte.
Please also note that this would result in multiple &token=... lines in your current code.
Depending on the server, it might use carriage return (0x0d or '\r') as its line ending byte, or even both. Check what the server sends and adapt the code accordingly.

Retrieving entire line from a socket in C#?

I have a simple client-server system sending plain text - though only commands that have been approved. The server is a Python system - and I've confirmed proper connections.
However, the client is C# - in Unity. Searching for examples, I stumbled across this bit of code. It does seem to do what I want, however, only partially:
public String readSocket()
{
if (!socketReady)
return "";
if (theStream.DataAvailable)
return theReader.ReadLine();
return "";
}
The strings I am sending end with \n, but I'm only getting half the message like this:
Message A:
claim_2
Message B:
_20_case
claim_1
I know this probably has to do with how I'm directly reading the line but I cannot find any better examples - strangely enough, everyone seems to point back at this snippet even when multiple people point out the problems.
Can anything be done to fix this bit of code properly?
In case it helps, I'm sending the information (from my Python server) out like this:
action = str(command) + "_" + str(x) + "_" + str(userid) + "_" + str(user)
cfg.GameSendConnection.sendall((action + "\n").encode("utf-8"))
When you do sockets programming, it is important to note that data might not be
available in one piece. In fact, this is exactly what you are seeing. Your
messages are being broken up.
So why does ReadLine not wait until there's a line to read?.
Here's some simple sample code:
var stream = new MemoryStream();
var reader = new StreamReader(stream);
var writer = new StreamWriter(stream) { AutoFlush = true };
writer.Write("foo");
stream.Seek(0, SeekOrigin.Begin);
Console.WriteLine(reader.ReadLine());
Note that there is no newline at the end. Still, the output of this little
snippet is foo.
ReadLine returns the string up to the first line break or until there is no
more data to read. The exception being reading from a stream that has no more
data to read, then it returns null.
When a NetworkStream has its DataAvailable property return true, it has
data. But as mentioned before, there is no guarantee whatsoever about what that
data is. It might be a single byte. Or a part of a message. Or a full message
plus part of the next message. Note that depending on the encoding, it could
even be possible to receive only part of a character. Not all character
encodings have all characters be at most a single byte. This includes UTF-8, which cfg.GameSendConnection.sendall((action + "\n").encode("utf-8")) sends.
How to solve this? Read bytes, not lines. Put them in some buffer. After every
read, check if the buffer contains a newline. If it does, you now have a full
message to handle. Remove the message up to and including the newline from the
buffer and keep appending new data to it until the next newline is received. And
so on.
This is how I process the entire line in my similar application, which is a very simple code, and your code may be different, but you can get the idea.
private string incompleteRecord = "";
public void ReadSocket()
{
if (_networkStream.DataAvailable)
{
var buffer = new byte[8192];
var receivedString = new StringBuilder();
do
{
int numberOfBytesRead = _networkStream.Read(buffer, 0, buffer.Length);
receivedString.AppendFormat("{0}", Encoding.UTF8.GetString(buffer, 0, numberOfBytesRead));
} while (_networkStream.DataAvailable);
var bulkMsg = receivedString.ToString();
// When you receive data from the socket, you can receive any number of messages at a time
// with no guarantee that the last message you receive will be complete.
// You can receive only part of a complete message, with next part coming
// with the next call. So, we need to save any partial messages and add
// them to the beginning of the data next time.
bulkMsg = incompleteRecord + bulkMsg;
// clear incomplete record so it doesn't get processed next time too.
incompleteRecord = "";
// loop though the data breaking it apart into lines by delimiter ("\n")
while (bulkMsg.Length > 0)
{
var newLinePos = bulkMsg.IndexOf("\n");
if (newLinePos > 0)
{
var line = bulkMsg.Substring(0, newLinePos);
// Do whatever you want with your line here ...
// ProcessYourLine(line)
// Move to the next message.
bulkMsg = bulkMsg.Substring(line.Length + 1);
}
else
{
// there are no more newline delimiters
// so we save the rest of the message (if any) for processing with the next batch of received data.
incompleteRecord = bulkMsg;
bulkMsg = "";
}
}
}
}

C# : Dealing with HttpWebResponse timeout problems

I have a big problem dealing with data I try to download in my Application over the internet via HttpWebResponse. My code looks like that:
myWebRequest.Timeout = 10000;
using (HttpWebResponse myWebResponse = (HttpWebResponse)myWebRequest.GetResponse())
{
using (Stream ReceiveStream = myWebResponse.GetResponseStream())
{
Encoding encode = Encoding.GetEncoding("utf-8");
StreamReader readStream = new StreamReader(ReceiveStream, encode);
// Read 1024 characters at a time.
Char[] read = new Char[1024];
int count = readStream.Read(read, 0, 1024);
int break_counter = 0;
while (count > 0 && break_counter < 10000)
{
String str = new String(read, 0, count);
buffer += str;
count = readStream.Read(read, 0, 1024);
break_counter++;
}
}
}
This code runs in a few instances in separated threads so it's a little bit hard to debug. The problem is this method got stuck and I blame it on the poor connection to the data.
As you can see I already set a timeout and was hoping the code would just terminate after the timeout time has expired. It does not! At least not all the time. Sometimes I get a WebException/Timeout but a few times it just got stuck.
What is a timeout exactly? When is it called?
Lets say the HttpWebResponse starts to receive data but it got stuck somewhere in the middle of transmission. Do I get a timeout? For me it looks like I don't because my application got stuck too and no timeout exception is raised.
What can I do to patch this up or how can I get further information about what is going wrong here?
Try setting HttpWebRequest.ReadWriteTimeout Property
The number of milliseconds before the
writing or reading times out. The
default value is 300,000 milliseconds
(5 minutes).

How do I download a large file (via HTTP) in .NET?

I need to download a large file (2 GB) over HTTP in a C# console application. Problem is, after about 1.2 GB, the application runs out of memory.
Here's the code I'm using:
WebClient request = new WebClient();
request.Credentials = new NetworkCredential(username, password);
byte[] fileData = request.DownloadData(baseURL + fName);
As you can see... I'm reading the file directly into memory. I'm pretty sure I could solve this if I were to read the data back from HTTP in chunks and write it to a file on disk.
How could I do this?
If you use WebClient.DownloadFile you could save it directly into a file.
The WebClient class is the one for simplified scenarios. Once you get past simple scenarios (and you have), you'll have to fall back a bit and use WebRequest.
With WebRequest, you'll have access to the response stream, and you'll be able to loop over it, reading a bit and writing a bit, until you're done.
From the Microsoft documentation:
We don't recommend that you use WebRequest or its derived classes for
new development. Instead, use the System.Net.Http.HttpClient class.
Source: learn.microsoft.com/WebRequest
Example:
public void MyDownloadFile(Uri url, string outputFilePath)
{
const int BUFFER_SIZE = 16 * 1024;
using (var outputFileStream = File.Create(outputFilePath, BUFFER_SIZE))
{
var req = WebRequest.Create(url);
using (var response = req.GetResponse())
{
using (var responseStream = response.GetResponseStream())
{
var buffer = new byte[BUFFER_SIZE];
int bytesRead;
do
{
bytesRead = responseStream.Read(buffer, 0, BUFFER_SIZE);
outputFileStream.Write(buffer, 0, bytesRead);
} while (bytesRead > 0);
}
}
}
}
Note that if WebClient.DownloadFile works, then I'd call it the best solution. I wrote the above before the "DownloadFile" answer was posted. I also wrote it way too early in the morning, so a grain of salt (and testing) may be required.
You need to get the response stream and then read in blocks, writing each block to a file to allow memory to be reused.
As you have written it, the whole response, all 2GB, needs to be in memory. Even on a 64bit system that will hit the 2GB limit for a single .NET object.
Update: easier option. Get WebClient to do the work for you: with its DownloadFile method which will put the data directly into a file.
WebClient.OpenRead returns a Stream, just use Read to loop over the contents, so the data is not buffered in memory but can be written in blocks to a file.
i would use something like this
The connection can be interrupted, so it is better to download the file in small chunks.
Akka streams can help download file in small chunks from a System.IO.Stream using multithreading. https://getakka.net/articles/intro/what-is-akka.html
The Download method will append the bytes to the file starting with long fileStart. If the file does not exist, fileStart value must be 0.
using Akka.Actor;
using Akka.IO;
using Akka.Streams;
using Akka.Streams.Dsl;
using Akka.Streams.IO;
private static Sink<ByteString, Task<IOResult>> FileSink(string filename)
{
return Flow.Create<ByteString>()
.ToMaterialized(FileIO.ToFile(new FileInfo(filename), FileMode.Append), Keep.Right);
}
private async Task Download(string path, Uri uri, long fileStart)
{
using (var system = ActorSystem.Create("system"))
using (var materializer = system.Materializer())
{
HttpWebRequest request = WebRequest.Create(uri) as HttpWebRequest;
request.AddRange(fileStart);
using (WebResponse response = request.GetResponse())
{
Stream stream = response.GetResponseStream();
await StreamConverters.FromInputStream(() => stream, chunkSize: 1024)
.RunWith(FileSink(path), materializer);
}
}
}

Reading "chunked" response with HttpWebResponse

I'm having trouble reading a "chunked" response when using a StreamReader to read the stream returned by GetResponseStream() of a HttpWebResponse:
// response is an HttpWebResponse
StreamReader reader = new StreamReader(response.GetResponseStream());
string output = reader.ReadToEnd(); // throws exception...
When the reader.ReadToEnd() method is called I'm getting the following System.IO.IOException: Unable to read data from the transport connection: The connection was closed.
The above code works just fine when server returns a "non-chunked" response.
The only way I've been able to get it to work is to use HTTP/1.0 for the initial request (instead of HTTP/1.1, the default) but this seems like a lame work-around.
Any ideas?
#Chuck
Your solution works pretty good. It still throws the same IOExeception on the last Read(). But after inspecting the contents of the StringBuilder it looks like all the data has been received. So perhaps I just need to wrap the Read() in a try-catch and swallow the "error".
Haven't tried it this with a "chunked" response but would something like this work?
StringBuilder sb = new StringBuilder();
Byte[] buf = new byte[8192];
Stream resStream = response.GetResponseStream();
string tmpString = null;
int count = 0;
do
{
count = resStream.Read(buf, 0, buf.Length);
if(count != 0)
{
tmpString = Encoding.ASCII.GetString(buf, 0, count);
sb.Append(tmpString);
}
}while (count > 0);
I am working on a similar problem. The .net HttpWebRequest and HttpWebRequest handle cookies and redirects automatically but they do not handle chunked content on the response body automatically.
This is perhaps because chunked content may contain more than simple data (i.e.: chunk names, trailing headers).
Simply reading the stream and ignoring the EOF exception will not work as the stream contains more than the desired content. The stream will contain chunks and each chunk begins by declaring its size. If the stream is simply read from beginning to end the final data will contain the chunk meta-data (and in case where it is gziped content it will fail the CRC check when decompressing).
To solve the problem it is necessary to manually parse the stream, removing the chunk size from each chunk (as well as the CR LF delimitors), detecting the final chunk and keeping only the chunk data. There likely is a library out there somewhere that does this, I have not found it yet.
Usefull resources :
http://en.wikipedia.org/wiki/Chunked_transfer_encoding
https://www.rfc-editor.org/rfc/rfc2616#section-3.6.1
I've had the same problem (which is how I ended up here :-). Eventually tracked it down to the fact that the chunked stream wasn't valid - the final zero length chunk was missing. I came up with the following code which handles both valid and invalid chunked streams.
using (StreamReader sr = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
StringBuilder sb = new StringBuilder();
try
{
while (!sr.EndOfStream)
{
sb.Append((char)sr.Read());
}
}
catch (System.IO.IOException)
{ }
string content = sb.ToString();
}
After trying a lot of snippets from StackOverflow and Google, ultimately I found this to work the best (assuming you know the data a UTF8 string, if not, you can just keep the byte array and process appropriately):
byte[] data;
var responseStream = response.GetResponseStream();
var reader = new StreamReader(responseStream, Encoding.UTF8);
data = Encoding.UTF8.GetBytes(reader.ReadToEnd());
return Encoding.Default.GetString(data.ToArray());
I found other variations work most of the time, but occasionally truncate the data. I got this snippet from:
https://social.msdn.microsoft.com/Forums/en-US/4f28d99d-9794-434b-8b78-7f9245c099c4/problems-with-httpwebrequest-and-transferencoding-chunked?forum=ncl
It is funny. During playing with the request header and removing "Accept-Encoding: gzip,deflate" the server in my usecase did answer in a plain ascii manner and no longer with chunked, encoded snippets. Maybe you should give it a try and keep "Accept-Encoding: gzip,deflate" away. The idea came while reading the upper mentioned wiki in topic about using compression.

Categories

Resources