StreamReader and reading an XML file - c#

I get a response from a web-server using StreamReader... now I want to parse this response (it's an XML document file) to get its values, but every time I try to do it I get a error: Root element is missing.
If I read the same XML file directly, the file is well formatted and I can read it.
This is the stream:
WebResponse response = webRequest.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader responseReader = new StreamReader(responseStream);
string responseString = responseReader.ReadToEnd();
And this is how I try to read the XML file:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(responseReader);
XmlNodeList address = xmlDoc.GetElementsByTagName("original");

You have called ReadToEnd(), hence consumed all the data (into a string). This means the reader has nothing more to give. Just: don't do that. Or, do that and use LoadXml(reaponseString).

The Load method is capable of fetching XML documents from remote resources. So you could simplify your code like this:
var xmlDoc = new XmlDocument();
xmlDoc.Load("http://example.com/foo.xml");
var address = xmlDoc.GetElementsByTagName("original");
No need of any WebRequests, WebResponses, StreamReaders, ... (which by the way you didn't properly dispose). If this doesn't work it's probably because the remote XML document is not a real XML document and it is broken.

If you do it with the exact code you pasted in your question, then the problem is that you first read the whole stream into string, and then try to read the stream again when calling
xmlDoc.Load(responseReader)
If you have already read the whole stream to the string, use that string to create the xml document
xmlDoc.Load(responseString)

Check what's the content of responseString: probably it contains some additional headers that makes the xmlparser unhappy.

The error you are getting means, that XML you receive lacks first element that wraps the whole content. Try wrapping the answer you receive with some element, for example:
WebResponse response = webRequest.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader responseReader = new StreamReader(responseStream);
string responseString = responseReader.ReadToEnd();
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXML( "<root>" + responseString + "</root>" );
XmlNodeList address = xmlDoc.GetElementsByTagName("original")
Hope this helped

Related

.NET won't download full XML response from REST API

When downloading an XML response from a REST API, I cannot get .NET to download the full XML document on many requests. In each case, I'm missing the last several characters of the XML file which means I can't parse it. The requests work fine in a browser.
I have tried WebResponse.GetResponseStream() using a StreamReader. Within the StreamReader I have tried Read(...) with a buffer, ReadLine(), and ReadToEnd() to build a string for the response. Wondering if there was a bug in my code, I also tried WebClient.DownloadString(url) with the same result and XmlDocument.Load(url) which just throws an exception (unexpected end of file while parsing ____).
I know for a fact that this API has had some encoding issues in the past, so I've tried specifying multiple different encodings (e.g., UTF-8, iso-8859-1) for the StreamReader as well as letting .NET detect the encoding. Changing the encoding seems to result in a different number of characters that get left off the end.
Is there any way I can detect the proper encoding myself? How does a browser do it? Is there somewhere in any browser to see the actual encoding the response is using (not what the HTTP headers say it's returning)? Any other methods of getting a string response from a web site with an unknown encoding?
StreamReader sample code
StringBuilder sb = new StringBuilder();
using (resp = (HttpWebResponse)req.GetResponse())
{
using (Stream stream = resp.GetResponseStream())
{
using (StreamReader sr = new StreamReader(stream))
{
int charsRead = 1;
char[] buffer = new char[4096];
while (charsRead > 0)
{
charsRead = sr.Read(buffer, 0, buffer.Length);
sb.Append(buffer, 0, charsRead);
}
}
}
}
WebClient sample code
WebClient wc = new WebClient();
string text = wc.DownloadString(url);
XmlDocument sample code
XmlDocument doc = new XmlDocument();
doc.Load(url)

Read files from Stream

I posted two files to my custom web service. Now I need to read this stream into separate files.
I've sent an XML and a Text file and in the web service I read it as following:
StreamReader stream = new StreamReader(HttpContext.Current.Request.InputStream);
string xmls = stream.ReadToEnd();
I get the stream as string as following (I've also added the boundaries):
"\r\n------------------------------8cfd42d26566ff0\r\nContent-Disposition: form-data; name=\"uplTheFile\"; filename=\"E:\\AJ\\Demo\\EktronSite2\\XMLFiles\\XMLFile.xml\"\r\n Content-Type: application/octet-stream\r\n\r\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\r\n<note xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns=\"http://tempuri.org/XMLFile.xsd\">\r\n <ID>101</ID>\r\n</note>\r\n------------------------------8cfd42d26566ff0\r\nContent-Disposition: form-data; name=\"uplTheFile\"; filename=\"C:\\TEMP\\log.txt\"\r\n Content-Type: application/octet-stream\r\n\r\nABCDEFGHIJKLMNOPQRSTUVWXYZ\r\n------------------------------8cfd42d26566ff0\r\n"
I need to read the same into different files. For example I need the XML read to XmlDocument type and the text to .docx.
Thanks in advance.
Have you try this :
Stream xmlStream = System.Web.HttpContext.Current.Request.Files[0].InputStream;
Stream txtStream = System.Web.HttpContext.Current.Request.Files[1].InputStream;
I would suggest to use following function :
System.IO.File.WriteAllText(string path, string contents)
It will basically dumps your string into file.
You can use this code:
var memoryStream = new MemoryStream();
await HttpContext.Request.Body.CopyToAsync(memoryStream);

StreamReader returns empty string

I have the following code:
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
req.Credentials = new NetworkCredential("admin", "password");
System.Net.WebResponse resp = req.GetResponse();
System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
var result = sr.ReadToEnd().Trim();
When I run the code the result is just an empty string. However when I step through the code the result is a string with data in it, as I was expecting, when I put a breakpoint on this line:
System.Net.WebResponse resp = req.GetResponse();
So I think the problem lies with this or the subsequent line. Not sure how to proceed, help would be appreciated.
I came across a similar issue whilst using CopyToAsync() on a WebResponse, it turned out that the Stream's pointer was ending up at the end of the Stream (it's pointer position was equal to it's length).
If this is the case, you can reset the pointer before reading the contents of the string with the following...
var responseStream = resp.GetResponseStream();
responseStream.Seek(0, SeekOrigin.Begin);
var sr = new StreamReader(responseStream);
var result = sr.ReadToEnd().Trim();
Although, since you're reading the stream directly, and not copying it into a new MemoryStream, this may not apply to your case.
May be "req.GetResponse();" taking more time..... When your putting the break point its getting time to complete the task.
You need to check
resp.StatusDescription
before
System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());

How can I save HtmlDocument to memory? Html Agility Pack

I am using HTML Agility Pack to parse and HTML document, make a change to a node, and then save the HTML document. I would like to save the document to memory so I can write the HTML out as a string later in the application. My current implementation always returns a string == "". I can see that the HtmlDocument object is not empty when debugging. Can someone provide some insight?
private string InitializeHtml(HtmlDocument htmlDocument)
{
string currentUserName = User.Identity.Name;
HtmlNode scriptTag = htmlDocument.DocumentNode.SelectSingleNode("//script[#id ='HwInitialize']");
scriptTag.InnerHtml =
string.Format("org.myorg.application = {{}}; org.myorg.application.init ={{uid:\"{0}\", application:\"testPortal\"}};",currentUserName);
MemoryStream memoryStream = new MemoryStream();
htmlDocument.Save(memoryStream);
StreamReader streamReader = new StreamReader(memoryStream);
return streamReader.ReadToEnd();
}
Try
memoryStream.Seek(0, System.IO.SeekOrigin.Begin)
Before creating the StreamReader and calling ReadToEnd()
The stream pointer is likely getting left at the end of the stream by the Save method (it's best practise for a component to do this - in case you want to append more data to the stream) therefore when you call ReadToEnd, it's already at the end and nothing gets read.

OutOfMemoryException loading xml document winmo 6.1

I am using c# on a windows mobile 6.1 device. compact framework 3.5.
I am getting a OutofMemoryException when loading in a large string of XML.
The handset has limited memory, but should be more than enough to handle the size of the xml string. The string of xml contains the base64 contents of a 2 MB file. The code will work when the xml string contains files of up to 1.8 MB.
I am completely puzzled as to what to do. Not sure how to change any memory settings.
I have included a condensed copy of the code below. Any help is appreciated.
Stream newStream = myRequest.GetRequestStream();
// Send the data.
newStream.Write(data, 0, data.Length);
//close the write stream
newStream.Close();
// Get the response.
HttpWebResponse response = (HttpWebResponse)myRequest.GetResponse();
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream();
//Process the return
//Set the buffer
byte[] server_response_buffer = new byte[8192];
int response_count = -1;
string tempString = null;
StringBuilder response_sb = new StringBuilder();
//Loop through the stream until it is all written
do
{
// Read content into a buffer
response_count = dataStream.Read(server_response_buffer, 0, server_response_buffer.Length);
// Translate from bytes to ASCII text
tempString = Encoding.ASCII.GetString(server_response_buffer, 0, response_count);
// Write content to a file from buffer
response_sb.Append(tempString);
}
while (response_count != 0);
responseFromServer = response_sb.ToString();
// Cleanup the streams and the response.
dataStream.Close();
response.Close();
}
catch {
MessageBox.Show("There was an error with the communication.");
comm_error = true;
}
if(comm_error == false){
//Load the xml file into an XML object
XmlDocument xdoc = new XmlDocument();
xdoc.LoadXml(responseFromServer);
}
The error occurs on the xdoc.LoadXML line. I have tried writing the stream to a file and then loading the file directly into the xmldocument but it was no better.
Completely stumped at this point.
I would recommend that you use the XmlTextReader class instead of the XmlDocument class. I am not sure what your requirements are for reading of the xml, but XmlDocument is very memory intensive as it creates numerous objects and attempts to load the entire xml string. The XmlTextReader class on the other hand simply scans through the xml as you read from it.
Assuming you have the string, this means you would do something like the following
String xml = "<someXml />";
using(StringReader textReader = new StringReader(xml)) {
using(XmlTextReader xmlReader = new XmlTextReader(textReader)) {
xmlReader.MoveToContent();
xmlReader.Read();
// the reader is now pointed at the first element in the document...
}
}
Have you tried loading from a stream instead of from a string(this is different from writing to a stream, because in your example you are still trying to load it all at once into memory with the XmlDocument)?
There are other .NET components for XML files that work with the XML as a stream instead of loading it all at once. The problem is that .LoadXML probably tries to process the entire document at once, loading it in memory. Not only that but you've already loaded it into a string, so it exists in two different forms in memory, further increasing the chance that you do not have enough free contiguous memory available.
What you want is some way to read it piece meal into a stream through an XmlReader so that you can begin reading the XML document piece wise without loading the entire thing into memory. Of course there are limitations to this approach because an XmlReader is forward only and readonly, and whether it will work depends on what you are wanting to do with the XML once it is loaded.
I'm unsure why you are reading the xml in the way you are, but it could be very memory inefficient. If the garbage collector hasn't kicked in you could have 3+ copies of the document in memory: in the string builder, in the string and in the XmlDocument.
Much better to do something like:
XmlDocument xDoc = new XmlDocument();
Stream dataStream;
try {
dataStream = response.GetResponseStream();
xDoc.Load(dataStream);
} catch {
MessageBox.Show("There was an error with the communication.");
} finally {
if(dataStream != null)
dataStream.Dispose();
}

Categories

Resources