I am using HTML Agility Pack to parse and HTML document, make a change to a node, and then save the HTML document. I would like to save the document to memory so I can write the HTML out as a string later in the application. My current implementation always returns a string == "". I can see that the HtmlDocument object is not empty when debugging. Can someone provide some insight?
private string InitializeHtml(HtmlDocument htmlDocument)
{
string currentUserName = User.Identity.Name;
HtmlNode scriptTag = htmlDocument.DocumentNode.SelectSingleNode("//script[#id ='HwInitialize']");
scriptTag.InnerHtml =
string.Format("org.myorg.application = {{}}; org.myorg.application.init ={{uid:\"{0}\", application:\"testPortal\"}};",currentUserName);
MemoryStream memoryStream = new MemoryStream();
htmlDocument.Save(memoryStream);
StreamReader streamReader = new StreamReader(memoryStream);
return streamReader.ReadToEnd();
}
Try
memoryStream.Seek(0, System.IO.SeekOrigin.Begin)
Before creating the StreamReader and calling ReadToEnd()
The stream pointer is likely getting left at the end of the stream by the Save method (it's best practise for a component to do this - in case you want to append more data to the stream) therefore when you call ReadToEnd, it's already at the end and nothing gets read.
Related
I'm working on a MVC project which creates a memory stream which is used for XSLT Transform. At the end, I want to display the results of the transformation on any web browser.
Following is memory stream is being created.
XslCompiledTransform xsl = new XslCompiledTransform();
xsl.Load(xsltpath);
MemoryStream stream = new MemoryStream();
XmlWriter xmlWriter = XmlWriter.Create(stream);
xsl.Transform(InputMessagePath, xmlWriter);
xmlWriter.Close();
// Pass or set the content of stream as a string or any other compatible type to the view to diplay
stream.Close();
Is it possible to display the contents of a memory stream in a web browser? If it's not possible what would be the best way to do that? I'm thinking about creating temporary file and pass its path to System.Diagnostics.Process.Start(path) as a parameter. But before that, it would be great to know the possibility of using the the stream object instead of creating a file to display the contents in a web browser.
Thank you.
You return a 'FileResult' and set content-type in the response:
You can find documentation here:
http://msdn.microsoft.com/en-us/library/system.web.mvc.controller.file%28v=vs.118%29.aspx
There's an overload that takes a stream.
return File(stream, "text/html; charset=utf-8");
Thanks for the replies. Just thought that it would be useful for someone, if I posted my implementation.
Following is procedure I used to display the content of memorystream. My requirement was to get the encoded html content of XSLT transform and display in a browser. Following is not exactly what I implemented. I simplified it to demonstrate the above requirement.
------------- Model ----------------------------------------
public class MyModel
{
public string EncodedOutputMessage { get; set; }
Public Transform()
{
// Do something
SetTransformResult(xsl, intputMsgPath);
}
private void SetTransformResult(XslCompiledTransform xsl, string intputMsgPath)
{
MemoryStream stream = new MemoryStream();
XmlWriter xmlWriter = XmlWriter.Create(stream);
xsl.Transform(intputMsgPath, xmlWriter);
xmlWriter.Close();
XmlDocument root = new XmlDocument();
stream.Position = 0; // rewind the pointer to the beginning of the stream
root.Load(stream);
XmlNodeList nodes = root.GetElementsByTagName("BodyXhtml");
if (nodes.Count == 1)
{
EncodedOutputMessage = HttpUtility.HtmlEncode(nodes[0].InnerXml);
}
stream.Close();
}
}
-------------- Controller -----------------------------------
public class MyTestController : Controller
{
[HttpPost]
public ActionResult ShowResult()
{
MyModel model = new MyModel();
model.Transform();
ViewBag.DecodedOutputMessage = HttpUtility.HtmlDecode(model.EncodedOutputMessage);
return View("MyView");
}
}
----- MyView --------------------------------------------------------------
#model Tester.Models.MyModel
// Other fields to display
#Html.Raw(ViewBag.DecodedOutputMessage);
The action method do the transform and get the encoded string, then decode it and send it to the view. I have removed most of the methods to simplify this procedure.
If you don't have any other content to display in the browser, you can pass only the HTML string, something like this.
public string Result()
{
string htmlRes = "<html><body><font color=\"red\"> Testing color... </font></body></html>";
return htmlRes ;
}
I do this:
MemoryStream stream = new MemoryStream(System.Text.UTF8Encoding.Default.GetBytes(xml));
XPathDocument document = new XPathDocument(stream);
StringWriter writer = new StringWriter();
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(Server.MapPath("~/NFCe.xsl"));
transform.Transform(document, null, writer);
Response.Write(writer.ToString());
Th XML result is sent to the client. In that case I have a variable 'xml' with my original xml, them I'm applying a transformation contained in the file NFCE.xsl. The Response.Write writes a string to the output stream, which is sent to the client, instead of the rendered aspx.
I am attempting to modify a simple MS word templates XML. I realize there are SDK's available that could make this process easier but what I am tasked with maintaining uses packages and I was told to do the same.
I have a basic test document with two placeholders mapped to the following XML:
<root>
<element>
Fubar
</element>
<second>
This is the second placeholder
</second>
</root>
What I am doing is creating a stream using the word doc, removing the existing XML, getting some hard coded test XML and trying to write that to the stream.
Here is the code I am using:
string strRelRoot = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";
byte[] buffer = File.ReadAllBytes("dev.docx");
//stream with the template
MemoryStream stream = new MemoryStream(buffer, true);
//create a package using the stream
Package package = Package.Open(stream, FileMode.Open, FileAccess.ReadWrite);
PackageRelationshipCollection pkgrcOfficeDocument = package.GetRelationshipsByType(strRelRoot);
foreach (PackageRelationship pkgr in pkgrcOfficeDocument)
{
if (pkgr.SourceUri.OriginalString == "/")
{
Uri uriData = new Uri("/customXML/item1.xml", UriKind.Relative);
//remove the existing part
if (package.PartExists(uriData))
{
// Delete template "/customXML/item1.xml" part
package.DeletePart(uriData);
}
//create a new part
PackagePart pkgprtData = package.CreatePart(uriData, "application/xml");
//test data
string xml = #"<root>
<element>
Changed
</element>
<second>
The second placeholder changed
</second>
</root>";
//stream created from the xml string
MemoryStream fromStream = new MemoryStream();
UnicodeEncoding uniEncoding = new UnicodeEncoding();
byte[] fromBuffer = uniEncoding.GetBytes(xml);
fromStream.Write(fromBuffer, 0, fromBuffer.Length);
fromStream.Seek(0L, SeekOrigin.Begin);
Stream toStream = pkgprtData.GetStream();
//copy the xml to the part stream
fromStream.CopyTo(toStream);
//copy part stream to the byte stream
toStream.CopyTo(stream);
}
}
This is currently not modifying the document although I feel like I am close to a solution. Any advice would be very much appreciated. Thanks!
Edit: To clairify, the result I am getting is the document is unchanged. I get no exceptions or the like, but the documents XML is not modified.
OK, so not quite the timely response I promised, but here goes!
There are several aspects to the problem. Sample code is from memory and documentation, not necessarily compiled and tested.
Read the template XML
Before you delete the package part containing the template XML, you need to open its stream and read the XML. How you get the XML if the part doesn't exist to begin with is up to you.
My example code uses classes from the LINQ to XML API, though you could use whichever set of XML APIs you prefer.
XElement templateXml = null;
using (Stream stream = package.GetPart(uriData))
templateXml = XElement.Load(stream);
// Now you can delete the part.
At this point you have an in-memory representation of the template XML in templateXml.
Substitute values into the placeholders
templateXml.SetElementValue("element", "Replacement value of first placeholder");
templateXml.SetElementValue("second", "Replacement value of second placeholder");
Check out the methods on XElement if you need to do anything more advanced than this, e.g. read the original content in order to determine the replacement value.
Save the document
This is your original code, modified and annotated.
// The very first thing to do is create the Package in a using statement.
// This makes sure it's saved and closed when you're done.
using (Package package = Package.Open(...))
{
// XML reading, substituting etc. goes here.
// Eventually...
//create a new part
PackagePart pkgprtData = package.CreatePart(uriData, "application/xml");
// Don't need the test data anymore.
// Assuming you need UnicodeEncoding, set it up like this.
var writerSettings = new XmlWriterSettings
{
Encoding = Encoding.Unicode,
};
// Shouldn't need a MemoryStream at all; write straight to the part stream.
// Note using statements to ensure streams are flushed and closed.
using (Stream toStream = pkgprtData.GetStream())
using (XmlWriter writer = XmlWriter.Create(toStream, writerSettings))
templateXml.Save(writer);
// No other copying should be necessary.
// In particular, your toStream.CopyTo(stream) appeared
// to be appending the part's data to the package's stream
// (the physical file), which is a bug.
} // This closes the using statement for the package, which saves the file.
I am saving an HTML document to a MemoryStream and then reading that stream (using StreamReader) out to a string object. HtmlDocument object is complete but when I inspect the string that is assigned from the streamReader.ReadToEnd() it appears that the end of the file has been truncated. I assume that my implementation of the MemoryStream or StreamReader is faulty. Can someone help me out?
HtmlDocument htmlDocument = GetDocument(htmlHref);
HtmlNode scriptTag = htmlDocument.DocumentNode.SelectSingleNode("//script[#id ='HwInitialize']");
scriptTag.InnerHtml =
string.Format("org.myorg.application.init ={0};", stateJson);
MemoryStream memoryStream = new MemoryStream();
htmlDocument.Save(memoryStream); //Save Document to memory
memoryStream.Seek(0, SeekOrigin.Begin);
StreamReader streamReader = new StreamReader(memoryStream);
return streamReader.ReadToEnd(); //return the stream contents to string
The htmlDocument.DocumentNode.OuterHtml property will serialize your htmlDocument, including any of your changes, into a html string.
I am using c# on a windows mobile 6.1 device. compact framework 3.5.
I am getting a OutofMemoryException when loading in a large string of XML.
The handset has limited memory, but should be more than enough to handle the size of the xml string. The string of xml contains the base64 contents of a 2 MB file. The code will work when the xml string contains files of up to 1.8 MB.
I am completely puzzled as to what to do. Not sure how to change any memory settings.
I have included a condensed copy of the code below. Any help is appreciated.
Stream newStream = myRequest.GetRequestStream();
// Send the data.
newStream.Write(data, 0, data.Length);
//close the write stream
newStream.Close();
// Get the response.
HttpWebResponse response = (HttpWebResponse)myRequest.GetResponse();
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream();
//Process the return
//Set the buffer
byte[] server_response_buffer = new byte[8192];
int response_count = -1;
string tempString = null;
StringBuilder response_sb = new StringBuilder();
//Loop through the stream until it is all written
do
{
// Read content into a buffer
response_count = dataStream.Read(server_response_buffer, 0, server_response_buffer.Length);
// Translate from bytes to ASCII text
tempString = Encoding.ASCII.GetString(server_response_buffer, 0, response_count);
// Write content to a file from buffer
response_sb.Append(tempString);
}
while (response_count != 0);
responseFromServer = response_sb.ToString();
// Cleanup the streams and the response.
dataStream.Close();
response.Close();
}
catch {
MessageBox.Show("There was an error with the communication.");
comm_error = true;
}
if(comm_error == false){
//Load the xml file into an XML object
XmlDocument xdoc = new XmlDocument();
xdoc.LoadXml(responseFromServer);
}
The error occurs on the xdoc.LoadXML line. I have tried writing the stream to a file and then loading the file directly into the xmldocument but it was no better.
Completely stumped at this point.
I would recommend that you use the XmlTextReader class instead of the XmlDocument class. I am not sure what your requirements are for reading of the xml, but XmlDocument is very memory intensive as it creates numerous objects and attempts to load the entire xml string. The XmlTextReader class on the other hand simply scans through the xml as you read from it.
Assuming you have the string, this means you would do something like the following
String xml = "<someXml />";
using(StringReader textReader = new StringReader(xml)) {
using(XmlTextReader xmlReader = new XmlTextReader(textReader)) {
xmlReader.MoveToContent();
xmlReader.Read();
// the reader is now pointed at the first element in the document...
}
}
Have you tried loading from a stream instead of from a string(this is different from writing to a stream, because in your example you are still trying to load it all at once into memory with the XmlDocument)?
There are other .NET components for XML files that work with the XML as a stream instead of loading it all at once. The problem is that .LoadXML probably tries to process the entire document at once, loading it in memory. Not only that but you've already loaded it into a string, so it exists in two different forms in memory, further increasing the chance that you do not have enough free contiguous memory available.
What you want is some way to read it piece meal into a stream through an XmlReader so that you can begin reading the XML document piece wise without loading the entire thing into memory. Of course there are limitations to this approach because an XmlReader is forward only and readonly, and whether it will work depends on what you are wanting to do with the XML once it is loaded.
I'm unsure why you are reading the xml in the way you are, but it could be very memory inefficient. If the garbage collector hasn't kicked in you could have 3+ copies of the document in memory: in the string builder, in the string and in the XmlDocument.
Much better to do something like:
XmlDocument xDoc = new XmlDocument();
Stream dataStream;
try {
dataStream = response.GetResponseStream();
xDoc.Load(dataStream);
} catch {
MessageBox.Show("There was an error with the communication.");
} finally {
if(dataStream != null)
dataStream.Dispose();
}
Does anyone have any idea why the following code sample fails with an XmlException "Data at the root level is invalid. Line 1, position 1."
var body = "<?xml version="1.0" encoding="utf-16"?><Report> ......"
XmlDocument bodyDoc = new XmlDocument();
bodyDoc.LoadXml(body);
Background
Although your question does have the encoding set as UTF-16, you don't have the string properly escaped so I wasn't sure if you did, in fact, accurately transpose the string into your question.
I ran into the same exception:
System.Xml.XmlException: Data at the
root level is invalid. Line 1,
position 1.
However, my code looked like this:
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);
The Problem
The problem is that strings are stored internally as UTF-16 in .NET however the encoding specified in the XML document header may be different. E.g.:
<?xml version="1.0" encoding="utf-8"?>
From the MSDN documentation for String here:
Each Unicode character in a string is
defined by a Unicode scalar value,
also called a Unicode code point or
the ordinal (numeric) value of the
Unicode character. Each code point is
encoded using UTF-16 encoding, and the
numeric value of each element of the
encoding is represented by a Char
object.
This means that when you pass XmlDocument.LoadXml() your string with an XML header, it must say the encoding is UTF-16. Otherwise, the actual underlying encoding won't match the encoding reported in the header and will result in an XmlException being thrown.
The Solution
The solution for this problem is to make sure the encoding used in whatever you pass the Load or LoadXml method matches what you say it is in the XML header. In my example above, either change your XML header to state UTF-16 or to encode the input in UTF-8 and use one of the XmlDocument.Load methods.
Below is sample code demonstrating how to use a MemoryStream to build an XmlDocument using a string which defines a UTF-8 encode XML document (but of course, is stored a UTF-16 .NET string).
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
// Encode the XML string in a UTF-8 byte array
byte[] encodedString = Encoding.UTF8.GetBytes(xml);
// Put the byte array into a stream and rewind it to the beginning
MemoryStream ms = new MemoryStream(encodedString);
ms.Flush();
ms.Position = 0;
// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(ms);
Simple and effective solution: Instead of using the LoadXml() method use the Load() method
For example:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("sample.xml");
I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.
MemoryStream stream = new MemoryStream();
byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
stream.Write(data, 0, data.Length);
stream.Seek(0, SeekOrigin.Begin);
XmlTextReader reader = new XmlTextReader(stream);
// MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
bodyDoc.Load(reader);
Try this:
XmlDocument bodyDoc = new XmlDocument();
bodyDoc.XMLResolver = null;
bodyDoc.Load(body);
This worked for me:
var xdoc = new XmlDocument { XmlResolver = null };
xdoc.LoadXml(xmlFragment);
This really saved my day.
I have written a extension method based on Zach's answer, also I have extended it to use the encoding as a parameter, allowing for different encodings beside from UTF-8 to be used, and I wrapped the MemoryStream in a 'using' statement.
public static class XmlHelperExtentions
{
/// <summary>
/// Loads a string through .Load() instead of .LoadXml()
/// This prevents character encoding problems.
/// </summary>
/// <param name="xmlDocument"></param>
/// <param name="xmlString"></param>
public static void LoadString(this XmlDocument xmlDocument, string xmlString, Encoding encoding = null) {
if (encoding == null) {
encoding = Encoding.UTF8;
}
// Encode the XML string in a byte array
byte[] encodedString = encoding.GetBytes(xmlString);
// Put the byte array into a stream and rewind it to the beginning
using (var ms = new MemoryStream(encodedString)) {
ms.Flush();
ms.Position = 0;
// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
xmlDocument.Load(ms);
}
}
}
I had the same problem when switching from absolute to relative path for my xml file.
The following solves both loading and using relative source path issues.
Using a XmlDataProvider, which is defined in xaml (should be possible in code too) :
<Window.Resources>
<XmlDataProvider
x:Name="myDP"
x:Key="MyData"
Source=""
XPath="/RootElement/Element"
IsAsynchronous="False"
IsInitialLoadEnabled="True"
debug:PresentationTraceSources.TraceLevel="High" /> </Window.Resources>
The data provider automatically loads the document once the source is set. Here's the code :
m_DataProvider = this.FindResource("MyData") as XmlDataProvider;
FileInfo file = new FileInfo("MyXmlFile.xml");
m_DataProvider.Document = new XmlDocument();
m_DataProvider.Source = new Uri(file.FullName);
Simple line:
bodyDoc.LoadXml(new MemoryStream(Encoding.Unicode.GetBytes(body)));
I had the same issue because the XML file I was uploading was encoded using UTF-8-BOM (UTF-8 byte-order mark).
Switched the encoding to UTF-8 in Notepad++ and was able to load the XML file in code.