I've been using this function to read XML from a string and apply an XSLT style sheets, it has been working very well for small portions of XML:
private static string TransformXML(String XML, String XSLT)
{
string output = String.Empty;
using (StringReader srt = new StringReader(XSLT))
{
using (StringReader sri = new StringReader(XML))
{
using (XmlReader xrt = XmlReader.Create(srt))
using (XmlReader xri = XmlReader.Create(sri))
{
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load(xrt);
using (StringWriter sw = new StringWriter())
using (XmlWriter xwo = XmlWriter.Create(sw, xslt.OutputSettings)) // use OutputSettings of xsl, so it can be output as HTML
{
xslt.Transform(xri, xwo);
output = sw.ToString();
}
}
}
}
return output;
}
However, with large portions of XML, I'm getting errors, even though I know it is correctly formatted.
Here is an example error: Unexpected end of file while parsing Name has occurred. Line 1, position 30001.
I'm guessing there is a limit on the buffering, but I can't quite work it out - the code is within an SSIS package and different script tasks produce and translate the XML.
I appreciate any help!
Related
I'm trying to write a string(which is nothing but XMLNodes) into a new XML File using XMLWriter. Few of the strings are valid XML content while few of the string aren't.
String Input:
1.
<Test>
<A a="Hello"></A>
<B b="Hello"></B>
</Test>
Hello
This is Sample String but not XML
Code :
using (XmlWriter writer = XmlWriter.Create(#"C:\\Test.XML"))
{
writer.WriteStartDocument();
string scontent2 = "Hello This is Sample String but not XML";
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
try{
using (StringReader stringReader = new StringReader(scontent))
using (XmlReader xmlReader = XmlReader.Create(stringReader, settings))
{
writer.WriteStartElement("Test");
writer.WriteNode(xmlReader, true);
writer.WriteEndElement();
}catch(XMLException exception){}
}
Expected Output:
The Test element must also not be created if the Exception occurs. If I use, scontent.Read() or any such, the problem is since the pointer moves to a node, the writer.WriteNode(scontent,true) wont write entire nodes(if there are more than two nodes) For ex. <A a="Hello"></A><B b="Hello"></B>. In this case, I've write all nodes using WriteNode for which XMLReader must be in Initial State(XmlReader.State).
I want to read only the xml used for generating equation, which i obtained by using Paragraph.Range.WordOpenXML. But the section used for the equation is not as per MathML which as i found that the Equation of microsoft is in MathML.
Do I need to use some special converter to get desired xmls or are there any other methods?
You could use the OMML2MML.XSL file (located under %ProgramFiles%\Microsoft Office\Office15)
to transform Microsoft Office MathML (equations) included in a word document into MathML.
The code below shows how to transform the equations in a word document into MathML
using the following steps:
Open the word document using OpenXML SDK (version 2.5).
Create a XslCompiledTransform and load the OMML2MML.XSL file.
Transform the word document by calling the Transform() method
on the created XslCompiledTransform instance.
Output the result of the transform (e.g. print on console or write to file).
I've tested the code below with a simple word document containing two equations, text and pictures.
using System.IO;
using System.Xml;
using System.Xml.Xsl;
using DocumentFormat.OpenXml.Packaging;
public string GetWordDocumentAsMathML(string docFilePath, string officeVersion = "14")
{
string officeML = string.Empty;
using (WordprocessingDocument doc = WordprocessingDocument.Open(docFilePath, false))
{
string wordDocXml = doc.MainDocumentPart.Document.OuterXml;
XslCompiledTransform xslTransform = new XslCompiledTransform();
// The OMML2MML.xsl file is located under
// %ProgramFiles%\Microsoft Office\Office15\
xslTransform.Load(#"c:\Program Files\Microsoft Office\Office" + officeVersion + #"\OMML2MML.XSL");
using (TextReader tr = new StringReader(wordDocXml))
{
// Load the xml of your main document part.
using (XmlReader reader = XmlReader.Create(tr))
{
using (MemoryStream ms = new MemoryStream())
{
XmlWriterSettings settings = xslTransform.OutputSettings.Clone();
// Configure xml writer to omit xml declaration.
settings.ConformanceLevel = ConformanceLevel.Fragment;
settings.OmitXmlDeclaration = true;
XmlWriter xw = XmlWriter.Create(ms, settings);
// Transform our OfficeMathML to MathML.
xslTransform.Transform(reader, xw);
ms.Seek(0, SeekOrigin.Begin);
using (StreamReader sr = new StreamReader(ms, Encoding.UTF8))
{
officeML = sr.ReadToEnd();
// Console.Out.WriteLine(officeML);
}
}
}
}
}
return officeML;
}
To convert only one single equation (and not the whole word document) just query for the desired Office Math Paragraph (m:oMathPara) and use the OuterXML property of this node.
The code below shows how to query for the first math paragraph:
string mathParagraphXml =
doc.MainDocumentPart.Document.Descendants<DocumentFormat.OpenXml.Math.Paragraph>().First().OuterXml;
Use the returned XML to feed the TextReader.
I am trying to use XslCompiledTransform in the .NET class library in order to transform an xml string to an Html string. Please consider that I want to use normal strings, not files.
How ca I do this?
It seems that XslCompiledTransform only works with files...
Load() also accepts XmlReader, and Transform() accepts most combinations of XmlReader input, and XmlWriter, TextWriter and Stream as output.
So most typically, you might use a StringWriter for the output, and a XmlReader created from a StringReader for the input.
Full example, no files:
string xslt = #"<xsl:stylesheet version=""1.0"" xmlns:xsl=""http://www.w3.org/1999/XSL/Transform"">
<xsl:output method=""html"" indent=""no""/>
<xsl:template match=""*"">
<p>some html</p>
</xsl:template>
</xsl:stylesheet>", xml = #"<xml>boo</xml>";
var transform = new XslCompiledTransform();
using (var sr = new StringReader(xslt))
using (var xr = XmlReader.Create(sr))
{
transform.Load(xr);
}
using (var sw = new StringWriter())
using (var sr = new StringReader(xml))
using (var xr = XmlReader.Create(sr))
{
transform.Transform(xr, null, sw);
string html = sw.ToString();
}
I have use xslt and xml in the form of string ,
which is generated on the same .aspx page,
then converting it to html using StringWriter,
use a literal control to show html,bind Stringwriter data to it.
string xslt="Add your code for xslt here";//look for any normal xslt file.
string xml="Add your code for xml here";//look for any normal xml file.
XslCompiledTransform transform = new XslCompiledTransform();
StringReader sr = new StringReader(xslt);
XmlReader xr = XmlReader.Create(sr);
transform.Load(xr);
StringReader srxml = new StringReader(xml);
XmlReader xrxml = XmlReader.Create(srxml);
StringWriter writer = new StringWriter();
transform.Transform(xrxml, null, writer);
Literal1.Text = writer.ToString();
writer.Close();
I was using this extension method to transform very large xml files with an xslt.
Unfortunately, I get an OutOfMemoryException on the source.ToString() line.
I realize there must be a better way, I'm just not sure what that would be?
public static XElement Transform(this XElement source, string xslPath, XsltArgumentList arguments)
{
var doc = new XmlDocument();
doc.LoadXml(source.ToString());
var xsl = new XslCompiledTransform();
xsl.Load(xslPath);
using (var swDocument = new StringWriter(System.Globalization.CultureInfo.InvariantCulture))
{
using (var xtw = new XmlTextWriter(swDocument))
{
xsl.Transform((doc.CreateNavigator()), arguments, xtw);
xtw.Flush();
return XElement.Parse(swDocument.ToString());
}
}
}
Thoughts? Solutions? Etc.
UPDATE:
Now that this is solved, I have issues with validating the schema!
Validating large Xml files
Try this:
using System.Xml.Linq;
using System.Xml.XPath;
using System.Xml.Xsl;
static class Extensions
{
public static XElement Transform(
this XElement source, string xslPath, XsltArgumentList arguments)
{
var xsl = new XslCompiledTransform();
xsl.Load(xslPath);
var result = new XDocument();
using (var writer = result.CreateWriter())
{
xsl.Transform(source.CreateNavigator(), arguments, writer);
}
return result.Root;
}
}
BTW, new XmlTextWriter() is deprecated as of .NET 2.0. Use XmlWriter.Create() instead. Same with new XmlTextReader() and XmlReader.Create().
For large XML files you can try to use XPathDocument as suggested in Microsoft Knowledge Base article.
XPathDocument srcDoc = new XPathDocument(srcFile);
XslCompiledTransform myXslTransform = new XslCompiledTransform();
myXslTransform.Load(xslFile);
using (XmlWriter destDoc = XmlWriter.Create(destFile))
{
myXslTransform.Transform(srcDoc, destDoc);
}
I'm currently using Syntax Highlighter to show a XML or SOAP messages on a page. That works fine for messages that are already formatted correctly (line breaks, indents, etc). But if I had a XML string like:
string xml = "<doc><object><first>Joe</first><last>Smith</last></object></doc>";
I would write the string to the page and the javascript highlighter would correctly syntax highlight the string, but it would be all on a single line.
Is there a C# string formatter or some syntax highlighting library that has a "smart" indent feature that would insert line breaks, indents, etc... ?
Since this is a string, adding line breaks and indents would be changing the actual value of variable xml, which is not what you want your code formatter to do!
Note that you can format the XML in C# before writing to the page, like this:
using System;
using System.IO;
using System.Text;
using System.Xml;
namespace XmlIndent
{
class Program
{
static void Main(string[] args)
{
string xml = "<doc><object><first>Joe</first><last>Smith</last></object></doc>";
var xd = new XmlDocument();
xd.LoadXml(xml);
Console.WriteLine(FormatXml(xd));
Console.ReadKey();
}
static string FormatXml(XmlDocument doc)
{
var sb = new StringBuilder();
var sw = new StringWriter(sb);
XmlTextWriter xtw = null;
using(xtw = new XmlTextWriter(sw) { Formatting = Formatting.Indented })
{
doc.WriteTo(xtw);
}
return sb.ToString();
}
}
}