Edit CustomXML with OpenXML C# - c#

This is my first OpenXML project. I am trying to edit the CustomXML file of a docx file. I am trying to change this:
<?xml version="1.0" encoding="UTF-8"?>
<PERSON>
<NAMETAG>NAME</NAMETAG>
<DOBTAG>DOB</DOBTAG>
<SCORE1TAG>SCORE1</SCORE1TAG>
<SCORE2TAG>SCORE2</SCORE2TAG>
</PERSON>
To this:
<?xml version="1.0" encoding="UTF-8"?>
<PERSON>
<NAMETAG>John Doe</NAMETAG>
<DOBTAG>01/01/2020</DOBTAG>
<SCORE1TAG>90.5</SCORE1TAG>
<SCORE2TAG>100.0</SCORE2TAG>
</PERSON>
I would prefer to not use search and replace but instead navigate the WordprocessingDocument to find the correct properties to modify. I tried to do a whole delete/add but that corrupted the file and did not work. Here is that code:
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
byte[] byteArray = File.ReadAllBytes(#"C:\Simple_Template.docx");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
doc.MainDocumentPart.DeleteParts<CustomXmlPart>(doc.MainDocumentPart.CustomXmlParts);
string newcustomXML = #"<?xml version=""1.0\"" encoding=""UTF-8\""?><Person><NAMETAG>John Doe</NAMETAG><DOBTAG>DOB</DOBTAG><SCORE1TAG>90.5</SCORE1TAG><SCORE2TAG>100.0</SCORE2TAG></PERSON>";
CustomXmlPart xmlPart = doc.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
byte[] byteArrayXML = Encoding.UTF8.GetBytes(newcustomXML);
using (MemoryStream xml_strm = new MemoryStream(byteArrayXML))
{
xmlPart.FeedData(xml_strm);
}
doc.MainDocumentPart.Document.Save();
doc.Close();
File.WriteAllBytes(#"C:\Simple_Template_Replace.docx", stream.ToArray());
}
}
I have also tried to navigate through the structure but I am having a hard time figuring out where in the WordprocessingDocument object contains the actual values that I need to modify. Ideally, I would like something like this psuedo-code:
doc.MainDocumentPart.CustomXMLPart.Select("NAMETAG") = "John Doe"
--------Follow On----------
The answer below worked well without a Namespace. Now I would like to add one. This is the new XML:
<?xml version="1.0"?><myxml xmlns="www.mydomain.com">
<PERSON>
<NAMETAG>NAME</NAMETAG>
<DOBTAG>DOB</DOBTAG>
<SCORE1TAG>SCORE1</SCORE1TAG>
<SCORE2TAG>SCORE2</SCORE2TAG>
</PERSON>
</myxml>
I have adjusted the code to the following but the SelectSingleNode call is returning NULL. Here is the updated code:
XmlNamespaceManager mgr = new XmlNamespaceManager(xmlDocument.NameTable);
mgr.AddNamespace("ns", "www.mydomain.com");
string name_tag = xmlDocument.SelectSingleNode("/ns:myxml/ns:PERSON/ns:NAMETAG", mgr).InnerText;
I was able to fix this myself. I did not realize that you need to include "ns:" with every element. I still thought that I would be able to pass in String.Empty into my AddNamespace and then I would not have to do it. But this will work for now.

The problem is with the newcustomXML value, it has two '\' characters in the XML declaration and also the start tag of "PERSON" element has capital case instead of upper case.
So, try using the following instead:
string newcustomXML = #"<?xml version=""1.0"" encoding=""UTF-8""?>
<PERSON>
<NAMETAG>John Doe</NAMETAG>
<DOBTAG>01/01/2020</DOBTAG>
<SCORE1TAG>90.5</SCORE1TAG>
<SCORE2TAG>100.0</SCORE2TAG>
</PERSON>";
Also regarding your navigation attempt, try using this:
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
byte[] byteArray = File.ReadAllBytes(#"C:\Simple_Template.docx");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
CustomXmlPart xmlPart = doc.MainDocumentPart.CustomXmlParts.First();
XmlDocument xmlDocument = new XmlDocument();
using (var inputStream = xmlPart.GetStream(FileMode.Open, FileAccess.Read))
using (var outputStream = new MemoryStream())
{
xmlDocument.Load(inputStream);
xmlDocument.SelectSingleNode("/PERSON/NAMETAG").InnerText = "John Doe";
xmlDocument.Save(outputStream);
outputStream.Seek(0, SeekOrigin.Begin);
xmlPart.FeedData(outputStream);
}
doc.MainDocumentPart.Document.Save();
doc.Close();
File.WriteAllBytes(#"C:\Simple_Template_Replace.docx", stream.ToArray());
}
}

Related

Why XDocument encoding type changed while use XDocument's WriteTo method

I have used XDocument to create simple xml document.I have created document with XDocument and XDeclaration.
XDocument encodedDoc8 = new XDocument(new XDeclaration("1.0", "utf-8", "yes"),
new XElement("Root", "Content"));
If i save this document to file means the encoding type is not changed.
using (TextWriter sw = new StreamWriter(#"C:\sample.txt", false)){
encodedDoc8.Save(sw);
}
Output:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Root>Content</Root>
But,if I use XDocument's WriteTo method to print the xml means encoding type is changed.
using (XmlWriter writ = XmlWriter.Create(Console.Out))
{
encodedDoc8.WriteTo(writ);
}
Output:
<?xml version="1.0" encoding="IBM437" standalone="yes"?><Root>Content</Root>
Why this happened?.Please update your answers.Thanks in advance.
If you look at the reference source for XmlWriter.Create, the chain of calls would eventually lead to this constructor:
public XmlTextWriter(TextWriter w) : this() {
textWriter = w;
encoding = w.Encoding;
xmlEncoder = new XmlTextEncoder(w);
xmlEncoder.QuoteChar = this.quoteChar;
}
The assignment encoding = w.Encoding provides an explanation to what is happening in your case: the Encoding setting of the Console.Out text writer is copied to the encoding setting of the newly created XmlTextWriter, replacing the encoding that you supplied in the XDocument.

Specify encoding XmlSerializer

I've a class correctly defined and after serialize it to XML I'm getting no encoding.
How can I define encoding "ISO-8859-1"?
Here's a sample code
var xml = new XmlSerializer(typeof(Transacao));
var file = new FileStream(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "transacao.xml"),FileMode.OpenOrCreate);
xml.Serialize(file, transacao);
file.Close();
Here are the beginning of xml generated
<?xml version="1.0"?>
<requisicao-transacao xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<dados-ec>
<numero>1048664497</numero>
The following should work:
var xml = new XmlSerializer(typeof(Transacao));
var fname = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "transacao.xml");
var appendMode = false;
var encoding = Encoding.GetEncoding("ISO-8859-1");
using(StreamWriter sw = new StreamWriter(fname, appendMode, encoding))
{
xml.Serialize(sw, transacao);
}
If you don't mind me asking, why do you need ISO-8859-1 encoding in particular? You could probably use UTF-8 or UTF-16 (they're more commonly recognizable) and get away with it.
Create a StreamWriter with the desired encoding:
System.Text.Encoding code = *WhateverYouWant*
StreamWriter sw = new StreamWriter(file, code);

XML Canonicalization returns empty elements in the transformed output

I have a related post asking how to select nodes from an XmlDocument using an XPath statement.
The only way I could get the SelectNodes to work was to create a non default namespace "x" and then explicitly reference the nodes in the XPath statement.
Whilst this works and provides me with a node list, the canonicalization then fails to produce any content to my selected nodes in the output.
I've tried using XmlDsigExcC14NTransform and specifying the namespace but this produces the same output.
Below is an example of the xml output produced (using the XML in my related post):
<Applications xmlns="http://www.myApps.co.uk/">
<Application>
<ApplicantDetails>
<Title>
</Title>
<Forename>
</Forename>
<Middlenames>
<Middlename>
</Middlename>
</Middlenames>
<PresentSurname>
</PresentSurname>
<CurrentAddress>
<Address>
<AddressLine1>
</AddressLine1>
<AddressLine2>
</AddressLine2>
<AddressTown>
</AddressTown>
<AddressCounty>
</AddressCounty>
<Postcode>
</Postcode>
<CountryCode>
</CountryCode>
</Address>
<ResidentFromGyearMonth>
</ResidentFromGyearMonth>
</CurrentAddress>
</ApplicantDetails>
</Application>
<Application>
<ApplicantDetails>
<Title>
</Title>
<Forename>
</Forename>
<Middlenames>
<Middlename>
</Middlename>
</Middlenames>
<PresentSurname>
</PresentSurname>
<CurrentAddress>
<Address>
<AddressLine1>
</AddressLine1>
<AddressLine2>
</AddressLine2>
<AddressTown>
</AddressTown>
<AddressCounty>
</AddressCounty>
<Postcode>
</Postcode>
<CountryCode>
</CountryCode>
</Address>
<ResidentFromGyearMonth>
</ResidentFromGyearMonth>
</CurrentAddress>
</ApplicantDetails>
</Application>
</Applications>
Another StackOverflow user has had a similar problem here
Playing around with this new code, I found that the results differ depending upon how you pass the nodes into the LoadInput method. Implementing the code below worked.
I'm still curious as to why it works one way and not another but will leave that for a rainy day
static void Main(string[] args)
{
string path = #"..\..\TestFiles\Test_1.xml";
if (File.Exists(path) == true)
{
XmlDocument xDoc = new XmlDocument();
xDoc.PreserveWhitespace = true;
using (FileStream fs = new FileStream(path, FileMode.Open))
{
xDoc.Load(fs);
}
//Instantiate an XmlNamespaceManager object.
System.Xml.XmlNamespaceManager xmlnsManager = new System.Xml.XmlNamespaceManager(xDoc.NameTable);
//Add the namespaces used to the XmlNamespaceManager.
xmlnsManager.AddNamespace("x", "http://www.myApps.co.uk/");
// Create a list of nodes to have the Canonical treatment
XmlNodeList nodeList = xDoc.SelectNodes("/x:ApplicationsBatch/x:Applications|/x:ApplicationsBatch/x:Applications//*", xmlnsManager);
//Initialise the stream to read the node list
MemoryStream nodeStream = new MemoryStream();
XmlWriter xw = XmlWriter.Create(nodeStream);
nodeList[0].WriteTo(xw);
xw.Flush();
nodeStream.Position = 0;
// Perform the C14N transform on the nodes in the stream
XmlDsigC14NTransform transform = new XmlDsigC14NTransform();
transform.LoadInput(nodeStream);
// use a new memory stream for output of the transformed xml
// this could be done numerous ways if you don't wish to use a memory stream
MemoryStream outputStream = (MemoryStream)transform.GetOutput(typeof(Stream));
File.WriteAllBytes(#"..\..\TestFiles\CleanTest_1.xml", outputStream.ToArray());
}
}

Add a Text line in xml on a particular location?

How can I insert the following stylesheet information into my existing xml file which is created using C#?
<?xml-stylesheet type="text/xsl" href="_fileName.xsl"?>
Or.... Can I add this line at the time of creation of the new XML file?
Edit:
I tried to achieve the above using XmlSerialier (hit and trial), something like this:
// assumes 'XML' file exists.
XmlDocument doc = new XmlDocument();
XElement dataElements = XElement.Load("_fileName.xml");
XmlSerializer xs = new XmlSerializer(typeof(Parents));
var ms = new MemoryStream();
xs.Serialize(ms, parents);
ms.Seek(0, SeekOrigin.Begin); // rewind stream to beginning
doc.Load(ms);
XmlProcessingInstruction pi;
string data = "type=\"text/xsl\" href=\"_fileName.xsl\"";
pi = doc.CreateProcessingInstruction("xml-stylesheet", data);
doc.InsertBefore(pi, doc.DocumentElement); // insert before root
doc.DocumentElement.Attributes.RemoveAll(); // remove namespaces
But the output xml is getting corrupted:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="_fileName.xsl"?>
<parents />
Whereas the desired output is something like:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="_fileName.xsl"?>
<parents>
<parent>
<Child1>
<child2>
</parent>
</parents>
Did this help to understand what's my problem???
You didn't answer the question.. "what lib do you use".
Although I advise:
XDocument
if you would use it you could do something like:
XDocument document = new XDocument(new XDeclaration("1.0", "utf-8", "yes"));
document.Add(new XProcessingInstruction(
"xml-stylesheet", "type=\"text/xsl\" href=\"_fileName.xsl\""));
//and then your actual document...
document.Add(
new XElement("parent",
new XElement("child1"),
new XElement("child2")
)
);
EDIT:
Ok So you could do it like:
XDocument document = XDocument.Load("file");
document.AddFirst(new XProcessingInstruction(
"xml-stylesheet", "type=\"text/xsl\" href=\"LogStyle.xsl\""));
Is this what you're looking for?

Deserialization error in XML document(1,1)

I have an XML file that I deserialize, the funny part is the XML file is the was serialized
using the following code:
enter code here
var serializer = new XmlSerializer(typeof(CommonMessage));
var writer = new StreamWriter("OutPut.txt");
serializer.Serialize(writer, commonMessage);
writer.Close();
And i m trying to deserialized it again to check if the output match the input.
anyhow here is my code to deserialize:
var serializer = new XmlSerializer(typeof(CommonMessage));
var reader = new StringReader(InputFileName);
CommonMessage commonMessage = (CommonMessage)serializer.Deserialize(reader);
Replace StringReader with StreamReader and it will work fine. StringReader reads value from the string (which is file name in your case).
I just had the same error message but different error source. In case someone has the same problem like me. I chopped off the very first char of my xml string by splitting strings. And the xml string got corrupted:
"?xml version="1.0" encoding="utf-16"?> ..." // my error
"<?xml version="1.0" encoding="utf-16"?> ..." // correct
(1,1) means basically first char of the first line is incorrect and the string can't be deserialized.
include in your CommonMessage class the XmlRoot element tag with your xmlroot eg:[XmlRoot("UIIVerificationResponse")]
You should disable the order mark in the StreamWriter constructor like this:
UTF8Encoding(false)
Full sample:
using (MemoryStream stream = new MemoryStream())
using (StreamWriter writer = new StreamWriter(stream, new UTF8Encoding(false)))
{
xmlSerializer.Serialize(writer, objectToSerialize, ns);
return Encoding.UTF8.GetString(stream.ToArray());
}

Categories

Resources