I am trying to quickly and correctly serialize an XDocument object. I have tried several things, but this last one (found it here) seems simple and straightforward:
StringBuilder b = new StringBuilder();
XmlWriterSettings sett = new XmlWriterSettings();
sett.Encoding = Encoding.UTF8;
XmlWriter xw = XmlWriter.Create(b, sett);
doc.Save(xw);
String r = b.ToString();
However, at the end, r is just an empty string. Am I missing something? Why is it so hard to correctly serialize an XDocument object?
The frustrating thing is that if I call doc.ToString() I get a nice serialized XML string, without declaration. If I call doc.ToString(true) I get an empty string (doc.Declaration is set).
I figured it out. Still not convinced this is the "right" way to do it, but here goes:
MemoryStream s = new MemoryStream();
using (TextWriter b = new StreamWriter(s, Encoding.UTF8))
doc.Save(b);
String r = Encoding.UTF8.GetString(s.ToArray());
This results in a correctly encoded and correctly declared XML string.
Related
I have a string that is displayed in XML but in it I have some invalid chars like string
s = <root> something here <XMLElement>hello</XMLElement> somethig here too </root>
where XMLElement is a List like XMLElement = {"bold", "italic",...} .
What I need is to replace the < and </ if followed by any of the XMLElements to be replaced by > or < depending on the cases.
The <root> is to keep
I have tried so far some regEx
strAux = Regex.Replace(strAux, "bold=\"[^\"]*\"",
match => match.Value.Replace("<", "<").Replace(">", ">"));
or
List<string> startsWith = new List<string> { "<", "</"};
foreach(var stw in startsWith)
{
int nextLt = 0;
while ((nextLt = strAux.IndexOf(stw, nextLt)) != -1)
{
bool isMatch = strAux.Substring(nextLt + 1).StartsWith(BoldElement); // needs to ckeck all the XMLElements
//is element, leave it
if (isMatch)
{
//its not, replace
strAux = string.Format(#"{0}<{1}", strAux.Substring(0, nextLt), strAux.Substring(nextLt +1, strAux.Length - (nextLt + 1)));
}
nextLt++;
}
}
Also tried
XmlDocument doc = new XmlDocument();
XmlElement element = doc.CreateElement("root");
element.InnerText = strAux;
Console.WriteLine(element.OuterXml);
strAux = element.OuterXml.Replace("<root>", "").Replace("</root>", "");
return strAux; But it will repeat the `<root>` too
But nothing worked like I suposed. Is there any different ideias .Thanks
What you have is well-formed XML, so you can use the XML APIs to help you:
Using LINQ to XML (which is generally the better API):
var element = XElement.Parse(s);
element.Value = string.Concat(element.Nodes());
var result = element.ToString();
Or using the older XmlDocument API:
var doc = new XmlDocument();
doc.LoadXml(s);
var root = doc.DocumentElement;
root.InnerText = root.InnerXml;
var result = root.OuterXml;
The result for both is:
<root> something here <XMLElement>hello</XMLElement> somethig here too </root>
See this fiddle for a demo.
You should be using the XmlWriter class.
Sample from the documentation:
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
settings.CloseOutput = false;
// Create the XmlWriter object and write some content.
MemoryStream strm = new MemoryStream();
XmlWriter writer = XmlWriter.Create(strm, settings);
writer.WriteElementString("someNode", "someValue");
writer.Flush();
writer.Close();
https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter(v=vs.110).aspx
It sounds like your input is well-formed XML, but you want to escape some of the tags. The issue here is that there's no way for the code to know which tags are valid and which aren't.
One way to do this is to create a list of valid tags.
List<string> validTags = new List<string>() { "root", "..." };
Then use regex to pick out all instances of <tag> or </tag>and replace them if they're not in the list.
Another way which is faster and easier, but requires more information up front, is to create a list of tags which aren't valid.
List<string> invalidTags = new List<string>() { "XMLElement", "..." };
Simple string manipulation will do, now.
string s = GetYourXMLString();
invalidTags.ForEach(t => s = s.Replace($"</{t}>",$"<{t}>")
.Replace($"<{t}>",$"</{t}>"));
The second way should really only be used if you know which foreign tags are making (or will ever make) an appearance. If not the first approach should be used. One clever possibility is to dynamically create the list of valid tags using reflection or a data contract so that changes to the XML spec will be automatically reflected in your code.
For example, if each element is a property of an object, you might get the list like this:
var validTags = typeof(MyObjectType).GetProperties()
.Select(p => p.PropertyName)
.ToList();
Of course, the property names likely won't be the actual tag names, AND often you'll want to only include certain properties. So you make an attribute class to designate the desired properties (let's call it XMLTagName) and then you can do this:
var validTags = typeof(MyObjectType).GetProperties()
.Select(p => p.GetCustomAttribute<XMLTagName>()?.TagName)
.Where(tagName => tagName != null) //gets rid of properties that aren't tagged
.ToList();
Even with all that, you'll still committing the crime of string manipulation on raw XML. After all, the best real solution here is to figure out how to fix the incoming XML to actually contain the data you want. But if that's not a possibility, the above should do the job.
I feel like I am going made. I have written a hundred deserializing routines, but this one is killing me!
Below is what I get returned from a service. A very simple array of strings...I think.
<ArrayOfstring xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<string>Action & Adventure</string>
<string>Comedy</string>
<string>Drama</string>
<string>Family</string>
<string>Horror</string>
<string>Independent & World</string>
<string>Romance</string>
<string>Sci-Fi/Fantasy</string>
<string>Thriller & Crime</string>
</ArrayOfstring>
I am using out the box deserializing
var serializer = new XmlSerializer(typeof(List<string>));
var reader = new StringReader(xmlString);
var GenreList = (List<string>)serializer.Deserialize(reader);
but I get the following error on the Deserialize line:
<ArrayOfstring xmlns='http://schemas.microsoft.com/2003/10/Serialization/Arrays'> was not expected
I have tried including the namespace and creating all manner of exotic objects in an attempt to get this to work. Crazy amount of time. In the end I have requested it in JSON and deserialised that with Json.net.
However I am curious as to what I have been doing wrong!
Of course XmlSerializer can deserialize it. All you need is to create XmlSerializer as follows
var serializer = new XmlSerializer(typeof(List<string>),
new XmlRootAttribute() { ElementName = "ArrayOfstring", Namespace = "http://schemas.microsoft.com/2003/10/Serialization/Arrays" });
The XML Serializer cannot deserialize a simpletype or a list of simple types without additional specification, but the DataContractReader can:
string content = #"
<ArrayOfstring xmlns=""http://schemas.microsoft.com/2003/10/Serialization/Arrays"" xmlns:i=""http://www.w3.org/2001/XMLSchema-instance"">
<string>Action & Adventure</string>
<string>Comedy</string>
<string>Drama</string>
<string>Family</string>
<string>Horror</string>
<string>Independent & World</string>
<string>Romance</string>
<string>Sci-Fi/Fantasy</string>
<string>Thriller & Crime</string>
</ArrayOfstring>";
var serializer = new DataContractSerializer(typeof(string[]));
var reader = new XmlTextReader(new StringReader(content));
var GenreList = new List<string>((string[])serializer.ReadObject(reader));
You can also use a simple class to achieve the same results. Note that I removed the namespaces from your XML file for brevity. You can implement reading of the namespaces in the serializer if you please.
public class ArrayOfstring
{
[XmlElement("string")]
public List<string> strings;
}
private void Deserialize(string xmlString)
{
var serializer = new XmlSerializer(typeof(ArrayOfstring));
var reader = new StringReader(xmlString);
var GenreList = ((ArrayOfstring) serializer.Deserialize(reader)).strings;
}
This will work
DataContractSerializer xmlSer = new DataContractSerializer(typeof(string[]));
TextReader reader=new StreamReader(xmlString);
var stringArr= (string[])xmlSer.ReadObject(reader);
List<string> listStr=new List<>();
for(var s in stringArr)
{
listStr.Add(s);
}
I realize this is an old question, but I recently ran on to the same issue and wanted to share what worked for me. I tried all the approaches outlined as potential solutions, but couldn't get any of them to work. Even specifying the namespace in the XmlRootAttribute approach would throw the "was not expected" error reported in the original problem. I was getting the ArrayOfString as a response from an API, so I used an XDocument parse approach:
List<string> lstGenre = new List<string>();
var response = await client.PostAsync(url, content);
var responseString = await response.Content.ReadAsStringAsync();
XDocument xdoc = XDocument.Parse(responseString);
XNamespace ns = xdoc.Root.GetDefaultNamespace();
XElement root = xdoc.Element(XName.Get("ArrayOfString", ns.NamespaceName));
IEnumerable<XElement> list = root.Elements();
foreach (XElement element in list)
{
string item = element.Value; // <-- individual strings from the "ArrayOfString"
lstGenre.Add(item);
}
I basically want to know how to insert a XmlDocument inside another XmlDocument.
The first XmlDocument will have the basic header and footer tags.
The second XmlDocument will be the body/data tag which must be inserted into the first XmlDocument.
string tableData = null;
using(StringWriter sw = new StringWriter())
{
rightsTable.WriteXml(sw);
tableData = sw.ToString();
}
XmlDocument xmlTable = new XmlDocument();
xmlTable.LoadXml(tableData);
StringBuilder build = new StringBuilder();
using (XmlWriter writer = XmlWriter.Create(build, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
writer.WriteStartElement("dataheader");
//need to insert the xmlTable here somehow
writer.WriteEndElement();
}
Is there an easier solution to this?
Use importNode feature in your document parser.
You can use this code based on CreateCDataSection method
// Create an XmlCDataSection from your document
var cdata = xmlTable.CreateCDataSection("<test></test>");
XmlElement root = xmlTable.DocumentElement;
// Append the cdata section to your node
root.AppendChild(cdata);
Link : http://msdn.microsoft.com/fr-fr/library/system.xml.xmldocument.createcdatasection.aspx
I am not sure what you are really looking for but this can show how to merge two xml documents (using Linq2xml)
string xml1 =
#"<xml1>
<header>header1</header>
<footer>footer</footer>
</xml1>";
string xml2 =
#"<xml2>
<body>body</body>
<data>footer</data>
</xml2>";
var xdoc1 = XElement.Parse(xml1);
var xdoc2 = XElement.Parse(xml2);
xdoc1.Descendants().First(d => d.Name == "header").AddAfterSelf(xdoc2.Elements());
var newxml = xdoc1.ToString();
OUTPUT
<xml1>
<header>header1</header>
<body>body</body>
<data>footer</data>
<footer>footer</footer>
</xml1>
You will need to write the inner XML files in CDATA sections.
Use writer.WriteCData for such nodes, passing in the inner XML as text.
writer.WriteCData(xmlTable.OuterXml);
Another option (thanks DJQuimby) is to encode the XML to some XML compatible format (say base64) - note that the encoding used must be XML compatible and that some encoding schemes will increase the size of the encoded document (base64 adds ~30%).
There is a nice example on the Microsoft Website (Even vor .Net 4)
Dim xmlDoc As XmlDataDocument = New XmlDataDocument(dataSet)
Dim xslTran As XslTransform = New XslTransform
xslTran.Load("transform.xsl")
Dim writer As XmlTextWriter = New XmlTextWriter("xslt_output.html", System.Text.Encoding.UTF8)
xslTran.Transform(xmlDoc, Nothing, writer)
http://technet.microsoft.com/en-us/query/8fd7xytc
No unfortunately XmlDataDocument is deprecated, but nobody seems to have a good answer on how to replace it in this situation?
You can use the following code.
Use DataSet.GetXml() to get the xml as string and then create an XDocument by parsing the string:
string xml = dataSet.GetXml();
XDocument document = XDocument.Parse(xml);
The setup of the transformation and its output is the same, except using XslCompiledTransform:
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load("transform.xsl");
XmlTextWriter writer = new XmlTextWriter("Output.xml", System.Text.Encoding.UTF8);
And then you can use the XslCompiledTransform.Transform() overload that takes a reader as the first argument, which you can get from calling XDocument.CreateReader():
transform.Transform(Document.CreateReader(), writer);
I have an XSLT transform issue:
style="width:{Data/PercentSpaceUsed}%;"
And the value of Data/PercentSpaceUsed is integer 3.
And it outputs:
style="width:
3
%;"
instead of what I expected:
style="width:3%;"
Here's the code that does the transform: xslt_xslt is the transform xml, sw.ToString() contains the
and
which I did not expect.
var xslTransObj = new XslCompiledTransform();
var reader = new XmlTextReader(new StringReader(xslt_xslt));
xslTransObj.Load(reader);
var sw = new StringWriter();
var writer = new XmlTextWriter(sw);
xslTransObj.Transform(new XmlTextReader(new StringReader(xslt_data)), writer);
ResultLiteral.Text = sw.ToString();
The
are carriage returns and line feeds either within your XML or your XSLT. Make sure the xml is like
<Value>3</Value>
Rather than
<Value>
3
</Value>
I believe there is a way to stop whitespace being used within your transformation although I don`t know it off the top of my head.
You're getting whitespace from the source document. Use
style="width:{normalize-space(Data/PercentSpaceUsed)}%;"
to strip out the whitespace. The other option in your case would be to use
style="width:{number(Data/PercentSpaceUsed)}%;"
try
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.IndentChars = "\t";
settings.NewLineHandling = NewLineHandling.None;
XmlWriter writer = XmlWriter.Create(xmlpath, settings);
for input whitespace to be preserved on output for attribute values.
note: with above settings, tabs are used for indentation