I am trying to compare XML user input to a valid XML string. What I do is remove the values from the user input and compare it to the valid XML. At the bottom you can see my code. But as you can see in the XML examples the user input the goodslines has two goodsline children and fewer children. How can i alter my code so that it can this case would return true when compared? Thanks in advance
Valid XML
<?xml version="1.0" encoding="Windows-1252"?>
<goodslines>
<goodsline>
<unitamount></unitamount>
<unit_id matchmode="1"></unit_id>
<product_id matchmode="1"></product_id>
<weight></weight>
<loadingmeter></loadingmeter>
<volume></volume>
<length></length>
<width></width>
<height></height>
</goodsline>
</goodslines>
User input
<?xml version="1.0" encoding="Windows-1252"?>
<goodslines>
<goodsline>
<unitamount>5</unitamount>
<unit_id matchmode="1">colli</unit_id>
<product_id matchmode="1">1109</product_id>
<weight>50</weight>
<loadingmeter>0.2</loadingmeter>
<volume>0.036</volume>
<length>20</length>
<width>20</width>
<height>90</height>
</goodsline>
<goodsline>
<unitamount>12</unitamount>
<unit_id matchmode="1">drums</unit_id>
<product_id matchmode="1">1109</product_id>
<weight>345</weight>
</goodsline>
</goodslines>
Code
public static string Format(string xml)
{
try
{
var stringBuilder = new StringBuilder();
var element = XDocument.Parse(xml);
var settings = new XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent = true,
IndentChars = new string(' ', 3),
NewLineChars = Environment.NewLine,
NewLineOnAttributes = false,
NewLineHandling = NewLineHandling.Replace
};
using (var xmlWriter = XmlWriter.Create(stringBuilder, settings))
element.Save(xmlWriter);
return stringBuilder.ToString();
}
catch(Exception ex)
{
return "Unable to format XML" + ex;
}
}
public static bool Compare(string xmlA, string xmlB)
{
if(xmlA == null || xmlB == null)
return false;
var xmlFormattedA = Format(xmlA);
var xmlFormattedB = Format(xmlB);
return xmlFormattedA.Equals(xmlFormattedB, StringComparison.InvariantCultureIgnoreCase);
}
public static string NoText(string request)
{
string pattern = #"<.*?>";
Regex rg = new Regex(pattern);
var noTextArr = rg.Matches(request)
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
string noText = string.Join("", noTextArr);
return noText;
}
Related
I have a string property which will contain text with newlines. This text has some of the properties of HTML text in that whitespace is disregarded.
If I serialize this type using XML serialization, the newlines are serialized properly, but the indentation is "wrong". I want the serialization process to indent the lines to keep the formatting of the XML, since those whitespace characters will be disregarded later anyway.
Here's an example LINQPad program:
void Main()
{
var d = new Dummy();
d.Text = #"Line 1
Line 2
Line 3";
var serializer = new XmlSerializer(typeof(Dummy));
var ns = new XmlSerializerNamespaces();
ns.Add("", "");
using (var writer = new StringWriter())
{
serializer.Serialize(writer, d, ns);
writer.ToString().Dump();
}
}
[XmlType("dummy")]
public class Dummy
{
[XmlElement("text")]
public string Text
{
get;
set;
}
}
Actual output:
<?xml version="1.0" encoding="utf-16"?>
<dummy>
<text>Line 1
Line 2
Line 3</text>
</dummy>
Desired output:
<?xml version="1.0" encoding="utf-16"?>
<dummy>
<text>
Line 1
Line 2
Line 3
</text>
</dummy>
Is this possible? If so, how? I'd rather not do the hackish way of just adding the whitespace in myself.
The reason for this is that this XML will be viewed and edited by people, so I'd like for the initial output to be better formatted for them out of the box.
I bumped into the same issue. At the end I came out with a custom writer:
public class IndentTextXmlWriter : XmlTextWriter
{
private int indentLevel;
private bool isInsideAttribute;
public IndentTextXmlWriter(TextWriter textWriter): base(textWriter)
{
}
public bool IndentText { get; set; }
public override void WriteStartAttribute(string prefix, string localName, string ns)
{
isInsideAttribute = true;
base.WriteStartAttribute(prefix, localName, ns);
}
public override void WriteEndAttribute()
{
isInsideAttribute = false;
base.WriteEndAttribute();
}
public override void WriteStartElement(string prefix, string localName, string ns)
{
indentLevel++;
base.WriteStartElement(prefix, localName, ns);
}
public override void WriteEndElement()
{
indentLevel--;
base.WriteEndElement();
}
public override void WriteString(string text)
{
if (String.IsNullOrEmpty(text) || isInsideAttribute || Formatting != Formatting.Indented || !IndentText || XmlSpace == XmlSpace.Preserve)
{
base.WriteString(text);
return;
}
string[] lines = text.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
string indent = new string(IndentChar, indentLevel * Indentation);
foreach (string line in lines)
{
WriteRaw(Environment.NewLine);
WriteRaw(indent);
WriteRaw(line.Trim());
}
WriteRaw(Environment.NewLine);
WriteRaw(new string(IndentChar, (indentLevel - 1) * Indentation));
}
}
You can use it like this:
[TestMethod]
public void WriteIndentedText()
{
var result = new StringBuilder();
using (var writer = new IndentTextXmlWriter(new StringWriter(result)){Formatting = Formatting.Indented, IndentText = true})
{
string text = #" Line 1
Line 2
Line 3 ";
// some root
writer.WriteStartDocument();
writer.WriteStartElement("root");
writer.WriteStartElement("child");
// test auto-indenting
writer.WriteStartElement("elementIndented");
writer.WriteString(text);
writer.WriteEndElement();
// test space preserving
writer.WriteStartElement("elementPreserved");
writer.WriteAttributeString("xml", "space", null, "preserve");
writer.WriteString(text);
writer.WriteEndDocument();
}
Debug.WriteLine(result.ToString());
}
And the output:
<?xml version="1.0" encoding="utf-16"?>
<root>
<child>
<elementIndented>
Line 1
Line 2
Line 3
</elementIndented>
<elementPreserved xml:space="preserve"> Line 1
Line 2
Line 3 </elementPreserved>
</child>
</root>
I have xml data in a string and i want it to split and i want to display the result in a Lable.
Here is my code:
string param = <HCToolParameters><BatchId>12</BatchId><HCUser>Admin</HCUser</HCToolParameters>;
var a = param.Split(new string[] { "<HCToolParameters>" }, StringSplitOptions.RemoveEmptyEntries);
var b = param.Split(new string[] { "<BatchId>12</BatchId>" }, StringSplitOptions.RemoveEmptyEntries);
var c = param.Split(new string[] { "<HCUser>Admin</HCUser>" }, StringSplitOptions.RemoveEmptyEntries);
var d = param.Split(new string[] { "</HCToolParameters>" }, StringSplitOptions.RemoveEmptyEntries);
Example:
String value =
<HCToolParameters><BatchId>12</BatchId><HCUser>Admin</HCUser></HCToolParameters>
Expected Result:
<HCToolParameters>
<BatchId>12</BatchId>
<HCUser>Admin</HCUser>
</HCToolParameters>
From what I see in the begging you have valid xml so, stop spliting it and use Xml Parser !
string param =#"<HCToolParameters><BatchId>12</BatchId><HCUser>Admin</HCUser></HCToolParameters>";
XDocument doc = XDocument.Parse(param);
Console.WriteLine(doc.ToString());
Well you could do it easy by this:
value = value.Replace("><", ">" + Environment.NewLine + "<");
This would work out in your example and is easy,... if you need it as Array (I don't realy know why you would try it this way:
var array = value.Replace("><", ">#<").Split('#');
You can use XmlTextWriter.Formatting = Formatting.Indented; because what is see is, you wanted to format your XML string. This function might do the trick for you
public static String FormatMyXML(String SomeXML)
{
String Result = "";
MemoryStream mStream = new MemoryStream();
XmlTextWriter wrtr = new XmlTextWriter(mStream, Encoding.Unicode);
XmlDocument document = new XmlDocument();
try
{
document.LoadXml(SomeXML);
wrtr.Formatting = Formatting.Indented;
document.WriteContentTo(wrtr);
wrtr.Flush();
mStream.Flush();
mStream.Position = 0;
StreamReader sReader = new StreamReader(mStream);
String FormattedXML = sReader.ReadToEnd();
Result = FormattedXML;
}
catch (XmlException)
{
}
mStream.Close();
wrtr.Close();
return Result;
}
I want to convert the string to XML.
I have a string like below. It contains the Programming language names.
string lang = "java,php,c#,asp.net,spring,hibernate";
I want to convert this string to XML formal like below:
<Languages>
<lang Name="java"/>
<lang Name="php"/>
<lang Name="c#"/>
<lang Name="asp.net"/>
<lang Name="spring"/>
<lang Name="hibernate"/>
</Languages>
I want to store this XML data in a variable to store later in a database.
It can also be done using Linq-to-XML:
using System.Xml.Linq; // required namespace
XDocument xmlDoc = new XDocument();
XElement xElm = new XElement("Languages",
from l in lang.Split(',')
select new XElement("lang", new XAttribute("Name", l)
)
);
xmlDoc.Add(xElm);
string lang = "java,php,c#,asp.net,spring,hibernate";
string[] langs = lang.Split(',');
XmlDocument document = new XmlDocument();
XmlElement root = document.CreateElement("Languages");
document.AppendChild(root);
for (int i = 0; i < langs.Length; i++)
{
XmlElement langElement = document.CreateElement("lang");
XmlAttribute nameAttr = document.CreateAttribute("Name");
nameAttr.Value = langs[i];
langElement.Attributes.Append(nameAttr);
root.AppendChild(langElement);
}
document.WriteTo(new XmlTextWriter(Console.Out) {
Formatting = Formatting.Indented
});
A short version of what you have done, using Linq and the string manipulation functions
var vales = lang.Split(','); //Splits the CSV
var xmlBody = vales.Select(v => string.Format("<lang Name=\"{0}\"/>",v));
var xml = string.Join(string.Empty, xmlBody); //Potentially add a new line as a seperator
xml = string.Format("<Languages>{0}</Languages>", xml);
The other option is to convert your csv into a model that implements ISerialize and then use the xml serializer. That is more code and not necessarily bad. If you would like to see an example, feel free to ask and I will post an example.
This is working,
class Program
{
static void Main(string[] args)
{
string lang = "java,php,c#,asp.net,spring,hibernate";
StringBuilder sb = new StringBuilder();
sb.AppendFormat("<Languages>");
foreach (string s in lang.Split(','))
{
sb.AppendFormat("<lang Name=\"{0}\"/>", s);
}
sb.AppendFormat("</Languages>");
Console.WriteLine(sb.ToString());
Console.ReadLine();
}
}
I have following string:
OK:<IDP RESULT="0" MESSAGE="some message" ID="oaisjd98asdh339wnf" MSGTYPE="Done"/>
I use this method to parse and get result:
public string MethodName(string capt)
{
var receivedData = capt.Split(' ').ToArray();
string _receivedReultValue = "";
foreach (string s in receivedData)
{
if (s.Contains('='))
{
string[] res = s.Split('=').ToArray();
if (res[0].ToUpper() == "RESULT")
{
string resValue = res[1];
resValue = resValue.Replace("\\", " ");
_receivedReultValue = resValue.Replace("\"", " ");
}
}
}
return _receivedReultValue.Trim();
}
Is there better way to parse string like this to extract data?
What you have isn't all that bad. But, because it's XML you could do this:
class Program
{
static void Main(string[] args)
{
var capt = "OK:<IDP RESULT=\"0\" MESSAGE=\"some message\" ID=\"oaisjd98asdh339wnf\" MSGTYPE=\"Done\"/>";
var stream = new MemoryStream(Encoding.Default.GetBytes(capt.Substring(capt.IndexOf("<"))));
var kvpList = XDocument.Load(XmlReader.Create(stream))
.Elements().First()
.Attributes()
.Select(a => new
{
Attr = a.Name.LocalName,
Val = a.Value
});
}
}
That would give you an IEnumerable of that anonymous type.
You can use XDocument, assuming that you will remove the "OK:" at the beginning you can do it like this:
static void Main(string[] args)
{
var str = "<IDP RESULT=\"0\" MESSAGE=\"some message\" ID=\"oaisjd98asdh339wnf\" MSGTYPE=\"Done\"/>";
var doc = XDocument.Parse(str);
var element = doc.Element("IDP");
Console.WriteLine("RESULT: {0}", element.Attribute("RESULT").Value);
Console.WriteLine("MESSAGE: {0}", element.Attribute("MESSAGE").Value);
Console.WriteLine("ID: {0}", element.Attribute("ID").Value);
Console.WriteLine("MSGTYPE: {0}", element.Attribute("MSGTYPE").Value);
Console.ReadKey();
}
EDIT: I tested the code above on .NET 4.5. For 3.5 I had to change it a bit
static void Main(string[] args)
{
const string str = "<IDP RESULT=\"0\" MESSAGE=\"some message\" ID=\"oaisjd98asdh339wnf\" MSGTYPE=\"Done\"/>";
var ms = new MemoryStream(Encoding.ASCII.GetBytes(str));
var rdr = new XmlTextReader(ms);
var doc = XDocument.Load(rdr);
var element = doc.Element("IDP");
Console.WriteLine("RESULT: {0}", element.Attribute("RESULT").Value);
Console.WriteLine("MESSAGE: {0}", element.Attribute("MESSAGE").Value);
Console.WriteLine("ID: {0}", element.Attribute("ID").Value);
Console.WriteLine("MSGTYPE: {0}", element.Attribute("MSGTYPE").Value);
Console.ReadKey();
}
Sure. It looks like XML, you may use normal XML methods for this.
if you remove "OK" and add
<?xml version="1.0" ?>
<IDP RESULT="0" MESSAGE="some message" ID="oaisjd98asdh339wnf" MSGTYPE="Done"/>
this can be parsed by any XML decoder. Try xmllint to check it out.
You can Regex to obtain all key/value pairs:
string str = #"OK:<IDP RESULT=""0"" MESSAGE=""some message"" ID=""oaisjd98asdh339wnf"" MSGTYPE=""Done""/>";
var matches = Regex.Matches(str, #"(?<Key>\w+)=""(?<Value>[^""]+)""");
then you can access RESULT attribute:
var match = matches.OfType<Match>()
.FirstOrDefault(match => match.Groups["Key"].Value == "RESULT");
if (match != null)
{
result = match.Groups["Value"].Value;
}
try this is too simple
string xml = #"OK:<IDP RESULT=""0"" MESSAGE=""some message"" ID=""oaisjd98asdh339wnf"" MSGTYPE=""Done""/>";
XElement xElement = XElement.Parse(new string(xml.Skip(3).ToArray()));
//for example message
var message = xElement.Attribute("MESSAGE").Value;
I try to parse the following Java resources file - which is an XML.
I am parsing using C# and XDocument tools, so not a Java question here.
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="problem"> test </string>
<string name="no_problem"> test </string>
</resources>
The problem is that XDocument.Load(string path) method load this as an XDocument with 2 identical XElements.
I load the file.
string filePath = #"c:\res.xml"; // whatever
var xDocument = XDocument.Load(filePath);
When I parse the XDocument object, here is the problem.
foreach (var node in xDocument.Root.Nodes())
{
if (node.NodeType == XmlNodeType.Element)
{
var xElement = node as XElement;
if (xElement != null) // just to be sure
{
var elementText = xElement.Value;
Console.WriteLine("Text = '{0}', Length = {1}",
elementText, elementText.Length);
}
}
}
This produces the following 2 lines :
"Text = ' test ', Length = 6"
"Text = ' test ', Length = 6"
I want to get the following 2 lines :
"Text = ' test ', Length = 6"
"Text = ' test ', Length = 16"
Document encoding is UTF8, if this is relevant somehow.
string filePath = #"c:\res.xml"; // whatever
var xDocument = XDocument.Load(filePath);
String one = (xDocument.Root.Nodes().ElementAt(0) as XElement).Value;//< test >
String two = (xDocument.Root.Nodes().ElementAt(1) as XElement).Value;//< test >
Console.WriteLine(one == two); //false
Console.WriteLine(String.Format("{0} {1}", (int)one[0], (int)two[0]));//160 32
You have two different strings, and is there, but in unicode format.
One possible way to get things back is manually replace non-breaking space to " "
String result = one.Replace(((char) 160).ToString(), " ");
Thanks to Dmitry, following his suggestion, I have made a function to make stuff work for a list of unicode codes.
private static readonly List<int> UnicodeCharCodesReplace =
new List<int>() { 160 }; // put integers here
public static string UnicodeUnescape(this string input)
{
var chars = input.ToCharArray();
var sb = new StringBuilder();
foreach (var c in chars)
{
if (UnicodeCharCodesReplace.Contains(c))
{
// Append &#code; instead of character
sb.Append("&#");
sb.Append(((int) c).ToString());
sb.Append(";");
}
else
{
// Append character itself
sb.Append(c);
}
}
return sb.ToString();
}