Validating string value has the correct XML format

Validating string value has the correct XML format - c#

I am Having a sring for which i need to chek weather it has correct XML format like consistent start and end tags.
Sorry i tried to make string value well formated but could not :).
string parameter="<HostName>Arasanalu</HostName><AdminUserName>Administrator</AdminUserName><AdminPassword>A1234</AdminPassword><placeNumber>38</PlaceNumber>"
I tried with following check :
public bool IsValidXML(string value)
{
try
{
// Check we actually have a value
if (string.IsNullOrEmpty(value) == false)
{
// Try to load the value into a document
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(parameter);
// If we managed with no exception then this is valid XML!
return true;
}
else
{
// A blank value is not valid xml
return false;
}
}
catch (System.Xml.XmlException)
{
return false;
}
}
It was throwing error for correct as well as wrong format.
Please let me know how can i proceed.
Regards,
Channa

The content of the string you have do not actually form a valid xml document
Its missing a Root Element
string parameter="<HostName>Arasanalu</HostName><AdminUserName>Administrator</AdminUserName><AdminPassword>A1234</AdminPassword><PlaceNumber>38</PlaceNumber>";
XmlDocument doc = new XmlDocument(); \
doc.LoadXml("<root>" + parameter + "</root>"); // this adds a root element and makes it Valid
Root Element
There is exactly one element, called the root, or document element, no
part of which appears in the content of any other element.] For all
other elements, if the start-tag is in the content of another element,
the end-tag is in the content of the same element. More simply stated,
the elements, delimited by start- and end-tags, nest properly within
each other.

Always put proper tags in variable. Put <root> tag before and after you code. Try below code.
try
{
string unformattedXml = "<Root><HostName>Arasanalu</HostName><AdminUserName>Administrator</AdminUserName><AdminPassword>A1234</AdminPassword><PlaceNumber>38</PlaceNumber></Root>";
string formattedXml = XElement.Parse(unformattedXml).ToString();
return true;
}
catch (Exception e)
{
return false;
}

Related

XmlReader behaves different with line breaks

If the data is on a single line the
index=int.Parse(logDataReader.ReadElementContentAsString());
and
value=double.Parse(logDataReader.ReadElementContentAsString(),
cause the cursor to move forward. If I take those calls out I see it loop 6 times in debug.
In the following only 3 <data> are read (and they are wrong as the value is for the next index) on the first (<logData id="Bravo">). On the second (<logData id="Bravo">) all <data> are read.
It is not an option to edit the xml and put in line breaks as that file is created dynamically (by XMLwriter). The NewLineChars setting is a line feed. From XMLwriter it is actually just one line - I broke it down to figure out where it was breaking. In the browser it is displayed properly.
How to fix this?
Here is my XML:
<?xml version="1.0" encoding="utf-8"?>
<log>
<logData id="Alpha">
<data><index>100</index><value>150</value></data>
<data><index>110</index><value>750</value></data>
<data><index>120</index><value>750</value></data>
<data><index>130</index><value>150</value></data>
<data><index>140</index><value>0</value></data>
<data><index>150</index><value>222</value></data>
</logData>
<logData id="Bravo">
<data>
<index>100</index>
<value>25</value>
</data>
<data>
<index>110</index>
<value>11</value>
</data>
<data>
<index>120</index>
<value>1</value>
</data>
<data>
<index>130</index>
<value>25</value></data>
<data>
<index>140</index>
<value>0</value>
</data>
<data>
<index>150</index>
<value>1</value>
</data>
</logData>
</log>
And my code:
static void Main(string[] args)
{
List<LogData> logDatas = GetLogDatasFromFile("singleVersusMultLine.xml");
Debug.WriteLine("Main");
Debug.WriteLine("logData");
foreach (LogData logData in logDatas)
{
Debug.WriteLine($" logData.ID {logData.ID}");
foreach(LogPoint logPoint in logData.LogPoints)
{
Debug.WriteLine($" logData.Index {logPoint.Index} logData.Value {logPoint.Value}");
}
}
Debug.WriteLine("end");
}
public static List<LogData> GetLogDatasFromFile(string xmlFile)
{
List<LogData> logDatas = new List<LogData>();
using (XmlReader reader = XmlReader.Create(xmlFile))
{
// move to next "logData"
while (reader.ReadToFollowing("logData"))
{
var logData = new LogData(reader.GetAttribute("id"));
using (var logDataReader = reader.ReadSubtree())
{
// inside "logData" subtree, move to next "data"
while (logDataReader.ReadToFollowing("data"))
{
// move to index
logDataReader.ReadToFollowing("index");
// read index
var index = int.Parse(logDataReader.ReadElementContentAsString());
// move to value
logDataReader.ReadToFollowing("value");
// read value
var value = double.Parse(logDataReader.ReadElementContentAsString(), CultureInfo.InvariantCulture);
logData.LogPoints.Add(new LogPoint(index, value));
}
}
logDatas.Add(logData);
}
}
return logDatas;
}
public class LogData
{
public string ID { get; }
public List<LogPoint> LogPoints { get; } = new List<LogPoint>();
public LogData (string id)
{
ID = id;
}
}
public class LogPoint
{
public int Index { get; }
public double Value { get; }
public LogPoint ( int index, double value)
{
Index = index;
Value = value;
}
}

Your problem is as follows. According to the documentation for XmlReader.ReadElementContentAsString():
This method reads the start tag, the contents of the element, and moves the reader past the end element tag.
And from the documentation for XmlReader.ReadToFollowing(String):
It advances the reader to the next following element that matches the specified name and returns true if a matching element is found.
Thus, after the call to ReadElementContentAsString(), since the reader has been advanced to the next node, it might already be positioned on the next <value> or <data> node. Then when you call ReadToFollowing() this element node is skipped because the method unconditionally moves on to the next node with the correct name. But if the XML is indented then the next node immediately after the call to ReadElementContentAsString() will be an XmlNodeType.Whitespace node, protecting against this bug.
The solution is to check whether the reader is already positioned correctly after the call to ReadElementContentAsString(). First, introduce the following extension method:
public static class XmlReaderExtensions
{
public static bool ReadToFollowingOrCurrent(this XmlReader reader, string localName, string namespaceURI)
{
if (reader == null)
throw new ArgumentNullException(nameof(reader));
if (reader.NodeType == XmlNodeType.Element && reader.LocalName == localName && reader.NamespaceURI == namespaceURI)
return true;
return reader.ReadToFollowing(localName, namespaceURI);
}
}
Then modify your code as follows:
public static List<LogData> GetLogDatasFromFile(string xmlFile)
{
List<LogData> logDatas = new List<LogData>();
using (XmlReader reader = XmlReader.Create(xmlFile))
{
// move to next "logData"
while (reader.ReadToFollowing("logData", ""))
{
var logData = new LogData(reader.GetAttribute("id"));
using (var logDataReader = reader.ReadSubtree())
{
// inside "logData" subtree, move to next "data"
while (logDataReader.ReadToFollowing("data", ""))
{
// move to index
logDataReader.ReadToFollowing("index", "");
// read index
var index = XmlConvert.ToInt32(logDataReader.ReadElementContentAsString());
// move to value
logDataReader.ReadToFollowingOrCurrent("value", "");
// read value
var value = XmlConvert.ToDouble(logDataReader.ReadElementContentAsString());
logData.LogPoints.Add(new LogPoint(index, value));
}
}
logDatas.Add(logData);
}
}
return logDatas;
}
Notes:
Always prefer to use XmlReader methods in which the local name and namespace are specified separately, such as XmlReader.ReadToFollowing (String, String). When you use a method such as XmlReader.ReadToFollowing(String) which accepts a single qualified name, you are implicitly hardcoding the choice of XML prefix, which is generally not a good idea. XML parsing should be independent of prefix choice.
While you correctly parsed your double using the CultureInfo.InvariantCulture locale, it's even easier to use the methods from the XmlConvert class to handle parsing and formatting correctly.
XmlReader.ReadSubtree() leaves the XmlReader positioned on the EndElement node of the element being read, so you shouldn't need to call ReadToFollowingOrCurrent() afterwards. (Nice use of ReadSubtree() to avoid reading too little or too much by the way; by using this method one can avoid several frequent mistakes with XmlReader.)
As you have found, code that manually reads XML using XmlReader should always be unit-tested with both formatted and unformatted XML, because certain bugs will only arise with one or the other. (See e.g. this answer, this one and this one also for other examples of such.)
Working sample .Net fiddle here.

Indeed that code (which I provided to you in your another question) is wrong. ReadToFollowing will read to the next element with this name even if it's cursor is already positioned on element with this name. When there is a whitespace - after you read index, cursor moves to that whitespace and ReadToFollowing("value") works as you expect. However, if there is no whitespace, cursor is already on value node and so ReadToFollowing("value") reads to the next "value" in subsequent "data" node.
I think the following would be a safer approach:
public static List<LogData> GetLogDatasFromFile(string xmlFile) {
List<LogData> logDatas = new List<LogData>();
using (XmlReader reader = XmlReader.Create(xmlFile)) {
LogData currentData = null;
while (reader.Read()) {
if (reader.IsStartElement("logData")) {
// we are positioned on start of logData
if (currentData != null)
logDatas.Add(currentData);
currentData = new LogData(reader.GetAttribute("id"));
}
else if (reader.IsStartElement("data")) {
// we are on start of "data"
// we always have "currentData" at this point
Debug.Assert(currentData != null);
reader.ReadToFollowing("index");
var index = int.Parse(reader.ReadElementContentAsString());
// check if we are not already on "value"
if (!reader.IsStartElement("value"))
reader.ReadToFollowing("value");
var value = double.Parse(reader.ReadElementContentAsString(), CultureInfo.InvariantCulture);
currentData.LogPoints.Add(new LogPoint(index, value));
}
}
if (currentData != null)
logDatas.Add(currentData);
}
return logDatas;
}

I found a fix but to me not an acceptable answer. XMLreader should not behave differently with line breaks.
In XmlWriter this will put line breaks in the text:
XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.NewLineOnAttributes = true;
xmlWriterSettings.Indent = true;
using (XmlWriter xmlWriter = XmlWriter.Create(fileNameXML, xmlWriterSettings))
{
I found this here.

XML Validation against XSD always returns true

I have a c# script that validates an XML document against an XSD document, as follows:
static bool IsValidXml(string xmlFilePath, string xsdFilePath)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.Schemas.Add(null, xsdFilePath);
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Compile();
try
{
XmlReader xmlRead = XmlReader.Create(xmlFilePath, settings);
while (xmlRead.Read())
{ };
xmlRead.Close();
}
catch (Exception e)
{
return false;
}
return true;
}
I've compiled this after looking at a number of MSDN articles and questions here where this is the solution. It does correctly validate that the XSD is formed well (returns false if I mess with the file) and checks that the XML is formed well (also returns false when messed with).
I've also tried the following, but it does the exact same thing:
static bool IsValidXml(string xmlFilePath, string xsdFilePath)
{
XDocument xdoc = XDocument.Load(xmlFilePath);
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add(null, xsdFilePath);
try
{
xdoc.Validate(schemas, null);
}
catch (XmlSchemaValidationException e)
{
return false;
}
return true;
}
I've even pulled a completely random XSD off the internet and thrown it into both scripts, and it still validates on both. What am I missing here?
Using .NET 3.5 within an SSIS job.

In .NET you have to check yourself if the validator actually matches a schema component; if it doesn't, there is no exception thrown, and so your code will not work as you expect.
A match means one or both of the following:
there is one global element in your schema set with a qualified name that is the same as your XML document element's qualified name.
the document element has an xsi:type attribute, that is a qualified name pointing to a global type in your schema set.
In streaming mode, you can do this check easily. This pseudo-kind-of-code should give you an idea (error handling not shown, etc.):
using (XmlReader reader = XmlReader.Create(xmlfile, settings))
{
reader.MoveToContent();
var qn = new XmlQualifiedName(reader.LocalName, reader.NamespaceURI);
// element test: schemas.GlobalElements.ContainsKey(qn);
// check if there's an xsi:type attribute: reader["type", XmlSchema.InstanceNamespace] != null;
// if exists, resolve the value of the xsi:type attribute to an XmlQualifiedName
// type test: schemas.GlobalTypes.ContainsKey(qn);
// if all good, keep reading; otherwise, break here after setting your error flag, etc.
}
You might also consider the XmlNode.SchemaInfo which represents the post schema validation infoset that has been assigned to a node as a result of schema validation. I would test different conditions and see how it works for your scenario. The first method is recommended to reduce the attack surface in DoS attacks, as it is the fastest way to detect completely bogus payloads.

How to add two actions depending on 'type' attribute of element in XML using webdriver in c#

I have elements stored in a config.xml file as part of my project, currently I have a method to 'setData' which will find the element by the id and then set its value to the user input (using a webdriver instance called FireFoxBrowser)
I want to add a type attribute to the xml to differentiate between 'inputs' which will use the current code and 'button' to add code that will click anything with this type. How can I use webdriver to write this code?
public void setData(string elementName, string elementValue)
{
XmlDocument docXml = null;
try
{
docXml = new XmlDocument();
string xmlPath = new DirectoryInfo(Environment.CurrentDirectory).Parent.Parent.FullName + #"\config.xml";
docXml.Load(xmlPath);
XmlNode nd = docXml.SelectSingleNode(string.Format(#"//page[#url='{0}']", FireFoxBrowser.Url.ToString()));
if (nd != null)
{
var id = nd.SelectSingleNode(string.Format(#"element[#name='{0}']", elementName)).Attributes["id"].Value;
FireFoxBrowser.FindElement(By.Id(id)).Clear();
FireFoxBrowser.FindElement(By.Id(id)).SendKeys(elementValue);
}
}
finally
{
if (docXml != null)
docXml = null;
}

I was able to achieve this using the following line of code which differentiates between type attribute set:
var id = nd.SelectSingleNode(string.Format(#"element[#name='{0}']", elementName)).Attributes["id"].Value;

XSD validation always passes

I am using following code to validate XML agains the XSD:
public static bool IsValidXmlOld(string xmlFilePath, string xsdFilePath)
{
if (File.Exists(xmlFilePath) && File.Exists(xsdFilePath))
{
try
{
XDocument xdocXml = XDocument.Load(xmlFilePath);
var schemas = new XmlSchemaSet();
schemas.Add(null, xsdFilePath);
Boolean result = true;
xdocXml.Validate(schemas, (sender, e) =>
{
result = false;
});
return result;
}
catch (Exception ex)
{
// Logging logic + error handling logic
throw new Exception(ex.Message);
}
}
throw new Exception("Either the Schema or the XML file does not exist Please check");
}
For some reason, it always returns true even if the XML is not valid for given XSD. I picked up this code from following link:
Validate XML against XSD in a single method. sounds like that result= false never gets called even if the xml is completely invalid.
I have a pair of valid and invalid XML that goes against a particular XSD
Valid XML
XSD
Invalid XML
If I try to validate them on This web site then the valid one passes the validation test against the invalid one BUT the invalid XML Fails the test. However, the code above passes both the XMLs invariably.
At the same time it fails the validation when I use some basic XML like following:
XDocument doc2 = new XDocument(
new XElement("Root",
new XElement("Child1", "content1"),
new XElement("Child3", "content1")
)
);
with following error:
The 'Root' element is not declared.: {0}
Now, it clearly demonstrates that the code is not completely incapable of failing a validation. However, what is so special about the 3. Invalid XML that the code passes that particular XML when This Site clearly fails it?

Test for root node XML .NET

I'm currently using the code below to attempt to check for a certain root node (rss) and a certain namespace\prefix (itunes), but it seems to be saying that the feed is valid even when supplied with a random web page URL instead of one pointing to a feed.
FeedState state = FeedState.Invalid;
XmlDocument xDoc = new XmlDocument();
xDoc.Load(_url);
XmlNode root = xDoc.FirstChild;
if (root.Name.ToLower() == "rss" && root.GetNamespaceOfPrefix("itunes") == "http://www.itunes.com/dtds/podcast-1.0.dtd")
{
state = FeedState.Valid;
}
return state;
Can anybody tell me why this might be?

Found the solution now. Putting xDoc.Load(_url); in a try .. catch block and returning FeedState.Invalid upon exception seems to have solved my problems.
FeedState state = FeedState.Invalid;
XmlDocument xDoc = new XmlDocument();
try
{
xDoc.Load(_url);
}
catch
{
return state;
}
XmlNode root = xDoc.FirstChild;
if (root.Name.ToLower() == "rss" && root.GetNamespaceOfPrefix("itunes") == "http://www.itunes.com/dtds/podcast-1.0.dtd")
{
state = FeedState.Valid;
}
return state;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Validating string value has the correct XML format - c#

Related

XmlReader behaves different with line breaks

XML Validation against XSD always returns true

How to add two actions depending on 'type' attribute of element in XML using webdriver in c#

XSD validation always passes

Test for root node XML .NET

Categories

Resources