looping through element of XDocument and getting specific attributes - c#

I am currently in the process of rewriting my XmlDocument and XmlElements to XDocument and XElement. I've run into a small problem, I used to simply select a single node ModuleRequests and than loop through it. I thought simple right I'll just rewrite this:
var propertiesRequested = new XmlDocument();
propertiesRequested.LoadXml(propertiesConfiguration);
var requests = propertiesRequested.SelectSingleNode("ModuleRequests");
foreach (XmlNode request in requests)
{
var propertyValue = request.Attributes["Id"].Value;
if (systemProperties.ContainsKey(propertyValue) == false)
{
systemProperties.Add(propertyValue, request.Attributes["Request"].Value);
}
}
To this:
var propertiesRequested = XDocument.Parse(propertiesConfiguration);
var requests = propertiesRequested.Element("ModuleRequests");
foreach (XNode request in requests)
{
var propertyValue = request.Attributes["Id"].Value;
if (systemProperties.ContainsKey(propertyValue) == false)
{
systemProperties.Add(propertyValue, request.Attributes["Request"].Value);
}
}
Well needless to say it isn't that easy, then I thought fine i'll make it:
foreach(XNode request in requests.Nodes())
but this gave me even more problems since an XNode does not have an attribute.
As you can probably tell I'm a bit of a novice when it comes to xml reading. I'm hoping someone can help me out. What is the correct way to rewrite from XmlDocument to XDocument

You want to use XElement.Elements() to iterate through all child elements of your requests element, then use XElement.Attribute(XName name) to fetch the specified attribute by name.
You might also consider explicitly casting your XAttribute to a string rather than using the Value property, as the former will return null on a missing attribute rather than generating a null reference exception.
Thus:
var propertiesRequested = XDocument.Parse(propertiesConfiguration);
var requests = propertiesRequested.Element("ModuleRequests");
foreach (var request in requests.Elements())
{
var propertyValue = (string)request.Attribute("Id");
if (systemProperties.ContainsKey(propertyValue) == false)
{
systemProperties.Add(propertyValue, (string)request.Attribute("Request"));
}
}

Related

Roslyn get IdentifierName in ObjectCreationExpressionSyntax

Currently I am working on simple code analyse for c# with roslyn. I need to parse all document of all projects inside one solution and getting the declared used classes inside this document.
For example from:
class Program
{
static void Main(string[] args)
{
var foo = new Foo();
}
}
I want to get Program uses Foo.
I already parse all documents and get the declared class inside.
// all projects in solution
foreach (var project in _solution.Projects)
{
// all documents inside project
foreach (var document in project.Documents)
{
var syntaxRoot = await document.GetSyntaxRootAsync();
var model = await document.GetSemanticModelAsync();
var classes = syntaxRoot.DescendantNodes().OfType<ClassDeclarationSyntax>();
// all classes inside document
foreach (var classDeclarationSyntax in classes)
{
var symbol = model.GetDeclaredSymbol(classDeclarationSyntax);
var objectCreationExpressionSyntaxs = classDeclarationSyntax.DescendantNodes().OfType<ObjectCreationExpressionSyntax>();
// all object creations inside document
foreach (var objectCreationExpressionSyntax in objectCreationExpressionSyntaxs)
{
// TODO: Get the identifier value
}
}
}
}
The problem is to get the IdentifierName Foo. Using the debugger, I see objectCreationExpressionSyntax.Typegot the Identifier.Text got the value I need, but objectCreationExpressionSyntax.Type.Identifierseems to be private.
I could use the SymbolFinder to find all references of a Class in the solution. As I already need to parse all documents its should work without.
Maybe I am on the wrong path? How to get the identifier value?
You'll need to handle the different types of TypeSyntaxes. See here: http://sourceroslyn.io/#Microsoft.CodeAnalysis.CSharp/Syntax/TypeSyntax.cs,29171ac4ad60a546,references
What you see in the debugger is a SimpleNameSyntax, which does have a public Identifier property.
Update
var ns = objectCreationExpressionSyntax.Type as NameSyntax;
if (ns != null)
{
return ns.Identifier.ToString();
}
var pts = objectCreationExpressionSyntax.Type as PredefinedTypeSyntax;
if (pts != null)
{
return pts.Keyword.ToString();
}
...
All other subtypes would need to be handed. Note that ArrayType.ElementType is also a TypeSyntax, so you would most probably need to make this method recursive.
You can get the identifier from the syntax's Type property:
foreach (var objectCreationExpressionSyntax in objectCreationExpressionSyntaxs)
{
IdentifierNameSyntax ins = (IdentifierNameSyntax)objectCreationExpressionSyntax.Type;
var id = ins.Identifier;
Console.WriteLine(id.ValueText);
}
Strings can be misleading.
Let's say you have the expression new SomeClass(), and you get the string "SomeClass" out of it. How do you know if that refers to Namespace1.SomeClass or Namespace2.SomeClass ? What if there is a using SomeClass = Namespace3.SomeOtherType; declaration being used?
Fortunately, you don't have to do this analysis yourself. The compiler can bind the ObjectCreationExpressionSyntax to a symbol. You have your semantic model, use it.
foreach (var oce in objectCreationExpressionSyntaxs)
{
ITypeSymbol typeSymbol = model.GetTypeInfo(oce).Type;
// ...
}
You can compare this symbol with the symbols you get from model.GetDeclaredSymbol(classDeclarationSyntax), just make sure you use the Equals method, not the == operator.

How to check multiple XMLNode Attributes for Null Value?

I am trying to read multiple attributes from an xml file using XMLNode, but depending on the element, the attribute might not exist. In the event the attribute does not exist, if I try to read it into memory, it will throw a null exception. I found one way to test if the attribute returns null:
var temp = xn.Attributes["name"].Value;
if (temp == null)
{ txtbxName.Text = ""; }
else
{ txtbxName.Text = temp; }
This seems like it will work for a single instance, but if I am checking 20 attributes that might not exist, I'm hoping there is a way to setup a method I can pass the value to test if it is null. From what I have read you can't pass a var as it is locally initialized, but is there a way I could setup a test to pass a potentially null value to be tested, then return the value if it is not null, and return "" if it is null? Is it possible, or do would I have to test each value individually as outlined above?
You can create a method like this:
public static string GetText(XmlNode xn, string attrName)
{
var attr = xn.Attributes[attrName];
if (attr == null). // Also check whether the attribute does not exist at all
return string.Empty;
var temp = attr.Value;
if (temp == null)
return string.Empty;
return temp;
}
And call it like this:
txtbxName.Text = GetText(xn, "name");
If you use an XDocument you could just use Linq to find all the nodes you want.
var names = (from attr in doc.Document.Descendants().Attributes()
where attr.Name == "name"
select attr).ToList();
If you are using XmlDocument for some reason, you could select the nodes you want using XPath. (My XPath is rusty).
var doc = new XmlDocument();
doc.Load("the file");
var names = doc.SelectNodes("//[Name=\"name\"");

How to read same child element types from an XML tree recursively?

I have a sample xml file in following format:-
Also, I have a Control class shown below:
class Control
{
private string id;
public string Id
{
get { return id; }
set { id = value; }
}
private string controlType;
public string ControlType
{
get { return controlType; }
set { controlType = value; }
}
private string searchProperties;
public string SearchProperties
{
get { return searchProperties; }
set { searchProperties = value; }
}
public List<Control> ChildrenControl = new List<Control>();
}
I need to read the XML file mentioned above and populate the code. I am not sure how to recursively do that. I was thinking to use Linq to XML, but not sure how to use it recursively in this case in which parent and child elements are of the same type. Can someone please help me with this problem?
Thanks,
Harit
Update:
Try the following. It uses Linq to XML and a recursive function to parse the controls from out of the XML document. It assumes the existence of your XML data in a file called "Controls.xml", and obviously your Control class. Its not the greatest of code, but it should get you started.
private void ParseControlsData()
{
var doc = XDocument.Load("Controls.xml");
var controls = from control in doc.Element("controls").Elements("control")
select CreateFromXElement(control);
var controlsList = controls.ToList();
Console.ReadLine();
}
private Control CreateFromXElement(XElement element)
{
var control = new Control()
{
Id = (string)element.Attribute("id"),
ControlType = (string)element.Attribute("controlType"),
SearchProperties = (string)element.Attribute("searchProperties")
};
var childrenElements = element.Element("childControls");
if (childrenElements != null)
{
var children = from child in childrenElements.Elements("control")
select CreateFromXElement(child);
control.ChildrenControl = children.ToList();
}
return control;
}
Notes:
In the ParseControlsData function, it uses query expression syntax to select the first element in the document called "controls" (your root), and then selects all sub-elements named "control". A very similar expression occurs inside the CreateFromXElement function, except it needs to find an element called "childControls".
There's no real error checking. You'll definitely need some.
Update:
Don't do this, it doesn't work for your example because you have values stored in attributes, and by default the DataContractSerializer + DataContractAttributes combination does not support that (without a whole bunch of extra work). Other options are Linq to XML (as you suggested) and using the XmlSerializer (which is similar to the DataContractSerializer, but uses its own set of attributes). I'll look into it further.
Previous Answer:
One way to turn an XML document into an Object Graph is to mark the classes you want to create from the XML with DataContract attributes, and to use the DataContractSerializer. All you need to do is make sure that the DataContract(Name = "X") and DataMember(Name = "Y") match the names of the elements inside your XML.
Have a look at my answer on XML Element Selection, which is performing the operation you want (taking existing XML and turning it into an Object Graph). You probably wont need to worry about the CDATA stuff that that user ran into, so your solution will probably be a bit simpler.
Also, have a look at my answer on How to catch/send XML doc with various sub arrays?, which is performing the reverse operation (that user wanted to create XML from an Object Graph).
If you can't tell, I'm a fan of the DataContractSerializer :)
You can use recursion using Func<> delegate, but you have to declare it before specifying the actual delegate logic:
var xDoc = XDocument.Load("Input.xml");
Func<XElement, List<Control>> childControlsQuery = null;
childControlsQuery =
x => (from c in x.Elements("control")
select new Control
{
Id = (string)c.Attribute("id"),
ControlType = (string)c.Attribute("controltype"),
SearchProperties = (string)c.Attribute("searchproperties"),
ChildrenControl = childControlsQuery(c.Element("childControls") ?? new XElement("childControls"))
}).ToList();
var controls = childControlsQuery(xDoc.Root);
You can remove ?? new XElement("childControls") if you're sure there is always childControls element, even when given control does not have any childs.
And if you're sure there is always only one main control, you can get it as:
var mainControl = controls.First();

Efficient Way to Parse XML

I find it puzzling to determine the best way to parse some XML. It seems they are so many possible ways and none have really clicked with me.
My current attempt looks something like this:
XElement xelement = XElement.Parse(xmlText);
var name = xelement.Element("Employee").Attribute("name").Value;
So, this works. But it throws an exception if either the "Employee" element or the "name" attribute is missing. I don't want to throw an exception.
Exploring some examples available online, I see code like this:
XElement xelement = XElement.Load("..\\..\\Employees.xml");
IEnumerable<XElement> employees = xelement.Elements();
Console.WriteLine("List of all Employee Names :");
foreach (var employee in employees)
{
Console.WriteLine(employee.Element("Name").Value);
}
This would seem to suffer from the exact same issue. If the "Name" element does not exist, Element() returns null and there is an error calling the Value property.
I need a number of blocks like the first code snippet above. Is there a simple way to have it work and not throw an exception if some data is missing?
You can use the combination of the explicit string conversion from XAttribute to string (which will return null if the operand is null) and the FirstOrDefault method:
var name = xelement.Elements("Employee")
.Select(x => (string) x.Attribute("name"))
.FirstOrDefault();
That will be null if either there's no such element (because the sequence will be empty, and FirstOrDefault() will return null) or there's an element without the attribute (in which case you'll get a sequence with a null element, which FirstOrDefault will return).
I often use extension methods in cases like this as they work even if the reference is null. I use a slightly modified version of the extension method's from Anders Abel's very good blog posting from early 2012 'Null Handling with Extension Methods':
public static class XElementExtension
{
public static string GetValueOrDefault(this XAttribute attribute,
string defaultValue = null)
{
return attribute == null ? defaultValue : attribute.Value;
}
public static string GetAttributeValueOrDefault(this XElement element,
string attributeName,
string defaultValue = null)
{
return element == null ? defaultValue : element.Attribut(attributeName)
.GetValueOrDefault(defaultValue);
}
}
If you want to return 'null' if the element or attribute doesn't exist:
var name = xelement.Element("Employee")
.GetAttributeValueOrDefault("name" );
If you want to return a default value if the element or attribute doesn't exist:
var name = xelement.Element("Employee")
.GetAttributeValueOrDefault("name","this is the default value");
To use in your for loop:
XElement xelement = XElement.Load("..\\..\\Employees.xml");
IEnumerable<XElement> employees = xelement.Elements();
Console.WriteLine("List of all Employee Names :");
foreach (var employee in employees)
{
Console.WriteLine(employee.GetAttributeValueOrDefault("Name"));
}
You could always use XPath:
string name = xelement.XPathEvaluate("string(Employee/#name)") as string;
This will be either the value of the attribute, or null if either Employee or #name do not exist.
And for the iterative example:
foreach (XNode item in (IEnumerable)xelement.XPathEvaluate("Employee/Name"))
{
Console.WriteLine(item.Value);
}
XPathEvaluate() will only select valid nodes here, so you can be assured that item will always be non-null.
It all depends on what you want to do with the data once you've extracted it from the XML.
You would do well to look at languages that are designed for XML processing, such as XSLT and XQuery, rather than using languages like C#, which aren't (though Linq gives you something of a hybrid). Using C# or Java you're always going to have to do a lot of work to cope with the fact that XML is so flexible.
Use the native XmlReader. If your problem is reading large XML files instead of allowing the XElement to build an object representation, you can build something like Java SAX parser that only stream the XML.
Ex:
http://www.codeguru.com/csharp/csharp/cs_data/xml/article.php/c4221/Writing-XML-SAX-Parsers-in-C.htm

HTML Agility Pack Null Reference

I've got some trouble with the HTML Agility Pack.
I get a null reference exception when I use this method on HTML not containing the specific node. It worked at first, but then it stopped working. This is only a snippet and there are about 10 more foreach loops that selects different nodes.
What am I doing wrong?
public string Export(string html)
{
var doc = new HtmlDocument();
doc.LoadHtml(html);
// exception gets thrown on below line
foreach (var repeater in doc.DocumentNode.SelectNodes("//table[#class='mceRepeater']"))
{
if (repeater != null)
{
repeater.Name = "editor:repeater";
repeater.Attributes.RemoveAll();
}
}
var sw = new StringWriter();
doc.Save(sw);
sw.Flush();
return sw.ToString();
}
AFAIK, DocumentNode.SelectNodes could return null if no nodes found.
This is default behaviour, see a discussion thread on codeplex: Why DocumentNode.SelectNodes returns null
So the workaround could be in rewriting the foreach block:
var repeaters = doc.DocumentNode.SelectNodes("//table[#class='mceRepeater']");
if (repeaters != null)
{
foreach (var repeater in repeaters)
{
if (repeater != null)
{
repeater.Name = "editor:repeater";
repeater.Attributes.RemoveAll();
}
}
}
This has been updated, and you can now prevent SelectNodes from returning null by setting doc.OptionEmptyCollection = true, as detailed in this github issue.
This will make it return an empty collection instead of null if there are no nodes which match the query (I'm not sure why this wasn't the default behaviour to begin with, though)
As per Alex's answer, but I solved it like this:
public static class HtmlAgilityPackExtensions
{
public static HtmlAgilityPack.HtmlNodeCollection SafeSelectNodes(this HtmlAgilityPack.HtmlNode node, string selector)
{
return (node.SelectNodes(selector) ?? new HtmlAgilityPack.HtmlNodeCollection(node));
}
}
You add simple ? before every . example are given blow:
var titleTag = htdoc?.DocumentNode?.Descendants("title")?.FirstOrDefault()?.InnerText;
I've created universal extension which would work with any IEnumerable<T>
public static List<TSource> ToListOrEmpty<TSource>(this IEnumerable<TSource> source)
{
return source == null ? new List<TSource>() : source.ToList();
}
And usage is:
var opnodes = bodyNode.Descendants("o:p").ToListOrEmpty();
opnodes.ForEach(x => x.Remove());

Categories

Resources