parsing xsd complexType recursively - c#

private ElementDefinition ParseComplexType(XElement complexType, string nameValue = "")
{
var name = complexType.Attribute("name");
ElementDefinition element = new ElementDefinition()
{
Elements = new List<ElementDefinition>(),
ElementName = name != null ? name.Value : string.Empty
};
foreach (var el in complexType.Descendants().Where(k => k.Parent.Parent == complexType && k.Name.LocalName == "element"))
{
ElementDefinition tempElement = new ElementDefinition();
var tempName = el.Attribute("name");
var tempType = el.Attribute("type");
if (tempName != null)
{
tempElement.ElementName = tempName.Value;
}
if (tempType != null)
{
var tempTypeValue = tempType.Value.Substring(tempType.Value.IndexOf(":") + 1, tempType.Value.Length - tempType.Value.IndexOf(":") - 1);
if (tipovi.Contains(tempTypeValue))
{
tempElement.ElementType = tempTypeValue;
element.Elements.Add(tempElement);
}
else
{
complexType = GetComplexType(tempTypeValue);
element.Elements.Add(ParseComplexType(complexType, tempName.Value));
}
}
}
if (nameValue != "") element.ElementName = nameValue;
return element;
}
Hi so this is a function i use for parsing XSD complexTypes.
This is a xsd schema i use xsd Schema.
I have problem parsing complexType element at line 14.
It only parses shipTo element, skipping billTo and parsing badly items.
The result is http://pokit.org/get/?b335243094f635f129a8bc74571c8bf2.jpg
Which fixes can i apply to this function in order to work properly?
PS. "tipovi" is list of xsd supported types, e.g. string, positiveInteger....
EDITED:
private XElement GetComplexType(string typeName)
{
XNamespace ns = "http://www.w3.org/2001/XMLSchema";
string x = "";
foreach (XElement ele in xsdSchema.Descendants())
{
if (ele.Name.LocalName == "complexType" && ele.Attribute("name") != null)
{
x = ele.Attribute("name").Value;
if (x == typeName)
{
return ele;
}
}
}
return null;
}
GetComplexType finds complexType definition of an element type. For example, for "PurchaseOrderType" (line 10) it returns element at line 14.

NOTE: This is only a partial answer as it only explains the issue regarding the skipped "billTo" element. The code as presented in the question has many more issues.
The problem regarding skipping of the billTo element
The complexType variable is used in the predicate for the Linq method Where in the foreach loop:
complexType.Descendants().Where(k => k.Parent.Parent == complexType && k.Name.LocalName == "element"))
This lambda expression uses the variable complexType, not merely its value.
By assigning another value to complexType deep down inside your foreach loop
complexType = GetComplexType(tempTypeValue);
you also change the logic of which elements are filtered by the predicate of the Where method in the the foreach loop.
The Fix
The solution is rather simple: Do not change the complexType variable within the foreach loop. You could do the call of GetComplexType like this:
XElement complexTypeUsedByElement = GetComplexType(tempTypeValue);
element.Elements.Add(ParseComplexType(complexTypeUsedByElement, tempName.Value));

Related

Finding duplicate properties inside two lists

I have two lists, each list is of type "Node". So I have a StartNodeList and an EndNodeList.
Each Node consists of 3 properties of type Double... X, Y and Z.
The StartNodeList and EndNodeList currently contain Nodes with identical property values.
The output I need is a single list of type Node that contains only Nodes with unique property values (i.e. no duplicate Nodes).
I have tried all manner of foreach loops and comparison operators that I can think of with varying levels of success with nothing working perfectly, and several hours of researching the problem online hasn't helped.
Could someone please help me toward a solution?
while (selector.MoveNext())
{
Beam beam = selector.Current as Beam;
if (beam != null)
{
Node nodeEnd = new Node();
nodeEnd.x = beam.EndPoint.X;
nodeEnd.y = beam.EndPoint.Y;
nodeEnd.z = beam.EndPoint.Z;
Node nodeStart = new Node();
nodeStart.x = beam.StartPoint.X;
nodeStart.y = beam.StartPoint.Y;
nodeStart.z = beam.StartPoint.Z;
Member member = new Member() { member_start = nodeStart, member_end = nodeEnd, member_id = 1 };
memberList.Add(member);
nodeEndList.Add(nodeEnd);
nodeStartList.Add(nodeStart);
memberNumdber++;
}
}
Console.WriteLine(nodeStartList.Count());
Console.ReadLine();
int count = nodeStartList.Count();
foreach(Node i in nodeEndList)
{
nodeListSorted = EqualityComparer.Compare(i, nodeStartList);
}
public static class EqualityComparer
{
public static List<Node> Compare(Node node, List<Node> list)
{
List<Node> output = new List<Node>();
output.Add(node);
foreach(Node i in list)
{
if (node.x.Equals(i.x) && node.y.Equals(i.y) && node.z.Equals(i.z))
{
}
else
{
output.Add(i);
}
}
return output;
}
}
I would recommend using linq. you can use Union which Produces the set union of two sequences and Any which determines whether any element of a sequence exists based onj the given condition.
var uniqueList = list1.Where(el => !list2.Any(l2 => l2.x == el.x && l2.y == el.y && l2.z == e.z)).Union(List2);.ToList();

Entity Framework 6 - Detached entity prevent saving duplicate navigation properties

The method below takes some detached Nodes as an input parameter. The goal is to load any existing Aliases from the database, insert missing Nodes into the database, and if the detached Node's Alias entity is already in the database, simply set it to the one of the database.
However, on SaveChanges(), it seems that the Alias that already exists in the database is inserted yet again. How do I get around this?
internal async Task InsertMissingNodesToDb(INWatchNode[] nodes)
{
if (nodes.Any(x => x == null)) {
Trace.TraceError("Some null element in nodes array in InsertMissingNodesToDb().");
nodes = nodes.Where(x => x != null).ToArray();
}
// De-dup nodes based on ID
nodes = nodes.GroupBy(x => x.Id).Select(y => y.FirstOrDefault()).ToArray();
List<string> aliasNames = new List<string>();
foreach (var node in nodes) {
foreach (var alias in node.Aliases) {
if (!aliasNames.Contains(alias)) {
aliasNames.Add(alias);
}
}
}
using (var dbContext = Application.GetDbContext()) {
dbContext.Aliases.Where(x => aliasNames.Contains(x.Alias)).Load();
foreach (var node in nodes) {
var entityNode = await dbContext.Nodes.FindAsync(node.Id);
if (entityNode == null) {
entityNode = node is NWatchNode ? (NWatchNode)node : new NWatchNode(node);
for (int i = 0; i < entityNode.AliasEntities.Count; i++) {
var currentElement = entityNode.AliasEntities.ElementAt(i);
var loadedAlias = dbContext.Aliases.Local.
FirstOrDefault(x => x.Alias == currentElement.Alias);
if (loadedAlias != null) {
currentElement.Id = loadedAlias.Id;
currentElement = loadedAlias;
dbContext.Entry(loadedAlias).State = EntityState.Unchanged;
}
}
dbContext.Nodes.Add(entityNode);
}
}
await dbContext.SaveChangesAsync();
}
}
I am suspicious of this section of code.
currentElement.Id = loadedAlias.Id;
currentElement = loadedAlias; // this seems pointless
dbContext.Entry(loadedAlias).State = EntityState.Unchanged;
If you want the line in question to do something, I think you need to set entityNode.AliasEntities.ElementAt(i) to loadedAlias -- setting currentElement to loadedAlias is just overriding the local instance variable to point at loadedAlias and as such seems like pointless code -- you haven't informed the entity that one of its intended children should point to this instance with that line. Something like the following might prevent the issue?
for (int i = 0; i < entityNode.AliasEntities.Count; i++) {
var currentElement = entityNode.AliasEntities.ElementAt(i);
var loadedAlias = dbContext.Aliases.Local.
FirstOrDefault(x => x.Alias == currentElement.Alias);
if (loadedAlias != null) {
currentElement.Id = loadedAlias.Id;
entityNode.AliasEntities.ElementAt(i) = loadedAlias;
dbContext.Entry(loadedAlias).State = EntityState.Unchanged;
}
}

Use the object returned by LINQ

I'm using LINQ to find an object from an XML file. After I find the object, I want to print its details, but I'm not really sure how I can use the object I found.
This is my code:
var apartmentExist =
from apartment1 in apartmentXml.Descendants("Apartment")
where (apartment1.Attribute("street_name").Value == newApartment.StreetName) &&
(apartment1.Element("Huose_Num").Value == newApartment.HouseNum.ToString())
select apartment1.Value;
if (apartmentExist.Any() == false)
{
Console.WriteLine("Sorry, Apartment at {0} or at num {1}", newApartment.StreetName,
newApartment.HouseNum);
}
else
{
//print the details of apartment1
}
My XML is:
<?xml version="1.0" encoding="utf-8"?>
<Apartments>
<Apartment street_name="sumsum">
<Huose_Num>13</Huose_Num>
<Num_Of_Rooms>4</Num_Of_Rooms>
<Price>10000</Price>
<Flags>
<Elevator>true</Elevator>
<Floor>1</Floor>
<parking_spot>true</parking_spot>
<balcony>true</balcony>
<penthouse>true</penthouse>
<status_sale>true</status_sale>
</Flags>
</Apartment>
</Apartments>
You LINQ query returns IEnumerable<XElement> If you expect it to return more then one element you can use foreach loop to print the elementss, if there is only one result you can call .Single() extension method to get the XElement, not collection:
Casting XElement to string is safer then using XElement.Value property, because it will not throw NullReferenceException when element does not exist. You should also use (int)XElement cast and compare numbers instead of XElement.Value and comparing it to string representation of a number.
You should not use Descendants method, Use Elements instead. It will make your query faster because only elements that need to be searched will be processed.
You should call FirstOrDefault and check if result is null instead of using Any and then another First call. It will prevent your query from execution twice.
Instead of returning apartment1.Value, which is a string, return apartment1 itself. It will be XElement and you'll be able to get into it's content later when it's necessary.
var apartmentExist =
from apartment1 in apartmentXml.Root.Elements("Apartment")
where ((string)apartment1.Attribute("street_name") == newApartment.StreetName) &&
((int)apartment1.Element("Huose_Num") == newApartment.HouseNum)
select apartment1;
var apartment = apartmentExist.FirstOrDefault();
if (apartment == null)
{
Console.WriteLine("Sorry, Apartment at {0} or at num {1}", newApartment.StreetName, newApartment.HouseNum);
}
else
{
// you can use apartment variable here. It's an XElement
var huoseNum = (string)apartment.Element("Huose_Num");
// flags
foreach(var flag in apartment.Elements("Flags"))
{
var name = flag.Name;
var value = (string)flag;
}
}
You can do it with one linq query like this:
var apartment =
(from a in apartmentXml.Descendants("Apartment")
where (a.Attribute("street_name").Value == newApartment.StreetName) &&
(a.Element("Huose_Num").Value == newApartment.HouseNum.ToString())
select new {
street_name = a.Attribute("street_name").Value,
Huose_Num = a.Element("Huose_Num").Value,
Num_Of_Rooms = a.Element("Num_Of_Rooms").Value,
Price = a.Element("Price").Value,
Flags = (from f in a.Element("Flags")
select new {
Elevator = f.Element("Elevator").Value,
Floor = f.Element("Floor").Value,
parking_spot = f.Element("Floor").Value,
balcony = f.Element("balcony").Value,
penthouse = f.Element("penthouse").Value,
status_sale = f.Element("status_sale").Value
})
}).FirstOrDefault();
if(aparment == null)
{
Console.WriteLine("Sorry, Apartment at {0} or at num {1}", newApartment.StreetName,
newApartment.HouseNum);
}
else
{
Console.WriteLine(apartment.street_name);
Console.WriteLine(apartment.Huose_Num);
Console.WriteLine(apartment.Num_Of_Rooms);
Console.WriteLine(apartment.Price);
Console.WriteLine(apartment.street_name);
Console.WriteLine(apartment.Flags.Elevator);
Console.WriteLine(apartment.Flags.Floor);
Console.WriteLine(apartment.Flags.parking_spot);
Console.WriteLine(apartment.Flags.balcony);
Console.WriteLine(apartment.Flags.penthouse);
Console.WriteLine(apartment.Flags.status_sale);
}
Try this:
var xml = #"<?xml version=""1.0"" encoding=""utf-8""?>
<Apartments>
<Apartment street_name=""sumsum"">
<Huose_Num>13</Huose_Num>
<Num_Of_Rooms>4</Num_Of_Rooms>
<Price>10000</Price>
<Flags>
<Elevator>true</Elevator>
<Floor>1</Floor>
<parking_spot>true</parking_spot>
<balcony>true</balcony>
<penthouse>true</penthouse>
<status_sale>true</status_sale>
</Flags>
</Apartment>
</Apartments>
";
var apartmentXml = XElement.Parse( xml );
//apartmentXml.Dump(); // This is a linqpad feature
var new_street = "sumsum";
var new_house_num = "13";
var match_apartment = apartmentXml.Elements().Where (x => x.Attribute("street_name").Value == new_street && x.Element("Huose_Num").Value == new_house_num );
//match_apartment.Dump();
if (match_apartment.Count() < 1 )
{
Console.WriteLine("Sorry, Apartment at {0} or at num {1}", new_street,
new_house_num);
}
else
{
foreach( var x in match_apartment.Elements() )
{
Console.WriteLine("{0} | {1}", x.Name, x.Value );
}
}
appatmentExist is an IEnumerable so to access the individual items within it use List indexing to access an individual element
Comsole.Writeline(appartmentExist.toList()[0].StreetName);
will print the streetname for the first element found in the query above

How iterate on a Jdom Element that contains a list of node?

I am pretty new in XML parsing in Java using org.jdom.** and I don't know C#.
In this time I have to translate some method from C# to Java and I have the following problem.
In C# I have something like this:
System.Xml.XmlNodeList nodeList;
nodeList = _document.SelectNodes("//root/settings/process-filters/process");
if (nodeList == null || nodeList.Count == 0) {
return risultato;
}
Objects.MyItem o;
foreach (System.Xml.XmlNode n in nodeList){
o = new Objects.MyItem(n.ChildNodes[1].InnerText, n.ChildNodes[0].InnerText);
risultato.Add(o);
}
And I have translate it in Java in this way:
public List<ProcessiDaEscludere> getProcessiDaEscludere() {
Vector<ProcessiDaEscludere> risultato = new Vector<ProcessiDaEscludere>();
Element nodeList;
XPath xPath;
try {
// Query XPath che accede alla root del tag <process-filters>:
xPath = XPath.newInstance("//root/settings/process-filters/process");
nodeList = (Element) xPath.selectSingleNode(CONFIG_DOCUMENT);
if (nodeList == null || nodeList.getChildren().size() == 0)
return risultato;
ProcessiDaEscludere processoDaEscludere = new ProcessiDaEscludere();
}catch (JDOMException e){
}
return risultato;
}
My problem is that now I have no idea about how iterate on the Element nodeList variable as do in C# by these lines:
foreach (System.Xml.XmlNode n in nodeList){
o = new Objects.MyItem(n.ChildNodes[1].InnerText, n.ChildNodes[0].InnerText);
risultato.Add(o);
}
Someone can help me?

Best way to combine nodes with Html Agility Pack

I've converted a large document from Word to HTML. It's close, but I have a bunch of "code" nodes that I'd like to merge into one "pre" node.
Here's the input:
<p>Here's a sample MVC Controller action:</p>
<code> public ActionResult Index()</code>
<code> {</code>
<code> return View();</code>
<code> }</code>
<p>We'll start by making the following changes...</p>
I want to turn it into this, instead:
<p>Here's a sample MVC Controller action:</p>
<pre class="brush: csharp"> public ActionResult Index()
{
return View();
}</pre>
<p>We'll start by making the following changes...</p>
I ended up writing a brute-force loop that iterates nodes looking for consecutive ones, but this seems ugly to me:
HtmlDocument doc = new HtmlDocument();
doc.Load(file);
var nodes = doc.DocumentNode.ChildNodes;
string contents = string.Empty;
foreach (HtmlNode node in nodes)
{
if (node.Name == "code")
{
contents += node.InnerText + Environment.NewLine;
if (node.NextSibling.Name != "code" &&
!(node.NextSibling.Name == "#text" && node.NextSibling.NextSibling.Name == "code")
)
{
node.Name = "pre";
node.Attributes.RemoveAll();
node.SetAttributeValue("class", "brush: csharp");
node.InnerHtml = contents;
contents = string.Empty;
}
}
}
nodes = doc.DocumentNode.SelectNodes(#"//code");
foreach (var node in nodes)
{
node.Remove();
}
Normally I'd remove the nodes in the first loop, but that doesn't work during iteration since you can't change the collection as you iterate over it.
Better ideas?
The first approach: select all the <code> nodes, group them, and create a <pre> node per group:
var idx = 0;
var nodes = doc.DocumentNode
.SelectNodes("//code")
.GroupBy(n => new {
Parent = n.ParentNode,
Index = n.NextSiblingIsCode() ? idx : idx++
});
foreach (var group in nodes)
{
var pre = HtmlNode.CreateNode("<pre class='brush: csharp'></pre>");
pre.AppendChild(doc.CreateTextNode(
string.Join(Environment.NewLine, group.Select(g => g.InnerText))
));
group.Key.Parent.InsertBefore(pre, group.First());
foreach (var code in group)
code.Remove();
}
The grouping field here is combined field of a parent node and group index which is increased when new group is found.
Also I used NextSiblingIsCode extension method here:
public static bool NextSiblingIsCode(this HtmlNode node)
{
return (node.NextSibling != null && node.NextSibling.Name == "code") ||
(node.NextSibling is HtmlTextNode &&
node.NextSibling.NextSibling != null &&
node.NextSibling.NextSibling.Name == "code");
}
It used to determine whether the next sibling is a <code> node.
The second approach: select only the top <code> node of each group, then iterate through each of these nodes to find the next <code> node until the first non-<code> node. I used xpath here:
var nodes = doc.DocumentNode.SelectNodes(
"//code[name(preceding-sibling::*[1])!='code']"
);
foreach (var node in nodes)
{
var pre = HtmlNode.CreateNode("<pre class='brush: csharp'></pre>");
node.ParentNode.InsertBefore(pre, node);
var content = string.Empty;
var next = node;
do
{
content += next.InnerText + Environment.NewLine;
var previous = next;
next = next.SelectSingleNode("following-sibling::*[1][name()='code']");
previous.Remove();
} while (next != null);
pre.AppendChild(doc.CreateTextNode(
content.TrimEnd(Environment.NewLine.ToCharArray())
));
}
Sanitize the html you want to parse. HTML Agility Pack strip tags NOT IN whitelist

Categories

Resources