Reorder using Join in Linq - c#

I want my XElements in document order.
Can I reorder my xpathGroups using Join like in this example?
XDocument message_doc = XDocument.Load(message);
var xpathGroups =
from e in contextErrors
group e by SelectElement(message_doc, e.XPath) into g
select new
{
Element = g.Key,
ErrorItems = g,
};
var documentOrderedGroups =
from elem in message_doc.Root.DescendantsAndSelf()
join e in xpathGroups on elem equals e.Element
select e;
Message:
<root>
<a>
<aa>
<aaa>9999</aaa>
</aa>
<aa>
<aaa>8888</aaa>
</aa>
</a>
<b>
<bb>
<bbb>7777</bbb>
</bb>
</b>
<c>
<cc>
<ccc>6666</ccc>
</cc>
</c>
</root>
Input data:
/root/c[1]/cc[1]/ccc[1]
/root/a[1]/aa[2]/aaa[1]
/root/b[1]/bb[1]/bbb[1]
/root/a[1]/aa[1]/aaa[1]
Expected result:
/root/a[1]/aa[1]/aaa[1]
/root/a[1]/aa[2]/aaa[1]
/root/b[1]/bb[1]/bbb[1]
/root/c[1]/cc[1]/ccc[1]

Your original queries work, and the result is an object with the element and its relevant XPath query in document order. However, the result conflicts with the comment you made that you only want the elements in document order.
Elements and XPath: if you want both the element and its XPath then the join will remain as part of the query but I would replace the grouping with a projection into an anonymous type.
var xpathElements = contextErrors.Select(e => new
{
Element = message_doc.XPathSelectElement(e.XPath),
XPath = e.XPath
});
var ordered = from e in message_doc.Descendants()
join o in xpathElements on e equals o.Element
select o;
Elements only: if you only want the elements to be in document order, the following approach would work as well.
var xpathElements = contextErrors.Select(e => message_doc.XPathSelectElement(e.XPath));
var ordered = message_doc.Descendants()
.Where(e => xpathElements.Any(o => e == o));
In both examples I've used the XPathSelectElement method to take the place of your
SelectElement method, which I gather has the same purpose.

Related

Use LINQ to get only the most recent JOINed item for each element

I have a LINQ query:
Elements.Join(ElementStates,
element => element.ElementID,
elementState => elementState.ElementID,
(element , elementState ) => new { element, elementState })
OK, so each Element has an ElementState associated to it. However there can be multiple states for each element for historical purposes, marked by a DateModified column. In this query, I would like to return only the most recent ElementState for each Element.
Is such a thing possible, using LINQ?
EDIT:
Credit to Gilad Green for their helpful answer.
I have converted it to Method syntax for anyone else who would like to see this in the future:
Elements.GroupJoin(ElementStates,
element => element.ElementID,
elementState => elementState.ElementID,
(element, elementState) =>
new { element, elementState = elementState.OrderByDescending(y => y.DateModified).FirstOrDefault() });
You can use GroupJoin instead of Join and then retrieve the first record after ordering the group by the DateModified:
var result = from e in Elements
join es in ElementStates on e.ElementID equals es.ElementID into esj
select new {
Element = e,
State = esj.OrderByDescending(i => i.DateModified).FirstOrDefault()
};
The same can be implemented with method syntax instead of query syntax but in my opinion this is more readable
For the difference between simply joining and group joining: Linq to Entities join vs groupjoin

Using a List in a where clause in Entity Framework

I am trying to retrieve document id's over a one-to-many table. I want to use a List in the where clause to find all id's that are connected with every element in the list.
List<int> docIds = (from d in doc
where _tags.Contains(d.Tags)
select d.id).ToList<int>();
I know that the contains must be incorrect but I can't work it out. If I try a foreach I can't work out how to check if the document contains all Tags.
If you want that all d.Tags should be in the included in the _tags list, you can try:
List<int> docIds = (from d in doc
where d.Tags.All(t => _tags.Contains(t))
select d.id).ToList<int>();
If you want that d.Tags should contain all the item from _tags you need:
List<int> docIds = (from d in doc
where _tags.All(t => d.Tags.Contains(t))
select d.id).ToList<int>();
But I don't know how it translates to SQL by EF so maybe you need to evaluate it on the client site.
Use a join:
List<int> docIds = (from d in doc
from t in tags
where d.Tags.Contains(t)
select d.id).ToList<int>();

Linq query on XML to Select multiple elements of subnodes

I want to select all distinct values of child from following xml
<root>
<parent>
<child>value 1</child>
<child>value 2</child>
</parent>
<parent>
<child>value 1</child>
<child>value 4</child>
</parent>
</root>
I tried following:
var vals = (from res in XmlResources.Elements("root").Elements("parent") select res)
.SelectMany(r => r.Elements("child")).Distinct().ToList();
But can't get the value from it, gives me value wrapped in tag and not Distinct
Is it possible to show both ways to get it - query and chaining aka lambda.
yes it is possible both ways
var doc = new XDocument("your xml string");
var values = (from c in doc.Root.Descendants("child") select c.Value).Distinct();
//chaining style
var values = doc.Root.Descendants("child").Select(c=>c.Value).Distinct();
You're selecting the elements, and the elements are all distinct. You need to get the distinct values. For example:
var values = XmlResources.Element("root")
.Elements("parent")
.Elements("child")
.Select(x => x.Value)
.Distinct();
There's really no benefit in using a query expression here - it only adds cruft. I only use a query expression when the query has multiple aspects to it (e.g. a where and a meaningful select, or a join). For just a select or just a where it's pretty pointless. So yes, you can use:
var values = (from x in XmlResources.Element("root")
.Elements("parent")
.Elements("child")
select x.Value).Distinct();
... but why would you? It's a lot less clear IMO.
Note that if you don't care too much about the root/parent/child hierarchy, and are happy to just get all the child descendants, you can use:
var values = XmlResources.Descendants("child")
.Select(x => x.Value)
.Distinct();

Dynamic XML Sorting using LINQ

How can I sort dynamic XML using LINQ having following precedences:
Sort by node-name
Sort by node-value
Sort by attribute-name
Sort by attribute-value
Sorting by Node Name:
var doc = XDocument.Parse("<data><carrot /><apple /><orange /></data>");
var sortedByNames = doc.Root.Elements().OrderBy(e => e.Name.ToString());
foreach(var e in sortedByNames)
Console.WriteLine (e.Name);
Sorted by Node Value:
var doc = XDocument.Parse("<data><thing>carrot</thing><thing>apple</thing><thing>orange</thing></data>");
var sortedByValue = doc.Root.Elements().OrderBy(e => e.Value.ToString());
foreach(var e in sortedByValue)
Console.WriteLine (e.Value);
It all follows the same pattern... You sort based on the criteria you define in the selector function passed into the OrderBy method.
var data = from item in xmldoc.Descendants("content")
orderby (string)item.Element("title") // by node value
//orderby item.Attribute("something") // by attribute value
select new
{
Title = (string)item.Element("title"),
};

Linq query convert to List<string>

I have this code
List<string> IDs = new List<string>();
XDocument doc = XDocument.Parse(xmlFile);
var query = from c in doc.Root.Elements("a").Elements("b")
select new { ID = c.Element("val").Value};
How can I convert query to List without loop foreach ?
{ ID = c.Element("val")}
are strings of course
EDIT
my XML File
<?xml version="1.0" encoding="utf-8"?>
<aBase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<a>
<b>
<val>other data</val>
</b>
<b>
<val>other data</val>
</b>
</a>
</aBase>
IDs = query.Select(a => a.ID).ToList();
or if you'd like to do it in one line
List<string> IDs = (from c in doc.Root.Elements("a").Elements("b")
select c.Element("val").Value).ToList()
The anonymous type isn't really helping you since you only need a sequence of strings, not any sort of tuple. Try:
XDocument doc = XDocument.Parse(xmlFile);
var query = from c in doc.Root.Elements("a").Elements("b")
select c.Element("val").Value;
var IDs = query.ToList();
Personally, I would just use method-syntax all the way:
var IDs = doc.Root.Elements("a")
.Elements("b")
.Select(c => c.Element("val").Value)
.ToList();

Categories

Resources