Linq to XML: I am not able to compare the nested element

Linq to XML: I am not able to compare the nested element - c#

Thank you in advance, this is a great resource.
I believe the code explains itself, but just in case I am being arrogant I will explain myself.
My program lists movies, to a treeview, according to the drop down lists selected genre. Each movie has a few genres, ergo the nested genres.
This is the XML:
<movie>
<title>2012</title>
<director>Roland Emmerich</director>
<writtenBy>
<writter>Roland Emmerich,</writter>
<writter>Harald Kloser</writter>
</writtenBy>
<releaseDate>12-Nov-2009</releaseDate>
<actors>
<actor>John Cusack,</actor>
<actor>Thandie Newton, </actor>
<actor>Chiwetel Ejiofor</actor>
</actors>
<filePath>H:\2012\2012.avi</filePath>
<picPath>~\image\2012.jpg</picPath>
<runningTime>158 min</runningTime>
<plot>Dr. Adrian Helmsley, part of a worldwide geophysical team investigating the effect on the earth of radiation from unprecedented solar storms, learns that the earth's core is heating up. He warns U.S. President Thomas Wilson that the crust of the earth is becoming unstable and that without proper preparations for saving a fraction of the world's population, the entire race is doomed. Meanwhile, writer Jackson Curtis stumbles on the same information. While the world's leaders race to build "arks" to escape the impending cataclysm, Curtis struggles to find a way to save his family. Meanwhile, volcanic eruptions and earthquakes of unprecedented strength wreak havoc around the world. </plot>
<trailer>http://2012-movie-trailer.blogspot.com/</trailer>
<genres>
<genre>Action</genre>
<genre>Adventure</genre>
<genre>Drama</genre>
</genres>
<rated>PG-13</rated>
</movie>
This is the code:
string selectedGenre = this.ddlGenre.SelectedItem.ToString();
XDocument xmldoc = XDocument.Load(Server.MapPath("~/App_Data/movie.xml"));
List<Movie> movies =
(from movie in xmldoc.Descendants("movie")
// The treeView doesn't exist
where movie.Elements("genres").Elements("genre").ToString() == selectedGenre
select new Movie
{
Title = movie.Element("title").Value
}).ToList();
foreach (var movie in movies)
{
TreeNode myNode = new TreeNode();
myNode.Text = movie.Title;
TreeView1.Nodes.Add(myNode);
}

Change your code to
List<Movie> movies =
(from movie in xmldoc.Descendants("movie")
where movie.Elements("genres").Elements("genre").Any(e => e.Value == selectedGenre)
select new Movie
{
Title = movie.Element("title").Value
}).ToList();
This is because there are more than 1 genre node, so you'll have to check if any of them match instead of just the first.

List<Movie> movies =
(from movie in xmldoc.Descendants("movie")
where movie.Elements("genres")
.Any((e) => e.Elements("genre").ToString() == selectedGenre);

Related

Irrelevant search results with lucene.net

I have been developing a search engine for a business directory application using Lucene.net. However when i search for Sports shop it returns the result of other shops including the sports shops because the key word shop matches with that. So how can i prioritize that it should return the results which is matches with the keyword sport
If anyone have solution for this please share here. Any helpful example or links will be appreciated.

I would very much appreciate it if you could paste some code to give you a better example.
However, from reading your question I think that what you need is a phrase query to give Sports Shop a higher boost.
My implementation of this query is this:
public List QueryToPhraseQuery(string pQuery) {
QueryParsers.Classic.MultiFieldQueryParser oPhraseParser = new QueryParsers.Classic.MultiFieldQueryParser(Version, FieldArray, Analyzer, BoostDictionary);
List<PhraseQuery> lstPhraseQuery = new List<PhraseQuery>();
HashSet<Term> lstTerms = new HashSet<Term>();
oPhraseParser.Parse(pQuery).ExtractTerms(lstTerms);
foreach (var group in lstTerms.GroupBy(x => x.Field))
{
PhraseQuery oPhraseQuery = new PhraseQuery() { Boost = 10, Slop = 3 };
foreach (var oTerm in group.ToList())
{
oPhraseQuery.Add(oTerm);
if (oTerm.Field == Field.ImportantField)
oPhraseQuery.Boost = 30;
}
lstPhraseQuery.Add(oPhraseQuery);
}
return lstPhraseQuery;
}
This would search for thing like this in your index which will match exactly and will return better results with more relevance
attributedescriptions:"something something"~3^10.0 attributemajor:"something something"~3^30.0 description:"something something"~3^10.0 edescription:"something something"~3^10.0
If you want me to give you an example using your code, just past eit and I can modify it to better fit your exam

Parse an xml document with "dynamic" nodes

I am parsing XML via an XDocument, how can I retreive all languages, i.e <en> or <de> or <CodeCountry> and their child elements?
<en>
<descriptif>In the historic area, this 16th century Town House on 10,764 sq. ft. features 10 rooms and 3 shower-rooms. Period features include a spiral staircase. 2-room annex house with a vaulted cellar. Period orangery. Ref.: 2913.</descriptif>
<prox>NOGENT-LE-ROTROU.</prox>
<libelle>NOGENT-LE-ROTROU.</libelle>
</en>
<de>
<descriptif>`enter code here`In the historic area, this 16th century Town House on 10,764 sq. ft. features 10 rooms and 3 shower-rooms. Period features include a spiral staircase. 2-room annex house with a vaulted cellar. Period orangery. Ref.: 2913.</descriptif>
<prox>NOGENT-LE-ROTROU.</prox>
</de>
...
<lang>
<descriptif></descriptif>
<prox></prox>
<libelle></libelle>
</lang>

As your xml document is not well formatted, you should first add a root element.
You may do something like that.
var content = File.ReadAllText(#"<path to your xml>");
var test = XDocument.Parse("<Language>" + content + "</Language>");
Then, as you have "dynamic top nodes", you may try to work with their children (which don't seem to be dynamic), assuming all nodes have at least a "descriptif" child. (If it's not "descriptif", it may be "prox" or "libelle") **.
//this will give you all parents, <en>, <de> etc. nodes
var parents = test.Descendants("descriptif").Select(m => m.Parent);
Then you can select the language and childrens.
I used an anonymous type, you can of course project to a custom class.
var allNodes = parents.Select(m => new
{
name = m.Name.LocalName,
Descriptif = m.Element("descriptif") == null ? string.Empty : m.Element("descriptif").Value,
Prox = m.Element("prox") == null ? string.Empty : m.Element("prox").Value ,
Label = m.Element("libelle") == null ? string.Empty : m.Element("libelle").Value
});
This is of course not performant code for a big file, but... that's another problem.
**
Worst case, you may do
var parents = test.Descendants("descriptif").Select(m => m.Parent)
.Union(test.Descendants("prox").Select(m => m.Parent))
.Union(test.Descendants("libelle").Select(m => m.Parent));

UmbracoExamine ParentID?

I'm currently using UmbracoExamine for all of my project's search needs, and I'm trying to figure out what exactly the query-parameter ".ParentId" does.
I was hoping I could use it to find all child nodes from a parentID, but I can't seem to get it working.
Basically, if the searchstring contains e.g. "C# Programming", it should find all that category's articles. This is just an example.
Thank you in advance!

When you say it should find all "that category's" articles I assume you have a structure like the below?
-- Programming
----Begin Java Programming
----Java Installation on Linux
----Basics of C# Programming
----What is SDLC
----Advanced C# Programming
-- Sports
----Baseball basics
If so then I assume as well that you want all the articles under "programming" to be listed and not just those containing "C# Programming"?
What you will need to do is to loop through the SearchResults from your query and find the parent node from there
IPublishedContent node = new UmbracoHelper(UmbracoContext.Current).TypedContent(item.Fields["id"].ToString());
IPublishedContent parentNode = node.Parent;
Once you have the parent node you can get all it's children as well as some of them depending on document type and what you want to do
IEnumerable<IPublishedContent> allChildren = parentNode.Children;
IEnumerable<IPublishedContent> specificChildren = parentNode.Children.Where(x => x.DocumentTypeAlias.Equals("aliasOfSomeDocType"));
Example code below
//Fetching what eva searchterm some bloke is throwin' our way
string q = Request.QueryString["search"].Trim();
//Fetching our SearchProvider by giving it the name of our searchprovider
Examine.Providers.BaseSearchProvider Searcher = Examine.ExamineManager.Instance.SearchProviderCollection["SiteSearchSearcher"];
// control what fields are used for searching and the relevance
var searchCriteria = Searcher.CreateSearchCriteria(Examine.SearchCriteria.BooleanOperation.Or);
var query = searchCriteria.GroupedOr(new string[] { "nodeName", "introductionTitle", "paragraphOne", "leftContent", "..."}, q.Fuzzy()).Compile();
//Searching and ordering the result by score, and we only want to get the results that has a minimum of 0.05(scale is up to 1.)
IEnumerable<SearchResult> searchResults = Searcher.Search(query).OrderByDescending(x => x.Score).TakeWhile(x => x.Score > 0.05f);
//Printing the results
foreach (SearchResult item in searchResults)
{
//get the parent node
IPublishedContent node = new UmbracoHelper(UmbracoContext.Current).TypedContent(item.Fields["id"].ToString());
IPublishedContent parentNode = node.Parent;
//if you wish to check for a particular document type you can include this
if (item.Fields["nodeTypeAlias"] == "SubPage")
{
}
}

how to display data in correct format using HTML agility

Am having an HTML document and from that want to fetch necessary information so have used HTML agility concept.
Using the following code am getting all the necessary data.
var web = new HtmlWeb();
var doc = web.Load("http://www.talentsearchpeople.com/en/jobs/?page=joblisting&pubID=&formID=&start=0&count=8&module=&functionLevel1=&provinceNode=&countryNode=&keyword=");
var nodes = doc.DocumentNode.SelectNodes("//a[#class='grijs'][#title]");
foreach (var node in nodes)
{
HtmlAttribute att = node.Attributes["title"];
title = att.Value;
Response.Write("<br/>" + att.Value);
}
var Location = doc.DocumentNode.SelectNodes("//td[#width='80']");
foreach (var node in Location)
{
if (node.InnerHtml.Contains("Location:"))
{
locationname = HttpUtility.HtmlDecode(node.NextSibling.NextSibling.InnerText.Trim());
Response.Write("<br/>Location1=" + locationname);
}
}
Using the above code am getting following output:
**
Lead Buyer South
Customer Service Order Management with native level of German
EMEA Customer Experience & Quality Internship
Service Desk Team Leader with Excellent Level of German and French
Sourcing & Procurement Consultant with native level of French
Jefe/a de ventas con alemán e inglés. Recien Titulados.
Jefe/a de ventas con alemán e inglés. Recien Titulados.
Jefe/a de ventas con alemán e inglés. Recien Titulados.
Location1=Almeria
Location1=Terrassa
Location1=United Kingdom, Manchester
Location1=Barcelona
Location1=Barcelona
Location1=A Coruña
Location1=Cataluña
Location1=Murcia
**
Above code works correctly for fetching of the data. Problem is i want to insert above data in database and also want to display the data in correct format means first title of the property followed by its location
**Lead Buyer South
Location1=Almeria
Customer Service Order Management with native level of German
Location1=Terrassa
EMEA Customer Experience & Quality Internship
Location1=United Kingdom, Manchester
Service Desk Team Leader with Excellent Level of German and French
Location1=Barcelona
Sourcing & Procurement Consultant with native level of French
Location1=Barcelona
Jefe/a de ventas con alemán e inglés. Recien Titulados.
Location1=A Coruña
Jefe/a de ventas con alemán e inglés. Recien Titulados.
Location1=Cataluña
Jefe/a de ventas con alemán e inglés. Recien Titulados.
Location1=Murcia**
Alternative Method by searching the table tag
var web = new HtmlWeb();
var doc = web.Load("http://www.talentsearchpeople.com/en/jobs/?page=joblisting&pubID=&formID=&start=0&count=8&module=&functionLevel1=&provinceNode=&countryNode=&keyword=");
var mainNode = doc.DocumentNode.SelectNodes("//table[#class='border-jobs']/*");
foreach (var mainNodes in mainNode)
{
string pathdet = mainNodes.XPath;
var nodes = mainNodes.SelectSingleNode("//a[#class='grijs'][#title]");
if (nodes != null)
{
HtmlAttribute att = nodes.Attributes["title"];
title = att.Value;
Response.Write("<br/>" + att.Value);
}
var Description = doc.DocumentNode.SelectSingleNode("//td[#colspan='2']");
if (Description.InnerHtml.Contains("Description:"))
{
s = Description.InnerHtml;
s = s.Replace("Description:", "");
Response.Write("<br/>Description=" + s);
}
var Location = doc.DocumentNode.SelectSingleNode("//td[#width='80']");
if (Location.InnerHtml.Contains("Location:"))
{
locationname = HttpUtility.HtmlDecode(Location.NextSibling.NextSibling.InnerText.Trim());
Response.Write("<br/>Location1=" + locationname);
}
}
If i use the above code then i get following output:
Assistant Call Centre Manager with fluent level of Spanish and English
Description= We are recruiting an Assistant Call Center Manager for a multinational company based in Lisboa, Portugal. This person will be responsible for the team management. Experience in team management, mainly in contact center, environment is required.
Location1=Lisboa, Portugal
I get the above output 8 times as //table[#class='border-jobs']/* tag occurs 8 times in the document
how can i get correct output?

I got the answer. :)
Since // returns the first td[#colspan='2'] on the entire page, not the one in the table.
Using the XPath "." in front of the expression will select the current node so
var Description = mainNodes.SelectSingleNode(".//tr//td//table//tr//td[#colspan='2']");
will select only the descendants of the mainNodes node .

At a glance it looks like you may get away with just storing them both in arrays and then when outputting get one item from each array.
More robustly and more correctly you should refine your searches so that you find the html element that has both pieces of information in it (eg search for tables with class "border-jobs". This contains both the job title and location. You can then get the two pieces of data from that at the same time.
This technique is better because it will deal better with things like no location being specified and in general better reflects what you are doing so will be more easily understandable by the next person to come along.
Addition
To answer your additional issues this line:
var Description = doc.DocumentNode.SelectSingleNode("//td[#colspan='2']");
will search the whole document. To get it to search the right node and only contents of that node you need:
var Description = mainNodes.SelectSingleNode(".//td[#colspan='2']");
Note the change to object (that you are already aware of from comments) as well as the addition of the . in the XPath which tells it to start at the current node.
Also your title select will not find anything valid in that node so you will need to update the XPath. Changing it to .//a will work since it is the first anchor tag but this might be a bit brittle.

C# - Combine multiple LINQ collections with same properties

Maybe it's late in the night, but I'm stumped here. I'm trying to combine multiple lists with the same properties into one. I thought that LINQ's .UNION would do the trick but I was mistaken. Here's an example of a few of my lists:
LIST1 (report names):
Date Name Title Product
02/01/13 Steve Hello World Report
02/05/13 Greg Howdy Report
LIST2 (song names):
Date Name Title Product
01/01/13 John Time Song
01/05/13 Bob Sorry Song
LIST3 (games names):
Date Name Title Product
12/01/12 Google Bike Race Game
12/05/12 Apple Temple Run Game
My class is very simple. Here's what it looks like:
public class MyClass {
public DateTime Date { get; set; }
public string Name { get; set; }
public string Title { get; set; }
public string Product { get; set; }
}
In case you're wondering, I used this LINQ query to get one of the above lists:
var finalList = Games
.Select (s => new MyClass {
Date = (System.DateTime) s.Games.Creation_date,
Name = s.Games.Last_name,
Title = string.Format("{0} (Report)", s.Game.Headline),
Product="Report"
})
;
So far, it's pretty easy, but I want to combine all my lists into 1. So, my final list should look like:
Date Name Title Product
02/01/13 Steve Hello World Report
02/05/13 Greg Howdy Report
01/01/13 John Time Song
01/05/13 Bob Sorry Song
12/01/12 Google Bike Race Game
12/05/12 Apple Temple Run Game
I thought that a UNION command would do it:
var newList = List1.Union(List2).Union(List3);
But I'm not getting the desired output.
Date Name Title Product
02/01/13 Steve Hello World Report
02/05/13 Greg Howdy Report
01/01/13 Bob Time Game
01/05/13 John Sorry Song
12/01/12 Google Bike Race Song
12/05/12 Apple Temple Run Game
Any idea on what I'm doing wrong here?

Try:
list1.Concat(list2).Concat(list3);
You don't want to be using Union ( working or not) anyway as it does set union.

You could try using the AddRange command should look something like this
var FullList = list1.AddRange(list2).AddRange(list3);
or the fail safe way whould be
var FullList = list1.Concat(list2).Concat(list3).ToList(); //Personally i would use this
or you also have
var FullList = new[] { list1, list2, list3 }.SelectMany(a => GetAllProducts(a)).ToList();

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Linq to XML: I am not able to compare the nested element - c#

List<Movie> movies = (from movie in xmldoc.Descendants("movie") where movie.Elements("genres") .Any((e) => e.Elements("genre").ToString() == selectedGenre);

Related

Irrelevant search results with lucene.net

Parse an xml document with "dynamic" nodes

UmbracoExamine ParentID?

how to display data in correct format using HTML agility

C# - Combine multiple LINQ collections with same properties

Categories

Resources