I want to scrape data with Html Agility Pack.
I used this:
string url = #"https://mobile.bet365.gr/#type=Coupon;key=1-1-13-40-141-0-0-0-1-0-0-4100-0-0-1-0-0-0-0-0-0;ip=0;lng=5;anim=1";
var webGet = new HtmlWeb();
var document = webGet.Load(url);
var nodes = document.DocumentNode.SelectNodes("//*[#id='Coupon']/div[1]/div[2]/div[1]/div/div[1]/div[1]/span");
int i = 0;
foreach (var node in nodes)
{
dataGridView1.Rows.Add();
dataGridView1.Rows[i].Cells[0].Value = i + 1;
dataGridView1.Rows[i].Cells[1].Value = node.InnerHtml;
i++;
}
The XPath is taken from FireXPath but nothing appears.
The HTML snippet is this:
<div id="Coupon" class="C4 C4_1">
<div class="liveAlertKey enhancedPod cc_12_7" data-sportskey="1-1-13-40-141-0-0-0-1-0-0-4100-0-0-1-0-0-0-0-0-0" data-alertkey="NPower Champs">
<h1><em>Αγγλία - Τσάμπιονσιπ</em></h1>
<div class="podHeaderRow">
<div class="wideLeftColumn">Παρ 29 Σεπ</div>
<div class="priceColumn"><em>1</em></div>
<div class="priceColumn"><em>X</em></div>
<div class="priceColumn"><em>2</em></div>
</div>
<div data-fixtureid="67185688" data-plbtid="40" class="podEventRow cc_12_4 ippg-Market " data-nav="rw_spl_sc_1-1-8-67185688-3-0-0-0-1-0-0-0-0-0-1-0-0-0-0-0-0,MarketCount,1-1-8-67185688-3-0-0-0-1-0-0-0-0-0-1-0-0-0-0-0-0,False,1">
<div class="wideLeftColumn hasStatsIcon">
<div class="ippg-Market_GameDetail">
<div class="ippg-Market_GameItem ">
<div class="ippg-Market_CompetitorName">
<span class="ippg-Market_Truncator">ΚΠΡ</span>
</div>
<div class="ippg-Market_CompetitorScores">
<span class="ippg-PointNode"></span>
</div>
</div>
<div class="ippg-Market_GameItem ">
<div class="ippg-Market_CompetitorName">
<span class="ippg-Market_Truncator">Φούλαμ</span>
</div>
<div class="ippg-Market_CompetitorScores">
<span class="ippg-PointNode"></span>
</div>
</div>
<div class="ippg-Market_MetaContainer ">
<div class="ippg-Market_GameStartTime">20:45</div>
<div class="ippg-Market_GameInfo "></div>
<div class="ippg-Market_MarketCount">109</div>
<div id="FixtureIconsContainer">
<img src="/grfx/V6/Misc/pixel.gif" class="VideoIcon SSP-7">
</div>
<div id="StatsIconContainer">
<a class="icon-stats" target="_blank" data-nav="externalLink" href="http://www.stats.betradar.com/s4/?clientid=259&matchid=11868244&language=el"></a>
</div>
</div>
</div>
</div>
<div class="ippg-Market_Topic priceColumn" data-nav="pt=N#o=9/4#f=67185688#fp=1410316836#so=0#c=1#" data-inplaytopic="" data-pgfpid="1410316836" data-inplaymarkettopic="" data-inplayaltmarkettopic="">
<span class="ippg-Market_Odds">3.25</span>
</div>
<div class="ippg-Market_Topic priceColumn" data-nav="pt=N#o=13/5#f=67185688#fp=1410316839#so=0#c=1#" data-inplaytopic="" data-pgfpid="1410316839" data-inplaymarkettopic="" data-inplayaltmarkettopic="">
<span class="ippg-Market_Odds">3.60</span>
</div>
<div class="ippg-Market_Topic priceColumn" data-nav="pt=N#o=5/4#f=67185688#fp=1410316841#so=0#c=1#" data-inplaytopic="" data-pgfpid="1410316841" data-inplaymarkettopic="" data-inplayaltmarkettopic="">
<span class="ippg-Market_Odds">2.25</span>
</div>
</div>
</div>
</div>
Could anyone help me find the correct XPath? I used this technique in other sites and I had taken the results I wanted but from this site I have some problem to find the correct XPath.
You can get your teams and odds from the HTML snippet like this:
HtmlDocument document = new HtmlDocument();
document.Load(Server.MapPath("xpath.html"));
// Teams
HtmlNodeCollection teamNodes = document.DocumentNode.SelectNodes("//div[#class='ippg-Market_CompetitorName']");
List<string> teams = new List<string>();
foreach (HtmlNode n in teamNodes)
{
HtmlNode nodeTeam = n.SelectSingleNode(".//span[#class='ippg-Market_Truncator']");
if (nodeTeam != null)
{
teams.Add(nodeTeam.InnerText);
}
}
// Odds
HtmlNodeCollection oddNodes = document.DocumentNode.SelectNodes("//span[#class='ippg-Market_Odds']");
List<string> odds = new List<string>();
foreach (HtmlNode o in oddNodes)
{
odds.Add(o.InnerText);
}
Related
I am new to MVC and am trying to implement a contact page.
I created a partial view for the contact part and I am adding it on my index page with this line
#Html.Partial("_Contact")
When my form is submitted my code executes and when the form is not valid I have to return something.
My question is what do I have to return to make it work.
I want to return my model to my contact partial view but I still have to show my index page because the contact part is shown there.
Or am I looking at it the wrong way?
This is the post code:
[ValidateAntiForgeryToken]
public IActionResult _Contact(EmailFormModel model)
{
if (ModelState.IsValid)
{
var path = _hostingEnvironment.WebRootPath;
string html = System.IO.File.ReadAllText(path);
var message = new MailMessage();
var templatePath = path + _configuration.GetSection("Email.PathToHtmlTemplate").Value;
var smtpHost = _configuration.GetSection("Email.Smtp.Host").Value;
var smtpPort = Convert.ToInt32(_configuration.GetSection("Email.Smtp.Port").Value);
var smtpUser = _configuration.GetSection("Email.Smtp.UserName").Value;
var smtpPass = _configuration.GetSection("Email.Smtp.Password").Value;
var from = _configuration.GetSection("Email.From").Value;
var to = _configuration.GetSection("Email.To").Value;
var cc = _configuration.GetSection("Email.Cc").Value;
var bcc = _configuration.GetSection("Email.Bcc").Value;
var subject = _configuration.GetSection("Email.Subject").Value;
var dic = new Dictionary<string, string>()
{
{ "{{NAME}}", model.Name },
{ "{{TEL}}", model.Telephone },
{ "{{EMAIL}}", model.Email },
{ "{{MESSAGE}}", model.Message },
};
dic.ForEach(x => html = html.Replace(x.Key, x.Value));
foreach (var t in to.Split(';'))
{
message.To.Add(new MailAddress(t));
}
foreach (var t in cc.Split(';'))
{
message.CC.Add(new MailAddress(t));
}
foreach (var t in bcc.Split(';'))
{
message.Bcc.Add(new MailAddress(t));
}
message.From = new MailAddress(from);
message.Subject = subject;
message.Body = html;
message.IsBodyHtml = true;
using (var smtp = new SmtpClient())
{
var credential = new NetworkCredential
{
UserName = smtpUser,
Password = smtpPass
};
smtp.Credentials = credential;
smtp.Host = smtpHost;
smtp.Port = smtpPort;
smtp.Send(message);
return View("EmailSent");
}
}
return View(model);
}
The index view:
#{
ViewData["Title"] = "Home Page";
}
...
#Html.Partial("_Contact")
The contact view:
#model EmailFormModel;
#{
ViewData["Title"] = "Contact";
}
<form asp-controller="Home" asp-action="_Contact" method="post" class="form-horizontal" role="form">
<!-- Contact Page Section Start -->
<div id="contact" class="contact-page-sec pt-100 pb-100">
<div class="container">
<row>
<div class="contact-page-sec">
<div class="sec-title">
<h1>Contacteer ons</h1>
<div class="border-shape"></div>
</div>
</div>
</row>
<div class="row">
<div class="col-md-4">
<div class="contact-info">
<div class="contact-info-item">
<div class="contact-info-icon">
<img src="img/icon/phone.png" alt="" />
</div>
<div class="contact-info-text">
<h2>Telefoon</h2>
<span>
<a href='tel:+3245798689'>+(32) 496 56 53 13</a>
</span>
</div>
</div>
</div>
<div class="contact-info">
<div class="contact-info-item">
<div class="contact-info-icon">
<img src="img/icon/envelope.png" alt="" />
</div>
<div class="contact-info-text">
<h2>e-mail</h2>
<span>
<a href='mailto:info#vloerwerken-krekelbergh.be'>info#vloerwerken-krekelbergh.be</a>
</span>
</div>
</div>
</div>
<div class="contact-info">
<div class="contact-info-item">
<div class="contact-info-icon">
<img src="img/icon/map-marker.png" alt="" />
</div>
<div class="contact-info-text">
<h2>Adres</h2>
<span>Engelstraat 60</span>
<span>8211 Aartrijke, België</span>
</div>
</div>
</div>
<div class="social-profile">
<ul>
<li>
<a href="https://www.facebook.com/vloerwerken-krekelbergh-228041574489932/">
<i class="fa fa-facebook"></i>
</a>
</li>
</ul>
</div>
</div>
<div class="col-md-8">
<div class="contact-field">
<h2>Schrijf uw bericht</h2>
<div class="col-md-6 col-sm-6 col-xs-12">
<div class="single-input-field">
<input asp-for="Name" class="form-control" />
<span asp-validation-for="Name" class="text-danger"></span>
</div>
</div>
<div class="col-md-6 col-sm-6 col-xs-12">
<div class="single-input-field">
<input asp-for="Email" class="form-control" type="email"/>
<span asp-validation-for="Email" class="text-danger"></span>
</div>
</div>
<div class="col-md-6 col-sm-6 col-xs-12">
<div class="single-input-field">
<input asp-for="Telephone" class="form-control" />
<span asp-validation-for="Telephone" class="text-danger"></span>
</div>
</div>
<div class="col-md-12 message-input">
<div class="single-input-field">
<textarea asp-for="Message" class="form-control"></textarea>
<span asp-validation-for="Message" class="text-danger"></span>
</div>
</div>
<div class="single-input-fieldsbtn" style="clear: both;">
<input id="submitBtn" value="Verstuur" type="submit">
</div>
<div id="info"></div>
</div>
</div>
</div>
</div>
</div>
<!-- Contact Page Section End -->
</form>
Anyone know how to print out all the elements that contain in a list with text value in selenium c#? Try to do like the code below it print out blank value. But if i were to put writeline with elem only the value was display but it is not in text form. I would like to get value with text.
Code:
IList<IWebElement> attachmentList = driver.FindElements(By.ClassName("comment-box"));
foreach (IWebElement element in attachmentList)
{
Console.WriteLine(element.Text);
}
HTML:
<div class="comment-box">
<!-- Comment Image -->
<div class="col-xs-2">
<div id="attachmentImgSFHD-24" class="attachmentImg">
<img src="downloadAttachment?attachmenturl=/secure/thumbnail/10111/_thumb_10111.png" />
</div>
</div>
<!-- Attachment details -->
<div class="col-xs-10">
<div class="commentContent">
<div class="topRow">
<div class="username">ApplicationLink.png</div>
<div class="commentTimeStamp">31400 KB</div>
</div>
<div class="bottomRow">
<div class="commentDisplay">
Download
</div>
</div>
</div>
</div>
</div>
<div class="comment-box">
<!-- Comment Image -->
<div class="col-xs-2">
<div id="attachmentImgSFHD-24" class="attachmentImg">
<img src="downloadAttachment?attachmenturl=/secure/thumbnail/10313/_thumb_10313.png" />
</div>
</div>
<!-- Attachment details -->
<div class="col-xs-10">
<div class="commentContent">
<div class="topRow">
<div class="username">test.jpg</div>
<div class="commentTimeStamp">7423 KB</div>
</div>
<div class="bottomRow">
<div class="commentDisplay">
Download
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br/>
The elements don't have text in the html, so element.Text is empty. Use
Console.WriteLine(element.GetAttribute("value"));
You can use the below Xpath to get the attachment details
Xpath: //div[#class='comment-box']//div[#class='commentContent']//div[#class='username']
Code:
IList<IWebElement> attachmentList = driver.FindElements(By.XPath("//div[#class='comment-box']//div[#class='commentContent']//div[#class='username']"));
foreach (IWebElement element in attachmentList)
{
Console.WriteLine(element.Text);//It will print all the attachment name like 'ApplicationLink.png,test.jpg'
}
IList<IWebElement> attachmentList = driver.FindElements(By.ClassName("comment-box"));
foreach (IWebElement element in attachmentList)
{
System.Threading.Thread.Sleep(2000);
Console.WriteLine(element.Text);
}
it works fine by putting the thread.sleep code
I'm trying to render out my projects sorted under the right category. I have it right now so right product will go under right category but it doesn't render out the name. It renders out the object so it looks rather weird.
This is my code:
int i = 0;
var categories = ViewBag.Category as List<Category>;
var products = ViewBag.Products as List<Product>;
foreach (var cateory in categories)
{
i++;
<div class="panel-group accordion" id="accordion1">
<div class="panel panel-default">
<div class="panel-heading">
<h4 class="panel-title">
<a class="accordion-toggle" data-toggle="collapse" data-parent="#accordion1" href="#collapse_#i">
#cateory.Name
</a>
</h4>
</div>
<div id="collapse_#i" class="panel-collapse collapse">
<div class="panel-body">
<div class="col-md-12">
#cateory.Products
</div>
</div>
</div>
</div>
</div>
}
And this is the output.
System.Collections.Generic.HashSet`1[ProductCatalog.Models.Product]
I have tired this #cateory.Products.Select(n => n.Name).ToString(); but did not do much.
Im making PostHelper.cshtml in App_Code folder inside my Blog project. And I got this error on line:
<div class="commentsTab">
#Post.comments.Count**
</div>
and:
#foreach (tag Tag in **Post.tags**)
when Im deleting "#Post.comments.Count" its fine but Ive got similar line and there's no errors:
<div class="postTitle">#Post.Title</div>
whats wrong with this? There's whole code:
#using Blog.Models;
#helper Render(post Post, System.Web.Mvc.HtmlHelper html, bool isAdmin, bool showComments)
{
<div class="postTitle">#Post.Title</div>
<div class="postContainer">
<div class="postTabs">
<div class="dateTab">
<div class="month">#Post.DateTime.ToString("MMM").ToUpper()</div>
<div class="day">#Post.DateTime.ToString("dd")</div>
</div>
<div class="commentsTab">
#Post.comments.Count
</div>
</div>
<div class="postContent">
<div class="postBody">#html.Raw(Post.Body)</div>
<div class="tagList">
#foreach (tag Tag in Post.tags)
{
<span class="tag">#Tag.Name</span>
}
</div>
<div class="linkList">
<div id="fb-root"></div>
<script>
(function (d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) return;
js = d.createElement(s); js.id = id;
js.src = "//connect.facebook.net/pl_PL/sdk.js#xfbml=1&version=v2.0";
fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));
</script>
</div>
</div>
</div>
if (showComments)
{
<div id="commentContainer">
<a id="comments"></a>
#foreach (comment Comment in Post.comments.OrderBy(x => x.DateTime))
{
<div class="comment">
<div class="commentName">
#if (!string.IsNullOrWhiteSpace(Comment.Email))
{
#Comment.Name
}
else
{
#Comment.Name;
}
</div>
said:
<div class="commentBody">#html.Raw(html.Encode(Comment.Body).Replace("\n", "<br/>"))</div>
<div class="commentTime">at #Comment.DateTime.ToString("HH:mm") on #Comment.DateTime.ToString("yyyy/MM/dd")</div>
</div>
}
<div id="commentEditor">
<div id="commentPrompt">Leave a comment!</div>
<form action="#Href("~/Posts/Comment/" + Post.ID)" method="post">
<input type="text" id="commentNamePrompt" name="name" /> Name (required)<br />
<input type="text" id="commentEmailPrompt" name="email" /> Email (optional)<br />
<textarea id="commentBodyInput" name="body" rows="10" cols="60"></textarea><br />
<input type="submit" id="commentSubmitInput" name="submit" value="Submit!" />
</form>
</div>
</div>
}
}
My action:
public ActionResult Index(int? id)
{
int pageNumber = id ?? 0;
IEnumerable<post> posts =
(from Post in model.posts
where Post.DateTime < DateTime.Now
orderby Post.DateTime descending
select Post).Skip(pageNumber * PostsPerPage).Take(PostsPerPage + 1);
ViewBag.IsPreviousLinkVisible = pageNumber > 0;
ViewBag.IsNextLinkVisible = posts.Count() > PostsPerPage;
ViewBag.PageNumber = pageNumber;
ViewBag.IsAdmin = IsAdmin;
return View(posts.Take(PostsPerPage));
}
I presume your exception is caused by an already open connection to the DB that you are not closing. In your case try to add a .ToList at the end of your initial select:
select Post).Skip(pageNumber * PostsPerPage).Take(PostsPerPage + 1).ToList();
This will close the reader and copy all results in your memory. See if that makes any difference.
You need MARS. Add: MultipleActiveResultSets=True; to your connection string. See: http://msdn.microsoft.com/en-us/library/ms131686.aspx
I would like the nodes in the collection but with iterating SelectSingleNode I keep getting the same object just node.Id is changing...
What i try is to readout the webresponse of a given site and catch some information like values, links .. in special defined elements.
int offSet = 0;
string address = "http://www.testsite.de/ergebnisliste.html?offset=" + offSet;
HtmlWeb web = new HtmlWeb();
//web.OverrideEncoding = Encoding.UTF8;
HtmlDocument doc = web.Load(address);
HtmlNodeCollection collection = doc.DocumentNode.SelectNodes("//div[#itemtype='http://schema.org/Posting']");
foreach (HtmlNode node in collection) {
string id = HttpUtility.HtmlDecode(node.Id);
string cpname = HttpUtility.HtmlDecode(node.SelectSingleNode("//span[#itemprop='name']").InnerText);
string cptitle = HttpUtility.HtmlDecode(node.SelectSingleNode("//span[#itemprop='title']").InnerText);
string cpaddress = HttpUtility.HtmlDecode(node.SelectSingleNode("//span[#itemprop='addressLocality']").InnerText);
string date = HttpUtility.HtmlDecode(node.SelectSingleNode("//div[#itemprop='datePosted']").InnerText);
string link = "http://www.testsite.de" + HttpUtility.HtmlDecode(node.SelectSingleNode("//div[#class='h3 title']//a[#href]").GetAttributeValue("href", "default"));
}
This is for example for 1 iteration:
<div id="66666" itemtype="http://schema.org/Posting">
<div>
<a>
<img />
</a>
</div>
<div>
<div class="h3 title">
<a href="/test.html" title="Test">
<span itemprop="title">Test</span>
</a>
</div>
<div>
<span itemprop="name">TestName</span>
</div>
</div>
<div>
<div>
<div>
<div>
<span itemprop="address">Test</span>
</div>
<span>
<a>
<span><!-- --></span>
<span></span>
</a>
</span>
</div>
</div>
<div itemprop="date">
<time datetime="2013-03-01">01.03.13</time>
</div>
</div>
By writing
node.SelectSingleNode("//span[#itemprop='name']").InnerText
it's like you writing
doc.DocumentNode.SelectSingleNode("//span[#itemprop='name']").InnerText
To do what you want to do you should write it like this: node.SelectSingleNode(".//span[#itemprop='name']").InnerText.
This .dot / period tells make a search on the current node which is node instead on doc