How to extract data from a website with specifying a search criteria? - c#

I have got this new project that I am not familiar in working with. One task is that I need to navigate some websites to collect some data. One sample website would be this: https://www.hudhomestore.com/Home/Index.aspx
I have read and watched tutorials on "collecting" data from a web page, such as:
How to Scrape HTML Data with C#
Reading data from a website using C#
Pulling data from a webpage, parsing it for specific pieces, and displaying it
But my question is how do we usually set preferences, to "search" based on our preferences, and then use the above links to load the results in my code?
EDIT
This is correct for setting the searching criteria based on my selection. However, total count of the search (If I do it manually for MI state) is 223, but i I execute the below code, tdNodeCollection is only 121. Can you show me where am I going wrong?
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
string zipCode = "", city = "", county = "", street = "", sState = "MI", fromPrice = "0", toPrice = "0", fcaseNumber = "",
bed = "0", bath = "0", buyerType = "0", Status = "0", indoorAmenities = "", outdoorAmenities = "", housingType = "",
stories = "", parking = "", propertyAge = "", sLanguage = "ENGLISH";
var doc = await (Task.Factory.StartNew(() => web.Load("https://www.hudhomestore.com/Listing/PropertySearchResult.aspx?" +
"zipCode=" + zipCode + "&city=" + city + "&country=" + county + "&street=" + street + "&sState=" + sState +
"&fromPrice=" + fromPrice + "&toPrice=" + toPrice +
"&fcaseNumber=" + fcaseNumber + "&bed=" + bed + "&bath=" + bath +
"&buyerType=" + buyerType + "&Status=" + Status + "&indoorAmenities=" + indoorAmenities +
"&outdoorAmenities=" + outdoorAmenities + "&housingType=" + housingType + "&stories=" + stories +
"&parking=" + parking + "&propertyAge=" + propertyAge + "&sLanguage=" + sLanguage)));
HtmlNodeCollection tdNodeCollection = doc
.DocumentNode
.SelectNodes("//*[#id=\"dgPropertyList\"]//tr//td");

You can make use of HTMLAgilityPack for this purpose. I've made a small testing code and tested with the second page you wish to scrap based on the search criteria which you can set.
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
HtmlWeb web = new HtmlWeb();
//string InitialUrl = "https://www.hudhomestore.com/Home/Index.aspx";
//Here you need to set the values of these variable to whatever user inputs
//after setting these values, add them to initial URL
string zipCode = "", city = "", county = "", street = "", sState = "AK", fromPrice = "0", toPrice = "0", fcaseNumber = "",
bed = "0", bath = "0", buyerType = "0", Status = "0", indoorAmenities = "", outdoorAmenities = "", housingType = "",
stories = "", parking = "", propertyAge = "", sLanguage = "ENGLISH";
HtmlAgilityPack.HtmlDocument document = web.Load("https://www.hudhomestore.com/Listing/PropertySearchResult.aspx?" +
"zipCode=" + zipCode + "&city=" + city + "&country=" + county + "&street=" + street + "&sState=" + sState +
"&fromPrice=" + fromPrice + "&toPrice=" + toPrice +
"&fcaseNumber=" + fcaseNumber + "&bed=" + bed + "&bath=" + bath +
"&buyerType=" + buyerType + "&Status=" + Status + "&indoorAmenities=" + indoorAmenities +
"&outdoorAmenities=" +outdoorAmenities + "&housingType=" + housingType + "&stories=" + stories +
"&parking=" + parking + "&propertyAge=" + propertyAge + "&sLanguage=" + sLanguage);
HtmlNodeCollection tdNodeCollection = document
.DocumentNode
.SelectNodes("//*[#id=\"dgPropertyList\"]//tr//td");
Count them again and look at your expression, there are exactly 121 td's within tr with id="dgPropertyList"
Next, check your td manually and trace what you need from that td and fetch that data.
foreach (HtmlAgilityPack.HtmlNode node in tdNodeCollection)
{
//Do you say you want to access to <h2>, <p> here?
//You can do:
HtmlNode h2Node = node.SelectSingleNode("./h2"); //That will get the first <h2> node
HtmlNodeCollection allH2Nodes = node.SelectNodes(".//h2"); //That will search in depth too
//And you can also take a look at the children, without using XPath (like in a tree):
HtmlNode h2Node_ = node.ChildNodes["h2"];
}
I've tested the code, it works and parse the whole document to reach the required table. It will get you all the rows within that table inside div. So, you can further dig into these rows, find your td and get what you need.
Another option could be using Selenium webdriver, Get your hands on Selenium
If you don't want the browser to be visible and still want to use Selenium like functionality then you can make use of PhantomJS
Hope it helps.

Related

Im looking for a way to delete the bot command after the bot has posted in Discord code written in Discord.net

Below is The code after the embed message is posted I would like the bot to delete the command that it was give to post the embed. Also if anyone knows how to add a footer to this embed that would be awesome
if (raid == "gos")
{
if (day == "Sun")
{
var filename = "gos_Sun.png";
var embed = new EmbedBuilder()
{
Title = "Garden of Salvation",
Description = "```" + day + ", " + date + " # " + time + " " + ampm + " " + "\n" + description + "```" + description2,
ImageUrl = $"attachment://{filename}",
}.Build();
SentEmbed = await Context.Channel.SendFileAsync(filename, embed: embed);
await SentEmbed.AddReactionsAsync(myReactions);
}
}
var footer = new EmbedFooterBuilder().WithText("React Below");
if (raid == "gos")
{
if (day == "Sun")
{
var filename = "gos_Sun.png";
var embed = new EmbedBuilder()
{
Title = "Garden of Salvation",
Description = "```" + day + ", " + date + " # " + time + " " + ampm + " " + "\n" + description + "```" + description2,
ImageUrl = $"attachment://{filename}",
}.WithFooter(footer).Build();

How can I render string as html in asp.net?

Convert input field string in the table to the input field?
<%= HttpUtility.HtmlDecode((string)objRow["Post"]) %>
On view End
<%= HttpUtility.HtmlDecode(GetUsersList())%>
Code In Cs File
foreach (DataRow dtRow in ds.Tables[0].Rows)
{
string userpk = Convert.ToString(dtRow["user_pk"]);
string usertypecd = Convert.ToString(dtRow["user_type_cd"]);
string firstname = Convert.ToString(dtRow["first_name"]);
string lastname = Convert.ToString(dtRow["last_name"]);
string active = Convert.ToString(dtRow["active_ind"]);
if (active == "true")
{
active = "< input type = 'checkbox' class='editor-active' disabled='disabled' checked='checked'>";
}
else {
active = "< input type = 'checkbox' class='editor-active' disabled='disabled'>";
}
string phoneno = Convert.ToString(dtRow["phone_no"]);
string phone_ext = Convert.ToString(dtRow["phone_ext"]);
string email = Convert.ToString(dtRow["email"]);
content += "<tr><td>"+ usertypecd+ "</td><td>" + firstname + "</td><td>" + lastname + "</td><td>" + active + "</td><td>" + phoneno + "</td><td>" + phone_ext + "</td><td>" + email + "</td><td></td></tr>";
}
return content;
Result Image:
If you mean Asp-WebForms you can create a System.Web.Label with the html text in the Text Property. This will then render as HTML if you add the control to your page.
Alternatively you can just dump the text into a protected/public property and use <%= YourPropertyNameHere %> to dump the text as it is into the page.
Personally through I recommend you build your web controls instead of string operations. For example create a System.Web.Panel and then add your html controls to it as you need them.
var pnl = new Panel();
var cb = new CheckBox();
cb.Enabled = (active == "true");
cb.Checked = cb.Enabled;
cb.CssClass = "editor-active";
pnl.Controls.Add(cb);
SomePanelInFrontCode.Controls.Add(pnl);

Display List into Console

I trying to display a List into Console
My List code:
var order = new List<Orders>();
order.Add(new Orders { Date = "" + orders[0].date_created, Name = ""+ orders[0].billing.first_name , Adress = ""+ orders[0].shipping.address_1 + " " + orders[0].shipping.address_2 });
order.Add(new Orders { Date = "" + orders[1].date_created, Name = "" + orders[1].billing.first_name, Adress = "" + orders[1].shipping.address_1 + " " + orders[1].shipping.address_2 });
order.Add(new Orders { Date = "" + orders[2].date_created, Name = "" + orders[2].billing.first_name, Adress = "" + orders[2].shipping.address_1 + " " + orders[2].shipping.address_2 });
order.Add(new Orders { Date = "" + orders[3].date_created, Name = "" + orders[3].billing.first_name, Adress = "" + orders[3].shipping.address_1 + " " + orders[3].shipping.address_2 });
order.Add(new Orders { Date = "" + orders[4].date_created, Name = "" + orders[4].billing.first_name, Adress = "" + orders[4].shipping.address_1 + " " + orders[4].shipping.address_2 });
return order;
I have tried to display it like this:
Debug.WriteLine(order.ToString());
and like this:
order.ForEach(i => Debug.WriteLine(i.ToString()));
But gives the warning:
Unreachable code
How I can display the list?
Using Linq, as in your second try is close to the actual printing, you just need to format the string properly instead of simply call ToString method:
order.ForEach(o => Debug.WriteLine("Date: " + o.Date + " Adress: " + o.Adress + "Name: " + o.Name));
And I know it is not the point of the question, but I suggest you to use a ForEach instruction to populate the list too, as it will add more flexibility to your code.
Try this one:
foreach (var item in order)
{
Debug.WriteLine(item.ToString());
}
Or if you have mulitple properties as mentioned above, you can try like this:
foreach (var item in order)
{
Debug.WriteLine("Date : {0}, Name : {1}, Adress : {2}",item.Date.ToString(), item.Name.ToString(), item.Adress.ToString());
}

Code for Google Analytic product hit in c#

I'm new in google analytic. I go through some regarding this. I found that there is no direct method to hit a windows application in google analytic. But i found some solutions in stackoverflow. I tried that, but didn't work for me. Below is the code that I'm using.
private void analyticsmethod4(string trackingId, string pagename)
{
Random rnd = new Random();
long timestampFirstRun, timestampLastRun, timestampCurrentRun, numberOfRuns;
// Get the first run time
timestampFirstRun = DateTime.Now.Ticks;
timestampLastRun = DateTime.Now.Ticks - 5;
timestampCurrentRun = 45;
numberOfRuns = 2;
// Some values we need
string domainHash = "123456789"; // This can be calcualted for your domain online
int uniqueVisitorId = rnd.Next(100000000, 999999999); // Random
string source = "Shop";
string medium = "medium123";
string sessionNumber = "1";
string campaignNumber = "1";
string culture = Thread.CurrentThread.CurrentCulture.Name;
string screenRes = Screen.PrimaryScreen.Bounds.Width + "x" + Screen.PrimaryScreen.Bounds.Height;
string statsRequest = "http://www.google-analytics.com/__utm.gif" +
"?utmwv=4.6.5" +
"&utmn=" + rnd.Next(100000000, 999999999) +
// "&utmhn=hostname.mydomain.com" +
"&utmcs=-" +
"&utmsr=" + screenRes +
"&utmsc=-" +
"&utmul=" + culture +
"&utmje=-" +
"&utmfl=-" +
"&utmdt=" + pagename + // Here i passed my profile name "MyWindowsApp"
"&utmhid=1943799692" +
"&utmr=0" +
"&utmp=" + pagename +
"&utmac=" + trackingId + //Tracking id : ie "UA-XXXXXXXX-X"
"&utmcc=" +
"__utma%3D" + domainHash + "." + uniqueVisitorId + "." + timestampFirstRun + "." + timestampLastRun + "." + timestampCurrentRun + "." + numberOfRuns +
"%3B%2B__utmz%3D" + domainHash + "." + timestampCurrentRun + "." + sessionNumber + "." + campaignNumber + ".utmcsr%3D" + source + "%7Cutmccn%3D(" + medium + ")%7Cutmcmd%3D" + medium + "%7Cutmcct%3D%2Fd31AaOM%3B";
try
{
using (var client = new WebClient())
{
//byte[] bt = client.DownloadData(statsRequest);
Stream data = client.OpenRead(statsRequest);
StreamReader reader = new StreamReader(data);
string s = reader.ReadToEnd();
MessageBox.Show(s);
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
This example is also got from this site itself. I don't know where was the problem. Please direct me, how can i make it. This is the output i'm getting "GIF89a".
Thanks
Bobbin Paulose
So it's working. The Google Analytics call loads a tiny GIF image, and the querystring parameters provided in the request trigger all the Google Analytics goodness. If you're getting a response back, you have registered your event successfully with Google.

How to add a where clause inside select query

<PersVeh id="V0001" LocationRef="L0001" RatedDriverRef="D0001">
<Manufacturer>FORD</Manufacturer>
<Model>TAURUS SE</Model>
<ModelYear>2007</ModelYear>
<VehBodyTypeCd>PP</VehBodyTypeCd>
<POLKRestraintDeviceCd>E</POLKRestraintDeviceCd>
<EstimatedAnnualDistance>
<NumUnits>011200</NumUnits>
</EstimatedAnnualDistance>
<VehIdentificationNumber>1FAFP53U37A160207</VehIdentificationNumber>
<VehSymbolCd>12</VehSymbolCd>
<VehRateGroupInfo>
<RateGroup>16</RateGroup>
<CoverageCd>COMP</CoverageCd>
</VehRateGroupInfo>
<VehRateGroupInfo>
<RateGroup>21</RateGroup>
<CoverageCd>COLL</CoverageCd>
</VehRateGroupInfo>
I'm brand new to Linq and I'm hoping that someone can help me with what may or may not be a simple problem.
For the above xml sample I'm using the following code:
var result = from item in doc.Descendants(n + "PersVeh")
where item.Attribute("id").Value == "V0001"
select new
{
RatedDriverRef = (string)item.Attribute("RatedDriverRef"),
LocationRef = (string)item.Attribute("LocationRef"),
ModelYear = (string)item.Element(n + "ModelYear") ?? "9999",
VehBodyTypeCd = (string)item.Element(n + "VehBodyTypeCd") ?? "XX",
POLKRestraintDeviceCd = (string)item.Element(n + "POLKRestraintDeviceCd") ?? "0",
EstimatedAnnualDistance = (string)item.Element(n + "EstimatedAnnualDistance").Element(n + "NumUnits") ?? "999999",
VehIdentificationNumber = (string)item.Element(n + "VehIdentificationNumber") ?? "VIN not found",
VehSymbolCd = (string)item.Element(n + "VehSymbolCd") ?? "00"
};
The problem I'm having is with the VehRateGroupInfo nodes. I need to extract the RateGroup number based on the CoverageCd.
In other words, something like this:
CompSymbol = item.Element(n + "VehRateGroupInfo").Element(n + "RateGroup").Value
where item.Element(n + "VehRateGroupInfo").Element(n + "CoverageCd").Value == "COMP"
Is it possible to do this within the select or do I need a separate query?
Here's a solution with query syntax:
CompSymbol = (from vehRateGroup in item.Descendants(n + "VehRateGroupInfo")
where vehRateGroup.Element(n + "CoverageCd").Value == "COMP"
select vehRateGroup.Element(n + "RateGroup").Value).Single()
Here's a similar solution with method syntax:
CompSymbol = item.Descendants(n + "VehRateGroupInfo")
.Single(x => x.Element(n + "CoverageCd").Value == "COMP")
.Element(n + "RateGroup").Value

Categories

Resources