My requirement is to extract the required content from a web page. The page has a section which is being populated using ajax. When i view in page source it is not showing the content loaded using ajax. The section content will change based on check box selected. If we select 'India' check box then the section will display all the details of India. The page source will show only default content not the content displayed using ajax. I checked the page source after selecting the check box, still it shows only default value. How to get that section content,
In C# you can use HTMLAgilityPack to craw data, but if you use webBrowser.DocumentText, you can't load ajax content from webpage to get xpath. So after webBrowser control loaded webpage completely. In Document_Complete method you add some codes below:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
this.webBrowser1.Document;
IHTMLDocument2 currentDoc =(IHTMLDocument2)this.webBrowser1.Document.DomDocument;
doc.LoadHtml(currentDoc.activeElement.innerHTML);
Use Firebug under Firefox. Under NET tab you will see the extra content loaded.
Related
Using htmlagilityPack trying to get all href links. But web page doesn't return all links.
I tried in browser and saw that until you scroll down the whole page it doesn't show all links. Then I tried to resize (zoom-in) browser window so that all page contents can be seen without scrolling down. That moment all links appeared. May be java need to triggered....
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument Doc = web.Load("https://www.verkkokauppa.com/fi/catalog/438b/Televisiot/products?page=1");
foreach (HtmlNode item in Doc.DocumentNode.SelectNodes("//li[#class='product-list-grid__grid-item']/a"))
{
debug.WriteLine(item.GetAttributeValue("href", string.Empty));
}
One page has 24 product links but I get only 15 out of them.
Check Network tab in chrome on that page. There are ajax requests to https://www.verkkokauppa.com/resp-api/product?pids=467610. So products are loaded using javascript.
You can't just trigger javascript here. HtmlAgilityPack is an html parser. If you want to work with dynamic content you need browser engine. I think you should check Selenium and phantomjs.
I need to read this page in WCF service
http://bvmf.bmfbovespa.com.br/cias-listadas/empresas-listadas/ResumoEmpresaPrincipal.aspx?codigoCvm=9512&idioma=pt-br
But I want to read this node generate dynamic by server class="ficha responsive"
When I use a method like
HtmlDocument doc = web.Load("http://bvmf.bmfbovespa.com.br/cias-listadas/empresas-listadas/ResumoEmpresaPrincipal.aspx?codigoCvm=9512&idioma=pt-br")
I not get full page because page call dynamic this form
form name="aspnetForm"
method="post"
action="ResumoEmpresaPrincipal.aspx?codigoCvm=9512&idioma+=+pt+-+br&idioma=pt-br"
id="aspnetForm"
How I can get load FULL page or post data to this webform in C#?? or load a full HTML Content ?
ResumoEmpresaPrincipal.aspx?codigoCvm=9512
The solution to read a full page content are in this post
Scraping webpage generated by javascript with C#
I made a webform in html and I have a website in C#.
I would like this form to show up every time the page is loaded.
What is the best way to integrate/include/call the form?
Which pages I have to modify? Default.Aspx or Default.Aspx.cs?
The purpose of this project is to show this form everytime the cookies is not set in the aspx code.
Which I guess I have to modify the aspx part that checks if the value of the cookie is set or not and show/not show the webform based on this value.
You could do this using a combination of JavaScript (to check if cookies are enabled) and JQuery. If cookies aren't enabled, have a placeholder DIV that can hold the HTML content you wanted to show. Then use $.ajax (http://api.jquery.com/jquery.ajax/) to load the html content from browser and set the DIV's innerHTML property with the returned HTML.
Hope this works for you!!
Some little progress.
That's how I modified my pages for my needs. In this way the html webform is diplaying correctly.
In the head of default.master I added:
in the default.aspx I added:
with the entire html of my html page (tag html included).
Now I need to modify this page that in the way this pop up is showed only if a cookie value is not set.
I am trying to access this webpage http://www.pof.com with C# code.
I figured out that the Document element is stored in an iframe after I successfully logged in as a user and I am not familiar with how to access the document element.
All I want to do is to get the HTML format of that page which is loaded with an iframe and go to some of the links of that site.
Use following code:
document.getElementById('iframe1').contentWindow.document
or simply,
var elemVal;
if (iframeDocument) {
elemVal= iframeDocument.getElementById('#iframe1');
}
I want to assign the html content to iframe control from asp.net code behind page;
this is working fine
myIframe.Attributes.Add("src", "pathtofilewith.html");
but, i don't want to give the path of html file to display i just want to assign some html content which comes from database to iframe control.
i want some thing like this(Ashok) to be displayed in iframe control
i tried the bellow ways but nothing is succesful
myIframe.Attributes["innerHTML"] = "<h1>Thank You..</h1>";
myIframe.Attributes.Add("innerHTML", "<h1>Ashok</h1>");
A way to communicate between two different pages where one is in an IFrame on the other, is to post data using JQuery. An example is given in this StackOverflow question
It is also discussed in this other StackOverflow question
On this page, you will also find a short and simple example of how you can put content in an IFrame without using a separate web-page for it (note the lacking src attribute!).
Hope some of this helps!
You can't. That's not how an IFRAME works - it is for use with the src attribute as you've already discovered.
Perhaps you want to create a DIV instead
There is no way to insert HTML content into iframe tag directly, but you can create a page which gets the content form the database and then you can view it using iframe,
or
You can create a page for example called getContent.aspx which request value from the URL e.g. getContent.aspx?content=<h1>Thank You..</h1> and display it wherever you like, and then call it from iframe.