Getting dynamic content of div in File.ReadAllText - c#

I need to export the webpage to PDF for that am reading all the content of that file and writing that content in to the PDF file.
Please refer below dynamic content in div tag
<div ng-app="criteriaApp">
<div ng-include src="'#Url.Content("~/template.html")'"></div>
</div>
template.html content will be modified dynamically in jquery and displayed in view.
So whenvever am inpsecting the webpage through developer tools i can able to see the content
like
<div ng-app="criteriaApp">
<div ng-include src="'#Url.Content("~/template.html")'">
<div>.......</div>
</div>
</div>
but getting the content through File.ReadAllText doesn't give original DOM elements. it gives the page source code only (i.e. getting viewPage Source by right clicking the page)
string contents = File.ReadAllText(path);
i need to get the dynamic contents of div through above code ?
Note : File.ReadAllText returns page source code only. not all DOM nodes generated dynamically.
The dynamic content is there in DOM. how can i get DOM code for particular div in C# ?
how can i acheive this ?
Thanks,
Siva

To get the dynamic contents of div through your html code, you need to use a engine like webkit to generate DOM. If you need to export your page as pdf, you should look at razor pdf or rotativa.

Related

Anglesharp HTML parser doesn't seem to be parsing document deep enough to access desired element

So I'm trying to scrape a website using AngleSharp and want to access a particular button that is nested deep in the site. I have logged out the parsed document html with document.DocumentElement.OuterHtml
but can only see so far into the document:
<div class="l-propertySearch-paginationAndSearchFooter" data-test="pagination">
<div data-bind="component: 'pagination'"></div>
</div>
</div>
However, when I inspect the page in the web browser, I can see the additional layers necessary to access the button:
As you can see, the div with the data-bind attribute title "component: 'pagination'" open up further but doesn't display this in the log - this is why, I suspect, I can't retrieve the element.
I've experimented with document.QuerySelectorAll("button" and get back a list of buttons but not the one I'm after - it's like the particular block I want doesn't exist. Any ideas what I'm doing wrong?
As far as I understand that button you are looking for is created with javascript and does not exist in original source code. That is the reason you can't access that button with anglesharp. Right click on website and click View page source (Ctrl + U on chrome) and look for your button there. That is what anglesharp sees not html inside inspect element.

Read a full page with aspx form Load dynamic in c#

I need to read this page in WCF service
http://bvmf.bmfbovespa.com.br/cias-listadas/empresas-listadas/ResumoEmpresaPrincipal.aspx?codigoCvm=9512&idioma=pt-br
But I want to read this node generate dynamic by server class="ficha responsive"
When I use a method like
HtmlDocument doc = web.Load("http://bvmf.bmfbovespa.com.br/cias-listadas/empresas-listadas/ResumoEmpresaPrincipal.aspx?codigoCvm=9512&idioma=pt-br")
I not get full page because page call dynamic this form
form name="aspnetForm"
method="post"
action="ResumoEmpresaPrincipal.aspx?codigoCvm=9512&idioma+=+pt+-+br&idioma=pt-br"
id="aspnetForm"
How I can get load FULL page or post data to this webform in C#?? or load a full HTML Content ?
ResumoEmpresaPrincipal.aspx?codigoCvm=9512
The solution to read a full page content are in this post
Scraping webpage generated by javascript with C#

Any DOM parsers that do not modify the DOM?

I need to write a page, can use PHP or .NET, that will display the unmodified html for an element of another page.
The other page may not have valid HTML, but we want it to be returned unmodified. We will not be selecting based on the invalid elements, but will select their parent element and need them returned unmodified.
An example HTML page that my page will be fetching:
<body>
<div>
<p>test1</p>
<br>
<p>test2
<p>test3</p>
</div>
</body>
So far everything I have tried attempts to fix the HTML, it makes the br in the example self closing and the second paragraph tags gets closed.
Is there anything out there that can do this?
Thanks!

Add List Of Images To HTML dynamic using c#

I have list of images stored in sql database.
i try to add it dynamic at run time .
i use "InnerHtml"
i create dive tag and want to add the image list in the div tag
HTML:
<div runat=server class="ws_images" id="List_Slide">
C#
List_Slide.InnerHtml = "<li><img src=data1/images/31.jpg alt=31 title=31 id=wows1_0/></li>"
can you help me ?
It looks like you are using ASP.NET web forms.
What you can do is drop a asp:PlaceHolder control onto your page then add Literal controls in there that contain your image HTML code.
<asp:PlaceHolder id="ImagePlaceHolder" runat="server"/>
..
..
In code behind:
var literal = new LiteralControl("<li><img src=data1/images/31.jpg alt=31 title=31 id=wows1_0/></li>");
ImagePlaceHolder.Controls.Add(literal);
PlaceHolder does not render any HTML tags so if you want you can put your div tag inside the LiteralControl.

How to Update the Div Content without generating all the content of a html File

I have an HTML file inside <body /> tag with structure:
<div id='header'>content of header</div>
<div id='content'>content of content.</div>
<div id='footer'>content of footer</div>
The content of header, content and footer are changed as per user interaction.
User selects content for header section, The header section adds the user content to the header <div />.
For this I made three stringbuilder variables respective to the three <div />s in c#, whenever there is a change in any <div /> the respective string builder variable updates and I am making an HTML temp file with <head /> section and a <body /> section—and finally appending all the string builders to the <body /> tag and saving the file. And if user wants to download the file, the file should have all the updates.
How to update the particular <div /> content from code behind with c# with out making temp file?
I need changes directly on the file and I don't want to redesign the temp page in code behind.
using update panel can resolve your issue.
just update the div in your codebehind.
http://ajax.net-tutorials.com/controls/updatepanel-control/
You can use the following code for changing the content of specific Div.
Try the following code in your code behind file
For Header
header.InnerText = "Changed content of header";
same for other divs.
You could indeed use updatepanel like mentioned above or even better, make an asynchrone call in javascript to your server and update the element of choice with the response.
You could use jQuery (see: jQuery.com) for the async call (e.g. with $.getJSON or $.ajax) and modify the contents of the div with jQuery to like: $("#header").html(yourResult).
The call to the server could be handled by a handler (.ashx) or WCF service or whatever works for you.
Hope this gets you further!
Cheers!

Categories

Resources