Smart detection of html content changes - c#

I'm looking for algorithm/library (preferably in c#) that can detect changes in the content of an html page intelligently.
For example if the page was techcrunch.com, it would only match when there's a new post or significant changes to the page. it would ignore html comments, javascript, minor updates such as the number of comments, etc...
Can someone point me to the right direction?

You could use JavaScript to count how many elements are on the page, or specific portion of the page. There are thousands of ways one could implement JS to detect changes.

For my assumption, that, you request the page with your C# program.
Actually, there are hundreds of ways to do it.
I'll give you one:
Number one, the easiest and dummy algorithm.. is
while(true) {
checkModifyDate();
if(date is newer) {
do anything you want...
}
do it again in next 10 mins()
}
That checkModifyDate() function will "ONLY" check the HTTP Header for changes.
Then you can do anything later.
You can add it to a timer object that runs every xxx minute or to a thread and set it to automatically do the job for you.
Hope this helps.

Related

Disable postback of certain properties of Telerik RadEditor control

We have a page with Telerik RadEditor on a tab strip. There are scenarios when RadEditor contains a lot of html and when doing a post back in order to switch the tab, all its contents is being post back to the server. This results in gigantic performance loss (there are times when post backs are sending tens of MiB of data).
Is it possible to tweak RadEditor in such a way that it does not send its contents over to server on postbacks? Our code-behind does not rely on RadEditors Content property accessor (does not read its content explicitly), only its mutator (its contents are set from within the control's code-behind).
Is it even possible to do such things with any of Telerik controls and if it is, then how do we achieve such result?
It's worth pointing out that we use relatively old Telerik UI version (2013.2.611.35) and we can't switch to a newer version at the moment.
Thank you in advance.
Consider using the ContentUrl of the PageViews. This will let you load separate pages in iframes, so they will postback independently of the main page. Thus, you can have a standalone page with the editor and standalone pages for your other tabs.
On the possibility to exclude something from the POST request - I don't know of a way to do this, as it is not supposed to happen. The whole point is to transfer the current page state to the server.
Another option you may consider is using AJAX and the PageRequestManager's beingRequest event to try to blank out the editor. I have not tried it and I do not know whether it will actually work out, since so much data may simply be too much for the JS engine to process before the postback begins. Here is a bit of code that illustrates the idea:
var currContent = null;
function BeginRequestHandler(sender, args) {
var editor = $find("<%=RadEditor1.ClientID%>");
currContent = editor.get_html(true);
editor.set_html("");
}
function EndRequestHandler(sender, args) {
var editor = $find("<%=RadEditor1.ClientID%>");
editor.set_html(currContent);
currContent = null;
}
Sys.WebForms.PageRequestManager.getInstance().add_beginRequest(BeginRequestHandler);
Sys.WebForms.PageRequestManager.getInstance().add_endRequest(EndRequestHandler);

Get print page count without printing the document

This is somewhat similar to question about Is there a better way to get the page count from a PrintDocument than this?
But in my case I have a web-browser control with formatted html. At the moment I have option which calls ShowPrintPreviewDialog() so user can see how many pages going to be printed.
Is there anyway to get the no of pages which going to be printed, without launching the PrintPreview?
I am trying to create a method which will call OnTextChange and display print-page count automatically?
I have use PrintPage event
private void PrintDocumentOnPrintPage(object sender, PrintPageEventArgs e)
{
e.Graphics.DrawString(this.webBrowser1.DocumentText, this.webBrowser1.Font, Brushes.Black, 10, 25);
}
Bad news always travels slow at SO. You'll need to scratch the idea that this is practical.
Although unstated in the question, you should have already figured out by now that your PrintPage event handler doesn't work. It always produces a count of 1. That's because you never set the e.HasMorePages property to true, the property that causes more than one page to be generated.
To reliably set that property to true, you need to figure out exactly how the HTML gets rendered by the browser layout engine. And figure out exactly how to break it up into pages that don't cut, say, a line of text or an image in two. And figure out how to this is in the exact same way that the browser printing engine does this. A feat that's been attempted by many a programmer, accomplished by none. The browser's automation object model just doesn't have the needed api.
The only reasonable way is the one you already know. You have to call ShowPrintPreviewDialog(). Which readily displays the page count in the preview dialog, looks like this in IE11:
In case you'd consider snooping that number off the dialog: no, that cannot work either. The dialog doesn't use any controls, it is one monolithic window.

cant reach to the text field with Watin

im trying to make an automation with Watin and i'm having an issue reaching to a text fill in an HTML body..
when i log into the site i manage to reach the search box and to put input there and even press "enter" when it moves to the second page, i cant reach the input form there
This is my code - first step is working smooth but second isnt.
browser.GoTo("mywebsiteaddress");
browser.TextField(Find.ByName("sysparm_search")).TypeText(ticketNumber.Text);
System.Windows.Forms.SendKeys.SendWait("{ENTER}");
//browser.TextField(Find.ByName("sys_display.sc_task.u_category")).TypeText(ticketNumber.Text);
browser.Element("sc_task.work_notes");
This is the Browser source code when i check it with google chrome
<textarea wrap="soft" onkeypress="" onkeyup="multiModified(this);fieldTyped(this);" onfocus="this.isFocused=true;" autocomplete="off" data-charlimit="false" rows="16" id="sc_task.work_notes" data-length="4000" style="; width:100%; overflow:auto; " name="sc_task.work_notes" onblur="this.isFocused=false;" onchange="this.isFocused=false;multiModified(this)" onkeydown="multiKeyDown(this);;"></textarea>
Thanks all!
It could be a number of things including what you're looking for is in a frame or less likely you're looking for it before it is loaded. If in a frame, you'll need to specify which frame. If it is not loaded yet the easy way to check is by putting in a generic sleep() call, though that is not the best long term.
When I deal with something I can't find, I make heavy use of the Flash() method. In your case, you'd probably want to start at the whole page level and work your way down to your object. Flash() will show you where you're looking at to make sure you're looking in the right spot on the page, ideally getting down to the parent element of what you're looking for and being able to correctly identify and flash that and then figure out what is amiss with trying to get the textarea you're really trying to get at.
Use Fire fox. Install an add on by the name Firebug.
Whichever element that you want to inspect, you right click on it and say inspect element with fire bug.
once you know the id or name of that element, you can easily access it using the Find class.
Sometimes it so happens that the element is in a different frame than the main frame. watin cannot directly access that element if it is not accessed via the frame.

Is there a generally accepted way to pass data from one ASP.Net form to another after validation?

I have an ASP.Net form (Page1) where the user enters some data and then clicks the submit button.
As part of Page1, I have some Validators, including a CustomValidator which needs to do its validation back on the server.
When the user clicks the submit button a post is done to Page1 and the validation routine is run on the server and as long as I check Page.IsValid in the button click routine the form knows whether things have passed or not.
When the validation doesn't pass everything properly goes back to Form1 and the error message is displayed.
When the form does pass validation, I want to pass the data that the user entered to a second form (Page2) so that Page2 can be rendered correctly based on the data the user entered on Page1.
Is there a generally accepted way, or best way, to pass the data to Page2? Here are some ways I know about:
Call Page2 with a query string: This won't work as I need the data to not be visible to the user in certain cases.
Use the PostBackUrl on the submit button to go to Page2: As far as I know, this won't work correctly because then the server side validation routines for Page1 won't be run.
Use Session Variables: I don't know of a particular reason why this would be bad.
Use Server.Transfer: I don't really have any experience with this.
I would think that this would be a pretty standard thing to do but I'm having a hard time finding any information on the correct way to do it.
If you don't have a form of secondary storage for this data, using either Session storage or Server.Transfer would work.
You might find Server.Transfer is a little neater as, this way, you'll retain your POST values across the transfer. This will potentially save you a lot of cumbersome code playing around with session state, which, depending on how complex your forms are, could open the way to all kinds of unusual behaviour that you'd have to predict and plan to deal with in advance such as what happens when a user clicks the "back" button or - if you're posting across multiple pages - what happens when a session expires (plus Servy's examples of having multiple tabs open on the same page(s), all sharing the same session). Working with session state can be messy.
Perform your validation on PostBack then, if Page.IsValid, do:
Server.Transfer("/FormPage2.aspx");
Server.Transfer preserves Request.QueryString and Request.Form, so you can pick up your POST values on FormPage2 and do whatever you need with them here - whether it be using them for conditional logic or rendering them out again as hidden fields to join them up with the values from the second page of the form (bear in mind that if you're doing this you'll have to revalidate the hidden inputs at this stage).
http://msdn.microsoft.com/en-us/library/y4k58xk7.aspx
I have used session state for handling complex forms in the past and found myself wishing I'd used Server.Transfer, which I plan to use for all similar endeavours in the future, unless I have a very good reason not to.
You might also consider using a multiview, but in my experience these can be very messy.
Hope this helps.
I think that the easiest solution would be to specify a PreviousPageType directive. It specifies a type that the page should expect to receive and you would do a normal POST to that page.
On the second page of your application, use the following directive:
<%# PreviousPageType VirtualPath="~/FirstPage.aspx" %>
You will be able to access the properties exposed and check for validity by using something like this:
if (PreviousPage != null && PreviousPage.IsValid)
Using the Session object is a standard way to pass information across forms.
#Servy gives a good explaination (in the comments below) on how Server.Transfer can help you in this case.
The other options you stated all have problems, just like you mentioned...
If you want to use Session:
In the postback of Page1 you can set the values:
Session["myVar"] = <Data you want to pass to page2>
In page2 in the OnLoad:
if (Session["myVar"] != null)
{
myVar = Session["MyVar"]
}
You can achieve this with Server.Transfer by adding a property to your page1. In your second page in page_load for example:
Page1 prev = Page.PreviousPage as Page1;
if (prev != null)
{
// access your property here and set up the page
}
Server.Transfer can safely receive a query string without fear of the user seeing it.
Instead of Session use Context.Items.
Context.Items["validationProblems"] = "...";
Server.Transfer("FixProblems.aspx");
My other comment is that in my experience it's more "standard" to keep the validation UI contained in the same form that's collecting the information. This enables "real time" feedback. In practice I think it's better to give a user information that their doing something wrong as early as possible.
Note, that's just in my experience though.. it's a big world.
It may be more that you presently require, but one alternative is to save the data in a database:
http://msdn.microsoft.com/en-us/library/6tc47t75%28v=VS.100%29.aspx
http://www.asp.net/web-forms/videos/how-do-i/how-do-i-set-up-the-sql-membership-provider

Asp.net MVC with Ajax history (jquery address), how to load from the URL?

I am using asp.net mvc with ajax navigation. I use jquery address and I can change the address bar to be like "MYPage.Com/#/Url", but how can I invoke my route when the user enters that link?
This has probably been asked before but I could not find it, so please point me to it if you find it.
You need to use the window.onHashChange event of the window element. It is best to use javascript libraries like jquery bbq to handle the hash change.
If you still want to do it without using a library, then on page load you should make a call to the function that handles the onHashChange even.
There is no event for that (at least not the last time I've checked). You need to make a checker function in JS that will run once every 100ms for example (or more often).
var currentHash="";
function CheckHash()
{
if(currentHash!=window.location.hash)
{
currentHash=window.location.hash;
NavigateTo(currentHash); //or whatever code to execute when address behind `#` changes
}
}
CheckHash(); //Initial Run, for fast reaction on load
window.setInterval(CheckHash,100); //schedules the function to run once every 100ms

Categories

Resources