html page annotator - c#

I have to write a web page annotator.
And my requirements are the following:
1) given a set of pages, I want to annotate them efficiently (in a browser, in an external application that knows how to render HTML, etc.)
2) I select (highlight, make active) manually a string of text, and dropdown menu appears that allows to select from a set of options
3) after that the iterator appears (like in a browser when pressed ctrl+F to search) and I want to be able to navigate through matches of the string, selected in the previous step, on the same page
4) comparison function on strings is given that has interface: given two strings it outputs either 1 or 0, depending on strings match
5) when I press iterator button, i move to the next match for selected string, and then a message box should appear (or any other thing where i can confirm that it is a true match)
6) having confirmed that it is a true match, the text of a page should be modified such that current match became surrounded by special tag
(for instance <<< optionX >>> matched text <<< /optionX >>> ), where optionX is defined based on a value selected in the first step (dropdown menu)
5) when all matches (defined by comparison function) are found on a page, I would like to mark another string of text on the same page and then repeat the process, by finding all the matches, confirming some of them, and modifying page source correspondingly
6) then a modified page should be stored on a local drive
QUESTIONS:
Can you please suggest what is the right tool to do that?
1)Is it OK to use javascript and work in a browser. If yes, what methods are required for that and are there any useful libraries that do just that, or at least cover some functionality described above
2) May be is it better to build a custom desktop app, that renders a page in a special frame, and have appropriate buttons to navigate, confirm etc. (python or C# are considered), and again what classes and libraries can help
[UPDATE]:
I know how to work with the content of a page, but I am curious how to make it comfortable for annotators to use, how to build the right dialog with the user: ways to have all candies such as dropdown menus and iterator that is visible for users, dialog for confirmation etc.
The goal is to annotate a lot of pages with that, therefore interface should be efficient. I am a researcher (and this is not a homework as you might think, i just described what is needed in a formal way ) and I have only poor experience writing user oriented apps.
Thank you in advance!

This is definitely a web browsers job. You can use JQuery to search and modify a html page. This example will find all occurrences of homework and change the text to << homework >>
$("*").each(function () {
if ($(this).children().length == 0) {
$(this).text($(this).text().replace('homework','<< homework >>'));
}
});

Related

Ranorex test automation issue: Unable to reliably click a button on silverlight web app

We have automated a few test cases using the Ranorex automation framework for a Silverlight web application. These test cases involve clicking buttons in order to invoke certain messages on the screen. In order to grab the button on the screen, we first create an Ranorex button object and then point it to the appropriate element using Ranorexpath. Then, we use the RanorexButton.Click() event to click the button. However, this event is unreliable. It works sometimes and at other times the button is not clicked. When the button is not clicked, we have to run the test case again from the start. What are we doing wrong? If this is a known problem of ranorex, please suggest workarounds.
I was facing the same problem but I am able to resolve the problem by introducing a Validate.Exists(infoObject) just before the click. Please make sure that you pass infoObject of your button or any element in Validate.Exists API.
Example:
Validate.Exists(repo.MyApp.LoginBtnInfo);
var button = repo.MyApp.LoginBtn;
button.Click();
With regards,
Avinash Nigam
I haven't heard about such a problem with Ranorex yet, maybe this is just a timing issue.
You could add a Validate.Exists(yourButton) right before the click, this ensures that the click is performed after the button was successfully loaded.
If it is a WebElement you could also use the PerformClick() method instead of the normal Click() method.
There are also different methods which will ensure that the button is in the visible area and has focus, like the EnsureVisible() or the Focus() method.
You will find the available methods of the used adapter in the online API of Ranorex.
If the Button is not within the area you can see without scrolling, you can use a
var button = repo.Buttons.button1;
button.EnsureVisible();
button.Click();
In this way the button is forced to be watched.
It might as well be an issue with the xpath and element Id-s.
If you have changing element Id-s even when the page is navigated away from and moved back (for example we have this issue with SAP related components) you might need to make a more robust xPath path variable using regular expressions.
Try to find object and parts of the path that do not change (eg. "iFrame id="MainContent"" or "btn id="ID_XXXX_Search_Button"") - ofcourse this will only help if the issue is within this.
Ranorex Regular Expression info can be found here: http://www.ranorex.com/support/user-guide-20/ranorexpath.html#c3294
A quick example of what I'm talking about:
Let's say we have an input field that has a changing ID in it's name:
US_EL.ID_F2B49DB60EE1B68131BD662A21B3432B:V_MAIN._046A-r
And I know that the part in the Id that doesn't change is:
:V_MAIN._046A-r
I can create a path for this searching the element by the ending of the elements' id that doesn't change using regular expression:
/dom[#domain='test.example.com']//iframe[#'identifier']//iframe[#'identifier2']//input[#id**~'^**:V_MAIN._046A-r']
The bold part will specify to search for an input element with an Id that ends with ":V_MAIN._046A-r".
An issue that might arrise from this is if you have elements using partially the same names you might get multiple elements returned for the same path. So it's wise to add a few more certain points to the path (eg. "iframe[#'identifier2']") when this issue arrises.

Popup on mouse hover the word

I am doing a website using asp.net C# and I would like to popup a small window with information as soon as mouse hover a particular word. I know that I have to use jquery but I don't know exactly how to do it.
Any suggestions please?
There are many plugins out there that will help you achieve what you are looking for. However it is also very possible to implement this functionality yourself. I wouldn't be surprised either if some of the plugins you come across also use similar code.
The following is my attempt to demystify tooltip/popup plugin behaviour.
You could wrap the desired word in a <span> element and give it a .hover class.
<div>
This is some text with a <span class="hover">special</span>
word that has hovercraft capabilities.
</div>
Your jQuery (ver 1.7+) would look something like this :
$(".hover").on('mouseenter',function(){
// The popup must be shown here (mouse is over element).
}).on('mouseleave',function(){
// The popup must be hidden here (mouse has left element).
});
I should add here that I am using a great and yet sometimes forgotten capability of jQuery called "chaining". The on() function actually returns the object that it was attached to. In this case $(".hover") - so if I want to call another function on that object I can just add it as another function at the end. Another example of this would be :
$("#myElement").text("An error has occured!").css("color","#FF0000");
That line of code would also at the text to #myElement and also turn the colour red.
With regard to your actual popup - I would suggest two things :
Have an element at the bottom of your markup (written last so highest index - or manually set the highest z-index)
You could also have the popup in a hidden element right next to the element that is supposed to trigger the popup.
What you're after sounds like a 'tool tip'.
The solutions using jQuery are somewhat involved - so I'll just direct you to external resources.
Possible solutions:
ToolTip Plugin for jQuery
Build a Better Tooltip with jQuery Awesomeness

asp.net dynamically add usercontrols and position them

My issue is that I have a designer that will create a custom aspx page bu without any .net controls. I need a way of adding the controls dynamically. So far the only types of controls will be textboxes and a button, but there are 30 variations of what the textboxes can be (name, phone #, email, etc). Also the textboxes may or may not need to be required. Once the textboxes are added the form will be submitted to a db.
My first thought was to have the designer place something like [name] and then replace that with a user control that has a name textbox and a required field validator. In order to determine if the validator should be enabled I was thinking that the place holder could look like this, [name;val] or [name;noval]. I could either do replace the place holders in code dynamically or set up a tool that the user pastes their html into a textbox and clicks a button which then spits out the necessary code to create the aspx page.
I'm sure there must be a better way to do this but its a fairly unique problem so I haven't been able to find any alternatives. Does anyone have any ideas?
Thanks,
Kirk
IF your designer gives you html pages, just create a new website. copy and pages all the HTML pages with the Image folders and everything to your project. then for every HTML page create an aspx page, (with the same name) copy and pages the html's tags which are between to the aspx page's and for the body copy and paste HTML page's tags which are between into the of the aspx page.
Now you have your aspx page, exactly the same as html page.
Sounds like an attempt to over-engineer a solution to what should be a non-issue.
As #Alessandro mentioned in a comment above, why can't the designer provide you with pages that have the control markup? As it stands right now, the designer isn't providing you with "a custom aspx" so much as "a custom html page." If the designer is promising ASPX but delivering only HTML, that's a misinterpretation somewhere in the business requirements.
However, even if the designer is rightfully providing only HTML, there shouldn't be a problem with that. At worst, you can set each element you need on the server to runat="server" to access them on the server-side. Or, probably better, would be to simply replace them with the ASPX control markup for the relevant controls.
Write a simple parser that will recognize the [...] tags and replace them with corresponding controls. Its pretty easy to do and i've often done this... the tag i use is usually $$(..); though, but that doesn't matter as long as your parser knows your tags.
Such a parser will consist of a simple state-machine that can be in two states; text-mode or tag-mode. Loop through the whole page-text, char for char. As long as you're in text-mode you keep appending each char into a temporary buffer. As soon as you get into tag-mode you create a LiteralControl with the content of the temporary buffer and add it to the bottom of your Control-tree, and emtpy the buffer.
Now, you still keep adding each char into the buffer, but when you hit text-mode again, you analyze the content of the buffer and create the correct control - could be a simple switch case statement. Add the control to the bottom of your control tree and keep looping through the rest of the chars unto you read the end and keep switching back and forth between text-mode and tag-mode adding LiteralControls and concrete controls.
Simple example of such a parser... written in notepad in 4 minutes, but you should get the idea.
foreach (var c in text)
{
buffer.Append(c);
if (c== '[' && mode == Text)
{
mode = Tag;
Controls.Add(new LiteralControl(buffer));
buffer.Clear();
}
if (c == ']' && mode == Tag)
{
mode = Text;
switch (buffer)
{
case "[name]": Controls.Add(new NameControl());
... the rest of possible tags
}
buffer.Clear();
}

Click Gmail Refresh button

I am writing a simple personal app that has a browser control and I want it to automatically "Refresh" gmail to check it more often than it does by default. There are monkey scripts that do this but I'm trying to add my personal style to it.
Anyhow, I've looked around and found everything but what I can do in csharp using the browser control.
I found this:
// Link the ID from the web form to the Button var
theButton = webBrowser_Gmail.Document.GetElementById("Refresh");
// Now do the actual click.
theButton.InvokeMember("click");
But it comes back with null in 'theButton' so it doesn't invoke anything.
Anyone have any suggestions?
It's been awhile since I've used JavaScript, but given the other answers and comments that there is no real ID associated with the element, could you do something like the following:
Search all Div's with an attribute of Role == 'Button' and an InnerHtml == 'Refresh'.
Once the correct InnerHtml is found, get the Element.
Invoke the click on the found Element.
Again, this may be blowing smoke, but thought I'd throw it out there.
edit: Just realized you are doing this with C# and a browser control; however, the concept would still be the same.
The best suggestion I could give you at this point involves an existing API that is used for .NET web browser based automation:
http://watin.org/
Since the div tag with the desired button really only seems to identify itself with the class name, you could use the Find.BySelector(“”) code included with the most recent version of watin.

How to handle Html inputs in the TextBox

I have a requirement that user can input HTML tags in the ASP.NET TextBox. The value of the textbox will be saved in the database and then we need to show it
on some other page what he had entered. SO to do so I set the ValidateRequest="false" on the Page directive.
Now the problem is that when user input somthing like :
<script> window.location = 'http://www.xyz.com'; </script>
Now its values saved in the database, but when I am showing its value in some other page It redirects me to "http://www.xyz.com" which is obvious
as the javascript catches it. But I need to find a solution as I need to show exactly what he had entered.
I am thinking of Server.HtmlEncode. Can you guide me to a direction for my requirement
Always always always encode the input from the user and then and only then persist in your database. You can achieve this easily by doing
Server.HtmlEncode(userinput)
Now, when it come time to display the content to the user decode the user input and put it on the screen:
Server.HtmlDecode(userinput)
You need to encode all of the input before you output it back to the user and you could consider implementing a whitelist based approach to what kind of HTML you allow a user to submit.
I suggest a whitelist approach because it's much easier to write rules to allow p,br,em,strong,a (for example) rather than to try and identify every kind of malicious input and blacklist them.
Possibly consider using something like MarkDown (as used on StackOverflow) instead of allowing plain HTML?
You need to escape some characters during generating the HTML: '<' -> <, '>' -> >, '&' -> &. This way you get displayed exactly what the user entered, otherwise the HTML parser would possibly recognize HTML tags and execute them.
Have you tried using HTMLEncode on all of your inputs? I personally use the Telerik RadEditor that escapes the characters before submitting them... that way the system doesn't barf on exceptions.
Here's an SO question along the same lines.
You should have a look at the HTML tags you do not want to support because of vulnerabilities as the one you described, such as
script
img
iframe
applet
object
embed
form, button, input
and replace the leading "<" by "& lt;".
Also replace < /body> and < /html>
HTML editors such as CKEditor allow you to require well-formed XHTML, and define tags to be excluded from input.

Categories

Resources