I have a website with products part number in it. I want to know if it is possible to search this part number in other website search box. That means to insert the string to the other website search box and execute searching all by code behind?
Admitting you're not the owner of the other website and you don't know how it works :
When you click to the "Search" button of the other website, it triggers a request against the webserver : something like a GET or POST, or maybe the same but through an Ajax request.
It depends of the targeted website, but the best thing to do is probably to get the information about this GET or POST (very easy with Firefox or Google in the Network tab) and to process it yourself from your own application's code using an HttpWebRequest (for example).
This approach is easier and safer than to try to mimic the "Fill the textbox and click on the button and retrieve the response" action, but this is possible too (with web testing frameworks for example : watin, selenium...)
Concretely :
Go to the website you want to dig in
Open "Net" tab of your google chrome / firefox / whatever you want, enable it
Fill the search textbox
Click the search button
Look at the request and response appeared in the browser : you will know how to send a "search" query and the format of the answer returned.
This is general advices, not a definitive answer, because you don't provide enough details to help you more, but I hope this will be helpful !
Related
I have a scenario where I would like to automate programmatically the following process:
Currently, I have to manually
Navigate to a webpage
Enter some text (an email) in a certain field on the webpage
Press the 'Search' button, which generates a new page containing a Table with the results on it.
Manually scroll through the generated results table and extract 4 pieces of information.
Is there a way for me to do this from a Desktop WPF App using C# ?
I am aware there is a WebClient type that can download a string, presumably of the content of the webpage, but I don't see how that would help me.
My knowledge of web based stuff is pretty non-existent so I am quite lost how to go about this, or even if this is possible.
I think a web driver is what you're looking for, I would suggest using Selenium, you can navigate to sites and send input or clicks to specific elements in them.
Well, I'll write the algorithm for you but you also need to some homework.
UseWebClient get the htm page with the form you want to auto fill and submit
Us regex and extract the action attribute of the form you want to auto submit. That gets you the URL you want to submit your next request to.
Since you know the fields in that form, create a class corresponding to those fields, let's call the class AutoClass
Create a new instance of your auto class and assign values you want to auto fill
Using WebClient to send your new request with the url you extracted from the form previously, attach your object which you want to send to the server either through serialization or any method.
Send the request and wait for feedback, then further action
Either use a web driver like Puppeteer (Selenium is kinda dead) or use HTTPS protocol to make web requests (if you don't get stopped by bot checks). I feel like your looking for the latter method because there is no reason to use a web driver in this case when a lighter method like HTTP requests can be used.
You can use RestSharp or the built in libraries if you want. Here is a popular thread of the ways to send requests with the libraries built in to C#.
To figure out what you need to send you should use a tool like Fiddler or Chrome Dev Tools (specifically the Network tab) to see what you need to send to acheive your goal as you would in a browser.
I have read this documentation but it doesn't solve my case
https://learn.microsoft.com/en-us/previous-versions/dotnet/articles/ms972974(v=msdn.10)
for example I want to hide this url
http://yoursite.com/info/dispEmployeeInfo.aspx?EmpID=459-099&type=summary
to be only
http://yoursite.com
Note: I've read this question How to hide asp url in url bar? but unfortunately the answers are not satisfying enough
There is no simple way to hide the address bar. The browser always shows the current address of the top-most document.
If you don't want that address to show, you have a few options:
Use a client-side rendering framework (React, Angular, etc) that doesn't rely on URLs to decide what's currently being shown. This will use a lot of REST-like calls to view/update data.
Request the page via a form post. You can send the ID with the POST data, rather than it being in the URL.
Embed the page whose URL you want to hide using <iframe> element. The URL being requested from the server will still contain the query string parameters, it just won't be immediately visible to the user.
You can use the JavaScript window.open function to open the page in a full pop-up window, and use the option location=0 to hide the location bar. Note that some modern browsers will ignore this and will display the location anyway.
Using the URL Rewriting module, you can't hide that identifier. You can only build it into the URL in a way that's prettier to look at, e.g. https://yoursite.com/employee/459-099/summary.
Context:
I am trying to write a "bot" to make queries into this site. One of the problems is that, the results are shown into a "frame" that does not appear into the HTML, and it looks to me like a "Java Virtual Machine" or whatever.
Problem:
The main problem (leaving the captchas recognition aside, since we've done it before) is that i can't find a way to "retrieve/extract" the result data of the query at the website.
I've crawled the HTML manually trying to find at least, a clue of what's going on there, but seems like i dont have enought expertise to figure out how the results are being displayed.
Fiddler shows that, the requests for the virtual machine, returns a sort of "encripted" information, that i doesn't know how to "decript".
Fiddler Request (jar) : GET http://www.brasiltelecom.com.br/portal/pf/102online/Applet102PConv.jar
Fiddler Request with encripted Return : GET http://www.brasiltelecom.com.br/portal/Consultar102OnlineServlet?nome=9E10EB3AEF707099&endr=B48A41A90FCA933A&bair=B48A41A90FCA933A&locl=4A8DEF5F7E4C714B&tipo=1&secure=334265
Parameters of the second request translated:
nome = name
endr = Address
bair = neighborhood
locl = location
Tipo = always 1
secure = captcha
Just for a better understanding of the context.
Tools:
Currently, i am using Visual Studio 2010 (IDE), Fiddler (Web Debugger), and internal Libraries (dll's) to make the process easier.
Questions:
How can i extract the information displayed in the screen programatically, using a C# application ?
Is there any way i can "decript" the information returned by the webrequest, or, at least, find the "key" or "method" used by the service as the first step to start decripting ?
Thanks in advance, i hope i've made myself clear, let me know if there is anything i can do to improve the question.
Peace !
I've to automate a file download activity from a website (similar to, let's say, yahoomail.com). To reach a page which has this file download link, i've to login, jump from page to page to provide some parameters like dates etc., and finally click on download link.
I am thinking of three approaches:
Using WatIN and develop a windows service that periodically executes some WatiN code to traverse through the page and download the file.
Using AutoIT (no much idea)
Using a simple HTML parsing technique (there are several questions here eg., how to maintain a session after doing a login? how to do a logout after doing it?
I use scrapy.org, it's a python library. It's quiet good actually. Easy to write spiders and it's very extensive in it's functionality. Scraping sites after login is available in the package.
Here is an example of a spider that would crawl a site after authentication.
class LoginSpider(BaseSpider):
domain_name = 'example.com'
start_urls = ['http://www.example.com/users/login.php']
def parse(self, response):
return [FormRequest.from_response(response,
formdata={'username': 'john', 'password': 'secret'},
callback=self.after_login)]
def after_login(self, response):
# check login succeed before going on
if "authentication failed" in response.body:
self.log("Login failed", level=log.ERROR)
return
# continue scraping with authenticated session...
I used mechanize for Python with success for a few things. It's easy to use and supports HTTP authentication, form handling, cookies, automatic HTTP redirection (30X), ... Basically the only thing missing is JavaScript, but if you need to rely on JS you're pretty much screwed anyway.
Try a Selenium script, automated with Selenium Remote Control.
Free Download Manager is great for crawling, and you could use wget.
On my website users can post comments on a document. Now I want to send an RSS feed to the webmasters when a comment is posted. I want the webmaster to be notified by a small pop-up in the right corner of the page. So this is what's happening:
User adds comment
system checks if webmaster is logged in
if webmaster is logged in; show pop-up in right corner with the title of the comment in it.
How to accomplish this?
Setup a javascript timer to call a webservice periodically (every 5 seconds?) if the user is a webmaster. This webservice can determine if a new comment has been added since the last time it was checked. The webservice returns nothing if no new comment or some information about the comment if there is a new one.
If the webservice returns a comment, put that information into a div tag that you have created on your page and make it visible. If you are sure the webmaster is using a modern browser, you can use position:fixed to put this div tag in the upper right corner. If not, you will have to use some javascript to accomplish this.
Unless you're using a comet style service to push notifictaions to the webmaster's browser, you're going to need to make a page that polls for new notifications at a pre-defined interval. You can then make an AJAX call to the service and parse the response on to a web page that only the webmaster has access to.
If you're interested in comet (services that can push data to the connected client), you can get a start at Wikipedia:
Comet (programming)