I've been trying to see if I could get timetable data of a school website, and make a little application of it. At the moment this is what I have :
string userInput = "/*My username will be here*/";
string passInput = "/*My password will be here */";
string formUrl = "https://portal.gc.ac.nz/student/index.php/process-login";
string formParams = string.Format("username={0}&password={1}", userInput, passInput);
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
string pageSource;
string getUrl = "https://portal.gc.ac.nz/student/index.php/timetable";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
I couldn't find a way to check if above code works, however my question is:
How can you access the data(texts) you want from the page? I want to get the subject names. Part of the html looks like this :
There are a few ways to do this: one would be regexp matching and taking the contents of the tags and another would be to just use HtmlAgilityPack library.
If you don't need to do it in C# I would strongly recommend a different language like Python or Perl. It seems to me that you are trying to scrape the data and in this case I strongly recommend to use the Scrapy framework from Python if possible. It's the best tool I encountered for scraping and you can use XPath to get your data easily. Here is the link to Scrapy's website.
Related
I'm trying to do a WebRequest to a site from a Windows Phone Application.
But is vry important for me to also get the response from the server.
Here is my code:
Uri requestUri = new Uri(string.Format("http://localhost:8099/hello/{0}", metodo));
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(requestUri);
httpWebRequest.ContentType = "application/xml; charset=utf-8";
httpWebRequest.Method = "POST";
using (var stream = await Task.Factory.FromAsync<Stream>(httpWebRequest.BeginGetRequestStream,
httpWebRequest.EndGetRequestStream, null))
{
string xml = "<string xmlns=\"http://schemas.microsoft.com/2003/10/Serialization/\">Ahri</string>";
byte[] xmlAsBytes = Encoding.UTF8.GetBytes(xml);
await stream.WriteAsync(xmlAsBytes, 0, xmlAsBytes.Length);
}
Unfortunatelly, I have no idea of how I could get the response from the server.
Does anyone have an idea?
Thanks in advance.
Thanks to #max I found the solution and wanted to share it above.
Here is how my code looks like:
string xml = "<string xmlns=\"http://schemas.microsoft.com/2003/10/Serialization/\">Claor</string>";
Uri requestUri = new Uri(string.Format("http://localhost:8099/hello/{0}", metodo));
string responseFromServer = "no response";
HttpWebRequest httpWebRequest = HttpWebRequest.Create(requestUri) as HttpWebRequest;
httpWebRequest.ContentType = "application/xml; charset=utf-8";
httpWebRequest.Method = "POST";
using (Stream requestStream = await httpWebRequest.GetRequestStreamAsync())
{
byte[] xmlAsBytes = Encoding.UTF8.GetBytes(xml);
await requestStream.WriteAsync(xmlAsBytes, 0, xmlAsBytes.Length);
}
WebResponse webResponse = await httpWebRequest.GetResponseAsync();
using (var reader = new StreamReader(webResponse.GetResponseStream()))
{
responseFromServer = reader.ReadToEnd();
}
I hope it will help someone in the future.
This is very common question for people new in windows phone app
development. There are several sites which gives tutorials for the
same but I would want to give small answer here.
In windows phone 8 xaml/runtime you can do it by using HttpWebRequest or a WebClient.
Basically WebClient is a wraper around HttpWebRequest.
If you have a small request to make then user HttpWebRequest. It goes like this
HttpWebRequest request = HttpWebRequest.Create(requestURI) as HttpWebRequest;
WebResponse response = await request.GetResponseAsync();
using (var reader = new StreamReader(response.GetResponseStream()))
{
string responseContent = reader.ReadToEnd();
// Do anything with you content. Convert it to xml, json or anything.
}
Although this is a get request and i see that you want to do a post request, you have to modify a few steps to achieve that.
Visit this place for post request.
If you want windows phone tutorials, you can go here. He writes awesome tuts.
I'm working on a project where I have to send product information via HTTP POST in XML string to a web server. It turns out that certain product names may have a % sign in their name, such as ".05% Topical Cream". Whenever I attempt to send XML data that contained a % sign in the product name, I get an error apparently because when encoding the XML string data the percent sign caused the data to become malformed.
How can I encode and send the XML string data with % sign in product name safely?
XML Data:
<node>
<product>
<BrandName>amlodipine besylate (bulk) 100 % Powder</BrandName>
</product>
</node>
Web request code:
public string MakeWebServerRequest(string url, string data)
{
var parms = System.Web.HttpUtility.UrlEncode(data);
byte[] bytes = Encoding.UTF8.GetBytes("xml=" + parms);
string webResponse = String.Empty;
try
{
System.Web.HttpUtility.UrlEncode(data);
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
req.ContentLength = bytes.Length;
using (Stream reqStream = req.GetRequestStream())
{
reqStream.WriteTimeout = 3000;
reqStream.Write(bytes, 0, bytes.Length);
reqStream.Close();
}
using (HttpWebResponse response = (HttpWebResponse)req.GetResponse())
{
using (StreamReader rdr = new StreamReader(response.GetResponseStream()))
{
webResponse = rdr.ReadToEnd();
rdr.Close();
}
response.Close();
}
}
Should I be creating the web request differently? What can I do to resolve, while maintaining the product name?
Corrected - and working now. Thanks
Thanks
You need to construct request propely. application/x-www-form-urlencoded means that each parameter is Url-encoded. In your case xml parameter must have value correctly encoded, not just blindly concatenated. Below is sample that should give you stared... hopefully you'll be able to avoid string concateneation to construct XML (and insane way of constructing string constant with queotes you have in original code):
var parameterValue = System.Web.HttpUtility.UrlEncode("<xml>" + data);
byte[] bytes = Encoding.UTF8.GetBytes("xml=" + parameterValue);
There are also plenty of samples how to correctly construct requests of this kind. I.e. C# web request with POST encoding question
I'm trying to read the reponse from a webserver using httpwebrequests in C#.
I use the following code:
UriBuilder urib = new UriBuilder();
urib.Host = "wikipedia.com";
HttpWebRequest req = WebRequest.CreateHttp(urib.Uri);
req.KeepAlive = false;
req.Host = "wikipedia.com/";
req.Method = "GET";
HttpWebResponse response = (HttpWebResponse) req.GetResponse();
byte[] buffer = new byte[response.ContentLength];
System.IO.Stream stream = response.GetResponseStream();
stream.Read(buffer, 0, buffer.Length);
Console.WriteLine(System.Text.Encoding.ASCII.GetString(buffer, 0, buffer.Length));
The code does indeed retrieve the correct amount of data (I compared the contentlength used to create the buffer, with the length of the console output, they're the same.
My problem is that the last 80% or so of the response is blank chars. They're all 0x00.
I tested this with several pages, including wikipedia.com and it just cuts off mid-file for some reason.
Have I misunderstood/misused the way to use webrequests or can anyone spot an error here?
Try to use this method:
public static String GetResponseString(Uri url, CookieContainer cc)
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = WebRequestMethods.Http.Get;
request.CookieContainer = cc;
request.AutomaticDecompression = DecompressionMethods.GZip;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
String responseString = reader.ReadToEnd();
response.Close();
return responseString;
}
There are a couple of issues with your code:
Your trying to read the entire response in one go using Stream.Read - that's not what it was designed for. This should be used for more optimal reading e.g. 4KB chunks.
Your reading a HTML response as ASCII encoding - are you sure the page doesn't contain any Unicode characters? I would stick to UTF-8 encoding to be on the safe side (or alternatively read the Content-Type header in the response).
When reading characters from a byte stream (which is what your response is essentially) the recommended approach is to use StreamReader. More specifically, if you want to read the entire stream in one go then use StreamReader.ReadToEnd.
Your code could be shortened to:
HttpWebRequest req = WebRequest.CreateHttp(new Uri("http://wikipedia.org"));
req.Method = WebRequestMethods.Http.Get;
using (var response = (HttpWebResponse)req.GetResponse())
using (var reader = new StreamReader(response.GetResponseStream()))
{
Console.WriteLine(reader.ReadToEnd());
}
I'm having issues to send POST data that includes characteres like "+" in the password field,
string postData = String.Format("username={0}&password={1}", "anyname", "+13Gt2");
I'm using HttpWebRequest and a webbrowser to see the results, and when I try to log in from my C# WinForms using HttpWebRequest to POST data to the website, it tells me that password is incorrect. (in the source code[richTexbox1] and the webBrowser1). Trying it with another account of mine, that does not contain '+' character, it lets me log in correctly (using array of bytes and writing it to the stream)
byte[] byteArray = Encoding.ASCII.GetBytes(postData); //get the data
request.Method = "POST";
request.Accept = "text/html";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream newStream = request.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close(); //this works well if user does not includes symbols
From this Question I found that HttpUtility.UrlEncode() is the solution to escape illegal characters, but I can't find out how to use it correctly, my question is, after url-encoding my POST data with urlEncode() how do I send the data to my request correctly?
This is how I've been trying for HOURS to make it work, but no luck,
First method
string urlEncoded = HttpUtility.UrlEncode(postData, ASCIIEncoding.ASCII);
//request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = urlEncoded.Length;
StreamWriter wr = new StreamWriter(request.GetRequestStream(),ASCIIEncoding.ASCII);
wr.Write(urlEncoded); //server returns for wrong password.
wr.Close();
Second method
byte[] urlEncodedArray = HttpUtility.UrlEncodeToBytes(postData,ASCIIEncoding.ASCII);
Stream newStream = request.GetRequestStream(); //open connection
newStream.Write(urlEncodedArray, 0, urlEncodedArray.Length); // Send the data.
newStream.Close(); //The server tells me the same thing..
I think I'm doing wrong on how the url-encoded must be sent to the request, I really ask for some help please, I searched through google and couldn't find more info about how to send encoded url to an HttpWebRequest.. I appreciate your time and attention, hope you can help me. Thank you.
I found my answer using Uri.EscapeDataString it exactly solved my problem, but somehow I couldn't do it with HttpUtility.UrlEncode. Stackoverflowing around, I found this question that is about urlEncode method, in msdn it documentation tells that:
Encodes a URL string.
But I know now that is wrong if used to encode POST data (#Marvin, #Polity, thanks for the correction). After discarding it, I tried the following:
string postData = String.Format("username={0}&password={1}", "anyname", Uri.EscapeDataString("+13Gt2"));
The POST data is converted into:
// **Output
string username = anyname;
string password = %2B13Gt2;
Uri.EscapeDataString in msdn says the following:
Converts a string to its escaped representation.
I think this what I was looking for, when I tried the above, I could POST correctly whenever there's data including the '+' characters in the formdata, but somehow there's much to learn about them.
This link is really helpful.
http://blogs.msdn.com/b/yangxind/archive/2006/11/09/don-t-use-net-system-uri-unescapedatastring-in-url-decoding.aspx
Thanks a lot for the answers and your time, I appreciate it very much. Regards mates.
I'd recommend you a WebClient. Will shorten your code and take care of encoding and stuff:
using (var client = new WebClient())
{
var values = new NameValueCollection
{
{ "username", "anyname" },
{ "password", "+13Gt2" },
};
var url = "http://foo.com";
var result = client.UploadValues(url, values);
}
the postdata you are sending should NOT be URL encoded! it's formdata, not the URL
string url = #"http://localhost/WebApp/AddService.asmx/Add";
string postData = "x=6&y=8";
WebRequest req = WebRequest.Create(url);
HttpWebRequest httpReq = (HttpWebRequest)req;
httpReq.Method = WebRequestMethods.Http.Post;
httpReq.ContentType = "application/x-www-form-urlencoded";
Stream s = httpReq.GetRequestStream();
StreamWriter sw = new StreamWriter(s,Encoding.ASCII);
sw.Write(postData);
sw.Close();
HttpWebResponse httpResp =
(HttpWebResponse)httpReq.GetResponse();
s = httpResp.GetResponseStream();
StreamReader sr = new StreamReader(s, Encoding.ASCII);
Console.WriteLine(sr.ReadToEnd());
This uses the System.Text.Encoding.ASCII to encode the postdata.
Hope this helps,
I've got a problem with creating an HTTP post request in .NET. When I do this request in ruby it does work.
When doing the request in .NET I get following error:
<h1>FOXISAPI call failed</h1><p><b>Progid is:</b> carejobs.carejobs
<p><b>Method is:</b> importvacature/
<p><b>Parameters are:</b>
<p><b> parameters are:</b> vacature.deelnemernr=478
</b><p><b>GetIDsOfNames failed with err code 80020006: Unknown name.
</b>
Does anyone knows how to fix this?
Ruby:
require 'net/http'
url = URI.parse('http://www.carejobs.be/scripts/foxisapi.dll/carejobs.carejobs.importvacature')
post_args = {
'vacature.deelnemernr' => '478',
}
resp, data = Net::HTTP.post_form(url, post_args)
print resp
print data
C#:
Uri address = new Uri(url);
// Create the web request
HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
// Set type to POST
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
// Create the data we want to send
StringBuilder data = new StringBuilder();
data.Append("vacature.deelnemernr=" + HttpUtility.UrlEncode("478"));
// Create a byte array of the data we want to send
byte[] byteData = UTF8Encoding.UTF8.GetBytes(data.ToString());
// Set the content length in the request headers
request.ContentLength = byteData.Length;
// Write data
using (Stream postStream = request.GetRequestStream())
{
postStream.Write(byteData, 0, byteData.Length);
}
// Get response
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
// Get the response stream
StreamReader reader = new StreamReader(response.GetResponseStream());
// Console application output
result = reader.ReadToEnd();
}
return result;
Don't you need the ? after the URL in order to do a post with parameters? I think that Ruby hides this behind the scenes.
I found the problem! The url variable in the C# code was "http://www.carejobs.be/scripts/foxisapi.dll/carejobs.carejobs.importvacature/"
It had to be "http://www.carejobs.be/scripts/foxisapi.dll/carejobs.carejobs.importvacature" without the backslash.