What's wrong with my url encoding?

What's wrong with my url encoding? - c#

In my asp.net mvc application I created the following link:
http://localhost:2689/en/Formula.mvc/351702++LYS+GRONN+5G+9%252f2++fds
I get error 400 (bad request).
I think it blocks at the %25 (forward slash).
What am I doing wrong?
--EDIT 3--
I tried not encoding anything at all but rather rely on the default encoding of Url.RouteUrl().
It seems that this doesn't encode the "/" for some reason.
If I encode it myself first, I end up with the doubel encoded %252f. This gives me a bad request for some reason..
Why?!
--EDIT 2--
I generated the last part of the URI as follows:
Take the id.toString
Take the HttpUtility.UrlEncode(name)
Take the HttpUtility.UrlEncode(code)
String.Format("{0}--{1}--{2}") with the values from the previous parts
Add it as a parameter to Url.RouteUrl()
After that my action gets this parameter again, splits it at -- and HttpUtility.Decode() the values back.
I do it this way because the two last parameters are optional, but functional parameters. IF they are defined in a previous step, they have to be carried along to the other pages.
Less abstract: A color can have multiple names, but if a user selected it by a particular name, it should be kept throughout all the other pages.
--EDIT 1--
It also looks like HttpUtility.UrlEncode() and Url.Encode() return different results :S
If I don't encode the "/", it acts as a separator=>no luck there.
If I encode it with Url.Encode() I end up with %2F => Code 400
If I encode it with HttpUtility.UrlEncode() I end up with %25 => code 400
Because 400 doesn't even let it through to asp.net-mvc, the route debugger is of no use :(

I was there a couple of days ago. If you can accept unreadable route-values in the URL try this:
URL-encoded slash in URL

%25 is actually encoded "%", so %252f is encoded "%2f".
%2f (encoded "/") is not allowed in URL unless you explicitly allow it in webserver's configuration.

Have you run the Routing debugger: http://haacked.com/archive/2008/03/13/url-routing-debugger.aspx

I haven't looked too much at the encoding - but note that if this is to be stored somewhere (or acted upon in some way), then a POST would be more appropriate. If the text on the right is actually representative of the data with id 351702 (a vanity url, much like /665354/whats-wrong-with-my-url-encoding), then you should humanize the text. Much as the spaces have been removed from the above. It is also common to have this as a separate level in the route that is simply discarded.
Generally, MVC urls should be comprehensible.

W3Schools works fine: http://www.w3schools.com/TAGS/html_form_submit.asp?text=hello/world
Here's the URL encoding reference: http://www.w3schools.com/TAGS/ref_urlencode.asp

You can't use a forward slash as a value in the URL. Here is a nice post about creating browser and SEO friendly URLS => http://www.dominicpettifer.co.uk/displayBlog.aspx?id=34
[Edit]
Whenever you create a route you associate it with a URL pattern (The default pattern is {controller}/{action}/{id}). And in this url pattern you are supposed to use the forward slash to separate different tokens. Hope that helps

Related

How can I modify or handle URL special characters in MVC?

I have an MVC web application. The URL for a particular area is coming in as:
http://localhost/General/Bpa%3fapplication%3dTrf%23/GeneralInputs
This causes a "The resource cannot be found." error. However, if I change the URL to
http://localhost/General/Bpa?application=Trf#/GeneralInputs
then everything works. I can see from using some route debugging tricks that the controller in the first case is: "Bpa?application=Trf#", whereas the second one is: "Bpa", which is correct. How can I account for this or substitute for the encoded characters?

The encoding of the first URL is wrong. If you look at RFC 3986 you will find in 2.4 the paragraph
When a URI is dereferenced, the components and subcomponents
significant to the scheme-specific dereferencing process (if any)
must be parsed and separated before the percent-encoded octets within
those components can be safely decoded, as otherwise the data may be
mistaken for component delimiters.
That means the URL is decomposed by unencoded characters (in this case the ? matters). If the encoded string #3f is used, then the framework would have to look for a controller named "Bpa?application=Trf#" and not "Bpa". Thus a 404 / resource not found is returned.
You should not fix it on the server side; you will have to change the place where the wrong url http://localhost/General/Bpa%3fapplication%3dTrf%23/GeneralInputs is generated.

You're going to want to use this on your url:
string fixedUrl = System.Uri.UnescapeDataString(yourUrlHere);
Hope that works out for you!

C# HttpClient URI is not escaping properly

I'm trying to get my URL to escape but it's not working properly. Ironically, on my MacBook when I execute this part of code
Uri url = new Uri("http://www.example.com/?i=123%34", true);
// it returns http://www.example.com/?i=123%34 which is exactly what I want.
The problem is that my IDE says it's obsolete and it does not work on my Windows machine. It's the exact same project, and IDE. So I tried to find a solution, which someone suggested
Uri uri = new Uri(Uri.EscapeUriString("http://www.example.com/?i=123%34"));
// this returns http://www.example.com/?i=123%2534 which is what I DONT want.
So how do I approach this issue? I looked all over the web and I can't find any solutions. I need to know how to properly escape this URL. The second method posted above does not work like the first method above.
I verified the GET requests via Fiddler, so everything is indeed happening.
Update:
Again, I need the server to receive the URL exactly how the string is declared. I want the server to handle the conversion. I cannot substitute %25 for the % symbol. It MUST be received exactly how I the string is declared. Additionally, "http://www.example.com/?i=1234" is NOT what I want either.

The problem is with the configuration of your web server on Windows, that allows double escaping. Your original URL is http://www.example.com/?i=123%34, which when unescaped, becomes http://www.example.com/?i=1234.
Your web server on Windows, on the other hand, escapes the % character again instead of unescaping %34. Thus, it turns into http://www.example.com/?i=123%2534.
This is why you should not use characters like % in the URL before it gets escaped.
Edit -
I typed the following two URLs in Firefox to see how the parameters are received on the server.
The value of i in http://www.example.com/?i=123%34 is 1234.
The value of i in http://www.example.com/?i=123%2534 is 123%34
If the server must receive the % character, it must be escaped in order for it to be dispatched over HTTP. There's literally no other way to send it over the wire. If you don't escape the % character, it will be treated as an escape sequence along with 34 and automatically turn into 4 on the server.
If your network inspector shows you unescaped text in the request, it's because it's prettifying the URL before displaying it to you.

If you are okay with the string reading ht tp://www.example.com/?i=1234, you can try
Uri url = new Uri(Uri.UnescapeDataString("http://www.example.com/?i=123%34"));

Which HttpUtility decode method to use?

this may be a silly question, but it trips me up every time.
HttpUtility has the methods HtmlDecode and UrlDecode. Do these two methods decode anything (Html/Http related) I might find? When do I have to use them, and which one am I supposed to use?
Just now I hit an error. This is my error log:
Payment receiver was not payment#mysite.com. (it was payment%40mysite.com).
But, I wrapped the email address here in HttpUtility.HtmlDecode before using it. It turns out I have to use .UrlDecode instead, but this email address didn't come from a URL so this wasn't obvious to me.
Can someone clarify this?

See What is meant by htmlencode and urlencode?
It's the reverse of your case, but essentially you need to use UrlEncode/Decode anytime you are using an address of sorts (urls and yes, email addresses). HtmlEncode/Decode is for code that typically a browser would render (html/xml tags).

This same encoding is also used in Form POST requests as well.
My guess is something read it 'naked' without decoding it.
Html Encoding/Decoding is only used to escape strings that contain characters that would otherwise be interpreted as html control characters. The process turns the characters into html entities and back again.
Url Encoding is to get around the fact that many characters are not allowed in Uris; or because they too could be misinterpreted. Thus the percent encoding is used.
Percent encoding is also used in the body of http requests.
In both cases, of course, it's also a way of expressing a specific character code in a request/response independent of character sets; but equally, interpreting what is meant by a particular code can also be dependent on knowing a particular character set. Generally you don't worry about that - but it can be important (especially in the HTML case).

URLEncode converts characters that aren't allowed in a URL into character equivalents which are parsable as a URL. In your example # became %40. URLDecode reverses this.
HTMLEncode is similar to URLEncode, but the target environment is text NESTED inside of HTML. This helps the browser from interpereting your content as HTML, but when rendered it should look like the decoded version. HTMLDecode reverses this.

When you see %xx this means percent encoding has occured - this is a URL encoding scheme, so you need to use UrlEncode / UrlDecode.
The HtmlEncode and HtmlDecode methods are for encoding and decoding elements for HTML display - so things like & get encoded to & and > to >.

How do I use a pattern Url to extract a segment from an actual Url?

If I have a series of "pattern" Urls of the form:
http://{username}.sitename.com/
http://{username}.othersite.net/
http://mysite.com/{username}
and I have an actual Url of the form:
http://joesmith.sitename.com/
Is there any way that I can match a pattern Url and in turn use it to extract the username portion out the actual Url? I've thought of nasty ways to do it, but it just seems like there should be a more intuitive way to accomplish this.
ASP.NET MVC uses a similar approach to extract the various segments of the URL when it is building its routes. Given the example:
{controller}/{action}
So given the Url of the form, Home/Index, it knows that it is the Home controller calling the Index action method.

Not sure I understand this question correctly but you can just use a regular expression to match anything between 'http://' and the first dot.

A very simple regex will do:
':https?://([a-z0-9\.-]*[a-z0-9])\.sitename\.com'
This will allow any subdomain that only contains valid subdomain characters. Example of allowed subdomains:
joesmith.sitename.com
joe.smith.sitename.com
joe-smith.sitename.com
a-very-long-subdomain.sitename.com
As you can see, you might want to complicate the regex slightly. For instance, you could limit it to only allow a certain amount of characters in the subdomain.

It seems the the quickest and easiest solution is going off of Machine's answer.
var givenUri = "http://joesmith.sitename.com/";
var patternUri = "http://{username}.sitename.com/";
patternUri = patternUri.Replace("{username}", #"([a-z0-9\.-]*[a-z0-9]");
var result = Regex.Match(givenUri, patternUri, RegexOptions.IgnoreCase).Groups;
if(!String.IsNullOrEmpty(result[1].Value))
return result[1].Value;
Seems to work great.

Well, this "pattern URL" is a format you've made up, right? You basically you'll just need to process it.
If the format of it is:
anything inside "{ }" is a thing to capture, everything else must be as is
Then you'd just find the start/end index of those brackets, and match everything else. Then when you get to a place where one is, make sure you only look for chars such that they don't match whatever 'token' comes after the next ending '}'.

There are definitely different ways - ultimately though your server must be configured to handle (and possibly route) these different subdomain requests.
What I would do would be to answer all subdomain requests (except maybe some reserved words, like 'www', 'mail', etc.) on sitename.com with a single handler or page (I'm assuming ASP.NET here based on your C# tag).
I'd use the request path, which is easy enough to get, with some simple string parsing/regex routines (remove the 'http://', grab the first token up until '.' or '/' or '\', etc.) and then use that in a session, making sure to observe URL changes.
Alternately, you could map certain virtual paths to request urls ('joesmith.sitename.com' => 'sitename.com/index.aspx?username=joesmith') via IIS but that's kind of nasty too.
Hope this helps!

Space in routing gives 404

I currently have this route defined (among others):
"{controller}/{action}/{id}/{designation}" being:
"id" my primary key
"designation" only used for SEO and not taken into account.
now my problem is:
"http://server/Home/Index/1/teste" works but "http://server/Home/Index/1/teste " with a space in the end doesn't.
IIS is giving me a 404 and mvc is not even starting for this request.
Anyone experienced this behavior? Anything I need to change?
With best regards

Space cannot be used as a plain text character in a url. You have to encode it as:
%20
E.g.
http://www.testDomain.com/test%20page

Space is an invalid character in URL's. The browser should not even send it.
If you're calling this in code, try using HttpUtility.UrlEncode( path ) before sending / redirecting.

Look at this post:
"The resource cannot be found." error when there is a "dot" at the end of the url
it talks about similar problem with '.' (dot) character at the end of a url. Think it's the same problem as yours.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

What's wrong with my url encoding? - c#

I was there a couple of days ago. If you can accept unreadable route-values in the URL try this: URL-encoded slash in URL

%25 is actually encoded "%", so %252f is encoded "%2f". %2f (encoded "/") is not allowed in URL unless you explicitly allow it in webserver's configuration.

Have you run the Routing debugger: http://haacked.com/archive/2008/03/13/url-routing-debugger.aspx

W3Schools works fine: http://www.w3schools.com/TAGS/html_form_submit.asp?text=hello/world Here's the URL encoding reference: http://www.w3schools.com/TAGS/ref_urlencode.asp

Related

How can I modify or handle URL special characters in MVC?

C# HttpClient URI is not escaping properly

Which HttpUtility decode method to use?

How do I use a pattern Url to extract a segment from an actual Url?

Space in routing gives 404

Categories

Resources