URL Decoder c# asp.net - c#

In a QueryString I have a part that looks like this
...u4w51EEcg8%2bj04e7C....
When I am using HttpUtility.UrlDecode the part "%2b" which represents a "+" just turns into a white space.
I'm using HttpUtility.UrlEncode in the first place to encode the string.
Does anyone have any clue to what is going on?

Are you decoding twice? For convenience, a + sign in a URL decodes to (space, 0x20). While %2b should decode to +, decoding that will give you .
EDIT: Just saw your self-answer, and yeah, always check whether your getter functions / properties automatically decode for you. Double-decoding usually doesn't produce the desired result, and can even lead to security risks.

It looks like that "Request.QueryString[]" decodewhatever it gets, so what happened was that I decode the QueryString 2 times, which makes "+" a whitespace.

Related

Which HttpUtility decode method to use?

this may be a silly question, but it trips me up every time.
HttpUtility has the methods HtmlDecode and UrlDecode. Do these two methods decode anything (Html/Http related) I might find? When do I have to use them, and which one am I supposed to use?
Just now I hit an error. This is my error log:
Payment receiver was not payment#mysite.com. (it was payment%40mysite.com).
But, I wrapped the email address here in HttpUtility.HtmlDecode before using it. It turns out I have to use .UrlDecode instead, but this email address didn't come from a URL so this wasn't obvious to me.
Can someone clarify this?
See What is meant by htmlencode and urlencode?
It's the reverse of your case, but essentially you need to use UrlEncode/Decode anytime you are using an address of sorts (urls and yes, email addresses). HtmlEncode/Decode is for code that typically a browser would render (html/xml tags).
This same encoding is also used in Form POST requests as well.
My guess is something read it 'naked' without decoding it.
Html Encoding/Decoding is only used to escape strings that contain characters that would otherwise be interpreted as html control characters. The process turns the characters into html entities and back again.
Url Encoding is to get around the fact that many characters are not allowed in Uris; or because they too could be misinterpreted. Thus the percent encoding is used.
Percent encoding is also used in the body of http requests.
In both cases, of course, it's also a way of expressing a specific character code in a request/response independent of character sets; but equally, interpreting what is meant by a particular code can also be dependent on knowing a particular character set. Generally you don't worry about that - but it can be important (especially in the HTML case).
URLEncode converts characters that aren't allowed in a URL into character equivalents which are parsable as a URL. In your example # became %40. URLDecode reverses this.
HTMLEncode is similar to URLEncode, but the target environment is text NESTED inside of HTML. This helps the browser from interpereting your content as HTML, but when rendered it should look like the decoded version. HTMLDecode reverses this.
When you see %xx this means percent encoding has occured - this is a URL encoding scheme, so you need to use UrlEncode / UrlDecode.
The HtmlEncode and HtmlDecode methods are for encoding and decoding elements for HTML display - so things like & get encoded to & and > to >.

HttpUtility.UrlEncode - Minus instead of plus?

I am in a sitauation where I have some path. This path could be something like "jadajada.com/My Site.html".
I use HttpUtility.UrlEncode to encode the urls, which is great. However, I have the issue that whenever I have a space, it replaces this with a "+" sign. I need a "-" sign instead.
Can this method perform this task? And if so, what kind of encoding ect.
(And yes, I know you can use string.Replace, but please avoid that solution for now ;-)
Replacing spaces with "-" is not really encoding, since there is no standard decoder for that; the "+" is correct.
However, if this is for display only, and as long as your code doesn't rely on this value (for example, to do an exact slug match expecting the space) you could simply do a .Replace(" ","-") before you encode. In that lossy scenario you might also want to replace a few others, truncate overly long strings, etc.
Encoding it once it has a - should be a no-op (i.e. it won't change).
A space can be URL encoded either as a + or as %20. That is the way that a space is encoded, so there is no built in method for encoding it into any other arbitrary character.
If you want to replace spaces with - instead that is not encoding, it's replacing, so the Replace method would be appropriate to use.
UrlEncoding will never replace a space with - on it's own, since that is not a representation of a space inside a URL. It will either use + or %20.
So if you actually want to do this, I think that string.Replace is your best option here, but if you do not want spaces inside the resulting URL, you should probably remove the spaces from the URL before you encode it in the first place.
One reason that you'd want to change it from + to - is that URL Rewriting doesn't work when the URL contains + (unless you entirely disable double escaping). It's easier to change the + to -

Server.UrlEncode(string s)... doesn't

Server.UrlEncode("My File.doc") returns "My+File.doc", whereas the javascript escape("My File.doc") returns "My%20File.doc". As far as i understand it the javascript is corectly URL encoding the string whereas the .net method is not. It certainly seems to work that way in practice putting http://somesite/My+File.doc will not fetch "My File.doc" in any case i could test using firefox/i.e. and IIS, whereas http://somesite/My%20File.doc works fine. Am i missing something or does Server.UrlEncode simply not work properly?
Use Javascripts encodeURIComponent()/decodeURIComponent() for "round-trip" encoding with .Net's URLEncode/URLDecode.
EDIT
As far as I know, historically the "+" has been used in URL encoding as a special substitution for the space char ( ASCII 20 ). If an implementation does not take the space into consideration as a special character with the '+' substitution, then it still has to escape it using its ASCII code ( hence '%20' ).
There is a really good discussion of the situation at http://bytes.com/topic/php/answers/5624-urlencode-vs-rawurlencode. It's inconclusive, by the way. RFC 2396 lumps the space with other characters without an unreserved representation, which sides with the '%20' crowd.
RFC 1630 sides with the '+' crowd ( via forum discusion )...
Within the query string, the plus sign
is reserved as shorthand notation for
a space. Therefore, real plus signs
must beencoded. This method was used
to make query URIs easier to pass in
systems which did not allow spaces.
Also, the core RFCs are...
RFC 1630 - Universal Resource Identifiers in WWW
RFC 1738 - Uniform Resource Locators (URL)
RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax
As far as i understand it the javascript is corectly URL encoding the string whereas the .net method is not
Actually they're both wrong!
JavaScript escape() should never be used. As well as failing to encode the + character to %2B, it encodes all non-ASCII characters as a non-standard %uNNNN sequence.
Meanwhile Server.UrlEncode is not exactly URL-encoding as such, but encoding to application/x-www-form-urlencoded, which should only normally be used for query parameters. Using + to represent a space outside of a form name=value construct, such as in a path part, is wrong.
This is rather unfortunate. You might want to try doing a string replace of the + character with %20 after encoding with UrlEncode() when you are encoding into a path part rather than a parameter. In a parameter, + and %20 are equally good.
A + instead of a space is correct URL encoding, as would escaping it to %20. See this article (CGI Programming in Perl - URL Encoding).
The + is not something that JavaScript can parse, so javascript will escape the space or + to %20.
Using System.Uri.EscapeDataString() serverside and decodeURIComponent clientside works.

What kind of encoding is this?

What kind of encoding do you use to encode http:// as http%253A%252F%252F
HttpUtility.UrlEncode gives http%3a%2f%2f
What you're looking at is text that has been passed through UrlEncode twice.
The second time changes the % symbols to %25.
It's unusual to pass an entire URL through UrlEncode anyway, unless you are passing it as a parameter in another URL (for redirection, for instance).
It looks like UrlEncode was called twice, encoding the literal % as %25 (which is the correct result, by the way).

What's wrong with my url encoding?

In my asp.net mvc application I created the following link:
http://localhost:2689/en/Formula.mvc/351702++LYS+GRONN+5G+9%252f2++fds
I get error 400 (bad request).
I think it blocks at the %25 (forward slash).
What am I doing wrong?
--EDIT 3--
I tried not encoding anything at all but rather rely on the default encoding of Url.RouteUrl().
It seems that this doesn't encode the "/" for some reason.
If I encode it myself first, I end up with the doubel encoded %252f. This gives me a bad request for some reason..
Why?!
--EDIT 2--
I generated the last part of the URI as follows:
Take the id.toString
Take the HttpUtility.UrlEncode(name)
Take the HttpUtility.UrlEncode(code)
String.Format("{0}--{1}--{2}") with the values from the previous parts
Add it as a parameter to Url.RouteUrl()
After that my action gets this parameter again, splits it at -- and HttpUtility.Decode() the values back.
I do it this way because the two last parameters are optional, but functional parameters. IF they are defined in a previous step, they have to be carried along to the other pages.
Less abstract: A color can have multiple names, but if a user selected it by a particular name, it should be kept throughout all the other pages.
--EDIT 1--
It also looks like HttpUtility.UrlEncode() and Url.Encode() return different results :S
If I don't encode the "/", it acts as a separator=>no luck there.
If I encode it with Url.Encode() I end up with %2F => Code 400
If I encode it with HttpUtility.UrlEncode() I end up with %25 => code 400
Because 400 doesn't even let it through to asp.net-mvc, the route debugger is of no use :(
I was there a couple of days ago. If you can accept unreadable route-values in the URL try this:
URL-encoded slash in URL
%25 is actually encoded "%", so %252f is encoded "%2f".
%2f (encoded "/") is not allowed in URL unless you explicitly allow it in webserver's configuration.
Have you run the Routing debugger: http://haacked.com/archive/2008/03/13/url-routing-debugger.aspx
I haven't looked too much at the encoding - but note that if this is to be stored somewhere (or acted upon in some way), then a POST would be more appropriate. If the text on the right is actually representative of the data with id 351702 (a vanity url, much like /665354/whats-wrong-with-my-url-encoding), then you should humanize the text. Much as the spaces have been removed from the above. It is also common to have this as a separate level in the route that is simply discarded.
Generally, MVC urls should be comprehensible.
W3Schools works fine: http://www.w3schools.com/TAGS/html_form_submit.asp?text=hello/world
Here's the URL encoding reference: http://www.w3schools.com/TAGS/ref_urlencode.asp
You can't use a forward slash as a value in the URL. Here is a nice post about creating browser and SEO friendly URLS => http://www.dominicpettifer.co.uk/displayBlog.aspx?id=34
[Edit]
Whenever you create a route you associate it with a URL pattern (The default pattern is {controller}/{action}/{id}). And in this url pattern you are supposed to use the forward slash to separate different tokens. Hope that helps

Categories

Resources