HttpUtility.UrlEncode - Minus instead of plus? - c#

I am in a sitauation where I have some path. This path could be something like "jadajada.com/My Site.html".
I use HttpUtility.UrlEncode to encode the urls, which is great. However, I have the issue that whenever I have a space, it replaces this with a "+" sign. I need a "-" sign instead.
Can this method perform this task? And if so, what kind of encoding ect.
(And yes, I know you can use string.Replace, but please avoid that solution for now ;-)

Replacing spaces with "-" is not really encoding, since there is no standard decoder for that; the "+" is correct.
However, if this is for display only, and as long as your code doesn't rely on this value (for example, to do an exact slug match expecting the space) you could simply do a .Replace(" ","-") before you encode. In that lossy scenario you might also want to replace a few others, truncate overly long strings, etc.
Encoding it once it has a - should be a no-op (i.e. it won't change).

A space can be URL encoded either as a + or as %20. That is the way that a space is encoded, so there is no built in method for encoding it into any other arbitrary character.
If you want to replace spaces with - instead that is not encoding, it's replacing, so the Replace method would be appropriate to use.

UrlEncoding will never replace a space with - on it's own, since that is not a representation of a space inside a URL. It will either use + or %20.
So if you actually want to do this, I think that string.Replace is your best option here, but if you do not want spaces inside the resulting URL, you should probably remove the spaces from the URL before you encode it in the first place.

One reason that you'd want to change it from + to - is that URL Rewriting doesn't work when the URL contains + (unless you entirely disable double escaping). It's easier to change the + to -

Related

URL Decoder c# asp.net

In a QueryString I have a part that looks like this
...u4w51EEcg8%2bj04e7C....
When I am using HttpUtility.UrlDecode the part "%2b" which represents a "+" just turns into a white space.
I'm using HttpUtility.UrlEncode in the first place to encode the string.
Does anyone have any clue to what is going on?
Are you decoding twice? For convenience, a + sign in a URL decodes to (space, 0x20). While %2b should decode to +, decoding that will give you .
EDIT: Just saw your self-answer, and yeah, always check whether your getter functions / properties automatically decode for you. Double-decoding usually doesn't produce the desired result, and can even lead to security risks.
It looks like that "Request.QueryString[]" decodewhatever it gets, so what happened was that I decode the QueryString 2 times, which makes "+" a whitespace.

a cleaner way of representing double quote?

really simple question... just want to represent double quote " without needing to do "" or \"
cases that I'm aware of:
var s=#"123 "" 456 """;
var s="123 \" 456 \"";
It'd make a reasonalbe difference if I could remove this noise somehow. The reason is that the escape sequence \ and the double quote have meaning in a domain specific language (DSL) that we're using. Sometimes it's convenient to throw some syntax inline into a C# string.
What I'd like is a way to tell .net not to touch it. Perhaps some kind of catch all via the DLR?
Within a C# literal, there's nothing you can to - don't forget this is all done at compile-time.
If you don't use single quotes, you could always do:
var s = "123 ' 456 '".Replace("'", "\"");
(Or choose some other character you don't use much, and replace that afterwards instead.)
Other than that, avoiding storing lots of data in your source code helps a lot with this sort of thing - for test data, I often use an embedded resource and load that in at execution time.
I don't suppose you could just read them in from a file or database?
Yeah, there's definitely a way to do that, and I use it all the time for exactly that reason.
You create a string resource collection (open Project Properties, Resources, make sure it's on Strings) and put your literal strings in there. Then, when you need one of those strings, use the Properties.Resources.{insert string resource name} reference to collect it in a pure and unadulterated form!
For completeness, I'll mention that you can use hex in a C# string, so in this case, \x0022. Note that you can omit the leading 0's if the character immediately following isn't hex.

How to decode a string like "string\x27s" in c#?

I have a c# app that sometimes has to work with strings like:
"example\x27s string"
How do I decode that? I know 27 is the ascii code for a single quote ', but UrlDecode() wont work on that string.
Should I replace the \x value with % and then use System.Web.HttpUtility.UrlDecode() or is there another way to do it?
\x27 is not an HTML encoded value. This is a string Escape character. The truth behind it though is that in the actual string is probably a physical \ character so what you're dealing with is:
"\\x27"
Or
#"\x27"
I am unsure if .NET has a way to re-evaluate a string, but the \codes for strings are handled on a compiler level if i remember correctly.
You could use regular expressions to do a replacement if you want, since you know what it represents.

Url encoding quotes and spaces

I have some query text that is being encoded with JavaScript, but I've encountered a use case where I might have to encode the same text on the server side, and the encoding that's happening is not the same. I need it to be the same. Here's an example.
I enter "I like food" into the search box and hit the search button. JavaScript encodes this as %22I%20like%20food%22
Let's say I get the same value as a string on a request object on the server side. It will look like this: "\"I like food\""
When I use HttpUtility.UrlEncode(value), the result is "%22I+like+food%22". If I use HttpUtility.UrlPathEncode(value), the result is "\"I%20like%20food\""
So UrlEncode is encoding my quotes but is using the + character for spaces. UrlPathEncode is encoding my spaces but is not encoding my escaped quotes.
I really need it to do both, otherwise the Search code completely borks on me (and I have no control over the search code).
Tips?
UrlPathEncode doesn't escape " because they don't need to be escaped in path components.
Uri.EscapeDataString should do what you want.
There are a few options available to you, the fastest might be to use UrlEncode then do a string.replace to swap the + characters with %20.
Something like
HttpUtility.UrlEncode(input).Replace("+", "%20");
WebUtility.UrlEncode(str)
Will encode all characters that need encoded using the %XX format, including space.

Server.UrlEncode(string s)... doesn't

Server.UrlEncode("My File.doc") returns "My+File.doc", whereas the javascript escape("My File.doc") returns "My%20File.doc". As far as i understand it the javascript is corectly URL encoding the string whereas the .net method is not. It certainly seems to work that way in practice putting http://somesite/My+File.doc will not fetch "My File.doc" in any case i could test using firefox/i.e. and IIS, whereas http://somesite/My%20File.doc works fine. Am i missing something or does Server.UrlEncode simply not work properly?
Use Javascripts encodeURIComponent()/decodeURIComponent() for "round-trip" encoding with .Net's URLEncode/URLDecode.
EDIT
As far as I know, historically the "+" has been used in URL encoding as a special substitution for the space char ( ASCII 20 ). If an implementation does not take the space into consideration as a special character with the '+' substitution, then it still has to escape it using its ASCII code ( hence '%20' ).
There is a really good discussion of the situation at http://bytes.com/topic/php/answers/5624-urlencode-vs-rawurlencode. It's inconclusive, by the way. RFC 2396 lumps the space with other characters without an unreserved representation, which sides with the '%20' crowd.
RFC 1630 sides with the '+' crowd ( via forum discusion )...
Within the query string, the plus sign
is reserved as shorthand notation for
a space. Therefore, real plus signs
must beencoded. This method was used
to make query URIs easier to pass in
systems which did not allow spaces.
Also, the core RFCs are...
RFC 1630 - Universal Resource Identifiers in WWW
RFC 1738 - Uniform Resource Locators (URL)
RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax
As far as i understand it the javascript is corectly URL encoding the string whereas the .net method is not
Actually they're both wrong!
JavaScript escape() should never be used. As well as failing to encode the + character to %2B, it encodes all non-ASCII characters as a non-standard %uNNNN sequence.
Meanwhile Server.UrlEncode is not exactly URL-encoding as such, but encoding to application/x-www-form-urlencoded, which should only normally be used for query parameters. Using + to represent a space outside of a form name=value construct, such as in a path part, is wrong.
This is rather unfortunate. You might want to try doing a string replace of the + character with %20 after encoding with UrlEncode() when you are encoding into a path part rather than a parameter. In a parameter, + and %20 are equally good.
A + instead of a space is correct URL encoding, as would escaping it to %20. See this article (CGI Programming in Perl - URL Encoding).
The + is not something that JavaScript can parse, so javascript will escape the space or + to %20.
Using System.Uri.EscapeDataString() serverside and decodeURIComponent clientside works.

Categories

Resources