Is it safe to use an # symbol as part of a user? For example, a possible URL would be http://example.com/#dave.
The idea is that, nowadays, users are commonly called "#user", so why not make the user page "#username"?
Percent-encoded …
You can use the # character in HTTP URI paths if you percent-encode it as %40.
Many browsers would display it still as #, but e.g. when you copy-and-paste the URI into a text document, it will be %40.
… but also directly
Instead of percent-encoding it, you may use # directly in the HTTP URI path.
See the syntax for the path of an URI. Various unrelated clauses aside, the path may consist of characters in the segment, segment-nz, or segment-nz-nc set. segment and segment-nz consist of characters from the pchar set, which is defined as:
pchar = unreserved / pct-encoded / sub-delims / ":" / "#"
As you can see, the # is listed explicitly.
The segment-nz-nc set also lists the # character explicitly:
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "#" )
So, a HTTP URI like this is totally valid:
http://example.com/#dave
Example
Here is an example Wikipedia page:
link
copy-and-paste: http://en.wikipedia.org/wiki/%22#%22_%28album%29
As you can see, the ", (, and ) characters are percent-encoded, but the # and the _ is used directly.
Can you use the #-symbol in a URL? - Yes, you can!
Note that that #-character, hexadecimal value 40, decimal value 64, is a reserved characters for URI's. It's usage is for things like email-addresses in mailto:URI's, for example mailto:username#somewhere.foo and for passing username and password information on a URI (which is a bad idea, but possible): http://username:password#somewhere.foo
If you want a URL that has an #-symbol in a path you need to encode it, with so called "URL-encoding". For example like this: http://somewhere.foo/profile/username%40somewhere.foo
All modern browsers will display this as http://somewhere.foo/profile/username#somewhere.foo, and will convert any typed in #-sign to %40, so it's easy to use.
Many web-frameworks will also help you either automatically, or with helper-functions, to convert to and from URL-encoded URL's.
So, in summary: Yes, you can use the #-symbol in a URL, but you have to make sure it's encoded, as you can't use the #-character.
In the RFC the following characters:
* ' ( ) ; : # & = + $ , / ? % # [ ]
are reserved and:
The purpose of reserved characters is to provide a set of delimiting
characters that are distinguishable from other data within a URI.
So it is not recommended to use these characters without encoding.
Basicaly no.
# is a reserved character and should only be used for its intended purpose.
See: http://perishablepress.com/stop-using-unsafe-characters-in-urls/ and http://www.ietf.org/rfc/rfc3986.txt
It can be used encoded, but I don't think that is what you were asking.
Apparently modern browsers will handle this. However you asked if this was safe and according to the spec of the RFC you should not be using it (unencoded) unless it is for its intended purpose.
I found this question when I tried to search site:typescriptlang.org #ts-ignore at Chrome, and then got the result of This page isn't working, ts-ignore is currently unable to handle this request and I saw the URL became "http://site:typescriptlang.org%20#ts-ignore/". I felt so refused, then searched # symbol's function at an URL and then I found my answer on Wikipedia.
The full format of the URL is scheme://userInfo#host:port/path?query#fragment. so when we search site:typescriptlang.org #ts-ignore, the browser will think you want to visit "http://site:typescriptlang.org%20#ts-ignore/". In this URL, http is a scheme, site:typescriptlang.org%20 is a userInfo ("%20" is escaped by a space character), "ts-ignore/" is a host. Of course, we can't visit the host named "ts-ignore" without a domain.
So, # symbol can be a separator between userInfo and host.
Related
I have an MVC web application. The URL for a particular area is coming in as:
http://localhost/General/Bpa%3fapplication%3dTrf%23/GeneralInputs
This causes a "The resource cannot be found." error. However, if I change the URL to
http://localhost/General/Bpa?application=Trf#/GeneralInputs
then everything works. I can see from using some route debugging tricks that the controller in the first case is: "Bpa?application=Trf#", whereas the second one is: "Bpa", which is correct. How can I account for this or substitute for the encoded characters?
The encoding of the first URL is wrong. If you look at RFC 3986 you will find in 2.4 the paragraph
When a URI is dereferenced, the components and subcomponents
significant to the scheme-specific dereferencing process (if any)
must be parsed and separated before the percent-encoded octets within
those components can be safely decoded, as otherwise the data may be
mistaken for component delimiters.
That means the URL is decomposed by unencoded characters (in this case the ? matters). If the encoded string #3f is used, then the framework would have to look for a controller named "Bpa?application=Trf#" and not "Bpa". Thus a 404 / resource not found is returned.
You should not fix it on the server side; you will have to change the place where the wrong url http://localhost/General/Bpa%3fapplication%3dTrf%23/GeneralInputs is generated.
You're going to want to use this on your url:
string fixedUrl = System.Uri.UnescapeDataString(yourUrlHere);
Hope that works out for you!
I'm trying to get my URL to escape but it's not working properly. Ironically, on my MacBook when I execute this part of code
Uri url = new Uri("http://www.example.com/?i=123%34", true);
// it returns http://www.example.com/?i=123%34 which is exactly what I want.
The problem is that my IDE says it's obsolete and it does not work on my Windows machine. It's the exact same project, and IDE. So I tried to find a solution, which someone suggested
Uri uri = new Uri(Uri.EscapeUriString("http://www.example.com/?i=123%34"));
// this returns http://www.example.com/?i=123%2534 which is what I DONT want.
So how do I approach this issue? I looked all over the web and I can't find any solutions. I need to know how to properly escape this URL. The second method posted above does not work like the first method above.
I verified the GET requests via Fiddler, so everything is indeed happening.
Update:
Again, I need the server to receive the URL exactly how the string is declared. I want the server to handle the conversion. I cannot substitute %25 for the % symbol. It MUST be received exactly how I the string is declared. Additionally, "http://www.example.com/?i=1234" is NOT what I want either.
The problem is with the configuration of your web server on Windows, that allows double escaping. Your original URL is http://www.example.com/?i=123%34, which when unescaped, becomes http://www.example.com/?i=1234.
Your web server on Windows, on the other hand, escapes the % character again instead of unescaping %34. Thus, it turns into http://www.example.com/?i=123%2534.
This is why you should not use characters like % in the URL before it gets escaped.
Edit -
I typed the following two URLs in Firefox to see how the parameters are received on the server.
The value of i in http://www.example.com/?i=123%34 is 1234.
The value of i in http://www.example.com/?i=123%2534 is 123%34
If the server must receive the % character, it must be escaped in order for it to be dispatched over HTTP. There's literally no other way to send it over the wire. If you don't escape the % character, it will be treated as an escape sequence along with 34 and automatically turn into 4 on the server.
If your network inspector shows you unescaped text in the request, it's because it's prettifying the URL before displaying it to you.
If you are okay with the string reading ht tp://www.example.com/?i=1234, you can try
Uri url = new Uri(Uri.UnescapeDataString("http://www.example.com/?i=123%34"));
As part of a custom log in page, I'm trying to get the querystring part of a URL string that may represent an absolute or a relative URL. If it's an absolute URL, I can use use the Uri.Query property, but this is not supported for relative URLs.
Is it as simple as getting the substring starting at the first instance of a '?' or is it possible for a URL to contain a question mark before the query string? Or can any other text come after the query string?
returnUrl.Substring(returnUrl.IndexOf('?'))
Where returnUrl may be absolute: "http://www.example.com/anydir/any-page1?param=1" or relative: "/anydir/any-page?param=1"
Only the first question mark in the URL has significance to indicate the start of the query string. Any after it are treated as a literal question mark.
See RFC 3986 3.4 and 3.3
It does note however:
The characters slash ("/") and question mark ("?") are allowed to
represent data within the fragment identifier. Beware that some
older, erroneous implementations may not handle this data correctly
when it is used as the base URI for relative references
Server.UrlEncode("My File.doc") returns "My+File.doc", whereas the javascript escape("My File.doc") returns "My%20File.doc". As far as i understand it the javascript is corectly URL encoding the string whereas the .net method is not. It certainly seems to work that way in practice putting http://somesite/My+File.doc will not fetch "My File.doc" in any case i could test using firefox/i.e. and IIS, whereas http://somesite/My%20File.doc works fine. Am i missing something or does Server.UrlEncode simply not work properly?
Use Javascripts encodeURIComponent()/decodeURIComponent() for "round-trip" encoding with .Net's URLEncode/URLDecode.
EDIT
As far as I know, historically the "+" has been used in URL encoding as a special substitution for the space char ( ASCII 20 ). If an implementation does not take the space into consideration as a special character with the '+' substitution, then it still has to escape it using its ASCII code ( hence '%20' ).
There is a really good discussion of the situation at http://bytes.com/topic/php/answers/5624-urlencode-vs-rawurlencode. It's inconclusive, by the way. RFC 2396 lumps the space with other characters without an unreserved representation, which sides with the '%20' crowd.
RFC 1630 sides with the '+' crowd ( via forum discusion )...
Within the query string, the plus sign
is reserved as shorthand notation for
a space. Therefore, real plus signs
must beencoded. This method was used
to make query URIs easier to pass in
systems which did not allow spaces.
Also, the core RFCs are...
RFC 1630 - Universal Resource Identifiers in WWW
RFC 1738 - Uniform Resource Locators (URL)
RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax
As far as i understand it the javascript is corectly URL encoding the string whereas the .net method is not
Actually they're both wrong!
JavaScript escape() should never be used. As well as failing to encode the + character to %2B, it encodes all non-ASCII characters as a non-standard %uNNNN sequence.
Meanwhile Server.UrlEncode is not exactly URL-encoding as such, but encoding to application/x-www-form-urlencoded, which should only normally be used for query parameters. Using + to represent a space outside of a form name=value construct, such as in a path part, is wrong.
This is rather unfortunate. You might want to try doing a string replace of the + character with %20 after encoding with UrlEncode() when you are encoding into a path part rather than a parameter. In a parameter, + and %20 are equally good.
A + instead of a space is correct URL encoding, as would escaping it to %20. See this article (CGI Programming in Perl - URL Encoding).
The + is not something that JavaScript can parse, so javascript will escape the space or + to %20.
Using System.Uri.EscapeDataString() serverside and decodeURIComponent clientside works.
In my asp.net mvc application I created the following link:
http://localhost:2689/en/Formula.mvc/351702++LYS+GRONN+5G+9%252f2++fds
I get error 400 (bad request).
I think it blocks at the %25 (forward slash).
What am I doing wrong?
--EDIT 3--
I tried not encoding anything at all but rather rely on the default encoding of Url.RouteUrl().
It seems that this doesn't encode the "/" for some reason.
If I encode it myself first, I end up with the doubel encoded %252f. This gives me a bad request for some reason..
Why?!
--EDIT 2--
I generated the last part of the URI as follows:
Take the id.toString
Take the HttpUtility.UrlEncode(name)
Take the HttpUtility.UrlEncode(code)
String.Format("{0}--{1}--{2}") with the values from the previous parts
Add it as a parameter to Url.RouteUrl()
After that my action gets this parameter again, splits it at -- and HttpUtility.Decode() the values back.
I do it this way because the two last parameters are optional, but functional parameters. IF they are defined in a previous step, they have to be carried along to the other pages.
Less abstract: A color can have multiple names, but if a user selected it by a particular name, it should be kept throughout all the other pages.
--EDIT 1--
It also looks like HttpUtility.UrlEncode() and Url.Encode() return different results :S
If I don't encode the "/", it acts as a separator=>no luck there.
If I encode it with Url.Encode() I end up with %2F => Code 400
If I encode it with HttpUtility.UrlEncode() I end up with %25 => code 400
Because 400 doesn't even let it through to asp.net-mvc, the route debugger is of no use :(
I was there a couple of days ago. If you can accept unreadable route-values in the URL try this:
URL-encoded slash in URL
%25 is actually encoded "%", so %252f is encoded "%2f".
%2f (encoded "/") is not allowed in URL unless you explicitly allow it in webserver's configuration.
Have you run the Routing debugger: http://haacked.com/archive/2008/03/13/url-routing-debugger.aspx
I haven't looked too much at the encoding - but note that if this is to be stored somewhere (or acted upon in some way), then a POST would be more appropriate. If the text on the right is actually representative of the data with id 351702 (a vanity url, much like /665354/whats-wrong-with-my-url-encoding), then you should humanize the text. Much as the spaces have been removed from the above. It is also common to have this as a separate level in the route that is simply discarded.
Generally, MVC urls should be comprehensible.
W3Schools works fine: http://www.w3schools.com/TAGS/html_form_submit.asp?text=hello/world
Here's the URL encoding reference: http://www.w3schools.com/TAGS/ref_urlencode.asp
You can't use a forward slash as a value in the URL. Here is a nice post about creating browser and SEO friendly URLS => http://www.dominicpettifer.co.uk/displayBlog.aspx?id=34
[Edit]
Whenever you create a route you associate it with a URL pattern (The default pattern is {controller}/{action}/{id}). And in this url pattern you are supposed to use the forward slash to separate different tokens. Hope that helps