Alternatives to .NET provided apis regarding uris and urls

Alternatives to .NET provided apis regarding uris and urls - c#

I've recently come to the realization that the .NET apis working with URLs and URIs frequently come up short in achieving even basic functionality (atleast easily) including things such as: generating a FQDN url from a relative path, forcing https or back to http, getting the root of the site, combining relative urls properly and so forth.
Are there any alternative libraries out there that have put all of these type of functionality in a simple and reliable project?

I've certainly found myself doing much the same URI-manipulation code more than once, in .NET, but I don't see your cases as places it lacks.
Full URI from relative Uri:
new Uri(base, relative) // (works whether relative is a string or a Uri).
Obtaining the actual FQDN:
string host = uri.Host;
string fqdn = hostEndsWith(".") ? host : host + ".";
Forcing https or back to http:
UriBuilder toHttp = new UriBuilder(someUri);
toHttp.Scheme = "http";
toHttp.Port = 80;
return toHttp.Uri;
UriBuilder toHttps = new UriBuilder(someUri);
toHttps.Scheme = "https";
toHttps.Port = 443;
return toHttps.Uri;
Getting the root of the site:
new Uri(startingUri, "/");
Combining relative urls properly:
new Uri(baseUri, relUri); // We had this one already.
Only two of these are more than a single method call, and of those obtaining the FQDN is pretty obscure (unless rather than wanting the dot-ended FQDN you just wanted the absolute URI, in which case we're back to a single method call).
There is a single method version of the HTTPS/HTTP switching, though it's actually more cumbersome since it calls several properties of the Uri object. I can live with it taking a few lines to do this switch.
Still, to provide a new API one need only supply:
public static Uri SetHttpPrivacy(this Uri uri, bool privacy)
{
UriBuilder ub = new UriBuilder(uri);
if(privacy)
{
ub.Scheme = "https";
ub.Port = 443;
}
else
{
ub.Scheme = "http";
ub.Port = 80;
}
return ub.Uri;
}
I really can't see how an API could possibly be any more concise in the other cases.

XUri is a nice class that is part of the open source project from MindTouch
http://developer.mindtouch.com/en/ref/dream/MindTouch.Dream/XUri?highlight=XUri
This article includes a quick sample on how to use it.
http://blog.developer.mindtouch.com/2009/05/18/consuming-rest-services-and-tdd-with-plug/
I am a fan of it. A little overkill assembly wise if you are going to just use the XUri portion, but there are other really nice things in the library too.

I use a combination of extensions with 'System.IO.Path' object as well.
These are just blurbs for example.
public static Uri SecureIfRemote(this Uri uri){
if(!System.Web.HttpContext.Current.Request.IsSecureConnection &&
!System.Web.HttpContext.Current.Request.IsLocal){
return new Uri......(build secure uri here)
}
return uri;
}
public static NameValueCollection ParseQueryString(Uri uri){
return uri.Query.ParseQueryString();
}
public static NameValueCollection ParseQueryString(this string s)
{
//return
return HttpUtility.ParseQueryString(s);
}

Related

Sub domain as query string

Is there any way in ASP.net C# to treat sub-domain as query string?
I mean if the user typed london.example.com then I can read that he is after london data and run a query based on that. example.com does not currently have any sub-domains.

This is a DNS problem more than an C#/ASP.Net/IIS problem. In theory, you could use a wildcard DNS record. In practice, you run into this problem from the link:
The exact rules for when a wild card will match are specified in RFC 1034, but the rules are neither intuitive nor clearly specified. This has resulted in incompatible implementations and unexpected results when they are used.
So you can try it, but it's not likely to end well. Moreover, you can fiddle with things until it works in your testing environment, but that won't be able to guarantee things go well for the general public. You'll likely do much better choosing a good DNS provider with an API, and writing code to use the API to keep individual DNS entries in sync. You can also set up your own public DNS server, though I strongly recommend using a well-known and reputable commercial DNS host.
An additional problem you can run into is the TLS/SSL certificate (because of course you're gonna use HTTPS. Right? RIGHT!?) You can try a wild card certificate and probably be okay, but depending on what else you do you may find it's not adequate; suddenly you're needing to provision a separate SSL certificate for every city entry in your database, and that can be a real pain, even via the Let's Encrypt service.
If you do try it, IIS is easily capable of mapping the requests to your ASP.Net app based on a wildcard host name, and ASP.Net itself is easily capable of reading and parsing the host name out of the request and returning different results based on that. IIS URL re-writing should be able to help with this, though I'm not sure whether you can do stock MVC routing in C#/ASP.Net based on this attribute.

I have to add to the previous answers, that after you fix the dns, and translate the subdomain to some parameters you can use the RewritePath to move that parameters to your pages.
For example let say that a function PathTranslate(), translate the london.example.com to example.com/default.aspx?Town=1
Then you use the RewritePath to keep the sub-domain and at the same time send your parameters to your page.
string sThePathToReWrite = PathTranslate();
if (sThePathToReWrite != null){
HttpContext.Current.RewritePath(sThePathToReWrite, false);
}
string PathTranslate()
{
string sCurrentPath = HttpContext.Current.Request.Path;
string sCurrentHost = HttpContext.Current.Request.Url.Host;
//... lot of code ...
return strTranslatedUrl
}

A low tech solution can be like this: (reference: https://www.pavey.me/2016/03/aspnet-c-extracting-parts-of-url.html)
public static List<string> SubDomains(this HttpRequest Request)
{
// variables
string[] requestArray = Request.Host().Split(".".ToCharArray());
var subDomains = new List<string>();
// make sure this is not an ip address
if (Request.IsIPAddress())
{
return subDomains;
}
// make sure we have all the parts necessary
if (requestArray == null)
{
return subDomains;
}
// last part is the tld (e.g. .com)
// second to last part is the domain (e.g. mydomain)
// the remaining parts are the sub-domain(s)
if (requestArray.Length > 2)
{
for (int i = 0; i <= requestArray.Length - 3; i++)
{
subDomains.Add(requestArray[i]);
}
}
// return
return subDomains;
}
// e.g. www
public static string SubDomain(this HttpRequest Request)
{
if (Request.SubDomains().Count > 0)
{
// handle cases where multiple sub-domains (e.g. dev.www)
return Request.SubDomains().Last();
}
else
{
// handle cases where no sub-domains
return string.Empty;
}
}
// e.g. azurewebsites.net
public static string Domain(this HttpRequest Request)
{
// variables
string[] requestArray = Request.Host().Split(".".ToCharArray());
// make sure this is not an ip address
if (Request.IsIPAddress())
{
return string.Empty;
}
// special case for localhost
if (Request.IsLocalHost())
{
return Request.Host().ToLower();
}
// make sure we have all the parts necessary
if (requestArray == null)
{
return string.Empty;
}
// make sure we have all the parts necessary
if (requestArray.Length > 1)
{
return $"{requestArray[requestArray.Length - 2]}.{requestArray[requestArray.Length - 1]}";
}
// return empty string
return string.Empty;
}
Following question is similar to yours:
Using the subdomain as a parameter

Get specific subdomain from URL in foo.bar.car.com

Given a URL as follows:
foo.bar.car.com.au
I need to extract foo.bar.
I came across the following code :
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
if (host.Split('.').Length > 2)
{
int lastIndex = host.LastIndexOf(".");
int index = host.LastIndexOf(".", lastIndex - 1);
return host.Substring(0, index);
}
}
return null;
}
This gives me like foo.bar.car. I want foo.bar. Should i just use split and take 0 and 1?
But then there is possible wwww.
Is there an easy way for this?

Given your requirement (you want the 1st two levels, not including 'www.') I'd approach it something like this:
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
var nodes = host.Split('.');
int startNode = 0;
if(nodes[0] == "www") startNode = 1;
return string.Format("{0}.{1}", nodes[startNode], nodes[startNode + 1]);
}
return null;
}

I faced a similar problem and, based on the preceding answers, wrote this extension method. Most importantly, it takes a parameter that defines the "root" domain, i.e. whatever the consumer of the method considers to be the root. In the OP's case, the call would be
Uri uri = "foo.bar.car.com.au";
uri.DnsSafeHost.GetSubdomain("car.com.au"); // returns foo.bar
uri.DnsSafeHost.GetSubdomain(); // returns foo.bar.car
Here's the extension method:
/// <summary>Gets the subdomain portion of a url, given a known "root" domain</summary>
public static string GetSubdomain(this string url, string domain = null)
{
var subdomain = url;
if(subdomain != null)
{
if(domain == null)
{
// Since we were not provided with a known domain, assume that second-to-last period divides the subdomain from the domain.
var nodes = url.Split('.');
var lastNodeIndex = nodes.Length - 1;
if(lastNodeIndex > 0)
domain = nodes[lastNodeIndex-1] + "." + nodes[lastNodeIndex];
}
// Verify that what we think is the domain is truly the ending of the hostname... otherwise we're hooped.
if (!subdomain.EndsWith(domain))
throw new ArgumentException("Site was not loaded from the expected domain");
// Quash the domain portion, which should leave us with the subdomain and a trailing dot IF there is a subdomain.
subdomain = subdomain.Replace(domain, "");
// Check if we have anything left. If we don't, there was no subdomain, the request was directly to the root domain:
if (string.IsNullOrWhiteSpace(subdomain))
return null;
// Quash any trailing periods
subdomain = subdomain.TrimEnd(new[] {'.'});
}
return subdomain;
}

You can use the following nuget package Nager.PublicSuffix. It uses the PUBLIC SUFFIX LIST from Mozilla to split the domain.
PM> Install-Package Nager.PublicSuffix
Example
var domainParser = new DomainParser();
var data = await domainParser.LoadDataAsync();
var tldRules = domainParser.ParseRules(data);
domainParser.AddRules(tldRules);
var domainName = domainParser.Get("sub.test.co.uk");
//domainName.Domain = "test";
//domainName.Hostname = "sub.test.co.uk";
//domainName.RegistrableDomain = "test.co.uk";
//domainName.SubDomain = "sub";
//domainName.TLD = "co.uk";

private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
String[] subDomains = host.Split('.');
return subDomains[0] + "." + subDomains[1];
}
return null;
}

OK, first. Are you specifically looking in 'com.au', or are these general Internet domain names? Because if it's the latter, there is simply no automatic way to determine how much of the domain is a "site" or "zone" or whatever and how much is an individual "host" or other record within that zone.
If you need to be able to figure that out from an arbitrary domain name, you will want to grab the list of TLDs from the Mozilla Public Suffix project (http://publicsuffix.org) and use their algorithm to find the TLD in your domain name. Then you can assume that the portion you want ends with the last label immediately before the TLD.

I would recommend using Regular Expression. The following code snippet should extract what you are looking for...
string input = "foo.bar.car.com.au";
var match = Regex.Match(input, #"^\w*\.\w*\.\w*");
var output = match.Value;

In addition to the NuGet Nager.PubilcSuffix package specified in this answer, there is also the NuGet Louw.PublicSuffix package, which according to its GitHub project page is a .Net Core Library that parses Public Suffix, and is based on the Nager.PublicSuffix project, with the following changes:
Ported to .NET Core Library.
Fixed library so it passes ALL the comprehensive tests.
Refactored classes to split functionality into smaller focused classes.
Made classes immutable. Thus DomainParser can be used as singleton and is thread safe.
Added WebTldRuleProvider and FileTldRuleProvider.
Added functionality to know if Rule was a ICANN or Private domain rule.
Use async programming model
The page also states that many of above changes were submitted back to original Nager.PublicSuffix project.

Strange behavior in Uri-class (.NET)

Why does the Uri class urldecode my url that I send to its contructor and how can I prevent this?
Example (look at the querystring value "options"):
string url = "http://www.example.com/default.aspx?id=1&name=andreas&options=one%3d1%26two%3d2%26three%3d3";
Uri uri = new Uri(url); // http://www.example.com/default.aspx?id=1&name=andreas&options=one=1&two=2&three=3
Update:
// ?id=1&name=andreas&options=one%3d1%26two%3d2%26three%3d3
Request.QueryString["options"] = one=1&two=2&three=3
// ?id=1&name=andreas&options=one=1&two=2&three=3
Request.QueryString["options"] = one=1
This is my problem :)

why exactly?
you can get to the encoded version using url.AbsoluteUri
EDIT
Console.WriteLine("1) " + uri.AbsoluteUri);
Console.WriteLine("2) " + uri.Query);
OUT:
1) http://www.example.com/default.aspx?id=1&name=andreas&options=one%3d1%26two%3d2%26three%3d3
2) ?id=1&name=andreas&options=one%3d1%26two%3d2%26three%3d3

I would expect that from a Uri class. I am quite sure that it still gets you in a good place if you use it with e.g. WebClient class (i.e. WebClient.OpenRead (Uri uri)). What's the problem in your case?

This is how the internal code of .NET behaves - in previous versions you could use another constructor of Uri that accepted boolean value telling if to escape or not, but it has been deprecated.
The only way around it is hackish: accessing some private method directly by means of reflection:
string url = "http://www.example.com/default.aspx?id=1&name=andreas&options=one%3d1%26two%3d2%26three%3d3";
Uri uri = new Uri(url);
MethodInfo mi = uri.GetType().GetMethod("CreateThis", BindingFlags.NonPublic | BindingFlags.Instance);
if (mi != null)
mi.Invoke(uri, new object[] { url, true, UriKind.RelativeOrAbsolute });
This worked for me in quick test, but not ideal as you "hack" into .NET internal code.

HttpWebRequest Url escaping

I know, the title sounds like this question has been addressed many times. But I am struggling with a specific case and I am very confused over it. Hopefully a seasoned C#'er could point me in the correct direction.
I have the code:
string serviceURL = "https://www.domain.com/service/tables/bucketname%2Ftables%2Ftesttable/imports";
HttpWebRequest dataRequest = (HttpWebRequest)WebRequest.Create(serviceURL);
Now when I quickwatch dataRequest, I see that:
RequestUri: {https://www.domain.com/service/tables/bucketname/tables/testtable/imports}
And it looks like the HttpWebRequest has changed both the %2F to /. However, the server needs the requested Uri to be exactly as serviceURL is written, containing the %2F.
Is there any way to get the HttpWebRequest class to call the Url:
https://www.domain.com/service/tables/bucketname%2Ftables%2Ftesttable/imports
Many thanks! I am at a complete loss here...
-Brett

Kyle posted the answer in a comment, so to make it official:
GETting a URL with an url-encoded slash
It's a weird work around, but nevertheless gets the job done.

As long as the problem lies in %2F being unescaped to "/" there are solutions out there. One involving a hack and for newer versions of .Net, an app.config setting. Check here: How to make System.Uri not to unescape %2f (slash) in path?
However I have still to figure out how to prevent it unescaping some specifically escaped characters, like '(' and ')' (%28 and %29). I have tried all the settings and hacks that I found out there to prevent the Uri class from delivering a partially unescaped path for the WebRequest. The solutions will happily prevent %2F being unescaped, but not %28 and %29 and possible most of the other chars being specifically escaped.
It seems like the WebRequest is specifically asking for 1 value from the Uri object to create the "GET /path HTTP/1.1" syntax: Uri.PathAndQuery which again calls its UriParser.GetComponents.
If you want to download from mediafire and it contains the chars %28 and %29 you will get into a infinite redirect loop as .Net keeps changing %28 and %29 to '(' and ')' and following the redirect (exception: "Too many automatic redirections were attempted").
So this is a solution for those who are stuck and have not been able to find a way to prevent the unescape of some characters.
The only way I have found to override this (currenly using .Net 4.6) and deliver my own PathAndQuery has been a combination of inherting UriParser and hacking its use.
public sealed class MyUriParser : System.UriParser
{
private UriParser _originalParser;
private MethodInfo _getComponentsMethod;
public MyUriParser(UriParser originalParser) : base()
{
if (_originalParser == null)
{
_originalParser = originalParser;
_getComponentsMethod = typeof(UriParser).GetMethod("GetComponents", BindingFlags.NonPublic | BindingFlags.Instance);
if (_getComponentsMethod == null)
{
throw new MissingMethodException("UriParser", "GetComponents");
}
}
}
private static Regex rx = new Regex(#"^(?<Scheme>[^:]+):(?://((?<User>[^#/]+)#)?(?<Host>[^#:/?#]+)(:(?<Port>\d+))?)?(?<Path>([^?#]*)?)?(\?(?<Query>[^#]*))?(#(?<Fragment>.*))?$",RegexOptions.Compiled | RegexOptions.ExplicitCapture | RegexOptions.Singleline);
private Match m = null;
protected override string GetComponents(Uri uri, UriComponents components, UriFormat format)
{
var original = (string)_getComponentsMethod.Invoke(_originalParser, BindingFlags.InvokeMethod, null, new object[] { uri, components, format }, null);
if (components == UriComponents.PathAndQuery)
{
var reg = rx.Match(uri.OriginalString);
var path = reg.Groups["Path"]?.Value;
var query = reg.Groups["Query"]?.Value;
if (path != null && query != null) return $"{path}?{query}";
if (query == null) return $"{path}";
return $"{path}";
}
return original;
}
}
And then hacking it into the Uri instance by replacing its UriParser with this one.
public static Uri CreateUri(string url)
{
var uri = new Uri(url);
if (url.Contains("%28") || url.Contains("%29"))
{
var originalParser = ReflectionHelper.GetValueByReflection(uri, "m_Syntax") as UriParser;
var parser = new MyUriParser(originalParser);
ReflectionHelper.SetValueByReflection(parser, "m_Scheme", "http");
ReflectionHelper.SetValueByReflection(parser, "m_Port", 80);
ReflectionHelper.SetValueByReflection(uri, "m_Syntax", parser);
}
return uri;
}
Due to the way UriParser works, it normally needs to register to have its port and scheme name set, so these 2 values has to be set by reflection as we are not registering it the correct way. I have not found a way to register "http" as it already exist. The ReflectionHelper is just a class I have but can be quickly replaced with normal reflection code.
Then call it like this:
HttpWebRequest dataRequest = (HttpWebRequest)WebRequest.Create(CreateUri(serviceURL));

string serviceURL = Uri.EscapeUriString("https://www.domain.com/service/tables/bucketname%2Ftables%2Ftesttable/imports");

Alternative to .NET's Uri implementation?

I have a problem with the .NET's Uri implementation. It seems that if the scheme is "ftp", the query part is not parsed as a Query, but as a part of the path instead.
Take the following code for example:
Uri testuri = new Uri("ftp://user:pass#localhost/?passive=true");
Console.WriteLine(testuri.Query); // Outputs an empty string
Console.WriteLine(testuri.AbsolutePath); // Outputs "/%3Fpassive=true"
It seems to me that the Uri class wrongfully parses the query part as a part of the path. However changing the scheme to http, the result is as expected:
Uri testuri = new Uri("http://user:pass#localhost/?passive=true");
Console.WriteLine(testuri.Query); // Outputs "?passive=true"
Console.WriteLine(testuri.AbsolutePath); // Outputs "/"
Does anyone have a solution to this, or know of an alternative Uri class that works as expected?

Well, the problem is not that I am unable to create a FTP connection, but that URI's are not parsed accoding to RFC 2396.
What I actually intended to do was to create a Factory that provides implementations of a generic File transfer interface (containing get and put methods), based on a given connection URI. The URI defines the protocol, user info, host and path, and any properties needed to be passed should be passed through the Query part of the URI (such as the Passive mode option for the FTP connection).
However this proved difficult using the .NET Uri implementation, because it seems to parse the Query part of URI's differently based on the schema.
So I was hoping that someone knew a workaround to this, or of an alternative to the seemingly broken .NET Uri implementation. Would be nice to know before spending hours implementing my own.

I have been struggling with the same issue for a while. Attempting to replace the existing UriParser for the "ftp" scheme using UriParser.Register throws an InvalidOperationException because the scheme is already registered.
The solution I have come up with involves using reflection to modify the existing ftp parser so that it allows the query string. This is based on a workaround to another UriParser bug.
MethodInfo getSyntax = typeof(UriParser).GetMethod("GetSyntax", System.Reflection.BindingFlags.Static
| System.Reflection.BindingFlags.NonPublic);
FieldInfo flagsField = typeof(UriParser).GetField("m_Flags", System.Reflection.BindingFlags.Instance
| System.Reflection.BindingFlags.NonPublic);
if (getSyntax != null && flagsField != null)
{
UriParser parser = (UriParser)getSyntax.Invoke(null, new object[] { "ftp"});
if (parser != null)
{
int flagsValue = (int)flagsField.GetValue(parser);
// Set the MayHaveQuery attribute
int MayHaveQuery = 0x20;
if ((flagsValue & MayHaveQuery) == 0) flagsField.SetValue(parser, flagsValue | MayHaveQuery);
}
}
Run that somewhere in your initialization, and your ftp Uris will have the query string go into the Query parameter, as you would expect, instead of Path.

You should use the FtpWebRequest and FtpWebResponse classes unless you have a specific reason not to.
FtpWebRequest.fwr = (FtpWebRequest)FtpWebRequest.Create(new Uri("ftp://uri"));
fwr.ftpRequest.Method = WebRequestMethods.Ftp.UploadFile;
fwr.ftpRequest.Credentials = new NetworkCredential("user", "pass");
FileInfo ff = new FileInfo("localpath");
byte[] fileContents = new byte[ff.Length];
using (FileStream fr = ff.OpenRead())
{
fr.Read(fileContents, 0, Convert.ToInt32(ff.Length));
}
using (Stream writer = fwr.GetRequestStream())
{
writer.Write(fileContents, 0, fileContents.Length);
}
FtpWebResponse frp = (FtpWebResponse)fwr.GetResponse();
Response.Write(frp.ftpResponse.StatusDescription);
Ref1 Ref2

You have to use a specific class for FTP protocol like FtpWebRequest that has a Uri property like RequestUri.
You should search in thoses classes for a Uri parser I think.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Alternatives to .NET provided apis regarding uris and urls - c#

Related

Sub domain as query string

Get specific subdomain from URL in foo.bar.car.com

Strange behavior in Uri-class (.NET)

HttpWebRequest Url escaping

Alternative to .NET's Uri implementation?

Categories

Resources