Create http url string - c#

I need a function which will return correct url from url parts (like in browsers)
string GetUrl(string actual,string path) {
return newurl;
}
For example:
GetUrl('http://example.com/a/b/c/a.php','z/x/c/i.php') -> http://example.com/a/b/c/z/x/c/i.php
GetUrl('http://example.com/a/b/c/a.php','/z/x/c/i.php') -> http://example.com/z/x/c/i.php
GetUrl('http://example.com/a/b/c/a.php','i.php') -> http://example.com/a/b/c/i.php
GetUrl('http://example.com/a/b/c/a.php','/o/d.php?b=1') -> http//example.com/o/d.php?b=1
GetUrl('http://example.com/a/a.php','./o/d.php?b=1') -> http//example.com/a/o/d.php?b=1
Anu suggestions?

What you need is the System.UriBuilder class: http://msdn.microsoft.com/en-us/library/system.uribuilder.aspx
There is also a lightweight solution at CodeProject that doesnt depent on System.Web: http://www.codeproject.com/KB/aspnet/UrlBuilder.aspx
There is also one Query String Builder (but I havent tried it before): http://weblogs.asp.net/bradvincent/archive/2008/10/27/helper-class-querystring-builder-chainable.aspx

public string ConvertLink(string input)
{
//Add http:// to link url
Regex urlRx = new Regex(#"(?<url>(http(s?):[/][/]|www.)([a-z]|[A-Z]|[0-9]|[/.]|[~])*)", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(input);
foreach (Match match in matches)
{
string url = match.Groups["url"].Value;
Uri uri = new UriBuilder(url).Uri;
input = input.Replace(url, uri.AbsoluteUri);
}
return input;
}
The code locate every link inside the string with regex, and then use UriBuilder to add protocol to the link if doesn't exist. Since "http://" is default, it will then be added if no protocol exist.

At this link you can take an example of how to take the domain of the URL, with this, you can add the second part to the string of the url
http://www.jonasjohn.de/snippets/csharp/extract-domain-name-from-url.htm
This is the best way to do it, I think.
See you.

What about:
string GetUrl(string actual, string path)
{
return actual.Substring(0, actual.Length - 4).ToString() + "/" + path;
}

Related

How do I use UriBuilder for relative Uri so that the output is exactly like the input but with the path changed?

I need a method that does a redirection on URLs. The input can be absolute or relative URLs like these:
Inp: test/something.js?v=1#a
Out: testcontent/something.js?v=1#a
Inp: /test/something.js?v=1#a
Out: /testcontent/something.js?v=1#a
Inp: ://example.com/test/something.js?v=1#a
Out: ://example.com/testcontent/something.js?v=1#a
Inp: https://example.com/test/something.js?v=1#a
Out: https://example.com/testcontent/something.js?v=1#a
I know I can "fake" an absolute Uri by assigning a host so I can access Segments property and use UriBuilder. The problem is I cannot find a good way to get the output to be in the same format as whatever parts available in the input. The two methods below are my attempt: I can get the segments, modify them and join them back, and build an absolute Uri. The problem is, MakeRelativeUri changes the form of the output.
static readonly Uri BaseUri = new("https://0.0.0.0/index.html/");
static string[] GetUriSegments(Uri uri)
{
return uri.IsAbsoluteUri ?
uri.Segments :
new Uri(BaseUri, uri.OriginalString).Segments;
}
static Uri CreateUri(Uri uri, string[] segments)
{
var isAbs = uri.IsAbsoluteUri;
var absUri = isAbs ? uri : new Uri(BaseUri, uri.OriginalString);
var builder = new UriBuilder(absUri)
{
Path = string.Join("", segments)
};
var result = builder.Uri;
if (!isAbs)
{
result = BaseUri.MakeRelativeUri(result);
}
return result;
}
How should I solve this problem?
This covers all the examples provided:
Regex rg = new Regex(#"(^|/|//[^/]*/)test/");
var u2 = rg.Replace(u, "$1testcontent/");
It's unclear if you wish to change only the first test/ or all, but with this code
https://example.com/test/test/something.js?v=1#a
becomes
https://example.com/testcontent/test/something.js?v=1#a

Get Last two folder's name from URL using C#

I have a URL and from which i need to get names after "bussiness" and Before the Page Name i.e. "paradise-villas-little.aspx" from below URL.
http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx
I am not getting how can i get this. i have tried the RawUrl, but it fetched the full. Please help me how can i do this.
UPDATE: This is a type of URL, i need to check it for dynamically.
You can create a little helper, and parse the URL from it's Uri Segments :
public static class Helper
{
public static IEnumerable<String> ExtractSegments(this Uri uri, String exclusiveStart)
{
bool startFound = false;
foreach (var seg in uri.Segments.Select(i => i.Replace(#"/","")))
{
if (startFound == false)
{
if (seg == exclusiveStart)
startFound = true;
}
else
{
if (!seg.Contains("."))
yield return seg;
}
}
}
}
And call it like this :
Uri uri = new Uri(#"http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx");
var found = uri.ExtractSegments("bussiness").ToList();
Then found contains "accommo" and "resort", and this method is extensible to any URL length, with or without file name at the end.
Nothing sophisticated in this implementation, just regular string operations:
string url = "http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx";
string startAfter = "business";
string pageName = "paradise-villas-little.aspx";
char delimiter = '/'; //not platform specific
var from = url.IndexOf(startAfter) + startAfter.Length + 1;
var to = url.Length - from - pageName.Length - 1;
var strings = url.Substring(from, to).Split(delimiter);
You may want to add validations though.
You have to use built-in string methods. The best is to use String Split.
String url = "http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx";
String[] url_parts = url.Split('/'); //Now you have all the parts of the URL all folders and page. Access the folder names from string array.
Hope this helps

Get specific subdomain from URL in foo.bar.car.com

Given a URL as follows:
foo.bar.car.com.au
I need to extract foo.bar.
I came across the following code :
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
if (host.Split('.').Length > 2)
{
int lastIndex = host.LastIndexOf(".");
int index = host.LastIndexOf(".", lastIndex - 1);
return host.Substring(0, index);
}
}
return null;
}
This gives me like foo.bar.car. I want foo.bar. Should i just use split and take 0 and 1?
But then there is possible wwww.
Is there an easy way for this?
Given your requirement (you want the 1st two levels, not including 'www.') I'd approach it something like this:
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
var nodes = host.Split('.');
int startNode = 0;
if(nodes[0] == "www") startNode = 1;
return string.Format("{0}.{1}", nodes[startNode], nodes[startNode + 1]);
}
return null;
}
I faced a similar problem and, based on the preceding answers, wrote this extension method. Most importantly, it takes a parameter that defines the "root" domain, i.e. whatever the consumer of the method considers to be the root. In the OP's case, the call would be
Uri uri = "foo.bar.car.com.au";
uri.DnsSafeHost.GetSubdomain("car.com.au"); // returns foo.bar
uri.DnsSafeHost.GetSubdomain(); // returns foo.bar.car
Here's the extension method:
/// <summary>Gets the subdomain portion of a url, given a known "root" domain</summary>
public static string GetSubdomain(this string url, string domain = null)
{
var subdomain = url;
if(subdomain != null)
{
if(domain == null)
{
// Since we were not provided with a known domain, assume that second-to-last period divides the subdomain from the domain.
var nodes = url.Split('.');
var lastNodeIndex = nodes.Length - 1;
if(lastNodeIndex > 0)
domain = nodes[lastNodeIndex-1] + "." + nodes[lastNodeIndex];
}
// Verify that what we think is the domain is truly the ending of the hostname... otherwise we're hooped.
if (!subdomain.EndsWith(domain))
throw new ArgumentException("Site was not loaded from the expected domain");
// Quash the domain portion, which should leave us with the subdomain and a trailing dot IF there is a subdomain.
subdomain = subdomain.Replace(domain, "");
// Check if we have anything left. If we don't, there was no subdomain, the request was directly to the root domain:
if (string.IsNullOrWhiteSpace(subdomain))
return null;
// Quash any trailing periods
subdomain = subdomain.TrimEnd(new[] {'.'});
}
return subdomain;
}
You can use the following nuget package Nager.PublicSuffix. It uses the PUBLIC SUFFIX LIST from Mozilla to split the domain.
PM> Install-Package Nager.PublicSuffix
Example
var domainParser = new DomainParser();
var data = await domainParser.LoadDataAsync();
var tldRules = domainParser.ParseRules(data);
domainParser.AddRules(tldRules);
var domainName = domainParser.Get("sub.test.co.uk");
//domainName.Domain = "test";
//domainName.Hostname = "sub.test.co.uk";
//domainName.RegistrableDomain = "test.co.uk";
//domainName.SubDomain = "sub";
//domainName.TLD = "co.uk";
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
String[] subDomains = host.Split('.');
return subDomains[0] + "." + subDomains[1];
}
return null;
}
OK, first. Are you specifically looking in 'com.au', or are these general Internet domain names? Because if it's the latter, there is simply no automatic way to determine how much of the domain is a "site" or "zone" or whatever and how much is an individual "host" or other record within that zone.
If you need to be able to figure that out from an arbitrary domain name, you will want to grab the list of TLDs from the Mozilla Public Suffix project (http://publicsuffix.org) and use their algorithm to find the TLD in your domain name. Then you can assume that the portion you want ends with the last label immediately before the TLD.
I would recommend using Regular Expression. The following code snippet should extract what you are looking for...
string input = "foo.bar.car.com.au";
var match = Regex.Match(input, #"^\w*\.\w*\.\w*");
var output = match.Value;
In addition to the NuGet Nager.PubilcSuffix package specified in this answer, there is also the NuGet Louw.PublicSuffix package, which according to its GitHub project page is a .Net Core Library that parses Public Suffix, and is based on the Nager.PublicSuffix project, with the following changes:
Ported to .NET Core Library.
Fixed library so it passes ALL the comprehensive tests.
Refactored classes to split functionality into smaller focused classes.
Made classes immutable. Thus DomainParser can be used as singleton and is thread safe.
Added WebTldRuleProvider and FileTldRuleProvider.
Added functionality to know if Rule was a ICANN or Private domain rule.
Use async programming model
The page also states that many of above changes were submitted back to original Nager.PublicSuffix project.

Get file extension or "HasExtension" type bool from Uri object C#

Quick question:
Can anyone think of a better way then RegEx or general text searching to work out whether a Uri object (not URL string) has a file extension?
A Uri object generated from http://example.com/contact DOES NOT
A Uri object generated from http://example.com/images/logo.png DOES
Any thoughts welcome. Apologies if I've missed something in the .NET framework / Uri class that already does this.
Slightly more complexity wise.
A Uri object generated from http://example.com/contact.is.sortof.valid DOES NOT
A Uri object generated from http://example.com/images/logo.is.sort.of.valid.png DOES
I've accepted craigtp's answer; however, for what I need the solution is thus.
var hasExtension = Path.HasExtension(requestUri.AbsolutePath);
To all who had a go at this. For a full and comprehensive answer, you would obviously need a mime types dictionary to do a further check. For example http://example/this.is.sort.of.valid.but.not.a.mime.type would return "true" has Path.HasExtension, however, for what I need, I would never have this type of path coming in.
You can use the HasExtension method of the System.IO.Path class to determine if a Uri's string has an extension.
By using the AbsoluteUri property of the Uri object, you can retrieve the complete string that represents the Uri. Passing this string to the Path class's HasExtension method will correctly return a boolean indicating whether the Uri contains a file extension.
Copy and paste the following code into a simple console application to test this out. Only myUri3 and myUrl4 return True, which also demonstrates that the HasExtension method can correctly deal with additional characters (i.e. Querystrings) after the filename (and extension).
using System;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Uri myURI1 = new Uri(#"http://www.somesite.com/");
Uri myURI2 = new Uri(#"http://www.somesite.com/filenoext");
Uri myURI3 = new Uri(#"http://www.somesite.com/filewithext.jpg");
Uri myURI4 = new Uri(#"http://www.somesite.com/filewithext.jpg?q=randomquerystring");
Console.WriteLine("Does myURI1 have an extension: " + Path.HasExtension(myURI1.AbsoluteUri));
Console.WriteLine("Does myURI2 have an extension: " + Path.HasExtension(myURI2.AbsoluteUri));
Console.WriteLine("Does myURI3 have an extension: " + Path.HasExtension(myURI3.AbsoluteUri));
Console.WriteLine("Does myURI4 have an extension: " + Path.HasExtension(myURI4.AbsoluteUri));
Console.ReadLine();
}
}
}
EDIT:
Based upon the question asker's edit regarding determining if the extension is a valid extension, I've whipped up some new code below (copy & paste into a console app):
using System;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Uri myUri1 = new Uri("http://www.somesite.com/folder/file.jpg?q=randomquery.string");
string path1 = String.Format("{0}{1}{2}{3}", myUri1.Scheme, Uri.SchemeDelimiter, myUri1.Authority, myUri1.AbsolutePath);
string extension1 = Path.GetExtension(path1);
Console.WriteLine("Extension of myUri1: " + extension1);
Uri myUri2 = new Uri("http://www.somesite.com/folder/?q=randomquerystring");
string path2 = String.Format("{0}{1}{2}{3}", myUri2.Scheme, Uri.SchemeDelimiter, myUri2.Authority, myUri2.AbsolutePath);
string extension2 = Path.GetExtension(path2);
Console.WriteLine("Extension of myUri1: " + extension2);
Console.ReadLine();
}
}
}
This new code now de-constructs all of the component parts of a Uri object (i.e. Scheme - the http part etc.) and specifically removes any querystring part of the Uri. This gets around the potential problem as noted by Adriano in a comment on this answer that the querystring could contain a dot character (thereby potentially messing up the HasExtension method).
Once the Uri is de-constructed, we can now properly determine both if the Uri string has an extension and also what that extension is.
From here, it's merely a case of matching this extension against a list of known valid extensions. This part is something that the .NET framework will never given you as any file extension is potentially valid (any application can make up it's own file extension if it so desires!)
The Uri.IsFile property suggested by others does not work.
From the docs
The IsFile property is true when the Scheme property equals UriSchemeFile.
file://server/filename.ext"
http://msdn.microsoft.com/en-us/library/system.uri.isfile.aspx
What you can do is get the AbsolutePath of the URI (which corresponds to /contact or /images/logo.png for example) and then use the FileInfo class to check/get the extension.
var uris = new List<Uri>()
{
new Uri("http://mysite.com/contact"),
new Uri("http://mysite.com/images/logo.png"),
new Uri("http://mysite.com/images/logo.png?query=value"),
};
foreach (var u in uris)
{
var fi = new FileInfo(u.AbsolutePath);
var ext = fi.Extension;
if (!string.IsNullOrWhiteSpace(ext))
{
Console.WriteLine(ext);
}
}
You probably need to check against a list of supported extensions to handle the more complicated cases (contact.is.sortof.valid and contact.is.sortof.valid.png)
Tests:
"http://mysite.com/contact" //no ext
"http://mysite.com/contact?query=value" //no ext
"http://mysite.com/contact?query=value.value" //no ext
"http://mysite.com/contact/" //no ext
"http://mysite.com/images/logo.png" //.png
"http://mysite.com/images/logo.png?query=value" //.png
"http://mysite.com/images/logo.png?query=value.value" //.png
"http://mysite.com/contact.is.sortof.valid" //.valid
"http://mysite:123/contact.is.sortof.valid" //.valid
Take a look at the UriBuilder Class. Not only can you retrieve certain parts of the url, but you can also swap them out at will.
public bool HasExtension(Uri myUri)
{
var validExtensions = new List<string>() { ".png", ".jpg" };
var builder = UriBuilder(myUri)
foreach (var extension in validExtensions) {
if(builder.Path.Equals(extension, StringComparison.InvariantCultureIgnoreCase))
return true;
return false;
}
here is my solution to make it right ;)
var inputString = ("http://ask.com/pic.JPG http://aSk.com/pIc.JPG "
+ "http://ask.com/pic.jpg "
+ "http://yoursite.com/contact "
+ "http://yoursite.com/contact?query=value "
+ "http://yoursite.com/contact?query=value.value "
+ "http://yoursite.com/contact/ "
+ "http://yoursite.com/images/Logo.pnG "
+ "http://yoursite.com/images/lOgo.pNg?query=value "
+ "http://yoursite.com/images/logo.png?query=value.value "
+ "http://yoursite.com/contact.is.sortof.valid "
+ "http://mysite:123/contact.is.sortof.valid").Split(' ');
var restultString = "";
foreach (var is1 in inputString)
{
restultString += (!string.IsNullOrEmpty(restultString) ? " " : "") +
(Path.HasExtension(is1) ? Path.ChangeExtension(is1, Path.GetExtension(is1).ToLower()) : is1);
}

Getting URL from file path in IHttpHandler (Generic handler)

In my IHttpHandler class (for an .ashx page), I want to search a directory for certain files, and return relative urls. I can get the files, no problem:
string dirPath = context.Server.MapPath("~/mydirectory");
string[] files = Directory.GetFiles(dirPath, "*foo*.txt");
IEnumerable<string> relativeUrls = files.Select(f => WHAT GOES HERE? );
What is the easiest way to convert file paths to relative urls? If I were in an aspx page, I could say this.ResolveUrl(). I know I could do some string parsing and string replacement to get the relative url, but is there some built-in method that will take care of all of that for me?
Edit: To clarify, without doing my own string parsing, how do I go from this:
"E:\Webs\WebApp1\WebRoot\mydirectory\foo.txt"
to this:
"/mydirectory/foo.txt"
I'm looking for an existing method like:
public string GetRelativeUrl(string filePath) { }
I can imagine a lot of people having this question... My solution is:
public static string ResolveRelative(string url)
{
var requestUrl = context.Request.Url;
string baseUrl = string.Format("{0}://{1}{2}{3}",
requestUrl.Scheme, requestUrl.Host,
(requestUrl.IsDefaultPort ? "" : ":" + requestUrl.Port),
context.Request.ApplicationPath);
if (toresolve.StartsWith("~"))
{
return baseUrl + toresolve.Substring(1);
}
else
{
return new Uri(new Uri(baseUrl), toresolve).ToString();
}
}
update
Or from filename to virtual path (haven't tested it; you might need some code similar to ResoveRelative above as well... let me know if it works):
public static string GetUrl(string filename)
{
if (filename.StartsWith(context.Request.PhysicalApplicationPath))
{
return context.Request.ApplicationPath +
filename.Substring(context.Request.PhysicalApplicationPath.Length);
}
else
{
throw new ArgumentException("Incorrect physical path");
}
}
try System.Web.Hosting.HostingEnvironment.MapPath method, its static and can be accessed everywhere in web application.

Categories

Resources