Get Last two folder's name from URL using C#

Get Last two folder's name from URL using C# - c#

I have a URL and from which i need to get names after "bussiness" and Before the Page Name i.e. "paradise-villas-little.aspx" from below URL.
http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx
I am not getting how can i get this. i have tried the RawUrl, but it fetched the full. Please help me how can i do this.
UPDATE: This is a type of URL, i need to check it for dynamically.

You can create a little helper, and parse the URL from it's Uri Segments :
public static class Helper
{
public static IEnumerable<String> ExtractSegments(this Uri uri, String exclusiveStart)
{
bool startFound = false;
foreach (var seg in uri.Segments.Select(i => i.Replace(#"/","")))
{
if (startFound == false)
{
if (seg == exclusiveStart)
startFound = true;
}
else
{
if (!seg.Contains("."))
yield return seg;
}
}
}
}
And call it like this :
Uri uri = new Uri(#"http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx");
var found = uri.ExtractSegments("bussiness").ToList();
Then found contains "accommo" and "resort", and this method is extensible to any URL length, with or without file name at the end.

Nothing sophisticated in this implementation, just regular string operations:
string url = "http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx";
string startAfter = "business";
string pageName = "paradise-villas-little.aspx";
char delimiter = '/'; //not platform specific
var from = url.IndexOf(startAfter) + startAfter.Length + 1;
var to = url.Length - from - pageName.Length - 1;
var strings = url.Substring(from, to).Split(delimiter);
You may want to add validations though.

You have to use built-in string methods. The best is to use String Split.
String url = "http://test.com/anc/bussiness/accommo/resort/paradise-villas-little.aspx";
String[] url_parts = url.Split('/'); //Now you have all the parts of the URL all folders and page. Access the folder names from string array.
Hope this helps

Related

How to back-calculate a Url / Uri from two partial Uris and an assumption

I have a relative path stored in a database field. It looks like this:
/images/2012-6/file.jpg
I can access the absolute Url via:
http://www.blah.com/images/2012-6/file.jpg
The thumbnail relative path isn't stored in the database table. However I know that it can be found on:
http://www.blah.com/images/thumbnails/2012-6/file.jpg
Given I know http://www.blah.com/ and /images/2012-6/file.jpg, what is the best algorithm to determine:
http://www.blah.com/images/thumbnails/2012-6/file.jpg
Note: The naming scheme could be multiple folders deep on both the relative path and the thumbnail folder root path. i.e. /images/whatever/something/thumbnails/".
The one and only assumption that you can make is that both will start with the same first segment (folder). i.e. /images/.
This is in a console app (no System.Web if I can help it). This is the best I have come up with so far:
public static void Main()
{
var test = GetThumbnail("http://www.blah.com/", "/images/folder/subfolder/test/again/filename.jpg", "/images/extra/thumbnails/");
Console.WriteLine(test);
}
private static string GetThumbnail(string baseUriString, string relativePath, string thumbsPath)
{
var root = new Uri(baseUriString, UriKind.Absolute);
var absolute = new Uri(root, relativePath);
var segments = absolute.Segments;
var thumbnailUri = new Uri(root, thumbsPath);
var thumbSegments = thumbnailUri.Segments;
var matchedRoot = Array.IndexOf(segments, thumbSegments[1]);
var builder = new UriBuilder(root);
for (var j = 1; j < segments.Length; j++)
{
if (j == matchedRoot)
{
for (var k = 1; k < thumbSegments.Length; k++)
{
builder.Path += thumbSegments[k];
}
}
else
{
builder.Path += segments[j];
}
}
return builder.Uri.ToString();
}
It seems convoluted. Any better way of doing this?

If I've understood your requirements correctly, you fundamentally want to:
Break the paths into segments
Find the overlapping first segment
Insert all the thumbs segments into that spot in the Uri.
Your code is already doing that, but I believe the following code makes those steps more clear:
private static string GetThumbnail(string baseUriString, string relativePath, string thumbsPath)
{
// 1. Break the paths into segments
// the trim here is important to prevent empty list items or "//" when we rebuild later.
List<string> relativeSegments = relativePath.Trim('/').Split('/').ToList();
List<string> thumbsSegments = thumbsPath.Trim('/').Split('/').ToList();
// 2. Find the first segment of thumbs in the relative segments
int sharedSegmentIndex = relativeSegments.IndexOf(thumbsSegments[0]);
// 3. Insert all the thumbs segments into that spot in the Uri.
// remove the overlapping segment (because we're adding it again below)
relativeSegments.RemoveAt(sharedSegmentIndex);
// insert the thumbs segment
relativeSegments.InsertRange(sharedSegmentIndex, thumbsSegments);
// build up the full path.
var root = new Uri(baseUriString, UriKind.Absolute);
var builder = new UriBuilder(root);
builder.Path += string.Join("/", relativeSegments);
return builder.Uri.ToString();
}
Also, here is my quick and dirty test suite, so you can check I make the right desired outputs, and for anyone else wanting to tinker.
public static void Main()
{
RunTest("http://www.blah.com/", "/images/2012-6/file.jpg", "/images/thumbnails/", "http://www.blah.com/images/thumbnails/2012-6/file.jpg");
RunTest("http://www.blah.com/", "/images/folder/subfolder/test/again/filename.jpg", "/images/extra/thumbnails/", "http://www.blah.com/images/extra/thumbnails/folder/subfolder/test/again/filename.jpg");
RunTest("http://www.blah.com/", "/myapp/images/folder/subfolder/test/again/filename.jpg", "/images/extra/thumbnails/", "http://www.blah.com/myapp/images/extra/thumbnails/folder/subfolder/test/again/filename.jpg");
}
public static void RunTest(string baseUriString, string relativePath, string thumbsPath, string expected)
{
var result = GetThumbnail(baseUriString, relativePath, thumbsPath);
Debug.Assert(result == expected);
}

Get specific subdomain from URL in foo.bar.car.com

Given a URL as follows:
foo.bar.car.com.au
I need to extract foo.bar.
I came across the following code :
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
if (host.Split('.').Length > 2)
{
int lastIndex = host.LastIndexOf(".");
int index = host.LastIndexOf(".", lastIndex - 1);
return host.Substring(0, index);
}
}
return null;
}
This gives me like foo.bar.car. I want foo.bar. Should i just use split and take 0 and 1?
But then there is possible wwww.
Is there an easy way for this?

Given your requirement (you want the 1st two levels, not including 'www.') I'd approach it something like this:
private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
var nodes = host.Split('.');
int startNode = 0;
if(nodes[0] == "www") startNode = 1;
return string.Format("{0}.{1}", nodes[startNode], nodes[startNode + 1]);
}
return null;
}

I faced a similar problem and, based on the preceding answers, wrote this extension method. Most importantly, it takes a parameter that defines the "root" domain, i.e. whatever the consumer of the method considers to be the root. In the OP's case, the call would be
Uri uri = "foo.bar.car.com.au";
uri.DnsSafeHost.GetSubdomain("car.com.au"); // returns foo.bar
uri.DnsSafeHost.GetSubdomain(); // returns foo.bar.car
Here's the extension method:
/// <summary>Gets the subdomain portion of a url, given a known "root" domain</summary>
public static string GetSubdomain(this string url, string domain = null)
{
var subdomain = url;
if(subdomain != null)
{
if(domain == null)
{
// Since we were not provided with a known domain, assume that second-to-last period divides the subdomain from the domain.
var nodes = url.Split('.');
var lastNodeIndex = nodes.Length - 1;
if(lastNodeIndex > 0)
domain = nodes[lastNodeIndex-1] + "." + nodes[lastNodeIndex];
}
// Verify that what we think is the domain is truly the ending of the hostname... otherwise we're hooped.
if (!subdomain.EndsWith(domain))
throw new ArgumentException("Site was not loaded from the expected domain");
// Quash the domain portion, which should leave us with the subdomain and a trailing dot IF there is a subdomain.
subdomain = subdomain.Replace(domain, "");
// Check if we have anything left. If we don't, there was no subdomain, the request was directly to the root domain:
if (string.IsNullOrWhiteSpace(subdomain))
return null;
// Quash any trailing periods
subdomain = subdomain.TrimEnd(new[] {'.'});
}
return subdomain;
}

You can use the following nuget package Nager.PublicSuffix. It uses the PUBLIC SUFFIX LIST from Mozilla to split the domain.
PM> Install-Package Nager.PublicSuffix
Example
var domainParser = new DomainParser();
var data = await domainParser.LoadDataAsync();
var tldRules = domainParser.ParseRules(data);
domainParser.AddRules(tldRules);
var domainName = domainParser.Get("sub.test.co.uk");
//domainName.Domain = "test";
//domainName.Hostname = "sub.test.co.uk";
//domainName.RegistrableDomain = "test.co.uk";
//domainName.SubDomain = "sub";
//domainName.TLD = "co.uk";

private static string GetSubDomain(Uri url)
{
if (url.HostNameType == UriHostNameType.Dns)
{
string host = url.Host;
String[] subDomains = host.Split('.');
return subDomains[0] + "." + subDomains[1];
}
return null;
}

OK, first. Are you specifically looking in 'com.au', or are these general Internet domain names? Because if it's the latter, there is simply no automatic way to determine how much of the domain is a "site" or "zone" or whatever and how much is an individual "host" or other record within that zone.
If you need to be able to figure that out from an arbitrary domain name, you will want to grab the list of TLDs from the Mozilla Public Suffix project (http://publicsuffix.org) and use their algorithm to find the TLD in your domain name. Then you can assume that the portion you want ends with the last label immediately before the TLD.

I would recommend using Regular Expression. The following code snippet should extract what you are looking for...
string input = "foo.bar.car.com.au";
var match = Regex.Match(input, #"^\w*\.\w*\.\w*");
var output = match.Value;

In addition to the NuGet Nager.PubilcSuffix package specified in this answer, there is also the NuGet Louw.PublicSuffix package, which according to its GitHub project page is a .Net Core Library that parses Public Suffix, and is based on the Nager.PublicSuffix project, with the following changes:
Ported to .NET Core Library.
Fixed library so it passes ALL the comprehensive tests.
Refactored classes to split functionality into smaller focused classes.
Made classes immutable. Thus DomainParser can be used as singleton and is thread safe.
Added WebTldRuleProvider and FileTldRuleProvider.
Added functionality to know if Rule was a ICANN or Private domain rule.
Use async programming model
The page also states that many of above changes were submitted back to original Nager.PublicSuffix project.

Issue in Parsing Json image in C#

net C#. I am trying to parse Json from a webservice. I have done it with text but having a problem with parsing image. Here is the Url from where I m getting Json
http://collectionking.com/rest/view/items_in_collection.json?args=122
And this is My code to Parse it
using (var wc = new WebClient()) {
JavaScriptSerializer js = new JavaScriptSerializer();
var result = js.Deserialize<ck[]>(wc.DownloadString("http://collectionking.com/rest/view/items_in_collection.json args=122"));
foreach (var i in result) {
lblTitle.Text = i.node_title;
imgCk.ImageUrl = i.["main image"];
lblNid.Text = i.nid;
Any help would be great.
Thanks in advance.
PS: It returns the Title and Nid but not the Image.
My class is as follows:
public class ck
{
public string node_title;
public string main_image;
public string nid; }

Your problem is that you are setting ImageUrl to something like this <img typeof="foaf:Image" src="http://... and not an actual url. You will need to further parse main image and extract the url to show it correctly.
Edit
This was a though nut to crack because of the whitespace. The only solution I could find was to remove the whitespace before parsing the string. It's not a very nice solution but I couldn't find any other way using the built in classes. You might be able to solve it properly using JSON.Net or some other library though.
I also added a regular expression to extract the url for you, though there is no error checking what so ever here so you'll need to add that yourself.
using (var wc = new WebClient()) {
JavaScriptSerializer js = new JavaScriptSerializer();
var result = js.Deserialize<ck[]>(wc.DownloadString("http://collectionking.com/rest/view/items_in_collection.json?args=122").Replace("\"main image\":", "\"main_image\":")); // Replace the name "main image" with "main_image" to deserialize it properly, also fixed missing ? in url
foreach (var i in result) {
lblTitle.Text = i.node_title;
string realImageUrl = Regex.Match(i.main_image, #"src=""(.*?)""").Groups[1].Value; // Extract the value of the src-attribute to get the actual url, will throw an exception if there isn't a src-attribute
imgCk.ImageUrl = realImageUrl;
lblNid.Text = i.nid;
}
}

Try This
private static string ExtractImageFromTag(string tag)
{
int start = tag.IndexOf("src=\""),
end = tag.IndexOf("\"", start + 6);
return tag.Substring(start + 5, end - start - 5);
}
private static string ExtractTitleFromTag(string tag)
{
int start = tag.IndexOf(">"),
end = tag.IndexOf("<", start + 1);
return tag.Substring(start + 1, end - start - 1);
}
It may help

Getting URL from file path in IHttpHandler (Generic handler)

In my IHttpHandler class (for an .ashx page), I want to search a directory for certain files, and return relative urls. I can get the files, no problem:
string dirPath = context.Server.MapPath("~/mydirectory");
string[] files = Directory.GetFiles(dirPath, "*foo*.txt");
IEnumerable<string> relativeUrls = files.Select(f => WHAT GOES HERE? );
What is the easiest way to convert file paths to relative urls? If I were in an aspx page, I could say this.ResolveUrl(). I know I could do some string parsing and string replacement to get the relative url, but is there some built-in method that will take care of all of that for me?
Edit: To clarify, without doing my own string parsing, how do I go from this:
"E:\Webs\WebApp1\WebRoot\mydirectory\foo.txt"
to this:
"/mydirectory/foo.txt"
I'm looking for an existing method like:
public string GetRelativeUrl(string filePath) { }

I can imagine a lot of people having this question... My solution is:
public static string ResolveRelative(string url)
{
var requestUrl = context.Request.Url;
string baseUrl = string.Format("{0}://{1}{2}{3}",
requestUrl.Scheme, requestUrl.Host,
(requestUrl.IsDefaultPort ? "" : ":" + requestUrl.Port),
context.Request.ApplicationPath);
if (toresolve.StartsWith("~"))
{
return baseUrl + toresolve.Substring(1);
}
else
{
return new Uri(new Uri(baseUrl), toresolve).ToString();
}
}
update
Or from filename to virtual path (haven't tested it; you might need some code similar to ResoveRelative above as well... let me know if it works):
public static string GetUrl(string filename)
{
if (filename.StartsWith(context.Request.PhysicalApplicationPath))
{
return context.Request.ApplicationPath +
filename.Substring(context.Request.PhysicalApplicationPath.Length);
}
else
{
throw new ArgumentException("Incorrect physical path");
}
}

try System.Web.Hosting.HostingEnvironment.MapPath method, its static and can be accessed everywhere in web application.

Create http url string

I need a function which will return correct url from url parts (like in browsers)
string GetUrl(string actual,string path) {
return newurl;
}
For example:
GetUrl('http://example.com/a/b/c/a.php','z/x/c/i.php') -> http://example.com/a/b/c/z/x/c/i.php
GetUrl('http://example.com/a/b/c/a.php','/z/x/c/i.php') -> http://example.com/z/x/c/i.php
GetUrl('http://example.com/a/b/c/a.php','i.php') -> http://example.com/a/b/c/i.php
GetUrl('http://example.com/a/b/c/a.php','/o/d.php?b=1') -> http//example.com/o/d.php?b=1
GetUrl('http://example.com/a/a.php','./o/d.php?b=1') -> http//example.com/a/o/d.php?b=1
Anu suggestions?

What you need is the System.UriBuilder class: http://msdn.microsoft.com/en-us/library/system.uribuilder.aspx
There is also a lightweight solution at CodeProject that doesnt depent on System.Web: http://www.codeproject.com/KB/aspnet/UrlBuilder.aspx
There is also one Query String Builder (but I havent tried it before): http://weblogs.asp.net/bradvincent/archive/2008/10/27/helper-class-querystring-builder-chainable.aspx

public string ConvertLink(string input)
{
//Add http:// to link url
Regex urlRx = new Regex(#"(?<url>(http(s?):[/][/]|www.)([a-z]|[A-Z]|[0-9]|[/.]|[~])*)", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(input);
foreach (Match match in matches)
{
string url = match.Groups["url"].Value;
Uri uri = new UriBuilder(url).Uri;
input = input.Replace(url, uri.AbsoluteUri);
}
return input;
}
The code locate every link inside the string with regex, and then use UriBuilder to add protocol to the link if doesn't exist. Since "http://" is default, it will then be added if no protocol exist.

At this link you can take an example of how to take the domain of the URL, with this, you can add the second part to the string of the url
http://www.jonasjohn.de/snippets/csharp/extract-domain-name-from-url.htm
This is the best way to do it, I think.
See you.

What about:
string GetUrl(string actual, string path)
{
return actual.Substring(0, actual.Length - 4).ToString() + "/" + path;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Get Last two folder's name from URL using C# - c#

Related

How to back-calculate a Url / Uri from two partial Uris and an assumption

Get specific subdomain from URL in foo.bar.car.com

Issue in Parsing Json image in C#

Getting URL from file path in IHttpHandler (Generic handler)

Create http url string

Categories

Resources