how to detect specific part of the URL and modify it?

how to detect specific part of the URL and modify it? - c#

I am developing website using asp.net. In there I mainly use URL to pass parameters.
I have URL structure like this
http://localhost:51247/yyy/zzz/hrforum/(if its in my local PC)
http://test.com/yyy/zzz/hrforum/
I need to detect that zzz part and replace it with another word. I tried many things including Regex patterns but seems I am doing git wrong way. Please help me to detect it. Modify it and rebuild the URL
Codes I tried
Regex myRegex = new Regex(#"/([\w\s]+?\;){2}/");
var match = myRegex.Match(fullUrl);
var firstName = match.Groups[0].Value;
But this is not working.

The easiest method of doing this would be to use the Uri.Segments property. For example:
Uri uriAddress1 = new Uri("http://test.com/yyy/zzz/hrforum/");
Uri uriAddress2 = new Uri("ttp://localhost:51247/yyy/zzz/hrforum/");
Console.WriteLine(uriAddress1.Segments[2] == uriAddress2.Segments[2]);
Console.WriteLine("Segment 2 of Address 1: {0} Segment 2 of Address 2: {1}", uriAddress1.Segments[2].Trim('/'),uriAddress2.Segments[2].Trim('/'));
Output:
True
Segment 2 of Address 1: zzz Segment 2 of Address 2: zzz

I'm not sure what you want to achieve but to answer this question:
How to detect specific part of the URL and modify it?
I think you can use Uri class instead of using Regex.
var uri = new Uri("http://test.com/yyy/zzz/hrforum/");
var pathName = uri.PathAndQuery;
foreach (var item in pathName.Split('/'))
{
Console.WriteLine(item);
}
// output:
// yyy
// zzz
// hrforum

Related

Regex for getting domain and subdomain in C#

I am having a requirement to correctly get the domain/subdomain based on the current url. this is required in order to correctly fetch the data from database and further call web api with correct parameters.
In perticular, I am facing issues with local and production urls. for ex.
In local, i have
http://sample.local.example.com
http://test.dev.example.com
In production, i have
http://client.example.com
http://program.live.example.com
i need
Subdomain as: sample / test / client / program
Domain as: exmpale
So far i tried to use c# with following code to identify the same. It works fine on my local but i am sure this will create an issue on production at some point of time. Basically, for Subdomain, get the first part and for Domain, get the last part before ''.com''
var host = Request.Url.Host;
var domains = host.Split('.');
var subDomain = domains[0];
string mainDomain = string.Empty;
#if DEBUG
mainDomain = domains[2];
#else
mainDomain = domains[1];
#endif
return Tuple.Create(mainDomain, subDomain);

Instead of a regex, I think Linq should help your here. Try:
public static (string, string) GetDomains(Uri url)
{
var domains = url.Host.Substring(0, url.Host.LastIndexOf(".")).Split('.');
var subDomain = string.Join("/", domains.Take(domains.Length - 1));
var mainDomain = domains.Last();
return (mainDomain, subDomain);
}
output for "http://program.live.example.com"
example
program/live
Try it Online!

This regex should work for you:
Match match = Regex.Match(temp, #"http://(\w+)\.?.*\.(\w+).com$");
string subdomain = match.Groups[1].Value;
string domain = match.Groups[2].Value;
http://(\w+)\. matches 1 or more word characters as group 1 before a dot and after http://
.* matches zero or more occurences of any character
\.(\w+).com matches 1 or more word characters as group 2 before .com and after a dot
$ specifies the end of the string
\.? makes the dot optional to catch the case if there is nothing between group 1 and 2 like in http://client.example.com

You are doing the right and you can get the domain name as the second last value in the array.
var host = Request.Url.Host;
var domains = host.Split('.');
string subDomain = domains[0].Split('/')[2];
string mainDomain = domains[domains.Length-2];
return Tuple.Create(mainDomain, subDomain);
If you want all the subdomains you can put a loop here.

replace the matched expression from string and its next charecter?POST operation

i have a string suppose that one
http://www.whitelabelhosting.co.uk/flight-search.php?dept=any&journey=R&DepTime=0900
and now what i am doing is here in c sharp
string linkmain = link.Replace("&DepTime=", "&DepTime=" + journey);
but the time is being added as 09000900
and in case of
string linkmain = link.Replace("Journey=", "Journey="+journey);
journey added as RR
so i have to get the value of R that is after Journer=? AND deptTime=?
that are not same every time so how to get them during replace as they are present just after where ? sign is marked
this is a post operation so parameter are different like
journey :
R
M
O
and time :
0900 , 1200 , 0400

Use HttpUtility.ParseQueryString(url) with will return a NameValueCollection. You can then loop this collection to manipulate the data how you want and build the new url from that.
https://msdn.microsoft.com/en-us/library/ms150046(v=vs.110).aspx

You may want to try out the following regex. Regex101 link
((?:\?.*?&|\?)journey=)[^&]*
Try out the following code to replace the value of journey to replacement
string url = "http://www.whitelabelhosting.co.uk/flight-search.php?dept=any&journey=R&DepTime=0900";
string newUrl = Regex.Replace(url, #"((?:\?.*?&|\?)journey=)[^&]*", "$1"+"replacement");
Remember to add the following to your file:
using System.Text.RegularExpressions;
You can do the same for DepTime using the following regex:
((?:\?.*?&|\?)DepTime=)[^&]*

Parse Line and Break it into Variables

I have a text file that contain only the FULL version number of an application that I need to extract and then parse it into separate Variables.
For example lets say the version.cs contains 19.1.354.6
Code I'm using does not seem to be working:
char[] delimiter = { '.' };
string currentVersion = System.IO.File.ReadAllText(#"C:\Applicaion\version.cs");
string[] partsVersion;
partsVersion = currentVersion.Split(delimiter);
string majorVersion = partsVersion[0];
string minorVersion = partsVersion[1];
string buildVersion = partsVersion[2];
string revisVersion = partsVersion[3];

Altough your problem is with the file, most likely it contains other text than a version, why dont you use Version class which is absolutely for this kind of tasks.
var version = new Version("19.1.354.6");
var major = version.Major; // etc..

What you have works fine with the correct input, so I would suggest making sure there is nothing else in the file you're reading.
In the future, please provide error information, since we can't usually tell exactly what you expect to happen, only what we know should happen.
In light of that, I would also suggest looking into using Regex for parsing in the future. In my opinion, it provides a much more flexible solution for your needs. Here's an example of regex to use:
var regex = new Regex(#"([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9])");
var match = regex.Match("19.1.354.6");
if (match.Success)
{
Console.WriteLine("Match[1]: "+match.Groups[1].Value);
Console.WriteLine("Match[2]: "+match.Groups[2].Value);
Console.WriteLine("Match[3]: "+match.Groups[3].Value);
Console.WriteLine("Match[4]: "+match.Groups[4].Value);
}
else
{
Console.WriteLine("No match found");
}
which outputs the following:
// Match[1]: 19
// Match[2]: 1
// Match[3]: 354
// Match[4]: 6

How to retrieve the locale(country) code from URL?

I have a URL, which is like http://example.com/UK/Deal.aspx?id=322
My target is to remove the locale(country) part, to make it like http://example.com/Deal.aspx?id=322
Since the URL may have other similar formats like: https://ssl.example.com/JP/Deal.aspx?id=735, using "substring" function is not a good idea.
What I can think about is to use the following method for separating them, and map them back later.
HttpContext.Current.Request.Url.Scheme
HttpContext.Current.Request.Url.Host
HttpContext.Current.Request.Url.AbsolutePath
HttpContext.Current.Request.Url.Query
And, suppose HttpContext.Current.Request.Url.AbsolutePath will be:
/UK/Deal.aspx?id=322
I am not sure how to deal with this since my boss asked me not to use "regular expression"(he thinks it will impact performance...)
Except "Regular Expression", is there any other way to remove UK from it?
p.s.: the UK part may be JP, DE, or other country code.
By the way, for USA, there is no country code, and the url will be http://example.com/Deal.aspx?id=322
Please also take this situation into consideration.
Thank you.

Assuming that you'll have TwoLetterCountryISOName in the Url. yYou can use UriBuilder class to remove the path from Uri without using the Regex.
E.g.
var originalUri = new Uri("http://example.com/UK/Deal.aspx?id=322");
if (IsLocaleEnabled(sourceUri))
{
var builder = new UriBuilder(sourceUri);
builder.Path
= builder.Path.Replace(sourceUri.Segments[1] /* remove UK/ */, string.Empty);
// Construct the Uri with new path
Uri newUri = builder.Uri;;
}
Update:
// Cache the instance for performance benefits.
static readonly Regex regex = new Regex(#"^[aA-zZ]{2}\/$", RegexOptions.Compiled);
/// <summary>
/// Regex to check if Url segments have the 2 letter
/// ISO code as first ocurrance after root
/// </summary>
private bool IsLocaleEnabled(Uri sourceUri)
{
// Update: Compiled regex are way much faster than using non-compiled regex.
return regex.IsMatch(sourceUri.Segments[1]);
}
For performance benefits you must cache it (means keep it in static readonly field). There's no need to parse a pre-defined regex on every request. This way you'll get all the performance benefits you can get.
Result - http://example.com/Deal.aspx?id=322

It all depends on whether the country code always has the same position. If it's not, then some more details on the possible formats are required.. Maybe you could check, if the first segment has two chars or something, to be sure it really is a country code (not sure if this is reliable though). Or you start with the filename, if it's always in the format /[optionalCountryCode]/deal.aspx?...
How about these two approaches (on string level):
public string RemoveCountryCode()
{
Uri originalUri = new Uri("http://example.com/UK/Deal.aspx?id=322");
string hostAndPort = originalUri.GetLeftPart(UriPartial.Authority);
// v1: if country code is always there, always has same position and always
// has format 'XX' this is definitely the easiest and fastest
string trimmedPathAndQuery = originalUri.PathAndQuery.Substring("/XX/".Length);
// v2: if country code is always there, always has same position but might
// not have a fixed format (e.g. XXX)
trimmedPathAndQuery = string.Join("/", originalUri.PathAndQuery.Split('/').Skip(2));
// in both cases you need to join it with the authority again
return string.Format("{0}/{1}", hostAndPort, trimmedPathAndQuery);
}

If the AbsolutePath will always have the format /XX/...pagename.aspx?id=### where XX is the two letter country code, then you can just strip off the first 3 characters.
Example that removes the first 3 characters:
var targetURL = HttpContext.Current.Request.Url.AbsolutePath.Substring(3);
If the country code could be different lengths, then you could find the index of the second / character and start the substring from there.
var sourceURL = HttpContext.Current.Request.Url.AbsolutePath;
var firstOccurance = sourceURL.IndexOf('/')
var secondOccurance = sourceURL.IndexOf('/', firstOccurance);
var targetURL = sourceURL.Substring(secondOccurance);

The easy way would be to treat as string, split it by the "/" separator, remove the fourth element, and then join them back with the "/" separator again:
string myURL = "https://ssl.example.com/JP/Deal.aspx?id=735";
List<string> myURLsplit = myURL.Split('/').ToList().RemoveAt(3);
myURL = string.Join("/", myURLsplit);
RESULT: https://ssl.example.com/Deal.aspx?id=735

Using Regexp to get information in a KeyValuePair

Help me to parse this message:
text=&direction=re&orfo=rus&files_id=&message=48l16qL2&old_charset=utf-8&template_id=&HTMLMessage=1&draft_msg=&re_msg=&fwd_msg=&RealName=0&To=john+%3Cjohn11%40gmail.com%3E&CC=&BCC=&Subject=TestSubject&Body=%3Cp%3EHello+%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82+%D1%82%D0%B5%D0%BA%D1%81%D1%82%3Cbr%3E%3Cbr%3E%3C%2Fp%3E&secur
I would like to get information in an KeyValuePair:
Key - Value
text -
direction - re
and so on.
And how to convert this: Hello+%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82+%D1%82%D0%B5%D0%BA%D1%81%...
there are cyrillic character.
Thanks.

If you want to use a Regex, you can do it like this:
// I only added the first 3 keys, but the others are basically the same
Regex r = new Regex(#"text=(?<text>.*)&direction=(?<direction>.*)&orfo=(?<orfo>.*)");
Match m = r.Match(inputText);
if(m.Success)
{
var text = m.Groups["text"].Value; // result is ""
var direction = m.Groups["direction"].Value; // re
var orfo = m.Groups["orfo"].Value;
}
However, the method suggested by BoltClock is much better:
System.Collections.Specialized.NameValueCollection collection =
System.Web.HttpUtility.ParseQueryString(inputString);

It looks like you are dealing with a URI, better to use the proper class than try and figure out the detailed processing.
http://msdn.microsoft.com/en-us/library/system.uri.aspx

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

how to detect specific part of the URL and modify it? - c#

Related

Regex for getting domain and subdomain in C#

replace the matched expression from string and its next charecter?POST operation

Parse Line and Break it into Variables

How to retrieve the locale(country) code from URL?

Using Regexp to get information in a KeyValuePair

Categories

Resources