Extract Common Name from Distinguished Name - c#

Is there a call in .NET that parses the CN from a rfc-2253 encoded distinguished name? I know there are some third-party libraries that do this, but I would prefer to use native .NET libraries if possible.
Examples of a string encoded DN
CN=L. Eagle,O=Sue\, Grabbit and Runn,C=GB
CN=Jeff Smith,OU=Sales,DC=Fabrikam,DC=COM

If you are working with an X509Certificate2, there is a native method that you can use to extract the Simple Name. The Simple Name is equivalent to the Common Name RDN within the Subject field of the main certificate:
x5092Cert.GetNameInfo(X509NameType.SimpleName, false);
Alternatively, X509NameType.DnsName can be used to retrieve the Subject Alternative Name, if present; otherwise, it will default to the Common Name:
x5092Cert.GetNameInfo(X509NameType.DnsName, false);

After digging around in the .NET source code it looks like there is an internal utility class that can parse Distinguished Names into their different components. Unfortunately the utility class is not made public, but you can access it using reflection:
string dn = "CN=TestGroup,OU=Groups,OU=UT-SLC,OU=US,DC=Company,DC=com";
Assembly dirsvc = Assembly.Load("System.DirectoryServices");
Type asmType = dirsvc.GetType("System.DirectoryServices.ActiveDirectory.Utils");
MethodInfo mi = asmType.GetMethod("GetDNComponents", BindingFlags.NonPublic | BindingFlags.Static);
string[] parameters = { dn };
var test = mi.Invoke(null, parameters);
//test.Dump("test1");//shows details when using Linqpad
//Convert Distinguished Name (DN) to Relative Distinguished Names (RDN)
MethodInfo mi2 = asmType.GetMethod("GetRdnFromDN", BindingFlags.NonPublic | BindingFlags.Static);
var test2 = mi2.Invoke(null, parameters);
//test2.Dump("test2");//shows details when using Linqpad
The results would look like this:
//test1 is array of internal "Component" struct that has name/values as strings
Name Value
CN TestGroup
OU Groups
OU UT-SLC
OU US
DC company
DC com
//test2 is a string with CN=RDN
CN=TestGroup
Please not this is an internal utility class and could change in a future release.

I had the same question, myself, when I found yours. Didn't find anything in the BCL; however, I did stumble across this CodeProject article that hit the nail squarely on the head.
I hope it helps you out, too.
http://www.codeproject.com/Articles/9788/An-RFC-2253-Compliant-Distinguished-Name-Parser

Do Win32 functions count? You can use PInvoke with DsGetRdnW. For code, see my answer to another question: https://stackoverflow.com/a/11091804/628981.

You can extract the common name from an ASN.1-encoded distinguished name using AsnEncodedData class:
var distinguishedName= new X500DistinguishedName("CN=TestGroup,OU=Groups,OU=UT-SLC,OU=US,DC=Company,DC=com");
var commonNameData = new AsnEncodedData("CN", distinguishedName.RawData);
var commonName = commonNameData.Format(false);
A downside of this approach is that if you specify an unrecognized OID or the field identified with the OID is missing in the distinguished name, Format method will return a hex string with the encoded value of full distinguished name so you may want to verify the result.
Also the documentation does not seem to specify if the rawData parameter of the AsnEncodedData constructor is allowed to contain other OIDs besides the one specified as the first argument so it may break on non-Windows OS or in a future version of .NET Framework.

If you are on Windows, #MaxKiselev's answer works perfectly. On non-Windows platforms, it returns the ASN1 dumps of each attribute.
.Net Core 5+ includes an ASN1 parser, so you can access the RDN's in a cross-platform manner by using AsnReader.
Helper class:
public static class X509DistinguishedNameExtensions
{
public static IEnumerable<KeyValuePair<string, string>> GetRelativeNames(this X500DistinguishedName dn)
{
var reader = new AsnReader(dn.RawData, AsnEncodingRules.BER);
var snSeq = reader.ReadSequence();
if (!snSeq.HasData)
{
throw new InvalidOperationException();
}
// Many types are allowable. We're only going to support the string-like ones
// (This excludes IPAddress, X400 address, and other wierd stuff)
// https://www.rfc-editor.org/rfc/rfc5280#page-37
// https://www.rfc-editor.org/rfc/rfc5280#page-112
var allowedRdnTags = new[]
{
UniversalTagNumber.TeletexString, UniversalTagNumber.PrintableString,
UniversalTagNumber.UniversalString, UniversalTagNumber.UTF8String,
UniversalTagNumber.BMPString, UniversalTagNumber.IA5String,
UniversalTagNumber.NumericString, UniversalTagNumber.VisibleString,
UniversalTagNumber.T61String
};
while (snSeq.HasData)
{
var rdnSeq = snSeq.ReadSetOf().ReadSequence();
var attrOid = rdnSeq.ReadObjectIdentifier();
var attrValueTagNo = (UniversalTagNumber)rdnSeq.PeekTag().TagValue;
if (!allowedRdnTags.Contains(attrValueTagNo))
{
throw new NotSupportedException($"Unknown tag type {attrValueTagNo} for attr {attrOid}");
}
var attrValue = rdnSeq.ReadCharacterString(attrValueTagNo);
var friendlyName = new Oid(attrOid).FriendlyName;
yield return new KeyValuePair<string, string>(friendlyName ?? attrOid, attrValue);
}
}
}
Example usage:
// Subject: CN=Example, O=Organization
var cert = new X509Certificate2("foo.cer");
var names = this.cert.SubjectName.GetRelativeNames().ToArray();
// names has [ { "CN": "Example" }, { "O": "Organization" } ]
Since this does not involve any string parsing, no escape or injections can be mishandled. It doesn't support decoding DN's that contain non-string elements, but those seem exceedingly rare.

How about this one:
string cnPattern = #"^CN=(?<cn>.+?)(?<!\\),";
string dn = #"CN=Doe\, John,OU=My OU,DC=domain,DC=com";
Regex re = new Regex(cnPattern);
Match m = re.Match(dn);
if (m.Success)
{
// Item with index 1 returns the first group match.
string cn = m.Groups[1].Value;
}
Adapted from Powershell Regular Expression for Extracting Parts of an Active Directory Distiniguished Name.

Just adding my two cents here. This implementation works "best" if you first learn what business rules are in place that will ultimately dictate how much of the RFC will ever be implemented at your company.
private static string ExtractCN(string distinguishedName)
{
// CN=...,OU=...,OU=...,DC=...,DC=...
string[] parts;
parts = distinguishedName.Split(new[] { ",DC=" }, StringSplitOptions.None);
var dc = parts.Skip(1);
parts = parts[0].Split(new[] { ",OU=" }, StringSplitOptions.None);
var ou = parts.Skip(1);
parts = parts[0].Split(new[] { ",CN=" }, StringSplitOptions.None);
var cnMulti = parts.Skip(1);
var cn = parts[0];
if (!Regex.IsMatch(cn, "^CN="))
throw new CustomException(string.Format("Unable to parse distinguishedName for commonName ({0})", distinguishedName));
return Regex.Replace(cn, "^CN=", string.Empty);
}

You could use regular expressions to do this. Here's a regex pattern than can parse the whole DN, then you can just take the parts you are interested in:
(?:^|,\s?)(?:(?<name>[A-Z]+)=(?<val>"(?:[^"]|"")+"|(?:\\,|[^,])+))+
Here it is formatted a bit nicer, and with some comments:
(?:^|,\s?) <-- Start or a comma
(?:
(?<name>[A-Z]+)
=
(?<val>
"(?:[^"]|"")+" <-- Quoted strings
|
(?:\\,|[^,])+ <-- Unquoted strings
)
)+
This regex will give you name and val capture groups for each match.
DN strings can optionally be quoted (e.g. "Hello", which allows them to contain unescaped commas. Alternatively, if not quoted, commas must be escaped with a backslash (e.g. Hello\, there!). This regex handles both quoted and unquoted strings.
Here's a link so you can see it in action: https://regex101.com/r/7vhdDz/1

If the order is uncertain, I do this:
private static string ExtractCN(string dn)
{
string[] parts = dn.Split(new char[] { ',' });
for (int i = 0; i < parts.Length; i++)
{
var p = parts[i];
var elems = p.Split(new char[] { '=' });
var t = elems[0].Trim().ToUpper();
var v = elems[1].Trim();
if (t == "CN")
{
return v;
}
}
return null;
}

This is my almost RFC-compliant fail-safe DN parser derived from https://www.codeproject.com/Articles/9788/An-RFC-2253-Compliant-Distinguished-Name-Parser and an example of its usage (extract subject name as CN and O, both optional, concatenated with comma):
private static string GetCertificateString(X509Certificate2 certificate)
{
var subjectComponents = certificate.Subject.ParseDistinguishedName();
var subjectName = string.Join(", ", subjectComponents
.Where(m => (m.Item1 == "CN") || (m.Item1 == "O"))
.Select(n => n.Item2)
.Distinct());
return $"{certificate.SerialNumber} {certificate.NotBefore:yyyy.MM.dd}-{certificate.NotAfter:yyyy.MM.dd} {subjectName}";
}
private enum DistinguishedNameParserState
{
Component,
QuotedString,
EscapedCharacter,
};
public static IEnumerable<Tuple<string, string>> ParseDistinguishedName(this string value)
{
var previousState = DistinguishedNameParserState.Component;
var currentState = DistinguishedNameParserState.Component;
var currentComponent = new StringBuilder();
var previousChar = char.MinValue;
var position = 0;
Func<StringBuilder, Tuple<string, string>> parseComponent = sb =>
{
var s = sb.ToString();
sb.Clear();
var index = s.IndexOf('=');
if (index == -1)
{
return null;
}
var item1 = s.Substring(0, index).Trim().ToUpper();
var item2 = s.Substring(index + 1).Trim();
return Tuple.Create(item1, item2);
};
while (position < value.Length)
{
var currentChar = value[position];
switch (currentState)
{
case DistinguishedNameParserState.Component:
switch (currentChar)
{
case ',':
case ';':
// Separator found, yield parsed component
var component = parseComponent(currentComponent);
if (component != null)
{
yield return component;
}
break;
case '\\':
// Escape character found
previousState = currentState;
currentState = DistinguishedNameParserState.EscapedCharacter;
break;
case '"':
// Quotation mark found
if (previousChar == currentChar)
{
// Double quotes inside quoted string produce single quote
currentComponent.Append(currentChar);
}
currentState = DistinguishedNameParserState.QuotedString;
break;
default:
currentComponent.Append(currentChar);
break;
}
break;
case DistinguishedNameParserState.QuotedString:
switch (currentChar)
{
case '\\':
// Escape character found
previousState = currentState;
currentState = DistinguishedNameParserState.EscapedCharacter;
break;
case '"':
// Quotation mark found
currentState = DistinguishedNameParserState.Component;
break;
default:
currentComponent.Append(currentChar);
break;
}
break;
case DistinguishedNameParserState.EscapedCharacter:
currentComponent.Append(currentChar);
currentState = previousState;
currentChar = char.MinValue;
break;
}
previousChar = currentChar;
position++;
}
// Yield last parsed component, if any
if (currentComponent.Length > 0)
{
var component = parseComponent(currentComponent);
if (component != null)
{
yield return component;
}
}
}

Sorry for being a bit late to the party, but I was able to call the Name attribute directly from c#
UserPrincipal p
and then I was able to call
p.Name
and that gave me the full name (Common Name)
Sample code:
string name;
foreach(UserPrincipal p in PSR)
{
//PSR refers to PrincipalSearchResult
name = p.Name;
Console.WriteLine(name);
}
Obviously, you will have to fill in the blanks. But this should be easier than parsing regex.

Could you not just retrieve the CN attribute values?
As you correctly note, use someone else's class as there are lots of fun edge cases (escaped commas, escaped other characters) that make parsing a DN look easy, but actually reasonably tricky.
I usually use a Java class that comes with the Novell (Now NetID) Identity Manager. So that is not helpful.

using System.Linq;
var dn = "CN=Jeff Smith,OU=Sales,DC=Fabrikam,DC=COM";
var cn = dn.Split(',').Where(i => i.Contains("CN=")).Select(i => i.Replace("CN=", "")).FirstOrDefault();

Well, Here I am another person late to the party. Here is my Solution:
var dn = new X500DistinguishedName("CN=TestGroup,OU=Groups,OU=UT-SLC,OU=US,DC=\"Company, inc\",DC=com");
foreach(var part in dn.Format(true).Split("\r\n"))
{
if(part == "") continue;
var parts = part.Split('=', 2);
var key = parts[0];
var value = parts[1];
// use your key and value as you see fit here.
}
Basically its leveraging the X500DistinguishedName.Format method to put things on lines. Then split by lines, then split each line into key value.

Related

format string template with named parameters to literal c#

I have an application that creates string templates with named variables. This is done in accordance to the logging guide for ASP.NET Core
Now I find myself wanting to deliver these strings through the API itself as well, but this time with all the parameters filled in.
Basicly I'd want to use:
var template = "ID {ID} not found";
var para = new object[] {"value"};
String.Format(template, para);
However this gives an invalid input string.
Ofcourse I also cannot guarantee that somebody din't make a string template the 'classic' way with indexes.
var template2 = "ID {0} not found";
Is there a new way of formatting strings that I'm missing or are we supposed to work around this ?
I do not want to rework the existing code base to use numbers or use the $"...{para}" syntax. As this would lose information when it is being logged.
I'm guessing I could do a regex search and see if there's a '{0}' or a named parameter, and replace the named with indexes before formatting. But I wanted to know if there are some easier/cleaner ways of doing this.
Update - regex solution:
Bellow is the current work-around I've made using regex
public static class StringUtils
{
public static string Format(string template, params object[] para)
{
var match = Regex.Match(template, #"\{#?\w+}");
if (!match.Success) return template;
if (int.TryParse(match.Value.Substring(1, match.Value.Length - 2), out int n))
return string.Format(template, para);
else
{
var list = new List<string>();
var nextStartIndex = 0;
var i = 0;
while (match.Success)
{
if (match.Index > nextStartIndex)
list.Add(template.Substring(nextStartIndex , match.Index - nextStartIndex) + $"{{{i}}}");
else
list.Add($"{{{i}}}");
nextStartIndex = match.Index + match.Value.Length;
match = match.NextMatch();
i++;
}
return string.Format(string.Join("",list.ToArray()), para);
}
}
}

How can I find the first strong directionality character of a string in C#?

Assuming I get a string that can have mixed left-to-right and right-to-left content, I want to find the first strong directionality character in it, as defined here.
I think I found a good starting point in this question, but I still can't figure out how the BiDi category is related to the strong directionality characteristic. Is it possible to figure this out in C#?
Instead of relying on the internal implementation I took a slightly different approach that is open for optimizations but gives enough of a basis to answer your question.
I simply download the UnicodeData.txt that is part of the official release of a unicode version. That file contains for each unicodecharacter its number and some semicolon delimited fields. A typical line looks like this:
0041;LATIN CAPITAL LETTER A;Lu;0;L;;;;;N;;;;0061;
the fifth field contains the Bidirectional Class Value.
Armed with this knowledge the naive parser that reads the data and then inspects a demo string with it looks like this:
// hold chars with their Bidi Class Value
var udb = new Dictionary<char, string>();
// download UnicodeData txt file
var cli = new WebClient();
var data = cli.DownloadData("http://www.unicode.org/Public/UNIDATA/UnicodeData.txt");
// parse
using (var ms = new MemoryStream(data))
{
var sr = new StreamReader(ms, Encoding.UTF8);
var line = sr.ReadLine();
while (line != null)
{
var fields = line.Split(';');
int uc = int.Parse(fields[0], NumberStyles.HexNumber);
// above 0xffff we're lost
if (uc > 0xffff) break;
var ch = (char) uc;
var bca = fields[4];
udb.Add(ch, bca);
line = sr.ReadLine();
}
}
// test string
var s = "123A\xfb1d\x0620";
Console.WriteLine(s);
var pos = 0;
foreach(var c in s)
{
var bcv = udb[c]; // for a char get the Bidi Class Value
if (bcv == "L" || bcv == "R" || bcv == "AL")
{
Console.WriteLine(
"{0} - {1} : {2} [{3}]",
c,
pos,
CharUnicodeInfo.GetUnicodeCategory(c),
bcv);
}
pos++;
}
When run, you'll see the characters that are of the Strong Type and at which position they were found.

Converting HTML entities to Unicode Characters in C#

I found similar questions and answers for Python and Javascript, but not for C# or any other WinRT compatible language.
The reason I think I need it, is because I'm displaying text I get from websites in a Windows 8 store app. E.g. é should become é.
Or is there a better way? I'm not displaying websites or rss feeds, but just a list of websites and their titles.
I recommend using System.Net.WebUtility.HtmlDecode and NOT HttpUtility.HtmlDecode.
This is due to the fact that the System.Web reference does not exist in Winforms/WPF/Console applications and you can get the exact same result using this class (which is already added as a reference in all those projects).
Usage:
string s = System.Net.WebUtility.HtmlDecode("é"); // Returns é
Use HttpUtility.HtmlDecode() .Read on msdn here
decodedString = HttpUtility.HtmlDecode(myEncodedString)
This might be useful, replaces all (for as far as my requirements go) entities with their unicode equivalent.
public string EntityToUnicode(string html) {
var replacements = new Dictionary<string, string>();
var regex = new Regex("(&[a-z]{2,5};)");
foreach (Match match in regex.Matches(html)) {
if (!replacements.ContainsKey(match.Value)) {
var unicode = HttpUtility.HtmlDecode(match.Value);
if (unicode.Length == 1) {
replacements.Add(match.Value, string.Concat("&#", Convert.ToInt32(unicode[0]), ";"));
}
}
}
foreach (var replacement in replacements) {
html = html.Replace(replacement.Key, replacement.Value);
}
return html;
}
Different coding/encoding of HTML entities and HTML numbers in Metro App and WP8 App.
With Windows Runtime Metro App
{
string inStr = "ó";
string auxStr = System.Net.WebUtility.HtmlEncode(inStr);
// auxStr == ó
string outStr = System.Net.WebUtility.HtmlDecode(auxStr);
// outStr == ó
string outStr2 = System.Net.WebUtility.HtmlDecode("ó");
// outStr2 == ó
}
With Windows Phone 8.0
{
string inStr = "ó";
string auxStr = System.Net.WebUtility.HtmlEncode(inStr);
// auxStr == ó
string outStr = System.Net.WebUtility.HtmlDecode(auxStr);
// outStr == ó
string outStr2 = System.Net.WebUtility.HtmlDecode("ó");
// outStr2 == ó
}
To solve this, in WP8, I have implemented the table in HTML ISO-8859-1 Reference before calling System.Net.WebUtility.HtmlDecode().
This worked for me, replaces both common and unicode entities.
private static readonly Regex HtmlEntityRegex = new Regex("&(#)?([a-zA-Z0-9]*);");
public static string HtmlDecode(this string html)
{
if (html.IsNullOrEmpty()) return html;
return HtmlEntityRegex.Replace(html, x => x.Groups[1].Value == "#"
? ((char)int.Parse(x.Groups[2].Value)).ToString()
: HttpUtility.HtmlDecode(x.Groups[0].Value));
}
[Test]
[TestCase(null, null)]
[TestCase("", "")]
[TestCase("'fark'", "'fark'")]
[TestCase(""fark"", "\"fark\"")]
public void should_remove_html_entities(string html, string expected)
{
html.HtmlDecode().ShouldEqual(expected);
}
Improved Zumey method (I can`t comment there). Max char size is in the entity: &exclamation; (11). Upper case in the entities are also possible, ex. À (Source from wiki)
public string EntityToUnicode(string html) {
var replacements = new Dictionary<string, string>();
var regex = new Regex("(&[a-zA-Z]{2,11};)");
foreach (Match match in regex.Matches(html)) {
if (!replacements.ContainsKey(match.Value)) {
var unicode = HttpUtility.HtmlDecode(match.Value);
if (unicode.Length == 1) {
replacements.Add(match.Value, string.Concat("&#", Convert.ToInt32(unicode[0]), ";"));
}
}
}
foreach (var replacement in replacements) {
html = html.Replace(replacement.Key, replacement.Value);
}
return html;
}

RegEx - Please help in forming this RegEx

Can you please help me to write regular expression for this.
Name = "Windows Product for .Net"
Type = "Software Programming"
Quantity = "Pack of 3"
I want to do a match like this in c# for which I need RegEx.
If Name.contains(".Net") && (Type.Contains("Programming") || Type.Contains("Hardware")
{
// output will be a Match.
}
else
{
// no match.
}
The approach I want to take here is , specify regular expression for each condition and then apply logical operand && , logical grouping paranthesis and then logical operand ||.
I have come up with all regular expressions for these. How can I provide logical operands for each of them to execute in appropriate order?
string Name = "Windows Product for .Net";
string Type = "Software Programming";
string patternForName = ".*Net";
Regex rgxName = new Regex(patternForName);
Match matchName = rgx.Match(Name);
string patternForType = ".*Programming";
Regex rgxType = new Regex(patternForType);
Match matchType = rgx.Match(Type);
string patternForType1 = ".*Hardware";
Regex rgxType1 = new Regex(patternForType1);
Match matchType1 = rgx.Match(Type);
Please note - We are making it dynamic, in the sense the patterns , operands and regEx are coming from xml file. So that's why I do not want to write one big regEx for above.
First of all you don't need a leading .* in your expression unless you want the whole match (i.e. when working with matches). Just for a simple "is it there" you won't need it as the pattern might match any position.
Just use one regular expression for each field (i.e. one for Name, one for Type, one for Quantity:
string patternForName = "\\.Net"; // escaping the dot so it will match real dots only
string patternForType = "Programming|Hardware"; // | will result in "left side or right side"
string patternForQuantity = ".?"; // will match any string, even empty ones
To check everything:
bool match = rgxName.IsMatch(Name) && rgxType.IsMatch(Type) && rgx.IsMatch(Quantity);
You can make them dynamic without using regex. Using regex won't really save you any time or effort, since the code's going to be about the same size either way. Following your pattern above, you can do something like this:
var names = new[] { "Net", "Programming" };
var types = new[] { "Hardware" };
bool matchFound = true;
foreach (string n in names)
matchFound &= Name.Contains(n);
foreach (string t in types)
matchFound |= Type.Contains(t);
The above code assumes you want to match all of "names" and any of "types", but you can substitute any logic you want.
The real crux of your problem is these boolean combinations; regex won't help you with the logic for those, so you're better off using string.Contains unless the patterns you're looking for become much more variable. Regex is distracting you from your real goal here, in my opinion.
It sounds like you're asking how you should handle the logical part of the problem. If you're pulling it from an xml file, you could structure your file in the way you want to structure your logic.
for example, have And and Or groups:
<And>
<Name Match=".Net"/>
<Or>
<Type Match="Programming"/>
<Type Match="Hardware"/>
</Or>
</And>
Create classes for each of these types. For brevity, I didnt define the classes with properties or create constructors, but you can fill them out however you want:
class YourType
{
string Name;
string Type;
string Quantity;
}
abstract class Test
{
public abstract bool RunTest(YourType o);
}
class AndTest : Test
{
public List<Test> Children;
public bool RunTest(YourType o)
{
foreach (var test in Children)
{
if (!test.RunTest(o)) return false;
}
return true;
}
}
class OrTest : Test
{
public List<Test> Children;
public bool RunTest(YourType o)
{
foreach (var test in Children)
{
if (test.RunTest(o)) return true;
}
return false;
}
}
class NameTest : Test
{
public string Match;
public bool RunTest(YourType o)
{
return o.Name.Contains(Match);
}
}
class TypeTest : Test
{
public string Match;
public bool RunTest(YourType o)
{
return o.Type.Contains(Match);
}
}
Build the class structure from the xml file and just call RunTest from the top level Test. This way you can do any type of logic youd like. I just used Contains instead of a Regex for ease of the example, but you can easily replace the string match with a regex match.
if (rgxName.IsMatch(Name) && (rgxType.IsMatch(Type) || rgxType1.IsMatch(Type))
{
...
}
In .NET, Regex.Match matches anywhere in the string, so you don't need the any-characters (.*) prefix on your pattern. So, to check for ".NET", it would simply be:
Regex regexName = new Regex(#"\.NET", RegexOptions.IgnoreCase);
// IsMatch returns true/false, Match returns a Match object
bool nameMatches = regexName.IsMatch(name);
Your patterns for Programming and Hardware would just be
new Regex(#"Programming", RegexOptions.IgnoreCase) // Or leave out IgnoreCase if you're case-sensitive
new Regex(#"Hardware")
If you have a list of Name patterns and a list of type patterns, you could do something similar to this:
bool nameIsMatch = false;
bool typeIsMatch = false;
foreach (string namePattern in namePatterns)
{
nameIsMatch = nameIsMatch || Regex.IsMatch(nameString, namePattern);
}
foreach (string typePattern in typePatterns)
{
typeIsMatch = typeIsMatch || Regex.IsMatch(typeString, typePattern);
}
if (nameIsMatch && typeIsMatch)
{
// Whatever you want to do
}
patternForName = ".Net"
patternForType = "Programming"
patternForType1 = "Hardware"
You might find The Regex Coach to be useful.

What is the best way of performing a string concatenation in making a dynamic URL?

I am constructing a URL at runtime. So far I have done like
public string BuildURLAndNavigate(CodeType codeType)
{
string vURL = string.Empty;
string mstrNavServer = "http://some.com/nav";
vURL = ConcatenateString(mstrNavServer , "/somepage.asp?app=myapp");
//Build Code Type
switch (codeType)
{
case CodeType.Series:
vURL = ConcatenateString(vURL , "&tools=ser");
break;
case CodeType.DataType:
vURL = ConcatenateString(vURL , "&tools=dt");
break;
}
//build version
string VER_NUM = "5.0";
vURL = ConcatenateString(vURL , ConcatenateString("&vsn=" , VER_NUM));
return vURL;
}
private string ConcatenateString(string expression1, string expression2)
{
return string.Concat(expression1 + expression2);
}
But I am not happy with the one I am doing.
I am sure that there is definitely a best practice / better approach than this.
Kindly help me out in guiding for the same.
Thanks
You could use a StringBuilder:
public string BuildURLAndNavigate(CodeType codeType)
{
StringBuilder vURL = new StringBuilder();
vURL.Append("http://some.com/nav");
vURL.Append("/somepage.asp?app=myapp");
//Build Code Type
switch (codeType)
{
case CodeType.Series:
vURL.Append("&tools=ser");
break;
case CodeType.DataType:
vURL.Append("&tools=dt");
break;
}
//build version
string VER_NUM = "5.0";
vURL.AppendFormat("&vsn={0}", VER_NUM);
return vURL.ToString();
}
Never build urls using strings, string builders, string concatenations.
You could start by defining a custom collection which will take care of properly URL encoding any value being added to it:
public class HttpNameValueCollection : NameValueCollection
{
public override void Add(string name, string value)
{
base.Add(name, HttpUtility.UrlEncode(value));
}
public override string ToString()
{
return string.Join("&", Keys.Cast<string>().Select(
key => string.Format("{0}={1}", key, this[key])));
}
}
And then simply:
public string BuildURLAndNavigate()
{
var uriBuilder = new UriBuilder("http://some.com/nav/somepage.asp");
var values = new HttpNameValueCollection();
values.Add("app", "myapp");
switch (codeType)
{
case CodeType.Series:
values.Add("tools", "ser");
break;
case CodeType.DataType:
values.Add("tools", "dt");
break;
}
// You could even do things like this without breaking your urls
values.Add("p", "http://www.example.com?p1=v1&p2=v2");
string VER_NUM = "5.0";
values.Add("vsn", VER_NUM);
uriBuilder.Query = values.ToString();
return uriBuilder.ToString();
}
Like Saxon Druce said: You could use a StringBuilder, but, depending on CodeType values, you could eliminate the switch too:
public string BuildURLAndNavigate(CodeType codeType)
{
StringBuilder vURL = new StringBuilder();
vURL.Append("http://some.com/nav");
vURL.Append("/somepage.asp?app=myapp");
//Build Code Type
vURL.Append(codeType == CodeType.Series ? "&tools=ser" : "&tools=dt");
//build version
string VER_NUM = "5.0";
vURL.AppendFormat("&vsn={0}", VER_NUM);
return vURL.ToString();
}
Do
return string.Concat(expression1, expression2);
not
return string.Concat(expression1 + expression2);
wouldn't the right way to do that be to use the Uri-class or the UriBuilder class?
for example the Uri ctor overload Uri(Uri, string):
public Uri(
Uri baseUri,
string relativeUri
);
Uri baseUri = new Uri("http://www.contoso.com");
Uri myUri = new Uri(baseUri, "catalog/shownew.htm");
Console.WriteLine(myUri.ToString());
http://msdn.microsoft.com/en-us/library/aa332624(v=VS.71).aspx
What you are doing is fine - it is simple and understandable. Anyone who reads the code can understand what you are doing.
In terms of performance - you are not doing much string manipulation, so unless you are building huge strings or doing this operation thousands of times a minute, you will not gain much by using StringBuilder. Before optimizing this code, test its performance. You will probably find that there are other bigger bottlenecks to work on first.
The only real comment I have is that your ConcatenateString function seems superfluous. It is not really adding anything to the code and all the call to it can simply be replaced by string.Concat. As mentioned in the answer from #abatishchev, you should be using (str1, str2) not (str1 + str2), as that defeats the reason for the call.
Yes, StringBuilder is the best solution here. You can find more information on MSDN page: http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.aspx
StringBuilder contains very usefull methods:
StringBuilder.Append
Appends information to the end of the current StringBuilder.
StringBuilder.AppendFormat
Replaces a format specifier passed in a string with formatted text.
StringBuilder.Insert
Inserts a string or object into the specified index of the current StringBuilder.
StringBuilder.Remove
Removes a specified number of characters from the current StringBuilder.
StringBuilder.Replace
Replaces a specified character at a specified index.
I'd personally be inclined to just string.format something like that:
public string BuildURLAndNavigate(CodeType codeType)
{
string vURL = "http://some.com/nav/somepage.asp?app=myapp&tools={0}&vsn={1}";
string codevalue = "";
//Build Code Type
switch (codeType)
{
case CodeType.Series:
codevalue = "ser";
break;
case CodeType.DataType:
codevalue = "dt";
break;
}
//build version
string version = "5.0";
return string.Format(vURL, codevalue, version);
}
}
Apologies if there are any mistakes in that, I wasn't in VS when writing it so there might be a few typos I didn't notice - you should get the idea though.
The reason I like this method is because you can immediately see what your url's overall form is which can make it a bit easier to understand roughly what the url being returned is.
In order to keep all variables at one place, could we use following solution
public string BuildURLAndNavigate(CodeType codeType)
{
//code here - switch statement to calculate the codeValue
//Anonymous type - To keep all variables at one place
var urlComponents = new {
server = "http://some.com/nav",
pageName="/somepage.asp?app=myapp",
codevalue = "", //Replace with the actual value calculated in switch statement
versionPart="&vsn=",
version = "5.0"
};
StringBuilder vURL = new StringBuilder();
vURL.Append(urlComponents.server);
vURL.Append(urlComponents.pageName);
vURL.Append(urlComponents.codevalue);
vURL.Append(urlComponents.versionPart);
vURL.Append(urlComponents.version);
return vURL.ToString();
}

Categories

Resources