format string template with named parameters to literal c# - c#

I have an application that creates string templates with named variables. This is done in accordance to the logging guide for ASP.NET Core
Now I find myself wanting to deliver these strings through the API itself as well, but this time with all the parameters filled in.
Basicly I'd want to use:
var template = "ID {ID} not found";
var para = new object[] {"value"};
String.Format(template, para);
However this gives an invalid input string.
Ofcourse I also cannot guarantee that somebody din't make a string template the 'classic' way with indexes.
var template2 = "ID {0} not found";
Is there a new way of formatting strings that I'm missing or are we supposed to work around this ?
I do not want to rework the existing code base to use numbers or use the $"...{para}" syntax. As this would lose information when it is being logged.
I'm guessing I could do a regex search and see if there's a '{0}' or a named parameter, and replace the named with indexes before formatting. But I wanted to know if there are some easier/cleaner ways of doing this.
Update - regex solution:
Bellow is the current work-around I've made using regex
public static class StringUtils
{
public static string Format(string template, params object[] para)
{
var match = Regex.Match(template, #"\{#?\w+}");
if (!match.Success) return template;
if (int.TryParse(match.Value.Substring(1, match.Value.Length - 2), out int n))
return string.Format(template, para);
else
{
var list = new List<string>();
var nextStartIndex = 0;
var i = 0;
while (match.Success)
{
if (match.Index > nextStartIndex)
list.Add(template.Substring(nextStartIndex , match.Index - nextStartIndex) + $"{{{i}}}");
else
list.Add($"{{{i}}}");
nextStartIndex = match.Index + match.Value.Length;
match = match.NextMatch();
i++;
}
return string.Format(string.Join("",list.ToArray()), para);
}
}
}

Related

How to parse a Ternary Statement in C#

I'm accepting an input string that I want to be a ternary statement that works on strings. So my method signature would look like this:
public string Parse(string value, string ternaryStatement)
and there parameters would give these results:
Parse(null, "=?It's Null:It's Not Null") == "It's Null" // Empty string would
Parse("", "=?It's Null:It's Not Null") == "It's Null" // be same as null
This example is fairly simple, Split the string first by '?' then by ':'
But of course I need a method to handle escape characters, "\", "\?" and ":", where "\" is valid anywhere, "\?" would only be valid before the first unescaped "?" and ":" would only be valid after that same "?".
Parse(#"\?\", #"=\\\?\\?\:Match\::\:No Match\:") == ":Match:"
Parse(#"\?\", #"!=\\\?\\?\:No Match\::\:Match\:") == ":Match:"
But this is really complicated. I believe I can perform it using regular expressions, but that just creates another problem since this is well beyond my limited understanding of regular expressions. What's the best way to tackle this problem?
Edit 1
Some of the background: I'm storing a format for a URL in a database config table (It's actually Dynamics 365 for Customer Engagement, but that doesn't matter at this point). The format is stored as strings, and the parameters that are required are defined in code. So generally it looks like this:
Format: "https://something.com?Foo={0}&Bar={1}"
Description: "0 - Foo, 1 - Bar"
where the description is used both for the person that is formatting the url, and the developer that needs to know how to structure the format statement.
The problem I'm running into right now is that I have a url that requires at least one of two different parameters. If one of the values is null or empty, it will error if included in the url. So I need a way of saying, if Foo is null or Bar is null, don't include the name or &. Ideally I'd like to implement this like this:
"https://something.com?{0:=?:Foo={{0}}}&}{1:=?:Bar={{1}}}}"
So if Foo is null and Bar is "Bar" the output would be
"https://something.com?Bar=Bar"
I could also see this being used if we need to switch between a 0/1 for a boolean to true/false without having to change code:
"https://something.com?{0:=0?false:true}"
The two regexes should be:
Regex rx = new Regex(#"(?<=(?:^|[^\\])(?:\\\\)*)\?");
Regex rx2 = new Regex(#"(?<=(?:^|[^\\])(?:\\\\)*):");
Use them like:
var m = rx.Match(str);
if (m.Success)
{
int ix = m.Index;
}
The main point of the two rx is that the searched string (\? or :) must be preceded by
(?<=(?:^|[^\\])(?:\\\\)*)
that is the beginning of the string ^ or not a \ ([^\\]) plus zero or an even number of \\ that is (?:\\\\)*.
A all-in-one regex is:
Regex rx = new Regex(#"^(?<operator>=|!=|<=|>=|<|>)(?<cmp>(?:(?:\\.)|[^?:])*)\?(?<true>(?:(?:\\.)|[^?:])*):(?<false>(?:(?:\\.)|[^?:])*)$");
if (m.Success)
{
string op = m.Groups["operator"].Value;
string cmp = m.Groups["cmp"].Value;
string true1 = m.Groups["true"].Value;
string false1 = m.Groups["false"].Value;
}
In op you'll get the comparison operator used, in cmp the comparand, in true1 and false1 the true and false strings. If !m.Success then the string isn't correctly formatted. Comprehending the regex is left as a simple exercise for the reader (unless you comprehend a regex, you shouldn't ever use it, because before or later you'll have to modify it/fix it/debug it)
Solution to returning different values based on input string
Why do you need to pass in a ternary statement / wouldn't this make more sense?
string Parse(string value, string returnIfNull, string returnIfNotNull)
{
return string.IsNullOrEmpty(value) ? returnIfNull: returnIfNotNull;
}
Console.WriteLine(Parse("", "treat as null", "not expected"));
Console.WriteLine(Parse("hello", "not expected", "this value's not null"));
Parsing a ternary string for values
However, if you really need to do this, you could use something like the below:
private static readonly Regex TernaryParserRegex = new Regex(
#"^=\?(?<ifNull>(?:\\(\\\\)*:|[^:])*)(?<!\\):(?<ifNotNull>(?:\\(\\\\)*:|[^:])*)$"
/* , RegexOptions.Compiled //include this line for performance if appropriate */
);
private string[] ParseTernaryString (string ternaryStatement)
{
var results = TernaryParserRegex.Match(ternaryStatement);
if (results.Success)
{
string[] returnVal = {
results.Groups["ifNull"].Value
,
results.Groups["ifNotNull"].Value
};
return returnVal;
}
else
{
throw new Exception("Invalid Ternary Statement"); //use an appropriate exception type here; or have the function return `{null,null}` / some other default as appropriate
}
}
public string Parse(string value, string ternaryStatement)
{
var returnValues = ParseTernaryString(ternaryStatement);
return string.IsNullOrEmpty(value) ? returnValues[0]: returnValues[1];
}
//Example Usage:
Console.WriteLine(Parse("", "=?treat as null:not expected"));
Console.WriteLine(Parse("hello", "=?not expected:this value's not null"));
An explanation of the regex & additional examples are available here:
https://regex101.com/r/YJ9qd3/1
Appending non-null/blank values to a URL's Query String
public void Main()
{
var url = "https://example.com?something=keepMe&foo=FooWasHere&bar=swapMeQuick";
var dict = new System.Collections.Generic.Dictionary<string, string>();
dict.Add("foo", null);
dict.Add("bar", "hello");
dict.Add("boo", "new");
Console.WriteLine(CreateUri(url, dict).ToString());
}
Uri CreateUri(string uri, IDictionary<string, string> parameters)
{
return CreateUri(new Uri(uri), parameters);
}
Uri CreateUri(Uri uri, IDictionary<string, string> parameters)
{
var query = HttpUtility.ParseQueryString(uri.Query); //https://msdn.microsoft.com/en-us/library/ms150046(v=vs.110).aspx; though returns HttpValueCollection
foreach (string key in parameters.Keys)
{
if (string.IsNullOrEmpty(parameters[key]))
{ //parameter is null or empty; if such a parameter already exists on our URL, remove it
query.Remove(key); //if this parameter does not already exist, has no effect (but no exception is thrown)
}
else
{ //parameter has a value; add or update the query string with this value
query.Set(key, parameters[key]);
}
}
var builder = new UriBuilder(uri);
builder.Query = query.ToString();
return builder.Uri;
}

Custom Uppercase on String

hi i was trying to make a program that modified a word in a string to a uppercase word.
the uppercase word is in a tag like this :
the <upcase>weather</upcase> is very <upcase>hot</upcase>
the result :
the WEATHER is very HOT
my code is like this :
string upKey = "<upcase>";
string lowKey = "</upcase>";
string quote = "the lazy <upcase>fox jump over</upcase> the dog <upcase> something here </upcase>";
int index = quote.IndexOf(upKey);
int indexEnd = quote.IndexOf(lowKey);
while(index!=-1)
{
for (int a = 0; a < index; a++)
{
Console.Write(quote[a]);
}
string upperQuote = "";
for (int b = index + 8; b < indexEnd; b++)
{
upperQuote += quote[b];
}
upperQuote = upperQuote.ToUpper().ToString();
Console.Write(upperQuote);
for (int c = indexEnd+9;c<quote.Length;c++)
{
if (quote[c]=='<')
{
break;
}
Console.Write(quote[c]);
}
index = quote.IndexOf(upKey, index + 1);
indexEnd = quote.IndexOf(lowKey, index + 1);
}
Console.WriteLine();
}
i have been trying using this code,and a while(while (indexEnd != -1)) :
index = quote.IndexOf(upKey, index + 1);
indexEnd = quote.IndexOf(lowKey, index + 1);
but that not work, the program run into unlimited loop, btw i'm a noob so please give a answer that i can understand :)
You can use a regular expression for this:
string input = "the <upcase>weather</upcase> is very <upcase>hot</upcase>";
var regex = new Regex("<upcase>(?<theMatch>.*?)</upcase>");
var result = regex.Replace(input, match => match.Groups["theMatch"].Value.ToUpper());
// result will be: "the WEATHER is very HOT"
Here's an explanation taken from here for the regular expression used above:
<upcase> matches the characters <upcase> literally (case sensitive)
(?<theMatch>.\*?) Named capturing group theMatch
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
< matches the characters < literally
/ matches the character / literally
upcase> matches the characters upcase> literally (case sensitive)
The following will work as long as there are only matching tags and none of them are nested.
public static string Upper(string str)
{
const string start = "<upcase>";
const string end = "</upcase>";
var builder = new StringBuilder();
// Find the first start tag
int startIndex = str.IndexOf(start);
// If no start tag found then return the original
if (startIndex == -1)
return str;
// Append the part before the first tag as is
builder.Append(str.Substring(0, startIndex));
// Continue as long as we find another start tag.
while (startIndex != -1)
{
// Find the end tag for the current start tag
var endIndex = str.IndexOf(end, startIndex);
// Append the text between the start and end as upper case.
builder.Append(
str.Substring(
startIndex + start.Length,
endIndex - startIndex - start.Length).ToUpper());
// Find the next start tag.
startIndex = str.IndexOf(start, endIndex);
// Append the part after the end tag, but before the next start as is
builder.Append(
str.Substring(
endIndex + end.Length,
(startIndex == -1 ? str.Length : startIndex) - endIndex - end.Length));
}
return builder.ToString();
}
I'm not rewriting your code. Just answering your (main) question:
You need to keep a variable of the index you're at, and check for IndexOf from there only (See MSDN). Something like this:
int index = 0;
while (quote.IndexOf(upKey, index) != -1)
{
//Your code, including updating the value of index.
}
(I didn't check this on Visual Studio. This is just to point you in the direction that I think you're looking for.)
The reason for the infinite loop is that you're always testing IndexOf of the same index. Perhaps you mean to have quote.IndexOf(upKey, index += 1); which would change the value of index?
The way to go here is to probably use Regex but these easy parsing excercises are always fun to do manually. This can be easily solved using a very simple state machine.
What states can we have when dealing with strings of this nature? I can think of 4:
We are either parsing normal text
Or we are parsing an opening format tag '<...>'
Or we are parsing a closing format tag '</...>'
Or we are parsing text to be formatted between tags
I can't think of any other states. Now we need to think about the normal flow / transition between states. What should happen when we a parse string with the correct format?
Parser starts up expecting normal text. That is easy to understand.
If expecting normal text we encounter a '<' then the parser should switch to parsing opening format tag state. There is no other valid state transition.
If in parsing opening format tag state we encounter a '>' then the parser should switch to parsing text to be formatted. There is no other valid state transition.
If in parsing text to be formatted we encounter a '<' then the parser should switch to parsing closing tag. Again, there is no other valid state transition.
If in parsing closing tag we encounter a '>' then the parser should switch to normal text. Once more, there is no other valid transition. Note that we are disallowing nested tags.
Ok, so that seems pretty easy to understand. What do we need to implement this?
First we'll need something to represent the parsing states. A good old enum will do:
private enum ParsingState
{
UnformattedText,
OpenTag,
CloseTag,
FormattedText,
}
Now we need some string buffers to keep track of the final formatted string, the current format tag we are parsing and finally the substring we need to format. We will use several StringBuilder's for these as we don't know how long these buffers are and how many concatenations will be performed:
var formattedStringBuffer = new StringBuilder();
var formatBuffer = new StringBuilder();
var tagBuffer = new StringBuilder();
We will also need to keep track of the parser's state and the current active tag if any (so we can make sure that the parsed closing tag matches the current active tag):
var state = ParsingState.UnformattedText;
var activeFormatTag = string.Empty;
And now we are good to go, but before we do, can we generalize this so it works with any format tag?
Yes we can, we just need to tell the parser what to do for each supported tag. We can do this easily just passing a along a Dictionary that ties each tag with the action it should perform. We do this the following way:
var formatter = new Dictionary<string, Func<string, string>>();
formatter.Add("upcase", s => s.ToUpperInvariant());
formatter.Add("lcase", s => s.ToLowerInvariant());
Great! Now our implementation could be the following:
public static string Parse(this string str, Dictionary<string, Func<string,string>> formatter)
{
var formattedStringBuffer = new StringBuilder();
var formatBuffer = new StringBuilder();
var tagBuffer = new StringBuilder();
var state = ParsingState.UnformattedText;
var activeFormatTag = string.Empty;
foreach (var c in str)
{
switch (state)
{
case ParsingState.UnformattedText:
{
if (c != '<')
{
formattedStringBuffer.Append(c);
}
else
{
state = ParsingState.OpenTag;
}
break;
}
case ParsingState.OpenTag:
{
if (c != '>')
{
tagBuffer.Append(c);
}
else
{
state = ParsingState.FormattedText;
activeFormatTag = tagBuffer.ToString();
tagBuffer.Clear();
}
break;
}
case ParsingState.FormattedText:
{
if (c != '<')
{
formatBuffer.Append(c);
}
else
{
state = ParsingState.CloseTag;
}
break;
}
case ParsingState.CloseTag:
{
if (c!='>')
{
tagBuffer.Append(c);
}
else
{
var expectedTag = $"/{activeFormatTag}";
var tag = tagBuffer.ToString();
if (tag != expectedTag)
throw new FormatException($"Expected closing tag not found: <{expectedTag}>.");
if (formatter.ContainsKey(activeFormatTag))
{
var formatted = formatter[activeFormatTag](formatBuffer.ToString());
formattedStringBuffer.Append(formatted);
tagBuffer.Clear();
formatBuffer.Clear();
state = ParsingState.UnformattedText;
}
else
throw new FormatException($"Format tag <{activeFormatTag}> not recognized.");
}
break;
}
}
}
if (state != ParsingState.UnformattedText)
throw new FormatException($"Bad format in specified string '{str}'");
return formattedStringBuffer.ToString();
}
Is it the most elegant solution? No, Regex will do a much better job, but being a beginner I would not recommend you start solving these kind of problems that way, you'll learn a whole lot more solving them manualy. You'll have plenty of time to learn Regex later on.

How to implement "Find, Replace, Next" in a String on C#?

I'm searching for a solution to this case:
I have a Method inside a DLL that receive a string that contains some words as "placeholders/parameters" that will be replaced by a result of another specific method (inside dll too)
Too simplificate: It's a query string received as an argument to be on a method inside a DLL, where X word that matchs a specifc case, will be replaced.
My method receive a string that could be like this:
(on .exe app)
string str = "INSERT INTO mydb.mytable (id_field, description, complex_number) VALUES ('#GEN_COMPLEX_ID#','A complex solution', '#GEN_COMPLEX_ID#');"
MyDLLClass.MyMethod(str);
So, the problem is: if i replace the #GEN_COMPLEX_ID# on this string, wanting that a different should be on each match, it not will happen because the replaced executes the function in a single shot (not step by step). So, i wanna help to implement this: a step by step replace of any text (like Find some word, replace, than next ... replace ... next... etc.
Could you help me?
Thanks!
This works pretty well for me:
string yourOriginalString = "ab cd ab cd ab cd";
string pattern = "ab";
string yourNewDescription = "123";
int startingPositionOffset = 0;
int yourOriginalStringLength = yourOriginalString.Length;
MatchCollection match = Regex.Matches(yourOriginalString, pattern, RegexOptions.IgnoreCase | RegexOptions.Multiline);
foreach (Match m in match)
{
yourOriginalString = yourOriginalString.Substring(0, m.Index+startingPositionOffset) + yourNewDescription + yourOriginalString.Substring(m.Index + startingPositionOffset+ m.Length);
startingPositionOffset = yourOriginalString.Length - yourOriginalStringLength;
}
If what you're asking is how to replace each placeholder with a different value, you can do it using the Regex.Replace overload which accepts a MatchEvaluator delegate, and executes it for each match:
// conceptually, something like this (note that it's not checking if there are
// enough values in the replacementValues array)
static string ReplaceMultiple(
string input, string placeholder, IEnumerable<string> replacementValues)
{
var enumerator = replacementValues.GetEnumerator();
return Regex.Replace(input, placeholder,
m => { enumerator.MoveNext(); return enumerator.Current; });
}
This is, of course, presuming that all placeholders look the same.
Pseudo-code
var split = source.Split(placeholder); // create array of items without placeholders
var result = split[0]; // copy first item
for(int i = 1; i < result.Length; i++)
{
bool replace = ... // ask user
result += replace ? replacement : placeholder; // to put replacement or not to put
result += split[i]; // copy next item
}
you should use the split method like this
string [] placeholder = {"#Placeholder#"} ;
string[] request = cd.Split(placeholder, StringSplitOptions.RemoveEmptyEntries);
StringBuilder requetBuilding = new StringBuilder();
requetBuilding.Append(request[0]);
int index = 1;
requetBuilding.Append("Your place holder replacement");
requetBuilding.Append(request[index]);
index++; //next replacement
// requetBuilding.Append("Your next place holder replacement");
// requetBuilding.Append(request[index]);

Extract Common Name from Distinguished Name

Is there a call in .NET that parses the CN from a rfc-2253 encoded distinguished name? I know there are some third-party libraries that do this, but I would prefer to use native .NET libraries if possible.
Examples of a string encoded DN
CN=L. Eagle,O=Sue\, Grabbit and Runn,C=GB
CN=Jeff Smith,OU=Sales,DC=Fabrikam,DC=COM
If you are working with an X509Certificate2, there is a native method that you can use to extract the Simple Name. The Simple Name is equivalent to the Common Name RDN within the Subject field of the main certificate:
x5092Cert.GetNameInfo(X509NameType.SimpleName, false);
Alternatively, X509NameType.DnsName can be used to retrieve the Subject Alternative Name, if present; otherwise, it will default to the Common Name:
x5092Cert.GetNameInfo(X509NameType.DnsName, false);
After digging around in the .NET source code it looks like there is an internal utility class that can parse Distinguished Names into their different components. Unfortunately the utility class is not made public, but you can access it using reflection:
string dn = "CN=TestGroup,OU=Groups,OU=UT-SLC,OU=US,DC=Company,DC=com";
Assembly dirsvc = Assembly.Load("System.DirectoryServices");
Type asmType = dirsvc.GetType("System.DirectoryServices.ActiveDirectory.Utils");
MethodInfo mi = asmType.GetMethod("GetDNComponents", BindingFlags.NonPublic | BindingFlags.Static);
string[] parameters = { dn };
var test = mi.Invoke(null, parameters);
//test.Dump("test1");//shows details when using Linqpad
//Convert Distinguished Name (DN) to Relative Distinguished Names (RDN)
MethodInfo mi2 = asmType.GetMethod("GetRdnFromDN", BindingFlags.NonPublic | BindingFlags.Static);
var test2 = mi2.Invoke(null, parameters);
//test2.Dump("test2");//shows details when using Linqpad
The results would look like this:
//test1 is array of internal "Component" struct that has name/values as strings
Name Value
CN TestGroup
OU Groups
OU UT-SLC
OU US
DC company
DC com
//test2 is a string with CN=RDN
CN=TestGroup
Please not this is an internal utility class and could change in a future release.
I had the same question, myself, when I found yours. Didn't find anything in the BCL; however, I did stumble across this CodeProject article that hit the nail squarely on the head.
I hope it helps you out, too.
http://www.codeproject.com/Articles/9788/An-RFC-2253-Compliant-Distinguished-Name-Parser
Do Win32 functions count? You can use PInvoke with DsGetRdnW. For code, see my answer to another question: https://stackoverflow.com/a/11091804/628981.
You can extract the common name from an ASN.1-encoded distinguished name using AsnEncodedData class:
var distinguishedName= new X500DistinguishedName("CN=TestGroup,OU=Groups,OU=UT-SLC,OU=US,DC=Company,DC=com");
var commonNameData = new AsnEncodedData("CN", distinguishedName.RawData);
var commonName = commonNameData.Format(false);
A downside of this approach is that if you specify an unrecognized OID or the field identified with the OID is missing in the distinguished name, Format method will return a hex string with the encoded value of full distinguished name so you may want to verify the result.
Also the documentation does not seem to specify if the rawData parameter of the AsnEncodedData constructor is allowed to contain other OIDs besides the one specified as the first argument so it may break on non-Windows OS or in a future version of .NET Framework.
If you are on Windows, #MaxKiselev's answer works perfectly. On non-Windows platforms, it returns the ASN1 dumps of each attribute.
.Net Core 5+ includes an ASN1 parser, so you can access the RDN's in a cross-platform manner by using AsnReader.
Helper class:
public static class X509DistinguishedNameExtensions
{
public static IEnumerable<KeyValuePair<string, string>> GetRelativeNames(this X500DistinguishedName dn)
{
var reader = new AsnReader(dn.RawData, AsnEncodingRules.BER);
var snSeq = reader.ReadSequence();
if (!snSeq.HasData)
{
throw new InvalidOperationException();
}
// Many types are allowable. We're only going to support the string-like ones
// (This excludes IPAddress, X400 address, and other wierd stuff)
// https://www.rfc-editor.org/rfc/rfc5280#page-37
// https://www.rfc-editor.org/rfc/rfc5280#page-112
var allowedRdnTags = new[]
{
UniversalTagNumber.TeletexString, UniversalTagNumber.PrintableString,
UniversalTagNumber.UniversalString, UniversalTagNumber.UTF8String,
UniversalTagNumber.BMPString, UniversalTagNumber.IA5String,
UniversalTagNumber.NumericString, UniversalTagNumber.VisibleString,
UniversalTagNumber.T61String
};
while (snSeq.HasData)
{
var rdnSeq = snSeq.ReadSetOf().ReadSequence();
var attrOid = rdnSeq.ReadObjectIdentifier();
var attrValueTagNo = (UniversalTagNumber)rdnSeq.PeekTag().TagValue;
if (!allowedRdnTags.Contains(attrValueTagNo))
{
throw new NotSupportedException($"Unknown tag type {attrValueTagNo} for attr {attrOid}");
}
var attrValue = rdnSeq.ReadCharacterString(attrValueTagNo);
var friendlyName = new Oid(attrOid).FriendlyName;
yield return new KeyValuePair<string, string>(friendlyName ?? attrOid, attrValue);
}
}
}
Example usage:
// Subject: CN=Example, O=Organization
var cert = new X509Certificate2("foo.cer");
var names = this.cert.SubjectName.GetRelativeNames().ToArray();
// names has [ { "CN": "Example" }, { "O": "Organization" } ]
Since this does not involve any string parsing, no escape or injections can be mishandled. It doesn't support decoding DN's that contain non-string elements, but those seem exceedingly rare.
How about this one:
string cnPattern = #"^CN=(?<cn>.+?)(?<!\\),";
string dn = #"CN=Doe\, John,OU=My OU,DC=domain,DC=com";
Regex re = new Regex(cnPattern);
Match m = re.Match(dn);
if (m.Success)
{
// Item with index 1 returns the first group match.
string cn = m.Groups[1].Value;
}
Adapted from Powershell Regular Expression for Extracting Parts of an Active Directory Distiniguished Name.
Just adding my two cents here. This implementation works "best" if you first learn what business rules are in place that will ultimately dictate how much of the RFC will ever be implemented at your company.
private static string ExtractCN(string distinguishedName)
{
// CN=...,OU=...,OU=...,DC=...,DC=...
string[] parts;
parts = distinguishedName.Split(new[] { ",DC=" }, StringSplitOptions.None);
var dc = parts.Skip(1);
parts = parts[0].Split(new[] { ",OU=" }, StringSplitOptions.None);
var ou = parts.Skip(1);
parts = parts[0].Split(new[] { ",CN=" }, StringSplitOptions.None);
var cnMulti = parts.Skip(1);
var cn = parts[0];
if (!Regex.IsMatch(cn, "^CN="))
throw new CustomException(string.Format("Unable to parse distinguishedName for commonName ({0})", distinguishedName));
return Regex.Replace(cn, "^CN=", string.Empty);
}
You could use regular expressions to do this. Here's a regex pattern than can parse the whole DN, then you can just take the parts you are interested in:
(?:^|,\s?)(?:(?<name>[A-Z]+)=(?<val>"(?:[^"]|"")+"|(?:\\,|[^,])+))+
Here it is formatted a bit nicer, and with some comments:
(?:^|,\s?) <-- Start or a comma
(?:
(?<name>[A-Z]+)
=
(?<val>
"(?:[^"]|"")+" <-- Quoted strings
|
(?:\\,|[^,])+ <-- Unquoted strings
)
)+
This regex will give you name and val capture groups for each match.
DN strings can optionally be quoted (e.g. "Hello", which allows them to contain unescaped commas. Alternatively, if not quoted, commas must be escaped with a backslash (e.g. Hello\, there!). This regex handles both quoted and unquoted strings.
Here's a link so you can see it in action: https://regex101.com/r/7vhdDz/1
If the order is uncertain, I do this:
private static string ExtractCN(string dn)
{
string[] parts = dn.Split(new char[] { ',' });
for (int i = 0; i < parts.Length; i++)
{
var p = parts[i];
var elems = p.Split(new char[] { '=' });
var t = elems[0].Trim().ToUpper();
var v = elems[1].Trim();
if (t == "CN")
{
return v;
}
}
return null;
}
This is my almost RFC-compliant fail-safe DN parser derived from https://www.codeproject.com/Articles/9788/An-RFC-2253-Compliant-Distinguished-Name-Parser and an example of its usage (extract subject name as CN and O, both optional, concatenated with comma):
private static string GetCertificateString(X509Certificate2 certificate)
{
var subjectComponents = certificate.Subject.ParseDistinguishedName();
var subjectName = string.Join(", ", subjectComponents
.Where(m => (m.Item1 == "CN") || (m.Item1 == "O"))
.Select(n => n.Item2)
.Distinct());
return $"{certificate.SerialNumber} {certificate.NotBefore:yyyy.MM.dd}-{certificate.NotAfter:yyyy.MM.dd} {subjectName}";
}
private enum DistinguishedNameParserState
{
Component,
QuotedString,
EscapedCharacter,
};
public static IEnumerable<Tuple<string, string>> ParseDistinguishedName(this string value)
{
var previousState = DistinguishedNameParserState.Component;
var currentState = DistinguishedNameParserState.Component;
var currentComponent = new StringBuilder();
var previousChar = char.MinValue;
var position = 0;
Func<StringBuilder, Tuple<string, string>> parseComponent = sb =>
{
var s = sb.ToString();
sb.Clear();
var index = s.IndexOf('=');
if (index == -1)
{
return null;
}
var item1 = s.Substring(0, index).Trim().ToUpper();
var item2 = s.Substring(index + 1).Trim();
return Tuple.Create(item1, item2);
};
while (position < value.Length)
{
var currentChar = value[position];
switch (currentState)
{
case DistinguishedNameParserState.Component:
switch (currentChar)
{
case ',':
case ';':
// Separator found, yield parsed component
var component = parseComponent(currentComponent);
if (component != null)
{
yield return component;
}
break;
case '\\':
// Escape character found
previousState = currentState;
currentState = DistinguishedNameParserState.EscapedCharacter;
break;
case '"':
// Quotation mark found
if (previousChar == currentChar)
{
// Double quotes inside quoted string produce single quote
currentComponent.Append(currentChar);
}
currentState = DistinguishedNameParserState.QuotedString;
break;
default:
currentComponent.Append(currentChar);
break;
}
break;
case DistinguishedNameParserState.QuotedString:
switch (currentChar)
{
case '\\':
// Escape character found
previousState = currentState;
currentState = DistinguishedNameParserState.EscapedCharacter;
break;
case '"':
// Quotation mark found
currentState = DistinguishedNameParserState.Component;
break;
default:
currentComponent.Append(currentChar);
break;
}
break;
case DistinguishedNameParserState.EscapedCharacter:
currentComponent.Append(currentChar);
currentState = previousState;
currentChar = char.MinValue;
break;
}
previousChar = currentChar;
position++;
}
// Yield last parsed component, if any
if (currentComponent.Length > 0)
{
var component = parseComponent(currentComponent);
if (component != null)
{
yield return component;
}
}
}
Sorry for being a bit late to the party, but I was able to call the Name attribute directly from c#
UserPrincipal p
and then I was able to call
p.Name
and that gave me the full name (Common Name)
Sample code:
string name;
foreach(UserPrincipal p in PSR)
{
//PSR refers to PrincipalSearchResult
name = p.Name;
Console.WriteLine(name);
}
Obviously, you will have to fill in the blanks. But this should be easier than parsing regex.
Could you not just retrieve the CN attribute values?
As you correctly note, use someone else's class as there are lots of fun edge cases (escaped commas, escaped other characters) that make parsing a DN look easy, but actually reasonably tricky.
I usually use a Java class that comes with the Novell (Now NetID) Identity Manager. So that is not helpful.
using System.Linq;
var dn = "CN=Jeff Smith,OU=Sales,DC=Fabrikam,DC=COM";
var cn = dn.Split(',').Where(i => i.Contains("CN=")).Select(i => i.Replace("CN=", "")).FirstOrDefault();
Well, Here I am another person late to the party. Here is my Solution:
var dn = new X500DistinguishedName("CN=TestGroup,OU=Groups,OU=UT-SLC,OU=US,DC=\"Company, inc\",DC=com");
foreach(var part in dn.Format(true).Split("\r\n"))
{
if(part == "") continue;
var parts = part.Split('=', 2);
var key = parts[0];
var value = parts[1];
// use your key and value as you see fit here.
}
Basically its leveraging the X500DistinguishedName.Format method to put things on lines. Then split by lines, then split each line into key value.

C# Template parsing and matching with text file

Need some ideas how to solve this problem.
I have a template file what describes the line in the text file. For example:
Template
[%f1%]|[%f2%]|[%f3%]"[%f4%]"[%f5%]"[%f6%]
Text file
1234|1234567|123"12345"12"123456
Now i need to read in the fields from the text file. In the template file fields are described with [%some name%]. Allso in the template file there is set what the field separators are, in this example here there are | and ". The lenght of the fields can change through different files but the separators will stay the same. What would be the best way to read in the template and by template read in the text file?
EDIT: Text file has multiple rows, like this:
1234|1234567|123"12345"12"123456"\r\n
1234|field|123"12345"12"asdasd"\r\n
123sd|1234567|123"asdsadf"12"123456"\r\n
45gg|somedata|123"12345"12"somefield"\r\n
EDIT2: Ok, lets make it even harder. Some fields can contain binary data and i know the starting and end position of the binary data field. I should be able to mark those fields in the template and then the parser will know that this field is binary. How to solve this problem?
I would create a regex based on the template and then parse the text file using that:
class Parser
{
private static readonly Regex TemplateRegex =
new Regex(#"\[%(?<field>[^]]+)%\](?<delim>[^[]+)?");
readonly List<string> m_fields = new List<string>();
private readonly Regex m_textRegex;
public Parser(string template)
{
var textRegexString = '^' + TemplateRegex.Replace(template, Evaluator) + '$';
m_textRegex = new Regex(textRegexString);
}
string Evaluator(Match match)
{
// add field name to collection and create regex for the field
var fieldName = match.Groups["field"].Value;
m_fields.Add(fieldName);
string result = "(.*?)";
// add delimiter to the regex, if it exists
// TODO: check, that only last field doesn't have delimiter
var delimGroup = match.Groups["delim"];
if (delimGroup.Success)
{
string delim = delimGroup.Value;
result += Regex.Escape(delim);
}
return result;
}
public IDictionary<string, string> Parse(string text)
{
var match = m_textRegex.Match(text);
var groups = match.Groups;
var result = new Dictionary<string, string>(m_fields.Count);
for (int i = 0; i < m_fields.Count; i++)
result.Add(m_fields[i], groups[i + 1].Value);
return result;
}
}
You can parse the template using regular expressions. An expression like this will match each field definition and separator:
Match m = Regex.Match(template, #"^(\[%(?<name>.+?)%\](?<separator>.)?)+$")
The match will contain two named groups for (name and separator), each of which will contain a number of captures for each time they matched in the input string. In your example, the separator group would have one less capture than the name group.
You can then iterate over the captures, and use the results to extract the fields from the input string and store the values, like this:
if( m.Success )
{
Group name = m.Groups["name"];
Group separator = m.Groups["separator"];
int index = 0;
Dictionary<string, string> fields = new Dictionary<string, string>();
for( int x = 0; x < name.Captures.Count; ++x )
{
int separatorIndex = input.Length;
if( x < separator.Captures.Count )
separatorIndex = input.IndexOf(separator.Captures[x].Value, index);
fields.Add(name.Captures[x].Value, input.Substring(index, separatorIndex - index));
index = separatorIndex + 1;
}
// Do something with results.
}
Obviously in a real program you'd have to account for invalid input and such, which I didn't do here.
I would do this with a few lines of code. Loop through your template row, grabbing all text between "[" as the variable name and everything else as a terminator. Read all the text to the terminal, assign it to the variable name, repeat.
1- Use API for that sscanf(line, format, __arglist) check here
2- Use string split Like:
public IEnumerable<int> GetDataFromLines(string[] lines)
{
//handle the output data
List<int> data = new List<int>();
foreach (string line in lines)
{
string[] seperators = new string[] { "|", "\"" };
string[] results = line.Split(seperators, StringSplitOptions.RemoveEmptyEntries);
foreach (string result in results)
{
data.Add(int.Parse(result));
}
}
return data;
}
Test it with line:
line = "1234|1234567|123\"12345\"12\"123456";
string[] lines = new string[] { line };
GetDataFromLines(lines);
//output list items are:
1234
1234567
123
12345
12
123456

Categories

Resources