Insert spaces between words on a camel-cased token [duplicate] - c#

This question already has answers here:
.NET - How can you split a "caps" delimited string into an array?
(19 answers)
Closed 10 years ago.
Is there a nice function to to turn something like
FirstName
to this:
First Name?

See: .NET - How can you split a "caps" delimited string into an array?
Especially:
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[A-Z])", " $1")

Here's an extension method that I have used extensively for this kind of thing
public static string SplitCamelCase( this string str )
{
return Regex.Replace(
Regex.Replace(
str,
#"(\P{Ll})(\P{Ll}\p{Ll})",
"$1 $2"
),
#"(\p{Ll})(\P{Ll})",
"$1 $2"
);
}
It also handles strings like IBMMakeStuffAndSellIt, converting it to IBM Make Stuff And Sell It (IIRC).
Syntax explanation (credit):
{Ll} is Unicode Character Category "Letter lowercase" (as opposed to {Lu} "Letter uppercase"). P is a negative match, while p is a positive match, so \P{Ll} is literally "Not lowercase" and p{Ll} is "Lowercase".
So this regex splits on two patterns. 1: "Uppercase, Uppercase, Lowercase" (which would match the MMa in IBMMake and result in IBM Make), and 2. "Lowercase, Uppercase" (which would match on the eS in MakeStuff). That covers all camelcase breakpoints.
TIP: Replace space with hyphen and call ToLower to produce HTML5 data attribute names.

Simplest Way:
var res = Regex.Replace("FirstName", "([A-Z])", " $1").Trim();

You can use a regular expression:
Match ([^^])([A-Z])
Replace $1 $2
In code:
String output = System.Text.RegularExpressions.Regex.Replace(
input,
"([^^])([A-Z])",
"$1 $2"
);

/// <summary>
/// Parse the input string by placing a space between character case changes in the string
/// </summary>
/// <param name="strInput">The string to parse</param>
/// <returns>The altered string</returns>
public static string ParseByCase(string strInput)
{
// The altered string (with spaces between the case changes)
string strOutput = "";
// The index of the current character in the input string
int intCurrentCharPos = 0;
// The index of the last character in the input string
int intLastCharPos = strInput.Length - 1;
// for every character in the input string
for (intCurrentCharPos = 0; intCurrentCharPos <= intLastCharPos; intCurrentCharPos++)
{
// Get the current character from the input string
char chrCurrentInputChar = strInput[intCurrentCharPos];
// At first, set previous character to the current character in the input string
char chrPreviousInputChar = chrCurrentInputChar;
// If this is not the first character in the input string
if (intCurrentCharPos > 0)
{
// Get the previous character from the input string
chrPreviousInputChar = strInput[intCurrentCharPos - 1];
} // end if
// Put a space before each upper case character if the previous character is lower case
if (char.IsUpper(chrCurrentInputChar) == true && char.IsLower(chrPreviousInputChar) == true)
{
// Add a space to the output string
strOutput += " ";
} // end if
// Add the character from the input string to the output string
strOutput += chrCurrentInputChar;
} // next
// Return the altered string
return strOutput;
} // end method

Regex:
http://weblogs.asp.net/jgalloway/archive/2005/09/27/426087.aspx
http://stackoverflow.com/questions/773303/splitting-camelcase
(probably the best - see the second answer)
http://bytes.com/topic/c-sharp/answers/277768-regex-convert-camelcase-into-title-case
To convert from UpperCamelCase to
Title Case, use this line :
Regex.Replace("UpperCamelCase",#"(\B[A-Z])",#"
$1");
To convert from both lowerCamelCase
and UpperCamelCase to Title Case, use
MatchEvaluator : public string
toTitleCase(Match m) { char
c=m.Captures[0].Value[0]; return
((c>='a')&&(c<='z'))?Char.ToUpper(c).ToString():"
"+c; } and change a little your regex
with this line :
Regex.Replace("UpperCamelCase or
lowerCamelCase",#"(\b[a-z]|\B[A-Z])",new
MatchEvaluator(toTitleCase));

Related

Capture substring within delimiters and excluding characters using regex

How could a regex pattern look like to capture a substring between 2 delimiters, but excluding some characters (if any) after first delimiter and before last delimiter (if any)?
The input string looks for instance like this:
var input = #"Not relevant {
#AddInfoStart Comment:String:=""This is a comment"";
AdditionalInfo:String:=""This is some additional info"" ;
# } also not relevant";
The capture should contain the substring between "{" and "}", but excluding any spaces, newlines and "#AddInfoStart" string after start delimiter "{" (just if any of them present), and also excluding any spaces, newlines and ";" and "#" characters before end delimiter "}" (also if any of them present).
The captured string should look like this
Comment:String:=""This is a comment"";
AdditionalInfo:String:=""This is some additional info""
It is possible that there are blanks before or after the ":" and ":=" internal delimiters, and also that the value after ":=" is not always marked as a string, for instance something like:
{ Val1 : Real := 1.7 }
For arrays is used the following syntax:
arr1 : ARRAY [1..5] OF INT := [2,5,44,555,11];
arr2 : ARRAY [1..3] OF REAL
This is my solution:
Remove the content outside the brackets
Use a regular expression to get the values inside the brackets
Code:
var input = #"Not relevant {
#AddInfoStart Comment:String:=""This is a comment"";
Val1 : Real := 1.7
AdditionalInfo:String:=""This is some additional info"" ;
# } also not relevant";
// remove content outside brackets
input = Regex.Replace(input, #".*\{", string.Empty);
input = Regex.Replace(input, #"\}.*", string.Empty);
string property = #"(\w+)";
string separator = #"\s*:\s*"; // ":" with or without whitespace
string type = #"(\w+)";
string equals = #"\s*:=\s*"; // ":=" with or without whitespace
string text = #"""?(.*?)"""; // value between ""
string number = #"(\d+(\.\d+)?)"; // number like 123 or with a . separator such as 1.45
string value = $"({text}|{number})"; // value can be a string or number
string pattern = $"{property}{separator}{type}{equals}{value}";
var result = Regex.Matches(input, pattern)
.Cast<Match>()
.Select(match => new
{
FullMatch = match.Groups[0].Value, // full match is always the 1st group
Property = match.Groups[1].Value,
Type = match.Groups[2].Value,
Value = match.Groups[3].Value
})
.ToList();

Modifying string value

I have a string which is
string a = #"\server\MainDirectory\SubDirectoryA\SubDirectoryB\SubdirectoryC\MyFile.pdf";
The SubDirectoryB will always start with a prefix of RN followed by 6 unique numbers. Now I'm trying to modify SubDirectoryB parth of the string to be replaced by a new value lets say RN012345
So the new string should look like
string b = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
To achieve this I'm making use of the following helper method
public static string ReplaceAt(this string path, int index, int length, string replace)
{
return path.Remove(index, Math.Min(length, path.Length - index)).Insert(index, replace);
}
Which works great for now.
However the orginial path will be changing in the near future so it will something like #\MainDirectory\RN012345\AnotherDirectory\MyFile.pdf. So I was wondering if there is like a regex or another feature I can use to just change the value in the path rather than providing the index which will change in the future.
Assuming you need to only replace those \RNxxxxxx\ where each x is a unique digit, you need to capture the 6 digits and analyze the substring inside a match evaluator.
var a = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
var res = Regex.Replace(a, #"\\RN([0-9]{6})\\", m =>
m.Groups[1].Value.Distinct().Count() == m.Groups[1].Value.Length ?
"\\RN0123456\\" : m.Value);
// res => \server\MainDirectory\SubDirectoryA\RN0123456\SubdirectoryC\MyFile.pdf
See the C# demo
The regex is
\\RN([0-9]{6})\\
It matches a \ with \\, then matches RN, then matches and captures into Group 1 six digits (with ([0-9]{6})) and then will match a \. In the replacment part, the m.Groups[1].Value.Distinct().Count() == m.Groups[1].Value.Length checks if the number of distinct digits is the same as the number of the substring captured, and if yes, the digits are unique and the replacement occurs, else, the whole match is put back into the replacement result.
Use String.Replace
string oldSubdirectoryB = "RN012345";
string newSubdirectoryB = "RN147258";
string fileNameWithPath = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
fileNameWithPath = fileNameWithPath.Replace(oldSubdirectoryB, newSubdirectoryB);
You can use Regex.Replace to replace the SubDirectoryB with your required value
string a = #"\server\MainDirectory\SubDirectoryA\RN123456\SubdirectoryC\MyFile.pdf";
a = Regex.Replace(a, "RN[0-9]{6,6}","Mairaj");
Here i have replaced a string with RN followed by 6 numbers with Mairaj.

Regex to remove text between two chars in c#

I have the following string that I will need to remove everything between =select and the following } char
ex.
Enter Type:=select top 10 type from cable}
The end result is the string variable to just show Enter Type:
I was looking for a way to do this with Regex, but I'm open to other methods as well. Thanks in advance for the help.
string input = "Enter Type:=select top 10 type from cable}";
System.Text.RegularExpressions.Regex regExPattern = new System.Text.RegularExpressions.Regex("(.*):=select.*}");
System.Text.RegularExpressions.Match match = regExPattern.Match(input);
string output = String.Empty;
if( match.Success)
{
output = match.Groups[1].Value;
}
Console.WriteLine("Output = " + output);
The value of the 'output' variable will be the value found before the ":=select" segment of the input string. If you need to pull out additional information from the input string, surround it will parenthesis and matches found will be added to the match.Groups array. By the way, the value of match.Groups[0].Value is the original string.
var rx = new Regex("=select[^}]*}");;
Console.WriteLine(rx.Replace ("Enter Type:=select top 10 type from cable}", ""));
Regexp.Replace(string input,string output) function replaces all substrings that match given regexp with string "output". First line defines regexp that matches everything between =select and }

Get partial string from string

I have the following string:
This isMyTest testing
I want to get isMyTest as a result. I only have two first characters available("is"). The rest of the word can vary.
Basically, I need to select a first word delimeted by spaces which starts with chk.
I've started with the following:
if (text.contains(" is"))
{
text.LastIndexOf(" is"); //Should give me index.
}
now I cannot find the right bound of the word since I need to match on something like
You can use regular expressions:
string pattern = #"\bis";
string input = "This isMyTest testing";
return Regex.Matches(input, pattern);
You can use IndexOf to get the index of the next space:
int startPosition = text.LastIndexOf(" is");
if (startPosition != -1)
{
int endPosition = text.IndexOf(' ', startPosition + 1); // Find next space
if (endPosition == -1)
endPosition = text.Length - 1; // Select end if this is the last word?
}
What about using a regex match? Generally if you are searching for a pattern in a string (ie starting with a space followed by some other character) regex are perfectly suited to this. Regex statements really only fall apart in contextually sensitive areas (such as HTML) but are perfect for a regular string search.
// First we see the input string.
string input = "/content/alternate-1.aspx";
// Here we call Regex.Match.
Match match = Regex.Match(input, #"[ ]is[A-z0-9]*", RegexOptions.IgnoreCase);
// Here we check the Match instance.
if (match.Success)
{
// Finally, we get the Group value and display it.
string key = match.Groups[1].Value;
Console.WriteLine(key);
}

Unquote string in C#

I have a data file in INI file like format that needs to be read by both some C code and some C# code. The C code expects string values to be surrounded in quotes. The C# equivalent code is using some underlying class or something I have no control over, but basically it includes the quotes as part of the output string. I.e. data file contents of
MY_VAL="Hello World!"
gives me
"Hello World!"
in my C# string, when I really need it to contain
Hello World!
How do I conditionally (on having first and last character being a ") remove the quotes and get the string contents that I want.
On your string use Trim with the " as char:
.Trim('"')
I usually call String.Trim() for that purpose:
string source = "\"Hello World!\"";
string unquoted = source.Trim('"');
My implementation сheck that quotes are from both sides
public string UnquoteString(string str)
{
if (String.IsNullOrEmpty(str))
return str;
int length = str.Length;
if (length > 1 && str[0] == '\"' && str[length - 1] == '\"')
str = str.Substring(1, length - 2);
return str;
}
Just take the returned string and do a Trim('"');
Being obsessive, here (that's me; no comment about you), you may want to consider
.Trim(' ').Trim('"').Trim(' ')
so that any, bounding spaces outside of the quoted string are trimmed, then the quotation marks are stripped and, finally, any, bounding spaces for the contained string are removed.
If you want to retain contained, bounding white space, omit the final .Trim(' ').
Should there be embedded spaces and/or quotation marks, they will be preserved. Chances are, such are desired and should not be deleted.
Do some study as to what a no argument Trim() does to things like form feed and/or tabulation characters, bounding and embedded. It could be that one and/or the other Trim(' ') should be just Trim().
If you know there will always be " at the end and beginning, this would be the fastest way.
s = s.Substring(1, s.Length - 2);
Use string replace function or trim function.
If you just want to remove first and last quotes use substring function.
string myworld = "\"Hello World!\"";
string start = myworld.Substring(1, (myworld.Length - 2));
I would suggest using the replace() method.
string str = "\"HelloWorld\"";
string result = str.replace("\"", string.Empty);
What you are trying to do is often called "stripping" or "unquoting". Usually, when the value is quoted that means not only that it is surrounded by quotation characters (like " in this case) but also that it may or may not contain special characters to include quotation character itself inside quoted text.
In short, you should consider using something like:
string s = #"""Hey ""Mikey""!";
s = s.Trim('"').Replace(#"""""", #"""");
Or when using apostrophe mark:
string s = #"'Hey ''Mikey''!";
s = s.Trim('\'').Replace("''", #"'");
Also, sometimes values that don't need quotation at all (i.e. contains no whitespace) may not need to be quoted anyway. That's the reason checking for quotation characters before trimming is reasonable.
Consider creating a helper function that will do this job in a preferable way as in the example below.
public static string StripQuotes(string text, char quote, string unescape)a
{
string with = quote.ToString();
if (quote != '\0')
{
// check if text contains quote character at all
if (text.Length >= 2 && text.StartsWith(with) && text.EndsWith(with))
{
text = text.Trim(quote);
}
}
if (!string.IsNullOrEmpty(unescape))
{
text = text.Replace(unescape, with);
}
return text;
}
using System;
public class Program
{
public static void Main()
{
string text = #"""Hello World!""";
Console.WriteLine(text);
// That will do the job
// Output: Hello World!
string strippedText = text.Trim('"');
Console.WriteLine(strippedText);
string escapedText = #"""My name is \""Bond\"".""";
Console.WriteLine(escapedText);
// That will *NOT* do the job to good
// Output: My name is \"Bond\".
string strippedEscapedText = escapedText.Trim('"');
Console.WriteLine(strippedEscapedText);
// Allow to use \" inside quoted text
// Output: My name is "Bond".
string strippedEscapedText2 = escapedText.Trim('"').Replace(#"\""", #"""");
Console.WriteLine(strippedEscapedText2);
// Create a function that will check texts for having or not
// having citation marks and unescapes text if needed.
string t1 = #"""My name is \""Bond\"".""";
// Output: "My name is \"Bond\"."
Console.WriteLine(t1);
// Output: My name is "Bond".
Console.WriteLine(StripQuotes(t1, '"', #"\"""));
string t2 = #"""My name is """"Bond"""".""";
// Output: "My name is ""Bond""."
Console.WriteLine(t2);
// Output: My name is "Bond".
Console.WriteLine(StripQuotes(t2, '"', #""""""));
}
}
https://dotnetfiddle.net/TMLWHO
Here's my solution as extension method:
public static class StringExtensions
{
public static string UnquoteString(this string inputString) => inputString.TrimStart('"').TrimEnd('"');
}
It's just trimming at the start an the end...

Categories

Resources