Regex to find first capital letter occurrence in a string - c#

I want to find the index of first capital letter occurrence in a string.
E.g. -
String x = "soHaM";
Index should return 2 for this string. The regex should ignore all other capital letters after the first one is found. If there are no capital letters found then it should return 0. Please help.

I'm pretty sure all you need is the regex A-Z \p{Lu}:
public static class Find
{
// Apparently the regex below works for non-ASCII uppercase
// characters (so, better than A-Z).
static readonly Regex CapitalLetter = new Regex(#"\p{Lu}");
public static int FirstCapitalLetter(string input)
{
Match match = CapitalLetter.Match(input);
// I would go with -1 here, personally.
return match.Success ? match.Index : 0;
}
}
Did you try this?

Just for fun, a LINQ solution:
string x = "soHaM";
var index = from ch in x.ToArray()
where Char.IsUpper(ch)
select x.IndexOf(ch);
This returns IEnumerable<Int32>. If you want the index of the first upper case character, simply call index.First() or retrieve only the first instance in the LINQ:
string x = "soHaM";
var index = (from ch in x.ToArray()
where Char.IsUpper(ch)
select x.IndexOf(ch)).First();
EDIT
As suggested in the comments, here is another LINQ method (possibly more performant than my initial suggestion):
string x = "soHaM";
x.Select((c, index) => new { Char = c, Index = index }).First(c => Char.IsUpper(c.Char)).Index;

No need for Regex:
int firstUpper = -1;
for(int i = 0; i < x.Length; i++)
{
if(Char.IsUpper(x[i]))
{
firstUpper = i;
break;
}
}
http://msdn.microsoft.com/en-us/library/system.char.isupper.aspx
For the sake of completeness, here's my LINQ approach(although it's not the right tool here even if OP could use it):
int firstUpperCharIndex = -1;
var upperChars = x.Select((c, index) => new { Char = c, Index = index })
.Where(c => Char.IsUpper(c.Char));
if(upperChars.Any())
firstUpperCharIndex = upperChars.First().Index;

First your logic fails, if the method returns 0 in your case it would mean the first char in that list was in upperCase, so I would recomend that -1 meens not found, or throw a exception.
Anyway just use regular expressions becasue you can is not always the best choise, plus they are pretty slow and hard to read in general, making yoru code much harder to work with.
Anyway here is my contribution
public static int FindFirstUpper(string text)
{
for (int i = 0; i < text.Length; i++)
if (Char.IsUpper(text[i]))
return i;
return -1;
}

Using Linq:
using System.Linq;
string word = "soHaMH";
var capChars = word.Where(c => char.IsUpper(c)).Select(c => c);
char capChar = capChars.FirstOrDefault();
int index = word.IndexOf(capChar);
Using C#:
using System.Text.RegularExpressions;
string word = "soHaMH";
Match match= Regex.Match(word, "[A-Z]");
index = word.IndexOf(match.ToString());

Using loop
int i = 0;
for(i = 0; i < mystring.Length; i++)
{
if(Char.IsUpper(mystring, i))
break;
}
i is the value u should be looking at;

Related

count the number of points at the end of a string

I need to count the number of points at the END of string.
The number of points in the middle of the string are not relevant and should not be countet.
How can this be done?
string sample = "This.is.a.sample.string.....";
for the example above the correct answer would be 5 because there are 5 points at the end of the string.
because of performace reasons I would prefer a fast solution. Don't know if Regular Expressions
\.*$
should be used in such a case.
Start from the end of the string and go back char by char until its not a dot:
string sample = "This.is.a.sample.string....."
int count = 0;
for (int i = sample.Length - 1; i >= 0; i--)
{
if (sample[i] != '.') break;
count++;
}
Using Linq:
var test = "this.is.a.test........";
var count = test.ToCharArray().Reverse().TakeWhile(q => q == '.').Count();
Convert string to array, reverse, then take while character = '.'. Count result.
A simple solution using an extension method.
var test = "this.is.a.test........";
Console.WriteLine(test.CountTrailingDots());
public static int CountTrailingDots(this string value)
{
return value.Length - value.TrimEnd('.').Length;
}
Using Regex:
int points = Regex.Match("This.is.a.sample.string....", #"^[\w\W]*?([.]*+)$").Groups[1].Value.Length;
Description:
*+ = Matches as many characters as possible
*? = Matches as few characters as possible.
It can be something like..
string sample = "This.is.a.sample.string.....";
int count = 0;
if(sample.EndsWith("."))
count = sample.Substring(sample.TrimEnd('.').Length).Length;

Upper Case Everything Before the nth character in .NET

I need to capitalize everything before the second - from the beginning of the string in .NET. What is the best way to do this? The string before the second dash can be anything. I need a new single string once this is complete.
Before
Tt-Fga - Louisville - Kentucky
After
TT-FGA - Louisville - Kentucky
This should get the job done for your specific case:
public static string ToUpperUntilSecondHyphen(string text)
{
int index = text.IndexOf('-', text.IndexOf('-') + 1);
return text.Substring(0, index).ToUpper() + text.Substring(index);
}
A more generalized method could look something like this:
public static string ToUpperUntilNthOccurrenceOfChar(string text, char c, int occurrences)
{
if (occurrences > text.Count(x => x == c))
{
return text.ToUpper();
}
int index = 0;
for (int i = 0; i < occurrences; i++, index++)
{
index = text.IndexOf(c, index);
}
return text.Substring(0, index).ToUpper() + text.Substring(index);
}
Identify the location of the hyphen with IndexOf. You'll have to use this function twice so that you can find the first hyphen, and then the second one.
Construct the substring that only contains the characters up to that with Substring. Construct the substring that contains all the remaining characters as well.
Upper case the first string with ToUpper.
Concatenate with the + operator.
(.*?-.*)(?=-)
You can use replace here.Replace with $1.upper() or something which is available in c#.
See
http://regex101.com/r/yR3mM3/50
I went ahead and did this. If there is a better answer let me know.
var parts = #event.EventParent.Name.Split(new[] {'-'}, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < parts.Length; i++)
{
if (i >= 2)
break;
parts[i] = parts[i].ToUpper();
}
#event.EventParent.Name = string.Join("-", parts);

Find the first character in a string that is a letter

I am trying to figure out how to look through a string, find the first character that is a letter and then delete from that index point and on.
For example,
string test = "5604495Alpha";
I need to go through this string, find "A" and delete from that point on.
There are several ways to do this. Two examples:
string s = "12345Alpha";
s = new string(s.TakeWhile(Char.IsDigit).ToArray());
Or, more correctly, as Baldrick pointed out in his comment, find the first letter:
s = new string(s.TakeWhile(c => !Char.IsLetter(c)).ToArray());
Or, you can write a loop:
int pos = 0;
while (!Char.IsLetter(s[pos]))
{
++pos;
}
s = s.Substring(0, pos);
A little method to do it:
int getIndexOfFirstLetter(string input) {
var index = 0;
foreach (var c in input)
if (char.IsLetter(c))
return index;
else
index++;
return input.Length;
}
Usage:
var test = "5604495Alpha";
var result = test.Substring(0, getIndexOfFirstLetter(test));
// Returns 5604495
You should break this out to make sure there is a match, and to make sure there is a value at index 0... but it does work for this example case for demonstration purposes.
string test = "5604495Alpha";
var test2 = test.Remove(test.IndexOf(
System.Text.RegularExpressions.Regex.Match(test, "[A-Za-z]").Index));
// test2 = "5604495"

Char/String comparison

I'm trying to have a suggestion feature for the search function in my program eg I type janw doe in the search section and it will output NO MATCH - did you mean jane doe? I'm not sure what the problem is, maybe something to do with char/string comparison..I've tried comparing both variables as type char eg char temp -->temp.Contains ...etc but an error appears (char does not contain a definition for Contains). Would love any help on this! 8)
if (found == false)
{
Console.WriteLine("\n\nMATCH NOT FOUND");
int charMatch = 0, charCount = 0;
string[] checkArray = new string[26];
//construction site /////////////////////////////////////////////////////////////////////////////////////////////////////////////
for (int controlLoop = 0; controlLoop < contPeople.Length; controlLoop++)
{
foreach (char i in userContChange)
{
charCount = charCount + 1;
}
for (int i = 0; i < userContChange.Length; )
{
string temp = contPeople[controlLoop].name;
string check=Convert.ToString(userContChange[i]);
if (temp.Contains(check))
{
charMatch = charMatch + 1;
}
}
int half = charCount / 2;
if (charMatch >= half)
{
checkArray[controlLoop] = contPeople[controlLoop].name;
}
}
///////////////////////////////////////////////////////////////////////////////////////////////////////////
Console.WriteLine("Did you mean: ");
for (int a = 0; a < checkArray.Length; a++)
{
Console.WriteLine(checkArray[a]);
}
///////////////////////////////////////////////////////////////////////////////////////////////////
A string is made up of many characters. A character is a primitive, likewise, it doesn't "contain" any other items. A string is basically an array of characters.
For comparing string and characters:
char a = 'A';
String alan = "Alan";
Debug.Assert(alan[0] == a);
Or if you have a single digit string.. I suppose
char a = 'A';
String alan = "A";
Debug.Assert(alan == a.ToString());
All of these asserts are true
But, the main reason I wanted to comment on your question, is to suggest an alternative approach for suggesting "Did you mean?". There's an algorithm called Levenshtein Distance which calculates the "number of single character edits" required to convert one string to another. It can be used as a measure of how close two strings are. You may want to look into how this algorithm works because it could help you.
Here's an applet that I found which demonstrates: Approximate String Matching with k-differences
Also the wikipedia link Levenshtein distance
Char type cannot have .Contains() because is only 1 char value type.
In your case (if i understand), maybe you need to use .Equals() or the == operator.
Note: for compare String correctly, use .Equals(),
the == operator does not work good in this case because String is reference type.
Hope this help!
char type dosen't have the Contains() method, but you can use iit like this: 'a'.ToString().Contains(...)
if do not consider the performance, another simple way:
var input = "janw doe";
var people = new string[] { "abc", "123", "jane", "jane doe" };
var found = Array.BinarySearch<string>(people, input);//or use FirstOrDefault(), FindIndex, search engine...
if (found < 0)//not found
{
var i = input.ToArray();
var target = "";
//most similar
//target = people.OrderByDescending(p => p.ToArray().Intersect(i).Count()).FirstOrDefault();
//as you code:
foreach (var p in people)
{
var count = p.ToArray().Intersect(i).Count();
if (count > input.Length / 2)
{
target = p;
break;
}
}
if (!string.IsNullOrWhiteSpace(target))
{
Console.WriteLine(target);
}
}

Find the exact occurence of a string in HTML file

I would like to find the count of Exact match of string
Let suppose string is 'My Computer'. I want to find it,s occurrence in string
This is My computer,this is a good Computer,this is my Computer,this is my Computers
So at end I shall get Count 2 ,
I have tried the following formula with 'mykeyWord' as string to be found.
int strength = (innerDocument.DocumentNode.InnerText.Length - innerDocument.DocumentNode.InnerText.ToLower().Replace(mykeyWord.ToLower(), "").Length) / mykeyWord.Length;
But it will also count strings like 'my Computers' that is wrong.
This is a perfect place to use regular expressions, just as you tagged your post:
Regex re = new Regex("\\b" + Regex.Escape(mykeyWord) + "\\b", RegexOptions.IgnoreCase);
int count = re.Matches(innerDocument.DocumentNode.InnerText).Count;
You could use the regular expression [^A-z](my computer)[^A-z] This matches 'my computer' but not if it's before or after 'A to Z'. To make the regex search case insensitive use RegexOptions.IgnoreCase.
Edit
minitech's answer using word boundaries is better.
int FindCount(string keyword, string input)
{
if (input.Contains(keyword))
{
int count = 0;
int i = 0;
foreach (var c in input)
{
if (c == keyword[i])
i++;
else
i = 0;
if (i == keyword.Length)
{
i = 0;
count++;
}
}
return count;
}
return 0;
}

Categories

Resources