Capturing a number within curly brackets inside a string c# - c#

I have posted this again as my previous post was admittedly rather ambiguous..sorry!!
I have a string and I want to capture the number inside it and then add one to it!.
For example I have an email subject header saying "Re: Hello (1)"
I want to capture that 1 and then raise it by 2, then 3,then 4,etc. The difficulty I am having is taking into consideration the growing numbers, once it becomes say 10 or 100, that extra digit kills my current Regex expression.
Any help would be praised as always!
int replyno;
string Subject = "Re: Hey :) (1)";
if (Subject.Contains("Re:"))
{
try
{
replyno = int.Parse(Regex.Match(Subject, #"\(\d+\)").Value);
replyno++;
Subject = Subject.Remove(Subject.Length - 3);
TextBoxSubject.Text = Subject + "("+replyno+")";
}
catch
{
TextBoxSubject.Text = Subject + " (1)";
}
}
else
{
TextBoxSubject.Text = "Re: " + Subject;
}
Current output from this code fails from the Int.TryParse

Try substituting this code:
var m = Regex.Match(Subject, #"\((\d+)\)");
replyno = int.Parse(m.Groups[1].Value);
The changes are:
capture just the digits in the regex
parse just the captured digits
I'd also recommend that you check m.Success instead of just catching the resulting exception.

The problem is with the way you remove and replace the reply no.
Change your code this way
int replyno;
string Subject = "Re: Hey :) (1)";
if (Subject.Contains("Re:"))
{
try
{
replyno = int.Parse(Regex.Match(Subject, #"(\d+)").Value);
replyno++;
Subject = Regex.Replace(Subject,#"(\d+)", replyno.ToString());
TextBoxSubject.Text = Subject ;
}
catch
{
TextBoxSubject.Text = Subject + " (1)";
}
}
else
{
TextBoxSubject.Text = "Re: " + Subject;
}

I don't normally deal with regex so here's how i'd do it.
string subject = "Hello (1)";
string newSubject = string.Empty;
for (int j = 0; j < subject.Length; j++)
if (char.IsNumber(subject[j]))
newSubject += subject[j];
int number = 0;
int.TryParse(newSubject, out number);
subject = subject.Replace(number.ToString(), (++number).ToString());

You don't necessarily need regex for this, but you can adjust yours to \((?<number>\d+)\)$ to fix the problem.
For a regex solution, you can access the match using a group:
for (int i = 0; i < 10; i++)
{
int currentLevel = 0;
var regex = new System.Text.RegularExpressions.Regex(#"\((?<number>\d+)\)$");
var m = regex.Match(inputText);
string strLeft = inputText + " (", strRight = ")";
if (m.Success)
{
var levelText = m.Groups["number"];
if (int.TryParse(levelText.Value, out currentLevel))
{
var numCap = levelText.Captures[0];
strLeft = inputText.Substring(0, numCap.Index);
strRight = inputText.Substring(numCap.Index + numCap.Length);
}
}
inputText = strLeft + (++currentLevel).ToString() + strRight;
output.AppendLine(inputText);
}
Instead, consider just using IndexOf and Substring:
// Example
var inputText = "Subject Line";
for (int i = 0; i < 10; i++)
{
int currentLevel = 0;
int trimStart = inputText.Length;
// find the current level from the string
{
int parenStart = 0;
if (inputText.EndsWith(")")
&& (parenStart = inputText.LastIndexOf('(')) > 0)
{
int numStrLen = inputText.Length - parenStart - 2;
if (numStrLen > 0)
{
var numberText = inputText.Substring(parenStart + 1, numStrLen);
if (int.TryParse(numberText, out currentLevel))
{
// we found a number, remove it
trimStart = parenStart;
}
}
}
}
// add new number
{
// remove existing
inputText = inputText.Substring(0, trimStart);
// increment and add new
inputText = string.Format("{0} ({1})", inputText, ++currentLevel);
}
Console.WriteLine(inputText);
}
Produces
Subject Line
Subject Line (1)
Subject Line (2)
Subject Line (3)
Subject Line (4)
Subject Line (5)
Subject Line (6)
Subject Line (7)
Subject Line (8)
Subject Line (9)
Subject Line (10)

Related

Remove text between quotes

I have a program, in which you can input a string. But I want text between quotes " " to be removed.
Example:
in: Today is a very "nice" and hot day.
out: Today is a very "" and hot day.
Console.WriteLine("Enter text: ");
text = Console.ReadLine();
int letter;
string s = null;
string s2 = null;
for (s = 0; s < text.Length; letter++)
{
if (text[letter] != '"')
{
s = s + text[letter];
}
else if (text[letter] == '"')
{
s2 = s2 + letter;
letter++;
(text[letter] != '"')
{
s2 = s2 + letter;
letter++;
}
}
}
I don't know how to write the string without text between quotes to the console.
I am not allowed to use a complex method like regex.
This should do the trick. It checks every character in the string for quotes.
If it finds quotes then sets a quotesOpened flag as true, so it will ignore any subsequent character.
When it encounters another quotes, it sets the flag to false, so it will resume copying the characters.
Console.WriteLine("Enter text: ");
text = Console.ReadLine();
int letterIndex;
string s2 = "";
bool quotesOpened = false;
for (letterIndex= 0; letterIndex< text.Length; letterIndex++)
{
if (text[letterIndex] == '"')
{
quotesOpened = !quotesOpened;
s2 = s2 + text[letterIndex];
}
else
{
if (!quotesOpened)
s2 = s2 + text[letterIndex];
}
}
Hope this helps!
A take without regular expressions, which I like better, but okay:
string input = "abc\"def\"ghi";
string output = input;
int firstQuoteIndex = input.IndexOf("\"");
if (firstQuoteIndex >= 0)
{
int secondQuoteIndex = input.IndexOf("\"", firstQuoteIndex + 1);
if (secondQuoteIndex >= 0)
{
output = input.Substring(0, firstQuoteIndex + 1) + input.Substring(secondQuoteIndex);
}
}
Console.WriteLine(output);
What it does:
It searches for the first occurrence of "
Then it searches for the second occurrence of "
Then it takes the first part, including the first " and the second part, including the second "
You could improve this yourself by searching until the end of the string and replace all occurrences. You have to remember the new 'first index' you have to search on.
string text = #" Today is a very ""nice"" and hot day. Second sentense with ""text"" test";
Regex r = new Regex("\"([^\"]*)\"");
var a = r.Replace(text,string.Empty);
Please try this.
First we need to split string and then remove odd items:
private static String Remove(String s)
{
var rs = s.Split(new[] { '"' }).ToList();
return String.Join("\"\"", rs.Where(_ => rs.IndexOf(_) % 2 == 0));
}
static void Main(string[] args)
{
var test = Remove("hello\"world\"\"yeah\" test \"fhfh\"");
return;
}
This would be a possible solution:
String cmd = "This is a \"Test\".";
// This is a "".
String newCmd = cmd.Split('\"')[0] + "\"\"" + cmd.Split('\"')[2];
Console.WriteLine(newCmd);
Console.Read();
You simply split the text at " and then add both parts together and add the old ". Not a very nice solution, but it works anyway.
€dit:
cmd[0] = "This is a "
cmd[1] = "Test"
cmd[2] = "."
You can do it like this:
Console.WriteLine("Enter text: ");
var text = Console.ReadLine();
var skipping = false;
var result = string.Empty;
foreach (var c in text)
{
if (!skipping || c == '"') result += c;
if (c == '"') skipping = !skipping;
}
Console.WriteLine(result);
Console.ReadLine();
The result string is created by adding characters from the original string as long we are not between quotes (using the skipping variable).
Take all indexes of quotes remove the text between quotes using substring.
static void Main(string[] args)
{
string text = #" Today is a very ""nice"" and hot day. Second sentense with ""text"" test";
var foundIndexes = new List<int>();
foundIndexes.Add(0);
for (int i = 0; i < text.Length; i++)
{
if (text[i] == '"')
foundIndexes.Add(i);
}
string result = "";
for(int i =0; i<foundIndexes.Count; i+=2)
{
int length = 0;
if(i == foundIndexes.Count - 1)
{
length = text.Length - foundIndexes[i];
}
else
{
length = foundIndexes[i + 1] - foundIndexes[i]+1;
}
result += text.Substring(foundIndexes[i], length);
}
Console.WriteLine(result);
Console.ReadKey();
}
Output: Today is a very "" and hot day. Second sentense with "" test";
Here dotNetFiddle

C# Regex for masking E-Mails

Is there a simple way for masking E-Mail addresses using Regular Expressions in C#?
My E-Mail:
myawesomeuser#there.com
My goal:
**awesome****#there.com (when 'awesome' was part of the pattern)
So it's more like an inverted replacement where evertyhing that does not actually match will be replaced with *.
Note: The domain should never be replaced!
From a performance side of view, would it make more sense to split by the # and only check the first part then put it back together afterwards?
Note: I don't want to check if the E-Mail is valid or not. It's just a simple inverted replacement and only for my current needs, the string is an E-Mail but for sure it can be any other string as well.
Solution
After reading the comments I ended up with an extension-method for strings which perfectly matches my needs.
public static string MaskEmail(this string eMail, string pattern)
{
var ix1 = eMail.IndexOf(pattern, StringComparison.Ordinal);
var ix2 = eMail.IndexOf('#');
// Corner case no-#
if (ix2 == -1)
{
ix2 = eMail.Length;
}
string result;
if (ix1 != -1 && ix1 < ix2)
{
result = new string('*', ix1) + pattern + new string('*', ix2 - ix1 - pattern.Length) + eMail.Substring(ix2);
}
else
{
// corner case no str found, all the pre-# is replaced
result = new string('*', ix2) + eMail.Substring(ix2);
}
return result;
}
which then can be called
string eMail = myawesomeuser#there.com;
string maskedMail = eMail.MaskEmail("awesome"); // **awesome****#there.com
string email = "myawesomeuser#there.com";
string str = "awesome";
string rx = "^((?!" + Regex.Escape(str) + "|#).)*|(?<!#.*)(?<=" + Regex.Escape(str) + ")((?!#).)*";
string email2 = Regex.Replace(email, rx, x => {
return new string('*', x.Length);
});
There are two sub-regular expressions here:
^((?!" + Regex.Escape(str) + "|#).)*
and
(?<!#.*)(?<=" + Regex.Escape(str) + ")((?!#).)*
They are in | (or)
The first one means: from the start of the string, any character but stop when you find str (escaped) or #
The second one means: there mustn't be a # before the start of this matching and, starting from str (escaped), replace any character stopping at the #
Probably faster/easier to read:
string email = "myawesomeuser#there.com";
string str = "awesome";
int ix1 = email.IndexOf(str);
int ix2 = email.IndexOf('#');
// Corner case no-#
if (ix2 == -1) {
ix2 = email.Length;
}
string email3;
if (ix1 != -1 && ix1 < ix2) {
email3 = new string('*', ix1) + str + new string('*', ix2 - ix1 - str.Length) + email.Substring(ix2);
} else {
// corner case no str found, all the pre-# is replaced
email3 = new string('*', ix2) + email.Substring(ix2);
}
This second version is better because it handle corner cases like: string not found and no domain in the email.
(awesome)|.(?=.*#)
Try this.Replace by *$1.But there will be an extra * at the start.So remove a * from the masked email from the start.See demo.
https://regex101.com/r/wU7sQ0/29
Non RE;
string name = "awesome";
int pat = email.IndexOf('#');
int pname = email.IndexOf(name);
if (pname < pat)
email = new String('*', pat - name.Length).Insert(pname, name) + email.Substring(pat);

Html page parsing using Html Agility Pack

I'm, trying to parse the IMDb page with Regex (I know HAP is better), but my RegEx is wrong, so may be you can advice me how to use HAP correctly.
This is the part of page I'm trying to parse. I need to take 2 numbers from here:
5 out of 5 people (so these two five's i need, two numbers)
<small>5 out of 5 people found the following review useful:</small>
<br>
<a href="/user/ur1174211/">
<h2>Interesting, Particularly in Comparison With "La Sortie des usines Lumière"</h2>
<b>Author:</b>
Snow Leopard
<small>from Ohio</small>
<br>
<small>10 March 2005</small>
and this is my code on c#
Regex reg1 = new Regex("([0-9]+(out of)+[0-9])");
for (int i = 0; i < number; i++)
{
Console.WriteLine("the heading of the movie is {0}", header[i].InnerHtml);
Match m = reg1.Match(header[i].InnerHtml);
if (!m.Success)
{
return;
}
else
{
string str1 = m.Value.Split(' ')[0];
string str2 = m.Value.Split(' ')[3];
if (!Int32.TryParse(str1, out index1))
{
return;
}
if (!Int32.TryParse(str2, out index2))
{
return;
}
Console.WriteLine("index1 = {0}", index1);
Console.WriteLine("index2 = {0}", index2);
}
}
Big thanks to everybody who read this.
Try this. This way you will take numbers not only digits.
Regex reg1 = new Regex(#"(\d* (out of) \d*)");
for (int i = 0; i < number; i++)
{
Console.WriteLine("the heading of the movie is {0}", header[i].InnerHtml);
Match m = reg1.Match(header[i].InnerHtml);
if (!m.Success)
{
return;
}
else
{
Regex reg2 = new Regex(#"\d+");
m = reg2.Match(m.Value);
string str1 = m.Value;
string str2 = m.NextMatch().Value;
if (!Int32.TryParse(str1, out index1))
{
return;
}
if (!Int32.TryParse(str2, out index2))
{
return;
}
Console.WriteLine("index1 = {0}", index1);
Console.WriteLine("index2 = {0}", index2);
}
}
if you have the InnerHtml of the small tag then this can also be done to get numbers
var title = "5 out of 5 people found the following review useful:";
var titleNumbers = title.ToCharArray().Where(x => Char.IsNumber(x));
EDIT
as #PulseLab suggests, i have an alternate method
var sd = s.Split(' ').Where((data) =>
{
var datum = 0;
int.TryParse(data, out datum);
return datum > 0;
}).ToArray();

Split string after certain character count

I need some help. I'm writing an error log using text file with exception details. With that I want my stack trace details to be written like the below and not in straight line to avoid the user from scrolling the scroll bar of the note pad or let's say on the 100th character the strings will be written to the next line. I don't know how to achieve that. Thanks in advance.
SAMPLE(THIS IS MY CURRENT OUTPUT ALL IN STRAIGHT LINE)
STACKTRACE:
at stacktraceabcdefghijklmnopqrstuvwxyztacktraceabcdefghijklmnopqrswxyztacktraceabcdefghijk
**MY DESIRED OUTPUT (the string will write to the next line after certain character count)
STACKTRACE:
at stacktraceabcdefghijklmno
pqrstuvwxyztacktraceabcdefgh
ijklmnopqrswxyztacktraceabcd
efghijk
MY CODE
builder.Append(String.Format("STACKTRACE:"));
builder.AppendLine();
builder.Append(logDetails.StackTrace);
Following example splits 10 characters per line, you can change as you like {N} where N can be any number.
var input = "stacktraceabcdefghijklmnopqrstuvwxyztacktraceabcdefghijklmnopqrswxyztacktraceabcdefghijk";
var regex = new Regex(#".{10}");
string result = regex.Replace(input, "$&" + Environment.NewLine);
Console.WriteLine(result);
Here is the Demo
you can use the following code:
string yourstring;
StringBuilder sb = new StringBuilder();
for(int i=0;i<yourstring.length;++i){
if(i%100==0){
sb.AppendLine();
}
sb.Append(yourstring[i]);
}
you may create a function for this
string splitat(string line, int charcount)
{
string toren = "";
if (charcount>=line.Length)
{
return line;
}
int totalchars = line.Length;
int loopcnt = totalchars / charcount;
int appended = 0;
for (int i = 0; i < loopcnt; i++)
{
toren += line.Substring(appended, charcount) + Environment.NewLine;
appended += charcount;
int left = totalchars - appended;
if (left>0)
{
if (left>charcount)
{
continue;
}
else
{
toren += line.Substring(appended, left) + Environment.NewLine;
}
}
}
return toren;
}
Best , Easiest and Generic Answer :). Just set the value of splitAt to the that number of character count after that u want it to break.
string originalString = "1111222233334444";
List<string> test = new List<string>();
int splitAt = 4; // change 4 with the size of strings you want.
for (int i = 0; i < originalString.Length; i = i + splitAt)
{
if (originalString.Length - i >= splitAt)
test.Add(originalString.Substring(i, splitAt));
else
test.Add(originalString.Substring(i,((originalString.Length - i))));
}

finding an string expression in an sentence

I have like a three word expression: "Shut The Door" and I want to find it in a sentence. Since They are kind of seperated by space what would be the best solution for it.
If you have the string:
string sample = "If you know what's good for you, you'll shut the door!";
And you want to find where it is in a sentence, you can use the IndexOf method.
int index = sample.IndexOf("shut the door");
// index will be 42
A non -1 answer means the string has been located. -1 means it does not exist in the string. Please note that the search string ("shut the door") is case sensitive.
Use build in Regex.Match Method for matching strings.
string text = "One car red car blue car";
string pat = #"(\w+)\s+(car)";
// Compile the regular expression.
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
// Match the regular expression pattern against a text string.
Match m = r.Match(text);
int matchCount = 0;
while (m.Success)
{
Console.WriteLine("Match"+ (++matchCount));
for (int i = 1; i <= 2; i++)
{
Group g = m.Groups[i];
Console.WriteLine("Group"+i+"='" + g + "'");
CaptureCollection cc = g.Captures;
for (int j = 0; j < cc.Count; j++)
{
Capture c = cc[j];
System.Console.WriteLine("Capture"+j+"='" + c + "', Position="+c.Index);
}
}
m = m.NextMatch();
}
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.match(v=vs.71).aspx
http://support.microsoft.com/kb/308252
if (string1.indexOf(string2) >= 0)
...
The spaces are nothing special, they are just characters, so you can find a string like this like yuo would find any other string in your sentence, for example using "indexOf" if you need the position, or just "Contains" if you need to know if it exists or not.
E.g.
string sentence = "foo bar baz";
string phrase = "bar baz";
Console.WriteLine(sentence.Contains(phrase)); // True
Here is some C# code to find a substrings using a start string and end string point but you can use as a base and modify (i.e. remove need for end string) to just find your string...
2 versions, one to just find the first instance of a substring, other returns a dictionary of all starting positions of the substring and the actual string.
public Dictionary<int, string> GetSubstringDic(string start, string end, string source, bool includeStartEnd, bool caseInsensitive)
{
int startIndex = -1;
int endIndex = -1;
int length = -1;
int sourceLength = source.Length;
Dictionary<int, string> result = new Dictionary<int, string>();
try
{
//if just want to find string, case insensitive
if (caseInsensitive)
{
source = source.ToLower();
start = start.ToLower();
end = end.ToLower();
}
//does start string exist
startIndex = source.IndexOf(start);
if (startIndex != -1)
{
//start to check for each instance of matches for the length of the source string
while (startIndex < sourceLength && startIndex > -1)
{
//does end string exist?
endIndex = source.IndexOf(end, startIndex + 1);
if (endIndex != -1)
{
//if we want to get length of string including the start and end strings
if (includeStartEnd)
{
//make sure to include the end string
length = (endIndex + end.Length) - startIndex;
}
else
{
//change start index to not include the start string
startIndex = startIndex + start.Length;
length = endIndex - startIndex;
}
//add to dictionary
result.Add(startIndex, source.Substring(startIndex, length));
//move start position up
startIndex = source.IndexOf(start, endIndex + 1);
}
else
{
//no end so break out of while;
break;
}
}
}
}
catch (Exception ex)
{
//Notify of Error
result = new Dictionary<int, string>();
StringBuilder g_Error = new StringBuilder();
g_Error.AppendLine("GetSubstringDic: " + ex.Message.ToString());
g_Error.AppendLine(ex.StackTrace.ToString());
}
return result;
}
public string GetSubstring(string start, string end, string source, bool includeStartEnd, bool caseInsensitive)
{
int startIndex = -1;
int endIndex = -1;
int length = -1;
int sourceLength = source.Length;
string result = string.Empty;
try
{
if (caseInsensitive)
{
source = source.ToLower();
start = start.ToLower();
end = end.ToLower();
}
startIndex = source.IndexOf(start);
if (startIndex != -1)
{
endIndex = source.IndexOf(end, startIndex + 1);
if (endIndex != -1)
{
if (includeStartEnd)
{
length = (endIndex + end.Length) - startIndex;
}
else
{
startIndex = startIndex + start.Length;
length = endIndex - startIndex;
}
result = source.Substring(startIndex, length);
}
}
}
catch (Exception ex)
{
//Notify of Error
result = string.Empty;
StringBuilder g_Error = new StringBuilder();
g_Error.AppendLine("GetSubstring: " + ex.Message.ToString());
g_Error.AppendLine(ex.StackTrace.ToString());
}
return result;
}
You may want to make sure the check ignores the case of both phrases.
string theSentence = "I really want you to shut the door.";
string thePhrase = "Shut The Door";
bool phraseIsPresent = theSentence.ToUpper().Contains(thePhrase.ToUpper());
int phraseStartsAt = theSentence.IndexOf(
thePhrase,
StringComparison.InvariantCultureIgnoreCase);
Console.WriteLine("Is the phrase present? " + phraseIsPresent);
Console.WriteLine("The phrase starts at character: " + phraseStartsAt);
This outputs:
Is the phrase present? True
The phrase starts at character: 21

Categories

Resources