how can I retrieve both string between STRING & END in this sentence
"This is STRING a222 END, and this is STRING b2838 END."
strings that I want to get:
a222
b2838
Following is my code, and i only manage to get first string which is a222
string myString = "This is STRING a222 END, and this is STRING b2838 END.";
int first = myString.IndexOf("STRING") + "STRING".Length;
int second= myString.LastIndexOf("END");
string result = St.Substring(first, second - first);
.
Here is the solution using Regular Expressions. Working Code here
var reg = new Regex("(?<=STRING ).*?(?= END)");
var matched = reg.Matches("This is STRING a222 END, and this is STRING b2838 END.");
foreach(var m in matched)
{
Console.WriteLine(m.ToString());
}
You can pass a value for startIndex to string.IndexOf(), you can use this while looping:
IEnumerable<string> Find(string input, string startDelimiter, string endDelimiter)
{
int first = 0, second;
do
{
// Find start delimiter
first = input.IndexOf(startDelimiter, startIndex: first) + startDelimiter.Length;
if (first == -1)
yield break;
// Find end delimiter
second = input.IndexOf(endDelimiter, startIndex: first);
if (second == -1)
yield break;
yield return input.Substring(first, second - first).Trim();
first = second + endDelimiter.Length + 1;
}
while (first < input.Length);
}
You can iterate over indexes,
string myString = "This is STRING a222 END, and this is STRING b2838 END.";
//Jump to starting index of each `STRING`
for(int i = myString.IndexOf("STRING");i > 0; i = myString.IndexOf("STRING", i+1))
{
//Get Index of each END
var endIndex = myString.Substring(i + "STARTING".Length).IndexOf("END");
//PRINT substring between STRING and END of each occurance
Console.WriteLine(myString.Substring(i + "STARTING".Length-1, endIndex));
}
.NET FIDDLE
In your case, STRING..END occurs multiple times, but you were getting index of only first STRING and last index of END which will return substring, starts with first STRING to last END.
i.e.
a222 END, and this is STRING b2838
You've already got some good answers but I'll add another that uses ReadOnlyMemory from .NET core. That provides a solution that doesn't allocate new strings which can be nice. C# iterators are a common way to transform one sequence, of chars in this case, into another. This method would be used to transform the input string into sequence of ReadOnlyMemory each containing the tokens your after.
public static IEnumerable<ReadOnlyMemory<char>> Tokenize(string source, string beginPattern, string endPattern)
{
if (string.IsNullOrEmpty(source) ||
string.IsNullOrEmpty(beginPattern) ||
string.IsNullOrEmpty(endPattern))
yield break;
var sourceText = source.AsMemory();
int start = 0;
while (start < source.Length)
{
start = source.IndexOf(beginPattern, start);
if (-1 != start)
{
int end = source.IndexOf(endPattern, start);
if (-1 != end)
{
start += beginPattern.Length;
yield return sourceText.Slice(start, (end - start));
}
else
break;
start = end + endPattern.Length;
}
else
{
break;
}
}
}
Then you'd just call it like so to iterate over the tokens...
static void Main(string[] args)
{
const string Source = "This is STRING a222 END, and this is STRING b2838 END.";
foreach (var token in Tokenize(Source, "STRING", "END"))
{
Console.WriteLine(token);
}
}
string myString = "This is STRING a222 END, and this is STRING b2838 END.";
// Fix the issue based on #PaulF's comment.
if (myString.StartsWith("STRING"))
myString = $"DUMP {myString}";
var arr = myString.Split(new string[] { "STRING", "END" }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < arr.Length; i++)
{
if(i%2 > 0)
{
// This is your string
Console.WriteLine(arr[i].Trim());
}
}
Related
Problem: I want to write a method that takes a message/index pair like this:
("Hello, I am *Name1, how are you doing *Name2?", 2)
The index refers to the asterisk delimited name in the message. So if the index is 1, it should refer to *Name1, if it's 2 it should refer to *Name2.
The method should return just the name with the asterisk (*Name2).
I have attempted to play around with substrings, taking the first delimited * and ending when we reach a character that isn't a letter, number, underscore or hyphen, but the logic just isn't setting in.
I know this is similar to a few problems on SO but I can't find anything this specific. Any help is appreciated.
This is what's left of my very vague attempt so far. Based on this thread:
public string GetIndexedNames(string message, int index)
{
int strStart = message.IndexOf("#") + "#".Length;
int strEnd = message.LastIndexOf(" ");
String result = message.Substring(strStart, strEnd - strStart);
}
If you want to do it the old school way, then something like:
public static void Main(string[] args)
{
string message = "Hello, I am *Name1, how are you doing *Name2?";
string name1 = GetIndexedNames(message, "*", 1);
string name2 = GetIndexedNames(message, "*", 2);
Console.WriteLine(message);
Console.WriteLine(name1);
Console.WriteLine(name2);
Console.ReadLine();
}
public static string GetIndexedNames(string message, string singleCharDelimiter, int index)
{
string valid = "abcdefghijlmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-";
string[] parts = message.Split(singleCharDelimiter.ToArray());
if (parts.Length >= index)
{
StringBuilder sb = new StringBuilder();
for(int i = 0; i < parts[index].Length; i++)
{
string character = parts[index].Substring(i, 1);
if (valid.Contains(character))
{
sb.Append(character);
}
else
{
return sb.ToString();
}
}
return sb.ToString();
}
return "";
}
You can try using regular expressions to match the names. Assuming that name is a sequence of word characters (letters or digits):
using System.Linq;
using System.Text.RegularExpressions;
...
// Either name with asterisk *Name or null
// index is 1-based
private static ObtainName(string source, int index) => Regex
.Matches(source, #"\*\w+")
.Cast<Match>()
.Select(match => match.Value)
.Distinct() // in case the same name repeats several times
.ElementAtOrDefault(index - 1);
Demo:
string name = ObtainName(
"Hello, I am *Name1, how are you doing *Name2?", 2);
Console.Write(name);
Outcome:
*Name2
Perhaps not the most elegant solution, but if you want to use IndexOf, use a loop:
public static string GetIndexedNames(string message, int index, char marker='*')
{
int lastFound = 0;
for (int i = 0; i < index; i++) {
lastFound = message.IndexOf(marker, lastFound+1);
if (lastFound == -1) return null;
}
var space = message.IndexOf(' ', lastFound);
return space == -1 ? message.Substring(lastFound) : message.Substring(lastFound, space - lastFound);
}
Here is an example of what I want to implement: for instance, given the string Something, if you were to replace all occurrences of so with DDD, the result would be DDDmething.
Here is how I am implementing it; my code finds a char by its specific position and changes it, but in fact I want to implement what I stated above.
static void Main(string[] args)
{
string str = "The Haunting of Hill House!";
Console.WriteLine("String: " + str);
// replacing character at position 7
int pos = 7;
char rep = 'p';
string res = str.Substring(0, pos) + rep + str.Substring(pos + 1);
Console.WriteLine("String after replacing a character: " + result);
Console.ReadLine();
}
Alternative might be to split the string by "so" and join the resulting array by "DDD" :
string result = string.Join("DDD", "something".Split(new[] { "so" }, StringSplitOptions.None));
This should do what you want. The idea is to use IndexOf to find the index of the substring to replace then append the substring before it followed by the replacement, then start the search over from the end of the found substring. Then after all substrings are found and replaced append the rest of the original string to the end if there is any.
Note this doesn't do any checks on the input and you really should use the string.Replace as I'm sure it's more performant.
public string Replace(string input, string find, string replace)
{
// The current index in the string where we are searching from
int currIndex = 0;
// The index of the next substring to replace
int index = input.IndexOf(find);
// A string builder used to build the new string
var builder = new StringBuilder();
// Continue until the substring is not found
while(index != -1)
{
// If the current index is not equal to the substring location
// when we need to append everything from the current position
// to where we found the substring
if(index != currIndex )
{
builder.Append(input.Substring(currIndex , index - currIndex));
}
// Now append the replacement
builder.Append(replace);
// Move the current position past the found substring
currIndex = index + find.Length;
// Search for the next substring.
index = input.IndexOf(find, currIndex );
}
// If the current position is not the end of the string we need
// to append the remainder of the string.
if(currIndex < input.Length)
{
builder.Append(input.Substring(currIndex));
}
return builder.ToString();
}
You could do it like this:
var someString = "This is some sort of string.";
var resultIndex = 0;
var searchKey ="So";
var replacementString = "DDD";
while ((resultIndex = someString.IndexOf(searchKey, resultIndex, StringComparison.OrdinalIgnoreCase)) != -1)
{
var prefix = someString.Substring(0, Math.Max(0, resultIndex - 1));
var suffix = someString.Substring(resultIndex + searchKey.Length);
someString = prefix + replacementString + suffix;
resultIndex += searchKey.Length;
}
Expected to produce "This is DDDme DDDrt of string.".
I have a very big string with a lot of usernames. I want to extract the names from the string. That means I have one big string with lot of names in it. At the end I want every username in a string array.
An example of the string:
blablablabla#User;\u0004User\username,blablablablablablablabla#User;\u0004User\anotherusername,#Viewblablablablablablablabla
Search for: u0004User\
Save all charractes until , is found in the string array
Pseudocode:
string [] array = new string []{};
int i = 0;
foreach (var c in bigdata)
{
if(c == "u0004User\")
{
array[i] = c.AllCharactersUntil(',');
i++;
//AllCharactersUntil is a pseudo function
}
}
You can use string.IndexOf to find the index of "u0004User\" then again to find the following comma. Then use string.Substring to get the name. Keeping track of the current index and using it to tell IndexOf where to start searching from.
string bigdata =
#"blablablabla#User;\u0004User\username,blablablablablablablabla#User;\u0004User\anotherusername,#Viewblablablablablablablabla";
string searchValue = #"u0004User\";
int index = 0;
List<string> names = new List<string>();
while (index < bigdata.Length)
{
index = bigdata.IndexOf(searchValue, index);
if (index == -1) break;
int start = index + searchValue.Length;
int end = bigdata.IndexOf(',', start);
if (end == -1) break;
names.Add(bigdata.Substring(start, end - start));
index = end + 1;
}
Console.WriteLine(string.Join(", ", names));
That will give you the following output
username, anotherusername
NOTE
I've assumed that the "\u0004" values are those 6 characters and not a single unicode character. If it is a unicode character then you need the following change
string searchValue = "\u0004User\\";
Here's a simple result:
string input = "blablablabla#User;\\u0004User\username,blablablablablablablabla#User;\\u0004User\anotherusername,#Viewblablablablablablablabla";
List<string> userNames = new List<string>();
foreach (Match match in Regex.Matches(input, #"(u0004User\\)(.*?),", RegexOptions.IgnoreCase))
{
string currentUserName = match.Groups[2].ToString();
userNames.Add(currentUserName); // Add UserName to List
}
I have a txt file as a string, and I need to find words between two characters and Ltrim/Rtrim everything else. It may have to be conditional because the two characters may change depending on the string.
Example:
car= (data between here I want) ;
car = (data between here I want) </value>
Code:
int pos = st.LastIndexOf("car=", StringComparison.OrdinalIgnoreCase);
if (pos >= 0)
{
server = st.Substring(0, pos);..............
}
This is a simple extension method I use:
public static string Between(this string src, string findfrom, string findto)
{
int start = src.IndexOf(findfrom);
int to = src.IndexOf(findto, start + findfrom.Length);
if (start < 0 || to < 0) return "";
string s = src.Substring(
start + findfrom.Length,
to - start - findfrom.Length);
return s;
}
With this you can use
string valueToFind = sourceString.Between("car=", "</value>")
You can also try this:
public static string Between(this string src, string findfrom,
params string[] findto)
{
int start = src.IndexOf(findfrom);
if (start < 0) return "";
foreach (string sto in findto)
{
int to = src.IndexOf(sto, start + findfrom.Length);
if (to >= 0) return
src.Substring(
start + findfrom.Length,
to - start - findfrom.Length);
}
return "";
}
With this you can give multiple ending tokens (their order is important)
string valueToFind = sourceString.Between("car=", ";", "</value>")
You could use regex
var input = "car= (data between here I want) ;";
var pattern = #"car=\s*(.*?)\s*;"; // where car= is the first delimiter and ; is the second one
var result = Regex.Match(input, pattern).Groups[1].Value;
I have like a three word expression: "Shut The Door" and I want to find it in a sentence. Since They are kind of seperated by space what would be the best solution for it.
If you have the string:
string sample = "If you know what's good for you, you'll shut the door!";
And you want to find where it is in a sentence, you can use the IndexOf method.
int index = sample.IndexOf("shut the door");
// index will be 42
A non -1 answer means the string has been located. -1 means it does not exist in the string. Please note that the search string ("shut the door") is case sensitive.
Use build in Regex.Match Method for matching strings.
string text = "One car red car blue car";
string pat = #"(\w+)\s+(car)";
// Compile the regular expression.
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
// Match the regular expression pattern against a text string.
Match m = r.Match(text);
int matchCount = 0;
while (m.Success)
{
Console.WriteLine("Match"+ (++matchCount));
for (int i = 1; i <= 2; i++)
{
Group g = m.Groups[i];
Console.WriteLine("Group"+i+"='" + g + "'");
CaptureCollection cc = g.Captures;
for (int j = 0; j < cc.Count; j++)
{
Capture c = cc[j];
System.Console.WriteLine("Capture"+j+"='" + c + "', Position="+c.Index);
}
}
m = m.NextMatch();
}
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.match(v=vs.71).aspx
http://support.microsoft.com/kb/308252
if (string1.indexOf(string2) >= 0)
...
The spaces are nothing special, they are just characters, so you can find a string like this like yuo would find any other string in your sentence, for example using "indexOf" if you need the position, or just "Contains" if you need to know if it exists or not.
E.g.
string sentence = "foo bar baz";
string phrase = "bar baz";
Console.WriteLine(sentence.Contains(phrase)); // True
Here is some C# code to find a substrings using a start string and end string point but you can use as a base and modify (i.e. remove need for end string) to just find your string...
2 versions, one to just find the first instance of a substring, other returns a dictionary of all starting positions of the substring and the actual string.
public Dictionary<int, string> GetSubstringDic(string start, string end, string source, bool includeStartEnd, bool caseInsensitive)
{
int startIndex = -1;
int endIndex = -1;
int length = -1;
int sourceLength = source.Length;
Dictionary<int, string> result = new Dictionary<int, string>();
try
{
//if just want to find string, case insensitive
if (caseInsensitive)
{
source = source.ToLower();
start = start.ToLower();
end = end.ToLower();
}
//does start string exist
startIndex = source.IndexOf(start);
if (startIndex != -1)
{
//start to check for each instance of matches for the length of the source string
while (startIndex < sourceLength && startIndex > -1)
{
//does end string exist?
endIndex = source.IndexOf(end, startIndex + 1);
if (endIndex != -1)
{
//if we want to get length of string including the start and end strings
if (includeStartEnd)
{
//make sure to include the end string
length = (endIndex + end.Length) - startIndex;
}
else
{
//change start index to not include the start string
startIndex = startIndex + start.Length;
length = endIndex - startIndex;
}
//add to dictionary
result.Add(startIndex, source.Substring(startIndex, length));
//move start position up
startIndex = source.IndexOf(start, endIndex + 1);
}
else
{
//no end so break out of while;
break;
}
}
}
}
catch (Exception ex)
{
//Notify of Error
result = new Dictionary<int, string>();
StringBuilder g_Error = new StringBuilder();
g_Error.AppendLine("GetSubstringDic: " + ex.Message.ToString());
g_Error.AppendLine(ex.StackTrace.ToString());
}
return result;
}
public string GetSubstring(string start, string end, string source, bool includeStartEnd, bool caseInsensitive)
{
int startIndex = -1;
int endIndex = -1;
int length = -1;
int sourceLength = source.Length;
string result = string.Empty;
try
{
if (caseInsensitive)
{
source = source.ToLower();
start = start.ToLower();
end = end.ToLower();
}
startIndex = source.IndexOf(start);
if (startIndex != -1)
{
endIndex = source.IndexOf(end, startIndex + 1);
if (endIndex != -1)
{
if (includeStartEnd)
{
length = (endIndex + end.Length) - startIndex;
}
else
{
startIndex = startIndex + start.Length;
length = endIndex - startIndex;
}
result = source.Substring(startIndex, length);
}
}
}
catch (Exception ex)
{
//Notify of Error
result = string.Empty;
StringBuilder g_Error = new StringBuilder();
g_Error.AppendLine("GetSubstring: " + ex.Message.ToString());
g_Error.AppendLine(ex.StackTrace.ToString());
}
return result;
}
You may want to make sure the check ignores the case of both phrases.
string theSentence = "I really want you to shut the door.";
string thePhrase = "Shut The Door";
bool phraseIsPresent = theSentence.ToUpper().Contains(thePhrase.ToUpper());
int phraseStartsAt = theSentence.IndexOf(
thePhrase,
StringComparison.InvariantCultureIgnoreCase);
Console.WriteLine("Is the phrase present? " + phraseIsPresent);
Console.WriteLine("The phrase starts at character: " + phraseStartsAt);
This outputs:
Is the phrase present? True
The phrase starts at character: 21