Find delimiters in a string and mark that position - c#

I'm trying to figure out how to record the position of a delimiter in a string of text entered by the user.
So if the user entered text:
orange red green yellow?
* * * *
I would want to mark the space after each word along with the question mark. (Those stars should be lining up with the delimiters.)
I know how to search the string for a certain character or set of characters, but not how I would mark it to receive a star on the next line.

string input = "orange red green yellow?";
List<int> indexes = Regex.Matches(input, #"[^\w]+").Cast<Match>()
.Select(m => m.Index)
.ToList();
or if you want to replace delimeters with *
var output = Regex.Replace(input, #"[^\w]+","*");
EDIT
var output = String.Join("",input.Select(c => char.IsLetter(c)?" ":"*"));

text = text.Replace(" ", "? ");

Related

Regex Replace exclude first and nth character

I am trying to mask a string name with * (asterisks) and exclude both the first and nth (5th) characters.
Example:
UserFirstName -> U****F*******
I managed to exclude the first character with (?!^).:
var regex = new Regex("(?!^).");
var result = regex.Replace(stringUserName, "*");
output:
UserFirstName -> U************
How can I also exclude the character in the 5th position?
You may use
(?!^)(?<!^.{4}).
See the regex demo
Pattern details
(?!^) - (it is equal to (?<!^) lookbehind that you may use instead) a negative lookahead that fails the position at the start of string
(?<!^.{4}) - a negative lookbehind that fails the match if, immediately to the left of the current position, there are any four characters other than a newline char from the start of the string
. - any single char other than a newline char.
C# demo:
string text = "UserFirstName";
int SkipIndex = 5;
string pattern = $#"(?!^)(?<!^.{{{SkipIndex-1}}}).";
Console.WriteLine(Regex.Replace(text, pattern, "*"));
Output: U***F********
Without Regex, extra explanation not required ;)
var text = "UserFirstName";
var skip = new[] { 0, 4 }.ToHashSet();
var masked = text.Select((c, index) => skip.Contains(index) ? c : '*').ToArray();
var output = new String(masked);
Console.WriteLine (output); // U***F********
c# Demo

Split text into two sentences in C#

I want to divide a text into sentences.the sentence contains whitespace characters
For example:
Orginal sentence: 100 10 20 13
the result:
first sentence:100 10 20
second sentence:13
I tried split but the result was :
first:100
second:10
third:20
fourth:13
How can I do that?
You want all before the last space and the rest? You can use String.LastIndexOf and Substring:
string text = "100 10 20 13";
string firstPart = text;
string lastPart;
int lastSpaceIndex = text.LastIndexOf(' ');
if(lastSpaceIndex >= 0)
{
firstPart = text.Substring(0, lastSpaceIndex);
lastPart = text.Substring(lastSpaceIndex).TrimStart();
}
You can use Linq for this;
// This splits on white space
var split = original.Split(' ');
// This takes all split parts except for the last one
var first = split.Take(split.Count() - 1);
// And rejoins it
first = String.Join(" ", first);
// This gets the last one
var last = split.Last();
Note: This is assuming that you want the first result to be every word except for the last and the second result to be only the last... If you have different requirements please clarify your question

Add space around each / using Humanizer or Regex

I have a string like the following:
var text = #"Some text/othertext/ yet more text /last of the text";
I want to normalize the spaces around each slash so it matches the following:
var text = #"Some text / othertext / yet more text / last of the text";
That is, one space before each slash and one space after. How can I do this using Humanizer or, barring that, with a single regex? Humanizer is the preferred solution.
I'm able to do this with the following pair of regexes:
var regexLeft = new Regex(#"\S/"); // \S matches non-whitespace
var regexRight = new Regex(#"/\S");
var newVal = regexLeft.Replace(text, m => m.Value[0] + " /");
newVal = regexRight.Replace(newVal, m => "/ " + m.Value[1]);
Are you looking for this:
var text = #"Some text/othertext/ yet more text /last of the text";
// Some text / othertext / yet more text / last of the text
string result = Regex.Replace(text, #"\s*/\s*", " / ");
slash surrounded by zero or more spaces replaced by slash surrounded by exactly one space.

Smarter string replace based on a pattern

I have a string that looks like this:
string1 + \t\t\t\t\t\t\t + string2
string1 can be anything and string2 can be one of the following: Display, Search, Fee. For the escaped characters, sometimes I get 10, sometimes I get 5, sometimes I get some amount N... I am only expecting one \t character between string1 and string2.
What I have so far:
string newLine0 = line.Replace("\t\t\t\t\t\t\t\t\t\t\t\t\t\tDisplay", "\tDisplay");
string newline1 = newLine0.Replace("\t\tFee", "\tFee");
string newLine2 = newline1.Replace("\t\tSearch", "\tSearch");
string newLine3 = newLine2.Replace("\t\t\t\t\t\t\t\t\t\t\t\tDisplay", "\tDisplay");
string newLine4 = newLine3.Replace("\t\tDisplay", "\tDisplay");
Is there a better way to do this with cleaner code and less variables?
It seems like you could simply replace instances of more than one \t with a single \t:
string newLine = Regex.Replace(line, #"\t{2,}", "\t");
If you only want to remove extra tabs if one of the words Display, Fee or Search follows, use
string newLine = Regex.Replace(line, #"\t{2,}(?=Display|Fee|Search)", "\t");
If N tabs precede a word, make N be 1:
string newLine = Regex.Replace(line, #"\b(\t+)(\t\w)\2\b", "$+");
\b - starting from a word boundary
(\t+) - match one or more tabs (first grouping)
(\t\w) - followed by just one tab and a word (second grouping)
\2 - match the second captured group
$+ - substitute the whole match (/\t*\w/) with only the second matched group (/\t\w).

If substring is in a word, get the index value before the word

Im gone substring a description text after a certain count with .Substring(0, 100). I dont want to break in the middle of a word, if thats the case i would like to get to the first whitespace before the word.
I figure i check if the next character is not a " " its in the middle of the word.
My biggest issue is how do step backwards to get the index of the " " (whitespace).
this is what i got so far
string description = "a long string";
description = Regex.Replace(description, #"(?></?\w+)(?>(?:[^>'""]+|'[^']*'|""[^""]*"")*)>", String.Empty);
var newString = (description.Count() > 101) ? description.Substring(0, 101) : description;
//i tried something like this
var whatIsNext = newString.IndexOf(" ", 100, -20);
I think you're looking for String.LastIndexOf:
Reports the zero-based index position of the last occurrence of a specified Unicode character within this instance. The search starts at a specified character position and proceeds backward toward the beginning of the string.
You want something like this:
int index = s.LastIndexOf(' ', 100);
Use LastIndexOf and apply it to your substring
// The string we are searching.
string value = "Dot Net Perls";
//
// Find the last occurrence of ' '.
int index1 = value.LastIndexOf(' ');
Link Info
var whatIsNext = newString.Substring(0, newString.LastIndexOf(' '));

Categories

Resources