C# how to find and replace specific text in string? - c#

I have a string which represents byte array, inside of it I have several groups of numbers (usually 5): which are encoded as 0x30..0x39 (codes for 0..9 digits). Before and after each number I have a space (0x20 code).
Examples:
"E5-20-32-36-20-E0" // "32-36" encodes number "26", notice spaces: "20"
"E5-20-37-20-E9" // "37" encodes number "7"
"E5-20-38-20-E7-E4-20-37-35-20-E9" // two numbers: "8" (from "38") and "75" (from "37-35")
I want to find out all these groups and reverse digits in the encoded numbers:
8 -> 8
75 -> 57
123 -> 321
Desired outcome:
"E5-20-32-36-20-E0" -> "E5-20-36-32-20-E0"
"E5-20-37-20-E9" -> "E5-20-37-20-E9"
"E5-20-37-38-39-20-E9" -> "E5-20-39-38-37-20-E9"
"E5-20-38-39-20-E7-E4-20-37-35-20-E9" -> "E5-20-39-38-20-E7-E4-20-35-37-20-E9"
I have the data inside a List \ String \ Byte[] - so maybe there is a way to do it ?
Thanks,

It's unclear (from the original question) what do you want to do with the the digits; let's extract a custom method for you to implement it. As an example, I've implemented reverse:
32 -> 32
32-36 -> 36-32
36-32-37 -> 37-32-36
36-37-38-39 -> 39-38-37-36
Code:
// items: array of digits codes, e.g. {"36", "32", "37"}
//TODO: put desired transformation here
private static IEnumerable<string> Transform(string[] items) {
// Either terse Linq:
// return items.Reverse();
// Or good old for loop:
string[] result = new string[items.Length];
for (int i = 0; i < items.Length; ++i)
result[i] = items[items.Length - i - 1];
return result;
}
Now we can use regular expressions (Regex) to extract all the digit sequencies and replace them with transformed ones:
using System.Text.RegularExpressions;
...
string input = "E5-20-36-32-37-20-E0";
string result = Regex
.Replace(input,
#"(?<=20\-)3[0-9](\-3[0-9])*(?=\-20)",
match => string.Join("-", Transform(match.Value.Split('-'))));
Console.Write($"Before: {input}{Environment.NewLine}After: {result}";);
Outcome:
Before: E5-20-36-32-37-20-E0
After: E5-20-37-32-36-20-E0
Edit: In case reverse is the only desired transformation, the code can be simplified by dropping Transform and adding Linq:
using System.Linq;
using System.Text.RegularExpressions;
...
string input = "E5-20-36-32-37-20-E0";
string result = Regex
.Replace(input,
#"(?<=20\-)3[0-9](\-3[0-9])*(?=\-20)",
match => string.Join("-", match.Value.Split('-').Reverse()));
More tests:
private static string MySolution(string input) {
return Regex
.Replace(input,
#"(?<=20\-)3[0-9](\-3[0-9])*(?=\-20)",
match => string.Join("-", Transform(match.Value.Split('-'))));
}
...
string[] tests = new string[] {
"E5-20-32-36-20-E0",
"E5-20-37-20-E9",
"E5-20-37-38-39-20-E9",
"E5-20-38-39-20-E7-E4-20-37-35-20-E9",
};
string report = string.Join(Environment.NewLine, tests
.Select(test => $"{test,-37} -> {MySolution(test)}"));
Console.Write(report);
Outcome:
E5-20-32-36-20-E0 -> E5-20-36-32-20-E0
E5-20-37-20-E9 -> E5-20-37-20-E9
E5-20-37-38-39-20-E9 -> E5-20-39-38-37-20-E9
E5-20-38-39-20-E7-E4-20-37-35-20-E9 -> E5-20-39-38-20-E7-E4-20-35-37-20-E9
Edit 2: Regex explanation (see https://www.regular-expressions.info/lookaround.html for details):
(?<=20\-) - must appear before the match: "20-" ("-" escaped with "\")
3[0-9](\-3[0-9])* - match itself (what we are replacing in Regex.Replace)
(?=\-20) - must appear after the match "-20" ("-" escaped with "\")
Let's have a look at match part 3[0-9](\-3[0-9])*:
3 - just "3"
[0-9] - character (digit) within 0-9 range
(\-3[0-9])* - followed by zero or more - "*" - groups of "-3[0-9]"

I'm not sure but I guess the length can change and you just want to reorder in reverse order just the numbers. so a possible way is:
Put the string in 2 arrays (so they are the same)
Iterate through one of them to locate begin and end o fthe number area
Go from end-area to begin-area in first array and write to the second from begin-area to end-area
Edit: not really tested, i just wrote that quickly:
string input = "E5-20-36-32-37-20-E0";
string[] array1 = input.Split('-');
string[] array2 = input.Split('-');
int startIndex = -1;
int endIndex = -1;
for (int i= 0; i < array1.Length; ++i)
{
if (array1[i] == "20")
{
if (startIndex < 0)
{
startIndex = i + 1;
}
else
{
endIndex = i - 1;
}
}
}
int pos1 = startIndex;
int pos2 = endIndex;
for (int j=0; j < (endIndex- startIndex + 1); ++j)
{
array1[pos1] = array2[pos2];
pos1++;
pos2--;
}

If you would be clear about how you want to process the numbers, it would be easier to provide a solution.
Do you want to swap them randomly?
Do you want to reverse order?
Do you want to swap every second number with the number before?
Do you want to swap ...
you can try the following (for reversing the numbers)
string hex = "E5-20-36-32-20-E0"; // this is your input string
// split the numbers by '-' and generate list out of it
List<string> hexNumbers = new List<string>();
hexNumbers.AddRange(hex.Split('-'));
// find start and end of the numbers that should be swapped
int startIndex = hexNumbers.IndexOf("20");
int endIndex = hexNumbers.LastIndexOf("20");
string newHex = "";
// add the part in front of the numbers that should be reversed
for (int i = 0; i <= startIndex; i++) newHex += hexNumbers[i] + "-";
// reverse the numbers
for (int i = endIndex-1; i > startIndex; i--) newHex += hexNumbers[i] + "-";
// add the part behind the numbers that should be reversed
for (int i = endIndex; i < hexNumbers.Count-1; i++) newHex += hexNumbers[i] + "-";
newHex += hexNumbers.Last();
If the start and the end is always the same, this can be fairly simplified into 4 lines of code:
string[] hexNumbers = hex.Split('-');
string newHex = "E5-20-";
for (int i = hexNumbers.Count() - 3; i > 1; i--) newHex += hexNumbers[i] + "-";
newHex += "20-E0";
Results:
"E5-20-36-32-20-E0" -> "E5-20-32-36-20-E0"
"E5-20-36-32-37-20-E0" -> "E5-20-32-37-36-20-E0"
"E5-20-36-12-18-32-20-E0" -> "E5-20-32-18-12-36-20-E0"

Related

How can I add a space between every 3 characters counting from the right to the left in a string in C#?

I want to add space between every 3 characters in a string in C#, but count from right to left.
For example :
11222333 -> 11 222 333
Answer by #Jimi from comments (will delete if they post their own)
var YourString = "11222333";
var sb = new StringBuilder(YourString);
for (int i = sb.Length -3; i >= 0; i -= 3)
sb.Insert(i, ' ');
return sb.ToString();
The benefit of this algorithm appears to be that you are working backwards through the string and therefore only moving a certain amount on each run, rather than the whole string.
If you are trying to format a string as a number according to some locale conventions you can use the NumberFormat class to set how you want a number to be formatted as a string
So for example
string input = "11222333";
NumberFormatInfo currentFormat = new NumberFormatInfo();
currentFormat.NumberGroupSeparator = " ";
if(Int32.TryParse(input, NumberStyles.None, currentFormat, out int result))
{
string output = result.ToString("N0", currentFormat);
Console.WriteLine(output); // 11 222 333
}
The following recursive function would do the job:
string space3(string s)
{
int len3 = s.Length - 3;
return (len <= 0) ? s
: (space3(s.Substring(0, len3)) + " " + s.Substring(len3));
}
C# 8.0 introduced string ranges. Ranges allow for a more compact form:
string space3(string s)
{
return (s.Length <= 3) ? s
: (space3(s[..^3]) + " " + s[^3..]);
}
Using Regex.Replace:
string input = "11222333";
string result = Regex.Replace( input, #"\d{3}", #" $0", RegexOptions.RightToLeft );
Demo and detailed explanation of RegEx pattern at regex101.
tl;dr: Match groups of 3 digits from right to left and replace them by space + the 3 digits.
The most efficient algorithm I can come up with is the following:
var sb = new StringBuilder(YourString.Length + YourString.Length / 3 + 1);
if (YourString.Length % 3 > 0)
{
sb.Append(YourString, 0, YourString.Length % 3);
sb.Append(' ');
}
for (var i = YourString.Length % 3; i < YourString.Length; i += 3)
{
sb.Append(YourString, i, 3);
sb.Append(' ');
}
return sb.ToString();
We first assign a StringBuilder of the correct size.
Then we check to see if we need to append the first one or two characters. Then we loop the rest.
dotnetfiddle

How do you do a string split with 2 chars counts in C#?

I know how to do a string split if there's a letter, number, that I want to replace.
But how could I do a string.Split() by 2 char counts without replacing any existing letters, number, etc...?
Example:
string MAC = "00122345"
I want that string to output: 00:12:23:45
You could create a LINQ extension method to give you an IEnumerable<string> of parts:
public static class Extensions
{
public static IEnumerable<string> SplitNthParts(this string source, int partSize)
{
if (string.IsNullOrEmpty(source))
{
throw new ArgumentException("String cannot be null or empty.", nameof(source));
}
if (partSize < 1)
{
throw new ArgumentException("Part size has to be greater than zero.", nameof(partSize));
}
return Enumerable
.Range(0, (source.Length + partSize - 1) / partSize)
.Select(pos => source
.Substring(pos * partSize,
Math.Min(partSize, source.Length - pos * partSize)));
}
}
Usage:
var strings = new string[] {
"00122345",
"001223453"
};
foreach (var str in strings)
{
Console.WriteLine(string.Join(":", str.SplitNthParts(2)));
}
// 00:12:23:45
// 00:12:23:45:3
Explanation:
Use Enumerable.Range to get number of positions to slice string. In this case its the length of the string + chunk size - 1, since we need to get a big enough range to also fit leftover chunk sizes.
Enumerable.Select each position of slicing and get the startIndex using String.Substring using the position multiplied by 2 to move down the string every 2 characters. You will have to use Math.Min to calculate the smallest size leftover size if the string doesn't have enough characters to fit another chunk. You can calculate this by the length of the string - current position * chunk size.
String.Join the final result with ":".
You could also replace the LINQ query with yield here to increase performance for larger strings since all the substrings won't be stored in memory at once:
for (var pos = 0; pos < source.Length; pos += partSize)
{
yield return source.Substring(pos, Math.Min(partSize, source.Length - pos));
}
You can use something like this:
string newStr= System.Text.RegularExpressions.Regex.Replace(MAC, ".{2}", "$0:");
To trim the last colon, you can use something like this.
newStr.TrimEnd(':');
Microsoft Document
Try this way.
string MAC = "00122345";
MAC = System.Text.RegularExpressions.Regex.Replace(MAC,".{2}", "$0:");
MAC = MAC.Substring(0,MAC.Length-1);
Console.WriteLine(MAC);
A quite fast solution, 8-10x faster than the current accepted answer (regex solution) and 3-4x faster than the LINQ solution
public static string Format(this string s, string separator, int length)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.Length; i += length)
{
sb.Append(s.Substring(i, Math.Min(s.Length - i, length)));
if (i < s.Length - length)
{
sb.Append(separator);
}
}
return sb.ToString();
}
Usage:
string result = "12345678".Format(":", 2);
Here is a one (1) line alternative using LINQ Enumerable.Aggregate.
string result = MAC.Aggregate("", (acc, c) => acc.Length % 3 == 0 ? acc += c : acc += c + ":").TrimEnd(':');
An easy to understand and simple solution.
This is a simple fast modified answer in which you can easily change the split char.
This answer also checks if the number is even or odd , to make the suitable string.Split().
input : 00122345
output : 00:12:23:45
input : 0012234
output : 00:12:23:4
//The List that keeps the pairs
List<string> MACList = new List<string>();
//Split the even number into pairs
for (int i = 1; i <= MAC.Length; i++)
{
if (i % 2 == 0)
{
MACList.Add(MAC.Substring(i - 2, 2));
}
}
//Make the preferable output
string output = "";
for (int j = 0; j < MACList.Count; j++)
{
output = output + MACList[j] + ":";
}
//Checks if the input string is even number or odd number
if (MAC.Length % 2 == 0)
{
output = output.Trim(output.Last());
}
else
{
output += MAC.Last();
}
//input : 00122345
//output : 00:12:23:45
//input : 0012234
//output : 00:12:23:4

How to find text between two tabs

I have a file that looks similar like the following:
Tomas | Nordstrom | Sweden | Europe | World
(the character "|" in the above line represents a tab, new column)
Now I want a string containing only the text in the 4th column.
I have succeeded to find characters in a certain spot in the line. But that spot changes according to the number och characters in each column.
I could really need some nice input on this.
Thanks in advance.
/Tomas
This can be done using the Split method like this:
string s = "Tomas|Nordstrom|Sweden|Europe|World";
string[] stringArray = s.Split( new string[] { "|" }, StringSplitOptions.None );
Console.WriteLine( stringArray[3] );
This will print out "Europe", because that is located at index 3 in stringArray.
Edit:
The same can be achieved using Regex like this:
string[] stringRegex = Regex.Split( s, #"\|+" );
Basic algorithm would be iterating characters, until n-1 tabs found, then take chars up to the next tab or the end of string.
Depending on requirements, if performance is critical, you might need to implement a scanning algorithm manually.
You might be surprising how slow is string splitting. Well - it's not not by itself, but the overall approach requires:
Scanning to the end of the string
Creation of all of the split parts on heap
Collecting garbage
Consider following benchmark of the two approaches:
void Main()
{
string source = "Tomas\tNordstrom\tSweden\tEurope\tWorld";
var sw = Stopwatch.StartNew();
string result = null;
var n = 100000000;
for (var i = 0; i < n; i++)
{
result = FindBySplitting(source);
}
sw.Stop();
var splittingNsop = (double)sw.ElapsedMilliseconds / n * 1000000.0;
Console.WriteLine("Splitting. {0} ns/op",splittingNsop);
Console.WriteLine(result);
sw.Restart();
for (var i = 0; i < n; i++)
{
result = FindByScanning(source);
}
sw.Stop();
var scanningNsop = (double)sw.ElapsedMilliseconds / n * 1000000.0;
Console.WriteLine("Scanning. {0} ns/op",
scanningNsop);
Console.WriteLine(result);
Console.WriteLine("Scanning over splitting: {0}", splittingNsop / scanningNsop);
}
string FindBySplitting(string s)
{
return s.Split('\t')[3];
}
string FindByScanning(string s)
{
int l = s.Length, p = 0, q = 0, c = 0;
while (c++ < 4 - 1)
while (p < l && s[p++] != '\t')
;
for (q = p; q < l && s[q] != '\t'; q++)
;
return s.Substring(p, q - p);
}
Scanning algorithm implemented in pure C# outperforms the splitting one implemented on the low level by a factor of 4.6 on my laptop:
Splitting. 174.81 ns/op
Europe
Scanning. 37.58 ns/op
Europe
Scanning over splitting: 4.65167642362959

Replace Only Multiples Of three in c#

How can I replace only multiples of 3 in C#? Say for example I had the string "000100000", and I wanted "000" to be replaced with "+" but only every group of three characters. Additional condition: the groups should be changed starting from the end:, e.g. for "000100000" it should output "+100+".
You can just use a regular expression for this.
(0{3}(?!0+))
This uses a negative lookahead to make sure there aren't any other zeros after a group of three 0s - in other words, for a sequence of an arbitrary number of 0s, it'll only match the last 3.
You can modify this if you want to do something subtly different looking lookaheads and lookbehinds.
I suggest using regular expressions, e.g.:
string source = "000100000";
// "+100+"
string result = Regex.Replace(
source,
"0{3,}",
match => new string('0', match.Length % 3) + new string('+', match.Length / 3));
Tests:
001 -> 001
0001 -> +1
000100 -> +100
0001000 -> +1+
00010000 -> +10+
000100000 -> +100+
0001000000 -> +1++
You can do this with Substring:
string strReplace = "000100000";
//Store your string on StringBuilder to edit the string
StringBuilder sb = new StringBuilder();
sb.Append("+");
sb.Append(strReplace.Substring(0, 3)); //Use substring, 0 is the start of index and 3 is the length as your requirement
sb.Append("+");
sb.Append(strReplace.Substring(3, 3));
sb.Append("+");
sb.Append(strReplace.Substring(6, 3));
sb.Append("+");
strReplace = sb.ToString(); //Finally replace your string instance with your result
Or by for loop but this time instead of using substring, we use Char array to get every char in your string:
string strReplace = "000100000";
char[] chReplace = strReplace.ToCharArray();
StringBuilder sb = new StringBuilder();
for (int x = 0; x <= 8; x++)
{
if (x == 0 || x == 3 || x == 6 || x == 9)
{
sb.Append("+");
sb.Append(chReplace[x]);
}
else
{
sb.Append(chReplace[x]);
}
}
sb.Append("+");
strReplace = sb.ToString();
Okay, a bunch of these answers are addressing detecting groups of 3 '0's. Here's an answer that deals with groups of 3 anythings (reading the string in groups of three characters):
string GroupsOfThree(string str)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i + 2 < str.Length; i += 3)
{
string sub = str.Substring(i, 3);
if (sub.All(c => c == sub[0]))
sb.Append("+");
else
sb.Append(sub);
}
return sb.ToString();
}
You can use a replace regular expression.
"[0]{3}|[1]{3}"
The above Regular Expression can be use like below in C#:
string k = "000100000";
Regex pattern = new Regex("[0]{3}|[1]{3}");
pattern.Replace(k, "+");
reversing the string before you replace and after solves your problem.
something like:
string ReplaceThreeZeros(string text)
{
var reversed = new string(text.Reverse().ToArray());
var replaced = reversed.Replace("000","+");
return new string(replaced.Reverse().ToArray());
}

What is the most efficient way to detect if a string contains a number of consecutive duplicate characters in C#?

For example, a user entered "I love this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
the consecutive duplicate exclamation mark "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" should be detected.
The following regular expression would detect repeating chars. You could up the number or limit this to specific characters to make it more robust.
int threshold = 3;
string stringToMatch = "thisstringrepeatsss";
string pattern = "(\\d)\\" + threshold + " + ";
Regex r = new Regex(pattern);
Match m = r.Match(stringToMatch);
while(m.Success)
{
Console.WriteLine("character passes threshold " + m.ToString());
m = m.NextMatch();
}
Here's and example of a function that searches for a sequence of consecutive chars of a specified length and also ignores white space characters:
public static bool HasConsecutiveChars(string source, int sequenceLength)
{
if (string.IsNullOrEmpty(source))
return false;
if (source.Length == 1)
return false;
int charCount = 1;
for (int i = 0; i < source.Length - 1; i++)
{
char c = source[i];
if (Char.IsWhiteSpace(c))
continue;
if (c == source[i+1])
{
charCount++;
if (charCount >= sequenceLength)
return true;
}
else
charCount = 1;
}
return false;
}
Edit fixed range bug :/
Can be done in O(n) easily: for each character, if the previous character is the same as the current, increment a temporary count. If it's different, reset your temporary count. At each step, update your global if needed.
For abbccc you get:
a => temp = 1, global = 1
b => temp = 1, global = 1
b => temp = 2, global = 2
c => temp = 1, global = 2
c => temp = 2, global = 2
c => temp = 3, global = 3
=> c appears three times. Extend it to get the position, then you should be able to print the "ccc" substring.
You can extend this to give you the starting position fairly easily, I'll leave that to you.
Here is a quick solution I crafted with some extra duplicates thrown in for good measure. As others pointed out in the comments, some duplicates are going to be completely legitimate, so you may want to narrow your criteria to punctuation instead of mere characters.
string input = "I loove this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!aa";
int index = -1;
int count =1;
List<string> dupes = new List<string>();
for (int i = 0; i < input.Length-1; i++)
{
if (input[i] == input[i + 1])
{
if (index == -1)
index = i;
count++;
}
else if (index > -1)
{
dupes.Add(input.Substring(index, count));
index = -1;
count = 1;
}
}
if (index > -1)
{
dupes.Add(input.Substring(index, count));
}
The better way i my opinion is create a array, each element in array is responsible for one character pair on string next to each other, eg first aa, bb, cc, dd. This array construct with 0 on each element.
Solve of this problem is a for on this string and update array values.
You can next analyze this array for what you want.
Example: For string: bbaaaccccdab, your result array would be { 2, 1, 3 }, because 'aa' can find 2 times, 'bb' can find one time (at start of string), 'cc' can find three times.
Why 'cc' three times? Because 'cc'cc & c'cc'c & cc'cc'.
Use LINQ! (For everything, not just this)
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index)));
// returns "abb", where each of these items has the previous letter before it
OR
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index))).Any();
// returns true

Categories

Resources