Using regex replace to transform an expression

Using regex replace to transform an expression - c#

is it possible to transform an expression like
{op1 == op2, #} && {op3 > op4, 1, 2} && op5 == op6
to
op1 ==_# op2 && op3 >_1_2 op4 && op5 == op6
So, all occurences after the comma should be placed seperated by an underline after the operator (==, >,<,<=, etc...). opX can be any alphanumerical value.

After Qtax's comment, I just wrote a solution:
var st = "{op1 == op2, #} && {op3 > op4, 1, 2} && op5 == op6";
var regex = "{.*?}";
for (var match = Regex.Match(st, regex); match.Success; match = Regex.Match(st, regex))
{
var oldString = match.Value; // {op1 == op2, #}
var op = oldString.Split(' ').ToList()[1].Trim(); // ==
var csv = oldString.Split(',').Select(x => x.Trim()).ToList(); // [0] = "{op1 == op2" [1] = "#}"
var expression = csv[0].Remove(0,1); // op1 == op2
csv.RemoveAt(0);
var extension = "_" + String.Join("_", csv);
extension = extension.Remove(extension.Length-1); // _#
var newString = expression.Replace(op, op + extension);
st = st.Replace(oldString, newString);
}

Related

Get the substring of the non conditional part

I have this string for example:
2X+4+(2+2X+4X) +4
The position of the parenthesis can vary. I want to find out how can I extract the part without the parenthesis. For example I want 2X+4+4. Any Suggestions?
I am using C#.

Try simple string Index and Substring operations as follows:
string s = "2X+4+(2+2X+4X)+4";
int beginIndex = s.IndexOf("(");
int endIndex = s.IndexOf(")");
string firstPart = s.Substring(0,beginIndex-1);
string secondPart = s.Substring(endIndex+1,s.Length-endIndex-1);
var result = firstPart + secondPart;
Explanation:
Get the first index of (
Get the first index of )
Create two sub-string, first one is 1 index before beginIndex to remove the mathematical symbol like +
Second one is post endIndex, till string length
Concatenate the two string top get the final result

Try Regex approach:
var str = "(1x+2)-2X+4+(2+2X+4X)+4+(3X+3)";
var regex = new Regex(#"\(\S+?\)\W?");//matches '(1x+2)-', '(2+2X+4X)+', '(3X+3)'
var result = regex.Replace(str, "");//replaces parts above by blank strings: '2X+4+4+'
result = new Regex(#"\W$").Replace(result, "");//replaces last operation '2X+4+4+', if needed
//2X+4+4 ^

Try this one:
var str = "(7X+2)+2X+4+(2+2X+(3X+3)+4X)+4+(3X+3)";
var result =
str
.Aggregate(
new { Result = "", depth = 0 },
(a, x) =>
new
{
Result = a.depth == 0 && x != '(' ? a.Result + x : a.Result,
depth = a.depth + (x == '(' ? 1 : (x == ')' ? -1 : 0))
})
.Result
.Trim('+')
.Replace("++", "+");
//result == "2X+4+4"
This handles nested, preceding, and trailing parenthesis.

update specific lines in text file C#

I'm trying to update specific lines in text file using this condition:
if line contain Word-to-search remove only the next space
using the blew code :
using (System.IO.TextReader tr = File.OpenText((#"d:\\My File3.log")))
{
string line;
while ((line = tr.ReadLine()) != null)
{
string[] items = line.Trim().Split(' ');
foreach (var s in items)
{
if (s == "a" || s == "b")
s = s.Replace(" ", "");
using (StreamWriter tw = new StreamWriter(#"d:\\My File3.log"))
tw.WriteLine(s);
my file is llike :
k l m
x y z a c
b d a w
the update file shold be like :
k l m
x y z ac
bd aw

I think you can do it by:
...
if (s == "a" || s == "b"){
if (s == "a")
s = s.Replace("a ", "a");
if (s == "b")
s = s.Replace("b ", "b");
using (StreamWriter tw = new StreamWriter(#"d:\\My File3.log"))
tw.WriteLine(s);
}
...
SAMPLE:
string test="a c";
test =test.Replace("a ", "a");
Console.WriteLine(test);
OUTPUT:
ac

try this:
....
while ((line = tr.ReadLine()) != null)
{
using (StreamWriter tw = new StreamWriter(#"d:\\My File3.log"))
string st = line.Replace("a ", "a").Replace("b ", "b");//just add additional .Replace() here
tw.WriteLine(st);
}

Your problem, I think, is here:
if (s == "a" || s == "b")
s = s.Replace(" ", "");
In order to satisfy your if condition, string s is necessarily without any spaces in it. Your code, therefore, does nothing.
if(s == "a" || s == "b")
foreach(var s2 in items)
{
if(items.IndexOf(s2) > items.IndexOf(s) && s2 == " ")
s2 == string.Empty;
break;
}
The break exists to ensure we only replace the next space, not all spaces following the character.

Are you looking for String.Replace?
string path = #"d:\My File3.log";
var data = File
.ReadLines(path)
.Select(line => line
.Replace("a ", "a")
.Replace("b ", "b"))
.ToList(); // Materialization, since we have to write back to the same file
File.WriteAllLines(path, data);
In general case, e.g.
if line contain Word-to-search
means that a and b should be words (b within abc is not the word we are looking for):
"abc a b c a" -> "abc abc a"
try using regular expressions:
string[] words = new string[] { "a", "b" };
string pattern =
#"\b(" + string.Join("|",
words.Select(item => Regex.Escape(item))) +
#")\s";
var data = File
.ReadLines(path)
.Select(line => Regex.Replace(line, pattern, m => m.Groups[1].Value))
.ToList();
File.WriteAllLines(path, data);

you should consider a temporary variable just before foreach loop
int temp = 0;
foreach(var s in items)
{
if (temp == 0)
{
if (s == "a" || s == "b")
{
temp = 1;
}
}
else
{
s = s.Replace(" ", "");
using (StreamWriter tw = new StreamWriter(#"d:\\My File3.log"))
tw.WriteLine(s);
temp = 0;
}
}

You cannot read and write at the same iteration to the same file.
Here a solution using StringBuilder (with him you can manipulate chars in the string):
using (StreamWriter tw = new StreamWriter(#"file1.txt"))
{
using (System.IO.TextReader tr = File.OpenText((#"file.txt")))
{
string line;
StringBuilder items = new StringBuilder();
while ((line = tr.ReadLine()) != null)
{
items.Append(line);
items.Replace("a ", "a");
items.Replace("b ", "b");
tw.WriteLine(items);
items.Clear();
}
}
}

Turn boolean-expression string into the .NET code

I have logic where customer specifies a string and my app tells to the customer if this string presents in the text, something like this:
internal const string GlobalText = "blablabla";
bool PresentInTheText(string searchString)
{
return GlobalText.IndexOf(searchString, StringComparison.OrdinalIgnoreCase) >= 0;
}
Basically if text contains passed string return true otherwise false.
Now I want to make it more complex. Lets say if customer passes a string "foo && bar", and I need to return true if this text contains both "foo" and "bar" substrings, straightforward approach:
bool result;
if (!string.IsNullOrEmpty(passedExpression) &&
passedExpression.Contains(" && "))
{
var tokens = passedExpression.Split(new[] { " && " }, StringSplitOptions.RemoveEmptyEntries);
result = true;
foreach (var token in tokens)
{
if (GlobalText.IndexOf(token, StringComparison.OrdinalIgnoreCase) < 0)
{
result = false;
}
}
}
return result;
It works for expressions like A && B && C. But I want generalize the solution to support all boolean operators.
Let's say: ("foo" && "bar") || "baz". What would be the solution?
I would say take passed string, using regex add to all strings .IndexOf(token, StringComparison.OrdinalIgnoreCase) < >= 0 code, it would be like this:
("foo".IndexOf(token, StringComparison.OrdinalIgnoreCase) < >= 0 &&
"bar".IndexOf(token, StringComparison.OrdinalIgnoreCase) < >= 0)) ||
"baz".IndexOf(token, StringComparison.OrdinalIgnoreCase) < >= 0
and then turn this string into a function and execute using Reflections. What would be the best solution?
ETA:
Test cases:
bool Contains(string text, string expressionString);
string text = "Customers: David, Danny, Mike, Luke. Car: BMW"
string str0 = "Luke"
string str1 = "(Danny || Jennifer) && (BMW)"
string str2 = "(Mike && BMW) || Volvo"
string str3 = "(Mike || David) && Ford"
string str4 = "David && !BMW"
bool Contains(string text, string str0); //True - This text contains "Luke"
bool Contains(string text, string str1); //True - David and BMW in the text
bool Contains(string text, string str2); //True - Mike and BMW in the text
bool Contains(string text, string str3); //False - no Ford in the list
bool Contains(string text, string str4); //False - BMW in the list

You can solve this universally in the same way that a calculator, or a compiler, evaluates an expression:
Tokenize the string and identify each token as an operator (OP) or an operand (A, B, C, etc).
Convert the token sequence from infix (A OP B) to postfix (A B OP).
Evaluate the postfix token sequence.
Each of these steps can be done with a well known stack based algorithm, in linear time and space. Plus, if you use this method, it automatically extends to any binary operators you'd like to add later (addition, subtraction, fuzzy string match, etc etc).
To convert from infix to postfix: http://scriptasylum.com/tutorials/infix_postfix/algorithms/infix-postfix/
To evaluate the postfix:
http://scriptasylum.com/tutorials/infix_postfix/algorithms/postfix-evaluation/

The easiest way to do this would be to parse the input text and build an array of boolean "true" values, so you end up with something like this:
//Dictionary<string,List<string>> members;
members["Car"].Contains("BMW") // evals to True;
Alternatively, if there's no functional difference between any of the input entries (i.e. the variable evaluates to true as long as the word shows up in the input text), you can probably just build a list of strings rather than having to worry about using their classification as the dictionary key.
Then, you parse the equation strings and see if the values are present in the boolean list, if they are, you replace them in the original equation string with a 1. If they are not present, you replace them with a 0.
You end up with something that looks like this:
string str0 = "Luke" // "1"
string str1 = "(Danny || Jennifer) && (BMW)" // "(1 || 0) && (1)"
string str2 = "(Mike && BMW) || Volvo" // "(1 && 1) || 0"
string str3 = "(Mike || David) && Ford" // "(1 || 1) && 0"
string str4 = "David && !BMW" // "1 && !0"
Now, it's just a simple iterative string replace. You loop on the string until the only thing remaining is a 1 or a 0.
while (str.Length > 1)
{
if (str.Contains("(1 || 1)"))
str.Replace("(1 || 1)", "1");
if (str.Contains("(1 || 0)"))
str.Replace("(1 || 0)", "1");
// and so on
}
Alternatively, if you can find a C# "eval" method, you can evaluate the expression directly (and you can also use True/False instead of 0/1).
Edit:
Found a simple tokenizer that will probably work for parsing the test equations:
using System;
using System.Text.RegularExpressions;
public static string[] Tokenize(string equation)
{
Regex RE = new Regex(#"([\(\)\! ])");
return (RE.Split(equation));
}
//from here: https://www.safaribooksonline.com/library/view/c-cookbook/0596003390/ch08s07.html
Edit 2:
Just wrote a sample project that does it.
//this parses out the string input, does not use the classifications
List<string> members = new List<string>();
string input = "Customers: David, Danny, Mike, Luke. Car: BMW";
string[] t1 = input.Split(new string[] {". "}, StringSplitOptions.RemoveEmptyEntries);
foreach (String t in t1)
{
string[] t2 = t.Split(new string[] { ": " }, StringSplitOptions.RemoveEmptyEntries);
string[] t3 = t2[1].Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries);
foreach (String s in t3)
{
members.Add(s.Trim());
}
}
This tokenizes the equation and replaces with 1 and 0.
string eq = "(Danny || Jennifer) && (!BMW)";
Regex RE = new Regex(#"([\(\)\! ])");
string[] tokens = RE.Split(eq);
string eqOutput = String.Empty;
string[] operators = new string[] { "&&", "||", "!", ")", "("};
foreach (string tok in tokens)
{
if (tok.Trim() == String.Empty)
continue;
if (operators.Contains(tok))
{
eqOutput += tok;
}
else if (members.Contains(tok))
{
eqOutput += "1";
}
else
{
eqOutput += "0";
}
}
At this point, the equation "(Danny || Jennifer) && (!BMW)" looks like "(1||0)&&(!1)".
Now reduce the equation to a 1 or 0.
while (eqOutput.Length > 1)
{
if (eqOutput.Contains("!1"))
eqOutput = eqOutput.Replace("!1", "0");
else if (eqOutput.Contains("!0"))
eqOutput = eqOutput.Replace("!0", "1");
else if (eqOutput.Contains("1&&1"))
eqOutput = eqOutput.Replace("1&&1", "1");
else if (eqOutput.Contains("1&&0"))
eqOutput = eqOutput.Replace("1&&0", "0");
else if (eqOutput.Contains("0&&1"))
eqOutput = eqOutput.Replace("0&&1", "0");
else if (eqOutput.Contains("0&&0"))
eqOutput = eqOutput.Replace("0&&0", "0");
else if (eqOutput.Contains("1||1"))
eqOutput = eqOutput.Replace("1||1", "1");
else if (eqOutput.Contains("1||0"))
eqOutput = eqOutput.Replace("1||0", "1");
else if (eqOutput.Contains("0||1"))
eqOutput = eqOutput.Replace("0||1", "1");
else if (eqOutput.Contains("0||0"))
eqOutput = eqOutput.Replace("0||0", "0");
else if (eqOutput.Contains("(1)"))
eqOutput = eqOutput.Replace("(1)", "1");
else if (eqOutput.Contains("(0)"))
eqOutput = eqOutput.Replace("(0)", "0");
}
Now you should have a string that contains only a 1 or a 0 indicating true or false, respectively.

With the help of DynamicExpresso you can easily do this in 10 lines. Let's say the text and the user input are like this:
var text = "Bob and Tom are in the same class.";
var input = "(Bob || Alice) && Tom";
You can consider "Bob" "Alice" "Tom" are variables whose type is bool in C#, the user input string becomes a valid C# expression, evaulate it using DynamicExpresso and get a bool result.
var variables = input.Split(new[] { "(", "||", "&&", ")", " " },
StringSplitOptions.RemoveEmptyEntries);
var interpreter = new Interpreter();
foreach (var variable in variables)
{
interpreter.SetVariable(variable, text.Contains(variable));
}
var result = (bool)interpreter.Parse(input).Invoke();

Make group from list string

I have one List as below:
var paths = new List<string> {
#"rootuploaded\samplefolder\1232_234234_1.jpg",
#"rootuploaded\samplefolder\1232_2342.jpg",
#"rootuploaded\samplefolder\subfolder\1232_234234_1.jpg",
#"rootuploaded\samplefolder\subfolder\1232_2342.jpg",
#"rootuploaded\file-5.txt",
#"rootuploaded\file-67.txt",
#"rootuploaded\file-a.txt",
#"rootuploaded\file1.txt",
#"rootuploaded\file5.txt",
#"rootuploaded\filea.txt",
#"rootuploaded\text.txt",
#"rootuploaded\file_sample_a.txt",
#"rootuploaded\file2.txt",
#"rootuploaded\file_sample.txt",
#"rootuploaded\samplefolder\1232_234234_2.bmp",
};
How to print output like this:
○ Group 1
rootuploaded\samplefolder\1232_234234_1.jpg,
rootuploaded\samplefolder\1232_234234_2.bmp
○ Group 2
rootuploaded\file1.txt
rootuploaded\file2.txt
rootuploaded\file5.txt
○ Group 3
rootuploaded\file-5.txt
rootuploaded\file-67.txt
○ Group 4
rootuploaded\file_sample.txt
rootuploaded\file_sample_a.txt
○ Cannot grouped
rootuploaded\samplefolder\1232_2342.jpg
rootuploaded\file-a.txt
rootuploaded\filea.txt
rootuploaded\text.txt
Grouping files based on 6 naming conventions (with top¬down priority):
FileName.ext, FileName_anything.ext, FileName_anythingelse.ext, ...
FileName.ext, FileName-anything.ext, FileName-anythingelse.ext, ...
FileName_1.ext, FileName_2.ext, ..., FileName_N.ext (maybe not continuous)
FileName-1.ext, FileName-2.ext, ..., FileName-N.ext (maybe not continuous)
FileName 1.ext, FileName 2.ext, ..., FileName N.ext (maybe not continuous)
FileName1.ext, FileName2.ext, ..., FileNameN.ext (maybe not continuous)
I used Linq to separate:
var groups1 = paths.GroupBy(GetFileName, (key, g) => new
{
key = key,
count = g.Count(),
path = g.ToList()
}).Where(x => x.count < 5 && x.count >= 2).ToList();
public string GetFileName(string fileName)
{
var index = 0;
if (fileName.Contains("_"))
index = fileName.IndexOf("_", StringComparison.Ordinal);
else if (fileName.Contains("-"))
index = fileName.IndexOf("-", StringComparison.Ordinal);
var result = fileName.Substring(0, index);
return result;
}

Try doing this:
var groups = new []
{
new { regex = #"rootuploaded\\samplefolder\\1232_234234_\d\..{3}", grp = 1 },
new { regex = #"rootuploaded\\file\d\.txt", grp = 2 },
new { regex = #"rootuploaded\\file-\d+\.txt", grp = 3 },
new { regex = #"rootuploaded\\file_sample.*\.txt", grp = 4 },
};
var results =
from path in paths
group path by
groups
.Where(x => Regex.IsMatch(path, x.regex))
.Select(x => x.grp)
.DefaultIfEmpty(99)
.First()
into gpaths
orderby gpaths.Key
select new
{
Group = gpaths.Key,
Files = gpaths.ToArray(),
};
That gives you this:
You would just have to jig around with the regex until you get exactly what you want.

Sadly, 1. and 2. group turn this solution difficult. Cause both contain 'FileName.ext', so it has to check whole list together :(
I try to separate groupping 1. 2. and 3 - 6:
First step:
Find and remove Group 1 and 2 candidates.
It orders the list base on file path:
var orderedFilenames = pathsDistinct().OrderBy(p => p).ToList();
Than find Group 1 and 2 candidates:
var groupped = orderedFilenames.GroupBy(s => GetStarterFileName(s, orderedFilenames));
private static string GetStarterFileName(string fileNameMatcher, List<string> orderedFilenames)
{
string fileNameMatcherWOExt = Path.GetFileNameWithoutExtension(fileNameMatcher);
return orderedFilenames.FirstOrDefault(p =>
{
if (p == fileNameMatcher) return true;
string p_directory = Path.GetDirectoryName(p);
string directory = Path.GetDirectoryName(fileNameMatcher);
if (p_directory != directory) return false;
string pure = Path.GetFileNameWithoutExtension(p);
if (!fileNameMatcherWOExt.StartsWith(pure)) return false;
if (fileNameMatcherWOExt.Length <= pure.Length) return false;
char separator = fileNameMatcherWOExt[pure.Length];
if (separator != '_' && separator != '-') return false;
return true;
});
}
Step two:
After first step, you got Group 1 and 2 candidates, but all others are separated into different groups.
Collect remaining path and separete group 1 and 2:
var mergedGroupps = groupped.Where(grp => grp.Count() == 1).SelectMany(grp => grp);
var starterFileNameGroups = groupped.Where(grp => grp.Count() > 1);
Step three
Now you could find 3-6 based on regex validation:
var endWithNumbersGroups = mergedGroupps.GroupBy(s => GetEndWithNumber(s));
private static string GetEndWithNumber(string fileNameMatcher)
{
string fileNameWithoutExtesion = Path.Combine(Path.GetDirectoryName(fileNameMatcher), Path.GetFileNameWithoutExtension(fileNameMatcher));
string filename = null;
filename = CheckWithRegex(#"_(\d+)$", fileNameWithoutExtesion, 1);
if (filename != null) return filename;
filename = CheckWithRegex(#"-(\d+)$", fileNameWithoutExtesion, 1);
if (filename != null) return filename;
filename = CheckWithRegex(#" (\d+)$", fileNameWithoutExtesion, 1);
if (filename != null) return filename;
filename = CheckWithRegex(#"(\d+)$", fileNameWithoutExtesion);
if (filename != null) return filename;
return fileNameWithoutExtesion;
}
private static string CheckWithRegex(string p, string filename, int additionalCharLength = 0)
{
Regex regex = new Regex(p, RegexOptions.Compiled | RegexOptions.CultureInvariant);
Match match = regex.Match(filename);
if (match.Success)
return filename.Substring(0, filename.Length - (match.Groups[0].Length - additionalCharLength));
return null;
}
Final Step:
Collect non groupped items and merge Group 1-2 and 3-6 candidates
var nonGroupped = endWithNumbersGroups.Where(grp => grp.Count() == 1).SelectMany(grp => grp);
endWithNumbersGroups = endWithNumbersGroups.Where(grp => grp.Count() > 1);
var result = starterFileNameGroups.Concat(endWithNumbersGroups);
You could try to solve both step in one shot, but as you see groupping mechanism are different. My solution is not so beautiful, but I think it's clear... maybe :)

String manipulation in alternating order

I have a string
string value = "123456789";
now I need to re-arrange the string in the following way:
123456789
1 right
12 left
312 right
3124 left
53124 right
...
975312468 result
Is there a fancy linq one liner solution to solve this?
My current (working but not so good looking) solution:
string items = "abcdefgij";
string result = string.Empty;
for (int i = 0; i < items.Length; i++)
{
if (i % 2 != 0)
{
result = result + items[i];
}
else
{
result = items[i] + result;
}
}

string value = "123456789";
bool b = true;
string result = value.Aggregate(string.Empty, (s, c) =>
{
b = !b;
return b ? (s + c) : (c + s);
});
I actually don't like local variables inside LINQ statements, but in this case b helps alternating the direction. (#klappvisor showed how to live without b).

You can use length of the res as variable to decide from which side to append
items.Aggregate(string.Empty, (res, c) => res.Length % 2 == 0 ? c + res : res + c);
Alternative solution would be zipping with range
items.Zip(Enumerable.Range(0, items.Length), (c, i) => new {C = c, I = i})
.Aggregate(string.Empty, (res, x) => x.I % 2 == 0 ? x.C + res : res + x.C)
EDIT: don't really needed ToCharArray...

Resulting string is chars in evens positions concatenated to chars in odds positions in reverse order:
string value = "123456789";
var evens = value.Where((c, i) => i % 2 == 1);
var odds = value.Where((c, i) => i % 2 == 0).Reverse();
var chars = odds.Concat(evens).ToArray();
var result = new string(chars);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Using regex replace to transform an expression - c#

is it possible to transform an expression like {op1 == op2, #} && {op3 > op4, 1, 2} && op5 == op6 to op1 ==_# op2 && op3 >_1_2 op4 && op5 == op6 So, all occurences after the comma should be placed seperated by an underline after the operator (==, >,<,<=, etc...). opX can be any alphanumerical value.

Related

Get the substring of the non conditional part

update specific lines in text file C#

Turn boolean-expression string into the .NET code

Make group from list string

String manipulation in alternating order

Categories

Resources