I am using Regex to replace all the strings in a template. Everything works fine until there is a value I want to replace, which is $0.00. I can't seem to properly replace the $0 as replacement text. The output I am getting is "Project Cost: [[ProjectCost]].00". Any idea why?
Here is an example of the code with some simplified variables.
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using Newtonsoft.Json.Linq;
using System;
using System.Collections.Generic;
using System.Security;
using System.Text.RegularExpressions;
namespace Export.Services
{
public class CommonExportService
{
private Dictionary<string, string> _formTokens;
public CommonExportService() {
_formTokens = {{"EstimatedOneTimeProjectCost", "0.00"}};
}
private string GetReplacementText(string replacementText)
{
replacementText = "Project Cost: [[EstimatedOneTimeProjectCost]]";
//replacement text = "Project Cost: [[ProjectCost]]"
foreach (var token in _formTokens)
{
var val = token.Value;
var key = token.Key;
//work around for now
//if (val.Equals("$0.00")) {
// val = "0.00";
//}
var reg = new Regex(Regex.Escape("[[" + key + "]]"));
if (reg.IsMatch(replacementText))
replacementText = reg.Replace(replacementText, SecurityElement.Escape(val ?? string.Empty));
else {
}
}
return replacementText;
//$0.00 does not replace, something is happening with the $0 before the decimal
//the output becomes Project Cost: [[EstimatedOneTimeProjectCost]].00
//The output is correct for these
//0.00 replaces correctly
//$.00 replaces correctly
//0 replaces correctly
//00 replaces correctly
//$ replaces correctly
}
}
}
Since your replacement string is built dynamically, you need to take care of the $ char in it. When $ is followed with 0, the $0 is a backreference to the whole match, so the whole match is inserted as a result of replacement.
You just need to dollar-escape the $ inside a literal string pattern:
return replacementText.replace("$", "$$");
Then, your replacement pattern will contain $$0, and that will "translate" into a literal $0 string.
I have many filenames such as:
libgcc1-5.2.0-r0.70413e92.rbt.xar
python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar
u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar
I need to reliably extract the name, version and "rbt" or "norbt" from this. What is the best way? I am trying regex, something like:
(?<fileName>.*?)-(?<version>.+).(rbt|norbt).xar
Issue is the file name and version both can have multiple semi colons. So I am not sure if there is an answer by I have two questions:
What is the best strategy to extract values such as these?
How would I be able to figure out which version is greater?
Expected output is:
libgcc1, 5.2.0-r0.70413e92, rbt
python3-sqlite3, 3.4.3-r1.0.f25d9e76, rbt
u-boot-signed-pad.bin, v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57, rbt
This will give you what you want without using Regex:
var fileNames = new List<string>(){
"libgcc1-5.2.0-r0.70413e92.rbt.xar",
"python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar",
"u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar"
};
foreach(var file in fileNames){
var spl = file.Split('-');
string name = string.Join("-",spl.Take(spl.Length-2));
string versionRbt = string.Join("-",spl.Skip(spl.Length-2));
string rbtNorbt = versionRbt.IndexOf("norbt") > 0 ? "norbt" : "rbt";
string version = versionRbt.Replace($".{rbtNorbt}.xar","");
Console.WriteLine($"name={name};version={version};rbt={rbtNorbt}");
}
Output:
name=libgcc1;version=5.2.0-r0.70413e92;rbt=rbt
name=python3-sqlite3;version=3.4.3-r1.0.f25d9e76;rbt=rbt
name=u-boot-signed-pad.bin;version=v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57;rbt=rbt
Edit:
Or using Regex:
var m = Regex.Match(file,#"^(?<fileName>.*)-(?<version>.+-.+)\.(rbt|norbt)\.xar$");
string name = m.Groups["fileName"].Value;
string version = m.Groups["version"].Value;
string rbtNorbt = m.Groups[1].Value;
The output will be the same. Both approaches assum that "version" has one -.
Tested following code and work perfectly with Regex. I used option Right-To-Left
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication107
{
class Program
{
static void Main(string[] args)
{
string[] inputs = {
"libgcc1-5.2.0-r0.70413e92.rbt.xar",
"python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar",
"u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar"
};
string pattern = #"(?'prefix'.+)-(?'middle'[^-][\w+\.]+-[\w+\.]+)\.(?'extension'[^\.]+).\.xar";
foreach (string input in inputs)
{
Match match = Regex.Match(input, pattern, RegexOptions.RightToLeft);
Console.WriteLine("prefix : '{0}', middle : '{1}', extension : '{2}'",
match.Groups["prefix"].Value,
match.Groups["middle"].Value,
match.Groups["extension"].Value
);
}
Console.ReadLine();
}
}
}
I can't be the first person to have this issue but hours of searching Stack revealed nothing close to an answer. I have an SSIS script that works over a directory of csv files. This script folds, bends and mutilates these files; performs queries, data cleansing, persists some data and finally outputs a small set to csv file that is ingested by another system.
One of the files has a free text field that contains the value: "20,000 BONUS POINTS". This one field, in a file of 10k rows, one of dozens of similar files, is the problem that I can't seem to solve.
Be advised: I'm weak on both C# and Regex.
Sample csv set:
4121,6383,0,,,TRUE
4122,6384,0,"20,000 BONUS POINTS",,TRUE
4123,6385,,,,
4124,6386,0,,,TRUE
4125,6387,0,,,TRUE
4126,6388,0,,,TRUE
4127,6389,0,,,TRUE
4128,6390,0,,,TRUE
I found plenty of information on how to parse this using a variety of Regex patterns but what I've noticed is the StreamReader.ReadLine() method wraps the complete line with double quotes:
"4121,6383,0,,,TRUE"
such that the output of the regex Replace method:
s = Regex.Replace(line, #"[^\""]([^\""])*[^\""]",
m => m.Value.Replace(",", ""));
looks like this:
412163830TRUE
and the target line that actually contains a double quote delimited string ends up looking like:
"412263840\"20000 BONUS POINTS\"TRUE"
My entire method (for your reading pleasure) is this:
string fileDirectory = "C:\\tmp\\Unzip\\";
string fullPath = "C:\\tmp\\Unzip\\test.csv";
string line = "";
//int count=0;
List<string> list = new List<string>();
try
{
//MessageBox.Show("inside Try Block");
string s = null;
StreamReader infile = new StreamReader(fullPath);
StreamWriter outfile = new StreamWriter(Path.Combine(fileDirectory, "output.csv"));
while ((line = infile.ReadLine()) != null)
{
//line.Substring(0,1).Substring(line.Length-1, 1);
System.Console.WriteLine(line);
Console.WriteLine(line);
line =
s = Regex.Replace(line, #"[^\""]([^\""])*[^\""]",
m => m.Value.Replace(",", ""));
System.Console.WriteLine(s);
list.Add(s);
}
foreach (string item in list)
{
outfile.WriteLine(item);
};
infile.Close();
outfile.Close();
//System.Console.WriteLine("There were {0} lines.", count);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
//another addition for TFS consumption
}
Thanks for reading and if you have a useful answer, bless you and your prodigy for generations to come!
mfc
EDIT: The requirement is a valid csv file output. In the case of the test data, it would look like this:
4121,6383,0,,,TRUE
4122,6384,0,"20000 BONUS POINTS",,TRUE
4123,6385,,,,
4124,6386,0,,,TRUE
4125,6387,0,,,TRUE
4126,6388,0,,,TRUE
4127,6389,0,,,TRUE
4128,6390,0,,,TRUE
I recommend using a CSV reader lib like others have suggested.
Install-Package LumenWorksCsvReader
https://github.com/phatcher/CsvReader#getting-started
However, if you just want to try something fast and dirty. Give this a try.
If I understand correctly. You need to remove commas between double quotes within each line of a CSV file. This should do that.
using System;
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string pattern = #"([""'])(?:(?=(\\?))\2.)*?\1";
List<string> lines = new List<string>();
lines.Add("4121,6383,0,,,TRUE");
lines.Add("4122,6384,0,\"20,000 BONUS POINTS\",,TRUE");
lines.Add("4123,6385,,,,");
lines.Add("4124,6386,0,,,TRUE");
lines.Add("4125,6387,0,,,TRUE");
lines.Add("4126,6388,0,,,TRUE");
lines.Add("4127,6389,0,,,TRUE");
lines.Add("4128,6390,0,,,TRUE");
StringBuilder sb = new StringBuilder();
foreach (var line in lines)
{
sb.Append(Regex.Replace(line, pattern, m => m.Value.Replace(",", ""))+"\n");
}
Console.WriteLine(sb.ToString());
}
}
OUTPUT
4121,6383,0,,,TRUE
4122,6384,0,"20000 BONUS POINTS",,TRUE
4123,6385,,,,
4124,6386,0,,,TRUE
4125,6387,0,,,TRUE
4126,6388,0,,,TRUE
4127,6389,0,,,TRUE
4128,6390,0,,,TRUE
https://dotnetfiddle.net/flmWG3
I haven't tried with numerous lines, but this would be my first approach:
namespace ConsoleTestApplication
{
class Program
{
static void Main(string[] args)
{
var before = "4122,6384,0,\"20,000 BONUS POINTS\",,TRUE";
var pattern = #"""[^""]*""";
var after = Regex.Replace(before, pattern, match => match.Value.Replace(",", ""));
Console.WriteLine(after);
}
}
}
Ok, so I know that questions LIKE this have been asked a lot on here, but I can't seem to make solutions work.
I am trying to take a string from a file and find the longest word in that string.
Simples.
I think the issue is down to whether I am calling my methods on a string[] or char[], currently stringOfWords returns a char[].
I am trying to then order by descending length and get the first value but am getting an ArgumentNullException on the OrderByDescending method.
Any input much appreciated.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Text;
using System.Threading.Tasks;
namespace TextExercises
{
class Program
{
static void Main(string[] args)
{
var fileText = File.ReadAllText(#"C:\Users\RichardsPC\Documents\TestText.txt");
var stringOfWords = fileText.ToArray();
Console.WriteLine("Text in file: " + fileText);
Console.WriteLine("Words in text: " + fileText.Split(' ').Length);
// This is where I am trying to solve the problem
var finalValue = stringOfWords.OrderByDescending(n => n.length).First();
Console.WriteLine("Largest word is: " + finalValue);
}
}
}
Don't split the string, use a Regex
If you care about performance you don't want to split the string. The reason in order to do the split method will have to traverse the entire string, create new strings for the items it finds to split and put them into an array, computational cost of more than N, then doing an order by you do another (at least) O(nLog(n)) steps.
You can use a Regex for this, which will be more efficient, because it will only iterate over the string once
var regex = new Regex(#"(\w+)\s",RegexOptions.Compiled);
var match = regex.Match(fileText);
var currentLargestString = "";
while(match.Success)
{
if(match.Groups[1].Value.Length>currentLargestString.Length)
{
currentLargestString = match.Groups[1].Value;
}
match = match.NextMatch();
}
The nice thing about this is that you don't need to break the string up all at once to do the analysis and if you need to load the file incrementally is a fairly easy change to just persist the word in an object and call it against multiple strings
If you're set on using an Array don't order by just iterate over
You don't need to do an order by your just looking for the largest item, computational complexity of order by is in most cases O(nLog(n)), iterating over the list has a complexity of O(n)
var largest = "";
foreach(var item in strArr)
{
if(item.Length>largest.Length)
largest = item;
}
Method ToArray() in this case returns char[] which is an array of individual characters. But instead you need an array of individual words. You can get it like this:
string[] stringOfWords = fileText.Split(' ');
And you have a typo in your lambda expression (uppercase L):
n => n.Length
Try this:
var fileText = File.ReadAllText(#"C:\Users\RichardsPC\Documents\TestText.txt");
var words = fileText.Split(' ')
var finalValue = fileText.OrderByDescending(n=> n.Length).First();
Console.WriteLine("Longest word: " + finalValue");
As suggested in the other answer, you need to split your string.
string[] stringOfWords = fileText.split(new Char [] {',' , ' ' });
//all is well, now let's loop over it and see which is the biggest
int biggest = 0;
int biggestIndex = 0;
for(int i=0; i<stringOfWords.length; i++) {
if(biggest < stringOfWords[i].length) {
biggest = stringOfWords[i].length;
biggestIndex = i;
}
}
return stringOfWords[i];
What we're doing here is splitting the string based on whitespace (' '), or commas- you can add an unlimited number of delimiters there - each word, then, gets its own space in the array.
From there, we're iterating over the array. If we encounter a word that's longer than the current longest word, we update it.