C# split text when delimiter may be in values [duplicate] - c#
Given
2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,"Corvallis, OR",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34
How to use C# to split the above information into strings as follows:
2
1016
7/31/2008 14:22
Geoff Dalgas
6/5/2011 22:21
http://stackoverflow.com
Corvallis, OR
7679
351
81
b437f461b3fd27387c5d8ab47a293d35
34
As you can see one of the column contains , <= (Corvallis, OR)
Based on
C# Regex Split - commas outside quotes
string[] result = Regex.Split(samplestring, ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
Use the Microsoft.VisualBasic.FileIO.TextFieldParser class. This will handle parsing a delimited file, TextReader or Stream where some fields are enclosed in quotes and some are not.
For example:
using Microsoft.VisualBasic.FileIO;
string csv = "2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,\"Corvallis, OR\",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34";
TextFieldParser parser = new TextFieldParser(new StringReader(csv));
// You can also read from a file
// TextFieldParser parser = new TextFieldParser("mycsvfile.csv");
parser.HasFieldsEnclosedInQuotes = true;
parser.SetDelimiters(",");
string[] fields;
while (!parser.EndOfData)
{
fields = parser.ReadFields();
foreach (string field in fields)
{
Console.WriteLine(field);
}
}
parser.Close();
This should result in the following output:
2
1016
7/31/2008 14:22
Geoff Dalgas
6/5/2011 22:21
http://stackoverflow.com
Corvallis, OR
7679
351
81
b437f461b3fd27387c5d8ab47a293d35
34
See Microsoft.VisualBasic.FileIO.TextFieldParser for more information.
You need to add a reference to Microsoft.VisualBasic in the Add References .NET tab.
It is so much late but this can be helpful for someone. We can use RegEx as bellow.
Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
String[] Fields = CSVParser.Split(Test);
I see that if you paste csv delimited text in Excel and do a "Text to Columns", it asks you for a "text qualifier". It's defaulted to a double quote so that it treats text within double quotes as literal. I imagine that Excel implements this by going one character at a time, if it encounters a "text qualifier", it keeps going to the next "qualifier". You can probably implement this yourself with a for loop and a boolean to denote if you're inside literal text.
public string[] CsvParser(string csvText)
{
List<string> tokens = new List<string>();
int last = -1;
int current = 0;
bool inText = false;
while(current < csvText.Length)
{
switch(csvText[current])
{
case '"':
inText = !inText; break;
case ',':
if (!inText)
{
tokens.Add(csvText.Substring(last + 1, (current - last)).Trim(' ', ','));
last = current;
}
break;
default:
break;
}
current++;
}
if (last != csvText.Length - 1)
{
tokens.Add(csvText.Substring(last+1).Trim());
}
return tokens.ToArray();
}
You could split on all commas that do have an even number of quotes following them.
You would also like to view at the specf for CSV format about handling comma's.
Useful Link : C# Regex Split - commas outside quotes
Use a library like LumenWorks to do your CSV reading. It'll handle fields with quotes in them and will likely overall be more robust than your custom solution by virtue of having been around for a long time.
It is a tricky matter to parse .csv files when the .csv file could be either comma separated strings, comma separated quoted strings, or a chaotic combination of the two. The solution I came up with allows for any of the three possibilities.
I created a method, ParseCsvRow() which returns an array from a csv string. I first deal with double quotes in the string by splitting the string on double quotes into an array called quotesArray. Quoted string .csv files are only valid if there is an even number of double quotes. Double quotes in a column value should be replaced with a pair of double quotes (This is Excel's approach). As long as the .csv file meets these requirements, you can expect the delimiter commas to appear only outside of pairs of double quotes. Commas inside of pairs of double quotes are part of the column value and should be ignored when splitting the .csv into an array.
My method will test for commas outside of double quote pairs by looking only at even indexes of the quotesArray. It also removes double quotes from the start and end of column values.
public static string[] ParseCsvRow(string csvrow)
{
const string obscureCharacter = "ᖳ";
if (csvrow.Contains(obscureCharacter)) throw new Exception("Error: csv row may not contain the " + obscureCharacter + " character");
var unicodeSeparatedString = "";
var quotesArray = csvrow.Split('"'); // Split string on double quote character
if (quotesArray.Length > 1)
{
for (var i = 0; i < quotesArray.Length; i++)
{
// CSV must use double quotes to represent a quote inside a quoted cell
// Quotes must be paired up
// Test if a comma lays outside a pair of quotes. If so, replace the comma with an obscure unicode character
if (Math.Round(Math.Round((decimal) i/2)*2) == i)
{
var s = quotesArray[i].Trim();
switch (s)
{
case ",":
quotesArray[i] = obscureCharacter; // Change quoted comma seperated string to quoted "obscure character" seperated string
break;
}
}
// Build string and Replace quotes where quotes were expected.
unicodeSeparatedString += (i > 0 ? "\"" : "") + quotesArray[i].Trim();
}
}
else
{
// String does not have any pairs of double quotes. It should be safe to just replace the commas with the obscure character
unicodeSeparatedString = csvrow.Replace(",", obscureCharacter);
}
var csvRowArray = unicodeSeparatedString.Split(obscureCharacter[0]);
for (var i = 0; i < csvRowArray.Length; i++)
{
var s = csvRowArray[i].Trim();
if (s.StartsWith("\"") && s.EndsWith("\""))
{
csvRowArray[i] = s.Length > 2 ? s.Substring(1, s.Length - 2) : ""; // Remove start and end quotes.
}
}
return csvRowArray;
}
One downside of my approach is the way I temporarily replace delimiter commas with an obscure unicode character. This character needs to be so obscure, it would never show up in your .csv file. You may want to put more handling around this.
This question and its duplicates have a lot of answers. I tried this one that looked promising, but found some bugs in it. I heavily modified it so that it would pass all of my tests.
/// <summary>
/// Returns a collection of strings that are derived by splitting the given source string at
/// characters given by the 'delimiter' parameter. However, a substring may be enclosed between
/// pairs of the 'qualifier' character so that instances of the delimiter can be taken as literal
/// parts of the substring. The method was originally developed to split comma-separated text
/// where quotes could be used to qualify text that contains commas that are to be taken as literal
/// parts of the substring. For example, the following source:
/// A, B, "C, D", E, "F, G"
/// would be split into 5 substrings:
/// A
/// B
/// C, D
/// E
/// F, G
/// When enclosed inside of qualifiers, the literal for the qualifier character may be represented
/// by two consecutive qualifiers. The two consecutive qualifiers are distinguished from a closing
/// qualifier character. For example, the following source:
/// A, "B, ""C"""
/// would be split into 2 substrings:
/// A
/// B, "C"
/// </summary>
/// <remarks>Originally based on: https://stackoverflow.com/a/43284485/2998072</remarks>
/// <param name="source">The string that is to be split</param>
/// <param name="delimiter">The character that separates the substrings</param>
/// <param name="qualifier">The character that is used (in pairs) to enclose a substring</param>
/// <param name="toTrim">If true, then whitespace is removed from the beginning and end of each
/// substring. If false, then whitespace is preserved at the beginning and end of each substring.
/// </param>
public static List<String> SplitQualified(this String source, Char delimiter, Char qualifier,
Boolean toTrim)
{
// Avoid throwing exception if the source is null
if (String.IsNullOrEmpty(source))
return new List<String> { "" };
var results = new List<String>();
var result = new StringBuilder();
Boolean inQualifier = false;
// The algorithm is designed to expect a delimiter at the end of each substring, but the
// expectation of the caller is that the final substring is not terminated by delimiter.
// Therefore, we add an artificial delimiter at the end before looping through the source string.
String sourceX = source + delimiter;
// Loop through each character of the source
for (var idx = 0; idx < sourceX.Length; idx++)
{
// If current character is a delimiter
// (except if we're inside of qualifiers, we ignore the delimiter)
if (sourceX[idx] == delimiter && inQualifier == false)
{
// Terminate the current substring by adding it to the collection
// (trim if specified by the method parameter)
results.Add(toTrim ? result.ToString().Trim() : result.ToString());
result.Clear();
}
// If current character is a qualifier
else if (sourceX[idx] == qualifier)
{
// ...and we're already inside of qualifier
if (inQualifier)
{
// check for double-qualifiers, which is escape code for a single
// literal qualifier character.
if (idx + 1 < sourceX.Length && sourceX[idx + 1] == qualifier)
{
idx++;
result.Append(sourceX[idx]);
continue;
}
// Since we found only a single qualifier, that means that we've
// found the end of the enclosing qualifiers.
inQualifier = false;
continue;
}
else
// ...we found an opening qualifier
inQualifier = true;
}
// If current character is neither qualifier nor delimiter
else
result.Append(sourceX[idx]);
}
return results;
}
Here are the test methods to prove that it works:
[TestMethod()]
public void SplitQualified_00()
{
// Example with no substrings
String s = "";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "" }, substrings);
}
[TestMethod()]
public void SplitQualified_00A()
{
// just a single delimiter
String s = ",";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "", "" }, substrings);
}
[TestMethod()]
public void SplitQualified_01()
{
// Example with no whitespace or qualifiers
String s = "1,2,3,1,2,3";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_02()
{
// Example with whitespace and no qualifiers
String s = " 1, 2 ,3, 1 ,2\t, 3 ";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_03()
{
// Example with whitespace and no qualifiers
String s = " 1, 2 ,3, 1 ,2\t, 3 ";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(
new List<String> { " 1", " 2 ", "3", " 1 ", "2\t", " 3 " },
substrings);
}
[TestMethod()]
public void SplitQualified_04()
{
// Example with no whitespace and trivial qualifiers.
String s = "1,\"2\",3,1,2,\"3\"";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
s = "\"1\",\"2\",3,1,\"2\",3";
substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_05()
{
// Example with no whitespace and qualifiers that enclose delimiters
String s = "1,\"2,2a\",3,1,2,\"3,3a\"";
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2,2a", "3", "1", "2", "3,3a" },
substrings);
s = "\"1,1a\",\"2,2b\",3,1,\"2,2c\",3";
substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1,1a", "2,2b", "3", "1", "2,2c", "3" },
substrings);
}
[TestMethod()]
public void SplitQualified_06()
{
// Example with qualifiers enclosing whitespace but no delimiter
String s = "\" 1 \",\"2 \",3,1,2,\"\t3\t\"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" },
substrings);
}
[TestMethod()]
public void SplitQualified_07()
{
// Example with qualifiers enclosing whitespace but no delimiter
String s = "\" 1 \",\"2 \",3,1,2,\"\t3\t\"";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", "2 ", "3", "1", "2", "\t3\t" },
substrings);
}
[TestMethod()]
public void SplitQualified_08()
{
// Example with qualifiers enclosing whitespace but no delimiter; also whitespace btwn delimiters
String s = "\" 1 \", \"2 \" , 3,1, 2 ,\" 3 \"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" },
substrings);
}
[TestMethod()]
public void SplitQualified_09()
{
// Example with qualifiers enclosing whitespace but no delimiter; also whitespace btwn delimiters
String s = "\" 1 \", \"2 \" , 3,1, 2 ,\" 3 \"";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", " 2 ", " 3", "1", " 2 ", " 3 " },
substrings);
}
[TestMethod()]
public void SplitQualified_10()
{
// Example with qualifiers enclosing whitespace and delimiter
String s = "\" 1 \",\"2 , 2b \",3,1,2,\" 3,3c \"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2 , 2b", "3", "1", "2", "3,3c" },
substrings);
}
[TestMethod()]
public void SplitQualified_11()
{
// Example with qualifiers enclosing whitespace and delimiter; also whitespace btwn delimiters
String s = "\" 1 \", \"2 , 2b \" , 3,1, 2 ,\" 3,3c \"";
// whitespace should be preserved
var substrings = s.SplitQualified(',', '"', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", " 2 , 2b ", " 3", "1", " 2 ", " 3,3c " },
substrings);
}
[TestMethod()]
public void SplitQualified_12()
{
// Example with tab characters between delimiters
String s = "\t1,\t2\t,3,1,\t2\t,\t3\t";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_13()
{
// Example with newline characters between delimiters
String s = "\n1,\n2\n,3,1,\n2\n,\n3\n";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_14()
{
// Example with qualifiers enclosing whitespace and delimiter, plus escaped qualifier
String s = "\" 1 \",\"\"\"2 , 2b \"\"\",3,1,2,\" \"\"3,3c \"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "\"2 , 2b \"", "3", "1", "2", "\"3,3c" },
substrings);
}
[TestMethod()]
public void SplitQualified_14A()
{
// Example with qualifiers enclosing whitespace and delimiter, plus escaped qualifier
String s = "\"\"\"1\"\"\"";
// whitespace should be removed
var substrings = s.SplitQualified(',', '"', true);
CollectionAssert.AreEquivalent(new List<String> { "\"1\"" },
substrings);
}
[TestMethod()]
public void SplitQualified_15()
{
// Instead of comma-delimited and quote-qualified, use pipe and hash
// Example with no whitespace or qualifiers
String s = "1|2|3|1|2,2f|3";
var substrings = s.SplitQualified('|', '#', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2", "3", "1", "2,2f", "3" }, substrings);
}
[TestMethod()]
public void SplitQualified_16()
{
// Instead of comma-delimited and quote-qualified, use pipe and hash
// Example with qualifiers enclosing whitespace and delimiter
String s = "# 1 #|#2 | 2b #|3|1|2|# 3|3c #";
// whitespace should be removed
var substrings = s.SplitQualified('|', '#', true);
CollectionAssert.AreEquivalent(new List<String> { "1", "2 | 2b", "3", "1", "2", "3|3c" },
substrings);
}
[TestMethod()]
public void SplitQualified_17()
{
// Instead of comma-delimited and quote-qualified, use pipe and hash
// Example with qualifiers enclosing whitespace and delimiter; also whitespace btwn delimiters
String s = "# 1 #| #2 | 2b # | 3|1| 2 |# 3|3c #";
// whitespace should be preserved
var substrings = s.SplitQualified('|', '#', false);
CollectionAssert.AreEquivalent(new List<String> { " 1 ", " 2 | 2b ", " 3", "1", " 2 ", " 3|3c " },
substrings);
}
I had a problem with a CSV that contains fields with a quote character in them, so using the TextFieldParser, I came up with the following:
private static string[] parseCSVLine(string csvLine)
{
using (TextFieldParser TFP = new TextFieldParser(new MemoryStream(Encoding.UTF8.GetBytes(csvLine))))
{
TFP.HasFieldsEnclosedInQuotes = true;
TFP.SetDelimiters(",");
try
{
return TFP.ReadFields();
}
catch (MalformedLineException)
{
StringBuilder m_sbLine = new StringBuilder();
for (int i = 0; i < TFP.ErrorLine.Length; i++)
{
if (i > 0 && TFP.ErrorLine[i]== '"' &&(TFP.ErrorLine[i + 1] != ',' && TFP.ErrorLine[i - 1] != ','))
m_sbLine.Append("\"\"");
else
m_sbLine.Append(TFP.ErrorLine[i]);
}
return parseCSVLine(m_sbLine.ToString());
}
}
}
A StreamReader is still used to read the CSV line by line, as follows:
using(StreamReader SR = new StreamReader(FileName))
{
while (SR.Peek() >-1)
myStringArray = parseCSVLine(SR.ReadLine());
}
With Cinchoo ETL - an open source library, it can automatically handles columns values containing separators.
string csv = #"2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,""Corvallis, OR"",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34";
using (var p = ChoCSVReader.LoadText(csv)
)
{
Console.WriteLine(p.Dump());
}
Output:
Key: Column1 [Type: String]
Value: 2
Key: Column2 [Type: String]
Value: 1016
Key: Column3 [Type: String]
Value: 7/31/2008 14:22
Key: Column4 [Type: String]
Value: Geoff Dalgas
Key: Column5 [Type: String]
Value: 6/5/2011 22:21
Key: Column6 [Type: String]
Value: http://stackoverflow.com
Key: Column7 [Type: String]
Value: Corvallis, OR
Key: Column8 [Type: String]
Value: 7679
Key: Column9 [Type: String]
Value: 351
Key: Column10 [Type: String]
Value: 81
Key: Column11 [Type: String]
Value: b437f461b3fd27387c5d8ab47a293d35
Key: Column12 [Type: String]
Value: 34
For more information, please visit codeproject article.
Hope it helps.
Related
Splitting String of Numbers into String[]
Splitting a string of numbers separated by spaces or commas into a string array doesn't seem to work. Using the code string numStr = "12 13 2 7 8 105 6 5 29 0"; char[] delimiterChars = { ' ', ',' }; string[] numArray = numStr.Split(delimiterChars); I expect numbers to contain "12", "13", "2", "7", "8", "105", "6", "5", "29", "0" but instead numArray[0] = "12 13 2 7 8 105 6 5 29 0" I've also tried using the following code but it doesn't work either. string[] keywords = Regex.Split(numstr, #"(?<=\d) |, ");
How to submit a string in a Request using C# for Eggplant Performance
Greeting for 2022. I am currently evaluating a performance tool called Eggplant performance using a C#. The application I am testing against has a user set password that requires me to insert certain random characters of my password. Example: Password = password1 Then the application will make me insert 3characters of my password randomly. Each required field gets an ID that is between 7-9 characters long, but always longer than 6 and shorter than 10. The payload down the wire then looks like this: B806b8220=s&e210cdd9=s&cd5d5105=d&landingpage=express etc. I have been able to do the work correlating those fields and and getting the logic around that. What I am struggling with, and it is because I do not have a dev background is submitting this back down the wire. The correlated build up value looks like below: Passphrase is set as string And where i need to submit it is: Submit the Request I receive a bunch of errors all over the place. Any guidance into the right way will be much appreciated. (Using Visual Studio 2015) ////Additional Information: The code where the extraction happens. I extract 9 ID's (as the password1 is 9 characters). I then Say if the ID extracted is bigger than 6, use that plus add the correlated password piece. This works 100% ExtractionCursor extractionCursor41 = new ExtractionCursor(); if (response41.Find(extractionCursor41, "Enter only the required characters of", ActionType.ACT_EXIT_VU, true, SearchFlags.SEARCH_IN_BODY)) { Set("c_surephraseIDs_41", response41.ExtractList(extractionCursor41, "name=\"", "\" id=\"", "viLabel=\"Password 9\" />", true, 9, ActionType.ACT_EXIT_VU, SearchFlags.SEARCH_IN_BODY)); } if (extractionCursor41.Succeeded) { WriteMessage("Items extracted to list variable: c_surephraseIDs_41..."); List<string> valuesList = Get<List<string>>("c_surephraseIDs_41"); foreach (string listItem in valuesList) { WriteMessage(String.Format("Item: {0}", listItem)); } List<string> Array2 = Get<List<string>>("c_surephraseIDs_41"); List<string> Array1 = new List<string> { "p", "a", "s", "s", "w", "o", "r", "d", "1" }; List<string> Array3 = new List<string> { "", "", "", "", "", "", "", "", "" }; int j = 0; for (int i = 0; i < 9; i++) { if (Array2[i].Length > 6) { Array3[j] = Array2[i] + "=" + Array1[i]; WriteMessage(Array3[j]); j++; } } string Passphrase = Array3[0] + "&" + Array3[1] + "&" + Array3[2]; } // Rule: Verify that the result code matches what was recorded response41.VerifyResult(HttpStatus.OK, ActionType.ACT_WARNING); } Then in the request where this is parsed back to the website: using (Request request44 = WebBrowser.CreateRequest(HttpMethod.POST, url44, 44)) { request44.SetReferer(new Url(protocol1, onlinebankinguat3, "/absa-online/login.jsp")); request44.SetHeader("Origin", "https://onlinebankinguat3.absa.co.za"); request44.SetHeader("Content-Type", "application/x-www-form-urlencoded"); Form postData44 = new Form(); postData44.CharEncoding = Encoding.GetEncoding("ISO-8859-1"); //The below 3 are originally as per the recording. //postData44.AddElement(new InputElement("B3391b84d", "a", Encoding.GetEncoding("ISO-8859-1"))); //postData44.AddElement(new InputElement("B54824db9", "r", Encoding.GetEncoding("ISO-8859-1"))); //postData44.AddElement(new InputElement("fa2ebc87", "d", Encoding.GetEncoding("ISO-8859-1"))); //The below is the one I am trying to send over the wire postData44.AddElement(new InputElement("",GetString("Passphrase"), Encoding.GetEncoding("ISO-8859-1"))); // postData44.AddElement(new InputElement("landingpage", "express", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("dsp", "false", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("dspid", "0", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("dspreferer", "null", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("goto", "", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("", "", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("", "", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("nonce", "0", Encoding.GetEncoding("ISO-8859-1"))); postData44.AddElement(new InputElement("uniq", GetMillisecondsSinceEpoch(-5) /* Replaced timestamp 1641064306006 (2022-01-01T21:11:46.006000+02:00) */ , Encoding.GetEncoding("ISO-8859-1"))); request44.SetMessageBody(postData44); When it comes to building the string Passphrase I can see that it works correctly: The string builds correctly And when it comes to the sending of the request this is the response: The errors that is logged once I try send the request
I did this: for (int i = 0; i < 9; i++) { if (Array2[i].Length > 6) { Array3[j] = Array2[i] + "=" + Array1[i]; if (j == 0) { Set<string>("PPleft1", Array2[i]); Set<string>("PPright1", Array1[i]); } if (j == 1) { Set<string>("PPleft2", Array2[i]); Set<string>("PPright2", Array1[i]); } if (j == 2) { Set<string>("PPleft3", Array2[i]); Set<string>("PPright3", Array1[i]); } } }
Skipping Lines with Dashes in a text file with regex in c#
I have a text file with SQL commands, I've done some code to "ignore" the comments and blank spaces in orde to get just the commands (I will post code below and a sample of the text file and output), that works fine but in that text file I also have lines such as this "-----------------------------------" that I need to ignore, I've done the code to ignore it but I can't figure out why it doesnt work properly. Code: public string[] Parser(string caminho) { string text = File.ReadAllText(caminho); var Linha = Regex.Replace(text, #"\/\**?\*\/", " "); var Commands = Linha.Split(new[] { '/' }, StringSplitOptions.RemoveEmptyEntries) .Where(line => !string.IsNullOrWhiteSpace(line)) .Where(line => !Regex.IsMatch(line, #"^[\s\-]+$")) .ToArray(); } This is the .Where I added to "ignore" the dashed lines: .Where(line => !Regex.IsMatch(line, #"^[\s-]+$")) Sample of text with the dashes: / --------------------------------------------------------------------- UPDATE CDPREPORTSQL SET COMANDOSQL_FROM = 'SELECT DESCONTO,EMPCOD,EMPDSC,LINVER,NOMESISTEMA,OBS,ORCCOD,ORCVER,PEDCOD,PEDDSC, ROUND(PRCUNIT*#CAMBIO#,5) PRCUNIT, ROUND(PRCUNITSEMDESC*#CAMBIO#,5) PRCUNITSEMDESC, PROPCHECK,QTDGLOB,QTDPROP,REFCOD,REFDSC,EMPCODVER, COEFGERAL_PLT FROM #OWNER#.VW_PROPOSTAS', COMANDOSQL_WHERE = 'WHERE ORCCOD=#ORCCOD# AND ORCVER=#ORCVER# AND NOMESISTEMA=#NOMESISTEMA# AND PEDCOD=#MYCOD#' WHERE REPID = 'CDP0000057' / --------------------------------------------------------------------- Sample of the output: --------------------------------------------------------------------- UPDATE CDPREPORTSQL SET COMANDOSQL_FROM = 'SELECT DESCONTO,EMPCOD,EMPDSC,LINVER,NOMESISTEMA,OBS,ORCCOD,ORCVER,PEDCOD,PEDDSC, ROUND(PRCUNIT*#CAMBIO#,5) PRCUNIT, ROUND(PRCUNITSEMDESC*#CAMBIO#,5) PRCUNITSEMDESC, PROPCHECK,QTDGLOB,QTDPROP,REFCOD,REFDSC,EMPCODVER, COEFGERAL_PLT FROM #OWNER#.VW_PROPOSTAS', COMANDOSQL_WHERE = 'WHERE ORCCOD=#ORCCOD# AND ORCVER=#ORCVER# AND NOMESISTEMA=#NOMESISTEMA# AND PEDCOD=#MYCOD#' WHERE REPID = 'CDP0000057' --------------------------------------------------------------------- These are the examples of statements that can occur and that I need to process: /* */ UPDATE Orc /*UPDATE comando */ set MercadoInt = 'N', Coef_KrMo = 1, Coef_KrMt = 1, Coef_KrEq = 1, Coef_KrSb = 1, Coef_KrGb = 1, Coef_MDEmp = 1, Coef_MDLoc = 1, Abrv_MDLoc = '', Dsc_MDLoc = '', Arred_MDLoc = 'N', Arred_NDecs = 0 WHERE MercadoInt IS NULL / Another one: /* */ ---- comment UPDATE Orc set MercadoInt = 'N', Coef_KrMo = -1, Coef_KrMt = 1, Coef_KrEq = 1, Coef_KrSb = 1, Coef_KrGb = 1, Coef_MDEmp = 1, Coef_MDLoc = 1, Abrv_MDLoc = '', Dsc_MDLoc = '', Arred_MDLoc = 'N', Arred_NDecs = 0 WHERE MercadoInt IS NULL / And another one: /* */ UPDATE Orc set MercadoInt = 'N', Coef_KrMo = 1, Coef_KrMt = 1, Coef_KrEq = 1, Coef_KrSb = 1, Coef_KrGb = 1, Coef_MDEmp = 1, Coef_MDLoc = 1, Abrv_MDLoc = '', Dsc_MDLoc = '', Arred_MDLoc = 'N', Arred_NDecs = 0 WHERE MercadoInt IS NULL / Note that I need to process them even if there are commented section in the middle if the statement Note that everything else is working fine (it "ignores" the comments and blank spaces) The '/' is just to divide the commands in the text file
All this seems rather complex and slow. If you just want to find/reject lines of dashes, why not use: if (line.StartsWith("----")) (Assuming that 4 dashes is sufficient to detect such lines unambiguously) If there may be whitespace at the start of the line, then: if (line.Trim().StartsWith("----")) Not only is this approach infinitely more readable than regex, it'll most probably be much faster.
The code below works on the examples you gave. private const string DashComment = #"(^|\s+)--.*(\n|$)"; private const string SlashStarComment = #"\/\*.*?\*\/"; private string[] CommandSplitter(string text) { // strip /* ... */ comments var strip1 = Regex.Replace(text, SlashStarComment, " ", RegexOptions.Multiline); var strip2 = Regex.Replace(strip1, DashComment, "\n", RegexOptions.Multiline); // split into individual commands separated by '/' var commands = strip2.Split(new[] {'/'}, StringSplitOptions.RemoveEmptyEntries); return commands.Where(line => !String.IsNullOrWhiteSpace(line)) .ToArray(); } I took the three examples you posted in your question and put them in a single string. It looks like this (yeah, it's ugly): private const string Test1 = #"/* */ UPDATE Orc /*UPDATE comando */ set MercadoInt = 'N', Coef_KrMo = 1, Coef_KrMt = 1, Coef_KrEq = 1, Coef_KrSb = 1, Coef_KrGb = 1, Coef_MDEmp = 1, Coef_MDLoc = 1, Abrv_MDLoc = '', Dsc_MDLoc = '', Arred_MDLoc = 'N', Arred_NDecs = 0 WHERE MercadoInt IS NULL / /* */ ---- comment UPDATE Orc set MercadoInt = 'N', Coef_KrMo = -1, Coef_KrMt = 1, Coef_KrEq = 1, Coef_KrSb = 1, Coef_KrGb = 1, Coef_MDEmp = 1, Coef_MDLoc = 1, Abrv_MDLoc = '', Dsc_MDLoc = '', Arred_MDLoc = 'N', Arred_NDecs = 0 WHERE MercadoInt IS NULL / /* */ UPDATE Orc set MercadoInt = 'N', Coef_KrMo = 1, Coef_KrMt = 1, Coef_KrEq = 1, Coef_KrSb = 1, Coef_KrGb = 1, Coef_MDEmp = 1, Coef_MDLoc = 1, Abrv_MDLoc = '', Dsc_MDLoc = '', Arred_MDLoc = 'N', Arred_NDecs = 0 WHERE MercadoInt IS NULL /"; Then, I called the CommandSplitter: var result = CommandSplitter(Test1); And output the results: foreach (var t in result) { Console.WriteLine(t); Console.WriteLine("////////////////////////"); } That removed the /* ... */ comments and the -- ... comments. It also worked on this example: private const string Test2 = "Update Orc set /* this is a comment */ MercadoInt = 'N' -- this is another comment\n" + "Where MercadoInt is NULL --another comment"; The output: Update Orc set MercadoInt = 'N' Where MercadoInt is NULL Update The code above returns an array of commands. Each command consists of multiple lines. If you want to remove extraneous spaces at the beginning of lines, and eliminate blank lines, then you have to process each individual command separately. So you'd want to extend the CommandSplitter like this: private string[] CommandSplitter(string text) { // strip /* ... */ comments var strip1 = Regex.Replace(text, SlashStarComment, " ", RegexOptions.Multiline); var strip2 = Regex.Replace(strip1, DashComment, "\n", RegexOptions.Multiline); // split into individual commands separated by '/' var commands = strip2.Split(new[] { '/' }, StringSplitOptions.RemoveEmptyEntries); return commands.Select(cmd => cmd.Split(new[] {'\n'}) .Select(l => l.Trim())) .Select(lines => string.Join("\n", lines.Where(l => !string.IsNullOrWhiteSpace(l)))) .ToArray(); }
From what I understand, you have a text file with multiple SQL commands, seperated by: / --------------------------------------------------------------------- And you only want the text in between these dashes. If so, why not split the text with Regex.Split, then get out all the elements? This regex seems to work: \/\n\n-+ Based on the Regex.Split documentation, the code would be: string input = File.ReadAllText(caminho); string pattern = "\/\n\n-+"; string[] substrings = Regex.Split(input, pattern); foreach (string match in substrings) { //do cool stuff with your cool query }
If you don't want to use regex you could also use !line.TrimStart().StartWith("-") shoud be the same and I think it is faster.
I've done the code like this, so far is working good. public string[] Parser(string caminho) { List<string> Commands2 = new List<string>(); string text = File.ReadAllText(caminho); var Linha = Regex.Replace(text, #"\/\**?\*\/", " "); var Commands = Linha.Split(new[] { '/' }, StringSplitOptions.RemoveEmptyEntries) .Where(line => !string.IsNullOrWhiteSpace(line)) .Where(line => !Regex.IsMatch(line, #"^[\s\-]+$")) .ToArray(); Commands2 = Commands.ToList(); for(int idx = 0; idx < Commands2.Count; idx ++) { if (Commands2[idx].TrimStart().StartsWith("-")) { string linha = Commands2[idx]; string linha2 = linha.Remove(linha.IndexOf('-'), linha.LastIndexOf('-') - 1); Commands2[idx] = linha2; } } //test the output to a .txt file StreamWriter Comandos = new StreamWriter(Directory.GetParent(caminho).ToString() + "Out.txt", false); foreach (string linha in Commands2) { Comandos.Write(linha); } Comandos.Close(); return Commands2.ToArray(); } After they analyzed my code they said that I can't use this (As mentioned above) because it wont work for some cases like comments in the middle of the statements. I will try now doing so using Tsql120Parser
Split strings that have strange pattern
I need help to split a collection of strings that have rather strange pattern. Example data: List<string> input = new List<string>(); input.Add("Blue Code \n 03 ID \n 05 Example \n Sky is blue"); input.Add("Green Code\n 01 ID\n 15"); input.Add("Test TestCode \n 99 \n Testing is fun"); Expected output: For input[0]: string part1 = "Blue" string part2 = "Code \n 03" string part3 = "ID \n 05" string part4 = "Example \n Sky is blue" For input[1]: string part1 = "Green" string part2 = "Code\n 01" string part3 = "ID\n 15" For input[2]: string part1 = "Test" string part2 = "TestCode \n 99" string part3 = "\n Testing is fun" Edited with one more example: "038 038\n 0004 049.0\n 0006" Expected output: "038" "038\n 0004" "049.0\n 0006" In short, I don't even know how to describe the pattern... It seems like I need the first string(act as a key) right before the "\n" as part of the new string, but the last input[2] has slightly different pattern from the other 2. Also, please take note of the spaces, they are extremely inconsistent. I know this is a long shot, but please let me know if anyone can figure out how to deal with these data. Updated: I think I can forget about solving this... When I actually take a look at the database in detail, I just found out that there are NOT only \n, it can be... anything, including |a |b |c (from a-z, A-Z), \a \b \c (from a-z, A-Z). Manually re-entering the data could be much more easier...
I would say the pattern is: List<string> input = new List<string>(); input.Add("Blue Code \n 03 ID \n 05 Example \n Sky is blue"); input.Add("Green Code\n 01 ID\n 15"); input.Add("Test TestCode \n 99 \n Testing is fun"); foreach(string text in input) { string rest = text; //1 Take first word string part1 = rest.Split(' ')[0]; rest = rest.Skip(part1.Length).ToString(); //while rest contains (/n number) while (rest.Contains("\n")) { //Take until /n number int index = rest.IndexOf("\n"); string partNa = rest.Take(index).ToString(); string temp = rest.Skip(index).ToString(); string partNb = temp.Split(' ')[0]; int n; if (int.TryParse("123", out n)) { string partN = partNa + partNb; rest = rest.Skip(partN.Length).ToString(); } } //Take rest string part3 = rest; } It could probably be written a bit more optimised, but you get the idea.
Ok, I have got this little code snippet to generate the output you are looking for. the Pattern seems to be: Word [Key \n Value] [Key \n Value] [Key \n Value (With Spaces)] Where the Key can be empty. Is that right? var input = new List<string> { "Blue Code \n 03 ID \n 05 Example \n Sky is blue", "Green Code\n 01 ID\n 15", "038 038\n 0004 049.0\n 0006", "Test TestCode \n 99 \n Testing is fun" }; var output = new List<List<string>>(); foreach (var item in input) { var items = new List<string> {item.Split(' ')[0]}; const string strRegex = #"(?<group>[a-zA-Z0-9\.]*\s*\n\s*[a-zA-Z0-9\.]*)"; var myRegex = new Regex(strRegex, RegexOptions.None); var matchCollection = myRegex.Matches(item.Remove(0, item.Split(' ')[0].Length)); for (var i = 0; i < 2; i++) { if (matchCollection[i].Success) { items.Add(matchCollection[i].Value); } } var index = item.IndexOf(items.Last()) + items.Last().Length; var final = item.Substring(index); if (final.Contains("\n")) { items.Add(final); } else { items[items.Count -1 ] = items[items.Count - 1] + final; } output.Add(items); }
how do I replace numbers using regex?
I am using C# regex library to do some text find and replace. I would like to change the following: 1 -> one 11 -> one one 123 -> one two three for example, here's my code to replace ampersand: string pattern = "[&]"; string replacement = " and "; Regex rgx = new Regex(pattern); string result = rgx.Replace(text, replacement); Edit I found some great examples of .NET RegEx on MSDN: http://msdn.microsoft.com/en-us/library/kweb790z.aspx
Since you're specifically asking for a regex, you could do something like this var digits = new Dictionary<string, string> { { "0", "zero" }, { "1", "one" }, { "2", "two" }, { "3", "three" }, { "4", "four" }, { "5", "five" }, { "6", "six" }, { "7", "seven" }, { "8", "eight" }, { "9", "nine" } }; var text = "this is a text with some numbers like 123 and 456"; text = Regex.Replace(text, #"\d", x => digits[x.Value]); which will give you this is a text with some numbers like onetwothree and fourfivesix