Objects to be comma separated and with double quote - c#

I have an object array:
object[] keys
I need to transform this array into a string which is comma separated and I did it by doing this:
var newKeys = string.Join(",", keys);
My problem here is I want this values to be double quoted.
ex:
"value1","value2","value3"

var new= "\"" + string.Join( "\",\"", keys) + "\"";
To include a double quote in a string, you escape it with a backslash character, thus "\"" is a string consisting of a single double quote character, and "\", \"" is a string containing a double quote, a comma, a space, and another double quote.

Please give a try to this.
var keys = new object[] { "test1", "hello", "world", null, "", "oops"};
var csv = string.Join(",", keys.Select(k => string.Format("\"{0}\"", k)));
Because you have an object[] array, string.Format can deal with null as well as other types than strings. This solutions also works in .NET 3.5.
When the object[] array is empty, then a empty string is returned.

If performance is the key, you can always use a StringBuilder to concatenate everything.
Here's a fiddle to see it in action, but the main part can be summarized as:
// these look like snails, but they are actually pretty fast
using #_____ = System.Collections.Generic.IEnumerable<object>;
using #______ = System.Func<object, object>;
using #_______ = System.Text.StringBuilder;
public static string GetCsv(object[] input)
{
// use a string builder to make things faster
var #__ = new StringBuilder();
// the rest should be self-explanatory
Func<#_____, #______, #_____>
#____ = (_6,
_2) => _6.Select(_2);
Func<#_____, object> #_3 = _6
=> _6.FirstOrDefault();
Func<#_____, #_____> #_4 = _8
=> _8.Skip(input.Length - 1);
Action<#_______, object> #_ = (_9,
_2) => _9.Append(_2);
Action<#_______>
#___ = _7 =>
{ if (_7.Length > 0) #_(
#__, ",");
}; var #snail =
#____(input, (#_0 =>
{ #___(#__); #_(#__, #"""");
#_(#__, #_0); #_(#__, #"""");
return #__; }));
var #linq = #_4(#snail);
var #void = #_3(#linq);
// get the result
return #__.ToString();
}

Related

Unexpected regex result with a single space

Can somebody please tell me why a space comes up with 2 matches for the below pattern?
((?<key>(?:((?!\d)\w+(?:\.(?!\d)\w+)*)\.)?((?!\d)\w+)):(?<value>([^ "]+)|("[^"]*?")+))*
Trying to match the following cases:
var body = "Key:Hello";
var body = "Key:\"Hello\"";
var body = "Key1:Hello Key2:\"Goodbye\"";
This may provide more context:
pattern = #"((?<key>" + StringExtensions.REGEX_IDENTIFIER_MIDSTRING + "):(?<value>([^ \"]+)|(\"[^\"]*?\")+))*";
My goal is to pull the keys, values out of a command-line like string in the form of [key]:[value] with optional repeats. Values can either be a with no spaces or in quotes with spaces.
Probably right there in front of me but I'm not seeing it.
Probably because “.”, because a period in regex, marches every character except line breaks
I took a different approach:
public static Dictionary<string, string> GetCommandLineKeyValues(this string commandLine)
{
var keyValues = new Dictionary<string, string>();
var pattern = #"(?<command>(" + StringExtensions.REGEX_IDENTIFIER + " )?)(?<args>.*)";
var args = commandLine.RegexGet(pattern, "args");
Match match;
if (args.Length > 0)
{
string key;
string value;
pattern = #" ?(?<key>" + StringExtensions.REGEX_IDENTIFIER_MIDSTRING + ")*?:(?<value>([^ \"]+)|(\"[^\"]*?\")+)";
do
{
match = args.RegexGetMatch(pattern);
if (match == null)
{
break;
}
key = match.Groups["key"].Value;
value = match.Groups["value"].Value;
keyValues.Add(key, value);
args = match.Replace(args, string.Empty);
}
while (args.RegexIsMatch(pattern));
}
return keyValues;
}
I took what I call the "pac-man" approach to Regex.. match, eat (hence the Match.Replace), and continue matching.
For convenience:
public const string REGEX_IDENTIFIER = #"^(?:((?!\d)\w+(?:\.(?!\d)\w+)*)\.)?((?!\d)\w+)$";

Replacing commas in a string with brackets and commas if they don't exist

I am trying to manipulate and clean up a string of database columns as follows.
Example Source string(s):
[foo],[bar],baz
[foo],bar,[baz]
[foo],[bar,[baz]
[foo],bar],[baz]
foo,bar,baz
(and so on)
Expected output:
[foo],[bar],[baz]
I have tried to run the following regex substitutions over the string:
string columnString = "[foo],[bar],baz";
if (!Regex.IsMatch(columnString, #"^\[.*"))
{
columnString = string.Concat("[", columnString);}
if (!Regex.IsMatch(columnString, #"^.*\]$"))
{
columnString = string.Concat(columnString,"]");
}
while (!Regex.IsMatch(columnString, #"^.*\],.*$"))
{
columnString = Regex.Replace(columnString, #",", #"],");}
while (!Regex.IsMatch(columnString, #"^.*,\[.*$"))
{
columnString = Regex.Replace(columnString, #"\],", #"],[");
}
While this fixes up the leading and trailing brackets, it (obviously) doesn't deal with the commas where there is already an existing match in the string.
Can anyone suggest a method that would clean this up (it doesn't have to be regex).
Cheers
I suggest a splitting and string rebuilding solution:
var result = string.Join(
",",
s.Split(',') // split with commas
.Select(x => !x.StartsWith("[") && !x.EndsWith("]") ? $"[{x}]" : x ) // add [ ] to items not starting and ending with [ ]
);
See C# demo:
var strs = new List<string> { "[foo],[bar],baz", "[foo],bar,[baz]", "foo,bar,baz" };
foreach (var s in strs)
{
var result = string.Join(",", s.Split(',').Select(x => !x.StartsWith("[") && !x.EndsWith("]") ? $"[{x}]" : x ));
Console.WriteLine(result);
}
Output:
[foo],[bar],[baz]
[foo],[bar],[baz]
[foo],[bar],[baz]
Updated
As there may be items with either [ at the start or a ] at the end you may use
var result = string.Join(
",",
s.Split(',')
.Select(x => !x.StartsWith("[") || !x.EndsWith("]") ?
$"[{Regex.Replace(x, #"^\[|]$", "")}]" : x
)
);
See this C# demo. Result:
[foo],[bar],[baz],[test]
[foo],[bar],[baz],[test]
[foo],[bar],[baz]
Note that Regex.Replace(x, #"^\[|]$", "") removes a [ at the start and ] at the end of the string.
string str = "[foo],[bar],baz";
str = "[" + str.Replace("[", "").Replace("]", "").Replace(",", "],[") + "]";
Use StringBuilder if possible. I just gave you an idea using String class.
If you want to use regular expression, here is the answer:
var input = "[foo],bar,[baz]";
var regex = new Regex("((\\[?)((foo)|(bar)|(baz))(\\]?))");
var result = regex.Replace(input, "[$3]");
Please see: https://dotnetfiddle.net/Afnn3m

Check a pattern in a string then convert it to upper case

I was not clear with my previous question
I have a list: new List<string> { "lts", "mts", "cwts", "rotc" };
Now I wan't to check a pattern in string that starts or ends with a forward slash like this: "cTws/Rotc/lTs" or "SomethingcTws cWtS/Rotc rOtc".
and convert to upper case only the string that starts/ends with a forward slash based on the list that I have.
So the output should be: "CWTS/ROTC/LTS", "SomethingcTws CWTS/ROTC rOtc"
I modified Sachin's answer:
List<string> replacementValues = new List<string>
{
"cwts",
"mts",
"rotc",
"lts"
};
string pattern = string.Format(#"\G({0})/?", string.Join("|", replacementValues.Select(x => Regex.Escape(x))));
Regex regExp = new Regex(pattern, RegexOptions.IgnoreCase);
string value = "Cwts/Rotc Somethingcwts1 Cwts/Rotc/lTs";
string result = regExp.Replace(value, s => s.Value.ToUpper());
Result: CWTS/ROTC Somethingcwts1 Cwts/Rotc/lTs
The desired output should be: CWTS/ROTC Somethingcwts1 CWTS/ROTC/LTS
So instead of using Regex, which I'm not really good with, I'm doing split by space then split by "/" then rejoin the strings
string val = "Somethingrotc1 cWts/rOtC/lTs Cwts";
List<string> replacementValues = new List<string>
{
"lts", "mts",
"cwts", "rotc"
};
string[] tokens = val.Split(new char[] { ' ' }, StringSplitOptions.None);
string result = string.Join(" ", tokens.Select(t =>
{
// Now split by "/"
string[] ts = t.Split(new char[] { '/' }, StringSplitOptions.None);
if (ts.Length > 1)
{
t = string.Join("/", ts.Select(x => replacementValues.Contains(x.ToLower()) ? x.ToUpper() : x));
}
return t;
}));
Output: Somethingrotc1 CWTS/ROTC/LTS Cwts
You want to change the specific words in the string to Upper case. Then you can use Regex to achieve it.
string value = "Somethingg1 Cwts/Rotc/Lts Cwts";
var replacementValues = new Dictionary<string, string>()
{
{"Cwts","CWTS"},
{"Rotc","ROTC"},
{"Lts","LTC"}
};
var regExpression = new Regex(String.Join("|", replacementValues.Keys.Select(x => Regex.Escape(x))));
var outputString = regExpression.Replace(value, s => replacementValues[s.Value]);

Regex without escaping Characters - Problems

I found some solutions for my problem, which is quite simple:
I have a string, which is looking like this:
"\r\nContent-Disposition: form-data; name=\"ctl00$cphMainContent$grid$ctl03$ucPicture$ctl00\""
My goal is to break it down, so I have a Dictionary of values, like:
Key = "name", value ? "ctl..."
My approach was: Split it by "\r\n" and then by the equal or the colon sign.
This worked fine, but then some funny Tester uploaded a file with all allowed charactes, which made the String looking like this:
"\r\nContent-Disposition: form-data; name=\"ctl00_cphMainContent_grid_ctl03_ucPicture_btnUpload$fileUpload\"; filename=\"C:\\Users\\matthias.mueller\\Desktop\\- ie+![]{}_-´;,.$¨##ç %&()=~^`'.jpg\"\r\nContent-Type: image/jpeg"
Of course, the simple splitting doesn't work anymore, since it splits now the filename.
I corrected this by reading out "filename=" and escaping the signs I'm looking to split, and then creating a regex.
Now comes my problem: I found two Regex-samples, which could do the work for the equal sign, the semicolon and the colon. one is:
[^\\]=
The other one I found was:
(?<!\\\\)=
The problem is, the first one doesn't only split, but it splits the equal sign and one character before this sign, which means my key in the Dictionary is "nam" instead of "name"
The second one works fine on this matter, but it still splits the escaped equal sign in the filename.
Is my approach for this problem even working? Would there be a better solution for this? And why is the first Regex cutting a character?
Edit: To avoid confusion, my escaped String looks like this:
"Content-Disposition: form-data; name=\"ctl00_cphMainContent_grid_ctl03_ucPicture_btnUpload$fileUpload\"; filename=\"C\:\Users\matthias.mueller\Desktop\- ie+![]{}_-´\;,.$¨##ç %&()\=~^`'.jpg\""
So I want basically: Split by equal Sign EXCEPT the escaped ones. By the way: The string here shows only one \, but there are 2.
Edit 2: OK seems like I have a working solution, but it's so ugly:
Dictionary<string, string> ParseHeader(byte[] bytes, int pos)
{
Dictionary<string, string> items;
string header;
string[] headerLines;
int start;
int end;
string input = _encoding.GetString(bytes, pos, bytes.Length - pos);
start = input.IndexOf("\r\n", 0);
if (start < 0) return null;
end = input.IndexOf("\r\n\r\n", start);
if (end < 0) return null;
WriteBytes(false, bytes, pos, end + 4 - 0); // Write the header to the form content
header = input.Substring(start, end - start);
items = new Dictionary<string, string>();
headerLines = Regex.Split(header, "\r\n");
Regex regLineParts = new Regex(#"(?<!\\\\);");
Regex regColon = new Regex(#"(?<!\\\\):");
Regex regEqualSign = new Regex(#"(?<!\\\\)=");
foreach (string hl in headerLines)
{
string workString = hl;
//Escape the Semicolon in filename
if (hl.Contains("filename"))
{
String orig = hl.Substring(hl.IndexOf("filename=\"") + 10);
orig = orig.Substring(0, orig.IndexOf('"'));
string toReplace = orig;
toReplace = toReplace.Replace(toReplace, toReplace.Replace(";", #"\\;"));
toReplace = toReplace.Replace(toReplace, toReplace.Replace(":", #"\\:"));
toReplace = toReplace.Replace(toReplace, toReplace.Replace("=", #"\\="));
workString = hl.Replace(orig, toReplace);
}
string[] lineParts = regLineParts.Split(workString);
for (int i = 0; i < lineParts.Length; i++)
{
string[] p;
if (i == 0)
p = regColon.Split(lineParts[i]);
else
p = regEqualSign.Split(lineParts[i]);
if (p.Length == 2)
{
string orig = p[0];
orig = orig.Replace(#"\\;", ";");
orig = orig.Replace(#"\\:", ":");
orig = orig.Replace(#"\\=", "=");
p[0] = orig;
orig = p[1];
orig = orig.Replace(#"\\;", ";");
orig = orig.Replace(#"\\:", ":");
orig = orig.Replace(#"\\=", "=");
p[1] = orig;
items.Add(p[0].Trim(), p[1].Trim());
}
}
}
return items;
}
Needs some further testing.
I had a go at writing a parser for you. It handles literal strings, like "here is a string", as the values in name-value pairs. I've also written a few tests, and the last shows an '=' character inside a literal string. It also handles escaping quotes (") inside literal strings by escaping as \" -- I'm not sure if this is right, but you could change it.
A quick explanation. I first find anything that looks like a literal string and replace it with a value like PLACEHOLDER8230498234098230498. This means the whole thing is now literal name-value pairs; eg
key="value"
becomes
key=PLACEHOLDER8230498234098230498
The original string value is stored off in the literalStrings dictionary for later.
So now we split on semicolons (to get key=value strings) and then on equals, to get the proper key/value pairs.
Then I substitute the placeholder values back in before returning the result.
public class HttpHeaderParser
{
public NameValueCollection Parse(string header)
{
var result = new NameValueCollection();
// 'register' any string values;
var stringLiteralRx = new Regex(#"""(?<content>(\\""|[^\""])+?)""", RegexOptions.IgnorePatternWhitespace);
var equalsRx = new Regex("=", RegexOptions.IgnorePatternWhitespace);
var semiRx = new Regex(";", RegexOptions.IgnorePatternWhitespace);
Dictionary<string, string> literalStrings = new Dictionary<string, string>();
var cleanedHeader = stringLiteralRx.Replace(header, m =>
{
var replacement = "PLACEHOLDER" + Guid.NewGuid().ToString("N");
var stringLiteral = m.Groups["content"].Value.Replace("\\\"", "\"");
literalStrings.Add(replacement, stringLiteral);
return replacement;
});
// now it's safe to split on semicolons to get name-value pairs
var nameValuePairs = semiRx.Split(cleanedHeader);
foreach(var nameValuePair in nameValuePairs)
{
var nameAndValuePieces = equalsRx.Split(nameValuePair);
var name = nameAndValuePieces[0].Trim();
var value = nameAndValuePieces[1];
string replacementValue;
if (literalStrings.TryGetValue(value, out replacementValue))
{
value = replacementValue;
}
result.Add(name, value);
}
return result;
}
}
There's every chance there are some proper bugs in it.
Here's some unit tests you should incorporate, too;
[TestMethod]
public void TestMethod1()
{
var tests = new[] {
new { input=#"foo=bar; baz=quux", expected = #"foo|bar^baz|quux"},
new { input=#"foo=bar;baz=""quux""", expected = #"foo|bar^baz|quux"},
new { input=#"foo=""bar"";baz=""quux""", expected = #"foo|bar^baz|quux"},
new { input=#"foo=""b,a,r"";baz=""quux""", expected = #"foo|b,a,r^baz|quux"},
new { input=#"foo=""b;r"";baz=""quux""", expected = #"foo|b;r^baz|quux"},
new { input=#"foo=""b\""r"";baz=""quux""", expected = #"foo|b""r^baz|quux"},
new { input=#"foo=""b=r"";baz=""quux""", expected = #"foo|b=r^baz|quux"},
};
var parser = new HttpHeaderParser();
foreach(var test in tests)
{
var actual = parser.Parse(test.input);
var actualAsString = String.Join("^", actual.Keys.Cast<string>().Select(k => string.Format("{0}|{1}", k, actual[k])));
Assert.AreEqual(test.expected, actualAsString);
}
}
Looks to me like you'll need a bit more of a solid parser for this than a regex split. According to this page the name/value pairs can either be 'raw';
x=1
or quoted;
x="foo bar baz"
So you'll need to look for a solution that not only splits on the equals, but ignores any equals inside;
x="y=z"
It might be that there is a better or more managed way for you to access this info. If you are using a classic ASP.NET WebForms FileUpload control, you can access the filename using the properties of the control, like
FileUpload1.HasFile
FileUpload1.FileName
If you're using MVC, you can use the HttpPostedFileBase class as a parameter to the action method. See this answer
[HttpPost]
public ActionResult Index(HttpPostedFileBase file)
{
// Verify that the user selected a file
if (file != null && file.ContentLength > 0)
{
// extract only the fielname
var fileName = Path.GetFileName(file.FileName);
// store the file inside ~/App_Data/uploads folder
var path = Path.Combine(Server.MapPath("~/App_Data/uploads"), fileName);
file.SaveAs(path);
}
// redirect back to the index action to show the form once again
return RedirectToAction("Index");
}
This:
(?<!\\\\)=
matches = not preceded by \\.
It should be:
(?<!\\)=
(Make sure you use # (verbatim) strings for the regex, to avoid confusion)

How to remove " [ ] \ from string

I have a string
"[\"1,1\",\"2,2\"]"
and I want to turn this string onto this
1,1,2,2
I am using Replace function for that like
obj.str.Replace("[","").Replace("]","").Replace("\\","");
But it does not return the expected result.
Please help.
You haven't removed the double quotes. Use the following:
obj.str = obj.str.Replace("[","").Replace("]","").Replace("\\","").Replace("\"", "");
Here is an optimized approach in case the string or the list of exclude-characters is long:
public static class StringExtensions
{
public static String RemoveAll(this string input, params Char[] charactersToRemove)
{
if(string.IsNullOrEmpty(input) || (charactersToRemove==null || charactersToRemove.Length==0))
return input;
var exclude = new HashSet<Char>(charactersToRemove); // removes duplicates and has constant lookup time
var sb = new StringBuilder(input.Length);
foreach (Char c in input)
{
if (!exclude.Contains(c))
sb.Append(c);
}
return sb.ToString();
}
}
Use it in this way:
str = str.RemoveAll('"', '[', ']', '\\');
// or use a string as "remove-array":
string removeChars = "\"{[]\\";
str = str.RemoveAll(removeChars.ToCharArray());
You should do following:
obj.str = obj.str.Replace("[","").Replace("]","").Replace("\"","");
string.Replace method does not replace string content in place. This means that if you have
string test = "12345" and do
test.Replace("2", "1");
test string will still be "12345". Replace doesn't change string itself, but creates new string with replaced content. So you need to assign this new string to a new or same variable
changedTest = test.Replace("2", "1");
Now, changedTest will containt "11345".
Another note on your code is that you don't actually have \ character in your string. It's only displayed in order to escape quote character. If you want to know more about this, please read MSDN article on string literals.
how about
var exclusions = new HashSet<char>(new[] { '"', '[', ']', '\\' });
return new string(obj.str.Where(c => !exclusions.Contains(c)).ToArray());
To do it all in one sweep.
As Tim Schmelter writes, if you wanted to do it often, especially with large exclusion sets over long strings, you could make an extension like this.
public static string Strip(
this string source,
params char[] exclusions)
{
if (!exclusions.Any())
{
return source;
}
var mask = new HashSet<char>(exclusions);
var result = new StringBuilder(source.Length);
foreach (var c in source.Where(c => !mask.Contains(c)))
{
result.Append(c);
}
return result.ToString();
}
so you could do,
var result = "[\"1,1\",\"2,2\"]".Strip('"', '[', ']', '\\');
Capture the numbers only with this regular expression [0-9]+ and then concatenate the matches:
var input = "[\"1,1\",\"2,2\"]";
var regex = new Regex("[0-9]+");
var matches = regex.Matches(input).Cast<Match>().Select(m => m.Value);
var result = string.Join(",", matches);

Categories

Resources