This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Reading csv file
I have a comma delimited file:
"Some Text, More Text", 1, 2, 3,4,5,6
"Random Text, text text", 2,4,5,6,7,8
var content = reader.ReadLine();
var stringArray = content.Split(',');
The problem is the text ends up being split into two parts. I want to keep it as one unit. So what are my options?
EDIT: I meant like
Some Text
More Text
1
2
3
4
5
6
I want it like
Some Text,More Text
1
2
3
4
5
6
How about finding all the matches of this regex:
"[^"]*"|\S+
I usually use the Microsoft.VisualBasic.FileIO.TextFieldParser object, see:
http://msdn.microsoft.com/en-us/library/f68t4563.aspx
and example of implementation at:
http://www.siccolo.com/Articles/CodeProject/Open_DataSet_From_TextFile/open_dataset_from_text_csv_file.html
This allows me to handle CSV files without worrying about how to cope with whether fields are enclosed in quotes, contain commas, escaped quotes etc.
You need to use Regex in Split, so that text in quotes are excluded!
Related
This question already has answers here:
How to split string while ignoring portion in parentheses?
(5 answers)
Closed 1 year ago.
I have a nested comma string such as
a(x,y,z),b,c(n,o,p),d,e,f(t,w)
Want to split this string in C# such as
a(x),a(y),a(z),b,c(n),c(o),c(p),d,e,f(t),f(w)
I tried splitting using combination String.Split & String.SubString. Please let me know f you any solution for this problem.
Many problems get easier if you split them into smaller problems. This is one of them.
Step 1: Split on , while ignoring separators in parenthesis (see this related question for a regex-based solution: How to split string while ignoring portion in parentheses?)
This will yield a(x,y,z), b, c(n,o,p), ...
Step 2: Split the part before and inside the parenthesis (using a regular expression or just String.Split), split the inside part on ,, loop through it and add the component before the parenthesis.
This will transform a(x,y,z) into a(x), a(y), ...
This question already has answers here:
How can you strip non-ASCII characters from a string? (in C#)
(15 answers)
C# regex to remove non - printable characters, and control characters, in a text that has a mix of many different languages, unicode letters
(4 answers)
Closed 4 years ago.
I'm reading data from a file, and sometimes the file contains funky stuff, like:
"䉌Āᜊ»ç‰ç•‡ï¼ƒè¸²æœ€ä²’Bíœë¨¿ä„€å•²ï²ä‹¾é¥˜BéŒé“‡ä„€â²ä‹¾â¢"
I need to strip/replace these characters as JSON has no idea what to do with them.
They aren't control characters (I think), so my current regex of
Regex.Replace(value, #"\p{C}+", string.Empty);
Isn't catching them.
A lot of these strings read in are going to be long, upwards of256 characters, so I'd rather not loop through each char checking it.
Is there a simple solution to this? I'm thinking regular expressions would solve it, but I'm not sure.
If all you want is ASCII then you could do:
Regex.Replace(value, #"[^\x00-\x7F]+", string.Empty);
and if all you want are the "normal" ASCII characters, you could do:
Regex.Replace(value, #"[^\x20-\x7E]+", string.Empty);
This question already has answers here:
Are there any CSV readers/writer libraries in C#? [closed]
(5 answers)
Closed 6 years ago.
I've been trying my luck with Regex but my understanding doesn't seem to be the best.
Problem
I have a .csv file given to me by a 3rd party. I cannot edit it but need to read the data into my application.
There are always 12 columns in the file. However, sometimes it will go like this:
text, text ,text,"text with comma,"
text, text, text, text....
text, text, text,"text with comma,","text with comma again", text...
What I need to do this replace all the commas between the "" with a -.
Any help would be appreciated!
This might do the trick for you
foreach(Match match in Regex.Matches(YourCSV, "\"([^\"]*)\""))
if(match.ToString().Contains(","))
YourCSV = YourCSV.Replace(match.ToString(), match.ToString().Replace(",", "-"));
This question already has answers here:
Parse subtitle file using regex C#
(5 answers)
Closed 8 years ago.
I need to split an .srt file text like
1
00:02:10,437 --> 00:02:11,598
Day one, Greenie.
2
00:02:11,757 --> 00:02:12,838
Rise and shine.
3
00:02:14,357 --> 00:02:16,041
He looks like
a slopper to me.
split into multi-line string array, each string has at least 3 lines,
one for the number, one for the time, and one or more for the text of the subtitles
can you help?
\n{2,}
Split by this and you have your result.
This question already has answers here:
How to remove illegal characters from path and filenames?
(30 answers)
Closed 9 years ago.
I'm creating bunch of folders using a C# console application. An XML file is parsed for different nodes and based on the values the folders are created with the same name.
One of the XML node had the following value with some unknown special character in it (ASCII code 127)
There is a special character after Foldername. I tried using String.Trim() to trim the value but had no luck. I also tried to compare the character with the list of
System.IO.Path.GetInvalidFileNameChars()
and remove it. But still no luck. How can I try to eliminate these characters before I create a folder name. The folder name will be always alpha numeric in my case.
If the folder name will "always be alpha numeric", then you can simply remove all non-alphanumeric characters:
var regex = new Regex("[^a-zA-Z0-9]");
fileName = regex.Replace(fileName, string.Empty);
You could remove the unwanted characters using Regular Expressions -
string validFolderName = Regex.Replace(folderName,"[^A-Za-z0-9 _]","");