I'm working on a program that will parse off chunks off data from a CSV file, and populate it into the attributes of an XML document. The data entry I'm working with looks like this...e11*70/157*1999/101*1090*04. I want to break that up, using the asterisks as the reference to split it into e11, 70/157, 1999/101, etc; so I can insert those values into the attributes of the XML. Would this be a situation appropriate for RegEx? Or would I be better off using Substring, with an index of *?
Thanks so much for the assistance. I'm new to the programming world, and have found sites such as these to be a extremely valuable resource.
You can use String.Split()
string[] words = #"e11*70/157*1999/101*1090*04".Split('*');
I think this should solve your ptoblem :
string content = #"11*70/157*1999/101*1090*04";
string [] split = words.Split('*');
You could use the Split method to create a string array like so:
string txt = "e11*70/157*1999/101*1090*04";
foreach (string s in txt.Split('*')){
DoSomething(s);
}
Related
I want to replace ALL HTML special entities like > < to custom string.
Lets say i have following string:
string str = "<div>>hello<</div>";
and method:
Method(string str, string replaceStr)
After calling Method(str, ":)") result should be
<div>:)hello:)</div>
The problem is there are too many of special characters and I'm wondering what is the be most efficient way to accomplish this?
EDIT:
String.Replace will not do my work and using Regex for parsing HTML is not really good approach.
By dislikes on this quetion there propably isn't any clean solution therefore I decided go for following algorithm:
create txt file with valid HTML special characters (like
¶)
parse file into array of string
Thanks to HtmLAgilityPack parse HTML and get raw text and replace all entities.
I know that this is not really effective for big html string but it should do the work for now.
You can try:
string str = "<div>>hello<</div>";
string output = Regex.Replace(str, ">|<", ":)");
You can also use HtmlDecode
string str = "<div>>hello<</div>";
string output = WebUtility.HtmlDecode(str);
I am trying to read specific word from text file I know its easy and I have done but I need to read from sentence i.e. if file contain
WC|110916|F-12003||ZET5.4|27019570 then i need to pic "27019570" this specific word, I did with substring(26,8) splitting with characters and its works but every line not having specific size/length so splitting words is not proper solution for this.
In short I need to know how do i check (|) this character and its position on every sentence which includes in text file.
Thanks in Advance :)
you can split each line by '|' character . it returns an array then you can select the desired index.
var textFromFile = "WC|110916|F-12003||ZET5.4|27019570";
var goalText = textFromFile.Split('|')[5];
if you're using .NET 3.5 or higher, it's easy using LINQ with File.ReadAllLines
string fullFilePath = #"C:\ed\cc\filename.txt";
List<string> items = File.ReadAllLines(fullFilePath ).Select(line=>line.Split('|').Last()).ToList();
iam just curious, is there a way to break multiple string in cell gridview and store it or display it one by one.
earlier when i messagebox.show it would diplay the whole name or number like
abdullah ali ashonie; adefitri; candry. so what i want is, it display one by one abdullah ali shonie then adefitri then candry and how to store it
sorry for bad english, because i dont quite sure you guys know what i want
The simple way is String.Split():
var parts = GridView1.Rows[0].Cells[0].Text.Split(";".ToCharArray())
Just be warned: String.Split() has all kinds of pitfalls and gotchas. If you can't put meaningful constraints on the possible values — be absolutely certain you won't find things like new-lines or other semi-colon(;) characters as part of individual names, have quoted text, etc — you should really look into a dedicated delimited text parser. There are three (at least) built into the .Net Framework (see TextFieldParser as one option), and a plethora more on NuGet.
Look at String.Split
Returns a string array that contains the substrings in this instance that are delimited by elements of a specified string or Unicode character array.
For example:
string text = "abdullah ali ashonie; adefitri; candry";
string[] names = text.Split(';');
foreach (string name in names)
{
System.Console.WriteLine(name);
}
Outputs:
abdullah ali ashonie
adefitri
candry
There is some more information here too
I'm not 100% sure I completely understand what you're trying to do, but this is a basic string split example:
string input = "abdullah ali ashonie; adefitri; candry";
string[] pieces = input.Split(';');
foreach (var s in pieces) {
Console.WriteLine(s.Trim());
}
Fiddle here.
I want to import data from a CSV file, But some cells contain comma in string value. How can I recognize which comma is for separate and which is in cell content?
use TextFieldParser :usage
using Microsoft.VisualBasic.FileIO; //Microsoft.VisualBasic.dll
...
using(var csvReader = new TextFieldParser(reader)){
csvReader.SetDelimiters(new string[] {","});
csvReader.HasFieldsEnclosedInQuotes = true;
fields = csvReader.ReadFields();
}
In general, do not bother writing the import yourself.
I have good experiences with the FileHelpers lib.
http://www.filehelpers.com/
And indeed, I hope your fields are quoted. Filehelpers supports this out of the box.
Otherwise there is not much you can do.
Unless you have quotes around the strings you are pretty much hosed, hence the "quote and comma" delimiter style. If you have control of the export facility then you must select "enclose strings quotes" or change the delimiter to something like a tilde or carat symbol.
If not well then you have to write some code. If you detect "a..z" then start counting commas and then keep working through string until you detect [0..9] and even then this is going to be problematic since people can put a [0..9] in their text. At best this is going to be a best efforts process. Your going to have to know when you are in chars and when you are not going to be in chars. I doubt even regex will help you much on this.
The only other thing I can think of is to run through the data and look for commas. Then look prior to and after the comma. If you are surrounded by chars then replace the comma with alternate char like the carat "^" symbol or the tilde "~". Then process the file as normal then go back and replace the alternate char with a comma.
Good luck.
using FileHelper is defnitley way to go. They have done a great job building all the logic for you. I had the same issue where i had to parse a CSV file having comma as part of the field. And this utility did the job very well. All you have to do is to use fillowing attribute on to the field
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
For details http://www.filehelpers.com/forums/viewtopic.php?f=12&t=391
We can use RegEx also as bellow.
Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
String[] Fields = CSVParser.Split(Test);
Let's say I have this string:
"param1,r:1234,p:myparameters=1,2,3"
...and I would like to split it into:
param1
r:1234
p:myparameters=1,2,3
I've used the split function and of course it splits it at every comma. Is there a way to do this using regex or will I have to write my own split function?
Personally, I would try something like this:
,(?=[^,]+:.*?)
Basically, use a positive look-ahead to find a comma, followed by a "key-value" pair (this defined by a key, a colon, and more information [data] (including other commas). This should disqualify the commas between the numbers, too.
You can use ; for separating values which makes easy to work with it.
Since you have , for separation and also for values it is difficult to split it.
You have
string str = "param1,r:1234,p:myparameters=1,2,3"
Recommended to use
string str = "param1;r:1234;p:myparameters=1,2,3"
which can be splited as
var strArray = str.Split(';');
strArray[0]; // contains param1
strArray[1]; // r:1234
strArray[2]; // p:myparameters=1,2,3
I'm not sure how you would write a split that knew which commas to split on there, honestly.
Unless it's a fixed number each time in which case, just use the String.Split overload that takes an int specifying how many substrings to return at max
If you're going to have comma-delimited data that's not always a fixed number of items and it could have literal commas in the data itself, they really should be quoted. If you can control the input in any way, you should encourage that, and use an actual CSV parser instead of String.Split
That depends. You can't parse it with regex (or anything else) unless you can identify a consistent rule separating one group from another. Based on your sample, I can't clearly identify such a rule (though I have some guesses). How does the system know that p:myparameters=1,2,3 is a single item? For example, if there were another item after it, what would be the difference between that and the 1,2,3? Figure that out and you'll be pretty close to a solution.
If you're able to change the format of the input string, why not decide on a consistent delimiter between your groups? ; would be a good choice. Use an input like param1;r:1234;p:myparameters=1,2,3 and there will be no ambiguity where the groups are, plus you can just split on ; and you won't need regex.
The simplest approach would be changing your delimiter from "," to something like "|". Then you can split on "|" no problem. However if you can't change the delimiting character then maybe you could encode the sections in a fashion similar to CSV.
CSV files have the same issue... the standard there is to put double quotes "" around columns.
For example, your string would be "param1","r:1234","p:myparameters=1,2,3".
Then you could use the Microsoft.VisualBasic.FileIO.TextFieldParser to split/parse. You can include this in c# even though its in the VisualBasic namespace.
TextFieldParser
Do you mean that:string[] str = System.Text.RegularExpression.Regex.Spilt("param1,r:1234,p:myparameters=1,2,3",#"\,");