C#Regex Expression Getting certain words from a file path - c#

Hey guys im writing a c# application and i cant seem to split up data using regex as id like to.
C:\Users\e014425c\Downloads\StoreData\ABE1102013.csv
Thats getting stored as a string within the loop. I want:-
ABE1
10
2013
From there i would store the individual strings into 3 arrays.
Problem is, within the loop the data will change so you cant use Contains("ABE1")
Help would be much appreciated!
foreach (string word in file)
{
string testString = word;
string firstWords = Regex.Match(testString, #"^(\w+\b.*?){6}").ToString();
Console.Write(firstWords);
}
Thats how i´ve been testing the data

First get the filename only from the path:
string filename = File.FilenameWithoutExtension(path);
Then the characters you want with SubString():
string a1 = filename.SubString(0, 4);
string a2 = filename.SubString(4, 2);
string a3 = filename.SubString(6, 4);
EDIT
With a general pattern "ABC1_xy_YYYY" it is better to use the underscores as delimiters:
string[] field = filename.Split('_');
a1 = field[0];
...

Related

Sorting a given string - What is wrong with my IF Contains block?

Super new to C# apologize upfront. My goal is to sort a given string. Each word in the string will contain a single number. This number is the position the word should have in the result. Numbers can be from 1 to 9. So 1 will be the first word (not 0).
My plan of attack is to split the string, having one variable of int data-type (int lookingForNum) and the other variable turning that into a String data-type(string stringLookingForNum), then for each loop over the array looking to see if any elements contain string stringLookingForNum, if they do I add it to an emptry string variable, lastly add 1 to int variable lookingForNum. My issue seems to be with the if statement with the Contains method. It will not trigger the way I currently have it written. Hard coding in if (word.Contains("1")) will trigger that code block but running it as written below will not trigger the if statement.Please can anyone tell my WHY!?!? I console.log stringLookingForNum and it is for sure a string data type "1"
This noobie would appreciate any help. Thanks!
string testA = "is2 Thi1s T4est 3a"; //--> "Thi1s is2 3a T4est"
string[] arrayTestA = testA.Split(' ');
string finalString = string.Empty;
int lookingForNum = 1; //Int32
foreach (string word in arrayTestA){
string stringLookingForNum = lookingForNum.ToString();
//Don't understand why Contains is not working as expected here)
if (word.Contains(stringLookingForNum)){
finalString = finalString + $"{word} ";
}
lookingForNum++;
}
you need this - look for the string with 1, the look for the string with 2 etc. Thats not what you are doing
you look at the first string and see if it contains one
then look at the second one and see if it contains 2
....
int lookingForNum = 1;
while(true){ // till the end
string stringLookingForNum = lookingForNum.ToString();
bool found = false;
foreach (string word in arrayTestA){
if (word.Contains(stringLookingForNum)){
finalString = finalString + $"{word} ";
found = true;
break;
}
}
if(!found) break;
lookingForNum++;
}
To sort you should simply use OrderBy, and since you need to sort by number inside a word - just Find and extract a number from a string
string testA = "is2 Thi1s T4est 3a";
var result = testA.Split().OrderBy(word =>
Int32.Parse(Regex.Match(word, #"\d+").Value));
Console.WriteLine(string.Join(" ", result));

How can i analise millions of strings that merge into each other?

I have millions of strings, around 8GB worth of HEX; each string is 3.2kb in length.
Each of these strings contains multiple parts of data I need to extract.
This is an example of one such string:
GPGGA,104644.091,,,,,0,0,,,M,,M,,*43$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ$GPGGA,104645.091,,,,,0,0,,,M,,M,,*42$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ ÿÿ!ÿÿ"ÿÿ#ÿÿ$ÿÿ%ÿÿ&ÿÿ'ÿÿ(ÿÿ)ÿÿ*ÿÿ+ÿÿ,ÿÿ-ÿÿ.ÿÿ/ÿÿ0ÿÿ1ÿÿ$GPGGA,104646.091,,,,,0,0,,,M,,M,,*41$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test2ÿÿ3ÿÿ4ÿÿ5ÿÿ6ÿÿ7ÿÿ8ÿÿ9ÿÿ:ÿÿ;ÿÿ<ÿÿ=ÿÿ>ÿÿ?ÿÿ#ÿÿAÿÿBÿÿCÿÿDÿÿEÿÿFÿÿGÿÿHÿÿIÿÿJÿÿ$GPGGA,104647.091,,,,,0,0,,,M,,M,,*40$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header TestKÿÿLÿÿMÿÿNÿÿOÿÿPÿÿQÿÿRÿÿSÿÿTÿÿUÿÿVÿÿWÿÿXÿÿYÿÿZÿÿ[ÿÿ\ÿÿ]ÿÿ^ÿÿ_ÿÿ`ÿÿaÿÿbÿÿcÿÿ$GPGGA,104648.091,,,,,0,0,,,M,,M,,*4F$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Testdÿÿeÿÿfÿÿgÿÿhÿÿiÿÿjÿÿkÿÿlÿÿmÿÿnÿÿoÿÿpÿÿqÿÿrÿÿsÿÿtÿÿuÿÿvÿÿwÿÿxÿÿyÿÿzÿÿ{ÿÿ|ÿÿ$GPGGA,104649.091,,,,,0,0,,,M,,M,,*4E$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test}ÿÿ~ÿÿ.ÿÿ€ÿÿ.ÿÿ‚ÿÿƒÿÿ„ÿÿ…ÿÿ†ÿÿ‡ÿÿˆÿÿ‰ÿÿŠÿÿ‹ÿÿŒÿÿ.ÿÿŽÿÿ.ÿÿ.ÿÿ‘ÿÿ’ÿÿ“ÿÿ”ÿÿ•ÿÿ$GPGGA,104650.091,,,,,0,0,,,M,,M,,*46$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Head
as you can see it is pretty much this repeated:
GPGGA,104644.091,,,,,0,0,,,M,,M,,*43$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ$GPGGA,104645.091,,,,,0,0,,,M,,M,,*42$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ ÿÿ!ÿÿ"ÿÿ#ÿÿ$ÿÿ%ÿÿ&ÿÿ'ÿÿ(ÿÿ)ÿÿ*ÿÿ+ÿÿ,ÿÿ-ÿÿ.ÿÿ/ÿÿ0ÿÿ1ÿÿ
I want to separate this string into two lists like this:
_GPSList
$GPGGA,104644.091,,,,,0,0,,,M,,M,,*43
$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*
$GPVTG,0.00,T,,M,0.00,N,0.00,K,N
_WavList
32HeaderTest.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ
32HeaderTest.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ ÿÿ!ÿÿ"ÿÿ#ÿÿ$ÿÿ%ÿÿ&ÿÿ'ÿÿ(ÿÿ)ÿÿ*ÿÿ+ÿÿ,ÿÿ-ÿÿ.ÿÿ/ÿÿ0ÿÿ1ÿÿ
Issue 1:
This repetition isn't containing within a single string, it overflows into the next string. so if some data crosses the end and start of two strings how to I deal with that?
Issue 2: How do I analyse the string and extract only the parts I need?
The solution I'm providing is not a complete answer but more like an idea which might help you get what you want.
Everything else which I present is an assumption on my behalf.
//Assuming your data is stored in a file "yourdatafile"
//Splitting all the text on "$" assuming this will separate GPSData
string[] splittedstring = File.ReadAllText("yourdatafile").Split('$');
//I found an extra string lingering in the sample you provided
//because I splitted on "$", so you gotta take that into account
var GPSList = new List<string>();
var WAVList = new List<string>();
foreach (var str in splittedstring)
{
//So if the string contains "Header" we would want to separate it from GPS data
if (str.Contains("Header"))
{
string temp = str.Remove(str.IndexOf("Header"));
int indexOfAsterisk = temp.LastIndexOf("*");
string stringBeforeAsterisk = str.Substring(0, indexOfAsterisk + 1);
string stringAfterAsterisk = str.Replace(stringBeforeAsterisk, "");
WAVList.Add(stringAfterAsterisk);
GPSList.Add("$" + stringBeforeAsterisk);
}
else
GPSList.Add("$" + str);
}
This provides the exact output as you need, only exception is with that extra string. Also some non-standard characters might look like black blocks.

Unable to split a string using split() method

I have a string
String data="CE|2014-2015|ClassA"
I need output like
string Batch="2014-2015"
string Class="ClassA"
How can I achieve it?? I tried a lot string,Split() function. But I did not get expected output.Please help me out
I tried,
string s = "CE|2014-2015|Class1";
string[] words = s.Split('|| ');
This should work for you
string[] splitted = data.Split('|');
string Batch = splitted[1];
string Class = splitted[2];
Your solution is wrong because: '|| ' is not a valid char and it won't even compile. You should split on | and take second and third value from splitted values
You can do the following
string data = "CE|2014-2015|ClassA";
string[] split = data.Split('|');
string Batch=split[1];
string Class = split[2];
Hope it works for you.

Getting part of the filename C#

I have a file name dayhappy_02_02345.csv
How do I get the 02 part out to be used in a variable and also how do I get the 02345 part so that I can pass these 2 values into a variable for a function.
Using c#.
I have looked at GetFileName but this gets either the filename, the extention or the full file name only.
Thanks
Ste
For that specific file name,
string sData = "dayhappy_02_02345.csv";
string[] sArr = sData.split('_');
string sPart1 = sArr[1];
string sPart2 = sArr[2];
Will do, but that's a special case, will work only on file names of this type
Get the file name as you've already figured out, then use String.Split() to get the individual pieces.
You have to use Regex:
var match = new Regex(#".*_(\d+)_(\d+)").Match(Path.GetFileNameWithoutExtension(fileNAme));
var v02 = match.Groups[0].Value;
var v02345 = match.Groups[1].Value;

Loop Problem: Assign data to different strings when in a loop

I have a string which consists of different fields. So what I want to do is get the different text and assign each of them into a field.
ex: Hello Allan IBM
so what I want to do is:
put these three words in different strings like
string Greeting = "Hello"
string Name = "Allan"
string Company = "IBM"
//all of it happening in a loop.
string data = "Hello Allan IBM"
string s = data[i].ToString();
string[] words = s.Split(',');
foreach (string word in words) {
Console.WriteLine(word);
}
any suggestions?
thanks hope to hear from you soon
If I understand correctly you have a string with place-holders and you want to put different string in those place-holders:
var format="{0}, {1} {2}. How are you?";
//string Greeting = "Hello"
//string Name = "Allan"
//string Company = "IBM"
//all of it happening in a loop.
string data = ...; //I think you have an array of strings separated by ,
foreach( va s in data){
{
//string s = data[i];//.ToString(); - it is already a string array
string[] words = data[i].Split(',');
Console.WriteLine(format, words[0], words[1], words[2]);
}
To me it sound not like a problem that can be solved with a loop. The essential problem is that the loop can only work if you do exactly the same operation on the items within the loop. If your problem doesn't fit, you end up with a dozen of lines of code within the loop to handle special cases, what could have been written in a shorter way without a loop.
If there are only two or three strings you have to set (what should be the case if you have named variables), assign them from the indexes of the split string. An alternative would be using regular expressions to match some patterns to make it more robust, if one of the expected strings is missing.
Another possibility would be to set attributes on members or properties like:
[MyParseAttribute(/*position*/ /*regex*/)]
string Greeting {get;set;}
And use reflexion to populate them. Here you could create a loop on all properties having that attribute, as it sounds to me that you are eager to create a loop :-)

Categories

Resources