I have millions of strings, around 8GB worth of HEX; each string is 3.2kb in length.
Each of these strings contains multiple parts of data I need to extract.
This is an example of one such string:
GPGGA,104644.091,,,,,0,0,,,M,,M,,*43$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ$GPGGA,104645.091,,,,,0,0,,,M,,M,,*42$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ ÿÿ!ÿÿ"ÿÿ#ÿÿ$ÿÿ%ÿÿ&ÿÿ'ÿÿ(ÿÿ)ÿÿ*ÿÿ+ÿÿ,ÿÿ-ÿÿ.ÿÿ/ÿÿ0ÿÿ1ÿÿ$GPGGA,104646.091,,,,,0,0,,,M,,M,,*41$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test2ÿÿ3ÿÿ4ÿÿ5ÿÿ6ÿÿ7ÿÿ8ÿÿ9ÿÿ:ÿÿ;ÿÿ<ÿÿ=ÿÿ>ÿÿ?ÿÿ#ÿÿAÿÿBÿÿCÿÿDÿÿEÿÿFÿÿGÿÿHÿÿIÿÿJÿÿ$GPGGA,104647.091,,,,,0,0,,,M,,M,,*40$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header TestKÿÿLÿÿMÿÿNÿÿOÿÿPÿÿQÿÿRÿÿSÿÿTÿÿUÿÿVÿÿWÿÿXÿÿYÿÿZÿÿ[ÿÿ\ÿÿ]ÿÿ^ÿÿ_ÿÿ`ÿÿaÿÿbÿÿcÿÿ$GPGGA,104648.091,,,,,0,0,,,M,,M,,*4F$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Testdÿÿeÿÿfÿÿgÿÿhÿÿiÿÿjÿÿkÿÿlÿÿmÿÿnÿÿoÿÿpÿÿqÿÿrÿÿsÿÿtÿÿuÿÿvÿÿwÿÿxÿÿyÿÿzÿÿ{ÿÿ|ÿÿ$GPGGA,104649.091,,,,,0,0,,,M,,M,,*4E$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test}ÿÿ~ÿÿ.ÿÿ€ÿÿ.ÿÿ‚ÿÿƒÿÿ„ÿÿ…ÿÿ†ÿÿ‡ÿÿˆÿÿ‰ÿÿŠÿÿ‹ÿÿŒÿÿ.ÿÿŽÿÿ.ÿÿ.ÿÿ‘ÿÿ’ÿÿ“ÿÿ”ÿÿ•ÿÿ$GPGGA,104650.091,,,,,0,0,,,M,,M,,*46$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Head
as you can see it is pretty much this repeated:
GPGGA,104644.091,,,,,0,0,,,M,,M,,*43$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ$GPGGA,104645.091,,,,,0,0,,,M,,M,,*42$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*32Header Test.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ ÿÿ!ÿÿ"ÿÿ#ÿÿ$ÿÿ%ÿÿ&ÿÿ'ÿÿ(ÿÿ)ÿÿ*ÿÿ+ÿÿ,ÿÿ-ÿÿ.ÿÿ/ÿÿ0ÿÿ1ÿÿ
I want to separate this string into two lists like this:
_GPSList
$GPGGA,104644.091,,,,,0,0,,,M,,M,,*43
$GPVTG,0.00,T,,M,0.00,N,0.00,K,N*
$GPVTG,0.00,T,,M,0.00,N,0.00,K,N
_WavList
32HeaderTest.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ
32HeaderTest.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ.ÿÿ ÿÿ!ÿÿ"ÿÿ#ÿÿ$ÿÿ%ÿÿ&ÿÿ'ÿÿ(ÿÿ)ÿÿ*ÿÿ+ÿÿ,ÿÿ-ÿÿ.ÿÿ/ÿÿ0ÿÿ1ÿÿ
Issue 1:
This repetition isn't containing within a single string, it overflows into the next string. so if some data crosses the end and start of two strings how to I deal with that?
Issue 2: How do I analyse the string and extract only the parts I need?
The solution I'm providing is not a complete answer but more like an idea which might help you get what you want.
Everything else which I present is an assumption on my behalf.
//Assuming your data is stored in a file "yourdatafile"
//Splitting all the text on "$" assuming this will separate GPSData
string[] splittedstring = File.ReadAllText("yourdatafile").Split('$');
//I found an extra string lingering in the sample you provided
//because I splitted on "$", so you gotta take that into account
var GPSList = new List<string>();
var WAVList = new List<string>();
foreach (var str in splittedstring)
{
//So if the string contains "Header" we would want to separate it from GPS data
if (str.Contains("Header"))
{
string temp = str.Remove(str.IndexOf("Header"));
int indexOfAsterisk = temp.LastIndexOf("*");
string stringBeforeAsterisk = str.Substring(0, indexOfAsterisk + 1);
string stringAfterAsterisk = str.Replace(stringBeforeAsterisk, "");
WAVList.Add(stringAfterAsterisk);
GPSList.Add("$" + stringBeforeAsterisk);
}
else
GPSList.Add("$" + str);
}
This provides the exact output as you need, only exception is with that extra string. Also some non-standard characters might look like black blocks.
I have an MVC app which I need to store information into the database. I get a string value e.g. as
string a = "a,b,c";
I then split the string by removing the commas as
string[] b = a.Split(',');
Now before I save to database I have to add the comma back in and this is where I'm kind of stuck. I can add the comma however one gets added to the end of the string too which I don't want. If I do TrimEnd(',') it removes every comma. Can someone tell me where I'm going wrong please. I'm adding the comma back as:
foreach(var items in b)
{
Console.WriteLine(string.Format("{0},", items));
}
Please note I have to split the comma first due to some validation which needs to be carried out before saving to DB
The expected result should be for example
a,b,c
In stead I get
a,b,c,
Update - The below is the code I'm using In my MVC app after Bruno Garcia answer
string[] checkBoxValues = Request.Form["location"].Split(',');
foreach(var items in checkBoxValues)
{
if (!items.Contains("false"))
{
UsersDto.Location += string.Join(",", items);
}
}
Try:
string.Join(",", b);
This will add a ',' in between each item of your array
Based on the code you posted this is what I think you need
UsersDto.Location = string.Join(
",",
Request.Form["location"]
.Split(',')
.Where(item => !item.Contains("false")));
That will split the values in Request.Form["location"] on comma. Then filter out items that contain "false" as a substring, and finally join them back together with a comma.
So a string like "abc,def,blahfalseblah,xyz" would become "abc,def,xyz".
You can just use String.Join then?
var result = String.join(",", b); // a,b,c
Full document: https://msdn.microsoft.com/en-us/library/57a79xd0(v=vs.110).aspx
it can do
string[] checkBoxValues = Request.Form["location"].Split(',');
string s = "";
foreach (var items in checkBoxValues)
{
if (!items.Contains("false"))
{
s = s + string.Format("{0},", items);
}
}
UsersDto.Location = s.TrimEnd(',');
Having a bit of trouble with converting a comma separated text file to a generic list. I have a class (called "Customers") defined with the following attributes:
Name (string)
City (string)
Balance (double)
CardNumber (int)
The values will be stored in a text file in this format: Name,City, Balance, CarNumber e.g. John,Memphis,10,200789. There will be multiple lines of this. What I want to do is have each line be placed in a list item when the user clicks a button.
I've worked out I can break each line up using the .Split() method, but have no idea how to make the correct value go into the correct attribute of the list. (Please note: I know how to use get/set properties, and I am not allowed to use LINQ to solve the problem).
Any help appreciated, as I am only learning and have been working on this for a for while with no luck. Thanks
EDIT:
Sorry, it appears I'm not making myself clear. I know how to use .add.
If I have two lines in the text file:
A,B,1,2 and
C,D,3,4
What I don't know how to do is make the name "field" in the list item in position 0 equal "A", and the name "field" in the item in position 1 equal "C" and so on.
Sorry for the poor use of terminology, I'm only learning. Hope you understand what I'm asking (I'm sure it's really easy to do once you know)
The result of string.Split will give you an array of strings:
string[] lineValues = line.Split(',');
You can access values in an array by index:
string name = lineValues[0];
string city = lineValues[1];
You can convert strings to double or int using their respective Parse methods:
double balance = double.Parse(lineValues[2]);
int cardNumber = int.Parse(lineValues[3]);
You can instantiate the class and assign to it very simply:
Customer customerForCurrentLine = new Customer()
{
Name = name,
City = city,
Balance = balance,
CardNumber = cardNumber,
};
Simply loop over the lines, instantiate a Customer for that line, and add it to a variable you've created of the type List<Customer>
If you want your program to be bulletproof, you're going to have to do a lot of checking to skip over lines that don't have enough values, or that would fail to parse to the correct number type. For example, check lineValues.Length == 4 and use int.TryParse(...) and double.TryParse(...).
Read a file and split its text based on newline character. Then for total line count run a loop that will split based on comma and create a new object and insert values in its properties and add that object to a list.
This way
List<Customers> lst = new List<Customers>();
string[] str = System.IO.File.ReadAllText(#"C:\CutomersFile.txt")
.Split(new string[] { Environment.NewLine },
StringSplitOptions.None);
for (int i = 0; i < str.Length; i++)
{
string[] s = str[i].Split(',');
Customers c = new Customers();
c.Name = s[0];
c.City = s[1];
c.Balance = Convert.ToDouble(s[2]);
c.CardNumber = Convert.ToInt32(s[3]);
lst.Add(c);
}
BTW class name should be Customer and not Customers
Split() generates an array of strings in the order they appeared in the source string. Thus, if your name field is the first column in the CSV file, it will always be the first index in the array.
someCustomer.Name = splitResult[0];
And so on. You'll also need to investigate String.TryParse for your class's numerically typed properties.
PHP developer here working with c#.
I'm using a technique to remove a block of text from a large string by exploding the string into an array and then shifting the first element out of the array and turning what remains back into a string.
With PHP (an awesome & easy language) it was just
$array = explode('somestring',$string);
array_shift($array);
$newstring = implode(' ', $array);
and I'm done.
I get so mad at c# for not allowing me to create dynamic arrays and for not offering me default functions that can do the same thing as PHP regarding arrays. Instead of dynamic arrays I have to create lists and predefine key structures etc. But I'm new and I'm sure there are still equally graceful ways to do the same with c#.
Will someone show me a clean way to accomplish this goal with c#?
Rephrase of question: How can I remove the first element from an array using c# code.
Here is how far I've gotten, but RemoveAt throws a error while debugging so I don't believe it works:
//scoop-out feed header information
if (entry_start != "")
{
string[] parts = Regex.Split(this_string, #entry_start);
parts.RemoveAt(0);
this_string = String.Join(" ", parts);
}
I get so mad at c# for not allowing me to create dynamic arrays
You may take a look at the List<T> class. Its RemoveAt might be worth checking.
But for your particular scenario you could simply use LINQ and the Skip extension method (don't forget to add using System.Linq; to your file in order to bring it into scope):
if (entry_start != "")
{
string[] parts = Regex.Split(this_string, #entry_start).Skip(1).ToArray();
this_string = String.Join(" ", parts);
}
C# is not designed to be quick and dirty, nor it particularly specializes in text manipulation. Furthermore, the technique you use for removing some portion of a string from a beginning is crazy imho.
Why don't you just use String.Substring(int start, int length) coupled with String.IndexOf("your delimiter")?
Here is the corresponding C# code:
string input = "a,b,c,d,e";
string[] splitvals = input.Split(',');
string output = String.Join(",", splitvals, 1, splitvals.Length-1);
MessageBox.Show(output);
You can use LINQ for this:
if (entry_start != "")
this_string = String.Join(" ", Regex.Split(this_string, #entry_start).Skip(1).ToArray());
string split = ",";
string str = "asd1,asd2,asd3,asd4,asd5";
string[] ary = str.Split(new string[] { split }, StringSplitOptions.RemoveEmptyEntries);
string newstr = string.Join(split, ary, 1, ary.Count() - 1);
splits at ",". removes the first record. then combines back with ","
As stated above, you can use LINQ. Skip(int) will return an IEnumerable<string> that you can then cast back as array.
string[] myArray = new string[]{"this", "is", "an", "array"};
myArray = myArray.Skip(1).toArray();
You might be more comfortable with generic lists than arrays, which work more like PHP arrays.
List<T>
But if your goal is "to remove a block of text from a large string" then the easier way would be:
string Example = "somestring";
string BlockRemoved = Example.Substring(1);
// BlockRemoved = "omestring"
Edit
I misunderstood the question, thinking you were just removing the first element from the array where the array consisted of the characters that make up the string.
To split a string by a delimiter, look at the String.Split method instead. Some good examples are given here.