Sort a variable based on header text - c#

I am looking for guidence, and as I tried to convey with my title, I have an issue where I receive data that sometimes look like this for example :
entry[0] = "SHAPE", "X", "Y"
entry[1] = "Circle", "2", "3"
and sometimes may look like this:
entry[0] = "X", "Y", "SHAPE"
entry[1] = "2", "3", "Circle"
As you can see, they are ordered based on the first row values, which I will call "headerValues" below.
I am now trying to map my variables (for example "shape") so it's placed where the entry actually correlates to the shape value. I want to do this so I dont end up with a X number in my "Shape" variable due to a different input order then I planned for.
I am also well aware that I may want to remove the first row before I add them into my shapes, but that is an issue I want to try and figure out on my own in order to learn. I am only here due to the fact that I have been stuck on this problem for a while now, and therefore really appriciate any help I can get from a more seasoned programmer than me.
Below you will find the code:
var csvRows = csvData.Split(';');
var headerValues = csvRows[0].Split(',');
List<Shapes> shapes = new List<Shapes>();
if (csvRows.Count() > 0)
foreach (var row in csvRows)
{
var csvColumn = row.Split(',').Select(csvData => csvData.Replace(" ", "")).Where(csvData => !string.IsNullOrEmpty(csvData)).Distinct().ToList();
if (csvColumn.Count() == 5)
{
shapes.Add(new()
{
shape = csvColumn[0], //want to have same index palcement as where headervalue contains = "Shape"
});
}
else
{
Console.WriteLine(row + " does not have 5 inputs and cannot be added!");
}
}
Thank you in advance!

You can determine your column(s) by using linq:
var colShape = headerValues.ToList().FindIndex(e => e.Equals("SHAPE"));
and then use that to set the the property in the object:
shapes.Add(new()
{
shape = csvColumn[colShape], //want to have same index palcement as where headervalue contains = "Shape"
});
In the long run you would be better off using a csv parsing library.

Since your data is in the CSV format, you don't need to reinvent the wheel, just use a helper library like CsvHelper
using var reader = new StringReader(csvData);
using var csvReader = new CsvReader(reader, CultureInfo.InvariantCulture);
var shapes = csvReader.GetRecords<Shapes>().ToList();
You may need to annotate the Shapes.shape field or property if it has different casing from the data, use the NameAttribute provided by CsvHelper

Related

Select Row of CSV File [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I have a CSV File with these headers:
date;clock;value
My aim is to select the CSV line with a specific date to get the corresponding value.
For example:
I want to select date 20.08.22 and the result should be 130
15.08.22;07:05;100
20.08.22;08:04;130
21.08.22;10:04;150
With this code snippet I read the lines of the csv file:
private void Werte_aus_CSV_auslesen()
{
var path = #"E:\werte.csv";
using (TextFieldParser csvParser = new TextFieldParser(path))
{
csvParser.CommentTokens = new string[] { "#" };
csvParser.SetDelimiters(new string[] { ";" });
csvParser.HasFieldsEnclosedInQuotes = true;
// Skip the row with the column names
csvParser.ReadLine();
while (!csvParser.EndOfData)
{
// Read current line fields, pointer moves to the next line.
fields = csvParser.ReadFields();
datum.Add(fields[0]);
uhrzeit.Add(fields[1]);
wert.Add(double.Parse(fields[2], CultureInfo.InvariantCulture));
}
}
}
The approach you are using is going to have to scan the entire CSV every time you lookup a value. This might be a performance problem if this method is called multiple times. It would be better to build a dictionary that maps the date to the value that can be built once and reused for each subsequent lookup.
I maintain a couple libraries that make this pretty easy: Sylvan.Data and Sylvan.Data.Csv. Here is a complete C# 10 console app that demonstrates how to accomplish this:
using Sylvan.Data.Csv;
using Sylvan.Data;
// data: would normally use CsvDataReader.Create(csvFileName, opts);
var data =
new StringReader(
#"date;clock;value
15.08.22;07:05;100
20.08.22;08:04;130
21.08.22;10:04;150
");
// parameter:
var selectDate = new DateTime(2022, 8, 20);
// configure settings so the csv reader understands your data
var opts = new CsvDataReaderOptions
{
DateTimeFormat = "dd'.'MM'.'yy",
// ignore clock, as it isn't used
Schema = new CsvSchema(Schema.Parse("date:date,clock,value:int"))
};
var csvReader = CsvDataReader.Create(data, opts);
// create a dictionary to cache the CSV data for quick lookups
// creating the dictionary scans the whole dataset, but subsequent lookups will
// be blazing fast.
{
var dict =
csvReader
.GetRecords<Record>() // bind the CSV data to the Record class
.ToDictionary(r => r.Date, r => r.Value);
Console.WriteLine(dict.TryGetValue(selectDate, out var value) ? value.ToString() : "Value not found");
}
class Record
{
public DateTime Date { get; set; }
public int Value { get; set; }
}
Matched arrays/lists like datum, uhrzeit, and wert that relate values within each collection based on index is an anti-pattern... something to avoid. So much better to create a class with fields for each of the values, and then have one collection to hold the class.
public class MyData
{
public DateTime date {get;set;}
public int value {get;set;}
}
(Of course, give it a better name than "MyData")
Newer code might also use a record instead of a class.
We can further improve this by separating the code to read the csv data from the code that composes the objects. Start with something like this:
private IEnumerable<string[]> Werte_aus_CSV_auslesen(string path)
{
using (TextFieldParser csvParser = new TextFieldParser(path))
{
csvParser.CommentTokens = new string[] { "#" };
csvParser.SetDelimiters(new string[] { ";" });
csvParser.HasFieldsEnclosedInQuotes = true;
// Skip the row with the column names
csvParser.ReadLine();
while (!csvParser.EndOfData)
{
// Read current line fields, pointer moves to the next line.
yield return csvParser.ReadFields();
}
}
}
Notice how it accepts an input and returns an object (the enumerable with the data). Also notice how it avoids anything to do with processing the individual rows. It is only concerned with parsing the CSV/SSV inputs. It doesn't care what fields you expect to find, and can handle any file input with a header line, hash comments, and semi-colon field separators.
Since this gives us string[] values, we also add a method to transform a string[] into a class instance. I like to start out with this as a static method of the class itself, but as a project grows to have many of these methods they may eventually be moved to their own static type:
public class MyData
{
public DateTime date {get;set;}
public int value {get;set}
public static MyData FromCSVRow(string[] input)
{
return new MyData() {
date = DateTime.ParseExact($"{input[0]} {input[1]}", "dd.MM.yy HH:mm", null),
value = int.Parse(input[2])
};
}
}
And now with all that out of the way, we can finally put it all together to get your answer:
var targetDate = new DateTime(2022, 8, 20);
var csv = Werte_aus_CSV_auslesen(#"E:\werte.csv");
var rows = csv.Select(MyData.FromCSV);
var result = rows.Where(r => r.date.Date == targetDate);
If we really wanted to, we could even treat all that as a single line of code (it's probably better to keep it separate, for readability/maintainability):
var result = Werte_aus_CSV_auslesen(#"E:\werte.csv").
Select(MyData.FromCSV).
Where(r => r.date.Date == new DateTime(2022, 8, 20));
Note result is still an IEnumerable<MyData>, because there might be more than one row matching the criteria. If you are really sure there will only be one matching record, you can use this:
var result = rows.Where(r => r.date.Date == targetDate).FirstOrDefault();
or this:
var result = rows.Where(r => r.date.Date == targetDate).First();
depending on what you want to happen if no match is found.
One of the nice features here is this checks each record as it reads the file, and will stop reading the file as soon as it finds a match, which is potentially a very nice performance win.

take one value from multiple loops

I have a list of arrays, of which i want to take one value from each array and build up a JSON structure. Currently for every managedstrategy the currency is always the last value in the loop. How can i take the 1st, then 2nd value etc while looping the names?
List<managedstrategy> Records = new List<managedstrategy>();
int idcnt = 0;
foreach (var name in results[0])
{
managedstrategy ms = new managedstrategy();
ms.Id = idcnt++;
ms.Name = name.ToString();
foreach (var currency in results[1]) {
ms.Currency = currency.ToString();
}
Records.Add(ms);
}
var Items = new
{
total = results.Count(),
Records
};
return Json(Items, JsonRequestBehavior.AllowGet);
JSON structure is {Records:[{name: blah, currency: gbp}]}
Assuming that I understand the problem correctly, you may want to look into the Zip method provided by Linq. It's used to "zip" together two different lists, similar to how a zipper works.
A related question can be found here.
Currently, you are nesting the second loop in the first, resulting in it always returning the last currency, you have to put it all in one big for-loop for it to do what you want:
for (int i = 0; i < someNumber; i++)
{
// some code
ms.Name = results[0][i].ToString();
ms.Currency = results[1][i].ToString();
}

Flat file normalization with a dynamic number of columns

I have a flat file with an unfortunately dynamic column structure. There is a value that is in a hierarchy of values, and each tier in the hierarchy gets its own column. For example, my flat file might resemble this:
StatisticID|FileId|Tier0ObjectId|Tier1ObjectId|Tier2ObjectId|Tier3ObjectId|Status
1234|7890|abcd|efgh|ijkl|mnop|Pending
...
The same feed the next day may resemble this:
StatisticID|FileId|Tier0ObjectId|Tier1ObjectId|Tier2ObjectId|Status
1234|7890|abcd|efgh|ijkl|Complete
...
The thing is, I don't care much about all the tiers; I only care about the id of the last (bottom) tier, and all the other row data that is not a part of the tier columns. I need normalize the feed to something resembling this to inject into a relational database:
StatisticID|FileId|ObjectId|Status
1234|7890|ijkl|Complete
...
What would be an efficient, easy-to-read mechanism for determining the last tier object id, and organizing the data as described? Every attempt I've made feels kludgy to me.
Some things I've done:
I have tried to examine the column names for regular expression patterns, identify the columns that are tiered, order them by name descending, and select the first record... but I lose the ordinal column number this way, so that didn't look good.
I have placed the columns I want into an IDictionary<string, int> object to reference, but again reliably collecting the ordinal of the dynamic columns is an issue, and it seems this would be rather non-performant.
I ran into a simular problem a few years ago. I used a Dictionary to map the columns, it was not pretty, but it worked.
First make a Dictionary:
private Dictionary<int, int> GetColumnDictionary(string headerLine)
{
Dictionary<int, int> columnDictionary = new Dictionary<int, int>();
List<string> columnNames = headerLine.Split('|').ToList();
string maxTierObjectColumnName = GetMaxTierObjectColumnName(columnNames);
for (int index = 0; index < columnNames.Count; index++)
{
if (columnNames[index] == "StatisticID")
{
columnDictionary.Add(0, index);
}
if (columnNames[index] == "FileId")
{
columnDictionary.Add(1, index);
}
if (columnNames[index] == maxTierObjectColumnName)
{
columnDictionary.Add(2, index);
}
if (columnNames[index] == "Status")
{
columnDictionary.Add(3, index);
}
}
return columnDictionary;
}
private string GetMaxTierObjectColumnName(List<string> columnNames)
{
// Edit this function if Tier ObjectId is greater then 9
var maxTierObjectColumnName = columnNames.Where(c => c.Contains("Tier") && c.Contains("Object")).OrderBy(c => c).Last();
return maxTierObjectColumnName;
}
And after that it's simply running thru the file:
private List<DataObject> ParseFile(string fileName)
{
StreamReader streamReader = new StreamReader(fileName);
string headerLine = streamReader.ReadLine();
Dictionary<int, int> columnDictionary = this.GetColumnDictionary(headerLine);
string line;
List<DataObject> dataObjects = new List<DataObject>();
while ((line = streamReader.ReadLine()) != null)
{
var lineValues = line.Split('|');
string statId = lineValues[columnDictionary[0]];
dataObjects.Add(
new DataObject()
{
StatisticId = lineValues[columnDictionary[0]],
FileId = lineValues[columnDictionary[1]],
ObjectId = lineValues[columnDictionary[2]],
Status = lineValues[columnDictionary[3]]
}
);
}
return dataObjects;
}
I hope this helps (even a little bit).
Personally I would not try to reformat your file. I think the easiest approach would be to parse each row from the front and the back. For example:
itemArray = getMyItems();
statisticId = itemArray[0];
fileId = itemArray[1];
//and so on for the rest of your pre-tier columns
//Then get the second to last column which will be the last tier
lastTierId = itemArray[itemArray.length -1];
Since you know the last tier will always be second from the end you can just start at the end and work your way forwards. This seems like it would be much easier than trying to reformat the datafile.
If you really want to create a new file, you could use this approach to get the data you want to write out.
I don't know C# syntax, but something along these lines:
split line in parts with | as separator
get parts [0], [1], [length - 2] and [length - 1]
pass the parts to the database handling code

C# - sorting by a property

I am trying to sort a collection of objects in C# by a custom property.
(For context, I am working with the Twitter API using the Twitterizer library, sorting Direct Messages into conversation view)
Say a custom class has a property named label, where label is a string that is assigned when the class constructor.
I have a Collection (or a List, it doesn't matter) of said classes, and I want to sort them all into separate Lists (or Collections) based on the value of label, and group them together.
At the moment I've been doing this by using a foreach loop and checking the values that way - a horrible waste of CPU time and awful programming, I know. I'm ashamed of it.
Basically I know that all of the data I have is there given to me, and I also know that it should be really easy to sort. It's easy enough for a human to do it with bits of paper, but I just don't know how to do it in C#.
Does anyone have the solution to this? If you need more information and/or context just ask.
Have you tried Linq's OrderBy?
var mySortedList = myCollection.OrderBy(x => x.PropertyName).ToList();
This is still going to loop through the values to sort - there's no way around that. This will at least clean up your code.
You say sorting but it sounds like you're trying to divide up a list of things based on a common value. For that you want GroupBy.
You'll also want ToDictionary to switch from an IGrouping as you'll presumably be wanting key based lookup.
I assume that the elements within each of the output sets will need to be sorted, so check out OrderBy. Since you'll undoubtedly be accessing each list multiple times you'll want to collapse it to a list or an array (you mentioned list) so I used ToList
//Make some test data
var labels = new[] {"A", "B", "C", "D"};
var rawMessages = new List<Message>();
for (var i = 0; i < 15; ++i)
{
rawMessages.Add(new Message
{
Label = labels[i % labels.Length],
Text = "Hi" + i,
Timestamp = DateTime.Now.AddMinutes(i * Math.Pow(-1, i))
});
}
//Group the data up by label
var groupedMessages = rawMessages.GroupBy(message => message.Label);
//Convert to a dictionary for by-label lookup (this gives us a Dictionary<string, List<Message>>)
var messageLookup = groupedMessages.ToDictionary(
//Make the dictionary key the label of the conversation (set of messages)
grouping => grouping.Key,
//Sort the messages in each conversation by their timestamps and convert to a list
messages => messages.OrderBy(message => message.Timestamp).ToList());
//Use the data...
var messagesInConversationA = messageLookup["A"];
var messagesInConversationB = messageLookup["B"];
var messagesInConversationC = messageLookup["C"];
var messagesInConversationD = messageLookup["D"];
It sounds to me like mlorbetske was correct in his interpretation of your question. It sounds like you want to do grouping rather than sorting. I just went at the answer a bit differently
var originalList = new[] { new { Name = "Andy", Label = "Junk" }, new { Name = "Frank", Label = "Junk" }, new { Name = "Lisa", Label = "Trash" } }.ToList();
var myLists = new Dictionary<string, List<Object>>();
originalList.ForEach(x =>
{
if (!myLists.ContainsKey(x.Label))
myLists.Add(x.Label,new List<object>());
myLists[x.Label].Add(x);
});

Is there a way to access the columns in a Dapper FastExpando via string or index?

I am pulling in a Dapper FastExpando object and want to be able to reference the column names dynamically at run time rather than at design/compile time. So I want to be able to do the following:
var testdata = conn.Query("select * from Ride Where RiderNum = 21457");
I want to be able to do the following:
foreach( var row in testdata) {
var Value = row["PropertyA"];
}
I understand that I can do:
var Value = row.PropertyA;
but I can't do that since the name of the property i'm going to need won't be known until runtime.
The answer from this SO Question doesn't work. I still get the same Target Invocation exception. So...
Is there any way to do what I want to do with a Dapper FastExpando?
Sure, it is actually way easier than that:
var sql = "select 1 A, 'two' B";
var row = (IDictionary<string, object>)connection.Query(sql).First();
row["A"].IsEqualTo(1);
row["B"].IsEqualTo("two");
Regarding the portion of the title "or index?" - I needed to access results by index since the column names being returned changed sometimes, so you can use a variation of Sam Saffron's answer like this:
var sql = "select 1, 'two'";
var row = (IDictionary<string, object>)connection.Query(sql).First();
row.Values.ElementAt(0).IsEqualTo(1);
row.Values.ElementAt(1).IsEqualTo("two");
There a simple way to access fields direct below sample
string strConexao = WebConfigurationManager.ConnectionStrings["connection"].ConnectionString;
conexaoBD = new SqlConnection(strConexao);
conexaoBD.Open();
var result = conexaoBD.Query("Select Field1,Field2 from Table").First();
//access field value result.Field1
//access field value result.Field2
if (result.Field1 == "abc"){ dosomething}

Categories

Resources