I have a list of objects. Each object contains a list of categories as a comma delimited string.
I want to know how many objects i have for each category. For this i think i need to group by the categories and then count the entries - however i can't wrap my head around grouping by a list.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
class MyDto
{
public string Name { get; set; }
public List<string> Categories => CategoriesString
.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries).ToList();
public string CategoriesString { get; set; }
public override string ToString()
{
return Name + ": " + CategoriesString;
}
}
public class Program
{
public static void Main()
{
var dtos = new MyDto[]
{
new MyDto() { Name = "Dto 1", CategoriesString = "DELIVERY"},
new MyDto() { Name = "Dto 2", CategoriesString = "DELIVERY , DAMAGE"},
new MyDto() { Name = "Dto 3", CategoriesString = "DAMAGE"},
new MyDto() { Name = "Dto 4", CategoriesString = "DAMAGE , DELIVERY"},
new MyDto() { Name = "Dto 5", CategoriesString = "DELIVERY"},
};
var res = dtos.GroupBy(c => c.Categories).Select(c => new { c.Key, amt = c.Count() });
foreach (var c in res)
{
Console.WriteLine(c.Key + " - " + c.amt);
}
// Should return:
// DELIVERY - 4
// DAMAGE - 3
}
}
https://dotnetfiddle.net/bowYb4
The sample above is just to demonstrate the issue and does not actually give the desired result(s). I'm using data objects coming from a database with EF core. I'm aware that what im trying to do won't translate to SQL - I'm doing this client-side and that is fine.
One option is to use SelectMany to flatten categories and transform the dtos in key-value pairs (I use valu tuples to store them):
var res = dtos
.SelectMany(dto => dto.Categories.Select(c => (Cat: c, dto.Name)))
.GroupBy(c => c.Cat)
.Select(c => new { c.Key, amt = c.Count() });
An alternative solution could be the following:
var lookup = dtos
.Select(c => c.Categories) //retrieve the already split values
.SelectMany(c => c) //flatten the IEnumerable<List<string> to IEnumerable<string>
.ToLookup(c => c, c => c); //group the same values
foreach (var item in lookup)
{
Console.WriteLine($"{item.Key} - {item.Count()}");
}
The difference between GroupBy and ToLookup is that the former is executed in a deferred way, while the latter is executed immediately.
Could I please get some help with querying from a JSON file? Populating a datagrid view works just fine but what I am trying to do now is filter the data using LINQ which I'm really struggling with.
This works just fine, populating the datagridview with all of my jsonfile data
//dataGridView1.DataSource = (from p in movie2
// select p).ToArray();
Below is what I have been playing around with. When I group by employee ID into g, I can not longer use my p references to fields.
using (StreamReader file = File.OpenText(#"C:\temp\GRMReportingJSONfiles\Assigned_FTE\" + myFile))
{
JsonSerializer serializer = new JsonSerializer();
IEnumerable<AssgnData> movie2 = (IEnumerable<AssgnData>)serializer.Deserialize(file, typeof(IEnumerable<AssgnData>));
dataGridView1.DataSource = (from p in movie2
group p by p.EMPLID[0] into g
select new {
EMPLID = p.EMPLID,
(decimal?)decimal.Parse(p.MNTH1) ?? 0).Sum(),
};
);
//dataGridView1.DataSource = (from p in movie2
// select Int32.Parse(p.MNTH1).Sum();
dataGridView1.DataSource = (from p in movie2
group p by p.EMPLID[0] into g
select (decimal?)decimal.Parse(p.MNTH1) ?? 0).Sum(); //dataGridView1.DataSource = (from p in movie2
// select p).ToArray();
//where p.Resource_BU == "7000776"
//chart1.DataBindCrossTable(movie2, "MNTH1", "1", "PROJECT_ID", "Label = FTE");
//chart1.Refresh();
}
Here is part of the array layout, removed other fields for now as I was just trying to focus on these two, dataset has 100k rows and 50 columns
public class AssgnData
{
public string EMPLID { get; set; }
public string MNTH1 { get; set; }
}
In my opinion, using Fluent Syntax usually makes it a bit easier to understand what is going wrong here.
As soon as you group your data you are no longer working on the individual objects, but on a 'group', which is the key and an enumerable of objects.
Getting the sum per employee should then be grouping by the full employee id and then parsing the MNTH1 fields of your objects and summing them.
dataGridView1.DataSource = movie2
.GroupBy(p => p.EMPLID) // create a group of data per employee
.Select(g => new
{
EMPLID = g.Key, // the employee id is the group key
Sum = g.Sum(data => decimal.Parse(data.MNTH1)) // parse and sum
})
.ToArray();
Edit: you are right, you need the ToArray to evaluate the query. I just verified on my computer and it works.
Try following :
class Program
{
static void Main(string[] args)
{
IEnumerable<AssgnData> movie2 = null;
dataGridView1.DataSource = movie2.GroupBy(x => new {id = x.EMPLID, month = x.MNTH1})
.Select(x => new {
EMPLYID = x.Key.id,
MONTH = x.Key.month,
SUM = x.Sum(y => y.value)
});
}
}
public class AssgnData
{
public string EMPLID { get; set; }
public string MNTH1 { get; set; }
public int value { get;set;}
}
I am trying to use csv helper libary to parse my csv. But I am having an issue it says that the itemcode does not exist when its there in the file.
// Adding stock item code
Sage.Accounting.Stock.StockItem stockItem = new Sage.Accounting.Stock.StockItem();
string line = null;
public void ImportCsv(string filename)
{
TextReader reader = File.OpenText(filename);
var csv = new CsvReader(reader);
csv.Configuration.HasHeaderRecord = true;
csv.Read();
// Dynamic
// Using anonymous type for the class definition
var anonymousTypeDefinition = new
{
Itemcode = string.Empty,
Barcode = string.Empty
};
var records = csv.GetRecords(anonymousTypeDefinition);
}
This is the csv structure
"Itemcode","Barcode","description"
"P4S100001","303300054486","Test Product"
This is my first time using the csvhelper as showing here at https://joshclose.github.io/CsvHelper/
You are better off creating a strongly typed model to hold the data if one does not already exist
public class Item {
public string Itemcode { get; set; }
public string Barcode { get; set; }
public string description { get; set; }
}
and using GetRecords<T>() to read the records by type
TextReader reader = File.OpenText(filename);
var csv = new CsvReader(reader);
var records = csv.GetRecords<Item>();
Your GetRecords function needs a type specifier like so:
var records = csv.GetRecords<type>();
Also you may want to put csv.Read() in a while loop depending on your need.
Since all your values have quotes you need to specify it in the config. Working with quotes in csvHelper is frustrating. if not all if the values have quotes there are ways to handle that as well but not as nicely as this
var csv = new CsvReader(reader,new CsvHelper.Configuration.Configuration
{
HasHeaderRecord = true,
QuoteAllFields = true
});
var anonymousTypeDefinition = new
{
Itemcode = string.Empty,
Barcode = string.Empty
};
var records = csv.GetRecords(anonymousTypeDefinition);
I'm writing a program to read in CSV files and validate the data. The csv file is comma delimited.
The csv file contains a sales order that is retrieved online so we can't actually edit the CSV file itself. I need to read in the file and split it into the cells. However, the product description will contain further commas which is affecting how I access the data.
My code for pulling the values out is below.
private void csvParse()
{
List<string> products = new List<string>();
List<string> quantities = new List<string>();
List<string> price = new List<string>();
using (var reader = new StreamReader(txt_filePath.Text.ToString()))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
products.Add(values[0]);
quantities.Add(values[2]);
values[3] = values[3].Substring(4);
price.Add(values[3]);
}
}
if (validateData(products, quantities, price) != "")
{
MessageBox.Show(validateData(products, quantities, price));
}
}
Is there anyway to ignore the columns in a set cell or can the columns distinguished by another delimiter?
A snippet of a row in my csv file is below.
The raw CSV data is below:
TO12345,"E45 Dermatological Moisturising Lotion, 500 ml",765,GBP 1.75
You can use LinqToCSV from nuGet. ie:
void Main()
{
List<MyData> sample = new List<MyData> {
new MyData {Id=1, Name="Hammer", Description="Everything looks like a nail to a hammer, doesn't it?"},
new MyData {Id=2, Name="C#", Description="A computer language."},
new MyData {Id=3, Name="Go", Description="Yet another language, from Google, cross compiles natively."},
new MyData {Id=3, Name="BlahBlah"},
};
string fileName = #"c:\temp\MyCSV.csv";
File.WriteAllText(fileName,"Id,My Product Name,Ignore1,Ignore2,Description\n");
File.AppendAllLines(fileName, sample.Select(s => $#"{s.Id},""{s.Name}"",""ignore this"",""skip this too"",""{s.Description}"""));
CsvContext cc = new CsvContext();
CsvFileDescription inputFileDescription = new CsvFileDescription
{
SeparatorChar = ',',
FirstLineHasColumnNames = true,
IgnoreUnknownColumns=true
};
IEnumerable<MyData> fromCSV = cc.Read<MyData>(fileName, inputFileDescription);
foreach (var d in fromCSV)
{
Console.WriteLine($#"ID:{d.Id},Name:""{d.Name}"",Description:""{d.Description}""");
}
}
public class MyData
{
[CsvColumn(FieldIndex = 1, Name="Id", CanBeNull = false)]
public int Id { get; set; }
[CsvColumn(FieldIndex = 2, Name="My Product Name",CanBeNull = false, OutputFormat = "C")]
public string Name { get; set; }
[CsvColumn(FieldIndex = 5, Name="Description",CanBeNull = true, OutputFormat = "C")]
public string Description { get; set; }
}
It should work..:)
var csvSplit = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);
string[] csvlines = File.ReadAllLines(txt_filePath.Text.ToString());
var query = csvlines.Select(csvline => new
{
data = csvSplit.Matches(csvline)
}).Select(t => t.data);
var row = query.Select(matchCollection =>
(from Match m in matchCollection select (m.Value.Contains(',')) ? m.Value.Replace(",", "") : m.Value)
.ToList()).ToList();
You can also use the Microsoft.VisualBasic.FileIO.TextFieldParser class. More detailed answer here: TextFieldParser
I'm working in C# (.Net 4) and I am trying to do several things:
I have 2 files ("Offline.csv","online.csv"), and I'm having those files make one "master" file (called "Attendance.csv")
Both offline.csv and online.csv contain similar data---
My Offline.csv file has:
(ID),(TimeInMin),(DateWithoutSlashes yyymmdd)
01,10,20151201
01,05,20151202
02,11,20151201
03,11,20151202
My Online.csv file has
(ID),(TimeInMin),(DateWithoutSlashes yyymmdd)
01,70,20151201
02,20,20151202
03,22,20151202
After my program is ran, the Attendance.csv should look something like:
(Same headers)
01,80,20151201
01,05,20121502 (notice the date from offline.csv, which doesn't exist in the online.csv)
02,31,20151201
03,33,20151202
So what I'm trying to do is:
Compare the data from both the offline.csv and online.csv files. If data matches on the "ID" and "Date" columns, add the minutes together (column 2) and put them as a row in the Attendance.csv file
However, IF the offline.csv contains rows that the online.csv doesn't have, then put all those other records into the Attendance.csv on their own. Perform the same action with the online.csv, being mindful to not duplicate minutes that were already merged together from step #1
I don't know if that all makes sense, but I hope it does :X
I have been beating my head against the wall all day with this, and I don't know what else to look at.
With all that said, here is what I have so far:
I have created my own class, called "aoitime", it looks as follows:
public class aoitime
{
public string ID { get; set; }
public string online { get; set; }
public string offline { get; set; }
public string dtonline { get; set; }
public string dtoffline { get; set; }
public string date { get; set; }
}
I then use IEnumerable in a different function, looks similar to ...
IEnumerable<aoitime> together =
from online in onlinefile
let onlineFields = online.Split(',')
from id in offlinefile
let offlineFields = id.Split(',')
where (onlineFields[0] == offlineFields[0] && onlineFields[2] == offlineFields[2]) || (!offlineFields[1].Contains(""))
orderby onlineFields[0]
select new aoitime
{
ID = onlineFields[0],
online = onlineFields[1],
offline = offlineFields[1],
dtonline = onlineFields[2],
dtoffline = offlineFields[2],
date = onlineFields[2]
};
StreamWriter Attendance = new StreamWriter(destination);
Attendance.Write("SIS_NUMBER,MINUTES,DATE" + Environment.NewLine);
foreach (aoitime att in together)
{
int date = int.Parse(att.date);
int dateonline = int.Parse(att.dtonline);
int dateoffline = int.Parse(att.dtoffline);
int online = int.Parse(att.online);
int offline = int.Parse(att.offline);
int total = (online + offline);
Console.WriteLine("Writing total time now: "+online);
Attendance.Write(att.ID + "," + total + "," date + Environment.NewLine);
}
I then tried creating another IEnumerable class spawn that looks similar to the one above, but instead using "where offlineFields[2] != onlineFields[2]" but I get unpredictable results. I just don't know where else to look or what else to do.
Please be gentle, I'm very much new to programming in general (I promise this isn't for a classroom assignment :-)
thanks so much for any advice and reading this book!
You are almost there. I wrote this code, so hopefully you will be able to learn something from it.
First you only need one entity class for this. Note the ToString method. You will see how it's used later.
public class Attendance
{
public int Id { get; set; }
public int TimeInMinutes { get; set; }
public string Date { get; set; }
public override string ToString()
{
return string.Format("{0},{1},{2}", Id, TimeInMinutes, Date);
}
}
Now the code to parse your files and create the new file. Read my comments in the code.
var onlineEntries = File.ReadAllLines(#"c:\online.txt");//read online file
var validOnlineEntries = onlineEntries.Where(l => !l.Contains("(")); //remove first line
var onlineRecords = validOnlineEntries.Select(r => new Attendance()
{
Id = int.Parse(r.Split(new[] {","}, StringSplitOptions.None)[0]),
TimeInMinutes = int.Parse(r.Split(new[] {","}, StringSplitOptions.None)[1]),
Date = r.Split(new[] {","}, StringSplitOptions.None)[2],
}).ToList();//popultae Attendance class
var offlineEntries = File.ReadAllLines(#"c:\offline.txt"); //read online file
var validOfflineEntries = offlineEntries.Where(l => !l.Contains("(")); //remove first line
var offlineRecords = validOfflineEntries.Select(r => new Attendance()
{
Id = int.Parse(r.Split(new[] { "," }, StringSplitOptions.None)[0]),
TimeInMinutes = int.Parse(r.Split(new[] { "," }, StringSplitOptions.None)[1]),
Date = r.Split(new[] { "," }, StringSplitOptions.None)[2],
}).ToList();//popultae Attendance class
var commonRecords = (from n in onlineRecords
join f in offlineRecords on new {n.Date, n.Id } equals new {f.Date, f.Id} //if Date and Id are equal
select new { n.Id, TimeInMinutes = (n.TimeInMinutes + f.TimeInMinutes), n.Date }).OrderBy(x => x.Id).Distinct().ToList(); //add Online and Off line time
var newRecords = commonRecords.Select(r => new Attendance()
{
Id = r.Id,
TimeInMinutes = r.TimeInMinutes,
Date = r.Date,
}); //Poulate attendance again. So we can call toString method
onlineRecords.AddRange(offlineRecords); //merge online and offline
var recs = onlineRecords.Distinct().Where(r => !newRecords.Any(o => o.Date == r.Date && o.Id == r.Id)).ToList(); //remove already added items from merged online and offline collection
newRecords.AddRange(recs);//add filtered merged collection to new records
newRecords = newRecords.OrderBy(r => r.Id).ToList();//order new records by id
File.WriteAllLines(#"C:\newFile.txt", newRecords.Select(l => l.ToString()).ToList()); //write new file.
Just to add this as an answer, I am selecting #Kosala-w's suggestion as an answer. My code now looks very identical to what he posted, except I modified the ID to a string format because the integers used for the IDs are pretty lenghty.
I thank both people who answered this question, and I appreciate the SO community! Have a good day :-)
public class Attendance
{
public string Id { get; set; }
public int TimeInMinutes { get; set; }
public int Code { get; set; }
public string Date { get; set; }
public override string ToString()
{
return string.Format("{0},{1},{2}", Id, TimeInMinutes, Date);
}
}
I also have more rows that I have to handle in the Attendance sheet than I stated in my original question (I didn't worry about those because I wasn't concerned that I'd have a hard time getting what I needed.)
Anyway, the code below is what I used, again, thanks Kosala.
private void createAttendance()
{
try
{
txtStatus.ResetText();
txtStatus.Text += "Creating Attendance file. Please wait.";
string destination = (#"C:\asdf\Attendance.csv");
barStatus.Caption = "Processing Attendance file. Please wait.";
if (File.Exists(destination))
File.Delete(destination);
var validOnlineEntries = File.ReadAllLines(#"C:\asdf\online.csv");//read online file
//var validOnlineEntries = onlineEntries.Where(l => !l.Contains("(")); //remove first line
var onlineRecords = validOnlineEntries.Select(r => new Attendance()
{
Id = (r.Split(new[] { "," }, StringSplitOptions.None)[0] + ",202" + "," + txtYear.Text),
TimeInMinutes = int.Parse(r.Split(new[] { "," }, StringSplitOptions.None)[1]),
Date = r.Split(new[] { "," }, StringSplitOptions.None)[2],
}).ToList();//populate Attendance class
var validOfflineEntries = File.ReadAllLines(#"C:\asdf\offline.csv"); //read online file
//var validOfflineEntries = offlineEntries.Where(l => !l.Contains("(")); //remove first line
var offlineRecords = validOfflineEntries.Select(r => new Attendance()
{
Id = (r.Split(new[] { "," }, StringSplitOptions.None)[0] + ",202" + "," + txtYear.Text),
TimeInMinutes = int.Parse(r.Split(new[] { "," }, StringSplitOptions.None)[1]),
Date = r.Split(new[] { "," }, StringSplitOptions.None)[2],
}).ToList();//populate Attendance class
var commonRecords = (from n in onlineRecords
join f in offlineRecords on new { n.Date, n.Id } equals new { f.Date, f.Id } //if Date and Id are equal
select new { n.Id, TimeInMinutes = (n.TimeInMinutes + f.TimeInMinutes), n.Date }).OrderBy(x => x.Id).Distinct().ToList(); //add Online and Off line time
var newRecords = commonRecords.Select(r => new Attendance()
{
Id = r.Id,
TimeInMinutes = r.TimeInMinutes,
Date = r.Date,
}).ToList(); //Populate attendance again. So we can call toString method
onlineRecords.AddRange(offlineRecords); //merge online and offline
var recs = onlineRecords.Distinct().Where(r => !newRecords.Any(o => o.Date == r.Date && o.Id == r.Id)).ToList(); //remove already added items from merged online and offline collection
newRecords.AddRange(recs);//add filtered merged collection to new records
newRecords = newRecords.OrderBy(r => r.Id).ToList();//order new records by id
StreamWriter Attendance = new StreamWriter(destination);
//Attendance.Write("SIS_NUMBER,SCHOOL_CODE,SCHOOL_YEAR,ABSENCE_DATE,ABSENCE_REASON1,ABSENCE_REASON2,MINUTES_ATTEND,NOTE,ABS_FTE1,ABS_FTE2" + Environment.NewLine);
Attendance.Write("SIS_NUMBER,SCHOOL_CODE,SCHOOL_YEAR,MINUTES_ATTEND,ABSENCE_DATE,ABSENCE_REASON2,ABSENCE_REASON1,NOTE,ABS_FTE1,ABS_FTE2" + Environment.NewLine);
Attendance.Dispose();
File.AppendAllLines(destination, newRecords.Select(l => l.ToString()).ToList()); //write new file.
Convert_CSV_To_Excel();
}
catch(Exception ex)
{
barStatus.Caption = ("ERROR: "+ex.Message.ToString());
}
}
I plan to do some more fine tuning, but this sure got me in the right direction!
The first thing that I'd do is define a simpler class to hold your aoitimes. For example:
public class aoitime
{
public string ID { get; set; }
public int TimeInMinutes { get; set; }
public string DateWithoutSlashes { get; set; }
}
Then, you'll want to parse the string from the csv file into that class. I figure that that's an implementation detail that you can probably figure out on your own. If not, leave a comment and I can post more details.
Next, the tricky part is that you want not only a join, but you want the exceptions as well. The join logic is fairly simple:
var matches = from offline in offlineItems
join online in onlineItems
on
new {offline.ID, offline.DateWithoutSlashes} equals
new {online.ID, online.DateWithoutSlashes}
select new aoitime
{
ID = offline.ID,
TimeInMinutes = offline.TimeInMinutes + online.TimeInMinutes,
DateWithoutSlashes = offline.DateWithoutSlashes
};
(Notice there that you're using anonymous objects in the "ON" join condition). But the hard part is how to get the exceptions. LINQ is set up to do inner joins or equijoins, but I'm not sure about outer joins. At least I haven't seen it.
So one solution might be to use the LINQ join to get the matches and then another LINQ query to get those that don't match and then combine those two collections and write them out to a file.
Another solution might be to go back to basics and do the iteration logic yourself. LINQ is just elegant iteration logic and if it doesn't do what you need it to do, you might need to do it yourself.
For example, let's say that you have your collection of online and offline items and you want to iterate through them and do the comparison:
List<aoitime> offlineItems = <some method that produces this list>
List<aoitime> onlineItems = <some method that produces this list>
List<aoitime> attendanceItems = new List<aoitime>();
//For simplicity, assuming that you have the same number of elements in each list
for (int i = 0; i < offlineItems.Count; i++)
{
aoitime offline = offlineItems[i];
aoitime online = onlineItems[i];
if(offline.ID == online.ID && offline.DateWithoutSlashes = online.DateWithoutSlashes)
{
//Create your new object and add it to the attendance items collection.
}
else
{
//Process the exceptions and add them individually to the attendance items collection.
}
}
So you do the iteration and processing yourself and have control over the whole process. Does that make sense? If not, let me know in a comment and I can add more.