LINQ Query for Unsanitized Data - c#

I have a List<MyData> where MyData contains a Location field.
This field is a string, and is normally in "City, State" format but sometimes will come in as "Unknown".
There is another field, DateField.
I need to return a list of MyData objects grouped by the year in DateField, grouped by the state portion of Location, if that exists. If it comes in as "Unknown" then I need to ignore that.
My thoughts are use RemoveAll() on the List<> where (r => r.Location.Split(",").Length == 0), or if it doesn't include a comma at all.
Then I will have sanitized data.
That leaves me with two questions:
Is this the correct approach, or can I just handle it all with one LINQ query?
What should this LINQ query look like? I am looking state totals by a year specific year, which is passed into the API as an int.
I hope that comes across clear. Thanks.

If I understood you correctly, you can try to do the following. Filter the data with known Location first, then group it by two keys, year from DateField and State, and finally select a result
var data = new List<MyData>();
var result = data.Where(l => l.Location != "Unknown")
.GroupBy(d => new { d.DateField.Year, State = d.Location.Split(",").LastOrDefault() })
.Select(g => new
{
g.Key.Year,
g.Key.State,
Total = g.Count()
});

This query removes an state with unkonwn and also group on year of datefield ans state part of location
var mydata = new List<MyData>();
mydata.Where(x => x.Location != "Unknown")
.GroupBy(x => new { x.DateField.Date.Year, State =
x.Location.Split(',').LastOrDefault() })
.Select(x => new {
Year = x.Key.Year,
Count = x.Count()
});

Related

Group By struct list on multiple columns in C#

I am having a struct as
public struct structMailJob
{
public string ID;
public string MailID;
public int ResendCount;
public int PageCount;
}
and a list as
List<structMailJob> myStructList = new List<structMailJob>();
I have loaded data in myStructList from database and want myStructList data in a new list after grouping by MailID and ResendCount.
I am trying as:
List<structMailJob> newStructList = new List<structMailJob>();
newStructList = myStructList.GroupBy(u => u.MailID, u=>u.ResendCount)
.Select(grp => new { myStructList = grp.ToList() })
.ToList();
but unable to do that as getting error message - cant implicitly convert generic list to structMailJob.
I think that you are looking for is the following:
var newStructList = myStructList.GroupBy(smj => new { smj.MailID, smj.ResendCount })
.Select(grp => new
{
MailID = grp.Key.MailID,
ResendCount = grp.Key.ResendCount
MailJobs = grp.Select(x=>new
{
x.ID,
x.PageCount
}).ToList()
})
.ToList();
Note that we changed the GroupBy clause to the following one:
GroupBy(smj => new { smj.MailID, smj.ResendCount })
Doing so, the key on which the groups would be created would be consisted of both MailID and ResendCount. By the way the former GroupBy clause isn't correct.
Then having done the grouping, we project each group to an object with three properties, MailID and ResendCout, which are the components of the key and list of anonymous type object with two properties, ID and PageCount, which we gave it the name MailJobs.
Last but not least you will notice that I didn't mention the following
List<structMailJob> newStructList = new List<structMailJob>();
I just used the var and declared the newStructList. I don't think that you stated in your post makes sense. How do we expect to get a list of the same objects after grouping them? So I assumed that you might want is the above.
However, I thought you might want also something like this and you didn't want to refer to Grouping.
myStructList = myStructList.OrderBy(smj => smj.MailID)
.ThenBy(smj => smj.ResendCount)
.ToList();
Linq Query is completely incorrect, following are the important points:
myStructList.GroupBy(u => u.MailID, u=>u.ResendCount) // Incorrect grouping
myStructList.GroupBy(u => new {u.MailID, u.ResendCount }) // Correct grouping, which will do by two columns MailID and ResendCount, last one was only doing by MailID and was using ResendCount for result projection
Now the result is of type IEnumerable<IGrouping<AnonymousType,structMailJob>>, so when you do something like Select, it will end up creating Concatenated List of type IEnumerable<List<structMailJob>> (Removed the assignment to myStructList inside the Select, as that was not correct):
.Select(grp => grp.ToList())
Correct code would require you to flatten using SelectMany as follows:
newStructList = myStructList.GroupBy(u => new {u.MailID, u.ResendCount})
.SelectMany(grp => grp.ToList()).ToList();
Assign it to newStructList, but this code has little use, since literally newStructList is exactly same as myStructList post flattening, ideally you shall be able to use the grouping, so that you can get a subset and thus the correct result, however that depends on your business logic
I don't know if I got your question right but it seems to me you missed the 'Group by' signature.
List<structMailJob> myStructList = new List<structMailJob>();
List<structMailJob> newStructList = new List<structMailJob>();
newStructList = myStructList
// .GroupBy(/*Key Selector */u => u.MailID, /*Element Selector*/u=>u.ResendCount)
.GroupBy(u => new { u.MailID, u.ResendCount }) // broup by MailID, ResendCount
// Note no Element Selector , the 'STRUCT' is 'SELECTED'
.Select(grp => {
// NOte: Key == Anonymous {MailID, ResendCount }
return grp;
})
// otherwise you get a IEnumerable<IEnumerable<T>> instead of IEnumerable<T> because you grouped it
.SelectMany(x=>x)
.ToList();
If Mrinal Kamboj's answer is what you are looking for, then you could use the following as an alternative:
var orderedList = myStructList.OrderBy(x => x.MailID).ThenBy(x => x.ResendCount);

C# How to filter a list and remove duplicates?

I have a List of Type X. This contains fields and I need to return only unique records from the list. I need to use one of the field/property (OIndex) that contains a timestamp and filter it using that property. List is like this:
> 2c55-Checked-branchDeb-20160501121315-05
> 2c60-Checked-branchDeb-20160506121315-06
> 2c55-Checked-branchDeb-20160601121315-07
> 2c55-Checked-branchDeb-20160601141315-07
> 2c60-Checked-branchDeb-20160720121315-08
In the example above the last field is the recordId so we have a duplicate record of "07". The timestamp is field four. So I want to get the all the records except that 3rd which is a duplicate. The latest version of record "07" is the fourth line.
I started doing the code but struggling. So far:
List<X> originalRecords = GetSomeMethod(); //this method returns our list above
var duplicateKeys = originalRecords.GroupBy(x => x.Record) //x.Record is the record as shown above "05", "06" etc
.Where(g => g.Count() > 1)
.Select(y => y.Key);
What do I do now? Now that I have the duplicate keys.
I think I need to go through the OriginalRecords list again and see if it contains the duplicate key.
And then use substring on the datetime. Store this somewhere and then remove the record which is not the latest. And save the original records with the filter. Thanks
You don't need to find duplicate keys explicitly, you could simply select first from each group:
var res == originalRecords
.GroupBy(x => x.RecordId)
.Select(g => g.OrderByDescending(x => x.DateTimeField).First());
There is no field for datetimefield as in your code. I simply have a string field which contains the datetime together with other data. The record however has a Record Id field.
You can split your records on a dash, grab the date-time portion, and sort on it. Your date/time is in a format that lets you sort lexicographically, so you can skip parsing the date.
Assuming that there are no dashes, and that all strings are formatted in the same way, x.TextString.Split('-')[3] expression will give you the timestamp portion of your record:
var res == originalRecords
.GroupBy(x => x.RecordId)
.Select(g => g.OrderByDescending(x => x.TextString.Split('-')[3]).First());
This should solve your problem:
List<X> originalRecords = GetSomeMethod();
Dictionary<int, X> records = new Dictionary<int, X>();
foreach (X record in originalRecords) {
if(records[record.recordId] != null) {
if(records[record.recordId].stamp < record.stamp){
records[record.recordId] = record;
}
}
else {
records[record.recordId] = record;
}
}
Your answer are records.Values
Hope it helps

Order a list of string by quantity

I have a list of string that i want to order by quantity. The List contain a list of Order-CreationDate with datetime values. I'm converting this values to strings as as i will need that for later.
My current output is a list of CreationDate that looks like this.
2014-04-05
2014-04-05
2014-04-05
2014-04-05
2014-04-06
2014-04-06
2014-04-06
...
I get a list of dates as expected but i want to group number of dates by the date. This mean i need another variable with number of total orders. Ive tried creating a new variable, using a for loop and linq queries but not getting the results I want.
How can I get number of orders by CreationDate? I need to count total number of orders by CreationDate but I can't find a way to do this.
The expected output would be:
2014-04-05 4 - representing 4 orders that date
2014-04-06 3 - representing 3 orders that date.
This what my code looks like:
List<string> TotalOrdersPaid = new List<string>();
foreach (var order in orders)
{
if (order.PaymentStatus == PaymentStatus.Paid)
{
string Created = order.CreatedOnUtc.Date.ToString("yyyy-MM-dd");
order.CreatedOnUtc = DateTime.ParseExact(Created, "yyyy-MM-dd", CultureInfo.InvariantCulture);
TotalOrdersPaid.Add(Created);
}
}
Eg TotalOrdersPaid should contain a list with number of orders by CreationDate.
What is a good way to achieve this?
Thanks
basically, you just need a group by and and ordering.
var result = orders//add any where clause needed
.GroupBy(m => m)
.Select(m => new {
cnt = m.Count(),
date = m.Key
})
.OrderByDescending(m => m.cnt);
Of course, you can add any DateTime.Parse / ParseExact in the Select, and / or project to a corresponding class.
To group the orders by date, take following LinQ lambda expression:
var grouped = orders.Where(o => o.PaymentStatus == PaymentStatus.Paid)
.GroupBy(g => g.CreatedOnUtc);
Now, all paid orders are grouped by date. To count the orders per date, select the key which is the date, and the Count() will count all orders for that date.
var counted = grouped.Select(s => new { Date = s.Key, Count = s.Count() });
Edit:
In one statement:
var result = orders.Where(o => o.PaymentStatus == PaymentStatus.Paid)
.GroupBy(g => g.CreatedOnUtc)
.Select(s => new { Date = s.Key, Count = s.Count() });
Based on your list of dates, the output will look like:
Date Count
5/04/2014 4
6/04/2014 3
Update:
If you want to put more properties in the anonymous type that will be returned from the Select() method, sumply just add them. If, for example, you want the date, the count and the list of orders for that date, use following line of code:
var result = orders.Where(o => o.PaymentStatus == PaymentStatus.Paid)
.GroupBy(g => g.CreatedOnUtc)
.Select(s => new
{
Date = s.Key,
Count = s.Count(),
Items = s.ToList()
});
Now you can do following:
foreach(var orderGroup in result)
{
Console.WriteLine(String.Format("Number of orders for {0}: {1}", orderGroup.Date, orderGroup.Count));
foreach(var order in orderGroup.Items)
{
Console.WriteLine(order.Name);
}
}
This will loop over all grouped items and display a sentence like:
Number of orders for 5/04/2014: 4
And then it will display the name of each order for that date (if your order has a property Name). Hope this clears everything out.

LINQ projection combining date and list

I have a list of objects containg value and date of transaction.
DateTime Date { get ; set; }
double Value { get; set; }
I want to get the new grouped object which will contain date of transaction and list of values for this particular day.
I can retrieve both list and date but i dont know how to use projection to cast them into new object.
var res1 = ExpenseList.Where(p => p.Date == Convert.ToDateTime("01-12-2013"))
.Select(p => p.Value)
.ToList();
DateTime res2 = ExpenseList.Where(p => p.Date == Convert.ToDateTime("01-12-2013"))
.Select(p => p.Date)
.FirstOrDefault();
There might have been some confusion around I will post pictures to clarify
I have something like this in ExpenseList set of date and value
I want one Date and Collection of Values
To group all of the values for a particular day you'll want to use GroupBy
var groups = ExpenseList.GroupBy(expense => expense.Date,
expense => expense.Value);
ExpenseList.Where(p => p.Date == Convert.ToDateTime("01-12-2013"))
.Select(p => new MyObject {Date = p.Date, Value = p.Value})
.ToList();
Replace MyObject for the actual type you have defined.

Order By on the Basis of Integer present in string

I've a problem in my C# application... I've some school classes in database for example 8-B, 9-A, 10-C, 11-C and so on .... when I use order by clause to sort them, the string comparison gives results as
10-C
11-C
8-B
9-A
but I want integer sorting on the basis of first integer present in string...
i.e.
8-B
9-A
10-C
11-C
hope you'll understand...
I've tried this but it throws exception
var query = cx.Classes.Select(x=>x.Name)
.OrderBy( x=> new string(x.TakeWhile(char.IsDigit).ToArray()));
Please help me... want ordering on the basis of classes ....
Maybe Split will do?
.OrderBy(x => Convert.ToInt32(x.Split('-')[0]))
.ThenBy(x => x.Split('-')[1])
If the input is well-formed enough, this would do:
var maxLen = cx.Classes.Max(x => x.Name.Length);
var query = cx.Classes.Select(x => x.Name).OrderBy(x => x.PadLeft(maxLen));
You can add 0 as left padding for a specified length as your data for example 6
.OrderBy(x => x.PadLeft(6, '0'))
This is fundamentally the same approach as Andrius's answer, written out more explicitly:
var names = new[] { "10-C", "8-B", "9-A", "11-C" };
var sortedNames =
(from name in names
let parts = name.Split('-')
select new {
fullName = name,
number = Convert.ToInt32(parts[0]),
letter = parts[1]
})
.OrderBy(x => x.number)
.ThenBy(x => x.letter)
.Select(x => x.fullName);
It's my naive assumption that this would be more efficient because the Split is only processed once in the initial select rather than in both OrderBy and ThenBy, but for all I know the extra "layers" of LINQ may outweigh any gains from that.

Categories

Resources