C# - sorting by a property - c#

I am trying to sort a collection of objects in C# by a custom property.
(For context, I am working with the Twitter API using the Twitterizer library, sorting Direct Messages into conversation view)
Say a custom class has a property named label, where label is a string that is assigned when the class constructor.
I have a Collection (or a List, it doesn't matter) of said classes, and I want to sort them all into separate Lists (or Collections) based on the value of label, and group them together.
At the moment I've been doing this by using a foreach loop and checking the values that way - a horrible waste of CPU time and awful programming, I know. I'm ashamed of it.
Basically I know that all of the data I have is there given to me, and I also know that it should be really easy to sort. It's easy enough for a human to do it with bits of paper, but I just don't know how to do it in C#.
Does anyone have the solution to this? If you need more information and/or context just ask.

Have you tried Linq's OrderBy?
var mySortedList = myCollection.OrderBy(x => x.PropertyName).ToList();
This is still going to loop through the values to sort - there's no way around that. This will at least clean up your code.

You say sorting but it sounds like you're trying to divide up a list of things based on a common value. For that you want GroupBy.
You'll also want ToDictionary to switch from an IGrouping as you'll presumably be wanting key based lookup.
I assume that the elements within each of the output sets will need to be sorted, so check out OrderBy. Since you'll undoubtedly be accessing each list multiple times you'll want to collapse it to a list or an array (you mentioned list) so I used ToList
//Make some test data
var labels = new[] {"A", "B", "C", "D"};
var rawMessages = new List<Message>();
for (var i = 0; i < 15; ++i)
{
rawMessages.Add(new Message
{
Label = labels[i % labels.Length],
Text = "Hi" + i,
Timestamp = DateTime.Now.AddMinutes(i * Math.Pow(-1, i))
});
}
//Group the data up by label
var groupedMessages = rawMessages.GroupBy(message => message.Label);
//Convert to a dictionary for by-label lookup (this gives us a Dictionary<string, List<Message>>)
var messageLookup = groupedMessages.ToDictionary(
//Make the dictionary key the label of the conversation (set of messages)
grouping => grouping.Key,
//Sort the messages in each conversation by their timestamps and convert to a list
messages => messages.OrderBy(message => message.Timestamp).ToList());
//Use the data...
var messagesInConversationA = messageLookup["A"];
var messagesInConversationB = messageLookup["B"];
var messagesInConversationC = messageLookup["C"];
var messagesInConversationD = messageLookup["D"];

It sounds to me like mlorbetske was correct in his interpretation of your question. It sounds like you want to do grouping rather than sorting. I just went at the answer a bit differently
var originalList = new[] { new { Name = "Andy", Label = "Junk" }, new { Name = "Frank", Label = "Junk" }, new { Name = "Lisa", Label = "Trash" } }.ToList();
var myLists = new Dictionary<string, List<Object>>();
originalList.ForEach(x =>
{
if (!myLists.ContainsKey(x.Label))
myLists.Add(x.Label,new List<object>());
myLists[x.Label].Add(x);
});

Related

List of objects, add properties together based of another property

I have a List that contains 2 properties per object. The properties are as follows:
string Project;
double Value;
So in any given case we might have a List of 5 objects, where 3 of them have a Project property called "Test" and the other 2 objects have a Project Property called "Others", but none of the 5 objects have the same "Value".
{
Project = "Test" Value = 1,
Project = "Test" Value = 5,
Project = "Test" Value = 25,
Project = "Others" Value = 89,
Project = "Others" Value = 151
}
Okay, I get a lot of data from a Database (I "Query" it out into a List of objects), then I take the specific properties I need from that List and add to my own List as follows.
public class Data
{
public string Project {get; set;}
public double Value {get; set;}
}
public List<Data> dataList = new List<Data>();
foreach(var item in DatabaseList)
{
Data newData = new Data(
data.Project = item.Project;
data.Value = item.Project;
dataList.Add(newData);
}
This gives me my list of data that I somehow need to combine based on the property in "Project"
But I have a hard time figuring out how to seperate them from one another, my first thought was to find "Unique" "Projects" and adding that to a new List called "counter", to then loop through that list based of the "Project" property, so something like this:
List<Data> counter = dataList.GroupBy(x => x.Project).Select(First()).ToList();
foreach(var item in counter)
{
Data finalItem = new Data();
foreach (var item2 in dataList)
{
if(item.Project == item2.Project)
{
finalItem.Project = item2.Project;
finalItem.Value += item2.Value;
finalList.Add(finalItem);
}
}
}
So I already know that the above is so messy its crazy, and its also not going to work, but this was the angle I was trying to take, I was also thinking whether I could maybe make use of Dictionary, but I feel like there is probably a super simple solution to something like this.
I think your initial thoughts regarding making use of a dictionary are good. Your use of .GroupBy() is a first step to create that dictionary, where the project name is the dictionary Key and the sum of values for that project is the dictionary Value.
You already seem to be familiar with the System.Linq namespace. The extension method .ToDictionary() exists in the same namespace, and can be used to define the Key and Value selector for each KeyValuePair (KVP) in the dictionary as follows:
.ToDictionary(
<selector for Key>,
<selector for Value>
);
The dictionary may be created by utilizing .ToDictionary() directly after .GroupBy(), as follows:
Dictionary<string, double> dataDictionary = dataList
.GroupBy(item => item.Project)
.ToDictionary(
itemsByProject => itemsByProject.Key,
itemsByProject => itemsByProject.Sum(item => item.Value));
Example fiddle here.
You can use the following code to compute the total Value for objects with Project="Test" :
double TestsValue = my_list.Where(o=>o.Project=="Test").Sum(t=>t.Value);
and do the same for "Others".
Assuming you're happy to return an IEnumerable of Data, you can do this:
var projects = dataList.GroupBy(p => p.Project)
.Select(grp =>
new Data
{
Project = grp.First().Project,
Value = grp.Sum(pr => pr.Value)
});

Sort a variable based on header text

I am looking for guidence, and as I tried to convey with my title, I have an issue where I receive data that sometimes look like this for example :
entry[0] = "SHAPE", "X", "Y"
entry[1] = "Circle", "2", "3"
and sometimes may look like this:
entry[0] = "X", "Y", "SHAPE"
entry[1] = "2", "3", "Circle"
As you can see, they are ordered based on the first row values, which I will call "headerValues" below.
I am now trying to map my variables (for example "shape") so it's placed where the entry actually correlates to the shape value. I want to do this so I dont end up with a X number in my "Shape" variable due to a different input order then I planned for.
I am also well aware that I may want to remove the first row before I add them into my shapes, but that is an issue I want to try and figure out on my own in order to learn. I am only here due to the fact that I have been stuck on this problem for a while now, and therefore really appriciate any help I can get from a more seasoned programmer than me.
Below you will find the code:
var csvRows = csvData.Split(';');
var headerValues = csvRows[0].Split(',');
List<Shapes> shapes = new List<Shapes>();
if (csvRows.Count() > 0)
foreach (var row in csvRows)
{
var csvColumn = row.Split(',').Select(csvData => csvData.Replace(" ", "")).Where(csvData => !string.IsNullOrEmpty(csvData)).Distinct().ToList();
if (csvColumn.Count() == 5)
{
shapes.Add(new()
{
shape = csvColumn[0], //want to have same index palcement as where headervalue contains = "Shape"
});
}
else
{
Console.WriteLine(row + " does not have 5 inputs and cannot be added!");
}
}
Thank you in advance!
You can determine your column(s) by using linq:
var colShape = headerValues.ToList().FindIndex(e => e.Equals("SHAPE"));
and then use that to set the the property in the object:
shapes.Add(new()
{
shape = csvColumn[colShape], //want to have same index palcement as where headervalue contains = "Shape"
});
In the long run you would be better off using a csv parsing library.
Since your data is in the CSV format, you don't need to reinvent the wheel, just use a helper library like CsvHelper
using var reader = new StringReader(csvData);
using var csvReader = new CsvReader(reader, CultureInfo.InvariantCulture);
var shapes = csvReader.GetRecords<Shapes>().ToList();
You may need to annotate the Shapes.shape field or property if it has different casing from the data, use the NameAttribute provided by CsvHelper

IEnumerable Where filtering occuring without actually being called

I'm using HtmlAgilityPack to parse a page of HTML and retrieve a number of option elements from a select list.
The GvsaDivisions is a method that returns raw html from the result of a POST, irreverent in the context of the question
public IEnumerable<SelectListItem> Divisions(string season, string gender, string ageGroup)
{
var document = new HtmlDocument();
var html = GvsaDivisions(season);
document.LoadHtml(html);
var options = document.DocumentNode.SelectNodes("//select//option").Select(x => new SelectListItem() { Value = x.GetAttributeValue("value", ""), Text = x.NextSibling.InnerText });
var divisions = options.Where(x => x.Text.Contains(string.Format("{0} {1}", ageGroup, gender)));
if (ageGroup == "U15/U16")
{
ageGroup = "U15/16";
}
if (ageGroup == "U17/U19")
{
ageGroup = "U17/19";
}
return divisions;
}
What I'm observing is this... once the options.Where() is executed, divisions contains a single result. After the test of ageGroup == "U15/U16" and the assignment of ageGroup = "U15/16", divisions now contains 3 results (the original 1, with the addition of 2 new matching the criteria of the new value of ageGroup
Can anybody explain this anomaly? I expected to make a call to Union the result of a new Where query to the original results, but it seems it's happening automagically. While the results are what I desire, I have no way to explain how it's happening (or the certainty that it'll continue to act this way)
LINQ queries use deferred execution, which means they are run whenever you enumerate the result.
When you change a variable that is being used in your query, you actually are changing the result of the next run of the query, which is the next time you iterate the result.
Read more about this here and here:
This is actually by-design, and in many situations it is very useful, and sometimes necessary. But if you need immediate evaluation, you can call the ToList() method at the end of your query, which materializes you query and gives you a normal List<T> object.
The divisions variable contains an unprocessed enumerator that calls the code x.Text.Contains(string.Format("{0} {1}", ageGroup, gender)) on each element in the list of nodes. Since you change ageGroup before you process that enumerator, it uses that new value instead of the old value.
For example, the following code outputs a single line with the text "pear":
List<string> strings = new List<string> { "apple", "orange", "pear", "watermelon" };
string matchString = "orange";
var queryOne = strings.Where(x => x == matchString);
matchString = "pear";
foreach (var item in queryOne)
{
Console.WriteLine(" " + item);
}
I'm thinking along the same lines as Travis, the delayed execution of linq.
I'm not sure if this will avoid the issue, but I generally put my results into an immediate collection like this. With my experience it seems once you shove the results into a real defined collection I believe it may not be delayed execution.
List<SelectListItem> options = document.DocumentNode.SelectNodes("//select//option").Select(x => new SelectListItem() { Value = x.GetAttributeValue("value", ""), Text = x.NextSibling.InnerText }).Where(x => x.Text.Contains(string.Format("{0} {1}", ageGroup, gender))).ToList<SelectListItem>();

Remove duplicate items from a List<String[]> in C#

I have an issue here a bit complex than I'm trying to resolve since some days ago. I'm using the PetaPoco ORM and didn't found any other way to do a complex query like this:
var data = new List<string[]>();
var db = new Database(connectionString);
var memberCapabilities = db.Fetch<dynamic>(Sql.Builder
.Select(#"c.column_name
,CASE WHEN c.is_only_view = 1
THEN c.is_only_view
ELSE mc.is_only_view end as is_only_view")
.From("capabilities c")
.Append("JOIN members_capabilities mc ON c.capability_id = mc.capability_id")
.Where("mc.member_id = #0", memberID)
.Where("c.table_id = #0", tableID));
var roleCapabilities = db.Fetch<dynamic>(Sql.Builder
.Select(#"c.column_name
,CASE WHEN c.is_only_view = 1
THEN c.is_only_view
ELSE rc.is_only_view end as is_only_view")
.From("capabilities c")
.Append("JOIN roles_capabilities rc ON c.capability_id = rc.capability_id")
.Append("JOIN members_roles mr ON rc.role_id = mr.role_id")
.Where("mr.member_id = #0", memberID)
.Where("c.table_id = #0", tableID));
I'm trying to get the user capabilities, but my system have actually to ways to assign an user a capability, or direct to that user or attaching the user to a role. I wanted to get this merged list using a stored procedure but I needed cursors and I thought maybe should be easier and faster doing this on the web application. So I get that two dynamics and the members capabilities have priority to the roles capabilities, so I need to check if that using loops. And I did like this:
for (int i = 0; i < roleCapabilities.Count; i++)
{
bool added = false;
for (int j = 0; j < memberCapabilities.Count; j++)
if (roleCapabilities[i].column_name == memberCapabilities[j].column_name)
{
data.Add(new string[2] { memberCapabilities[j].column_name, Convert.ToString(memberCapabilities[j].is_only_view) });
added = true;
break;
}
if (!added)
data.Add(new string[2] { roleCapabilities[i].column_name, Convert.ToString(roleCapabilities[i].is_only_view) });
}
So now the plan is delete the duplicate entries. I have try using the following methods with no results:
data = data.Distinct();
Any help? Thanks
Make sure that your object either implements System.IEquatable or overrides Object.Equals and Object.GetHashCode. In this case, it looks like you're storing the data as string[2], which won't give you the desired behavior. Create a custom object to hold the data, and do one of the 2 options listed above.
If I understand your question correctly you want to get a distinct set of arrays of strings, so if the same array exists twice, you only want one of them? The following code will return arrays one and three while two is removed as it is the same as one.
var one = new[] {"One", "Two"};
var two = new[] {"One", "Two"};
var three = new[] {"One", "Three"};
List<string[]> list = new List<string[]>(){one, two, three};
var i = list.Select(l => new {Key = String.Join("|", l), Values = l})
.GroupBy(l => l.Key)
.Select(l => l.First().Values)
.ToArray();
You might have to use ToList() after Distinct():
List<string[]> distinct = data.Distinct().ToList();

Querying Distinct Events from Google Calendar

Right now I have a very simple query that pulls up entries that have a string and a specific date range.
EventQuery eQuery = new EventQuery(calendarInfo.Uri.ToString());
eQuery.Query = "Tennis";
eQuery.StartDate = startDate;
eQuery.EndDate = endDate;
EventFeed myResultsFeed = _service.Query(eQuery);
After querying, myResultsFeed will contain an atomEntryCollection. Each atomEntry has a Title. The way I have it set up, there could be multiple entries with the same title.
I would like my Query to be able to select UNIQUE titles. Is this possible?
Link to the API Docs
I hypothesized that I could use a WHERE object
Where x = new Where();
x.yadayada();
but it can't be passed to _service.Query()
I'm also exploring the .extraparameters object. is it possible to do something like this?
eQuery.ExtraParameters = "distinct";
Looking into the "Partial Response" feature..
http://code.google.com/apis/gdata/docs/2.0/reference.html#PartialResponse
it looks pretty promising..
I don't think what you're trying to do is possible using the Google Data API.
However, extenting upon #Fueled answer, you could do something like this if you need a collection of AtomEntry's.
// Custom comparer for the AtomEntry class
class AtomEntryComparer : IEqualityComparer<AtomEntry>
{
// EventEntry are equal if their titles are equal.
public bool Equals(AtomEntry x, AtomEntry y)
{
// adjust as needed
return x.Title.Text.Equals(y.Title.Text);
}
public int GetHashCode(AtomEntry entry)
{
// adjust as needed
return entry.Title.Text.GetHashCode();
}
}
EventFeed eventFeed = service.Query(query)
var entries = eventFeed.Entries.Distinct(new AtomEntryComparer());
It's probably not the solution you were looking for, but since you have in hand an AtomEntryCollection (which down the line implements IEnumerable<T>), you could use LINQ to retrieve the distinct titles, like so:
EventFeed feed = service.Query(query);
var uniqueEntries =
(from e in feed.Entries
select e.Title.Text).Distinct();
And then loop over them with a simple foreach:
foreach (var item in uniqueEntries)
{
Console.WriteLine(item);
}
But then you have only a collection of string representing the Event titles, and not a collection of AtomEntry. I guess you could link them together in a Dictionary.
Not optimal, but should work.

Categories

Resources