Firstly apologies for the poor title. Absolutely no idea how to describe this question!
I have a "Relationship" entity that defines a relationship between 2 users.
public class Relationship{
User User1{get;set;}
User User2{get;set;}
DateTime StateChangeDate {get;set;}
//RelationshipState is an Enum with int values
State RelationshipState State{get;set;}
}
Relationship state example.
public enum RelationshipState{
state1 = 1,
state2 = 2,
state3 = 3,
state4 = 4
}
A Relationship entity is created each time the RelationshipState changes. So for any pair of users, there will be many Relationship objects. With the most recent being current.
I'm trying to query for any Relationship object that represents a REDUCTION in RelationshipState for a particular pair of users.
So all the RelationshipObjects for all the users. That have a later Date than one with a higher RelationshipState.
I'm finding it very hard to figure out how to accomplish this without iterating over the entire Relationship table.
First, create a query to return all the combinations of users and a child that lists all the status changes. For more information, google LINQ Group By.
Then using your collection, filter out all the ones you don't want by looking at the last two status changes and seeing if it's gone down.
Here's an example, tested in LinqPad as a C# Program:
public enum RelationshipState {
state1 = 1,
state2 = 2,
state3 = 3,
state4 = 4
}
public class User {
public int id {get;set;}
}
public class Relationship{
public User User1{get;set;}
public User User2{get;set;}
public DateTime StateChangeDate {get;set;}
//RelationshipState is an Enum with int values
public RelationshipState State {get;set;}
}
void Main()
{
var rs=new List<Relationship>() {
new Relationship{ User1=new User{id=1},User2=new User{id=2},StateChangeDate=DateTime.Parse("1/1/2013"),State=RelationshipState.state2},
new Relationship{ User1=new User{id=1},User2=new User{id=2},StateChangeDate=DateTime.Parse("1/2/2013"),State=RelationshipState.state3},
new Relationship{ User1=new User{id=1},User2=new User{id=3},StateChangeDate=DateTime.Parse("1/1/2013"),State=RelationshipState.state2},
new Relationship{ User1=new User{id=1},User2=new User{id=3},StateChangeDate=DateTime.Parse("1/2/2013"),State=RelationshipState.state1},
new Relationship{ User1=new User{id=2},User2=new User{id=3},StateChangeDate=DateTime.Parse("1/2/3013"),State=RelationshipState.state1}
};
var result=rs.GroupBy(cm=>new {id1=cm.User1.id,id2=cm.User2.id},(key,group)=>new {Key1=key,Group1=group.OrderByDescending(g=>g.StateChangeDate)})
.Where(r=>r.Group1.Count()>1) // Remove Entries with only 1 status
//.ToList() // This might be needed for Linq-to-Entities
.Where(r=>r.Group1.First().State<r.Group1.Skip(1).First().State) // Only keep relationships where the state has gone done
.Select(r=>r.Group1.First()) //Turn this back into Relationship objects
;
// Use this instead if you want to know if state ever had a higher state than it is currently
// var result=rs.GroupBy(cm=>new {id1=cm.User1.id,id2=cm.User2.id},(key,group)=>new {Key1=key,Group1=group.OrderByDescending(g=>g.StateChangeDate)})
// .Where(r=>r.Group1.First().State<r.Group1.Max(g=>g.State))
// .Select(r=>r.Group1.First())
// ;
result.Dump();
}
Create a stored procedure in the database that can use a cursor to iterate the items and pair them off with the item before them (and then filter to decreasing state.)
Barring that, you can perform an inner query that finds the previous value for each item:
from item in table
let previous =
(from innerItem in table
where previous.date < item.Date
select innerItem)
.Max(inner => inner.Date)
where previous.State > item.State
select item
As inefficient as that seems, It might be worth a try. Perhaps, with the proper indexes, and a good query optimizer (and a sufficiently small set of data) it won't be that bad. If it's unacceptably slow, then trying out a stored proc with a cursor is most likely going to be the best.
Related
I am new to OOP and C# so have what is potentially a noob question.
Looking at classes all the code I see requires the object names to be hardcoded. Say for example we don't know how many Customers a user will enter and what their names are. So how does that work with creating instances of a class?
Is it the case that you would only create one class of customer as a data type (with methods as required) and store them in say a list? Or can a class hold multiple instances of an object but create them and name them dynamically in relation to user input?
I've not quite got my head around how a class works with holding records in memory etc.
If you will iterate throught the objects in some moment I'd use a List.
As mentioned by #wazz you should store them in a list, like so:
List<Customer> customers = new List<Customer>();
Then you can start working with the list, adding new instances, removing etc.
You can check more information at the docs.
Note that a class is a type. You can think of it as template used to create objects (also called instances of this class). So, you can have many different objects of the same class.
You can differentiate between different objects of the same class by assigning them to different variables:
Customer a = new Customer { Name = "customer a" };
Customer b = new Customer { Name = "customer b" };
This has its limits, of course, as you do not want to declare one thousand variables when you have one thousand customers!
Here, collections come into play.
Store the customers in a collection. There are many different collection types. Probably the 2 most popular are List<T> and Dictionary<TKey, TValue>.
List<T> where T is a type (Customer in your case):
List<Customer> customers = new List<Customer>();
Customer c = new Customer { Name = "My first customer" };
customers.Add(c);
// You can re-use the same variable for another customer.
c = new Customer { Name = "My second customer" };
customers.Add(c);
// Or skip the temporary variable alltoghther.
customers.Add(new Customer { Name = "My third customer" });
The customer in the list can be accessed through a zero-based index
Customer c = list[5]; // Get the 6th customer
Lists can also be enumerated:
foreach (Customer cust in customers) {
Console.WriteLine(cust.Name);
}
Dictionary<TKey, TValue>: dictionaries allow you to lookup a customer by a key, e.g. a customer number.
Dictionary<string, Customer> customers = new Dictionary<string, Customer>();
Customer c = new Customer { Id = "C1", Name = "My first customer" };
customers.Add(c.Id, c); // The Id is used as key, the customer object as value.
Then you can get it back with
Customer c = customers["C1"];
This throws an exception if the supplied key is not existing. To avoid an exception you can write:
string id = "XX";
if (customers.TryGetValue(id, out Customer cust)) {
Console.WriteLine(cust.Name);
} else {
Console.WriteLine($"Customer '{id}' not found");
}
To help you understand a class first.
A class is a blueprint of what the data will look like. If you were to build, a house for example, you would first make the blueprints. Those prints could be used to make many houses that look that same. The differences in those houses vary in areas such as color, wood, brick, or vinyl, and if is has carpet or not. This is just an example but let's make this house into a class (blueprint) and then let's make several of those houses into actual individual objects; finally let's reference all of those in the same location (using a list).
First let's decide what our house options are, the things that set them apart. It's color, exterior, and with or without carpet. Here we will make color a string (text) for simplicity, the exterior as an enum ( so the options are hard coded), and the carpet or not a bool (true or false).
First let's make the enum since it is separate than a class but used within a class.
public enum Exterior { Brick, Vinyl, Wood }
Now let's make the class (the blueprint).
public class House
{
public string Color { get; set; }
public Exterior Exterior { get; set; }
public bool HasCarpet { get; set; }
}
Now that we have the blue print let's actually make some houses. When we make them we need to have a way to locate them in memory so we assign them to a variable. Let's pretend we are in a console app and this is the first few lines of the main method.
var house1 = new House();
var house2 = new House();
var house3 = new House();
Now we have 3 individual houses in memory and can get to any of them by referencing house1, house2, or house3. However, these houses are still built identical and unfortunately have no color (since the string Color is empty), are default to Brick (since the first enum is Brick), and have no carpet (since HasCarpet defaults to false).
We can fix this by referencing each house object reference and assigning these values like so...
house1.Color = "Red";
house1.Exterior = Exterior.Wood;
We could have given the classes a constructor that required these values as parameters to start with or we can do it a simpler way inline (thanks to the power of C# syntax).
var house1 = new House()
{
Color = "Red",
Exterior = Exterior.Wood
};
We could also give it carpet but since this house isn't going to have any and it defaults to false I've left it out.
Ok, so let's say we have all 3 houses built via our blueprint. Let's now store them together in a List. First we need to make the List into an object and reference that also.
var houses = new List<House>();
Now let's add the houses to the list.
houses.Add(house1);
houses.Add(house2);
houses.Add(house3);
Now houses is a reference to all of our house objects. If we want to get to a house in the list we could use an index (starting at 0) and get that location in the list. So let's say that house2 needs to have carpet but we want to use the list now to reference it. Note: There are quite a few ways to reference the items in a list, this one is elementary.
houses[1].HasCarpet = true;
I did this on my phone, hopefully there are no errors. My intentions are to clearly answer your question and educate and help you better understand classes.
Most likely, you would use an IEnumerable<Customer>. A Customer class in real life is related with a database.
When dealing with databases, Linq comes to mind which would use an IQueryable<Customer> (inherited from IEnumerable). As need arises you would also use other collection types like List<Customer>, Dictionary<...> etc
Sorry if my terminology is not great, I'm not a professional programmer.
I have a List< Something >, whereby the 'Something' is a struct. This struct contains objects, which each have their own public properties/fields in the classes. I want to sort the list in order - but by the values found in these nested properties/fields. Then I want to return a list of these values, not the structs.
I know this is very confusing but I've had trouble trying to do it. At the moment I just get a list returned with a count of 20 (which is the full data set I'm using), but I want the 3 values only with the smallest value.
For context and further explanation, here is some code I'm using:
// Returns 3 nearest stations to the location specified
public static List<TrainStation> nearbyStations(GeoCoordinate location)
{
List<StationWalk> stations = new List<StationWalk>();
foreach (TrainStation s in All)
{
stations.Add(new StationWalk(s, new Walk(location, s.location)));
}
// return 3 TrainStation objects that have the lowest StationWalk.Walk.duration values corresponding with them in the StationWalk struct
stations.OrderBy(walks => walks.walk.duration).Take(3);
List<TrainStation> returnList = new List<TrainStation>();
foreach (StationWalk s in stations)
{
returnList.Add(s.station);
}
return returnList;
}
private struct StationWalk
{
public StationWalk(TrainStation station, Walk walk)
{
this.station = station;
this.walk = walk;
}
public TrainStation station;
public Walk walk;
}
'Walk' is a class that contains a 'duration' field. This represents the time it takes to walk. More specifically, my overall goal here is to figure out which 3 walks are the fastest walks out of all 20 in the list. But the 'walks' are properties of the StationWalk struct, and the 'duration' is a property of the Walk.
How would I go about doing this? Really sorry if this isn't well explained, it's confusing to myself despite writing it myself, yet alone trying to explain it to others.Appreciate any help.
The OrderBy and Take both return a new collection, they do not modify the existing collection, so you would need to store the reference to new collection returned by the methods like:
stations = stations.OrderBy(walks => walks.walk.duration).Take(3).ToList();
and if you want to keep reference to the original list for further usage down in your code, then just store the result in a local variable:
var lowestThreeStations = stations.OrderBy(walks => walks.walk.duration).Take(3).ToList();
I'm pretty sure the title sounds kind of weird but I hope this is a valid question :)
I have a class, let's call it Employee:
class Employee
{
int employeeid { get; set; }
String employeename { get; set; }
String comment { get; set; }
}
I will fill a List from a database. An employeeid can have X number of comments, thus leaving the ratio 1:X. And there can of course be Y number of employeeid as well.
I want to create a List out of all the employee-objects which has for example employeeid = 1. And another list out of employeeid = 2.
I can sort the original List by employeeid, loop through the list and create a new list each time I hit a new employeeid. However I feel that the performance could be better.
Is there a way to split the original List into X number of lists depending on X number of distinct employeeids?
It's as simple as:
var query = data.GroupBy(employee => employee.employeeid);
Note the performance is much better than the algorithm you described. It will use a hash based data structure for the IDs, meaning that the entire operation is effectively a single pass performing a constant-time operation on each item.
Sure, LINQ's GroupBy should make this a breeze. Try something like that:
var answer = myEmployeeList.GroupBy( emp=>emp.employeeid );
public class User
{
public int Id { get; set; }
public int Age { get; set; }
public string Name { get; set; }
}
I have 100k users.
Query: Get Users Whose Name is "Rafael" AND whose age is between 40 and 50
By Linq2Objects : users.Where(p=>p. Name=="Rafael" && p.Age>=40 && p.Age<=50).ToArray();
Is there any alternative implemantation with better performance? (Readonly Thread-Safe)
(MultiIndexed User Array)
I've tested it's performance. For 1000k users it takes 30-50 ms. It seems not important but it is.
Because I can get 50 requests in a second.
With dharnitski's solution. It takes 0ms. :)
But is there any code framework makes it transparently.
public class FastArray<T>
You cannot get result you want without full dataset scan if your data is not prepared.
Prepare data in advance when time is not critical and work with sorted data when you need short response time.
There is an analogy for this in database world.
There is a table with 100K records. Somebody wants to run a Select query with "where" clause that filter data by not primary key. It always will be slow "table scan" operation in execution plan unless index(es) is implemented.
Sample of code that implements indexing using ILookup<TKey, TValue>:
//not sorted array of users - raw data
User[] originalUsers;
//Prepare data in advance (create one index).
//Field with the best distribution should be used as key
ILookup<string, User> preparedUsers = originalUsers.ToLookup(u => u.Name, u => u);
//run this code when you need subset
//search by key is optimized by .NET class
//"where" clause works with small set of data
preparedUsers["Rafael"].Where(p=> p.Age>=40 && p.Age<=50).ToArray();
This code is not as powerful as database indexes (for example it does not support substrings) but it shows the idea.
I wanted to generate a unique identifier for the results of a Linq query i did on some date.
Initially i thought of using Guid for that but stumbling upon this problem i had to improvise.
However I'd like to see if anyone could have a solution using Guid so here we go.
Imagine we have:
class Query
{
public class Entry
{
public string Id { get; set; }
public int Value { get; set; }
}
public static IEnumerable<Entry> GetEntries( IEnumerable<int> list)
{
var result =
from i in list
select new Entry
{
Id = System.Guid.NewGuid().ToString("N"),
Value = i
};
return result;
}
}
Now we want Id to be unique for each entry, but we need this value to be the same for each traversal of the IEnumerable we get from GetEntries. This means that we want calling the following code:
List<int> list = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<Query.Entry> entries = Query.GetEntries(list);
Console.WriteLine("first pass");
foreach (var e in entries) { Console.WriteLine("{0} {1}", e.Value, e.Id); }
Console.WriteLine("second pass");
foreach (var e in entries) { Console.WriteLine("{0} {1}", e.Value, e.Id); }
to give us something like:
first pass
1 47f4a21a037c4ac98a336903ca9df15b
2 f339409bde22487e921e9063e016b717
3 8f41e0da06d84a58a61226a05e12e519
4 013cddf287da46cc919bab224eae9ee0
5 6df157da4e404b3a8309a55de8a95740
second pass
1 47f4a21a037c4ac98a336903ca9df15b
2 f339409bde22487e921e9063e016b717
3 8f41e0da06d84a58a61226a05e12e519
4 013cddf287da46cc919bab224eae9ee0
5 6df157da4e404b3a8309a55de8a95740
However we get:
first pass
1 47f4a21a037c4ac98a336903ca9df15b
2 f339409bde22487e921e9063e016b717
3 8f41e0da06d84a58a61226a05e12e519
4 013cddf287da46cc919bab224eae9ee0
5 6df157da4e404b3a8309a55de8a95740
second pass
1 a9433568e75f4f209c688962ee4da577
2 2d643f4b58b946ba9d02b7ba81064274
3 2ffbcca569fb450b9a8a38872a9fce5f
4 04000e5dfad340c1887ede0119faa16b
5 73a11e06e087408fbe1909f509f08d03
Now taking a second look at my code above I realized where my error was:
The assignment of Id to Guid.NewGuid().ToString("N") gets called every time we traverse the collection and thus is different everytime.
So what should i do then?
Is there a way i can reassure that i will get with only one copy of the collection everytime?
Is there a way that i'm sure that i won't be getting the new instances of the result of the query?
Thank you for your time in advance :)
This is a inherent to all LINQ queries. Being repeatable is coincidental, not guaranteed.
You can solve it with a .ToList() , like:
IEnumerable<Query.Entry> entries = Query.GetEntries(list).ToList();
Or better, move the .ToList() inside GetEntries()
Perhaps you need to produce the list of entries once, and return the same list each time in GetEntries.
Edit:
Ah no, you get each time the different list! Well, then it depends on what you want to get. If you want to get the same Id for each specific Value, maybe in different lists, you need to cache Ids: you should have a Dictionary<int, Guid> where you'll store the already allocated GUIDs. If you want your GUIDs be unique for each source list, you would perhaps need to cache the input the return IEnumerables, and always check if this input list was already returned or not.
Edit:
If you don't want to share the same GUIDs for different runs of GetEntries, you should just "materialize" the query (replacing return result; with return result.ToList();, for example), as it was suggested in the comment to your question.
Otherwise the query will run each time you traverse your list. This is what is called lazy evaluation. The lazy evaluation is usually not a problem, but in your case it leads to recalculating the GUID each query run (i.e., each loop over the result sequence).
Any reason you have to use LINQ? The following seems to work for me:
public static IEnumerable<Entry> GetEntries(IEnumerable<int> list)
{
List<Entry> results = new List<Entry>();
foreach (int i in list)
{
results.Add(new Entry() { Id = Guid.NewGuid().ToString("N"), Value = i });
}
return results;
}
That's because of the way linq works. When you return just the linq query, it is executed every time you enumerate over it. Therefore, for each list item Guid.NewGuid will be executed as many times as you enumerate over the query.
Try adding an item to the list after you iterated once over the query and you will see, that when iterating a second time, the just added list item will be also in the result set. That's because the linq query holds an instance of your list and not an independent copy.
To get always the same result, return an array or list instead of the linq query, so change the return line of the GetEntries method to something like that:
return result.ToArray();
This forces immediate execution, which also happens only once.
Best Regards,
Oliver Hanappi
You might think not using Guid, at least not with "new".
Using GetHashCode() returns unique values that don't change when you traverse the list multiple times.
The problem is that your list is IEnumerable<int>, so the hash code of each item coincides with its value.
You should re-evaluate your approach and use a different strategy. One thing that comes into my mind is to use a pseudo-random number generator initialized with the hash code of the collection. It will return you always the same numbers as soon as it's initialized with the same value. But, again, forget Guid
One suggestion: (Don't know if that's your case or not though)
If you want to save the entries in database, Try to assign your entry's primary key a Guid at the database level. This way, each entry will have a unique and persisted Guid as its primary key. Checkout this link for more info.