Database queries in Entity Framework model - variable equals zero - c#

I have some problems with using database in my Model. I suspect that its not good idea to use database queries in Model, but I don't know how to do it better.
Code:
Let's assume that I have some application to analize football scores. I have some EF model that stores info about footballer:
public class Player
{
[...]
public virtual ICollection<Goal> HisGoals { get; set; }
public float Efficiency
{
get
{
using(var db = new ScoresContext())
{
var allGoalsInSeason = db.Goals.Count();
return HisGoals.Count / allGoalsInSeason;
}
}
}
}
Problem:
So the case is: I want to have a variable in my model called "Efficiency" that will return quotient of two variables. One of them contains data got in real-time from database.
For now this code doesn't work. "Efficiency" equals 0. I tried to use debugger and all data is correct and it should return other value.
What's wrong? Why it returns always zero?
My suspicions:
Maybe I'm wrong, I'm not good at C#, but I think the reason that Efficiency is always zero, is because I use database in it and it is somehow asynchronous. When I call this variable, it returns zero first and then calls the database.

I think that your problem lies in dividing integer / integer. In order to get a float you have to cast first one to float like this:
public float Efficiency
{
get
{
using(var db = new ScoresContext())
{
var allGoalsInSeason = db.Goals.Count();
return (float)HisGoals.Count / allGoalsInSeason;
}
}
}
Dividing int/int results always in int that is in your case 0 (if it is as you said in comment 4/11).
Second thing is that Entity Framework will cache values - test it before shipping to production.

Related

C# store and read large array of objects

I have an application (Winforms C#) to perform calculations on a raster. The calculation results are stored as objects in an array, total array length depending on project but currently around 1 million entries (but I want to make them larger, even 2 or 3 million). The goal of the application is perform queries to the data: the users (de)selects some properties, then the app is iterating over the array and summarize the values of the objects for each array entry. The results are shown as a picture (each pixel is an array entry).
Currently I'm storing the data as a compressed JSON string on the disk, so I'm loading all the data in memory. Advantage of doing this is that the queries are performed very fast (max 2 seconds). But disadvantage is that it takes a lot of memory, and it will give a out of memory exception if the array will become larger (I'm already building the app to 64 bit).
Question: is there a way of storing my array on the disk, without loading the entire array in memory and performing the queries in a very fast way? I've done some tests with LiteDB, but executing the queries is not fast enough (but I haven't experience with LiteDB, so maybe I'm doing something wrong). Is a database like LiteDB a good solution? Or is loading all the data in memory the only option?
Update: each entry in my array is a List of class CellResultPart, with around 1 to 10 objects in the list. Class defintion as followes:
public struct CellResultPart
{
public CellResultPart(double designElevation, double existingElevation)
{
DesignElevation = designElevation;
ExistingElevation = existingElevation;
MaterialName = "<None>";
Location = "<None>";
EnvironmentalClass = "<None>";
ElevationTop = double.NaN;
ElevationBottom = double.NaN;
ElevationLayerTop = double.NaN;
ElevationLayerBottom = double.NaN;
DepthLayerTop = double.NaN;
DepthLayerBottom = double.NaN;
DesignElevation = double.NaN;
ExistingElevation = double.NaN;
}
public double DesignElevation;
public double ExistingElevation;
public double Depth
{
get
{
if (IsExcavation)
{
return -Math.Round(Math.Abs(DepthBottom - DepthTop),3);
}
else
{
return Math.Round(Math.Abs(DepthBottom - DepthTop),3);
}
}
}
public double ElevationTop;
public double ElevationBottom;
public double ElevationLayerTop;
public double ElevationLayerBottom;
public double DepthTop
{
get
{
if (IsExcavation)
{
return -Math.Round(Math.Abs(ExistingElevation - ElevationTop),3);
}
else
{
return Math.Round(Math.Abs(DesignElevation - ElevationTop),3);
}
}
}
public double DepthBottom
{
get
{
if (IsExcavation)
{
return -Math.Round(Math.Abs(ExistingElevation - ElevationBottom),3);
}
else
{
return Math.Round(Math.Abs(DesignElevation - ElevationBottom),3);
}
}
}
public double DepthLayerTop;
public double DepthLayerBottom;
public string EnvironmentalClass;
public string Location;
public string MaterialName;
public bool IsExcavation
{
get
{
if (DesignElevation > ExistingElevation)
{
return false;
}
else return true;
}
}
}
Lets make some rough calculations. You have 10 doubles and 3 strings. Lets assume the strings are on average 20 characters. That should give you about 200 bytes per entry, or 200-600Mb overall. That should not be unfeasible to keep in memory, even on a 32 bit system.
Using json will probably not help, since it will make the data much larger. I would consider some binary format that should be closer to the theoretical required size. I have used protobuf .net with good results. That also support SerializeWithLengthPrefix, that should allow you to serialize each object independently from each other in a single stream, and that should avoid the need to keep everything in memory at the same time.
The other option would be to use some kind of database. Such a solution would most likely scale better as the size increase. Database performance is mostly an issue with using appropriate indices, I assume that is the reason your attempt went poorly. Creating good indices may be difficult if you have no idea what queries will be run, but I would still expect a database to perform better than a linear search.

Should the value of a property be updated with a method or determined in the getter?

Just a quick question so as to know if there is a better practice between updating the value of a property with a method that gets called everytime the value needs to be changed or just to do it in the getter.
For instance between this:
public double Balance { get; private set; }
private void UpdateBalance()
{
if (Payments.IsNullOrEmpty())
{
Balance = 0;
}
else
{
double amountSum = 0;
foreach (Payment payment in Payments)
{
amountSum += payment.Amount;
}
Balance = amountSum;
}
}
And this :
public double OtherBalance
{
get
{
if (Payments.IsNullOrEmpty())
return 0;
double amountSum = 0;
foreach (Payment payment in Payments)
{
amountSum += payment.Amount;
}
return amountSum;
}
}
The only difference I can think of is one of performance, since in the first case the getter runs through the whole list every time we try to get the value of the property. However, you don't need to worry about calling the Update method every time you do a change that might impact the value of the property like in the second option. Is that difference really significant ? Beyond that, is there any reason to chose one option over another ?
Thanks in advance
Well in first method for reading the balance you should call the getter again and in the second method there is no balance at all, personally I prefer second method because it's generate the value at the call time so you can be sure that it is always updated and it doesn't need to call a function and then read the value so it's cleaner and more maintainable.
To add to Kiani's answer, if you don't mind using Linq, you could turn your code to a one liner.
private double Balance=>(!Payments.Any())?0:Payments.Sum(t=>t.Amount);

Passing in and returning custom data - are interfaces the right approach?

I'm writing code in a C# library to do clustering on a (two-dimensional) dataset - essentially breaking the data up into groups or clusters. To be useful, the library needs to take in "generic" or "custom" data, cluster it, and return the clustered data.
To do this, I need to assume that each datum in the dataset being passed in has a 2D vector associated with it (in my case Lat, Lng - I'm working with co-ordinates).
My first thought was to use generic types, and pass in two lists, one list of the generic data (i.e. List<T>) and another of the same length specifying the 2D vectors (i.e. List<Coordinate>, where Coordinate is my class for specifying a lat, lng pair), where the lists correspond to each other by index. But this is quite tedious because it means that in the algorithm I have to keep track of these indices somehow.
My next thought was to use inferfaces, where I define an interface
public interface IPoint
{
double Lat { get; set; }
double Lng { get; set; }
}
and ensure that the data that I pass in implements this interface (i.e. I can assume that each datum passed in has a Lat and a Lng).
But this isn't really working out for me either. I'm using my C# library to cluster stops in a transit network (in a different project). The class is called Stop, and this class is also from an external library, so I can't implement the interface for that class.
What I did then was inherit from Stop, creating a class called ClusterableStopwhich looks like this:
public class ClusterableStop : GTFS.Entities.Stop, IPoint
{
public ClusterableStop(Stop stop)
{
Id = stop.Id;
Code = stop.Code;
Name = stop.Name;
Description = stop.Description;
Latitude = stop.Latitude;
Longitude = stop.Longitude;
Zone = stop.Zone;
Url = stop.Url;
LocationType = stop.LocationType;
ParentStation = stop.ParentStation;
Timezone = stop.Timezone;
WheelchairBoarding = stop.WheelchairBoarding;
}
public double Lat
{
get
{
return this.Latitude;
}
}
public double Lng
{
get
{
return this.Longitude;
}
}
}
which as you can see implements the IPoint interface. Now I use the constructor for ClusterableStop to first convert all Stops in the dataset to ClusterableStops, then run the algorithm and get the result as ClusterableStops.
This isn't really what I want, because I want to do things to the Stops based on what cluster they fall in. I can't do that because I've actually instantiated new stops, namely ClusterableStops !!
I can still acheive what I want to, because e.g. I can retrieve the original objects by Id. But surely there is a much more elegant way to accomplish all of this? Is this the right way to be using interfaces? It seemed like such a simple idea - passing in and getting back custom data - but turned out to be so complicated.
Since all you need is to associate a (latitude, longitude) pair to each element of 2D array, you could make a method that takes a delegate, which produces an associated position for each datum, like this:
ClusterList Cluster<T>(IList<T> data, Func<int,Coordinate> getCoordinate) {
for (int i = 0 ; i != data.Count ; i++) {
T item = data[i];
Coordinate coord = getCoord(i);
...
}
}
It is now up to the caller to decide how Coordinate is paired with each element of data.
Note that the association by list position is not the only option available to you. Another option is to pass a delegate that takes the item, and returns its coordinate:
ClusterList Cluster<T>(IEnumerable<T> data, Func<T,Coordinate> getCoordinate) {
foreach (var item in data) {
Coordinate coord = getCoord(item);
...
}
}
Although this approach is better than the index-based one, in cases when the coordinates are not available on the object itself, it requires the caller to keep some sort of an associative container on T, which must either play well with hash-based containers, or be an IComparable<T>. The first approach places no restrictions on T.
In your case, the second approach is preferable:
var clustered = Cluster(
myListOfStops
, stop => new Coordinate(stop.Latitude, stop.Longitude)
);
Have you considered using Tuples to do the work - sometimes this is a useful way of associating two classes without creating a whole new class. You can create a list of tuples:
List<Tuple<Point, Stop>>
where Point is the thing you cluster on.

If - return is a huge bottleneck in my application

This is a snippet of code from my C# application:
public Player GetSquareCache(int x, int y)
{
if (squaresCacheValid)
return (Player)SquaresCache[x,y];
else
//generate square cache and retry...
}
squareCacheValid is a private bool and SquaresCache is private uint[,].
The problem was that the application is running extremely slow and any optimization just made it slower, so I ran a tracing session.
I figured that GetSquareCache() gets 94.41% own time, and the if and return split that value mostly evenly (46% for if and 44.82% for return statement). Also the method is hit cca. 15,000 times in 30 seconds, in some tests going up to 20,000.
Before adding methods that call GetSquareCache(), program preformed pretty well but was using random value instead of actual GetSquareCache() calls.
My questions are: is it possible that these if/return statements used up so much CPU time? How is it possible that if statements GetSquareCache() is called in (which in total are hit the same number of times) have minimal own time? And is it possible to speed up the fundamental computing operation if?
Edit: Player is defined as
public enum Player
{
None = 0,
PL1 = 1,
PL2 = 2,
Both = 3
}
I would suggest a different approach , under the assumption that most of the values in the square hold no player, and that the square is very large remember only location where there are players,
It should look something like this :
public class PlayerLocaiton
{
Dictionary<Point, List<Player>> _playerLocation = new ...
public void SetPlayer(int x, int y, Player p)
{
_playerLocation[new Point(x,y)].add(p);
}
public Player GetSquareCache(int x, int y)
{
if (squaresCacheValid)
{
Player value;
Point p = new Point(x,y);
if(_playerLocation.TryGetValue(p, out value))
{
return value ;
}
return Player.None;
}
else
//generate square cache and retry...
}
}
The problem is just the fact that method is called way too many times. And indeed, 34,637 ms it gets in last trace, over 34,122 hits it got is a little over 1ms per hit. In decompiled CIL code there are also some assignments to local variables not present in code in both if branches because it needs one ret statement. The algorithm itself is what needs to be modified, and such modifications were planned anyway.
replace return type of this method to int and remove the casting
to Player
if cache is to be set once remove the if from this method so
it is always true when the method is called
replace array with single
dimension array and access it via unsafe fixed way

Is this more suited for key value storage or a tree?

I'm trying to figure out the best way to represent some data. It basically follows the form Manufacturer.Product.Attribute = Value. Something like:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
So the minimum price across all Acme products is 100 except in the case of product A and B. I want to store this data in C# and have some function where GetValue("Acme.ProductC.MinimumPrice") returns 100 but GetValue("Acme.ProductA.MinimumPrice") return 50.
I'm not sure how to best represent the data. Is there a clean way to code this in C#?
Edit: I may not have been clear. This is configuration data that needs to be stored in a text file then parsed and stored in memory in some way so that it can be retrieved like the examples I gave.
Write the text file exactly like this:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Parse it into a path/value pair sequence:
foreach (var pair in File.ReadAllLines(configFileName)
.Select(l => l.Split('='))
.Select(a => new { Path = a[0], Value = a[1] }))
{
// do something with each pair.Path and pair.Value
}
Now, there two possible interpretations of what you want to do. The string Acme.*.MinimumPrice could mean that for any lookup where there is no specific override, such as Acme.Toadstool.MinimumPrice, we return 100 - even though there is nothing referring to Toadstool anywhere in the file. Or it could mean that it should only return 100 if there are other specific mentions of Toadstool in the file.
If it's the former, you could store the whole lot in a flat dictionary, and at look up time keep trying different variants of the key until you find something that matches.
If it's the latter, you need to build a data structure of all the names that actually occur in the path structure, to avoid returning values for ones that don't actually exist. This seems more reliable to me.
So going with the latter option, Acme.*.MinimumPrice is really saying "add this MinimumPrice value to any product that doesn't have its own specifically defined value". This means that you can basically process the pairs at parse time to eliminate all the asterisks, expanding it out into the equivalent of a completed version of the config file:
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Acme.ProductC.MinimumPrice = 100
The nice thing about this is that you only need a flat dictionary as the final representation and you can just use TryGetValue or [] to look things up. The result may be a lot bigger, but it all depends how big your config file is.
You could store the information more minimally, but I'd go with something simple that works to start with, and give it a very simple API so that you can re-implement it later if it really turns out to be necessary. You may find (depending on the application) that making the look-up process more complicated is worse over all.
I'm not entirely sure what you're asking but it sounds like you're saying either.
I need a function that will return a fixed value, 100, for every product ID except for two cases: ProductA and ProductB
In that case you don't even need a data structure. A simple comparison function will do
int GetValue(string key) {
if ( key == "Acme.ProductA.MinimumPrice" ) { return 50; }
else if (key == "Acme.ProductB.MinimumPrice") { return 60; }
else { return 100; }
}
Or you could have been asking
I need a function that will return a value if already defined or 100 if it's not
In that case I would use a Dictionary<string,int>. For example
class DataBucket {
private Dictionary<string,int> _priceMap = new Dictionary<string,int>();
public DataBucket() {
_priceMap["Acme.ProductA.MinimumPrice"] = 50;
_priceMap["Acme.ProductB.MinimumPrice"] = 60;
}
public int GetValue(string key) {
int price = 0;
if ( !_priceMap.TryGetValue(key, out price)) {
price = 100;
}
return price;
}
}
One of the ways - you can create nested dictionary: Dictionary<string, Dictionary<string, Dictionary<string, object>>>. In your code you should split "Acme.ProductA.MinimumPrice" by dots and get or set a value to the dictionary corresponding to the splitted chunks.
Another way is using Linq2Xml: you can create XDocument with Acme as root node, products as children of the root and and attributes you can actually store as attributes on products or as children nodes. I prefer the second solution, but it would be slower if you have thousands of products.
I would take an OOP approach to this. The way that you explain it is all your Products are represented by objects, which is good. This seems like a good use of polymorphism.
I would have all products have a ProductBase which has a virtual property that defaults
virtual MinimumPrice { get { return 100; } }
And then your specific products, such as ProductA will override functionality:
override MinimumPrice { get { return 50; } }

Categories

Resources