UnitsNet - Whats the best way to do runtime unit conversion - c#

We are writing a data conversion application.
We have 10s of 1000s of equations that need to be extracted into a template aka ‘model’.
The model will take the most common unit of measure for a given set of equations.
Then each equation's value must be adjusted to ensure that its value corresponds to the unit of measure on the model.
So, I am looking to use UnitsNet to convert values of variables, given a source variable and its unit, to a target unit.
The issue I have is that we don’t know at compile time either what the source unit is, nor the target unit.
All we have is the source and target unit string abbreviations in the equation which comes in at runtime (and some of these will be custom units).
Simple example:
Target Unit: mA (Miliamperes)
Source Equation: Is=8A (Amperes)
We need to be able to recognize the unit of measure from the abbreviation, compare it to a target unit and then adjust the value accordingly :
e.g. in the above case we would multiply 8 Amperes by 1000 to equal 8000 Miliamperes.
I cannot see a succinct way to do this with UnitsNet.
Something crude like this is all I have so far (this is a unit test written in xUnit):
[Theory]
[InlineData("A", "mA", "8")]
public void DoConversion(string sourceUnit, string targetUnit, string variableValue)
{
double result = 0;
ElectricCurrent sourceCurrent;
if (ElectricCurrent.TryParse($"{variableValue}{sourceUnit}", out sourceCurrent))
{
ElectricCurrent targetCurrent;
if (ElectricCurrent.TryParse($"1{targetUnit}", out targetCurrent))
{
var electricCurrentUnit = GetElectricCurrentUnitFromAbbreviation(targetUnit);
if (electricCurrentUnit == ElectricCurrentUnit.Ampere)
{
result = sourceCurrent.Amperes;
}
if (electricCurrentUnit == ElectricCurrentUnit.Milliampere)
{
result = sourceCurrent.Milliamperes;
}
}
}
result.Should().Be(8000);
// TODO: Add every other combination of all possible Units and their Scales- OMG!!!
}
private ElectricCurrentUnit GetElectricCurrentUnitFromAbbreviation(string abbreviation)
{
// Is there a better way to determine WHICH ElectricCurrentUnit the target is?
if (abbreviation == "A")
return ElectricCurrentUnit.Ampere;
if (abbreviation == "mA")
return ElectricCurrentUnit.Milliampere;
return ElectricCurrentUnit.Undefined;
}
But the list of possible units we have to cater for is large, so I don’t want to have to write it this way.
It seems like there’s got to be a better way.
Would really appreciation your expert insight into this.

This was answered on github: https://github.com/angularsen/UnitsNet/issues/220
Proposed solution
This gives you some tools to more easily work with dynamic conversions, using string representations of quantities and units.
New class UnitConverter
New property Units on quantities, ex: LengthUnit[] Units { get; } on Length
Renamed (obsoleted) UnitClass enum to QuantityType for new naming convention
This allows the following scenario:
// Get quantities for populating quantity UI selector
QuantityType[] quantityTypes = Enum.GetValues(typeof(QuantityType)).Cast<QuantityType>().ToArray();
// If Length is selected, get length units for populating from/to UI selectors
LengthUnit[] lengthUnits = Length.Units;
// Perform conversion by using .ToString() on selected units
double centimeters = UnitConverter.ConvertByName(5, "Length", "Meter", "Centimeter");
double centimeters2 = UnitConverter.ConvertByAbbreviation(5, "Length", "m", "cm");

Related

Passing in and returning custom data - are interfaces the right approach?

I'm writing code in a C# library to do clustering on a (two-dimensional) dataset - essentially breaking the data up into groups or clusters. To be useful, the library needs to take in "generic" or "custom" data, cluster it, and return the clustered data.
To do this, I need to assume that each datum in the dataset being passed in has a 2D vector associated with it (in my case Lat, Lng - I'm working with co-ordinates).
My first thought was to use generic types, and pass in two lists, one list of the generic data (i.e. List<T>) and another of the same length specifying the 2D vectors (i.e. List<Coordinate>, where Coordinate is my class for specifying a lat, lng pair), where the lists correspond to each other by index. But this is quite tedious because it means that in the algorithm I have to keep track of these indices somehow.
My next thought was to use inferfaces, where I define an interface
public interface IPoint
{
double Lat { get; set; }
double Lng { get; set; }
}
and ensure that the data that I pass in implements this interface (i.e. I can assume that each datum passed in has a Lat and a Lng).
But this isn't really working out for me either. I'm using my C# library to cluster stops in a transit network (in a different project). The class is called Stop, and this class is also from an external library, so I can't implement the interface for that class.
What I did then was inherit from Stop, creating a class called ClusterableStopwhich looks like this:
public class ClusterableStop : GTFS.Entities.Stop, IPoint
{
public ClusterableStop(Stop stop)
{
Id = stop.Id;
Code = stop.Code;
Name = stop.Name;
Description = stop.Description;
Latitude = stop.Latitude;
Longitude = stop.Longitude;
Zone = stop.Zone;
Url = stop.Url;
LocationType = stop.LocationType;
ParentStation = stop.ParentStation;
Timezone = stop.Timezone;
WheelchairBoarding = stop.WheelchairBoarding;
}
public double Lat
{
get
{
return this.Latitude;
}
}
public double Lng
{
get
{
return this.Longitude;
}
}
}
which as you can see implements the IPoint interface. Now I use the constructor for ClusterableStop to first convert all Stops in the dataset to ClusterableStops, then run the algorithm and get the result as ClusterableStops.
This isn't really what I want, because I want to do things to the Stops based on what cluster they fall in. I can't do that because I've actually instantiated new stops, namely ClusterableStops !!
I can still acheive what I want to, because e.g. I can retrieve the original objects by Id. But surely there is a much more elegant way to accomplish all of this? Is this the right way to be using interfaces? It seemed like such a simple idea - passing in and getting back custom data - but turned out to be so complicated.
Since all you need is to associate a (latitude, longitude) pair to each element of 2D array, you could make a method that takes a delegate, which produces an associated position for each datum, like this:
ClusterList Cluster<T>(IList<T> data, Func<int,Coordinate> getCoordinate) {
for (int i = 0 ; i != data.Count ; i++) {
T item = data[i];
Coordinate coord = getCoord(i);
...
}
}
It is now up to the caller to decide how Coordinate is paired with each element of data.
Note that the association by list position is not the only option available to you. Another option is to pass a delegate that takes the item, and returns its coordinate:
ClusterList Cluster<T>(IEnumerable<T> data, Func<T,Coordinate> getCoordinate) {
foreach (var item in data) {
Coordinate coord = getCoord(item);
...
}
}
Although this approach is better than the index-based one, in cases when the coordinates are not available on the object itself, it requires the caller to keep some sort of an associative container on T, which must either play well with hash-based containers, or be an IComparable<T>. The first approach places no restrictions on T.
In your case, the second approach is preferable:
var clustered = Cluster(
myListOfStops
, stop => new Coordinate(stop.Latitude, stop.Longitude)
);
Have you considered using Tuples to do the work - sometimes this is a useful way of associating two classes without creating a whole new class. You can create a list of tuples:
List<Tuple<Point, Stop>>
where Point is the thing you cluster on.

best data structure for storing large number of numeric fields

I am working with a class, say Widget, that has a large number of numeric real world attributes (eg, height, length, weight, cost, etc.). There are different types of widgets (sprockets, cogs, etc.), but each widget shares the exact same attributes (the values will be different by widget, of course, but they all have a weight, weight, etc.). I have 1,000s of each type of widget (1,000 cogs, 1,000 sprockets, etc.)
I need to perform a lot of calculations on these attributes (say calculating the weighted average of the attributes for 1000s of different widgets). For the weighted averages, I have different weights for each widget type (ie, I may care more about length for sprockets than for cogs).
Right now, I am storing all the attributes in a Dictionary< string, double> within each widget (the widgets have an enum that specifies their type: cog, sprocket, etc.). I then have some calculator classes that store weights for each attribute as a Dictionary< WidgetType, Dictionary< string, double >>. To calculate the weighted average for each widget, I simply iterate through its attribute dictionary keys like:
double weightedAvg = 0.0;
foreach (string attibuteName in widget.Attributes.Keys)
{
double attributeValue = widget.Attributes[attributeName];
double attributeWeight = calculator.Weights[widget.Type][attributeName];
weightedAvg += (attributeValue * attributeWeight);
}
So this works fine and is pretty readable and easy to maintain, but is very slow for 1000s of widgets based on some profiling. My universe of attribute names is known and will not change during the life of the application, so I am wondering what some better options are. The few I can think of:
1) Store attribute values and weights in double []s. I think this is probably the most efficient option, but then I need to make sure the arrays are always stored in the correct order between widgets and calculators. This also decouples the data from the metadata so I will need to store an array (?) somewhere that maps between the attribute names and the index into double [] of attribute values and weights.
2) Store attribute values and weights in immutable structs. I like this option because I don't have to worry about the ordering and the data is "self documenting". But is there an easy way to loop over these attributes in code? I have almost 100 attributes, so I don't want to hardcode all those in the code. I can use reflection, but I worry that this will cause even a larger penalty hit since I am looping over so many widgets and will have to use reflection on each one.
Any other alternatives?
Three possibilities come immediately to mind. The first, which I think you rejected too readily, is to have individual fields in your class. That is, individual double values named height, length, weight, cost, etc. You're right that it would be more code to do the calculations, but you wouldn't have the indirection of dictionary lookup.
Second is to ditch the dictionary in favor of an array. So rather than a Dictionary<string, double>, you'd just have a double[]. Again, I think you rejected this too quickly. You can easily replace the string dictionary keys with an enumeration. So you'd have:
enum WidgetProperty
{
First = 0,
Height = 0,
Length = 1,
Weight = 2,
Cost = 3,
...
Last = 100
}
Given that and an array of double, you can easily go through all of the values for each instance:
for (int i = (int)WidgetProperty.First; i < (int)WidgetProperty.Last; ++i)
{
double attributeValue = widget.Attributes[i];
double attributeWeight = calculator.Weights[widget.Type][i];
weightedAvg += (attributeValue * attributeWeight);
}
Direct array access is going to be significantly faster than accessing a dictionary by string.
Finally, you can optimize your dictionary access a little bit. Rather than doing a foreach on the keys and then doing a dictionary lookup, do a foreach on the dictionary itself:
foreach (KeyValuePair<string, double> kvp in widget.Attributes)
{
double attributeValue = kvp.Value;
double attributeWeight = calculator.Weights[widget.Type][kvp.Key];
weightedAvg += (attributeValue * attributeWeight);
}
To calculate weighted averages without looping or reflection, one way would be to calculate the weighted average of the individual attributes and store them in some place. This should happen while you are creating instance of the widget. Following is a sample code which needs to be modified to your needs.
Also, for further processing of the the widgets themselves, you can use data parallelism. see my other response in this thread.
public enum WidgetType { }
public class Claculator { }
public class WeightStore
{
static Dictionary<int, double> widgetWeightedAvg = new Dictionary<int, double>();
public static void AttWeightedAvgAvailable(double attwightedAvg, int widgetid)
{
if (widgetWeightedAvg.Keys.Contains(widgetid))
widgetWeightedAvg[widgetid] += attwightedAvg;
else
widgetWeightedAvg[widgetid] = attwightedAvg;
}
}
public class WidgetAttribute
{
public string Name { get; }
public double Value { get; }
public WidgetAttribute(string name, double value, WidgetType type, int widgetId)
{
Name = name;
Value = value;
double attWeight = Calculator.Weights[type][name];
WeightStore.AttWeightedAvgAvailable(Value*attWeight, widgetId);
}
}
public class CogWdiget
{
public int Id { get; }
public WidgetAttribute height { get; set; }
public WidgetAttribute wight { get; set; }
}
public class Client
{
public void BuildCogWidgets()
{
CogWdiget widget = new CogWdiget();
widget.Id = 1;
widget.height = new WidgetAttribute("height", 12.22, 1);
}
}
As it is always the case with data normalization, is that choosing your normalization level determines a good part of the performance. It looks like you would have to go from your current model to another model or a mix.
Better performance for your scenario is possible when you do not process this with the C# side, but with the database instead. You then get the benefit of indexes, no data transfer except the wanted result, plus 100000s of man hours already spent on performance optimization.
Use Data Parallelism supported by the .net 4 and above.
https://msdn.microsoft.com/en-us/library/dd537608(v=vs.110).aspx
An excerpt from the above link
When a parallel loop runs, the TPL partitions the data source so that the loop can operate on multiple parts concurrently. Behind the scenes, the Task Scheduler partitions the task based on system resources and workload. When possible, the scheduler redistributes work among multiple threads and processors if the workload becomes unbalanced

Database queries in Entity Framework model - variable equals zero

I have some problems with using database in my Model. I suspect that its not good idea to use database queries in Model, but I don't know how to do it better.
Code:
Let's assume that I have some application to analize football scores. I have some EF model that stores info about footballer:
public class Player
{
[...]
public virtual ICollection<Goal> HisGoals { get; set; }
public float Efficiency
{
get
{
using(var db = new ScoresContext())
{
var allGoalsInSeason = db.Goals.Count();
return HisGoals.Count / allGoalsInSeason;
}
}
}
}
Problem:
So the case is: I want to have a variable in my model called "Efficiency" that will return quotient of two variables. One of them contains data got in real-time from database.
For now this code doesn't work. "Efficiency" equals 0. I tried to use debugger and all data is correct and it should return other value.
What's wrong? Why it returns always zero?
My suspicions:
Maybe I'm wrong, I'm not good at C#, but I think the reason that Efficiency is always zero, is because I use database in it and it is somehow asynchronous. When I call this variable, it returns zero first and then calls the database.
I think that your problem lies in dividing integer / integer. In order to get a float you have to cast first one to float like this:
public float Efficiency
{
get
{
using(var db = new ScoresContext())
{
var allGoalsInSeason = db.Goals.Count();
return (float)HisGoals.Count / allGoalsInSeason;
}
}
}
Dividing int/int results always in int that is in your case 0 (if it is as you said in comment 4/11).
Second thing is that Entity Framework will cache values - test it before shipping to production.

How to approach building an expression/condition evaluator GUI?

I have a winforms application that is connected to a database which contains a huge amount of measurement data of different datapoints. The data gets updated every few seconds, old values go to an archive table etc. I'm using EF6 for data access.
Now I need to build a GUI that offers some functionality as follows:
The user should be able to define conditions and/or expressions at runtime that then trigger some actions when true. An example in pseudo-code:
if ([Value of Datapoint 203] >= 45 and (TimeStamp of Datapoint 203 < "07:00am")
and (([Value of Datapoint 525] == 1]) or ([Value of Datapoint 22] < 0)])
then set [Value of Datapoint 1234] to ([Value of 203]/4) //or call a method alternatively
or an even simpler example in natural language (differs from the above):
if it is cold and raining, turn on machine XY
where cold and raining are values of certain datapoints and turn on machine is a method with a given parameter XY.
These expressions need to be saved and then evaluated in regular intervals of some minutes or hours. I did not face such a requirement before and I hardly know where to start. What would be the best practice? Is there maybe some sample code you know of? Or are there even controls or libraries for this?
Update: Breaking it down to something more specific:
Suppose I have a class like this:
class Datapoint
{
public int Id { get; set; }
public DateTime TimeStamp { get; set; }
public int Value { get; set; }
}
During runtime I have two objects of this type, DatapointA and DatapointB. I want to enter the following into a textbox and then click a button:
DatapointA.Value>5 && ( DatapointB.Value==2 || DatapointB.Value==7 )
Depending on the actual values of these objects, I want to evaluate this expression string and get a true or false. Is this possible?

Is this more suited for key value storage or a tree?

I'm trying to figure out the best way to represent some data. It basically follows the form Manufacturer.Product.Attribute = Value. Something like:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
So the minimum price across all Acme products is 100 except in the case of product A and B. I want to store this data in C# and have some function where GetValue("Acme.ProductC.MinimumPrice") returns 100 but GetValue("Acme.ProductA.MinimumPrice") return 50.
I'm not sure how to best represent the data. Is there a clean way to code this in C#?
Edit: I may not have been clear. This is configuration data that needs to be stored in a text file then parsed and stored in memory in some way so that it can be retrieved like the examples I gave.
Write the text file exactly like this:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Parse it into a path/value pair sequence:
foreach (var pair in File.ReadAllLines(configFileName)
.Select(l => l.Split('='))
.Select(a => new { Path = a[0], Value = a[1] }))
{
// do something with each pair.Path and pair.Value
}
Now, there two possible interpretations of what you want to do. The string Acme.*.MinimumPrice could mean that for any lookup where there is no specific override, such as Acme.Toadstool.MinimumPrice, we return 100 - even though there is nothing referring to Toadstool anywhere in the file. Or it could mean that it should only return 100 if there are other specific mentions of Toadstool in the file.
If it's the former, you could store the whole lot in a flat dictionary, and at look up time keep trying different variants of the key until you find something that matches.
If it's the latter, you need to build a data structure of all the names that actually occur in the path structure, to avoid returning values for ones that don't actually exist. This seems more reliable to me.
So going with the latter option, Acme.*.MinimumPrice is really saying "add this MinimumPrice value to any product that doesn't have its own specifically defined value". This means that you can basically process the pairs at parse time to eliminate all the asterisks, expanding it out into the equivalent of a completed version of the config file:
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Acme.ProductC.MinimumPrice = 100
The nice thing about this is that you only need a flat dictionary as the final representation and you can just use TryGetValue or [] to look things up. The result may be a lot bigger, but it all depends how big your config file is.
You could store the information more minimally, but I'd go with something simple that works to start with, and give it a very simple API so that you can re-implement it later if it really turns out to be necessary. You may find (depending on the application) that making the look-up process more complicated is worse over all.
I'm not entirely sure what you're asking but it sounds like you're saying either.
I need a function that will return a fixed value, 100, for every product ID except for two cases: ProductA and ProductB
In that case you don't even need a data structure. A simple comparison function will do
int GetValue(string key) {
if ( key == "Acme.ProductA.MinimumPrice" ) { return 50; }
else if (key == "Acme.ProductB.MinimumPrice") { return 60; }
else { return 100; }
}
Or you could have been asking
I need a function that will return a value if already defined or 100 if it's not
In that case I would use a Dictionary<string,int>. For example
class DataBucket {
private Dictionary<string,int> _priceMap = new Dictionary<string,int>();
public DataBucket() {
_priceMap["Acme.ProductA.MinimumPrice"] = 50;
_priceMap["Acme.ProductB.MinimumPrice"] = 60;
}
public int GetValue(string key) {
int price = 0;
if ( !_priceMap.TryGetValue(key, out price)) {
price = 100;
}
return price;
}
}
One of the ways - you can create nested dictionary: Dictionary<string, Dictionary<string, Dictionary<string, object>>>. In your code you should split "Acme.ProductA.MinimumPrice" by dots and get or set a value to the dictionary corresponding to the splitted chunks.
Another way is using Linq2Xml: you can create XDocument with Acme as root node, products as children of the root and and attributes you can actually store as attributes on products or as children nodes. I prefer the second solution, but it would be slower if you have thousands of products.
I would take an OOP approach to this. The way that you explain it is all your Products are represented by objects, which is good. This seems like a good use of polymorphism.
I would have all products have a ProductBase which has a virtual property that defaults
virtual MinimumPrice { get { return 100; } }
And then your specific products, such as ProductA will override functionality:
override MinimumPrice { get { return 50; } }

Categories

Resources