Converting two-dimensional array to an object c# - c#

This all originates from querying Google Analytics data. For a basic query the main factors that change are the dimensions and the metrics. The object that is returned is of a type called GaData, and the actual results that you need reside in GaData.Rows.
The format of GaData.Rows looks like this:
There will first be a row for each dimension, in this example there is a row for "New Visitor" and a 2nd row for "Returning Visitor". Within those rows will be another set of rows that contain the Dimension value, and then each metric that you specify (I've only asked for one metric).
So far the class setup I have is as follows:
public class Results
{
public List<Dimension> Dimensions { get; set; }
}
public class Dimension
{
public string Value { get; set; }
public List<Metric> Metrics { get; set; }
}
public class Metric
{
public int Value { get; set; }
}
Finally, maybe its just late and my brain isn't functioning well, but I'm having a little bit of difficulty converting this data into the Results type, I think because of the multiple layers. Any help?
Edit
I added an answer below for how I ended up accomplishing it, if anyone has a more condensed example let me know!

Well, I don't know what Rows is inside Ga, but maybe this will point you in the right direction.
var results
= GaData.Rows.Select(x => x.Rows.Select(y =>
new Dimension { Value = y.Value, Metrics = new List<Metric> {innerRow.Metric}}));

I ended up creating an extension method for GaData called ToDimensionResults(). I'm not sure if I would have been able to accomplish this using LINQ as I needed to know the index of some of the rows (like the Dimension Value). So I opted to just loop through both dimensions and metrics and create the class manually. NOTE: if you do not include a dimension in your query, the results do not contain the dimension value, only a list of metrics, so this accommodates that possibility.
public static Results ToDimensionResults(this GaData ga)
{
var results = new Results();
var dimensions = new List<Dimension>();
List<Metric> metrics;
var value = "";
var metricStartIndex = 1;
for (var i = 0; i < ga.Rows.Count; i++)
{
//accomodate data without dimensions
if (!string.IsNullOrEmpty(ga.Query.Dimensions))
{
value = ga.Rows[i][0].ToString();
}
else
{
value = "";
metricStartIndex = 0;
}
metrics = new List<Metric>();
for (var x = metricStartIndex; x < ga.Rows[i].Count; x++)
{
metrics.Add(new Metric
{
Value = Convert.ToInt32(ga.Rows[i][x])
});
}
dimensions.Add(new Dimension
{
Value = value,
Metrics = metrics
});
}
results.Dimensions = dimensions;
return results;
}

Related

ML.net Load From Enumerable Trouble

I have been struggling to create the proper data structure for ML.net and get it to load into my application. Essentially, I have an application where the training data will be dynamic and the type and/or size will not be known prior to runtime. In addition, I have to convert the training data from a non-standard primitive types (ie. App_Bool, or App_Number... rather than simply using bool or double, etc.) So, this has been proving to be a problem as I try to convert my training data into a generic data type which can then be loaded from memory using the LoadFromEnumerable function.
I have four basic data type classes:
public class MLIntData
{
public MLIntData(string label, List<object> l)
{
Label = label;
foreach (App_Integer element in l)
Features.Add((int)element.Value);
}
public List<int> Features { get; set; } = new List<int>();
public string Label { get; set; } = "";
}
public class MLNumberData
{
public MLNumberData(string label, List<object> l)
{
Label = label;
foreach (App_Number element in l)
Features.Add((double)element.Value);
}
public List<double> Features { get; set; } = new List<double>();
public string Label { get; set; } = "";
}
public class MLBoolData
{
public MLBoolData(string label, List<object> l)
{
Label = label;
foreach (App_Boolean element in l)
Features.Add((bool)element.Value);
}
public List<bool> Features { get; set; } = new List<bool>();
public string Label { get; set; } = "";
}
public class MLTextData
{
public MLTextData(string label, List<object> l)
{
Label = label;
foreach (App_String element in l)
Features.Add(element.Value.ToString());
}
public List<string> Features { get; set; } = new List<string>();
public string Label { get; set; } = "";
}
So, each base class will contain a label for the data and then a list of features which will either be of type bool, double, int, or string.
Now, in my ML.net code I'm trying to load in the training data and then create an IDataView object of the data. First I loop through the input data (which is originally of the generic type object) then create the new classes of data.
List<object> data = new List<object>();
for(int i = 0; i < input.Count; i++)
{
MLCodifiedData codifiedData = input[i].Value as MLCodifiedData;
Type dataType = codifiedData.Features[0].GetType();
if (dataType == typeof(App_Boolean))
{
data.Add(new MLBoolData(codifiedData.Label, codifiedData.Features));
}
else if (dataType == typeof(App_Number))
{
data.Add(new MLNumberData(codifiedData.Label, codifiedData.Features));
}
else if (dataType == typeof(App_Integer))
{
data.Add(new MLIntData(codifiedData.Label, codifiedData.Features));
}
if (dataType == typeof(App_String))
{
data.Add(new MLTextData(codifiedData.Label, codifiedData.Features));
}
}
IDataView TrainingData = mlContext.Data.LoadFromEnumerable<object>(data);
I have tried creating a schema definition (which can be passed in as the second parameter in the LoadFromEnumerable method, but I can't seem to get that to work. I've also tried creating a schema using the schema builder to create a schema, but that doesn't seem to work either. Right now, I'm using one of the datasets that is included in one of the sample files. And to preempt questions, yes, I know I could simply load the data as file and read it in that way... However, in my app I need to first read in the CSV into memory, then create the data structure so I can't really use many of the examples which are geared toward reading in a CSV file using the LoadFromTextFile method. Can anyone provide support as to how I could setup a dynamic in-memory collection and get it converted into a IDataView object?

Add custom column to IDataView in ML.NET

I'd like to add a custom column after loading my IDataView from file.
In each row, the column value should be the sum of previous 2 values. A sort of Fibonacci series.
I was wondering to create a custom transformer but I wasn't able to find something that could help me to understand how to proceed.
I also tried to clone ML.Net Git repository in order to see how other transformers were implemented but I saw many classes are marked as internal so I cannot re-use them in my project.
There is a way to create a custom transform with CustomMapping
Here's an example I used for this answer.
The input and output classes:
class InputData
{
public int Age { get; set; }
}
class CustomMappingOutput
{
public string AgeName { get; set; }
}
class TransformedData
{
public int Age { get; set; }
public string AgeName { get; set; }
}
Then, in the ML.NET program:
MLContext mlContext = new MLContext();
var samples = new List<InputData>
{
new InputData { Age = 16 },
new InputData { Age = 35 },
new InputData { Age = 60 },
new InputData { Age = 28 },
};
var data = mlContext.Data.LoadFromEnumerable(samples);
Action<InputData, CustomMappingOutput> mapping =
(input, output) =>
{
if (input.Age < 18)
{
output.AgeName = "Child";
}
else if (input.Age < 55)
{
output.AgeName = "Man";
}
else
{
output.AgeName = "Grandpa";
}
};
var pipeline = mlContext.Transforms.CustomMapping(mapping, contractName: null);
var transformer = pipeline.Fit(data);
var transformedData = transformer.Transform(data);
var dataEnumerable = mlContext.Data.CreateEnumerable<TransformedData>(transformedData, reuseRowObject: true);
foreach (var row in dataEnumerable)
{
Console.WriteLine($"{row.Age}\t {row.AgeName}");
}
Easy thing. I am assuming, you know how to use pipelines.
This is a part of my project, where I merge two columns together:
IEstimator<ITransformer> pipeline = mlContext.Transforms.CustomMapping(mapping, contractName: null)
.Append(mlContext.Transforms.Text.FeaturizeText(inputColumnName: "question1", outputColumnName: "question1Featurized"))
.Append(mlContext.Transforms.Text.FeaturizeText(inputColumnName: "question2", outputColumnName: "question2Featurized"))
.Append(mlContext.Transforms.Concatenate("Features", "question1Featurized", "question2Featurized"))
//.Append(mlContext.Transforms.NormalizeMinMax("Features"))
//.AppendCacheCheckpoint(mlContext)
.Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: nameof(customTransform.Label), featureColumnName: "Features"));
As you can see the two columns question1Featurized and question2Featurized are combined into Features which will be created and can be used as any other column of IDataView. The Features column does not need to be declared in a separate class.
So in your case you should transform the columns firs in their data type, if strings you can do what I did and in case of numeric values use a custom Transformer/customMapping.
The documentation of the Concatenate function might help as well!

How to get one result from a list in another method?

Hi all I'm new to C#.
I try to return a result "totalAmount" from my method called "GetAllKschl". In this method I returned a list with "KSCHL, KSCHLData, price, pieces und totalPrice".
So in my new method I need the total amount of all "totalPrice" together.
first method:
public List<Result> GetAllKschl(string fileNameResult, string fileNameData)
{
List<Result> listResult = new List<Result>();
docResult.Load(fileNameResult);
docData.Load(fileNameData);
var resultList = docResult.SelectNodes("//root/CalculationLogCompact/CalculationLogRowCompact");
foreach (XmlNode nextText in resultList)
{
XmlNode KSCHL = nextText.SelectSingleNode("KSCHL");
string nextKschl = KSCHL.InnerText;
// ... and so on...
if (pieces > 0 && totalPrice > 0)
{
listResult.Add(new Result(nextKschl, nextKSCHLData, nextEinzelpreis, pieces, totalPrice));
}
}
return listResult;
}
second method: (don't know exactly what to do)
public decimal GetTotalAmount(string amount, string totalAmount)
{
string total = GetAllKschl(amount, totalAmount); // ??
return total;
}
So here I want to have just the TotalAmount (every totalPrice from GetAllKschl) und not the whole list from GetAllKschl. How do I do this?
here my class result:
public class Result
{
public string KSCHL { get; set; }
public string Info { get; set; }
public int individualPrice { get; set; }
public int Pieces { get; set; }
public int TotalCosts { get; set; }
public Result(string kschl, string info, int individualPrice, int pieces, int totalCosts)
{
KSCHL = kschl;
Info = info;
IndividualPrice = individualPrice;
Pieces = pieces;
TotalCosts = totalCosts;
}
}
You can use LINQ extension method Sum to do so:
decimal total = GetAllKschl( amount, totalAmount ).Sum( result => result.Gesamtpreis );
I assume that the TotalPrice is the name of the property for price in the Result class.
The Sum extension method iterates over all items in the returned collection and sums up the prices.
You could rewrite this without LINQ like this:
var list = GetAllKschl( amount, totalAmount );
decimal total = 0;
foreach ( var item in list )
{
total += item.Gesamtpreis;
}
As a suggestion, I would recommend making clearer variable naming conventions and do not mix variable names from different languages (English and German).
Also it is quite unusual you used decimal for the total price while Result class uses int. Maybe result should have decimals as well? It seems fitting for a price property.
The best would be probably using LINQ:
public decimal GetTotalAmount(string amount, string totalAmount)
{
var total = GetAllKschl(amount, totalAmount).Sum(result => result.Gesamtpreis);
return total;
}
I'm assuming that Your result are in the property called Gesamtpreis and is of any numeric type.
EDIT:
Based on the comments I decided to put there a bit more description about LINQ extension methods and lambda method. LINQ methods allows You to use a query language similar to SQL. It works with Collection of elements (e.g. List of Result - List<Result>). On this collection You will call these methods and they will provide You some kind of result, sometimes just number (Aggregate functions like Min,Max,Sum,..) or they will do some other actions returning object or another collection (First,Last, ToList, ToDictionary).
In our care we will have a List with objects:
public class Product
{
public string Name { get; set; }
public int Price { get; set; }
}
List<Product> productList = new List<Product>();
productList.Add(new Product() { Name = "Car", Price = 140000 });
productList.Add(new Product() { Name = "SSD Disc", Price = 2000 });
productList.Add(new Product() { Name = "Bananan", Price = 7 });
Having those, for normal SUM You would go with:
int result = 0;
foreach(var nProduct in productList)
result += nProduct.Price;
Console.WriteLine(result);
This is kind of short code, but it can be pretty much simplified without using variable (for intermediate results) and foreach cycle. (Actually the foreach cycle will be used but we won't need to handle/write it.) LINQ example:
var result = productList.Sum(nProduct => nProduct.Price);
Now this code is much shorter, but we have to split it into several parts to understand what actually happened:
// Saving result to variable (as anytime before)
// Note that I have changed "int" to "var", which simplifies the code,
// as You don't have to take care of "what type will the result be"
// usage is really common with LINQ also
var result =
// calling method Sum() on the productList
productList.Sum()
// Sum will take each object in the collection and put it as a parameter called "nProduct"
// now the "=>" is something called Lambda syntax,
// that allows take paremeters from the left side and use them in code on the right side.
// left side is instance of "Product" class named as "nProduct"
Sum(nProduct => ... )
// ... is replaced with "nProduct.Price",
// which is selector that tells "make the sum of property "Price"
Sum(nProduct => nProduct.Price)
// In similar manner works other aggregate functions
var max = productList.Max(prod => prod.Price);
var min = productList.Min(prod => prod.Price);
var avg = productList.Average(prod => prod.Price);
//In this method you are returning a List
public List<Result> GetAllKschl(string fileNameResult, string fileNameData)
{
List<Result> listResult = new List<Result>();
docResult.Load(fileNameResult);
docData.Load(fileNameData);
var resultList = docResult.SelectNodes("//root/CalculationLogCompact/CalculationLogRowCompact");
foreach (XmlNode nextText in resultList)
{
XmlNode KSCHL = nextText.SelectSingleNode("KSCHL");
string nextKschl = KSCHL.InnerText;
// ... and so on...
if (pieces > 0 && totalPrice > 0)
{
listResult.Add(new Result(nextKschl, nextKSCHLData, nextEinzelpreis, pieces, totalPrice));
}
}
return listResult;
}
//On the second method you returning a decimal and expecting a string
public decimal GetTotalAmount(string amount, string totalAmount)
{
string total = GetAllKschl(amount, totalAmount); // ??
return total;
}
It would be best to change the second method as:
decimal total = GetAllKschl(amount, totalAmount).Sum(result => result.Gesamtpreis);
Add linq in the return.

C# List - Sorting Data

I am trying to add elements into a list, order them and then output them, there a number of "columns" if you like, per list
List<Info> infoList = new List<Info>();
while (dr.Read())
{
meeting_id = dr.GetValue(0).ToString();
try
{
Appointment appointment = Appointment.Bind(service, new ItemId(meeting_id));
Info data = new Info();
data.Start = appointment.Start;
data.Fruit = Convert.ToInt32(dr.GetValue(1));
data.Nuts = Convert.ToInt32(dr.GetValue(2));
infoList.Add(data);
}
Then to output it I want to order it by Start and then display all associated columns
for (int i = 0; i < infoList.Count; i++)
{
meet = meet + infoList[i];
}
First question: is the way I am inputting the data right?
Second question: How to I output all the columns to display all the associated columns? Is this possible? Is there a better practice?
Thanks
EDIT:
The class if you are interested:
public class Info
{
public DateTime Start { get; set; }
public int Fruit { get; set; }
public int Nuts { get; set; }
}
You can use Enumerable.OrderBy extension for enumerating your collection in some particular order (e.g. ordered by Start property value):
foreach(var info in infoList.OrderBy(i => i.Start))
{
// use info object here
// info.Fruits
// info.Nuts
}
BTW consider to add sorting on database side - that will be more efficient

Inserting into a C# List of Objects that uses a Property of the Object to Sort

There's a list of objects, each object representing a record from a database. To sort the records there is a property called SortOrder. Here's a sample object:
public class GroupInfo
{
public int Id { get; set; }
public string Text { get; set; }
public int SortOrder { get; set; }
public GroupInfo()
{
Id = 0;
Text = string.Empty;
SortOrder = 1;
}
}
A list object would look like this:
var list = new List<GroupInfo>();
I need to be able to change the SortOrder and update the SortOrder on the other objects in the list. I figured out how to sort up or down by one. I need to know how to change it by more than one and adjust the SortOrder on the other records. Any ideas?
This could be done by first getting the original SortOrder and the updated SortOrder. You would then iterate through your collection and adjust the SortOrder of any other GroupInfo objects that fall inside the range between original and updated. you could put all of this in a "SetSortOrder" function that takes in the containing collection.
public static void SetSortOrder(List<GroupInfo> groupInfos, GroupInfo target, int newSortOrder)
{
if (newSortOrder == target.SortOrder)
{
return; // No change
}
// If newSortOrder > SortOrder, shift all GroupInfos in that range down
// Otherwise, shift them up
int sortOrderAdjustment = (newSortOrder > target.SortOrder ? -1 : 1);
// Get the range of SortOrders that must be updated
int bottom = Math.Min(newSortOrder, target.SortOrder);
int top = Math.Max(newSortOrder, target.SortOrder);
// Get the GroupInfos that fall within our range
var groupInfosToUpdate = from g in groupInfos
where g.Id != target.Id
&& g.SortOrder >= bottom
&& g.SortOrder <= top
select g;
// Do the updates
foreach (GroupInfo g in groupInfosToUpdate)
{
g.SortOrder += sortOrderAdjustment;
}
target.SortOrder = newSortOrder;
// Uncomment this if you want the list to resort every time you update
// one of its members (not a good idea if you're doing bulk changes)
//groupInfos.Sort((info1, info2) => info1.SortOrder.CompareTo(info2.SortOrder));
}
Update: As suggested, I moved the logic into a static helper function.
var sortedList = list.OrderBy(item => item.SortOrder);
Edit: Sorry, I misunderstood. You will need to write yourself a method outside of GroupInfo to handle the updating of that property.

Categories

Resources