Alternative Data Structure to DataTable - c#

I need a data structure of some sort to do the following:
One "set" composed of many types such as string, integer, datetime and double.
Many sets are added dynamically
The sets are retrieved dynamically where information is pulled
Now the obvious solution is to use a DataTable. Define the datatable structure, and add a new row each time you need to add a new set. Pull data from the datatable when you need to.
Actually I have implemented it already using a datatable, but the problem is it is extremely slow for some reason. Since this is done thousands to millions of times performance can be problematic.
Is there an alternative datatable type of data structure with better performance that I can use or should I build my own class using Lists<> ?

Depending on your use case I would recommend using List<object[]> (since you mentioned dynamic schema) as central data structure, but you will need to maintain the schema info yourself if you need it later on.
If you need to bind the UI to the data this approach will add a lot of extra manual work, it's better suited for background processing of large amounts of data.
We have used this approach in the past and were able to save 2/3 of memory and 80% of execution time when bulk handling data compared to data tables.

One alternative way of approaching problems like this: use a sqlite database in memory.
Sounds like a weird thing to do at first, but you can put quite complex structures into tables, and you get the whole power of SQL to work on your data. SQLite is a tiny lib, so it won't bloat up your code. Integrating the DB into your code might be kinda strange at first, put performance should work on huge data sets (since that's what DBs are made for). And if you ever need to save that data to disk, you are already done.
Depending on the details of your problem, it might even be a good idea to move to a bigger db back end (e.g. postgres), but that is hard to tell from here. Just don't dismiss this idea too easily.

There are several similar questions on stackoverflow, but none provides a good answer. A generic alternative should not be List<YourObject>, because YourObject is not generic. The beauty of DataTable is that it does not have a data model.
A DataTable is a collection of rows, while each row is a collection of cells. A cell could be a string or a number. So we can define a Cell as:
public class Cell
{
public double Value { get; set; }
public string Text { get; set; }
}
Then a row would be Dictionary<string, Cell>, where string is the column name. And then a DataTable alternative is simply a List<Dictionary<string, Cell>>.
Let's say you define Rows as public List<Dictionary<string, Cell>> Rows;.
Now you can easily query the Rows like:
var MaleHeight = Rows.Where(row => row["sex"].Text == "Male").Select(row => row["Height"].Value);

Related

Convert a DataRow to a Domain class object

I have a DataRow object from a dealers DataTable which has columns d_id, d_name, d_contactInfo and a domain class object Dealer that has properties id, name, contactInfo. I am looking for a way to convert this dataRow to the domain class object by using a conversion such as
DbDataSet.dealers.FindByd_id(id) as Dealer;
Is there some way I can achieve this? because if so, the code would look MUCH cleaner than having to specify attribute mapping one by one. Thanks.
You have a couple of options to solve this. One is to create your own data layer object that maps the fields in the DataRow to an object. If you want this more automagic, you can either create a helper routine, or if you can get the data out as XML or JSON, you can use serialization to match the items. This is the harder way to accomplish this.
If you can refactor the code, you can use Entity Framework to match the items. I am not as fond of it in Enterprise scale code, but it works fine with others. There are other OR/M products, many open source, that can do it as well. Chosing the correct one depends on your requirements (some are faster, some have more features, etc.)
There are also products that can map from one shape to another and save you some time, as is mentioned in the comments.

Dynamically create a class and then create a list<dynamicClass> based on it in C#

I want to create a class and its properties on run time, the properties will be like Year2001, Year2002, Year2003, Year2004, Year2005... I get these property names on run-time, I get them in a list. Later I need to use this class to create a list which I need to show in the kendo grid.I surfed a lot and thought of using ExpandoObject, but was unsuccessful.
If all properties will be of the form YearX and contain some information about or related to that year, then I would strongly recommend you (if at all possible) to go with something along the lines of an IList<YearInfo> where YearInfo is some object containing the info you need for every year, including an integer property indicating what year the object corresponds to. If you require these objects to be unique you could use an IDictionary<int, YearObject> or ISet<YearObject> instead.
Reflection can be powerful, but it it comes at the price of complexity and loss of type safety/compile-time checks. Avoid when possible.
Sounds to me like you are really wanting to a grid with grouping support. Your idea of having the system create a CLASS at runtime is not going to fly. Even if it were possible, which I doubt it is, it is absolutely the wrong approach.
Like I say - have a read about Grouping / Hierarchy on Grid Controls (Kendo grid example here), and maybe have a look at OLAP cubes as well...
Although you have had some answers I would also like to suggest an alternative way of doing this which is using DataTables. This is the approach I take when I have any "Dynamic" data sets that I want to present to the grid.
This is also the approach that Telerik themselves take with one of their code samples.
here are a couple of links to show them doing this to DataTables and Dynamic Objects
Grid Binding to Data Table
Grid Binding to Dynamic Objects
Personally I find the binding to Tables easier to deal with as I am used to dealing with Data Tables.

How to create the ability to apply a generic data source class

This is maybe something I know how to do or have already done it in the past. For some reason I am drawing a blank on how to wrap my head around it. This is more for learning as well as trying to implement something in my app.
I am using a set of third party controls. These controls offer a lot of functionality which is great. However, I want to be able to create a custom object that handle the logic/properties for the datasource of this control.
For example, there is a spreadsheet like object that I am using. You supply the spreadsheet like object some data and it pulls in your data. The problem here is that you need to set the columns, their data types, and other formatting/events as well as some logic to spit the data back to the user.
List<CustomClassWithProperties> dataSource
The custom class has some properties that will be translated to the columns. Like ProductName, Price, SalesDepartment, DatePurchased etc. This can be done by supplying the spreadsheet the columns and their data types each time. I want to be able to create a helper class that you just supply a list, a visible column list, and an editable column list and the data will fill in without any other issues.
Using the above list, I would imagine something similar to this:
DataHelperClass dtHlpr = new DataHelperClass(List<CustomClassWithProperties> data, List<string> visibleColumns, List<string> editableColumns)
This data helper class will take the data input list as the spreadsheet data source. It would then take the visibleColumns list and use that to set the visible rows, same for editableColumns.
Where I am running into a mental block (long week) is when I want to be able to reuse this. Let's say I have a List that has completely different properties. I would want my constructor for the data helper to be able to handle any List I send to it. Looking at whatever code I can get to for the third party controls, it appears that their data source is of type object.
Could someone point me in the right direction? I am thinking it has to do with generics and some interface implementation. I just honestly cannot think of where to start.
You can make the class itself generic:
public class DataHelperClass<T>
{
public DataHelperClass(List<T> data, ...) { ... }
}
DataHelperClass<CustomClassWithProperties> dtHlpr = new DataHelperClass<CustomClassWithProperties>(List<CustomClassWithProperties> data, List<string> visibleColumns, List<string> editableColumns)
You'd then perform your reflection against typeof(T).
I'd also be tempted to use IEnumerable<T> rather than List<T> if possible, but that's a matter of preference, more or less.
This is similar to using a simple List<object>, except that it enforces that all objects in the list inherit from the same type (which might well be object), so you get some more type-checking than you otherwise would.
You mentioned interfaces, I don't see any reason here to include that (from what you've told us, at least), but you can certainly make a generic interface via the same syntax.

What should we use when initializing some values from DataTables

What should be more appropriate to use when you want to initialize class properties from datatable.
i.e.
name=dt.Rows[0][0] or name=dt.Rows[0]["Name"]
Which approach is more scalable any easy to handle.
Currently I uses 2nd approach but feels like if I use indexes rather than name that i only need to change stored procedures rather that UI also.
But compromises readability of code. So What should I go for
One option is to have something in between:
private const int NameColumn = 0;
...
name = dt.Rows[0][NameColumn];
That gives you one place to change if the column ordering/definitions change, but also gives readable code at the point of access. I'm not sure it addresses the issue of having to change both the UI code and the stored procedures at the same time: if your SPs are effectively changing their public interface, you should expect to change the UI code. However, this approach can reduce the pain of that change, as well as not littering your code with magic values.
(You might also want to consider strongly typed datasets... or moving to a non-DataTable data solution such as the Entity Framework. It may not be appropriate in your situation, but it's worth considering.)
Column names, as it makes it all more readable. If you're going to change stored procedures, using column indexes won't help much cause you might be changing the number or the order of columns.
Using a numeric index is faster than repeatedly using the string index. You can compute the numeric index at runtime once before you loop through table rows like so:
int indexName = dt.Columns.IndexOf("Name")

DataTable vs. Collection in .Net

I am writing a program that needs to read a set of records that describe the register map of a device I need to communicate with. Each record will have a handfull of fields that describe the properties of each register.
I don't really need to edit or modify the data in my VB or C# program, though I would like to be able to display the data on a grid. I would like to store the data in a CSV file, or perhaps an XML file. I need to enable users to edit the data off-line, preferably in excel.
I am considering using a DataTable or a Collection of "Register" objects (which I would define).
I prototyped a DataTable, and found I can read/write XML easily using the built in methods and I can easily bind to a DataGridView. I was not able to find a way to retreive info on a single register without using a query that returns a collection of rows, even though I defined a unique primaty key column. The syntax to get a value from a column is also complex, though I could be missing something on both counts.
I'm tempted to use a collection of "Register" objects that I can access via a unique key. It would be a little more coding up front, but seems like a cleaner solution overall. I should still be able to use LINQ to dataset to query subsets of registers when I need them, but would also be able to grab a single field using a the key value, something like this: Registers(keyValue).fieldName).
Which would be a cleaner approach to the problem?
Is there a way to read/write XML into a Collection without needing custom code?
Could this be accomplished using String for a key?
UPDATE: Sounds like the consensus is towards the Collection of register Objects. Makes sense to me. I was leaning that way, and since nobody pointed out any DataTable features that would simplify acessing a single row, it looks like the Collection is clearly the way to go. Thanks to those who weighed in.
I would be inclined not to use data sets. It would be better to work with objects and collections. Your code will be more maintainable/readable, composable, testable & reusable.
Given that you can do queries on the data set to return particular row, you might find that a LINQ query to turn the rows into objects may be all the custom code that you need.
Using a Dictionary<string, Register> for look ups is a good idea if you have a large number of items (say greater than 1000). Otherwise a simple LINQ query should be fine.
It depends on how you define 'clean'.
A generic collection is potentially MUCH more lightweight than a DataTable. But on the other hand that doesn't seem to be too much of an issue for you. And unless you go into heavy reflection you'll have to write some code to read/write xml.
If you use a key I'd also recommend (in the case of the collection) to use a Dictionary. That way you have a Collection of the raw data and still can identify each entry through the key in the Dictionary.
I usually use datatables if its something quick and unlikely to be used in any other way. If it's something I can see evolving into an object that has its own use within the app (like your Register Object you mentioned).
It might be a little extra code up front - but it saves converting from a datatable to the collection in the future if you come up with something you would like to do based on an individual row, or if you want/need to add some sort of extra functionality to that element down the road.
I would go with the collection of objects so you can swap out the data access later if you need to.
You can serialize classes with an xml serializer and defining a Serialize attribute or something like that (it has been a while since I done that, sorry for the vagueness). A DataSet or DataTable works great with XML.
Both DS and DT have ReadXml and WriteXml methods. XML must be predefined format, but it works seamlessly.
Otherwise, I personally like collections or dictionaries; DS/DT are OK, but I like custom objects, and LINQ adds in some power.
HTH.

Categories

Resources