Entity Attribute Value (EAV) frameworks?

Entity Attribute Value (EAV) frameworks? - c#

I'd seen Entity Attribute Value in lots of contexts before I actually learnt what its name was. Its that technique that often crops up when instead of storing data in database columns you 'flip it' and have a table with Entity, Attrbute, Value columns and each piece of data becomes a row in that table. Sometimes its also known as 'Open-Schema'.
Its good for some things, bad for other things. This wikipedia article has a good discussion of the theory behind it.
It seems like the sort of oft-used technique that should have Frameworks or Engines or NoSQL Databases or general software tools to build and support it.
So, do you know of any? I'm particularly interested in the Microsoft stack (.Net, SQL Server, etc), but also in other technology stacks.
For example, here's a project to build an ASP.NET EAV engine that is exactly what I'm looking for, but apparently never got started.

If you can live with the drawbacks of NoSQL databases, the best way to approach the EAV pattern is with a NoSQL alternative like CouchDB or MongoDB. These databases offer a "schemaless" design which allows each row to have its own schema. Doing an EAV with a traditional RDBMS is asking for trouble as the querying becomes very difficult and performance suffers the larger the dataset.
An alternative that I have used successfully in the past is to combine an RDBMS with a NOSQL variant (MySql and MongoDB). I use MySQL to store the EAV data (gaining the transactional integrity), and I use MongoDB as a reporting store to get around the querying issues with the EAV model.

You could store it in SQL-XML, donno of a lib tho, but you could do the de/serialization in .NET, then X-LINQ aginst it.
Performance will also be a tremendous issue obviously.

I'll get the ball rolling with one I found via this blog post:
An early Beta for a SQL Server EAV framework:
http://eav.codeplex.com/
"A sample EAV pattern for SQL Server with: Tables and indexes, Partial referential integrity, Partial data typing, Updatable views (like normal SQL table)"
Provides some SQL scripts to download, here.

Related

Vendor-agnostic way of retrieving the schema of a database table

I'm currently building an application at work where users get to define various ways in which pieces of data are routed to various storage technologies. Those include traditional relational database systems.
We'd like to give feedback to users if the way they've configured this does not work with the defined database schema, i.e. if the column types don't match.
I've been looking for a solid vendor-agnostic way of retrieving the datatypes of a database table, ideally including the CLR types they map to.
So far I've struggling to find anything even remotely decent. Much of the solutions I stumbled upon are not vendor-agnostic, and much of the tooling regarding database technologies included in .NET (Core) are specific to SQL Server.
The most popular way seems to be via the GetSchema method on an IDbConnection object, but that one is also riddled with implementation specific details, and does not give a very pleasant to use result. I've been able to retrieve textual representations for each of the types, and for Postgres for example, the closest I've come is actual human-readable descriptions of the types. VARCHAR was displayed as "Varying length character string", which is hard to parse.
Most database interaction libraries for .NET (Core) abstract away the primitives like DataSet, DataTable, DataReader etc, and usually directly map to objects, thereby removing any use I could have had for them.
What is the easiest way to get an overview of a table schema?
For clarity's sake, we're looking to support the following database technologies for now:
SQL Server
PostgreSQL
MySQL / MariaDB
SQLite
Oracle RDMBS
Thanks!

This does sound like something that you have to pay for, because it is such a narrow use-case, if it even exists. I have a hard time believing this would be a maintained open-source project.
When that is said, maybe you can go around it by querying the database directly using something like this:
select *
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME='tableName'
Taken from https://stackoverflow.com/a/18298685/1387545
I checked and it seems to work for at least the first two databases. I think finding some kind of SQL query is your best bet of a generic solution. Since SQL is the technology that they share.
But then again, I think you will obtain a better result by building your own specific parser for the database tables for each database. It of course all depends on time and budget.

Entity Framework - Interact with Orace and SQL Server

I am working on a .NET web api service(with Odata support) to support Mobile client.The service should support both Oracle and SQL server databases, but only one database type will be used at a time, according to which ever database technology client is using.
How to create database agnostic data access layer? Dont want to write code twice - once for SQL server and once for Oracle.
Also it seems like in order to support oracle in EF, 3rd party oracle drivers are required - either from devart or oracle's ODP.NET.
I am debating should I use old style ADO.NET or use EF for building data access layer.
I will appreciate any help on this.
Thanks!

Your question seems to revolve around multiple concerns, i'll give answers based on my views on them:
1.- ¿How can you create a Database (DB Engine) agnostic DAL?
A: One approach for this is to follow the Repository pattern and/or use interfaces to decouple the code that manipulates the data from the code that retrieves/inserts it. The actual implementation of the interfaces used by your code to get the data can also be taylored to be DB Engine agnostic, if you're going to use ADO.NET, you can check out the Enterprise Library for some very useful code which is DB Engine agnostic. Entity Framework is also compatible with different DB engines but, as you mentioned, can only interact with one DB at a time, so whenever you generate the model, you tie it to the specifics of the DB Engine that your DB is hosted in. This is related to another concern in your question:
2.- ¿Should you use plain old ADO.NET or EF?
A: This is a very good question, which i'm sure has been asked before many times and given that both approaches give you the same practical results of being able to retrieve and manipulate data, the resulting question is: ¿what is your personal preference for coding and the time/resources constraints of the project?
IMO, Entity Framework is best suited for Code-First projects and when your business logic doesn't require complex logging, transactions and other security or performance constraints on the DB side, not because EF is not capable of including these requirements, but because it becomes rather convoluted and unpractical to do it and i personally believe that defeats the purpose of EF, to provide you with a tool that allows for rapid development.
So, if the people involved in the project is not very comfortable writing stored procedures in SQL and the data manipulation will revolve mostly around your service without the need for very complex operations on the DB side, then EF is a suitable approach, and you can leverage the Repository pattern as well as interfaces to implement "DBContext" objects that will allow you to create a DB Agnostic DAL.
However, if you are required to implement transactions, security, extensive logging, and are more comfortable writing SQL stored procedures, Entity Framework will often prove to be a burden for you simply because it is not yet suited for advanced tasks, for example:
Imagine you have a User table, with multiple fields (Address, phone, etc) that are not always necessary for all user-related operations (such as authentication); Trying to map an entity to the results of a stored procedure that does not return any of the fields that the entity contains will result in an error, and you will either need to create different models with more or less members or return additional columns in the SP that you might not need for a particular operation, increasing the bandwith consumption unnecessarily.
Another situation is taking advantage of features such as Table Valued Parameters in SQL Server to optimize sending multiple records at once to the DB, in this case Entity Framework does not include anything that will automatically optimize operations with multiple records, so in order to use TVPs you will need to manually define that operation, much like you would if you had gone the ADO.NET route.
Eventually, you will have to weigh the considerations of your project against what your alternatives provide you; ADO.NET gives you the best performance and customization for your DB operations, it is highly scalable and allows optimizations but it takes more time to code, while EF is very straightforward and practical for objects manipulation, and though it is constantly evolving and improving, its performance and capabilities are not quite on pair with ADO.NET yet.
And regarding the drivers issue, it shouldn't weigh too much in the matter since even Oracle encourages you to use their driver instead of the default one provided by Microsoft.

Legacy MySQL database mapping to a good .NET ORM for system migration

This is quite a long one, but I'd very much appreciate your thoughts and suggestions.
We are busy rebuilding a legacy system which was written in PHP and MySQL and replacing its components with ASP.MVC in C# and SQL Server. The legacy architecture leaves much to be desired and there is a serious issue with spaghetti code, no referential integrity in the DB, unused code and database fields and just generally bad coding.
As much as I'd love to, we can't just rip out all of the old code and replace it. The company needs to stay functional during the development process, so we will need to build new functionality while using the old databases to ensure that their data is accurate at all times. The level of data accuracy isn't real-time, but if we had 2 systems, they would have to be in sync 100% of the time. The old system uses 6 different MySQL databases, all on the same server, running Linux. We will be running Windows 2008 R2 on the new server for the new system and we are planning to use the latest version of SQL Server.
The problem I'm having to solve is: I need to somehow map all of these databases into a consolidated model that we can use through C# to develop the new system on. Once we have moved all the functionality over to C#, we need to port the data into a DB that matches our code model. This DB will be running on SQL Server. I'm not too worried about the migration just yet; my current issue is finding an ORM tool that will allow me to map these 6 MySQL databases into a single, well planned out and designed model that we can use for the new development.
The new model might have additional fields that we would have to store in a new MySQL database until we port the data across at some stage, so the ORM should support easily building entities that span multiple tables and databases.
Is what I'm trying to do possible? Is it viable in terms of effort? Is there an ORM that can do all of this? and what other way is there to maintain operational capacity of the company whilst developing on the system actively?
I have looked at these ORM options:
SubSonic (great, but I think too lightweight for what we are trying)
Entity Framework (looks like I might be able to use this if I use very dirty models with tons of stored procedures for inserts, updates and deletes)
NHibernate (the client does not want us to use this due to bad experiences in the past)
LLBLGen (seems like it can do what we need it to, but long term support could be a concern with the client)
Anything else I should look at? Is there a different approach I could try?

ORMs aren't designed to solve the problem you have. That said, a quality ORM will get you some percentage of the way toward a solution.
NHibernate is the easy choice. LLBLGen would be my second choice. I wouldn't even bother with EF or SubSonic as they are very feature poor compared to the other two and you need decent feature support in your scenario.
You'll likely have to invest a lot of time in writing custom code around your migration requirements. Your use case is not a standard, well traveled path.

For Entity Framework: if you're prepared to maintain one complete set of stored procedures with a static interface (i.e. same signature) you could implement them all in Transact-SQL on the SQL Server box, with linked servers (to the MySQL farm).
When the time comes, you could migrate the data into SQL Server and update your stored procedures.
Basically, design a nice model with nice stored procedures, and as a temporary solution implement any ugliness inside the stored procedures. Once MySQL is out of the way, you can replace the stored procedures with better ones.
SQL Server has a tendency to retrieve the entire remote table when you're running queries against a linked server, so if performance is a concern it might eventuate that all your stored procedures are wrappers around OPENROWSET (see Example A for running a query on a remote server).

Database Design In SQL Server or C#?

Should a database be designed on SQL Server or C#?
I always thought it was more appropriate to design it on SQL Server, but recently I started reading a book (Pro ASP.NET MVC Framework) which, to my understanding, basically says that it's probably a better idea to write it in C# since you will be accessing the model through C#, which does make sense.
I was wondering what everyone else's opinion on this matter was...
I mean, for example, do you consider "correct" having a table that specifies constants (like an AccessLevel table that is always supposed to contain
1 Everyone
2 Developers
3 Administrators
4 Supervisors
5 Restricted
Wouldn't it be more robust and streamlined to just have an enum for that same purpose?

A database schema should be designed on paper or with an ERD tool.
It should be implemented in the database.
Are you thinking about ORMs like Entity Framework that let you use code to generate the database?
Personally, I would rather think through my design on paper before committing it to a DB myself. I would be happy to use an ORM or class generator from this DB later on.

Before VS.NET 2010 I was using SQL Server Management Studio to design my databases, now I am using EF 4.0 designer, for me it's the best way to go.

If your problem domain is complex or its complexity grows as the system evolves you'll soon discover you need some meta data to make life easier. C# can be a good choice as a host language for such stuff as you can utilize its type-system to enforce some invariants (like char-columns length, null/not null restrictions or check-constraints; you can declared it as consts, enums, etc). Unfortunately i don't know utilities (sqlmetal.exe can export some meta but only as xml) that can do it out of the box, although some CASE tools probably can be customized. I'd go for some custom-made generator to produce the db schema from C# (just a few hours work comparing to learning, for example, customization options offered by Sybase PowerDesigner).

ORMs have their place, that place is NOT database design. There are many considerations in designing a database that need to be thought through not automatically generated no matter how appealing the idea of not thinking about design might be. There are often many things that need to be considered that have nothing to do with the application, things like data integrity, reporting, audit tables and data imports. Using an ORM to create a database that looks like an object model may not be the best design for performance and may not have the the things you really need in terms of data integrity. Remember even if you think nothing except the application will touch the database ever, this is not true. At some point the data base will need to have someone do a major data revision (to fix a problem) that is done directly on the database not through the application. At somepoint you are going to need need to import a million records from some other company you just bought and are goping to need an ETL process outside teh application. Putting all your hopes and dreams for the database (as well as your data integrity rules) is short-sighted.

pluggable data store architectures

I have a pluggable system management tool. The architecture of this kind of thing is well understood (interfaces, publish/ subscribe, ....). How about the data store though. What do people do?
I need plugins to be able to add new entities, extend existing entities, establish new relationships, etc.
My thoughts (SQL), not necessarily well thought out
each plugin simply extends the schema when they are installed. In the old days changing the schema was a big no-no; now databases are very relaxed about this
plugins have their own tables. If 2 of them have an entity (say) person, then there are 2 tables p1_person and p2_person
plugins have their own database
invent some sort of flexible scheme where the tables are softly typed. Maybe many attributes packed into a single attribute. The ultimate is to have one big table called data, with key of table name & column name and a single data value.
Not SQL
object DB. I have no experience with these. Anybody care to pass on experience. db4o for example. Can I change the 'schema' of objects as the app evolves
NO-SQL
this is 'where its at' at the moment. Most of these seem to be aimed slightly differently than my needs. Anybody want to pass on experience with these
Apologies for the open ended question

My suggestion is go read about the entity framework
a lot of the situations you are describing can be solved (very elegantly) using table inheritance.
Your idea of one big table called data makes the hamsters in my computer cry ;)
The general trend is away from weakly typed schemas because they cannot be debugged at compile time. What you get from something like entity framework is a strongly typed extenislbe schema that you can code against using linq.
Object databases:
like you i havent played with them massivley - however the time when i was considering them was a time when there was no good ORM for .net and writing ado.net code was slowly killing me.
as for NO-SQL these are databases that meet a performance need. SQL performs badly in situations here there are lots of small writes occuring. I say badly tounge in cheek - it performs very well but when you scale to millions of concurrent users everything changes. My understanding of no sql is that it is a non rationalised format designed for lots of small fast writes and reads. The scale of sites that use these is usually very large.
OK - in response
I am currently lucky enough to be on a green field project so i am using EF to generate my schema.
On non greenfield projects I use sql scripts to update my table structures. As for implementing table inheritance in sql its very easy once you know the concept, its essentially a one to many relationship with a constraint that it will only ever be 0-1.
I wouldn't write .net code that updates the database structure ... that sounds like a disaster waiting to happen to me.
Beginning to think i have misunderstood what you are looking for. I find databases to be second nature as I have spent so long with them.
I haven't found a replacement for being meticulous about script management.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.