Vendor-agnostic way of retrieving the schema of a database table

Vendor-agnostic way of retrieving the schema of a database table - c#

I'm currently building an application at work where users get to define various ways in which pieces of data are routed to various storage technologies. Those include traditional relational database systems.
We'd like to give feedback to users if the way they've configured this does not work with the defined database schema, i.e. if the column types don't match.
I've been looking for a solid vendor-agnostic way of retrieving the datatypes of a database table, ideally including the CLR types they map to.
So far I've struggling to find anything even remotely decent. Much of the solutions I stumbled upon are not vendor-agnostic, and much of the tooling regarding database technologies included in .NET (Core) are specific to SQL Server.
The most popular way seems to be via the GetSchema method on an IDbConnection object, but that one is also riddled with implementation specific details, and does not give a very pleasant to use result. I've been able to retrieve textual representations for each of the types, and for Postgres for example, the closest I've come is actual human-readable descriptions of the types. VARCHAR was displayed as "Varying length character string", which is hard to parse.
Most database interaction libraries for .NET (Core) abstract away the primitives like DataSet, DataTable, DataReader etc, and usually directly map to objects, thereby removing any use I could have had for them.
What is the easiest way to get an overview of a table schema?
For clarity's sake, we're looking to support the following database technologies for now:
SQL Server
PostgreSQL
MySQL / MariaDB
SQLite
Oracle RDMBS
Thanks!

This does sound like something that you have to pay for, because it is such a narrow use-case, if it even exists. I have a hard time believing this would be a maintained open-source project.
When that is said, maybe you can go around it by querying the database directly using something like this:
select *
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME='tableName'
Taken from https://stackoverflow.com/a/18298685/1387545
I checked and it seems to work for at least the first two databases. I think finding some kind of SQL query is your best bet of a generic solution. Since SQL is the technology that they share.
But then again, I think you will obtain a better result by building your own specific parser for the database tables for each database. It of course all depends on time and budget.

Related

Correct solution for persistent table/grid in C# that does not require a full database solution?

My WinForms C#/.NET application requires a table/grid control to display records to the end user. The records will be simple, containing only two fields, a string and a date/time field. I need to persist the data and I am wondering what the most efficient control and storage back-end is to use. The data is non-critical (i.e. - not health or financial records, or anything sensitive requiring extensive safety or any encryption).
One solution I have found so far is the DataGrid control in conjunction with SQL Server Compact Edition. I learned about this solution from this tutorial:
http://www.dotnetperls.com/datagridview-tutorial
It seems though that this may be overkill for my application. In addition, I am worried about the complexities of installing SQL Server CE, especially when it comes to admin vs. user account privilege issues during installation:
http://msdn.microsoft.com/en-us/library/aa983326(v=vs.80).aspx
Is there a table or grid control with built-in file load/save capabilities that uses a simple disk file as the storage method, perhaps a comma delimited ASCII file? I'd like something that I can still use SQL (via LINQ) to interface with. also, I am hoping that this can be done transparently. That is, if I want to upgrade to an SQL database engine solution later, the code from my end that interfaces with the data would not change (except perhaps for the database open/create code of course).
Or am I better off simply biting the bullet and going with SQL Server CE or perhaps SQLite:
Good embedded database solution (like SQLite) for .Net
If you have any caveats or anecdotes regarding installation issues and ease of use, they would be appreciated.

In my projects, we use Object datasources. Grid's can be bound to collections of objects just as easily as they can dataTables. You can store/restore the data using a simple serialization engine (XmlSerializer is rather easy to implement). Make a basic object, use List or BindingList as the dataset, and serialize/de-serialize it in the backEnd when you need it.
List and BindingList both support Linq queries.
Adding database save later is as simple as writing the code that saves the object to the database, in place of the serialization code, no change to the front end at all.
As far as a "Correct" solution is concerned...there are so many different ways to do it that it boils down to personal preference, and possibly actual requirements and expected future development. I find it easier to code using objects because the data manipulation is easier, but if you are going for straight record entry, no data manipulation required, going direct to a database is easier. It just depends on the data and what you plan on doing with it.

I strongly recommend you to use an embedded database, because it will be easier to go to a full database in a near future. SQL Server CE is a good option, and if you want to go big you can simply go to a full SQL Server Database with minimal changes in your code, the only downside of SQL Server CE is that you need to install it and it requires the .NET Framework 4, aside from that I don't see a big problem with it.

Entity Attribute Value (EAV) frameworks?

I'd seen Entity Attribute Value in lots of contexts before I actually learnt what its name was. Its that technique that often crops up when instead of storing data in database columns you 'flip it' and have a table with Entity, Attrbute, Value columns and each piece of data becomes a row in that table. Sometimes its also known as 'Open-Schema'.
Its good for some things, bad for other things. This wikipedia article has a good discussion of the theory behind it.
It seems like the sort of oft-used technique that should have Frameworks or Engines or NoSQL Databases or general software tools to build and support it.
So, do you know of any? I'm particularly interested in the Microsoft stack (.Net, SQL Server, etc), but also in other technology stacks.
For example, here's a project to build an ASP.NET EAV engine that is exactly what I'm looking for, but apparently never got started.

If you can live with the drawbacks of NoSQL databases, the best way to approach the EAV pattern is with a NoSQL alternative like CouchDB or MongoDB. These databases offer a "schemaless" design which allows each row to have its own schema. Doing an EAV with a traditional RDBMS is asking for trouble as the querying becomes very difficult and performance suffers the larger the dataset.
An alternative that I have used successfully in the past is to combine an RDBMS with a NOSQL variant (MySql and MongoDB). I use MySQL to store the EAV data (gaining the transactional integrity), and I use MongoDB as a reporting store to get around the querying issues with the EAV model.

You could store it in SQL-XML, donno of a lib tho, but you could do the de/serialization in .NET, then X-LINQ aginst it.
Performance will also be a tremendous issue obviously.

I'll get the ball rolling with one I found via this blog post:
An early Beta for a SQL Server EAV framework:
http://eav.codeplex.com/
"A sample EAV pattern for SQL Server with: Tables and indexes, Partial referential integrity, Partial data typing, Updatable views (like normal SQL table)"
Provides some SQL scripts to download, here.

Database Design In SQL Server or C#?

Should a database be designed on SQL Server or C#?
I always thought it was more appropriate to design it on SQL Server, but recently I started reading a book (Pro ASP.NET MVC Framework) which, to my understanding, basically says that it's probably a better idea to write it in C# since you will be accessing the model through C#, which does make sense.
I was wondering what everyone else's opinion on this matter was...
I mean, for example, do you consider "correct" having a table that specifies constants (like an AccessLevel table that is always supposed to contain
1 Everyone
2 Developers
3 Administrators
4 Supervisors
5 Restricted
Wouldn't it be more robust and streamlined to just have an enum for that same purpose?

A database schema should be designed on paper or with an ERD tool.
It should be implemented in the database.
Are you thinking about ORMs like Entity Framework that let you use code to generate the database?
Personally, I would rather think through my design on paper before committing it to a DB myself. I would be happy to use an ORM or class generator from this DB later on.

Before VS.NET 2010 I was using SQL Server Management Studio to design my databases, now I am using EF 4.0 designer, for me it's the best way to go.

If your problem domain is complex or its complexity grows as the system evolves you'll soon discover you need some meta data to make life easier. C# can be a good choice as a host language for such stuff as you can utilize its type-system to enforce some invariants (like char-columns length, null/not null restrictions or check-constraints; you can declared it as consts, enums, etc). Unfortunately i don't know utilities (sqlmetal.exe can export some meta but only as xml) that can do it out of the box, although some CASE tools probably can be customized. I'd go for some custom-made generator to produce the db schema from C# (just a few hours work comparing to learning, for example, customization options offered by Sybase PowerDesigner).

ORMs have their place, that place is NOT database design. There are many considerations in designing a database that need to be thought through not automatically generated no matter how appealing the idea of not thinking about design might be. There are often many things that need to be considered that have nothing to do with the application, things like data integrity, reporting, audit tables and data imports. Using an ORM to create a database that looks like an object model may not be the best design for performance and may not have the the things you really need in terms of data integrity. Remember even if you think nothing except the application will touch the database ever, this is not true. At some point the data base will need to have someone do a major data revision (to fix a problem) that is done directly on the database not through the application. At somepoint you are going to need need to import a million records from some other company you just bought and are goping to need an ETL process outside teh application. Putting all your hopes and dreams for the database (as well as your data integrity rules) is short-sighted.

pluggable data store architectures

I have a pluggable system management tool. The architecture of this kind of thing is well understood (interfaces, publish/ subscribe, ....). How about the data store though. What do people do?
I need plugins to be able to add new entities, extend existing entities, establish new relationships, etc.
My thoughts (SQL), not necessarily well thought out
each plugin simply extends the schema when they are installed. In the old days changing the schema was a big no-no; now databases are very relaxed about this
plugins have their own tables. If 2 of them have an entity (say) person, then there are 2 tables p1_person and p2_person
plugins have their own database
invent some sort of flexible scheme where the tables are softly typed. Maybe many attributes packed into a single attribute. The ultimate is to have one big table called data, with key of table name & column name and a single data value.
Not SQL
object DB. I have no experience with these. Anybody care to pass on experience. db4o for example. Can I change the 'schema' of objects as the app evolves
NO-SQL
this is 'where its at' at the moment. Most of these seem to be aimed slightly differently than my needs. Anybody want to pass on experience with these
Apologies for the open ended question

My suggestion is go read about the entity framework
a lot of the situations you are describing can be solved (very elegantly) using table inheritance.
Your idea of one big table called data makes the hamsters in my computer cry ;)
The general trend is away from weakly typed schemas because they cannot be debugged at compile time. What you get from something like entity framework is a strongly typed extenislbe schema that you can code against using linq.
Object databases:
like you i havent played with them massivley - however the time when i was considering them was a time when there was no good ORM for .net and writing ado.net code was slowly killing me.
as for NO-SQL these are databases that meet a performance need. SQL performs badly in situations here there are lots of small writes occuring. I say badly tounge in cheek - it performs very well but when you scale to millions of concurrent users everything changes. My understanding of no sql is that it is a non rationalised format designed for lots of small fast writes and reads. The scale of sites that use these is usually very large.
OK - in response
I am currently lucky enough to be on a green field project so i am using EF to generate my schema.
On non greenfield projects I use sql scripts to update my table structures. As for implementing table inheritance in sql its very easy once you know the concept, its essentially a one to many relationship with a constraint that it will only ever be 0-1.
I wouldn't write .net code that updates the database structure ... that sounds like a disaster waiting to happen to me.
Beginning to think i have misunderstood what you are looking for. I find databases to be second nature as I have spent so long with them.
I haven't found a replacement for being meticulous about script management.

Methods for storing searchable data in C#

In a desktop application, I need to store a 'database' of patient names with simple information, which can later be searched through. I'd expect on average around 1,000 patients total. Each patient will have to be linked to test results as well, although these can/will be stored seperately from the patients themselves.
Is a database the best solution for this, or overkill? In general, we'll only be searching based on a patient's first/last name, or ID numbers. All data will be stored with the application, and not shared outside of it.
Any suggestions on the best method for keeping all such data organized? The method for storing the separate test data is what seems to stump me when not using databases, while keeping it linked to the patient.
Off the top of my head, given a List<Patient>, I can imagine several LINQ commands to make searching a breeze, although with a list of 1,000 - 10,000 patients, I'm unsure if there's any performance concerns.

Use a database. Mainly because what you expect and what you get (especially over the long term) tend be two totally different things.

This is completely unrelated to your question on a technical level, but are you doing this for a company in the United States? What kind of patient data are you storing?
Have you looked into HIPAA requirements and checked to see if you're a covered entity? Be sure that you're complying with all legal regulations and requirements!

I think 1000 is to much to try to store in XML. I'd go with a simple db type, like access or Sqlite. Yes, as a matter of fact, I'd probably use Sqlite. Sql Server Express is probably overkill for it. http://sqlite.phxsoftware.com/ is the .net provider.

I would recommend a database. You can use SQL Server Express for something like that. Trying to use XML or something similar would probably get out of hand with that many rows.
For smaller databases/apps like this I've yet to notice any performance hits from using LINQ to SQL or Entity Framework.

I would use SQL Server Express because it has the best tool support (IDE integration) from Microsoft. I don't see any reason to consider it overkill.
Here's an article on how to embed it directly in your application (no separate installation needed).

If you had read-only files provided by another party in some kind of standard format which were meant to be used by the application, then I would consider simply indexing them according to your use cases and running your searches and UI against that. But that's still some customized work.
Relational databases are great for storing data in tables, and for representing the relationships between tables. Typically there are also good tools for getting the data in and out.
There are other systems you could use to store your data, but none which would so quickly be mapped to your input (you didn't mention how your data would get into this system) and then be queryable against with least effort.
Now, which database to choose...

Use Database...but maybe just SQLite, instead of a fully fledged database like MS SQL (Express).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.