Why creating Tables in run-time (code behind) is bad? - c#

People suggest creating database table dynamically (or, in run-time) should be avoided, with the saying that it is bad practice and will be hard to maintain.
I don't see the reason why, and I don't see difference between creating table and any another SQL query/statement such as SELECT or INSERT. I wrote apps that create, delete and modify database and tables in run time, and so far I do not see any performance issues.
Can anyone explane the cons of creating database and tables in run-time?

Tables are much more complex entities than rows and managing table creation is much more complex than an insert which has to abide by an existing model, the table. True, a table create statement is a standard SQL operation but depending on creating them dynamically smacks of a bad design decisions.
Now, if you just create one or two and that's it, or an entire database dynamically, or from a script once, that might be ok. But if you depend on having to create more and more tables to handle your data you will also need to join more and more and query more and more. One very serious issue I encountered with an app that made use of dynamic table creation is that a single SQL Server query can only involve 255 tables. It's a built-in constraint. (And that's SQL Server, not CE.) It only took a few weeks in production for this limit to be reached resulting in a nonfunctioning application.
And if you get into editing the tables, e.g. adding/dropping columns, then your maintenance headache gets even worse. There's also the matter of binding your db data to your app's logic. Another issue is upgrading production databases. This would really be a challenge if a db had been growing with objects dynamically and you suddenly needed to update the model.
When you need to store data in such a dynamic manner the standard practice is to make use of EAV models. You have fixed tables and your data is added dynamically as rows so your schema does not have to change. There are drawbacks of course but it's generally thought of as better practice.

KMC ,
Remember the following points
What if you want to add or remove a column , you many need to change in the code and compile it agian
what if the database location changes
Developers who are not very good at database can make changes , if you create the schema at the backend , DBA's can take care of it.
If you get any performance issues , it may get tough to debug.

You will need to be a little clearer about what you mean by "creating tables".
One reason to not allow the application to control table creation and deletion is that this is a task that should be handled only by an administrator. You don't want normal users to have the ability to delete whole tables.
Temporary tables ar a different story, and you may need to create temporary tables as part of your queries, but your basic database structure should be managed only by someone with the rights to do so.

sometimes, creating tables dynamically is not the best option security-wise (Google SQL injection), and it would be better using stored procedures and have your insert or update operations occur at the database level by executing the stored procedures in code.

Related

Adding Entity Objects to LINQ-to-SQL Data Context at Runtime - SQL, C#(WPF)

I've hit a wall when it comes to adding a new entity object (a regular SQL table) to the Data Context using LINQ-to-SQL. This isn't regarding the drag-and-drop method that is cited regularly across many other threads. This method has worked repeatedly without issue.
The end goal is relatively simple. I need to find a way to add a table that gets created during runtime via stored procedure to the current Data Context of the LINQ-to-SQL dbml file. I'll then need to be able to use the regular LINQ query methods/extension methods (InsertOnSubmit(), DeleteOnSubmit(), Where(), Contains(), FirstOrDefault(), etc...) on this new table object through the existing Data Context. Essentially, I need to find a way to procedurally create the code that would otherwise be automatically generated when you do use the drag-and-drop method during development (when the application isn't running), but have it generate this same code while the application is running via command and/or event trigger.
More Detail
There's one table that gets used a lot and, over the course of an entire year, collects many thousands of rows. Each row contains a timestamp and this table needs to be divided into multiple tables based on the year that the row was added.
Current Solution (using one table)
Single table with tens of thousands of rows which are constantly queried against.
Table is added to Data Context during development using drag-and-drop, so there are no additional coding issues
Significant performance decrease over time
Goals (using multiple tables)
(Complete) While the application is running, use C# code to check if a table for the current year already exists. If it does, no action is taken. If not, a new table gets created using a stored procedure with the current year as a prefix on the table name (2017_TableName, 2018_TableName, 2019_TableName, and so on...).
(Incomplete) While the application is still running, add the newly created table to the active LINQ-to-SQL Data Context (the same code that would otherwise be added using drag-and-drop during development).
(Incomplete) Run regular LINQ queries against the newly added table.
Final Thoughts
Other than the above, my only other concern is how to write the C# code that references a table that may or may not already exist. Is it possible to use a variable in place of the standard 'DB_DataContext.2019_TableName' methodology in order to actually get the table's data into a UI control? Is there a way to simply create an Enumerable of all the tables where the name is prefixed with a year and then select the most current table?
From what I've read so far, the most likely solution seems to involve the use of a SQL add-on like SQLMetal or Huagati which (based solely from what I've read) will generate the code I need during runtime and update the corresponding dbml file. I have no experience using these types of add-ons, so any additional insight into these would be appreciated.
Lastly, I've seen some references to LINQ-to-Entities and/or LINQ-to-Objects. Would these be the components I'm looking for?
Thanks for reading through a rather lengthy first post. Any comments/criticisms are welcome.
The simplest way to achieve what you want is to redirect in SQL Server, and leave your client code alone. At design-time create your L2S Data Context, or EF DbContex referencing a database with only a single table. Then at run-time substitue a view or synonym for that table that points to the "current year" table.
HOWEVER this should not be necessary in the first place. SQL Server supports partitioning, so you can store all the data in a physically separate data structures, but have a single logical table. And SQL Server supports columnstore tables, which can compress and store many millions of rows with excellent performance.

c# update single db field or whole object?

This might seem like an odd question, but it's been bugging me for a while now. Given that i'm not a hugely experienced programmer, and i'm the sole application/c# developer in the company, I felt the need to sanity check this with you guys.
We have created an application that handles shipping information internally within our company, this application works with a central DB at our IT office.
We've recently switch DB from mysql to mssql and during the transition we decided to forgo the webservices previously used and connect directly to the DB using Application Role, for added security we only allow access to Store Procedures and all CRUD operations are handle via these.
However we currently have stored procedures for updating every field in one of our objects, which is quite a few stored procedures, and as such quite a bit of work on the client for the DataRepository (needing separate code to call the procedure and pass the right params for each procedure).
So i'm thinking, would it be better to simply update the entire object (in this case, an object represents a table, for example shipments) given that a lot of that data would be change one field at a time after initial insert, and that we are trying to keep the network usage down, as some of the clients will run with limited internet.
Whats the standard practice for this kind of thing? or is there a method that I've overlooked?
I would say that updating all the columns for the entire row is a much more common practice.
If you have a proc for each field, and you change multiple fields in one update, you will have to wrap all the stored procedure calls into a single transaction to avoid the database getting into an inconsistent state. You also have to detect which field changed (which means you need to compare the old row to the new row).
Look into using an Object Relational Mapper (ORM) like Entity Framework for these kinds of operations. You will find that there is not general consensus on whether ORMs are a great solution for all data access needs, but it's hard to argue that they solve the problem of CRUD pretty comprehensively.
Connecting directly to the DB over the internet isn't something I'd switch to in a hurry.
"we decided to forgo the webservices previously used and connect directly to the DB"
What made you decide this?
If you are intent on this model, then a single SPROC to update an entire row would be advantageous over one per column. I have a similar application which uses SPROCs in this way, however the data from the client comes in via XML, then a middleware application on our server end deals with updating the DB.
The standard practice is not to connect to DB over the internet.
Even for small app, this should be the overall model:
Client app -> over internet -> server-side app (WCF WebService) -> LAN/localhost -> SQL
DB
Benefits:
your client app would not even know that you have switched DB implementations.
It would not know anything about DB security, etc.
you, as a programmer, would not be thinking in terms of "rows" and "columns" on client side. Those would be objects and fields.
you would be able to use different protocols: send only single field updates between client app and server app, but update entire rows between server app and DB.
Now, given your situation, updating entire row (the entire object) is definitely more of a standard practice than updating a single column.
It's better to only update what you change if you know what you change (if using an ORM like entity Framework for example), but if you're going down the stored proc route then yes definately update everything in a row at once that's way granular enough.
You should take the switch as an oportunity to change over to LINQ to entities however if you're already in a big change and ditch stored procedures in the process whenever possible

Is it a good approach to query the database only through stored procedures?

When I am developing an ASP.NET website I do really like to use Entity Framework with both database-first or code-first models (+ asp.net mvc controllers scaffolding).
For an application requiring to access an existing database, I naturally thought to create a database model and to use asp.net mvc scaffolding to get all the basic CRUD operations done in a few minutes with nearly no development costs.
But I discussed with a friend who told me that accessing data stored in the database only through stored procedures is the best approach to take.
My question is thus, what do you think of this sentence? Is it better to create stored procedures for any required operations on a table in the database (e.g. create and read on this table, update and delete only on another one, ...)? And what are the advantages/disadvantages of doing so instead of using a database-first model created from the tables in the database?
What I thought at first is that it double costs of development to do everything through stored procedures as you have to write these stored procedures where Entity Framework could have provided DbContext in a few clicks, allowing me to use LINQ over Entities, ... But then I've read a few stuff about Ownership Chains that might improve security by setting only permissions to execute stored procedures and no permissions for any operations (select, insert, update, delete) on the tables.
Thank you for your answers.
Its a cost benefit analysis. Being a DB focused guy, I would agree with that statement. It is best. It also makes you code easier to read (no crazy sql statements uglifying it). Increased performance with cached execution plans. Ease of modifying the querying without recompiling the code, eetc.
Many of the ppl I work with are not all that familiar with writing SPROCs so it tends to be a constant fight with them use them. Personally I dont see any reason to ever bury SQLStatments in your code. They tend to shy away from them b/c it is more work for them up front.
Yes, it's a good approach.
Whether it's the best approach or not, that depends on a lot of factors, some of them which you don't even know yet.
One important factor is how much furter development there will be, and how much maintainence. If the initial development is a big part of the total job, then you should rather use a method that gets you there as fast and easy as possible.
If you will be working with and maintaining the system for a long time, you should focus less on the initial development time, and more on how easy it is to make changes to the system once it's up and running. Using stored procedures is one way to make the code less depending on the exact data layout, and allows you to make changes without a lot of down time.
Note that it's not neccesarily a choise between stored procedures and Entity Framework. You can also use stored procedures with Entity Framework.
This is primarily an opinion based question and the answer may depend on the situation. Using stored procedure is definetely one of the best ways to query the database but since the emergence of Entity Framework it is widely used. The advantage of Entity Framework is that it provides a higher level of abstraction.
Entity Framework applications provide the following benefits:
Applications can work in terms of a more application-centric conceptual model, including types with inheritance, complex members,
and relationships.
Applications are freed from hard-coded dependencies on a particular data engine or storage schema.
Mappings between the conceptual model and the storage-specific schema can change without changing the application code.
Developers can work with a consistent application object model that can be mapped to various storage schemas, possibly implemented in
different database management systems.
Multiple conceptual models can be mapped to a single storage schema.
Language-integrated query (LINQ) support provides compile-time syntax validation for queries against a conceptual model.
You may also check this related question Best practice to query data from MS SQL Server in C Sharp?
following are some Stored Procedure advantages
Encapsulate multiple statements as single transactions using stored procedured
Implement business logic using temp tables
Better error handling by having tables for capturing/logging errors
Parameter validations / domain validations can be done at database level
Control query plan by forcing to choose index
Use sp_getapplock to enforce single execution of procedure at any time
in addition entity framework will adds an overhead for each request you make, as entity framework will use reflection for each query. So, by implementing stored procedure you will gain in time as it's compiled and not interpreted each time like a normal entity framework query.
The link bellow give some reasons why you should use entity framework
http://kamelbrahim.blogspot.com/2013/10/why-you-should-use-entity-framework.html
Hope this can enlighten you a bit
So I'm gonna give you a suggestion, and it will be something I've done, but not many would say "I do that".
So, yes, I used stored procedures when using ADO.NET.
I also (at times) use ORM's, like NHibernate and EntityFramework.
When I use ADO.NET, I use stored procedures.
When you get data from the database, you have to turn it into something on the DotNet side.
The quickest thing is to put data into a DataTable or DataSet.
I no longer favor this method. While it may make for RAPID development ("just stuff the data into a datatable")......it does not work well for MAINTENANCE, even if that maintenance is only 2-3 months down the road.
So what do I put the data into?
I create DTO/POCO objects and hydrate the data from the database into these objects.
For example.
The NorthWind database has
Customer(s)
Order(s)
and OrderDetail(s)
So I create a csharp class called Order.cs, Order.cs and OrderDetail.cs.
These ONLY contain properties of the entity. Most of the time, the properties simple reflect the columns in the database for that entity. (Order.cs has properties, that simulate a Select * from dbo.Order where OrderID = 123 for example).
Then I create a child-collection object
public class OrderCollection : List<Order>{}
and then the parent object gets a property.
public class Customer ()
{
/* a bunch of scalar properties */
public OrderCollection Orders {get;set;}
}
So now you have a stored procedure. And it gets data.
When that data comes back, one way to get it is with an IDataReader. (.ExecuteReader).
When this IDataReader comes back, I loop over it, and populate the Customer(.cs), the Orders, and the OrderDetails.
This is basic, poor man's ORM (object relation mapping).
Back to how I code my stored procedures, I would write a procedure that returns 3 resultsets, (one db hit) and return the info about the Customer, the Order(s) (if any) and the OrderDetails(s) (if any exist).
Note that I do NOT do alot of JOINING.
When you do a "Select * from dbo.Customer c join dbo.Orders o on c.CustomerID = o.CustomerId, you'll note you get redundant data in the first columns. This is what I do not like.
I prefer multiple resultsets OVER joining and bringing back a single resultset with redundant data.
Now for the little special trick.
Whenever I select from a table, I always select all columns on that table.
So whenever I write a stored procedure that needs customer data, I do a
Select A,B,C,D,E,F,G from dbo.Customer where (......)
Now, alot of people will argue that. "Why do you bring back more info than you need?"
Well, real ORM's do this anyway. So I am poor-man reflecting this.
And, my code for taking the resultset(s) from the stored procedure to turn that into instances of objects........stays consistent.
Because if you write 3 stored procedures, and each one selects data from Customer table, BUT you select different columns and/or in a different order, youre "object mapper" code needs to have a method for each stored procedure.
This method of ADO.NET has served me well.
And, once my team swapped out ADO.NET for a real ORM, and that transition was very pain free because of the way we did the ADO.NET from the get go.
Quick rules of thumb:
1. If using ADO.NET, use stored procedures.
2. Get multiple result-sets, instead of redundant data via joins.
3. Make your columns consistent from any table you select from.
4. Take the results of your stored procedure call, and write a "hydrater" to take that info and put into your domain-model as soon as you can. (the .cs classes)
That has served me well for many years.
Good luck.
In my opinion :
Stored Procedures are written in big iron database "languages" like PL/SQL or T-SQL
Stored Procedures typically cannot be debugged in the same IDE your write your UI.
Stored Procedures don't provide much feedback when things go wrong.
Stored Procedures can't pass objects.
Stored Procedures hide business logic.
Source :
http://www.codinghorror.com/blog/2004/10/who-needs-stored-procedures-anyways.html

What's the purpose of Datasets?

I want to understand the purpose of datasets when we can directly communicate with the database using simple SQL statements.
Also, which way is better? Updating the data in dataset and then transfering them to the database at once or updating the database directly?
I want to understand the purpose of datasets when we can directly communicate with the database using simple SQL statements.
Why do you have food in your fridge, when you can just go directly to the grocery store every time you want to eat something? Because going to the grocery store every time you want a snack is extremely inconvenient.
The purpose of DataSets is to avoid directly communicating with the database using simple SQL statements. The purpose of a DataSet is to act as a cheap local copy of the data you care about so that you do not have to keep on making expensive high-latency calls to the database. They let you drive to the data store once, pick up everything you're going to need for the next week, and stuff it in the fridge in the kitchen so that its there when you need it.
Also, which way is better? Updating the data in dataset and then transfering them to the database at once or updating the database directly?
You order a dozen different products from a web site. Which way is better: delivering the items one at a time as soon as they become available from their manufacturers, or waiting until they are all available and shipping them all at once? The first way, you get each item as soon as possible; the second way has lower delivery costs. Which way is better? How the heck should we know? That's up to you to decide!
The data update strategy that is better is the one that does the thing in a way that better meets your customer's wants and needs. You haven't told us what your customer's metric for "better" is, so the question cannot be answered. What does your customer want -- the latest stuff as soon as it is available, or a low delivery fee?
Datasets support disconnected architecture. You can add local data, delete from it and then using SqlAdapter you can commit everything to the database. You can even load xml file directly into dataset. It really depends upon what your requirements are. You can even set in memory relations between tables in DataSet.
And btw, using direct sql queries embedded in your application is a really really bad and poor way of designing application. Your application will be prone to "Sql Injection". Secondly if you write queries like that embedded in application, Sql Server has to do it's execution plan everytime whereas Stored Procedures are compiled and it's execution is already decided when it is compiled. Also Sql server can change it's plan as the data gets large. You will get performance improvement by this. Atleast use stored procedures and validate junk input in that. They are inherently resistant to Sql Injection.
Stored Procedures and Dataset are the way to go.
See this diagram:
Edit: If you are into .Net framework 3.5, 4.0 you can use number of ORMs like Entity Framework, NHibernate, Subsonic. ORMs represent your business model more realistically. You can always use stored procedures with ORMs if some of the features are not supported into ORMs.
For Eg: If you are writing a recursive CTE (Common Table Expression) Stored procedures are very helpful. You will run into too much problems if you use Entity Framework for that.
This page explains in detail in which cases you should use a Dataset and in which cases you use direct access to the databases
I usually like to practice that, if I need to perform a bunch of analytical proccesses on a large set of data I will fill a dataset (or a datatable depending on the structure). That way it is a disconnected model from the database.
But for DML queries I prefer the quick hits directly to the database (preferably through stored procs). I have found this is the most efficient, and with well tuned queries it is not bad at all on the db.

When to use Stored Procedures instead of using any ORM with programming logic?

Hi all I wanted to know when I should prefer writing stored procedures over writing programming logic and pulling data using a ORM or something else.
Stored procedures are executed on server side.
This means that processing large amounts of data does not require passing these data over the network connection.
Also, with stored procedures, you can build consistent complicated business logic.
Say, you need to update the account balance each time you insert a transaction, and you need to insert many transactions at once.
Instead of doing this with triggers (which are implemented using inefficient record-by-record approach in many systems), you can pass a table variable or temporary table with the inputs and issue a set-based SQL statement inside the procedure. This will be much more efficient.
I prefer SPs over programming logic mainly for two reasons
Performance, anything what will reduce result set or can be more effectively done on the server, e.g.:
paging
filtering
ordering (on indexed columns)
Security -- if someone have got application's access to the database and wants to wipe out your all your records, having to execute Row_Delete for single each of them instead of DELETE FROM Rows already sounds good.
Never unless you identify a performance issue. (largely opinion)
(a Jeff blog post!)
http://www.codinghorror.com/blog/2004/10/who-needs-stored-procedures-anyways.html
If you see stored procs as optimizations:
http://en.wikipedia.org/wiki/Program_optimization#When_to_optimize
When appropriate.
complex data validation/checking logic
avoid several round trips to do one action in the DB
several clients
anything that should be set based
You can't say "never" or "always".
There is also the case where the database engine will outlive your client code. I bet there's more DAL or ORM upgrades/refactoring that DB engine upgrades/refactoring going on.
Finally, why can't I encapsulate code in a stored proc? Isn't that a good thing?
As ever, much of your decision as to which to use will depend on your application and its environment.
There are a couple of schools of thought here, and this debate always arouses strong sentiments on both sides.
The advantanges of Stored Procedures (as well as the large data moving that Quassnoi has mentioned) are that the logic is tied down in the database, and therefore potentially more secure. It is also only ever in one place.
However, there will be others who believe that the place for application logic should be in the application, especially if you are planning to access other types of datebases (for which you will have to write often different SPs).
Another consideration may be the skills of the resources you have to implement your application.
The point at which stored procedures become preferable to an ORM is that point at which you have multiple applications talking to the same database. At this point, you want your query logic embedded in one place, rather than once per application. And even here, you might want to prefer a service layer (which can scale horizontally) instead of the database (which only scales vertically).

Categories

Resources