Where to parse data - c#

In the following scenario:
A form based Windows application written in C#
GUI contains only interface related code but instantiates a class for functional logic
SQL class that stores and loads data from SQL
Data classes that store data in a format compatible with the database
I want to load data via the sql class and store it in the data classes. What would be the best way to do it?
The functional class instantiates a SQL class, calls the query function and passes the contained values to a new data class instance using a constructor;
The SQL class contains a function that converts the query to the data class and thus returns the data in the correct format to the functional class;
The Data class contains a method to parse the query result directly so the functional logic only has to call the parse.
Option 1 seems to be the most fitting in this structure. My experience at writing larger applications is rather limited so I'ld like to know how someone with more experience thinks about this.

Someone with more experience uses an ORM (object relational mapper) to do this for them.
Or potentially writes a data access layer that encapsulates the database access and passes data transfer objects to the application code.
Option 1 feels like the database is bleeding into the application. Separation of concerns is very important in large applications.
Option 2 sounds like the start of writing your own Data Access Layer.
Option 3 will pollute the data transfer objects with knowledge of the relational database structure.
Think very hard about picking up an ORM.
Entity framework. Nhibernate. Subsonic. Massive. Dapper. There are lots to chose from. The main thing is that this is a problem that has already been solved many times by many developers with a lot of experience. You can leverage their learning and speed up your development.

Related

How to build dynamic integration adapter?

We have a scenario where we have multiple sources of data coming in from various external systems through API calls, SQL tables and physical files, and we now have to map it against a number of transaction templates. I want to build an integration adapter and UI where I can choose any entity data class and map it's fields to a class or action that will be used to create a transaction in our financial system.
I want to have an object type or class that can be modified dynamically, setup links between these objects and possibly create a set of rules that defines the interaction between these objects. I have seen some versions of this types of software that uses a drag and drop type of UI interface to do the mappings, so that will be the ideal end goal.
I'm coming from a C# .Net background, so I need some advise or tips on where to start and what to look at.
I am currently doing something similar. I wrote some code to turn data from our legacy system into JSON objects written out to flat files (1 file per data record in a table), and then wrote some code to cleanup that data and push it into Acumatica via the REST API.
I like flat file json objects because they can easily be hashed, and the hashes used to compare them to new data that comes in. Only data where the hash has changed needs to be merged or overwritten and then pushed into the target system. The file names are usually primary key values from whatever table you're working with. Our legacy system has a hierarchical (non-tabular/SQL-like data structure), so my mileage may be greater than doing with this with a well-normalized SQL database.
There are also products like Alteryx that are built for doing data pipelines the way you have proposed.
I would caution to be practical in the building of these types of things. For us, for example, we have a limited set of data that needs to come over, so we don't need the perfect abstraction for every data type. We inevitably had to do cleanup on legacy/3rd party data as well, and those problems aren't always so easy to abstract. I had previously built a system using closures for function passing in order to write custom cleanup routines for any abstract data problem I might encounter (sort of sounds like what you're talking about), but realized in the end that just writing simpler code that deals with specific data problems was cleaner and simpler to maintain....in the end, there is probably only a limited amount of data that needs to be synched.

Should I call a single sql statement to retrieve data for a service or different distinct calls

I have a requirement where I need to return some data from a web service (in .net). The response of the service would be a class having different attributes. Some of the attributes are complex and need data to be fetched for such types need separate call.
Ex:
Class getServiceData
{
public int SimpleProperty1 {get;set;}
public int SimpleProperty2 {get;set;}
public IEnumerable<ComplexProperty> {get;set;}
...
}
For getting the data for such class, I need to call the Oracle database. My question is whether I should bring all the data for the service in a single sql statement (please note the structure involves some hierarchy) or call separate statements for different attributes of the class.
thanks in advance
Hierarchical queries done by the client involve probably many roundtrips between the application-code and the server. (at least if you do not use PSQL)
Since oracle is perfectly capable to handle hierarchical queries in one statement, I recommend to use that feature. It is much more performant, imO does not put more load onto the server than multiple regular queries and produces the same results.
The downsides I see, are
if you use some mapping library you might need to use native statements and therefore drop the support of the framework.
the query might be complex and not easily understandable for people who later have to maintain your code.
The code is not easily portable to other databases, if you decide to switch. Since the oracle hierarchical (connect by prior) feature came quite early on the market, it has not so much in common with recursive CTE-queries, other dbms support.
If performance is the biggest criteria, then I would ask you to pull all the data in a single query itself.
Using Oracle to directly return data is easier than using any wrapper that would return data in parts.
However be careful if you are using a Stored Procedure with a lot of processing in it as this can become difficult to debug and has the tendency to become a black box while testing (hides the processing from code).

Designing Data Access Layer using C#, SQL Server

OK, as the title suggests I am designing a data access layer for a survey framework I am currently working on.
We all are familiar with the layered architecture notion, we try to achieve separation between the layers in a way that presentation layer can be hooked to any business layer, in the same way business layer can be wired to any Data access layer regardless of its implementation as long as it maintains the same interface (Same methods).
Now, after building the database using SQL Server, I am building the DAL using a DataSet (*.xsd) file and in this file I create the methods for each table adapter and the corresponding stored procedures in the database.
After working for a little while with the data set visual designer in Visual Studio I have noticed that I am aiming at providing a very flexible API that provides all the possible queries for the user in the form of methods. For example, I want to provide the user with methods that performs retrieval operations on the tables using any possible filter or with not filter, I also want the user to be able to delete rows using any column she/he wants as filter, also updating all/individual fields using any column he wants as a filter.
The way I have accomplished this primarily is by creating a method for every possible query whether it is DDL or DML. Now, when I think that I might made a mistake in a certain method or that I want to check the methods to make sure I did not miss anything while fast typing it seems like a pain because I have ended up with a ton of methods.
So, my question is: Is there another way for designing the data access layer so that it can be easy to refactor the methods and create them?
I hope I did not elaborate too much but I wanted to put you in the picture so I can get the correct answer, thanks in advance
Well, you could use an ORM tool to provider a good data access layer. I mean it because with an ORM tool you will have support to most of populars databases as SQL Server, Oracle, MySQL, PostgreSql, etc.
Depending of wichi ORM tool you use, you do not have to write SQL statments, which means you will be less sensitive to error an query.
I recommend you to check a tool called NHibernate. With this ORM you can write queries using Linq and another one (more specific for NHibernate) called QueryOver. You will have a lot of flexibily to write dynamics queries.
With an ORM tool you could implement a Repository Pattern and create methods and queries to get working data access.
So, when you use something like this, you will have the benefits of the Visual Studio like Refecting, because Linq, and QueryOver is strongly typed. But you will have HQL too, it like Sql Statment.
Check this article: Why I don't use DataSets

Database handling in applications

This is a bit of difficult question to ask but any feedback at all is welcome.
Ill start by the background, I am a university student studying software engineering last year we covered c# and I got myself a job working in a software house coding prototype software in c# (their main language is c++ using QT) after producing the prototype it was given to some clients which have all passed back positive feedback.
Now I am looking at the app and thinking well I could use this as a showcase with my CV esp as the clients who used the software have said that they will sign something to reference it.
So if I am going to do that then I had better get it right and do it to the best I possibly can. so I have started to look at it and think where I can improve it and one of the ways in which I think that I can is the way it handles the database connections and the data along with it.
the app itself runs along side a MySQL server and there is 6 different schemas which it gets its data from.
I have written a class (called it databaseHandler) which has the mysqlconnection in it ( one question was about if the connection should remain open the whole time the app is running, or open it fire a query then close it etc) inside this class I have written a method which takes some arguments and creates its query string which it then does the whole mysqlDataReader = cmd.executeReader(), this then returns the reader back to where ever it was called from.
After speaking to a friend he mentioned it might be nice if the method returned the raw data and not the reader, therefore keeping all the database "stuff" away from the main app.
After playing around I managed to find a couple of tutorials on putting the reader data into arrays and arraylists and passing then back, also had a go at passing back an array list of hashtables - these methods obv mean that the dev must know the column names in order to find the correct data.
then I stumbled across a page which went on about creating a Class which had the attributes of the column names and created a list which you could pull your data from:
http://zensoftware.org/archives/248 is the link
so this made me think, in order to use this method, would I need to create 6 classes with the attributes of the columns of my tables ( a couple of tables has up to 10-15 columns)? or is there a better way for me to handle my data?
I am not really clued up on these things but if pointed in the right direction I am a very fast learner :)
Again I thank you for any input what so ever.
Vade
You have a lot of ideas that are very close but are pretty common problems, but good that you are actively thinking about how to handle them!
On the question about leaving the connection open for the whole program or only having it open during the actual query time. The common (and proper) way to do this is only have the connection open as much as you need it, so
MySqlConnection cn = new MySqlConnection(yourConnectionString);
//Execute your queries
cn.close();
This is better since you don't risk leaving open connections, or having transaction issues typing up databases and resources.
With the having just the data returned and not the actual datareader this is a good idea but by just returning the data as an ArrayList or whatever you are kind of losing the structure of the data a little.
A good way to do this would be to either have your class just take the datareader to populate it's data OR have the Data Layer just return an instance of your class after reading the data.
I believe that it would be an excellent approach if your data access class returned a custom class populated with data from the database. That would be object-oriented. Instead of, say, returning a DataSet or an array containing customer information, you would create a Customer class with properties. Then, when you retrieve the data from the database, you populate an instance of the Customer class with the data, and return it to the calling code.
A lot of the newer Microsoft technologies are focusing on making this task easier. Quite often, there are many more than 6 classes needed, and writing all that code can seem like drudgery. I would suggest that, if you are interested in learning about those newer approaches, and possibly adapting them to your own code, you can check out Linq to SQL and Entity Framework.
one question was about if the connection should remain open the whole
time the app is running, or open it fire a query then close it etc
You want to keep the connection open as little as possible. Therefore you should open on each data request and close it as soon as you are done. You should also dispose it but if your database stuff is inside a C# using statement that happens automatically.
As far as the larger question on how to return the data to your application you are on the right track. You typically want to hide the raw database from the rest of your application and mapping the raw data to other intermediate classes is the correct thing to do.
Now, how you do this mapping is a very large topic. Ideally you don't want to create classes that map one to one your tables/columns but rather provide your app a more app-friendly representation of the data (e.g. business objects rather than database tables.) For example, if your employee data is split in to or three tables for normalization purposes you can hide this complexity and present the information as a single Employee class that binds the data from the other tables together.
Abstracting away your data access code using objects is known as Object/Relational mapping. It's actually a much more complex task than it appears at first sight. There are several libraries, even in the framework itself, that already do very well what you're trying to implement.
If your needs are very simple, look into typed DataSets. They let you create the table classes in a designer and also generate objects that will do the loading and saving for you (given certain limitations)
If your needs are less simple, but still pretty simple, I recommend you take a look at Linq To SQL to see if it covers your needs, as it does table-class mapping in a very straightforward way and uses a more modern usage pattern than DataSets.
There are also more complex ORMs that allow you to define more complex mappings, like Entity Framework or nHibernate, but very often their complexity is not necessary.
Details like connection lifetime will then depend on your specific needs. Sometimes it's best to keep the connection open, if you have a lot of queries caused by user interaction, like is usually the case with a desktop app. Other times it's best to keep them as short as possible to avoid congestion, like the case of web apps.
Whichever technology you choose will likely end up guiding you onto a good set of practices for it, and the best you can do is try things out and see which works best for you.

Generating Clean Business Object Classes from a horrible data source

I'm starting with a blank slate for a layer of business/entity classes, but with an existing data source. Normally this would be cake, fire up Entity Framework, point it at the DB, and call it a day, but for the time being, I need to get the actual data from a third party vendor data source that...
Can only be accessed from a generic ODBC provider (not SQL)
Has very bad syntax, naming, and in some cases, duplicate data everywhere
Has well over 100 tables, which, when combined, would be in the neighborhood of 1,000 columns/properties of data
I only need read-only from this DB, so I don't need to support updates / inserts / deletes, just need to extract the "dirty" data into clean entities once per application run.
I'd like to isolate the database as much as possible from my nice, clean, well-named entity classes.
Is there a good way to:
Generate the initial entity classes from the DB tables? (That way I'm just mostly renaming and fixing class properties to sanitize it instead of starting from scratch.)
Map the data from the DB into my nice, clean classes without writing 1000 property sets?
Edit: The main goal here is not as much to come up with a pseudo-ORM as it is to generate as much of the existing code as possible based on what's already there, and then tweak as needed, eliminating a lot of the manual labor-intensive class writing tasks.
I like to avoid auto-generating classes from database schemas just to have more control over the structure of the entities - that and what makes sense for a database structure doesn't always make sense for a class structure. For 3rd party or legacy systems, we use an adapter pattern for our business objects - converting the old or poorly structured format - be in in a database, flat files, other components, etc. into something more suitable.
That being said, you could create views or stored procedures to represent the data in a manor more suitable to your needs than the database's current structure. This is assuming that you are allowed to touch the database.
Dump the database. I mean, redesign the schema, migrate your data and you should be good.

Categories

Resources