I have an application which queries a local store of data (currently backed by an XML file), using Linq to Objects. Periodically, another thread in the application will query a remote server for updated data, and if it exists, will download all of the remote data, deserialise it and replace the local objects with newly deserialised ones before saving the new XML to disk.
I have decided to replace the XML file with a SQLite database, and I intend to use Entity Framework to interact with it. This has prompted me to re-look at the way external changes are applied, and I've decided that only data where the remote entities updated_at property is newer than the local entity will be updated (rather than the current approach of replacing the whole data set)
So I must write a method to download the external changes and update or insert the relevant entities into the SQLite database.
What I don't understand is where, in architectural terms, this method should sit. My (potentially naive) thinking is that a generic UpdateFromRemoteObjects<T>(List<T> updatedItems) method could sit in the DbContext class, and would accept a list of entities and update the appropriate DbSet. But this feels like it may be too closely coupled to the DbContext. Should I use a repository to provide a layer to implement this? Or is another application architecture more appropriate?
Many people start with CRC when designing components: Classes have Responsibilities and Collaborators
First consider the single responsibility principle: a class with two or more responsibilities is probably doing too much. This is your reason for not putting the method on the DbContext: this updating stuff is a new distinct responsibility, so create a class for it.
I can see this class doing 2 things: QueryRemoteServerForChanges and UpdateLocalObjects.
Now consider its Collaborators. it seems to need two: an instance of DbContext for the local changes, and a instance of whatever gives access to the remote data.
So not a repository no; and not a layer; but definitely a class with a responsibility.
Related
What is the "database model"? Is this an appropriate class to contain methods for reading and writing methods to the database?
After studying MVC a bit, I'm confident in saying that the Model portion is where I should be communicating with the database. I currently have my entity classes (such as the classic "Person") and a class called DatabaseModel which has public methods for executing SQL queries on the database.
Then, in other classes in my controller, I create a DatabaseModel object, and execute the public methods within that class to retrieve SQL query results.
Am I approaching this correctly? Also, on a side note, I have a feeling this DatabaseModel class is going to become very large. Is there a good strategy for breaking this up (possibly for related queries). I thought of dividing it into partial classes in C#, but that's my best guess right now.
Database models is the mapping structure for your database schema, in such case it will be the Entity class, so back to definition; that been said. In order to interact or query your Database you have 2 common patterns which you can follow, DAO pattern, and simply this is a class which will contain the main query operations for each entity CRUD operations or [ Save , Update and Delete ], that means if you have PersonEntity you should have PersonDAO.
The second pattern is Repository pattern, and I think you will have that in any powerful framework, and will support you with ready CRUD methods to use it directly with no need to write the definition code for each operation, unlike DAO pattern.
Speaking about code quantity in each class, actually it depends, however you should follow clean code strategies, that is for any class you shouldn’t do everything to find yourself having more than 400 lines of code, this as example.Unless you really need to have this.
After studying MVC a bit, I'm confident in saying that the Model portion is where I should be communicating with the database. I currently have my entity classes (such as the classic "Person") and a class called DatabaseModel which has public methods for executing SQL queries on the database.
Actually I disagree, this is the misconception I've had when I started. To me, M is the the mapping to the database, it is the database itself, using either entity framework or some other framework of your choice, so you can create a database using code first approach without ever touching SQL.
Then, in other classes in my controller, I create a DatabaseModel object, and execute the public methods within that class to retrieve SQL query results.
This should be a job for your repository, where you retrieve SQL query results, please keep in mind that this is not the actual controller.
Am I approaching this correctly? Also, on a side note, I have a feeling this DatabaseModel class is going to become very large. Is there a good strategy for breaking this up (possibly for related queries). I thought of dividing it into partial classes in C#, but that's my best guess right now.
You are almost there, what you have right now is the Model(Database mapping), the Repository(where you execute and retrieve data), but you are missing the Controller(where you get the repository using dependency injection and do the actual work, for example mapping the data to your ViewModel), your ViewModel(where you map the data so you only get and send what you need from the data to the view, this might seem like pointless at first, why can't we just send the data to the view, but it comes with many benefits, for example, you are only dealing with what you actually need, you can verify user input(client side validation) before mapping it back to data etc), and then finally the View(your display).
So to me, MVC is just a standard, it doesn't mean you only have the Model View and Controller, I never liked the acronym, it causes too much confusion at first, to me, it should be: M(Model, the database mapping)C(Controller)VM(ViewModel)V(View), and also the repository between the Model and the Controller, but let's leave that out because it's a personal preference thing, most of the time people are just confused about the difference between the Model and the ViewModel.
Some time ago, at work, we had to change our main system to be "cross-rdbms". I'm not sure if this is the correct term, but basically the system worked only with MSSQLServer but in order to acomodate a new client we had to make it possible for the system to work with both MSSQLServer and Oracle.
We don't use a ORM because of reasons. Instead, we use a custom ADO-based data access layer.
Before of this change, we rellied heavily on stored procedures, database functions, triggers, etc. A substantial amount of business logic was located on the database itself.
We decided to get rid of all stored procedures, triggers and stuff, and basically reduce to database to a mere storage layer.
To handle migrations, we created a .json file which contains a representation of our database schema: tables, columns, indexes, constraints, etc. A simple application was created to edit this file. By using it, we're able to edit existent tables and columns and add new ones.
This json file is versioned in our repository. When the application is executed, a routine reads the file, constructing a representation of the database in memory. It then reads the metadata from the actual database, compare it to the in-memory representation and generates scripts based on the differences found.
Finally, the scripts are executed, updating the database schema.
So, now comes my real problem. When a new column is added, the developer needs to:
- add a new property to the POCO class that represents a row in that table;
- edit the method which maps the table columns to the class properties, adding the new column/property mapping;
- edit the class which handles database commands, creating a new parameter referent to the new column.
When this approach was initially implemented, I thought about auto-generating and updating the POCO classes based on changes in the json file. These would keep the classes in sync with the database schema, and we wouldn't have to worry about developers forgetting to update the classes after creating new columns.
This feature wasn't implemented tough, and now I'm having serious doubts about it, mostly because I've been studying Clean Architecture/Onion Architecture and Domain Driven Design.
From a DDD perspective, everything should be about the Domain, which in turn should be tottally ignorant about its persistence.
So, my question is basically: how can I maintain my domain model and my database schema in sync, without violating DRY and without using a "database-centric" approach?
DDD puts the focus on the domain language and its representation in domain classes. DB issues are not the primary concern of DDD.
Therefore, generating domain classes from the database schema is the wrong direction if the intention is to apply DDD.
This question is more about finding a decent way to manage DB upgrades, which has little to do with DDD. Unit/integration tests for basic read/write DB operations may go a long way in assisting developers to remember editing the required files when DB columns are altered.
How reusable are the results of using EF?
Currently, I use stored procedures for 100% of my data access. The main reason I may be looking to do things differently for my newest project is for maintainability: adding an attribute to a table means manually altering dozens of stored procedures. If I understand EF correctly, I should be able to add an attribute to an Entity in my EF model, and then ask EF to update my CRUD methods for me. awesome.
However, there is one thing holding me back: reusability. I like that I can make the SP's for a given database once, and be done with them; I can make 12 applications that all use that database and getting that data will be as easy as calling the correct SP.
Will the same be true if I switch to a more EF-centric approach?
Can I import an Existing EF Data Model and have it work without too much trouble?
Can I make it so that I alter the Data Model once, and that change is felt across all applications?
Ad1. You can easily reuse complex EF queries if you follow the Repository pattern. Repositories are where you encapsulate your data access and, yes, repositories are easily reused between different modules/applications.
(not that you can't reuse code without Repositories! Repositories are just a common way of doing it for data access layer)
Ad2. I am not sure what you mean by "import existing EF model" (import where?) but usually EF models are straightforward and can be reused
Ad3. Just have your model in a separate assembly.
A real benefit to using EF is getting away from stored procedures.
The problem that exists with using stored procedures for all your data access is that you are forced to put business logic into your data layer.
Check out Should the data access layer contain business logic? While its not true in every case, generally keeping your business logic in your business layer gives you better separation of concerns.
I have an EF project that I use as the data layer for several applications. This allows me to change it once and have all the other projects get the benefits. Granted, sometimes supporting code needs to be changed in these other projects, but you'd have that same problem in a stored procedure model as well.
Any N-tier design would solve this problem by simply creating a class (in a class library) that understands how to access the data by using entity framework. Then any applications that want to access the data uses the class library, configure a connection string in the app.config or web.config and you're done.
I have a graph of objects :
School-->Classes-->Students.
and I want to set it up in a way that I can send back school class to the client and it can access classes and students in a lazy-loading way.
is that possible ?
In brief: no.
You can either :
send back all the data needed (including classes and students with your school entity) in a single call ("eager loading")
or:
you need have to have separate methods on your WCF service to retrieve detail data in a separate call (something like: List<Class> GetClassesForSchool(int schoolId), List<Student> GetStudentsForClass(int classId))
Lazy loading per se only works as long as your Entity Framework object context is still around to be queried for more data - which is certainly not the case when you send entities across the wire using WCF.
I don't think so, because your entity is travelling across different tiers and the one with database connection won't be accessed without your interventation from any other tier.
You'll need to tailor your own solution to do that, or just use data-transfer objects, which will have the right information nor the one that may be useless for some view.
Update:
Read this article if you want to learn more about DTO pattern:
http://aspalliance.com/1215_Implementing_a_Generic_Data_Transfer_Object_in_C.2
I'm starting with a blank slate for a layer of business/entity classes, but with an existing data source. Normally this would be cake, fire up Entity Framework, point it at the DB, and call it a day, but for the time being, I need to get the actual data from a third party vendor data source that...
Can only be accessed from a generic ODBC provider (not SQL)
Has very bad syntax, naming, and in some cases, duplicate data everywhere
Has well over 100 tables, which, when combined, would be in the neighborhood of 1,000 columns/properties of data
I only need read-only from this DB, so I don't need to support updates / inserts / deletes, just need to extract the "dirty" data into clean entities once per application run.
I'd like to isolate the database as much as possible from my nice, clean, well-named entity classes.
Is there a good way to:
Generate the initial entity classes from the DB tables? (That way I'm just mostly renaming and fixing class properties to sanitize it instead of starting from scratch.)
Map the data from the DB into my nice, clean classes without writing 1000 property sets?
Edit: The main goal here is not as much to come up with a pseudo-ORM as it is to generate as much of the existing code as possible based on what's already there, and then tweak as needed, eliminating a lot of the manual labor-intensive class writing tasks.
I like to avoid auto-generating classes from database schemas just to have more control over the structure of the entities - that and what makes sense for a database structure doesn't always make sense for a class structure. For 3rd party or legacy systems, we use an adapter pattern for our business objects - converting the old or poorly structured format - be in in a database, flat files, other components, etc. into something more suitable.
That being said, you could create views or stored procedures to represent the data in a manor more suitable to your needs than the database's current structure. This is assuming that you are allowed to touch the database.
Dump the database. I mean, redesign the schema, migrate your data and you should be good.