Love the site--it has been very informative throughout my studies. Just finished a quarter of C# intro and one of the projects was to design a Financial "Account Manager" app that keeps a balance and updates it when withdraws and deposits are made. The project was fairly simple and I didn't have any problems. Unfortunately, my next quarter doesn't include any programming classes :(, so I'm using the time to expand my knowledge through beefing up my Account Manager app.
First thing I wanted to do, was to enable multiple users. So far, I've included a CreateNewUser class that prohibits duplicate user names, checks new passwords for specific formatting requirements, salts and hashes it, and saves it to an "Accounts" table with the username (email address) and an auto-incremented user id. Simple enough.
So now I'm stuck: not sure what would be best practice. I don't think that the user should be using the same table as other users, so I'm thinking that each user should have their own table. Am I being "too paranoid", or is my thinking along the lines of common programming security practices? The truth is that nobody will probably ever use this app, but I'm trying to learn what I can apply in the real world when I grow up.
Using the same table only requires loading the DataSet with a query of matching userID's, so that wouldn't be a big deal. If I should use separate tables, then I would need to create a new table dynamically when the new user is created, and I was going to just name the table with the user id, which would simulate the account number in the real world, I'm assuming.
Anyway, I couldn't find another question that covered this, so I thought I'd ask ya'll for your thoughts.
Thanks,
Deadeddie
Think of it this way. If you're going to be keeping physical examples of these tables, for example, using a notebook. Would you rather have a lot of small notebooks or one big notebook that you can refer too?
As long as your code is written to only pull the correct data (in this case, matching userIDs) there isn't a big deal security wise because all your code will be handling the access permissions to the data. And your database and code have the correct permissions set on them as well.
So far, I've included a CreateNewUser class that prohibits duplicate
user names, checks new passwords for specific formatting requirements,
salts and hashes it, and saves it to an "Accounts" table with the
username (email address) and an auto-incremented user id. Simple
enough.
Already bad. It should be a Users table - Account in an application dealing with financial information has a very specific financial meaning, and you may want to have multiple accounts per user and / or an account shared by users.
Also, unless you write Powershell CmdLets (where one class per command is the pattern), a CreateNewUser class is as bad as going out an d burning cars. User is a class, some sort of repository is ok, but CREATE NEW is if anything a FUNCTION on the class. It is definitely not a complete class - you totally botch the concept of object orientation if you turn every method in a class.
I don't think that the user should be using the same table as other users,
Again a total beginenr mistake. Why not? Put in proper fields referencing the account and / or user as appropriate and be fine.
then I would need to create a new table dynamically when the new user is created,
Did you ever think what you are doing here? Maintenance wise every change means writing a program that finds out what user tables exist, then modifies them. Tooling support out of the window. I once saw an application written like that - invoice management. It had one invoice details tables PER INVOICE (and an invoice table per invoice, coded by invoice number) because the programmer never understood what databases are.
Am I being "too paranoid", or is my thinking along the lines of common programming security practices?
They are along the line "you are fired, learn how databases work".
Using the same table only requires loading the DataSet with a query of matching userID's
;) So DataSet are still around? Is there a reason you do programming archaeology, following the worst practices of the last 30 years at Microsoft - instead of using an ORM as Microsoft already provides since some time now (Linq2SQL, Entity Framework) which would make your application a lot - ah - more - ah - object oriented?
May I suggest reading a decent book? Look up "Building Object Applications That Work" by Scott Ambler? And no, it is not written for C# - interesting enough the concepts of good architecture are 99% language agnostic.
Related
I am working on a inventory app using c# and entity framework code first approach.
One of the design requirements is that user should be able to create multiple companies and each company should have a full set of inventory master tables.
For example each company should have its own stock journal and list of items. There would also be a way to combine these companies in future to form like a 'group' company, essentially merging the data.
Using one of the file based RDBMS like sqlite, its very simple, I would just need to create a separate sqlite database for each company and then a master database to tie it all together. However how should I go about doing it in a single database file! not multiple file databases.
I do not want to have a 'company' column on every table!
The idea that I had given my limited knowledge of DB's is to separate using different schemas. One schema for each company with the same set of tables in each schema, with a separate schema holing the common tables and tables to tie up the other schemas together. Is that a good approach? Because I am having a hard time finding a way to 'dynamically' create schemas using ef and code first.
Edit #1
To get an idea of the number of companies, one enterprise has about 4-5 companies, and each financial year the old companies are closed off and a fresh set of companies created. It is essentially good to maintain data for multiple years in the same file but it is not required as long as I can provide a separate module to load data for several years, from several of the db files to facilitate year on year analysis.
As far as size of individual companies data, it can hit the GB mark per company.
Schema changes quite frequently at least on the table level as it will be completely customizable by the user.
I guess one aspect that drives my question is the implementation of this design. If it is a app with discrete desktop interface and implementation and I have my on RDBMS server like SQL Server the number of databases do not matter that much. However for a web-based UI hosted on third party and using their database server, the number of databases available will be limited. The only solution to that would be to use serverless database like SQLite.
But as far as general advice goes, SQLite is not advised for large enterprise class databases.
You've provided viable solutions, and even some design requirements, but it's difficult to advise "what's best" without knowing the base requirements like:
How many companies now - and can be reasonably expected in the future
How many tables per instance
How many records per 'large' table, per company
How likely are things to change frequently, dataschema-wise
With that in mind, off to some general opinion on your solutions. First off, considering the design requirements, it would make sense to consider using seperate databases per company. This would seperate your data and allow for example roles and security quite easily to be defined on a database level. Considering you explicitely mention you could "make it simple" using this approach, you could just create a database (file) per company. With your data access layer through Entity Framework you could also easily change connection strings between databases, and even merge data from A=>B using this. I see no particular reason, besides a possible risk in maintaining and updating different instances, why this shouldn't be a solution to consider.
On the other hand, using the one-big-database-for-all approach, isn't bad by definition either. The domain of maintenance becomes more compact and easily approachable. One way to seperate data is to use different database schemas, as you suggest yourself. However, database schemas are primarily intended to seperate the accessability on a role based level. For example, a backoffice employee e.g. user role should only communicate to the "financial" schema, whilst the dbo can talk to pretty much anything. You could extend this approach on a company base, seeing a company as a "user", but think of the amount of tables you would get if you have to create more and more companies. This would make your database huge. Therefor, in my opinion, not the best approach.
Finally, I'm intrigued by your statement "I do not want to have a 'company' column on every table". In my opinion, you should consider this as well. Having a discriminator property, like the companyId column on several tables are pretty easy to abstract using Entity Framework (or any ORM for that matter). This is what the concept of foreign keys is all about. Also, it would give you the advantage of indexing this column for performance. Your only consideration in this approach would be to make sure you provide this 'company discriminator' on all relevant tables.
The latter would be quite simple to enforce using EF Code First if you use a contract for each seperate data class to inherit from:
interface IMyTableName {
int companyId;
}
Just my quick thoughts, though.
I agree with Moriarty for the most part. Our company chose the one database per company approach, and we're paying for it every time we want to do a schema change. Since our deployments are automated, they should all be the same, but there are small differences each time. Moreover, these databases are really independent, so it's hard to keep our backups in sync as well.
It has been painful working with all these databases. The only plus side is that we can spread them out over multiple servers to increase performance. So I'm going to cast my vote for the one big database design.
My question about what the best , tried and tested (and new?) methods out there to do a fairly common requirement in most companies.
Every company has customers. And lets say a company A has about 10 different systems for its business needs.Customer is critical to all systems.
Customer can be maintained in any of the systems independently but if they fall out of sync then it’s not good. I know it’s ideal to keep one big master place/System for customer record and have all other systems take that information from that single location/system.
How do you build something like this.. SOA? ETLs? Webservice? Etc.. any other ideas out there that are new … and not to forget old methods.
We are a MS / .NET shop. This is mostly for my knowledge and learning.. please point me in right direction and I want to be aware of all my options.
Ideally all your different systems would share the same database, in which case that database would be the master. However that's almost never the case.
So the most common method I've seen is to have yet another system (lets call it a data warehouse) that takes feeds from your 10 different systems, aggregates them together, and forms a "master" view of a customer.
I have not done anything like this, but playing with the idea here are my thoughts. Perhaps something will be helpful.
This is a difficult question, and I'd say it mainly depends on what development ability and interfaces you have available in each of the 10 systems. You may need a data warehouse manager piece of software working like my next paragraph says with various plugins for all the different types of interfaces in the 10 systems involved.
Thinking from the data warehouse idea: Ideally each Customer in each system would have a LastModified field, although that is probably unlikely. So you'd almost need to serialize the Customer record from each source, store it in your data warehouse database with the last time the program updated that record. This idea would allow you to know exactly what record is the newest any time anything changes in any of the 10 systems and update fields based on that. This is about the best you could do if you're not developing some of the systems, only able to read from some fashion of an interface.
If you are developing all the systems, then I'd imagine WCF interfaces (I mention WCF because they have more connection options than webservices in general) to propagate updates to all the other systems (probably via a master hub application) might be the simplest option. Passing in the new values and the date it was updated, either from an event on the save button, or checking a LastModified field every hour/day.
Another difficulty is what happens if one Customer object has an Address field and another does not, will the updates between those two overwrite each other in any cases? Or if one had a CustomerName and another has CustomerFirstname and CustomerLastname
NoSQL ideas of variable data structure and ability to mark cached values as dirty also somewhat come to mind, not sure how much benefit those concepts would really add though.
Ok guys, another my question is seems to be very widely asked and generic. For instance, I have some accounts table in my db, let say it would be accounts table. On client (desktop winforms app) I have appropriate functionality to add new account. Let say in UI it's a couple of textboxes and one button.
Another one requirement is account uniqueness. So I can't add two same accounts. My question is should I check this account existence on client (making some query and looking at result) or make a stored procedure for adding new account and check account existence there. As it for me, it's better to make just a stored proc, there I can make any needed checks and after all checks add new account. But there is pros and cons of that way. For example, it will be very difficult to manage languagw of messages that stored proc should produce.
POST EDIT
I already have any database constraints, etc. The issue is how to process situation there user is being add an existence account.
POST EDIT 2
The account uniqueness is exposed as just a simple tiny example of business logic. My question is more abour handling complicated business logic on that accounts domain.
So, how can I manage this misunderstanding?
I belive that my question is basic and has proven solution. My tools are C#, .NET Framework 2.0. Thanks in advance, guys!
If the application is to be multi-user ( i.e. not just a single desktop app with a single user, but a centralised DB with the app acting as clients maybe on many workstations), then it is not safe to rely on the client (app) to check for such as uniqueness, existance, free numbers etc as there is a distinct possibility of change happening between calls (unless read locking is used, but this often become more of an issue than a help!).
There is the ability of course to precheck and then recheck (pre at app level, re at DB), but of course this would give extra DB traffic, so depends on whether it is a problem for you.
When I write SPROCs that will return to an app, I always use the same framework - I include parameters for a return code and message and always populate them. Then I can use standard routines to call them and even add in the parameters automatically. I can then either display the message directly on failure, or use the return code to localize it as required (or automate a response). I know some DBs (like SQL Svr) will return Return_Code parameters, but I impliment my own so I can leave inbuilt ones for serious system based errors and unexpected failures. Also allows me to have my own numbering systems for return codes (i.e. grouping them to match Enums in the code and/or grouping by severity)
On web apps I have also used a different concept at times. For example, sometimes a request is made for a new account but multiple pages are required (profile for example). Here I often use a header table that generates a hidden user ID against the requested unique username, a timestamp and someway of recognising them (IP Address etc). If after x hours it is not used, the header table deletes the row freeing up the number (depending on DB the number may never become useable again - this doesn;t really matter as it is just used to keep the user data unique until application is submitted) and the username. If completed correctly, then the records are simply copied across to the proper active tables.
//Edit - To Add:
Good point. But account uniqueness is just a very tiny simple sample.
What about more complex requirements for accounts in business logic?
For example, if I implement in just in client code (in winforms app) I
will go ok, but if I want another (say console version of my app or a
website) kind of my app work with this accounts I should do all this
logic again in new app! So, I'm looking some method to hold data right
from two sides (server db site and client side). – kseen yesterday
If the requirement is ever for mutiuse, then it is best to separate it. Putting it into a separate Class Library Project allows the DLL to be used by your WinForm, Console program, Service, etc. Although I would still prefer rock-face validation (DB level) as it is closest point in time to any action and least likely to be gazzumped.
The usual way is to separate into three projects. A display layer [DL] (your winform project/console/Service/etc) and Business Application Layer [BAL] (which holds all the business rules and calls to the DAL - it knows nothing about the diplay medium nor about the database thechnology) and finally the Data Access Layer [DAL] (this has all the database calls - this can be very basic with a method for insert/update/select/delete at SQL and SPROC level and maybe some classes for passing data back and forth). The DL references only the BAL which references the DAL. The DAL can be swapped for each technology (say change from SQL Server to MySQL) without affecting the rest of the application and business rules can be changed and set in the BAL with no affect to the DAL (DL may be affected if new methods are added or display requirement change due to data change etc). This framework can then be used again and again across all your apps and is easy to make quite drastic changes to (like DB topology).
This type of logic is usually kept in code for easier maintenance (which includes testing). However, if this is just a personal throwaway application, do what is most simple for you. If it's something that is going to grow, it's better to put things practices in place now, to ease maintenance/change later.
I'd have a AccountsRepository class (for example) with a AddAcount method that did the insert/called the stored procedure. Using database constraints (as HaLaBi mentioned), it would fail on trying to insert a duplicate. You would then determine how to handle this issue (passing a message back to the ui that it couldn't add) in the code. This would allow you to put tests around all of this. The only change you made in the db is to add the constraint.
Just my 2 cents on a Thrusday morning (before my cup of green tea). :)
i think the answer - like many - is 'it depends'
for sure it is a good thing to push logic as deeply as possible towards the database. This prevent bad data no matter how the user tries to get it in there.
this, in simple terms, results in applications that TRY - FAIL - RECOVER when attempting an invalid transaction. you need to check each call(stored proc, or triggered insert etc) and IF something bad happens, recover from that condition. Usually something like tell the user an issue occurred, reset the form or something, and let them try again.
i think at a minimum, this needs to happen.
but, in addition, to make a really nice experience for the user, the app should also preemptively check on certain data conditions ahead of time, and simply prevent the user from making bad inserts in the first place.
this is of course harder, and sometimes means double coding of business rules (one in the app, and one in the DB constraints) but it can make for a dramatically better user experience.
The solution is more of being methodical than technical:
Implement - "Defensive Programming" & "Design by Contract"
If the chances of a business-rule being changed over time is very less, then apply the constraint at database-level
Create a "validation or rules & aggregation layer (or class)" that will manage such conditions/constraints for entity and/or specific property
A much smarter way to do this would be to make a user-control for the entity and/or specific property (in your case the "Account-Code"), which would internally use the "validation or rules & aggregation layer (or class)"
This will allow you to ensure a "systematic-way-of-development" or a more "scalable & maintainable" application-architecture
If your application is a website then along with placing the validation on the client-side it is always better to have validation even in the business-layer or C# code as well
When ever a validation would fail you could implement & use a "custom-error-message" library, to ensure message-content is standard across the application
If the errors are raised from database itself (i.e., from stored-procedures), you could use the same "custom-error-message" class for converting the SQL Exception to the fixed or standardized message format
I know that this is all a bit too much, but is will always good for future.
Hope this helps.
As you should not depend on a specific Storage Provider (DB [mysql, mssql, ...], flat file, xml, binary, cloud, ...) in a professional project all constraint should be checked in the business logic (model).
The model shouldn't have to know anything about the storage provider.
Uncle Bob said something about architecture and databases: http://blog.8thlight.com/uncle-bob/2011/11/22/Clean-Architecture.html
I can't decide whether to keep the help desk application in the same database as the rest of the corporate applications or completely separate it.
The help desk application can log support request from a phone call, email, website.
We can get questions sent to us from registered customers and non-registered customers.
The only reason to keep the help desk application in the same database is so that we can share the user base. But then again we can have the user create a new account for support or sync the user accounts with the help desk application.
If we separate the help desk application, our database backup will be smaller. Or we can just keep the help desk application in the same database, which makes development/integration a lot easier overall, having only one database to backup. (Maybe larger but still one database with everything.)
What to do?
I think this is a subjective answer, but I would keep the help desk system as a separate entity, unless there is a good business reason to use the same user base.
This is mostly based on what I've seen in professional helpdesk call logging/ticket software, but I do have another compelling reason - security - logic is as follows:
Generally, a helpdesk ticketing system generally needs less sensitive information than other business system (accounting, shopping, CRM, etc). Your technicians will likely need to know how to contact a customer, but probably won't need to store full addresses, birth dates, etc. All of the following is based on an assumption - that your existing customer data contains sensitive or personally identifiable data that would not be needed by your ticketing system.
Principle 1: Reducing the attack surface area by limiting the stored data. Generally, I subscribe to the principle that you should ONLY collect the data you absolutely need. Having less sensitive information available means less that an attacker can steal.
Principle 2: Reducing the surface area by minimizing avenues of attack into existing sensitive data. Assuming you already have a large user base, and assuming that you're already storing potentially useful data about your customers, adding another application with hooks into that data is just adding further avenues of attack into the existing customer base. This leads me to...
Principle 3: Least privilege. The user you set up for the helpdesk software database should have access ONLY to the data absolutely needed by your helpdesk analysts. Accomplishing this is easier if you design your database with a specific set of needs in mind. It's a lot more difficult from a maintenance standpoint to have to set up views and stored procedures over sensitive data in order to only allow access to the non-sensitive data than it is to have a database designed to have only the data that you need.
Of course, I may be over-thinking it. And there are other compelling reasons for going either route. I'm just trying to give you something to think about.
This will definitely be a subjective answer based upon your environment. You have to weigh the benefits/drawbacks of one choice with the benefits/drawbacks of the other choice. However, my opinion would be that the best benefits will be found in separating the two databases. I really don't like to have one database with two purposes. Instead look to create a database with one purpose only. Here are the benefits I see to doing this:
Portability - if you decide to move the helpdesk to a different server, you can without issue. The same is true if you want to move the corporate database somewhere else
Separation of concerns - each database is designed for its own purpose. The security of one won't interfere with the security of the other.
Backup policies - Currently, you can only have one backup policy for both systems since they are in the same database. If you split them, you could back up one more often than the other (and the backup would be smaller/faster).
The drawbacks I see (not being able to access the corporate data as easily) actually come out as a positive in my mind. Accessing the data from the corporate database sounds good but it can be a security issue (also a maintainability issue). Instead, this way you can limit how much access (and what type of access) is granted to the helpdesk system. Databases can access each other fairly easily so it won't be that inconvenient and it will allow you to add a nice security barrier between your corporate data and your helpdesk data.
We've developed simple CRM application in ASP.NET MVC. It's for a single organization with few user accounts.
I'm looking for easy way to make it work with many organization. May I use ApplicationId from Membership provider for this? Every organization would have they own ApplicationId.
But this means that every row in the database would have to have ApplicationId too, right?
Please give you suggestions. Maybe there is a better way?
Unfortunately, for the "easy way" you already missed the bus. Since the easy way would have been to make this possible already by design. It would have not been that much of a burden to include the OwnerId to the data already in the first phase and make your business logic to work according to that.
Currently the "easy way" is to refactor all your data and business logic to include the OwnerId. And while doing it, look ahead. Think of the situations "what if we need to support this and that in the future" and leave some room there for the future by design. You don't need to fully implement everything right now but you'll find it out how easy it is to make your application scale if it was designed to scale.
What comes to the ApplicationId, that's an internal ID for your membership provides to scope your membership data per application. I would stay away from bleeding that logic to the whole of your application. Keep in mind that authenticating your web users and assigning them in roles and giving them rights through roles is a totally different process than ownership of data.
In ASP.NET MVC, you would use [Authorize] attribute to make sure that certain actions can or cannot be performed by certain users or groups while determining which data is whose, should be implemented in the data itself. Even if you would run two or more instances of your application, it'd still be the same application. So ApplicationId doesn't work here for scoping your data.
But, let's say your CRM would not be so small after all in the future and it becomes apparent that either your initial organization or one of the later ones would like to allow their customers to log on and check their data, you would need to build another application for the customers to log onto. This site would use a different applicationId than your CRM. Then your client organization could map the user accounts to their CRM records so that their customer could review them.
So, since your CRM is (still) small, the easiest way is to design a good schema for your clients to be stored in and then mark all your CRM data with an OwnerId. And that OwnerId cannot come from the users table, or membership table or anywhere near there. It has to come from the table that lists the legal owners of the data. Whether you want to call them Organizations, Companies, Clients or whatever. It cannot be userId, roleId, applicationId etc, since users might be leaving an owning organization, roles are shared between the organizations (at least the ones that are used to determine access to certain MVC actions) and applicationIds are meant for scoping membership and roles between different kinds of client applications.
So what you are missing here is the tables describing owners for the CRM records and mapping all the data to their owners. And for that there's no easy way. You already went on developing the CRM thinking "this is just a simple one-organization CRM so let's make things the easy". Now you're having a "simple multi-organization CRM and asking an easy way to recover from that initial lack of design. Next step would be asking how you make your "not so simple multi-organization CRM" to easily do something you didn't take in to account in the first place.
The easy solution is to design your application scalable and doing "just a little" extra to support future growth. It'll be much easier in the long run than to spend a lot of extra rewriting your application twice a year. Also, keep in mind. It's a CRM after all, and you can't go ahead and tell whoever is using it in their business that have a day off since we're fixing some stuff in the CRM.
I'm not patronizing you here. I'm answering to anyone who might be reading this to stop searching for easy solutions to recover from inadequate planning. There isn't any. And seeking one is making the same mistake twice.
Instead, grab some pen and paper and plan a workable design and make it work. Put some extra effort in the early stages of software design and development and you'll find that work saving you countless hours later in the process. That way, whoever is using your CRM will stay happy using it. It'll become easier to talk to your users about future changes while you don't have to think "I don't want to do that since it'd break the application again". Instead, you can enjoy together brainstorming the next cool step. Some of the ideas will be left for later but some room for implementation will be designed already on this stage so that the actual implementation year later will come in smoothly and will be enjoyable for all the parties involved.
That's my easy solution. I have 15 years of development behind and the fact that I'm still enjoying it to back the above up. And it's mainly because I take every (well most of them anyway) challenge as an opportunity to design the code better instead of trying to dodge the inevitable process. We have this old saying in Finland: "Either you'll do it or you'll cry doing it". And it fits the bill here perfectly. It's up to you if you like crying so much and take "the easy way" now.