I'm a novice programmer. I am full of theoretical knowledge, but I'm behind with the practice. OK. I am trying to make a program for adding categories and descriptions to files. The language is C#, it should run on Windows 7...
1.The categories can contain sub-categories.
I don't want to call them "tags", because these are different. A category can be fx "favorites". But it can also be: "favorites->music->2013". You can create sub-categories, I will use a TreeView on a WinForm for all the operations a user can do with them.
QUESTION: Should I use XML file for the categories?
2.Every file CAN have a description and one or many categories. However:
Even if the file is deleted, I want to keep its description, so that it can be available for later usage.
Folders themselves will be omitted. The folders themselves cannot have nor categories, nor description. But the contained files YES.
I made a very simple SQL Server database containing one table: !http://img832.imageshack.us/img832/3931/finalprojectdb.png
QUESTION: Is this a good idea? Maybe the categories column is better to be of type XML ?
Any advice on what should the best approach in this situation be, is welcomed. Thanks in advance !
SQL is not great for getting nested data at once. You can store things in XML which gives you a lot of flexibility, but you also have to write a parser or deserializer for it. Nowadays people also just write a little Javascript class and use something like Newtonsoft to deserialize it automatically.
If you want a DB solution, you can use something like SQLite embedded in your application if you don't want to install a database separately.
XML is a great design for an app that needs to communicate cross platform (say c# to java), or cross internet, or cross network. But as a way to store data as a subset in a table, not really.
A normalized database is a terrific tool. It can be indexed (xml can not) this allows for rapid querying of data. If you de-normalize your data by embedding xml in a column querying it will be slow and updating / maintaining a pain.
I personally prefer foreign key tables.
Related
I am using a xml file as a database currently in development.
The xml file is going to be modified by multiple users over the network.(Not on a server per say but on my computer where they have access over the network)
I kinda know it is a bad idea to use xml for this but the structure of xml is much better/cleaner/something I like.
Wondering, what are my options ? As in would I be able to continue with the xml with some weird background custom connection ? (Which would verify all the necessary details to allow me to write/read from the xml without issues)
Or am I stuck in using some SQL type of database? If I am stuck in using that would there be some sort of database that is somewhat similar to XML...
EDIT: Reason for liking xml.
Grouped easily for the eyes.
<SomeDocument name="Something">
<URL>bbbb</URL>
<Something>2342</Soemthing>
<Something_That_would_of_been_in_another_database>derp</...>
rather than linking 3-4 tables together...
There are some examples of XML based databases that support multi-user environments. One is the OneNote Revision File Format used by Microsoft OneNote. Although there is a very detailed documentation on that, it is tremendously complicated to support multiple users editing a single file. Basically one could argue that an XML based storage is not viable option when you need multi-user support.
If you are stuck with the XML file you could look into the OneNote file format, but it isn't a traditional XML format, since it also uses a "binary wrapper", meaning that the actual content is defined in XML data within the binary file, but transactions/revisions/free chunks are represented binary. This is necessary since you have to allocate specific portions of the file for users to write to, while you have the file open.
If you don't want to use a dedicated server software, you could use various file-based databases like SQL CE or SQLite.
You would need to deal with concurrency issues if you used a file that several users had access to. Guarantees need to be made for one user not overwriting another user's changes made around the same time.
My suggestion is to use a proper database (e.g. SQL Server) that will handle these issues for you.
I am not familiar with the C# soultions, but for our java application we use eXist-db and query it with xquery. I'm not too familiar with it, but some use markLogic. Still more use Berkley db.
The question whether or not to use a native XML database, an XML-enabled database, a so-called NoSQL database, or any of the more traditional methods can rely on multiple factors. Just to mention two:
Most importantly, do you have your data in XML, and do you want to keep it that way? If so, use an XML-enabled solution.
Do you need scalability or performance? If so, you will need a solution that can deal with that. There are lots of NoSQL and XML databases that are well capable of handling that.
As for concurrency: any database should deal with that natively.
A number of databases have been mentioned already. To single out a few, MarkLogic Server ( www.marklogic.com ) is built to scale and perform upto Terabyte scale (and beyond), and has connectors for amongst others Java and .Net. The solution from 28msec ( www.28msec.com based on Zorba) runs in the cloud, and should scale too.
But most interesting to mention here is that these databases are often used through HTTP / REST interfaces. That allows easy integration from any programming language, and makes interchanging easier too.
I'm new to windows app and I would like to know what the best way to save a small amount of data, like 1 value a day.
I'm going for the text file because it's easy, but I know i could use MS Access.
Do you have other option ? Faster or better ?
Since you are already considering using a MS Access database, I would recommend using SQLite. Here's a quote from their site (SQLite Home Page):
SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine.
It is really very easy to use - no installations required, you simply need to reference a DLL.
If you need to read it then use a plain text file.
If you need to read the values back into the application then serialize to an XML or binary file by making your user data serializable possibly by having a List of values in your object.
How do you want to use the data? Do you just want to look at it once in awhile? Do you plan to analyze it in a spreadsheet? Etc. Based upon what you say so far, I would just use a text file, one value per line. Even if later you wanted to do more with it, it's easy to import into spreadsheets, etc. If the daily data is a little more complicated (maybe a couple of different values for things each day), you might consider something like YAML.
Why stray from the path? XML gives you the ability to expand on it later without having to rethink everything.
Its mainly dependent upon the complexity of the data that you want to store. If its just DateTime some other simple built in type you would be able to recreate that object as a strongly typed one easily. But in case if its more complicated I would suggest you to create a serializable class (link on how to create such class is here) and then use one of Binary or SOAP serializations based on the size, security and other such needs. I am suggesting this as it would be best to be able to recreate objects as strongly typed ones from a flat file rather than just trying to parse what's there in the flat file.
Please let me know in case you need more clarity.
Thanks,
Sai Pavan
I would like to handle xml data in an activerecord way, so 1 class for each xml structure (I will need an xsd obviusly) and the possibility to do operations like Users.FindAll() like castle activerecord do.
The problem is, obviusly, that those are xml file, not relational databases.
Are there any library to achieve this? If is MS library and not a third party library is better, obviusly.
To understand why I would like to achieve this, I'll explain the program I'm building so you can eventually give me some suggestions if a different approach is better:
The program "output" will be something like a long MS-Word (or pdf) document which will contains information about how a company handles the privacy of their customers, following the local legislation.
I will have, so, a "global" xml file which contains something like Jobs (as defined in law, but law can change so should be editable by the user) that each employee can have in it's company (there will be other data too, this is a generic example).
Then, I will have an xml file for each company the user would like to use this program for. This xml file will have a list of employees where each emplyee have a reference to the Job (chosen from the global xml file).
Obviusly the program will have much more data, but this explains how it works.
I'm still not sure if I must use a relational databse, what really frighten me in case I use one, is that I will have "troubles" in allowing the user to export/import data if he install the program on a new computer. Also I would like to avoid to force the user to install a database on his computer (well, an sqlite-like database could be ok because is on a file).
Any suggestion about this?
Thanks to everyone
Although Linq-to-XML is pretty easy to use, there are many more things to do when it comes to reading and storing related data in a way a RDBMS does. An RDBMS is all about referential integrity, ACID transactions, concurrent users, performance enhancements, to name a few elements that spring to my mind now. Thinking of this daunting task, I think doing this all by yourself is more scary than deploying a database file.
There are some XML-based databases, but I don't know how mature and user friendly they are. I even remember having read of database systems based on plain text files.
I would go for the paved roads and use a relational database, possibly a local database, as you already suggested. Lots of support and tooling available.
How about to use Linq to Xml?
I am writing a small program for our local high school (pro bono). The program has an interface allows the user to enter school holidays. This is a simple stand alone Windows app.
What format should I use to store the data? A big relational data is obviously overkill.
My initial plan was to store the data in an XML file. Co-workers have been suggesting that I use JSON files, Access Databases, SQL Lite, and SQL Server Express. There was even a suggestion of old school INI files.
Projects like this have a habit of getting bigger, quickly, and if they do your XML file will become complex and a burden to manage.
I would not recommend storing the data in an xml file or json - they are just text files by a different name, all suffering from the same problem - you don't have any control over who edits them.
Use some kind of db, starting from the small ones first (Access, SQLLite)
Edit
Based on your latest comments, roll forward to a point where the users have been using the app for two years.
How much data do you expect to have stored by then?
Will the user(s) need to look back through historic data to see, for example, what they did this time last year
And more so, right now
What is Teacher A doing on Thursday afternoon
Will Teacher B be free to attend event on 15th May 2010?
Can Student C attend event D?
All of these questions/problems are a lot easier/more efficient to handle with SQL. Plus your resulting codebase will make a lot more sense. Traversing XML isn't the prettiest of things to do.
Plus if your user base is familiar with Excel already, linking Excel to a SQL database (and produce custom results) is a lot easier than doing the same with XML.
Have you considered using SQLite? It'll result in a small .s3db file. SQLite is used by all kinds of desktop applications for local storage.
There's a SQLite .NET library that'll allow you to use ADO.NET to CRUD your data.
Check out Mike Duncan's article on how to get started with SQLite in .NET.
I would have to second the Json answer and SQL Lite.
Another option would be to use the built in database that's included in all of windows (since Windows 2000), ESENT. There is a codeplex project to make it easy to work with http://managedesent.codeplex.com/
Hm, obviously SQL Express is a full blown database - otohj it may make sense. Why NOT use a databsae if they already have one ?;)
Otherwise I would possibly go with a XML file.
I would recommend an XML file and a typed dataset.
You will need to figure out where to put the XML file.
Note that if you ever want to allow multiple users to use it, you need to use a database, such as Access or SQL Server.
I'd go for
SQL Server Express (free and full blown relational database)
XML/JSON if you don't want a database (LINQ to XML might be a big help here).
xml files seems to be a good option. using Linq to xml it should be quite easy to read/write the files from objects/to objects. check out XDocument and XElement classes
Access Databases, SQL Lite, and SQL Server Express seens like an overkill. do you really need a database to store a simple calendar data? is your application going to grow?
From "simple stand alone Windows app" it sounds to me like it's not critical. I'd go with whatever is easiest, or what you are most comfortable with. XML is usefully human readable, but a sensibly formatted flat file might be just as sensible.
I'd use serialized classes.
One thing to consider if you use a database instead of a text file of some sort: you're no longer a "simple stand alone Windows app". Now you've (most likely) got an installation program to write.
I am creating an RSS reader as a hobby project, and at the point where the user is adding his own URL's.
I was thinking of two things.
A plaintext file where each url is a single line
SQLite where i can have unique ID's and descriptions following the URL
Is the SQLite idea to much of an overhead or is there a better way to do things like this?
What about as an OPML file? It's XML, so if you needed to store more data then the OPML specification supplies, you can always add your own namespace.
Additionally, importing and exporting from other RSS readers is all done via OPML. Often there is library support for it. If you're interested in having users switch then you have to support OPML. Thansk to jamesh for bringing that point up.
Why not XML?
If you're dealing with RSS anyway you mayaswell :)
Do you plan just to store URLs? Or you plan to add data like last_fetch_time or so?
If it's just a simple URL list that your program will read line-by-line and download data, store it in a file or even better in some serialized object written to a file.
If you plan to extend it, add comments/time of last fetch, etc, I'd go for SQLite, it's not that much overhead.
If it's a single user application that only has one instance, SQLite might be overkill.
You've got a few options as I see it:
SQLite / Database layer. Increases the dependencies your code needs to run. But allows concurrent access
Roll your own text parser. Complexity increases as you want to save more data and you're re-inventing the wheel. Less dependency and initially, while your data is simple, it's trivial for a novice user of your application to edit.
Use XML. It's well formed & defined and text editable. Could be overkill for storing just a URL though.
Use something like pickle to serialize your objects and save them to disk. Changes to your data structure means "upgrading" the pickle files. Not very intuitive to edit for a novice user, but extremely easy to implement.
I'd go with the XML text file option. You can use the XSD tool built into Visual Studio to create a DataTable out of the XML data, and it easily serializes back into the file when needed.
The other caveat is that I'm sure you're going to want the end user to be able to categorize their RSS feeds and be able to potentially search/sort them, and having that kind of datatable style will help with this.
You'll get easy file storage and access, the benefit of a "database" structure, but not quite the overhead of SQLite.