I am working on an application that reads and makes edits to an xml file but using the XDocument class Load() and Save() methods. If another application, or another instance of my application is running then one document could potentially overwrite changes that the other has made if it's in memory xdocument is not continually updating (reLoading). The simultaneous running programs would never edit the same section of the xml file at the same time. What is the best way to solve this problem? Should I just do a Load right before I make every change or is there a more elegant approach?
The best solution would be not to use XML.
Use a (small) database.
Using any kind of text file in a multi-user situation is difficult enough and the fact that it is to different sections can only be used profitable if you have fixed-length and (therefore) fixed-position records (lines). XML does not deliver this.
Doing a load-before-edit will only make problems appear less often. There always is the chance that changes will be lost, you will have race-conditions at the filesystem level. To make it work you have to use a scheme with locking files. You will need an extra file for this.
You are talking about multiple processes being able to modify the file. If you want to keep the file instead of putting that data into a common store (MS SQL) you will need to implement a mutex to make sure only one application can access file at any moment of time.
Related
I am creating a very simple database in C# which I use to store playlists and an overview of all my music. I want to make this C compatible in the future I plan to make this completely text based. The idea is that every text file is a table, and the contents are JSON format where every line of text is a record.
I don't want to have loose files for each database, so I was thinking about something like a zip file. I don't want to extract and compress every time I access a file. Is there someway I can use a stream reader/writer in C# on different files where windows only see one file?
I'm not completely convinced that this is the way to go. So I'm open to suggestions.
Update,
Im currently messing around with the "Local database" item in C#. I never payed any atention to it. It could very well be the solution.
Update2,
SQLite seems to be very simple. I have some experience with MySQL in the past with some php projects so that will give me a headstart.
You want to use a file as a container containing different files? If so, there are a lot ways to accomplish this. These are techniques I used in the past:
Zip:
A compressed file, such as Zip is known to behave that way, can be used as a solution for your interest. It is capable to store virtual files. They can vary in size to at least up to 1 Gigabyte (testet, but I currently don't know if there are implementation based size limits).
SQLite:
SQLite sounds oldschool, but it stores all database related stuff into one physical file. Creating a database with tables for each virtual file should to the trick. This approach is useful if you know that your virtual files won't use a lot of bytes in size or neither reach any limit of sqlite field datatypes. As your virtual files are going to use textlines, may you can be able to form then into attributes and tuples. This way you can even use SQL specific statements to query and filter your data as you wish to.
There are still more ways to implement that kind of container format by your own, but propably needs to invest more time and work in it than getting effort out of it. Stay tuned for better ideas and may ready to use implementations :-)
Will you ever try to search between your data? Then use a real database manager, in C# the built in local database file is the simpliest choice (if you are familiar with SQL).
The zip file is a good choice for data space and compactness (a single file instead of many files) but it is very slow: for each database operation the whole zip file will be reorganized. Even a tar file (without compression) needs a continous reallocation when the content changes, and a zip file needs extra computation and relocation.
If you want something what is compressed and still standard, you can use OpenXML (ods or xlsx, does not matter) to store your data but the save operation will be slow and even slower as your database grows.
I guess I'll just flat-out explain my situation. I have a desktop application that reads and writes to an XML file. This same XML file is read and written by an ASP site. Both sides need to be notified when a value changes. This is relatively trivial on the desktop app side as I just re-read the XML and apply the values, however it gets more complicated on the web side.
The website needs to immediately get the updated information from the XML. The problem is I can't figure out a proper way to store these values and in turn handle notification of updated/changed/new/deleted values. Sending the entire XML file is out of the question.
Getting the data to the page isn't the question, I have that all wired up. The question is how should I be storing this data in order to be able to handle incremental updates and also be notified of changed values?
I have an extremely clunky solution and I absolutely hate it. I was hoping someone could maybe point me in the right direction as to what type of container to store this data in; I'm relatively inexperienced in this area of C#/ASP.
Thanks for taking the time to read my novel.
What do you mean by the website getting the update "immediately"? Code would normally only execute there on the next request (whenever that comes). So on every request you could (but do not!) just read the latest copy of the file.
In general architecture terms, if you need 2 components / applications co-ordinating like this - a message queue is the natural abstraction. Right now you're treating the XML file like shared memory for doing interprocess communication.
If it's going to remain a kludge like solution - i suggest replacing the XML with a DB table. Easier to poll, co-ordinate and receive updates.
Well, I kinda found a solution.
I keep a copy of the "current" XML as an XmlSerializer with a proper Type. When I read the new XML, I also read it in as a XmlSerializer but to compare them, I just JSON each and do a standard string compare.
This works for me because my XML structure never changes, just the values inside so I'm able to check to see which "sections" have changed by comparing their serialized JSON.
Kinda weird but it works for me.
We have an application on the web that must allow the user to upload files with zip codes, these files are .csv's files. Any user will be able to upload the file from their computer, the issue is that the file may contain thousands of records. Right now i am getting the file, making sure it has the right headers but I am pushing the records one by one into the database.
I am using c# asp.net, is there a better way to do this?, more efficient from the code?. We cant use any external importers or data importers or tools like sql server business intelligence. How can I do this?, i was reading something about putting it in memory and then push it to the database?. Any urls, examples or suggestions would be much appreciated.
Regards
Firstly, I'm pretty sure that what you are asking is actually "How do you process a large file and insert the processed data into the database?".
Now assuming I am correct I would say the question is akin to 'how long is a piece of string?'. The reality is that an implementation for processing large files into a database is highly specific to your requirements.
However, at the simplest end of the spectrum you could simply upload the file straight into a table (or folder) and create a windows service that runs every x minutes, traverses through the table, picks each file and processes your data using bulk inserts and the prepare method (which may give you some performance benefits).
Alternatively you could look at something like MSMQ (Microsoft Message Queuing) and save any uploaded files direct to a queue which is then completely independent of your application and can be processed at any point in time along with easily scaled out.
At the end of the day though, honestly I don't think anyone here can give you a 'correct' answer to your question cause there really isn't one and you'll only be able to find improvements to your implementation by experimentation.
if this contains up to a million record, best to do this is to create a service to manage the inserting of records into the database to avoid timeout and prevent the web iis stress.
if you make it a windows service you can notify the service to process the zip files in certain directory where it was uploaded.
also, i would suggest to use bulk insert for more faster database transactions.
if there are validation you can probably stage the data into a different database and validate the data then push to the final database.
Since these records are in the same table and would then not be related to each other, Parallel.ForEach may be a valid answer here. Assuming you have a static method (may not necessarily need to be static) that inserts an individual record into the db, you can run Parallel.ForEach loop over an array where each index of the array represents a line of the CSV.
This assumes that uploading the large file to the server isn't the initial issue. If that is also part of the issue I would reccomend zipping the file and then using something like SharpZipLib to unzip it once it is uploaded. Since text compresses very well this may be the biggest boon to performance from the user's perspective.
I need to tweak some variables (only in a development setting) without having to restart IIS or anything (so I assume Web.Config is the wrong place to put them). Where is the easiest place to put about 500 config settings that have to be read for every request and written to, like I said, while IIS is running?
EDIT: Like I said, this is only for some Q&D development so I don't care about performance in any way. A database is a bit of overkill (and is probably more work than I want to deal with), I want something fast (like Settings), that I don't have to worry about parsing and can read from and write to. If I do XML, where do I write the file to so I don't have to spend time messing around with permissions?
In a database?
500 Config Settings to be read for every request? I'd put them in a database so they can be indexed and cached. A separate XML or data file would also most likely be cached in memory by the web server, but still wouldn't provide the performance an indexed database table could. But it depends on how you are accessing the settings.
You can just make your own "config" file. Just don't name it .config. Then you can read it just like a text file and set all your properties. Just have to either implement your own file monitoring class or something to know the file has changed so you can update your code.
With that many configuration options a database system, with some well thought caching is going to most likely be the best idea overall!
You have to be sure to consider the impacts of loading/storing them on all requests as well, as even with small sized values, that can be a big amount of overhead. SO caching is going to be very important.
I know you said you don't want a database, but with 500 settings, it just seems like the best solution.
That said, if you really don't want a database, you could always dump them into an xml file stored locally and just read/write when needed.
I am creating an RSS reader as a hobby project, and at the point where the user is adding his own URL's.
I was thinking of two things.
A plaintext file where each url is a single line
SQLite where i can have unique ID's and descriptions following the URL
Is the SQLite idea to much of an overhead or is there a better way to do things like this?
What about as an OPML file? It's XML, so if you needed to store more data then the OPML specification supplies, you can always add your own namespace.
Additionally, importing and exporting from other RSS readers is all done via OPML. Often there is library support for it. If you're interested in having users switch then you have to support OPML. Thansk to jamesh for bringing that point up.
Why not XML?
If you're dealing with RSS anyway you mayaswell :)
Do you plan just to store URLs? Or you plan to add data like last_fetch_time or so?
If it's just a simple URL list that your program will read line-by-line and download data, store it in a file or even better in some serialized object written to a file.
If you plan to extend it, add comments/time of last fetch, etc, I'd go for SQLite, it's not that much overhead.
If it's a single user application that only has one instance, SQLite might be overkill.
You've got a few options as I see it:
SQLite / Database layer. Increases the dependencies your code needs to run. But allows concurrent access
Roll your own text parser. Complexity increases as you want to save more data and you're re-inventing the wheel. Less dependency and initially, while your data is simple, it's trivial for a novice user of your application to edit.
Use XML. It's well formed & defined and text editable. Could be overkill for storing just a URL though.
Use something like pickle to serialize your objects and save them to disk. Changes to your data structure means "upgrading" the pickle files. Not very intuitive to edit for a novice user, but extremely easy to implement.
I'd go with the XML text file option. You can use the XSD tool built into Visual Studio to create a DataTable out of the XML data, and it easily serializes back into the file when needed.
The other caveat is that I'm sure you're going to want the end user to be able to categorize their RSS feeds and be able to potentially search/sort them, and having that kind of datatable style will help with this.
You'll get easy file storage and access, the benefit of a "database" structure, but not quite the overhead of SQLite.