I have .txt files that are overwritten with data from software every 5-10 seconds, I then have a wpf application that reads and displays this data every second. Here are my issues:
Currently the text files are stored on a server and there are a bunch of users running this application to view this "live" data.
HOWEVER, due to:
An I/O bug in windows
The files "lock" up periodically and cause all of the applications to lock up (can't even close in task manager).
Therefore I decided to have the data copied from the text files to SQL, however from my understanding there's no way to overwrite the data in the SQL table. One must Drop the Table and Create a new one. This cause a 10+ second delay updating the data, which cannot happen.
My question is, there HAS to be a way to rapidly read and write data from somewhere, be it a database, etc. I am not sure where else to turn.
My constraints:
I'm stuck with Server 2008, have to use these text file, and I have to display it on my wpf application. Does anyone have any suggestions for a method that can handle this type of I/O?
All help is greatly appreciated, I'm at a complete loss..
It seems like you may not have extensive experience with database technology, so let me propose something different:
string text = System.IO.File.ReadAllText(path);
Then perhaps you can take the text and do what you want with it, dump it in a queue for action in another part of the application.
ReadAllText has some exceptions that are thrown:
https://msdn.microsoft.com/en-us/library/ms143368(v=vs.110).aspx
I'd be on the look out for UnauthorizedAccessException as you said, the file seems to lock up when multiple users are accessing it.
Related
Requirement is to be able to achieve 'chat' like communication between two console apps on the same windows box. I implemented this using named pipes, by implementing both a sender and receiver functionality in each app.
I want to try the same functionality but using Memory Mapped files (though I think it is not ideal for 'chat' type communication).
For simplicity sake, say chat messages are just strings of short length.
Here is what I have in mind:
One app will take care of creating a mutex and the memory mapped file.
Call it master.
In each app, we maintain two threads, one responsible for taking user
input and writing to the file, the other responsible for periodically
checking if it has something to read.
So four threads in all, each governed by a mutex for access to the
file.
Within the file, I think both should have their own 'section'. say
first half of the file size is for master app and the other half for
the second app.
So when user inputs a line of text in master app, the thread accesses
its half of the file and tries to append new text after the last new
line.
When app reads its section of the file for text, if there is any, app
reads it and blanks out its section.
Is this approach correct? Other approach would be to some how mark the message with the source id, so that the reader knows to ignore messages that are written by itself. But I feel that is unnecessary string parsing.
Also, other than each reader thread periodically trying to read their section of the file to see if there is new data, can you suggest any kind of notification mechanism? Sort of event handling? Reader thread will only go look for new messages if it gets some kind of event notification.
Any thoughts?
I agree with Hans for the most part, memory mapped files would not nessecarily be ideal here, if you go down this road though consider using a named event (see http://msdn.microsoft.com/en-us/library/windows/desktop/ms682396(v=vs.85).aspx) instead of polling.
You may need to p/invoke to get at this functionality from c#.
On the rest give each app its own region of the file, with a control section managed by the master to coordinate who gets what.
I wrote a custom control for output file name selection with the typical: text box for the filename, a "browse" button, and some other functionality specific to my application.
The text box changes color depending on the filename. If the file location cannot be written to, it turns red. If the file already exist, it turns yellow. Otherwise, it remains the system-assigned color.
To see if a file exists, I use IO.File.Exists; simple enough.
I implemented the "if the file can be written to" as a simple try-catch block where a file is actually opened, something written in it, closed, then deleted. If at any point an exception is thrown, I know the user can't use that filename and I turn the text box red.
This is a catch-all; since I'm doing the actual operation I intend to do, it is fool-proof. However, it seems irresponsible to have software creating and deleting files like crazy just to see if it can.
So my question is, how do I replicate this functionality without creating files? I can see I have to:
Check the path for legality (e.g., 'z:' is not a valid filename). This entails parsing the path and making sure all directories exist.
If the location exists, I have to check for write permissions. (Several answered questions exist to this end.)
Is there anything else?
EDIT
Within minutes I see people are already voting up an answer that criticizes that I'm checking at all that the file is accessible before actual writing to it occurs. While I appreciate experts "standing back" from my question to see whether or not there is a completely different way to achieve it, telling me I shouldn't be doing it is not an answer to my question.
So let me elaborate on my application (I am not expecting hundreds of users at the same time).
I use this file chooser control in data acquisition applications. In many situations the test that you are about to run is "expensive" in one way or another. Therefore it is critical to set things up very carefully. Overwriting data can be very expensive (and for the fearful user I have a checkbox that will append the date and time down to the millisecond to the filename).
So the purpose of my indicator colors is not to provide a surefire way for the software to know the file can be written to (that check is still done at the instant it actually has to), it's to serve as an indicator to the user that at least he has set up the file name correctly so if he goes forward he is guaranteed not to overwrite old data and he's almost sure a last-minute IO error (filename typo) won't let the experiment run unrecorded.
I suggest this - don't check anything before user commits the action. With your current approach, even if you verified the file is okay, it may be locked 5 seconds later when the user actually commits to write to a file. Doing preliminary checks may only give user a false impression of estimated success. Especially consider this point on a terminal server with 100+ simultaneous users.
There is nothing wrong with showing a prompt with Retry/Cancel/etc. if no access, and let user decide.
EDIT:
No offense, but there are standards on how such collisions are handled. Windows standard is to show a prompt to the user. Also consider this - if you suddenly have a deny in write access to the folder, which you are not expected to have, you probably need to hire another system/network administrator.
If the operation is costly, make sure this guy is paid well. C'mon, what if your network goes down during writing? Hard drive? Router? There are many reasons why writing to a file can be interrupted, and you should be prepared for that. If you cannot afford it, make sure you have invested in good infrastructure and good people to support it.
Down on earth, you can increase chances of acquiring a successful lock on the file:
Pick a unique file name, using datetime-based hash as a suffix/prefix.
Write to user's home directory, also known as %UserProfile%, it is likely that you will succeed.
I can understand your problem with not wanting to risk losing "expensive" data because the file couldn't be written and a responsible program will do it's best to avoid the situation.
I would do this by cacheing the results. Before the test is run write a mock result to a file somewhere in the user data space, then leave the file open and write the real result to the file. After this is done write it to the user-specified file. Provide a recovery option that will read the cache file and write it out to the user's file.
Your approach could fail because just because the file was writable at the start doesn't mean it's still writable. The network could have gone down. Someone could have removed the flash drive. Someone else could be doing a large data transfer through a buggy router. (Real world case--it took me a long time to prove it was a network problem and not my program. finally accepted it was their fault when I showed that dir :*.* /s on multiple machines at once would almost certainly cause one or more to fail.)
I'm planning to develop an application that will read a log file and display statistics.
The first question, I guess, is to know if I need a database or not?
Will it be quicker to run queries against the database ; or read the file each time a user wants to see the statistics?
If I choose the database method, I will have to read the log file and update the database on a regular basis (between 1 and 10 minutes).
Is this article still good do you think (as it's from 2005): http://www.codeproject.com/KB/aspnet/ASPNETService.aspx
Or is it better to develop a Windows service? In that case, can I add the Windows Serice in my ASP.NET project in Visual Studio, or does it need to be
You mentioned ASP.NET so I believe it is a web application. In such case I would suggest to use Data Base, this is more robust, flexible and distributed solution.
Any way consider using log4net and then you can easily switch on file/DB ouput in any time by simply adding an other one appender section into the configuration file.
If I choose the database method, I will have to read the log file and
update the database on a regular basis (between 1 and 10 minutes)
Exactly, you're going to have to do it anyway. The Database basically just becomes another bottleneck at that point. For this type of app, there's no need to do anything other than read the file when the user requests to see it, and display them the results on the fly.
No need to have a windows service either. I mean, I don't know all your details, but I'm assuming the log file is in a directory on your machine, so just access it, open it, parse it, and display it to the user when they choose to see it on the front end.
If the only data you going to work is LOG files, you don't need any database.
But I assume that your application would do parse logs files, create some statistics and STORE it somewhere, to make possible to users to get back and see statistics for some period of time. It is not cool if any time you will be "re-calculating" that statistics again (further more, you might loose original log files till that time).
Even if you could store it to some files also, I do not recommed that at all. Don't be afraid of using Database, don't be concered on application performace on such early stage. Do the most that helps you to solve the problem.. and as for me using Database will solve your problem;
I need to write an application that polls a directory which contains images on a file server and display 4 at a time.
This application will be run up to 50 times across the network at the same time.
I'm trying to think of the best architecture to complete this requirement.
I was working on the idea of opening a file with read/write access and no file share allowed so that if another PC came in to read it it would error and it would have to move on to the next one, the problem is, is that I need to access all 4 images in sequence on the same pc ensuring other pc's dont try to open them. So for example if PC1 tries to open 1.jpg it needs to be able to open 1,2,3,4.jpg. If another PC comes in at the same time to read them I need a way for it to then open 5,6,7,8.jpg and so on and so on.
It seems a simple requirement but a nightmare to try and build successfully.
You're basically dealing with a race condition here, and I don't see a way to handle it from separate instances of your application running on separate machines unless you can guarantee your file naming will always follow a standard naming convention that would allow you to work with the sequence of 4 files using only the name of the first.
The best way to handle this would be using a centralized resource to manage access to your files, either a database as was suggested in a comment or else a service (such as WCF) that would "hand out" each set of 4 files.
What about creating a 1.jpg.lock file? The presence of a the file indicates the images are locked and any other instance of the application should skip that set.
I am working on an app that will keep a running index of work in accomplished.
I could write once at the end of a work session, but I don't want to risk losing data if something blows up. Therefore, I rewrite to disk (XML) every time a new entry or a correction is made by the user.
private void WriteIndexFile()
{
XmlDocument IndexDoc
// Build document here
XmlTextWriter tw = new XmlTextWriter(_filePath, Encoding.UTF8);
tw.Formatting = Formatting.Indented;
IndexDoc.Save(tw);
}
It is possible for the writes to be triggered in rapid succession. If this happens, it tries to open the file for writing before the prior write is complete. (While it would not be normal, I suppose it is possible that the file gets opened for use by another program.)
How can I check if the file can be re-written?
Edit for clarification: This is part of an automated lab data collection system. The users will click a button to capture data (saved in separate files), and identify the sub-task the the data package is for. Typically, it will be 3-10 minutes between clicks.
If they make an error, they need to be able to go back and correct it, so it's not an append-only usage.
Finally, the files will be read by other automated tools and manually by humans. (XML/XSLT)
The size will be limited as each work session (worker shift or less) will have a new index file generated.
Further question: As the overwhelming consensus is to not use XML and write in an append-only mode, how would I solve the requirement of going back and correcting earlier entries?
I am considering having a "dirty" flag, and save a few minutes after the flag is set and upon closing the work session. If multiple edits happen in that time, only one write will occur - no more rapid user - also have a retry/cancel dialog if the save fails. Thoughts?
XML is a poor choice in your case because new content has to be inserted before the closing tag. Use Text istead and simply open the file for append and write the new content at the end of the file, see How to: Open and Append to a Log File.
You can also look into a simple logging framework like log4net and use that instead of handling the low level file stuff urself.
If all you want is a simple log of all operations, XML may be the wrong choice here as it is difficult to append to an XML document without rewriting the whole file, which will become slower and slower as the file grows.
I'd suggest instead File.AppendText or even better: keeping the file open for the duration of the aplication's life time and using WriteLine.
(Oh, and as others have pointed out, you need to lock to ensure that only one thread writes to the file at a time. This is still true even with this solution.)
There are also logging frameworks that already solve this problem, such as log4net. Have you considered using an existing logging framework instead of rolling your own?
I have a logger that uses System.Collections.Queue. Basically it waits until something is queued then trys to write it. While writing items, which could be slow, more items could be added to the queue.
This will also help in just grouping messages rather than trying to keep up. It is running on a separate thread.
private AutoResetEvent ResetEvent { get; set; }
LogMessage(string fullMessage)
{
this.logQueue.Enqueue(fullMessage);
// Trigger the Reset Event to send the
this.ResetEvent.Set();
}
private void ProcessQueueMessages()
{
while (this.Running)
{
// This will process all the items in the queue.
while (this.logQueue.Count > 0)
{
// This method will just log the top item on the queue
this.LogQueueItem();
}
// Once the queue is empty will wait for a
// another message to queueed before running again.
// Rather than sleeping and checking if the queue is full,
// saves from doing a System.Threading.Thread.Sleep(1000); stuff
this.ResetEvent.WaitOne();
}
}
I handle write failures but not dequeueing until it wrote to the file with no errors. Then I just keep attempting until it finally can write. This has saved me because somebody removed permissions from one of our apps during it process. Permission was given back with out shutting down our app, and we didn't lose a single log statement.
Consider using a flat text file. I have a process that I wrote that uses an XML log... it was a poor choice. You can't just write out the state as you run without having to constantly rewrite the file to make sure the tags are correct. If it was flat entries written to a file you could have an automatic timeline that could give you details of what happened without trying to figure out if it was the XML writer/tag set that blew up and you don't have to worry about your logs bloating out as much.
I agree with others suggesting you avoid XML. Also, I would suggest you have one component (a "monitor") that is responsible for all access to the file. That component will have the job of handling multiple simultaneous requests and making the disk writes happen one after another.