Let me preface this question by saying I'm absolutely not a pro C# programmer and have pretty much brute forced my way through most of my small programs so far.
I'm working on a small WinForms application to SSH into a few devices, tail -f a log file on each, and display the real-time output in TextBoxes while also saving to log files. Right now, it works, but hogs nearly 30% of my CPU during logging and I'm sure I'm doing something wrong.
After creating the SshClient and connecting, I run the tail command like so (these variables are part of a logger class which exists for each connection):
command = client.CreateCommand("tail -f /tmp/messages")
result = command.BeginExecute();
stream = command.OutputStream;
I then have a log reading/writing function:
public async Task logOutput(IAsyncResult result, Stream stream, TextBox textBox, string logPath)
{
// Clear textbox ( thread-safe :) )
textBox.Invoke((MethodInvoker)(() => textBox.Clear()));
// Create reader for stream and writer for text file
StreamReader reader = new StreamReader(stream, Encoding.UTF8, true, 1024, true);
StreamWriter sw = File.AppendText(logPath);
// Start reading from SSH stream
while (!result.IsCompleted || !reader.EndOfStream)
{
string line = await reader.ReadLineAsync();
if (line != null)
{
// append to textbox
textBox.Invoke((Action)(() => textBox.AppendText(line + Environment.NewLine)));
// append to file
sw.WriteLine(line);
}
}
}
Which I call the following way, per device connection:
Task.Run(() => logOutput(logger.result, logger.stream, textBox, fileName), logger.token);
Everything works fine, it's just the CPU usage that's the issue. I'm guessing I'm creating way more than one thread per logging process, but I don't know why or how to fix that.
Does anything stand out as a simple fix to the above code? Or even better - is there a way to set up a callback that only prints the new data when the result object gets new text?
All help is greatly appreciated!
EDIT 3/4/2021
I tried a simple test using CopyToAsync by changing the code inside logOutput() to the following:
public async Task logOutput(IAsyncResult result, Stream stream, string logPath)
{
using (Stream fileStream = File.Open(logPath, FileMode.OpenOrCreate))
{
// While the result is running, copy everything from the command stream to a file
while (!result.IsCompleted)
{
await stream.CopyToAsync(fileStream);
}
}
}
However this results in the text files never getting data written to them, and CPU usage is actually slightly worse.
2ND EDIT 3/4/2021
Doing some more debugging, it appears the high CPU usage occurs only when there's no new data coming in. As far as I can tell, this is because the ReadLineAsync() method is constantly firing regardless of whether or not there's actually new data from the SSH command that's running, and it's running as fast as possible hogging all the CPU cycles it can. I'm not entirely sure why that is though, and could really use some help here. I would've assumed that ReadLineAsync() would simply wait until a new line was available from the SSH command to continue.
The solution ended up being much simpler than I would've thought.
There's a known bug in SSH.NET where the command's OutputStream will continually spit out null data when there's no actual new data recieved. This causes the while loop in my code to be running as fast as possible, consuming a bunch of CPU in the process.
The solution is simply to add a short asynchronous delay in the loop. I included the delay only when the recieved data is null, so that reading isn't interrupted when there's actual valid data coming through.
while (!result.IsCompleted && !token.IsCancellationRequested)
{
string line = await reader.ReadLineAsync();
// Append line if it's valid
if (string.IsNullOrEmpty(line))
{
await Task.Delay(10); // prevents high CPU usage
continue;
}
// Append line to textbox
textBox.Invoke((Action)(() => textBox.AppendText(line + Environment.NewLine)));
// Append line to file
writer.WriteLine(line);
}
On a Ryzen 5 3600, this brought my CPU usage from ~30-40% while the program was running to less than 1% even when data is flowing. Much better.
Related
There is an issue where we are seeing some periodic +200ms overhead on reading the Input Stream from a Stream Reader when there is load on the system. I am wondering has anyone else seen this and if they have done anything to fix it?
The following is the code:
string requestBody;
var streamReaderTime = Stopwatch.StartNew();
using (var streamReader = new StreamReader(context.Request.InputStream, context.Request.ContentEncoding))
{
var allLines = streamReader.ReadLines();
var request = new StringBuilder();
allLines.ForEach(line => request.Append(line));
requestBody = request.ToString();
}
streamReaderTime.Stop();
ReadLine is just as follows:
public static IEnumerable<string> ReadLines(this StreamReader reader)
{
while (!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
Note: Using ReadLines() or ReadToEnd() makes very little difference if any.
We run performance tests overnight and we are seeing the following behavior just from graphing streamReaderTime.
A single request takes between 45ms and 70ms to execute but it can be seen from the screenshot that it is adding on a fixed value and sometimes an even bigger spike. I saw it before being at around 1.5 seconds.
If anyone has any solutions/suggestions it would be greatly appreciated.
Edit : I did have ReadToEnd() instead of ReadLines() and that got rid of the StringBuilder but it was still the same overhead. Is there an alternative to StreamReader, just to test out even? It does seem like GC cost since having a request ever ten seconds does not effect it, but the exact same request per second will cause this overhead to happen. Also I am not able to reproduce it locally either, it is only in the virtual environment that this is happening.
This issue is not with the above code at all. The issue is from the caller. The service calling is using a library that is cutting the connections to early and that overhead is the connection being re-established again.
I am working with this code in order to comunicate with another program through TCP which acts as a server. My app is also a Windows Store App and I just added the 3 methods to my code. The server is not made by me, I can't modify it in any way. Connection and sending messages works fine. After I give a specific command to the server, it sends back a continuous stream composed of strings that end in "\r\n" in order to see when a message ends, something like this: "string1\r\nstring2\r\n" and so on, as long as there is a connection with it. Note that sending the command works because I get a visual response from the server.
I can not find a way to display the individual strings in my app's UI, I think my problem lies in the read() method, because the stream never "consumes":
public async Task<String> read()
{
DataReader reader;
StringBuilder strBuilder;
using (reader = new DataReader(socket.InputStream))
{
strBuilder = new StringBuilder();
// Set the DataReader to only wait for available data (so that we don't have to know the data size)
reader.InputStreamOptions = Windows.Storage.Streams.InputStreamOptions.Partial;
// The encoding and byte order need to match the settings of the writer we previously used.
reader.UnicodeEncoding = Windows.Storage.Streams.UnicodeEncoding.Utf8;
reader.ByteOrder = Windows.Storage.Streams.ByteOrder.LittleEndian;
// Send the contents of the writer to the backing stream.
// Get the size of the buffer that has not been read.
await reader.LoadAsync(256);
// Keep reading until we consume the complete stream.
while (reader.UnconsumedBufferLength > 0)
{
strBuilder.Append(reader.ReadString(reader.UnconsumedBufferLength));
await reader.LoadAsync(256);
}
reader.DetachStream();
return strBuilder.ToString();
}
}
I have an event on a button that calls send() having a parameter the string command I wish to send. At first, I simply tried textBox.Text = await read(); after calling the send() method, nothing appeared in the textBox. Next, I tried making the read() method to not return anything and putting textBox.Text = strBuilder.ToString(); in different places inside read(). Finally, I discovered if I put it inside while (reader.UnconsumedBufferLength > 0) after strBuilder.Append(reader.ReadString(reader.UnconsumedBufferLength)); the textBox gets updated, although I'm not sure if the strings really appear correctly, but my UI becomes unresponsive, probably because it gets stuck in the while loop. I searched the internet for multiple examples, including how to do it in a separate thread, unfortunately my experience is entry-level and this is the best I could do, I don't know how to adapt the code any further. I hope I have been explicit enough. Also, I don't mind if you show me a different, better way of updating the UI
I have a program that continuously writes its log to a text file.
I don't have the source code of it, so I can not modify it in any way and it is also protected with Themida.
I need to read the log file and execute some scripts depending on the content of the file.
I can not delete the file because the program that is continuously writing to it has locked the file.
So what will be the better way to read the file and only read the new lines of the file?
Saving the last line position? Or is there something that will be useful for solving it in C#?
Perhaps use the FileSystemWatcher along with opening the file with FileShare (as it is being used by another process). Hans Passant has provided a nice answer for this part here:
var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
using (var sr = new StreamReader(fs)) {
// etc...
}
Have a look at this question and the accepted answer which may also help.
using (var fs = new FileStream("test.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite | FileShare.Delete))
using (var reader = new StreamReader(fs))
{
while (true)
{
var line = reader.ReadLine();
if (!String.IsNullOrWhiteSpace(line))
Console.WriteLine("Line read: " + line);
}
}
I tested the above code and it works if you are trying to read one line at a time. The only issue is that if the line is flushed to the file before it is finished being written then you will read the line in multiple parts. As long as the logging system is writing each line all at once it should be okay.
If not then you may want to read into a buffer instead of using ReadLine, so you can parse the buffer yourself by detecting each Environment.NewLine substring.
You can just keep calling ReadToEnd() in a tight loop. Even after it reaches the end of the file it'll just return an empty string "". If some more data is written to the file it will pick it up on a subsequent call.
while (true)
{
string moreData = streamReader.ReadToEnd();
Thread.Sleep(100);
}
Bear in mind you might read partial lines this way. Also if you are dealing with very large files you will probably need another approach.
Use the filesystemwatcher to detect changes and get new lines using last read position and seek the file.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek.aspx
The log file is being "continuously" updated so you really shouldn't use FileSystemWatcher to raise an event each time the file changes. This would be triggering continuously, and you already know it will be very frequently changing.
I'd suggest using a timer event to periodically process the file. Read this SO answer for a good pattern to use System.Threading.Timer1. Keep a file stream open for reading or reopen each time and Seek to the end position of your last successful read. By "last successful read" I mean that you should encapsulate the reading and validating of a complete log line. Once you've successfully read and validated a log line, then you have a new position for the next Seek.
1 Note that System.Threading.Timer will execute on a system supplied thread that is kept in business by the ThreadPool. For short tasks this is more desirable that a dedicated thread.
Use this answer on another post c# continuously read file.
This one is quite efficient, and it checks once per second if the file size has changed. So the file is usually not read-locked as a result.
The other answers are quite valid and simple. A couple of them will read-lock the file continuously, but that's probably not a problem for most.
As Ive stated with a few other questions, Ive been using a new SSH .NET library to connect to a Unix server and run various scripts and commands. Well, I've finally attempted to use it to run a Unix tail -f on a live log file and display the tail in a Winforms RichTextBox.
Since the library is not fully-fleshed out, the only kinda-sorta solution I've come up with seems lacking... like the feeling you get when you know there has to be a better way. I have the connection/tailing code in a separate thread as to avoid UI thread lock-ups. This thread supports cancellation request (which will allow the connection to gracefully exit, the only way to ensure the process Unix side is killed). Here's my code thus far (which for the record seems to work, I would just like some thoughts on if this is the right way to go about it):
PasswordConnectionInfo connectionInfo = new PasswordConnectionInfo(lineIP, userName, password);
string command = "cd /logs; tail -f " + BuildFileName() + " \r\n";
using (var ssh = new SshClient(connectionInfo))
{
ssh.Connect();
var output = new MemoryStream();
var shell = ssh.CreateShell(Encoding.ASCII, command, output, output);
shell.Start();
long positionLastWrite = 0;
while (!TestBackgroundWorker.CancellationPending) //checks for cancel request
{
output.Position = positionLastWrite;
var result = new StreamReader(output, Encoding.ASCII).ReadToEnd();
positionLastWrite = output.Position;
UpdateTextBox(result);
Thread.Sleep(1000);
}
shell.Stop();
e.Cancel = true;
}
The UpdateTextBox() function is a thread-safe way of updating the RichTextBox used to display the tail from a different thread. The positionLastWrite stuff is an attempt to make sure I don’t loose any data in between the Thread.Sleep(1000).
Now Im not sure about 2 things, first being that I have the feeling I might be missing out on some data each time with the whole changing MemoryStream position thing (due to my lack of experiance with MemoryStreams, and the second being that the whole sleep for 1 second then update again thing seems pretty archaic and inefficient... any thoughts?
Mh, I just realized that you are not the creator of the SSH library (although it's on codeplex so you could submit patches), anyway: You might want to wrap your loop into a try {} finally {} and call shell.Stop() in the finally block to make sure it is always cleaned up.
Depending on the available interfaces polling might be the only way to go and it is not inherently bad. Whether or not you loose data depends on what the shell object is doing for buffering: Does it buffer all output in memory, does it throw away some output after a certain time?
My original points still stand:
One thing which comes to mind is that it looks like the shell object is buffering the whole output in memory all the time which poses a potential resource problem (out of memory). One option of changing the interface is to use something like a BlockingQueue in the shell object. The shell is then enqueuing the output from the remote host in there and in your client you can just sit there and dequeue which will block if nothing is there to read.
Also: I would consider making the shell object (whatever type CreateShell returns) IDisposable. From your description it sounds shell.Stop() is required to clean up which won't happen in case some exception is thrown in the while loop.
I have a binary log file with streaming data from a sensor (Int16).
Every 6 seconds, 6000 samples of type Int16 are added, until the sensor is disconnected.
I need to poll this file on regular intervals, continuing from last position read.
Is it better to
a) keep a filestream and binary reader open and instantiated between readings
b) instantiate filestream and binary reader each time I need to read (and keep an external variable to track the last position read)
c) something better?
EDIT: Some great suggestions so far, need to add that the "server" app is supplied by an outside source vendor and cannot be modified.
If it's always adding the same amount of data, it may make sense to reopen it. You might want to find out the length before you open it, and then round down to the whole number of "sample sets" available, just in case you catch it while it's still writing the data. That may mean you read less than you could read (if the write finishes between you checking the length and starting the read) but you'll catch up next time.
You'll need to make sure you use appropriate sharing options so that the writer can still write while you're reading though. (The writer will probably have to have been written with this in mind too.)
Can you use MemoryMappedFiles?
If you can, mapping the file in memory and sharing it between processes you will be able to read the data by simply incrementing the offset for your pointer each time.
If you combine it with an event you can signal your reader when he can go in an read the information. There will be no need to block anything as the reader will always read "old" data which has already been written.
I would recommend using pipes, they act just like files, except stream data directly between applications, even if the apps run on different PCs (though this is really only an option if you are able to change both applications). Check it out under the "System.IO.Pipes" namespace.
P.S. You would use a "named" pipe for this (pipes are supported in 'c' as well, so basically any half decent programming language should be able to implement them)
I think that (a) is the best because:
Current Position will be incremented as you read and you don't need to worry about to store it somewhere;
You don't need to open it and seek required position (it shouldn't be much slower to reopen but keeping it open gives OS some hints for optimization I believe) each time you poll it;
Other solutions I can think out requires PInvokes to system interprocess synchronisation primitives. And they won't be faster than file operations already in framework.
You just need to set proper FileShare flags:
Just for example:
Server:
using(var writer = new BinaryWriter(new FileStream(#"D:\testlog.log", FileMode.Append, FileAccess.Write, FileShare.Read)))
{
int n;
while(Int32.TryParse(Console.ReadLine(), out n))
{
writer.Write(n);
writer.Flush(); // write cached bytes to file
}
}
Client:
using (var reader = new BinaryReader(new FileStream(#"D:\testlog.log", FileMode.Open, FileAccess.Read, FileShare.ReadWrite)))
{
string s;
while (Console.ReadLine() != "exit")
{
// allocate buffer for new ints
Int32[] buffer = new Int32[(reader.BaseStream.Length - reader.BaseStream.Position) / sizeof(Int32)];
Console.WriteLine("Stream length: {0}", reader.BaseStream.Length);
Console.Write("Ints read: ");
for (int i = 0; i < buffer.Length; i++)
{
buffer[i] = reader.ReadInt32();
Console.Write((i == 0 ? "" : ", ") + buffer[i].ToString());
}
Console.WriteLine();
}
}
you could also stream the data into a database, rather than a file as another alternative, then you wouldn't have to worry about file locking.
but if you're stuck with the file method, you may want to close the file each time you read data from it; it depends alot on how complicated the process writing to the file is going to be, and whether it can detect a file locking operation and respond appropriately without crashing horribly.