How to read very long input from console in C#? - c#

I need to load veeeery long line from console in C#, up to 65000 chars. Console.ReadLine itself has a limit of 254 chars(+2 for escape sequences), but I can use this:
static string ReadLine()
{
Stream inputStream = Console.OpenStandardInput(READLINE_BUFFER_SIZE);
byte[] bytes = new byte[READLINE_BUFFER_SIZE];
int outputLength = inputStream.Read(bytes, 0, READLINE_BUFFER_SIZE);
Console.WriteLine(outputLength);
char[] chars = Encoding.UTF7.GetChars(bytes, 0, outputLength);
return new string(chars);
}
...to overcome that limit, for up to 8190 chars(+2 for escape sequences) - unfortunately I need to enter WAY bigger line, and when READLINE_BUFFER_SIZE is set to anything bigger than 8192, error "Not enough storage is available to process this command" shows up in VS. Buffer should be set to 65536. I've tried a couple of solutions to do that, yet I'm still learning and none exceeded either 1022 or 8190 chars, how can I increase that limit to 65536? Thanks in advance.

You have to add following line of code in your main() method:
byte[] inputBuffer = new byte[4096];
Stream inputStream = Console.OpenStandardInput(inputBuffer.Length);
Console.SetIn(new StreamReader(inputStream, Console.InputEncoding, false, inputBuffer.Length));
Then you can use Console.ReadLine(); to read long user input.

try Console.Read with StringBuilder
StringBuilder sb =new StringBuilder();
while (true) {
char ch = Convert.ToChar(Console.Read());
sb.Append(ch);
if (ch=='\n') {
break;
}
}

I agree with Manmay, that seems to work for me, and I also attempt to keep the default stdin so I can restore it afterwards:
if (dbModelStrPathname == #"con" ||
dbModelStrPathname == #"con:")
{
var stdin = Console.In;
var inputBuffer = new byte[262144];
var inputStream = Console.OpenStandardInput(inputBuffer.Length);
Console.SetIn(new StreamReader(inputStream, Console.InputEncoding, false, inputBuffer.Length));
dbModelStr = Console.In.ReadLine();
Console.SetIn(stdin);
}
else
{
dbModelStr = File.ReadAllText(dbModelStrPathname);
}

Related

C# Read from SslStream continuously (long connection, last for up to days) and Efficiently without infinite loop

I am completely new to C#, and need to encrypt the data sent and received between client and server, after googled it for two days, learnt the best way is to use SslStream, some answers I found give good examples but they all somehow assume we just need to read one message and then close the connection, which is totally not my case, I have to read whenever a user triggers his device to send a message through the persistent connection.
one example from Microsoft documentation:
static string ReadMessage(SslStream sslStream)
{
// Read the message sent by the client.
// The client signals the end of the message using the
// "<EOF>" marker.
byte [] buffer = new byte[2048];
StringBuilder messageData = new StringBuilder();
int bytes = -1;
do
{
// Read the client's test message.
bytes = sslStream.Read(buffer, 0, buffer.Length);
// Use Decoder class to convert from bytes to UTF8
// in case a character spans two buffers.
Decoder decoder = Encoding.UTF8.GetDecoder();
char[] chars = new char[decoder.GetCharCount(buffer,0,bytes)];
decoder.GetChars(buffer, 0, bytes, chars,0);
messageData.Append (chars);
// Check for EOF or an empty message. <------ In my case,I don't have EOF
if (messageData.ToString().IndexOf("<EOF>") != -1)
{
break;
}
} while (bytes !=0);
return messageData.ToString();
}
and other answers actually tell me how to continuously read from a SslStream, but they are using infinite loop to do it, on the server side, there could be thousands clients connected to it, so the possible poor performance concerns me,like this one :
Read SslStream continuously in C# Web MVC 5 project
So I want to know if there is a better way to continuously read from a persistent SslStream connection.
I know with bare socket I can use SocketAsyncEventArgs to know when there is new data ready, I hope I could do this with SslStream, probably I misunderstand something, any ideas would be appreciated, thanks in advance.
Here's my shot at it. Instead of looping forever, I chose recursion. This method will return immediately but will fire an event when EOF is hit and continue to keep reading:
public static void ReadFromSSLStreamAsync(
SslStream sslStream,
Action<string> result,
Action<Exception> error,
StringBuilder stringBuilder = null)
{
const string EOFToken = "<EOF>";
stringBuilder = stringBuilder ?? new StringBuilder();
var buffer = new byte[4096];
try
{
sslStream.BeginRead(buffer, 0, buffer.Length, asyncResult =>
{
// Read all bytes avaliable from stream and then
// add them to string builder
{
int bytesRead;
try
{
bytesRead = sslStream.EndRead(asyncResult);
}
catch (Exception ex)
{
error?.Invoke(ex);
return;
}
// Use Decoder class to convert from bytes to
// UTF8 in case a character spans two buffers.
var decoder = Encoding.UTF8.GetDecoder();
var buf = new char[decoder.GetCharCount(buffer, 0, bytesRead)];
decoder.GetChars(buffer, 0, bytesRead, buf, 0);
stringBuilder.Append(buf);
}
// Find the EOFToken, if found copy all data before the token
// and send it to event, then remove it from string builder
{
int tokenIndex;
while((tokenIndex = stringBuilder.ToString().IndexOf(EOFToken)) != -1)
{
var buf = new char[tokenIndex];
stringBuilder.CopyTo(0, buf, 0, tokenIndex);
result?.Invoke(new string(buf));
stringBuilder.Remove(0, tokenIndex + EOFToken.Length);
}
}
// Continue reading...
ReadFromSSLStreamAsync(sslStream, result, error, stringBuilder);
}, null);
}
catch(Exception ex)
{
error?.Invoke(ex);
}
}
You could call it as so:
ReadFromSSLStreamAsync(sslStream, sslData =>
{
Console.WriteLine($"Finished: {sslData}");
}, error =>
{
Console.WriteLine($"Errored: {error}");
});
It's not TaskAsync, so you don't have to await on it. But it is asynchronous so your thread can go on to do other things.
Consider checking out the following asnwer. SSLStream was derived from the Stream class therefore the ReadAsnyc method can be used. Code below, read until the <EOF> delimiter characters then return with the received message as string.
internal static readonly byte[] EOF = Encoding.UTF8.GetBytes("<EOF>");
internal static async Task<string> ReadToEOFAsync(Stream stream)
{
byte[] buffer = new byte[8192];
using (MemoryStream memoryStream = new MemoryStream())
{
long eofLength = EOF.LongLength;
byte[] messageTail = new byte[eofLength];
while (!messageTail.SequenceEqual(EOF))
{
int bytesRead = await stream.ReadAsync(buffer, 0, buffer.Length);
await memoryStream.WriteAsync(buffer, 0, bytesRead);
Array.Copy(memoryStream.GetBuffer(), memoryStream.Length - eofLength, messageTail, 0, eofLength);
}
// Truncate the EOF tail from the data stream
byte[] result = new byte[memoryStream.Length - eofLength];
Array.Copy(memoryStream.GetBuffer(), 0, result, 0, result.LongLength);
return Encoding.UTF8.GetString(result);
}
}
The received messages was appended to the memoryStream. The first Array.Copy copies the message tail from the buffer. If the message tail is euqals to the <EOF> then it stops reading from the stream. Second copy is to ensure truncating the delimiter characters from the message.
Note: There is a more sophisticated way of slicing using Span introduced in .NET Core 2.1.

How do I read exactly one char from a Stream?

I have a Stream with some text data (can be ASCII, UTF-8, Unicode; encoding is known). I need to read exactly one char from the stream, without advancing stream position any longer. StreamReader is inappropriate, as it aggressively prefetches data from the stream.
Ideas?
If you want to read and decode the text one byte at a time, the most convenient approach I know of is to use the System.Text.Decoder class.
Here's a simple example:
class Program
{
static void Main(string[] args)
{
Console.OutputEncoding = Encoding.Unicode;
string originalText = "Hello world! ブ䥺ぎょズィ穃 槞こ廤樊稧 ひゃご禺 壪";
byte[] rgb = Encoding.UTF8.GetBytes(originalText);
MemoryStream dataStream = new MemoryStream(rgb);
string result = DecodeOneByteAtATimeFromStream(dataStream);
Console.WriteLine("Result string: \"" + result + "\"");
if (originalText == result)
{
Console.WriteLine("Original and result strings are equal");
}
}
static string DecodeOneByteAtATimeFromStream(MemoryStream dataStream)
{
Decoder decoder = Encoding.UTF8.GetDecoder();
StringBuilder sb = new StringBuilder();
int inputByteCount;
byte[] inputBuffer = new byte[1];
while ((inputByteCount = dataStream.Read(inputBuffer, 0, 1)) > 0)
{
int charCount = decoder.GetCharCount(inputBuffer, 0, 1);
char[] rgch = new char[charCount];
decoder.GetChars(inputBuffer, 0, 1, rgch, 0);
sb.Append(rgch);
}
return sb.ToString();
}
}
Presumably you are already aware of the drawbacks of processing data of any sort just one byte at a time. :) Suffice to say, this is not a very efficient way to do things.

Sending and receiving compressed data over a TCP socket

Need help with sending and receiving compressed data over TCP socket.
The code works perfectly fine if I don't use compression, but something very strange happens when I do use compression.. Basically, the problem is that the stream.Read() operation gets skipped and I don't know why..
My code:
using (var client = new TcpClient())
{
client.Connect("xxx.xxx.xx.xx", 6100);
using (var stream = client.GetStream())
{
// SEND REQUEST
byte[] bytesSent = Encoding.UTF8.GetBytes(xml);
// send compressed bytes (if this is used, then stream.Read() below doesn't work.
//var compressedBytes = bytesSent.ToStream().GZipCompress();
//stream.Write(compressedBytes, 0, compressedBytes.Length);
// send normal bytes (uncompressed)
stream.Write(bytesSent, 0, bytesSent.Length);
// GET RESPONSE
byte[] bytesReceived = new byte[client.ReceiveBufferSize];
// PROBLEM HERE: when using compression, this line just gets skipped over very quickly
stream.Read(bytesReceived, 0, client.ReceiveBufferSize);
//var decompressedBytes = bytesReceived.ToStream().GZipDecompress();
//string response = Encoding.UTF8.GetString(decompressedBytes);
string response = Encoding.UTF8.GetString(bytesReceived);
Console.WriteLine(response);
}
}
You will notice some extension methods above. Here is the code in case you are wondering if something is wrong there.
public static MemoryStream ToStream(this byte[] bytes)
{
return new MemoryStream(bytes);
}
public static byte[] GZipCompress(this Stream stream)
{
using (var memoryStream = new MemoryStream())
{
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress))
{
stream.CopyTo(gZipStream);
}
return memoryStream.ToArray();
}
}
public static byte[] GZipDecompress(this Stream stream)
{
using (var memoryStream = new MemoryStream())
{
using (var gZipStream = new GZipStream(stream, CompressionMode.Decompress))
{
gZipStream.CopyTo(memoryStream);
}
return memoryStream.ToArray();
}
}
The extensions work quite well in the following, so I'm sure they're not the problem:
string original = "the quick brown fox jumped over the lazy dog";
byte[] compressedBytes = Encoding.UTF8.GetBytes(original).ToStream().GZipCompress();
byte[] decompressedBytes = compressedBytes.ToStream().GZipDecompress();
string result = Encoding.UTF8.GetString(decompressedBytes);
Console.WriteLine(result);
Does anyone have any idea why the Read() operation is being skipped when the bytes being sent are compressed?
EDIT
I received a message from the API provider after showing them the above sample code. They had this to say:
at a first glance I guess the header is missing. The input must start
with a 'c' followed by the length of the input
(sprintf(cLength,"c%09d",hres) in our example). We need this because
we can't read until we find a binary 0 to recognize the end.
They previously provided some sample code in C, which I don't fully understand 100%, as follows:
example in C:
#include <zlib.h>
uLongf hres;
char cLength[COMPRESS_HEADER_LEN + 1] = {'\0'};
n = read(socket,buffer,10);
// check if input is compressed
if(msg[0]=='c') {
compressed = 1;
}
n = atoi(msg+1);
read.....
hres = 64000;
res = uncompress((Bytef *)msg, &hres, (const Bytef*)
buffer/*compressed*/, n);
if(res == Z_OK && hres > 0 ){
msg[hres]=0; //original
}
else // errorhandling
hres = 64000;
if (compressed){
res = compress((Bytef *)buffer, &hres, (const Bytef *)msg, strlen(msg));
if(res == Z_OK && hres > 0 ) {
sprintf(cLength,"c%09d",hres);
write(socket,cLength,10);
write(socket, buffer, hres);
}
else // errorhandling
makefile: add "-lz" to the libs
They're using zlib. I don't suspect that to make any difference, but I did try using zlib.net and I still get no response anyway.
Can someone give me an example of how exactly I'm supposed to send this input length in C#?
EDIT 2
In response to #quantdev, here is what I am trying now for the length prefix:
using (var client = new TcpClient())
{
client.Connect("xxx.xxx.xx.xx", 6100);
using (var stream = client.GetStream())
{
// SEND REQUEST
byte[] bytes = Encoding.UTF8.GetBytes(xml);
byte[] compressedBytes = ZLibCompressor.Compress(bytes);
byte[] prefix = Encoding.UTF8.GetBytes("c" + compressedBytes.Length);
byte[] bytesToSend = new byte[prefix.Length + compressedBytes.Length];
Array.Copy(prefix, bytesToSend, prefix.Length);
Array.Copy(compressedBytes, 0, bytesToSend, prefix.Length, compressedBytes.Length);
stream.Write(bytesToSend, 0, bytesToSend.Length);
// WAIT
while (client.Available == 0)
{
Thread.Sleep(1000);
}
// GET RESPONSE
byte[] bytesReceived = new byte[client.ReceiveBufferSize];
stream.Read(bytesReceived, 0, client.ReceiveBufferSize);
byte[] decompressedBytes = ZLibCompressor.DeCompress(bytesReceived);
string response = Encoding.UTF8.GetString(decompressedBytes);
Console.WriteLine(response);
}
}
You need to check the return value of the Read() calls you are making on the TCP stream: it is the number of bytes effectively read.
MSDN says :
Return Value
The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not
currently available, or zero (0) if the end of the stream has been
reached.
If the socket is closed, the call will return immediately 0 (which is what might be happening here).
If is not 0, then you must check how many bytes you did actually received, if it is less than client.ReceiveBufferSize, you will need additional calls to Read to retrieve the remaining bytes.
Prior to you call to read, check that some data is actually available on the socket :
while(client.Available == 0)
// wait ...
http://msdn.microsoft.com/en-us/library/system.net.sockets.tcpclient.available%28v=vs.110%29.aspx
I think you may have the end of file or so. Can you try setting the stream position before reading the stream
stream.position = 0;
http://msdn.microsoft.com/en-us/library/vstudio/system.io.stream.read
Encoding.UTF8.GetString shouldn't be used on arbitrary byte array.
e.g.: The compressed bytes may contain NULL character, which is not allowed in UTF-8 encoded text except for being used as terminator.
If you want to print the received bytes for debugging, maybe you should just print them as integers.

Unexpected output when reading and writing to a text file

I am a bit new to files in C# and am having a problem. When reading from a file and copying to another, the last chunk of text is not being written. Below is my code:
StringBuilder sb = new StringBuilder(8192);
string fileName = "C:...rest of path...inputFile.txt";
string outputFile = "C:...rest of path...outputFile.txt";
using (StreamReader reader = File.OpenText(fileName))
{
char[] buffer = new char[8192];
while ((reader.ReadBlock(buffer, 0, buffer.Length)) != 0)
{
foreach (char c in buffer)
{
//do some function on char c...
sb.Append(c);
}
using (StreamWriter writer = File.CreateText(outputFile))
{
writer.Write(sb.ToString());
}
}
}
My aim was to read and write to a textfile in a buffered manner. Something that in Java I would achieve in the following manner:
public void encrypt(File inputFile, File outputFile) throws IOException
{
BufferedReader infromfile = null;
BufferedWriter outtofile = null;
try
{
String key = getKeyfromFile(keyFile);
if (key != null)
{
infromfile = new BufferedReader(new FileReader(inputFile));
outtofile = new BufferedWriter(new FileWriter(outputFile));
char[] buffer = new char[8192];
while ((infromfile.read(buffer, 0, buffer.length)) != -1)
{
String temptext = String.valueOf(buffer);
//some changes to temptext are done
outtofile.write(temptext);
}
}
}
catch (FileNotFoundException exc)
{
} // and all other possible exceptions
}
Could you help me identify the source of my problem?
If you think that there is possibly a better approach to achieve buffered i/o with text files, I would truly appreciate your suggestion.
There are a couple of "gotchas":
c can't be changed (it's the foreach iteration variable), you'll need to copy it in order to process before writing
you have to keep track of your buffer's size, ReadBlock fills it with characters which would make your output dirty
Changing your code like this looks like it works:
//extracted from your code
foreach (char c in buffer)
{
if (c == (char)0) break; //GOTCHA #2: maybe you don't want NULL (ascii 0) characters in your output
char d = c; //GOTCHA #1: you can't change 'c'
// d = SomeProcessingHere();
sb.Append(d);
}
Try this:
string fileName = #"";
string outputfile = #"";
StreamReader reader = File.OpenText(fileName);
string texto = reader.ReadToEnd();
StreamWriter writer = new StreamWriter(outputfile);
writer.Write(texto);
writer.Flush();
writer.Close();
Does this work for you?
using (StreamReader reader = File.OpenText(fileName))
{
char[] buffer = new char[8192];
bool eof = false;
while (!eof)
{
int numBytes = (reader.ReadBlock(buffer, 0, buffer.Length));
if (numBytes>0)
{
using (StreamWriter writer = File.CreateText(outputFile))
{
writer.Write(buffer, 0, numBytes);
}
} else {
eof = true;
}
}
}
You still have to take care of character encoding though!
If you dont care about carraign returns, you could use File.ReadAllText
This method opens a file, reads each line of the file, and then adds each line as an element of a string. It then closes the file. A line is defined as a sequence of characters followed by a carriage return ('\r'), a line feed ('\n'), or a carriage return immediately followed by a line feed. The resulting string does not contain the terminating carriage return and/or line feed.
StringBuilder sb = new StringBuilder(8192);
string fileName = "C:...rest of path...inputFile.txt";
string outputFile = "C:...rest of path...outputFile.txt";
// Open the file to read from.
string readText = File.ReadAllText(fileName );
foreach (char c in readText)
{
// do something to c
sb.Append(new_c);
}
// This text is added only once to the file, overwrite it if it exists
File.WriteAllText(outputFile, sb.ToString());
Unless I'm missing something, it appears that your issue is that you're overwriting the existing contents of your output file on each blockread iteration.
You call:
using (StreamWriter writer = File.CreateText(outputFile))
{
writer.Write(sb.ToString());
}
for every ReadBlock iteration. The output of the file would only be the last chunk of data that was read.
From MSDN documentation on File.CreateText:
If the file specified by path does not exist, it is created. If the
file does exist, its contents are overwritten.

Read textfile from specific position till specific length

Due to me receiving a very bad datafile, I have to come up with code to read from a non delimited textfile from a specific starting position and a specific length to buildup a workable dataset. The textfile is not delimited in any way, but I do have the starting and ending position of each string that I need to read. I've come up with this code, but I'm getting an error and can't figure out why, because if I replace the 395 with a 0 it works..
e.g. Invoice number starting position = 395, ending position = 414, length = 20
using (StreamReader sr = new StreamReader(#"\\t.txt"))
{
char[] c = null;
while (sr.Peek() >= 0)
{
c = new char[20];//Invoice number string
sr.Read(c, 395, c.Length); //THIS IS GIVING ME AN ERROR
Debug.WriteLine(""+c[0] + c[1] + c[2] + c[3] + c[4]..c[20]);
}
}
Here is the error that I get:
System.ArgumentException: Offset and length were out of bounds for the array
or count is greater than the number of elements from
index to the end of the source collection. at
System.IO.StreamReader.Read(Char[] b
Please Note
Seek() is too low level for what the OP wants. See this answer instead for line-by-line parsing.
Also, as Jordan mentioned, Seek() has the issue of character encodings and varying character sizes (e.g. for non-ASCII and non-ANSI files, like UTF, which is probably not applicable to this question). Thanks for pointing that out.
Original Answer
Seek() is only available on a stream, so try using sr.BaseStream.Seek(..), or use a different stream like such:
using (Stream s = new FileStream(path, FileMode.Open))
{
s.Seek(offset, SeekOrigin.Begin);
s.Read(buffer, 0, length);
}
Here is my suggestion for you:
using (StreamReader sr = new StreamReader(#"\\t.txt"))
{
char[] c = new char[20]; // Invoice number string
sr.BaseStream.Position = 395;
sr.Read(c, 0, c.Length);
}
(new answer based on comments)
You are parsing invoice data, with each entry on a new line, and the required data is at a fixed offset for every line. Stream.Seek() is too low level for what you want to do, because you will need several seeks, one for every line. Rather use the following:
int offset = 395;
int length = 20;
using (StreamReader sr = new StreamReader(#"\\t.txt"))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
string myData = line.Substring(offset, length);
}
}
Solved this ages ago, just wanted to post the solution that was suggested
using (StreamReader sr = new StreamReader(path2))
{
string line;
while ((line = sr.ReadLine()) != null)
{
dsnonhb.Tables[0].Columns.Add("InvoiceNum" );
dsnonhb.Tables[0].Columns.Add("Odo" );
dsnonhb.Tables[0].Columns.Add("PumpVal" );
dsnonhb.Tables[0].Columns.Add("Quantity" );
DataRow myrow;
myrow = dsnonhb.Tables[0].NewRow();
myrow["No"] = rowcounter.ToString();
myrow["InvoiceNum"] = line.Substring(741, 6);
myrow["Odo"] = line.Substring(499, 6);
myrow["PumpVal"] = line.Substring(609, 7);
myrow["Quantity"] = line.Substring(660, 6);
I've created a class called AdvancedStreamReader into my Helpers project on git hub here:
https://github.com/jsmunroe/Helpers/blob/master/Helpers/IO/AdvancedStreamReader.cs
It is fairly robust. It is a subclass of StreamReader and keeps all of that functionality intact. There are a few caveats: a) it resets the position of the stream when it is constructed; b) you should not seek the BaseStream while you are using the reader; c) you need to specify the newline character type if it differs from the environment and the file can only use one type. Here are some unit tests to demonstrate how it is used.
[TestMethod]
public void ReadLineWithNewLineOnly()
{
// Setup
var text = $"ƒun ‼Æ¢ with åò☺ encoding!\nƒun ‼Æ¢ with åò☺ encoding!\nƒun ‼Æ¢ with åò☺ encoding!\nHa!";
var bytes = Encoding.UTF8.GetBytes(text);
var stream = new MemoryStream(bytes);
var reader = new AdvancedStreamReader(stream, NewLineType.Nl);
reader.ReadLine();
// Execute
var result = reader.ReadLine();
// Assert
Assert.AreEqual("ƒun ‼Æ¢ with åò☺ encoding!", result);
Assert.AreEqual(54, reader.CharacterPosition);
}
[TestMethod]
public void SeekCharacterWithUtf8()
{
// Setup
var text = $"ƒun ‼Æ¢ with åò☺ encoding!{NL}ƒun ‼Æ¢ with åò☺ encoding!{NL}ƒun ‼Æ¢ with åò☺ encoding!{NL}Ha!";
var bytes = Encoding.UTF8.GetBytes(text);
var stream = new MemoryStream(bytes);
var reader = new AdvancedStreamReader(stream);
// Pre-condition assert
Assert.IsTrue(bytes.Length > text.Length); // More bytes than characters in sample text.
// Execute
reader.SeekCharacter(84);
// Assert
Assert.AreEqual(84, reader.CharacterPosition);
Assert.AreEqual($"Ha!", reader.ReadToEnd());
}
I wrote this for my own use, but I hope it will help other people.
395 is the index in c array at which you start writing. There's no 395 index there, max is 19.
I would suggest something like this.
StreamReader r;
...
string allFile = r.ReadToEnd();
int offset = 395;
int length = 20;
And then use
allFile.Substring(offset, length)

Categories

Resources