ToArray() function limitation - c#

I am using the .ToArray() method to convert my string to char array whose size i have kept char[] buffer = new char[1000000]; but when I am using the following code:
using (StreamReader streamReader = new StreamReader(path1))
{
buffer = streamReader.ReadToEnd().ToCharArray();
}
// buffer = result.ToArray();
threadfunc(data_path1);
The size of the buffer getting fixed up to 8190, even it is not reading the whole file after using .ToCharArray() or .ToArray().
What is the reason for this does .ToCharArray() or .ToArray() have size limitations? As if I do not use this function I'm able to read whole file in string format, but when trying to convert it into char array by using this function I am getting size limitations.

My guess is the problem is that read to end should finish before you call the ToCharArray(). This might help you. You don't need to define buffer since ToCharArray() creates a new instance of char[] itself.
string content;
using (StreamReader streamReader = new StreamReader(path1))
{
content = streamReader.ReadToEnd();
}
var buffer = content.ToCharArray();

ToCharArray() returns new instance of of array. So your buffer will refer to the new instance which is the size of data returned by ReadToEnd.
If you want keep buffer same size just add new array to the existed one
char[] buffer = new char[1000000];
using (StreamReader streamReader = new StreamReader(path1))
{
var tempArray = streamReader.ReadToEnd().ToCharArray();
tempArray.CopyTo(buffer, 0);
}
If you want just use the result array - you don't need to "predict" the size of array - just use returned one
public char[] GetArrayFromFile(string pathToFile)
{
using (StreamReader streamReader = new StreamReader(path1))
{
var data = streamReader.ReadToEnd();
}
return data.ToCharArray();
}
var arrayFromFile = GetArrayFromFile(#"..\path.file");

You are probably using incorrect encoding. By default StreamReader(String) uses UTF8 encoding:
The complete file path is specified by the path parameter. This
constructor initializes the encoding to UTF8Encoding and the buffer
size to 1024 bytes.
Don't pre-allocate the buffer size, unless you have a specific need.
If your file is in ASCII format, you need to update your StreamReader constructor:
char[] buffer = null;
using (StreamReader streamReader = new StreamReader(path1, Encoding.ASCII))
{
buffer = streamReader.ReadToEnd().ToCharArray();
}
// buffer = result.ToArray();
threadfunc(data_path1);

Does your file contain binary data? If it contains EOF character and the stream is opened in text mode (which StreamReader does), that character will signal end of file, even if it is not actually the end of the file.
I can reproduce this by reading random .exe files in text mode.

Related

Read n first Characters of a big Text File - C#

I have a very big text file, for example about 1 GB. I need to just read 100 first characters and nothing more.
I searched StackOverflow and other forums but all of them have some solutions which first read the whole file and then will return some n characters of the file.
I do not want to read and load the whole file into memory etc. just need the first characters.
You can use StreamReader.ReadBlock() to read a specified number of characters from a file:
public static char[] ReadChars(string filename, int count)
{
using (var stream = File.OpenRead(filename))
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
char[] buffer = new char[count];
int n = reader.ReadBlock(buffer, 0, count);
char[] result = new char[n];
Array.Copy(buffer, result, n);
return result;
}
}
Note that this assumes that your file has UTF8 encoding. If it doesn't, you'll need to specify the correct encoding (in which case you could add an encoding parameter to ReadChars() rather than hard-coding it).
The advantage of using ReadBlock() rather than Read() is that it blocks until either all the characters have been read, or the end of the file has been reached. However, for a FileStream this is of no consequence; just be aware that Read() can return less bytes than asked for in the general case, even if the end of the stream has not been reached.
If you want an async version you can just call ReadBlockAsync() like so:
public static async Task<char[]> ReadCharsAsync(string filename, int count)
{
using (var stream = File.OpenRead(filename))
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
char[] buffer = new char[count];
int n = await reader.ReadBlockAsync(buffer, 0, count);
char[] result = new char[n];
Array.Copy(buffer, result, n);
return result;
}
}
Which you might call like so:
using System;
using System.IO;
using System.Text;
using System.Threading.Tasks;
namespace Demo
{
static class Program
{
static async Task Main()
{
string filename = "Your filename here";
Console.WriteLine(await ReadCharsAsync(filename, 100));
}
}
}
Let's read with StreamReader:
char[] buffer = new char[100];
using (StreamReader reader = new StreamReader(#"c:\MyFile.txt")) {
// Technically, StreamReader can read less than buffer.Length characters
// if the file is too short;
// in this case reader.Read returns the number of actually read chars
reader.Read(buffer, 0, buffer.Length);
}
fs.Read(); does not read the whole bytes all at once, it reads some number of bytes and returns the number of bytes read. MSDN has a good example of how to use it.
http://msdn.microsoft.com/en-us/library/system.io.filestream.read.aspx
Reading the entire 1 GB of data into memory is really going to put a drain on your client's system -- the preferred option would be to optimize it so that you don't need the whole file all at once.

Byte array read from a file and byte array converted from string read from same file differs

If i read byte array from a file and write it using below code
byte[] bytes = File.ReadAllBytes(filePath);
File.WriteAllBytes(filePath, byteArr);
works perfectly fine.I can open and view the written file properly.
But if i read file contents into a string and then convert it to byte array using below function
string s = File.ReadAllText(filePath);
var byteArr = System.Text.Encoding.UTF8.GetBytes(s);
the size of byte array is more than the previous array read directly from file and the values are also different, hence if i write the file using this array the cannot be read when opened
Note:- File is utf-8 encoded
i found out that using below code
using (StreamReader reader = new StreamReader(filePath, Encoding.UTF8, true))
{
reader.Peek(); // you need this!
var encoding = reader.CurrentEncoding;
}
Unable to understand why both the array differs??
I was using the below attached image for converting and then writing
With
using (StreamReader reader = new StreamReader(filePath, Encoding.UTF8, true))
{
reader.Peek(); // you need this!
var encoding = reader.CurrentEncoding;
}
your var encoding will just echo the Encoding.UTF8 parameter. You are deceiving yourself there.
A binary file just has no text encoding.
Need to save a file may be anything an image or a text
Then just use ReadAllBytes/WriteAllBytes. A text file is always also a byte[], but not all file types are text. You would need Base64 encoding first and that just adds to the size.
The safest way to convert byte arrays to strings is indeed encoding it in something like base64.
Like:
string s= Convert.ToBase64String(bytes);
byte[] bytes = Convert.FromBase64String(s);

Convert List of bytes to memorystream without using ToArray()

How can I save my list<byte> to MemoryStream() without using ToArray() or creating new array ?
This is my current method:
public Packet(List<byte> data)
{
// Create new stream from data buffer
using (Stream stream = new MemoryStream(data.ToArray()))
{
using (BinaryReader reader = new BinaryReader(stream))
{
Length = reader.ReadInt16();
pID = reader.ReadByte();
Result = reader.ReadByte();
Message = reader.ReadString();
ID = reader.ReadInt32();
}
}
}
The ToArray solution is the most efficient solution possible using documented APIs. MemoryStream will not copy the array. It will just store it. So the only copy is in List<T>.ToArray().
If you want to avoid that copy you need to pry List<T> open using reflection and access the backing array. I advise against that.
Instead, use a collection that allows you to obtain the backing array using legal means. Write your own, or use a MemoryStream in the first place.
A List<T> is not the most efficient way to move around bytes anyway. Storing them is fine, moving them usually has more overhead. For example, adding items bytewise will be far slower than a memcpy.
What about something like:
public Packet(List<byte> data)
{
using (Stream stream = new MemoryStream())
{
// Loop list and write out bytes
foreach(byte b in data)
stream.WriteByte(b);
// Reset stream position ready for read
stream.Seek(0, SeekOrigin.Begin);
using (BinaryReader reader = new BinaryReader(stream))
{
Length = reader.ReadInt16();
pID = reader.ReadByte();
Result = reader.ReadByte();
Message = reader.ReadString();
ID = reader.ReadInt32();
}
}
}
But why do you have a list in the first place? Can't you pass it into the method as a byte[] to start with? It'd be interesting to see how you populate that list.

How to insert List<> values into SQL Binary field using C#

I'm not entirely new to programming but I still see myself as a novice. I'm currently creating an Invoicing system with a max of 5 line items, this being said, I'm creating a String<> item, serializing it to store and then de-serializing it to display.
So far I've managed the serializing, and de-serializing, and from the de-serialized value I've managed to display the relevant information in the correct fields.
My question comes to: HOW do I add the list of items in the String<> object to either a Binary or XML field in my SQL table?
I know it should be similar to adding an Image object to binary but there's a catch there. usually:
byte[] convertToByte(string sourcePath)
{
//get the byte file size of image
FileInfo fInfo = new FileInfo(sourcePath);
long byteSize = fInfo.Length;
//read the file using file stream
FileStream fStream = new FileStream(sourcePath, FileMode.Open, FileAccess.Read);
//read again as byte using binary reader
BinaryReader binRead = new BinaryReader(fStream);
//convert image to byte (already)
byte[] data = binRead.ReadBytes((int)byteSize);
return data;
}
this kind of thing is done for an image however the whole "long" thing does not apply to the List<> object.
Any assistance would be helpful
If you simply want to store your data as "readable" text, you can use the varchar(MAX) or nvarchar(MAX) (depending on whether you need extended character support). That translates directly into a string in ADO.NET or EntityFramework.
If all you need are bytes from a string, the Encoding class will do that:
System.Text.Encoding.Default.GetBytes(yourstring);
See: http://msdn.microsoft.com/en-us/library/ds4kkd55%28v=vs.110%29.aspx
A way of saving a binary file in a string is to convert the image to a Base64 string. This can be done with the Convert.ToBase64String (Byte[]) method:
Convert.ToBase64String msdn
string convertImageToBase64(string sourcePath)
{
//get the byte file size of image
FileInfo fInfo = new FileInfo(sourcePath);
long byteSize = fInfo.Length;
//read the file using file stream
FileStream fStream = new FileStream(sourcePath, FileMode.Open, FileAccess.Read);
//read again as byte using binary reader
BinaryReader binRead = new BinaryReader(fStream);
//convert image to byte (already)
byte[] data = binRead.ReadBytes((int)byteSize);
return Convert.ToBase64String (data);
}
Now you will be able to save the Base64 string in a string field in your database.

How do you remove and add bytes from a byte array in C#

I have a configuration file (.cfg) that I am using to create a command line application to add users to a SFTP server application.
The cfg file needs to have a certain number of reserved bytes for each entry in the cfg file. I am currently just appending a new user to the end of the file by creating a byte array and converting it to a string, then copying it to the file, but i've hit a snag. The config file requires 4 bytes at the end of the file.
The process I need to accomplish is to remove these trailing bytes from the file, append the new user, then append the bytes to the end.
So, now that you have some context behind my problem.
Here is the question:
How do you remove and add bytes from a byte array?
Here is the code I've got so far, it reads the user from one file and appends it to another.
static void Main(string[] args)
{
System.Text.ASCIIEncoding code = new System.Text.ASCIIEncoding(); //Encoding in ascii to pick up mad characters
StreamReader reader = new StreamReader("one_user.cfg", code, false, 1072);
string input = "";
input = reader.ReadToEnd();
//convert input string to bytes
byte[] byteArray = Encoding.ASCII.GetBytes(input);
MemoryStream stream = new MemoryStream(byteArray);
//Convert Stream to string
StreamReader byteReader = new StreamReader(stream);
String output = byteReader.ReadToEnd();
int len = System.Text.Encoding.ASCII.GetByteCount(output);
using (StreamWriter writer = new StreamWriter("freeFTPdservice.cfg", true, Encoding.ASCII, 5504))
{
writer.Write(output, true);
writer.Close();
}
Console.WriteLine("Appended: " + len);
Console.ReadLine();
reader.Close();
byteReader.Close();
}
To try and illustrate this point, here is a "diagram".
1) Add first user
File(appended text)Bytes at end (zeros)
2) Add second user
File(appended text)(appended text)bytes at end (zeros)
and so on.
To explicitly answer your question: How do you remove and add bytes from a byte array?
You can only do this by creating a new array and copying the bytes into it.
Fortunately, this is simplified by using Array.Resize():
byte[] array = new byte[10];
Console.WriteLine(array.Length); // Prints 10
Array.Resize(ref array, 20); // Copies contents of old array to new.
Console.WriteLine(array.Length); // Prints 20
If you need to remove bytes from the beginning - Array.Copy bytes first and than resize (or copy to new array if you don't like ref):
// remove 42 bytes from beginning of the array, add size checks as needed
Array.Copy(array, 42, array, 0, array.Length-42);
Array.Resize(ref array, array.Length-42);
You don't. You can copy to a new array of the desired size. Or you can work with a List<byte> and then create an array from that.
But, in your case, I would suggest looking into the file streams themselves... they let you read and write individual bytes or byte arrays and also:
Seek
which lets you move around to arbitrary locations in the file... So, for the use case you described, you would
open the file (for read/write access)
move to the end of the file
move back four bytes (do you know which ones they are? if not, this would be a good time to stash them)
write the new user
write the four bytes
close the file
Something like this:
using (var fs = new FileStream(PATH, FileMode.Open, FileAccess.ReadWrite))
{
fs.Seek(-4, SeekOrigin.End);
fs.Write(userBytes);
fs.Write(fourBytesAtEnd);
}
This also has the advantage of not having to slurp in the whole file and write it back out.

Categories

Resources