Unexpected output when reading and writing to a text file - c#

I am a bit new to files in C# and am having a problem. When reading from a file and copying to another, the last chunk of text is not being written. Below is my code:
StringBuilder sb = new StringBuilder(8192);
string fileName = "C:...rest of path...inputFile.txt";
string outputFile = "C:...rest of path...outputFile.txt";
using (StreamReader reader = File.OpenText(fileName))
{
char[] buffer = new char[8192];
while ((reader.ReadBlock(buffer, 0, buffer.Length)) != 0)
{
foreach (char c in buffer)
{
//do some function on char c...
sb.Append(c);
}
using (StreamWriter writer = File.CreateText(outputFile))
{
writer.Write(sb.ToString());
}
}
}
My aim was to read and write to a textfile in a buffered manner. Something that in Java I would achieve in the following manner:
public void encrypt(File inputFile, File outputFile) throws IOException
{
BufferedReader infromfile = null;
BufferedWriter outtofile = null;
try
{
String key = getKeyfromFile(keyFile);
if (key != null)
{
infromfile = new BufferedReader(new FileReader(inputFile));
outtofile = new BufferedWriter(new FileWriter(outputFile));
char[] buffer = new char[8192];
while ((infromfile.read(buffer, 0, buffer.length)) != -1)
{
String temptext = String.valueOf(buffer);
//some changes to temptext are done
outtofile.write(temptext);
}
}
}
catch (FileNotFoundException exc)
{
} // and all other possible exceptions
}
Could you help me identify the source of my problem?
If you think that there is possibly a better approach to achieve buffered i/o with text files, I would truly appreciate your suggestion.

There are a couple of "gotchas":
c can't be changed (it's the foreach iteration variable), you'll need to copy it in order to process before writing
you have to keep track of your buffer's size, ReadBlock fills it with characters which would make your output dirty
Changing your code like this looks like it works:
//extracted from your code
foreach (char c in buffer)
{
if (c == (char)0) break; //GOTCHA #2: maybe you don't want NULL (ascii 0) characters in your output
char d = c; //GOTCHA #1: you can't change 'c'
// d = SomeProcessingHere();
sb.Append(d);
}

Try this:
string fileName = #"";
string outputfile = #"";
StreamReader reader = File.OpenText(fileName);
string texto = reader.ReadToEnd();
StreamWriter writer = new StreamWriter(outputfile);
writer.Write(texto);
writer.Flush();
writer.Close();

Does this work for you?
using (StreamReader reader = File.OpenText(fileName))
{
char[] buffer = new char[8192];
bool eof = false;
while (!eof)
{
int numBytes = (reader.ReadBlock(buffer, 0, buffer.Length));
if (numBytes>0)
{
using (StreamWriter writer = File.CreateText(outputFile))
{
writer.Write(buffer, 0, numBytes);
}
} else {
eof = true;
}
}
}
You still have to take care of character encoding though!

If you dont care about carraign returns, you could use File.ReadAllText
This method opens a file, reads each line of the file, and then adds each line as an element of a string. It then closes the file. A line is defined as a sequence of characters followed by a carriage return ('\r'), a line feed ('\n'), or a carriage return immediately followed by a line feed. The resulting string does not contain the terminating carriage return and/or line feed.
StringBuilder sb = new StringBuilder(8192);
string fileName = "C:...rest of path...inputFile.txt";
string outputFile = "C:...rest of path...outputFile.txt";
// Open the file to read from.
string readText = File.ReadAllText(fileName );
foreach (char c in readText)
{
// do something to c
sb.Append(new_c);
}
// This text is added only once to the file, overwrite it if it exists
File.WriteAllText(outputFile, sb.ToString());

Unless I'm missing something, it appears that your issue is that you're overwriting the existing contents of your output file on each blockread iteration.
You call:
using (StreamWriter writer = File.CreateText(outputFile))
{
writer.Write(sb.ToString());
}
for every ReadBlock iteration. The output of the file would only be the last chunk of data that was read.
From MSDN documentation on File.CreateText:
If the file specified by path does not exist, it is created. If the
file does exist, its contents are overwritten.

Related

Splitting of text file not working properly in c#

I have requirement of writing to text file.
If the file size exceeds 700MB, create new file & write to it.
I am currently writing data with “|” delimited from database to file & after that check the file size & splitting into multiple files, but the file splits in middle of the line.
It should write till end of line or start that particular line in new file .
I need to write the column names in the first line in the newly splited file.
I am new to c#, could you please suggest me the solution with the sample code.
Please find below code to splitting the file
private static void ReadWriteToFile(string fileNames)
{
string sourceFileName = fileNames;
string destFileLocation = Path.GetDirectoryName(fileNames);
int index = 0;
long maxFileSize = 700 * 1024 * 1024;
byte[] buffer = new byte[65536];
using (Stream source = File.OpenRead(sourceFileName))
{
while (source.Position < source.Length)
{
index++;
string newFileName = Path.Combine(destFileLocation, Path.GetFileNameWithoutExtension(sourceFileName));
newFileName += index.ToString() + Path.GetExtension(sourceFileName);
using (Stream destination = File.OpenWrite(newFileName))
{
while (destination.Position < maxFileSize)
{
int bytes = source.Read(buffer, 0, (int)Math.Min(maxFileSize, buffer.Length));
destination.Write(buffer, 0, bytes);
if (bytes < Math.Min(maxFileSize, buffer.Length))
{
break;
}
}
}
}
}
}
Thanks in advance.
Could you please let me know if there is any alternative best way to do this
Try this, a rewrite of a line file splitter i wrote in my beginning c# times.
(You only have to add the column header as a string in the beginning of a new file.)
private static void SplitAfterMBytes(int splitAfterMBytes, string filename)
{
// Variable for max. file size.
var maxFileSize = splitAfterMBytes * 1048576;
int fileCount = 0;
long byteCount = 0;
StreamWriter writer = null;
try
{
var inputFile = new FileInfo(filename);
var index = filename.LastIndexOf('.');
//get only the name of the file.
var fileStart = filename.Substring(0, index);
// get the file extension
var fileExtension = inputFile.Extension;
// generate a new file name.
var outputFile = fileStart + '_' + fileCount++ + fileExtension;
// file format is like: QS_201101_321.txt.
writer = new StreamWriter(outputFile);
using (var reader = new StreamReader(filename))
{
for (string str; (str = reader.ReadLine()) != null;)
{
byteCount = byteCount + System.Text.Encoding.Unicode.GetByteCount(str);
if (byteCount >= maxFileSize)
{
// max number of bytes reached
// write into the old file, without Newline,
// so that no extra line is written.
writer.Write(str);
// 1. close the actual file.
writer.Close();
// 2. open a new file with number incresed by 1.
outputFile = fileStart + '_' + fileCount++ + fileExtension;
writer = new StreamWriter(outputFile);
byteCount = 0; //reset the counter.
}
else
{
// Write into the old file.
// Use a Linefeed, because Write works without LF.
// like Java ;)
writer.Write(str);
writer.Write(writer.NewLine);
}
}
}
}
catch (Exception ex)
{
// do something useful, like: Console.WriteLine(ex.Message);
}
finally
{
writer.Dispose();
}
}

Why does FileStream sometimes ignore invisible characters?

I have two blocks of code that I've tried using for reading data out of a file-stream in C#. My overall goal here is to try and read each line of text into a list of strings, but they are all being read into a single string (when opened with read+write access together)...
I am noticing that the first block of code correctly reads in all of my carriage returns and line-feeds, and the other ignores them. I am not sure what is really going on here. I open up the streams in two different ways, but that shouldn't really matter right? Well, in any case here is the first block of code (that correctly reads-in my white-space characters):
StreamReader sr = null;
StreamWriter sw = null;
FileStream fs = null;
List<string> content = new List<string>();
List<string> actual = new List<string>();
string line = string.Empty;
// first, open up the file for reading
fs = File.OpenRead(path);
sr = new StreamReader(fs);
// read-in the entire file line-by-line
while(!string.IsNullOrEmpty((line = sr.ReadLine())))
{
content.Add(line);
}
sr.Close();
Now, here is the block of code that ignores all of the white-space characters (i.e. line-feed, carriage-return) and reads my entire file in one line.
StreamReader sr = null;
StreamWriter sw = null;
FileStream fs = null;
List<string> content = new List<string>();
List<string> actual = new List<string>();
string line = string.Empty;
// first, open up the file for reading/writing
fs = File.Open(path, FileMode.Open);
sr = new StreamReader(fs);
// read-in the entire file line-by-line
while(!string.IsNullOrEmpty((line = sr.ReadLine())))
{
content.Add(line);
}
sr.Close();
Why does Open cause all data to be read as a single line, and OpenRead works correctly (reads data as multiple lines)?
UPDATE 1
I have been asked to provide the text of the file that reproduces the problem. So here it is below (make sure that CR+LF is at the end of each line!! I am not sure if that will get pasted here!)
;$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
;$$$$$$$$$ $$$$$$$
;$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
;
;
;
UPDATE 2
An exact block of code that reproduces the problem (using the text above for the file). In this case I am actually seeing the problem WITHOUT trying Open and only using OpenRead.
StreamReader sr = null;
StreamWriter sw = null;
FileStream fs = null;
List<string> content = new List<string>();
List<string> actual = new List<string>();
string line = string.Empty;
try
{
// first, open up the file for reading/writing
fs = File.OpenRead(path);
sr = new StreamReader(fs);
// read-in the entire file line-by-line
while(!string.IsNullOrEmpty((line = sr.ReadLine())))
{
content.Add(line);
}
sr.Close();
// now, erase the contents of the file
File.WriteAllText(path, string.Empty);
// make sure that the contents of the file have been erased
fs = File.OpenRead(path);
sr = new StreamReader(fs);
if (!string.IsNullOrEmpty(line = sr.ReadLine()))
{
Trace.WriteLine("Failed: Could not erase the contents of the file.");
Assert.Fail();
}
else
{
Trace.WriteLine("Passed: Successfully erased the contents of the file.");
}
// now, attempt to over-write the contents of the file
fs.Close();
fs = File.OpenWrite(path);
sw = new StreamWriter(fs);
foreach(var l in content)
{
sw.Write(l);
}
// read back the over-written contents of the file
fs.Close();
fs = File.OpenRead(path);
sr = new StreamReader(fs);
while (!string.IsNullOrEmpty((line = sr.ReadLine())))
{
actual.Add(line);
}
// make sure the contents of the file are correct
if(content.SequenceEqual(actual))
{
Trace.WriteLine("Passed: The contents that were over-written are correct!");
}
else
{
Trace.WriteLine("Failed: The contents that were over-written are not correct!");
}
}
finally
{
// close out all the streams
fs.Close();
// finish-up with a message
Trace.WriteLine("Finished running the overwrite-file test.");
}
Your new file generated by
foreach(var l in content)
{
sw.Write(l);
}
does not contain end-of-line characters because end-of-line characters are not included in content.
As #DaveKidder points out in this thread over here, the spec for StreamReader.ReadLine specifically says that the resulting line does not include end of line.
When you do
while(!string.IsNullOrEmpty((line = sr.ReadLine())))
{
content.Add(line);
}
sr.Close();
You are losing end of line characters.

read file and copy modified data to other file

I have some numbers under each oterh, like this:
71004
71006
71008
71026
71028
They are standing in a text file. And I want to read the textfile and then modify the textfile, so it becomes:
71004|71006|71008|71026|71028|
I try it like this:
class Program
{
static void Main(string[] args)
{
string file1 = #"D:\Docs\ImportDataInNAV\ImportVendorNumbers.txt";
using (StreamReader stream = File.OpenText(file1))
{
string s = String.Empty;
while ((s = stream.ReadLine()) != null)
{
foreach (var line in s)
{
Console.WriteLine(line);
}
}
}
}
}
Thank you
I try it like this:
using (FileStream stream = File.OpenRead(#"D:\\Docs\\ImportDataInNAV\\ImportVendorNumbers.txt"))
using (FileStream writeStream = File.OpenWrite("D:\\file2.txt"))
{
var output = string.Join("|", File.ReadLines(writeStream));
BinaryReader reader = new BinaryReader(stream);
BinaryWriter writer = new BinaryWriter(writeStream);
// create a buffer to hold the bytes
byte[] buffer = new Byte[1024];
int bytesRead;
// while the read method returns bytes
// keep writing them to the output stream
while ((bytesRead = stream.Read(buffer, 0, 1024)) > 0)
{
//writeStream.Write(buffer, 0, bytesRead) ;
File.WriteAllText(filepath, output);
}
}
I want to read the textfile and then modify the textfile, so it
becomes:
71004|71006|71008|71026|71028|
Use string.Join to form | delimited string.
var output -string.Join("|",File.ReadLines(filepath));
File.WriteAllText(filepath, output);
Not sure | at the end is intentional, if it is desired output concat | at the end to output before writing.
how about Run your code like
yourprograme.exe > outputfile.txt

Is there a better way to write read and modify text lines and write them into an output stream?

I'm currently trying to read a file, modify a few placeholders within and then write the file into an output stream. As its the output stream for a page response in aspx.net I'm using the OutputStream.Write method there (the file is an attachment in the end).
Originally I had:
using (FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
while (readBytes < fs.Length)
{
tmpReadBytes = fs.Read(bytes, 0, bytes.Length);
if (tmpReadBytes > 0)
{
readBytes += tmpReadBytes;
page.Response.OutputStream.Write(bytes, 0, tmpReadBytes);
}
}
}
After thinking things over I came up with the following:
foreach(string line in File.ReadLines(filename))
{
string modifiedLine = line.Replace("#PlaceHolder#", "NewValue");
byte[] modifiedByteArray = System.Text.Encoding.UTF8.GetBytes(modifiedLine);
page.Response.OutputStream.Write(modifiedByteArray, 0, modifiedByteArray.length);
}
But it looks inefficient especially with the conversions. So my question is: Is there any better way of doing this?
As note the file itself is not very big, it's an about 3-4 KB sized textfile.
You don't need to handle the bytes your self.
If you know the file is and always will be small,
this.Response.Write(File.ReadAllText("path").Replace("old", "new"));
otherwise
using (var stream = new FileStream("path", FileMode.Open))
{
using (var streamReader = new StreamReader(stream))
{
while (streamReader.Peek() != -1)
{
this.Response.Write(streamReader.ReadLine().Replace("old", "new"));
}
}
}
To get the lines in a string array:
string[] lines = File.ReadAllLines(file);
To alter the lines, use a loop.
for (int i = 0; i < lines.Length; i++)
{
lines[i] = lines[i].Replace("#PlaceHolder#", "NewValue");
}
And to save the new text, first create a string with all the lines.
string output = "";
foreach(string line in lines)
{
output+="\n"+line;
}
And then save the string to the file.
File.WriteAllText(file,output);

Not able to read special character "£" using Streamreader in c#

I am trying to read a character (£) from a text file, using the following code.
public static List<string> ReadAllLines(string path, bool discardEmptyLines, bool doTrim)
{
var retVal = new List<string>();
if (string.IsNullOrEmpty(path) || !File.Exists(path)) {
Comm.File.Log.LogError("ReadAllLines", string.Format("Could not load file: {0}", path));
return retVal;
}
//StreamReader sr = null;
StreamReader sr = new StreamReader(path, Encoding.Default));
try {
sr = File.OpenText(path);
while (sr.Peek() >= 0) {
var line = sr.ReadLine();
if (discardEmptyLines && (line == null || string.IsNullOrEmpty(line.Trim()))) {
continue;
}
if (line != null) {
retVal.Add(doTrim ? line.Trim() : line);
}
}
}
catch (Exception ex) {
Comm.File.Log.LogGeneralException("ReadAllLines", ex);
}
finally {
if (sr != null) {
sr.Close();
}
}
return retVal;
}
But my code is not correctly reading £, It is reading the character as � please guide me what needs to be done to read the special character.
Thanks in advance.
The file you are reading is not encoded the same as Encoding.Default. It is likely UTF-8. Try using UTF-8 for this particular file. For more generic usage, you should see Determining the Encoding of a text file.
Try to replace Encoding.Default with Encoding.GetEncoding(437)
Look like a encoding problem. Try creating your StreamReader with a UTF-8 (or Unicode) encoding instead of default.
StreamReader sr = new StreamReader(path, Encoding.UTF8));
Encoding information can be provided to stream reader in 2 ways.
1)Save your file with save as option and select the appropriate encoding option from dropdown in windows.
see screenshot
2)If your files are dynamic in nature use Encoding.GetEncoding() with StreamReader

Categories

Resources