reading from file and writing array to file - C# - c#

Is there any way to add a string array to file without using loop? I tried doing that with StreamWriter (writer) after spliting the string to cells in array. I seek for a way to avoid the for loop and the close() method.
Thanks!
if (data[0] != null)
{
for (int i = 0; i < data.Length; i++)
{
this.writer.Write(this.data[i] + "+++", true);
}
}
this.writer.Write("\n", true);
writer.Close();

I don't know why you want to write like this, but you can try:
var stringToSave = string.Join("+++", data);
File.WriteAllText(stringToSave);
But of course there is a loop somewhere in string.Join() here. I would also like to ask you, are you sure you don't want to use:
File.WriteAllLines(data);
?
This will write every string of your collection to a seperate line...

Related

Replace character at specific index in List<string>, but indexer is read only [duplicate]

This question already has answers here:
Is there an easy way to change a char in a string in C#?
(8 answers)
Closed 5 years ago.
This is kind of a basic question, but I learned programming in C++ and am just transitioning to C#, so my ignorance of the C# methods are getting in my way.
A client has given me a few fixed length files and they want the 484th character of every odd numbered record, skipping the first one (3, 5, 7, etc...) changed from a space to a 0. In my mind, I should be able to do something like the below:
static void Main(string[] args)
{
List<string> allLines = System.IO.File.ReadAllLines(#"C:\...").ToList();
foreach(string line in allLines)
{
//odd numbered logic here
line[483] = '0';
}
...
//write to new file
}
However, the property or indexer cannot be assigned to because it is read only. All my reading says that I have not set a setter for the variable, and I have tried what was shown at this SO article, but I am doing something wrong every time. Should what is shown in that article work? Should I do something else?
You cannot modify C# strings directly, because they are immutable. You can convert strings to char[], modify it, then make a string again, and write it to file:
File.WriteAllLines(
#"c:\newfile.txt"
, File.ReadAllLines(#"C:\...").Select((s, index) => {
if (index % 2 = 0) {
return s; // Even strings do not change
}
var chars = s.ToCharArray();
chars[483] = '0';
return new string(chars);
})
);
Since strings are immutable, you can't modify a single character by treating it as a char[] and then modify a character at a specific index. However, you can "modify" it by assigning it to a new string.
We can use the Substring() method to return any part of the original string. Combining this with some concatenation, we can take the first part of the string (up to the character you want to replace), add the new character, and then add the rest of the original string.
Also, since we can't directly modify the items in a collection being iterated over in a foreach loop, we can switch your loop to a for loop instead. Now we can access each line by index, and can modify them on the fly:
for(int i = 0; i < allLines.Length; i++)
{
if (allLines[i].Length > 483)
{
allLines[i] = allLines[i].Substring(0, 483) + "0" + allLines[i].Substring(484);
}
}
It's possible that, depending on how many lines you're processing and how many in-line concatenations you end up doing, there is some chance that using a StringBuilder instead of concatenation will perform better. Here is an alternate way to do this using a StringBuilder. I'll leave the perf measuring to you...
var sb = new StringBuilder();
for (int i = 0; i < allLines.Length; i++)
{
if (allLines[i].Length > 483)
{
sb.Clear();
sb.Append(allLines[i].Substring(0, 483));
sb.Append("0");
sb.Append(allLines[i].Substring(484));
allLines[i] = sb.ToString();
}
}
The first item after the foreach (string line in this case) is a local variable that has no scope outside the loop - that’s why you can’t assign a value to it. Try using a regular for loop instead.
Purpose of for each is meant to iterate over a container. It's read only in nature. You should use regular for loop. It will work.
static void Main(string[] args)
{
List<string> allLines = System.IO.File.ReadAllLines(#"C:\...").ToList();
for (int i=0;i<=allLines.Length;++i)
{
if (allLines[i].Length > 483)
{
allLines[i] = allLines[i].Substring(0, 483) + "0";
}
}
...
//write to new file
}

Extremely Large Single-Line File Parse

I am downloading data from a site and the site gives the data to me in very large blocks. Within the very large block, there are "chunks" that I need to parse individually. These "chunks" begin with "(ClinicalData)" and end with "(/ClinicalData)". Therefore, an example string would look something like:
(ClinicalData)(ID="1")(/ClinicalData)(ClinicalData)(ID="2")(/ClinicalData)(ClinicalData)(ID="3")(/ClinicalData)(ClinicalData)(ID="4")(/ClinicalData)(ClinicalData)(ID="5")(/ClinicalData)
Under "ideal" circumstances, the block is meant to be one-single line of data, however sometimes there are erroneous newline characters. Since I want to parse the (ClinicalData) chunks within the block, I want to make my data parse-able line-by-line. Therefore, I take the text file, read it all into a StringBuilder, remove new-lines (just in case), and then insert my own newlines, that way I can read line-by-line.
StringBuilder dataToWrite = new StringBuilder(File.ReadAllText(filepath), Int32.MaxValue);
// Need to clear newline characters just in case they exist.
dataToWrite.Replace("\n", "");
// set my own newline characters so the data becomes parse-able by line
dataToWrite.Replace("<ClinicalData", "\n<ClinicalData");
// set the data back into a file, which is then used in a StreamReader to parse by lines.
File.WriteAllText(filepath, dataToWrite.ToString());
This has been working out great (albeit maybe not efficient, but at least it is friendly to me :)), until I have not encountered a chunk of data that is being given to me as a 280MB large file.
Now I am getting a System.OutOfMemoryException with this block and I just cannot figure out a way around it. I believe the issue is that StringBuilder cannot handle 280MB of straight text? Well, I have tried string splits, regex.match splits, and various other ways to break it into guaranteed "(ClinicalData) chunks, but I continue to get the memory exception. I have also had no luck in attempting to read pre-defined chunks (e.g.: using .ReadBytes).
Any suggestions on how to handle a 280MB large, potentially-but-might-not-actually-be single line of text would be great!
That's an extremely inefficient way to read a text file, let alone a large one. If you only need one pass, replacing or adding individual characters, you should use a StreamReader. If you only need one character of lookahead you only need to maintain a single intermediate state, something like:
enum ReadState
{
Start,
SawOpen
}
using (var sr = new StreamReader(#"path\to\clinic.txt"))
using (var sw = new StreamWriter(#"path\to\output.txt"))
{
var rs = ReadState.Start;
while (true)
{
var r = sr.Read();
if (r < 0)
{
if (rs == ReadState.SawOpen)
sw.Write('<');
break;
}
char c = (char) r;
if ((c == '\r') || (c == '\n'))
continue;
if (rs == ReadState.SawOpen)
{
if (c == 'C')
sw.WriteLine();
sw.Write('<');
rs = ReadState.Start;
}
if (c == '<')
{
rs = ReadState.SawOpen;
continue;
}
sw.Write(c);
}
}
First off, I don't think you need to put all the text in a StringBuilder, since you aren't even concatenating parts to it. You could just try the following:
File.ReadAllText(filepath).Replace("\n", "").Replace("<ClinicalData", "\n<ClinicalData");
Why not try a StreamReader for this task? You can pick a "chunk" size that you want to read by and then split up those chunks into the (ClinicalData)data(/ClinicalData) parts. Here is some detailed code on how to do this:
char[] buffer = new char[1024];
string remainder = string.Empty;
List<ClientData> list = new List<ClientData>();
using (StreamReader reader = File.OpenText(#"source.txt"))
{
while (reader.Read(buffer, 0, 1024) > 0)
{
remainder = Parse(remainder + new string(buffer), list);
}
}
with the following method:
string Parse(string value, List<ClientData> list)
{
string[] parts = value.Split(new string[1] { "</ClientData>" }, StringSplitOptions.None);
for (int i = 0; i < parts.Length - 1; i++)
list.Add(new ClientData(parts[i]));
return parts[parts.Length - 1];
}
and the ClientData class however you have it implemented:
class ClientData
{
public ClientData(string value)
{
// fill in however you are already parsing out ID, and other info
}
}
There are many ways to implement something like this, but hopefully this can help get you started.
StreamReader's ReadLine() method is only one of the many ways you can read the text from the file. You can read into a buffer with a specified length, and then parse out the ClinicalData tags. I can provide an example if you'd like.
http://msdn.microsoft.com/en-us/library/9kstw824%28v=vs.110%29.aspx
Alternately, if you are reading an XML file, XmlReader is another option.
http://msdn.microsoft.com/en-us/library/system.xml.xmlreader%28v=vs.110%29.aspx

Can't find string in input file

I have a text file, which I am trying to insert a line of code into. Using my linked-lists I believe I can avoid having to take all the data out, sort it, and then make it into a new text file.
What I did was come up with the code below. I set my bools, but still it is not working. I went through debugger and what it seems to be going on is that it is going through the entire list (which is about 10,000 lines) and it is not finding anything to be true, so it does not insert my code.
Why or what is wrong with this code?
List<string> lines = new List<string>(File.ReadAllLines("Students.txt"));
using (StreamReader inFile = new StreamReader("Students.txt", true))
{
string newLastName = "'Constant";
string newRecord = "(LIST (LIST 'Constant 'Malachi 'D ) '1234567890 'mdcant#mail.usi.edu 4.000000 )";
string line;
string lastName;
bool insertionPointFound = false;
for (int i = 0; i < lines.Count && !insertionPointFound; i++)
{
line = lines[i];
if (line.StartsWith("(LIST (LIST "))
{
values = line.Split(" ".ToCharArray());
lastName = values[2];
if (newLastName.CompareTo(lastName) < 0)
{
lines.Insert(i, newRecord);
insertionPointFound = true;
}
}
}
if (!insertionPointFound)
{
lines.Add(newRecord);
}
You're just reading the file into memory and not committing it anywhere.
I'm afraid that you're going to have to load and completely re-write the entire file. Files support appending, but they don't support insertions.
you can write to a file the same way that you read from it
string[] lines;
/// instanciate and build `lines`
File.WriteAllLines("path", lines);
WriteAllLines also takes an IEnumerable, so you can past a List of string into there if you want.
one more issue: it appears as though you're reading your file twice. one with ReadAllLines and another with your StreamReader.
There are at least four possible errors.
The opening of the streamreader is not required, you have already read
all the lines. (Well not really an error, but...)
The check for StartsWith can be fooled if you lines starts with blank
space and you will miss the insertionPoint. (Adding a Trim will remove any problem here)
In the CompareTo line you check for < 0 but you should check for == 0. CompareTo returns 0 if the strings are equivalent, however.....
To check if two string are equals you should avoid using CompareTo as
explained in MSDN link above but use string.Equals
List<string> lines = new List<string>(File.ReadAllLines("Students.txt"));
string newLastName = "'Constant";
string newRecord = "(LIST (LIST 'Constant 'Malachi 'D ) '1234567890 'mdcant#mail.usi.edu 4.000000 )";
string line;
string lastName;
bool insertionPointFound = false;
for (int i = 0; i < lines.Count && !insertionPointFound; i++)
{
line = lines[i].Trim();
if (line.StartsWith("(LIST (LIST "))
{
values = line.Split(" ".ToCharArray());
lastName = values[2];
if (newLastName.Equals(lastName))
{
lines.Insert(i, newRecord);
insertionPointFound = true;
}
}
}
if (!insertionPointFound)
lines.Add(newRecord);
I don't list as an error the missing write back to the file. Hope that you have just omitted that part of the code. Otherwise it is a very simple problem.
(However I think that the way in which CompareTo is used is probably the main reason of your problem)
EDIT Looking at your comment below it seems that the answer from Sam I Am is the right one for you. Of course you need to write back the modified array of lines. All the changes are made to an in memory array of lines and nothing is written back to a file if you don't have code that writes a file. However you don't need new file
File.WriteAllLines("Students.txt", lines);

Reading a line in c# without trimming the line delimiter character

I've got a string that I want to read line-by-line, but I also need to have the line delimiter character, which StringReader.ReadLine unfortunately trims (unlike in ruby where it is kept). What is the fastest and most robust way to accomplish this?
Alternatives I've been thinking about:
Reading the input character-by-character and checking for the line delimiter each time
Using RegExp.Split with a positive lookahead
Alternatively I only care about the line delimiter because I need to know the actual position in the string, and the delimiter can be either one or tho character long. Therefore if I could get back the actual position of the cursor within the string would be also good, but StringReader doesn't have this feature.
EDIT: here is my current implementation. End-of-file is designated by returning an empty string.
StringBuilder line = new StringBuilder();
int r = _input.Read();
while (r >= 0)
{
char c = Convert.ToChar(r);
line.Append(c);
if (c == '\n') break;
if (c == '\r')
{
int peek = _input.Peek();
if (peek == -1) break;
if (Convert.ToChar(peek) != '\n') break;
}
r = _input.Read();
}
return line.ToString();
Are you concerned about inconsistencies between files (i.e. coming from Unix/Mac vs. Windows), or within files?
One very easy optimization if you know that individual files are consistent with themselves would be to only read the first line character-by-character and figure out what the delimiter is. Then determining the exact position of any other line would be simple math.
Failing that, I think I would go the character-by-character route. A regex seems too "clever." This sounds like a complex function and I think the most important thing would be to make it easy to write, read, understand, and most importantly debug.
There's another way to do this, which would be more efficient if your data source was a stream. Unfortunately it's not, as referenced in your comment, so you would have to create one first; however, I'll include the solution anyway, it might give you some inspiration:
public IEnumerable<int> GetLineStartIndices(string s)
{
yield return 0;
byte[] chars = Encoding.UTF8.GetBytes(s);
using (MemoryStream stream = new MemoryStream(chars))
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
while (reader.ReadLine() != null)
{
yield return stream.Position;
}
}
}
}
This will give you back the start position of each new line. Obviously you can tweak this to do whatever else you need, i.e. do something else with the actual lines you read.
Just note that this has to make a copy of the string to create the byte array, so it's really not suitable for very large strings. It's a bit nicer than the char-by-char approach though, less bug-prone, so perhaps worth considering if the strings are not megabytes-long.
If you only care about the position: ReadLine() moves you to the next line. If you store the .Position of the stream underneath you can compare it to the .Position after the following ReadLine(). That's the length of the string you just read plus the delimiter.
Length of the delimiter is currentPosition - previousPosition - line.Length.
That way you could easily find out if it was 1 or 2 bytes (without knowing the details, but you said you care only about the positions anyway).
File.ReadAllText will get you all of the file contents. Yup. All. So you better check that file size before using it.
EDIT:
read it all in then create an enumerator that yields line by line.
foreach(string line in Read("some.file"))
{ ... }
private IEnumerator Read(string file)
{
string buffer = File.ReadAllText()
for (int index=0;index<buffer.length;index++)
{
string line = ... logic to build a "line" here
yield return line;
}
yield break;
}
FileStream fs = new FileStream("E:\\hh.txt", FileMode.Open, FileAccess.Read);
BinaryReader read = new BinaryReader(fs);
byte[] ch = read.ReadBytes((int)fs.Length);
byte[] che=new byte[(int)fs.Length];
int size = (int)fs.Length,j=0;
for ( int i =0; i <= (size-1); i++)
{
if (ch[i] != '|')
{
che[j] = ch[i];
j++;
}
}
richTextBox1.Text = Encoding.ASCII.GetString(che);
read.Close();
fs.Close();

c# how do I count lines in a textfile

any problems with doing this?
int i = new StreamReader("file.txt").ReadToEnd().Split(new char[] {'\n'}).Length
The method you posted isn't particularly good. Lets break this apart:
// new StreamReader("file.txt").ReadToEnd().Split(new char[] {'\n'}).Length
// becomes this:
var file = new StreamReader("file.txt").ReadToEnd(); // big string
var lines = file.Split(new char[] {'\n'}); // big array
var count = lines.Count;
You're actually holding this file in memory twice: once to read all the lines, once to split it into an array. The garbage collector hates that.
If you like one liners, you can write System.IO.File.ReadAllLines(filePath).Length, but that still retrieves the entire file in an array. There's no point doing that if you aren't going to hold onto the array.
A faster solution would be:
int TotalLines(string filePath)
{
using (StreamReader r = new StreamReader(filePath))
{
int i = 0;
while (r.ReadLine() != null) { i++; }
return i;
}
}
The code above holds (at most) one line of text in memory at any given time. Its going to be efficient as long as the lines are relatively short.
Well, the problem with doing this is that you allocate a lot of memory when doing this on large files.
I would rather read the file line by line and manually increment a counter. This may not be a one-liner but it's much more memory-efficient.
Alternatively, you may load the data in even-sized chunks and count the line breaks in these. This is probably the fastest way.
If you're looking for a short solution, I can give you a one-liner that at least saves you from having to split the result:
int i = File.ReadAllLines("file.txt").Count;
But that has the same problems of reading a large file into memory as your original. You should really use a streamreader and count the line breaks as you read them until you reach the end of the file.
Sure - it reads the entire stream into memory. It's terse, but I can create a file today that will fail this hard.
Read a character at a time and increment your count on newline.
EDIT - after some quick research
If you want terse and want that shiny new generic feel, consider this:
public class StreamEnumerator : IEnumerable<char>
{
StreamReader _reader;
public StreamEnumerator(Stream stm)
{
if (stm == null)
throw new ArgumentNullException("stm");
if (!stm.CanSeek)
throw new ArgumentException("stream must be seekable", "stm");
if (!stm.CanRead)
throw new ArgumentException("stream must be readable", "stm");
_reader = new StreamReader(stm);
}
public IEnumerator<char> GetEnumerator()
{
int c = 0;
while ((c = _reader.Read()) >= 0)
{
yield return (char)c;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
which defines a new class which allows you to enumerate over streams, then your counting code can look like this:
StreamEnumerator chars = new StreamEnumerator(stm);
int lines = chars.Count(c => c == '\n');
which gives you a nice terse lambda expression to do (more or less) what you want.
I still prefer the Old Skool:
public static int CountLines(Stream stm)
{
StreamReader _reader = new StreamReader(stm);
int c = 0, count = 0;
while ((c = _reader.Read()) != -1)
{
if (c == '\n')
{
count++;
}
}
return count;
}
NB: Environment.NewLine version left as an exercise for the reader
That should do the trick:
using System.Linq;
....
int i = File.ReadLines(file).Count();
Mayby this?
string file = new StreamReader("YourFile.txt").ReadToEnd();
string[] lines = file.Split('\n');
int countOfLines = lines.GetLength(0));
Assuming the file exists and you can open it, that will work.
It's not very readable or safe...

Categories

Resources