Can you read multiple variables on one line with Streamreader()? - c#

I'm working in c# and wondering if it's possible to read back multiple variables and load into an array using Streamreader.Readline()?
Here's an example:
I have an array of different types being written using Streamwriter
foreach (Stuff stf in StuffArray)
{
sw.WriteLine(" " + stf.car+ " " + stf.carOwned + " " + stf.carLocation);
}
sw.Close();
It writes out a text line that looks like:
Magnum True Alabama
When I go to read it back the only option I have is to read the entire line with Streamreader.
I want to load it back like:
for (int i = 0; i < stfArray.Length; i++)
{
stfArray[i] = new stfArray(Readline spot 1, Readline spot 2, Readline spot3)
}
So I can put the stuff back into the array or a new array in the exact same way it was when I extracted it.
Thanks!

Read the line, parse it (say with string.Split) and then take the tokens that you've parse and rehydrate your variables.
You probably want to use a more sophisticated format that word, space, etc. (think of cars whose model name contains a space, or places like New York). Pick a separator that will not occur in your strings.
Or better still, pick a well known serialization format like XML or JSON.

Since you are creating string representation on your own, you are 100% sure about its format. Under any other occasion, you should consider something more robust, like serialization to XML and then writing it to a file.
In your situation I would recommend creating new collection of Stuff and add object, that lies in every line of your file, like:
List<Stuff> stuffItems = new List<Stuff>();
while(!streamReader.EndOfStream)
{
string[] line = streamReader.ReadLine().Split(' ');
stuffItems.Add(new Stuff(line[0], line[1], line[2]));
}
And now you can use this list however you want, for example overwrite old one calling ToArray method on list, etc.
But, again, I warn you about this approach: when you change your format, for example change delimiting character, you'd get exception :)

Related

Parallel.ForEach not storing data in variable accurately

First off, I apologize for the title, I am unsure of how to word this question.
I currently have a Parallel.ForEach looping through a directory filled with .txt files.
What I am doing inside of this is grabbing specific data that varies from file-to-file.
Whenever all of the needed data is grabbed, it is then outputted to a file.
Everything is accurate except for the ID numbers inside each text file. The ID numbers are not lining up with their corresponding file. It's outputted in an arbitrary manner. The names and date's line up fine, but not the actual ID numbers inside of the text files. I am unsure of how to proceed with this as the code looks fine to me. The only thing I can think of is AppendAllText is consistently opening and closing the final text file, which in turn disallows accurate data to be outputted. Below is my code.
Parallel.ForEach(directoryInfo.GetFiles("*.txt"), (file, state) =>
{
using StreamReader sr = File.OpenText(file.FullName);
string user = sr.ReadToEnd();
if (user.Contains("ID:"))
{
ID = IDRegex.Match(user).Value.Replace("ID:", string.Empty);
}
else if (user.Contains("ID="))
{
ID = IDDRegex.Match(user).Value.Replace("ID=", string.Empty);
}
this.Dispatcher.Invoke(() =>
{
//count++;
//Current.Content = file.Name;
if (user.Contains(users.Text))
{
File.AppendAllText(Idir + "IReport-" + IUser.Text + ".txt",
String.Format("{0,-16} {1,27}", file.Name.Replace(".txt", ""), (file.LastWriteTime.Date).ToString()).Replace("12:00:00 AM", "") +
String.Format("{0, 18}", ID) + Environment.NewLine + Environment.NewLine);
}
});
});
Your problem is that because you've used Parallel.ForEach(), the code you've posted is being run multiple times simutaneously by different threads. This is great for doing the work faster, but it can catch you out a few ways.
Your variable ID is not declared in the code you've posted, which means it must be coming from somewhere else. This means that all the threads created by your use of Parallel.ForEach() are sharing the same variable, overwriting each other's values, which is what you're seeing: the ID being written to the file isn't the one for this file, it's one from whichever thread happens to have touched that value most recently.
Declare a fresh variable inside your call to Parallel.ForEach() to use as the ID, and all threads will have their own value that the others can't mess with. That should fix your problem.

Convert text of a C# project into 1 text file

So I'm doing Google Code Jam, and for their new format I have to upload my code as a single text file.
I like writing my code as properly constructed classes and multiple files even when under time pressure (I find that I save more time in clarity and my own debugging speed than I lose in wasted time.) and I want to re-use the common code.
Once I've got my code finished I have to convert from a series of classes in multiple files, to a single file.
Currently I'm just manually copying and pasting all the files' text into a single file, and then manually massaging the usings and namespaces to make it all work.
Is there a better option?
Ideally a tool that will JustDoIt for me?
Alternatively, if there were some predictable algorithm that I could implement that wouldn't require any manual tweaks?
Write your classes so that all "using"s are inside "namespace"
Write a script which collects all *.cs files and concatenates them
This is probably not the most optimal way to do this but this is a algorithm which can do what you need:
loop through every file and grab every line starting with "using" -> write them to a temp file/buffer
check for duplicates and remove them
get the position of the first '{' after the charsequence "namespace"
get the position of the last '}' in the file
append the text in between these two positions onto a temp file/buffer
append the second file/buffer to the first one
write out the merged buffer
It is very subjective. I see the algorithm as the following in pseudo code:
usingsLines = new HashSet<string>();
newFile = new StringBuilder();
foreeach (file in listOfFiles)
{
var textFromFile = file.ReadToEnd();
var usingOperators = textFromFile.GetUsings();
var fileBody = textFromFile.GetBody();
newFile+=fileBody ;
}
newFile = usingsLines.ToString() + newFile;
// As a result if will have something like this
// using usingsfromFirstFile;
// using usingsfromSecondFile;
//
// namespace FirstFileNamespace
// {
// ...
// }
//
// namespace SecondFileNamespace
// {
// ...
// }
But keep in mind this approach can lead to conflicts in namespaces if two different namespaces contan the same classes etc. To solve it you need to fix it manually, or rid of using operator and use fullnames with namespaces.
Also these few links may be useful:
Merge files,
Merge file in Java

Searching an array string with a binary search sub string

I have a file.txt containing about 200,000 records.
The format of each record is 123456-99-Text. The 123456 are unique account numbers, the 99 is a location code that I need (it changes from 01 to 99), and the text is irrelevant. These account numbers are sorted in order and with a line break in the file per ac(111111, 111112, 111113, etc).
I made a visual studio textbox and search button to have someone search for the account number. The account number is actually 11 digits long but only the first 6 matter. I wrote this as string actnum = textbox1.text.substring(0,6)
I wrote a foreach (string x in file.readline('file.txt')) with an if (x.contains(actnum)) then string code = x.substring(8,2)) statement.
The program works well, but because there are so many records if someone searches an account number that doesnt exist, or a number at the bottom of the list, the program locks up for a good 10 seconds before going to the "number not found" else statement, or taking forever to find that last record.
My Question:
Reading about binary searches I have attempted to try one without much success. I cannot seem to get the array or file to act like a legitimate binary search. Is there a way to take the 6 digit actnum from textbox1, compare it to an array substring of the 6 digit account number, then grab the substring 99 code from that specific line?
A binary search would help greatly! I could take 555-555 and compare it to the top or bottom half of the record file, then keep searching until i fine the line i need, grab the entire line, then substring the 99 out. The problem I have is I cant seem to get a proper integer conversion of the file because it contains both numbers AND text, and therefore I cant properly use <, >, = signs.
Any help on this would be greatly appreciated. The program I currently have actually works but is incredibly slow at times.
As one possible solution (not necessarily the best) you can add your record IDs to a Dictionary<string, int> (or even a Dictionary<long, int> if all record IDs are numeric) where each key is the ID of one line and each value is the line index. When you need to look up a particular record, just look in the dictionary (it'll do an efficient lookup for you) and gives you the line number. If the item is not there (non-existent ID), you won't find it in the dictionary.
At this point, if the record ID exists in the file, you have a line number - you can either load the entire file into memory (if it's not too big) or just seek to the right line and read in the line with the data.
For this to work, you have to go through the file at least once and collect all the record IDs from all lines and add them to the dictionary. You won't have to implement the binary search - the dictionary will internally perform the lookup for you.
Edit:
If you don't need all the data from a particular line, just one bit (like the location code you mentioned), you don't even need to store the line number (since you won't need to go back to the line in the file) - just store the location data as the value in the dictionary.
I personally would still store the line index because, in my experience, such projects start out small but end up collecting features and there'll be a point where you'll have to have everything from the file. If you expect this to be the case over time, just parse data from each line into a data structure and store that in the dictionary - it'll make your future life simpler. If you're very sure you'll never need more data than the one bit of information, you can just stash the data itself in the dictionary.
Here's a simple example (assuming that your record IDs can be parsed into a long):
public class LineData
{
public int LineIndex { get; set; }
public string LocationCode { get; set; }
// other data from the line that you need
}
// ...
// declare your map
private Dictionary<long, LineData> _dataMap = new Dictionary<long, LineData> ();
// ...
// Read file, parse lines into LineData objects and put them in dictionary
// ...
To see if a record ID exists, you just call TryGetValue():
LineData lineData;
if ( _dataMap.TryGetValue ( recordID, out lineData ) )
{
// record ID was found
}
This approach essentially keeps the entire file in memory but all data is parsed only once (at the beginning, during building the dictionary). If this approach uses too much memory, just store the line index in the dictionary and then go back to the file if you find a record and parse the line on the fly.
You cannot really do a binary search against file.ReadLine because you have to be able to access the lines in different order. Instead you should read the whole file into memory (file.ReadAllLines would be an option)
Assuming your file is sorted by the substring, you can create a new class that implements IComparer
public class SubstringComparer : IComparer<string>
{
public int Compare(string x, string y)
{
return x.Substring(0, 6).CompareTo(y.Substring(0, 6));
}
}
and then your binary search would look like:
int returnedValue = foundStrings.BinarySearch(searchValue, new SubstringComparer());
Assuming the file doesn't change often, then you can simply load the entire file into memory using a structure that handles the searching in faster time. If the file can change then you will need to decide on a mechanism for reloading the file, be it restarting the program or a more complex process.
It looks like you are looking for exact matches (searching for 123456 yields only one record which is labelled 123456). If that is the case then you can use a Dictionary. Note that to use a Dictionary you need to define key and value types. It looks like in your case they would both be string.
While I did not find a way to do a better type of search, I did manage to learn about embedded resources which considerably sped up the program. Scanning the entire file takes a fraction of a second now, instead of 5-10 seconds. Posting the following code:
string searchfor = textBox1.Text
Assembly assm = Assembly.GetExecutingAssembly();
using (Stream datastream = assm.GetManifestResourceStream("WindowsFormsApplication2.Resources.file1.txt"))
using (StreamReader reader = new StreamReader(datastream))
{
string lines;
while ((lines = reader.ReadLine()) != null)
{
if (lines.StartsWith(searchfor))
{
label1.Text = "Found";
break;
}
else
{
label1.Text = "Not found";
}
}
}

Read multiple lines with StreamReader with StreamReader.Peek

Let's say I have the following file format (Key value pairs):
Objec1tKey: Object1Value
Object2Key: Object2Value
Object3Key: Object3Value
Object4Key: Object4Value1
Object4Value2
Object4Value3
Object5Key: Object5Value
Object6Key: Object6Value
I'm reading this line by line with StreamReader. for the objects 1, 2, 3, 5 and 6 it wouldn't be a problem because the whole object is on one line, so it's possible to process the object.
But for object 4 I need to process multiple lines. can I use Peek for this? (MSDN for Peek: Returns the next available character but does not consume it.). Is there a method like Peek which returns the next line and not the character?
If I can use Peek, then my question is, can I use Peek two times so I can read the two next lines (or 3) until I know there is a new object (obect 5) to be processed?
I would strongly recommend that you separate the IO from the line handling entirely.
Instead of making your processing code use a StreamReader, pass it either an IList<string> or an IEnumerable<string>... if you use IList<string> that will make it really easy to just access the lines by index (so you can easily keep track of "the key I'm processing started at line 5" or whatever), but it would mean either doing something clever or reading the whole file in one go.
If it's not a big file, then just using File.ReadAllLines is going to be the very simplest way of reading a file as a list of lines.
If it is a big file, use File.ReadLines to obtain an IEnumerable<string>, and then your processing code needs to be a bit smarter... for example, it might want to create a List<string> for each key that it processes, containing all the lines for that key - and let that list be garbage collected when you read the next key.
There is now way to use Peek multiple time as you thing, because it will always return only "top" character in stream. It just read it but "not send" to stream information that it was read.
To sum up pointer to stream after Peek stays in same place.
If you use for example FileStream you can use Seek for going back, but you didn't precise what type of stream are you using.
You could do something like this:
List<MyObject> objects = new List<MyObject>();
using (StreamReader sr = new StreamReader(aPath))
{
MyObject curObj;
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
if (line.IndexOf(':') >= 0) // or whatever identify the beginning of a new object
{
curObj = new MyObject(line);
objects.Add(curObj);
}
else
curObj.AddAttribute(line);
}
}

C# Saving "X" times into one .txt file without overwriting last string

Well, now i have a new problem.
Im writing code in C#
I want to save from textBoxName into group.txt file each time i enter string into textbox and click on save button. It should save at this order (if its possible to sort it like A-Z that would be great):
1. Petar Milutinovic
2. Ljiljana Milutinovic
3. Stefan Milutinovic
4. ... etc
I cant get it to work, i tried to use tehniques from my first question, and no solution yet :(
This is easy one i guess, but im still a beginer and i need this baddly...
Try to tackle this from a top-down approach. Write out what should happen, because it's not obvious from your question.
Example:
User enters a value in a (single-line?) textbox
User clicks Save
One new line is appended to the end of a file, with the contents of the textbox in step 1
Note: each line is prefixed with a line number, in the form "X. Sample" where X is the line number and Sample is the text from the textbox.
Is the above accurate?
(If you just want to add a line to a text file, see http://msdn.microsoft.com/en-us/library/ms143356.aspx - File.AppendAllText(filename, myTextBox.Text + Environment.NewLine); may be what you want)
Here's a simple little routine you can use to read, sort, and write the file. There are loads of ways this can be done, mine probably isn't even the best. Even now I'm thinking "I could have written that using a FileStream and done the iteration for counting then", but they're micro-optimizations that can be done later if you have performance issues with multi-megabyte files.
public static void AddUserToGroup(string userName)
{
// Read the users from the file
List<string> users = File.ReadAllLines("group.txt").ToList();
// Strip out the index number
users = users.Select(u => u.Substring(u.IndexOf(". ") + 2)).ToList();
users.Add(userName); // Add the new user
users.Sort((x,y) => x.CompareTo(y)); // Sort
// Reallocate the number
for (int i = 0; i < users.Count; i++)
{
users[i] = (i + 1).ToString() + ". " + users[i];
}
// Write to the file again
File.WriteAllLines("group.txt", users);
}
If you need the file to be sorted every time a new line is added, you'll either have to load the file into a list, add the line, and sort it, or use some sort of search (I'd recommend a binary search) to determine where the new line belongs and insert it accordingly. The second approach doesn't have many advantages, though, as you basically have to rewrite the entire file in order to insert a line - it only saves you time in the best case scenario, which occurs when the line to be inserted falls at the end of the file. Additionally, the second method is a bit lighter on the processor, as you aren't attempting to sort every line - for small files however, the difference will be unnoticeable.

Categories

Resources