Reading char by char is best solution? - c#

So thing is I have to get some data from file and save it to database.
File have structure that at start there are some configuration lines, and then there is "$START" text which is beginning for my operations.
I was wandering how I can get all the information BEFORE this mark ("$START"), and then proceed further.
I can't search line by line, because after "$START" declaration there is just one long for 16k chars line...
I didn't work with files for some time now, so my question is what will be best solution for it?

You could use string.Split().
string data = "...your text data";
string[] splitted = data.Split(new string[] { "$START" }, StringSplitOptions.None);
Then you have both sections of the data separated out and you can access them like this:
string configuration = splitted[0];
string data = splitted[1];

What you consider best is the question, but this would certainly be easy to maintain and understandable. This sample uses regex:
string yourTextFileString = File.ReadAllBytes(#"filename.txt");
string textAfterStart = Regex.Replace(yourTextFileString, #"(.*)\$START(.*)", "$2", RegexOptions.Singleline);

ok, if I get it right you have a file like this:
IMPORTANT_CODE_$START_UNINPORTANT_CODE
you can do
String yourTextFile = File.ReadAllText("filename.txt");
string importantText = yourTextFile.Substring(0, yourTextFile.IndexOf("$START"));
But if the file is to big you should do it like this:
string operation = String.Empty;
using (Stream s = new FileStream("filename.txt", FileMode.Open))
{
byte[] buffer = new byte[1024];
string text = String.Empty;
while (s.Read(buffer, 0, buffer.Length) > 0)
{
text += Encoding.UTF8.GetString(buffer);
if (text.Contains("$START"))
{
operation = text.Substring(0, text.IndexOf("$START"));
break;
}
}
}
Console.WriteLine("Your operation: {0}", operation);

Related

Using StreamReader split methode to create a txt file from a CSV file

I want to extract each string between the first "" for each row and create a text file with it.
sample CSV:
number,season,episode,airdate,title,tvmaze link
1,1,1,13 Sep 05,"Pilot","https://www.tvmaze.com/episodes/991/supernatural-1x01-pilot"
2,1,2,20 Sep 05,"Wendigo","https://www.tvmaze.com/episodes/992/supernatural-1x02-wendigo"
3,1,3,27 Sep 05,"Dead in the Water","https://www.tvmaze.com/episodes/993/supernatural-1x03-dead-in-the-water"
4,1,4,04 Oct 05,"Phantom Traveler","https://www.tvmaze.com/episodes/994/supernatural-1x04-phantom-traveler"
5,1,5,11 Oct 05,"Bloody Mary","https://www.tvmaze.com/episodes/995/supernatural-1x05-bloody-mary"
Final result .txt file:
Pilot
Wendigo
Dead in the Water
Phantom Traveler
Bloody Mary
my function:
private void GetEpisodeNamesFromCSV()
{
using (StreamReader sr = new StreamReader(AppDir + "\\list.csv"))
{
string strResult = sr.ReadToEnd();
string[] result = strResult.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
File.WriteAllLines(AppDir + "\\list_generated_" + ShowTitel + ".txt", result);
}
}
I can't figure out how to properly Split the stream reader object, to only get the names on each Line. I'm very new to programming, and this site helped me immensely! But this problem is specific, and I couldn't find the answer myself. I appreciate any help.
EDIT:
I went with the csvHelper solution suggested by #Jesús López:
// Create a List
List<string> episodeNames = new List<string>();
// Make sure ther are no empty lines in the csv file
var lines = File.ReadAllLines(AppDir + "\\list.csv").Where(arg => !string.IsNullOrWhiteSpace(arg));
File.WriteAllLines(AppDir + "\\list.csv", lines);
// Open the file stream
var streamReader = File.OpenText(AppDir + "\\list.csv");
var csv = new CsvReader(streamReader, CultureInfo.InvariantCulture);
// Read the File
csv.Read();
// Read the Header
csv.ReadHeader();
// Create a string array with Header
string[] header = csv.Context.Reader.HeaderRecord;
// Select the column and get the Index
var columnExtracted = "title";
int extractedIndex = Array.IndexOf(header, columnExtracted);
// Read the file and fill the List
while (csv.Read())
{
string[] row = csv.Context.Reader.Parser.Record;
string column = row[extractedIndex];
episodeNames.Add(column);
}
// Convert the List to a string array
string[] result = episodeNames.ToArray();
//write the array to a text file
File.WriteAllLines(AppDir + "\\list.txt", result);
This is not so much help on StreamReader as it is on strings
If you are confident of the file layout and format as shown (and that it will be consistent), try this quick-and-dirty in a Console app
:
var line = sr.ReadLine();
while (line != null)
{
if (line.Trim() == string.Empty) continue;
var lineEntries = line.Split(',', StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(lineEntries[4].Trim('"'));
line = sr.ReadLine();
}
Note that I offer this because of your statement "I am very new to programming" to show off methods string.Split() .Trim() (and check out .Join()) and how easy they make the basic logic of what you want to achieve.
Using a proper CSV reader is the best idea for a robust solution (plus data-integrity checking, exception handling etc), but there is a reciprocal danger of over-engineering, so if this code displays what you want/expect for a once-off learning experience, then go ahead and implement;-)

Trimming degree symbol on C#

Can anyone tell me why this is not working:
string txt = "+0°1,0'";
string degree = txt.TrimEnd('°');
I am trying to separate the degrees on this string, but after this, what remains on degree is the same content of txt.
I am using C# in Visual Studio.
string.TrimEnd remove char at the end. In your example, '°' isn't at the end.
For example :
string txt = "+0°°°°";
string degree = txt.TrimEnd('°');
// degree => "+0"
If you want remove '°' and all next characters, you can :
string txt = "+0°1,0'";
string degree = txt.Remove(txt.IndexOf('°'));
// degree => "+0"
string txt = "+0°1,0'";
if(txt.IndexOf('°') > 0) // Checking if character '°' exist in the string
{
string withoutdegree = txt.Remove(txt.IndexOf('°'),1);
}
Another safe way of handling the same is using the String.Split method. You will not have to bother to verify the presence of the character in this case.
string txt = "+0°1,0'";
var str = txt.Split('°')[0]; // "+0"
string txt = "+01,0'";
var str = txt.Split('°')[0]; // "+01,0'"
You can use this to remove all the '°' symbols present in your string using String.Replace
string txt = "+0°1,0'°°";
var text = txt.Replace(#"°", ""); // +01,0'
Edit: Added a safe way to handle the OP's exact query.

How can I tell if there is an environment.newline at the end of StreamReader.Readline()

I am trying to read a text file line by line and create one line from multiple lines until the line read in has \r\n at the end. My data looks like this:
BusID|Comment1|Text\r\n
1010|"Cuautla, Inc. d/b/a 3 Margaritas VIII\n
State Lic. #40428210000 City Lic.#4042821P\n
9/26/14 9/14/14 - 9/13/15 $175.00\n
9/20/00 9/14/00 - 9/13/01 $575.00 New License"\r\n
1020|"7-Eleven Inc., dba 7-Eleven Store #20638\n
State Lic. #24111110126; City Lic. #2411111126P\n
SEND ISSUED LICENSES TO DALLAS, TX\r\n
I want the data to look like this:
BusID|Comment1|Text\r\n
1010|"Cuautla, Inc. d/b/a 3 Margaritas VIII State Lic. #40428210000 City Lic.#4042821P 9/26/14 9/14/14 - 9/13/15 $175.00 9/20/00 9/14/00 - 9/13/01 $575.00 New License"\r\n
1020|"7-Eleven Inc., dba 7-Eleven Store #20638 State Lic. #24111110126; City Lic. #2411111126P SEND ISSUED LICENSES TO DALLAS, TX\r\n
My code is like this:
FileStream fsFileStream = new FileStream(strInputFileName, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite);
using (StreamReader srStreamRdr = new StreamReader(fsFileStream))
{
while ((strDataLine = srStreamRdr.ReadLine()) != null && !blnEndOfFile)
{
//code evaluation here
}
I have tried:
if (strDataLine.EndsWith(Environment.NewLine))
{
blnEndOfLine = true;
}
and
if (strDataLine.Contains(Environment.NewLine))
{
blnEndOfLine = true;
}
These do not see anything at the end of the string variable. Is there a way for me to tell the true end of line so I can combine these rows into one row? Should I be reading the file differently?
You cannot use the ReadLine method of the StringReader because every kind of newline. both the \r\n and \n are removed from the input, a line is returned by the reader and you will never know if the characters removed are \r\n or just \n
If the file is not really big then you can try to load everything in memory and do the splitting yourself into separate lines
// Load everything in memory
string fileData = File.ReadAllText(#"D:\temp\myData.txt");
// Split on the \r\n (I don't use Environment.NewLine because it
// respects the OS conventions and this could be wrong in this context
string[] lines = fileData.Split(new string[] { "\r\n"}, StringSplitOptions.RemoveEmptyEntries);
// Now replace the remaining \n with a space
lines = lines.Select(x => x.Replace("\n", " ")).ToArray();
foreach(string s in lines)
Console.WriteLine(s);
EDIT
If your file is really big (like you say 3.5GB) then you cannot load everything in memory but you need to process it in blocks. Fortunately the StreamReader provides a method called ReadBlock that allows us to implement code like this
// Where we store the lines loaded from file
List<string> lines = new List<string>();
// Read a block of 10MB
char[] buffer = new char[1024 * 1024 * 10];
bool lastBlock = false;
string leftOver = string.Empty;
// Start the streamreader
using (StreamReader reader = new StreamReader(#"D:\temp\localtext.txt"))
{
// We exit when the last block is reached
while (!lastBlock)
{
// Read 10MB
int loaded = reader.ReadBlock(buffer, 0, buffer.Length);
// Exit if we have no more blocks to read (EOF)
if(loaded == 0) break;
// if we get less bytes than the block size then
// we are on the last block
lastBlock = (loaded != buffer.Length);
// Create the string from the buffer
string temp = new string(buffer, 0, loaded);
// prepare the working string adding the remainder from the
// previous loop
string current = leftOver + temp;
// Search the last \r\n
int lastNewLinePos = temp.LastIndexOf("\r\n");
if (lastNewLinePos > -1)
{
// Prepare the working string
current = leftOver + temp.Substring(0, lastNewLinePos + 2);
// Save the incomplete parts for the next loop
leftOver = temp.Substring(lastNewLinePos + 2);
}
// Process the lines
AddLines(current, lines);
}
}
void AddLines(string current, List<string> lines)
{
var splitted = current.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries);
lines.AddRange(splitted.Select(x => x.Replace("\n", " ")).ToList());
}
This code assumes that your file always ends with a \r\n and that you always get a \r\n inside a block of 10MB of text. More tests are needed with your actual data.
If what you have posted is exactly whats in the file. Meaning the \r\n are indeed written, you can use the following to unescape them:
strDataLine.Replace("\\r", "\r").Replace("\\n", "\n");
this will ensure you can now use Environment.NewLine in order to do your comparison as in:
if (strDataLine.Replace("\\r", "\r").Replace("\\n", "\n").EndsWith(Environment.NewLine))
{
blnEndOfLine = true;
}
You can just read all text by calling File.ReadAllText(path) and parse it in following way :
string input = File.ReadAllText(your_file_path);
string output = string.Empty;
input.Split(new[] { Environment.NewLine } , StringSplitOptions.RemoveEmptyEntries).
Skip(1).ToList().
ForEach(x =>
{
output += x.EndsWith("\\r\\n") ? x + Environment.NewLine
: x.Replace("\\n"," ");
});

Searching strings in txt file

I have a .txt file with a list of 174 different strings. Each string has an unique identifier.
For example:
123|this data is variable|
456|this data is variable|
789|so is this|
etc..
I wish to write a programe in C# that will read the .txt file and display only one of the 174 strings if I specify the ID of the string I want. This is because in the file I have all the data is variable so only the ID can be used to pull the string. So instead of ending up with the example about I get just one line.
eg just
123|this data is variable|
I seem to be able to write a programe that will pull just the ID from the .txt file and not the entire string or a program that mearly reads the whole file and displays it. But am yet to wirte on that does exactly what I need. HELP!
Well the actual string i get out from the txt file has no '|' they were just in the example. An example of the real string would be: 0111111(0010101) where the data in the brackets is variable. The brackets dont exsist in the real string either.
namespace String_reader
{
class Program
{
static void Main(string[] args)
{
String filepath = #"C:\my file name here";
string line;
if(File.Exists(filepath))
{
StreamReader file = null;
try
{
file = new StreamReader(filepath);
while ((line = file.ReadLine()) !=null)
{
string regMatch = "ID number here"; //this is where it all falls apart.
Regex.IsMatch (line, regMatch);
Console.WriteLine (line);// When program is run it just displays the whole .txt file
}
}
}
finally{
if (file !=null)
file.Close();
}
}
Console.ReadLine();
}
}
}
Use a Regex. Something along the lines of Regex.Match("|"+inputString+"|",#"\|[ ]*\d+\|(.+?)\|").Groups[1].Value
Oh, I almost forgot; you'll need to substitute the d+ for the actual index you want. Right now, that'll just get you the first one.
The "|" before and after the input string makes sure both the index and the value are enclosed in a | for all elements, including the first and last. There's ways of doing a Regex without it, but IMHO they just make your regex more complicated, and less readable.
Assuming you have path and id.
Console.WriteLine(File.ReadAllLines(path).Where(l => l.StartsWith(id + "|")).FirstOrDefault());
Use ReadLines to get a string array of lines then string split on the |
You could use Regex.Split method
FileInfo info = new FileInfo("filename.txt");
String[] lines = info.OpenText().ReadToEnd().Split(' ');
foreach(String line in lines)
{
int id = Convert.ToInt32(line.Split('|')[0]);
string text = Convert.ToInt32(line.Split('|')[1]);
}
Read the data into a string
Split the string on "|"
Read the items 2 by 2: key:value,key:value,...
Add them to a dictionary
Now you can easily find your string with dictionary[key].
first load the hole file to a string.
then try this:
string s = "123|this data is variable| 456|this data is also variable| 789|so is this|";
int index = s.IndexOf("123", 0);
string temp = s.Substring(index,s.Length-index);
string[] splitStr = temp.Split('|');
Console.WriteLine(splitStr[1]);
hope this is what you are looking for.
private static IEnumerable<string> ReadLines(string fspec)
{
using (var reader = new StreamReader(new FileStream(fspec, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
var dict = ReadLines("input.txt")
.Select(s =>
{
var split = s.Split("|".ToArray(), 2);
return new {Id = Int32.Parse(split[0]), Text = split[1]};
})
.ToDictionary(kv => kv.Id, kv => kv.Text);
Please note that with .NET 4.0 you don't need the ReadLines function, because there is ReadLines
You can now work with that as any dictionary:
Console.WriteLine(dict[12]);
Console.WriteLine(dict[999]);
No error handling here, please add your own
You can use Split method to divide the entire text into parts sepparated by '|'. Then all even elements will correspond to numbers odd elements - to strings.
StreamReader sr = new StreamReader(filename);
string text = sr.ReadToEnd();
string[] data = text.Split('|');
Then convert certain data elements to numbers and strings, i.e. int[] IDs and string[] Strs. Find the index of the given ID with idx = Array.FindIndex(IDs, ID.Equals) and the corresponding string will be Strs[idx]
List <int> IDs;
List <string> Strs;
for (int i = 0; i < data.Length - 1; i += 2)
{
IDs.Add(int.Parse(data[i]));
Strs.Add(data[i + 1]);
}
idx = Array.FindIndex(IDs, ID.Equals); // we get ID from input
answer = Strs[idx];

c# split line with .txt

lets say i have 5 lines within a txt file called users.txt each line has the following information
username:password
how would i go about spliting each line within a txt file and store the username as one string and password as the other.
I have the code to grab a random line using this code. This code is used for another part of my project aswell so I dont want the code to be altered. I was thinking after the line has been grabbed call another function but I have no idea on how to split it with the :
private static string GetRandomLine(string file)
{
List<string> lines = new List<string>();
Random rnd = new Random();
int i = 0;
try
{
if (File.Exists(file))
{
//StreamReader to read our file
StreamReader reader = new StreamReader(file);
//Now we loop through each line of our text file
//adding each line to our list
while (!(reader.Peek() == -1))
lines.Add(reader.ReadLine());
//Now we need a random number
i = rnd.Next(lines.Count);
//Close our StreamReader
reader.Close();
//Dispose of the instance
reader.Dispose();
//Now write out the random line to the TextBox
return lines[i].Trim();
}
else
{
//file doesn't exist so return nothing
return string.Empty;
}
}
catch (IOException ex)
{
MessageBox.Show("Error: " + ex.Message);
return string.Empty;
}
}
You should be able to use string.Split:
string line = GetRandomLine(file);
string[] parts = line.Split(':');
string user = parts[0];
string pass = parts[1];
That being said, you may also want to add error checking (ie: make sure parts has 2 elements, etc).
This is much cleaner, and handles cases where the password might contain ':'s
Of course I would expect you to ensure that passwords are not plain text and hashed password's don't contain any ':'s; But just in case they do, this is what would work:
Split() will cause other problems.
bool GetUsernamePassword(string line, ref string uname, ref string pwd)
{
int idx = line.IndexOf(':') ;
if (idx == -1)
return false;
uname = line.Substring(0, idx);
pwd = line.Substring(idx + 1);
return true;
}
.
string username_password = "username:password";
string uname = String.Empty;
string pwd = String.Empty;
if (!GetUsernamePassword(username_password, ref uname, ref pwd))
{
// Handle error: incorrect format
}
Console.WriteLine("{0} {1} {2}", username_password, uname, pwd);
btw. having said the above this won't work (like all other solutions before this one) if the username has ':' :P But this will handle the case where password has ':'.
To split the string is simple:
string[] components = myUserAndPass.Split(':');
string userName = components[0];
string passWord = components[1];
Try to read the following stackoverflow pages:
C# Tokenizer - keeping the separators
Does C# have a String Tokenizer like Java's?
Use the Split() method.
For example, in this case
string[] info = lines[i].Split(':');
info[0] will have the username and info[1] will have the password.
Try something like this...
string []split = line.Split(':');
string username = split[0];
string pwd = split[1];
Reed Corpsey gave a nice answer already, so instead of giving another solution, I'd just like to make one comment about your code. You can use the Using statement to handle the StreamReader Close and Dispose method calling for you. This way if an error happens, you don't have to worry that the Stream is left open.
Changing your code slightly would make it look like:
//StreamReader to read our file
using(StreamReader reader = new StreamReader(file))
{
//Now we loop through each line of our text file
//adding each line to our list
while (!(reader.Peek() == -1))
lines.Add(reader.ReadLine());
//Now we need a random number
i = rnd.Next(lines.Count);
}
//Now write out the random line to the TextBox
return lines[i].Trim();

Categories

Resources