Using C# how can I split a text file into multiple files - c#

How can I split a text file that contains ASCII code SOH and ETX into multiple files?
For exmaple the text file I have named 001234.txt contains the following content:
SOH{ABCDXZY}ETX
SOH{ABCDXZY}ETX
SOH{ABCDXZY}ETX
I would like to split the single text file into multiple text files for each ASCII code that starts with SOH and ends with ETX.
The single text file name should be splitted into 101234.txt , 111234.txt..etc and each contains a single content that starts with SOH and ends with ETX.
I appreciate any help.
using System.IO;
using System.Linq;
namespace ASCII_Split
{
class Program
{
static void Main(string[] args)
{
var txt = "";
const char soh = (char)1;
const char eox = (char)3;
var count = 1;
var pathToFile = #"‪‪C:\Temp\00599060.txt";
using (var sr = new StreamReader(pathToFile))
txt = sr.ReadToEnd();
while (txt.Contains(soh))
{
var outfil = Path.Combine(Path.GetDirectoryName(pathToFile), count.ToString("000"), "_fix.txt");
var eInd = txt.IndexOf(eox);
using (var sw = new StreamWriter(outfil, false))
{
sw.Write(txt.Substring(1, eInd - 1));
}
txt = txt.Substring(eInd + 1);
count++;
}
}
}
}

This should more or less do the trick:
//Read all text from file into a string
var fileContent = File.ReadAllText("001234.txt");
//split text into array according to a Regex pattern
var pattern = #"SOH*ETX";
var splitContent = Regex.Split(fileContent, pattern);
//counter for file names
var counter = 10;
foreach(var content in splitContent)
{
//create file and use stream to write to it
using (var stream = File.Create($"{counter++}1234.txt"))
{
var contentAsBytes = new UTF8Encoding(true).GetBytes(content);
stream.Write(contentAsBytes, 0, contentAsBytes.Length);
}
}

Provided by SOH and ETX you mean the respective control characters, this here should get you on your way:
var txt = "";
const char soh = (char) 1;
const char eox = (char) 3;
var count = 1;
var pathToFile = #"C:\00_Projects_temp\test.txt";
using (var sr = new StreamReader(pathToFile))
txt = sr.ReadToEnd();
while (txt.Contains(soh))
{
var outfil = Path.Combine(Path.GetDirectoryName(pathToFile), count.ToString("000"), "_test.txt");
var eInd = txt.IndexOf(eox);
using (var sw = new StreamWriter(outfil, false))
{
sw.Write(txt.Substring(1, eInd - 1));
}
txt = txt.Substring(eInd + 1);
count++;
}

Thank you LocEngineer the program works, I did little change to concatonate the filename with the counter using "+" instead of ",".
using System.IO;
using System.Linq;
namespace ASCII_Split
{
class Program
{
static void Main(string[] args)
{
var txt = "";
const char soh = (char)1;
const char eox = (char)3;
var count = 1;
var pathToFile = #"C:\Temp\00599060.txt";
using (var sr = new StreamReader (pathToFile))
txt = sr.ReadToEnd();
if (txt.IndexOf(soh) != txt.LastIndexOf(soh))
{
while (txt.Contains(soh))
{
var outfil = Path.Combine(Path.GetDirectoryName(pathToFile), count.ToString("00") + Path.GetFileName(pathToFile));
var eInd = txt.IndexOf(eox);
using (var sw = new StreamWriter(outfil, false))
{
sw.Write(txt.Substring(1, eInd - 1));
}
txt = txt.Substring(eInd + 1);
count++;
}
File.Move((pathToFile), (pathToFile) + ".org");
}
}
}
}

Related

StreamWriter: Starting and ending on a specific line number

I would like to ask some tips and help on a reading/writing part of my C#.
Situation:
I have to read a CSV file; - OK
If the CSV file name starts with "Load_", I want to write on another CSV the data from line 2 to the last one;
If the CSV file name starts with "RO_", I want to write on 2 different CSVs, 1 with the line 1 to 4 and the other 4 to the last one;
What I have so far is:
public static void ProcessFile(string[] ProcessFile)
{
// Keeps track of your current position within a record
int wCurrLine = 0;
// Number of rows in the file that constitute a record
const int LINES_PER_ROW = 1;
int ctr = 0;
foreach (string filename in ProcessFile)
{
var sbText = new System.Text.StringBuilder(100000);
int stop_line = 0;
int start_line = 0;
// Used for the output name of the file
var dir = Path.GetDirectoryName(filename);
var fileName = Path.GetFileNameWithoutExtension(filename);
var ext = Path.GetExtension(filename);
var folderbefore = Path.GetFullPath(Path.Combine(dir, #"..\"));
var lineCount = File.ReadAllLines(#filename).Length;
string outputname = folderbefore + "output\\" + fileName;
using (StreamReader Reader = new StreamReader(#filename))
{
if (filename.Contains("RO_"))
{
start_line = 1;
stop_line = 5;
}
else
{
start_line = 2;
stop_line = lineCount;
}
ctr = 0;
while (!Reader.EndOfStream && ctr < stop_line)
{
// Add the text
sbText.Append(Reader.ReadLine());
// Increment our current record row counter
wCurrLine++;
// If we have read all of the rows for this record
if (wCurrLine == LINES_PER_ROW)
{
// Add a line to our buffer
sbText.AppendLine();
// And reset our record row count
wCurrLine = 0;
}
ctr++;
} // end of the while
}
int total_lenght = sbText.Length
// When all of the data has been loaded, write it to the text box in one fell swoop
using (StreamWriter Writer = new StreamWriter(dir + "\\" + "output\\" + fileName + "_out" + ext))
{
Writer.Write.(sbText.);
}
} // end of the foreach
} // end of ProcessFile
I was thinking about using the IF/ELSE: "using (StreamWriter Writer = new StreamWriter(dir + "\" + "output\" + fileName + "_out" + ext))" part. However, I am not sure how to pass, to StreamWriter, to only write from/to a specific line number.
Any Help is welcome! If I am missing some information, please, let me know (I am pretty new on stackoverflow).
Thank you.
Code is way too complicated
using System.Collections.ObjectModel;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication57
{
class Program
{
static void Main(string[] args)
{
}
public static void ProcessFile(string[] ProcessFile)
{
foreach (string filename in ProcessFile)
{
// Used for the output name of the file
var dir = Path.GetDirectoryName(filename);
var fileName = Path.GetFileNameWithoutExtension(filename);
var ext = Path.GetExtension(filename);
var folderbefore = Path.GetFullPath(Path.Combine(dir, #"..\"));
var lineCount = File.ReadAllLines(#filename).Length;
string outputname = folderbefore + "output\\" + fileName;
using (StreamWriter Writer = new StreamWriter(dir + "\\" + "output\\" + fileName + "_out" + ext))
{
int rowCount = 0;
using (StreamReader Reader = new StreamReader(#filename))
{
rowCount++;
string inputLine = "";
while ((inputLine = Reader.ReadLine()) != null)
{
if (filename.Contains("RO_"))
{
if (rowCount <= 4)
{
Writer.WriteLine(inputLine);
}
if (rowCount == 4) break;
}
else
{
if (rowCount >= 2)
{
Writer.WriteLine(inputLine);
}
}
} // end of the while
Writer.Flush();
}
}
} // end of the foreach
} // end of ProcessFile
}
}
You can use LINQ to Take and Skip lines.
public abstract class CsvProcessor
{
private readonly IEnumerable<string> processFiles;
public CsvProcessor(IEnumerable<string> processFiles)
{
this.processFiles = processFiles;
}
protected virtual IEnumerable<string> GetAllLinesFromFile(string fileName)
{
using(var stream = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
using(var reader = new StreamReader(stream))
{
var line = String.Empty;
while((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
protected virtual void ProcessFiles()
{
var sb1 = new StringBuilder();
var sb2 = new StringBuilder();
foreach(var file in this.processFiles)
{
var fileName = Path.GetFileNameWithoutExtension(file);
var lines = GetAllLinesFromFile(file);
if(fileName.StartsWith("RO_", StringComparison.InvariantCultureIgnoreCase))
{
sb1.AppendLine(lines.Take(4)); //take only the first four lines
sb2.AppendLine(lines.Skip(4).TakeWhile(s => !String.IsNullOrEmpty(s))); //skip the first four lines, take everything else
}
else if(fileName.StartsWith("Load_", StringComparison.InvariantCultureIgnoreCase)
{
sb2.AppendLine(lines.Skip(1).TakeWhile(s => !String.IsNullOrEmpty(s)));
}
}
// now write your StringBuilder objects to file...
}
protected virtual void WriteFile(StringBuilder sb1, StringBuilder sb2)
{
// ... etc..
}
}

Remove Non-ASCII characters from XML file C#

I am trying to write a program to remove open an XML file with Non-ASCII characters and replace those characters with spaces and save and close the file.
Thats basically it, just open the file remove all the non ascii characters and save/close the file.
Here is my code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.IO;
using System.Text.RegularExpressions;
namespace RemoveSpecial
{
class Program
{
static void Main(string[] args)
{
string pth_input = string.Empty;
string pth_output = string.Empty;
for (int i = 1; i < args.Length; i++)
{
//input one
string p_input = args[0];
pth_input = p_input;
pth_input = pth_input.Replace(#"\", #"\\");
//output
string p_output = args[2];
pth_output = p_output;
pth_output = pth_output.Replace(#"\", #"\\");
}
//s = Regex.Replace(s, #"[^\u0000-\u007F]+", string.Empty);
string lx;
using (StreamReader sr = new StreamReader(pth_input))
{
using (StreamWriter x = new StreamWriter(pth_output))
{
while ((lx = sr.ReadLine()) != null)
{
string text = sr.ReadToEnd();
Regex.Replace(text, #"[^\u0000-\u007F]+", "", RegexOptions.Compiled);
x.Write(text);
} sr.Close();
}
}
}
}
}
Thanks in advance guys.
According to documentation, the first string is an input parameter (and not passed by reference, so it could not change anyway). The result of the replacement is in the return value, like so:
var result = Regex.Replace(text, #"[^\u0000-\u007F]+", "", RegexOptions.Compiled);
x.Write(result);
Note that RegexOptions.Compiled might decrease performance here. It makes sense only if you reuse the same regular expression instance on multiple strings. You can still do that, if you create the RegEx instance outside of the loop:
var regex = new Regex(#"[^\u0000-\u007F]+", RegexOptions.Compiled);
using (var sr = new StreamReader(pth_input))
{
using (var x = new StreamWriter(pth_output))
{
while ((lx = sr.ReadLine()) != null)
{
var text = sr.ReadToEnd();
var result = regex.Replace(text, String.Empty);
x.Write(result);
}
}
}

Csharp substring text and add it to list

I have file.txt like:
EDIT: I didn't wrote but this is important i guess- In file.txt there can be others lines!
folder=c:\user;c:\test;c:\something;
I need to add one path like one list item (List<string> Folders).
So my List should looks like:
Folders[0] = c:\user
Folders[1] = c:\test
etc. (without text "folder=" which starts line in file.txt and ";" which means end of path).
file can contain much more paths.
I did something like this:
using (FileStream fss = new FileStream(path, FileMode.Open))
{
using (StreamReader sr = new StreamReader(fss))
{
while (sr.EndOfStream == false)
{
string line = sr.ReadLine();
if(line.StartsWith("folders"))
{
int index = line.IndexOf("=");
int index1 = line.IndexOf(";");
string folder = line.Substring(index + 1, index1 - (index + 1));
Folders.Add(folder);
Now in List Folders i have first path but what now? I can't go ahead :(
using(var sr = new StreamReader(path))
{
var folders = sr.ReadToEnd()
.Split(new char[]{';','\n','\r'}, StringSplitOptions.RemoveEmptyEntries)
.Select(o => o.Replace("folder=",""))
.ToArray();
Folders.AddRange(folders);
}
You can try following code, using File.ReadAllText
string Filepath = "c:\abc.txt";
string filecontent = File.ReadAllText(Filepath);
string startingString = "=";
var startIndex = filecontent.IndexOf(startingString);
filecontent = filecontent.Substring(startIndex + 1, filecontent.Length - startIndex - 2);
List<String> folders = filecontent.Split(';').ToList();
Here's a simple example:
List<String> Folders = new List<string>();
private void button1_Click(object sender, EventArgs e)
{
string path = #"C:\Users\mikes\Documents\SomeFile.txt";
string folderTag = "folder=";
using (FileStream fss = new FileStream(path, FileMode.Open))
{
using (StreamReader sr = new StreamReader(fss))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
if (line.StartsWith(folderTag))
{
line = line.Substring(folderTag.Length); // remove the folderTag from the beginning
Folders.AddRange(line.Split(";".ToCharArray(), StringSplitOptions.RemoveEmptyEntries));
}
}
}
}
foreach(string folder in Folders)
{
Console.WriteLine(folder);
}
}
I'd use this approach if you're going to read line by line, and do something else based on what each line starts with. In that case you could add different else if(...) blocks:
if (line.StartsWith(folderTag))
{
line = line.Substring(folderTag.Length); // remove the folderTag from the beginning
Folders.AddRange(line.Split(";".ToCharArray(), StringSplitOptions.RemoveEmptyEntries));
}
else if(line.StartsWith("parameters="))
{
// do something different with a line starting with "parameters="
}
else if (line.StartsWith("unicorns="))
{
// do something else different with a line starting with "unicorns="
}

Load items from file and split them into array

I have to do a program in C# Form, which has to load from a file an ID number and a ADN formed by 20 letters, whatever the output has to be something like that:
//Edit: I'll try to explain it better; it's a C# Form program which has to load 20 people from a town(file) with their ADN, ID number and Name, after that i have to load from a file a single ADN without name or id number (which is the murderer; the program is a CSI game, you have a town with 20 people, someone does a murder and i have to find him) and then i have to COMPARE the single ADN with all 20 ADNS and i have to find a % and then to find the murderer..
1;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A
2;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A
3;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A
...
The file has 20 lines.
I've tried this so far but.. it doesn't work
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.IO;
namespace CSI_Marconi_FORM
{
public partial class DNAabitanti : Form
{
public DNAabitanti()
{
InitializeComponent();
}
private void DNAabitanti_Load(object sender, EventArgs e)
{
StreamReader reader = new StreamReader(#"\\Repo\Studenti$\Informatica\SezCi\4Ci\Corneliu.Cotet\Documenti\Visual Studio 2012\Projects\CSI Marconi FORM\CSI Marconi FORM\bin\Debug\DNAabitanti.txt");
reader = File.OpenText(#"\\Repo\Studenti$\Informatica\SezCi\4Ci\Corneliu.Cotet\Documenti\Visual Studio 2012\Projects\CSI Marconi FORM\CSI Marconi FORM\bin\Debug\DNAabitanti.txt");
FormPrincipale.utenti = File.ReadAllLines(#"\\Repo\Studenti$\Informatica\SezCi\4Ci\Corneliu.Cotet\Documenti\Visual Studio 2012\Projects\CSI Marconi FORM\CSI Marconi FORM\bin\Debug\DNAabitanti.txt").Length;
string abitanti = reader.ReadToEnd();
richTextBox1.Text = abitanti;
reader.Close();
FormPrincipale.database = new FormPrincipale.Persona[FormPrincipale.utenti];
FormPrincipale.corrispondenze = new int [FormPrincipale.utenti];
for (int i = 0; i < FormPrincipale.utenti; i++)
{
string letto = "";
letto = reader.ReadToEnd();
string[] aus = letto.Split(new char[] { ';' });
FormPrincipale.database[i].dna = new string[20];
for (int j = 0; j < 22; j++)
{
if (j < 20)
{
FormPrincipale.database[i].dna[j] = aus[j];
}
if (j == 20)
{
FormPrincipale.database[i].nome = aus[j];
}
if (j == 21)
{
FormPrincipale.database[i].cognome = aus[j];
}
}
}
}
}
}
Try this :
You have to first replace all ';'s with space and then maek the individual change with ':' after the first digit.That will replace the whole string in correct format.
string line;
System.IO.StreamReader file = new System.IO.StreamReader(#"d:\\textFile.txt");
while ((line = file.ReadLine()) != null)
{
string output = "";
//replacing all ';' with space
output = line.Replace(";", " ");
StringBuilder sb = new StringBuilder(output);
//replacing character after number with ':'
sb[1] = ':';
output = sb.ToString();
MessageBox.Show(output);
}
file.Close();
Without seeing your code, try something like:
var myArray = myFileContents.Split(new char [] { '\n' });

Counting number of words in a text file

I'm trying to count the number of words from a text file, namely this, to start.
This is a test of the word count program. This is only a test. If your
program works successfully, you should calculate that there are 30
words in this file.
I am using StreamReader to put everything from the file into a string, and then use the .Split method to get the number of individual words, but I keep getting the wrong value when I compile and run the program.
using System;
using System.IO;
class WordCounter
{
static void Main()
{
string inFileName = null;
Console.WriteLine("Enter the name of the file to process:");
inFileName = Console.ReadLine();
StreamReader sr = new StreamReader(inFileName);
int counter = 0;
string delim = " ,.";
string[] fields = null;
string line = null;
while(!sr.EndOfStream)
{
line = sr.ReadLine();
}
fields = line.Split(delim.ToCharArray());
for(int i = 0; i < fields.Length; i++)
{
counter++;
}
sr.Close();
Console.WriteLine("The word count is {0}", counter);
}
}
Try to use regular expression, e.g.:
int count = Regex.Matches(input, #"\b\w+\b").Count;
this should work for you:
using System;
using System.IO;
class WordCounter
{
static void Main()
{
string inFileName = null;
Console.WriteLine("Enter the name of the file to process:");
inFileName = Console.ReadLine();
StreamReader sr = new StreamReader(inFileName);
int counter = 0;
string delim = " ,."; //maybe some more delimiters like ?! and so on
string[] fields = null;
string line = null;
while(!sr.EndOfStream)
{
line = sr.ReadLine();//each time you read a line you should split it into the words
line.Trim();
fields = line.Split(delim.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
counter+=fields.Length; //and just add how many of them there is
}
sr.Close();
Console.WriteLine("The word count is {0}", counter);
}
}
A couple hints.
What if you just have the sentence "hi" what would be your output?
Your counter calculation is: from 0 through fields.Length, increment counter. How are fields.Length and your counter related?
you're probably getting a one off error, try something like this
counter = 0;
while(!sr.EndOfStream)
{
line = sr.ReadLine();
fields = line.Split(delim.ToCharArray());
counter += field.length();
}
there is no need to iterate over the array to count the elements when you can get the number directly
using System.IO;
using System;
namespace solution
{
class Program
{
static void Main(string[] args)
{
var readFile = File.ReadAllText(#"C:\test\my.txt");
var str = readFile.Split(new char[] { ' ', '\n'}, StringSplitOptions.RemoveEmptyEntries);
System.Console.WriteLine("Number of words: " + str.Length);
}
}
}
//Easy method using Linq to Count number of words in a text file
/// www.techhowdy.com
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace FP_WK13
{
static class Util
{
public static IEnumerable<string> GetLines(string yourtextfile)
{
TextReader reader = new StreamReader(yourtextfile);
string result = string.Empty;
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
reader.Close();
}
// Word Count
public static int GetWordCount(string str)
{
int words = 0;
string s = string.Empty;
var lines = GetLines(str);
foreach (var item in lines)
{
s = item.ToString();
words = words + s.Split(' ').Length;
}
return words;
}
}
}

Categories

Resources