C# Comparing two files and exporting matching lines based on delimiter - c#

Here’s the scenario.
I have a text file(alpha), single column, with a bunch of items.
My 2nd file is a csv(delta) with 4 columns.
I have to have the alpha compare again the delta and create a new file (omega) in which anything that alpha matched delta, it would export only the first two columns from delta into a new .txt file.
Example:
(Alpha)
BeginID
(delta):
BeginID,Muchmore,Info,Exists
(Omega):
BeginID,Muchmore
This document will probably have 10k lines it in. Thanks for the help!

Here's a rough cut way of doing the task you need:
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string alphaFilePath = #"C:\Documents and Settings\Jason\My Documents\Visual Studio 2008\Projects\Compte Two Files\Compte Two Files\ExternalFiles\Alpha.txt";
List<string> alphaFileContent = new List<string>();
using (FileStream fs = new FileStream(alphaFilePath, FileMode.Open))
using(StreamReader rdr = new StreamReader(fs))
{
while(!rdr.EndOfStream)
{
alphaFileContent.Add(rdr.ReadLine());
}
}
string betaFilePath = #"C:\Beta.csv";
StringBuilder sb = new StringBuilder();
using (FileStream fs = new FileStream(betaFilePath, FileMode.Open))
using (StreamReader rdr = new StreamReader(fs))
{
while(! rdr.EndOfStream)
{
string[] betaFileLine = rdr.ReadLine().Split(Convert.ToChar(","));
if (alphaFileContent.Contains(betaFileLine[0]))
{
sb.AppendLine(String.Format("{0}, {1}", betaFileLine[0], betaFileLine[1]));
}
}
}
using (FileStream fs = new FileStream(#"C:\Omega.txt", FileMode.Create))
using (StreamWriter writer = new StreamWriter(fs))
{
writer.Write(sb.ToString());
}
Console.WriteLine(sb.ToString());
}
}
}
Basically it reads a txt file, puts the contents in a list. Then it reads a csv file (assuming no columns) and matches the values to create a StringBuilder. In your code, substitute the StringBuilder with creating a new txt file.
EDIT: If you wish to have the code run in a button click, then put it in the button click handler (or a new routine and call that):
public void ButtonClick (Object sender, EventArgs e)
{
string alphaFilePath = #"C:\Documents and Settings\Jason\My Documents\Visual Studio 2008\Projects\Compte Two Files\Compte Two Files\ExternalFiles\Alpha.txt";
List<string> alphaFileContent = new List<string>();
using (FileStream fs = new FileStream(alphaFilePath, FileMode.Open))
using(StreamReader rdr = new StreamReader(fs))
{
while(!rdr.EndOfStream)
{
alphaFileContent.Add(rdr.ReadLine());
}
}
string betaFilePath = #"C:\Beta.csv";
StringBuilder sb = new StringBuilder();
using (FileStream fs = new FileStream(betaFilePath, FileMode.Open))
using (StreamReader rdr = new StreamReader(fs))
{
while(! rdr.EndOfStream)
{
string[] betaFileLine = rdr.ReadLine().Split(Convert.ToChar(","));
if (alphaFileContent.Contains(betaFileLine[0]))
{
sb.AppendLine(String.Format("{0}, {1}", betaFileLine[0], betaFileLine[1]));
}
}
}
using (FileStream fs = new FileStream(#"C:\Omega.txt", FileMode.Create))
using (StreamWriter writer = new StreamWriter(fs))
{
writer.Write(sb.ToString());
}
}

I'd probably load alpha into a collection then open delta for read, while not EOF readline into a string, split, if collection.contains column 0 then write to omega.
Done...

Related

CSV appears to be corrupt on Double quotes in Headers - C#

I was trying to read CSV file in C#.
I have tried File.ReadAllLines(path).Select(a => a.Split(';')) way but the issue is when there is \n multiple line in a cell it is not working.
So I have tried below
using LumenWorks.Framework.IO.Csv;
var csvTable = new DataTable();
using (TextReader fileReader = File.OpenText(path))
using (var csvReader = new CsvReader(fileReader, false))
{
csvTable.Load(csvReader);
}
for (int i = 0; i < csvTable.Rows.Count; i++)
{
if (!(csvTable.Rows[i][0] is DBNull))
{
var row1= csvTable.Rows[i][0];
}
if (!(csvTable.Rows[i][1] is DBNull))
{
var row2= csvTable.Rows[i][1];
}
}
The issue is the above code throwing exception as
The CSV appears to be corrupt near record '0' field '5 at position '63'
This is because the header of CSV's having two double quote as below
"Header1",""Header2""
Is there a way that I can ignore double quotes and process the CSV's.
update
I have tried with TextFieldParser as below
public static void GetCSVData()
{
using (var parser = new TextFieldParser(path))
{
parser.HasFieldsEnclosedInQuotes = false;
parser.Delimiters = new[] { "," };
while (parser.PeekChars(1) != null)
{
string[] fields = parser.ReadFields();
foreach (var field in fields)
{
Console.Write(field + " ");
}
Console.WriteLine(Environment.NewLine);
}
}
}
The output:
Sample CSV data I have used:
Any help is appreciated.
Hope this works!
Please replace two double quotes as below from csv:
using (FileStream fs = new FileStream(Path, FileMode.Open, FileAccess.ReadWrite, FileShare.None))
{
StreamReader sr = new StreamReader(fs);
string contents = sr.ReadToEnd();
// replace "" with "
contents = contents.Replace("\"\"", "\"");
// go back to the beginning of the stream
fs.Seek(0, SeekOrigin.Begin);
// adjust the length to make sure all original
// contents is overritten
fs.SetLength(contents.Length);
StreamWriter sw = new StreamWriter(fs);
sw.Write(contents);
sw.Close();
}
Then use the same CSV helper
using LumenWorks.Framework.IO.Csv;
var csvTable = new DataTable();
using (TextReader fileReader = File.OpenText(path))
using (var csvReader = new CsvReader(fileReader, false))
{
csvTable.Load(csvReader);
}
Thanks.

How to read binary files until EOF in C#

I have a function to write some data into a binary file
private void writeToBinFile (List<MyClass> myObjList, string filePath)
{
FileStream fs = new FileStream(filePath, FileMode.Create);
BinaryWriter bw = new BinaryWriter(fs);
foreach (MyClass myObj in myObjList)
{
bw.Write(JsonConvert.SerializeObject(myObj));
}
bw.Close();
fs.Close();
}
I am looking something like
FileStream fs = new FileStream(filePath, FileMode.Create);
BinaryReader bw = new BinaryReader(fs);
while (!filePath.EOF)
{
List<MyClass> myObjList = br.Read(myFile);
}
anyone can help with this?
thanks in advance
JSON can be saved with no formatting (no new lines), so you can save 1 record per row of a file. Thus, my suggested solution is to ignore binary files and instead use a regular StreamWriter:
private void WriteToFile(List<MyClass> myObjList, string filePath)
{
using (StreamWriter sw = File.CreateText(filePath))
{
foreach (MyClass myObj in myObjList)
{
sw.Write(JsonConvert.SerializeObject(myObj, Newtonsoft.Json.Formatting.None));
}
}
}
private List<MyClass> ReadFromFile(string filePath)
{
List<MyClass> myObjList = new List<MyClass>();
using (StreamReader sr = File.OpenText(filePath))
{
string line = null;
while ((line = sr.ReadLine()) != null)
{
myObjList.Add(JsonConvert.DeserializeObject<MyClass>(line));
}
}
return myObjList;
}
If you really want to use the binary writer to save JSON, you could change it to be like this:
private void WriteToBinFile(List<MyClass> myObjList, string filePath)
{
using (FileStream fs = new FileStream(filePath, FileMode.Create))
using (BinaryWriter bw = new BinaryWriter(fs))
{
foreach (MyClass myObj in myObjList)
{
bw.Write(JsonConvert.SerializeObject(myObj));
}
}
}
private List<MyClass> ReadFromBinFile(string filePath)
{
List<MyClass> myObjList = new List<MyClass>();
using (FileStream fs = new FileStream(filePath, FileAccess.Read))
using (BinaryReader br = new BinaryReader(fs))
{
while (fs.Length != fs.Position) // This will throw an exception for non-seekable streams (stream.CanSeek == false), but filestreams are seekable so it's OK here
{
myObjList.Add(JsonConvert.DeserializeObject<MyClass>(br.ReadString()));
}
}
return myObjList;
}
Notes:
I've added using around your stream instantiations so that the files are properly closed when memory is freed
To check the stream is at the end, you have to compare Length to Position.

Move position in FileStream (C#)

I have a txt file like this
#header1
#header2
#header3
....
#headerN
ID Value Pvalue
a 0.1 0.002
b 0.2 0.002
...
My code will try to parse
FileStream fs = new FileStream(file, FileMode.Open, FileMode.Read);
......
Table t = Table.Load(fs);
what I want is to make the start position of the Stream right before "ID", so I can feed the stream to the code and make a new table. But I am not sure what is the correct way to do it.
Thanks in advance
Ideally, you should convert Table.Load to take an IEnumerable<string> or at least a StreamReader, not a raw Stream.
If this is not an option, you can read the whole file into memory, skip its header, and write the result into MemoryStream:
MemoryStream stream = new MemoryStream();
using (var writer = new StreamWriter(stream, Encoding.UTF8);
foreach (var line in File.ReadLines(fileName).SkipWhile(s => s.StartsWith("#"))) {
writer.WriteLine(line);
}
}
stream.Position = 0;
Table t = Table.Load(stream);
Try this code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication57
{
class Program
{
const string file = "";
static void Main(string[] args)
{
FileStream fs = new FileStream(file, FileMode.Open, FileAccess.Read);
StreamReader reader = new StreamReader(fs);
string inputline = "";
State state = State.FIND_HEADER;
while((inputline = reader.ReadLine()) != null)
{
switch (state)
{
case State.FIND_HEADER:
if (inputline.StartsWith("#header"))
{
state = State.READ_TABLE;
}
break;
case State.READ_TABLE:
Table t = Table.Load(fs);
break;
}
}
}
enum State
{
FIND_HEADER,
READ_TABLE
}
}
}

Append throwing an exception

I was trying to create a fixed lenght(left aligned) batch file with the below code.
when i use Append it's throwing exception "is a method but used like a type".
string batFilePath = #"c:\mockforbat.bat";
if (!File.Exists(batFilePath))
{
using (FileStream fs = File.Create(batFilePath))
{
fs.Close();
}
}
//write
using (StreamWriter sw = new File.AppendText(batFilePath))
{
string a = String.Format("{0,-24}{1,-5}{2,5}", "CostCenter", "CostObject", "ActivityType");
sw.WriteLine(#a);
}
Process process = Process.Start(batFilePath);
process.WaitForExit();
Please some one correct me what i did wrong here ?
Drop the new operator from this line
using (StreamWriter sw = new File.AppendText(batFilePath))
It should read
using (StreamWriter sw = File.AppendText(batFilePath))
string batFilePath = #"c:\mockforbat.bat";
using(var fs = new FileStream(batFilePath , FileMode.OpenOrCreate, FileAccess.Write))
{
using(var sw = new StreamWriter(fs))
{
string a = String.Format("{0,-24}{1,-5}{2,5}", "CostCenter", "CostObject", "ActivityType");
sw.WriteLine(a);
}
}

Create File If File Does Not Exist

I need to get my code to read if file doesnt exist create else append. Right now it is reading if it does exist create and append. Here is the code:
if (File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
Would I do this?
if (! File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
Edit:
string path = txtFilePath.Text;
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
foreach (var line in employeeList.Items)
{
sw.WriteLine(((Employee)line).FirstName);
sw.WriteLine(((Employee)line).LastName);
sw.WriteLine(((Employee)line).JobTitle);
}
}
}
else
{
StreamWriter sw = File.AppendText(path);
foreach (var line in employeeList.Items)
{
sw.WriteLine(((Employee)line).FirstName);
sw.WriteLine(((Employee)line).LastName);
sw.WriteLine(((Employee)line).JobTitle);
}
sw.Close();
}
}
You can simply call
using (StreamWriter w = File.AppendText("log.txt"))
It will create the file if it doesn't exist and open the file for appending.
Edit:
This is sufficient:
string path = txtFilePath.Text;
using(StreamWriter sw = File.AppendText(path))
{
foreach (var line in employeeList.Items)
{
Employee e = (Employee)line; // unbox once
sw.WriteLine(e.FirstName);
sw.WriteLine(e.LastName);
sw.WriteLine(e.JobTitle);
}
}
But if you insist on checking first, you can do something like this, but I don't see the point.
string path = txtFilePath.Text;
using (StreamWriter sw = (File.Exists(path)) ? File.AppendText(path) : File.CreateText(path))
{
foreach (var line in employeeList.Items)
{
sw.WriteLine(((Employee)line).FirstName);
sw.WriteLine(((Employee)line).LastName);
sw.WriteLine(((Employee)line).JobTitle);
}
}
Also, one thing to point out with your code is that you're doing a lot of unnecessary unboxing. If you have to use a plain (non-generic) collection like ArrayList, then unbox the object once and use the reference.
However, I perfer to use List<> for my collections:
public class EmployeeList : List<Employee>
or:
using FileStream fileStream = File.Open(path, FileMode.Append);
using StreamWriter file = new StreamWriter(fileStream);
// ...
You don't even need to do the check manually, File.Open does it for you. Try:
using (StreamWriter sw = new StreamWriter(File.Open(path, System.IO.FileMode.Append)))
{
Ref: http://msdn.microsoft.com/en-us/library/system.io.filemode.aspx
2021
Just use File.AppendAllText, which creates the file if it does not exist:
File.AppendAllText("myFile.txt", "some text");
Yes, you need to negate File.Exists(path) if you want to check if the file doesn't exist.
This works as well for me
string path = TextFile + ".txt";
if (!File.Exists(HttpContext.Current.Server.MapPath(path)))
{
File.Create(HttpContext.Current.Server.MapPath(path)).Close();
}
using (StreamWriter w = File.AppendText(HttpContext.Current.Server.MapPath(path)))
{
w.WriteLine("{0}", "Hello World");
w.Flush();
w.Close();
}
This will enable appending to file using StreamWriter
using (StreamWriter stream = new StreamWriter("YourFilePath", true)) {...}
This is default mode, not append to file and create a new file.
using (StreamWriter stream = new StreamWriter("YourFilePath", false)){...}
or
using (StreamWriter stream = new StreamWriter("YourFilePath")){...}
Anyhow if you want to check if the file exists and then do other things,you can use
using (StreamWriter sw = (File.Exists(path)) ? File.AppendText(path) : File.CreateText(path))
{...}
For Example
string rootPath = Path.GetPathRoot(Environment.GetFolderPath(Environment.SpecialFolder.System));
rootPath += "MTN";
if (!(File.Exists(rootPath)))
{
File.CreateText(rootPath);
}
private List<Url> AddURLToFile(Urls urls, Url url)
{
string filePath = #"D:\test\file.json";
urls.UrlList.Add(url);
//if (!System.IO.File.Exists(filePath))
// using (System.IO.File.Delete(filePath));
System.IO.File.WriteAllText(filePath, JsonConvert.SerializeObject(urls.UrlList));
//using (StreamWriter sw = (System.IO.File.Exists(filePath)) ? System.IO.File.AppendText(filePath) : System.IO.File.CreateText(filePath))
//{
// sw.WriteLine(JsonConvert.SerializeObject(urls.UrlList));
//}
return urls.UrlList;
}
private List<Url> ReadURLToFile()
{
// string filePath = Path.Combine(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), #"App_Data\file.json");
string filePath = #"D:\test\file.json";
List<Url> result = new List<Url>(); ;
if (!System.IO.File.Exists(filePath))
using (System.IO.File.CreateText(filePath)) ;
using (StreamReader file = new StreamReader(filePath))
{
result = JsonConvert.DeserializeObject<List<Url>>(file.ReadToEnd());
file.Close();
}
if (result == null)
result = new List<Url>();
return result;
}

Categories

Resources