I am using the following C# code to filter a directory containing multiple files:
files = Directory.GetFiles(SourceDatafiles, #"2022*.txt",SearchOption.TopDirectoryOnly);
The directory contains multiple files for instance files like:
2022-07-21-14.txt
2017-2-2-0.txt
The result of the filter is wrong: It also filters the second file name as a valid name. But it does not contain "2022" !?
Any idea what's wrong?
Perhaps you could share more information on your environment, DotNet framework version, OS etc.
When I run the below code using DotNet 6 on Windows I get the expected results, namely it only prints the file 2022-07-14.txt
string SourceDatafiles = #"C:\Temp\Test";
var files = Directory.GetFiles(SourceDatafiles, #"2022*.txt", SearchOption.TopDirectoryOnly);
foreach (var file in files)
{
Console.WriteLine(file);
}
See also:
https://stackoverflow.com/a/39753445/19595774
int count1 = 0;
int count2 = 0;
string pattern = "2022";
foreach (var file in Directory.EnumerateFiles(dir))
{
if (file.Contains("2022"))
Console.WriteLine("Index:" + count1++ + "Name:" + file);
/* OR */
if (Regex.IsMatch(file, pattern) )
Console.WriteLine("Index:" + count2++ + "Name:" + file);
}
Enumeration works fine. I think a better approach than Directory.GetFiles(SourceDatafiles, #"2022*.txt",SearchOption.TopDirectoryOnly).
Still wondering why the wild-card filter did not work as expected.
I have excel list with file names that I want to move from one folder to another. And I can not just copy paste the files from one folder to another since there are allot of files that do not match the excel list.
private static void CopyPaste()
{
var pstFileFolder = "C:/Users/chnikos/Desktop/Test/";
//var searchPattern = "HelloWorld.docx"+"Test.docx";
string[] test = { "HelloWorld.docx", "Test.docx" };
var soruceFolder = "C:/Users/chnikos/Desktop/CopyTest/";
// Searches the directory for *.pst
foreach (var file in Directory.GetFiles(pstFileFolder, test.ToString()))
{
// Exposes file information like Name
var theFileInfo = new FileInfo(file);
var destination = soruceFolder + theFileInfo.Name;
File.Move(file, destination);
}
}
}
I have tried several things but I still think that with a array it would be the easiest way to do it(correct me if I am wrong).
The issue that I face right now is that it can not find any files (there are files under this name.
You can enumerate the files in the directory by using Directory.EnumerateFiles and use a linq expression to check if the file is contained in you string array.
Directory.EnumerateFiles(pstFileFolder).Where (d => test.Contains(Path.GetFileName(d)));
So your foreach would look like
this
foreach (var file in Directory.EnumerateFiles(pstFileFolder).Where (d => test.Contains(Path.GetFileName(d)))
{
// Exposes file information like Name
var theFileInfo = new FileInfo(file);
var destination = soruceFolder + theFileInfo.Name;
File.Move(file, destination);
}
Actually no, this will not search the directory for pst files. Either build the path yourself using Path.Combine and then iterate over the string-array, or use your approach. With the code above, you need to update the filter, because it will not find any file when given a string[].ToString (). This should do:
Directory.GetFiles (pstFileFolder, "*.pst")
Alternatively, you can iterate over all files without a filter and compare the filenames to your string-array. For this, a List<string> would be a better way. Just iterate over the files like you're doing and then check if the List contains the file via List.Contains.
foreach (var file in Directory.GetFiles (pstFileFolder))
{
// Exposes file information like Name
var theFileInfo = new FileInfo(file);
// Here, either iterate over the string array or use a List
if (!nameList.Contains (theFileInfo.Name)) continue;
var destination = soruceFolder + theFileInfo.Name;
File.Move(file, destination);
}
I think you need this
var pstFileFolder = "C:/Users/chnikos/Desktop/Test/";
//var searchPattern = "HelloWorld.docx"+"Test.docx";
string[] test = { "HelloWorld.docx", "Test.docx" };
var soruceFolder = "C:/Users/chnikos/Desktop/CopyTest/";
// Searches the directory for *.pst
foreach (var file in test)
{
// Exposes file information like Name
var theFileInfo = new FileInfo(file);
var source = Path.Combine(soruceFolder, theFileInfo.Name);
var destination = Path.Combine(pstFileFolder, file);
if (File.Exists(source))
File.Move(file, destination);
}
I have a list of zipped files that contains a ZipArchive and the zipped filename as a String. I also have a final list of filenames that I need to check with my List and if the files do not match with my final list of filenames they should be dumped from my zipped file list.
I under stand that may not be worded the best so let me try and explain with my code/pseudo code.
Here is my list:
List<ZipContents> importList = new List<ZipContents>();
Which has two parameters:
ZipArchive which is called ZipFile
String which is called FileName
filenames is the finale list of file names that I am trying to check my ZipContents list against.
Here is the start of what I am trying to do:
foreach (var import in importList)
{
var fn = import.FileName;
// do some kind of lookup to see if fn would be List<String> filenames
// If not in list dump from ZipContents
}
The commented out section is what I am unsure about doing. Would someone be able to help get me on the right track? Thanks!
EDIT 1
I know I did not say this originally but I think that LINQ would be the much cleaner route to take. I am just not positive how. I am assuming that using .RemoveAll(..) would be the way I would want to go?
Loop through importList in reverse and remove items when not found in filenames. Assuming you don't have too many items performance should be fine:
for (int i = importList.Count - 1; i >= 0; i--)
{
if (!filenames.Contains(importList[i].FileName))
{
importList.RemoveAt(i);
}
}
You can't remove items from the list using a foreach because it modifies the collection, but you can do it with the construct in my example.
You could do something like:
if (!filenames.Contains(fn)) {
importList.Remove(import);
}
Alternatively, I believe you could use Linq to simplify this logic into just one line.
Edit:
Yes, you can just create a new list of just the ones you want, like this:
var newImportList = importList.Where(il => filenames.Contains(il.FileName)).ToList();
You can do this in one line. Just use LINQ to re-establish your list:
var filenames = new List<string> {"file1", "file2"};
var zipcontents = new List<ZipContents>
{
new ZipContents {FileName = "file1"},
new ZipContents {FileName = "file2"},
new ZipContents {FileName = "file3"}
};
zipcontents = zipcontents.Where(z => filenames.Contains(z.FileName)).ToList();
//zipcontents contains only files that were in 'filenames'
Honestly, this is what LINQ was made for: querying data.
I have 2 csv files, file1.csv and file2.csv. Some lines in each file will be identical. I wish to create a 3rd csv file, based upon file2.csv but with any lines that are present in file1.csv removed from it. Effectively I wish to subtract file1.csv from file2.csv ignoring any lines present in file1 that are not in file2.
I know that I could use streamreader to read each line in file2.csv and search for it in file1.csv. If it does not exist in file1.csv I can write it to file3.csv. However, the files are very large (over 30000 lines) and I believe this would take a lot of processing time.
I suspect there may be a better method of loading each csv to an array and then performing a simple subtraction function on them to obtain the desired result. I would appreciate either some help with the code or on method that I should approach this problem with.
Example content of files:
file1.csv
dt97861.jpg,149954,c1714ee1,\folder1\folderA\,
dt97862.jpg,149955,c1714ee0,\folder1\folderA\,
dt97863.jpg,59368,cd23f223,\folder2\folderA\,
dt97864.jpg,57881,0835be4a,\folder2\folderB\,
dt97865.jpg,57882,0835be4b,\folder2\folderB\,
file2.csv
dt97862.jpg,149955,c1714ee0,\folder1\folderA\,
dt97863.jpg,59368,cd23f223,\folder2\folderA\,
dt97864.jpg,57881,0835be4a,\folder2\folderB\,
dt97865.jpg,57882,0835be4b,\folder2\folderB\,
dt97866.jpg,57883,0835be4c,\folder2\folderB\,
dt97867.jpg,57884,0835be4d,\folder3\folderA\,
dt97868.jpg,57885,0835be4e,\folder3\folderA\,
The results I require is:
file3.csv
dt97866.jpg,57883,0835be4c,\folder2\folderB\,
dt97867.jpg,57884,0835be4d,\folder3\folderA\,
dt97868.jpg,57885,0835be4e,\folder3\folderA\,
EDIT:
With the help below I came to the following solution which I believe to be nice and elegant:
public static IEnumerable<string> ReadFile(string path)
{
string line;
using (var reader = File.OpenText(path))
while ((line = reader.ReadLine()) != null)
yield return line;
}
then:
var file2 = ReadFile(file2FilePath);
var file1 = ReadFile(file1FilePath);
var file3 = file2.Except(file1);
File.WriteAllLines(file3FilePath, file3);
Assume the line is perfectly identical, you can read both file into two IEnumerable<string> and extract with IEnumerable.Except<T>. This will produce the same result regardless of the ordering~
Example :
var file1 = new List<string>{
#"dt97861.jpg,149954,c1714ee1,\folder1\folderA\,",
#"dt97862.jpg,149955,c1714ee0,\folder1\folderA\,",
#"dt97863.jpg,59368,cd23f223,\folder2\folderA\,",
#"dt97864.jpg,57881,0835be4a,\folder2\folderB\,",
#"dt97865.jpg,57882,0835be4b,\folder2\folderB\,",
};
var file2 = new List<string>{
#"dt97862.jpg,149955,c1714ee0,\folder1\folderA\,",
#"dt97863.jpg,59368,cd23f223,\folder2\folderA\,",
#"dt97864.jpg,57881,0835be4a,\folder2\folderB\,",
#"dt97865.jpg,57882,0835be4b,\folder2\folderB\,",
#"dt97866.jpg,57883,0835be4c,\folder2\folderB\,",
#"dt97867.jpg,57884,0835be4d,\folder3\folderA\,",
#"dt97868.jpg,57885,0835be4e,\folder3\folderA\,",
};
file2.Except(file1).Dump();
Output :
dt97866.jpg,57883,0835be4c,\folder2\folderB\,
dt97867.jpg,57884,0835be4d,\folder3\folderA\,
dt97868.jpg,57885,0835be4e,\folder3\folderA\,
Here is the function to load any file into IEnumerable<string>. Just dont forget to using System.IO;.
public static IEnumerable<string> ReadFile(string path)
{
string line;
using(var reader = File.OpenText(path))
while((line = reader.ReadLine()) != null)
yield return line;
}
To write the result to a file :
//using System.IO; is required
File.WriteAllLines("file3.csv", file2.Except(file1))
Remarks : File.WriteAllLines will create or overwrite the file.
While this may not be the best approach, it's the one I've used in the past. It's a bit of a dirty hack, but...
Import both CSV files into a datatable (so you will have two datatables -I personally prefer closed xml if you plan to use an excel type format, otherwise just use a normal file read/write - My example uses regular read/write)
Move data from datatable into a list (my example assumes comma separated values, one per line.)
Find unique values between lists and merge
Export the merged lists to a csv file
*[Edited steps after actually working on the code]
Per request from Bit, I've added an example using sample data from Some Random Website - This was written in VS2008 against .NET 3.5, but it should work on 3.5+. I copied us-500 into 2 versions, the original and modified 1 row to create a unique value to test. This project is targeting x86 platform. I've used a new windows form for testing
using System.Data;
using System.Data.OleDb;
using System.IO;
using System.Linq;
using System.Windows.Forms;
namespace TestSandbox
{
public partial class Form1 : Form
{
public Form1()
{
var file1 = new DataTable();
var file2 = new DataTable();
InitializeComponent();
//Gets data from csv file, select allows for filtering
using (var conn = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\;Extended Properties=""text;HDR=Yes;FMT=Delimited"";"))
{
conn.Open();
using (var adapter = new OleDbDataAdapter(#"select * from [us-500.csv]", conn))
{
adapter.Fill(file1);
}
}
using (var conn = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\;Extended Properties=""text;HDR=Yes;FMT=Delimited"";"))
{
conn.Open();
using (var adapter = new OleDbDataAdapter(#"select * from [us-500-2.csv]", conn))
{
adapter.Fill(file2);
}
}
//Moves datatable information to lists for comparison
var file1List = (from DataRow row in file1.Rows select row.ItemArray.Select(field => field.ToString()).ToArray() into fields select string.Join(",", fields)).ToList();
var file2List = (from DataRow row in file2.Rows select row.ItemArray.Select(field => field.ToString()).ToArray() into fields select string.Join(",", fields)).ToList();
//Adds all data from file2 into file1 list, except for data that already exists in file1
file1List.AddRange(file2List.Except(file1List));
//Exports all results to c:\results.csv
File.WriteAllLines(#"C:\Results.csv", file1List.ToArray());
}
}
}
*Note: After looking at the code, importing straight to a list looks like it would be more efficient, but I'll leave this as is for now since it's not overly complicated.
Step 1. Using System.IO, we'll read two files using FileStream and create a third file using StreamWriter.
Step 2. Use FileStream to read file #1. e.g.
using (var FS = new System.IO.FileStream(file1, System.IO.FileMode.Open, System.IO.FileAccess.Read)) { ...<insert next steps in here>...}
Step 3. Nest another FileStream to read file #2. This stream will be read multiple times, so it's best if you can put the smaller file in this part of the nest. You can do this by checking the size of the file prior to jumping into these loops.
Step 4. Read in a single line from our biggest file, File#1, then we compare it against ALL lines from File#2 sequentially. If a match is found, set a boolean to TRUE indicating that there is a matching line found in File #2.
Step 5. Once we're at the end of File #2, check for a true/false condition of the boolean. If its false, SAVE the string we read from File #1 into File #3. This is your output file.
Step 6. Reset the stream pointer for File #2 to the beginning of the file e.g. FS.Seek(0, System.IO.SeekOrigin.Begin)
Step 7. Repeat from Step 4 until we've reached the end of File #1. File #3's contents should represent only unique entries from File #1 that are not members of File #2
I have this code for parsing a CSV file.
var query = from line in File.ReadAllLines("E:/test/sales/" + filename)
let customerRecord = line.Split(',')
select new FTPSalesDetails
{
retailerName = "Example",
};
foreach (var item in query)
{
//sales details table
ItemSale ts = new ItemSale
{
RetailerID = GetRetailerID(item.retailerName)
};
}
Obviously there will be more data in the above code, I am just awaiting the test information file details/structure.
In the mean time I thought I'd ask if this could me modified to parse TSV files?
All help is appreciated,
thanks :)
assuming tsv is tab separated value, you can use
line.Split('\t')
if you are using .NET 4.0, i would recommend that u use File.ReadLines for large files in order to use LINQ and not to load all the lines in memory at once.