Removing specified text from CSV file

Removing specified text from CSV file - c#

it's my first attempt at doing this and I have no idea if I'm on the right lines.
Basically I want to remove text from a CSV file that contains a specific keyword but I can't figure out how to remove the line.
static void Main(string[] args)
{
var searchItem = "running";
var lines = File.ReadLines("C://Users//Pete//Desktop//testdata.csv");
foreach (string line in lines)
{
if (line.Contains(searchItem))
{
//Remove line here?
}
}
}

Try this one to remove one or a few multiple words.
static void sd(string[] args)
{
string contents = File.ReadAllText("C://Users//Pete//Desktop//testdata.csv");
string output = contents.Replace("running", string.Empty).Replace("replaceThis", string.Empty).Replace("replaceThisToo", string.Empty);
//string output = contents.Replace("a", "b").Replace("b", "c").Replace("c", "d");
}
To remove multiple string, you can use this...
static void Main(string[] args)
{
string[] removeTheseWords = { "aaa", "bbb", "ccc" };
string contents = File.ReadAllText("C://Users//Pete//Desktop//testdata.csv");
string output = string.Empty;
foreach (string value in removeTheseWords)
{
output = contents.Replace(value, string.Empty);
}
}
More info: https://learn.microsoft.com/en-us/dotnet/api/system.string.replace

The simple way if you'd like to remove a whole line:
var searchItem = "running";
var pathToYourFile = #"C://Users//Pete//Desktop//testdata.csv";
var lines = File.ReadAllLines(pathToYourFile);
lines = lines.Where(line => !line.Contains(searchItem)).ToArray();
File.WriteAllLines(pathToYourFile, lines);
For multiple search items:
var searchItems = "running;walking;waiting;any";
var pathToYourFile = #"..\..\items.csv";
var lines = File.ReadAllLines(pathToYourFile);
// split with your separator, actually is ';' character
foreach(var searchItem in searchItems.Split(';'))
lines = lines.Where(line =>!line.Contains(searchItem)).ToArray();
File.WriteAllLines(pathToYourFile, lines);

if you are using foreach and removing from lines its will through an exception called collection modified exception so go with for
for(int i=lines.Count - 1; i > -1; i--)
{
if (lines[i].Contains(searchItem))
{
lines.RemoveAt(i);
}
}

You don't need to remove line just skip those lines that contain your search term
foreach (string line in lines)
{
if (!line.Contains(searchItem)) //<= Notice here I added exclamation mark (!)
{
//Do your work when line does not contains search term
}
else
{
//Do something if line contains search term
}
}
Or alternative is to filtered your lines that does not contains your search term before loop like
lines = lines.Where(line => !line.Contains(searchItem));
foreach (string line in lines)
{
//Here are those line that does not contain search term
}
If your search term contains multiple words separated with comma(,) then you can skip those line by
lines = lines.Where(line => searchItem.Split(',').All(term => !line.Contains(term)));

Related

Insert multiple lines into file from position without overwriting the previous inserted line

I want to insert multiple "items" into a list using a foreach loop(looping over a list). Now I want to insert the lines as a <td> element. But by specifying the index at the position I want to insert the line, the previous one gets overwritten. How can I add a line at position and then add the rest afterwards without overwriting the previously added line
private void Create_Driver_Report(string npcName)
{
var fileName = Get_Path("Driver_Reports.html");
var endTag = npcName;
var lineToAdd = "<!--New Line Here-->";
var htmlContent = File.ReadAllLines(fileName).ToList();
var index = htmlContent.FindIndex(x => x.Contains(lineToAdd));
htmlContent.Insert(index + 1, endTag);
File.WriteAllLines("drivers.html", htmlContent);
}
How I want to do it in theory
foreach (Drivers item in drivers)
{
Create_Driver_Report($"<td>{item.Driver_ID}</td>");
Create_Driver_Report($"<td>{item.Driver_Name}</td>");
Create_Driver_Report($"<td>{item.Vehicle_ID}</td>");
Create_Driver_Report($"<td>{item.Company_ID}</td>");
Create_Driver_Report($"<td>{item.Company_Name}</td>");
}

Override the ToString() method or create a new one.
If you always want to insert the same properties, seems unnecessary to invoke the Create_Driver_Report over and over again.
public override String ToString() {
return
$"<td>{this.Driver_ID}</td>\n" +
$"<td>{this.Driver_Name}</td>\n" +
$"<td>{this.Vehichle_ID}</td>\n" +
$"<td>{this.Company_ID}</td>\n" +
$"<td>{this.Company_Name}</td>";
}
and you can invoke it like:
foreach (Drivers item in drivers) {
Create_Driver_Report(item.ToString());
}
Edit:
Option 1:
Use LINQ Select() and List.InsertRange()
public String ToHtmlRow() {
return
$"<tr><td>{this.Driver_ID}</td><td>{this.Driver_Name}</td><td>{this.Vehichle_ID}</td><td>{this.Company_ID}</td><td>{this.Company_Name}</td></tr>";
}
{
IEnumerable<string> lines = drivers.Select(driver => driver.ToString())
Create_Driver_Report(lines);
}
static void Create_Driver_Report(IEnumerable<string> lines) {
var fileName = Get_Path("Driver_Reports.html");
var lineToAdd = "<!--New Line Here-->";
var htmlContent = File.ReadAllLines(fileName).ToList();
var index = htmlContent.FindIndex(x => x.Contains(lineToAdd));
htmlContent.InsertRange(index + 1, lines);
File.WriteAllLines("drivers.html", htmlContent);
}
Option 2:
You just add all of the rows you want and you call the Create_Driver_Report only once.
List<String> toAdd = new List<String>();
foreach (Drivers item in drivers) {
toAdded.Add(item.ToHtmlRow());
}
Create_Driver_Report(String.Join("\n", toAdd));

Delete rows in a csv file

I have two files: Example1.csv and Example2.csv, note they are not comma-separated, but are saved with the 'csv' extension.
Example 1 has 1 column which has emails address only
Example 2 has many columns in which it has the column that is there in example 1 csv file.
Example1.csv file
emails
abc#gmail.com
jhg#yahoo.com
...
...
Example 2.csv
Column1 column2 Column3 column4 emails
1 45 456 123 abc#gmail.com
2 89 898 254 jhg#yahoo.com
3 85 365 789 ...
Now i need to delete the rows in example2.csv that matches with data in example 1 file, for example: Row 1 and 2 should be removed as they both the email matches.
string[] lines = File.ReadAllLines(#"C:\example2.csv");
var emails = File.ReadAllLines(#"C:\example1.csv");
List<string> linesToWrite = new List<string>();
foreach (string s in lines)
{
String[] split = s.Split(' ');
if (s.Contains(emails))
linesToWrite.Remove(s);
}
File.WriteAllLines("file3.csv", linesToWrite);

This should work:
var emails = new HashSet<string>(File.ReadAllLines(#"C:\example1.csv").Skip(1));
File.WriteAllLines("file3.csv", File.ReadAllLines("C:\example2.csv").Where(line => !emails.Contains(line.Split(',')[4]));
It reads all of file one, puts all emails into a format where lookup is easy, then goes through all lines in the second file and writes only those to disk that don't match any of the existing emails in their 5th column. You may want to expand on many parts, for example there is little to no error handling. It also compares emails case-sensitive, although emails are normally not.

Variable line is not string, but string array, same as lines, you are reading it in the same way as lines.
Also this line
if (s.Contains(line))
is not correct. You are trying to check if a string contains an array. If you need to check if a line contains an email from list, then this will be better:
if (split.Intersect(line).Any())
So, here is the final code.
var lines = File.ReadAllLines(#"C:\example2.csv");
var line = File.ReadAllLines(#"C:\example1.csv");
var linesToWrite = new List<string>();
foreach (var s in lines)
{
var split = s.Split(',');
if (split.Intersect(line).Any())
{
linesToWrite.Remove(s);
}
}
File.WriteAllLines("file3.csv", linesToWrite);

static void Main(string[] args)
{
var Example1CsvPath = #"C:\Inetpub\Poligon\Poligon\Resources\Example1.csv";
var Example2CsvPath = #"C:\Inetpub\Poligon\Poligon\Resources\Example2.csv";
var Example3CsvPath = #"C:\Inetpub\Poligon\Poligon\Resources\Example3.csv";
var EmailsToDelete = new List<string>();
var Result = new List<string>();
foreach(var Line in System.IO.File.ReadAllLines(Example1CsvPath))
{
if (!string.IsNullOrWhiteSpace(Line) && Line.IndexOf('#') > -1)
{
EmailsToDelete.Add(Line.Trim());
}
}
foreach (var Line in System.IO.File.ReadAllLines(Example2CsvPath))
{
if (!string.IsNullOrWhiteSpace(Line))
{
var Values = Line.Split(' ');
if (!EmailsToDelete.Contains(Values[4]))
{
Result.Add(Line);
}
}
}
System.IO.File.WriteAllLines(Example3CsvPath, Result);
}

I know this is 4 years-old... But I've got some ideas from this and I like to share my solution...
The idea behind this code is a simple CSV, with maximum of about 20 lines (reeeeally maximum), so I've decided to make something basic and not use a DB for this.
My solution is to rescan the CSV saving all variables (that is not the same that I like to delete) into a list and after scanning the CSV, it writes the list into the CSV (removing the one I've passed {textBox1})
List<string> _ = new();
try {
using (var reader = new StreamReader($"{Main.directory}\\bin\\ip.csv")) {
while (!reader.EndOfStream) {
var line = reader.ReadLine();
var values = line.Split(',');
if (values[0] == textBox1.Text || values[1] == textBox2.Text)
continue;
_.Add($"{values[0]},{values[1]},{values[2]},");
}
}
File.WriteAllLines($"{Main.directory}\\bin\\ip.csv", _);
} catch (Exception f) {
MessageBox.Show(f.Message);
}

multiple foreach loops inside while loop

is it possible to include multiple "foreach" statements inside any of the looping constructs like while or for ... i want to open the .wav files from two different directories simultaneously so that i can compare files from both.
here is what i am trying to so but it is certainly wrong.. any help in this regard is appreciated.
string[] fileEntries1 = Directory.GetFiles(folder1, "*.wav");
string[] fileEntries2 = Directory.GetFiles(folder11, "*.wav");
while ( foreach(string fileName1 in fileEntries1) && foreach(string fileName2 in fileEntries2))

Gramatically speaking no. This is because a foreach construct is a statement whereas the tests in a while statement must be expressions.
Your best bet is to nest the foreach blocks:
foreach(string fileName1 in fileEntries1)
{
foreach(string fileName2 in fileEntries2)

I like this kind of statements in one line. So even though most of the answers here are correct, I give you this.
string[] fileEntries1 = Directory.GetFiles(folder1, "*.wav");
string[] fileEntries2 = Directory.GetFiles(folder11, "*.wav");
foreach( var fileExistsInBoth in fileEntries1.Where(fe1 => fileEntries2.Contains(fe1) )
{
/// here you will have the records which exists in both of the lists
}

Something like this since you only need to validate same file names:
IEnumerable<string> fileEntries1 = Directory.GetFiles(folder1, "*.wav").Select(x => Path.GetFileName(x));
IEnumerable<string> fileEntries2 = Directory.GetFiles(folder2, "*.wav").Select(x => Path.GetFileName(x));
IEnumerable<string> filesToIterate = (fileEntries1.Count() > fileEntries2.Count()) ? fileEntries1 : fileEntries2;
IEnumerable<string> filesToValidate = (fileEntries1.Count() < fileEntries2.Count()) ? fileEntries1 : fileEntries2;
// Iterate the bigger collection
foreach (string fileName in filesToIterate)
{
// Find the files in smaller collection
if (filesToValidate.Contains(fileName))
{
// Get actual file and compare
}
else
{
// File does not exist in another list. Handle appropriately
}
}
.Net 2.0 based solution:
List<string> fileEntries1 = new List<string>(Directory.GetFiles(folder1, "*.wav"));
List<string> fileEntries2 = new List<string>(Directory.GetFiles(folder2, "*.wav"));
List<string> filesToIterate = (fileEntries1.Count > fileEntries2.Count) ? fileEntries1 : fileEntries2;
filesToValidate = (fileEntries1.Count < fileEntries2.Count) ? fileEntries1 : fileEntries2;
string iteratorFileName;
string validatorFilePath;
// Iterate the bigger collection
foreach (string fileName in filesToIterate)
{
iteratorFileName = Path.GetFileName(fileName);
// Find the files in smaller collection
if ((validatorFilePath = FindFile(iteratorFileName)) != null)
{
// Compare fileName and validatorFilePath files here
}
else
{
// File does not exist in another list. Handle appropriately
}
}
FindFile method:
static List<string> filesToValidate;
private static string FindFile(string fileToFind)
{
string returnValue = null;
foreach (string filePath in filesToValidate)
{
if (string.Compare(Path.GetFileName(filePath), fileToFind, true) == 0)
{
// Found the file
returnValue = filePath;
break;
}
}
if (returnValue != null)
{
// File was found in smaller list. Remove this file from the list since we do not need to look for it again
filesToValidate.Remove(returnValue);
}
return returnValue;
}
You may or may not choose to make fields and methods static based on your needs.

If you want to iterate all pairs of files in both paths respectively, you can do it as follows.
string[] fileEntries1 = Directory.GetFiles(folder1, "*.wav");
string[] fileEntries2 = Directory.GetFiles(folder11, "*.wav");
foreach(string fileName1 in fileEntries1)
{
foreach(string fileName2 in fileEntries2)
{
// to the actual comparison
}
}

This is what I would suggest, using linq
using System.Linq;
var fileEntries1 = Directory.GetFiles(folder1, "*.wav");
var fileEntries2 = Directory.GetFiles(folder11, "*.wav");
foreach (var entry1 in fileEntries1)
{
var entries = fileEntries2.Where(x => Equals(entry1, x));
if (entries.Any())
{
//We have matches
//entries is a list of matches in fileentries2 for entry1
}
}

If you want to enable both collections "in parallel", then use their iterators like this:
var fileEntriesIterator1 = Directory.EnumerateFiles(folder1, "*.wav").GetEnumerator();
var fileEntriesIterator2 = Directory.EnumerateFiles(folder11, "*.wav").GetEnumerator();
while(fileEntriesIterator1.MoveNext() && fileEntriesIterator2.MoveNext())
{
var file1 = fileEntriesIterator1.Current;
var file2 = fileEntriesIterator2.Current;
}
If one collection is shorter than the other, this loop will end when the shorter collection has no more elements.

get all lines from a huge textfile after a string

I have a lot of huge text files, I need to retrive all lines after certain string using c#,
fyi, the string will be there within last few lines, but not sure last how many lines.
sample text would be
someline
someline
someline
someline
etc
etc
"uniqueString"
line 1
line 2
line 3
I need to get lines
line 1
line 2
line 3

bool found=false;
List<String> lines = new List<String>();
foreach(var line in File.ReadLines(#"C:\MyFile.txt"))
{
if(found)
{
lines.Add(line);
}
if(!found && line.Contains("UNIQUEstring"))
{
found=true;
}
}

Try this code
public string[] GetLines()
{
List<string> lines = new List<string>();
bool startRead = false;
string uniqueString = "uniqueString";
using (StreamReader st = new StreamReader("File.txt"))
{
while (!st.EndOfStream)
{
if (!startRead && st.ReadLine().Equals(uniqueString))
startRead = true;
if (!startRead)
continue;
lines.Add(st.ReadLine());
}
}
return lines.ToArray();
}

Searching the first few characters of every word within a string in C#

I am new to programming languages. I have a requirement where I have to return a record based on a search string.
For example, take the following three records and a search string of "Cal":
University of California
Pascal Institute
California University
I've tried String.Contains, but all three are returned. If I use String.StartsWith, I get only record #3. My requirement is to return #1 and #3 in the result.
Thank you for your help.

If you're using .NET 3.5 or higher, I'd recommend using the LINQ extension methods. Check out String.Split and Enumerable.Any. Something like:
string myString = "University of California";
bool included = myString.Split(' ').Any(w => w.StartsWith("Cal"));
Split divides myString at the space characters and returns an array of strings. Any works on the array, returning true if any of the strings starts with "Cal".
If you don't want to or can't use Any, then you'll have to manually loop through the words.
string myString = "University of California";
bool included = false;
foreach (string word in myString.Split(' '))
{
if (word.StartsWith("Cal"))
{
included = true;
break;
}
}

I like this for simplicity:
if(str.StartsWith("Cal") || str.Contains(" Cal")){
//do something
}

You can try:
foreach(var str in stringInQuestion.Split(' '))
{
if(str.StartsWith("Cal"))
{
//do something
}
}

You can use Regular expressions to find the matches. Here is an example
//array of strings to check
String[] strs = {"University of California", "Pascal Institute", "California University"};
//create the regular expression to look for
Regex regex = new Regex(#"Cal\w*");
//create a list to hold the matches
List<String> myMatches = new List<String>();
//loop through the strings
foreach (String s in strs)
{ //check for a match
if (regex.Match(s).Success)
{ //add to the list
myMatches.Add(s);
}
}
//loop through the list and present the matches one at a time in a message box
foreach (String matchItem in myMatches)
{
MessageBox.Show(matchItem + " was a match");
}

string univOfCal = "University of California";
string pascalInst = "Pascal Institute";
string calUniv = "California University";
string[] arrayofStrings = new string[]
{
univOfCal, pascalInst, calUniv
};
string wordToMatch = "Cal";
foreach (string i in arrayofStrings)
{
if (i.Contains(wordToMatch)){
Console.Write(i + "\n");
}
}
Console.ReadLine();
}

var strings = new List<string> { "University of California", "Pascal Institute", "California University" };
var matches = strings.Where(s => s.Split(' ').Any(x => x.StartsWith("Cal")));
foreach (var match in matches)
{
Console.WriteLine(match);
}
Output:
University of California
California University

This is actually a good use case for regular expressions.
string[] words =
{
"University of California",
"Pascal Institute",
"California University"
}
var expr = #"\bcal";
var opts = RegexOptions.IgnoreCase;
var matches = words.Where(x =>
Regex.IsMatch(x, expr, opts)).ToArray();
The "\b" matches any word boundary (punctuation, space, etc...).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Removing specified text from CSV file - c#

if you are using foreach and removing from lines its will through an exception called collection modified exception so go with for for(int i=lines.Count - 1; i > -1; i--) { if (lines[i].Contains(searchItem)) { lines.RemoveAt(i); } }

Related

Insert multiple lines into file from position without overwriting the previous inserted line

Delete rows in a csv file

multiple foreach loops inside while loop

get all lines from a huge textfile after a string

Searching the first few characters of every word within a string in C#

Categories

Resources