Select file lines based in one string array - c#

I have one array with names
string[] arr = { "John", "karl", "Ralf", "Florian" };
And, I have one array of lines of one file
var lines = File.ReadAllLines("test.txt");
And, inside of that column, I have some names, random names, in a specific position of the line, for example, column 10 to column 20.
How I do this now, I have an encadead loop, where I search line by line from specific name of index of first loop
for(int i=0;i<arr.Lenght;i++)
for(int j=0;j<lines.Lenght;j++)
if(lines[j].substring(10,10).trim() == arr[i].Trim())
list.Add(lines[j])
That is a very poor way, i know :(
But, I want to know, if is possible to do this using Linq, if yes, how it's possible ?

I think this should do the job
list = lines.Where(x=>arr.Contains(x.Substring(10,10).Trim()));
it will create a list from lines where line substring (10,20) is in arr

Try this:
var result = lines
.Where(line => arr.Contains(line.Substring(10, 10).Trim()))
.ToList();
Trim on arr elements is not needed - they're already trimmed. ToList can be skipped to get a lazy IEnumerable<string> or replaced with ToArray if you prefer.

LINQ, in query comprehension syntax:
var result = from line in lines
let trimmed = line.Substring(10,10).Trim()
where arr.Contains(trimmed)
select line;
NOTE: result will contain an enumeration over non-trimmed lines, which is what the OP's code originally does.

Related

C# text file to string array and how to remove specific strings?

I need read a text file (10mb) and convert to .csv. See below portion of code:
string DirPathForm = System.IO.Path.GetDirectoryName(System.Reflection.Assembly.GetEntryAssembly().Location);'
string[] lines = File.ReadAllLines(DirPathForm + #"\file.txt");
Some portion of the text file have a pattern. So, used as below:
string[] lines1 = lines.Select(x => x.Replace("abc[", "ab,")).ToArray();
Array.Clear(lines, 0, lines.Length);
lines = lines1.Select(x => x.Replace("] CDE ", ",")).ToArray();
Some portion does not have a pattern to use directly Replace. The question is how remove the characters, numbers and whitespaces in this portion. Please see below?
string[] lines = {
"a] 773 b",
"e] 1597 t",
"z] 0 c"
};
to get the result below:
string[] result = {
"a,b",
"e,t",
"z,c"
};
obs: the items removed need be replaced by ",".
First of all, you should not use ReadAllLines since it is a huge file operation. It will load all the data into RAM and it is not correct. Instead, read the lines one by one in a loop.
Secondly, you can definitely use regex to replace data from the first condition to the second one.

Need to refer to second to the last element of array of partial filenames

I need to find distinct values of partial filenames in an array of filenames. I'd like to do it in one line.
So, I have something like that as a filenames:
string[] filenames = {"aaa_ab12345.txt", "bbb_ab12345.txt", "aaa_ac12345.txt", "bbb_ac12345"}
and I need to find distinct values for ab12345 part of it.
So I currently have something like that:
string[] filenames_partial_distinct = Array.ConvertAll(
filenames,
file => System.IO.Path.GetFileNameWithoutExtension(file)
.Split({"_","."}, StringSplitOptions.RemoveEmptyEntries)[1]
)
.Distinct()
.ToArray();
Now, I'm getting filenames that are of form of aaa_bbb_ab12345.txt. So, instead of referring to the second part of the filename, I need to refer to the second to the last.
So, how do I refer to an arbitrary element based on length of array in one line, if it's a result of Split method? Something along lines of:
Array.ConvertAll(filenames, file=>file.Split(separator)[this.Length-2]).Distinct().ToArray();
In other words, if a string method results in an array of strings, how do I immediately select element based on the length of array:
String.Split()[third from end, fifth from end, etc.];
If you use GetFileNameWithoutExtension there will be no extension and therefore splitting by '_' will do it. Then you can take the last part with .Last().
string[] filenames_partial_distinct = Array.ConvertAll(
filenames,
file => Path.GetFileNameWithoutExtension(file).Split('_').Last()
)
.Distinct()
.ToArray();
With the input
string[] filenames = { "aaa_ab12345.txt", "bbb_ab12345.txt",
"aaa_ac12345.txt", "bbb_ac12345", "aaa_bbb_ab12345.txt" };
You get the result
{ "ab12345", "ac12345" }
The StringSplitOptions.RemoveEmptyEntries is only required if there are filenames ending with _ (before the extension).
Seems you're looking for something like this:
string[] arr = filenames.Select(n => n.Substring(n.IndexOf("_") + 1, 7)).Distinct().ToArray();
I usually defer problems like this to regex. They are very powerful. This approach also gives you the opportunity to detect unexpected cases and handle them appropriately.
Here is a crude example, assuming I understood your requirements:
using System;
using System.Linq;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string MyMatcher(string filename)
{
// this pattern may need work depending on what you need - it says
// extract that pattern between the "()" which is 2 characters and
// 4 digits, exactly; and can be found in `Groups[1]`.
Regex r = new Regex(#".*_(\w{2}\d{4}).*", RegexOptions.IgnoreCase);
Match m = r.Match(filename);
return m.Success
? m.Groups[1].ToString()
: null; // what should happen here?
}
string[] filenames =
{
"aaa_ab12345.txt",
"bbb_ab12345.txt",
"aaa_ac12345.txt",
"bbb_ac12345",
"aaa_bbb_ab12345.txt",
"ae12345.txt" // MyMatcher() return null for this - what should you do if this happens?
};
var results = filenames
.Select(MyMatcher)
.Distinct();
foreach (var result in results)
{
Console.WriteLine(result);
}
}
}
Gives:
ab1234
ac1234
This can be refined further, such as pre-compiled regex patterns, encapsulation in a class, etc.

Use everything before a specific character

So, I've been learning C# and I need to remove everything after the
":" character.
I've used a StreamReader to read the text file, but then I can't use the Split function, then I tried it by using an int function to import it, but then again I can't use the Split function?
What I want this to do is import a text file that's written like;
name:lastname
name2:lastname2
And so that it only shows name and name2.
I've been searching this for a couple of days but I can't seem to figure it out!
I don't know what I'm doing wrong and how to import the text file without using StreamReader or anything else.
Edit:
I'm trying to post something to a website that goes like;
example.com/q=(name without ":")
Edit 2:
StreamReader list = new StreamReader(#"list.txt");
string reader = list.ReadToEnd();
string[] split = reader.Split(":".ToCharArray());
Console.WriteLine(split);
gives output as;
System.String[]
You've got a few issues here. First, use File.ReadLines() instead of a StreamReader, its much simpler and easier:
IEnumerable<string> lines = File.ReadLines("path/to/file");
Next, your lines variable needs to be iterated so you can get to each line of the collection:
foreach (string line in lines)
{
//TODO: write split logic here
}
Then you have to split each line on the ':' character:
string[] split = line.Split(":");
Your split variable is an array of string (i.e string[]) which means you have to access a specific index of the array if you want to see its value. This is your second issue, if you pass split to Console.WriteLine() under the hood it just calls .ToString() on the object you have passed it, and with a string[] it won't automatically give you all the values, you have to write that yourself.
So if your line variable was: "name:Steve", the split variable would have two indexes and look like this:
//split[0] = "name"
//split[1] = "Steve"
I made a fiddle here that demonstrates.
I your file size small and your name:lastname in one line use:
var lines = File.ReadAllLines("filaPath");
foreach (var line in lines)
{
var array = line.Split(':');
if (array.Length > 0)
{
var name = array[0];
}
}
if name:lastname isn't in new line tell me how it's seprated

Remove or don't write last ';' symbol in the row

Well I have export to CSV script.
I export list of struct. I write with help of StringWriter. In struct field array foreach cycle I iterate through all properties and after every property I put ';'. In the end of line I put WriteLine().
So as output I have:
value1;value2;value3;
And I want:
value1;value2;value3
The question is : how to get what I want from what I get, or based on what I'v already made.
I have 3 ideas right now:
The last 2 symbols in line should be something like ";\r(\n)" So replace this combination with nothing.
Check if the property is last.
Trim last(before newline) symbol in each row.
Use String.Join to form each line for your output. It prevents you from having to check which term is last.
http://msdn.microsoft.com/en-us/library/57a79xd0.aspx
var values = { "value1", "value2", "value3" };
string line = string.Join(";", values);
line will be
"value1;value2;value3"
try this code,
str.TrimEnd(';');
Usually I just delete the last character after the line is formed.
I use a StringBuilder and just do:
var builder = new StringBuilder();
// ...
// add the text
// ...
builder.Length--;
This way I can avoid the string copy.

cutting from string in C#

My strings look like that: aaa/b/cc/dd/ee . I want to cut first part without a / . How can i do it? I have many strings and they don't have the same length. I tried to use Substring(), but what about / ?
I want to add 'aaa' to the first treeNode, 'b' to the second etc. I know how to add something to treeview, but i don't know how can i receive this parts.
Maybe the Split() method is what you're after?
string value = "aaa/b/cc/dd/ee";
string[] collection = value.Split('/');
Identifies the substrings in this instance that are delimited by one or more characters specified in an array, then places the substrings into a String array.
Based on your updates related to a TreeView (ASP.Net? WinForms?) you can do this:
foreach(string text in collection)
{
TreeNode node = new TreeNode(text);
myTreeView.Nodes.Add(node);
}
Use Substring and IndexOf to find the location of the first /
To get the first part:
// from memory, need to test :)
string output = String.Substring(inputString, 0, inputString.IndexOf("/"));
To just cut the first part:
// from memory, need to test :)
string output = String.Substring(inputString,
inputString.IndexOf("/"),
inputString.Length - inputString.IndexOf("/");
You would probably want to do:
string[] parts = "aaa/b/cc/dd/ee".Split(new char[] { '/' });
Sounds like this is a job for... Regular Expressions!
One way to do it is by using string.Split to split your string into an array, and then string.Join to make whatever parts of the array you want into a new string.
For example:
var parts = input.Split('/');
var processedInput = string.Join("/", parts.Skip(1));
This is a general approach. If you only need to do very specific processing, you can be more efficient with string.IndexOf, for example:
var processedInput = input.Substring(input.IndexOf('/') + 1);

Categories

Resources