I have some problem with linq to txt file. Txt file has next specific structure:
================ 09.01.2017 [8:51:11] created by VBScript ================
....some text
============================= END =============================
================ 16.01.2017 [9:49:09] created by VBScript ================
....some text
============================= END =============================
================ 18.01.2017 [8:43:50] created by VBScript ================
....some text
============================= END =============================
etc
So I want to select all lines from that file that starts and ends with "=" and select their indexes (positions) in it.
First step: I've opened and convertered it to List (cause it's easier to work with list)
string filekvitErrorGroupsResource = Utils.ReadTextResource(resourceName, Assembly.GetExecutingAssembly());
string[] stringSeparators = {"\r\n"};
string[] lines = filekvitErrorGroupsResource.Split(stringSeparators, StringSplitOptions.None);
return new List<string>(lines);
Second step: I've tried to make simple lambda query to list by condition:
var myQuery = lines.Where(l => l.StartsWith("=") && l.EndsWith("="))
.Select(l => new {idx = lines.IndexOf(l), body = l});
PROBLEM: As the result, I expect to receive list of strings with unique indexes (idx), but instead I've received this:
So as you can see the line with "END" isn't unique, why?
a.IndexOf(b) returns the index of the first occurrence of b within a, so the index of === END === is always the same.
Instead, you can use an overload of Select which takes Func<TSource, int, TResult> as a parameter so that you can get an index of the element.
var myQuery = lines
.Select((l, i) => new {idx = i, body = l})
.Where(l => l.body.StartsWith("=") && l.body.EndsWith("="));
You can have different index using select first and then doing where.
var myQuery = lines.Select((l,idx) => new {idx = idx, body = l}).Where(m => m.body.StartsWith("=") && m.body.EndsWith("="));
Here is the fiddler : https://dotnetfiddle.net/JW7S1s
Edit : Answer updated as per comment.
Your issue is that all of the END lines have an identical string so all calls to lines.IndexOf(..) will return the first matching instance. You'll need to introduce a new method (maybe name it NextIndexOf that takes the list and maintains a counter of the last index it returned.
Each subsequent call to NextIndexOf would start looking from where it left off last time.
Related
I would like to know how to do this with C #:
I have a CSV file with multiple columns as follows:
I would like to concatenate the result of all the lines of the first column to have:
Name = NDECINT, NDEC, NFAC, ORIGIN .....
You said all c#. This is done with Core 5.
var yourData = File.ReadAllLines("yourFile.csv")
.Skip(1)
.Select(x => x.Split(','))
.Select(x => new
{
Name = x[0] //only working with Name column
,Type = int.Parse(x[1]) //Only added for reference for handling more columns
});
string namesJoined = string.Join(',', yourData.Select(x => x.Name));
This is really basic code and does not handle the crazy things that can be inside a csv like a comma in the name for example.
This solution is for SSIS.
Add a variable called concat set equal to ""
Read the file using SSIS.
Add a script component
Pass in Row A and add variable
Set variable to concat += RowA + ","
When you are done, you will have an extra "," on the variable that needs to be removed.
Use an expression.
concat = left(concat, len(concat)-1)
so what im doing currently is
getting the text from a file i.e a .txt
and putting it into an array i.e
whilst comparing the two files and outputting the differences between a and B
string[] linesA = File.ReadAllLines(path\file.txt);
string[] linesB = File.ReadAllLines(path\file2.txt);
IEnumerable<String> onlyB = linesB.Except(linesA);
string[] newstr = new HashSet<string>(onlyB).ToArray();
File.WriteAllLines('C:\path\', newstr);
and lets say the text inside the files includes : i.e
file a:
code(324332): 65dfsdf4fth
code(32342): hdfgvsdfsdgh
code(323462): h29dfs8dh
file b:
code(324332): 65dfsdf4fth
code(32342): hdfgvsdfsdgh
code(323462): h29dfs8dh
code(453453): 8gbhfhk,jv
code(343435): gigdbioyvgi
code(3435343): guidfyvfhs
how would i go about getting the text after :
and removing duplicates
so in the end the output would be
8gbhfhk,jv
gigdbioyvgi
guidfyvfhs
edited:
Kind regards,
Phil
You can browse files and registrars row by row in a list of type "Dictionary .Add (TKey, TValue)" so that you only have the unique values.
https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.dictionary-2.add?view=netframework-4.8
To get the text after : you can use the Substring and IndexOf methods, then remove the whitespace at the beginning of your new string with TrimStart. At the end use Concat to combine the two lists and GroupBy to filter out the values that have duplicates:
string[] linesA = File.ReadAllLines(#"C:\file.txt");
string[] linesB = File.ReadAllLines(#"C:\file2.txt");
IEnumerable<string> linesA2 = linesA.Select(x => x.Substring(x.IndexOf(":") + 1).TrimStart());
IEnumerable<string> linesB2 = linesB.Select(x => x.Substring(x.IndexOf(":") + 1).TrimStart());
string[] result = linesA2.Concat(linesB2).GroupBy(x => x)
.Where(g => g.Count() == 1)
.Select(y => y.Key)
.ToArray();
result:
result[0] ="8gbhfhk,jv"
result[1] ="gigdbioyvgi"
result[2] ="guidfyvfhs"
Write the result array to new text file:
File.WriteAllLines(#"C:\file3.txt", result);
Im creating an application which will read in a large data file and return a specific selection of text from each line in a .dat file. Please see example of the data below.
22/06/2016 22:18:21.209 Type6 -92.31435 2.06424 0.07686
22/06/2016 22:18:21.210 Type34 -91.4085 1.84464 -0.09333
I need the first 3 sets of data which is the date, time and type. The values after the type go on for a while and i have a large amount of rows which need to collected from. I have thought about just splitting each section of the line and taking the first 3 fields. Would this work or would there be an easier way to complete this?
Thanks
You are on the right way (extracting just three fields); I suggest using Linq in the context, e.g.
var source = File
.ReadLines(#"C:\MyData.dat")
.Select(line => line.Split(new char[] { ' ' }, 4))
.Where(items => items.Length >= 3) // it seems that you have empty lines or something
.Select(items => new {
// Let's combine date and time into DateTime
date = DateTime.ParseExact(items[0] + " " + items[1],
#"dd/MM/yyyy H:m:s.fff",
CultureInfo.InvariantCulture),
kind = items[2] });
// .ToArray(); // you may want add materialization (i.e. read once and put into array)
Having got this Linq query you can easily filter out and represent the data you want, e.g.
var test = source
.Where(item => item.date > DateTime.Now.AddDays(-3)) // let's have fresh records only
.OrderByDescending(item => item.date)
.Select(item => $"{item.date} {item.kind}");
Console.Write(string.Join(Environment.NewLine, test));
You could make something just to read the first chars of each line, but the length of the line is not specified anywhere, so you have to read all the data.
You should use File.ReadLines(path) because it is lazy loading the data. This will only load one line per iteration. Foreach line you should check what data you need and save it on whatever you like...
var relevantData = new List<T>();
foreach(var line in File.ReadLines(path))
{
// parse the data you need.
relevantData.Add( new T { Date = whatever, ..... });
}
If you need to parse it multiple times, you could create an index file that contains the start index of each line.
I have a sql database and I want to use specific column of it. Code below shows the matches at the third column I just want to know what exactly ((string[])result[0])[2] does in the code.
Note: "SingleSelectWhere" function choose those records that match the word of "bag" in the "word" column.
db.OpenDB("English.db");
ArrayList result = db.SingleSelectWhere("petdef", "*", "word", "=", "'bag'");
if(result.Count > 0)
{
description = ((string[])result[0])[2];
}
db.CloseDB();
If you don't know what code does, just try to split it into some more "readable" code. If we take this line: description = ((string[])result[0])[2]; we can do:
var result1 = result;
var result2 = result[0];
var result3 = (string[])result2;
var description = result3[2].
If you set a breakpoint to the first line, just start debugging and see what every step does / the variable contains. Just as a tip.
The answer: it takes the array/list with the name result and return the first element. Than you cast it to a string-array and finally select the thrid element (zero based index!). Hope this helps.
I have a list in which I filter, according to the text input in a TextBox in Xaml. The code below filters the List stored in the results variable. The code checks if the textbox input,ie, queryString, matches the Name of any item in the results list EXACTLY. This only brings back the items from the list where the string matches the Name of a the item exactly.
var filteredItems = results.Where(
p => string.Equals(p.Name, queryString, StringComparison.OrdinalIgnoreCase));
How do I change this so that it returns the items in the list whose Name, is similar to the queryString?
To describe what I mean by Similar:
An item in the list has a Name= Smirnoff Vodka. I want it so that if "vodka" or "smirnoff" is entered in the textbox, the the item Smirnoff Vodka will be returned.
As it is with the code above, to get Smirnoff Vodka returned as a result, the exact Name "Smirnoff Vodka" would have to be entered in the textbox.
It really depends on what you mean, by saying "similar"
Options:
1) var filteredItems = results.Where( p => p.Name != null && p.Name.ToUpper().Contains(queryString.ToUpper());
2) There is also also known algorithm as "Levenshtein distance":
http://en.wikipedia.org/wiki/Levenshtein_distance
http://www.codeproject.com/Articles/13525/Fast-memory-efficient-Levenshtein-algorithm
The last link contains the source code in c#. By using it you cann determine "how close" the query string to the string in your list.
Try this:
fileList.Where(item => filterList.Contains(item))
Try this:
var query = "Smirnoff Vodka";
var queryList = query.Split(new [] {" "}, StringSplitOptions.RemoveEmptyEntries);
var fileList = new List<string>{"smirnoff soup", "absolut vodka", "beer"};
var result = from file in fileList
from item in queryList
where file.ToLower().Contains(item.ToLower())
select file;