Searching a list of strings in C# - c#

So I want to use one of these LINQ functions with this List<string> I have.
Here's the setup:
List<string> all = FillList();
string temp = "something";
string found;
int index;
I want to find the string in all that matches temp when both are lower cased with ToLower(). Then I'll use the found string to find it's index and remove it from the list.
How can I do this with LINQ?

I get the feeling that you don't care so much about comparing the lowercase versions as you do about just performing a case-insensitive match. If so:
var listEntry = all.Where(entry =>
string.Equals(entry, temp, StringComparison.CurrentCultureIgnoreCase))
.FirstOrDefault();
if (listEntry != null) all.Remove(listEntry);

OK, I see my imperative solution is not getting any love, so here is a LINQ solution that is probably less efficient, but still avoids searching through the list two times (which is a problem in the accepted answer):
var all = new List<string>(new [] { "aaa", "BBB", "Something", "ccc" });
const string temp = "something";
var found = all
.Select((element, index) => new {element, index})
.FirstOrDefault(pair => StringComparer.InvariantCultureIgnoreCase.Equals(temp, pair.element));
if (found != null)
all.RemoveAt(found.index);
You could also do this (which is probably more performant than the above, since it does not create new object for each element):
var index = all
.TakeWhile(element => !StringComparer.InvariantCultureIgnoreCase.Equals(temp, element))
.Count();
if (index < all.Count)
all.RemoveAt(index);

I want to add to previous answers... why don't you just do it like this :
string temp = "something";
List<string> all = FillList().Where(x => x.ToLower() != temp.ToLower());
Then you have the list without those items in the first place.

all.Remove(all.FirstOrDefault(
s => s.Equals(temp,StringComparison.InvariantCultureIgnoreCase)));

Use the tool best suited for the job. In this case a simple piece of procedural code seems more appropriate than LINQ:
var all = new List<string>(new [] { "aaa", "BBB", "Something", "ccc" });
const string temp = "something";
var cmp = StringComparer.InvariantCultureIgnoreCase; // Or another comparer of you choosing.
for (int index = 0; index < all.Count; ++index) {
string found = all[index];
if (cmp.Equals(temp, found)) {
all.RemoveAt(index);
// Do whatever is it you want to do with 'found'.
break;
}
}
This is probably as fast as you can get, because:
Comparison it done in place - there is no creation of temporary uppercase (or lowercase) strings just for comparison purposes.
Element is searched only once (O(index)).
Element is removed in place without constructing a new list (O(all.Count-index)).
No delegates are used.
Straight for tends to be faster than foreach.
It can also be adapted fairly easily should you want to handle duplicates.

Related

Remove names that contain another in a list

I have a file with "Name|Number" in each line and I wish to remove the lines with names that contain another name in the list.
For example, if there is "PEDRO|3" , "PEDROFILHO|5" , "PEDROPHELIS|1" in the file, i wish to remove the lines "PEDROFILHO|5" , "PEDROPHELIS|1".
The list has 1.8 million lines, I made it like this but its too slow :
List<string> names = File.ReadAllLines("firstNames.txt").ToList();
List<string> result = File.ReadAllLines("firstNames.txt").ToList();
foreach (string name in names)
{
string tempName = name.Split('|')[0];
List<string> temp = names.Where(t => t.Contains(tempName)).ToList();
foreach (string str in temp)
{
if (str.Equals(name))
{
continue;
}
result.Remove(str);
}
}
File.WriteAllLines("result.txt",result);
Does anyone know a faster way? Or how to improve the speed?
Since you are looking for matches everywhere in the word, you will end up with O(n2) algorithm. You can improve implementation a bit to avoid string deletion inside a list, which is an O(n) operation in itself:
var toDelete = new HashSet<string>();
var names = File.ReadAllLines("firstNames.txt");
foreach (string name in names) {
var tempName = name.Split('|')[0];
toDelete.UnionWith(
// Length constraint removes self-matches
names.Where(t => t.Length > name.Length && t.Contains(tempName))
);
}
File.WriteAllLines("result.txt", names.Where(name => !toDelete.Contains(name)));
This works but I don't know if it's quicker. I haven't tested on millions of lines. Remove the tolower if the names are in the same case.
List<string> names = File.ReadAllLines(#"C:\Users\Rob\Desktop\File.txt").ToList();
var result = names.Where(w => !names.Any(a=> w.Split('|')[0].Length> a.Split('|')[0].Length && w.Split('|')[0].ToLower().Contains(a.Split('|')[0].ToLower())));
File.WriteAllLines(#"C:\Users\Rob\Desktop\result.txt", result);
test file had
Rob|1
Robbie|2
Bert|3
Robert|4
Jan|5
John|6
Janice|7
Carol|8
Carolyne|9
Geoff|10
Geoffrey|11
Result had
Rob|1
Bert|3
Jan|5
John|6
Carol|8
Geoff|10

How to quickly cut up and reassemble a C# List?

The setup:
I have a session variable that carries a list of IDs, pipe-delimited. The IDs are related to views in my site, and related to a breadcrumb builder.
Session["breadCrumb"] = "1001|1002|1003|1004";
If I'm on the view that corresponds to 1002, I'd like to cut everything AFTER that id out of the session variable.
I'd thought to use something like:
var curView = "1002";
if (Session["breadCrumb"] != null) {
var crumb = Session["breadCrumb"].ToString().Split('|').ToList();
var viewExists = crumb.Any(c => c.Value == curView);
if (viewExists) {
//remove everything after that item in the array.
}
}
But I'm wide open to methodologies.
You could use TakeWhile to get back only the items from the splitted list that precede the currentView.
var curView = "1002";
if (Session["breadCrumb"] != null)
{
var crumb = Session["breadCrumb"].ToString().Split('|').ToList();
var viewExists = crumb.TakeWhile(c => c != curView).ToList();
viewExists.Add(curView);
string result = string.Join("|",viewExists);
}
While this approach works I think that also the previous answer (now wrongly deleted from Mr. Andrew Whitaker) was correct. Using IndexOf should be faster with less splitting, looping, joining strings. I suggest Mr Whitaker to undelete its answer.
EDIT
This is from the deleted answer from Mr.Whitaker.
I will repost here because I think that its approach is simpler and should give better perfomances, so future readers could see also this option.
var crumb = Session["breadCrumb"].ToString()
int index = crumb.IndexOf(curView);
if (index >= 0)
{
Session["breadCrumb"] = crumb.Substring(0, index + curView.Length);
}
If Andrew decide to undelete its answer I will be glad to remove this part. Just let me know.
You could just store a List<string> in the Session directly. This saves you from having to split/concat the string manually. I know this does not answer the question directly, but I believe it is a superior solution to that.
var curView = "1002";
var crumb = Session["breadCrumb"] as List<string>;
if (crumb != null) {
var viewExists = crumb.Any(c => c.Value == curView);
if (viewExists) {
// remove everything after that item in the array.
}
}
I almost regret this, but frankly I'd just go for a regular expression:
var result = Regex.Replace(input, "(?<=(\\||^)" + current + ")(?=\\||$).*", "");
This does not directly tell you if the current view existed in the input, but even though this is also possible with the regex in this particular instance another, dead simple test exists:
var viewExists = result.Length != current.Length;

Comparing two strings with different orders

I have a dictionary with a list of strings that each look something like:
"beginning|middle|middle2|end"
Now what I wanted was to do this:
List<string> stringsWithPipes = new List<string>();
stringWithPipes.Add("beginning|middle|middle2|end");
...
if(stringWithPipes.Contains("beginning|middle|middle2|end")
{
return true;
}
problem is, the string i'm comparing it against is built slightly different so it ends up being more like:
if(stringWithPipes.Contains(beginning|middle2|middle||end)
{
return true;
}
and obviously this ends up being false. However, I want to consider it true, since its only the order that is different.
What can I do?
You can split your string on | and then split the string to be compared, and then use Enumerable.Except along with Enumerable.Any like
List<string> stringsWithPipes = new List<string>();
stringsWithPipes.Add("beginning|middle|middle2|end");
stringsWithPipes.Add("beginning|middle|middle3|end");
stringsWithPipes.Add("beginning|middle2|middle|end");
var array = stringsWithPipes.Select(r => r.Split('|')).ToArray();
string str = "beginning|middle2|middle|end";
var compareArray = str.Split('|');
foreach (var subArray in array)
{
if (!subArray.Except(compareArray).Any())
{
//Exists
Console.WriteLine("Item exists");
break;
}
}
This can surely be optimized, but the above is one way to do it.
Try this instead::
if(stringWithPipes.Any(P => P.split('|')
.All(K => "beginning|middle2|middle|end".split('|')
.contains(K)))
Hope this will help !!
You need to split on a delimeter:
var searchString = "beginning|middle|middle2|end";
var searchList = searchString.Split('|');
var stringsWithPipes = new List<string>();
stringsWithPipes.Add("beginning|middle|middle2|end");
...
return stringsWithPipes.Select(x => x.Split('|')).Any(x => Match(searchList,x));
Then you can implement match in multiple ways
First up must contain all the search phrases but could include others.
bool Match(string[] search, string[] match) {
return search.All(x => match.Contains(x));
}
Or must be all the search phrases cannot include others.
bool Match(string[] search, string[] match) {
return search.All(x => match.Contains(x)) && search.Length == match.Length;
}
That should work.
List<string> stringsWithPipes = new List<string>();
stringsWithPipes.Add("beginning|middle|middle2|end");
string[] stringToVerifyWith = "beginning|middle2|middle||end".Split(new[] { '|' },
StringSplitOptions.RemoveEmptyEntries);
if (stringsWithPipes.Any(s => !s.Split('|').Except(stringToVerifyWith).Any()))
{
return true;
}
The Split will remove any empty entries created by the doubles |. You then check what's left if you remove every common element with the Except method. If there's nothing left (the ! [...] .Any(), .Count() == 0 would be valid too), they both contain the same elements.

Remove duplicate items from a List<String[]> in C#

I have an issue here a bit complex than I'm trying to resolve since some days ago. I'm using the PetaPoco ORM and didn't found any other way to do a complex query like this:
var data = new List<string[]>();
var db = new Database(connectionString);
var memberCapabilities = db.Fetch<dynamic>(Sql.Builder
.Select(#"c.column_name
,CASE WHEN c.is_only_view = 1
THEN c.is_only_view
ELSE mc.is_only_view end as is_only_view")
.From("capabilities c")
.Append("JOIN members_capabilities mc ON c.capability_id = mc.capability_id")
.Where("mc.member_id = #0", memberID)
.Where("c.table_id = #0", tableID));
var roleCapabilities = db.Fetch<dynamic>(Sql.Builder
.Select(#"c.column_name
,CASE WHEN c.is_only_view = 1
THEN c.is_only_view
ELSE rc.is_only_view end as is_only_view")
.From("capabilities c")
.Append("JOIN roles_capabilities rc ON c.capability_id = rc.capability_id")
.Append("JOIN members_roles mr ON rc.role_id = mr.role_id")
.Where("mr.member_id = #0", memberID)
.Where("c.table_id = #0", tableID));
I'm trying to get the user capabilities, but my system have actually to ways to assign an user a capability, or direct to that user or attaching the user to a role. I wanted to get this merged list using a stored procedure but I needed cursors and I thought maybe should be easier and faster doing this on the web application. So I get that two dynamics and the members capabilities have priority to the roles capabilities, so I need to check if that using loops. And I did like this:
for (int i = 0; i < roleCapabilities.Count; i++)
{
bool added = false;
for (int j = 0; j < memberCapabilities.Count; j++)
if (roleCapabilities[i].column_name == memberCapabilities[j].column_name)
{
data.Add(new string[2] { memberCapabilities[j].column_name, Convert.ToString(memberCapabilities[j].is_only_view) });
added = true;
break;
}
if (!added)
data.Add(new string[2] { roleCapabilities[i].column_name, Convert.ToString(roleCapabilities[i].is_only_view) });
}
So now the plan is delete the duplicate entries. I have try using the following methods with no results:
data = data.Distinct();
Any help? Thanks
Make sure that your object either implements System.IEquatable or overrides Object.Equals and Object.GetHashCode. In this case, it looks like you're storing the data as string[2], which won't give you the desired behavior. Create a custom object to hold the data, and do one of the 2 options listed above.
If I understand your question correctly you want to get a distinct set of arrays of strings, so if the same array exists twice, you only want one of them? The following code will return arrays one and three while two is removed as it is the same as one.
var one = new[] {"One", "Two"};
var two = new[] {"One", "Two"};
var three = new[] {"One", "Three"};
List<string[]> list = new List<string[]>(){one, two, three};
var i = list.Select(l => new {Key = String.Join("|", l), Values = l})
.GroupBy(l => l.Key)
.Select(l => l.First().Values)
.ToArray();
You might have to use ToList() after Distinct():
List<string[]> distinct = data.Distinct().ToList();

SQL Like search in LIST and LINQ in C#

I have a List which contains the list of words that needs to be excluded.
My approach is to have a List which contains these words and use Linq to search.
List<string> lstExcludeLibs = new List<string>() { "CONFIG", "BOARDSUPPORTPACKAGE", "COMMONINTERFACE", ".H", };
string strSearch = "BoardSupportPackageSamsung";
bool has = lstExcludeLibs.Any(cus => lstExcludeLibs.Contains(strSearch.ToUpper()));
I want to find out part of the string strSearch is present in the lstExcludedLibs.
It turns out that .any looks only for exact match. Is there any possibilities of using like or wildcard search
Is this possible in linq?
I could have achieved it using a foreach and contains but I wanted to use LINQ to make it simpler.
Edit: I tried List.Contains but it also doesn't seem to work
You've got it the wrong way round, it should be:-
List<string> lstExcludeLibs = new List<string>() { "CONFIG", "BOARDSUPPORTPACKAGE", "COMMONINTERFACE", ".H", };
string strSearch = "BoardSupportPackageSamsung";
bool has = lstExcludeLibs.Any(cus => strSearch.ToUpper().Contains(cus));
Btw - this is just an observation but, IMHO, your variable name prefixes 'lst' and 'str' should be ommitted. This is a mis-interpretation of Hungarian notation and is redundant.
I think the line should be:
bool has = lstExcludeLibs.Any(cus => cus.Contains(strSearch.ToUpper()));
Is this useful to you ?
bool has = lstExcludeLibs.Any(cus => strSearch.ToUpper().Contains(cus));
OR
bool has = lstExcludeLibs.Where(cus => strSearch.ToUpper().IndexOf(cus) > -1).Count() > 0;
OR
bool has = lstExcludeLibs.Count(cus => strSearch.ToUpper().IndexOf(cus) > -1) > 0;

Categories

Resources