Compare two Lists using Linq for partial matches

Compare two Lists using Linq for partial matches - c#

I tried looking through some of the other questions, but couldn't find any that did a partial match.
I have two List<string>
They have codes in them. One is a list of selected codes, one is a list of required codes. The entire code list is a tree though, so they have sub codes. An example would be
Code B
Code B.1
Code B.11
So lets say the Required code is B, but anything under it's tree will meet that requirement, so if the Selected codes are A and C the match would fail, but if one of the selected codes was B.1 it contains the partial match.
I just need to know if any of the selected codes partially match any of the required codes. Here is my current attempt at this.
//Required is List<string> and Selected is a List<string>
int count = (from c in Selected where c.Contains(Required.Any()) select c).Count();
The error I get is on the Required.Any() and it's cannot convert from bool to string.
Sorry if this is confusing, let me know if adding any additional information would help.

I think you need something like this:
using System;
using System.Collections.Generic;
using System.Linq;
static class Program {
static void Main(string[] args) {
List<string> selected = new List<string> { "A", "B", "B.1", "B.11", "C" };
List<string> required = new List<string> { "B", "C" };
var matching = from s in selected where required.Any(r => s.StartsWith(r)) select s;
foreach (string m in matching) {
Console.WriteLine(m);
}
}
}
Applying the Any condition on required in this way should give you the elements that match - I'm not sure if you should use StartsWith or Contains, that depends on your requirements.

If selected and required lists are large enough the following is faster than the accepted answer:
static void Main(string[] args)
{
List<string> selected = new List<string> { "A", "B", "B.1", "B.11", "C" };
List<string> required = new List<string> { "B", "C" };
required.Sort();
var matching = selected.Where(s =>
{
int index = required.BinarySearch(s);
if (index >= 0) return true; //exact match
index = ~index;
if (index == 0) return false;
return s.StartsWith(required[index - 1]);
});
foreach (string m in matching)
{
Console.WriteLine(m);
}
}
Given n = required.Count and m = required.Count the accepted answer algorithm complexity is O(n*m). However what I propose has a better algorithm complexity: O((n+m)*Log(n))

This query finds any match that exists in two lists. If a value exists in both lists, it returns true, otherwise false.
List<string> listString1 = new List<string>();
List<string> listString2 = new List<string>();
listString1.Add("A");
listString1.Add("B");
listString1.Add("C");
listString1.Add("D");
listString1.Add("E");
listString2.Add("C");
listString2.Add("X");
listString2.Add("Y");
listString2.Add("Z");
bool isItemExist = listString1.Any(x => listString2.Contains(x));

Related

C# iterate List for specific values then Add them to another List in a certain order

Each instance of some Lists in my program are in an arbitrarily different order (as a result of an unfixed bug in Umbraco CMS Forms), and I need to rearrange them to the correct order. I can't use indices and OrderBy as each time the order is different.
I have been trying to iterate the existing List, then, when it finds the correct String that should be in position [0], using .Add to add it to another, empty List. Then continue through adding each value to the correct index.
I can't figure out a way to do this. I need the logic to basically say "look in this list, if the string equals this value, add it to this other list at position 0, then look for the next string to add at position 1, and so on", so at the end I will have the new List in the correct order.
// List to populate from record in wrong order
var extractedFields = new List<KeyValuePair<string, string>>();
// new list to copy values across in correct order
var newOrderFields = new List<KeyValuePair<string, string>>();
// separate list containing data field captions, used to iterate later
var extractedCaptions = new List<string>();
foreach (var field in record.RecordFields)
{
var extractValue = field.Value.ValuesAsString().NullSafeToString();
var extractType = CGHelper.CleanString(field.Value.Field.FieldType.Name).ToLower();
var extractCaption = field.Value.Field.Caption;
extractedFields.Add(new KeyValuePair<string, string>(extractType,
extractValue));
extractedCaptions.Add(extractCaption);
}
var count = 0;
foreach (var cap in extractedCaptions.ToList())
{
if (cap == "Opportunity ID")
{
extractedCaptions.Remove(cap);
extractedCaptions.Insert(0, cap);
var key = extractedFields[count].Key;
var value = extractedFields[count].Value;
newOrderFields.Add(new KeyValuePair<string, string>(key, value));
}
else if (cap == "Name")
{
// etc. identify string to be put into correct order
So to try and explain further, a user submits a form with the fields in a certain order. When we load that form and pull the record through (from the Umbraco Form), it is in a totally different and arbitrary order (and is in a different order for every single form).Therefore I need to iterate the fields and put them back into the order they were in the original form...

I don't know if I understand the situation correctly. But you can utilize the IComparer<T> interface for custom sorting logic.
using System;
using System.Collections.Generic;
namespace Comparer
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
var list = new[]{"Foobar", "Baz", "Foo", "Foobar", "Bar", };
Array.Sort(list, new CustomComparer(new[]{"Foo", "Bar", "Baz", "Foobar"}));
list.Dump();
// will dump : Foo,Bar,Baz,Foobar,Foobar
}
}
public class CustomComparer : IComparer<string>
{
private readonly string[] priorityList;
public CustomComparer(string[] priorityList)
{
this.priorityList = priorityList;
}
public int Compare(string x, string y)
{
var xIx = Array.IndexOf(priorityList, x);
var yIx = Array.IndexOf(priorityList, y);
return xIx.CompareTo(yIx);
}
}
}
This will sort the array according to indexes proved in the constructor.
Fiddle

I wrote a small example.
You can see it working here.
The idea is iterating the ordered list and search to every value in other one.
Code:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List<string> ordered = new List<string>(){ "a", "b", "c", "d"};
List<string> nonOrderedAndMissing = new List<string>(){ "c", "d", "a"};
// here is the join
var newList = ordered.Select(a => nonOrderedAndMissing.Contains(a) ? a : null).Where(d => d != null).ToList();
// checking here
foreach(var a in newList){
Console.WriteLine(a);
}
}
}

C#: LINQ query with split and parsing

I have an object with a String field containing a comma separated list of integers in it. I'm trying to use LINQ to retrieve the ones that have a specific number in the list.
Here's my approach
from p in source
where (p.Keywords.Split(',').something.Contains(val))
select p;
Where p.Keywords is the field to split.
I've seen the following in the net but just doesn't compile:
from p in source
where (p.Keywords.Split(',').Select(x=>x.Trim()).Contains(val))
select p;
I'm a LINQ newbie, but had success with simpler queries.
Update:
Looks like I was missing some details:
source is a List containing the object with the field Keywords with strings like 1,2,4,7
Error I get is about x not being defined.

Here's an example of selecting numbers that are greater than 3:
string str = "1,2,3,4,5,6,7,8";
var numbers = str.Split(',').Select(int.Parse).Where(num => num > 3); // 4,5,6,7,8
If you have a list then change the Where clause:
string str = "1,2,3,4,5,6,7,8";
List<int> relevantNums = new List<int>{5,6,7};
var numbers = str.Split(',').Select(int.Parse).Where(num => relevantNums.Contains(num)); // 5,6,7
If you are not looking for number but for strings then:
string str = "1,2,3,4,5,6,7,8";
List<string> relevantNumsStr = new List<string>{"5","6","7"};
var numbers = str.Split(',').Where(numStr => relevantNumsStr.Contains(numStr)); // 5,6,7

Here is an example of how you can achieve this. For simplicity I did to string on the number to check for, but you get the point.
// class to mimic what you structure
public class MyObj
{
public string MyStr{get;set;}
}
//method
void Method()
{
var myObj = new List <MyObj>
{
new MyObj{ MyStr="1,2,3,4,5"},
new MyObj{ MyStr="9,2,3,4,5"}
};
var num =9;
var searchResults = from obj in myObj
where !string.IsNullOrEmpty(obj.MyStr) &&
obj.MyStr.Split(new []{','})
.Contains(num.ToString())
select obj;
foreach(var item in searchResults)
Console.WriteLine(item.MyStr);
}

Thanks for all the answers, although not in the right language they led me to the answer:
from p in source where (p.Keywords.Split(',').Contains(val.ToString())) select p;
Where val is the number I'm looking for.

How to remove Duplicates from two List except few elements which may be duplicate also?

i have two lists having few elements in common, i want to remove duplicates events except few as described below..and the order of the string must be same and both list may not contain same no of elements?
list A: List B
ASCB ASCB
test1 test1
test2 test5
test3 test3
test4 test6
Arinc Arinc
testA testC
testB testB
testC
tesctD
now i want to remove all common elements in two list except elements ASCB, ARINC.. how to do that can any one help me in that...

I would just store the special values ( ASCB, ARINC, ect ) in their own list so I can use Except to get the difference between the two sets. You can add the special values in afterwards.
List<string> except = ListA.Except(ListB).Concat(listB.Except(ListA)).Concat(SpecialValues).ToList();
You have to call except twice because first we get items in A that are not in B. Then we add items that are in B but not in A. Finally we add the special values (I'm assuming SpecialValues is a collection with the strings you don't want removed).

You'd have to test performance as I suspect it's not the most efficient.
List<string> wordstoKeep = new List<string>() { "ASCB", "Arinc" };
foreach (string str in listB)
{
int index = listA.FindIndex(x => x.Equals(str, StringComparison.OrdinalIgnoreCase));
if (index >= 0)
{
if (!wordstoKeep.Any(x => x.Equals(str, StringComparison.OrdinalIgnoreCase)))
listA.RemoveAt(index);
}
else
listA.Add(str);
}

var listA = new List<string>{"ASCB","test1","test2"};
var listB = new List<string>{"ASCB","test1","test2"};
var combinedList = listA.Where(a => a.Contains("test"))
.Concat(listB.Where(b => b.Contains("test")))
.Distinct().Dump();
outputs 'test1', 'test2'
your filter conditions are contained in your Where clause.
Where can be whatever condition you want to filter by:
Where(a => a != "ASCB" or whatever...
Concat joins the two lists. Then call Distinct() to get unique entries.

Going off the requirement that order must be the same
if(B.Count != A.Count)
return;
List<String> reserved = new List<string>{ "ARCB", "ARINC" };
for (int i = A.Count -1; i >= 0; i--)
{
if (!reserved.Contains(A[i].ToUpper()) && A[i] == B[i])
{
A.RemoveAt(i);
B.RemoveAt(i);
}
}

This works:
var listA = new List<string>()
{
"ASCB",
"test1",
"test2",
"test3",
"test4",
"Arinc",
"testA",
"testB"
};
var listB = new List<string>()
{
"ASCB",
"test1",
"test5",
"test3",
"test6",
"Arinc",
"testC",
"testB"
};
var dontRemoveThese = new List<string>(){"ASCB", "Arinc"};
var listToRemove = new List<string>();
foreach (var str in listA)
if (listB.Contains(str))
listToRemove.Add(str);
foreach (var str in listToRemove){
if (dontRemoveThese.contains(str))
continue;
listA.Remove(str);
listB.Remove(str);
}
I like this solution because you can see what happens. I'd rather have 10 lines of code where it's obvious what happens than 1-3 lines of obscure magic.

compare two identical lists of strings

Let's say I have following code:
List<string> numbers = new List<string> { "1", "2" };
List<string> numbers2 = new List<string> { "1", "2"};
if (numbers.Equals(numbers2))
{
}
Like you can see I have two lists with identical items. Is there a way to check if these two lists are equal by using one method?
SOLUTION:
Use SequenceEqual()
Thanks

Use Enumerable.SequenceEqual, but Sort the lists first.

// if order does not matter
bool theSame = numbers.Except(numbers2).Count() == 0;
// if order is matter
var set = new HashSet<string>(numbers);
set.SymmetricExceptWith(numbers2);
bool theSame = set.Count == 0;

C# dedupe List based on split

I'm having a hard time deduping a list based on a specific delimiter.
For example I have 4 strings like below:
apple|pear|fruit|basket
orange|mango|fruit|turtle
purple|red|black|green
hero|thor|ironman|hulk
In this example I should want my list to only have unique values in column 3, so it would result in an List that looks like this,
apple|pear|fruit|basket
purple|red|black|green
hero|thor|ironman|hulk
In the above example I would have gotten rid of line 2 because line 1 had the same result in column 3. Any help would be awesome, deduping is tough in C#.
how i'm testing this:
static void Main(string[] args)
{
BeginListSet = new List<string>();
startHashSet();
}
public static List<string> BeginListSet { get; set; }
public static void startHashSet()
{
string[] BeginFileLine = File.ReadAllLines(#"C:\testit.txt");
foreach (string begLine in BeginFileLine)
{
BeginListSet.Add(begLine);
}
}
public static IEnumerable<string> Dedupe(IEnumerable<string> list, char seperator, int keyIndex)
{
var hashset = new HashSet<string>();
foreach (string item in list)
{
var array = item.Split(seperator);
if (hashset.Add(array[keyIndex]))
yield return item;
}
}

Something like this should work for you
static IEnumerable<string> Dedupe(this IEnumerable<string> input, char seperator, int keyIndex)
{
var hashset = new HashSet<string>();
foreach (string item in input)
{
var array = item.Split(seperator);
if (hashset.Add(array[keyIndex]))
yield return item;
}
}
...
var list = new string[]
{
"apple|pear|fruit|basket",
"orange|mango|fruit|turtle",
"purple|red|black|green",
"hero|thor|ironman|hulk"
};
foreach (string item in list.Dedupe('|', 2))
Console.WriteLine(item);
Edit: In the linked question Distinct() with Lambda, Jon Skeet presents the idea in a much better fashion, in the form of a DistinctBy custom method. While similar, his is far more reusable than the idea presented here.
Using his method, you could write
var deduped = list.DistinctBy(item => item.Split('|')[2]);
And you could later reuse the same method to "dedupe" another list of objects of a different type by a key of possibly yet another type.

Try this:
var list = new string[]
{
"apple|pear|fruit|basket",
"orange|mango|fruit|turtle",
"purple|red|black|green",
"hero|thor|ironman|hulk "
};
var dedup = new List<string>();
var filtered = new List<string>();
foreach (var s in list)
{
var filter = s.Split('|')[2];
if (dedup.Contains(filter)) continue;
filtered.Add(s);
dedup.Add(filter);
}
// Console.WriteLine(filtered);

Can you use a HashSet instead? That will eliminate dupes automatically for you as they are added.

May be you can sort the words with delimited | on alphabetical order. Then store them onto grid (columns). Then when you try to insert, just check if there is column having a word which starting with this char.

If LINQ is an option, you can do something like this:
// assume strings is a collection of strings
List<string> list = strings.Select(a => a.Split('|')) // split each line by '|'
.GroupBy(a => a[2]) // group by third column
.Select(a => a.First()) // select first line from each group
.Select(a => string.Join("|", a))
.ToList(); // convert to list of strings
Edit (per Jeff Mercado's comment), this can be simplified further:
List<string> list =
strings.GroupBy(a => a.split('|')[2]) // group by third column
.Select(a => a.First()) // select first line from each group
.ToList(); // convert to list of strings

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Compare two Lists using Linq for partial matches - c#

Related

C# iterate List for specific values then Add them to another List in a certain order

C#: LINQ query with split and parsing

How to remove Duplicates from two List except few elements which may be duplicate also?

compare two identical lists of strings

C# dedupe List based on split

Categories

Resources