C# Linq GroupBy - c#

Given a list of strings like this
"Val.1.ValueA"
"Val.1.ValueB"
"Val.1.ValueC"
"Val.2.ValueA"
"Val.2.ValueB"
"Val.2.ValueC"
"Val.3.ValueA"
"Val.3.ValueB"
"Val.3.ValueC"
How can I write a linq groupby statement to group by the first part of the string including the number? In other words, in the above case I want a list of 3 groups Val.1, Val.2, Val.3

Use String.Split() to define your group key:
var groups = myList.GroupBy(x => { var parts = x.Split('.');
return parts[0] + parts[1]; });
This would work regardless of the length of both parts of the key (before and after the dot).
Edit in response to comment:
It sounds like you want to group by a number within the string, but you do not know in advance which part constitutes the number. In this case this should work:
var groups = myList.GroupBy(x =>
{
var parts = x.Split('.'));
int num = 0;
return parts[0] + parts.Where(p => p.All(char.IsDigit)
.First( p => int.TryParse(p, out num));
}
);

Without more information about the formatting, the simplest is:
var groups = list.GroupBy(s => s.Substring(0, 5));
If these are in fact not fixed length:
var groups = list.GroupBy(s => {
var fields = s.Split('.');
return String.Format("{0}.{1}", fields[0], fields[1]);
});

Related

Getting a list of strings with only the first and last character from another list LINQ

From a given list of strings I need to use LINQ to generate a new sequence of strings, where each string consists of the first and last characters of the corresponding string in the original list.
Example:
stringList: new[] { "ehgrtthrehrehrehre", "fjjgoerugrjgrehg", "jgnjirgbrnigeheruwqqeughweirjewew" },
expected: new[] { "ee", "fg", "jw" });
list2 = stringList.Select(e => {e = "" + e[0] + e[e.Length - 1]; return e; }).ToList();
This is what I've tried, it works, but I need to use LINQ to solve the problem and I'm not sure how to adapt my solution.
just for the sake of completeness here is a version using Zip
var stringList = new string [] { "ehgrtthrehrehrehre", "fjjgoerugrjgrehg", "jgnjirgbrnigeheruwqqeughweirjewew" };
var result = stringList.Zip(stringList, (first, last) => $"{first.First()}{last.Last()}");
As mentioned in the comment that Select is already part of LINQ, you can use this code.var output = arr.Select(x => new string(new char[] { x.First(), x.Last() })).ToList();
Here you go:
var newList = stringList.Select(e => $"{e[0]}{e[e.Length - 1]}").ToList();
Approach with LINQ and String.Remove():
string[] input = new[] { "ehgrtthrehrehrehre", "fjjgoerugrjgrehg", "jgnjirgbrnigeheruwqqeughweirjewew" };
string[] result = input.Select(x => x.Remove(1, x.Length - 2)).ToArray();

How to split this string to obtain just the name value

Hi I am looking for a simple way to et just the name after the CN value
CN=Andrew Adams,OU=Services,OU=Users,OU=GIE,OU=CSP,OU=STAFF,DC=example,DC=net
is there an easy way to do this? I am currently doing this:
ResultPropertyValueCollection manager = result.Properties["manager"];
string managerUserName = manager[0].ToString();
string[] managerNameParts = managerUserName.Split(',');
string managerName = managerNameParts[0].Substring(4);
Console.WriteLine("Manager Name:" + managerName);
but it feels kind of bad.
This is a great place to use Regular Expressions. Try this:
var text = "CN=Andrew Adams,OU=Services,OU=Users,OU=GIE,OU=CSP,OU=STAFF,DC=example,DC=net";
var match = Regex.Match(text, #"CN=([^,]+)");
if (match.Success) return match.Groups[0].Value;
The expression CN=([^,]+) will look for the text CN= followed by one or more non-commas, and will stick that part of it into Groups[0].
You can do this:
var name = "CN=Andrew Adams,OU=Services,OU=Users,OU=GIE,OU=CSP,OU=STAFF,DC=example,DC=net"
.Split(',')[0].Split('=')[1];
Demo
What it does is splits on , and takes the first element and then splits it by = and takes the second element.
If you cannot have the same format, you can do a regex:
Regex.Match(name,#"(?<=CN=)[^,]+").Value;
Another option, using LINQ.
If the name/value pair exists anywhere in the string, you'll get it; if not, managerName will be null.
var managerName = input.Split(',')
.Where(x => x.StartsWith("CN="))
.Select(x => x.Split('=')[1])
.SingleOrDefault();
I find doing it like this fairly easy to read:
var input = #"CN=Andrew Adams,OU=Services,OU=Users,OU=GIE,OU=CSP,OU=STAFF,DC=example,DC=net";
var items = input.Split(',');
var keyValues = items.Select(x =>
{
var split = x.Split('=');
return new { Key = split[0], Value = split[1] };
});
var managerName = keyValues.Single(x => x.Key == "CN").Value;

Sorting a generic list by an external sort order

I have a generic list
Simplified example
var list = new List<string>()
{
"lorem1.doc",
"lorem2.docx",
"lorem3.ppt",
"lorem4.pptx",
"lorem5.doc",
"lorem6.doc",
};
What I would like to do is to sort these items based on an external list ordering
In example
var sortList = new[] { "pptx", "ppt", "docx", "doc" };
// Or
var sortList = new List<string>() { "pptx", "ppt", "docx", "doc" };
Is there anything built-in to linq that could help me achieve this or do I have to go the foreach way?
With the list you can use IndexOf for Enumerable.OrderBy:
var sorted = list.OrderBy(s => sortList.IndexOf(Path.GetExtension(s)));
So the index of the extension in the sortList determines the priority in the other list. Unknown extensions have highest priority since their index is -1.
But you need to add a dot to the extension to get it working:
var sortList = new List<string>() { ".pptx", ".ppt", ".docx", ".doc" };
If that's not an option you have to fiddle around with Substring or Remove, for example:
var sorted = list.OrderBy(s => sortList.IndexOf(Path.GetExtension(s).Remove(0,1)));
This solution will work even if some file names do not have extensions:
var sortList = new List<string>() { "pptx", "ppt", "docx", "doc" };
var list = new List<string>()
{
"lorem1.doc",
"lorem2.docx",
"lorem3.ppt",
"lorem4.pptx",
"lorem5.doc",
"lorem6.doc",
};
var result =
list.OrderBy(f => sortList.IndexOf(Path.GetExtension(f).Replace(".","")));
You could try using Array.IndexOf() method:
var sortedList = list.OrderBy(i => sortList.IndexOf(System.IO.Path.GetExtension(i))).ToList();
A sortDicionary would be more efficient:
var sortDictionary = new Dictionary<string, int> {
{ ".pptx", 0 },
{ ".ppt" , 1 },
{ ".docx", 2 },
{ ".doc" , 3 } };
var sortedList = list.OrderBy(i => {
var s = Path.GetExtension(i);
int rank;
if (sortDictionary.TryGetValue(s, out rank))
return rank;
return int.MaxValue; // for unknown at end, or -1 for at start
});
This way the lookup is O(1) rather than O(# of extensions).
Also, if you have a large number of filenames and a small number of extensions, it might actually be faster to do
var sortedList = list
.GroupBy(p => Path.GetExtension(p))
.OrderBy(g => {
int rank;
if (sortDictionary.TryGetValue(g.Key, out rank))
return rank;
return int.MaxValue; // for unknown at end, or -1 for at start
})
.SelectMany(g => g);
This means the sort scales by the number of distinct extensions in the input, rather than the number of items in the input.
This also allows you to give two extensions the same priority.
Here's another way that does not use OrderBy:
var res =
sortList.SelectMany(x => list.Where(f => Path.GetExtension(f).EndsWith(x)));
Note that the complexity of this approach is O(n * m) with n = sortList.Count and m list.Count.
The OrderBy approach worst-case complexity is instead O(n * m * log m) but probably in general it will be faster (since IndexOf does not result always in O(n) ). However with small n and m you won't notice any difference.
For big lists the fastest way ( complexity O(n+m) ) could be constructing a temporary lookup i.e. :
var lookup = list.ToLookup(x => Path.GetExtension(x).Remove(0,1));
var res = sortList.Where(x => lookup.Contains(x)).SelectMany(x => lookup[x]);

Check if the Array has something like my Text

I have a array of strings . I need to check in the array if it has something like "abcd". How to achive this in C#. I tried using the
var pathBits = new[] {"abcde ","abcd &"};
var item ="abcd";
var results = Array.FindAll(pathBits, s => s.Equals(item ));
maybe something like this:
var result = pathBits.Any(y => y.Contains(item));
That will give you true if the array contains an item that has a value like item. If you want to select all those values you should use:
var result = pathBits.Where(y => y.Contains(item));
which will give you an IEnumerable of the items from the list that contain the value item.
When you say 'something like "abcd"' do you mean "Starts with" or "Contains"?
The current code will only find strings in pathBits which are exactly equal to item ("abcd" ?)
The general shape is fine but to find non-exact matches you need to change the predicate
e.g.
string[] src = new[] { "abcde", "abcd &" };
var results = Array.FindAll<string>(src, name => name.Contains("abcd"));
This can also be implemented using the Linq IEnumerable<> extensions
e.g.
string[] src = new[] { "abcde", "abcd &" };
var results = src.Where(name => name.Contains("abcd"));
hth,
Alan.
This might be of some use
string[] pathBits = { "abcde ", "abcd &" };
var item = "abcd";
if (pathBits.Contains(item)) ;
{
}
You cannot use
var pathbits = { "abcde ", "abcd &" };
Please let me know if you have any problem
Is this what your looking for?
string[] pathBits = { "abcde ", "abcd &", "222" };
var item = "abcd";
var results = Array.FindAll<string>(pathBits, s => s.Contains(item));
results will have 2 items.
I'm not sure exactly what you want, but this would get all array allements that contain the string "abcd" -
String[] pathBits = {"abcde ","abcd &"};
var item ="abcd";
var results = pathBits.Where(s => s.IndexOf("abcd") > -1);

Removing duplicates from Array with C#.NET 4.0 LINQ?

I have this c# code that builds a string of comma seperated matches for a service:
for (m = r.Match(site); m.Success; m = m.NextMatch())
{
found = found + "," + m.Value.Replace(",", "");
}
return found;
Output looks like: aaa,bbb,ccc,aaa,111,111,ccc
Now that code is on .NET 4.0 How can I use C# LINQ to remove duplicates?
Also, Any way to remove duplicates without changing order?
I found this sample code in another post, but not sure exactly how to apply it:
int[] s = { 1, 2, 3, 3, 4};
int[] q = s.Distinct().ToArray();
Thanks.
string[] s = found.Split(',').Distinct().ToArray()
Rewrite the code that builds the result to output it directly.
ie. rewrite this:
for (m = r.Match(site); m.Success; m = m.NextMatch())
{
found = found + "," + m.Value.Replace(",", "");
}
return found;
To this:
return (from Match m in r.Matches(site)
select m.Value.Replace(",", "")).Distinct().ToArray();
This will return an array. If you still want it back as a string:
return string.Join(", ", (from Match m in r.Matches(site)
select m.Value.Replace(",", "")).Distinct().ToArray());
You may or may not be able to remove the last .ToArray() from the last code there depending on the .NET runtime version. .NET 4.0 string.Join(...) can take an IEnumerable<string>, whereas previous versions requires an array.
This will return a string of comma seperated values without duplicates:
var result = string.Join(",",
r.Matches(site)
.Cast<Match>()
.Select(m => m.Value.Replace(",", string.Empty))
.Distinct()
);
this could be one possible solution:
var data = new List<string>();
for (m = r.Match(site); m.Success; m = m.NextMatch())
data.Add(m.Value.Replace(",", ""));
return String.Join(",", data.Distinct().ToArray());
You can achieve this in a single LINQ query
string strSentence = "aaa,bbb,ccc,aaa,111,111,ccc";
List<string> results = (from w in strSentence.Split(',') select w).Distinct().ToList();

Categories

Resources