Related
Resharper constantly complains about this: Possible multiple enumeration of IEnumerable. For example:
private int ParseLoanNumber(IEnumerable<string> lines)
{
var loanNumber = 0;
var item = lines.FirstOrDefault(l => l.StartsWith(" LN# 00"));
if (item != null)
{
loanNumber = item.ParseInt(8, 10).GetValueOrDefault();
}
else
{
item = lines.FirstOrDefault(l => l.StartsWith(" LOAN-NO (CONT'D) 00"));
if (item != null)
{
loanNumber = item.ParseInt(19, 10).GetValueOrDefault();
}
}
// Yada yada...
}
The recommended solution is to convert the enumerable to a list or array, and iterate over that.
This baffles me. You will still be enumerating something, and both types (arrays and lists) implement IEnumerable. So how does this solve anything, or improve performance in any way?
Because you can write this:
public IEnumerable<int> GetNumbersSlowly()
{
for (var i = 0; i < 100; i++)
{
Thread.Sleep(10000); //Or retrieve from a website, etc
yield return i;
}
}
If you use it like this:
var numbers = GetNumbersSlowly();
foreach(var number in numbers) {
//Do something
}
foreach(var number in numbers) {
//Do something
}
It means the work done (the sleep) is done twice for each number. Evaluating the enumerable once and storing it in an array or list means you're sure there's no extra processing being done to return the items.
Since you're taking an IEnumerable<string>, you really don't know that the caller hasn't done the above.
If you think my example might be rare or an edge case, it also applies to things like this:
var someSource = new List<int> { 1, 2, 3, 4, 5 };
var numbers = someSource.Select(s => s * 100000);
Now every time you iterate numbers, you're also re-doing the calculation. In this case it's not much work, by why do it more than you need (and it's not uncommon for it to be non-trivial work).
Is there a way to achieve this?
I tried:
string str = "{34.10,0,0.00}"; //the string as I get it from Postgres DB
decimal[] won;
won = (decimal[])(str); //Cannot convert type 'string' to 'decimal[]'
What I would ideally want is to get into won:
won[0] = 34.10
won[1] = 0
won[2] = 0.00
Surely, I can go and split by commas, and put it in the array but I'm wondering if there's a better way.
You have to Split
won = str.Trim('{', '}').Split(',').Select(decimal.Parse).ToArray();
Edit: This part is just for fun
There is no way to cast string to a decimal[] array directly, but if you want you can add a decimal wrapper class and define implicit conversions:
class MyDecimal
{
private decimal[] _values;
public MyDecimal(int size)
{
_values = new decimal[size];
}
public decimal this[int index]
{
get { return _values[index]; }
set { _values[index] = value; }
}
public static implicit operator MyDecimal(string str)
{
var numbers = str.Trim('{', '}').Split(',');
MyDecimal d = new MyDecimal(numbers.Length);
d._values = numbers
.Select(x => decimal.Parse(x,CultureInfo.InvariantCulture))
.ToArray();
return d;
}
public static implicit operator string(MyDecimal md)
{
return string.Join(",", md._values);
}
}
Then you can do:
string str = "{34.10,0,0.00}"; //the string as I get it from Postgres DB
MyDecimal won = str;
I first misread your question. The real answer is: I know of no other way than splitting and converting in loops or using LINQ (for a LINQ sample see Selman22's answer). There's no way to cast a string to an array in one go.
While it is essentially what you suggest, you could try this:
// Remove leading and trailing brackets
string s = str.Trim('{', '}');
// Split numbers
string[] parts = s.Split(',');
decimal[] nums = new decimal[parts.Length];
// Convert
for (int i = 0; i < parts.Length; i++)
nums[i] = Convert.ToDecimal(parts[i]);
Just to play devil's advocate to those who say you have no option but to split:
var result = new JavaScriptSerializer()
.Deserialize<decimal[]>(str.Replace('{', '[').Replace('}', ']'))
here is another but probably not a better way in regex
string str = "{34.10,0,0.00}";
string pattern = #"([\d]+[\.]|[\d]?)[\d]+";
decimal[] result = Regex.Matches(str, pattern, RegexOptions.None)
.Cast<Match>()
.Select(x => decimal.Parse(x.Value))
.ToArray();
but remember Jamie Zawinski:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Another way would be using a StringReader and managing the split
There is no better way. At least until C# is backed up by an AI which will just guess what you are trying to do by casting one datatype into another by a custom logic.
Any programmer would guess what you want. Until now though the C# compiler is no wizard.
What is the quickest (and least resource intensive) to compare two massive (>50.000 items) and as a result have two lists like the ones below:
items that show up in the first list but not in the second
items that show up in the second list but not in the first
Currently I'm working with the List or IReadOnlyCollection and solve this issue in a linq query:
var list1 = list.Where(i => !list2.Contains(i)).ToList();
var list2 = list2.Where(i => !list.Contains(i)).ToList();
But this doesn't perform as good as i would like.
Any idea of making this quicker and less resource intensive as i need to process a lot of lists?
Use Except:
var firstNotSecond = list1.Except(list2).ToList();
var secondNotFirst = list2.Except(list1).ToList();
I suspect there are approaches which would actually be marginally faster than this, but even this will be vastly faster than your O(N * M) approach.
If you want to combine these, you could create a method with the above and then a return statement:
return !firstNotSecond.Any() && !secondNotFirst.Any();
One point to note is that there is a difference in results between the original code in the question and the solution here: any duplicate elements which are only in one list will only be reported once with my code, whereas they'd be reported as many times as they occur in the original code.
For example, with lists of [1, 2, 2, 2, 3] and [1], the "elements in list1 but not list2" result in the original code would be [2, 2, 2, 3]. With my code it would just be [2, 3]. In many cases that won't be an issue, but it's worth being aware of.
Enumerable.SequenceEqual Method
Determines whether two sequences are equal according to an equality comparer.
MS.Docs
Enumerable.SequenceEqual(list1, list2);
This works for all primitive data types. If you need to use it on custom objects you need to implement IEqualityComparer
Defines methods to support the comparison of objects for equality.
IEqualityComparer Interface
Defines methods to support the comparison of objects for equality.
MS.Docs for IEqualityComparer
More efficient would be using Enumerable.Except:
var inListButNotInList2 = list.Except(list2);
var inList2ButNotInList = list2.Except(list);
This method is implemented by using deferred execution. That means you could write for example:
var first10 = inListButNotInList2.Take(10);
It is also efficient since it internally uses a Set<T> to compare the objects. It works by first collecting all distinct values from the second sequence, and then streaming the results of the first, checking that they haven't been seen before.
If you want the results to be case insensitive, the following will work:
List<string> list1 = new List<string> { "a.dll", "b1.dll" };
List<string> list2 = new List<string> { "A.dll", "b2.dll" };
var firstNotSecond = list1.Except(list2, StringComparer.OrdinalIgnoreCase).ToList();
var secondNotFirst = list2.Except(list1, StringComparer.OrdinalIgnoreCase).ToList();
firstNotSecond would contain b1.dll
secondNotFirst would contain b2.dll
using System.Collections.Generic;
using System.Linq;
namespace YourProject.Extensions
{
public static class ListExtensions
{
public static bool SetwiseEquivalentTo<T>(this List<T> list, List<T> other)
where T: IEquatable<T>
{
if (list.Except(other).Any())
return false;
if (other.Except(list).Any())
return false;
return true;
}
}
}
Sometimes you only need to know if two lists are different, and not what those differences are. In that case, consider adding this extension method to your project. Note that your listed objects should implement IEquatable!
Usage:
public sealed class Car : IEquatable<Car>
{
public Price Price { get; }
public List<Component> Components { get; }
...
public override bool Equals(object obj)
=> obj is Car other && Equals(other);
public bool Equals(Car other)
=> Price == other.Price
&& Components.SetwiseEquivalentTo(other.Components);
public override int GetHashCode()
=> Components.Aggregate(
Price.GetHashCode(),
(code, next) => code ^ next.GetHashCode()); // Bitwise XOR
}
Whatever the Component class is, the methods shown here for Car should be implemented almost identically.
It's very important to note how we've written GetHashCode. In order to properly implement IEquatable, Equals and GetHashCode must operate on the instance's properties in a logically compatible way.
Two lists with the same contents are still different objects, and will produce different hash codes. Since we want these two lists to be treated as equal, we must let GetHashCode produce the same value for each of them. We can accomplish this by delegating the hashcode to every element in the list, and using the standard bitwise XOR to combine them all. XOR is order-agnostic, so it doesn't matter if the lists are sorted differently. It only matters that they contain nothing but equivalent members.
Note: the strange name is to imply the fact that the method does not consider the order of the elements in the list. If you do care about the order of the elements in the list, this method is not for you!
try this way:
var difList = list1.Where(a => !list2.Any(a1 => a1.id == a.id))
.Union(list2.Where(a => !list1.Any(a1 => a1.id == a.id)));
Not for this Problem, but here's some code to compare lists for equal and not! identical objects:
public class EquatableList<T> : List<T>, IEquatable<EquatableList<T>> where T : IEquatable<T>
/// <summary>
/// True, if this contains element with equal property-values
/// </summary>
/// <param name="element">element of Type T</param>
/// <returns>True, if this contains element</returns>
public new Boolean Contains(T element)
{
return this.Any(t => t.Equals(element));
}
/// <summary>
/// True, if list is equal to this
/// </summary>
/// <param name="list">list</param>
/// <returns>True, if instance equals list</returns>
public Boolean Equals(EquatableList<T> list)
{
if (list == null) return false;
return this.All(list.Contains) && list.All(this.Contains);
}
If only combined result needed, this will work too:
var set1 = new HashSet<T>(list1);
var set2 = new HashSet<T>(list2);
var areEqual = set1.SetEquals(set2);
where T is type of lists element.
While Jon Skeet's answer is an excellent advice for everyday's practice with small to moderate number of elements (up to a few millions) it is nevertheless not the fastest approach and not very resource efficient. An obvious drawback is the fact that getting the full difference requires two passes over the data (even three if the elements that are equal are of interest as well). Clearly, this can be avoided by a customized reimplementation of the Except method, but it remains that the creation of a hash set requires a lot of memory and the computation of hashes requires time.
For very large data sets (in the billions of elements) it usually pays off to consider the particular circumstances. Here are a few ideas that might provide some inspiration:
If the elements can be compared (which is almost always the case in practice), then sorting the lists and applying the following zip approach is worth consideration:
/// <returns>The elements of the specified (ascendingly) sorted enumerations that are
/// contained only in one of them, together with an indicator,
/// whether the element is contained in the reference enumeration (-1)
/// or in the difference enumeration (+1).</returns>
public static IEnumerable<Tuple<T, int>> FindDifferences<T>(IEnumerable<T> sortedReferenceObjects,
IEnumerable<T> sortedDifferenceObjects, IComparer<T> comparer)
{
var refs = sortedReferenceObjects.GetEnumerator();
var diffs = sortedDifferenceObjects.GetEnumerator();
bool hasNext = refs.MoveNext() && diffs.MoveNext();
while (hasNext)
{
int comparison = comparer.Compare(refs.Current, diffs.Current);
if (comparison == 0)
{
// insert code that emits the current element if equal elements should be kept
hasNext = refs.MoveNext() && diffs.MoveNext();
}
else if (comparison < 0)
{
yield return Tuple.Create(refs.Current, -1);
hasNext = refs.MoveNext();
}
else
{
yield return Tuple.Create(diffs.Current, 1);
hasNext = diffs.MoveNext();
}
}
}
This can e.g. be used in the following way:
const int N = <Large number>;
const int omit1 = 231567;
const int omit2 = 589932;
IEnumerable<int> numberSequence1 = Enumerable.Range(0, N).Select(i => i < omit1 ? i : i + 1);
IEnumerable<int> numberSequence2 = Enumerable.Range(0, N).Select(i => i < omit2 ? i : i + 1);
var numberDiffs = FindDifferences(numberSequence1, numberSequence2, Comparer<int>.Default);
Benchmarking on my computer gave the following result for N = 1M:
Method
Mean
Error
StdDev
Ratio
Gen 0
Gen 1
Gen 2
Allocated
DiffLinq
115.19 ms
0.656 ms
0.582 ms
1.00
2800.0000
2800.0000
2800.0000
67110744 B
DiffZip
23.48 ms
0.018 ms
0.015 ms
0.20
-
-
-
720 B
And for N = 100M:
Method
Mean
Error
StdDev
Ratio
Gen 0
Gen 1
Gen 2
Allocated
DiffLinq
12.146 s
0.0427 s
0.0379 s
1.00
13000.0000
13000.0000
13000.0000
8589937032 B
DiffZip
2.324 s
0.0019 s
0.0018 s
0.19
-
-
-
720 B
Note that this example of course benefits from the fact that the lists are already sorted and integers can be very efficiently compared. But this is exactly the point: If you do have favourable circumstances, make sure that you exploit them.
A few further comments: The speed of the comparison function is clearly relevant for the overall performance, so it may be beneficial to optimize it. The flexibility to do so is a benefit of the zipping approach. Furthermore, parallelization seems more feasible to me, although by no means easy and maybe not worth the effort and the overhead. Nevertheless, a simple way to speed up the process by roughly a factor of 2, is to split the lists respectively in two halfs (if it can be efficiently done) and compare the parts in parallel, one processing from front to back and the other in reverse order.
I have used this code to compare two list which has million of records.
This method will not take much time
//Method to compare two list of string
private List<string> Contains(List<string> list1, List<string> list2)
{
List<string> result = new List<string>();
result.AddRange(list1.Except(list2, StringComparer.OrdinalIgnoreCase));
result.AddRange(list2.Except(list1, StringComparer.OrdinalIgnoreCase));
return result;
}
I compared 3 different methods for comparing different data sets. Tests below create a string collection of all the numbers from 0 to length - 1, then another collection with the same range, but with even numbers. I then pick out the odd numbers from the first collection.
Using Linq Except
public void TestExcept()
{
WriteLine($"Except {DateTime.Now}");
int length = 20000000;
var dateTime = DateTime.Now;
var array = new string[length];
for (int i = 0; i < length; i++)
{
array[i] = i.ToString();
}
Write("Populate set processing time: ");
WriteLine(DateTime.Now - dateTime);
var newArray = new string[length/2];
int j = 0;
for (int i = 0; i < length; i+=2)
{
newArray[j++] = i.ToString();
}
dateTime = DateTime.Now;
Write("Count of items: ");
WriteLine(array.Except(newArray).Count());
Write("Count processing time: ");
WriteLine(DateTime.Now - dateTime);
}
Output
Except 2021-08-14 11:43:03 AM
Populate set processing time: 00:00:03.7230479
2021-08-14 11:43:09 AM
Count of items: 10000000
Count processing time: 00:00:02.9720879
Using HashSet.Add
public void TestHashSet()
{
WriteLine($"HashSet {DateTime.Now}");
int length = 20000000;
var dateTime = DateTime.Now;
var hashSet = new HashSet<string>();
for (int i = 0; i < length; i++)
{
hashSet.Add(i.ToString());
}
Write("Populate set processing time: ");
WriteLine(DateTime.Now - dateTime);
var newHashSet = new HashSet<string>();
for (int i = 0; i < length; i+=2)
{
newHashSet.Add(i.ToString());
}
dateTime = DateTime.Now;
Write("Count of items: ");
// HashSet Add returns true if item is added successfully (not previously existing)
WriteLine(hashSet.Where(s => newHashSet.Add(s)).Count());
Write("Count processing time: ");
WriteLine(DateTime.Now - dateTime);
}
Output
HashSet 2021-08-14 11:42:43 AM
Populate set processing time: 00:00:05.6000625
Count of items: 10000000
Count processing time: 00:00:01.7703057
Special HashSet test:
public void TestLoadingHashSet()
{
int length = 20000000;
var array = new string[length];
for (int i = 0; i < length; i++)
{
array[i] = i.ToString();
}
var dateTime = DateTime.Now;
var hashSet = new HashSet<string>(array);
Write("Time to load hashset: ");
WriteLine(DateTime.Now - dateTime);
}
> TestLoadingHashSet()
Time to load hashset: 00:00:01.1918160
Using .Contains
public void TestContains()
{
WriteLine($"Contains {DateTime.Now}");
int length = 20000000;
var dateTime = DateTime.Now;
var array = new string[length];
for (int i = 0; i < length; i++)
{
array[i] = i.ToString();
}
Write("Populate set processing time: ");
WriteLine(DateTime.Now - dateTime);
var newArray = new string[length/2];
int j = 0;
for (int i = 0; i < length; i+=2)
{
newArray[j++] = i.ToString();
}
dateTime = DateTime.Now;
WriteLine(dateTime);
Write("Count of items: ");
WriteLine(array.Where(a => !newArray.Contains(a)).Count());
Write("Count processing time: ");
WriteLine(DateTime.Now - dateTime);
}
Output
Contains 2021-08-14 11:19:44 AM
Populate set processing time: 00:00:03.1046998
2021-08-14 11:19:49 AM
Count of items: Hosting process exited with exit code 1.
(Didnt complete. Killed it after 14 minutes)
Conclusion:
Linq Except ran approximately 1 second slower on my device than using HashSets (n=20,000,000).
Using Where and Contains ran for a very long time
Closing remarks on HashSets:
Unique data
Make sure to override GetHashCode (correctly) for class types
May need up to 2x the memory if you make a copy of the data set, depending on implementation
HashSet is optimized for cloning other HashSets using the IEnumerable constructor, but it is slower to convert other collections to HashSets (see special test above)
First approach:
if (list1 != null && list2 != null && list1.Select(x => list2.SingleOrDefault(y => y.propertyToCompare == x.propertyToCompare && y.anotherPropertyToCompare == x.anotherPropertyToCompare) != null).All(x => true))
return true;
Second approach if you are ok with duplicate values:
if (list1 != null && list2 != null && list1.Select(x => list2.Any(y => y.propertyToCompare == x.propertyToCompare && y.anotherPropertyToCompare == x.anotherPropertyToCompare)).All(x => true))
return true;
Both Jon Skeet's and miguelmpn's answers are good. It depends on whether the order of the list elements is important or not:
// take order into account
bool areEqual1 = Enumerable.SequenceEqual(list1, list2);
// ignore order
bool areEqual2 = !list1.Except(list2).Any() && !list2.Except(list1).Any();
One line:
var list1 = new List<int> { 1, 2, 3 };
var list2 = new List<int> { 1, 2, 3, 4 };
if (list1.Except(list2).Count() + list2.Except(list1).Count() == 0)
Console.WriteLine("same sets");
I did the generic function for comparing two lists.
public static class ListTools
{
public enum RecordUpdateStatus
{
Added = 1,
Updated = 2,
Deleted = 3
}
public class UpdateStatu<T>
{
public T CurrentValue { get; set; }
public RecordUpdateStatus UpdateStatus { get; set; }
}
public static List<UpdateStatu<T>> CompareList<T>(List<T> currentList, List<T> inList, string uniqPropertyName)
{
var res = new List<UpdateStatu<T>>();
res.AddRange(inList.Where(a => !currentList.Any(x => x.GetType().GetProperty(uniqPropertyName).GetValue(x)?.ToString().ToLower() == a.GetType().GetProperty(uniqPropertyName).GetValue(a)?.ToString().ToLower()))
.Select(a => new UpdateStatu<T>
{
CurrentValue = a,
UpdateStatus = RecordUpdateStatus.Added,
}));
res.AddRange(currentList.Where(a => !inList.Any(x => x.GetType().GetProperty(uniqPropertyName).GetValue(x)?.ToString().ToLower() == a.GetType().GetProperty(uniqPropertyName).GetValue(a)?.ToString().ToLower()))
.Select(a => new UpdateStatu<T>
{
CurrentValue = a,
UpdateStatus = RecordUpdateStatus.Deleted,
}));
res.AddRange(currentList.Where(a => inList.Any(x => x.GetType().GetProperty(uniqPropertyName).GetValue(x)?.ToString().ToLower() == a.GetType().GetProperty(uniqPropertyName).GetValue(a)?.ToString().ToLower()))
.Select(a => new UpdateStatu<T>
{
CurrentValue = a,
UpdateStatus = RecordUpdateStatus.Updated,
}));
return res;
}
}
I think this is a simple and easy way to compare two lists element by element
x=[1,2,3,5,4,8,7,11,12,45,96,25]
y=[2,4,5,6,8,7,88,9,6,55,44,23]
tmp = []
for i in range(len(x)) and range(len(y)):
if x[i]>y[i]:
tmp.append(1)
else:
tmp.append(0)
print(tmp)
Maybe it's funny, but this works for me:
string.Join("",List1) != string.Join("", List2)
This is the best solution you'll found
var list3 = list1.Where(l => list2.ToList().Contains(l));
Context: C# 3.0, .Net 3.5
Suppose I have a method that generates random numbers (forever):
private static IEnumerable<int> RandomNumberGenerator() {
while (true) yield return GenerateRandomNumber(0, 100);
}
I need to group those numbers in groups of 10, so I would like something like:
foreach (IEnumerable<int> group in RandomNumberGenerator().Slice(10)) {
Assert.That(group.Count() == 10);
}
I have defined Slice method, but I feel there should be one already defined. Here is my Slice method, just for reference:
private static IEnumerable<T[]> Slice<T>(IEnumerable<T> enumerable, int size) {
var result = new List<T>(size);
foreach (var item in enumerable) {
result.Add(item);
if (result.Count == size) {
yield return result.ToArray();
result.Clear();
}
}
}
Question: is there an easier way to accomplish what I'm trying to do? Perhaps Linq?
Note: above example is a simplification, in my program I have an Iterator that scans given matrix in a non-linear fashion.
EDIT: Why Skip+Take is no good.
Effectively what I want is:
var group1 = RandomNumberGenerator().Skip(0).Take(10);
var group2 = RandomNumberGenerator().Skip(10).Take(10);
var group3 = RandomNumberGenerator().Skip(20).Take(10);
var group4 = RandomNumberGenerator().Skip(30).Take(10);
without the overhead of regenerating number (10+20+30+40) times. I need a solution that will generate exactly 40 numbers and break those in 4 groups by 10.
Are Skip and Take of any use to you?
Use a combination of the two in a loop to get what you want.
So,
list.Skip(10).Take(10);
Skips the first 10 records and then takes the next 10.
I have done something similar. But I would like it to be simpler:
//Remove "this" if you don't want it to be a extension method
public static IEnumerable<IList<T>> Chunks<T>(this IEnumerable<T> xs, int size)
{
var curr = new List<T>(size);
foreach (var x in xs)
{
curr.Add(x);
if (curr.Count == size)
{
yield return curr;
curr = new List<T>(size);
}
}
}
I think yours are flawed. You return the same array for all your chunks/slices so only the last chunk/slice you take would have the correct data.
Addition: Array version:
public static IEnumerable<T[]> Chunks<T>(this IEnumerable<T> xs, int size)
{
var curr = new T[size];
int i = 0;
foreach (var x in xs)
{
curr[i % size] = x;
if (++i % size == 0)
{
yield return curr;
curr = new T[size];
}
}
}
Addition: Linq version (not C# 2.0). As pointed out, it will not work on infinite sequences and will be a great deal slower than the alternatives:
public static IEnumerable<T[]> Chunks<T>(this IEnumerable<T> xs, int size)
{
return xs.Select((x, i) => new { x, i })
.GroupBy(xi => xi.i / size, xi => xi.x)
.Select(g => g.ToArray());
}
Using Skip and Take would be a very bad idea. Calling Skip on an indexed collection may be fine, but calling it on any arbitrary IEnumerable<T> is liable to result in enumeration over the number of elements skipped, which means that if you're calling it repeatedly you're enumerating over the sequence an order of magnitude more times than you need to be.
Complain of "premature optimization" all you want; but that is just ridiculous.
I think your Slice method is about as good as it gets. I was going to suggest a different approach that would provide deferred execution and obviate the intermediate array allocation, but that is a dangerous game to play (i.e., if you try something like ToList on such a resulting IEnumerable<T> implementation, without enumerating over the inner collections, you'll end up in an endless loop).
(I've removed what was originally here, as the OP's improvements since posting the question have since rendered my suggestions here redundant.)
Let's see if you even need the complexity of Slice. If your random number generates is stateless, I would assume each call to it would generate unique random numbers, so perhaps this would be sufficient:
var group1 = RandomNumberGenerator().Take(10);
var group2 = RandomNumberGenerator().Take(10);
var group3 = RandomNumberGenerator().Take(10);
var group4 = RandomNumberGenerator().Take(10);
Each call to Take returns a new group of 10 numbers.
Now, if your random number generator re-seeds itself with a specific value each time it's iterated, this won't work. You'll simply get the same 10 values for each group. So instead, you would use:
var generator = RandomNumberGenerator();
var group1 = generator.Take(10);
var group2 = generator.Take(10);
var group3 = generator.Take(10);
var group4 = generator.Take(10);
This maintains an instance of the generator so that you can continue retrieving values without re-seeding the generator.
You could use the Skip and Take methods with any Enumerable object.
For your edit :
How about a function that takes a slice number and a slice size as a parameter?
private static IEnumerable<T> Slice<T>(IEnumerable<T> enumerable, int sliceSize, int sliceNumber) {
return enumerable.Skip(sliceSize * sliceNumber).Take(sliceSize);
}
It seems like we'd prefer for an IEnumerable<T> to have a fixed position counter so that we can do
var group1 = items.Take(10);
var group2 = items.Take(10);
var group3 = items.Take(10);
var group4 = items.Take(10);
and get successive slices rather than getting the first 10 items each time. We can do that with a new implementation of IEnumerable<T> which keeps one instance of its Enumerator and returns it on every call of GetEnumerator:
public class StickyEnumerable<T> : IEnumerable<T>, IDisposable
{
private IEnumerator<T> innerEnumerator;
public StickyEnumerable( IEnumerable<T> items )
{
innerEnumerator = items.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return innerEnumerator;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return innerEnumerator;
}
public void Dispose()
{
if (innerEnumerator != null)
{
innerEnumerator.Dispose();
}
}
}
Given that class, we could implement Slice with
public static IEnumerable<IEnumerable<T>> Slices<T>(this IEnumerable<T> items, int size)
{
using (StickyEnumerable<T> sticky = new StickyEnumerable<T>(items))
{
IEnumerable<T> slice;
do
{
slice = sticky.Take(size).ToList();
yield return slice;
} while (slice.Count() == size);
}
yield break;
}
That works in this case, but StickyEnumerable<T> is generally a dangerous class to have around if the consuming code isn't expecting it. For example,
using (var sticky = new StickyEnumerable<int>(Enumerable.Range(1, 10)))
{
var first = sticky.Take(2);
var second = sticky.Take(2);
foreach (int i in second)
{
Console.WriteLine(i);
}
foreach (int i in first)
{
Console.WriteLine(i);
}
}
prints
1
2
3
4
rather than
3
4
1
2
Take a look at Take(), TakeWhile() and Skip()
I think the use of Slice() would be a bit misleading. I think of that as a means to give me a chuck of an array into a new array and not causing side effects. In this scenario you would actually move the enumerable forward 10.
A possible better approach is to just use the Linq extension Take(). I don't think you would need to use Skip() with a generator.
Edit: Dang, I have been trying to test this behavior with the following code
Note: this is wasn't really correct, I leave it here so others don't fall into the same mistake.
var numbers = RandomNumberGenerator();
var slice = numbers.Take(10);
public static IEnumerable<int> RandomNumberGenerator()
{
yield return random.Next();
}
but the Count() for slice is alway 1. I also tried running it through a foreach loop since I know that the Linq extensions are generally lazily evaluated and it only looped once. I eventually did the code below instead of the Take() and it works:
public static IEnumerable<int> Slice(this IEnumerable<int> enumerable, int size)
{
var list = new List<int>();
foreach (var count in Enumerable.Range(0, size)) list.Add(enumerable.First());
return list;
}
If you notice I am adding the First() to the list each time, but since the enumerable that is being passed in is the generator from RandomNumberGenerator() the result is different every time.
So again with a generator using Skip() is not needed since the result will be different. Looping over an IEnumerable is not always side effect free.
Edit: I'll leave the last edit just so no one falls into the same mistake, but it worked fine for me just doing this:
var numbers = RandomNumberGenerator();
var slice1 = numbers.Take(10);
var slice2 = numbers.Take(10);
The two slices were different.
I had made some mistakes in my original answer but some of the points still stand. Skip() and Take() are not going to work the same with a generator as it would a list. Looping over an IEnumerable is not always side effect free. Anyway here is my take on getting a list of slices.
public static IEnumerable<int> RandomNumberGenerator()
{
while(true) yield return random.Next();
}
public static IEnumerable<IEnumerable<int>> Slice(this IEnumerable<int> enumerable, int size, int count)
{
var slices = new List<List<int>>();
foreach (var iteration in Enumerable.Range(0, count)){
var list = new List<int>();
list.AddRange(enumerable.Take(size));
slices.Add(list);
}
return slices;
}
I got this solution for the same problem:
int[] ints = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
IEnumerable<IEnumerable<int>> chunks = Chunk(ints, 2, t => t.Dump());
//won't enumerate, so won't do anything unless you force it:
chunks.ToList();
IEnumerable<T> Chunk<T, R>(IEnumerable<R> src, int n, Func<IEnumerable<R>, T> action){
IEnumerable<R> head;
IEnumerable<R> tail = src;
while (tail.Any())
{
head = tail.Take(n);
tail = tail.Skip(n);
yield return action(head);
}
}
if you just want the chunks returned, not do anything with them, use chunks = Chunk(ints, 2, t => t). What I would really like is to have to have t=>t as default action, but I haven't found out how to do that yet.
In C#, I have an array of ints, containing digits only. I want to convert this array to string.
Array example:
int[] arr = {0,1,2,3,0,1};
How can I convert this to a string formatted as: "012301"?
at.net 3.5 use:
String.Join("", new List<int>(array).ConvertAll(i => i.ToString()).ToArray());
at.net 4.0 or above use: (see #Jan Remunda's answer)
string result = string.Join("", array);
You can simply use String.Join function, and as separator use string.Empty because it uses StringBuilder internally.
string result = string.Join(string.Empty, new []{0,1,2,3,0,1});
E.g.: If you use semicolon as separator, the result would be 0;1;2;3;0;1.
It actually works with null separator, and second parameter can be enumerable of any objects, like:
string result = string.Join(null, new object[]{0,1,2,3,0,"A",DateTime.Now});
I realize my opinion is probably not the popular one, but I guess I have a hard time jumping on the Linq-y band wagon. It's nifty. It's condensed. I get that and I'm not opposed to using it where it's appropriate. Maybe it's just me, but I feel like people have stopped thinking about creating utility functions to accomplish what they want and instead prefer to litter their code with (sometimes) excessively long lines of Linq code for the sake of creating a dense 1-liner.
I'm not saying that any of the Linq answers that people have provided here are bad, but I guess I feel like there is the potential that these single lines of code can start to grow longer and more obscure as you need to handle various situations. What if your array is null? What if you want a delimited string instead of just purely concatenated? What if some of the integers in your array are double-digit and you want to pad each value with leading zeros so that the string for each element is the same length as the rest?
Taking one of the provided answers as an example:
result = arr.Aggregate(string.Empty, (s, i) => s + i.ToString());
If I need to worry about the array being null, now it becomes this:
result = (arr == null) ? null : arr.Aggregate(string.Empty, (s, i) => s + i.ToString());
If I want a comma-delimited string, now it becomes this:
result = (arr == null) ? null : arr.Skip(1).Aggregate(arr[0].ToString(), (s, i) => s + "," + i.ToString());
This is still not too bad, but I think it's not obvious at a glance what this line of code is doing.
Of course, there's nothing stopping you from throwing this line of code into your own utility function so that you don't have that long mess mixed in with your application logic, especially if you're doing it in multiple places:
public static string ToStringLinqy<T>(this T[] array, string delimiter)
{
// edit: let's replace this with a "better" version using a StringBuilder
//return (array == null) ? null : (array.Length == 0) ? string.Empty : array.Skip(1).Aggregate(array[0].ToString(), (s, i) => s + "," + i.ToString());
return (array == null) ? null : (array.Length == 0) ? string.Empty : array.Skip(1).Aggregate(new StringBuilder(array[0].ToString()), (s, i) => s.Append(delimiter).Append(i), s => s.ToString());
}
But if you're going to put it into a utility function anyway, do you really need it to be condensed down into a 1-liner? In that case why not throw in a few extra lines for clarity and take advantage of a StringBuilder so that you're not doing repeated concatenation operations:
public static string ToStringNonLinqy<T>(this T[] array, string delimiter)
{
if (array != null)
{
// edit: replaced my previous implementation to use StringBuilder
if (array.Length > 0)
{
StringBuilder builder = new StringBuilder();
builder.Append(array[0]);
for (int i = 1; i < array.Length; i++)
{
builder.Append(delimiter);
builder.Append(array[i]);
}
return builder.ToString()
}
else
{
return string.Empty;
}
}
else
{
return null;
}
}
And if you're really so concerned about performance, you could even turn it into a hybrid function that decides whether to do string.Join or to use a StringBuilder depending on how many elements are in the array (this is a micro-optimization, not worth doing in my opinion and possibly more harmful than beneficial, but I'm using it as an example for this problem):
public static string ToString<T>(this T[] array, string delimiter)
{
if (array != null)
{
// determine if the length of the array is greater than the performance threshold for using a stringbuilder
// 10 is just an arbitrary threshold value I've chosen
if (array.Length < 10)
{
// assumption is that for arrays of less than 10 elements
// this code would be more efficient than a StringBuilder.
// Note: this is a crazy/pointless micro-optimization. Don't do this.
string[] values = new string[array.Length];
for (int i = 0; i < values.Length; i++)
values[i] = array[i].ToString();
return string.Join(delimiter, values);
}
else
{
// for arrays of length 10 or longer, use a StringBuilder
StringBuilder sb = new StringBuilder();
sb.Append(array[0]);
for (int i = 1; i < array.Length; i++)
{
sb.Append(delimiter);
sb.Append(array[i]);
}
return sb.ToString();
}
}
else
{
return null;
}
}
For this example, the performance impact is probably not worth caring about, but the point is that if you are in a situation where you actually do need to be concerned with the performance of your operations, whatever they are, then it will most likely be easier and more readable to handle that within a utility function than using a complex Linq expression.
That utility function still looks kind of clunky. Now let's ditch the hybrid stuff and do this:
// convert an enumeration of one type into an enumeration of another type
public static IEnumerable<TOut> Convert<TIn, TOut>(this IEnumerable<TIn> input, Func<TIn, TOut> conversion)
{
foreach (TIn value in input)
{
yield return conversion(value);
}
}
// concatenate the strings in an enumeration separated by the specified delimiter
public static string Delimit<T>(this IEnumerable<T> input, string delimiter)
{
IEnumerator<T> enumerator = input.GetEnumerator();
if (enumerator.MoveNext())
{
StringBuilder builder = new StringBuilder();
// start off with the first element
builder.Append(enumerator.Current);
// append the remaining elements separated by the delimiter
while (enumerator.MoveNext())
{
builder.Append(delimiter);
builder.Append(enumerator.Current);
}
return builder.ToString();
}
else
{
return string.Empty;
}
}
// concatenate all elements
public static string ToString<T>(this IEnumerable<T> input)
{
return ToString(input, string.Empty);
}
// concatenate all elements separated by a delimiter
public static string ToString<T>(this IEnumerable<T> input, string delimiter)
{
return input.Delimit(delimiter);
}
// concatenate all elements, each one left-padded to a minimum length
public static string ToString<T>(this IEnumerable<T> input, int minLength, char paddingChar)
{
return input.Convert(i => i.ToString().PadLeft(minLength, paddingChar)).Delimit(string.Empty);
}
Now we have separate and fairly compact utility functions, each of which are arguable useful on their own.
Ultimately, my point is not that you shouldn't use Linq, but rather just to say don't forget about the benefits of creating your own utility functions, even if they are small and perhaps only contain a single line that returns the result from a line of Linq code. If nothing else, you'll be able to keep your application code even more condensed than you could achieve with a line of Linq code, and if you are using it in multiple places, then using a utility function makes it easier to adjust your output in case you need to change it later.
For this problem, I'd rather just write something like this in my application code:
int[] arr = { 0, 1, 2, 3, 0, 1 };
// 012301
result = arr.ToString<int>();
// comma-separated values
// 0,1,2,3,0,1
result = arr.ToString(",");
// left-padded to 2 digits
// 000102030001
result = arr.ToString(2, '0');
To avoid the creation of an extra array you could do the following.
var builder = new StringBuilder();
Array.ForEach(arr, x => builder.Append(x));
var res = builder.ToString();
string result = arr.Aggregate("", (s, i) => s + i.ToString());
(Disclaimer: If you have a lot of digits (hundreds, at least) and you care about performance, I suggest eschewing this method and using a StringBuilder, as in JaredPar's answer.)
You can do:
int[] arr = {0,1,2,3,0,1};
string results = string.Join("",arr.Select(i => i.ToString()).ToArray());
That gives you your results.
I like using StringBuilder with Aggregate(). The "trick" is that Append() returns the StringBuilder instance itself:
var sb = arr.Aggregate( new StringBuilder(), ( s, i ) => s.Append( i ) );
var result = sb.ToString();
string.Join("", (from i in arr select i.ToString()).ToArray())
In the .NET 4.0 the string.Join can use an IEnumerable<string> directly:
string.Join("", from i in arr select i.ToString())
I've left this here for posterity but don't recommend its use as it's not terribly readable. This is especially true now that I've come back to see if after a period of some time and have wondered what I was thinking when I wrote it (I was probably thinking 'crap, must get this written before someone else posts an answer'.)
string s = string.Concat(arr.Cast<object>().ToArray());
The most efficient way is not to convert each int into a string, but rather create one string out of an array of chars. Then the garbage collector only has one new temp object to worry about.
int[] arr = {0,1,2,3,0,1};
string result = new string(Array.ConvertAll<int,char>(arr, x => Convert.ToChar(x + '0')));
This is a roundabout way to go about it its not much code and easy for beginners to understand
int[] arr = {0,1,2,3,0,1};
string joined = "";
foreach(int i in arr){
joined += i.ToString();
}
int number = int.Parse(joined);
If this is long array you could use
var sb = arr.Aggregate(new StringBuilder(), ( s, i ) => s.Append( i ), s.ToString());
// This is the original array
int[] nums = {1, 2, 3};
// This is an empty string we will end up with
string numbers = "";
// iterate on every char in the array
foreach (var item in nums)
{
// add the char to the empty string
numbers += Convert.ToString(item);
}
// Write the string in the console
Console.WriteLine(numbers);