LinkedList<T> (2.0): removing items iteratively

LinkedList<T> (2.0): removing items iteratively - c#

I need to iterate through a LinkedList<T> (in .NET 2.0) and remove all the items according to a given criteria.
It was easy way under Java, since I could do the following:
Iterator<E> i = list.iterator();
while (i.hasNext()) {
E e = i.next();
if (e == x) {
// Found, so move it to the front,
i.remove();
list.addFirst(x);
// Return it
return x;
}
}
Unfortunately, in the .NET behavior of IEnumerator<T> (the equivalent of Iterator<E>) there's no remove method to remove the current element from the collection.
Also, in the LinkedList<T> there's no way to access an element at a given index, to accomplish the task by iterating back from the last to the first.
Have you got any idea on how to do it? Thank you very much!

This will remove all nodes that match a criteria, in one loop through the linked list.
LinkedListNode<E> node = list.First;
while (node != null)
{
var next = node.Next;
if (node.Value == x) {
list.Remove(e);
}
node = next;
}
I believe that's what you're attempting... You also added back in the node at the beginning of the list (so your java code didn't remove all of the nodes, but rather moved the first matching to the beginning of the list). That would be easy to do with this approach, as well.

It's actually a lot easier in C#.
function PlaceAtHead(<T> x)
{
list.Remove(x);
list.AddFirst(x);
return x;
}

One ugly option is to iterate through your list, find all the items that apply and store them in a list. Then iterate through your second list and call remove on your LinkedList...
I'm hoping someone else has a more elegant solution :)

Just a little addition to Reed Copsey's answer with a predicate:
public static T MoveAheadAndReturn<T>(LinkedList<T> ll, Predicate<T> pred)
{
if (ll == null)
throw new ArgumentNullException("ll");
if (pred == null)
throw new ArgumentNullException("pred");
LinkedListNode<T> node = ll.First;
T value = default(T);
while (node != null)
{
value = node.Value;
if (pred(value))
{
ll.Remove(node);
ll.AddFirst(node);
break;
}
node = node.Next;
}
return value;
}

Related

Is there any benefit to using LINQ to get the first (and only) element in IEnumerable<T>?

Is there any significant difference between these two lines?
var o = xmlFile.Descendants("SomeElement").ElementAt(0).Value;
And:
var o = xmlFile.Descendants("SomeElement").First().Value;
XmlFile is an XDocument object, and Descendants(XName name) returns IEnumerable<XElement>.
I know First(); will throw an exception if the collection is empty and you might want to use FirstOrDefault(); but that's fine in this case; I already validate my XDocument object against an XmlSchemaSet, so I know the element exists. I suppose directly accessing Value would throw an exception either way if the collection was empty, as ElementAt(0) wouldn't return anything either.
But yea; I, obviously, don't like adding using directives if I don't need to. Is there any reason one might want to use LINQ in this case? I can't imagine there's any real performance difference in either case.
I ask because the user is able to upload a zip file containing any number of XML files that need to be processed. 1 "record" per XML file.
EDIT: What my original question was going to be was "How do you get the first element from IEnumerable without adding using System.Linq; then I found the ElementAt, not realizing they were both part of LINQ.
So I guess really what I want to know is, would there be a difference between either snippet above and this:
var descendants = xmlFile.Descendants("SomeElement");
var enumerator = descendants.GetEnumerator();
var node = (enumerator.MoveNext()) ? enumerator.Current : null;
I'd definitely say LINQ is much more readable, and for that alone is probably worth using. But again, the user can upload I think up to a 10 MB zip file and each of these XML files ranges from about 2 kilobytes to 10 kilobytes, depending on which schema it is. So that's a good number of files.

Check the source. Both ElementAt and First are extension methods defined on System.Linq.Enumerable (as noted by Lee in the question comments).
Update
I included the implementation for Single as well, as it was discussed it would be a better option for this specific problem. Fundamentally this comes down to readability and exceptions that are thrown, as they all use the same way of accessing the first element.
public static TSource First<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if (list != null) {
if (list.Count > 0) return list[0];
}
else {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) return e.Current;
}
}
throw Error.NoElements();
}
public static TSource ElementAt<TSource>(this IEnumerable<TSource> source, int index) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if(list != null) return list[index];
if (index < 0) throw Error.ArgumentOutOfRange("index");
using (IEnumerator<TSource> e = source.GetEnumerator()) {
while (true) {
if (!e.MoveNext()) throw Error.ArgumentOutOfRange("index");
if (index == 0) return e.Current;
index--;
}
}
}
public static TSource Single<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if (list != null) {
switch (list.Count) {
case 0: throw Error.NoElements();
case 1: return list[0];
}
}
else {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (!e.MoveNext()) throw Error.NoElements();
TSource result = e.Current;
if (!e.MoveNext()) return result;
}
}
throw Error.MoreThanOneElement();
}

The only real difference is the name but it's important anyway. If you only want the first item use Enumerable.First/FirstOrDefault, if you want the first but maybe later also the second, third etc. then use ElementAt/ElementAtOrdefault.
The intention should be self explanatory. Readability is the key factor here.
You can find source code here, for example:
Enumerable.ElementAt and Enumerable.First
You can see that both methods are optimized for collections that support access via index.

The other answers here point out that both options you've presented actually use LINQ. But your updated question asks if this is equivalent to the original LINQ call:
var descendants = xmlFile.Descendants("SomeElement");
var enumerator = descendants.GetEnumerator();
var node = (enumerator.MoveNext()) ? enumerator.Current : null;
Well, no, not quite. Firstly, note that the IEnumerator<T> implements IDisposable, but your code is never going to call Dispose (although I doubt that would actually have any affect in this case). Secondly, your code handles empty data sets differently from either of those LINQ methods (your implementation is more like FirstOrDefault). A more equivalent version would be:
XElement node;
using (var enumerator = xmlFile.Descendants("SomeElement").GetEnumerator())
{
if (!enumerator.MoveNext())
{
throw new Exception(...);
}
node = enumerator.Current;
}
Or without the using:
XElement node;
var enumerator = xmlFile.Descendants("SomeElement").GetEnumerator();
try {
if (!enumerator.MoveNext()) { throw new Exception(...); }
node = enumerator.Current;
} finally {
enumerator.Dispose();
}
But in truth, we don't need the Enumerator at all. We can get rid of the call to Descendants like this:
var n = xmlFile.FirstNode;
var node = n as XElement;
while (node == null && n != null)
{
node = (n = n.NextNode) as XElement;
}
while (node != null && node.Name != "SomeElement")
{
node = (n = node.FirstNode ?? node.NextNode ?? node.Parent?.NextNode) as XElement;
while (node == null && n != null)
{
node = (n = n.NextNode) as XElement;
}
}
if (node == null)
{
throw new Exception("");
}
Now, if you profile this, you'll find some marginal performance boost with the more complex solutions. Here's the results of a fairly basic benchmark I put together (first column is without compiler optimizations, second column is with compiler optimizations):
Method Mean (/o-) Mean (/o+)
First() 0.1468333 0.1414340
ElementAt() 0.1452045 0.1419018
No Linq 0.1334992 0.1259622
While Loop 0.0895821 0.0693819
However, saving a few processor cycles usually isn't your biggest concern in enterprise-level applications. Given the typical costs for maintaining code, you should generally try to optimize for readability, and in my opinion, this is a lot easier to read:
var node = xmlFile.Descendants("SomeElement").First();

They can be used interchangeably since they both are defined in System.Linq.Enumerable.
But here some minor differences:
1) If no results are returned, .First will throw an exception.
2) .ElementAt(0) will throw an exception if the indexer is out of bounds.
Both of these exceptions can be avoided by using FirstOrDefault() and/or ElementAtOrDefault(0)

Check if IEnumerable has ANY rows without enumerating over the entire list

I have the following method which returns an IEnumerable of type T. The implementation of the method is not important, apart from the yield return to lazy load the IEnumerable. This is necessary as the result could have millions of items.
public IEnumerable<T> Parse()
{
foreach(...)
{
yield return parsedObject;
}
}
Problem:
I have the following property which can be used to determine if the IEnumerable will have any items:
public bool HasItems
{
get
{
return Parse().Take(1).SingleOrDefault() != null;
}
}
Is there perhaps a better way to do this?

IEnumerable.Any() will return true if there are any elements in the sequence and false if there are no elements in the sequence. This method will not iterate the entire sequence (only maximum one element) since it will return true if it makes it past the first element and false if it does not.

Similar to Howto: Count the items from a IEnumerable<T> without iterating? an Enumerable is meant to be a lazy, read-forward "list", and like quantum mechanics the act of investigating it alters its state.
See confirmation: https://dotnetfiddle.net/GPMVXH
var sideeffect = 0;
var enumerable = Enumerable.Range(1, 10).Select(i => {
// show how many times it happens
sideeffect++;
return i;
});
// will 'enumerate' one item!
if(enumerable.Any()) Console.WriteLine("There are items in the list; sideeffect={0}", sideeffect);
enumerable.Any() is the cleanest way to check if there are any items in the list. You could try casting to something not lazy, like if(null != (list = enumerable as ICollection<T>) && list.Any()) return true.
Or, your scenario may permit using an Enumerator and making a preliminary check before enumerating:
var e = enumerable.GetEnumerator();
// check first
if(!e.MoveNext()) return;
// do some stuff, then enumerate the list
do {
actOn(e.Current); // do stuff with the current item
} while(e.MoveNext()); // stop when we don't have anything else

The best way to answer this question, and to clear all doubts, is to see what the 'Any' function does.
public static bool Any<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) return true;
}
return false;
}
https://github.com/microsoft/referencesource/blob/master/System.Core/System/Linq/Enumerable.cs

Removing from linked list C#

I'm trying to delete a node, if x currently matches a int in my linked list.
I tried this, but once it removes the node it throws an error when examining foreach loop
public void DeleteNode(int x, LinkedList<name> myLinkedList) {
foreach (name item in myLinkedList) {
if (item.num.equals(x)) mylinkedList.Remove(x);
}
}
Hope that makes sense.

Yes, you can't iterate over a collection and modify it at the same time. However, LinkedList<T> lets you do the iteration explicitly pretty easily:
public void DeleteNode(int x, LinkedList<name> myLinkedList) {
var node = myLinkedList.First;
while (node != null) {
var nextNode = node.Next;
if (node.Value.num == x) {
myLinkedList.Remove(node);
}
node = nextNode;
}
}
Note that you can't get away with just taking node = node.Next; as the last line; the node is invalidated when it's removed.
This approach allows a single traversal of the list in O(n), and is likely to be the most efficient approach you'll find. It doesn't require any copying, or working with a collection (say List<T>) with less efficient removal complexity.

If you call remove during a foreach it will invalidate the enumerator, so this is not allowed.
Change your foreach to a simple for loop.

In this situation, I usually create a temporary collection and add it to it if it needs to be deleted. Then I loop through that list removing it from the original.

The way I write that, without invalidating the iterator, is:
foreach(var item in list.Where(w=>w.num.Equals(x)).ToArray())
list.Remove(item);

I remove Items from list in the following way:
for (int j = lst.Count - 1; j >= 0; j--)
{
var elem= lst[j];
lst.Remove(elem);
}
It looks very close to regular "foreach var elem in lst", which is the reason I like it.
I go from the end to the beginning cause otherwise you'll loose your indexing, and will need to track number of removed items.

info is a class.
This will find through linkedlist and delete the first item who's no property value is 1
LinkedList<info> infolist = new LinkedList<info>();
string todelete = "1";
info tmpitem = new info();
foreach (var item in infolist)
{
if (item.no == todelete)
tmpitem = item;
}
infolist.Remove(tmpitem);

public ListNode RemoveElements(ListNode head, int val)
{
if (head == null) return null;
head.next = RemoveElements(head.next, val);
return head.val == val ? head.next : head;
}

Why can't I do currentNode = currentNode.Next.Next?

I have made my own single chained/linked list.
Now, if I want to delete/remove a node/item from my list, I'd have to do something like this:
public void Delete(PARAMETERS)
{
Node previousNode = null,
currentNode = f;
while (currentNode != null)
{
if (SOMECONDITION)
{
if (previousNode == null)
{
f = currentNode.Next;
}
else
{
previousNode.Next = currentNode.Next;
}
}
else
{
previousNode = currentNode;
}
currentNode = currentNode.Next;
}
}
If SOMECONDITION is true, you simply skip the currentNode and therefor effectively "deleting" the node, as nothing points to it anymore.
But, I am really wondering, why can I not do something like this:
(...)
while ()
{
if (SOMECONDITION)
{
currentNode = currentNode.Next;
}
currentNode = currentNode.Next;
}
(...)
OR perhaps:
(...)
while ()
{
if (SOMECONDITION)
{
currentNode = currentNode.Next.Next;
}
else
{
currentNode = currentNode.Next;
}
}
(...)
What fundamental understanding do I lack?

Doing:
currentNode = currentNode.Next.Next;
Is a prime candidate for a NullReferenceException
EDIT:
Here's a list implementation with some pictures that may help you understand.
http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=176

There is nothing to say you can't do Next.Next.
The only issue is what if currentNode.Next is null? Then you would get an error.
PreviousNode works because you are doing a NULL check before using it.

currentNode is just a temporary pointer variable (reference) that ceases to exist at the end of the scope (that is by the next closing brace). When you change what that reference points to, you don't change any other references; changing currentNode doesn't magically change what the previous node's Next reference points to.
currentNode = currentNode.Next // only changes the temporary reference
You have to actually reach into the linked list and change a referende inside the list, which is what you do when you change previousNode.Next - you change what node the previous node considers its next node. You basically tell it "This is your new Next node, forget about the old one".
Also, as the others have stated, you should check for null references throughout. if currentNode.Next is the last node in the list, its Next will point at nothing, and you'll get a NullReferenceException.

Perhaps if you re-write the original a bit you would see better what you are really doing to the list.
public void Delete(PARAMETERS)
{
var previous = FindPreviousNode(PARAMETERS);
if( previous == null && Matches(f, PARAMETERS)) {
f = f.Next;
} else if(previous != null ) {
previous.Next = previous.Next.Next;
} // u could add "else { throw new NodeNotFound() }" if that's appropiate
}
private Node FindPreviousNode(PARAMETERS) {
Node currentNode = f;
while (currentNode != null) {
if (Matches(currentNode.Next, PARAMETERS)) {
return currentNode;
}
currentNode = currentNode.Next;
}
return null;
}
You have asked around in the comments to understand more what's up with the list and the Next's properties, so here it goes:
Lets say the list is: 1|3|5|7, first points to 1, 1's Next property points to 3, 5's Next points to 7, and 7's Next points to null. That's all you keep track of to store the list. If you set the 5's Next property to null, you are deleting the 7. If instead you set 3's Next property to 7, you are deleting the 5 from the list. If you set first to 3, you are deleting the 1.
Its all about the first and the Next properties. That's what makes the list.

The assignments to currentNode and previousNode do not alter the structure of the linked list. They're merely used to step through the structure.
The assignment to previousNode.Next is what changes the structure. Doing currentNode = currentNode.Next.Next will skip over the next node (if currentNode.Next is not null) but it won't alter the structure of the list.

You should really sketch a picture of the linked list if you're wondering about problems like this. It's far easier to see what needs to be done to accomplish some linked list mutation, than it is to reason it out.

Honestly, I don't follow the posted code at all.
If this is a standard linked list (each node has a Next, but that's it), follow these steps to run a deletion of a single item:
Step 1: Find the target node you want to delete, but keep track of the previous node visited.
Step 2: prevNode.Next = targetNode.Next
Note: special checks for deleting the head of the list need to be done.

How do you know in both cases that currentNode.Next is not null and thus that you can apply .Next on it? You are only checking for the != null in the loop condition.

Is the LinkedList in .NET a circular linked list?

I need a circular linked list, so I am wondering if LinkedList is a circular linked list?

A quick solution to using it in a circular fashion, whenever you want to move the "next" piece in the list:
current = current.Next ?? current.List.First;
Where current is LinkedListNode<T>.

No. It is a doubly linked list, but not a circular linked list. See MSDN for details on this.
LinkedList<T> makes a good foundation for your own circular linked list, however. But it does have a definite First and Last property, and will not enumerate around these, which a proper circular linked list will.

While the public API of the LinkedList is not circular, internally it actually is. Consulting the reference source, you can see how it's implemented:
// This LinkedList is a doubly-Linked circular list.
internal LinkedListNode<T> head;
Of course, to hide the fact that it's circular, properties and methods that traverse the list make checks to prevent wrapping back to the head.
LinkedListNode:
public LinkedListNode<T> Next {
get { return next == null || next == list.head? null: next;}
}
public LinkedListNode<T> Previous {
get { return prev == null || this == list.head? null: prev;}
}
LinkedList.Enumerator:
public bool MoveNext() {
if (version != list.version) {
throw new InvalidOperationException(SR.GetString(SR.InvalidOperation_EnumFailedVersion));
}
if (node == null) {
index = list.Count + 1;
return false;
}
++index;
current = node.item;
node = node.next;
if (node == list.head) {
node = null;
}
return true;
}

If you need a circular data structure, have a look at the C5 generic collections library. They have any collection that's imaginably useful in there, including a circular queue (which might help you).

No, its not. See MSDN

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

LinkedList<T> (2.0): removing items iteratively - c#

It's actually a lot easier in C#. function PlaceAtHead(<T> x) { list.Remove(x); list.AddFirst(x); return x; }

One ugly option is to iterate through your list, find all the items that apply and store them in a list. Then iterate through your second list and call remove on your LinkedList... I'm hoping someone else has a more elegant solution :)

Related

Is there any benefit to using LINQ to get the first (and only) element in IEnumerable<T>?

Check if IEnumerable has ANY rows without enumerating over the entire list

Removing from linked list C#

Why can't I do currentNode = currentNode.Next.Next?

Is the LinkedList in .NET a circular linked list?

Categories

Resources