Recursive method to solve text file - c#

In C# I get a dictionary dict after reading in a text file that looks kind of like this
#33=CLOSED_SHELL('',(#34,#35,#36,#37,#38,#39));
#34=ADVANCED_FACE('',(#46),#40,.F.);
#35=ADVANCED_FACE('',(#47),#41,.F.);
#36=ADVANCED_FACE('',(#48),#42,.F.);
#37=ADVANCED_FACE('',(#49),#43,.F.);
#38=ADVANCED_FACE('',(#50),#44,.T.);
#39=ADVANCED_FACE('',(#51),#45,.F.);
#40=PLANE('',#127);
#41=PLANE('',#128);
#42=PLANE('',#129);
#43=PLANE('',#130);
#44=PLANE('',#131);
#45=PLANE('',#132);
#46=FACE_OUTER_BOUND('',#52,.T.);
#47=FACE_OUTER_BOUND('',#53,.T.);
#48=FACE_OUTER_BOUND('',#54,.T.);
#49=FACE_OUTER_BOUND('',#55,.T.);
#50=FACE_OUTER_BOUND('',#56,.T.);
#51=FACE_OUTER_BOUND('',#57,.T.);
#52=EDGE_LOOP('',(#58,#59,#60,#61));
#53=EDGE_LOOP('',(#62,#63,#64,#65));
#54=EDGE_LOOP('',(#66,#67,#68,#69));
#55=EDGE_LOOP('',(#70,#71,#72,#73));
#56=EDGE_LOOP('',(#74,#75,#76,#77));
#57=EDGE_LOOP('',(#78,#79,#80,#81));
//... this goes on for a couple of other elements
As you can see each line contains multiple references for another line. Each reference # at the beginning is unique so these are the keys in dict.
So I'm using this method to solve each category step by step:
private void RecursiveMethod(Dictionary<string, string> dict, Step stepObj, List<List<string>> getList, Action<List<string>> setList)
{
foreach(var item in getList.ToList())
{
for(int valuesIndex = 1; valuesIndex < item.Count - 1; valuesIndex++)
{
var key = item[valuesIndex];
string values;
if(dict.TryGetValue(key, out values))
{
setList(SplitValues(values));
}
}
}
}
So my thoughts are to integrate a switch / case statement for each name like this
case "FACE_OUTER_BOUND":
// set list for FACE_OUTER_BOUND...
// ...then call RecursiveMethod(...) again which selects case "EDGE_LOOP" and so on
break;
Does that make sense? Or would I be better off using single methods for each case without recalling the same method?

It really depends on how similar the work will be each time. If as you navigate down the tree of references you need to perform the same navigation on each branch, recursion's your answer.
If after the first navigation the way you handle branches radically changes, use a separate method.
Very generally speaking, write it to be readable and easy to work with first, then change it if performance becomes an issue or you need to split off the functionality drastically.
Remember also that having a method such as ProbeLinesRecursive(int lineNumber) says something to the programmer who reads it, that is "I'll start with the line you give me, then keep going until I run out of lines to probe". This kind of descriptive programming becomes hugely helpful the bigger a project becomes over time.

Related

Converting a for loop into Task.Parallel.For

I have a procedure bool IsExistImage(int i) . the task of the procedure to detect an image and return bool whether it exist or not.
i have a PDF of 100+ pages which i split and send only the file name through the method. file names are actually the pagenumber of the main PDF file. like 1,2,3,...,125,..
after detecting the image, my method correctly save the list of pages. For that i used this code:
ArrayList array1 = new ArrayList();
for(int i=1;i<pdf.length;i++)
{
if(isExistImage(i))
{
array1.add(i);
}
}
This process runs for more than 1 hours(obviously for the internal works in isExistImage() method.). I can assure you, that no object/variable are global out side the method scope.
So, to shorten the time, I used Task.Parallel For loop. here is what i did :
System.Threading.Tasks,Parallel.For(1,pdf.Length,i =>
{
if(isExistImage(i))
array1.Add(i);
}
But this is not working properly. Sometimes the image detection is right. But most of the time its wrong. When i use non parallel for loop, then it's always right.
I am not understanding what is the problem here. what should i apply here. Is there any technique i am missing?
Your problem is that ArrayList (and most other .Net collections) is not thread-safe.
There are several ways to fix this, but I think that in this case, the best option is to use PLINQ:
List<int> pagesWithImages = ParallelEnumerable.Range(1, pdf.Length)
.Where(i => isExistImage(i))
.ToList();
This will use multiple threads to call the (weirdly named) isExistImage method, which is exactly what you want, and then return a List<int> containing the indexes that matched the condition.
The returned list won't be sorted. If you want that, add AsOrdered() before the Where().
BTW, you really shouldn't be using ArrayList. If you want a list of integers, use List<int>.
ArrayList isn't thread safe; look into concurrent collections here.
is isExistImage thread safe? I.e. are you locking before updating any member variables??

Is a switch statement ok for 30 or so conditions?

I am in the final stages of creating an MP4 tag parser in .Net. For those who have experience with tagging music you would be aware that there are an average of 30 or so tags. If tested out different types of loops and it seems that a switch statement with Const values seems to be the way to go with regard to catching the tags in binary.
The switch allows me to search the binary without the need to know which order the tags are stored or if there are some not present but I wonder if anyone would be against using a switch statement for so many conditionals.
Any insight is much appreciated.
EDIT: One think I should add now that where discussing this is that the function is recursive, should I pull out this conditional and pass the data of to a method I can kill?
It'll probably work fine with the switch, but I think your function will become very long.
One way you could solve this is to create a handler class for each tag type and then register each handler with the corresponding tag in a dictionary. When you need to parse a tag, you can look up in the dictionary which handler should be used.
Personally, if you must, I would go this way. A switch statement is much easier to read than If/Else statements (and at your size will be optimized for you).
Here is a related question. Note that the accepted answer is the incorrect one.
Is there any significant difference between using if/else and switch-case in C#?
Another option (Python inspired) is a dictionary that maps a tag to a lambda function, or an event, or something like that. It would require some re-architecture.
For something low level like this I don't see a problem. Just make sure you place each case in a separate method. You will thank yourself later.
To me, having so many conditions in a switch statement gives me reason for thought. It might be better to refactor the code and rely on virtual methods, association between tags and methods, or any other mechanism to avoid spagetti code.
If you have only one place that has that particular structure of switch and case statements, then it's a matter of style. If you have more than one place that has the same structure, you might want to rethink how you do it to minimize maintenance headaches.
It's hard to tell without seeing your code, but if you're just trying to catch each tag, you could define an array of acceptable tags, then loop through the file, checking to see if each tag is in the array.
ID3Sharp on Sourceforge has a http://sourceforge.net/projects/id3sharp/ uses a more object oriented approach with a FrameRegistry that hands out derived classes for each frame type.
It's fast, it works well, and it's easy to maintain. The 'overhead' of creating a small class object in C# is negligible compared to opening an MP4 file to read the header.
One design that might be useful in some case (but from what I've seen would be over kill here):
class DoStuff
{
public void Do(type it, Context context )
{
switch(it)
{
case case1: doCase1(context) break;
case case2: doCase2(context) break;
//...
}
}
protected abstract void doCase1(Context context);
protected abstract void doCase2(Context context);
//...
}
class DoRealStuff : DoStuff
{
override void doCase1(Context context) { ... }
override void doCase2(Context context) { ... }
//...
}
I'm not familiar with the MP4 technology but I would explore the possiblity of using some interfaces here. Pass in an object, try to cast to the interface.
public void SomeMethod(object obj)
{
ITag it = obj as ITag;
if(it != null)
{
it.SomeProperty = "SomeValue";
it.DoSomthingWithTag();
}
}
I wanted to add my own answer just to bounce off people...
Create an object that holds the "binary tag name", "Data", "Property Name".
Create a list of these totaling the amount of tags known adding the tag name and property name.
When parsing use linq to match the found name with the object.binarytagname and add the data
reflect into the property and add the data...
What about good old for loop? I think you can design it that way. Isn't switch-case only transformed if-else anyway? I always try to write code using loop if amount of case statements is becoming higher than acceptable. And 30 cases in switch is too high for me.
You almost certainly have a Chain of Responsibility in your problem. Refactor.

How to replace for-loops with a functional statement in C#?

A colleague once said that God is killing a kitten every time I write a for-loop.
When asked how to avoid for-loops, his answer was to use a functional language. However, if you are stuck with a non-functional language, say C#, what techniques are there to avoid for-loops or to get rid of them by refactoring? With lambda expressions and LINQ perhaps? If so, how?
Questions
So the question boils down to:
Why are for-loops bad? Or, in what context are for-loops to avoid and why?
Can you provide C# code examples of how it looks before, i.e. with a loop, and afterwards without a loop?
Functional constructs often express your intent more clearly than for-loops in cases where you operate on some data set and want to transform, filter or aggregate the elements.
Loops are very appropriate when you want to repeatedly execute some action.
For example
int x = array.Sum();
much more clearly expresses your intent than
int x = 0;
for (int i = 0; i < array.Length; i++)
{
x += array[i];
}
Why are for-loops bad? Or, in what
context are for-loops to avoid and
why?
If your colleague has a functional programming, then he's probably already familiar with the basic reasons for avoiding for loops:
Fold / Map / Filter cover most use cases of list traversal, and lend themselves well to function composition. For-loops aren't a good pattern because they aren't composable.
Most of the time, you traverse through a list to fold (aggregate), map, or filter values in a list. These higher order functions already exist in every mainstream functional language, so you rarely see the for-loop idiom used in functional code.
Higher order functions are the bread and butter of function composition, meaning you can easily combine simple function into something more complex.
To give a non-trivial example, consider the following in an imperative language:
let x = someList;
y = []
for x' in x
y.Add(f x')
z = []
for y' in y
z.Add(g y')
In a functional language, we'd write map g (map f x), or we can eliminate the intermediate list using map (f . g) x. Now we can, in principle, eliminate the intermediate list from the imperative version, and that would help a little -- but not much.
The main problem with the imperative version is simply that the for-loops are implementation details. If you want change the function, you change its implementation -- and you end up modifying a lot of code.
Case in point, how would you write map g (filter f x) in imperatively? Well, since you can't reuse your original code which maps and maps, you need to write a new function which filters and maps instead. And if you have 50 ways to map and 50 ways to filter, how you need 50^50 functions, or you need to simulate the ability to pass functions as first-class parameters using the command pattern (if you've ever tried functional programming in Java, you understand what a nightmare this can be).
Back in the the functional universe, you can generalize map g (map f x) in way that lets you swap out the map with filter or fold as needed:
let apply2 a g b f x = a g (b f x)
And call it using apply2 map g filter f or apply2 map g map f or apply2 filter g filter f or whatever you need. Now you'd probably never write code like that in the real world, you'd probably simplify it using:
let mapmap g f = apply2 map g map f
let mapfilter g f = apply2 map g filter f
Higher-order functions and function composition give you a level of abstraction that you cannot get with the imperative code.
Abstracting out the implementation details of loops let's you seamlessly swap one loop for another.
Remember, for-loops are an implementation detail. If you need to change the implementation, you need to change every for-loop.
Map / fold / filter abstract away the loop. So if you want to change the implementation of your loops, you change it in those functions.
Now you might wonder why you'd want to abstract away a loop. Consider the task of mapping items from one type to another: usually, items are mapped one at a time, sequentially, and independently from all other items. Most of the time, maps like this are prime candidates for parallelization.
Unfortunately, the implementation details for sequential maps and parallel maps aren't interchangeable. If you have a ton of sequential maps all over your code, and you want swap them out for parallel maps, you have two choices: copy/paste the same parallel mapping code all over your code base, or abstract away mapping logic into two functions map and pmap. Once you're go the second route, you're already knee-deep in functional programming territory.
If you understand the purpose of function composition and abstracting away implementation details (even details as trivial as looping), you can start to appreciate just how and why functional programming is so powerful in the first place.
For loops are not bad. There are many very valid reasons to keep a for loop.
You can often "avoid" a for loop by reworking it using LINQ in C#, which provides a more declarative syntax. This can be good or bad depending on the situation:
Compare the following:
var collection = GetMyCollection();
for(int i=0;i<collection.Count;++i)
{
if(collection[i].MyValue == someValue)
return collection[i];
}
vs foreach:
var collection = GetMyCollection();
foreach(var item in collection)
{
if(item.MyValue == someValue)
return item;
}
vs. LINQ:
var collection = GetMyCollection();
return collection.FirstOrDefault(item => item.MyValue == someValue);
Personally, all three options have their place, and I use them all. It's a matter of using the most appropriate option for your scenario.
There's nothing wrong with for loops but here are some of the reasons people might prefer functional/declarative approaches like LINQ where you declare what you want rather than how you get it:-
Functional approaches are potentially easier to parallelize either manually using PLINQ or by the compiler. As CPUs move to even more cores this may become more important.
Functional approaches make it easier to achieve lazy evaluation in multi-step processes because you can pass the intermediate results to the next step as a simple variable which hasn't been evaluated fully yet rather than evaluating the first step entirely and then passing a collection to the next step (or without using a separate method and a yield statement to achieve the same procedurally).
Functional approaches are often shorter and easier to read.
Functional approaches often eliminate complex conditional bodies within for loops (e.g. if statements and 'continue' statements) because you can break the for loop down into logical steps - selecting all the elements that match, doing an operation on them, ...
For loops don't kill people (or kittens, or puppies, or tribbles). People kill people.
For loops, in and of themselves, are not bad. However, like anything else, it's how you use them that can be bad.
Sometime you don't kill just one kitten.
for (int i = 0; i < kittens.Length; i++)
{
kittens[i].Kill();
}
Sometimes you kill them all.
You can refactor your code well enough so that you won't see them often. A good function name is definitely more readable that a for loop.
Taking the example from AndyC :
Loop
// mystrings is a string array
List<string> myList = new List<string>();
foreach(string s in mystrings)
{
if(s.Length > 5)
{
myList.add(s);
}
}
Linq
// mystrings is a string array
List<string> myList = mystrings.Where<string>(t => t.Length > 5)
.ToList<string();
Wheter you use the first or the second version inside your function, It's easier to read
var filteredList = myList.GetStringLongerThan(5);
Now that's an overly simple example, but you get my point.
Your colleague is not right. For loops are not bad per se. They are clean, readable and not particularly error prone.
Your colleague is wrong about for loops being bad in all cases, but correct that they can be rewritten functionally.
Say you have an extension method that looks like this:
void ForEach<T>(this IEnumerable<T> collection, Action <T> action)
{
foreach(T item in collection)
{
action(item)
}
}
Then you can write a loop like this:
mycollection.ForEach(x => x.DoStuff());
This may not be very useful now. But if you then replace your implementation of the ForEach extension method for use a multi threaded approach then you gain the advantages of parallelism.
This obviously isn't always going to work, this implementation only works if the loop iterations are completely independent of each other, but it can be useful.
Also: always be wary of people who say some programming construct is always wrong.
A simple (and pointless really) example:
Loop
// mystrings is a string array
List<string> myList = new List<string>();
foreach(string s in mystrings)
{
if(s.Length > 5)
{
myList.add(s);
}
}
Linq
// mystrings is a string array
List<string> myList = mystrings.Where<string>(t => t.Length > 5).ToList<string>();
In my book, the second one looks a lot tidier and simpler, though there's nothing wrong with the first one.
Sometimes a for-loop is bad if there exists a more efficient alternative. Such as searching, where it might be more efficient to sort a list and then use quicksort or binary sort. Or when you are iterating over items in a database. It is usually much more efficient to use set-based operations in a database instead of iterating over the items.
Otherwise if the for-loop, especially a for-each makes the most sense and is readable, then I would go with that rather than rafactor it into something that isn't as intuitive. I personally don't believe in these religious sounding "always do it this way, because that is the only way". Rather it is better to have guidelines, and understand in what scenarios it is appropriate to apply those guidelines. It is good that you ask the Why's!
For loop is, let's say, "bad" as it implies branch prediction in CPU, and possibly performance decrease when branch prediction miss.
But CPU (having a branch prediction accuracy of 97%) and compiler with tecniques like loop unrolling, make loop performance reduction negligible.
If you abstract the for loop directly you get:
void For<T>(T initial, Func<T,bool> whilePredicate, Func<T,T> step, Action<T> action)
{
for (T t = initial; whilePredicate(t); step(t))
{
action(t);
}
}
The problem I have with this from a functional programming perspective is the void return type. It essentially means that for loops do not compose nicely with anything. So the goal is not to have a 1-1 conversion from for loop to some function, it is to think functionally and avoid doing things that do not compose. Instead of thinking of looping and acting think of the whole problem and what you are mapping from and to.
A for loop can always be replaced by a recursive function that doesn't involve the use of a loop. A recursive function is a more functional stye of programming.
But if you blindly replace for loops with recursive functions, then kittens and puppies will both die by the millions, and you will be done in by a velocirapter.
OK, here's an example. But please keep in mind that I do not advocate making this change!
The for loop
for (int index = 0; index < args.Length; ++index)
Console.WriteLine(args[index]);
Can be changed to this recursive function call
WriteValuesToTheConsole(args, 0);
static void WriteValuesToTheConsole<T>(T[] values, int startingIndex)
{
if (startingIndex < values.Length)
{
Console.WriteLine(values[startingIndex]);
WriteValuesToTheConsole<T>(values, startingIndex + 1);
}
}
This should work just the same for most values, but it is far less clear, less effecient, and could exhaust the stack if the array is too large.
Your colleague may be suggesting under certain circumstances where database data is involved that it is better to use an aggregate SQL function such as Average() or Sum() at query time as opposed to processing the data on the C# side within an ADO .NET application.
Otherwise for loops are highly effective when used properly, but realize that if you find yourself nesting them to three or more orders, you might need a better algorithm, such as one that involves recursion, subroutines or both. For example, a bubble sort has a O(n^2) runtime on its worst-case (reverse order) scenario, but a recursive sort algorithm is only O(n log n), which is much better.
Hopefully this helps.
Jim
Any construct in any language is there for a reason. It's a tool to be used to accomplish a task. Means to an end. In every case, there are manners in which to use it appropriately, that is, in a clear and concise way and within the spirit of the language AND manners to abuse it. This applies to the much-misaligned goto statement as well as to your for loop conundrum, as well as while, do-while, switch/case, if-then-else, etc. If the for loop is the right tool for what you're doing, USE IT and your colleague will need to come to terms with your design decision.
It depends upon what is in the loop but he/she may be referring to a recursive function
//this is the recursive function
public static void getDirsFiles(DirectoryInfo d)
{
//create an array of files using FileInfo object
FileInfo [] files;
//get all files for the current directory
files = d.GetFiles("*.*");
//iterate through the directory and print the files
foreach (FileInfo file in files)
{
//get details of each file using file object
String fileName = file.FullName;
String fileSize = file.Length.ToString();
String fileExtension =file.Extension;
String fileCreated = file.LastWriteTime.ToString();
io.WriteLine(fileName + " " + fileSize +
" " + fileExtension + " " + fileCreated);
}
//get sub-folders for the current directory
DirectoryInfo [] dirs = d.GetDirectories("*.*");
//This is the code that calls
//the getDirsFiles (calls itself recursively)
//This is also the stopping point
//(End Condition) for this recursion function
//as it loops through until
//reaches the child folder and then stops.
foreach (DirectoryInfo dir in dirs)
{
io.WriteLine("--------->> {0} ", dir.Name);
getDirsFiles(dir);
}
}
The question is if the loop will be mutating state or causing side effects. If so, use a foreach loop. If not, consider using LINQ or other functional constructs.
See "foreach" vs "ForEach" on Eric Lippert's Blog.

Logic is now Polymorphism instead of Switch, but what about constructing?

This question is specifically regarding C#, but I am also interested in answers for C++ and Java (or even other languages if they've got something cool).
I am replacing switch statements with polymorphism in a "C using C# syntax" code I've inherited. I've been puzzling over the best way to create these objects. I have two fall-back methods I tend to use. I would like to know if there are other, viable alternatives that I should be considering or just a sanity check that I'm actually going about this in a reasonable way.
The techniques I normally use:
Use an all-knowing method/class. This class will either populate a data structure (most likely a Map) or construct on-the-fly using a switch statement.
Use a blind-and-dumb class that uses a config file and reflection to create a map of instances/delegates/factories/etc. Then use map in a manner similar to above.
???
Is there a #3, #4... etc that I should strongly consider?
Some details... please note, the original design is not mine and my time is limited as far as rewriting/refactoring the entire thing.
Previous pseudo-code:
public string[] HandleMessage(object input) {
object parser = null;
string command = null;
if(input is XmlMessage) {
parser = new XmlMessageParser();
((XmlMessageParser)parser).setInput(input);
command = ((XmlMessageParser)parser).getCommand();
} else if(input is NameValuePairMessage) {
parser = new NameValuePairMessageParser();
((NameValuePairMessageParser)parser).setInput(input);
command = ((XmlMessageParser)parser).getCommand();
} else if(...) {
//blah blah blah
}
string[] result = new string[3];
switch(command) {
case "Add":
result = Utility.AddData(parser);
break;
case "Modify":
result = Utility.ModifyData(parser);
break;
case ... //blah blah
break;
}
return result;
}
What I plan to replace that with (after much refactoring of the other objects) is something like:
public ResultStruct HandleMessage(IParserInput input) {
IParser parser = this.GetParser(input.Type); //either Type or a property
Map<string,string> parameters = parser.Parse(input);
ICommand command = this.GetCommand(parameters); //in future, may need multiple params
return command.Execute(parameters); //to figure out which object to return.
}
The question is what should the implementation of GetParser and GetCommand be?
Putting a switch statement there (or an invokation of a factory that consists of switch statements) doesn't seem like it really fixes the problem. I'm just moving the switch somewhere else... which maybe is fine as its no longer in the middle of my primary logic.
You may want to put your parser instantiators on the objects themselves, e.g.,
public interface IParserInput
{
...
IParser GetParser()
ICommand GetCommand()
}
Any parameters that GetParser needs should, theoretically, be supplied by your object.
What will happen is that the object itself will return those, and what happens with your code is:
public ResultStruct HandleMessage(IParserInput input)
{
IParser parser = input.GetParser();
Map<string,string> parameters = parser.Parse(input);
ICommand command = input.GetCommand();
return command.Execute(parameters);
}
Now this solution is not perfect. If you do not have access to the IParserInput objects, it might not work. But at least the responsibility of providing information on the proper handler now falls with the parsee, not the handler, which seems to be more correct at this point.
You can have an
public interface IParser<SomeType> : IParser{}
And set up structuremap to look up for a Parser for "SomeType"
It seems that Commands are related to the parser in the existing code, if you find it clean for your scenario, you might want to leave that as is, and just ask the Parser for the Command.
Update 1: I re-read the original code. I think for your scenario it will probably be the least change to define an IParser as above, which has the appropiate GetCommand and SetInput.
The command/input piece, would look something along the lines:
public string[] HandleMessage<MessageType>(MessageType input) {
var parser = StructureMap.GetInstance<IParser<MessageType>>();
parser.SetInput(input);
var command = parser.GetCommand();
//do something about the rest
}
Ps. actually, your implementation makes me feel that the old code, even without the if and switch had issues. Can you provide more info on what is supposed to happen in the GetCommand in your implementation, does the command actually varies with the parameters, as I am unsure what to suggest for that because of it.
I don't see any problems with a message handler like you have it. I certainly wouldn't go with the config file approach - why create a config file outside the debugger when you can have everything available at compile time?
The third alternative would be to discover the possible commands at runtime in a decentralized way.
For example, Spring can do this in Java using so-called “classpath scanning”, reflection and annotations — Spring parses all classes in the package(s) you specify, picks ones annotated with #Controller, #Resource etc and registers them as beans.
Classpath scanning in Java relies on directory entries being added to JAR archives (so that Spring can enumerate the contents of various classpath directories).
I don't know about C#, but there should be a similar technique there: probably you can enumerate a list of classes in your assembly, and pick some of them based on some criteria (naming convention, annotation, whatever).
Now, this is just a third option to have in mind for the sake of having a third option in mind. I doubt it should actually be used in practise. You first alternative (just write a piece of code that knows about all the classes) should be the default choice unless you have a compelling reason to do otherwise.
In the decentralised world of OOP, where each class is a little piece of the puzzle, there has to be some “integration code” that knows how to put these pieces together. There's nothing wrong about having such “all-knowing” classes (as long as you limit them to application-level and subsystem-level integration code only).
Whichever way you choose (hard-code the possible choices in a class, read a config file or use reflection to discover the choices), it's all the same story, does not really matter, and can easily be changed at any time.
Have fun!

Probably BAD coding style ... please comment

I am checking whether the new name already exists or not.
Code 1
if(cmbxExistingGroups.Properties.Items.Cast<string>().ToList().Exists(txt => txt==txtNewGroup.Text.Trim())) {
MessageBox.Show("already exists.", "Add new group");
}
Otherwise I could have written:
Code 2
foreach(var str in cmbxExistingGroups.Properties.Items)
{
if(str==txtNewGroup.Text) {
MessageBox.Show("already exists.", "Add new group");
break;
}
}
I wrote these two and thought I was exploiting language features in code 1.
...and yes: both of them work for me ... I am wondering about the performance :-/
I appreciate the cleverness of the first sample (assuming it works), but the second one is a lot easier for the next person who has to maintain the code to figure out.
Sometimes just a little indentation makes a world of difference:
if (cmbxExistingGroups.Properties.Items
.Cast<string>().ToList()
.Exists
(
txt => txt==txtNewGroup.Text.Trim()
))
{
MessageBox.Show("already exists.", "Add new group");
}
Since your using a List<String>, you might as well just drop the Exists predicate and use Contains...use Exists when comparing complex objects by unique values.
I've quoted it before but I'll do it again:
Write your code as if the person maintaining it is a homicidal maniac
who knows where you live.
would
cmbxExistingGroups.Properties.Items.Contains(text)
not work instead?
There are a few things wrong here:
1) The two bits of code don't do the same thing - the first looks for the trimmed version of txtNewGroup, the second just looks for txtNewGroup
2) There's no point in calling ToList() - that just make things less efficient
3) Using Exists with a predicate is overkill - Contains is all you need here
So, the first could easily come down to:
if (cmbxExistingGroups.Properties.Items.Cast<string>.Contains(txtNewGroup.Text))
{
// Stuff
}
I'd probably create a variable to give "cmbxExistingGroups.Properties.Items.Cast" a meaningful, simple name - but then I'd say it's easier to understand than the explicit foreach loop.
The first code bit is fine, except instead of calling Enumerable.ToList() and List<T>.Exists(), you should just call Enumerable.Any() -- it does a lazy evaluation, so it never allocates the memory for the List<T>, and it will stop enumerating cmbxExistingGroups.Properties.Items and casting them to string. Also, calling the trim from inside that predicate means it happens for every item it looks at. It would be best to move it out to the outer scope:
string match = txtNewGroup.Text.Trim();
if(cmbxExistingGroups.Properties.Items.Cast<string>().Any(txt => txt==match)) {
MessageBox.Show("already exists.", "Add new group");
}
Verbosity in coding is not always bad at all. I prefer the second code snippet a lot over the first one. Just imagine you would have to maintain (or even change the functionality of) the first example... um.
Well, if it were me, it would be a variation on 2. Always prefer readability over one-liners. Additionally, always extract a method to make it clearer.
your calling code becomes
if( cmbxExistingGroups.ContainsKey(txtNewGroup.Text) )
{
MessageBox.Show("Already Exists");
}
If you define an extension method for Combo Boxes
public static class ComboBoxExtensions
{
public static bool ContainsKey(this ComboBox comboBox, string key)
{
foreach (string existing in comboBox.Items)
{
if (string.Equals(key, existing))
{
return true;
}
}
return false;
}
}
First, they're not equivalent. The 1st sample does a check against txtNewSGroup.Text.Trim(), the 2nd omits trim. Also, the 1st casts everything to a string, whereas the second uses whatever comes out of the iterator. I assume that's an object, or you wouldn't have needed the cast in the 1st place.
So, to be fair, the closest equivalent to the 2nd sample in the LINQ style would be:
if (mbxExistingGroups.Properties.Items.Cast<string>().Contains(txtNewGroup.Text)) {
...
}
which isn't too bad. But, since you seem to be working with old style IEnumerable instead of new fangled IEnumerable<T>, why don't we give you another extension method:
public static Contains<T>(this IEnumerable e, T value) {
return e.Cast<T>().Contains(value);
}
And now we have:
if (mbxExistingGroups.Properties.Items.Contains(txtNewGroup.Text)) {
...
}
which is pretty readable IMO.
I would agree, go with the second one because it will be easier to maintain for anybody else who works on it and when you come back to that in 6-12 months, it will be easier to remember what you were doing.
both of them works for me ..i am wonodering about the performance
I see no one read the question :) I think I see what you're doing (I don't use this language). The first tries to generate the list and test it in one shot. The second does an explicit iteration and can "short circuit" itself (exit early) if it finds the duplicate early on. The question is whether the "all at once" is more efficient due to the language implementation.
The second of the two would perform better, and it would perform the same as other people's samples that use Contains.
The reason why the first one uses an extra trim. plus a conversion to list. so it iterates once for conversion, then starts again to check using exists, and does a trim each time, but will exit iteration if found. The second starts iterating once, has no trim, and will exit if found.
So in short the answer to your question is the second performs much better.
From a performance point of view:
txtNewGroup.Text.Trim()
Do your control interaction/string manipulation outside of the loop - one time, instead of n times.
I imagine that on the WTF's per minute scale, the first would be off the chart. Count the dots, any more than two per line is a potential problem

Categories

Resources