Consider the following situation:
public class Employee
{
public string Name {get; set}
public string Email {get; set}
}
public class EnployeeGroup
{
//List of employees in marketting
public IList<Employee> MarkettingEmployees{ get; }
//List of employees in sales
public IList<Employee> SalesEmployees{ get; }
}
private EnployeeGroup GroupA;
int MarkettingCount;
string MarkettingNames;
MarkettingCount = GroupA.MarkettingEmployees.Count; //assigns MarkettingCount=5, this will always be 5-10 employees
MarkettingNames = <**how can i join all the GroupA.MarkettingEmployees.Name into a comma separated string?** >
//I tried a loop:
foreach(Employee MktEmployee in GroupA.MarkettingEmployees)
{
MarkettingNames += MktEmployee.Name + ", ";
}
The loop works, but i want to know:
Is Looping the most efficient/elegant way of doing this? If not then what are the better alternatives? I tried string.join but couldnt get it working..
I want to avoid Linq..
You need a little bit of LINQ whether you like it or not ;)
MarkettingNames = string.Join(", ", GroupA.MarkettingEmployees.Select(e => e.Name));
From a practicality standpoint, there's no reasonable argument for avoiding a loop. Iterations are at the hard of every general-purpose programming language.
Using LINQ is elegant in simple cases. Again, there's no sound reason to avoid it per se.
In case you are looking for a rather obscure, academic solution, there's always tail recursion. However, your data structure would have to be adapted for it. Note that even if you use it, a smart compiler will detect it and optimize into a loop. The odds are agains you!
As an alternative you could use a StringBuilder with Append instead of creating a new string at each iteration
This would be much more efficient (see caveat below):
var stringBuilder = new StringBuilder();
foreach (Employee MktEmployee in GroupA.MarkettingEmployees)
{
stringBuilder.Append(MktEmployee.Name + ", ");
}
Then this:
foreach(Employee MktEmployee in GroupA.MarkettingEmployees)
{
MarkettingNames += MktEmployee.Name + ", ";
}
Edit: If you were to have a large amount of employees this would be much more efficient. However, a trivial loop of 5-10 is actually slightly less efficient.
In small cases - this isn't going to be that large of a hit on performance, but in large cases the pay off will be significant.
Also, if you are to use the explicit loop approach, it's probably best to trim off that last ", " by using something like:
myString = myString.Trim().TrimEnd(',');
The article below explains when you should use StringBuilder to concatenate strings.
In short, in the approach you use: the concatenation is creating a new string each time, which obviously eats up a lot of memory. You also need to copy all the data from the existing string of MarkettingNames to the new string being appended yet another MktEmployee.Name + ", ".
Thank you, Jon Skeet: http://www.yoda.arachsys.com/csharp/stringbuilder.html
Related
Is there any way to make Search and addToSearch faster?
I am trying to make it faster. I am not sure if regex in addtosearch can be a problem, it is really small. I am out ofideas how to optimize it further. Now i am just trying to meet word count. I wonder if there is a way to concatenate parts of name that are not empty more effectivly than i do.
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System;
namespace AutoComplete
{
public struct FullName
{
public string Name;
public string Surname;
public string Patronymic;
}
public class AutoCompleter
{
private List<string> listOfNames = new List<string>();
private static readonly Regex sWhitespace = new Regex(#"\s+");
public void AddToSearch(List<FullName> fullNames)
{
foreach (FullName i in fullNames)
{
string nameToAdd = "";
if (!string.IsNullOrWhiteSpace(i.Surname))
{
nameToAdd += sWhitespace.Replace(i.Surname, "") + " ";
}
if (!string.IsNullOrWhiteSpace(i.Name))
{
nameToAdd += sWhitespace.Replace(i.Name, "") + " ";
}
if (!string.IsNullOrWhiteSpace(i.Patronymic))
{
nameToAdd += sWhitespace.Replace(i.Patronymic, "") + " ";
}
listOfNames.Add(nameToAdd.Substring(0, nameToAdd.Length - 1));
}
}
public List<string> Search(string prefix)
{
if (prefix.Length > 100 || string.IsNullOrWhiteSpace(prefix))
{
throw new System.Exception();
}
List<string> namesWithPrefix = new List<string>();
foreach (string name in listOfNames)
{
if (IsPrefix(prefix, name))
{
namesWithPrefix.Add(name);
}
}
return namesWithPrefix;
}
private bool IsPrefix(string prefix, string stringToSearch)
{
if (stringToSearch.Length < prefix.Length)
{
return false;
}
for (int i = 0; i < prefix.Length; i++)
{
if (prefix[i] != stringToSearch[i])
{
return false
}
}
return true
}
}
}
Regular expression (Regexp) are great because of their ease-of use and flexibility but most Regexp engines are actually quite slow. This is the case for the one of C#. Moreover, strings can contain Unicode character and "\s" needs to consider all the (fancy) spaces characters included in the Unicode character set. This make Regexp search/replace much slower. If you know your input does not contain such characters (eg. ASCII), then you can write a much faster implementation. Alternatively, you can play with RegexpOptions like Compiled and CultureInvariant so to reduce a bit the run time.
The AddToSearch performs many hidden allocations. Indeed, += create a new string (because C# string are immutable and not designed to be often resized) and Replace calls does allocate new strings too. You can speed up the computation by directly replace and write the result in a preallocated buffer and simply copy the result with a Substring like you currently do.
Search is fine and it is not easy to optimize it. That being said, if listOfNames is big, then you can use multiple threads so to significantly speed up the computation. Be careful though because Add is not thread-safe. Parallel linkq may help you to do that easily (I never tested it though).
Another solution to speed up a bit the computation of Search is to start the loop of IsPrefix from prefix.Length-1. Indeed, if most string contains the beginning of the prefix, then a significant portion of the time will be spend comparing nearly equal characters. The probability that prefix[prefix.Length-1] != stringToSearch[prefix.Length-1] is higher than prefix[0] != stringToSearch[0]. Additionally, partial loop unrolling may help a bit to speed up the function if the JIT is not able to do that.
Others have already pointed out that the use of regex can be problematic. I would personally consider using str.Replace(" ", String.Empty) - if I understood the regex correctly; I normally try to avoid regex as I have a hard time reading code using regex. Note that String.Empty does not allocate a new string.
That said, I think performance could boost if you would not store the names in a List but at least order the list alpabetically. Thus you do not need to iterate all elemnts of the list but e.g. use binary search to find all elements matching a given prefix - as range within the list of names you already have.
string c = tmpArr[0].Aggregate(string.Empty, (current, m) => current + (m.Name + " "));
StringBuilder sb = new StringBuilder();
foreach (Mobile m in tmpArr[0])
sb.Append(m.Name + " ");
sb.ToString();
which of those two is faster? aggregate certainly is cleaner, but is it fast or is it the same as doing
foreach(Mobile m in tmpArr[0])
c += m.Name + " ";
what I really would like to do is something like string.Join(",",tmpArr[0]), but I don't want it to concat their ToString values, just their Names, how would I do that best?
my problem with not using string.Join is I would actually have to do something like this:
string separator = "";
StringBuilder sb = new StringBuilder();
foreach (Mobile m in tmpArr[0])
{
separator = ", ";
sb.Append(separator + m.Name);
}
If you append strings in a loop (c += m.Name + " ";) you are causing lots of intermediate strings to be created; this causes "telescopic" memory usage, and puts extra load on GC. Aggregate, mixed with the fluent-API of StringBuilder can help here - but as would looping with StringBuilder. It isn't Aggregate that is important: it is not creating lots of intermediate strings.
For example, I would use:
foreach (Mobile m in tmpArr[0])
sb.Append(m.Name).Append(" ");
even fewer ;p
And for a similar example using StringBuilder in Aggregate:
string c = tmpArr[0].Aggregate(new StringBuilder(),
(current, m) => current.Append(m.Name).Append(" ")).ToString();
I don't want it to concat their ToString values, just their Names, how would I do that best?
string.Join(",",tmpArr[0].Select(t => t.Name).ToArray())
But most of the time It. Just. Doesn't. Matter!
As string is Immutable, add operation has performance cost. This is what the StringBuilder is mainly designed for, it acts like "Mutable" String. I haven't done much benchmarking for the speed, but for memory optimizations StringBuilder is definitely better.
Aggregate runs an anonymous method against each item in the IEnumerable. This method is passed to the System-defined Func<> delegate which returns an out parameter.
It's basically like running a function that does the appending as many times.
So allocation/deallocation on the stack for the method calls etc certainly has more overhead than running a simple for/foreach loop
So, in my opinion the second method would be faster.
Aggregate itself is not the problem. The problem is you are concatenating strings in a loop. When you concatenate two strings with + operator, a new place must be allocated in memory and the two strings are copied into it. So if you use the + five times, you actually create five new strings. That's why you should use StringBuilder or Join which avoid this.
If you want to use Join along with linq for better readability, you still can, just don't use Aggregate but something like Select and ToArray.
Something like this?
string joined = string.Join(",", myItems.Select(x => x.Name).ToArray());
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
c# array vs generic list
Array versus List<T>: When to use which?
I understand that there are several benefits of using List<>. However, I was wondering what benefits might still exist for using arrays.
Thanks.
You'll have a simple static structure to hold items rather than the overhead associated with a List (dynamic sizing, insertion logic, etc.).
In most cases though, those benefits are outweighed by the flexibility and adaptability of a List.
One thing arrays have over lists is covariance.
class Person { /* ... */}
class Employee : Person {/* ... */}
void DoStuff(List<Person> people) {/* ... */}
void DoStuff(Person[] people) {/* ... */}
void Blarg()
{
List<Employee> employeeList = new List<Employee>();
// ...
DoStuff(employeeList); // this does not compile
int employeeCount = 10;
Employee[] employeeArray = new Employee[employeeCount];
// ...
DoStuff(employeeArray); // this compiles
}
An array is simpler than a List, so there is less overhead. If you only need the capabilities of an array, there is no reason to use a List instead.
The array is the simplest form of collection, that most other collections use in some form. A List actually uses an array internally to hold it's items.
Whenever a language construct needs a light weight throw-away collection, it uses an array. For example this code:
string s = "Age = " + age.ToString() + ", sex = " + sex + ", location = " + location;
actually becomes this code behind the scene:
string s = String.Concat(new string[] {
"Age = ",
age.ToString(),
", sex = ",
sex,
", location = ",
location
});
I would only use an array if your collection is immutable.
Edit: Immutable in a sense that your collection will not grow or shrink.
Speed would be the main benefit, unless you start writing your own code to insert/remove items etc.
A List uses an array internally and just manages all of the things a list does for you internally.
According to the requirement we have to return a collection either in reverse order or as
it is. We, beginning level programmer designed the collection as follow :(sample is given)
namespace Linqfying
{
class linqy
{
static void Main()
{
InvestigationReport rpt=new InvestigationReport();
// rpt.GetDocuments(true) refers
// to return the collection in reverse order
foreach( EnquiryDocument doc in rpt.GetDocuments(true) )
{
// printing document title and author name
}
}
}
class EnquiryDocument
{
string _docTitle;
string _docAuthor;
// properties to get and set doc title and author name goes below
public EnquiryDocument(string title,string author)
{
_docAuthor = author;
_docTitle = title;
}
public EnquiryDocument(){}
}
class InvestigationReport
{
EnquiryDocument[] docs=new EnquiryDocument[3];
public IEnumerable<EnquiryDocument> GetDocuments(bool IsReverseOrder)
{
/* some business logic to retrieve the document
docs[0]=new EnquiryDocument("FundAbuse","Margon");
docs[1]=new EnquiryDocument("Sexual Harassment","Philliphe");
docs[2]=new EnquiryDocument("Missing Resource","Goel");
*/
//if reverse order is preferred
if(IsReverseOrder)
{
for (int i = docs.Length; i != 0; i--)
yield return docs[i-1];
}
else
{
foreach (EnquiryDocument doc in docs)
{
yield return doc;
}
}
}
}
}
Question :
Can we use other collection type to improve efficiency ?
Mixing of Collection with LINQ reduce the code ? (We are not familiar with LINQ)
Looks fine to me. Yes, you could use the Reverse extension method... but that won't be as efficient as what you've got.
How much do you care about the efficiency though? I'd go with the most readable solution (namely Reverse) until you know that efficiency is a problem. Unless the collection is large, it's unlikely to be an issue.
If you've got the "raw data" as an array, then your use of an iterator block will be more efficient than calling Reverse. The Reverse method will buffer up all the data before yielding it one item at a time - just like your own code does, really. However, simply calling Reverse would be a lot simpler...
Aside from anything else, I'd say it's well worth you learning LINQ - at least LINQ to Objects. It can make processing data much, much cleaner than before.
Two questions:
Does the code you currently have work?
Have you identified this piece of code as being your performance bottleneck?
If the answer to either of those questions is no, don't worry about it. Just make it work and move on. There's nothing grossly wrong about the code, so no need to fret! Spend your time building new functionality instead. Save LINQ for a new problem you haven't already solved.
Actually this task seems pretty straightforward. I'd actually just use the Reverse method on a Generic List.
This should already be well-optimized.
Your GetDocuments method has a return type of IEnumerable so there is no need to even loop over your array when IsReverseOrder is false, you can just return it as is as Array type is IEnumerable...
As for when IsReverseOrder is true you can use either Array.Reverse or the Linq Reverse() extension method to reduce the amount of code.
I'm looking for the shortest code to create methods to perform common operations on items in an IEnumerable.
For example:
public interface IPupil
{
string Name { get; set; }
int Age { get; set; }
}
Summing a property - e.g. IPupil.Age in IEnumerable<IPupil>
Averaging a property - e.g. IPupil.Age in IEnumerable<IPupil>
Building a CSV string - e.g. IPupil.Name in IEnumerable<IPupil>
I'm interested in the various approaches to solve these examples: foreach (long hand), delegates, LINQ, anonymous methods, etc...
Sorry for the poor wording, I'm having trouble describing exactly what I'm after!
Summing and averaging: easy with LINQ:
var sum = pupils.Sum(pupil => pupil.Age);
var average = pupils.Average(pupil => pupil.Age);
Building a CSV string - there are various options here, including writing your own extension methods. This will work though:
var csv = string.Join(",", pupils.Select(pupil => pupil.Name).ToArray());
Note that it's tricky to compute multiple things (e.g. average and sum) in one pass over the data with normal LINQ. If you're interested in that, have a look at the Push LINQ project which Marc Gravell and I have written. It's a pretty specialized requirement though.