Good way to concatenate string representations of objects? - c#

Ok,
We have a lot of where clauses in our code. We have just as many ways to generate a string to represent the in condition. I am trying to come up with a clean way as follows:
public static string Join<T>(this IEnumerable<T> items, string separator)
{
var strings = from item in items select item.ToString();
return string.Join(separator, strings.ToArray());
}
it can be used as follows:
var values = new []{1, 2, 3, 4, 5, 6};
values.StringJoin(",");
// result should be:
// "1,2,3,4,5,6"
So this is a nice extension method that does a very basic job. I know that simple code does not always turn into fast or efficient execution, but I am just curious as to what could I have missed with this simple code. Other members of our team are arguing that:
it is not flexible enough (no control of the string representation)
may not be memory efficient
may not be fast
Any expert to chime in?
Regards,
Eric.

Regarding the first issue, you could add another 'formatter' parameter to control the conversion of each item into a string:
public static string Join<T>(this IEnumerable<T> items, string separator)
{
return items.Join(separator, i => i.ToString());
}
public static string Join<T>(this IEnumerable<T> items, string separator, Func<T, string> formatter)
{
return String.Join(separator, items.Select(i => formatter(i)).ToArray());
}
Regarding the second two issues, I wouldn't worry about it unless you later run into performance issues and find it to be a problem. It's unlikely to much of a bottleneck however...

For some reason, I thought that String.Join is implemented in terms of a StringBuilder class. But if it isn't, then the following is likely to perform better for large inputs since it doesn't recreate a String object for each join in the iteration.
public static string Join<T>(this IEnumerable<T> items, string separator)
{
// TODO: check for null arguments.
StringBuilder builder = new StringBuilder();
foreach(T t in items)
{
builder.Append(t.ToString()).Append(separator);
}
builder.Length -= separator.Length;
return builder.ToString();
}
EDIT: Here is an analysis of when it is appropriate to use StringBuilder and String.Join.

Why don't you use StringBuilder, and iterate through the collection yourself, appending.
Otherwise you are creating an array of strings (var strings) and then doing the Join.

You are missing null checks for the sequence and the items of the sequence. And yes, it is not the fastest and most memory efficient way. One would probably just enumerate the sequence and render the string representations of the items into a StringBuilder. But does this really matter? Are you experiencing performance problems? Do you need to optimize?

this would work also:
public static string Test(IEnumerable<T> items, string separator)
{
var builder = new StringBuilder();
bool appendSeperator = false;
if(null != items)
{
foreach(var item in items)
{
if(appendSeperator)
{
builder.Append(separator)
}
builder.Append(item.ToString());
appendSeperator = true;
}
}
return builder.ToString();
}

Related

best way to convert collection to string

I need to convert a collection of <string,string> to a single string containing all the values in the collection like KeyValueKeyValue... But How do I do this effectively?
I have done it this way at the moment:
parameters = string.Join("", requestParameters.Select(x => string.Concat(x.Key, x.Value)));
But not sure it is the best way to do it, would a string builder be better? I guess the collection will contain a max of 10 pairs.
string.Join used to not really be the best option since it only accepted string[] or object[] parameters, requiring that any select-style queries needed to be completely evaluated and put into an array first.
.NET 4.0 brought with it an overload that accepts IEnumerable<string> -- which is what you are using -- and even an overload that accepts any IEnumerable<T>. These are definitely your best bet as they are now part of the BCL.
Incidentally, cracking open the source for the first overload in Reflector shows code that follows pretty closely to what davisoa suggested:
public static string Join(string separator, IEnumerable<string> values)
{
if (values == null)
{
throw new ArgumentNullException("values");
}
if (separator == null)
{
separator = Empty;
}
using (IEnumerator<string> enumerator = values.GetEnumerator())
{
if (!enumerator.MoveNext())
{
return Empty;
}
StringBuilder builder = new StringBuilder();
if (enumerator.Current != null)
{
builder.Append(enumerator.Current);
}
while (enumerator.MoveNext())
{
builder.Append(separator);
if (enumerator.Current != null)
{
builder.Append(enumerator.Current);
}
}
return builder.ToString();
}
}
So in other words, if you were to change this code to use a StringBuilder, you'd just be rewriting what MS already wrote for you.
With such a small collection, there isn't much of a performance concern, but I would probably just use a StringBuilder to Append all of the values.
Like this:
var sb = new Text.StringBuilder;
foreach (var item in requestParameters)
{
sb.AppendFormat("{0}{1}", item.Key, item.Value);
}
var parameters = sb.ToString();
String builder would be fine. Use append to add each a string to the string builder.
Basically the only reason why concat, replace, join, string+string , etc are considered not-the-best because they all tend to destroy the current string & recreate a new one.
So when you have adding strings like upto 10-12 time it really means you will destroy & recreate a string that many times.

StringBuilder related question

I have written a program for a stack. (https://stackoverflow.com/questions/2617367?tab=votes#tab-top)
For this i needed a StringBuilder to be able to show me what was in the stack else i would get the class name instead of the actual values inside.
My question is there any other way except for a StringBuilder for such kind of problem?
Also in what other kind of cases does this kind of problem happen?
Also the way i have written the StringBuilder felt very awkward when i needed several things on 1 line.
public override string ToString()
{
StringBuilder builder = new StringBuilder();
foreach (int value in tabel)
{
builder.Append(value);
builder.Append(" ");
}
if (tabel.Length == tabel.Length) // this is a bit messy, since I couldn't append after the rest above
{
builder.Append("(top:");
builder.Append(top);
builder.Append(")");
}
return builder.ToString();
}/*ToString*/
You could use Array.ConvertAll and String.Join instead of iterating the list yourself.
Also, when you talk about multiple things on one line... you don't have any linebreaks anywhere.
Or, if you keep using StringBuilder, the Append method returns the StringBuilder so you can chain calls together:
sb.Append("(top: ").Append(top).Append(")").AppendLine();
You could use an extension method like this to summarize enumerable collections
/// <summary>
/// A better ToString for Enumerable objects (mostly for logging)
/// </summary>
public static string ToStringList(this IEnumerable<string> collection, int limit)
{
return string.Join(", ", collection.Take(limit));
}
Usage
string result = tabel.Select(x => x.ToString()).ToStringList(50);
PS If you are using .NET prior to version 4 you might need a .ToArray() in there to satisfy string.Join()
Or, better yet, using the overload: string Join<T>(string separator, IEnumerable<T> values); you can simplify to:-
/// <summary>
/// A better ToString for Enumerable objects (mostly for logging)
/// </summary>
public static string ToStringList<T>(this IEnumerable<T> collection, int limit)
{
return string.Join(", ", collection.Take(limit));
}
Usage
string result = tabel.ToStringList(50);
This is the correct use of a string builder (although your code looks buggy)
Note you can use AppendLine if you want a link break instead of using spaces.
You can also use AppendFormat which is the equivalent of string.format eg
builder.AppendFormat("(top:{0})", value);
ToString() overrides like this for a collection class rarely work out well in practice. They don't behave well when you've got thousands of elements in the collection. A decent visualization is to display the top element and the number of elements. For example:
public override string ToString() {
if (this.Count == 0) return "Empty";
else return string.Format("Top:{0}, Count:{1}", top, Count);
}

Can all 'for' loops be replaced with a LINQ statement?

Is it possible to write the following 'foreach' as a LINQ statement, and I guess the more general question can any for loop be replaced by a LINQ statement.
I'm not interested in any potential performance cost just the potential of using declarative approaches in what is traditionally imperative code.
private static string SomeMethod()
{
if (ListOfResources .Count == 0)
return string.Empty;
var sb = new StringBuilder();
foreach (var resource in ListOfResources )
{
if (sb.Length != 0)
sb.Append(", ");
sb.Append(resource.Id);
}
return sb.ToString();
}
Cheers
AWC
Sure. Heck, you can replace arithmetic with LINQ queries:
http://blogs.msdn.com/ericlippert/archive/2009/12/07/query-transformations-are-syntactic.aspx
But you shouldn't.
The purpose of a query expression is to represent a query operation. The purpose of a "for" loop is to iterate over a particular statement so as to have its side-effects executed multiple times. Those are frequently very different. I encourage replacing loops whose purpose is merely to query data with higher-level constructs that more clearly query the data. I strongly discourage replacing side-effect-generating code with query comprehensions, though doing so is possible.
In general yes, but there are specific cases that are extremely difficult. For instance, the following code in the general case does not port to a LINQ expression without a good deal of hacking.
var list = new List<Func<int>>();
foreach ( var cur in (new int[] {1,2,3})) {
list.Add(() => cur);
}
The reason why is that with a for loop, it's possible to see the side effects of how the iteration variable is captured in a closure. LINQ expressions hide the lifetime semantics of the iteration variable and prevent you from seeing side effects of capturing it's value.
Note. The above code is not equivalent to the following LINQ expression.
var list = Enumerable.Range(1,3).Select(x => () => x).ToList();
The foreach sample produces a list of Func<int> objects which all return 3. The LINQ version produces a list of Func<int> which return 1,2 and 3 respectively. This is what makes this style of capture difficult to port.
In fact, your code does something which is fundamentally very functional, namely it reduces a list of strings to a single string by concatenating the list items. The only imperative thing about the code is the use of a StringBuilder.
The functional code makes this much easier, actually, because it doesn’t require a special case like your code does. Better still, .NET already has this particular operation implemented, and probably more efficient than your code1):
return String.Join(", ", ListOfResources.Select(s => s.Id.ToString()).ToArray());
(Yes, the call to ToArray() is annoying but Join is a very old method and predates LINQ.)
Of course, a “better” version of Join could be used like this:
return ListOfResources.Select(s => s.Id).Join(", ");
The implementation is rather straightforward – but once again, using the StringBuilder (for performance) makes it imperative.
public static String Join<T>(this IEnumerable<T> items, String delimiter) {
if (items == null)
throw new ArgumentNullException("items");
if (delimiter == null)
throw new ArgumentNullException("delimiter");
var strings = items.Select(item => item.ToString()).ToList();
if (strings.Count == 0)
return string.Empty;
int length = strings.Sum(str => str.Length) +
delimiter.Length * (strings.Count - 1);
var result = new StringBuilder(length);
bool first = true;
foreach (string str in strings) {
if (first)
first = false;
else
result.Append(delimiter);
result.Append(str);
}
return result.ToString();
}
1) Without having looked at the implementation in the reflector, I’d guess that String.Join makes a first pass over the strings to determine the overall length. This can be used to initialize the StringBuilder accordingly, thus saving expensive copy operations later on.
EDIT by SLaks: Here is the reference source for the relevant part of String.Join from .Net 3.5:
string jointString = FastAllocateString( jointLength );
fixed (char * pointerToJointString = &jointString.m_firstChar) {
UnSafeCharBuffer charBuffer = new UnSafeCharBuffer( pointerToJointString, jointLength);
// Append the first string first and then append each following string prefixed by the separator.
charBuffer.AppendString( value[startIndex] );
for (int stringToJoinIndex = startIndex + 1; stringToJoinIndex <= endIndex; stringToJoinIndex++) {
charBuffer.AppendString( separator );
charBuffer.AppendString( value[stringToJoinIndex] );
}
BCLDebug.Assert(*(pointerToJointString + charBuffer.Length) == '\0', "String must be null-terminated!");
}
The specific loop in your question can be done declaratively like this:
var result = ListOfResources
.Select<Resource, string>(r => r.Id.ToString())
.Aggregate<string, StringBuilder>(new StringBuilder(), (sb, s) => sb.Append(sb.Length > 0 ? ", " : String.Empty).Append(s))
.ToString();
As to performance, you can expect a performance drop but this is acceptable for most applications.
I think what's most important here is that to avoid semantic confusion, your code should only be superficially functional when it is actually functional. In other words, please don't use side effects in LINQ expressions.
Technically, yes.
Any foreach loop can be converted to LINQ by using a ForEach extension method,such as the one in MoreLinq.
If you only want to use "pure" LINQ (only the built-in extension methods), you can abuse the Aggregate extension method, like this:
foreach(type item in collection { statements }
type item;
collection.Aggregate(true, (j, itemTemp) => {
item = itemTemp;
statements
return true;
);
This will correctly handle any foreach loop, even JaredPar's answer. EDIT: Unless it uses ref / out parameters, unsafe code, or yield return.
Don't you dare use this trick in real code.
In your specific case, you should use a string Join extension method, such as this one:
///<summary>Appends a list of strings to a StringBuilder, separated by a separator string.</summary>
///<param name="builder">The StringBuilder to append to.</param>
///<param name="strings">The strings to append.</param>
///<param name="separator">A string to append between the strings.</param>
public static StringBuilder AppendJoin(this StringBuilder builder, IEnumerable<string> strings, string separator) {
if (builder == null) throw new ArgumentNullException("builder");
if (strings == null) throw new ArgumentNullException("strings");
if (separator == null) throw new ArgumentNullException("separator");
bool first = true;
foreach (var str in strings) {
if (first)
first = false;
else
builder.Append(separator);
builder.Append(str);
}
return builder;
}
///<summary>Combines a collection of strings into a single string.</summary>
public static string Join<T>(this IEnumerable<T> strings, string separator, Func<T, string> selector) { return strings.Select(selector).Join(separator); }
///<summary>Combines a collection of strings into a single string.</summary>
public static string Join(this IEnumerable<string> strings, string separator) { return new StringBuilder().AppendJoin(strings, separator).ToString(); }
In general, you can write a lambda expression using a delegate which represents the body of a foreach cycle, in your case something like :
resource => { if (sb.Length != 0) sb.Append(", "); sb.Append(resource.Id); }
and then simply use within a ForEach extension method. Whether this is a good idea depends on the complexity of the body, in case it's too big and complex you probably don't gain anything from it except for possible confusion ;)

Can I rewrite this more elegantly using LINQ?

I have a double[][] that I want to convert to a CSV string format (i.e. each row in a line, and row elements separated by commas). I wrote it like this:
public static string ToCSV(double[][] array)
{
return String.Join(Environment.NewLine,
Array.ConvertAll(array,
row => String.Join(",",
Array.ConvertAll(row, x => x.ToString())));
}
Is there a more elegant way to write this using LINQ?
(I know, one could use temporary variables to make this look better, but this code format better conveys what I am looking for.)
You can, but I wouldn't personally do all the lines at once - I'd use an iterator block:
public static IEnumerable<string> ToCSV(IEnumerable<double[]> source)
{
return source.Select(row => string.Join(",",
Array.ConvertAll(row, x=>x.ToString())));
}
This returns each line (the caller can then WriteLine etc efficiently, without buffering everything). It is also now callable from any source of double[] rows (including but not limited to a jagged array).
Also - with a local variable you could use StringBuilder to make each line slightly cheaper.
To return the entire string at once, I'd optimize it to use a single StringBuilder for all the string work; a bit more long-winded, but much more efficient (far fewer intermediate strings):
public static string ToCSV(IEnumerable<double[]> source) {
StringBuilder sb = new StringBuilder();
foreach(var row in source) {
if (row.Length > 0) {
sb.Append(row[0]);
for (int i = 1; i < row.Length; i++) {
sb.Append(',').Append(row[i]);
}
}
}
return sb.ToString();
}
You could also use Aggregate
public static string ToCSV(double[][] array)
{
return array.Aggregate(string.Empty, (multiLineStr, arrayDouble) =>
multiLineStr + System.Environment.NewLine +
arrayDouble.Aggregate(string.Empty, (str, dbl) => str + "," + dbl.ToString()));
}
This is compatible with any nested sequences of double. It also defers the ToString implementation to the caller, allowing formatting while avoiding messy IFormatProvider overloads:
public static string Join(this IEnumerable<string> source, string separator)
{
return String.Join(separator, source.ToArray());
}
public static string ToCsv<TRow>(this IEnumerable<TRow> rows, Func<double, string> valueToString)
where TRow : IEnumerable<double>
{
return rows
.Select(row => row.Select(valueToString).Join(", "))
.Join(Environment.NewLine);
}
You can do it with LINQ, but I'm not sure if you like this one better than yours. I'm afraid you don't. :)
var q = String.Join(Environment.NewLine, (from a in d
select String.Join(", ", (from b in a
select b.ToString()).ToArray())).ToArray());
Cheers,
Matthias

Fastest way to do a contains with string[]

I am getting back a "string[]" from a 3rd party library. I want to do a contains on it. what is the most efficient way of doing this?
Array.IndexOf:
bool contains = Array.IndexOf(arr, value) >= 0;
Or just use LINQ:
bool contains = arr.Contains(value);
LINQ should be "fast enough" for most purposes.
If you are only checking a single time, use Array.IndexOf or the LINQ Contains method like Marc proposed. If you are checking several times, it might be faster to first convert the string array into a HashSet<string>.
Unless you know the String array is sorted by a particular order the most efficient thing you can do is linear algorithm (i.e. compare each string in the array until you find a match or the end of the array.
If the array is sorted a binary search is much faster.
Another way to optimize the algorithm (although the complexity is not reduced) is to vectorize the string comparisons.
I'm fairly certain that a for loop is faster, if absolute speed is your concern. I.e.,
for (int i = 0; i < arr.Length; ++i)
if (arr[i] == value) return true;
return false;
If you're searching once or twice, use a linear search or IndexOf.
If you're searching a few times, put the strings into a HashSet.
If you're searching zillions of times in a time-critical fashion, use a HashSet and manage its bucket count yourself.
You can use the IEnumerable.Foreach Custom Extension
public static class CollectionExtensions
{
public static void ForEach<T>(this IEnumerable list, Action<T> action)
{
foreach (T item in list)
{
action(item);
}
}
}
class Program
{
static void Main(string[] args)
{
String[] list = new String[] { "Word1", "Word2", "Word3" };
list.ForEach<String>(p => Console.WriteLine(p));
list.ForEach(delegate(String p) { Console.WriteLine(p); });
}
}
Hope this help's.

Categories

Resources