How Bad is this Code?

How Bad is this Code? - c#

Ok, I am an amateur programmer and just wrote this. It gets the job done, but I am wondering how bad it is and what kind of improvements could be made.
[Please note that this is a Chalk extension for Graffiti CMS.]
public string PostsAsSlides(PostCollection posts, int PostsPerSlide)
{
StringBuilder sb = new StringBuilder();
decimal slides = Math.Round((decimal)posts.Count / (decimal)PostsPerSlide, 3);
int NumberOfSlides = Convert.ToInt32(Math.Ceiling(slides));
for (int i = 0; i < NumberOfSlides; i++ )
{
int PostCount = 0;
sb.Append("<div class=\"slide\">\n");
foreach (Post post in posts.Skip<Post>(i * PostsPerSlide))
{
PostCount += 1;
string CssClass = "slide-block";
if (PostCount == 1)
CssClass += " first";
else if (PostCount == PostsPerSlide)
CssClass += " last";
sb.Append(string.Format("<div class=\"{0}\">\n", CssClass));
sb.Append(string.Format("<img src=\"{2}\" alt=\"{3}\" />\n", post.Custom("Large Image"), post.MetaDescription, post.ImageUrl, post.Title));
sb.Append(string.Format("<a class=\"button-launch-website\" href=\"{0}\" target=\"_blank\">Launch Website</a>\n", post.Custom("Website Url")));
sb.Append("</div><!--.slide-block-->\n");
if (PostCount == PostsPerSlide)
break;
}
sb.Append("</div><!--.slide-->\n");
}
return sb.ToString();
}

Use an HtmlTextWriter instead of a StringBuilder to write HTML:
StringBuilder sb = new StringBuilder();
using(HtmlTextWriter writer = new HtmlTextWriter(new StringWriter(sb)))
{
writer.WriteBeginTag("div");
writer.WriteAttribute("class", "slide");
//...
}
return sb.ToString();
We don't want to use an unstructured writer to write structured data.
Break the body of your inner loop into a separate routine:
foreach(...)
{
WritePost(post, writer);
}
//snip
private void WritePost(Post post, HtmlTextWriter writer)
{
//write single post
}
This makes it testable and more easily modifiable.
Use a data structure for managing things like CSS classes.
Instead of appending extra class names with a space and hoping everything lines up right at the end, keep a List<string> of class names to add and remove as necessary, and then call:
List<string> cssClasses = new List<string>();
//snip
string.Join(" ", cssClasses.ToArray());

Instead of this,
foreach (Post post in posts.Skip<Post>(i * PostsPerSlide))
{
// ...
if (PostCount == PostsPerSlide)
break;
}
I'd move the exit condition into the loop like so: (and eliminate unnecessary generic parameter while you're at it)
foreach (Post post in posts.Skip(i * PostsPerSlide).Take(PostsPerSlide))
{
// ...
}
As a matter of fact, your two nested loops can probably be handled in one single loop.
Also, I'd prefer to use single quotes for the html attributes, since those are legal and don't require escaping.

It's not great but I've seen much much worse.
Assuming this is ASP.Net, is there a reason you're using this approach instead of a repeater?
EDIT:
If a repeater isn't possible in this instance I would recommend using the .Net HTML classes to generate the HTML instead of using text. It's a little easier to read/adjust later on.

I've seen much worse, but you could improve it quite a bit.
Instead of StringBuilder.Append() with a string.Format() inside, use StringBuilder.AppendFormat()
Add some unit tests around it to ensure that it's producing the output you want, then refactor the code inside to be better. The tests will ensure that you haven't broken anything.

My first thought is that this would be very difficult to unit test.
I would suggest having the second for loop be a separate function, so you would have something like:
for (int i = 0; i < NumberOfSlides; i++ )
{
int PostCount = 0;
sb.Append("<div class=\"slide\">\n");
LoopPosts(posts, i);
...
and LoopPosts:
private void LoopPosts(PostCollection posts, int i) {
...
}
If you get into a habit of doing loops like this then when you need to do a unit test it will be easier to test the inner loop separate from the outer loop.

I'd say that it looks good enough, there is no serious problems with the code, and it would work well. It doesn't mean that it can't be improved, though. :)
Here are a few tips:
For general floating point operations you should use double rather than Decimal. However, in this case you don't need any floating point arithmetic at all, you can do this with just integers:
int numberOfSlides = (posts.Count + PostsPerSlide - 1) / PostsPerSlide;
But, if you just use a single loop over all items instead of a loop in a loop, you don't need the slide count at all.
The convention for local variables is camel case (postCoint) rather than pascal case (PostCount). Try to make variable names meaningdful rather than obscure abbreviations like "sb". If the scope of a variable is so small that you don't really need a meaningful name for it, just go for the simplest possible and just a single letter instead of an abbreviation.
Instead of concatenating the strings "slide block" and " first" you can assign literal strings directly. That way you replace a call to String.Concat with a simple assignment. (This borders on premature optimisation, but on the other hand the string concatenation takes about 50 times longer.)
You can use .AppendFormat(...) instead of .Append(String.Format(...)) on a StringBuilder. I would just stick to Append in this case, as there isn't really anything that needs formatting, you are just concatenating strings.
So, I would write the method like this:
public string PostsAsSlides(PostCollection posts, int postsPerSlide) {
StringBuilder builder = new StringBuilder();
int postCount = 0;
foreach (Post post in posts) {
postCount++;
string cssClass;
if (postCount == 1) {
builder.Append("<div class=\"slide\">\n");
cssClass = "slide-block first";
} else if (postCount == postsPerSlide) {
cssClass = "slide-block last";
postCount = 0;
} else {
cssClass = "slide-block";
}
builder.Append("<div class=\"").Append(cssClass).Append("\">\n")
.Append("<a href=\"").Append(post.Custom("Large Image"))
.Append("\" rel=\"prettyPhoto[gallery]\" title=\"")
.Append(post.MetaDescription).Append("\"><img src=\"")
.Append(post.ImageUrl).Append("\" alt=\"").Append(post.Title)
.Append("\" /></a>\n")
.Append("<a class=\"button-launch-website\" href=\"")
.Append(post.Custom("Website Url"))
.Append("\" target=\"_blank\">Launch Website</a>\n")
.Append("</div><!--.slide-block-->\n");
if (postCount == 0) {
builder.Append("</div><!--.slide-->\n");
}
}
return builder.ToString();
}

It would help if the definition for posts existed, but in general, I would say that generating HTML in code behind is not the way to go in Asp.Net.
Since this is tagged as .Net specifically...
For displaying a list of items in a collection, you'd be better off looking at one of the data controls (Repeater, Data List, etc) and learning how to use those properly.
http://quickstarts.asp.net/QuickStartv20/aspnet/doc/ctrlref/data/default.aspx

Another piece of tightening you could do: replace the sb.Append(string.Format(...)) calls with sb.AppendFormat(...).

Related

Is there any way to make Search and addToSearch faster?

Is there any way to make Search and addToSearch faster?
I am trying to make it faster. I am not sure if regex in addtosearch can be a problem, it is really small. I am out ofideas how to optimize it further. Now i am just trying to meet word count. I wonder if there is a way to concatenate parts of name that are not empty more effectivly than i do.
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System;
namespace AutoComplete
{
public struct FullName
{
public string Name;
public string Surname;
public string Patronymic;
}
public class AutoCompleter
{
private List<string> listOfNames = new List<string>();
private static readonly Regex sWhitespace = new Regex(#"\s+");
public void AddToSearch(List<FullName> fullNames)
{
foreach (FullName i in fullNames)
{
string nameToAdd = "";
if (!string.IsNullOrWhiteSpace(i.Surname))
{
nameToAdd += sWhitespace.Replace(i.Surname, "") + " ";
}
if (!string.IsNullOrWhiteSpace(i.Name))
{
nameToAdd += sWhitespace.Replace(i.Name, "") + " ";
}
if (!string.IsNullOrWhiteSpace(i.Patronymic))
{
nameToAdd += sWhitespace.Replace(i.Patronymic, "") + " ";
}
listOfNames.Add(nameToAdd.Substring(0, nameToAdd.Length - 1));
}
}
public List<string> Search(string prefix)
{
if (prefix.Length > 100 || string.IsNullOrWhiteSpace(prefix))
{
throw new System.Exception();
}
List<string> namesWithPrefix = new List<string>();
foreach (string name in listOfNames)
{
if (IsPrefix(prefix, name))
{
namesWithPrefix.Add(name);
}
}
return namesWithPrefix;
}
private bool IsPrefix(string prefix, string stringToSearch)
{
if (stringToSearch.Length < prefix.Length)
{
return false;
}
for (int i = 0; i < prefix.Length; i++)
{
if (prefix[i] != stringToSearch[i])
{
return false
}
}
return true
}
}
}

Regular expression (Regexp) are great because of their ease-of use and flexibility but most Regexp engines are actually quite slow. This is the case for the one of C#. Moreover, strings can contain Unicode character and "\s" needs to consider all the (fancy) spaces characters included in the Unicode character set. This make Regexp search/replace much slower. If you know your input does not contain such characters (eg. ASCII), then you can write a much faster implementation. Alternatively, you can play with RegexpOptions like Compiled and CultureInvariant so to reduce a bit the run time.
The AddToSearch performs many hidden allocations. Indeed, += create a new string (because C# string are immutable and not designed to be often resized) and Replace calls does allocate new strings too. You can speed up the computation by directly replace and write the result in a preallocated buffer and simply copy the result with a Substring like you currently do.
Search is fine and it is not easy to optimize it. That being said, if listOfNames is big, then you can use multiple threads so to significantly speed up the computation. Be careful though because Add is not thread-safe. Parallel linkq may help you to do that easily (I never tested it though).
Another solution to speed up a bit the computation of Search is to start the loop of IsPrefix from prefix.Length-1. Indeed, if most string contains the beginning of the prefix, then a significant portion of the time will be spend comparing nearly equal characters. The probability that prefix[prefix.Length-1] != stringToSearch[prefix.Length-1] is higher than prefix[0] != stringToSearch[0]. Additionally, partial loop unrolling may help a bit to speed up the function if the JIT is not able to do that.

Others have already pointed out that the use of regex can be problematic. I would personally consider using str.Replace(" ", String.Empty) - if I understood the regex correctly; I normally try to avoid regex as I have a hard time reading code using regex. Note that String.Empty does not allocate a new string.
That said, I think performance could boost if you would not store the names in a List but at least order the list alpabetically. Thus you do not need to iterate all elemnts of the list but e.g. use binary search to find all elements matching a given prefix - as range within the list of names you already have.

Removing punctuation from an extremely long string

I'm working on a book encryption program for one of my courses and I've run into a problem. Our professor gave us the example of using say Pride and Prejudice as the book used to encrypt, so I chose that one to test my program. The current function I'm using to remove the punctuation from the string is taking so long that the program is being forced into break mode. This function works for smaller strings even pages long, but when I fed it Pride and Prejudice it takes way to long.
public void removePunctuation(ref string s) {
string result = "";
for (int i = 0; i < s.Length; i++) {
if (Char.IsWhiteSpace(s[i])) {
result += ' ';
} else if (!Char.IsLetter(s[i]) && !Char.IsNumber(s[i])) {
// do nothing
} else {
result += s[i];
}
}
s = result;
}
So I think I need a faster way to remove punctuation from this string if anyone has any suggestions? I know looping through every character is horrible, but I'm stumped and I was never taught Regex in depth.
Edit: I was asked how I was storing the string in the dictionary class! This is the constructor for another class that actually uses the formatted string.
public CodeBook(string book)
{
BookMap = new Dictionary<string, List<int>>();
Key = book.Split(null).ToList(); // split string into words
foreach(string s in Key)
{
if (!BookMap.Keys.Contains(s))
{
BookMap.Add(s, Enumerable.Range(0, Key.Count).Where(i => Key[i] == s).ToList());
// add word and add list of occurrances of word
}
}
}

This is slow because you construct string by concatenations in a loop. You have several approaches that are more performant:
Use StringBuilder - unlike string concatenation which constructs a new object each time you add a character, this approach expands the string under construction by larger chunks, preventing excessive garbage creation.
Use LINQ's filtering with Where - this approach constructs an array of chars in a single shot, then constructs a single string from it.
Use regular expression's Replace - this method is optimized to deal with strings of virtually unlimited sizes.
Roll your own algorithm - create an array of chars that corresponds to the length of the original string. Walk through the string, and add the characters that you wish to keep to the array. Use string's constructor that takes the array, the initial index, and the length to construct the string at once.

Looping through every character once is not that bad. You're doing it all in one pass, that's not trivial to avoid.
The problem lies in the fact that the framework will need to allocate a new copy of the (partial) string whenever you do something like
result += s[i];
You can avoid that by introducing a StringBuilder documented here to append non-punctuation characters as you go.
public string removePunctuation(string s)
{
var result = new StringBuilder();
for (int i = 0; i < s.Length; i++) {
if (Char.IsWhiteSpace(s[i])) {
result.Append(" ");
} else if (!Char.IsLetter(s[i]) && !Char.IsNumber(s[i])) {
// do nothing
} else {
result.Append(s[i]);
}
}
return result.ToString();
}
You could further reduce the number of necessary Append calls with a refined algorithm, for example look ahead to the next punctuation and append larger portions at once, or use an existing string manipulation library like RegEx. But the introduction of StringBuilder above should give you a noticable performance gain already.
I was never taught Regex in depth
Use the search provider of your choice, you may end up with a tested solution which you can just study and use: https://stackoverflow.com/a/5871826/1132334

You can use Regex to remove punctuations as below.
public string removePunctuation(string s)
{
string result = Regex.Replace(s, #"[^\w\s]", "");
return result;
}
^ Means: not these characters (letters, numbers).
\w Means: word characters.
\s Means: space characters.

Replace character at specific index in List<string>, but indexer is read only [duplicate]

This question already has answers here:
Is there an easy way to change a char in a string in C#?
(8 answers)
Closed 5 years ago.
This is kind of a basic question, but I learned programming in C++ and am just transitioning to C#, so my ignorance of the C# methods are getting in my way.
A client has given me a few fixed length files and they want the 484th character of every odd numbered record, skipping the first one (3, 5, 7, etc...) changed from a space to a 0. In my mind, I should be able to do something like the below:
static void Main(string[] args)
{
List<string> allLines = System.IO.File.ReadAllLines(#"C:\...").ToList();
foreach(string line in allLines)
{
//odd numbered logic here
line[483] = '0';
}
...
//write to new file
}
However, the property or indexer cannot be assigned to because it is read only. All my reading says that I have not set a setter for the variable, and I have tried what was shown at this SO article, but I am doing something wrong every time. Should what is shown in that article work? Should I do something else?

You cannot modify C# strings directly, because they are immutable. You can convert strings to char[], modify it, then make a string again, and write it to file:
File.WriteAllLines(
#"c:\newfile.txt"
, File.ReadAllLines(#"C:\...").Select((s, index) => {
if (index % 2 = 0) {
return s; // Even strings do not change
}
var chars = s.ToCharArray();
chars[483] = '0';
return new string(chars);
})
);

Since strings are immutable, you can't modify a single character by treating it as a char[] and then modify a character at a specific index. However, you can "modify" it by assigning it to a new string.
We can use the Substring() method to return any part of the original string. Combining this with some concatenation, we can take the first part of the string (up to the character you want to replace), add the new character, and then add the rest of the original string.
Also, since we can't directly modify the items in a collection being iterated over in a foreach loop, we can switch your loop to a for loop instead. Now we can access each line by index, and can modify them on the fly:
for(int i = 0; i < allLines.Length; i++)
{
if (allLines[i].Length > 483)
{
allLines[i] = allLines[i].Substring(0, 483) + "0" + allLines[i].Substring(484);
}
}
It's possible that, depending on how many lines you're processing and how many in-line concatenations you end up doing, there is some chance that using a StringBuilder instead of concatenation will perform better. Here is an alternate way to do this using a StringBuilder. I'll leave the perf measuring to you...
var sb = new StringBuilder();
for (int i = 0; i < allLines.Length; i++)
{
if (allLines[i].Length > 483)
{
sb.Clear();
sb.Append(allLines[i].Substring(0, 483));
sb.Append("0");
sb.Append(allLines[i].Substring(484));
allLines[i] = sb.ToString();
}
}

The first item after the foreach (string line in this case) is a local variable that has no scope outside the loop - that’s why you can’t assign a value to it. Try using a regular for loop instead.

Purpose of for each is meant to iterate over a container. It's read only in nature. You should use regular for loop. It will work.
static void Main(string[] args)
{
List<string> allLines = System.IO.File.ReadAllLines(#"C:\...").ToList();
for (int i=0;i<=allLines.Length;++i)
{
if (allLines[i].Length > 483)
{
allLines[i] = allLines[i].Substring(0, 483) + "0";
}
}
...
//write to new file
}

How do I get the number of listitems that meet certain criteria?

I need for example the number of list-items, that are NOT "".
ATM, I solve it like this:
public int getRealCount()
{
List<string> all = new List<string>(originList);
int maxall = all.Count;
try
{
for (int i = 0; i < maxall; i++)
{
all.Remove("");
}
}
catch { }
return all.Count;
}
No question, performance is pretty bad. I'm lucky it's just a 10-items-list, but on a phone you should avoid such code.
So my question is, how can I improve this code?
One idea was: there could already be a method for that. The econd method would be: that all could be filled with only the items that are not "".
How should I solve this?
Thanks

Sounds like you want:
return originList.Count(x => x != "");
There's no need to create a copy of the collection at all. Note that you'll need using System.Linq; in your using directives at the start of your source code.
(Note that you should not have empty catch blocks like that - it's a terrible idea to suppress exceptions in that way. Only catch exceptions when you either want to really handle them or when you want to rethrow them wrapped as another type. If you must ignore an exception, you should at least log it somewhere.)

If performance is your concern, then you should keep a collection that is only for these items.
If performance is not a big deal, I would suggest you use a Linq query on your collection. The cool thing about Linq is that the search is delayed until you need it.
int nonEmptyItemCount = originList.Count(str => !string.IsNullOrEmpty(str));
You could also do
int nonEmptyItemCount = originList.Count(str => str != "");

You should use LINQ. Install ReSharper, it'll generate it for you.
Also, don't create an int maxall = all.Count and then use it in your for loop.
For mobile apps you shouldn't use unnecessary memory so just use all.Count in the for loop.

You're calling all.remove("") for every item in the list all. Why not just call it once? You're not using i at all in your code...
Why not:
public int getRealCount()
{
List<string> all = new List<string>(originList);
int erased =all.RemoveAll(delegate(string s)
{
return s == "";
});
return all.Count - erased;
}
Update:
Fixed the issue I had. This is without lambda's.

Is goto ok for breaking out of nested loops?

JavaScript supports a goto like syntax for breaking out of nested loops. It's not a great idea in general, but it's considered acceptable practice. C# does not directly support the break labelName syntax...but it does support the infamous goto.
I believe the equivalent can be achieved in C#:
int i = 0;
while(i <= 10)
{
Debug.WriteLine(i);
i++;
for(int j = 0; j < 3; j++)
if (i > 5)
{
goto Break;//break out of all loops
}
}
Break:
By the same logic of JavaScript, is nested loop scenario an acceptable usage of goto? Otherwise, the only way I am aware to achieve this functionality is by setting a bool with appropriate scope.

My opinion: complex code flows with nested loops are hard to reason about; branching around, whether it is with goto or break, just makes it harder. Rather than writing the goto, I would first think really hard about whether there is a way to eliminate the nested loops.
A couple of useful techniques:
First technique: Refactor the inner loop to a method. Have the method return whether or not to break out of the outer loop. So:
for(outer blah blah blah)
{
for(inner blah blah blah)
{
if (whatever)
{
goto leaveloop;
}
}
}
leaveloop:
...
becomes
for(outer blah blah blah)
{
if (Inner(blah blah blah))
break;
}
...
bool Inner(blah blah blah)
{
for(inner blah blah blah)
{
if (whatever)
{
return true;
}
}
return false;
}
Second technique: if the loops do not have side effects, use LINQ.
// fulfill the first unfulfilled order over $100
foreach(var customer in customers)
{
foreach(var order in customer.Orders)
{
if (!order.Filled && order.Total >= 100.00m)
{
Fill(order);
goto leaveloop;
}
}
}
leaveloop:
instead, write:
var orders = from customer in customers
from order in customer.Orders;
where !order.Filled
where order.Total >= 100.00m
select order;
var orderToFill = orders.FirstOrDefault();
if (orderToFill != null) Fill(orderToFill);
No loops, so no breaking out required.
Alternatively, as configurator points out in a comment, you could write the code in this form:
var orderToFill = customers
.SelectMany(customer=>customer.Orders)
.Where(order=>!order.Filled)
.Where(order=>order.Total >= 100.00m)
.FirstOrDefault();
if (orderToFill != null) Fill(orderToFill);
The moral of the story: loops emphasize control flow at the expense of business logic. Rather than trying to pile more and more complex control flow on top of each other, try refactoring the code so that the business logic is clear.

I would personally try to avoid using goto here by simply putting the loop into a different method - while you can't easily break out of a particular level of loop, you can easily return from a method at any point.
In my experience this approach has usually led to simpler and more readable code with shorter methods (doing one particular job) in general.

Let's get one thing straight: there is nothing fundamentally wrong with using the goto statement, it isn't evil - it is just one more tool in the toolbox. It is how you use it that really matters, and it is easily misused.
Breaking out of a nested loop of some description can be a valid use of the statement, although you should first look to see if it can be redesigned. Can your loop exit expressions be rewritten? Are you using the appropriate type of loop? Can you filter the list of data you may be iterating over so that you don't need to exit early? Should you refactor some loop code into a separate function?

IMO it is acceptable in languages that do not support break n; where n specifies the number of loops it should break out.
At least it's much more readable than setting a variable that is then checked in the outer loop.

I believe the 'goto' is acceptable in this situation. C# does not support any nifty ways to break out of nested loops unfortunately.

It's a bit of a unacceptable practice in C#. If there's no way your design can avoid it, well, gotta use it. But do exhaust all other alternatives first. It will make for better readability and maintainability. For your example, I've crafted one such potential refactoring:
void Original()
{
int i = 0;
while(i <= 10)
{
Debug.WriteLine(i);
i++;
if (Process(i))
{
break;
}
}
}
bool Process(int i)
{
for(int j = 0; j < 3; j++)
if (i > 5)
{
return true;
}
return false;
}

I recommend using continue if you want to skip that one item, and break if you want to exit the loop. For deeper nested put it in a method and use return. I personally would rather use a status bool than a goto. Rather use goto as a last resort.

anonymous functions
You can almost always bust out the inner loop to an anonymous function or lambda. Here you can see where the function used to be an inner loop, where I would have had to use GoTo.
private void CopyFormPropertiesAndValues()
{
MergeOperationsContext context = new MergeOperationsContext() { GroupRoot = _groupRoot, FormMerged = MergedItem };
// set up filter functions caller
var CheckFilters = (string key, string value) =>
{
foreach (var FieldFilter in MergeOperationsFieldFilters)
{
if (!FieldFilter(key, value, context))
return false;
}
return true;
};
// Copy values from form to FormMerged
foreach (var key in _form.ValueList.Keys)
{
var MyValue = _form.ValueList(key);
if (CheckFilters(key, MyValue))
MergedItem.ValueList(key) = MyValue;
}
}
This often occurs when searching for multiple items in a dataset manually, as well. Sad to say the proper use of goto is better than Booleans/flags, from a clarity standpoint, but this is more clear than either and avoids the taunts of your co-workers.
For high-performance situations, a goto would be fitting, however, but only by 1%, let's be honest here...

int i = 0;
while(i <= 10)
{
Debug.WriteLine(i);
i++;
for(int j = 0; j < 3 && i <= 5; j++)
{
//Whatever you want to do
}
}

Unacceptable in C#.
Just wrap the loop in a function and use return.
EDIT: On SO, downvoting is used to on incorrect answers, and not on answers you disagree with. As the OP explicitly asked "is it acceptable?", answering "unacceptable" is not incorrect (although you might disagree).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.