.NET tool for simple lookups? - c#

So I've found myself writing code along these lines lately.
Dictionary<string, byte> dict = new Dictionary<string, byte>();
foreach(string str in arbitraryStringCollection)
{
if(!dict.ContainsKey(str))
{
ProcessString(str);
dict[str] = 0;
}
}
The example is overly generic, but the common goal I find myself shooting for is "Have I done this one already?".
I like using Dictionary for the fast key lookup, but since I never care about the value field, I can't help but feel it's slightly excessive, even if it's just a byte per entry.
Is there a better .NET tool out there that accomplishes this, something with the key lookup speed of a Dictionary but without the arbitrary and unnecessary values?

You should use HashSet<T>
HashSet<string> hashSet= new HashSet<string>();
foreach(string str in arbitraryStringCollection)
{
if(!hashSet.Contains(str))
{
ProcessString(str);
hashSet.Add(str);
}
}
To make it shorter:
foreach(string str in arbitraryStringCollection)
{
if(hashSet.Add(str)) ProcessString(str);
}

There isn't a tool or library for that, however you can refactor this code to be less verbose. For example, the code as is could be simplified using the Distinct method.
foreach (var str in arbitraryStringCollection.Distinct())
{
ProcessString(str)
}
You could further refactor it using some sort of ForEach extension method, or refactor the entire thing into an extension method.
Alternatively, if your requirements are slightly different (e.g. you want to keep dict for the lifetime of the application), then this could be refactored in a slightly different way, e.g.
HashSet<string> dict = new HashSet<string>();
foreach(string str in arbitraryStringCollection)
{
dict.DoOnce(str, ProcessString);
}
// Re-usable extension method)
public static class ExtensionMethods
{
public static void DoOnce<T>(this ISet<T> set, T value, Action<T> action)
{
if (!set.Contains(value))
{
action(value);
set.Add(value);
}
}
}

Related

C# getting array with a string name?

So here's a hypothetical. From someone fairly new to the whole C# and Unity thing:
Suppose for a moment that I have a series of string[] arrays. All of which have similar naming convention. For example:
public string[] UndeadEntities =
{
// stuff
};
public string[] DemonEntities =
{
// stuff
};
Now suppose I want to call one of them at random, I have another list that contains the names of all of those arrays and I return it at random.
My problem is that I grab the name from the array and it's a string, not something I can use. So my question is this:
is there any way for me to use this string and use it to call the above mentioned arrays.
Something like this is what I'm up to but unsure where to go from here and I really would like to avoid making a massive series of If Else statements just for that.
public string[] EnemiesType = { // list of all the other arrays }
public string enemiesTypeGeneratedArrayName = "";
public void GenerateEncounterGroup()
{
enemiesTypeGeneratedArrayName = EnemiesType[Random.Range(0, 12)];
}
Can I nest arrays inside of other arrays? Is there another alternative?
I'm not sure if it is possible at all but if it is, I'll take any pointers as to where to go from there. Thanks.
There are several solutions to your specific problem, an easy one is using Dictionaries:
A Dictionary is a data structure wher you have a key (usually a string) and a value (whatever type you may want to store).
What you can do is at start, initialized a Dictionary were each key is your enemy type, and the value it store is your array, something like:
Dictionary<string, string[]> enemyArrays= new Dictionary<string, string[]>();
.
void Start()
{
enemyArrays["typeA"] = myArrayA;
enemyArrays["typeB"] = myArrayB;
}
Then when you need to get that array, just:
enemiesTypeGeneratedArrayName = EnemiesType[Random.Range(0, 12)];
string[] myRandomArray =enemyArrays[enemiesTypeGeneratedArrayName];
string randomEnemy = myRandomArray[index];
Here you can read more about Dictionary class if you want.
There are other ways to do it, but I think this one is pretty easy to implement in the code you already made, and Dicionaries are cool haha.
I hope is clear:)

Replace a Specific Text From a Text

I'm writing a chat helper tool for a game with a custom library.
I want to change specific variables when player sends the message.
This is my code
static List<string> asciis = new List<string> { "shrug", "omg" };
static List<string> converteds = new List<string> { #"¯\_(ツ)_/¯", #"◕_◕"};
private static void Game_OnInput(GameInputEventArgs args)
{
newtext = args.Input;
foreach (var ascii in asciis)
{
foreach (var converted in converteds)
{
if (args.Input.Contains(ascii))
{
newtext = args.Input.Replace(ascii, converted);
Game.Say(newtext);
}
}
}
}
As you can see I'm trying to get the texts from "asciis" and convert them to "converteds" (in order).
Whenever I type something that not in "asciis" list it perfectly works. But whenever I type shrug it prints ¯\_(ツ)_/¯ + ◕_◕ + ◕_◕ (it prints omg 2 times). Same in omg too.
You probably understand that I'm really beginner. I really didn't understand what is wrong with this code...
It seems that your two lists have the same length (in terms of elements contained) and each element in one list has its replacement in the same position in the other list.
Then you could treat the two lists as two arrays and use a different way to search for the input term and replace it with the substitution text
private static void Game_OnInput(GameInputEventArgs args)
{
newtext = args.Input;
for(int x = 0; x < ascii.Count; x++)
if (args.Input.Contains(ascii[x]))
{
newtext = args.Input.Replace(ascii[x], converted[x]);
Game.Say(newtext);
}
}
While i don't think there is a big improvement, you could also implement the same with a dictionary
static Dictionary<string, string> converter = new Dictionary<string, string>()
{
{"shrug", #"¯\_(ツ)_/¯"},
{"omg", #"◕_◕"}
};
private static void Game_OnInput(GameInputEventArgs args)
{
newtext = args.Input;
foreach(KeyValuePair<string, string> kvp in converter)
if (args.Input.Contains(kvp.Key))
{
newtext = args.Input.Replace(kvp.Key, kvp.Value);
Game.Say(newtext);
}
}
Well, probably is a bit more readable, but still we need traversing the dictionary Keys one by one.
As Daniel pointed out in his comment, this is a good use case for dictionaries.
Have a dictionary that maps the text you want replaced to the stuff you want to be replaced with:
Dictionary<string, string> dict = new Dictionary<string, string>
{
{"shrug", #"¯\_(ツ)_/¯" },
{"omg", "◕_◕" }
}; // etc
Then find all occurrences of the keys from the dictionary and replace them with the corresponding values.
Also why are you using static methods and fields? I may be wrong, but I expect most, if not all of your other methods and fields are static as well. I strongly recommend avoiding getting used to them. Try learning more about OOP instead.
Your main problem is that you are always replacing on args.Input, but storing the results in newtext each time, overwriting your previous replacements. Your next problem is that you are outputting the result after each replacement attempt so that's why you are getting multiple weird output results.
I also suggest a dictionary since by definition, it is a mapping of one thing to another. Also, note my changes below, I have moved the Game.Say call outside of the loops and changed "args.Input.Replace" to "newtext.Replace"
Dictionary<string, string> dictionary = new Dictionary<string, string>
{
{"shrug", #"¯\_(ツ)_/¯" },
{"omg", "◕_◕" }
};
private static void Game_OnInput(GameInputEventArgs args)
{
string newtext = args.Input;
foreach(string key in dictionary.Keys){
newtext = newtext.Replace(key,dictionary[key]);
}
Game.Say(newtext);
}

When to write code to anticipate looping and when to specifically do something 'X' times

If i know that a method must perform an action a certain amount of times, such as retrieve data, should I write code to specifically do this the number of times that is required, or should my code be able to anticipate later changes?
For instance, say I was told to write a method that retrieves 2 values from a dictionary (Ill call it Settings here) and return them using known keys that are provided
public Dictionary<string, string> GetSettings()
{
const string keyA = "address"; //I understand 'magic strings' are bad, bear with me
const string keyB = "time"
Dictionary<string, string> retrievedSettings = new Dictionary<string,string>();
//should I add the keys to a list and then iterate through the list?
List<string> listOfKeys = new List<string>(){keyA, keyB};
foreach( string key in listOfKeys)
{
if(Settings.ContainsKey(key)
{
string value = Setting[key];
retrieveSettings.Add(key, value);
}
}
//or should I just get the two values directly from the dictionary like so
if(Settings.ContainsKey(keyA)
{
retrievedSettings.Add(keyA , Setting[keyA]);
}
if(Settings.Contains(keyB)
{
retrievedSettings.Add(keyB , Setting[keyB]);
}
return retrievedSettings
}
The reason why I ask is that code repetition is always a bad thing ie DRY, but at the same time, more experienced programmers have told me that there is no need write the logic to anticipate larger looping if it the action only needs to be performed a known number of times
I would extract a method that takes the keys as parameter:
private Dictionary<string, string> GetSettings(params string[] keys)
{
var retrievedSettings = new Dictionary<string, string>();
foreach(string key in keys)
{
if(Settings.ContainsKey(key)
retrieveSettings.Add(key, Setting[key]);
}
return retrievedSettings;
}
You can now use this method like this:
public Dictionary<string, string> GetSettings()
{
return GetSettings(keyA, keyB);
}
I would choose this approach because it makes your main method trivial to understand: "Aha, it gets the settings for keyA and for keyB".
I would use this approach even when I am sure that I will never need to get more than these two keys. In other words, this approach has been chosen, not because it anticipates later changes but because it better communicates intent.
However, with LINQ, you wouldn't really need that extracted method. You could simply use this:
public Dictionary<string, string> GetSettings()
{
return new [] { keyA, keyB }.Where(x => Settings.ContainsKey(x))
.ToDictionary(x => x, Settings[x]);
}
The DRY principle does not necessarily mean that every line of code in your program should be unique. It simply means that you should not have large regions of code spread out throughout your program that do the same thing.
Option number 1 works well when you have a large number of items to search for, but has the downside of making the code slightly less trivial to read.
Option number 2 works well when you have a small number options. It is more straightforward and is actually more efficient.
Since you only have two settings, I would definitely go with option number 2. Making decisions such as these in expectation of future changes is a waste of effort. I have found this article to be quite helpful in illustrating the perils of being too concerned with non-existent requirements.

Design pattern for aggregating lazy lists

I'm writing a program as follows:
Find all files with the correct extension in a given directory
Foreach, find all occurrences of a given string in those files
Print each line
I'd like to write this in a functional way, as a series of generator functions (things that call yield return and only return one item at a time lazily-loaded), so my code would read like this:
IEnumerable<string> allFiles = GetAllFiles();
IEnumerable<string> matchingFiles = GetMatches( "*.txt", allFiles );
IEnumerable<string> contents = GetFileContents( matchingFiles );
IEnumerable<string> matchingLines = GetMatchingLines( contents );
foreach( var lineText in matchingLines )
Console.WriteLine( "Found: " + lineText );
This is all fine, but what I'd also like to do is print some statistics at the end. Something like this:
Found 233 matches in 150 matching files. Scanned 3,297 total files in 5.72s
The problem is, writing the code in a 'pure functional' style like above, each item is lazily loaded.
You only know how many files match in total until the final foreach loop completes, and because only one item is ever yielded at a time, the code doesn't have any place to keep track of how many things it's found previously. If you invoke LINQ's matchingLines.Count() method, it will re-enumerate the collection!
I can think of many ways to solve this problem, but all of them seem to be somewhat ugly. It strikes me as something that people are bound to have done before, and I'm sure there'll be a nice design pattern which shows a best practice way of doing this.
Any ideas? Cheers
In a similar vein to other answers, but taking a slightly more generic approach ...
... why not create a Decorator class that can wrap an existing IEnumerable implementation and calculate the statistic as it passes other items through.
Here's a Counter class I just threw together - but you could create variations for other kinds of aggregation too.
public class Counter<T> : IEnumerable<T>
{
public int Count { get; private set; }
public Counter(IEnumerable<T> source)
{
mSource = source;
Count = 0;
}
public IEnumerator<T> GetEnumerator()
{
foreach (var T in mSource)
{
Count++;
yield return T;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
foreach (var T in mSource)
{
Count++;
yield return T;
}
}
private IEnumerable<T> mSource;
}
You could create three instances of Counter:
One to wrap GetAllFiles() counting the total number of files;
One to wrap GetMatches() counting the number of matching files; and
One to wrap GetMatchingLines() counting the number of matching lines.
The key with this approach is that you're not layering multiple responsibilities onto your existing classes/methods - the GetMatchingLines() method only handles the matching, you're not asking it to track stats as well.
Clarification in response to a comment by Mitcham:
The final code would look something like this:
var files = new Counter<string>( GetAllFiles());
var matchingFiles = new Counter<string>(GetMatches( "*.txt", files ));
var contents = GetFileContents( matchingFiles );
var linesFound = new Counter<string>(GetMatchingLines( contents ));
foreach( var lineText in linesFound )
Console.WriteLine( "Found: " + lineText );
string message
= String.Format(
"Found {0} matches in {1} matching files. Scanned {2} files",
linesFound.Count,
matchingFiles.Count,
files.Count);
Console.WriteLine(message);
Note that this is still a functional approach - the variables used are immutable (more like bindings than variables), and the overall function has no side-effects.
I would say that you need to encapsulate the process into a 'Matcher' class in which your methods capture statistics as they progress.
public class Matcher
{
private int totalFileCount;
private int matchedCount;
private DateTime start;
private int lineCount;
private DateTime stop;
public IEnumerable<string> Match()
{
return GetMatchedFiles();
System.Console.WriteLine(string.Format(
"Found {0} matches in {1} matching files." +
" {2} total files scanned in {3}.",
lineCount, matchedCount,
totalFileCount, (stop-start).ToString());
}
private IEnumerable<File> GetMatchedFiles(string pattern)
{
foreach(File file in SomeFileRetrievalMethod())
{
totalFileCount++;
if (MatchPattern(pattern,file.FileName))
{
matchedCount++;
yield return file;
}
}
}
}
I'll stop there since I'm supposed to be coding work stuff, but the general idea is there. The entire point of 'pure' functional program is to not have side effects, and this type of statics calculation is a side effect.
I can think of two ideas
Pass in a context object and return (string + context) from your enumerators - the purely functional solution
use thread local storage for you statistics (CallContext), you can be fancy and support a stack of contexts. so you would have code like this.
using (var stats = DirStats.Create())
{
IEnumerable<string> allFiles = GetAllFiles();
IEnumerable<string> matchingFiles = GetMatches( "*.txt", allFiles );
IEnumerable<string> contents = GetFileContents( matchingFiles );
stats.Print()
IEnumerable<string> matchingLines = GetMatchingLines( contents );
stats.Print();
}
If you're happy to turn your code upside down, you might be interested in Push LINQ. The basic idea is to reverse the "pull" model of IEnumerable<T> and turn it into a "push" model with observers - each part of the pipeline effectively pushes its data past any number of observers (using event handlers) which typically form new parts of the pipeline. This gives a really easy way to hook up multiple aggregates to the same data.
See this blog entry for some more details. I gave a talk on it in London a while ago - my page of talks has a few links for sample code, the slide deck, video etc.
It's a fun little project, but it does take a bit of getting your head around.
I took Bevan's code and refactored it around until I was content. Fun stuff.
public class Counter
{
public int Count { get; set; }
}
public static class CounterExtensions
{
public static IEnumerable<T> ObserveCount<T>
(this IEnumerable<T> source, Counter count)
{
foreach (T t in source)
{
count.Count++;
yield return t;
}
}
public static IEnumerable<T> ObserveCount<T>
(this IEnumerable<T> source, IList<Counter> counters)
{
Counter c = new Counter();
counters.Add(c);
return source.ObserveCount(c);
}
}
public static class CounterTest
{
public static void Test1()
{
IList<Counter> counters = new List<Counter>();
//
IEnumerable<int> step1 =
Enumerable.Range(0, 100).ObserveCount(counters);
//
IEnumerable<int> step2 =
step1.Where(i => i % 10 == 0).ObserveCount(counters);
//
IEnumerable<int> step3 =
step2.Take(3).ObserveCount(counters);
//
step3.ToList();
foreach (Counter c in counters)
{
Console.WriteLine(c.Count);
}
}
}
Output as expected: 21, 3, 3
Assuming those functions are your own, the only thing I can think of is the Visitor pattern, passing in an abstract visitor function that calls you back when each thing happens. For example: pass an ILineVisitor into GetFileContents (which I'm assuming breaks up the file into lines). ILineVisitor would have a method like OnVisitLine(String line), you could then implement the ILineVisitor and make it keep the appropriate stats. Rinse and repeat with a ILineMatchVisitor, IFileVisitor etc. Or you could use a single IVisitor with an OnVisit() method which has a different semantic in each case.
Your functions would each need to take a Visitor, and call it's OnVisit() at the appropriate time, which may seem annoying, but at least the visitor could be used to do lots of interesting things, other than just what you're doing here. In fact you could actually avoid writing GetMatchingLines by passing a visitor that checks for the match in OnVisitLine(String line) into GetFileContents.
Is this one of the ugly things you'd already considered?

Good way to concatenate string representations of objects?

Ok,
We have a lot of where clauses in our code. We have just as many ways to generate a string to represent the in condition. I am trying to come up with a clean way as follows:
public static string Join<T>(this IEnumerable<T> items, string separator)
{
var strings = from item in items select item.ToString();
return string.Join(separator, strings.ToArray());
}
it can be used as follows:
var values = new []{1, 2, 3, 4, 5, 6};
values.StringJoin(",");
// result should be:
// "1,2,3,4,5,6"
So this is a nice extension method that does a very basic job. I know that simple code does not always turn into fast or efficient execution, but I am just curious as to what could I have missed with this simple code. Other members of our team are arguing that:
it is not flexible enough (no control of the string representation)
may not be memory efficient
may not be fast
Any expert to chime in?
Regards,
Eric.
Regarding the first issue, you could add another 'formatter' parameter to control the conversion of each item into a string:
public static string Join<T>(this IEnumerable<T> items, string separator)
{
return items.Join(separator, i => i.ToString());
}
public static string Join<T>(this IEnumerable<T> items, string separator, Func<T, string> formatter)
{
return String.Join(separator, items.Select(i => formatter(i)).ToArray());
}
Regarding the second two issues, I wouldn't worry about it unless you later run into performance issues and find it to be a problem. It's unlikely to much of a bottleneck however...
For some reason, I thought that String.Join is implemented in terms of a StringBuilder class. But if it isn't, then the following is likely to perform better for large inputs since it doesn't recreate a String object for each join in the iteration.
public static string Join<T>(this IEnumerable<T> items, string separator)
{
// TODO: check for null arguments.
StringBuilder builder = new StringBuilder();
foreach(T t in items)
{
builder.Append(t.ToString()).Append(separator);
}
builder.Length -= separator.Length;
return builder.ToString();
}
EDIT: Here is an analysis of when it is appropriate to use StringBuilder and String.Join.
Why don't you use StringBuilder, and iterate through the collection yourself, appending.
Otherwise you are creating an array of strings (var strings) and then doing the Join.
You are missing null checks for the sequence and the items of the sequence. And yes, it is not the fastest and most memory efficient way. One would probably just enumerate the sequence and render the string representations of the items into a StringBuilder. But does this really matter? Are you experiencing performance problems? Do you need to optimize?
this would work also:
public static string Test(IEnumerable<T> items, string separator)
{
var builder = new StringBuilder();
bool appendSeperator = false;
if(null != items)
{
foreach(var item in items)
{
if(appendSeperator)
{
builder.Append(separator)
}
builder.Append(item.ToString());
appendSeperator = true;
}
}
return builder.ToString();
}

Categories

Resources