Trouble with CSV formatting

Trouble with CSV formatting - c#

Having problems with formatting CSV created from C# code. In the notepad file the output scrolls vertically down one row (the values seen in the structs below are output in one row. There is a row of numbers as well that appears directly below the struct values but the numbers should be in a new row beside the structs). When I open in excel it's a similar story only the output from the structs is where it should be however the row of numbers appears directly below the struct values but one row to the right if that makes sense, and the numbers should appear directly beside their corresponding struct values. The code I'm using is below.
Here are the structs for the dictionaries im working with.
public enum Genders
{
Male,
Female,
Other,
UnknownorDeclined,
}
public enum Ages
{
Upto15Years,
Between16to17Years,
Between18to24Years,
Between25to34Years,
Between35to44Years,
Between45to54Years,
Between55to64Years,
Between65to74Years,
Between75to84Years,
EightyFiveandOver,
UnavailableorDeclined,
}
the csv file that does the outputting using a streamwriter and stringbuilder.
public void CSVProfileCreate<T>(Dictionary<T, string> columns, Dictionary<T, int> data)
{
StreamWriter write = new StreamWriter("c:/temp/testoutputprofile.csv");
StringBuilder output = new StringBuilder();
foreach (var pair in columns)
{
//output.Append(pair.Key);
//output.Append(",");
output.Append(pair.Value);
output.Append(",");
output.Append(Environment.NewLine);
}
foreach (var d in data)
{
//output.Append(pair.Key);
output.Append(",");
output.Append(d.Value);
output.Append(Environment.NewLine);
}
write.Write(output);
write.Dispose();
}
And finally the method to feed the dictionaries into the csv creator.
public void RunReport()
{
CSVProfileCreate(genderKeys, genderValues);
CSVProfileCreate(ageKeys, ageValues);
}
Any ideas?
UPDATE
I fixed it by doing this:
public void CSVProfileCreate<T>(Dictionary<T, string> columns, Dictionary<T, int> data)
{
StreamWriter write = new StreamWriter("c:/temp/testoutputprofile.csv");
StringBuilder output = new StringBuilder();
IEnumerable<string> col = columns.Values.AsEnumerable();
IEnumerable<int> dat = data.Values.AsEnumerable();
for (int i = 0; i < col.Count(); i++)
{
output.Append(col.ElementAt(i));
output.Append(",");
output.Append(dat.ElementAt(i));
output.Append(",");
output.Append(Environment.NewLine);
}
write.Write(output);
write.Dispose();
}
}

You write Environment.NewLine after every single value that you output.
Rather than having two loops, you should have just one loop that outputs
A "pair"
A value
Environment.NewLine
for each iteration.
Assuming columns and data have the same keys, that could look something like
foreach (T key in columns.Keys)
{
pair = columns[key];
d = data[key];
output.Append(pair.Value);
output.Append(",");
output.Append(d.Value);
output.Append(Environment.NewLine);
}
Note two complications:
If pair.Value or d.Value contains a comma, you need to surround the output of that cell with double quotes.
If If pair.Value or d.Value contains a comma and also contains a double-quote, you have to double up the double-quote to escape it.
Examples:
Smith, Jr
would have to be output
"Smith, Jr"
and
"Smitty" Smith, Jr
would have to be output
"""Smitty"" Smith, Jr"
UPDATE
Based on your comment about the keys...
For purposes of enumeration, each item in the dictionary is treated as a KeyValuePair structure representing a value and its key. The order in which the items are returned is undefined.
http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
If you cannot use the key to associate the right pair with the right data, how do you make that association?
If you are iterating the dictionary and they happen to be in the order you hope, that is truly undefined behavior that could change with the next .NET service pack.
You need something reliable to relate the pair with the correct data.
About the var keyword
var is not a type, but rather a shortcut that frees you from writing out the entire type. You can use var if you wish, but the actual type is KeyValuePair<T, string> and KeyValuePair<T, int> respectively. You can see that if you write var and hover over that keyword with your mouse in Visual Studio.
About disposing resources
Your line
write.Dispose();
is risky. If any of your code throws an Exception prior to reaching that line, it will never run and write will not be disposed. It is strongly preferable to make use of the using keyword like this:
using (StreamWriter write = new StreamWriter("c:/temp/testoutputprofile.csv"))
{
// Your code here
}
When the scope of using ends (after the associated }), write.Dispose() will be automatically called whether or not an Exception was thrown. This is the same as, but shorter than,
try
{
StreamWriter write = new StreamWriter("c:/temp/testoutputprofile.csv");
// Your code here
}
finally
{
write.Dispose();
}

Related

how to merge two csv files with different columns and rows in c#

I'm trying to merge two csv files which have different headers and different number of rows/lines.
Using the following code, but doesn't get correct output. It's working when rows are same.
var first = File.ReadAllLines("firstfile.csv");
var second = File.ReadAllLines("secondfile.csv");
var result = first.Zip(second, (f, s) => string.Join(",", f, s));
File.WriteAllLines("combined.csv", result);
for ex:
firstfile is
col1,colb,colc
a,b,c
a,v,f
the secondfile is
colx,coly
x,y
cc,aa
bb,vv
m,n
the output is get
col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa
the second file rows are missiing.
my expected output is
col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa
,,,bb,vv
,,,m,n

There is no inbuilt method that allows you to merge two lists of unequal length. Zip only merges down to the shortest length. However, you can achieve what you want by modifying Marc Gravell's excellent answer here, in order to allow a default value. Create yourself an extensions class, something like this:
public static class Extensions
{
public static IEnumerable<T> Merge<T>(this IEnumerable<T> first,
IEnumerable<T> second, T defaultValue, Func<T, T, T> operation)
{
using (var iter1 = first.GetEnumerator())
using (var iter2 = second.GetEnumerator())
{
while (iter1.MoveNext())
{
if (iter2.MoveNext())
{
yield return operation(iter1.Current, iter2.Current);
}
else
{
yield return operation(iter1.Current, defaultValue);
}
}
while (iter2.MoveNext())
{
yield return operation(defaultValue, iter2.Current);
}
}
}
}
You can now call it with code like this:
char separator = ',';
var first = File.ReadAllLines("firstfile.csv").AsEnumerable();
var second = File.ReadAllLines("secondfile.csv").AsEnumerable();
string defaultValue = "";
int cnt = 0;
if (first.Count() < second.Count())
{
cnt = first.FirstOrDefault().Split(separator).Length;
}
else
{
cnt = second.FirstOrDefault().Split(separator).Length;
}
defaultValue = defaultValue.PadLeft(cnt - 1, separator);
var result = first.Merge(second, defaultValue, (f, s) => string.Join(separator.ToString(), f, s));
File.WriteAllLines("combined.csv", result);
Note I have added a char separator and changed the result of ReadAllLines to give an IEnumerable<string> rather than string[] to make the code more generic. Also the above code assumes that the both files have an internally consistent number of columns.

First you need to find out which of the two lists is the larger one so you can loop over that one and once you're past the length of the smaller list you can fill up the missing cells with empty values.
Next you need to know how many columns you have in the smaller list as you want to fill these columns with empty values. That means you have to take the header line of the smaller list, split it by comma and count the columns.
Then generate a string containing your empty cells (eg. if your smaller list has 3 columns, you need a string ",," - String Padding may be of help here).
So then you only have to loop over the larger list and get the two corresponding rows (or use the empty one you generated earlier) and concatenate them with a comma and put them in a list.

C# using dictionaries

I'm sorry in advance if it's bad to ask for this sort of help... but I don't know who else to ask.
I have an assignment to read two text files, and find the 10 longest words in the first file (and the amount of times they're repeated) which dont exist in the second file.
I currently read both of the files with File.ReadAllLines then split them into arrays, where every element is a single word (punctuation marks removed as well) and removed empty entries.
The idea I had to pick out the words fitting the requirements was: to make a dictionary containing a string Word and an int Count. Then make a loop repeating for the first file's length.... firstly comparing the element with the entire dictionary - if it finds a match, increase the Count by 1. Then if it doesn't match with any of the dictionary elements - compare the given element with every element in the 2nd file through another loop, if it finds a match - just go on to the next element of the first file, if it doesn't find any matches - add the word to the dictionary, and set Count to 1.
So my first question is: Is this actually the most efficient way to do this? (Don't forget I've only recently started studying c# and am not allowed to use linq)
Second question: How do I work with the dictionary, because most of the results I could find were very confusing, and we have not yet met them at university.
My code so far:
// Reading and making all the words lowercase for comparisons
string punctuation = " ,.?!;:\"\r\n";
string Read1 = File.ReadAllText("#\\..\\Book1.txt");
Read1 = Read1.ToLower();
string Read2 = File.ReadAllText("#\\..\\Book2.txt");
Read2 = Read2.ToLower();
//Working with the 1st file
string[] FirstFileWords = Read1.Split(punctuation.ToCharArray());
var temp1 = new List<string>();
foreach (var word in FirstFileWords)
{
if (!string.IsNullOrEmpty(word))
temp1.Add(word);
}
FirstFileWords = temp1.ToArray();
Array.Sort(FirstFileWords, (x, y) => y.Length.CompareTo(x.Length));
//Working with the 2nd file
string[] SecondFileWords = Read2.Split(punctuation.ToCharArray());
var temp2 = new List<string>();
foreach (var word in SecondFileWords)
{
if (!string.IsNullOrEmpty(word))
temp2.Add(word);
}
SecondFileWords = temp2.ToArray();

Well I think you've done very well so far. Not being able to use Linq here is torture ;)
As for performance, you should consider making your SecondFileWords a HashSet<string> as this would increase lookup times if any word exists in the 2nd file tremendously without much effort. I wouldn't go much further in terms of performance optimization for an exercise like that if performance is not a key requirement.
Of course, you would have to check that you don't add duplicates to your 2nd list, so change your current implementation to something like:
HashSet<string> temp2 = new HashSet<string>();
foreach (var word in SecondFileWords)
{
if (!string.IsNullOrEmpty(word) && !temp2.Contains(word))
{
temp2.Add(word);
}
}
Don't convert this back to an Array again, this is not necessary.
This brings me back to your FirstFileWords which would contain duplicates too. This will cause issues later on when the top words might contain the same word multiple times. So let's get rid of them. Here it's more complicated as you need to retain the information how often a word appeared in your first list.
So let's bring a Dictionary<string, int> into play here now. A Dictionary stores a lookup key, as the HashSet, but in addition, also a value. We will use the key for the word, and the value for a number that contains the amount of how often the word appeared in the first list.
Dictionary<string, int> temp1 = new Dictionary<string, int>();
foreach (var word in FirstFileWords)
{
if (string.IsNullOrEmpty(word))
{
continue;
}
if (temp1.ContainsKey(word))
{
temp1[word]++;
}
else
{
temp1.Add(word, 1);
}
}
Now a dictionary cannot be sorted, which complicates things at this point as you still need to get your sorting by word length done. So let's get back to your Array.Sort method which I think is a good choice when you are not allowed to use Linq:
KeyValuePair<string, int>[] firstFileWordsWithCount = temp1.ToArray();
Array.Sort(firstFileWordsWithCount, (x, y) => y.Key.Length.CompareTo(x.Key.Length));
Note: You are using .ToArray() in your example, so I think it's OK to use it. But strictly speaking, this would also fall unter using Linq IMHO.
Now all that's left is working through your firstFileWordsWithCount array until you got 10 words that do not exist in the HashSet temp2. Something like:
int foundWords = 0;
foreach(KeyValuePair<string, int> candidate in firstFileWordsWithCount)
{
if (!temp2.Contains(candidate.Key))
{
Console.WriteLine($"{candidate.Key}: {candidate.Value}");
foundWords++;
}
if (foundWords >= 10)
{
break;
}
}
If anything is unclear, just ask.

This is what you'll get when using dictionaries:
string File1 = "AMD Intel Skylake Processors Graphics Cards Nvidia Architecture Microprocessor Skylake SandyBridge KabyLake";
string File2 = "Graphics Nvidia";
Dictionary<string, int> Dic = new Dictionary<string, int>();
string[] File1Array = File1.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Array.Sort(File1Array, (s1, s2) => s2.Length.CompareTo(s1.Length));
foreach (string s in File1Array)
{
if (Dic.ContainsKey(s))
{
Dic[s]++;
}
else
{
Dic.Add(s, 1);
}
}
string[] File2Array = File2.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
foreach (string s in File2Array)
{
if (Dic.ContainsKey(s))
{
Dic.Remove(s);
}
}
int i = 0;
foreach (KeyValuePair<string, int> kvp in Dic)
{
i++;
Console.WriteLine(kvp.Key + " " + kvp.Value);
if (i == 9)
{
break;
}
}
My earlier attempt was using LINQ, which is apparently not allowed but missed it.
string[] Results = File1.Split(" ".ToCharArray()).Except(File2.Split(" ".ToCharArray())).OrderByDescending(s => s.Length).Take(10).ToArray();
for (int i = 0; i < Results.Length; i++)
{
Console.WriteLine(Results[i] + " " + Regex.Matches(File1, Results[i]).Count);
}

Replace a Specific Text From a Text

I'm writing a chat helper tool for a game with a custom library.
I want to change specific variables when player sends the message.
This is my code
static List<string> asciis = new List<string> { "shrug", "omg" };
static List<string> converteds = new List<string> { #"¯\_(ツ)_/¯", #"◕_◕"};
private static void Game_OnInput(GameInputEventArgs args)
{
newtext = args.Input;
foreach (var ascii in asciis)
{
foreach (var converted in converteds)
{
if (args.Input.Contains(ascii))
{
newtext = args.Input.Replace(ascii, converted);
Game.Say(newtext);
}
}
}
}
As you can see I'm trying to get the texts from "asciis" and convert them to "converteds" (in order).
Whenever I type something that not in "asciis" list it perfectly works. But whenever I type shrug it prints ¯\_(ツ)_/¯ + ◕_◕ + ◕_◕ (it prints omg 2 times). Same in omg too.
You probably understand that I'm really beginner. I really didn't understand what is wrong with this code...

It seems that your two lists have the same length (in terms of elements contained) and each element in one list has its replacement in the same position in the other list.
Then you could treat the two lists as two arrays and use a different way to search for the input term and replace it with the substitution text
private static void Game_OnInput(GameInputEventArgs args)
{
newtext = args.Input;
for(int x = 0; x < ascii.Count; x++)
if (args.Input.Contains(ascii[x]))
{
newtext = args.Input.Replace(ascii[x], converted[x]);
Game.Say(newtext);
}
}
While i don't think there is a big improvement, you could also implement the same with a dictionary
static Dictionary<string, string> converter = new Dictionary<string, string>()
{
{"shrug", #"¯\_(ツ)_/¯"},
{"omg", #"◕_◕"}
};
private static void Game_OnInput(GameInputEventArgs args)
{
newtext = args.Input;
foreach(KeyValuePair<string, string> kvp in converter)
if (args.Input.Contains(kvp.Key))
{
newtext = args.Input.Replace(kvp.Key, kvp.Value);
Game.Say(newtext);
}
}
Well, probably is a bit more readable, but still we need traversing the dictionary Keys one by one.

As Daniel pointed out in his comment, this is a good use case for dictionaries.
Have a dictionary that maps the text you want replaced to the stuff you want to be replaced with:
Dictionary<string, string> dict = new Dictionary<string, string>
{
{"shrug", #"¯\_(ツ)_/¯" },
{"omg", "◕_◕" }
}; // etc
Then find all occurrences of the keys from the dictionary and replace them with the corresponding values.
Also why are you using static methods and fields? I may be wrong, but I expect most, if not all of your other methods and fields are static as well. I strongly recommend avoiding getting used to them. Try learning more about OOP instead.

Your main problem is that you are always replacing on args.Input, but storing the results in newtext each time, overwriting your previous replacements. Your next problem is that you are outputting the result after each replacement attempt so that's why you are getting multiple weird output results.
I also suggest a dictionary since by definition, it is a mapping of one thing to another. Also, note my changes below, I have moved the Game.Say call outside of the loops and changed "args.Input.Replace" to "newtext.Replace"
Dictionary<string, string> dictionary = new Dictionary<string, string>
{
{"shrug", #"¯\_(ツ)_/¯" },
{"omg", "◕_◕" }
};
private static void Game_OnInput(GameInputEventArgs args)
{
string newtext = args.Input;
foreach(string key in dictionary.Keys){
newtext = newtext.Replace(key,dictionary[key]);
}
Game.Say(newtext);
}

Manipulating Values in Dictionary

So I have a dictionary whose index is an int, and whose value is a class that contains a list of doubles, the class is built like this:
public class MyClass
{
public List<double> MyList = new List<double>();
}
and the dictionary is built like this:
public static Dictionary<int, MyClass> MyDictionary = new Dictionary<int, MyClass>();
I populate the dictionary by reading a file in line by line, and adding the pieces of the file into a splitstring, of which there is a known number of parts (100), then adding the pieces of the string into the list, and finally into the dictionary. Here's what that looks like:
public void DictionaryFiller()
{
string LineFromFile;
string[] splitstring;
int LineNumber = 0;
StreamReader sr = sr.ReadLine();
while (!sr.EndOfStream)
{
LineFromFile = sr.ReadLine();
splitstring = LineFromFile.Split(',');
MyClass newClass = new MyClass();
for (int i = 1; i < 100; i++)
{
newClass.MyList.Add(Convert.ToDouble(splitstring[i]));
}
MyDictionary.Add(LineNumber, MyClass);
LineNumber++;
}
}
My question is this: is I were to then read another file and begin the DictionaryFiller method again, could I add terms to each item in the list for each value in the dictionary. What I mean by that is, say the file's 1st line started with 10,23,15,... Now, when I read in a second file, lets say its first line begins with 10,13,18,... what I'm looking to have happen is for the dictionary to have the first 3 doubles in its value-list (indexed at 0) to then become 20,36,33,...
Id like to be able to add terms for any number of files read in, and ultimately then take their average by going through the dictionary again (in a separate method) and dividing each term in the value-list by the number of files read in. Is this possible to do? Thanks for any advice you have, I'm a novice programmer and any help you have is appreciated.

Just Replace
newClass.MyList.Add(Convert.ToDouble(splitstring[i]))
with
newClass.MyList.Add(Convert.ToDouble(splitstring[i]) + MyDictionary[LineNumber].GetListOfDouble()[i])
and then replace
MyDictionary.add(Linenumber, Myclass)
with
MyDictionary[linenumber] = MyClass
Just makes sure that the MyDictionary[LineNumber] is not null before adding it :)
Something like this would work
If(MyDictionary[LineNumber] == null)
{
MyDictionnary.add(LIneNUmber, new List<double>());
}
If(MyDictionary[LineNUmber][i] == null)
{
return 0;
}
My solution does not care about list size and it done at reading time not afterward, which should be more efficient than traversing your Dictionary twice.

var current = MyDictionary[key];
for(int i = 0; i < current.MyList.Length; i++)
{
current.MyList[i] = current.MyList[i] + newData[i];
}
Given both lists have same length and type of data.
You can get the custom object by key of the dictionary and then use its list to do any operation. You need to keep track of how many files are read separately.

Printing out Hashtable without using a loop in c#

I need to print out in a file a HashTable.
Is it possible to do it without a for/foreach loop? Something like:
Hashtable myHash;
Logging.traceMessage(Datetime.now, myhash) // I want to have just a single datatime entry in my traceMessage.
My expected output is something like:
'hashtable'
[KEY] - [VALUE]
pizzas - one
costumer - three
'TraceFile'
Datetime: pizzas - one , costumer - three
If I use a foreach loop for printing out the hash I will get a datetime in every key and value pair.

I would recommend the following. It should be fairly optimized for your needs.
Hashtable mhash = new Hashtable();
var sb = new StringBuilder();
foreach (var myhash in mhash)
{
sb.AppendLine(myhash.ToString()); -- Note you format your hash however you want here
}
Logging.traceMessage(DateTime.Now, sb.ToString());

you need to iterate in the list at some point - but for your purpose, the below extension method may help
Define your Extension method in a library (lets say myExtensions)
public static string EntriesCSV(this Hashtable ht)
{
string ls_return;
foreach (var pair in ht)
ls_return += String.Format("{0}={1};\r\n", pair.Key, pair.Value); //this is very simple statement, but you can improve this to fit your needs
return ls_return;
}
Reference your extensions library
using myExtensions;
Enjoy simplicity code :)
Hashtable myHash;
Logging.traceMessage(Datetime.now, myhash.EntriesCSV());
Hope this helps

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Trouble with CSV formatting - c#

Related

how to merge two csv files with different columns and rows in c#

C# using dictionaries

Replace a Specific Text From a Text

Manipulating Values in Dictionary

Printing out Hashtable without using a loop in c#

Categories

Resources