I am new to C# and I ran into the following problem (I have looked for a solution here and on google but was not successful):
Given an array of strings (some columns can possibly be doubles or integers "in string format") I would like to convert this array to an integer array.
The question only concerns the columns with actual string values (say a list of countries).
Now I believe a Dictionary can help me to identify the unique values in a given column and associate an integer number to every country that appears.
Then to create my new array which should be of type int (or double) I could loop through the whole array and define the new array via the dictionary. This I would need to do for every column which has string values.
This seems inefficient, is there a better way?
In the end I would like to do multiple linear regression (or even fit a generalized linear model, meaning I want to get a design matrix eventually) with the data.
EDIT:
1) Sorry for being unclear, I will try to clarify:
Given:
MAKE;VALUE ;GENDER
AUDI;40912.2;m
WV;3332;f
AUDI;1234.99;m
DACIA;0;m
AUDI;12354.2;m
AUDI;123;m
VW;21321.2;f
I want to get a "numerical" matrix with identifiers for the the string valued columns
MAKE;VALUE;GENDER
1;40912.2;0
2;3332;1
1;1234.99;0
3;0;0
1;12354.2;0
1;123;0
2;21321.2;1
2) I think this is actually not what I need to solve my problem. Still it does seem like an interesting question.
3) Thank you for the responses so far.
This will take all the possible strings which represent an integer and puts them in a List.
You can do the same with strings wich represent a double.
Is this what you mean??
List<int> myIntList = new List<int>()
foreach(string value in stringArray)
{
int myInt;
if(Int.TryParse(value,out myInt)
{
myIntList.Add(myInt);
}
}
Dictionary is good if you want to map each string to a key like this:
var myDictionary = new Dictionary<int,string>();
myDictionary.Add(1,"CountryOne");
myDictionary.Add(2,"CountryTwo");
myDictionary.Add(3,"CountryThree");
Then you can get your values like:
string myCountry = myDictionary[2];
But still not sure if i'm helping you right now. Do you have som code to specify what you mean?
I'm not sure if this is what you are looking for but it does output the result you are looking for, from which you can create an appropriate data structure to use. I use a list of string but you can use something else to hold the processed data. I can expand further, if needed.
It does assume that the number of "columns", based on the semicolon character, is equal throughout the data and is flexible enough to handle any number of columns. Its kind of ugly but it should get what you want.
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication3
{
class StringColIndex
{
public int ColIndex { get; set; }
public List<string> StringValues {get;set;}
}
class Program
{
static void Main(string[] args)
{
var StringRepresentationAsInt = new List<StringColIndex>();
List<string> rawDataList = new List<string>();
List<string> rawDataWithStringsAsIdsList = new List<string>();
rawDataList.Add("AUDI;40912.2;m");rawDataList.Add("VW;3332;f ");
rawDataList.Add("AUDI;1234.99;m");rawDataList.Add("DACIA;0;m");
rawDataList.Add("AUDI;12354.2;m");rawDataList.Add("AUDI;123;m");
rawDataList.Add("VW;21321.2;f ");
foreach(var rawData in rawDataList)
{
var split = rawData.Split(';');
var line = string.Empty;
for(int i= 0; i < split.Length; i++)
{
double outValue;
var isNumberic = Double.TryParse(split[i], out outValue);
var txt = split[i];
if (!isNumberic)
{
if(StringRepresentationAsInt
.Where(x => x.ColIndex == i).Count() == 0)
{
StringRepresentationAsInt.Add(
new StringColIndex { ColIndex = i,
StringValues = new List<string> { txt } });
}
var obj = StringRepresentationAsInt
.First(x => x.ColIndex == i);
if (!obj.StringValues.Contains(txt)){
obj.StringValues.Add(txt);
}
line += (string.IsNullOrEmpty(line) ?
string.Empty :
("," + (obj.StringValues.IndexOf(txt) + 1).ToString()));
}
else
{
line += "," + split[i];
}
}
rawDataWithStringsAsIdsList.Add(line);
}
rawDataWithStringsAsIdsList.ForEach(x => Console.WriteLine(x));
Console.ReadLine();
/*
Desired output:
1;40912.2;0
2;3332;1
1;1234.99;0
3;0;0
1;12354.2;0
1;123;0
2;21321.2;1
*/
}
}
}
Related
I am quite new the C# and I have googled the answer. The closest answer I have found was this one. But it doesn't help me.
I am trying to write a function that finds the biggest number in a string using loops and splicing only. For some reason, when the condition is met, the local variable big won't mutate in the if statements. I have tried to debug it by setting big = 34 when I hit a space, but even then it won't mutate the local variable.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace parser
{
class Sub_parser
{
// function to find the greatest number in a string
public int Greatest(string uinput)
{
int len = uinput.Length;
string con1 = "";
int big = 0;
int m = 0;
// looping through the string
for (int i = 0; i < len; i++)
{
// find all the numbers
if (char.IsDigit(uinput[i]))
{
con1 = con1 + uinput[i];
}
// if we hit a space then we check the number value
else if (uinput[i].Equals(" "))
{
if (con1 != "")
{
m = int.Parse(con1);
Console.WriteLine(m);
if (m > big)
{
big = m;
}
}
con1 = "";
}
}
return big;
}
public static void Main(string[] args)
{
while (true)
{
string u_input = Console.ReadLine();
Sub_parser sp = new Sub_parser();
Console.WriteLine(sp.Greatest(u_input));
}
}
}
}
The problem comes from your check in this statement :
else if (uinput[i].Equals(" "))
uinput[i] is a char, while " " is a string : see this example
if you replace the double quotes by single quotes, it works fine...
else if (uinput[i].Equals(' '))
And, as stated by the comments, the last number will never be checked, unless your input string ends by a space. This leaves you with two options :
recheck again the value of con1 after the loop (which is not very good-looking)
Rewrite your method because you're a bit overdoing things, don't reinvent the wheel. You can do something like (using System.Linq):
public int BiggestNumberInString(string input)
{
return input.Split(null).Max(x => int.Parse(x));
}
only if you are sure of your input
When you give a number and a space in the keyboard you only read the number, no space.
So you have uinput="34".
Inside the loop, you check if the m > big only if uinput[i].Equals(" "). Which is never.
In general if you read a line, with numbers followed by space, it would ignore the last number.
One solution would be to append a " " into uinput, but i recommend splicing.
string[] numbers = uinput.Split(null);
Then iterate over the array.
Also, as said in another answer compare uinput[i].Equals(' ') because " "represents a string, and you were comparing a char with a string.
As Martin Verjans mentioned, in order to make your code work you have to edit it like in his example.
Although there is still a Problem if you input a single number. The output would then be 0.
I would go for this Method:
public static int Greatest(string uinput)
{
List<int> numbers = new List<int>();
foreach(string str in uinput.Split(' '))
{
numbers.Add(int.Parse(str));
}
return numbers.Max();
}
I have a c# class that looks like this:
public class MemberData
{
public int meme_ck;
public string meme_name;
public bool meme_active;
public MemberData(int ck2, string name2, bool active2)
{
meme_ck = ck2;
meme_name = name2;
meme_active = active2;
}
}
I have made two arrays out of that class:
private MemberData[] memarray1 = new MemberData[10000];
private MemberData[] memarray2 = new Memberdata[10000];
Over the course of my application I do a bunch of stuff with these two arrays and values change, etc. Member's name or active status may change which results in the ararys becoming different.
Eventually I need to compare them in order to do things to the other one based on what results are kicked out in the first one.
For example, member is de-activated in the first array based on something application does, I need to update array 2 to de-activate that same member.
I am trying to use some database design philosphy with the int CK (contrived-key) to be able to rapidly look up the entry in the other array based on the CK.
Since I can't figure it out I've had to resort to using nested for loops like this, which sucks:
foreach (Memberdata md in memarray1)
{
foreach (Memberdatamd2 in memarray2)
{
if (md.ck = md2.ck)
{
//de-activate member
}
}
}
Is there a better way to do this? I just want to find the index in the second array based on CK when I have the CK value from the first array.
Any other tips or advice you have about structure would be appreciated as well. Should I be using something other than arrays? How would I accomplish this same thing with Lists?
Thanks!
Should I be using something other than arrays?
Yes. Don't use arrays; they are seldom the right data structure to use.
How would I accomplish this same thing with Lists?
Lists are only marginally better. They don't support an efficient lookup-by-key operation which is what you need.
It sounds like what you want is instead of two arrays, two Dictionary<int, MemberData> where the key is the ck.
I totally agree with Eric Lippert's answer above. It is better you do not use Array.
Same thing can be achieved using List<MemberData>. You can use LINQ as well to query your DataStructure.
Following is one of the way just to achieve your result using array
class Program
{
static MemberData[] memarray1 = new MemberData[10000];
static MemberData[] memarray2 = new MemberData[10000];
static void Main(string[] args)
{
for (int i = 0; i < memarray1.Length; i++)
{
memarray1[i] = new MemberData(i + 1, "MemName" + i + 1, true);
memarray2[i] = new MemberData(i + 1, "MemName" + i + 1, true);
}
// SIMULATING YOUR APP OPERATION OF CHANGING A RANDOM ARRAY VALUE IN memarray1
int tempIndex = new Random().Next(0, 9999);
memarray1[tempIndex].meme_name = "ChangedName";
memarray1[tempIndex].meme_active = false;
//FOR YOUR UDERSTADNING TAKING meme_ck IN AN INTEGER VARIABLE
int ck_in_mem1 = memarray1[tempIndex].meme_ck;
//FINDING ITEM IN ARRAY2
MemberData tempData = memarray2.Where(val => val.meme_ck == ck_in_mem1).FirstOrDefault();
// THIS IS YOUR ITEM.
Console.ReadLine();
}
}
I have a text file full of strings, one on each line. Some of these strings will contain an unknown number of "#" characters. Each "#" can represent the numbers 1, 2, 3, or 4. I want to generate all possible combinations (permutations?) of strings for each of those "#"s. If there were a set number of "#"s per string, I'd just use nested for loops (quick and dirty). I need help finding a more elegant way to do it with an unknown number of "#"s.
Example 1: Input string is a#bc
Output strings would be:
a1bc
a2bc
a3bc
a4bc
Example 2: Input string is a#bc#d
Output strings would be:
a1bc1d
a1bc2d
a1bc3d
a1bc4d
a2bc1d
a2bc2d
a2bc3d
...
a4bc3d
a4bc4d
Can anyone help with this one? I'm using C#.
This is actually a fairly good place for a recursive function. I don't write C#, but I would create a function List<String> expand(String str) which accepts a string and returns an array containing the expanded strings.
expand can then search the string to find the first # and create a list containing the first part of the string + expansion. Then, it can call expand on the last part of the string and add each element in it's expansion to each element in the last part's expansion.
Example implementation using Java ArrayLists:
ArrayList<String> expand(String str) {
/* Find the first "#" */
int i = str.indexOf("#");
ArrayList<String> expansion = new ArrayList<String>(4);
/* If the string doesn't have any "#" */
if(i < 0) {
expansion.add(str);
return expansion;
}
/* New list to hold the result */
ArrayList<String> result = new ArrayList<String>();
/* Expand the "#" */
for(int j = 1; j <= 4; j++)
expansion.add(str.substring(0,i-1) + j);
/* Combine every expansion with every suffix expansion */
for(String a : expand(str.substring(i+1)))
for(String b : expansion)
result.add(b + a);
return result;
}
I offer you here a minimalist approach for the problem at hand.
Yes, like other have said recursion is the way to go here.
Recursion is a perfect fit here, since we can solve this problem by providing the solution for a short part of the input and start over again with the other part until we are done and merge the results.
Every recursion must have a stop condition - meaning no more recursion needed.
Here my stop condition is that there are no more "#" in the string.
I'm using string as my set of values (1234) since it is an IEnumerable<char>.
All other solutions here are great, Just wanted to show you a short approach.
internal static IEnumerable<string> GetStrings(string input)
{
var values = "1234";
var permutations = new List<string>();
var index = input.IndexOf('#');
if (index == -1) return new []{ input };
for (int i = 0; i < values.Length; i++)
{
var newInput = input.Substring(0, index) + values[i] + input.Substring(index + 1);
permutations.AddRange(GetStrings(newInput));
}
return permutations;
}
An even shorter and cleaner approach with LINQ:
internal static IEnumerable<string> GetStrings(string input)
{
var values = "1234";
var index = input.IndexOf('#');
if (index == -1) return new []{ input };
return
values
.Select(ReplaceFirstWildCardWithValue)
.SelectMany(GetStrings);
string ReplaceFirstWildCardWithValue(char value) => input.Substring(0, index) + value + input.Substring(index + 1);
}
This is shouting out loud for a recursive solution.
First, lets make a method that generates all combinations of a certain length from a given set of values. Because we are only interested in generating strings, lets take advantage of the fact that string is immutable (see P.D.2); this makes recursive functions so much easier to implement and reason about:
static IEnumerable<string> GetAllCombinations<T>(
ISet<T> set, int length)
{
IEnumerable<string> getCombinations(string current)
{
if (current.Length == length)
{
yield return current;
}
else
{
foreach (var s in set)
{
foreach (var c in getCombinations(current + s))
{
yield return c;
}
}
}
}
return getCombinations(string.Empty);
}
Study carefully how this methods works. Work it out by hand for small examples to understand it.
Now, once we know how to generate all possible combinations, building the strings is easy:
Figure out the number of wildcards in the specified string: this will be our combination length.
For every combination, insert in order each character into the string where we encounter a wildcard.
Ok, lets do just that:
public static IEnumerable<string> GenerateCombinations<T>(
this string s,
IEnumerable<T> set,
char wildcard)
{
var length = s.Count(c => c == wildcard);
var combinations = GetAllCombinations(set, length);
var builder = new StringBuilder();
foreach (var combination in combinations)
{
var index = 0;
foreach (var c in s)
{
if (c == wildcard)
{
builder.Append(combination[index]);
index += 1;
}
else
{
builder.Append(c);
}
}
yield return builder.ToString();
builder.Clear();
}
}
And we're done. Usage would be:
var set = new HashSet<int>(new[] { 1, 2, 3, 4 });
Console.WriteLine(
string.Join("; ", "a#bc#d".GenerateCombinations(set, '#')));
And sure enough, the output is:
a1bc1d; a1bc2d; a1bc3d; a1bc4d; a2bc1d; a2bc2d; a2bc3d;
a2bc4d; a3bc1d; a3bc2d; a3bc3d; a3bc4d; a4bc1d; a4bc2d;
a4bc3d; a4bc4d
Is this the most performant or efficient implementation? Probably not but its readable and maintainable. Unless you have a specific performance goal you are not meeting, write code that works and is easy to understand.
P.D. I’ve omitted all error handling and argument validation.
P.D.2: if the length of the combinations is big, concatenting strings inside GetAllCombinations might not be a good idea. In that case I’d have GetAllCombinations return an IEnumerable<IEnumerable<T>>, implement a trivial ImmutableStack<T>, and use that as the combination buffer instead of string.
Context: I am designing a leader board and want to have the data displayed from the highest score to the lowest score. The problem being the data contains a string and integers which are imported from a text file. Currently I am able to sort the data numerically however I am using the OrderByDescending function which does not work. E.g 11,4,5,8,23,65 when ordered = 8,65,5,4,23,11 (sorted alphanumerically).
The list contains the data: name(string), difficulty(string) and score(int) and I wish to sort the data in a descending order: E.g 1st = 10, 2nd = 9 etc.
List<string> leaderboardList = new List<string>();
StreamReader srUserData = new StreamReader(#"User Leaderboard.txt");
while ((userDataLine = srUserData.ReadLine()) != null)
{
leaderboardList.Add(userDataLine);
}
leaderboardList = leaderboardList.OrderByDescending(x => Regex.Match(x, #"\d+").Value).ToList();
The Regex.Match finds the number in the string.
Basically the final line is the line that needs amending.
All help welcome, thanks.
Edit: The data should be outputted in the form Name, difficulty, score and sort the data in a descending order with the highest score.
Just change:
leaderboardList = leaderboardList.OrderByDescending(x =>
Regex.Match(x, #"\d+").Value).ToList();
To:
leaderboardList = leaderboardList.OrderByDescending(x =>
Int32.Parse(Regex.Match(x, #"\d+").Value)).ToList();
Just add the Int32.Parse really. A caveat though: If you pass a string that is not a number Int32.Parse() will throw an exception. If this is a possibility that needs to be handled, then you can use Int32.TryParse instead.
Ex:
int testValue = 0; //This is only used for TryParse()
leaderboardList = leaderboardList.OrderByDescending(x =>
Int32.TryParse(Regex.Match(x, #"\d+").Value, out testValue)
? testValue : 0M).ToList();
Tested using the following example list:
List<string> leaderboardList = new List<string>();
leaderboardList.Add("Brandon|Easy|9");
leaderboardList.Add("Yoda|Impossible|9001");
leaderboardList.Add("Barney|Easy|-1");
leaderboardList.Add("John|Normal|500");
This code works correctly, and the TryParse code was not needed.
Output:
Yoda|Impossible|9001
John|Normal|500
Brandon|Easy|9
Barney|Easy|-1
I suggest a completely different approach from the others. Load your string from disk, then parse it into a class you have defined. For example:
public class LeaderboardRow
{
public string Name { get; set; }
public string Difficulty { get; set; }
public int Score { get; set; }
}
Then your code would look more like this:
List<LeaderboardRow> leaderboardList = new List<LeaderboardRow>();
StreamReader srUserData = new StreamReader(#"User Leaderboard.txt");
while ((userDataLine = srUserData.ReadLine()) != null)
{
//Put logic here that parses your string row into 3 distinct values
leaderboardList.Add(new LeaderboardRow()
{
Score = 0, //put real value here
Name = string.Empty, //put real value here
Difficulty = string.Empty //put real value here
});
}
Then any ordering you need to do is a simple LINQ statement:
leaderboardList = leaderboardList.OrderByDescending(x => x.Score).ToList();
Depending on your scenario/requirements you could store this text as json instead which could speed up and simplify your application.
You could go old school and perform Integer.TryParse or some derivative of such functions. Once you do that you can append them into a list of integers.
int number;
bool result = Int32.TryParse(value, out number);
if (result) {
Console.WriteLine("Converted '{0}' to {1}.", value, number);
}
You can use this for ordering list of numbers in string or list of string that contains numbers.
The Alphanum Algorithm
I hope this works for you.
You missed one thing, that is you are not convert Regex value to int. Please check below code, it's works for me.
CODE:
List<string> leaderboardList = new List<string>();
leaderboardList.Add("A,A,9");
leaderboardList.Add("B,B,8");
leaderboardList.Add("G,C,65");
leaderboardList.Add("S,B,10");
leaderboardList = leaderboardList.OrderByDescending(x =>Convert.ToInt32((string.IsNullOrEmpty(Regex.Match(x, #"\d+").Value)?"0":Regex.Match(x, #"\d+").Value))).ToList();
foreach (var item in leaderboardList)
{
Console.WriteLine(item);
}
Online output: .Net Fiddle Output
So I have a dictionary whose index is an int, and whose value is a class that contains a list of doubles, the class is built like this:
public class MyClass
{
public List<double> MyList = new List<double>();
}
and the dictionary is built like this:
public static Dictionary<int, MyClass> MyDictionary = new Dictionary<int, MyClass>();
I populate the dictionary by reading a file in line by line, and adding the pieces of the file into a splitstring, of which there is a known number of parts (100), then adding the pieces of the string into the list, and finally into the dictionary. Here's what that looks like:
public void DictionaryFiller()
{
string LineFromFile;
string[] splitstring;
int LineNumber = 0;
StreamReader sr = sr.ReadLine();
while (!sr.EndOfStream)
{
LineFromFile = sr.ReadLine();
splitstring = LineFromFile.Split(',');
MyClass newClass = new MyClass();
for (int i = 1; i < 100; i++)
{
newClass.MyList.Add(Convert.ToDouble(splitstring[i]));
}
MyDictionary.Add(LineNumber, MyClass);
LineNumber++;
}
}
My question is this: is I were to then read another file and begin the DictionaryFiller method again, could I add terms to each item in the list for each value in the dictionary. What I mean by that is, say the file's 1st line started with 10,23,15,... Now, when I read in a second file, lets say its first line begins with 10,13,18,... what I'm looking to have happen is for the dictionary to have the first 3 doubles in its value-list (indexed at 0) to then become 20,36,33,...
Id like to be able to add terms for any number of files read in, and ultimately then take their average by going through the dictionary again (in a separate method) and dividing each term in the value-list by the number of files read in. Is this possible to do? Thanks for any advice you have, I'm a novice programmer and any help you have is appreciated.
Just Replace
newClass.MyList.Add(Convert.ToDouble(splitstring[i]))
with
newClass.MyList.Add(Convert.ToDouble(splitstring[i]) + MyDictionary[LineNumber].GetListOfDouble()[i])
and then replace
MyDictionary.add(Linenumber, Myclass)
with
MyDictionary[linenumber] = MyClass
Just makes sure that the MyDictionary[LineNumber] is not null before adding it :)
Something like this would work
If(MyDictionary[LineNumber] == null)
{
MyDictionnary.add(LIneNUmber, new List<double>());
}
If(MyDictionary[LineNUmber][i] == null)
{
return 0;
}
My solution does not care about list size and it done at reading time not afterward, which should be more efficient than traversing your Dictionary twice.
var current = MyDictionary[key];
for(int i = 0; i < current.MyList.Length; i++)
{
current.MyList[i] = current.MyList[i] + newData[i];
}
Given both lists have same length and type of data.
You can get the custom object by key of the dictionary and then use its list to do any operation. You need to keep track of how many files are read separately.