Ok, so the title isn't really the best. But here's my problem:
I've made a little program that writes a few names and an integer to a .txt document when certain event occurs from an extern program.
The thing is that a name can show up in several lines in the document, so I want to sumarize the integers for each specific person so that I get the total amount of points for him/her and then sort it.
For example:
The original line:
Aaaa Aaa 5
Bbbb Bbb 7
Cccc Ccc 2
Aaaa Aaa 4
Cccc Ccc 4
Bbbb Bbb 1
Dddd Ddd 1
The output I want:
1. Aaaa Aaa 9
2. Bbbb Bbb 8
3. Cccc Ccc 6
4. Dddd Ddd 1
Is there any way to do this in C#?
I've tried to read in every single line in the file and search for the name of a person. But that doesn't really help and I don't know how to solve this.
Any advice?
Obviously, the text is somewhat a key you use to sum up the numbers, so why not use a Dictionary<string, int> to sum up first and write later?
Example:
Dictionary<string, int> sums = new Dictionary<string, int>();
...
if (sums.ContainsKey(theNewString))
sums[theNewString] += theNewNumber;
else
sums[theNewString] = theNewNumber;
And when you know you're done, write the file. You can also re-write the file after every update of the dictionary, but please remember that the dictionary will grow and grow if you don't purge it.
Also: This won't work if the program is restarted, unless you create a new file every time the program starts. Otherwise you'd have to read an existing file into the dictionary when the program starts to continue summing up.
This Linq query returns the desired result as IEnumerable<string>:
IEnumerable<string> lineGroups = File.ReadLines(path)
.Select((l, i) => new { Line = l, Parts = l.Split() })
.Select(x => new
{
Number = x.Parts.ElementAtOrDefault(2).TryGetInt() ?? 1,
Col1 = x.Parts.ElementAtOrDefault(0),
Col2 = x.Parts.ElementAtOrDefault(1),
x.Line,
x.Parts
})
.GroupBy(x =>new { x.Col1, x.Col2 })
.Select((g, groupIndex) =>
string.Format("{0}. {1} {2} {3}",
groupIndex + 1, g.Key.Col1, g.Key.Col2, g.Sum(x => x.Number)));
output:
foreach (var grp in lineGroups)
Console.WriteLine(grp);
This is the output:
1. Aaaa Aaa 9
2. Bbbb Bbb 8
3. Cccc Ccc 2 // different than your desired ouput but seems to be correct
4. Dddd Ddd 1
These are my extension methods that i use in Linq queries to Try-Parse a string to common value type like int(as above). It return a nullable type if it was not parsable:
public static class NumericExtensions
{
public static bool IsBetween(this int value, int fromInclusive, int toInclusive)
{
return value >= fromInclusive && value <= toInclusive;
}
public static Decimal? TryGetDecimal(this string item)
{
Decimal d;
bool success = Decimal.TryParse(item, out d);
return success ? (Decimal?)d : (Decimal?)null;
}
public static int? TryGetInt(this string item)
{
int i;
bool success = int.TryParse(item, out i);
return success ? (int?)i : (int?)null;
}
public static bool TryGetBool(this string item)
{
bool b = false;
Boolean.TryParse(item, out b);
return b; ;
}
public static Version TryGetVersion(this string item)
{
Version v;
bool success = Version.TryParse(item, out v);
return v;
}
}
Create a dictionary with the keys as the names. As value of each item in the dictionary, use the integer and add it to (the value of) an already existing key (or not).
lines.GroupBy(line => string.Join(" ", line.Split().Take(2)))
.Select((g, index) =>
string.Format("{0}. {1} {2}",
index,
g.Key,
g.Sum(line => int.Parse(line.Split().Last()))));
var results = File.ReadAllLines("filename.txt")
.Select(x => x.Split())
.GroupBy(y => new { y1 = y[0], y2 = y[1] })
.Select(g => new { g.Key.y1, g.Key.y2, Sum = g.Sum(v => int.Parse(v[2])) })
.OrderByDescending(p => p.Sum)
.Select(m => m.y1 + " " + m.y2 + " " + m.Sum).ToList();
something like that (not so elegant, but not dependant on the number of spaces)
var lines = File.ReadAllLines(#"<pathToFile>");
var result = lines
.Select(m => m.Split(' '))
.Select(x => new {
text = string.Join(" ", x.Where(z => z != x.Last())),
val = Convert.ToInt32(x.Last())
})
.GroupBy(x => x.text)
.Select(g => new {
text =g.Key,
sum = g.Sum(z => z.val)
}).ToList();
Just another solution with LINQ and regular expressions (for verifying line format and getting names and values from it):
Regex regex = new Regex(#"^(?<name>.*)\s(?<value>\d+)$");
var query = from line in File.ReadLines(file_name)
let match = regex.Match(line)
where match.Success
select new {
Name = match.Groups["name"].Value,
Value = match.Groups["value"].Value
} into item
group item by item.Name into g
orderby g.Key
select new {
Name = g.Key,
Total = g.Sum(x => Int32.Parse(x.Value))
};
Value overflow is not verified here. If it is possible that some values are bigger than Int32.MaxValue, then change sum calculation to
g.Sum(x => { int value; return Int32.TryParse(x.Value, out value) ? value : 0; })
Related
I am trying to figure out a regex to use to split a string into 2 character substring.
Let's say we have the following string:
string str = "Idno1";
string pattern = #"\w{2}";
Using the pattern above will get me "Id" and "no", but it will skip the "1" since it doesn't match the pattern. I would like the following results:
string str = "Idno1"; // ==> "Id" "no" "1 "
string str2 = "Id n o 2"; // ==> "Id", " n", " o", " 2"
Linq can make easy the code. Fiddle version works
The idea: I have a chunkSize = 2 as your requirement, then, Take the string at the index (2,4,6,8,...) to get the chunk of chars and Join them to string.
public static IEnumerable<string> ProperFormat(string s)
{
var chunkSize = 2;
return s.Where((x,i) => i % chunkSize == 0)
.Select((x,i) => s.Skip(i * chunkSize).Take(chunkSize))
.Select(x=> string.Join("", x));
}
With the input, I have the output
Idno1 -->
Id
no
1
Id n o 2 -->
Id
n
o
2
Linq is really better in this case. You can use this method - it will allow to split string in chunks of arbitrary size:
public static IEnumerable<string> SplitInChunks(string s, int size = 2)
{
return s.Select((c, i) => new {c, id = i / size})
.GroupBy(x => x.id, x => x.c)
.Select(g => new string(g.ToArray()));
}
But if you are bound to regex, use this code:
public static IEnumerable<string> SplitInChunksWithRegex(string s, int size = 2)
{
var regex = new Regex($".{{1,{size}}}");
return regex.Matches(s).Cast<Match>().Select(m => m.Value);
}
We have a program that shows you how many times a letter is repeated in a text
string txt = input.text.ToLower();
txt = Regex.Replace(txt, #"\s+", "").Replace(")","").Replace("(","").Replace(".","").Replace(",","").Replace("!","").Replace("?","") ;
var letterCount = txt.Where(char.IsLetter).GroupBy(c => c).Select(v => new { Letter = v.Key, count = v.Count() });
foreach (var c in letterCount)
{
Debug.Log(string.Format("Caracterul:{0} apare {1} ori", c.Letter.ToString(), c.count));
}
And how do I give for the most repeating letter the value of 26, then for the one that repeats the less it gets 25 and for the one that only once a value in alphabetical order?
For example, the text "we are all happy"
Letter A is repeated three times and has the value of 26
For letter L 25
For P 24 and others in alphabetical order
And, finally, get their sum?
Sorry for my English!!!
You can use this LINQ approach:
string input = "we are all happy";
var allCharValues = input.ToLookup(c => c)
.Where(g => g.Key != ' ') // or you want spaces?
.OrderByDescending(g => g.Count())
.ThenBy(g => g.Key) // you mentioned alphabetical ordering if two have same count
.Select((x, index) => new { Char = x.Key, Value = 26 - index, Count = x.Count() });
foreach (var x in allCharValues)
Console.WriteLine($"Char:{x.Char} Value:{x.Value} Count:{x.Count}");
int sum = allCharValues.Select(x => x.Value).Sum();
In relation to your question about removing unwanted characters:
I think you'd be better of just keeping all characters between a and z. You could write an extension method to do this, and convert to lowercase at the same time:
public static class StringExt
{
public static string AlphabeticChars(this string self)
{
var alphabeticChars = self.Select(char.ToLower).Where(c => 'a' <= c && c <= 'z');
return new string(alphabeticChars.ToArray());
}
}
Then you can use an approach as follows. This is similar to Tim's approach, but this uses GroupBy() to count the occurrences; it also uses the new Tuple syntax from C#7 to simplify things. Note that this ALSO names the tuple properties, so they are not using the default Item1 and Item2.
string txt = "we, (are?) all! happy.";
var r = txt
.AlphabeticChars()
.GroupBy(c => c)
.Select(g => (Count: g.Count(), Char: g.Key))
.OrderByDescending(x => x.Count)
.ThenBy(x => x.Char)
.Select((v, i) => (Occurance: v, Index: 26-i));
int sum = r.Sum(c => c.Occurance.Count * c.Index);
Console.WriteLine(sum);
IdGroup Quantity
-----------------------
1 22
2 1
3 4
I want to sum up all the quantities - so the answer should be 27
The class is:
public class lines
{
int IdGroup {get; set}
string Quantity {get; set}
}
Given is an list of lines:
List<lines> lines_array = ...
Quantity is a string, for some internal reasons.
I don't know how I can use Sum here because of this.
I tried with:
int total_quantity =
lines_array
.GroupBy(grp => grp.IdGroup)
.Select(grp => new { mysumatory = grp.Sum(o => o.Quantity });
But this don't work.
Can you help me?
Something like this should work:
int total_quantity = lines_array.Sum(l=>int.Parse(l.Quantity));
I don't understand why you group the entries if you want to calculate the total number of quantities.
Use Int32.Parse method:
int total_quantity = lines_array
.GroupBy(grp => grp.IdGroup)
.Select(grp => new { mysumatory = grp.Sum(o => int.Parse(o.Quantity)) });
you need to parse the string:
.Select(grp => new { mysumatory = grp.Sum(o => int.Parse(o.Quantity )});
Alternative with TryParse just in case, theres invalid data in quantity, which cannot be parsed to int:
.Select(grp => new { mysumatory = grp.Sum(s => {
int outvalue;
if (int.TryParse(s.Quantity, out outvalue))
return outvalue;
else
return 0;
});
You will have to either use Int32.Parse(o.Quantity), or perform a conversion of quantity on your linq query. As it is only used for sum I will go with the former.
I would like to remove the duplicate elements from a List. Some elements of the list looks like this:
Book 23
Book 22
Book 19
Notebook 22
Notebook 19
Pen 23
Pen 22
Pen 19
To get rid of duplicate elements i've done this:
List<String> nodup = dup.Distinct().ToList();
I would like to keep in the list just
Book 23
Notebook 22
Pen 23
How can i do that ?
you can do someting like
string firstElement = dup.Distinct().ToList().First();
and add it to another list if you want.
It's not 100% clear what you want here - however...
If you want to keep the "largest" number in the list, you could do:
List<string> noDup = dup.Select(s => s.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)
.Select(p => new { Name=p[0], Val=int.Parse(p[1]) })
.GroupBy(p => p.Name)
.Select(g => string.Join(" ", g.Key, g.Max().ToString()))
.ToList();
This would transform the List<string> by parsing the numeric portion into a number, taking the max per item, and creating the output string as you have specified.
You can use LINQ in combination with some String operations to group all your itemy by name and MAX(Number):
var q = from str in list
let Parts = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
let item = Parts[ 0 ]
let num = int.Parse(Parts[ 1 ])
group new { Name = item, Number = num } by item into Grp
select new {
Name = Grp.Key,
Value = Grp.Max(i => i.Number).ToString()
};
var highestGroups = q.Select(g =>
String.Format("{0} {1}", g.Name, g.Value)).ToList();
(Same as Reed's approach but in query syntax which is better readable to my mind)
Edit: I cannot reproduce your comment that it does not work, here is sample data:
List<String> list = new List<String>();
list.Add("Book 23");
list.Add("Book 22");
list.Add("Book 19");
list.Add("Notebook 23");
list.Add("Notebook 22");
list.Add("Notebook 19");
list.Add("Pen 23");
list.Add("Pen 22");
list.Add("Pen 19");
list.Add("sheet 3");
var q = from str in list
let Parts = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
let item = Parts[ 0 ]
let num = int.Parse(Parts[ 1 ])
group new { Name = item, Number = num } by item into Grp
select new {
Name = Grp.Key,
Value = Grp.Max(i => i.Number).ToString()
};
var highestGroups = q.Select(g => String.Format("{0} {1}", g.Name, g.Value));
MessageBox.Show(String.Join(Environment.NewLine, highestGroups));
The result:
Book 23
Notebook 23
Pen 23
sheet 3
You may want to add a custom comparer as a parameter, as you can see in the example on MSDN.
In this example I assumed Foo is a class with two members.
class Program
{
static void Main(string[] args)
{
var list = new List<Foo>()
{
new Foo("Book", 23),
new Foo("Book", 22),
new Foo("Book", 19)
};
foreach(var element in list.Distinct(new Comparer()))
{
Console.WriteLine(element.Type + " " + element.Value);
}
}
}
public class Foo
{
public Foo(string type, int value)
{
this.Type = type;
this.Value = value;
}
public string Type { get; private set; }
public int Value { get; private set; }
}
public class Comparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
if(x == null || y == null)
return x == y;
else
return x.Type == y.Type;
}
public int GetHashCode(Foo obj)
{
return obj.Type.GetHashCode();
}
}
This works on an IList, assuming that we want the first item each, not the one with the highest number. Be careful with different collection types (like ICollection or IEnumerable), as they do not guarantee you any order. Therefore any of the Foos may remain after the Distinct.
You could also override both Equals and GetHashCode of Foo instead of using a custom IEqualityComparer. However, I would not actually recommend this for a local distinct. Consumers of your class may not recognize that two instances with same value for Type are always equal, regardless of their Value.
a bit old fashioned , but it should work ,
If I understand correctrly
Dictionary<string,int> dict=new Dictionary<string,int>();
//Split accepts 1 character ,assume each line containes key value pair seperated with spaces and not containing whitespaces
input=input.Replace("\r\n","\n");
string[] lines=input.Split('\n');
//break to categories and find largest number at each
foreach(line in lines)
{
string parts[]=line.Split(' ');
string key=parts[0].Trim();
int value=Convert.ToInt32(parts[1].Trim());
if (dict.ContainsKey(key))
{
dict.Add(key, value);
}
else
{
if (dict[key]<value)
{
dict[key]=value;
}
}
}
//do somethig with dict
I have a troublesome query to write. I'm currently writing some nasty for loops to solve it, but I'm curious to know if Linq can do it for me.
I have:
struct TheStruct
{
public DateTime date {get; set;} //(time portion will always be 12 am)
public decimal A {get; set;}
public decimal B {get; set;}
}
and a list that contains these structs. Let's say it's ordered this way:
List<TheStruct> orderedList = unorderedList.OrderBy(x => x.date).ToList();
If you put the orderedList struct dates in a set they will always be contiguous with respect to the day.. that is if the latest date in the list was 2011/01/31, and the earliest date in the list was 2011/01/01, then you'd find that the list would contain 31 items, one for each date in January.
Ok, so what I want to do is group the list items such that:
Each item in a group must contain the same Decimal A value and the same Decimal B value
The date values in a group must form a set of contiguous dates, if the date values were in order
If you summed up the sums of items in each group, the total would equal the number of items in the original list (or you could say a struct with a particular date can't belong to more than one group)
Any Linq masters know how to do this one?
Thanks!
You can group adjacent items in a sequence using the GroupAdjacent Extension Method (see below):
var result = unorderedList
.OrderBy(x => x.date)
.GroupAdjacent((g, x) => x.A == g.Last().A &&
x.B == g.Last().B &&
x.date == g.Last().date.AddDays(1))
.ToList();
Example:
(1,1) 2011-01-01 \
(1,1) 2011-01-02 > Group 1
(1,1) 2011-01-03 __/
(2,1) 2011-01-04 \
(2,1) 2011-01-05 > Group 2
(2,1) 2011-01-06 __/
(1,1) 2011-01-07 \
(1,1) 2011-01-08 > Group 3
(1,1) 2011-01-09 __/
(1,1) 2011-02-01 \
(1,1) 2011-02-02 > Group 4
(1,1) 2011-02-03 __/
Extension Method:
static IEnumerable<IEnumerable<T>> GroupAdjacent<T>(
this IEnumerable<T> source, Func<IEnumerable<T>, T, bool> adjacent)
{
var g = new List<T>();
foreach (var x in source)
{
if (g.Count != 0 && !adjacent(g, x))
{
yield return g;
g = new List<T>();
}
g.Add(x);
}
yield return g;
}
Here is an entry for "Most Convoluted way to do this":
public static class StructOrganizer
{
public static IEnumerable<Tuple<Decimal, Decimal, IEnumerable<MyStruct>>> OrganizeWithoutGaps(this IEnumerable<MyStruct> someStructs)
{
var someStructsAsList = someStructs.ToList();
var lastValuesSeen = new Tuple<Decimal, Decimal>(someStructsAsList[0].A, someStructsAsList[0].B);
var currentList = new List<MyStruct>();
return Enumerable
.Range(0, someStructsAsList.Count)
.ToList()
.Select(i =>
{
var current = someStructsAsList[i];
if (lastValuesSeen.Equals(new Tuple<Decimal, Decimal>(current.A, current.B)))
currentList.Add(current);
else
{
lastValuesSeen = new Tuple<decimal, decimal>(current.A, current.B);
var oldList = currentList;
currentList = new List<MyStruct>(new [] { current });
return new Tuple<decimal, decimal, IEnumerable<MyStruct>>(lastValuesSeen.Item1, lastValuesSeen.Item2, oldList);
}
return null;
})
.Where(i => i != null);
}
// To Test:
public static void Test()
{
var r = new Random();
var sampleData = Enumerable.Range(1, 31).Select(i => new MyStruct {A = r.Next(0, 2), B = r.Next(0, 2), date = new DateTime(2011, 12, i)}).OrderBy(s => s.date).ToList();
var sortedData = sampleData.OrganizeWithoutGaps();
Console.Out.WriteLine("Sample Data:");
sampleData.ForEach(s => Console.Out.WriteLine("{0} = ({1}, {2})", s.date, s.A, s.B));
Console.Out.WriteLine("Output:");
sortedData.ToList().ForEach(s => Console.Out.WriteLine("({0}, {1}) = {2}", s.Item1, s.Item2, String.Join(", ", s.Item3.Select(st => st.date))));
}
}
If I understood you well, a simple Group By would do the trick:
var orderedList = unorderedList.OrderBy(o => o.date).GroupBy(s => new {s.A, s.B});
Just that. To print the results:
foreach (var o in orderedList) {
Console.WriteLine("Dates of group {0},{1}:", o.Key.A, o.Key.B);
foreach(var s in o){
Console.WriteLine("\t{0}", s.date);
}
}
The output would be like:
Dates of group 2,3:
02/12/2011
03/12/2011
Dates of group 4,3:
03/12/2011
Dates of group 1,2:
04/12/2011
05/12/2011
06/12/2011
Hope this helps.
Cheers