Find duplicate entries in a file for each day - c#

I want to write a c# code that reads my file which is in the below given format and prints all the duplicate entries for each date along with the number of occurrence.
Example.txt :
March 03 2014 abcd March 03 2014 def March 03 2014 abcd March 04 2014 xyz March 04 2014 xyz
Output :
March 03 2014 abcd 2
March 04 2014 xyz 2
Can someone help me with this?
I was thinking about using dictionary where the event would be my key and for each duplicate event, I would increment the value. But I am not sure how to group the result for each day.

It might be good case for LINQ power:
var input = "March 03 2014 abcd March 03 2014 def March 03 2014 abcd March 04 2014 xyz March 04 2014 xyz";
var format = "MMMM dd yyyy";
var results = input.Split(' ')
.Select((v, i) => new { v, i })
.GroupBy(x => x.i / 4, x => x.v, (k, g) => g.ToList())
.Select(g => new
{
Date = DateTime.ParseExact(String.Join(" ", g.Take(3)), format, CultureInfo.InvariantCulture),
Event = g[3]
})
.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(g => new
{
Item = g.Key,
Count = g.Count()
});
foreach (var i in results)
Console.WriteLine("{0} {1} {2}", i.Item.Date.ToString(format), i.Item.Event, i.Count.ToString());
Prints exactly what you need.

Going by your original description of the problem and sample data, this code will probably work with some tweaks. You could probably do it using some of the LINQ libraries as well.
List<String> outputStringList = new List<string>();
IEnumerable<String> stringEnumerable = System.IO.File.ReadLines(#"c:\tmp\test.txt");
System.Collections.Generic.HashSet<String> uniqueHashSet = new System.Collections.Generic.HashSet<String>();
foreach (String line in stringEnumerable) { uniqueHashSet.Add(line); }
foreach (String output in uniqueHashSet)
{
Int32 count = stringEnumerable.Count(element => element == output);
if (count > 1) { outputStringList.Add(output + " " + count); }
//if (count > 1) { System.Diagnostics.Debug.WriteLine(output + " " + count); }
}
I see that you changed the formatting of your data while I was writing up my answer. Please disregard as this solution will no longer work.

You can split your text by using a Regular Expression.
public IEnumerable<KeyValuePair<String, Int32>> SearchDuplicates(string file){
var file = File.ReadLines(file);
var pattern = new Regex("[A-Za-z]* [0-9]{2} [0-9]{4} [A-Za-z]*");
var results = new Dictionary<string, int>();
foreach(var line in file) {
foreach(Match match in pattern.Matches(line)) {
if(!results.ContainsKey(match.Value))
results.Add(match.Value, 0);
results[match.Value]++;
}
}
return results.Where(v => v.Value > 1);
}

Note: I've written this to be simple to read, with comments explaining the process.
If you are also the one writing this file, to separate each "file" with a Record separator, which if you look on the ascii table has a value of 30. If this is not the case, and you HAVE to use the file format given in the OP let me know and I can add a case for that.
// Reads in the entire file into one string variable.
string allTheText = File.ReadAllText(string filePath);
// Splits each "file" into a string of its own.
string[] files = allTheText.Split((char)30);
// Do this if you have a newline inbetween each "file" instead of just spaces.
string[] files = File.ReadAllLines(string filePath);
// Make a Dictionary<string, string> to hold all these (you could use DateTime but I opted to not).
Dictionary<string, string> entries = new Dictionary<string, string>();
foreach(string file in files)
{
// Now lets get the Date of this "file".
// We need the index of the 3rd space
var offset = file.IndexOf(' ');
offset = file.IndexOf(' ', offset+1);
offset = file.IndexOf(' ', offset+1);
// Now split up the string by this offset
string date = file.Substring(0, offset-1);
string filecont = file.Substring(offset);
// Only add if it isn't already in there
if(!entries.Keys.Contains(date))
entries.Add(date, filecont);
}
// Print them out
foreach(string key in entries)
{
Console.WriteLine(key + " " + entries[key]);
}

You can tokenize it based on the month delimiter if you want
public static void Main (string[] args)
{
var str = "March 03 2014 abcd March 03 2014 def March 03 2014 abcd March 04 2014 xyz March 04 2014 xyz";
var rawResults = tokenize (str).GroupBy(i => i);
foreach (var item in rawResults) {
Console.WriteLine ("Item {0} happened {1} times", item.Key, item.Count());
}
}
static List<String> tokenize (string str)
{
var months = new[]{ "March", "April", "May" }; //etc
var strTokens = str.Split (new []{ ' ' }, StringSplitOptions.RemoveEmptyEntries);
var results = new List<string> ();
var current = "";
foreach (var token in strTokens) {
if (months.Contains(token)) {
if (current != null && current != "") {
results.Add (current);
}
current = token + " ";
} else {
current += token + " ";
}
}
results.Add (current);
return results;
}
Better yet, use a parser combinator to do it

A simple solution using regex
string input = "March 03 2014 abcd March 03 2014 def March 03 2014 abcd March 04 2014 xyz March 04 2014 xyz";
List<string> dates = new List<string>();
string[] splitted = input.Split(' ');
for (int i = 0; i < splitted.Length; i = i + 4)
{
string strDate = splitted[i] + " " + splitted[i + 1] + " " + splitted[i + 2] + " " + splitted[i + 3];
if (!dates.Contains(strDate))
{
dates.Add(strDate);
if (Regex.Matches(input, strDate).Count > 1)
Console.WriteLine(strDate + " " + Regex.Matches(input, strDate).Count);
}
}

Related

Join keys in key-value pair list with close value c#

I have a list of key-value pairs of <string, int>. I want to merge and construct a new string with the keys that has close values (+3-3) and add each new string to a list.
Here are the keys and values of my list:
Luger: 9
Burger: 9
Le: 21
Pigeon: 21
Burger: 21
Hamburger: 25
Double: 30
Animal: 31
Style: 31
The: 43
Original: 43
Burger: 44
Here's the output that i want to achieve:
Luger Burger
Le Pigeon Burger
Hamburger
Double Animal Style
The Original Burger
To achieve this, firstly i created a list containing this key-value pairs. And iterate through each item and tried to find close values, assign them to new key-value pairs and delete that index. But that doesn't work properly. That's the code so far:
for (int i = 0; i < wordslist.Count; i++)
{
for (int j = 0; j < wordslist.Count; j++)
{
if (wordslist[i].Value <= wordslist[j].Value + 3 && wordslist[i].Value >= wordslist[j].Value - 3)
{
wordslist.Add(
new KeyValuePair<string, int>(wordslist[i].Key + " " + wordslist[j].Key, wordslist[i].Value)
);
wordslist.RemoveAt(j);
}
}
wordslist.RemoveAt(i);
}
this doesn't work and produce repetitive results as below:
Pigeon: 21
Style: 30
Burger: 30
Double Double Animal: 30
Burger Burger: 31
Original Original The The Original Burger Original Burger: 42
Is there any algorithm that can iterate through these items and construct a string by merging the keys that has close values and add each item to a list?
You can simplify this logic:
public IEnumerable<string> GetPlusOrMinus3(Dictionary<string, int> fullList, int checkNumber)
{
return fullList.Where(w => checkNumber <= w.Value + 3
&& checkNumber >= w.Value - 3)
.Select(s => $"{s.Key}: {s.Value}" );
}
The string format isn't perfect for you, but the logic should hold.
And in use you could do something like:
var forOne = GetPlusOrMinus3(values, 1);
var resultString = String.Join(", ", forOne);
Console.WriteLine(resultString);
Which would write out:
one: 1, two: 2, four: 4
And to loop through everything:
foreach(var entryValue in values.Values)
{
Console.WriteLine(String.Join(", ", GetPlusOrMinus3(values, entryValue)));
}
Or to loop through anything without resusing any results:
var matchedNumbers = new List<int>();
foreach(var entryValue in values.Values)
{
var matchResults = values.Where(w => entryValue <= w.Value + 3 && entryValue >= w.Value - 3
&& !matchedNumbers.Contains(w.Value)).ToDictionary(x => x.Key, x => x.Value);
if (matchResults.Any())
{
matchedNumbers.AddRange(matchResults.Select(s => s.Value).ToList());
Console.WriteLine(String.Join(", ",
GetPlusOrMinus3(matchResults, entryValue)));
}
}
Logs:
one: 1, two: 2, four: 4
twelve: 12, 10: 10, eleven: 11
six: 6

Find out if a list of strings contains permutations of words from another string (counter for each combination)

I didn't know exactly how to ask this question better so I will try to explain it as best as I can.
Let's say I have one list of 20 strings myList1<string> and I have another string string ToCompare. Now each of the strings in the list as well as the string ToCompare have 8 words divided by empty spaces. I want to know how many times combination of any three words from string ToCompare in any possible order is to be found in the strings of myList1<string>. For an example:
This is the list (short version - example):
string1 = "AA BB CC DD EE FF GG HH";
string2 = "BB DD EE AA HH II JJ MM";
.......
string20 = "NN OO AA RR EE BB FF KK";
string ToCompare = "BB GG AA FF CC MM RR II";
Now I want to know how many times any combination of 3 words from ToCompare string is to be found in myList1<string>. To clarify futher three words from ToCompare "BB AA CC" are found in string1 of the list thus the counter for these 3 words would be 1. Another 3 words from ToCompare "BB AA II" are found in the string2 of myList1<string> but the counter here would be also 1 because it's not the same combination of words (I have "AA" and "BB" but also "II". They are not equal). Order of these 3 words doesn't matter, that means "AA BB CC" = "BB AA CC" = "CC BB AA". I want to know how many combinations of all (any) 3 words from ToCompare are found in myList1<string>. I hope it's clear what I mean.
Any help would be appreciated, I don't have a clue how to solve this. Thanks.
Example from Vanest:
List<string> source = new List<string>();
source.Add("2 4 6 8 10 12 14 99");
source.Add("16 18 20 22 24 26 28 102");
source.Add("33 6 97 38 50 34 87 88");
string ToCompare = "2 4 6 15 20 22 28 44";
The rest of the code is exacty the same, and the result:
Key = 2 4 6, Value = 2
Key = 2 4 20, Value = 1
Key = 2 4 22, Value = 1
Key = 2 4 28, Value = 1
Key = 2 6 20, Value = 1
Key = 2 6 22, Value = 1
Key = 2 6 28, Value = 1
Key = 2 20 22, Value = 1
Key = 2 20 28, Value = 1
Key = 2 22 28, Value = 1
Key = 4 6 20, Value = 1
Key = 4 6 22, Value = 1
Key = 4 6 28, Value = 1
Key = 4 20 22, Value = 1
Key = 4 20 28, Value = 1
Key = 4 22 28, Value = 1
Key = 6 20 22, Value = 1
Key = 6 20 28, Value = 1
Key = 6 22 28, Value = 1
Key = 20 22 28, Value = 1
As you can see there are combinations which not exist in the strings, and the value of the first combination is 2 but it comes only one time in the first string
I think this should suffice your ask,
List<string> source = new List<string>();
source.Add("AA BB CC DD EE FF GG HH");
source.Add("BB DD EE AA HH II JJ MM");
source.Add("NN OO AA RR EE BB FF KK");
string ToCompare = "BB GG AA FF CC MM RR II";
string word1, word2, word3, existingKey;
string[] compareList = ToCompare.Split(new string[] { " " }, StringSplitOptions.None);
Dictionary<string, int> ResultDictionary = new Dictionary<string, int>();
for (int i = 0; i < compareList.Length - 2; i++)
{
word1 = compareList[i];
for (int j = i + 1; j < compareList.Length - 1; j++)
{
word2 = compareList[j];
for (int z = j + 1; z < compareList.Length; z++)
{
word3 = compareList[z];
source.ForEach(x =>
{
if (x.Contains(word1) && x.Contains(word2) && x.Contains(word3))
{
existingKey = ResultDictionary.Keys.FirstOrDefault(y => y.Contains(word1) && y.Contains(word2) && y.Contains(word3));
if (string.IsNullOrEmpty(existingKey))
{
ResultDictionary.Add(word1 + " " + word2 + " " + word3, 1);
}
else
{
ResultDictionary[existingKey]++;
}
}
});
}
}
}
ResultDictionary will have the 3 word combinations that occur in myList1<string> with their count of occurrences. To get the total count, retrieve and add all the value fields from ResultDictionary.
EDIT:
Below snippet produces correct result with the given input,
List<string> source = new List<string>();
source.Add("2 4 6 8 10 12 14 99");
source.Add("16 18 20 22 24 26 28 102");
source.Add("33 6 97 38 50 34 87 88");
string ToCompare = "2 4 6 15 20 22 28 44";
string word1, word2, word3, existingKey;
string[] compareList = ToCompare.Split(new string[] { " " }, StringSplitOptions.None);
string[] sourceList, keywordList;
Dictionary<string, int> ResultDictionary = new Dictionary<string, int>();
source.ForEach(x =>
{
sourceList = x.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < compareList.Length - 2; i++)
{
word1 = compareList[i];
for (int j = i + 1; j < compareList.Length - 1; j++)
{
word2 = compareList[j];
for (int z = j + 1; z < compareList.Length; z++)
{
word3 = compareList[z];
if (sourceList.Contains(word1) && sourceList.Contains(word2) && sourceList.Contains(word3))
{
existingKey = ResultDictionary.Keys.FirstOrDefault(y =>
{
keywordList = y.Split(new string[] { " " }, StringSplitOptions.None);
return keywordList.Contains(word1) && keywordList.Contains(word2) && keywordList.Contains(word3);
});
if (string.IsNullOrEmpty(existingKey))
{
ResultDictionary.Add(word1 + " " + word2 + " " + word3, 1);
}
else
{
ResultDictionary[existingKey]++;
}
}
}
}
}
});
Hope this helps...
I think this will do what you're asking for:
void Main()
{
var list =
new List<String>
{
"AA BB CC DD EE FF GG HH",
"BB DD EE AA HH II JJ MM",
"NN OO AA RR EE BB FF KK"
};
var toCompare = "BB GG AA FF CC MM RR II";
var permutations = CountPermutations(list, toCompare);
}
public Int32 CountPermutations(List<String> list, String compare)
{
var words = compare.Split(' ');
return list
.Select(l => l.Split(' '))
.Select(l => new { String = String.Join(" ", l), Count = l.Join(words, li => li, wi => wi, (li, wi) => li).Count()})
.Sum(x => x.Count - 3);
}
[edit: 2/20/2019]
You can use the following to get all the matches to each list item with the total number of unique combinations
void Main()
{
var list =
new List<String>
{
"AA BB CC DD EE FF GG HH",
"BB DD EE AA HH II JJ MM",
"NN OO AA RR EE BB FF KK",
"AA AA CC DD EE FF GG HH"
};
list.Select((l, i) => new { Index = i, Item = l }).ToList().ForEach(x => Console.WriteLine($"List Item{x.Index + 1}: {x.Item}"));
var toCompare = "BB GG AA FF CC MM RR II";
Console.WriteLine($"To Compare: {toCompare}");
Func<Int32, Int32> Factorial = x => x < 0 ? -1 : x == 0 || x == 1 ? 1 : Enumerable.Range(1, x).Aggregate((c, v) => c * v);
var words = toCompare.Split(' ');
var matches = list
// Get a list of the list items with all their parts
.Select(l => new { Parts = l.Split(' '), Original = l })
// Join each part from the to-compare item to each part of the list item
.Select(l => new { String = String.Join(" ", l), Matches = l.Parts.Join(words, li => li, wi => wi, (li, wi) => li), l.Original })
// Only consider items with at least 3 matches
.Where(l => l.Matches.Count() >= 3)
// Get the each item including how many parts matched and how many unique parts there are of each part
.Select(l => new { l.Original, Matches = String.Join(" ", l.Matches), Count = l.Matches.Count(), Groups = l.Matches.GroupBy(m => m).Select(m => m.Count()) })
// To calculate the unique combinations for each match use the following mathematical equation: match_count! / (frequency_part_1! * frequency_part_2! * ... * frequency_part_n!)
.Select(l => new { l.Original, l.Matches, Combinations = Factorial(l.Count) / l.Groups.Aggregate((c, v) => c * Factorial(v)) })
.ToList();
matches.ForEach(m => Console.WriteLine($"Original: {m.Original}, Matches: {m.Matches}, Combinations: {m.Combinations}"));
var totalUniqueCombinations = matches.Sum(x => x.Combinations);
Console.WriteLine($"Total Unique Combinations: {totalUniqueCombinations}");
}

Assign string into multidimensional array

Assumed that I have below string
[["Fri, 28 Mar 2014 01:00:00 +0000",0.402053266764,"1 sold"],["Thu, 03 Apr 2014 01:00:00 +0000",6.5,"1 sold"]];
How can i assign this set of string into an array?
Expected result:
string[,] items = {
{ "Fri, 28 Mar 2014 01:00:00 +0000", "0.402053266764", "1 sold"},
{ "Thu, 03 Apr 2014 01:00:00 +0000", "6.5", "1 sold"}
}
A brute force attack, probably better solutions are available
string input ="[[\"Fri, 28 Mar 2014 01:00:00 +0000\",0.402053266764,\"1 sold\"],[\"Thu, 03 Apr 2014 01:00:00 +0000\",6.5,\"1 sold\"]]";
string temp = input.Replace("[", "");
string[] records = temp.Split(new char[] {']'}, StringSplitOptions.RemoveEmptyEntries);
string[,] output = new string[records.Length, 3];
int recno = 0;
foreach(string record in records)
{
Console.WriteLine(record);
string[] fields = record.Split(new char[] {','}, StringSplitOptions.RemoveEmptyEntries);
output[recno,0] = string.Join(",", fields[0], fields[1]);
output[recno,1] = fields[2];
output[recno,2] = fields[3];
recno++;
}
for(int x = 0; x <= output.GetUpperBound(0); x++)
{
for(int y = 0; y <= output.GetUpperBound(1); y++)
Console.Write("INDEX[" +x + "," + y +"]=" + output[x, y] + ";");
Console.WriteLine();
}
string inputString; //Your original string
inputString=inputString.Replace('[',' '); //Removes left bracket
inputString=inputString.Substring(0,inputString.Count()-2); //Removes last two right brakets
var arrayOfStrings=inputString.Split(']'); //Split on right bracket
for(int i=1; i < arrayOfStrings.Count() -1; i++){
arrayOfStrings[i]=arrayOfStrings[i].Substring(1); //Removes the "," at the start of the 2nd to the n-1th elements
}

Distinguishing string being parsed using String Split

I need to parse a line that is in a similar format as following:
s = "Jun 21 09:47:50 ez-x5 user.debug if_comm: [TX] 02 30 20 0f 30 31 39 24 64 31 30 31 03 54 ";
I am splitting the line with [TX] or [RX]. Here's what I do with the parsed string:
s = "Jun 21 09:47:50 ez-x5 user.debug if_comm: [TX] 02 30 20 0f 30 31 39 24 64 31 30 31 03 54 ";
string[] stringSeparators = new string[] { "[TX] " + start_key };
string transfer = s.Split(stringSeparators, 2, StringSplitOptions.None)[1];
//At this point, transfer[] = 02 30 20 0f 30 31 39 24 64 31 30 31 03 54
if (!string.IsNullOrEmpty(transfer))
{
string key = "";
string[] split = transfer.Split(' ');
if (split[0] == start_key)
{
for (int i = 0; i < key_length; i++)
{
key += split[i + Convert.ToInt32(key_index)];
}
TX_Handle(key);
}
}
stringSeparators = new string[] { "[RX]" + start_key };
transfer = s.Split(stringSeparators, 2, StringSplitOptions.None)[1];
if (!string.IsNullOrEmpty(transfer))
{
string key = "";
string[] split = transfer.Split(' ');
if (split[0] == start_key)
{
for (int i = 0; i < key_length; i++)
{
key += split[i + Convert.ToInt32(key_index)];
}
RX_Handle(key);
}
}
Basically, because I have no realistic way of comparing whether the given token is [TX] or [RX], I am forced to use the above approach to separate the string, which requires me to write essentially the same code twice.
What is a way I can get around this problem and know which token is being parsed so that I don't have to duplicate my code?
The best way to do this is look at what is common. What is common in your code? Splitting based on 2 different tokens and a function call based on 2 different tokens. This can be broken into a conditional, so, why not move the common element into a conditional?
const string receiveToken = "[RX] ";
const string transmitToken = "[TX] ";
string token = s.IndexOf(receiveToken) > -1 ? receiveToken : transmitToken;
..now you have your token, so you can remove most of the duplication.
stringSeparators = new string[] { token + start_key };
transfer = s.Split(stringSeparators, 2, StringSplitOptions.None)[1];
if (!string.IsNullOrEmpty(transfer))
{
string key = "";
string[] split = transfer.Split(' ');
if (split[0] == start_key)
{
for (int i = 0; i < key_length; i++)
{
key += split[i + Convert.ToInt32(key_index)];
}
RX_TX_Handle(key, token);
}
}
..then you can have a common handler, eg:
void RX_TX_Handle(string key, string token)
{
token == receiveToken ? RX_Handle(key) : TX_Handle(key);
}
How about a different approach and use a regular expression. Mixin a little bit of LINQ and you have some pretty easy to follow code.
static void ParseLine(
string line,
int keyIndex,
int keyLength,
Action<List<byte>> txHandler,
Action<List<byte>> rxHandler)
{
var re = new Regex(#"\[(TX|RX)\](?: ([0-9a-f]{2}))+");
var match = re.Match(line);
if (match.Success)
{
var mode = match.Groups[1].Value; // either TX or RX
var values = match.Groups[2]
.Captures.Cast<Capture>()
.Skip(keyIndex)
.Take(keyLength)
.Select(c => Convert.ToByte(c.Value, 16))
.ToList();
if (mode == "TX") txHandler(values);
else if (mode == "RX") rxHandler(values);
}
}
Or without regular expressions:
static void ParseLine(
string line,
int keyIndex,
int keyLength,
Action<List<byte>> txHandler,
Action<List<byte>> rxHandler)
{
var start = line.IndexOf('[');
var end = line.IndexOf(']', start);
var mode = line.Substring(start + 1, end - start - 1);
var values = line.Substring(end + 1)
.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.Skip(keyIndex)
.Take(keyLength)
.Select(s => Convert.ToByte(s, 16))
.ToList();
if (mode == "TX") txHandler(values);
else if (mode == "RX") rxHandler(values);
}
I am not 100% sure if this answers your questions but I would create a TokenParser class that is responsible for parsing a token. You'll find it much easier to unit test.
public enum TokenType
{
Unknown = 0,
Tx = 1,
Rx = 2
}
public class Token
{
public TokenType TokenType { get; set; }
public IEnumerable<string> Values { get; set; }
}
public class TokenParser
{
public Token ParseToken(string input)
{
if (string.IsNullOrWhiteSpace(input)) throw new ArgumentNullException("input");
var token = new Token { TokenType = TokenType.Unknown };
input = input.ToUpperInvariant();
if (input.Contains("[TX]"))
{
token.TokenType = TokenType.Tx;
}
if (input.Contains("[RX]"))
{
token.TokenType = TokenType.Rx;
}
input = input.Substring(input.LastIndexOf("]", System.StringComparison.Ordinal) + 1);
token.Values = input.Trim().Split(Convert.ToChar(" "));
return token;
}
}
The example could be easily extended to allow multiple token parsers if the logic for parsing each token is vastly different.

How to chop a continuous date range list into a list of financial year in C#?

Example: given a continuous list of date range
List[0] = from 2001 Jan 01 to 2001 Aug 14
List[1] = from 2001 Aug 15 to 2002 Jul 10
Let’s assume that a financial year is from 1st of July to 30th of June (of next year) so the output should be
AnotherList[0] = from 2000 Jul 01 to 2001 Jun 30
period: 2001 Jan 01 to 2001 Jun 30
AnotherList[1] = from 2001 July 01 to 2002 Jun 30
period: 2001 Jul 01 to 2001 Aug 14
period: 2001 Aug 15 to 2002 Jun 30
AnotherList[2] = from 2002 July 01 to 2003 Jun 30
period: 2002 Jul 01 to 2002 Jul 10
Again it's very easy to work out by hand but my method contains close to 100 lines of code with the combination of if else, for each and while loops which I think it's ugly. I am trying to simplify the algorithm so that it's easier to maintain and debug. Thanks in advance.
You can be clever with GroupBy
// Beginning of earliest financial year
var start = new DateTime(2000,7,1);
var range = Enumerable.Range(0,365*2);
// Some random test data
var dates1 = range.Select(i => new DateTime(2001,1,1).AddDays(i) );
var dates2 = range.Select(i => new DateTime(2003,1,1).AddDays(i) );
// Group by distance in years from beginning of earliest financial year
var finYears =
dates1
.Concat(dates2)
.GroupBy(d => d.Subtract(start).Days / 365 );
This gives an IEnumerable<IGrouping<int, DateTime>> with each outer enumerable containing all the dates in the 2 lists in a single financial year.
EDIT: Changed to include clearer requirements.
Given a list that contains contiguous date ranges, the code doesn't have to be hard at all. In fact, you don't even have to write an actual loop:
public const int FYBeginMonth = 7, FYBeginDay = 1;
public static int FiscalYearFromDate(DateTime date)
{
return date.Month > FYBeginMonth ||
date.Month == FYBeginMonth && date.Day >= FYBeginDay ?
date.Year : date.Year - 1;
}
public static IEnumerable<DateRangeWithPeriods>
FiscalYears(IEnumerable<DateRange> continuousDates)
{
int startYear = FiscalYearFromDate(continuousDates.First().Begin),
endYear = FiscalYearFromDate(continuousDates.Last().End);
return from year in Enumerable.Range(startYear, endYear - startYear + 1)
select new DateRangeWithPeriods {
Range = new DateRange { Begin = FiscalYearBegin(year),
End = FiscalYearEnd(year) },
// start with the periods that began the previous FY and end in this FY
Periods = (from range in continuousDates
where FiscalYearFromDate(range.Begin) < year
&& FiscalYearFromDate(range.End) == year
select new DateRange { Begin = FiscalYearBegin(year),
End = range.End })
// add the periods that begin this FY
.Concat(from range in continuousDates
where FiscalYearFromDate(range.Begin) == year
select new DateRange { Begin = range.Begin,
End = Min(range.End, FiscalYearEnd(year)) })
// add the periods that completely span this FY
.Concat(from range in continuousDates
where FiscalYearFromDate(range.Begin) < year
&& FiscalYearFromDate(range.End) > year
select new DateRange { Begin = FiscalYearBegin(year),
End = FiscalYearEnd(year) })
};
}
This assumes some DateRange structures and helper functions, like this:
public struct DateRange
{
public DateTime Begin { get; set; }
public DateTime End { get; set; }
}
public class DateRangeWithPeriods
{
public DateRange Range { get; set; }
public IEnumerable<DateRange> Periods { get; set; }
}
private static DateTime Min(DateTime a, DateTime b)
{
return a < b ? a : b;
}
public static DateTime FiscalYearBegin(int year)
{
return new DateTime(year, FYBeginMonth, FYBeginDay);
}
public static DateTime FiscalYearEnd(int year)
{
return new DateTime(year + 1, FYBeginMonth, FYBeginDay).AddDays(-1);
}
This test code:
static void Main()
{
foreach (var x in FiscalYears(new DateRange[] {
new DateRange { Begin = new DateTime(2001, 1, 1),
End = new DateTime(2001, 8, 14) },
new DateRange { Begin = new DateTime(2001, 8, 15),
End = new DateTime(2002, 7, 10) } }))
{
Console.WriteLine("from {0:yyyy MMM dd} to {1:yyyy MMM dd}",
x.Range.Begin, x.Range.End);
foreach (var p in x.Periods)
Console.WriteLine(
" period: {0:yyyy MMM dd} to {1:yyyy MMM dd}", p.Begin, p.End);
}
}
outputs:
from 2000 Jul 01 to 2001 Jun 30
period: 2001 Jan 01 to 2001 Jun 30
from 2001 Jul 01 to 2002 Jun 30
period: 2001 Jul 01 to 2001 Aug 14
period: 2001 Aug 15 to 2002 Jun 30
from 2002 Jul 01 to 2003 Jun 30
period: 2002 Jul 01 to 2002 Jul 10
for each range in list
// determine end of this fiscal year
cut = new Date(range.start.year, 06, 31)
if cut < range.start
cut += year
end
if (range.end <= cut)
// one fiscal year
result.add range
continue
end
result.add new Range(range.start, cut)
// chop off whole fiscal years
start = cut + day
while (start + year <= range.end)
result.add new Range(start, start + year - day)
start += year
end
result.add new Range(start, range.end)
end
Sorry for mix of ruby and java :)
This is my simplest financial year list generate code
public void financialYearList()
{
List<Dictionary<string, DateTime>> diclist = new List<Dictionary<string, DateTime>>();
//financial year start from july and end june
int year = DateTime.Now.Month >= 7 ? DateTime.Now.Year + 1 : DateTime.Now.Year;
for (int i = 7; i <= 12; i++)
{
Dictionary<string, DateTime> dic = new Dictionary<string, DateTime>();
var first = new DateTime(year-1, i,1);
var last = first.AddMonths(1).AddDays(-1);
dic.Add("first", first);
dic.Add("lst", last);
diclist.Add(dic);
}
for (int i = 1; i <= 6; i++)
{
Dictionary<string, DateTime> dic = new Dictionary<string, DateTime>();
var first = new DateTime(year, i, 1);
var last = first.AddMonths(1).AddDays(-1);
dic.Add("first", first);
dic.Add("lst", last);
diclist.Add(dic);
}
}

Categories

Resources