Text file parsing using helper class - c#

I have a text file having list of movie names and its parts as below:
xxx, Author1, v6
the net, author1, v7
xxx, author3, v10
DDLJ, author3, v11
the fire, author5, v6
the health, author1, v8
the health, author7, v2
the hero, author9, v11
the hero, author8, v3
I would like to get most recent version of movie name. In this case it should return "DDLJ" and "the hero".
This is what I have tried:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
namespace ProgramNamespace
{
public class Program
{
public static List<String> processData(IEnumerable<string> lines)
{
Dictionary<string, int> keyValuePairs = new Dictionary<string, int>();
foreach (var item in lines)
{
string[] readsplitted = item.Split(',');
keyValuePairs.Add(readsplitted[0], Convert.ToInt32(
Regex.Replace(readsplitted[2], "[^0-9]+", string.Empty)));
}
//List<String> retVal = new List<String>();
return retVal;
}
static void Main(string[] args)
{
try
{
List<String> retVal = processData(File.ReadAllLines(#"D:\input.txt"));
File.WriteAllLines(#"D:\output.txt", retVal);
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
}
}
Note that, if required I would like to add a helper class.

EDIT: version for duplicated keys
I rewrote the first solution I gave to take duplicated data into account. The trick is adding a progressive number before the key and separating it with an underscore: this way every key will be unique.
E.g. you will have your Dictionary filled like this:
"1_xxx", 6
"2_the net", 7
"3_xxx", 10
"4_DDLJ", 11
...
Then I remove the number (and the underscore) before providing a result.
public static List<String> processData(IEnumerable<string> lines)
{
var keyValuePairs = new Dictionary<string, int>();
int Position = 0;
foreach (var item in lines)
{
Position++;
string[] readsplitted = item.Split(',');
keyValuePairs.Add(Position.ToString() +"_" + readsplitted[0], Convert.ToInt32(Regex.Replace(readsplitted[2], "[^0-9]+", string.Empty)));
}
var MaxVersion = keyValuePairs.Values.OrderByDescending(f => f).First();
return keyValuePairs.Where(f => f.Value == MaxVersion).Select(f => string.Join("_", f.Key.Split('_').Skip(1))).ToList();
}
More in detail:
keyValuePairs.Values will return just the version numbers
.OrderByDescending(f => f).First() will sort the version numbers in descending order and pick the first, i.e. the highest
keyValuePairs.Where(f => f.Value == MaxVersion) will select the key-value pairs corresponding to the highest version above
.Select(f => f.Key) will give you the keys of your Dictionary, i.e. the titles
This way you will also keep your Dictionary; if you are doing this one time and you don't need to expand your code or reuse your models, you won't have to create other classes or make it more complicated than necessary.

For these kinds of tasks I usually prefer to create a class that represents the data we're collecting, and give it a TryParse method that will create an instance of the class based on a line of data:
public class MovieInfo
{
public string Name { get; set; }
public string Author { get; set; }
public int Version { get; set; }
public static bool TryParse(string input, out MovieInfo result)
{
result = null;
if (input == null) return false;
var parts = input.Split(',');
int version;
if (parts.Length == 3 &&
int.TryParse(parts[2].Trim().TrimStart('v'), out version))
{
result = new MovieInfo
{
Name = parts[0],
Author = parts[1],
Version = version
};
}
return result != null;
}
public override string ToString()
{
return $"{Name} (v{Version}) - {Author}";
}
}
Then it's just a matter of reading the file, creating a list of these classes, and getting all that have the highest number:
public static List<MovieInfo> processData(IEnumerable<string> lines)
{
if (lines == null) return null;
var results = new List<MovieInfo>();
foreach (var line in lines)
{
MovieInfo temp;
if (MovieInfo.TryParse(line, out temp))
{
results.Add(temp);
}
}
var maxVersion = results.Max(result => result.Version);
return results.Where(result => result.Version == maxVersion).ToList();
}
For example:
private static void Main()
{
var lines = new List<string>
{
"xxx, Author1, v6",
"the net, author1, v7",
"xxx, author3, v10",
"DDLJ, author3, v11",
"the fire, author5, v6",
"the health, author1, v8",
"the health, author7, v2",
"the hero, author9, v11",
"the hero, author8, v3",
};
var processed = processData(lines);
foreach (var movie in processed)
{
// Note: this uses the overridden ToString method. You could just do 'movie.Name'
Console.WriteLine(movie);
}
GetKeyFromUser("\nDone!! Press any key to exit...");
}
Output

This is how I would do it. This accounts for getting all the movie names that where the max version is the same.
public static List<String> processData(string fileName)
{
var lines = File.ReadAllLines(fileName);
var values = lines.Select(x =>
{
var readsplitted = x.Split(',');
return new { Name = readsplitted[0], Verison = int.Parse(readsplitted[2].Replace("v", string.Empty))};
});
var maxValue= values.Max(x => x.Verison);
return values.Where(v => v.Verison == maxValue)
.Select(v => v.Name)
.ToList();
}
static void Main(string[] args)
{
try
{
List<String> retVal = processData(#"D:\output.txt");
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}

create a Movie class in order to initialize objects for each row that represents a movie.
split the whole string passed to processData() first by new line that by ','.
extract the number of the version of each movie (sperate it from "v") see: extractNumberFromString() method.
find the max version number and get (using linq query) all the movies that share the maximum version number.
public static List<Movie> processData(string s)
{
// list to store all movies
List<Movie> allmovies = new List<Movie>();
// first split by new line
var splitbynewline = s.Split('\n');
// split by ',' and initilize object
foreach (var line in splitbynewline)
{
var moviestring = line.Split(',');
// create new movie object
Movie obj = new Movie { Name = moviestring[0], Author = moviestring[1], Version = moviestring[2] };
obj.VersionNumber = extractNumberFromString(moviestring[2]);
allmovies.Add(obj);
}
// get the max version number
double maxver = allmovies.Max(x => x.VersionNumber);
// set and returen list that containes all movies with max version
List<Movie> result = allmovies.Where(x => x.VersionNumber == maxver).ToList();
return result;
}
/// <summary>
///
/// convert number that exist in a string to an int32 for example sdfdf43gn will return as 43
/// </summary>
/// <param name="value">string that contains inside him as digits</param>
/// <returns>int32</returns>
public static double extractNumberFromString(string value)
{
string returnVal = string.Empty;
System.Text.RegularExpressions.MatchCollection collection = System.Text.RegularExpressions.Regex.Matches(value, "\\d+");
foreach (System.Text.RegularExpressions.Match m in collection)
{
returnVal += m.ToString();
}
return Convert.ToDouble(returnVal);
}
public class Movie
{
public string Name;
public String Author;
public string Version;
public double VersionNumber;
}

Related

How to properly access object's List<> value in C#?

I am trying to get the object value but I don't know how to do it. I'm new to C# and its giving me syntax error. I want to print it separately via the method "PrintSample" How can I just concatenate or append the whatData variable . Thank you.
PrintSample(getData, "name");
PrintSample(getData, "phone");
PrintSample(getData, "address");
//Reading the CSV file and put it in the object
string[] lines = File.ReadAllLines("sampleData.csv");
var list = new List<Sample>();
foreach (var line in lines)
{
var values = line.Split(',');
var sampleData = new Sample()
{
name = values[0],
phone = values[1],
address = values[2]
};
list.Add(sampleData);
}
public class Sample
{
public string name { get; set; }
public string phone { get; set; }
public string adress { get; set; }
}
//Method to call to print the Data
private static void PrintSample(Sample getData, string whatData)
{
//THis is where I'm having error, how can I just append the whatData to the x.?
Console.WriteLine( $"{getData. + whatData}");
}
In C# it's not possible to dynamically evaluate expression like
$"{getData. + whatData}"
As opposed to languages like JavaScript.
I'd suggest to use rather switch expression or Dictionary<string, string>
public void PrintData(Sample sample, string whatData)
{
var data = whatData switch
{
"name" => sample.name,
"phone" => sample.phone,
"address" => sample.address
_ => throw new ArgumentOutOfRangeException(nameof(whatData)),
};
Console.WriteLine(data);
}
I'm not sure what you are trying to achieve. Perhaps this will help you:
private static void PrintSample(Sample getData, string whatData)
{
var property = getData.GetType().GetProperty(whatData);
string value = (string)property?.GetValue(getData) ?? "";
Console.WriteLine($"{value}");
}
What PO really needs is
private static void PrintSamples(List<Sample> samples)
{
foreach (var sample in samples)
Console.WriteLine($"name : {sample.name} phone: {sample.phone} address: {sample.address} ");
}
and code
var list = new List<Sample>();
foreach (var line in lines)
{
......
}
PrintSamples(list);
it is radicolous to use
PrintSample(getData, "name");
instead of just
PrintSample(getData.name)
You can do this using reflection. However, it's known to be relatively slow.
public static void PrintSample(object getData, string whatData)
{
Console.WriteLine( $"{getData.GetType().GetProperty(whatData).GetValue(getData, null)}");
}

How to read large text files and keep tracking of the information of previous lines using C#?

(This problem is a adaptation of a real life scenario, I reduced the problem so it is easy to understand, otherwise this question would be 10000 lines long)
I have a pipe delimited text file that looks like this (the header is not in the file):
Id|TotalAmount|Reference
1|10000
2|50000
3|5000|1
4|5000|1
5|10000|2
6|10000|2
7|500|9
8|500|9
9|1000
The reference is optional and is the Id of another entry in this text file. The entries that have a reference, are considered "children" of that reference, and the reference is their parent. I need to validate each parent in the file, and the validation is that the sum of TotalAmount of it's children should be equal to the parent's total amount. The parents can be either first or before their children in the file, like the entry with Id 9, that comes after it's children
In the provided file, the entry with Id 1 is valid, because the sum of the total amount of it's children (Ids 3 and 4) is 10000 and the entry with Id 2 is invalid, because the sum of it's children (Ids 5 and 6) is 20000.
For a small file like this, I could just parse everything to objects like this (pseudo code, I don't have a way to run it now):
class Entry
{
public int Id { get; set; }
public int TotalAmout { get; set; }
public int Reference { get; set; }
}
class Validator
{
public void Validate()
{
List<Entry> entries = GetEntriesFromFile(#"C:\entries.txt");
foreach (var entry in entries)
{
var children = entries.Where(e => e.Reference == entry.Id).ToList();
if (children.Count > 0)
{
var sum = children.Sum(e => e.TotalAmout);
if (sum == entry.TotalAmout)
{
Console.WriteLine("Entry with Id {0} is valid", entry.Id);
}
else
{
Console.WriteLine("Entry with Id {0} is INVALID", entry.Id);
}
}
else
{
Console.WriteLine("Entry with Id {0} is valid", entry.Id);
}
}
}
public List<Entry> GetEntriesFromFile(string file)
{
var entries = new List<Entry>();
using (var r = new StreamReader(file))
{
while (!r.EndOfStream)
{
var line = r.ReadLine();
var splited = line.Split('|');
var entry = new Entry();
entry.Id = int.Parse(splited[0]);
entry.TotalAmout = int.Parse(splited[1]);
if (splited.Length == 3)
{
entry.Reference = int.Parse(splited[2]);
}
entries.Add(entry);
}
}
return entries;
}
}
The problem is that I am dealing with large files (10 GB), and that would load way to many objects in memory.
Performance itself is NOT a concern here. I know that I could use dictionaries instead of the Where() method for example. My only problem now is performing the validation without loading everything to memory, and I don't have any idea how to do it, because a entry at the bottom of the file may have a reference to the entry at the top, so I need to keep track of everything.
So my question is: it is possible to keep track of each line in a text file without loading it's information into memory?
Since performance is not an issue here, I would approach this in the following way:
First, I would sort the file so all the parents go right before their children. There are classical methods for sorting huge external data, see https://en.wikipedia.org/wiki/External_sorting
After that, the task becomes pretty trivial: read a parent data, remember it, read and sum children data one by one, compare, repeat.
All you really need to keep in memory is the expected total for each non-child entity, and the running sum of the child totals for each parent entity. Everything else you can throw out, and if you use the File.ReadLines API, you can stream over the file and 'forget' each line once you've processed it. Since the lines are read on demand, you don't have to keep the entire file in memory.
public class Entry
{
public int Id { get; set; }
public int TotalAmount { get; set; }
public int? Reference { get; set; }
}
public static class EntryValidator
{
public static void Validate(string file)
{
var entries = GetEntriesFromFile(file);
var childAmounts = new Dictionary<int, int>();
var nonChildAmounts = new Dictionary<int, int>();
foreach (var e in entries)
{
if (e.Reference is int p)
childAmounts.AddOrUpdate(p, e.TotalAmount, (_, n) => n + e.TotalAmount);
else
nonChildAmounts[e.Id] = e.TotalAmount;
}
foreach (var id in nonChildAmounts.Keys)
{
var expectedTotal = nonChildAmounts[id];
if (childAmounts.TryGetValue(id, out var childTotal) &&
childTotal != expectedTotal)
{
Console.WriteLine($"Entry with Id {id} is INVALID");
}
else
{
Console.WriteLine($"Entry with Id {id} is valid");
}
}
}
private static IEnumerable<Entry> GetEntriesFromFile(string file)
{
foreach (var line in File.ReadLines(file))
yield return GetEntryFromLine(line);
}
private static Entry GetEntryFromLine(string line)
{
var parts = line.Split('|');
var entry = new Entry
{
Id = int.Parse(parts[0]),
TotalAmount = int.Parse(parts[1])
};
if (parts.Length == 3)
entry.Reference = int.Parse(parts[2]);
return entry;
}
}
This uses a nifty extension method for IDictionary<K, V>:
public static class DictionaryExtensions
{
public static TValue AddOrUpdate<TKey, TValue>(
this IDictionary<TKey, TValue> dictionary,
TKey key,
TValue addValue,
Func<TKey, TValue, TValue> updateCallback)
{
if (dictionary == null)
throw new ArgumentNullException(nameof(dictionary));
if (updateCallback == null)
throw new ArgumentNullException(nameof(updateCallback));
if (dictionary.TryGetValue(key, out var value))
value = updateCallback(key, value);
else
value = addValue;
dictionary[key] = value;
return value;
}
}

Parse a text file into instances of a class based on keywords c#

I have a text file that contains a list of points:
POINT:
TYPE 5,
OBJECT ID 2,
DEVICE TYPE CAT,
TAG 'ADDRESS-1',
DESCRIPTION 'kitty',
UNITS 'Lb',
POINT:
TYPE 5,
OBJECT ID 2,
DEVICE TYPE CAT,
TAG 'ADDRESS-2',
DESCRIPTION 'orange kitty',
UNITS 'Lb',
POINT:
TYPE 2,
OBJECT ID 3,
DEVICE TYPE DOG,
TAG 'ADDRESS-5',
DESCRIPTION 'brown dog',
UNITS 'Lb',
From this, I want to create instances (in this case, 2) of my class 'Cat' that contain the tag and description in this text file(and then put them in a list of Cats). I only want to take the description and tag from points of Type 5 (those are the cats).
I'm not sure what the best approach is to get the strings I want. I need to search the entire file for all points of type 5, then for each of those points, take the description and tag and add it to a new Cat.
public static void Main()
{
string line;
List<Cat> catList = new List<Cat>();
StreamReader file = new StreamReader(#"C:\Config\pets.txt");
while((line = file.ReadLine()) != null)
{
string[] words = line.Split(',');
catList.Add(new Cat cat1)
}}
I ended up doing it this way:
public static List<List<string>> Parse()
{
string filePath = #"C:\Config\pets.txt";
string readText = File.ReadAllText(filePath);
string[] stringSeparators = new string[] { "POINT:" }; //POINT is the keyword the text will be split on
string[] result;
result = readText.Split(stringSeparators, StringSplitOptions.None);
List<List<string>> catData = new List<List<string>>();
//split the text into an list of pieces
List<string> tags = new List<string>(); //tags go here
List<string> descriptions = new List<string>(); //descriptions go here
foreach (string s in result)
{
if (s.Contains("TYPE 5")) //TYPE 5 = CAT
{
string[] parts = s.Split(','); //split the cat by commas
string chop = "'"; //once tags and descriptions have been found, only want to keep what is inside single quotes ie 'orange kitty'
foreach (string part in parts)
{
if (part.Contains("TAG"))
{
int startIndex = part.IndexOf(chop);
int endIndex = part.LastIndexOf(chop);
int length = endIndex - startIndex + 1;
string path = part.Substring(startIndex, length);
tag = tag.Replace(chop, string.Empty);
tags.Add(tag);
//need to create instance of Cat with this tag
}
if (part.Contains("DESCRIPTION"))
{
int startIndex = part.IndexOf(chop);
int endIndex = part.LastIndexOf(chop);
int length = endIndex - startIndex + 1;
string description = part.Substring(startIndex, length);
description = description.Replace(chop, string.Empty);
descriptions.Add(description);
//need to add description to Cat instance that matches associated tag
}
}
}
}
catData.Add(tags);
catData.Add(descriptions);
return catData;
What I would do is create a class that represents the fields you want to capture. For this example, I'm capturing all the fields, but you can customize it how you want. I called this class "Animal", since it seems the "points" represent an animal.
Then I would add a static Parse method to the class that will return an instance of an Animal based on some input string. This method will parse the input string and attempt to set the relevant properties of the Animal object based on the values in the string.
I also added a ToString() override on the class so we have some way of displaying the output:
class Animal
{
public int Type { get; set; }
public int ObjectId { get; set; }
public string DeviceType { get; set; }
public string Tag { get; set; }
public string Description { get; set; }
public string Units { get; set; }
/// <summary>
/// Parses an input string and returns an Animal based
/// on any property values found in the string
/// </summary>
/// <param name="input">The string to parse for property values</param>
/// <returns>An animal instance with specified properties</returns>
public static Animal Parse(string input)
{
var result = new Animal();
if (string.IsNullOrWhiteSpace(input)) return result;
// Parse input string and set fields accordingly
var keyValueParts = input
.Split(new [] {','}, StringSplitOptions.RemoveEmptyEntries)
.Select(kvp => kvp.Trim());
foreach (var keyValuePart in keyValueParts)
{
if (keyValuePart.StartsWith("Type",
StringComparison.OrdinalIgnoreCase))
{
int type;
var value = keyValuePart.Substring("Type".Length).Trim();
if (int.TryParse(value, out type))
{
result.Type = type;
}
}
else if (keyValuePart.StartsWith("Object Id",
StringComparison.OrdinalIgnoreCase))
{
int objectId;
var value = keyValuePart.Substring("Object Id".Length).Trim();
if (int.TryParse(value, out objectId))
{
result.ObjectId = objectId;
}
}
else if (keyValuePart.StartsWith("Device Type",
StringComparison.OrdinalIgnoreCase))
{
var value = keyValuePart.Substring("Device Type".Length).Trim();
result.DeviceType = value;
}
else if (keyValuePart.StartsWith("Tag",
StringComparison.OrdinalIgnoreCase))
{
var value = keyValuePart.Substring("Tag".Length).Trim();
result.Tag = value;
}
else if (keyValuePart.StartsWith("Description",
StringComparison.OrdinalIgnoreCase))
{
var value = keyValuePart.Substring("Description".Length).Trim();
result.Description = value;
}
else if (keyValuePart.StartsWith("Units",
StringComparison.OrdinalIgnoreCase))
{
var value = keyValuePart.Substring("Units".Length).Trim();
result.Units = value;
}
}
return result;
}
public override string ToString()
{
// Return a string that describes this animal
var animalProperties = new StringBuilder();
animalProperties.Append($"Type = {Type}, Object Id = {ObjectId}, ");
animalProperties.Append($"Device Type = {DeviceType}, Tag = {Tag}, ");
animalProperties.Append($"Description = {Description}, Units = {Units}");
return animalProperties.ToString();
}
}
Now that we have an object that can create itself from a string, we just need to read in the file contents, split it on the "Point:" keyword, and then pass each string to the Animal class to get our instances.
I added some System.Linq clause to filter on animals that have Type = 5 (which you said are all cats), since you said that was the only animal you were interested in. Of course you could remove this to get all animals, or replace it with "Dog" to get the dogs, etc..:
private static void Main()
{
var filePath = #"f:\public\temp\temp.txt";
// Read all file contents and split it on the word "Point:"
var fileContents = Regex
.Split(File.ReadAllText(filePath), "Point:", RegexOptions.IgnoreCase)
.Where(point => !string.IsNullOrWhiteSpace(point))
.Select(point => point.Trim());
// Get all animals that are cats from the results
var catList = fileContents
.Select(Animal.Parse)
.Where(animal => animal.Type == 5)
.ToList();
// Output results
catList.ForEach(Console.WriteLine);
// Wait for input before closing
Console.WriteLine("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
Output

C# LINQ statement

I have two lists; List<int> numList has identifier number as its elements, and List<string> filePaths has path to file that needs to be analyzed as its elements. I want to filter filePaths based on the numList; that is, I only want to select the filePaths whose file names have the identifier number that is present in the numList.
For example, filePaths has
C:/test/1.test.xlsx
C:/test/2.test.xlsx
C:/test/3.test.xlsx
C:/test/4.test.xlsx
and, numList has
1
2
In this case, I want to construct LINQ statement to only get
C:/test/1.test.xlsx
C:/test/2.test.xlsx
I tried
for(int i = 0; i < numList.Count; i++)
{
filePaths = filePaths.Where(f => Convert.ToInt32(GetNumberFromString(Path.GetFileName(f))) == numList[i]).ToList();
}
And this is GetNumberFromString Helper Method
// Find number in the string
private string GetNumberFromString(string value)
{
int number;
string resultString = Regex.Match(value, #"\d+").Value;
if (Int32.TryParse(resultString, out number))
{
return resultString;
}
else
{
throw new Exception(String.Format("No number present in the file {0}", value));
}
}
I think this will work, but is there more elegant/efficient way of achieving this?
You can do it with a one-liner:
var filteredFilePaths = filePaths.Where(x => numList.Contains(GetNumberFromString(x));
I'd do it like this. The test method assumes that all the files in directory have appropriately formatted names. If that's not a reasonable assumption, it's easy enough to fix.
This is overkill, however, if you only ever care about the "file number" in one place.
public class TestClass
{
public static void TestMethod(String directory)
{
var files = System.IO.Directory.GetFiles(directory).Select(f => new FileInfo(f)).ToList();
var numList = new[] { 1, 2 };
var oneAndTwo = files.Where(fi => numList.Contains(fi.FileNumber)).ToList();
}
}
public class FileInfo
{
public FileInfo()
{
}
public FileInfo(String path)
{
Path = path;
}
public int FileNumber { get; private set; }
private string _path;
public String Path
{
get { return _path; }
set
{
_path = value;
FileNumber = GetNumberFromFileName(_path);
}
}
public static int GetNumberFromFileName(string path)
{
int number;
var fileName = System.IO.Path.GetFileName(path);
string resultString = Regex.Match(fileName, #"\d+").Value;
if (Int32.TryParse(resultString, out number))
{
return number;
}
else
{
throw new Exception(String.Format("No number present in the file {0}", path ?? "(null)"));
}
}
}
A stand-alone One-liner using a Join :
var result = filePaths.Select(x => new { Filename = Path.GetFileName(x), x })
.Join(numList, x => Regex.Match(x.Filename, "^([0-9]+)").Value,
y => y.ToString(),
(x, y) => x.x);

How to combine 2 LINQ dictionaries into 1?

I have 2 excel files that I have converted into lists. The 1st file has a complete list of all items that I need. However, the 2nd list has a small list of items that need to be changed in the 1st list.
Here's how my 1st list is constructed:
IEnumerable<ExcelRow> queryListA = from d in datapullList
select new ExcelRow
{
Company = d.GetString(0),
Location = d.GetString(1),
ItemPrice = d.GetString(4),
SQL_Ticker = d.GetString(15)
};
The 2nd list is constructed in a very similar way:
IEnumerable<ExcelRow> queryListB = from dupes in dupespullList
select new ExcelRow
{
Company = d.GetString(0),
Location = d.GetString(1),
NewCompany = d.GetString(4)
};
So, if there is a company from a particular location in 1st list that matches 2nd list, then the company gets changed to the newcompany name.
Then, my final list should have everything in 1st list but with the changes specified from 2nd list.
I've been struggling with this for a few days now. Let me know if you need more details.
[Update:] I'm pretty new to LINQ and C#. I've found this code on the web regarding Excel reader for Office 2003. How can I create the 1 list (stated above) from all the following classes?
My ExcelRow class:
class ExcelRow
{
List<object> columns;
public ExcelRow()
{
columns = new List<object>();
}
internal void AddColumn(object value)
{
columns.Add(value);
}
public object this[int index]
{
get { return columns[index]; }
}
public string GetString(int index)
{
if (columns[index] is DBNull)
{
return null;
}
return columns[index].ToString();
}
public int Count
{
get { return this.columns.Count; }
}
}
My ExcelProvider class:
class ExcelProvider : IEnumerable<ExcelRow>
{
private string sheetName;
private string filePath;
private string columnName1;
private string columnName2;
private List<ExcelRow> rows;
public ExcelProvider()
{
rows = new List<ExcelRow>();
}
public static ExcelProvider Create(string filePath, string sheetName, string columnName1, string columnName2)
{
ExcelProvider provider = new ExcelProvider();
provider.sheetName = sheetName;
provider.filePath = filePath;
provider.columnName1 = columnName1;
provider.columnName2 = columnName2;
return provider;
}
private void Load()
{
string connectionString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties= ""Excel 8.0;HDR=YES;IMEX=1""";
connectionString = string.Format(connectionString, filePath);
rows.Clear();
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
try
{
conn.Open();
using (OleDbCommand cmd = conn.CreateCommand())
{
cmd.CommandText = string.Format("SELECT * FROM [{0}$] WHERE {1} IS NOT NULL AND {2} <> \"{3}\"", sheetName, columnName1, columnName2, null);
using (OleDbDataReader reader = cmd.ExecuteReader())
{
while (reader.Read())
{
ExcelRow newRow = new ExcelRow();
for (int count = 0; count < reader.FieldCount; count++)
{
newRow.AddColumn(reader[count]);
}
rows.Add(newRow);
}
}
}
}
catch (Exception ex)
{ throw ex; }
finally
{
if (conn.State == System.Data.ConnectionState.Open)
conn.Close();
}
}
}
public IEnumerator<ExcelRow> GetEnumerator()
{
Load();
return rows.GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
Load();
return rows.GetEnumerator();
}
}
So, using all this logic, how can I solve my problem?
//first create a dictionary of comapny whose name has been changed
var dict = queryListB.ToDictionary(x => x.Company, y => y.NewCompany);
//loop on the first list and do the changes in the first list
queryListA.ForEach( x =>
{
if(dict.Keys.Contains(x.Company))
x.Company = dict[x.Company];
});
Loop through queryListA and see if there is a matching company in queryListB. If so, then update the Company property.
Here's the code:
foreach (var companyA in queryListA)
{
var companyBMatch = queryListB.FirstOrDefault(x => x.Company == companyA.Company && x.Location == companyA.Location);
if (companyBMatch != null)
companyA.Company = companyBMatch.NewCompany;
}
I'm sure you can write simpler code to achieve the same goal but I've gone for a way that reduces the number of times you have to iterate through the first and second lists. If performance isn't an issue a simpler method that just searches the dupespullList for each element in datapullList might be appropriate.
var excelRowCreator = new ExcelRowCreator(dupespullList);
var finalRows = excelRowCreator.CreateExcelRows(datapullList);
// ...
public class ExcelRowCreator
{
/// <summary>
/// First key is company name, second is location
/// and final value is the replacement name.
/// </summary>
private readonly IDictionary<string, IDictionary<string, string>> nameReplacements;
/// <summary>
/// I don't know what type of objects your initial
/// lists contain so replace T with the correct type.
/// </summary>
public ExcelRowCreator(IEnumerable<T> replacementRows)
{
nameReplacements = CreateReplacementDictionary(replacementRows);
}
/// <summary>
/// Creates ExcelRows by replacing company name where appropriate.
/// </summary>
public IEnumerable<ExcelRow> CreateExcelRows(IEnumerable<T> inputRows)
{
// ToList is here so that if you iterate over the collection
// multiple times it doesn't create new excel rows each time
return inputRows.Select(CreateExcelRow).ToList();
}
/// <summary>
/// Creates an excel row from the input data replacing
/// the company name if required.
/// </summary>
private ExcelRow CreateExcelRow(T data)
{
var name = data.GetString(0);
var location = data.GetString(1);
IDictionary<string, string> replacementDictionary;
if (nameReplacements.TryGetValue(name, out replacementDictionary))
{
string replacementName;
if (replacementDictionary.TryGetValue(location, out replacementName))
{
name = replacementName;
}
}
return new ExcelRow
{
Company = name,
Location = location,
ItemPrice = data.GetString(4),
SQL_Ticker = data.GetString(15)
};
}
/// <summary>
/// A helper method to create the replacement dictionary.
/// </summary>
private static IDictionary<string, IDictionary<string, string>> CreateReplacementDictionary(IEnumerable<T> replacementRows)
{
var replacementDictionary = new Dictionary<string, IDictionary<string, string>>();
foreach (var dupe in replacementRows)
{
var name = dupe.GetString(0);
IDictionary<string, string> locationReplacements;
if (!replacementDictionary.TryGetValue(name, out locationReplacements))
{
locationReplacements = new Dictionary<string, string>();
replacementDictionary[name] = locationReplacements;
}
locationReplacements[dupe.GetString(1)] = dupe.GetString(4);
}
return replacementDictionary;
}
}
UPDATE : Packaged as a class and written in visual studio so there shouldn't be any grammatical errors.

Categories

Resources