Get the values from CSV based on headers - c#

I have a csv file with this data
Price,volume,"Local, Zones 1 & 2",Zone 3,Zone 4,Zone 5,Zone 6,Zone 7,Zone 8,Zone 9
1,0.1,4.58,4.65,4.74,4.99,5.23,5.47,5.82,6.98
2,0.2,4.99,5.12,5.28,5.5,5.7,5.91,6.3,7.56
3,0.3,5.22,5.61,6.12,7.64,8.39,9.09,9.96,11.94
4,0.4,5.4,6.31,7.24,9.13,10.73,11.77,13.26,15.91
5,0.5,6.18,7.21,8.35,11.5,13.41,14.82,16.97,20.36
Now i want to retrieve the values based on volume and zone.if volume is 0.1 and zone is 1,then i should get the value of 4.58.similarly if the volume is 0.5 and xone is 2,then the value should be 6.18.
How can i do this in c#?

Dummy way, given that format of file is not going to change and csv file is valid.
int zone = 1;
double value = 0.1;
int zoneColumnIndex = Math.Max(zone, 2);
string valueString = value.ToString(CultureInfo.InvariantCulture);
string result = File.ReadLines("sample.txt")
.Skip(1)
.Select(s => s.Split(','))
.Where(t => t[1] == valueString)
.Select(t => t[zoneColumnIndex])
.FirstOrDefault();

I would look at "LinqToExcel" if I were you. Then you could do this:
var csv = new LinqToExcel.ExcelQueryFactory(csvFile);
var query =
from row in csv.Worksheet()
let Volume = row["volume"].Cast<double>()
where Volume == 0.1
select new
{
Price = row["Price"].Cast<int>(),
Volume,
Local = row["Local, Zones 1 & 2"].Cast<decimal>(),
Zone3 = row["Zone 3"].Cast<decimal>(),
//etc
};

To to get the values into a an array or a list
var reader = new StreamReader(File.OpenRead(#"C:\test.csv"));
List<string> listA = new List<string>();
int i = 0;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
listA.Add(values[i]);
i += 1;
}
now use your list to itterate through
foreach(int j in listA)
{
try
{/* do stuff */}
catch
{/* if your csv cotains strings it wall fall in here */}
}

Related

C# convert dictionary into CSV like string

I have a dictionary Dictionary<string, List<string>> I want order it alphabetically by the keys and convert it into a string that can be written into a CSV file with the keys as column headers and the values as values for that column.
My onordered dictionary looks like:
{
"Name" : ["John", "Ciara", "Moses"],
"Age" : ["23", "16", "37"],
"State" : ["Alabama", "Florida", "New York"]
}
The end result will look like:
Age,Name,State
23,John,Alabama
16,Ciara,Florida
37,Moses,New York
Please how I can achieve this in C#?
For clarity, here is a link to what the task entail.
And below is my approach of solving it. I converted the string into a dictionary with the column headings as keys. My problem now is converting the dictionary back to the string format.
public static string SortCsvColumns( string csv_data )
{
var data = csv_data.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
var values = data.Skip(1).ToArray();
var splittedValues = new List<List<string>>();
var dataSet = data[0].Split(new string[] {","}, StringSplitOptions.None).ToDictionary(x => x, x => new List<string>());
for(int i = 0; i < values.Length; i++)
{
splittedValues.Add(values[i].Split(new string[] { "," }, StringSplitOptions.None).ToList());
}
for(int i = 0; i < splittedValues.Count(); i++) {
var splittedValue = splittedValues[i];
for(int j = 0; j < splittedValue.Count(); j++) {
dataSet.Values.ElementAt(i).Add(splittedValue[j]);
}
}
dataSet = dataSet.OrderBy(key => key.Key);
}
Can someone suggest the best approach to do this please.
First, create a map to re-order the columns in sorted order by column header:
var map = new StringReader(csv_data).ReadLine() // get header line
.Split(';') // split into array of headers
.Select((h, n) => new { header = h, OrigPos = n }) // remember original position
.OrderBy(hn => hn.header, StringComparer.CurrentCultureIgnoreCase) // sort into new position
.Select(hn => hn.OrigPos) // return just old position new new order
.ToList();
Then remap each CSV line into the new order and recombine into a string:
using var sr = new StringReader(csv_data);
var ans = String.Join("\n",
sr.ReadLines()
.Select(line => line.Split(';'))
.Select(columns => String.Join(";", map.Select(pos => columns[pos]))));
This requires an extension method on TextReader to enumerate the lines of a TextReader:
public static class StringReaderExt {
public static IEnumerable<string> ReadLines(this TextReader sr) {
string line;
while ((line = sr.ReadLine()) != null)
yield return line;
}
}

Looking for next match (integer) in list of strings

I have a problem finding the next integer match in a list of strings, there are some other aspects to consider:
single string contains non relevant trailing and leading chars
numbers are formatted "D6" example 000042
there are gaps in the numbers
the list is not sorted, but it could be if there is a fast way to ignore the leading chars
Example:
abc-000001.file
aaac-000002.file
ab-002010.file
abbc-00003.file
abbbc-00004.file
abcd-00008.file
abc-000010.file
x-902010.file
The user input is 7 => next matching string would be abcd-000008.file
My attempt is :
int userInput = 0;
int counter = 0;
string found = String.Empty;
bool run = true;
while (run)
{
for (int i = 0; i < strList.Count; i++)
{
if(strList[i].Contains((userInput + counter).ToString("D6")))
{
found = strList[i];
run = false;
break;
}
}
counter++;
}
It's bad because it's slow and it can turn into a infinite loop. But I really don't know how to do this (fast).
You can parse numbers from strings with Regex and created a sorted collection which you can search with Where clause:
var strings = new[] { "abc-000001.file", "x-000004.file"};
var regP = "\\d{6}"; // simplest option in my example, maybe something more complicated will be needed
var reg = new Regex(regP);
var collection = strings
.Select(s =>
{
var num = reg.Match(s).Captures.First().Value;
return new { num = int.Parse(num), str = s};
})
.OrderBy(arg => arg.num)
.ToList();
var userInput = 2;
var res = collection
.Where(arg => arg.num >= userInput)
.FirstOrDefault()?.str; // x-000004.file
P.S.
How 9002010, 0000010, 0002010 should be treated? Cause they have 7 characters. Is it [9002010, 10, 2010] or [900201, 1, 201]?
If you don't want regex, you can do something like that:
List<string> strings = new List<string>
{
"abc-000001.file",
"aaac-000002.file",
"ab-0002010.file",
"abbc-000003.file",
"abbbc-000004.file",
"abcd-000008.file"
};
int input = 7;
var converted = strings.Select(s => new { value = Int32.Parse(s.Split('-', '.')[1]), str = s })
.OrderBy(c => c.value);
string result = converted.FirstOrDefault(v => v.value >= input)?.str;
Console.WriteLine(result);

c# Order By on List of Objects

I have a list/collection of objects with multiple fields. One of them being filename.
I am sorting based on filename but not getting the correct results.
List:
"552939_VVIDEO9.mp4"
"552939_VVIDEO8.mp4"
"552939_VVIDEO13.mp4"
"552939_VVIDEO12.mp4"
"552939_VVIDEO7.mp4"
"552939_VVIDEO6.mp4"
"552939_VVIDEO2.mp4"
"552939_VVIDEO16.mp4"
"552939_VVIDEO10.mp4"
"552939_VVIDEO3.mp4"
"552939_VVIDEO11.mp4"
"552939_VVIDEO4.mp4"
"552939_VVIDEO1.mp4"
"552939_VVIDEO15.mp4"
"552939_VVIDEO14.mp4"
"552939_VVIDEO17.mp4"
List<WfVideo> orderVideo = ProductData.Videos.OrderBy(o => o.Filename, StringComparer.InvariantCultureIgnoreCase).ToList();
Result I am getting:
VOD1
VOD2
VVIDEO1
VVIDEO10
VVIDEO11
VVIDEO12
VVIDEO13
VVIDEO14
VVIDEO15
VVIDEO16
VVIDEO17
VVIDEO2
VVIDEO3
VVIDEO4
VVIDEO5
VVIDEO6
Is the sorting incorrect?
If you want to sort these files after the number only, you could pass a Comparer to Sort that implements the rules you want. This sorts the filenames according to their number:
List<string> files = new List<string>
{
"552939_VVIDEO9.mp4",
"552939_VVIDEO8.mp4",
"552939_VVIDEO13.mp4",
"552939_VVIDEO12.mp4",
"VOD1.mp4",
"552939_VVIDEO6.mp4",
"VOD2.mp4",
"552939_VVIDEO2.mp4",
"552939_VVIDEO16.mp4",
"552939_VVIDEO10.mp4",
"552939_VVIDEO3.mp4",
"552939_VVIDEO11.mp4",
"552939_VVIDEO4.mp4",
"552939_VVIDEO1.mp4",
"552939_VVIDEO15.mp4",
"552939_VVIDEO14.mp4",
"552939_VVIDEO17.mp4"
};
files.Sort((a, b) => {
int an = 0;
int bn = 1;
var regex = new Regex("([0-9]+).mp4", RegexOptions.IgnoreCase);
var aGroups = regex.Match(a).Groups;
var bGroups = regex.Match(b).Groups;
var aidx = aGroups.Count > 1 ? 1 : 0;
var bidx = bGroups.Count > 1 ? 1 : 0;
an = int.Parse(aGroups[aidx].Value);
bn = int.Parse(bGroups[bidx].Value);
if (an == bn)
return 0;
if (an < bn)
return -1;
return 1;
});
foreach (var file in files)
{
Console.WriteLine(file);
}
Console.ReadKey();
Output:
VOD1.mp4
552939_VVIDEO1.mp4
VOD2.mp4
552939_VVIDEO2.mp4
552939_VVIDEO3.mp4
552939_VVIDEO4.mp4
552939_VVIDEO6.mp4
552939_VVIDEO8.mp4
552939_VVIDEO9.mp4
552939_VVIDEO10.mp4
552939_VVIDEO11.mp4
552939_VVIDEO12.mp4
552939_VVIDEO13.mp4
552939_VVIDEO14.mp4
552939_VVIDEO15.mp4
552939_VVIDEO16.mp4
552939_VVIDEO17.mp4
Note some additional error checking may be needed. You can offcourse extend this Comparer function to work for whatever rules you wish.

Loading CSV in C#

I want to search the penultimate row in the first.csv which date is 1975-01-03 and the Lemon value is 17.0, after I search in the second.csv the same date which lemon is 19.0
After catching both values, I compute the difference 17.0 - 19.0 = -2.0
The next step is to sum the difference -2 to all Lemon's values in second.csv from the date 1975-01-03 to the end 1975-01-09
The final step is to write the third.csv where we add the first.csv until the date 1975-01-02 and the sum we've done with second.csv from 1975-01-03 to the end 1975-01-09
first.csv
Date,Lemon
1974-12-31,19.0
1975-01-02,18.0
1975-01-03,17.0
1975-01-06,16.0
second.csv
Date,Lemon
1975-01-02,18.0
1975-01-03,19.0
1975-01-06,19.5
1975-01-07,19.5
1975-01-08,18.0
1975-01-09,17.0
third.csv
Date,Lemon
1974-12-31,19.0
1975-01-02,18.0
1975-01-03,17.0
1975-01-06,17.5
1975-01-07,17.5
1975-01-08,16.0
1975-01-09,15.0
All in all, the read from CSV is not as important as to obtain the third result in an Array, DataTable, Dictionary or whatever. Thanks
Start with a handy struct to make the coding easier:
public struct Line
{
public DateTime Timestamp;
public decimal Lemon;
}
Then you can write a simple function to load your CSV files:
Func<string, Line[]> readCsv =
fn =>
File
.ReadLines(fn)
.Skip(1)
.Select(x => x.Split(','))
.Select(y => new Line()
{
Timestamp = DateTime.Parse(y[0]),
Lemon = decimal.Parse(y[1])
})
.ToArray();
Now the rest is just a reading the files and a couple of LINQ queries before writing out the results:
Line[] first = readCsv(#"C:\_temp\first.csv");
Line[] second = readCsv(#"C:\_temp\second.csv");
Line difference =
(
from pen in first.Skip(first.Length - 2).Take(1)
from mtch in second
where mtch.Timestamp == pen.Timestamp
select new Line()
{
Timestamp = pen.Timestamp,
Lemon = pen.Lemon - mtch.Lemon
}
).First();
IEnumerable<string> result =
new [] { "Date,Lemon" }
.Concat(
first
.Where(x => x.Timestamp < difference.Timestamp)
.Concat(
second
.Where(x => x.Timestamp >= difference.Timestamp)
.Select(x => new Line()
{
Timestamp = x.Timestamp,
Lemon = x.Lemon + difference.Lemon
}))
.Select(x => String.Format(
"{0},{1}",
x.Timestamp.ToString("yyyy-MM-dd"),
x.Lemon)));
File.WriteAllLines(#"C:\_temp\third.csv", result);
The result I get is:
Date,Lemon
1974-12-31,19.0
1975-01-02,18.0
1975-01-03,17.0
1975-01-06,17.5
1975-01-07,17.5
1975-01-08,16.0
1975-01-09,15.0
This looks like homework, I strongly advice you to do this exercice by yourself by learning about LINQ (just google it). If you are stuck or can't find the solution here is a way to do it :
class LemonAtDate
{
public DateTime Date { get; set; }
public double Value { get; set; }
public LemonAtDate(DateTime Date, double Value)
{
this.Date = Date;
this.Value = Value;
}
public static List<LemonAtDate> LoadFromFile(string filepath)
{
IEnumerable<String[]> lines = System.IO.File.ReadLines(filepath).Select(a => a.Split(','));
List<LemonAtDate> result = new List<LemonAtDate>();
int index = 0;
foreach (String[] line in lines)
{
index++;
if (index == 1) continue; //skip header
DateTime date = DateTime.ParseExact(line[0], "yyyy-MM-dd", System.Globalization.CultureInfo.InvariantCulture);
double value = Double.Parse(line[1], System.Globalization.CultureInfo.InvariantCulture);
result.Add(new LemonAtDate(date, value));
}
return result;
}
public static void WriteToFile(IEnumerable<LemonAtDate> lemons, string filename)
{
//Write to file
using (var sw = new System.IO.StreamWriter(filename))
{
foreach (LemonAtDate lemon in lemons)
{
sw.WriteLine("Date,Lemon");//Header
string date = lemon.Date.ToString("yyyy-MM-dd");
string value = lemon.Value.ToString();
string line = string.Format("{0},{1}", date, value);
sw.WriteLine(line);
}
}
}
}
static void Main(string[] args)
{
//Load first file
List<LemonAtDate> firstCsv = LemonAtDate.LoadFromFile("first.csv");
//Load second file
List<LemonAtDate> secondCsv = LemonAtDate.LoadFromFile("second.csv");
//We need at least two rows
if (firstCsv.Count >= 2)
{
//Penultimate row in first file
LemonAtDate lemonSecondLast = firstCsv[firstCsv.Count - 2];
//Find the value 19 in the second file
LemonAtDate lemonValue19 = secondCsv.Where(x => x.Value == 19).FirstOrDefault();
//Value found
if (lemonValue19 != null)
{
double delta = lemonSecondLast.Value - lemonValue19.Value;
//Get the items between the dates and add the delta
DateTime dateStart = new DateTime(1975, 1, 3);
DateTime dateEnd = new DateTime(1975, 1, 9);
IEnumerable<LemonAtDate> secondFileSelection = secondCsv.Where(x => x.Date >= dateStart && x.Date <= dateEnd)
.Select(x => { x.Value += delta; return x; });
//Create third CSV
List<LemonAtDate> thirdCsv = new List<LemonAtDate>();
//Add the rows from the first file until 1975-01-02
DateTime threshold = new DateTime(1975, 1, 2);
thirdCsv.AddRange(firstCsv.Where(x => x.Date <= threshold));
//Add the rows from the second file
thirdCsv.AddRange(secondFileSelection);
//Write to file
LemonAtDate.WriteToFile(thirdCsv, "third.csv");
}
}
}
There are better ways of doing this, I took a quick and dirty procedural approach instead of an OO one. I also took a peek at the other answer and I see he parsed out the datetimes. I decided not to since you weren't doing any math specifically based on that. However his answer would be more flexible as with datetimes you can do more operations in the future.
List<string> csvfile1Text = System.IO.File.ReadAllLines("file1.csv").ToList();
List<string> csvfile2Text = System.IO.File.ReadAllLines("file2.csv").ToList();
Dictionary<string, double> csv1Formatted = new Dictionary<string, double>();
Dictionary<string, double> csv2Formatted = new Dictionary<string, double>();
Dictionary<string, double> csv3Formatted = new Dictionary<string, double>();
foreach (string line in csvfile1Text)
{
var temp= line.Split(',');
csv1Formatted.Add(temp[0], Double.Parse(temp[1]));
}
foreach (string line in csvfile2Text)
{
var temp = line.Split(',');
csv2Formatted.Add(temp[0], Double.Parse(temp[1]));
}
//operation 1
var penultimate = csv1Formatted["1974-01-03"];
var corrsponding = csv2Formatted["1974-01-03"];
var difference = penultimate - corrsponding;
//operation 2
var start = csv2Formatted["1974-01-03"];
var end = csv2Formatted["1974-01-09"];
var intermediate = csv2Formatted.Keys.SkipWhile((element => element != "1974-01-03")).ToList();
Dictionary<string, double> newCSV2 = new Dictionary<string, double>();
foreach (string element in intermediate)
{
var found = csv2Formatted[element];
found = found + difference;
newCSV2.Add(element, found);
}
//operation 3
intermediate = csv1Formatted.Keys.TakeWhile((element => element != "1975-01-03")).ToList();
foreach (string element in intermediate)
{
var found = csv1Formatted[element];
csv3Formatted.Add(element, found);
}
foreach (KeyValuePair<string,double> kvp in newCSV2)
{
csv3Formatted.Add(kvp.Key,kvp.Value);
}
//writing CSV3
StringBuilder sb = new StringBuilder();
foreach (KeyValuePair<string,double> kvp in csv3Formatted)
{
sb.AppendLine(kvp.Key + "," + kvp.Value);
}
System.IO.File.WriteAllText("C:\\csv3.csv", sb.ToString());
This is my favorite to use with CSV
https://github.com/kentcb/KBCsv
and if you want to work with csv entries as model for each row:
http://www.filehelpers.net/quickstart/
I hope you find this helpful.
Good luck :) Enjoy coding

How to find the number of each elements in the row and store the mean of each row in another array using C#?

I am using the below code to read data from a text file row by row. I would like to assign each row into an array. I must be able to find the number or rows/arrays and the number of elements on each one of them.
I would also like to do some manipulations on some or all rows and return their values.
I get the number of rows, but is there a way to to loop something like:
*for ( i=1 to number of rows)
do
mean[i]<-row[i]
done
return mean*
var data = System.IO.File.ReadAllText("Data.txt");
var arrays = new List<float[]>();
var lines = data.Split(new[] {'\r', '\n'}, StringSplitOptions.RemoveEmptyEntries);
foreach (var line in lines)
{
var lineArray = new List<float>();
foreach (var s in line.Split(new[] {','}, StringSplitOptions.RemoveEmptyEntries))
{
lineArray.Add(Convert.ToSingle(s));
}
arrays.Add(lineArray.ToArray());
}
var numberOfRows = lines.Count();
var numberOfValues = arrays.Sum(s => s.Length);
var arrays = new List<float[]>();
//....your filling the arrays
var averages = arrays.Select(floats => floats.Average()).ToArray(); //float[]
var counts = arrays.Select(floats => floats.Count()).ToArray(); //int[]
Not sure I understood the question. Do you mean something like
foreach (string line in File.ReadAllLines("fileName.txt")
{
...
}
Is it ok for you to use Linq? You might need to add using System.Linq; at the top.
float floatTester = 0;
List<float[]> result = File.ReadLines(#"Data.txt")
.Where(l => !string.IsNullOrWhiteSpace(l))
.Select(l => new {Line = l, Fields = l.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries) })
.Select(x => x.Fields
.Where(f => Single.TryParse(f, out floatTester))
.Select(f => floatTester).ToArray())
.ToList();
// now get your totals
int numberOfLinesWithData = result.Count;
int numberOfAllFloats = result.Sum(fa => fa.Length);
Explanation:
File.ReadLines reads the lines of a file (not all at once but straming)
Where returns only elements for which the given predicate is true(f.e. the line must contain more than empty text)
new { creates an anonymous type with the given properties(f.e. the fields separated by comma)
Then i try to parse each field to float
All that can be parsed will be added to an float[] with ToArray()
All together will be added to a List<float[]> with ToList()
Found an efficient way to do this. Thanks for your input everybody!
private void ReadFile()
{
var lines = File.ReadLines("Data.csv");
var numbers = new List<List<double>>();
var separators = new[] { ',', ' ' };
/*System.Threading.Tasks.*/
Parallel.ForEach(lines, line =>
{
var list = new List<double>();
foreach (var s in line.Split(separators, StringSplitOptions.RemoveEmptyEntries))
{
double i;
if (double.TryParse(s, out i))
{
list.Add(i);
}
}
lock (numbers)
{
numbers.Add(list);
}
});
var rowTotal = new double[numbers.Count];
var rowMean = new double[numbers.Count];
var totalInRow = new int[numbers.Count()];
for (var row = 0; row < numbers.Count; row++)
{
var values = numbers[row].ToArray();
rowTotal[row] = values.Sum();
rowMean[row] = rowTotal[row] / values.Length;
totalInRow[row] += values.Length;
}

Categories

Resources