Validate content of csv file c# - c#

I have a requirement where user will be uploading a csv file in the below format which will contain around 1.8 to 2 million records
SITE_ID,HOUSE,STREET,CITY,STATE,ZIP,APARTMENT
44,545395,PORT ROYAL,CORPUS CHRISTI,TX,78418,2
44,608646,TEXAS AVE,ODESSA,TX,79762,
44,487460,EVERHART RD,CORPUS CHRISTI,TX,78413,
44,275543,EDWARD GARY,SAN MARCOS,TX,78666,4
44,136811,MAGNOLIA AVE,SAN ANTONIO,TX,78212
What i have to do is, first validate the file and then save it in database iff its validated successfully and has no errors. The validations that i have to apply are different for each column. For example,
SITE_ID: it can only be an integer and it is required.
HOUSE: integer, required
STREET: alphanumeric, required
CITY: alphabets only, required
State: 2 alphabets only, required
zip: 5 digits only, required
APARTMENT: integer only, optional
I need a generic way of applying these validations to respective columns. What i have tried so far is that i converted the csv file to dataTable and i plan to try and validate each cell through regex but this doesn't seem like a generic or good solution to me. Can anyone help me in this regard and point me to the right direction?

Here is one efficient method :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Data.OleDb;
using System.Text.RegularExpressions;
using System.IO;
namespace ConsoleApplication23
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
CSVReader csvReader = new CSVReader();
DataSet ds = csvReader.ReadCSVFile(FILENAME, true);
RegexCompare compare = new RegexCompare();
DataTable errors = compare.Get_Error_Rows(ds.Tables[0]);
}
}
class RegexCompare
{
public static Dictionary<string,RegexCompare> dict = new Dictionary<string,RegexCompare>() {
{ "SITE_ID", new RegexCompare() { columnName = "SITE_ID", pattern = #"[^\d]+", positveNegative = false, required = true}},
{ "HOUSE", new RegexCompare() { columnName = "HOUSE", pattern = #"[^\d]+", positveNegative = false, required = true}},
{ "STREET", new RegexCompare() { columnName = "STREET", pattern = #"[A-Za-z0-9 ]+", positveNegative = true, required = true}},
{ "CITY", new RegexCompare() { columnName = "CITY", pattern = #"[A-Za-z ]+", positveNegative = true, required = true}},
{ "STATE", new RegexCompare() { columnName = "STATE", pattern = #"[A-Za-z]{2}", positveNegative = true, required = true}},
{ "ZIP", new RegexCompare() { columnName = "ZIP", pattern = #"\d{5}", positveNegative = true, required = true}},
{ "APARTMENT", new RegexCompare() { columnName = "APARTMENT", pattern = #"\d*", positveNegative = true, required = false}},
};
string columnName { get; set;}
string pattern { get; set; }
Boolean positveNegative { get; set; }
Boolean required { get; set; }
public DataTable Get_Error_Rows(DataTable dt)
{
DataTable dtError = null;
foreach (DataRow row in dt.AsEnumerable())
{
Boolean error = false;
foreach (DataColumn col in dt.Columns)
{
RegexCompare regexCompare = dict[col.ColumnName];
object colValue = row.Field<object>(col.ColumnName);
if (regexCompare.required)
{
if (colValue == null)
{
error = true;
break;
}
}
else
{
if (colValue == null)
continue;
}
string colValueStr = colValue.ToString();
Match match = Regex.Match(colValueStr, regexCompare.pattern);
if (regexCompare.positveNegative)
{
if (!match.Success)
{
error = true;
break;
}
if (colValueStr.Length != match.Value.Length)
{
error = true;
break;
}
}
else
{
if (match.Success)
{
error = true;
break;
}
}
}
if(error)
{
if (dtError == null) dtError = dt.Clone();
dtError.Rows.Add(row.ItemArray);
}
}
return dtError;
}
}
public class CSVReader
{
public DataSet ReadCSVFile(string fullPath, bool headerRow)
{
string path = fullPath.Substring(0, fullPath.LastIndexOf("\\") + 1);
string filename = fullPath.Substring(fullPath.LastIndexOf("\\") + 1);
DataSet ds = new DataSet();
try
{
if (File.Exists(fullPath))
{
string ConStr = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}" + ";Extended Properties=\"Text;HDR={1};FMT=Delimited\\\"", path, headerRow ? "Yes" : "No");
string SQL = string.Format("SELECT * FROM {0}", filename);
OleDbDataAdapter adapter = new OleDbDataAdapter(SQL, ConStr);
adapter.Fill(ds, "TextFile");
ds.Tables[0].TableName = "Table1";
}
foreach (DataColumn col in ds.Tables["Table1"].Columns)
{
col.ColumnName = col.ColumnName.Replace(" ", "_");
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
return ds;
}
}
}

Here's a rather overengineered but really fun generic method, where you give attributes to your class to match them to CSV column headers:
First step is to parse your CSV. There are a variety of methods out there, but my favourite is the TextFieldParser that can be found in the Microsoft.VisualBasic.FileIO namespace. The advantage of using this is that it's 100% native; all you need to do is add Microsoft.VisualBasic to the references.
Having done that, you have the data as List<String[]>. Now, things get interesting. See, now we can create a custom attribute and add it to our class properties:
The attribute class:
[AttributeUsage(AttributeTargets.Property)]
public sealed class CsvColumnAttribute : System.Attribute
{
public String Name { get; private set; }
public Regex ValidationRegex { get; private set; }
public CsvColumnAttribute(String name) : this(name, null) { }
public CsvColumnAttribute(String name, String validationRegex)
{
this.Name = name;
this.ValidationRegex = new Regex(validationRegex ?? "^.*$");
}
}
The data class:
public class AddressInfo
{
[CsvColumnAttribute("SITE_ID", "^\\d+$")]
public Int32 SiteId { get; set; }
[CsvColumnAttribute("HOUSE", "^\\d+$")]
public Int32 House { get; set; }
[CsvColumnAttribute("STREET", "^[a-zA-Z0-9- ]+$")]
public String Street { get; set; }
[CsvColumnAttribute("CITY", "^[a-zA-Z0-9- ]+$")]
public String City { get; set; }
[CsvColumnAttribute("STATE", "^[a-zA-Z]{2}$")]
public String State { get; set; }
[CsvColumnAttribute("ZIP", "^\\d{1,5}$")]
public Int32 Zip { get; set; }
[CsvColumnAttribute("APARTMENT", "^\\d*$")]
public Int32? Apartment { get; set; }
}
As you see, what I did here was link every property to a CSV column name, and give it a regex to validate the contents. On non-required stuff, you can still do regexes, but ones that allow empty values, as shown in the Apartment one.
Now, to actually match the columns to the CSV headers, we need to get the properties of the AddressInfo class, check for each property whether it has a CsvColumnAttribute, and if it does, match its name to the column headers of the CSV file data. Once we have that, we got a list of PropertyInfo objects, which can be used to dynamically fill in the properties of new objects created for all rows.
This method is completely generic, allows giving the columns in any order in the CSV file, and parsing will work for any class once you assign the CsvColumnAttribute to the properties you want to fill in. It will automatically validate the data, and you can handle failures however you want. In this code, all I do is skip invalid lines, though.
public static List<T> ParseCsvInfo<T>(List<String[]> split) where T : new()
{
// No template row, or only a template row but no data. Abort.
if (split.Count < 2)
return new List<T>();
String[] templateRow = split[0];
// Create a dictionary of rows and their index in the file data.
Dictionary<String, Int32> columnIndexing = new Dictionary<String, Int32>();
for (Int32 i = 0; i < templateRow.Length; i++)
{
// ToUpperInvariant is optional, of course. You could have case sensitive headers.
String colHeader = templateRow[i].Trim().ToUpperInvariant();
if (!columnIndexing.ContainsKey(colHeader))
columnIndexing.Add(colHeader, i);
}
// Prepare the arrays of property parse info. We set the length
// so the highest found column index exists in it.
Int32 numCols = columnIndexing.Values.Max() + 1;
// Actual property to fill in
PropertyInfo[] properties = new PropertyInfo[numCols];
// Regex to validate the string before parsing
Regex[] propValidators = new Regex[numCols];
// Type converters for automatic parsing
TypeConverter[] propconverters = new TypeConverter[numCols];
// go over the properties of the given type, see which ones have a
// CsvColumnAttribute, and put these in the list at their CSV index.
foreach (PropertyInfo p in typeof(T).GetProperties())
{
object[] attrs = p.GetCustomAttributes(true);
foreach (Object attr in attrs)
{
CsvColumnAttribute csvAttr = attr as CsvColumnAttribute;
if (csvAttr == null)
continue;
Int32 index;
if (!columnIndexing.TryGetValue(csvAttr.Name.ToUpperInvariant(), out index))
{
// If no valid column is found, and the regex for this property
// does not allow an empty value, then all lines are invalid.
if (!csvAttr.ValidationRegex.IsMatch(String.Empty))
return new List<T>();
// No valid column found: ignore this property.
break;
}
properties[index] = p;
propValidators[index] = csvAttr.ValidationRegex;
// Automatic type converter. This function could be enhanced by giving a
// list of custom converters as extra argument and checking those first.
propconverters[index] = TypeDescriptor.GetConverter(p.PropertyType);
break; // Only handle one CsvColumnAttribute per property.
}
}
List<T> objList = new List<T>();
// start from 1 since the first line is the template with the column names
for (Int32 i = 1; i < split.Count; i++)
{
Boolean abortLine = false;
String[] line = split[i];
// make new object of the given type
T obj = new T();
for (Int32 col = 0; col < properties.Length; col++)
{
// It is possible a line is not long enough to contain all columns.
String curVal = col < line.Length ? line[col] : String.Empty;
PropertyInfo prop = properties[col];
// this can be null if the column was not found but wasn't required.
if (prop == null)
continue;
// check validity. Abort buildup of this object if not valid.
Boolean valid = propValidators[col].IsMatch(curVal);
if (!valid)
{
// Add logging here? We have the line and column index.
abortLine = true;
break;
}
// Automated parsing. Always use nullable types for nullable properties.
Object value = propconverters[col].ConvertFromString(curVal);
prop.SetValue(obj, value, null);
}
if (!abortLine)
objList.Add(obj);
}
return objList;
}
To use on your CSV file, simply do
// the function using VB's TextFieldParser
List<String[]> splitData = SplitFile(datafile, new UTF8Encoding(false), ',');
// The above function, applied to the AddressInfo class
List<AddressInfo> addresses = ParseCsvInfo<AddressInfo>(splitData);
And that's it. Automatic parsing and validation, all through some added attributes on the class properties.
Note, if splitting the data in advance would give too much of a performance hit for large data, that's not really a problem; the TextFieldParser works from a Stream wrapped in a TextReader, so instead of giving a List<String[]> you can just give a stream and do the csv parsing on the fly inside the ParseCsvInfo function, simply reading per CSV line directly from the TextFieldParser.
I didn't do that here because the original use case for csv reading for which I wrote the reader to List<String[]> included automatic encoding detection, which required reading the whole file anyway.

I would suggest to using a CSV-library to read the file.
For example you can use LumenWorksCsvReader: https://www.nuget.org/packages/LumenWorksCsvReader
Your approach with an regex validation is actually ok.
For example, you could create a "Validation Dictionary" and check every CSV Value against the regex-expression.
Then you can build a function that can validate a CSV-File with such a "Validation Dictionary".
See here:
string lsInput = #"SITE_ID,HOUSE,STREET,CITY,STATE,ZIP,APARTMENT
44,545395,PORT ROYAL,CORPUS CHRISTI,TX,78418,2
44,608646,TEXAS AVE,ODESSA,TX,79762,
44,487460,EVERHART RD,CORPUS CHRISTI,TX,78413,
44,275543,EDWARD GARY,SAN MARCOS,TX,78666,4
44,136811,MAGNOLIA AVE,SAN ANTONIO,TX,78212";
Dictionary<string, string> loValidations = new Dictionary<string, string>();
loValidations.Add("SITE_ID", #"^\d+$"); //it can only be an integer and it is required.
//....
bool lbValid = true;
using (CsvReader loCsvReader = new CsvReader(new StringReader(lsInput), true, ','))
{
while (loCsvReader.ReadNextRecord())
{
foreach (var loValidationEntry in loValidations)
{
if (!Regex.IsMatch(loCsvReader[loValidationEntry.Key], loValidationEntry.Value))
{
lbValid = false;
break;
}
}
if (!lbValid)
break;
}
}
Console.WriteLine($"Valid: {lbValid}");

Here is another way to accomplish your needs using Cinchoo ETL - an open source file helper library.
First define a POCO class with DataAnnonations validation attributes as below
public class Site
{
[Required(ErrorMessage = "SiteID can't be null")]
public int SiteID { get; set; }
[Required]
public int House { get; set; }
[Required]
public string Street { get; set; }
[Required]
[RegularExpression("^[a-zA-Z][a-zA-Z ]*$")]
public string City { get; set; }
[Required(ErrorMessage = "State is required")]
[RegularExpression("^[A-Z][A-Z]$", ErrorMessage = "Incorrect zip code.")]
public string State { get; set; }
[Required]
[RegularExpression("^[0-9][0-9]*$")]
public string Zip { get; set; }
public int Apartment { get; set; }
}
then use this class with ChoCSVReader to load and check the validity of the file using Validate()/IsValid() method as below
using (var p = new ChoCSVReader<Site>("*** YOUR CSV FILE PATH ***")
.WithFirstLineHeader(true)
)
{
Exception ex;
Console.WriteLine("IsValid: " + p.IsValid(out ex));
}
Hope it helps.
Disclaimer: I'm the author of this library.

Related

Problem in databinding Array data to DataGridView in c#

I have been binding short data to DataGridView in C# Winforms. However, I need to bind long string array with size 75 to DataGridView. My data list class consists of 6 individual variables with get and set and array of string which I have defined get and set properties. The individual variables are displayed but the array of strings is not displayed in DataGridView. In debug, I checked the data source of DataGridView and it seems ok. How can I display binded array in gridview.
Below is my source code to populate DataGridView named Logview
public void populateLogData(string path)
{
StreamReader sr = null;
BindingList<LogList> bindLogList;
BindingSource bLogsource = new BindingSource();
List<LogList> loglist = new List<LogList>();
try
{
Logview.DataSource = null;
Logview.Rows.Clear();
Logview.Columns.Clear();
Logview.AutoGenerateColumns = true;
if (File.Exists(path))
{
try
{
sr = new StreamReader(path);
StringBuilder readline = new StringBuilder(sr.ReadLine());
if (readline.ToString() != null && readline.ToString() != "")
{
readline = new StringBuilder(sr.ReadLine());
while (readline.ToString() != null && readline.ToString() != "")
{
string[] subdata = readline.ToString().Split(',');
LogList tloglist = new LogList(subdata[0], subdata[1], subdata[2], subdata[3], subdata[4], subdata[5], max_index);
for (int i = 6; i < subdata.Length; i++)
tloglist.setPartList(i-6, subdata[i]);
loglist.Add(new LogList(subdata, subdata.Length));
readline = new StringBuilder(sr.ReadLine());
}
}
bindLogList = new BindingList<LogList>(loglist);
bLogsource.DataSource = bindLogList;
Logview.AutoGenerateColumns = true;
Logview.DataSource = bindLogList;
Logview.Columns[0].Width = 140; // project name
Logview.Columns[1].Width = 140; // data/time
Logview.Columns[2].Width = 90;
Logview.Columns[3].Width = 90;
Logview.Columns[4].Width = 90;
Logview.Columns[5].Width = 90;
// max_index is set from another part of code
for(int i = 0; i <= max_index; i++)
{
int counter = 6 + i;
Logview.Columns.Add(headertext[i], headertext[i]);
Logview.Columns[counter].Width = 90;
Logview.Columns[counter].HeaderText = headertext[i];
}
}
catch (IOException io)
{
MessageBox.Show("Error: Cannot Open log file.");
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
finally
{
if (sr != null) sr.Close();
}
}
else
{
MessageBox.Show("Log file not found \n" + path);
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
finally
{
GC.Collect();
}
}
Below is LogList class
class LogList
{
const int max_size = 100;
private string[] holdList;
public string project { get; set; }
public string date_time { get; set; }
public string Qty { get; set; }
public string Pass { get; set; }
public string Fail { get; set; }
public string Result { get; set; }
public string[] partlist
{
get
{
return holdList;
}
set
{
holdList = value;
}
}
public LogList(string project, string date_time, string Qty, string Pass, string Fail, string Result, int partsize)
{
this.project = project;
this.date_time = date_time;
this.Qty = Qty;
this.Pass = Pass;
this.Fail = Fail;
this.Result = Result;
partlist = new string[partsize+1];
}
public void setPartList(int size, string getValue)
{
partlist[size] = getValue;
}
}
Project, date/time, Qty, Pass, Fail, Result is displayed. But partlist array is not displayed.
To supplement IVSoftware’s answer, below is an example using two grids in a master-detail scenario.
One issue I would have with your current approach, is that it uses an Array for the “parts list.” Currently this is a string array, and that isn’t going to work if we want to display it in a grid. Fortunately, there are a few easy ways we can get the data to display as we want.
One simple solution is to create a “wrapper” Class for the string. I will call this Class Part. I added a simple int ID property and the string PartName property. You could easily leave out the ID and have a simple string wrapper. This simple Class may look something like…
public class Part {
public int ID { get; set; }
public string PartName { get; set; }
}
This should allow the data to display correctly in the grid using just about any construct like an array, list etc.… So, we “could” change your current code to use an array of Part objects like…
Part[] Parts = new Parts[X];
And this would work, however, if we use an array and we know for sure that each LogItem may have a different number of parts in its PartsList, then we will have to manage the array sizes. So, a BindingList of Part objects will simplify this. The altered LogList (LogItem) Class is below…
public class LogItem {
public BindingList<Part> PartsList { get; set; }
public string Project { get; set; }
public string Date_Time { get; set; }
public string Qty { get; set; }
public string Pass { get; set; }
public string Fail { get; set; }
public string Result { get; set; }
public LogItem(string project, string date_Time, string qty, string pass, string fail, string result) {
Project = project;
Date_Time = date_Time;
Qty = qty;
Pass = pass;
Fail = fail;
Result = result;
PartsList = new BindingList<Part>();
}
}
So given the updated Classes, this should simplify things and we will use the same DataSource for both grids. This DataSource for the “master” grid will be a BindingList of LogItem objects. In the “detail” grid, we simply need to point it’s DataMember property to the PartsList property of the currently selected LogItem. And this would look something like…
dgvLogs.DataSource = LogsBL;
if (LogsBL.Count > 0) {
dgvParts.DataMember = "PartsList";
dgvParts.DataSource = LogsBL;
}
Below is the code to test the Classes above in a master-detail scenario with two grids. Create a new winform solution and drop two (2) DataGridViews on the form. The grid on the left is dgvLogs and the grid on the right is dgvParts.
public void populateLogData(string path) {
BindingList<LogItem> LogsBL = new BindingList<LogItem>();
string currentLine;
if (File.Exists(path)) {
try {
using (StreamReader sr = new StreamReader(path)) {
LogItem tempLogItem;
currentLine = sr.ReadLine(); // <- header row - ignoring
currentLine = sr.ReadLine();
while (currentLine != null) {
if (!string.IsNullOrEmpty(currentLine)) {
string[] splitArray = currentLine.Split(',');
if (splitArray.Length >= 6) {
tempLogItem = new LogItem(splitArray[0], splitArray[1], splitArray[2], splitArray[3], splitArray[4], splitArray[5]);
for (int i = 6; i < splitArray.Length; i++) {
tempLogItem.PartsList.Add(new Part { ID = i, PartName = splitArray[i] });
}
LogsBL.Add(tempLogItem);
}
else {
Debug.WriteLine("DataRead Error: Not enough items to make a LogItem: " + currentLine);
}
}
else {
Debug.WriteLine("DataRead Empty row");
}
currentLine = sr.ReadLine();
}
}
dgvLogs.DataSource = LogsBL;
if (LogsBL.Count > 0) {
dgvParts.DataMember = "PartsList";
dgvParts.DataSource = LogsBL;
}
}
catch (IOException io) {
MessageBox.Show("Error: Cannot Open log file.");
}
catch (Exception ex) {
MessageBox.Show(ex.Message + " Stacktrace- " + ex.StackTrace);
}
}
else {
MessageBox.Show("Log file not found \n" + path);
}
}
And some test data…
H1,h2,h3,h4,h5,h6,h7,h8
Model: LMG600N_IF_2blablas,2022-9-6,112,61,51,Fail,p1,p3,p4,p5,p6
1,2022-9-6,2112,621,251,Pass,px4,px5,px6,px1,px2,px3
data1,2022-9-7,3456,789,123,Fail,z3,z3,z4
Model: LMG600N_IF_2blablas,2022-9-6,112,61,51,Fail
Model: LMG600N_IF_2blablas,2022-9-6,112,61,51,Fail,p1,p3,p4,p5,p6,p7,p8,p99
BadData Model: LMG600N_IF_2blablas,2022-9-6,112,61
Moxxxdel: LMG600N_IF_2blablas,2022-9-6,11x2,6x1,5x1,Fail
Hope this helps and makes sense.
Your data list class consists of 6 individual variables with get and set, and an array of string. Your question is about the variables are displayed but the array of strings is not.
Here's what has worked for me (similar to the excellent suggestion by JohnG) for displaying the string array. What I'm doing here is taking a DataGridView and dropping in my main form without changing any settings (other than to Dock it). Given the default settings, the LogList class (shown here in a minimal reproducible example of 1 variable and 1 array of strings) is defined with a public string property named PartList and with this basic implementation:
class LogList
{
public LogList(string product, string[] partList)
{
Product = product;
_partList = partList;
}
public string Product { get; set; }
private string[] _partList;
public string PartList => string.Join(",", _partList);
}
To autoconfigure the DataGridView with Product and PartList columns, here is an example initializer method that sets the DataSource and adds the first three items as a test:
// Set data source property once. Clear it, Add to it, but no reason to nullify it.
BindingList<LogList> DataSource { get; } = new BindingList<LogList>();
private void InitDataGridView()
{
dataGridView1.DataSource = DataSource;
// Auto config columns by adding at least one Record.
DataSource.Add(
new LogList(
product: "LMG450",
// Four parts
partList: new string[]
{
"PCT2000",
"WCT100",
"ZEL-0812LN",
"EN61000-3-3/-11",
}
));
DataSource.Add(
new LogList(
product: "LMG600N",
// Three parts
partList: new string[]
{
"LTC2280",
"BMS6815",
"ZEL-0812LN",
}
));
DataSource.Add(
new LogList(
product: "Long Array",
// 75 parts
partList: Enumerable.Range(1, 75).Select(x => $"{ x }").ToArray()
));
// Use string indexer to access columns for formatting purposes.
dataGridView1
.Columns[nameof(LogList.Product)]
.AutoSizeMode = DataGridViewAutoSizeColumnMode.AllCells;
dataGridView1
.Columns[nameof(LogList.PartList)]
.AutoSizeMode = DataGridViewAutoSizeColumnMode.Fill;
}
After running this code, the DGV looks like this:
With the mouse hovered over the item all 75 "parts" can be viewed.
One last thing - I notice you have some methods to assign a new partList[] of perhaps change an individual part at a specified index. (I didn't show them in the minimal sample but for sure you'll want things like that). You probably know this but make sure to call dataGridView1.Refresh after altering properties of an existing row/LogList object so that the view will reflect the changes.
I hope there's something here that offers a few ideas to achieve the outcome you want.

How to properly access object's List<> value in C#?

I am trying to get the object value but I don't know how to do it. I'm new to C# and its giving me syntax error. I want to print it separately via the method "PrintSample" How can I just concatenate or append the whatData variable . Thank you.
PrintSample(getData, "name");
PrintSample(getData, "phone");
PrintSample(getData, "address");
//Reading the CSV file and put it in the object
string[] lines = File.ReadAllLines("sampleData.csv");
var list = new List<Sample>();
foreach (var line in lines)
{
var values = line.Split(',');
var sampleData = new Sample()
{
name = values[0],
phone = values[1],
address = values[2]
};
list.Add(sampleData);
}
public class Sample
{
public string name { get; set; }
public string phone { get; set; }
public string adress { get; set; }
}
//Method to call to print the Data
private static void PrintSample(Sample getData, string whatData)
{
//THis is where I'm having error, how can I just append the whatData to the x.?
Console.WriteLine( $"{getData. + whatData}");
}
In C# it's not possible to dynamically evaluate expression like
$"{getData. + whatData}"
As opposed to languages like JavaScript.
I'd suggest to use rather switch expression or Dictionary<string, string>
public void PrintData(Sample sample, string whatData)
{
var data = whatData switch
{
"name" => sample.name,
"phone" => sample.phone,
"address" => sample.address
_ => throw new ArgumentOutOfRangeException(nameof(whatData)),
};
Console.WriteLine(data);
}
I'm not sure what you are trying to achieve. Perhaps this will help you:
private static void PrintSample(Sample getData, string whatData)
{
var property = getData.GetType().GetProperty(whatData);
string value = (string)property?.GetValue(getData) ?? "";
Console.WriteLine($"{value}");
}
What PO really needs is
private static void PrintSamples(List<Sample> samples)
{
foreach (var sample in samples)
Console.WriteLine($"name : {sample.name} phone: {sample.phone} address: {sample.address} ");
}
and code
var list = new List<Sample>();
foreach (var line in lines)
{
......
}
PrintSamples(list);
it is radicolous to use
PrintSample(getData, "name");
instead of just
PrintSample(getData.name)
You can do this using reflection. However, it's known to be relatively slow.
public static void PrintSample(object getData, string whatData)
{
Console.WriteLine( $"{getData.GetType().GetProperty(whatData).GetValue(getData, null)}");
}

fetch details from csv file on basis of name search c#

Step 1: I have created a C# application called : Student details
Step 2: Added four TextBoxes and named them as :
Image below to refer:
Studentname.Text
StudentSurname.Text
StudentCity.Text
StudentState.Text
DATA INSIDE CSV FILE
vikas,gadhi,mumbai,maharashtra
prem,yogi,kolkata,maha
roja,goal,orissa,oya
ram,kala,goa,barka
Issue is How do I fetch all the data(surname,city,state) of user prem into above textboxes studentsurname,studentcity,studentstate from csv file when I search the name in textbox 1 => studentname.Text as prem
Below is the Code where I am stuck at return null and code inside Load_Script_Click
void Connection_fetch_details(String searchName)
{
var strLines = File.ReadLines(filePath);
foreach (var line in strLines)
{
if (line.Split(',')[0].Equals(searchName))
{
Connection_fetch_details cd = new Connection_fetch_details()
{
username = line.Split(',')[1]
};
}
}
return;
}
private void Load_Script_Click(object sender, EventArgs e)
{
// load script is button
String con_env = textenv.Text.ToString();
//Address Address = GetAddress("vikas");
//textsurname.text = Address.Surname
Connection_fetch_details cd = Connection_fetch_details(con_env);
textusername.Text = cd.username;
}
==============================================================
Class file name : Address.class
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace DDL_SCRIPT_GENERATOR
{
public class Connection_fetch_details
{
public string username { get; set; }
}
}
The main problem is that your method is void, which means it doesn't return any value. So even though you may be finding a match, and creating a Connection_fetch_details object, you aren't returning that result back to the calling method.
This will fix that problem:
Connection_fetch_details Connection_fetch_details(String searchName)
{
var strLines = File.ReadLines(filePath);
foreach (var line in strLines)
{
if (line.Split(',')[0].Equals(searchName))
{
Connection_fetch_details cd = new Connection_fetch_details()
{
username = line.Split(',')[1]
};
return cd; //return the object containing the matched username
}
}
return null;
}
Now it will return a Connection_fetch_details object if there is a match, or null if there is no match.
Next, you asked about returning all the fields, not just one. For that you would need to
a) add more properties to your object
b) add more code to populate those properties from the CSV
c) add code to populate the textboxes with the results from the object.
I'm also going to rename "username" to something more relevant, since none of the field names you described in the question match that. I'm also going to rename your class to "Student", and rename your search method, for the same reason.
Here's an example:
Student searchStudent(String searchName)
{
var strLines = File.ReadLines(filePath);
foreach (var line in strLines)
{
var split = line.Split(',');
if (split[0].Equals(searchName))
{
Student s = new Student()
{
firstname = searchName,
surname = split[1],
city = split[2],
state = split[3]
};
return s; //return the object containing the matched name
}
}
return null;
}
private void Load_Script_Click(object sender, EventArgs e)
{
// load script is button
String con_env = textenv.Text.ToString();
//Address Address = GetAddress("vikas");
//textsurname.text = Address.Surname
Student st = searchStudent(con_env);
textsurname.Text = st.surname;
txtcity.Text = st.city;
txtstate.Text = st.state;
}
namespace DDL_SCRIPT_GENERATOR
{
public class Student
{
public string firstname { get; set; }
public string surname { get; set; }
public string city { get; set; }
public string state { get; set; }
}
}
To accomplish your goal you have to further separate your problem in more granular steps and also distinguish between what you show in your UI and what informations you hold in the background in which format.
Create a class with the desired properties
public class Student { public string Name { get; set; } ... }
Learn how to read a csv file into such an object by using an existing library like CsvHelper or CsvReader.
When you have something like List<Student> from this part. Learn how you can visualize such a thing by using some Binding (also depends on the visualization you use Winforms, WPF, etc.).
Depending on the visualization component it already supports filtering or you need to filter by yourself by using e.g. LINQ to get the matching elements students.Where(student => student.Name.StartsWith(search)).
So far a lot of smaller problems which is simply to much to answer in a single one. Please try to break down your problems into smaller ones and search for their solutions. If you get stuck, ask a new question. That's all I can do for you now.

Working with CSV file

I've been working and trying to solve this problem for maybe a whole week, which at this point I am wondering if I can solve it without it diving even deeper into the C# language, and I'm quite fairly new to C#, as well as working with CSV files and sorting and organizing them, so I'm fairly inexperienced into the whole spectrum of this.
I'm trying to sort a CSV file alphabetically, hide items that need to be hidden and have them have depth levels based on their parents, child and grandchild elements.
I've been successful with a couple of them, and written somewhat working code, but I don't know how to sort them alphabetically and give them the proper depth layer based on the parent and child they belong to.
Here's the mockup CSV that I've been trying to organize:
ID;MenuName;ParentID;isHidden;LinkURL
1;Company;NULL;False;/company
2;About Us;1;False;/company/aboutus
3;Mission;1;False;/company/mission
4;Team;2;False;/company/aboutus/team
5;Client 2;10;False;/references/client2
6;Client 1;10;False;/references/client1
7;Client 4;10;True;/references/client4
8;Client 5;10;True;/references/client5
10;References;NULL;False;/references
I've delimited the items by the semicolon, I've displayed the items that needs to be shown, but I fail to sort them like I should.
The sorting should look like this:
Company
About Us
Team
Mission
References
Client 1
Client 2
I've tried to sort them or display them in that order by getting the index of the slash, but what the code reproduces is not how it should be displayed, and, it looks like this:
Company
About Us
Mission
Team
Client 2
Client 1
References
In the other try, where I recursively match their parent id with the id, the console display looks like this:
Company
About Us
Mission
Team
Client 2
Client 1
References
I've tried solving this with a friend, and, even he doesn't know how to approach this problem, since this code should work on a different file that uses different parent ids.
On top of all this, I am unable to index them to an array, because there's only index of 0 or the index is based on their letters or crashes the console if I enter the index position of 1.
Here's the code for the first part where I fail to sort them:
class Program
{
static void Main(string[] args)
{
StreamReader sr = new StreamReader(#"Navigation.csv");
string data = sr.ReadLine();
while (data != null)
{
string[] rows = data.Split(';');
int id;
int parentId;
bool ids = Int32.TryParse(rows[0], out id);
string name = rows[1];
bool pIds = Int32.TryParse(rows[2], out parentId);
string isHidden = rows[3];
string linkUrl = rows[4];
string[] splitted = linkUrl.Split('/');
if (isHidden == "False")
{
List<CsvParentChild> pIdCid = new List<CsvParentChild>()
{
new CsvParentChild(id, parentId, name, linkUrl)
};
}
data = sr.ReadLine();
}
}
}
class CsvParentChild
{
public int Id;
public int ParentId;
public string Name;
public string LinkUrl;
public List<CsvParentChild> Children = new List<CsvParentChild>();
public CsvParentChild(int id, int parentId, string name, string linkUrl)
{
Id = id;
ParentId = parentId;
Name = name;
LinkUrl = linkUrl;
string[] splitted = linkUrl.Split(new char[] { '/' }, StringSplitOptions.RemoveEmptyEntries);
if (splitted.Length == 1)
{
Console.WriteLine($". { name }");
}
else if (splitted.Length == 2)
{
Console.WriteLine($".... { name }");
}
else if (splitted.Length == 3)
{
Console.WriteLine($"....... { name }");
}
}
}
And here's for the second part:
class Program
{
static void Main(string[] args)
{
// Get the path for the file
const string filePath = #"../../Navigation.csv";
// Read the file
StreamReader sr = new StreamReader(File.OpenRead(filePath));
string data = sr.ReadLine();
while (data != null)
{
string[] rows = data.Split(';');
ListItems lis = new ListItems();
int id;
int parentId;
// Get the rows/columns from the Csv file
bool ids = Int32.TryParse(rows[0], out id);
string name = rows[1];
bool parentIds = Int32.TryParse(rows[2], out parentId);
string isHidden = rows[3];
string linkUrl = rows[4];
// Split the linkUrl so that we get the position of the
// elements based on their slash
string [] splitted = linkUrl.Split(new char[] { '/' }, StringSplitOptions.RemoveEmptyEntries);
// If item.isHidden == "False"
// then display the all items whose state is set to false.
// If the item.isHidden == "True", then display the item
// whose state is set to true.
if (isHidden == "False")
{
// Set the items
ListItems.data = new List<ListItems>()
{
new ListItems() { Id = id, Name = name, ParentId = parentId },
};
// Make a new instance of ListItems()
ListItems listItems = new ListItems();
// Loop through the CSV data
for (var i = 0; i < data.Count(); i++)
{
if (splitted.Length == 1)
{
listItems.ListThroughItems(i, i);
}
else if (splitted.Length == 2)
{
listItems.ListThroughItems(i, i);
}
else
{
listItems.ListThroughItems(i, i);
}
}
}
// Break out of infinite loop
data = sr.ReadLine();
}
}
public class ListItems
{
public int Id { get; set; }
public string Name { get; set; }
public int ParentId { get; set; }
public static List<ListItems> data = null;
public List<ListItems> Children = new List<ListItems>();
// http://stackoverflow.com/a/36250045/7826856
public void ListThroughItems(int id, int level)
{
Id = id;
// Match the parent id with the id
List<ListItems> children = data
.Where(p => p.ParentId == id)
.ToList();
foreach (ListItems child in children)
{
string depth = new string('.', level * 4);
Console.WriteLine($".{ depth } { child.Name }");
ListThroughItems(child.Id, level + 1);
}
}
}
}
For each item, you need to construct a kind of "sort array" consisting of ids. The sort array consists of the ids of the item's ancestors in order from most distant to least distant. For "Team", our sort array is [1, 2, 4].
Here are the sort arrays of each item:
[1]
[1, 2]
[1, 3]
[1, 2, 4]
[10, 5]
[10, 6]
[10, 7]
[10, 8]
[10]
Once you have this, sorting the items is simple. When comparing two "sort arrays", start with the numbers in order in each array. If they are different, sort according to the value of the first number and you're done. If they are the same, look at the second number. If there is no second number, then sort by the length of the arrays, i.e., nothing comes before something.
Applying this algorithm, we get:
[1]
[1, 2]
[1, 2, 4]
[1, 3]
[10]
[10, 5]
[10, 6]
[10, 7]
[10, 8]
After that, hide the items based on the flag. I leave that to you because it's so simple. Depth is easy: It's the length of the sort array.
My Application was compiled and produced the following output with your data:
Company
About Us
Team
Mission
References
Client 1
Client 2
Client 4
Client 5
I would attempt to use object relation to create your tree like structure.
The main difficulty with the question is that parents don't matter. Children do.
So at some point in your code, you will need to reverse the hierarchy; Parsing Children first but reading their Parents first to create the output.
The roots of our tree are the data entries without parents.
Parsing
This should be pretty self explanatory, we have a nice class with a constructor that parses the input array and stores the data in it's properties.
We store all the rows in a list. After we are done with this, we pretty much converted the list, but no sorting happened at all.
public partial class csvRow
{
// Your Data
public int Id { get; private set; }
public string MenuName { get; private set; }
public int? ParentId { get; private set; }
public bool isHidden { get; private set; }
public string LinkURL { get; private set; }
public csvRow(string[] arr)
{
Id = Int32.Parse(arr[0]);
MenuName = arr[1];
//Parent Id can be null!
ParentId = ToNullableInt(arr[2]);
isHidden = bool.Parse(arr[3]);
LinkURL = arr[4];
}
private static int? ToNullableInt(string s)
{
int i;
if (int.TryParse(s, out i))
return i;
else
return null;
}
}
static void Main(string[] args)
{
List<csvRow> unsortedRows = new List<csvRow>();
// Read the file
const string filePath = #"Navigation.csv";
StreamReader sr = new StreamReader(File.OpenRead(filePath));
string data = sr.ReadLine();
//Read each line
while (data != null)
{
var dataSplit = data.Split(';');
//We need to avoid parsing the first line.
if (dataSplit[0] != "ID" )
{
csvRow lis = new csvRow(dataSplit);
unsortedRows.Add(lis);
}
// Break out of infinite loop
data = sr.ReadLine();
}
sr.Dispose();
//At this point we got our data in our List<csvRow> unsortedRows
//It's parsed nicely. But we still need to sort it.
//So let's get ourselves the root values. Those are the data entries that don't have a parent.
//Please Note that the main method continues afterwards.
Creating our Tree Strukture and Sorting the items
We start by defining Children and a public ChildrenSorted property that returns them sorted. That's actually allsorting we are doing, it's alot easier to sort than to work recursively.
We also need a function that add's children. It will pretty much filter the input and find all the rows where row.parentId = this.ID.
The last one is the function that defines our output and allows us to get something we can print into the console.
public partial class csvRow
{
private List<csvRow> children = new List<csvRow>();
public List<csvRow> ChildrenSorted
{
get
{
// This is a quite neet way of sorting, isn't it?
//Btw this is all the sorting we are doing, recursion for win!
return children.OrderBy(row => row.MenuName).ToList();
}
}
public void addChildrenFrom(List<csvRow> unsortedRows)
{
// Add's only rows where this is the parent.
this.children.AddRange(unsortedRows.Where(
//Avoid running into null errors
row => row.ParentId.HasValue &&
//Find actualy children
row.ParentId == this.Id &&
//Avoid adding a child twice. This shouldn't be a problem with your data,
//but why not be careful?
!this.children.Any(child => child.Id == row.Id)));
//And this is where the magic happens. We are doing this recursively.
foreach (csvRow child in this.children)
{
child.addChildrenFrom(unsortedRows);
}
}
//Depending on your use case this function should be replaced with something
//that actually makes sense for your business logic, it's an example on
//how to read from a recursiv structure.
public List<string> FamilyTree
{
get
{
List<string> myFamily = new List<string>();
myFamily.Add(this.MenuName);
//Merges the Trees with itself as root.
foreach (csvRow child in this.ChildrenSorted)
{
foreach (string familyMember in child.FamilyTree)
{
//Adds a tab for all children, grandchildren etc.
myFamily.Add("\t" + familyMember);
}
}
return myFamily;
}
}
}
Adding Items to the Tree and accessing them
This is the second part of my main function, where we actually work with our data (Right after sr.Dispose();)
var roots = unsortedRows.Where(row => row.ParentId.HasValue == false).
OrderBy(root => root.MenuName).ToList();
foreach (csvRow root in roots)
{
root.addChildrenFrom(unsortedRows);
}
foreach (csvRow root in roots)
{
foreach (string FamilyMember in root.FamilyTree)
{
Console.WriteLine(FamilyMember);
}
}
Console.Read();
}
Entire Sourcecode (Visual Studio C# Console Application)
You can use this to test, play around and learn more about recursive structures.
Copyright 2017 Eldar Kersebaum
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication49
{
class Program
{
static void Main(string[] args)
{
List<csvRow> unsortedRows = new List<csvRow>();
const string filePath = #"Navigation.csv";
StreamReader sr = new StreamReader(File.OpenRead(filePath));
string data = sr.ReadLine();
while (data != null)
{
var dataSplit = data.Split(';');
//We need to avoid parsing the first line.
if (dataSplit[0] != "ID" )
{
csvRow lis = new csvRow(dataSplit);
unsortedRows.Add(lis);
}
// Break out of infinite loop
data = sr.ReadLine();
}
sr.Dispose();
var roots = unsortedRows.Where(row => row.ParentId.HasValue == false).
OrderBy(root => root.MenuName).ToList();
foreach (csvRow root in roots)
{
root.addChildrenFrom(unsortedRows);
}
foreach (csvRow root in roots)
{
foreach (string FamilyMember in root.FamilyTree)
{
Console.WriteLine(FamilyMember);
}
}
Console.Read();
}
}
public partial class csvRow
{
// Your Data
public int Id { get; private set; }
public string MenuName { get; private set; }
public int? ParentId { get; private set; }
public bool isHidden { get; private set; }
public string LinkURL { get; private set; }
public csvRow(string[] arr)
{
Id = Int32.Parse(arr[0]);
MenuName = arr[1];
ParentId = ToNullableInt(arr[2]);
isHidden = bool.Parse(arr[3]);
LinkURL = arr[4];
}
private static int? ToNullableInt(string s)
{
int i;
if (int.TryParse(s, out i))
return i;
else
return null;
}
private List<csvRow> children = new List<csvRow>();
public List<csvRow> ChildrenSorted
{
get
{
return children.OrderBy(row => row.MenuName).ToList();
}
}
public void addChildrenFrom(List<csvRow> unsortedRows)
{
this.children.AddRange(unsortedRows.Where(
row => row.ParentId.HasValue &&
row.ParentId == this.Id &&
!this.children.Any(child => child.Id == row.Id)));
foreach (csvRow child in this.children)
{
child.addChildrenFrom(unsortedRows);
}
}
public List<string> FamilyTree
{
get
{
List<string> myFamily = new List<string>();
myFamily.Add(this.MenuName);
foreach (csvRow child in this.ChildrenSorted)
{
foreach (string familyMember in child.FamilyTree)
{
myFamily.Add("\t" + familyMember);
}
}
return myFamily;
}
}
}
}

Importing an Excel Sheet and Validate the Imported Data with Loosely Coupled

I am trying to develop a module which will read excel sheets (possibly from other data sources too, so it should be loosely coupled) and convert them into Entities so to save.
The logic will be this:
The excel sheet can be in different format, for example column names in Excel sheet can be different so my system needs to be able to map different fields to my entities.
For now I will be assuming the format defined above will be same and hardcoded for now instead of coming from database dynamically after set on a configuration mapping UI kinda thing.
The data needs to be validated before even get mapped. So I should be able validate it beforehand against something. We're not using like XSD or something else so I should validate it against the object structure I am using as a template for importing.
The problem is, I put together some things together but I don't say I liked what I did. My Question is how I can improve the code below and make things more modular and fix the validation issues.
The code below is a mock-up and is not expected to work, just to see some structure of the design.
This is code I've come up with so far, and I've realized one thing that I need to improve my design patterns skills but for now I need your help, if you could help me:
//The Controller, a placeholder
class UploadController
{
//Somewhere here we call appropriate class and methods in order to convert
//excel sheet to dataset
}
After we uploaded file using an MVC Controller, there could be different controllers specialized to import certain behaviors, in this example I will uploading person related tables,
interface IDataImporter
{
void Import(DataSet dataset);
}
//We can use many other importers besides PersonImporter
class PersonImporter : IDataImporter
{
//We divide dataset to approprate data tables and call all the IImportActions
//related to Person data importing
//We call inserting to database functions here of the DataContext since this way
//we can do less db roundtrip.
public string PersonTableName {get;set;}
public string DemographicsTableName {get;set;}
public Import(Dataset dataset)
{
CreatePerson();
CreateDemograhics();
}
//We put different things in different methods to clear the field. High cohesion.
private void CreatePerson(DataSet dataset)
{
var personDataTable = GetDataTable(dataset,PersonTableName);
IImportAction addOrUpdatePerson = new AddOrUpdatePerson();
addOrUpdatePerson.MapEntity(personDataTable);
}
private void CreateDemograhics(DataSet dataset)
{
var demographicsDataTable = GetDataTable(dataset,DemographicsTableName);
IImportAction demoAction = new AddOrUpdateDemographic(demographicsDataTable);
demoAction.MapEntity();
}
private DataTable GetDataTable(DataSet dataset, string tableName)
{
return dataset.Tables[tableName];
}
}
I have IDataImporter and specialized concrete class PersonImporter. However, I am not sure it looks good so far since things should be SOLID so basically easy to extend later in the project cycle, this will be a foundation for future improvements, lets keep going:
IImportActions are where the magic mostly happens. Instead of designing things table based, I am developing it behavior based so one can call any of them to import things in more modular model. For example a table may have 2 different actions.
interface IImportAction
{
void MapEntity(DataTable table);
}
//A sample import action, AddOrUpdatePerson
class AddOrUpdatePerson : IImportAction
{
//Consider using default values as well?
public string FirstName {get;set;}
public string LastName {get;set;}
public string EmployeeId {get;set;}
public string Email {get;set;}
public void MapEntity(DataTable table)
{
//Each action is producing its own data context since they use
//different actions.
using(var dataContext = new DataContext())
{
foreach(DataRow row in table.Rows)
{
if(!emailValidate(row[Email]))
{
LoggingService.LogWarning(emailValidate.ValidationMessage);
}
var person = new Person(){
FirstName = row[FirstName],
LastName = row[LastName],
EmployeeId = row[EmployeeId],
Email = row[Email]
};
dataContext.SaveObject(person);
}
dataContext.SaveChangesToDatabase();
}
}
}
class AddOrUpdateDemographic: IImportAction
{
static string Name {get;set;}
static string EmployeeId {get;set;}
//So here for example, we will need to save dataContext first before passing it in
//to get the PersonId from Person (we're assuming that we need PersonId for Demograhics)
public void MapEntity(DataTable table)
{
using(var dataContext = new DataCOntext())
{
foreach(DataRow row in table.Rows)
{
var demograhic = new Demographic(){
Name = row[Name],
PersonId = dataContext.People.First(t => t.EmployeeId = int.Parse(row["EmpId"]))
};
dataContext.SaveObject(person);
}
dataContext.SaveChangesToDatabase();
}
}
}
And the validation, which mostly where I suck at unfortunately. The validation needs to be easy to extend and loosely coupled and also I need to be able to call this validation beforehand instead of adding everything.
public static class ValidationFactory
{
public static Lazy<IFieldValidation> PhoneValidation = new Lazy<IFieldValidation>(()=>new PhoneNumberValidation());
public static Lazy<IFieldValidation> EmailValidation = new Lazy<IFieldValidation>(()=>new EmailValidation());
//etc.
}
interface IFieldValidation
{
string ValidationMesage{get;set;}
bool Validate(object value);
}
class PhoneNumberValidation : IFieldValidation
{
public string ValidationMesage{get;set;}
public bool Validate(object value)
{
var validated = true; //lets say...
var innerValue = (string) value;
//validate innerValue using Regex or something
//if validation fails, then set ValidationMessage propert for logging.
return validated;
}
}
class EmailValidation : IFieldValidation
{
public string ValidationMesage{get;set;}
public bool Validate(object value)
{
var validated = true; //lets say...
var innerValue = (string) value;
//validate innerValue using Regex or something
//if validation fails, then set ValidationMessage propert for logging.
return validated;
}
}
I have done the same thing on a project. The difference is that I didn't have to import Excel sheets, but CSV files. I created a CSVValueProvider. And, therefore, the CSV data was bound to my IEnumerable model automatically.
As for validation, I figured that going through all rows, and cells, and validating them one by one is not very efficient, especially when the CSV file has thousands of records. So, what I did was that I created some validation methods that went through the CSV data column by column, instead of row by row, and did a linq query on each column and returned the row numbers of the cells with invalid data. Then, added the invalid row number/column names to ModelState.
UPDATE:
Here is what I have done...
CSVReader Class:
// A class that can read and parse the data in a CSV file.
public class CSVReader
{
// Regex expression that's used to parse the data in a line of a CSV file
private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
// String array to hold the headers (column names)
private string[] _headers;
// List of string arrays to hold the data in the CSV file. Each string array in the list represents one line (row).
private List<string[]> _rows;
// The StreamReader class that's used to read the CSV file.
private StreamReader _reader;
public CSVReader(StreamReader reader)
{
_reader = reader;
Parse();
}
// Reads and parses the data from the CSV file
private void Parse()
{
_rows = new List<string[]>();
string[] row;
int rowNumber = 1;
var headerLine = "RowNumber," + _reader.ReadLine();
_headers = GetEscapedSVs(headerLine);
rowNumber++;
while (!_reader.EndOfStream)
{
var line = rowNumber + "," + _reader.ReadLine();
row = GetEscapedSVs(line);
_rows.Add(row);
rowNumber++;
}
_reader.Close();
}
private string[] GetEscapedSVs(string data)
{
if (!data.EndsWith(","))
data = data + ",";
return GetEscapedSVs(data, ",", "\"");
}
// Parses each row by using the given separator and escape characters
private string[] GetEscapedSVs(string data, string separator, string escape)
{
string[] result = null;
int priorMatchIndex = 0;
MatchCollection matches = Regex.Matches(data, string.Format(ESCAPE_SPLIT_REGEX, separator, escape));
// Skip empty rows...
if (matches.Count > 0)
{
result = new string[matches.Count];
for (int index = 0; index <= result.Length - 2; index++)
{
result[index] = data.Substring(priorMatchIndex, matches[index].Groups["Separator"].Index - priorMatchIndex);
priorMatchIndex = matches[index].Groups["Separator"].Index + separator.Length;
}
result[result.Length - 1] = data.Substring(priorMatchIndex, data.Length - priorMatchIndex - 1);
for (int index = 0; index <= result.Length - 1; index++)
{
if (Regex.IsMatch(result[index], string.Format("^{0}.*[^{0}]{0}$", escape)))
result[index] = result[index].Substring(1, result[index].Length - 2);
result[index] = result[index].Replace(escape + escape, escape);
if (result[index] == null || result[index] == escape)
result[index] = "";
}
}
return result;
}
// Returns the number of rows
public int RowCount
{
get
{
if (_rows == null)
return 0;
return _rows.Count;
}
}
// Returns the number of headers (columns)
public int HeaderCount
{
get
{
if (_headers == null)
return 0;
return _headers.Length;
}
}
// Returns the value in a given column name and row index
public object GetValue(string columnName, int rowIndex)
{
if (rowIndex >= _rows.Count)
{
return null;
}
var row = _rows[rowIndex];
int colIndex = GetColumnIndex(columnName);
if (colIndex == -1 || colIndex >= row.Length)
{
return null;
}
var value = row[colIndex];
return value;
}
// Returns the column index of the provided column name
public int GetColumnIndex(string columnName)
{
int index = -1;
for (int i = 0; i < _headers.Length; i++)
{
if (_headers[i].Replace(" ","").Equals(columnName, StringComparison.CurrentCultureIgnoreCase))
{
index = i;
return index;
}
}
return index;
}
}
CSVValueProviderFactory Class:
public class CSVValueProviderFactory : ValueProviderFactory
{
public override IValueProvider GetValueProvider(ControllerContext controllerContext)
{
var uploadedFiles = controllerContext.HttpContext.Request.Files;
if (uploadedFiles.Count > 0)
{
var file = uploadedFiles[0];
var extension = file.FileName.Split('.').Last();
if (extension.Equals("csv", StringComparison.CurrentCultureIgnoreCase))
{
if (file.ContentLength > 0)
{
var stream = file.InputStream;
var csvReader = new CSVReader(new StreamReader(stream, Encoding.Default, true));
return new CSVValueProvider(controllerContext, csvReader);
}
}
}
return null;
}
}
CSVValueProvider Class:
// Represents a value provider for the data in an uploaded CSV file.
public class CSVValueProvider : IValueProvider
{
private CSVReader _csvReader;
public CSVValueProvider(ControllerContext controllerContext, CSVReader csvReader)
{
if (controllerContext == null)
{
throw new ArgumentNullException("controllerContext");
}
if (csvReader == null)
{
throw new ArgumentNullException("csvReader");
}
_csvReader = csvReader;
}
public bool ContainsPrefix(string prefix)
{
if (prefix.Contains('[') && prefix.Contains(']'))
{
if (prefix.Contains('.'))
{
var header = prefix.Split('.').Last();
if (_csvReader.GetColumnIndex(header) == -1)
{
return false;
}
}
int index = int.Parse(prefix.Split('[').Last().Split(']').First());
if (index >= _csvReader.RowCount)
{
return false;
}
}
return true;
}
public ValueProviderResult GetValue(string key)
{
if (!key.Contains('[') || !key.Contains(']') || !key.Contains('.'))
{
return null;
}
object value = null;
var header = key.Split('.').Last();
int index = int.Parse(key.Split('[').Last().Split(']').First());
value = _csvReader.GetValue(header, index);
if (value == null)
{
return null;
}
return new ValueProviderResult(value, value.ToString(), CultureInfo.CurrentCulture);
}
}
For the validation, as I mentioned before, I figured that it would not be efficient to do it using DataAnnotation attributes. A row by row validation of the data would take a long time for CSV files with thousands of rows. So, I decided to validate the data in the Controller after the Model Binding is done. I should also mention that I needed to validate the data in the CSV file against some data in the database. If you just need to validate things like Email Address or Phone Number, you might as well just use DataAnnotation.
Here is a sample method for validating the Email Address column:
private void ValidateEmailAddress(IEnumerable<CSVViewModel> csvData)
{
var invalidRows = csvData.Where(d => ValidEmail(d.EmailAddress) == false).ToList();
foreach (var invalidRow in invalidRows)
{
var key = string.Format("csvData[{0}].{1}", invalidRow.RowNumber - 2, "EmailAddress");
ModelState.AddModelError(key, "Invalid Email Address");
}
}
private static bool ValidEmail(string email)
{
if(email == "")
return false;
else
return new System.Text.RegularExpressions.Regex(#"^[\w-\.]+#([\w-]+\.)+[\w-]{2,6}$").IsMatch(email);
}
UPDATE 2:
For validation using DataAnnotaion, you just use DataAnnotation attributes in your CSVViewModel like below (the CSVViewModel is the class that your CSV data will be bound to in your Controller Action):
public class CSVViewModel
{
// User proper names for your CSV columns, these are just examples...
[Required]
public int Column1 { get; set; }
[Required]
[StringLength(30)]
public string Column2 { get; set; }
}

Categories

Resources