xml attributes from SQL in C# - c#

C# winform: I asked a similar Q but didn't reached to the solution so i want to make it more clear, I have a string let suppose
str = "<sample>
<sample1 name="val1">
<sample2 name="val2">
</sample2>
<sample2 name="val3">
<groupbox name="val4">
<field type="textArea" x="xxx" />
</groupbox>
</sample2>
</sample1>
<sample1 name="abc">
</sample1>
<sample1 name="xyz">
</sample1>
</sample>"
i want to get the attributes and thier values from this string and place it in gridView notice that this string is just an example it could be changed. or display in any control like richTextField .... etc

I've given this solution before - this code will parse your one XML string and it will return the list of attributes and their values - so what else / what more do you need??
private static List<KeyValuePair<string, string>> ParseForAttributeNames(string xmlContent)
{
List<KeyValuePair<string, string>> attributeNamesAndValues = new List<KeyValuePair<string, string>>();
XDocument xmlDoc = XDocument.Parse(xmlContent);
var nodeAttrs = xmlDoc.Descendants().Select(x => x.Attributes());
foreach (var attrs in nodeAttrs)
{
foreach (var attr in attrs)
{
string attributeName = attr.Name.LocalName;
string attributeValue = attr.Value;
attributeNamesAndValues.Add(new KeyValuePair<string, string>(attributeName, attributeValue));
}
}
return attributeNamesAndValues;
}
If you can explain in good, comprehensible English what else you need, I might make the effort to answer you once again with even more info.... but you need to be clear and precise about what it is you need - otherwise me as an illiterate idiot won't be able to answer......

try this:
StringReader rdr = new StringReader(str);
DataSet ds = new DataSet();
ds.ReadXml(str);
DataTable dt = ds.Tables[0];
datagridview.datasource = dt;

For your other question that keeps coming up over and over again (grabbing XML attributes from SQL Server) - this is a complete sample that will
read all rows from a table and extract all the XML columns
parse all the XML contents into a list of attribute name/values
return a bindable list of attribute name/value pairs, which you can bind to a ListView, a GridView - whatever you like.
Please check it out - and if it doesn't meet your needs, please explain in comprehensible English what it is that's still missing.....
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Xml.Linq;
namespace GrabAndParseXml
{
internal class AttrNameAndValue
{
public string AttrName { get; set; }
public string AttrValue { get; set; }
}
class Program
{
static void Main(string[] args)
{
// grab *ALL* the "XmlContent" columns from your database table
List<string> xmlContent = GrabXmlStringsFromDatabase();
// parse *ALL* your xml strings into a list of attribute name/values
List<AttrNameAndValue> attributeNamesAndValues = ParseForAttributeNamesAndValues(xmlContent);
// you can now easily bind this list of attribute names and values to a ListView, a GridView - whatever - try it!
}
private static List<string> GrabXmlStringsFromDatabase()
{
List<string> results = new List<string>();
// connection string - adapt to **YOUR SETUP** !
string connection = "server=(local);database=test;integrated security=SSPI";
// Query to get the XmlContent columns - I would **ALWAYS** recommend to have a WHERE clause
// to limit the number of rows returned from the query - up to you....
string query = "SELECT XmlContent FROM dbo.TestXml WHERE 1=1";
// set up connection and command for data retrieval
using(SqlConnection _con = new SqlConnection(connection))
using (SqlCommand _cmd = new SqlCommand(query, _con))
{
_con.Open();
// use a SqlDataReader to loop over the results
using(SqlDataReader rdr = _cmd.ExecuteReader())
{
while(rdr.Read())
{
// stick all XML strings into resulting list
results.Add(rdr.GetString(0));
}
rdr.Close();
}
_con.Close();
}
return results;
}
private static List<AttrNameAndValue> ParseForAttributeNamesAndValues(List<string> xmlContents)
{
// create resulting list of "AttrNameAndValue" objects
List<AttrNameAndValue> attributeNamesAndValues = new List<AttrNameAndValue>();
// loop over all XML strings retrieved from the database
foreach (string xmlContent in xmlContents)
{
// parse into an XDocument (Linq-to-XML)
XDocument xmlDoc = XDocument.Parse(xmlContent);
// find **ALL** attribute nodes
var nodeAttrs = xmlDoc.Descendants().Select(x => x.Attributes());
// loop over **ALL** atributes in each attribute node
foreach (var attrs in nodeAttrs)
{
foreach (var attr in attrs)
{
// stick name and value into the resulting list
attributeNamesAndValues.Add(new AttrNameAndValue { AttrName = attr.Name.LocalName, AttrValue = attr.Value });
}
}
}
return attributeNamesAndValues;
}
}
}

Related

C# Reading CSV to DataTable and Invoke Rows/Columns

i am currently working on a small Project and i got stuck with a Problem i currently can not manage to solve...
I have multiple ".CSV" Files i want to read, they all have the same Data just with different Values.
Header1;Value1;Info1
Header2;Value2;Info2
Header3;Value3;Info3
While reading the first File i Need to Create the Headers. The Problem is they are not splited in Columns but in rows (as you can see above Header1-Header3).
Then it Needs to read the Value 1 - Value 3 (they are listed in the 2nd Column) and on top of that i Need to create another Header -> Header4 with the data of "Info2" which is always placed in Column 3 and Row 2 (the other values of Column 3 i can ignore).
So the Outcome after the first File should look like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Info2;
And after multiple files it sohuld be like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Value4;
Value1b;Value2b;Value3b;Value4b;
Value1c;Value2c;Value3c;Value4c;
I tried it with OleDB but i get the Error "missing ISAM" which i cant mange to fix. The Code i Used is the following:
public DataTable ReadCsv(string fileName)
{
DataTable dt = new DataTable("Data");
/* using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
Path.GetDirectoryName(fileName) + "\";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
*/
using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
Path.GetDirectoryName(fileName) + ";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
{
using(OleDbCommand cmd = new OleDbCommand(string.Format("select *from [{0}]", new FileInfo(fileName).Name,cn)))
{
cn.Open();
using(OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
{
adapter.Fill(dt);
}
}
}
return dt;
}
Another attempt i did was using StreamReader. But the Headers are in the wrong place and i dont know how to Change this + do this for every file. the Code i tried is the following:
public static DataTable ReadCsvFilee(string path)
{
DataTable oDataTable = new DataTable();
var fileNames = Directory.GetFiles(path);
foreach (var fileName in fileNames)
{
//initialising a StreamReader type variable and will pass the file location
StreamReader oStreamReader = new StreamReader(fileName);
// CONTROLS WHETHER WE SKIP A ROW OR NOT
int RowCount = 0;
// CONTROLS WHETHER WE CREATE COLUMNS OR NOT
bool hasColumns = false;
string[] ColumnNames = null;
string[] oStreamDataValues = null;
//using while loop read the stream data till end
while (!oStreamReader.EndOfStream)
{
String oStreamRowData = oStreamReader.ReadLine().Trim();
if (oStreamRowData.Length > 0)
{
oStreamDataValues = oStreamRowData.Split(';');
//Bcoz the first row contains column names, we will poluate
//the column name by
//reading the first row and RowCount-0 will be true only once
// CHANGE TO CHECK FOR COLUMNS CREATED
if (!hasColumns)
{
ColumnNames = oStreamRowData.Split(';');
//using foreach looping through all the column names
foreach (string csvcolumn in ColumnNames)
{
DataColumn oDataColumn = new DataColumn(csvcolumn.ToUpper(), typeof(string));
//setting the default value of empty.string to newly created column
oDataColumn.DefaultValue = string.Empty;
//adding the newly created column to the table
oDataTable.Columns.Add(oDataColumn);
}
// SET COLUMNS CREATED
hasColumns = true;
// SET RowCount TO 0 SO WE KNOW TO SKIP COLUMNS LINE
RowCount = 0;
}
else
{
// IF RowCount IS 0 THEN SKIP COLUMN LINE
if (RowCount++ == 0) continue;
//creates a new DataRow with the same schema as of the oDataTable
DataRow oDataRow = oDataTable.NewRow();
//using foreach looping through all the column names
for (int i = 0; i < ColumnNames.Length; i++)
{
oDataRow[ColumnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].ToString();
}
//adding the newly created row with data to the oDataTable
oDataTable.Rows.Add(oDataRow);
}
}
}
//close the oStreamReader object
oStreamReader.Close();
//release all the resources used by the oStreamReader object
oStreamReader.Dispose();
}
return oDataTable;
}
I am thankful for everyone who is willing to help. And Thanks for reading this far!
Sincerely yours
If I understood you right, there is a strict parsing there like this:
string OpenAndParse(string filename, bool firstFile=false)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var header = $"{parsed[0][0]};{parsed[1][0]};{parsed[2][0]};{parsed[1][0]}\n";
var data = $"{parsed[0][1]};{parsed[1][1]};{parsed[2][1]};{parsed[1][2]}\n";
return firstFile
? $"{header}{data}"
: $"{data}";
}
Where it would return - if first file:
Header1;Header2;Header3;Header2
Value1;Value2;Value3;Value4
if not first file:
Value1;Value2;Value3;Value4
If I am correct, rest is about running this against a list file of files and joining the results in an output file.
EDIT: Against a directory:
void ProcessFiles(string folderName, string outputFileName)
{
bool firstFile = true;
foreach (var f in Directory.GetFiles(folderName))
{
File.AppendAllText(outputFileName, OpenAndParse(f, firstFile));
firstFile = false;
}
}
Note: I missed you want a DataTable and not an output file. Then you could simply create a list and put the results into that list making the list the datasource for your datatable (then why would you use semicolons in there? Probably all you need is to simply attach the array values to a list).
(Adding as another answer just to make it uncluttered)
void ProcessMyFiles(string folderName)
{
List<MyData> d = new List<MyData>();
var files = Directory.GetFiles(folderName);
foreach (var file in files)
{
OpenAndParse(file, d);
}
string[] headers = GetHeaders(files[0]);
DataGridView dgv = new DataGridView {Dock=DockStyle.Fill};
dgv.DataSource = d;
dgv.ColumnAdded += (sender, e) => {e.Column.HeaderText = headers[e.Column.Index];};
Form f = new Form();
f.Controls.Add(dgv);
f.Show();
}
string[] GetHeaders(string filename)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
return new string[] { parsed[0][0], parsed[1][0], parsed[2][0], parsed[1][0] };
}
void OpenAndParse(string filename, List<MyData> d)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var data = new MyData
{
Col1 = parsed[0][1],
Col2 = parsed[1][1],
Col3 = parsed[2][1],
Col4 = parsed[1][2]
};
d.Add(data);
}
public class MyData
{
public string Col1 { get; set; }
public string Col2 { get; set; }
public string Col3 { get; set; }
public string Col4 { get; set; }
}
I don't know if this is the best way to do this. But what i would have done in your case, is to rewrite the CSV's the conventionnal way while reading all the files, then create a stream containing the new CSV created.
It would look like something like this :
var csv = new StringBuilder();
csv.AppendLine("Header1;Header2;Header3;Header4");
foreach (var item in file)
{
var newLine = string.Format("{0},{1},{2},{3}", item.value1, item.value2, item.value3, item.value4);
csv.AppendLine(newLine);
}
//Create Stream
MemoryStream stream = new MemoryStream();
StreamReader reader = new StreamReader(stream);
//Fill your data table here with your values
Hope this will help.

c# bind DataGridView to IEnumerable<XElement> from LINQ to XML filtered results

My XML looks like this
<root>
<record>
<Object_Number> 1</Object_Number>
<Object_Level> 1</Object_Level>
<Object_Heading> Introduction</Object_Heading>
<Object_Text> </Object_Text>
<Milestones> </Milestones>
<Unique_ID> </Unique_ID>
<Field_type> Info</Field_type>
<SG_attribute> </SG_attribute>
<Object_Identifier>1</Object_Identifier>
<Object_URL>doors://D1DDBAPP04:36677/?version=2&prodID=0&view=0000001a&urn=urn:telelogic::1-432aa0956f684cff-O-1-00028f60</Object_URL>
</record>
...
records...
...
</root>
Is it possible bind a IEnumerable result to a DatgridView and automatic detect columns?
Initially, I've done this
ds = new DataSet();
ds.ReadXml(myXml);
Then, convert to a DataTable
dt = ds.Tables["record"]
And this can directly populate DGV
dgvDoors.DataSource = dt;
But now, I realize that it's more easily to manipulate data directly in XML (with LINQ) and need somehow to display that (filtered) results in DataGridView
IEnumerable<XElement> elements = xdoc.Element("root").Elements("record");
Now is it possible to display 'elements' to DataGridView and detect columns such in original XML?
Thank you,
PS.
var bs = new BindingSource { DataSource = elements};
dgvDoors.DataSource = bs;
This is not working correctly since instead of records, DGV will display some other columns such as
FirstAttribute
HasAttributes
HasElements
...
To make it working properly I would recommend converting your xml data to strongly typed View Models.
public class RecordViewModel
{
public string Number { get; set; }
public string Level { get; set; }
public string Heading { get; set; }
public string Milestones { get; set; }
}
Below implementation, please let know if it works as you expect:
var elements = xdoc.Element("root").Elements("record")
.Select(e => new RecordViewModel
{
Number = e.Element("Object_Number").Value,
Level = e.Element("Object_Level").Value,
Heading = e.Element("Object_Heading").Value,
Milestones = e.Element("Milestones").Value,
});
var bs = new BindingSource
{
DataSource = elements
};
dgvDoors.DataSource = bs;
The conversion between Xml Data and ViewModels above is not checking for nulls, so you can move the implementation to some mapper, where the logic of converting Xml data to ViewModel would be more complex.
Try following which matches your request
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication49
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
string[] columnNames = doc.Descendants("record").FirstOrDefault().Elements().Select(x => x.Name.LocalName).ToArray();
DataTable dt = new DataTable();
foreach (string col in columnNames)
{
dt.Columns.Add(col, typeof(string));
}
foreach (XElement record in doc.Descendants("record"))
{
DataRow newRow = dt.Rows.Add();
foreach (string columnName in columnNames)
{
newRow[columnName] = (string)record.Element(columnName);
}
}
}
}
}

OleDb / DataTable to Model

have to extract data from a mdb file on the server. I can open and access the data. Now i have to map it to a model and i have no Idea how to receive the data as a model. My idea was to loop through the received DataTabel data and assign it to values of the model type:
OleDbCommand command = new OleDbCommand(sqlcommand, DbConnection);
adapter = new OleDbDataAdapter(command);
builder = new OleDbCommandBuilder(adapter);
dt = new DataTable();
try
{
DbConnection.Open();
adapter.Fill(dt);
foreach (DataRow row in dt.Rows)
{
var examplemodel= new exampleModel(
Id = row.ItemArray[0],
...
);
}
}
catch (Exception ex)
{
}
finally
{
DbConnection.Close();
}
The problem here is that I can not assign row.ItemArray[x] as an element of the model since row.ItemArray[x] is of type object and I cannot convert it to a int string or whatever.
Also I thought that there is maybe a simpler and cleaner approach to this problem.
Any ideas or suggestion are much appreciated.
You can use this
var Entity=(from DataRow dataRow in data.Rows select YourEntity<exampleModel>(dataRow)).ToList();
The Helper Class
public static T YourEntity<T>(DataRow row) where T : new()
{
var entity = new T();
var properties = typeof(T).GetProperties();
foreach (var property in properties)
{
//Get the description attribute
var descriptionAttribute = (DescriptionAttribute)property.GetCustomAttributes(typeof(DescriptionAttribute), true).SingleOrDefault();
if (descriptionAttribute == null)
continue;
property.SetValue(entity, row[descriptionAttribute.Description]);
}
return entity;
}
and Decorate the Entity with proper datatable header
public class exampleModel
{
....
[Description("Subentity_datatable_header_header")]
public string Subentity { get; set; }
....
}
I prefer simple Linq. Whatever approach it may be, it is very clean and safe if we use the column names instead of index.
You could do something like.
dt.Rows.AsEnumerable()
.Select(r=> new exampleModel()
{
Id = r.Field<int>("col1"),
Name = r.Field<string>("col2"),
...
});

Read CSV file in DataGridView

I want to read a csv-file into a Datagridview. I would like to have a class and a function which reads the csv like this one:
class Import
{
public DataTable readCSV(string filePath)
{
DataTable dt = new DataTable();
using (StreamReader sr = new StreamReader(filePath))
{
string strLine = sr.ReadLine();
string[] strArray = strLine.Split(';');
foreach (string value in strArray)
{
dt.Columns.Add(value.Trim());
}
DataRow dr = dt.NewRow();
while (sr.Peek() >= 0)
{
strLine = sr.ReadLine();
strArray = strLine.Split(';');
dt.Rows.Add(strArray);
}
}
return dt;
}
}
and call it:
Import imp = new Import();
DataTable table = imp.readCSV(filePath);
foreach(DataRow row in table.Rows)
{
dataGridView.Rows.Add(row);
}
Result of this is-> rows are created but there is no data in the cells!!
First solution using a litle bit of linq
public DataTable readCSV(string filePath)
{
var dt = new DataTable();
// Creating the columns
File.ReadLines(filePath).Take(1)
.SelectMany(x => x.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries))
.ToList()
.ForEach(x => dt.Columns.Add(x.Trim()));
// Adding the rows
File.ReadLines(filePath).Skip(1)
.Select(x => x.Split(';'))
.ToList()
.ForEach(line => dt.Rows.Add(line));
return dt;
}
Below another version using foreach loop
public DataTable readCSV(string filePath)
{
var dt = new DataTable();
// Creating the columns
foreach(var headerLine in File.ReadLines(filePath).Take(1))
{
foreach(var headerItem in headerLine.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries))
{
dt.Columns.Add(headerItem.Trim());
}
}
// Adding the rows
foreach(var line in File.ReadLines(filePath).Skip(1))
{
dt.Rows.Add(x.Split(';'));
}
return dt;
}
First we use the File.ReadLines, that returns an IEnumerable that is a colletion of lines. We use Take(1), to get just the first row, that should be the header, and then we use SelectMany that will transform the array of string returned from the Split method in a single list, so we call ToList and we can now use ForEach method to add Columns in DataTable.
To add the rows, we still use File.ReadLines, but now we Skip(1), this skip the header line, now we are going to use Select, to create a Collection<Collection<string>>, then again call ToList, and finally call ForEach to add the row in DataTable. File.ReadLines is available in .NET 4.0.
Obs.: File.ReadLines doesn't read all lines, it returns a IEnumerable, and lines are lazy evaluated, so just the first line will be loaded two times.
See the MSDN remarks
The ReadLines and ReadAllLines methods differ as follows: When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient.
You can use the ReadLines method to do the following:
Perform LINQ to Objects queries on a file to obtain a filtered set of its lines.
Write the returned collection of lines to a file with the File.WriteAllLines(String, IEnumerable) method, or append them to an existing file with the File.AppendAllLines(String, IEnumerable) method.
Create an immediately populated instance of a collection that takes an IEnumerable collection of strings for its constructor, such as a IList or a Queue.
This method uses UTF8 for the encoding value.
If you still have any doubt look this answer: What is the difference between File.ReadLines() and File.ReadAllLines()?
Second solution using CsvHelper package
First, install this nuget package
PM> Install-Package CsvHelper
For a given CSV, we should create a class to represent it
CSV File
Name;Age;Birthdate;Working
Alberto Monteiro;25;01/01/1990;true
Other Person;5;01/01/2010;false
The class model is
public class Person
{
public string Name { get; set; }
public int Age { get; set; }
public DateTime Birthdate { get; set; }
public bool Working { get; set; }
}
Now lets use CsvReader to build the DataTable
public DataTable readCSV(string filePath)
{
var dt = new DataTable();
var csv = new CsvReader(new StreamReader(filePath));
// Creating the columns
typeof(Person).GetProperties().Select(p => p.Name).ToList().ForEach(x => dt.Columns.Add(x));
// Adding the rows
csv.GetRecords<Person>().ToList.ForEach(line => dt.Rows.Add(line.Name, line.Age, line.Birthdate, line.Working));
return dt;
}
To create columns in DataTable e use a bit of reflection, and then use the method GetRecords to add rows in DataTabble
using Microsoft.VisualBasic.FileIO;
I would suggest the following. It should have the advantage at least that ';' in a field will be correctly handled, and it is not constrained to a particular csv format.
public class CsvImport
{
public static DataTable NewDataTable(string fileName, string delimiters, bool firstRowContainsFieldNames = true)
{
DataTable result = new DataTable();
using (TextFieldParser tfp = new TextFieldParser(fileName))
{
tfp.SetDelimiters(delimiters);
// Get Some Column Names
if (!tfp.EndOfData)
{
string[] fields = tfp.ReadFields();
for (int i = 0; i < fields.Count(); i++)
{
if (firstRowContainsFieldNames)
result.Columns.Add(fields[i]);
else
result.Columns.Add("Col" + i);
}
// If first line is data then add it
if (!firstRowContainsFieldNames)
result.Rows.Add(fields);
}
// Get Remaining Rows
while (!tfp.EndOfData)
result.Rows.Add(tfp.ReadFields());
}
return result;
}
}
CsvHelper's Author build functionality in library.
Code became simply:
using (var reader = new StreamReader("path\\to\\file.csv"))
using (var csv = new CsvReader(reader, CultureInfo.CurrentCulture))
{
// Do any configuration to `CsvReader` before creating CsvDataReader.
using (var dr = new CsvDataReader(csv))
{
var dt = new DataTable();
dt.Load(dr);
}
}
CultureInfo.CurrentCulture is used to determine the default delimiter and needs if you want to read csv saved by Excel.
I had the same problem but I found a way to use #Alberto Monteiro's Answer in my own way...
My CSV file does not have a "First-Line-Column-Header", I personally didn't put them there for some reasons, So this is the file sample
1,john doe,j.doe,john.doe#company.net
2,jane doe,j.doe,jane.doe#company.net
So you got the idea right ?
Now in I am going to add the Columns manually to the DataTable. And also I am going to use Tasks to do it asynchronously. and just simply using a foreach loop adding the values into the DataTable.Rows using the following function:
public Task<DataTable> ImportFromCSVFileAsync(string filePath)
{
return Task.Run(() =>
{
DataTable dt = new DataTable();
dt.Columns.Add("Index");
dt.Columns.Add("Full Name");
dt.Columns.Add("User Name");
dt.Columns.Add("Email Address");
// splitting the values using Split() command
foreach(var srLine in File.ReadAllLines(filePath))
{
dt.Rows.Add(srLine.Split(','));
}
return dt;
});
}
Now to call the function I simply ButtonClick to do the job
private async void ImportToGrid_STRBTN_Click(object sender, EventArgs e)
{
// Handling UI objects
// Best idea for me was to put everything a Panel and Disable it while waiting
// and after the job is done Enabling it
// and using a toolstrip docked to bottom outside of the panel to show progress using a
// progressBar and setting its style to Marquee
panel1.Enabled = false;
progressbar1.Visible = true;
try
{
DataTable dt = await ImportFromCSVFileAsync(#"c:\myfile.txt");
if (dt.Rows.Count > 0)
{
Datagridview1.DataSource = null; // To clear the previous data before adding the new ones
Datagridview1.DataSource = dt;
}
}
catch (Exception ex)
{
MessagBox.Show(ex.Message, "Error");
}
progressbar1.Visible = false;
panel1.Enabled = true;
}

Regular Expression with Lambda Expression

I've got several text files which should be tab delimited, but actually are delimited by an arbitrary number of spaces. I want to parse the rows from the text file into a DataTable (the first row of the text file has headers for property names). This got me thinking about building an extensible, easy way to parse text files. Here's my current working solution:
string filePath = #"C:\path\lowbirthweight.txt";
//regex to remove multiple spaces
Regex regex = new Regex(#"[ ]{2,}", RegexOptions.Compiled);
DataTable table = new DataTable();
var reader = ReadTextFile(filePath);
//headers in first row
var headers = reader.First();
//skip headers for data
var data = reader.Skip(1).ToArray();
//remove arbitrary spacing between column headers and table data
headers = regex.Replace(headers, #" ");
for (int i = 0; i < data.Length; i++)
{
data[i] = regex.Replace(data[i], #" ");
}
//make ready the DataTable, split resultant space-delimited string into array for column names
foreach (string columnName in headers.Split(' '))
{
table.Columns.Add(new DataColumn() { ColumnName = columnName });
}
foreach (var record in data)
{
//split into array for row values
table.Rows.Add(record.Split(' '));
}
//test prints correctly to the console
Console.WriteLine(table.Rows[0][2]);
}
static IEnumerable<string> ReadTextFile(string fileName)
{
using (var reader = new StreamReader(fileName))
{
while (!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
}
In my project I've already received several large (gig +) text files that are not in the format in which they are purported to be. So can I see having to write methods such as these with some regularity, albeit with a different regular expression. Is there a way to do something like
data =data.SmartRegex(x => x.AllowOneSpace) where I can use a regular expression to iterate over the collection of strings?
Is something like the following on the right track?
public static class SmartRegex
{
public static Expression AllowOneSpace(this List<string> data)
{
//no idea how to return an expression from a method
}
}
I'm not too overly concerned with performance, just would like to see how something like this works
You should consult with your data source and find out why your data is bad.
As for the API design that you are trying to implement:
public class RegexCollection
{
private readonly Regex _allowOneSpace = new Regex(" ");
public Regex AllowOneSpace { get { return _allowOneSpace; } }
}
public static class RegexExtensions
{
public static IEnumerable<string[]> SmartRegex(
this IEnumerable<string> collection,
Func<RegexCollection, Regex> selector
)
{
var regexCollection = new RegexCollection();
var regex = selector(regexCollection);
return collection.Select(l => regex.Split(l));
}
}
Usage:
var items = new List<string> { "Hello world", "Goodbye world" };
var results = items.SmartRegex(x => x.AllowOneSpace);

Categories

Resources