In my C# winforms program I want to make an export function that will create a comma separated text file or csv. I am not sure about the logic of how to do this the best way. My exported file will be like this :
Family Name, First Name, Sex, Age
Dekker, Sean, Male, 23
Doe, John, Male, 40
So the first line I want to be the name of the columns, and the rest should be treated as values. Is it ok in this way for later usage? Or I should not include column names?
Would be nice to hear your experiences about this!
Sean,
sorry don't have enough privilege points to comment directly on your post. I think you may be confusing CSV and Excel files here. A CSV is simply a text file where each value is separated by a comma, there is no special formating etc. Excel will display CSV files since it knows how to open them but you can just as easily open them in notepad.
Excel .xslx files are different and can contain all sorts of different formats, charts etc. To format these files its important to understand that .xslx files are essentially zips. So the first place to start is to create an excel file with some data, save it and then rename the extension to .zip
Open the zip file created now and you will see a number of different folders and files, of these the most important for your purposes is the XL directory. In this folder you will see a shared strings xml file and a worksheets folder.
Lets start by going into the worksheet folder and opening sheet1.xml. Look for the line that says "
If there is text in this column, i.e. data that excel should read as text then you will have something like 0. This indicates that cell A1 is of type string t="s" and that the value is to be found as the first value in the SharedStrings.xml file 0
If there is a number in the cell then you may have something like 234. In this case Excel knows to use the value 234 in this cell.
So in your case you will need to do the following:
1: create the excel document in C# - there are a number of libraries available for this
2: Open the excel file as a zip
3: Modify in your case the styles and worksheets xml files
4: Save the document
That is absolutely fine to do (to state the obvious....). Excel has a little checkbox that allows the user importing to use the first line as column headers if they select it.
I would also suggest that you leave out the spaces at the start of each piece of data, it isn't necessary.
In general its best practice to include the column headers, the only reason not to do so would be an external program over which you have no control accessing your data which doesn't realise that the first row are the column headers and which can't be changed.
To create the export function something like this should work:
private static List<Person> people = new List<Person>();
static void Main(string[] args)
{
// add some people
people.Add(
new Person() { firstName = "John", familyName = "Smith", sex = Sex.Male, age = 12 }
);
people.Add(
new Person() { firstName = "Mary", familyName = "Doe", sex = Sex.Female, age = 25 }
);
// write the data
Write();
}
static void Write()
{
using (TextWriter tw = new StreamWriter(#"c:\junk1\test.csv", false))
{
// write the header
tw.WriteLine("Family Name, First Name, Sex, Age");
// write the details
foreach(Person person in people)
{
tw.WriteLine(String.Format("{0}, {1}, {2}, {3}", person.familyName, person.firstName, person.sex.ToString(), person.age.ToString()));
}
}
}
}
/// <summary>
/// Applicable sexes
/// </summary>
public enum Sex
{
Male,
Female
}
/// <summary>
/// holds details about a person
/// </summary>
public class Person
{
public string familyName { get; set; }
public string firstName { get; set; }
public Sex sex { get; set; }
public int age { get; set; }
}
You can use Dataset to do this.
Please refer here
//Why not save the lines to a List<string> object, first List<sting>Object.Add//("your headers"), use the string.Join("," "your Header Array string[]" do not add //(+",") the .Join extension menthod will handle that for you.
// here is an example
//if you were to retreive the header values from a database using a SQL Reader
var reader = sqlcmdSomeQueryCommand.ExecuteReader();
var columns = new List<string>();
//get all the field names from the Columns var
for (int intCounter = 0; intCounter < reader.FieldCount; intCounter++)
{
columns.Add(reader.GetName(intCounter));
}
strarryTmpString = columns.ToArray();
string TmpFields = string.Join(", ", strarryTmpString);
columns.Clear();
columns.Add(TmpFields);
//you can save the TmpFieldList to later add the rest of your comma delimited fields
write line by line in a foreach loop or use the List<string> object .foreach(delegate(string delString)
{
someStreamWriterObject.WriteLine(delString)
});
Related
While reading a CSV file, how can I configure CsvHelper to enforce that each row has no extra columns that are not found in the header? I cannot find any obvious property under CsvConfiguration nor under CsvHelper.Configuration.Attributes.
Context: In our CSV file format, the last column is a string description, which our users (using plain-text editors) sometimes forget to quote when the description contains commas. Such "raw" commas cause that row to have extra columns, and the intended description read into the software omits the description after the first raw comma. I want to detect this and throw an exception that suggests to the user they may have forgotten to quote the description cell.
It looks like CsvConfiguration.DetectColumnCountChanges might be related, but presently the 29.0.0 library lacks any Intellisense description of CsvConfiguration properties, so I have no idea how to use this.
Similar information for other CSV libraries:
With LINQtoCSV this was done by setting IgnoreUnknownColumns = false in CsvFileDescription.
Can Lumenworks CSV parser error when there are too many columns in a row?
You were on the right track with CsvConfiguration.DetectColumnCountChanges.
void Main()
{
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
DetectColumnCountChanges = true
};
using (var reader = new StringReader("Id,Name\n1,MyName\n2,YourName,ExtraColumn"))
using (var csv = new CsvReader(reader, config))
{
try
{
var records = csv.GetRecords<Foo>().ToList();
}
catch (BadDataException ex)
{
if (ex.Message.StartsWith("An inconsistent number of columns has been detected."))
{
Console.WriteLine("There is an issue with an inconsistent number of columns on row {0}", ex.Context.Parser.RawRow);
Console.WriteLine("Row data: \"{0}\"", ex.Context.Parser.RawRecord);
Console.WriteLine("Please check for commas in a field that were not properly quoted.");
}
}
}
}
public class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}
I am trying to send the contents of a datatable to a csv file, with headers. There is a duplicate question but the the accepted answers only seem to work half way. At this point I have mixed and matched the upvoted answers with no luck and need a point in the right direction.
I can write the columns to the file just fine, and I can write data just fine but not together. Also the data never comes out quoted, only comma delimited without quotes.
//This is how the FileHelpers class is built
public class ScannedFileInfo
{
//this prefix will handle the double quotes issue
//delimiters must reside between double quotes
//Must specify FieldOrder too
[FieldQuoted('"', QuoteMode.AlwaysQuoted)]
[FieldOrder(1)]
public string DA_Id;
[FieldQuoted('"', QuoteMode.AlwaysQuoted)]
[FieldOrder(2)]
public string Name;
[FieldQuoted('"', QuoteMode.AlwaysQuoted)]
[FieldOrder(3)]
public string Extension;
[FieldQuoted('"', QuoteMode.AlwaysQuoted)]
[FieldOrder(4)]
public string Fullname;
[FieldQuoted('"', QuoteMode.AlwaysQuoted)]
[FieldOrder(5)]
public string PathLength;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
[FieldOrder(6)]
public string Directory;
}
//this is how I send it to the file
public static void ImportDirectory(string result, SqlConnection myconn, SqlConnection destConn ,ListBox lbInfo, DataGridView dgView)
{
//create data table code here - works fine...
MessageBox.Show("Scan complete. Starting Import...");
//build file
var engine = new FileHelperEngine<ScannedFileInfo>();
var orders = new List<ScannedFileInfo>();
engine.HeaderText = engine.GetFileHeader();
engine.WriteFile(#"C:\DirectoryScan.csv", orders);
MessageBox.Show("You now have proper labeled columns in your file.");
//now the data import is successful but it overwrites the column info
CommonEngine.DataTableToCsv(dt, #"C:\DirectoryScan.csv", ',');
MessageBox.Show("You now have data in your file, but no columns");
}
The CommonEngine.DataTableToCsv() call does not support all of the features of FileHelpersEngine. In particular, the field quoting and the column headers are missing. The CommonEngine is not as full featured as the FileHelpersEngine<T>.
Instead, you should create a list of ScannedFileInfo instead of a datatable and then use FileHelpersEngine<ScannedFileInfo>.WriteFile().
//create list of ScannedFileInfo here
var records = new List<ScannedFileInfo>();
records.Add(new ScannedFileInfo() { Name = ..., etc... });
// ( or if you prefer, loop through your datatable to build the list)
MessageBox.Show("Scan complete. Starting Import...");
// build file
var engine = new FileHelperEngine<ScannedFileInfo>();
engine.HeaderText = engine.GetFileHeader();
engine.WriteFile(#"C:\DirectoryScan.csv", records);
MessageBox.Show("You now have data and properly labelled columns in your file.");
I cannot recommend that you look into CSVHelper enough. This nuget package has saved me from pulling my hair out on so many occasions, I can't count them all.
If you're into using it, you can use the configuration settings to quickly set a header record, delimiter, ignore quotes or not, etc.
I have an excel sheet of xls format, I am using LinqToExcel to read it and then import it to my DB.
This sheet consist of about 3K row and only 6 cols. I am using .addmapping to map my class properties to the column names
The problem i have is: the cells of column "web-code" are SOMETIMES coming back as null although there are data in the cells.
Here is a sample data that is coming as null!
My Code watch
And here is a sample data where the data coming correct:
My Code Watch
I have tried applying ExcelColumn attribute for mapping, but no luck!
code:
var factory = new ExcelQueryFactory(_excelFilePath);
factory.AddMapping<ExcelPriceEntity>(x => x.WebCode, "WEB-CODE");
factory.AddMapping<ExcelPriceEntity>(x => x.Type, "TYPE");
factory.AddMapping<ExcelPriceEntity>(x => x.Style, "STYLE");
factory.AddMapping<ExcelPriceEntity>(x => x.Qty, "QTY");
factory.AddMapping<ExcelPriceEntity>(x => x.UnitPrice, "Unit Price");
factory.AddMapping<ExcelPriceEntity>(x => x.Bucket, "WEBCODE W/BUCKET");
factory.StrictMapping = StrictMappingType.ClassStrict;
factory.TrimSpaces = TrimSpacesType.Both;
factory.ReadOnly = true;
var prices = factory.Worksheet<ExcelPriceEntity>(_allPricesSheetName).ToList();
var priccerNP = prices.Where(p => p.Type.Contains("900 ARROW TAPE")).ToList();
My PriceEntity Class:
public class ExcelPriceEntity
{
//[ExcelColumn("TYPE")]
public string Type { get; set; }
public string WebCode { get; set; }
//[ExcelColumn("STYLE")]
public string Style { get; set; }
//[ExcelColumn("QTY")]
public string Qty { get; set; }
//[ExcelColumn("Unit Price")]
public string UnitPrice { get; set; }
//[ExcelColumn("WEBCODE W/BUCKET")]
public string Bucket { get; set; }
}
Alternate Solution:
I ended up saving the excel sheet as csv file, then import to SQL table.Then i used linq-to-sql to read the data.
Root Cause:
After researching i found out the problem was that the first cell of this column(web-code) was interger number, and excel trys to figure out the datatype of the column by looking at the first rows!
So next rows of (web-code) column was some text data. So excel couldn't parse it as integer, and assign null value to it!
What I could've done is, assing text value to the first cell so excel would guess the data type as string. But I didn't test that. For anyone reading this answer, try having text value in you first row if you came across the same problem
here, the Contains is not like string contains. It compares a list of cell values to the exact value u give inside the contains method. just try with the full text "900 AMMONIA STOCK OR CUST...)
Another alternative solution to #alsafoo is to convert the column from "general" to "text".
These are the steps:
1. Right click on any cell in the column.
2. In Number tab, select text.
3. Select Format cell
4. Press Ok.
After then, the library will read all values as string.
Is there a built-in field attribute in the FileHelper library which will add a header row in the final generated CSV?
I have Googled and didn't find much info on it. Currently I have this:
DelimitedFileEngine _engine = new DelimitedFileEngine(T);
_engine.WriteStream
(HttpContext.Current.Response.Output, dataSource, int.MaxValue);
It works, but without a header.
I'm thinking of having an attribute like FieldTitleAttribute and using this as a column header.
So, my question is at which point do I check the attribute and insert header columns? Has anyone done something similar before?
I would like to get the headers inserted and use custom text different from the actual field name just by having an attribute on each member of the object:
[FieldTitleAttribute("Custom Title")]
private string Name
and maybe an option to tell the engine to insert the header when it's generated.
So when WriteStream or WriteString is called, the header row will be inserted with custom titles.
I have found a couple of Events for DelimitedFileEngine, but not what's the best way to detect if the current record is the first row and how to insert a row before this.
I know this is an old question, but here is an answer that works for v2.9.9
FileHelperEngine<Person> engine = new FileHelperEngine<Person>();
engine.HeaderText = engine.GetFileHeader();
Here's some code that'll do it: https://gist.github.com/1391429
To use it, you must decorate your fields with [FieldOrder] (a good FileHelpers practice anyway). Usage:
[DelimitedRecord(","), IgnoreFirst(1)]
public class Person
{
// Must specify FieldOrder too
[FieldOrder(1), FieldTitle("Name")]
string name;
[FieldOrder(2), FieldTitle("Age")]
int age;
}
...
var engine = new FileHelperEngine<Person>
{
HeaderText = typeof(Person).GetCsvHeader()
};
...
engine.WriteFile(#"C:\people.csv", people);
But support for this really needs to be added within FileHelpers itself. I can think of a few design questions off the top of my head that would need answering before it could be implemented:
What happens when reading a file? Afaik FileHelpers is currently all based on ordinal column position and ignores column names... but if we now have [FieldHeader] attributes everywhere then should we also try matching properties with column names in the file? Should you throw an exception if they don't match? What happens if the ordinal position doesn't agree with the column name?
When reading as a data table, should you use A) the field name (current design), or B) the source file column name, or C) the FieldTitle attribute?
I don't know if you still need this, but here is the way FileHelper is working :
To include headers of columns, you need to define a string with headers delimited the same way as your file.
For example with '|' as delimiter :
public const string HeaderLine = #"COLUMN1|COLUMN2|COLUMN3|...";
Then, when calling your engine :
DelimitedFileEngine _engine = new DelimitedFileEngine<T> { HeaderText = HeaderLine };
If you don't want to write the headers, just don't set the HeaderText attribute on the engine.
List<MyClass> myList = new List<MyClass>();
FileHelperEngine engine = new FileHelperEngine(typeof(MyClass));
String[] fieldNames = Array.ConvertAll<FieldInfo, String>(typeof(MyClass).GetFields(), delegate(FieldInfo fo) { return fo.Name; });
engine.HeaderText = String.Join(";", fieldNames);
engine.WriteFile(MapPath("MyClass.csv"), myList);
Just to include a more complete example, which would have saved me some time, for version 3.4.1 of the FileHelpers NuGet package....
Given
[DelimitedRecord(",")]
public class Person
{
[FieldCaption("First")]
public string FirstName { get; set; }
[FieldCaption("Last")]
public string LastName { get; set; }
public int Age { get; set; }
}
and this code to create it
static void Main(string[] args)
{
var people = new List<Person>();
people.Add(new Person() { FirstName = "James", LastName = "Bond", Age = 38 });
people.Add(new Person() { FirstName = "George", LastName = "Washington", Age = 43 });
people.Add(new Person() { FirstName = "Robert", LastName = "Redford", Age = 28 });
CreatePeopleFile(people);
}
private static void CreatePeopleFile(List<Person> people)
{
var engine = new FileHelperEngine<Person>();
using (var fs = File.Create(#"c:\temp\people.csv"))
using (var sw = new StreamWriter(fs))
{
engine.HeaderText = engine.GetFileHeader();
engine.WriteStream(sw, people);
sw.Flush();
}
}
You get this
First,Last,Age
James,Bond,38
George,Washington,43
Robert,Redford,28
I found that you can use the FileHelperAsyncEngine to accomplish this. Assuming your data is a list called "output" of type "outputData", then you can write code that looks like this:
FileHelperAsyncEngine outEngine = new FileHelperAsyncEngine(typeof(outputData));
outEngine.HeaderText = "Header1, Header2, Header3";
outEngine.BeginWriteFile(outputfile);
foreach (outputData line in output){
outEngine.WriteNext(line);
}
outEngine.Close();
You can simply use FileHelper's GetFileHeader function from base class
var engine = new FileHelperEngine<ExportType>();
engine.HeaderText = engine.GetFileHeader();
engine.WriteFile(exportFile, exportData);
I have a text file that looks like this:
1,Smith, 249.24, 6/10/2010
2,Johnson, 1332.23, 6/11/2010
3,Woods, 2214.22, 6/11/2010
1,Smith, 219.24, 6/11/2010
I need to be able to find the balance for a client on a given date.
I'm wondering if I should:
A. Start from the end and read each line into an Array, one at a time.
Check the last name index to see if it is the client we're looking for.
Then, display the balance index of the first match.
or
B. Use RegEx to find a match and display it.
I don't have much experience with RegEx, but I'll learn it if it's a no brainer in a situation like this.
I would recommend using the FileHelpers opensource project:
http://www.filehelpers.net/
Piece of cake:
Define your class:
[DelimitedRecord(",")]
public class Customer
{
public int CustId;
public string Name;
public decimal Balance;
[FieldConverter(ConverterKind.Date, "dd-MM-yyyy")]
public DateTime AddedDate;
}
Use it:
var engine = new FileHelperAsyncEngine<Customer>();
// Read
using(engine.BeginReadFile("TestIn.txt"))
{
// The engine is IEnumerable
foreach(Customer cust in engine)
{
// your code here
Console.WriteLine(cust.Name);
// your condition >> add balance
}
}
This looks like a pretty standard CSV type layout, which is easy enough to process. You can actually do it with ADO.Net and the Jet provider, but I think it is probably easier in the long run to process it yourself.
So first off, you want to process the actual text data. I assume it is reasonable to assume each record is seperated by some newline character, so you can utilize the ReadLine method to easily get each record:
StreamReader reader = new StreamReader("C:\Path\To\file.txt")
while(true)
{
var line = reader.ReadLine();
if(string.IsNullOrEmpty(line))
break;
// Process Line
}
And then to process each line, you can split the string on comma, and store the values into a data structure. So if you use a data structure like this:
public class MyData
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Balance { get; set; }
public DateTime Date { get; set; }
}
And you can process the line data with a method like this:
public MyData GetRecord(string line)
{
var fields = line.Split(',');
return new MyData()
{
Id = int.Parse(fields[0]),
Name = fields[1],
Balance = decimal.Parse(fields[2]),
Date = DateTime.Parse(fields[3])
};
}
Now, this is the simplest example, and doesn't account for cases where the fields may be empty, in which case you would either need to support NULL for those fields (using nullable types int?, decimal? and DateTime?), or define some default value that would be assigned to those values.
So once you have that you can store the collection of MyData objects in a list, and easily perform calculations based on that. So given your example of finding the balance on a given date you could do something like:
var data = customerDataList.First(d => d.Name == customerNameImLookingFor
&& d.Date == dateImLookingFor);
Where customerDataList is the collection of MyData objects read from the file, customerNameImLookingFor is a variable containing the customer's name, and customerDateImLookingFor is a variable containing the date.
I've used this technique to process data in text files in the past for files ranging from a couple records, to tens of thousands of records, and it works pretty well.
I think the cleanest way is to load the entire file into an array of custom objects and work with that. For 3 MB of data, this won't be a problem. If you wanted to do completely different search later, you could reuse most of the code. I would do it this way:
class Record
{
public int Id { get; protected set; }
public string Name { get; protected set; }
public decimal Balance { get; protected set; }
public DateTime Date { get; protected set; }
public Record (int id, string name, decimal balance, DateTime date)
{
Id = id;
Name = name;
Balance = balance;
Date = date;
}
}
…
Record[] records = from line in File.ReadAllLines(filename)
let fields = line.Split(',')
select new Record(
int.Parse(fields[0]),
fields[1],
decimal.Parse(fields[2]),
DateTime.Parse(fields[3])
).ToArray();
Record wantedRecord = records.Single
(r => r.Name = clientName && r.Date = givenDate);
Note that both your options will scan the file. That is fine if you only want to search in the file for 1 item.
If you need to search for multiple client/date combinations in the same file, you could parse the file into a Dictionary<string, Dictionary <date, decimal>> first.
A direct answer: for a one-off, a RegEx will probably be faster.
If you're just reading it I'd consider reading in the whole file in memory using StreamReader.ReadToEnd and then treating it as one long string to search through and when you find a record you want to look at just look for the previous and next line break and then you have the transaction row you want.
If it's on a server or the file can be refreshed all the time this might not be a good solution though.
If it's all well-formatted CSV like this then I'd use something like the Microsoft.VisualBasic.TextFieldParser class or the Fast CSV class over on code project to read it all in.
The data type is a little tricky because I imagine not every client has a record for every day. That means you can't just have a nested dictionary for your looksup. Instead, you want to "index" by name first and then date, but the form of the date record is a little different. I think I'd go for something like this as I read in each record:
Dictionary<string, SortedList<DateTime, double>>
hey, hey, hey!!! why not do it with this great project on codeproject Linq to CSV, way cool!
rock solid