FileHelpers read csv with FieldQuoted having " character in multi lines filed - c#

I have a trouble with read csv file with FileHelpers library like that:
I have a csv file with some lines, if the first line have " character and the second line have " character too, then the second line will be the last column of the first line after read.
If my csv file have some lines, all line have single line and one of them have " character, then this line will be ignore.
Really need help!
Here is my class
[DelimitedRecord(",")]
[IgnoreEmptyLines()]
[IgnoreFirst()]
public sealed class MyClass
{
[FieldQuoted('"', QuoteMode.OptionalForBoth, MultilineMode.AllowForBoth)]
[FieldTrim(TrimMode.Both)]
public String NAME;
[FieldQuoted('"', QuoteMode.OptionalForBoth, MultilineMode.AllowForBoth)]
[FieldTrim(TrimMode.Both)]
public String NOTES;
}
And my read file code:
OpenFileDialog ofd = new OpenFileDialog
{
Filter = "CSV files (*.csv)|*.csv",
FilterIndex = 0,
CheckFileExists = true,
RestoreDirectory = true
};
if (ofd.ShowDialog(this) == DialogResult.OK)
{
if (AppSetting.IsFileLocked(ofd.FileName))
{
//file in is use
MessageUtility.ShowNotify(LanguagesMessage.GetLanguagesMessage("USING"));
return;
}
else
{
FileInfo f = new FileInfo(ofd.FileName);
if (f.Extension != ".csv")
{
MessageUtility.ShowNotify(MsgFormatFile);
return;
}
}
var _curr_encoding = SimpleHelpers.FileEncoding.DetectFileEncoding(ofd.FileName);
if (_curr_encoding == null)
{
MessageUtility.ShowNotify(MsgFormatFile);
return;
}
if (_curr_encoding.CodePage == _encoding_export_import.CodePage)
{
_curr_encoding = _encoding_export_import;
}
else
{
_curr_encoding = Encoding.GetEncoding(_curr_encoding.CodePage);
}
var engine = new FileHelperEngine<MyClass>(_curr_encoding);
engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue;
engine.Encoding = _curr_encoding;
engine.AfterReadRecord += Engine_AfterReadRecord;
List<MyClass> lstDataImports = engine.ReadFile(ofd.FileName).ToList();
if (engine.ErrorManager.ErrorCount > 0)
{
MessageUtility.ShowNotify(MsgFormatFile);
engine.ErrorManager.SaveErrors("Errors.txt");
return;
}
else
{
if (lstDataImports.Count() < 1)
{
MessageUtility.ShowNotify(LanguagesMessage.GetLanguagesMessage_SM("MY_ERROR"));
return;
}
}
if (!ValidateHeader(engine.HeaderText))
{
return;
}
}

This is the expected behavior, when configuring MyClass with QuoteMode.OptionalForBoth and MultilineMode.AllowForBoth.
Explanation: you allow quoting of those fields (by attributing QuoteMode.OptionalForBoth) and allow for a record to continue in the next line (by attributing MultilineMode.AllowForBoth).
You have to also remember, if that's not clear, that quoting is the act of put quotation mark before and after the record you wish to read.

Related

Compare CSV Header to Map Class

I have a process whereby we have written a class to import a large (ish) CSV into our app using CsvHelper (https://joshclose.github.io/CsvHelper).
I would like to compare the header to the Map to ensure the header's integrity. We get the CSV file from a 3rd party and I want to ensure it doesn't change over time and thought the best way to do this would be to compare it against the map.
We have a class set up as so (trimmed):
public class VisitExport
{
public int? Count { get; set; }
public string CustomerName { get; set; }
public string CustomerAddress { get; set; }
}
And its corresponding map (also trimmed):
public class VisitMap : ClassMap<VisitExport>
{
public VisitMap()
{
Map(m => m.Count).Name("Count");
Map(m => m.CustomerName).Name("Customer Name");
Map(m => m.CustomerAddress).Name("Customer Address");
}
}
This is the code I have for reading the CSV file and it works great. I have a try catch in place for the error but ideally, if it fails specifically for a header miss match, I'd like to handle that specifically.
private void fileLoadedLink_LinkClicked(object sender, LinkLabelLinkClickedEventArgs e)
{
try
{
var filePath = string.Empty;
data = new List<VisitExport>();
using (OpenFileDialog openFileDialog = new OpenFileDialog())
{
openFileDialog.InitialDirectory = new KnownFolder(KnownFolderType.Downloads).Path;
openFileDialog.Filter = "csv files (*.csv)|*.csv";
openFileDialog.FilterIndex = 2;
openFileDialog.RestoreDirectory = true;
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
filePath = openFileDialog.FileName;
var fileStream = openFileDialog.OpenFile();
var culture = CultureInfo.GetCultureInfo("en-GB");
using (StreamReader reader = new StreamReader(fileStream))
using (var readCsv = new CsvReader(reader, culture))
{
var map = new VisitMap();
readCsv.Context.RegisterClassMap(map);
var fileContent = readCsv.GetRecords<VisitExport>();
data = fileContent.ToList();
fileLoadedLink.Text = filePath;
viewModel.IsFileLoaded = true;
}
}
}
}
catch (CsvHelperException ex)
{
Console.WriteLine(ex.InnerException != null ? ex.InnerException.Message : ex.Message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
}
Is there a way of comparing the Csv header vs my map?
There are two basic cases for CSV files with headers: missing CSV columns, and extra CSV columns. The first is already detected by CsvHelper while the detection of the second is not implemented out of the box and requires subclassing of CsvReader.
(As CsvHelper maps CSV columns to model properties by name, permuting the order of the columns in the CSV file would not be considered a breaking change.)
Note that this only applies to CSV files that actually contain headers. Since you are not setting CsvConfiguration.HasHeaderRecord = false I assume that this applies to your use case.
Details about each of the two cases follow.
Missing CSV columns.
Currently CsvHelper already throws an exception by default in such situations. When unmapped data model properties are found, CsvConfiguration.HeaderValidated is invoked. By default this is set to ConfigurationFunctions.HeaderValidated whose current behavior is to throw a HeaderValidationException if there are any unmapped model properties. You can replace or extend HeaderValidated with logic of your own if you prefer:
var culture = CultureInfo.GetCultureInfo("en-GB");
var config = new CsvConfiguration (culture)
{
HeaderValidated = (args) =>
{
// Add additional logic as required here
ConfigurationFunctions.HeaderValidated(args);
},
};
using (var readCsv = new CsvReader(reader, config))
{
// Remainder unchanged
Demo fiddle #1 here.
Extra CSV columns.
Currently CsvHelper does not inform the application when this happens. See Throw if csv contains unexpected columns #1032 which confirms that this is not implemented out of the box.
In a GitHub comment, user leopignataro suggests a workaround, which is to subclass CsvReader and add the necessary validation logic oneself. However the version shown in the comment doesn't seem to handle duplicated column names or embedded references. The following subclass of CsvHelper should do this correctly. It is based on the logic in CsvReader.ValidateHeader(ClassMap map, List<InvalidHeader> invalidHeaders). It recursively walks the incoming ClassMap, attempts to find a CSV header corresponding to each member or constructor parameter, and flags the index of each one that is mapped. Afterwards, if there are any unmapped headers, the supplied Action<CsvContext, List<string>> OnUnmappedCsvHeaders is invoked to notify the application of the problem and throw some exception if desired:
public class ValidatingCsvReader : CsvReader
{
public ValidatingCsvReader(TextReader reader, CultureInfo culture, bool leaveOpen = false) : this(new CsvParser(reader, culture, leaveOpen)) { }
public ValidatingCsvReader(TextReader reader, CsvConfiguration configuration) : this(new CsvParser(reader, configuration)) { }
public ValidatingCsvReader(IParser parser) : base(parser) { }
public Action<CsvContext, List<string>> OnUnmappedCsvHeaders { get; set; }
public override void ValidateHeader(Type type)
{
base.ValidateHeader(type);
var headerRecord = HeaderRecord;
var mapped = new BitArray(headerRecord.Length);
var map = Context.Maps[type];
FlagMappedHeaders(map, mapped);
var unmappedHeaders = Enumerable.Range(0, headerRecord.Length).Where(i => !mapped[i]).Select(i => headerRecord[i]).ToList();
if (unmappedHeaders.Count > 0)
{
OnUnmappedCsvHeaders?.Invoke(Context, unmappedHeaders);
}
}
protected virtual void FlagMappedHeaders(ClassMap map, BitArray mapped)
{
// Logic adapted from https://github.com/JoshClose/CsvHelper/blob/0d753ff09294b425e4bc5ab346145702eeeb1b6f/src/CsvHelper/CsvReader.cs#L157
// By https://github.com/JoshClose
foreach (var parameter in map.ParameterMaps)
{
if (parameter.Data.Ignore)
continue;
if (parameter.Data.IsConstantSet)
// If ConvertUsing and Constant don't require a header.
continue;
if (parameter.Data.IsIndexSet && !parameter.Data.IsNameSet)
// If there is only an index set, we don't want to validate the header name.
continue;
if (parameter.ConstructorTypeMap != null)
{
FlagMappedHeaders(parameter.ConstructorTypeMap, mapped);
}
else if (parameter.ReferenceMap != null)
{
FlagMappedHeaders(parameter.ReferenceMap.Data.Mapping, mapped);
}
else
{
var index = GetFieldIndex(parameter.Data.Names.ToArray(), parameter.Data.NameIndex, true);
if (index >= 0)
mapped.Set(index, true);
}
}
foreach (var memberMap in map.MemberMaps)
{
if (memberMap.Data.Ignore || !CanRead(memberMap))
continue;
if (memberMap.Data.ReadingConvertExpression != null || memberMap.Data.IsConstantSet)
// If ConvertUsing and Constant don't require a header.
continue;
if (memberMap.Data.IsIndexSet && !memberMap.Data.IsNameSet)
// If there is only an index set, we don't want to validate the header name.
continue;
var index = GetFieldIndex(memberMap.Data.Names.ToArray(), memberMap.Data.NameIndex, true);
if (index >= 0)
mapped.Set(index, true);
}
foreach (var referenceMap in map.ReferenceMaps)
{
if (!CanRead(referenceMap))
continue;
FlagMappedHeaders(referenceMap.Data.Mapping, mapped);
}
}
}
And then in your code, handle the OnUnmappedCsvHeaders callback however you would like, such as by throwing a CsvHelperException or some other custom exception:
using (var readCsv = new ValidatingCsvReader(reader, culture)
{
OnUnmappedCsvHeaders = (context, headers) => throw new CsvHelperException(context, string.Format("Unmapped CSV headers: \"{0}\"", string.Join(",", headers))),
})
Demo fiddles:
#2 (your model).
#3 (with external references).
#4 (duplicate names).
#5 (using the auto-generated map).
This could use additional testing, e.g. for data models with parameterized constructors and additional, mutable properties.
How about catching HeaderValidationException before catching CsvHelperException
catch (HeaderValidationException ex)
{
var message = ex.Message.Split('\n')[0];
var currentHeader = ex.Context.Reader.HeaderRecord;
message += $"{Environment.NewLine}Header: \"{string.Join(",", currentHeader)}\"";
Console.WriteLine(message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}
catch (CsvHelperException ex)
{
Console.WriteLine(ex.InnerException != null ? ex.InnerException.Message : ex.Message);
fileLoadedLink.Text = "Error loading file.";
viewModel.IsFileLoaded = false;
}

How To Go Back To Previous Line In .csv? [duplicate]

This question already has answers here:
How to read a text file reversely with iterator in C#
(11 answers)
Closed 1 year ago.
I'm trying to figure out how to either Record which line I'm in, for example, line = 32, allowing me to just add line-- in the previous record button event or find a better alternative.
I currently have my form setup and working where if I click on "Next Record" button, the file increments to the next line and displays the cells correctly within their associated textboxes, but how do I create a button that goes to the previous line in the .csv file?
StreamReader csvFile;
public GP_Appointment_Manager()
{
InitializeComponent();
}
private void buttonOpenFile_Click(object sender, EventArgs e)
{
try
{
csvFile = new StreamReader("patients_100.csv");
// Read First line and do nothing
string line;
if (ReadPatientLineFromCSV(out line))
{
// Read second line, first patient line and populate form
ReadPatientLineFromCSV(out line);
PopulateForm(line);
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
private bool ReadPatientLineFromCSV(out string line)
{
bool result = false;
line = "";
if ((csvFile != null) && (!csvFile.EndOfStream))
{
line = csvFile.ReadLine();
result = true;
}
else
{
MessageBox.Show("File has not been opened. Please open file before reading.");
}
return result;
}
private void PopulateForm(string patientDetails)
{
string[] patient = patientDetails.Split(',');
//Populates ID
textBoxID.Text = patient[0];
//Populates Personal
comboBoxSex.SelectedIndex = (patient[1] == "M") ? 0 : 1;
dateTimePickerDOB.Value = DateTime.Parse(patient[2]);
textBoxFirstName.Text = patient[3];
textBoxLastName.Text = patient[4];
//Populates Address
textboxAddress.Text = patient[5];
textboxCity.Text = patient[6];
textboxCounty.Text = patient[7];
textboxTelephone.Text = patient[8];
//Populates Kin
textboxNextOfKin.Text = patient[9];
textboxKinTelephone.Text = patient[10];
}
Here's the code for the "Next Record" Button
private void buttonNextRecord_Click(object sender, EventArgs e)
{
string patientInfo;
if (ReadPatientLineFromCSV(out patientInfo))
{
PopulateForm(patientInfo);
}
}
Now, this is some sort of exercise. This class uses the standard StreamReader with a couple of modification, to implement simple move-forward/step-back functionalities.
It also allows to associate an array/list of Controls with the data read from a CSV-like file format. Note that this is not a general-purpose CSV reader; it just splits a string in parts, using a separator that can be specified calling its AssociateControls() method.
The class has 3 constructors:
(1) public LineReader(string filePath)
(2) public LineReader(string filePath, bool hasHeader)
(3) public LineReader(string filePath, bool hasHeader, Encoding encoding)
The source file has no Header in the first line and the text Encoding should be auto-detected
Same, but the first line of the file contain the Header if hasHeader = true
Used to specify an Encoding, if the automatic discovery cannot identify it correctly.
The positions of the lines of text are stored in a Dictionary<long, long>, where the Key is the line number and Value is the starting position of the line.
This has some advantages: no strings are stored anywhere, the file is indexed while reading it but you could use a background task to complete the indexing (this feature is not implemented here, maybe later...).
The disadvantage is that the Dictionary takes space in memory. If the file is very large (just the number of lines counts, though), it may become a problem. To test.
A note about the Encoding:
The text encoding auto-detection is reliable enough only if the Encoding is not set to the default one (UTF-8). The code here, if you don't specify an Encoding, sets it to Encoding.ASCII. When the first line is read, the automatic feature tries to determine the actual encoding. It usually gets it right.
In the default StreamReader implementation, if we specify Encoding.UTF8 (or none, which is the same) and the text encoding is ASCII, the encoder will use the default (Encoding.UTF8) encoding, since UTF-8 maps to ASCII gracefully.
However, when this is the case, [Encoding].GetPreamble() will return the UTF-8 BOM (3 bytes), compromising the calculation of the current position in the underlying stream.
To associate controls with the data read, you just need to pass a collection of controls to the LineReader.AssociateControls() method.
This will map each control to the data field in the same position.
To skip a data field, specify null instead of a control reference.
The visual example is built using a CSV file with this structure:
(Note: this data is generated using an automated on-line tool)
seq;firstname;lastname;age;street;city;state;zip;deposit;color;date
---------------------------------------------------------------------------
1;Harriett;Gibbs;62;Segmi Center;Ebanavi;ID;57854;$4444.78;WHITE;05/15/1914
2;Oscar;McDaniel;49;Kulak Drive;Jetagoz;IL;57631;$5813.94;RED;02/11/1918
3;Winifred;Olson;29;Wahab Mill;Ucocivo;NC;46073;$2002.70;RED;08/11/2008
I skipped the seq and color fields, passing this array of Controls:
LineReader lineReader = null;
private void btnOpenFile_Click(object sender, EventArgs e)
{
string filePath = Path.Combine(Application.StartupPath, #"sample.csv");
lineReader = new LineReader(filePath, true);
string header = lineReader.HeaderLine;
Control[] controls = new[] {
null, textBox1, textBox2, textBox3, textBox4, textBox5,
textBox6, textBox9, textBox7, null, textBox8 };
lineReader.AssociateControls(controls, ";");
}
The null entries correspond to the data fields that are not considered.
Visual sample of the functionality:
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Windows.Forms;
class LineReader : IDisposable
{
private StreamReader reader = null;
private Dictionary<long, long> positions;
private string m_filePath = string.Empty;
private Encoding m_encoding = null;
private IEnumerable<Control> m_controls = null;
private string m_separator = string.Empty;
private bool m_associate = false;
private long m_currentPosition = 0;
private bool m_hasHeader = false;
public LineReader(string filePath) : this(filePath, false) { }
public LineReader(string filePath, bool hasHeader) : this(filePath, hasHeader, Encoding.ASCII) { }
public LineReader(string filePath, bool hasHeader, Encoding encoding)
{
if (!File.Exists(filePath)) {
throw new FileNotFoundException($"The file specified: {filePath} was not found");
}
this.m_filePath = filePath;
m_hasHeader = hasHeader;
CurrentLineNumber = 0;
reader = new StreamReader(this.m_filePath, encoding, true);
CurrentLine = reader.ReadLine();
m_encoding = reader.CurrentEncoding;
m_currentPosition = m_encoding.GetPreamble().Length;
positions = new Dictionary<long, long>() { [0]= m_currentPosition };
if (hasHeader) { this.HeaderLine = CurrentLine = this.MoveNext(); }
}
public string HeaderLine { get; private set; }
public string CurrentLine { get; private set; }
public long CurrentLineNumber { get; private set; }
public string MoveNext()
{
string read = reader.ReadLine();
if (string.IsNullOrEmpty(read)) return this.CurrentLine;
CurrentLineNumber += 1;
if ((positions.Count - 1) < CurrentLineNumber) {
AdjustPositionToLineFeed();
positions.Add(CurrentLineNumber, m_currentPosition);
}
else {
m_currentPosition = positions[CurrentLineNumber];
}
this.CurrentLine = read;
if (m_associate) this.Associate();
return read;
}
public string MovePrevious()
{
if (CurrentLineNumber == 0 || (CurrentLineNumber == 1 && m_hasHeader)) return this.CurrentLine;
CurrentLineNumber -= 1;
m_currentPosition = positions[CurrentLineNumber];
reader.BaseStream.Position = m_currentPosition;
reader.DiscardBufferedData();
this.CurrentLine = reader.ReadLine();
if (m_associate) this.Associate();
return this.CurrentLine;
}
private void AdjustPositionToLineFeed()
{
long linePos = m_currentPosition + m_encoding.GetByteCount(this.CurrentLine);
long prevPos = reader.BaseStream.Position;
reader.BaseStream.Position = linePos;
byte[] buffer = new byte[4];
reader.BaseStream.Read(buffer, 0, buffer.Length);
char[] chars = m_encoding.GetChars(buffer).Where(c => c.Equals((char)10) || c.Equals((char)13)).ToArray();
m_currentPosition = linePos + m_encoding.GetByteCount(chars);
reader.BaseStream.Position = prevPos;
}
public void AssociateControls(IEnumerable<Control> controls, string separator)
{
m_controls = controls;
m_separator = separator;
m_associate = true;
if (!string.IsNullOrEmpty(this.CurrentLine)) Associate();
}
private void Associate()
{
string[] values = this.CurrentLine.Split(new[] { m_separator }, StringSplitOptions.None);
int associate = 0;
m_controls.ToList().ForEach(c => {
if (c != null) c.Text = values[associate];
associate += 1;
});
}
public override string ToString() =>
$"File Path: {m_filePath} Encoding: {m_encoding.BodyName} CodePage: {m_encoding.CodePage}";
public void Dispose()
{
this.Dispose(true);
GC.SuppressFinalize(this);
}
protected virtual void Dispose(bool disposing)
{
if (disposing) { reader?.Dispose(); }
}
}
General approach is the following:
Add a text file input.txt like this
line 1
line 2
line 3
and set Copy to Output Directory property to Copy if newer
Create extension methods for StreamReader
public static class StreamReaderExtensions
{
public static bool TryReadNextLine(this StreamReader reader, out string line)
{
var isAvailable = reader != null &&
!reader.EndOfStream;
line = isAvailable ? reader.ReadLine() : null;
return isAvailable;
}
public static bool TryReadPrevLine(this StreamReader reader, out string line)
{
var stream = reader.BaseStream;
var encoding = reader.CurrentEncoding;
var bom = GetBOM(encoding);
var isAvailable = reader != null &&
stream.Position > 0;
if(!isAvailable)
{
line = null;
return false;
}
var buffer = new List<byte>();
var str = string.Empty;
stream.Position++;
while (!str.StartsWith(Environment.NewLine))
{
stream.Position -= 2;
buffer.Insert(0, (byte)stream.ReadByte());
var reachedBOM = buffer.Take(bom.Length).SequenceEqual(bom);
if (reachedBOM)
buffer = buffer.Skip(bom.Length).ToList();
str = encoding.GetString(buffer.ToArray());
if (reachedBOM)
break;
}
stream.Position--;
line = str.Trim(Environment.NewLine.ToArray());
return true;
}
private static byte[] GetBOM(Encoding encoding)
{
if (encoding.Equals(Encoding.UTF7))
return new byte[] { 0x2b, 0x2f, 0x76 };
if (encoding.Equals(Encoding.UTF8))
return new byte[] { 0xef, 0xbb, 0xbf };
if (encoding.Equals(Encoding.Unicode))
return new byte[] { 0xff, 0xfe };
if (encoding.Equals(Encoding.BigEndianUnicode))
return new byte[] { 0xfe, 0xff };
if (encoding.Equals(Encoding.UTF32))
return new byte[] { 0, 0, 0xfe, 0xff };
return new byte[0];
}
}
And use it like this:
using (var reader = new StreamReader("input.txt"))
{
string na = "N/A";
string line;
for (var i = 0; i < 4; i++)
{
var isAvailable = reader.TryReadNextLine(out line);
Console.WriteLine($"Next line available: {isAvailable}. Line: {(isAvailable ? line : na)}");
}
for (var i = 0; i < 4; i++)
{
var isAvailable = reader.TryReadPrevLine(out line);
Console.WriteLine($"Prev line available: {isAvailable}. Line: {(isAvailable ? line : na)}");
}
}
The result is:
Next line available: True. Line: line 1
Next line available: True. Line: line 2
Next line available: True. Line: line 3
Next line available: False. Line: N/A
Prev line available: True. Line: line 3
Prev line available: True. Line: line 2
Prev line available: True. Line: line 1
Prev line available: False. Line: N/A
GetBOM is based on this.

Add two lines from csv file to array(s)

I have a csv file with the following data:
500000,0.005,6000
690000,0.003,5200
I need to add each line as a separate array. So 50000, 0.005, 6000 would be array1. How would I do this?
Currently my code adds each column into one element.
For example data[0] is showing 500000
690000
static void ReadFromFile(string filePath)
{
try
{
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filePath))
{
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] data = line.Split(',');
Console.WriteLine(data[0] + " " + data[1]);
}
}
}
catch (Exception e)
{
// Let the user know what went wrong.
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
Using the limited data set you've provided...
const string test = #"500000,0.005,6000
690000,0.003,5200";
var result = test.Split('\n')
.Select(x=> x.Split(',')
.Select(y => Convert.ToDecimal(y))
.ToArray()
)
.ToArray();
foreach (var element in result)
{
Console.WriteLine($"{element[0]}, {element[1]}, {element[2]}");
}
Can it be done without LINQ? Yes, but it's messy...
const string test = #"500000,0.005,6000
690000,0.003,5200";
List<decimal[]> resultList = new List<decimal[]>();
string[] lines = test.Split('\n');
foreach (var line in lines)
{
List<decimal> decimalValueList = new List<decimal>();
string[] splitValuesByComma = line.Split(',');
foreach (string value in splitValuesByComma)
{
decimal convertedValue = Convert.ToDecimal(value);
decimalValueList.Add(convertedValue);
}
decimal[] decimalValueArray = decimalValueList.ToArray();
resultList.Add(decimalValueArray);
}
decimal[][] resultArray = resultList.ToArray();
That will give the exact same output as what I've done with the first example
If you may use a List<string[]> you do not have to worry about the array length.
In the following example, the variable lines will be a list arrays, like:
["500000", "0.005", "6000"]
["690000", "0.003", "5200"]
static void ReadFromFile(string filePath)
{
try
{
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filePath))
{
List<string[]> lines = new List<string[]>();
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] splittedLine = line.Split(',');
lines.Add(splittedLine);
}
}
}
catch (Exception e)
{
// Let the user know what went wrong.
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
While other have split method, I will have a more "scolar"-"specified" method.
You have some Csv value in a file. Find a name for this object stored in a Csv, name every column, type them.
Define the default value of those field. Define what happends for missing column, and malformed field. Header?
Now that you know what you have, define what you want. This time again: Object name -> Property -> Type.
Believe me or not, the simple definition of your input and output solved your issue.
Use CsvHelper to simplify your code.
CSV File Definition:
public class CsvItem_WithARealName
{
public int data1;
public decimal data2;
public int goodVariableNames;
}
public class CsvItemMapper : ClassMap<CsvItem_WithARealName>
{
public CsvItemMapper()
{ //mapping based on index. cause file has no header.
Map(m => m.data1).Index(0);
Map(m => m.data2).Index(1);
Map(m => m.goodVariableNames).Index(2);
}
}
A Csv reader method, point a document it will give your the Csv Item.
Here we have some configuration: no header and InvariantCulture for decimal convertion
private IEnumerable<CsvItem_WithARealName> GetCsvItems(string filePath)
{
using (var fileReader = File.OpenText(filePath))
using (var csvReader = new CsvHelper.CsvReader(fileReader))
{
csvReader.Configuration.CultureInfo = CultureInfo.InvariantCulture;
csvReader.Configuration.HasHeaderRecord = false;
csvReader.Configuration.RegisterClassMap<CsvItemMapper>();
while (csvReader.Read())
{
var record = csvReader.GetRecord<CsvItem_WithARealName>();
yield return record;
}
}
}
Usage :
var filename = "csvExemple.txt";
var items = GetCsvItems(filename);

C#: Reading the next file in a folder?

I have a folder containing .txt files which are numbered like so:
0.txt
1.txt
...
867.txt
...
What I am trying to do is each time readNextFile(); is called, I want it to return the contents of the next file in that folder, and return string.Empty; if there is no file after the last one it read. I want to have a button that, when pressed, will make the program read the next file and do stuff with it's contents. The files might change between button presses. The way I did this before was this:
int lastFileNumber = 0;
string readNextFile()
{
string result = string.Empty;
//I know it is recommended to use as few of these as possible, this is just an example.
try
{
string file = Path.Combine("C:\Somewhere", lastFileNumber.ToString() + ".txt");
if (File.Exists(file))
{
result = File.ReadAllText(file);
lastFileNumber++;
}
}
catch
{
}
return result;
}
Problem is there might sometimes be this kind of situation:
0.txt
1.txt
5.txt
6.txt
...
It would obviously get stuck at 1.txt because 2.txt doesn't exist. I need it to skip to the next existing file and read that one. And clearly it is not possible to just sort the file names alphabetically in a string array since the file names are not Padded, so doing that will result in 1000000000.txt being read right after 1.txt.
Any idea how I can achieve this?
You can use linq to check the next file based on the stored number. That is done after ordering the file by converting its name into integer representation:
int lastFileNumber = -1;
bool isFirst = true;
private void buttonNext_Click(object sender, EventArgs e)
{
int lastFileNumberLocal = isFirst ? -1 : lastFileNumber;
isFirst = false;
int dummy;
var currentFile = Directory.GetFiles(#"D:\", "*.txt", SearchOption.TopDirectoryOnly)
.Select(x => new { Path = x, NameOnly = Path.GetFileNameWithoutExtension(x) })
.Where(x => Int32.TryParse(x.NameOnly, out dummy))
.OrderBy(x => Int32.Parse(x.NameOnly))
.Where(x => Int32.Parse(x.NameOnly) > lastFileNumberLocal)
.FirstOrDefault();
if (currentFile != null)
{
lastFileNumber = Int32.Parse(currentFile.NameOnly);
string currentFileContent = File.ReadAllText(currentFile.Path);
}
else
{
// reached the end, do something or show message
}
}
I don't think you can find what file is last without getting the whole list of files first. The sorting can be simplified by sorting by the file name length and then by the file name.
int currentFileNumber = -1;
string currentFileName;
string currentFileText;
string[] allFileNames;
string readCurrentFile()
{
try
{
if (allFileNames == null) allFileNames = (
from f in Directory.EnumerateFiles(#".", "*.*")
orderby f.Length, f select f).ToArray();
currentFileNumber++;
if (currentFileNumber >= allFileNames.Length) return null; // no files left
currentFileName = allFileNames[currentFileNumber];
currentFileText = File.ReadAllText(currentFileName);
return currentFileText;
}
catch (Exception ex) {
MessageBox.Show(ex.Message);
return readCurrentFile(); // get next file if any Exception
}
}

Reading a line from a streamreader without consuming?

Is there a way to read ahead one line to test if the next line contains specific tag data?
I'm dealing with a format that has a start tag but no end tag.
I would like to read a line add it to a structure then test the line below to make sure it not a new "node" and if it isn't keep adding if it is close off that struct and make a new one
the only solution i can think of is to have two stream readers going at the same time kinda suffling there way along lock step but that seems wastefull (if it will even work)
i need something like peek but peekline
The problem is the underlying stream may not even be seekable. If you take a look at the stream reader implementation it uses a buffer so it can implement TextReader.Peek() even if the stream is not seekable.
You could write a simple adapter that reads the next line and buffers it internally, something like this:
public class PeekableStreamReaderAdapter
{
private StreamReader Underlying;
private Queue<string> BufferedLines;
public PeekableStreamReaderAdapter(StreamReader underlying)
{
Underlying = underlying;
BufferedLines = new Queue<string>();
}
public string PeekLine()
{
string line = Underlying.ReadLine();
if (line == null)
return null;
BufferedLines.Enqueue(line);
return line;
}
public string ReadLine()
{
if (BufferedLines.Count > 0)
return BufferedLines.Dequeue();
return Underlying.ReadLine();
}
}
You could store the position accessing StreamReader.BaseStream.Position, then read the line next line, do your test, then seek to the position before you read the line:
// Peek at the next line
long peekPos = reader.BaseStream.Position;
string line = reader.ReadLine();
if (line.StartsWith("<tag start>"))
{
// This is a new tag, so we reset the position
reader.BaseStream.Seek(pos);
}
else
{
// This is part of the same node.
}
This is a lot of seeking and re-reading the same lines. Using some logic, you may be able to avoid this altogether - for instance, when you see a new tag start, close out the existing structure and start a new one - here's a basic algorithm:
SomeStructure myStructure = null;
while (!reader.EndOfStream)
{
string currentLine = reader.ReadLine();
if (currentLine.StartsWith("<tag start>"))
{
// Close out existing structure.
if (myStructure != null)
{
// Close out the existing structure.
}
// Create a new structure and add this line.
myStructure = new Structure();
// Append to myStructure.
}
else
{
// Add to the existing structure.
if (myStructure != null)
{
// Append to existing myStructure
}
else
{
// This means the first line was not part of a structure.
// Either handle this case, or throw an exception.
}
}
}
Why the difficulty? Return the next line, regardless. Check if it is a new node, if not, add it to the struct. If it is, create a new struct.
// Not exactly C# but close enough
Collection structs = new Collection();
Struct struct;
while ((line = readline()) != null)) {
if (IsNode(line)) {
if (struct != null) structs.add(struct);
struct = new Struct();
continue;
}
// Whatever processing you need to do
struct.addLine(line);
}
structs.add(struct); // Add the last one to the collection
// Use your structures here
foreach s in structs {
}
Here is what i go so far. I went more of the split route than the streamreader line by line route.
I'm sure there are a few places that are dieing to be more elegant but for right now it seems to be working.
Please let me know what you think
struct INDI
{
public string ID;
public string Name;
public string Sex;
public string BirthDay;
public bool Dead;
}
struct FAM
{
public string FamID;
public string type;
public string IndiID;
}
List<INDI> Individuals = new List<INDI>();
List<FAM> Family = new List<FAM>();
private void button1_Click(object sender, EventArgs e)
{
string path = #"C:\mostrecent.ged";
ParseGedcom(path);
}
private void ParseGedcom(string path)
{
//Open path to GED file
StreamReader SR = new StreamReader(path);
//Read entire block and then plit on 0 # for individuals and familys (no other info is needed for this instance)
string[] Holder = SR.ReadToEnd().Replace("0 #", "\u0646").Split('\u0646');
//For each new cell in the holder array look for Individuals and familys
foreach (string Node in Holder)
{
//Sub Split the string on the returns to get a true block of info
string[] SubNode = Node.Replace("\r\n", "\r").Split('\r');
//If a individual is found
if (SubNode[0].Contains("INDI"))
{
//Create new Structure
INDI I = new INDI();
//Add the ID number and remove extra formating
I.ID = SubNode[0].Replace("#", "").Replace(" INDI", "").Trim();
//Find the name remove extra formating for last name
I.Name = SubNode[FindIndexinArray(SubNode, "NAME")].Replace("1 NAME", "").Replace("/", "").Trim();
//Find Sex and remove extra formating
I.Sex = SubNode[FindIndexinArray(SubNode, "SEX")].Replace("1 SEX ", "").Trim();
//Deterine if there is a brithday -1 means no
if (FindIndexinArray(SubNode, "1 BIRT ") != -1)
{
// add birthday to Struct
I.BirthDay = SubNode[FindIndexinArray(SubNode, "1 BIRT ") + 1].Replace("2 DATE ", "").Trim();
}
// deterimin if there is a death tag will return -1 if not found
if (FindIndexinArray(SubNode, "1 DEAT ") != -1)
{
//convert Y or N to true or false ( defaults to False so no need to change unless Y is found.
if (SubNode[FindIndexinArray(SubNode, "1 DEAT ")].Replace("1 DEAT ", "").Trim() == "Y")
{
//set death
I.Dead = true;
}
}
//add the Struct to the list for later use
Individuals.Add(I);
}
// Start Family section
else if (SubNode[0].Contains("FAM"))
{
//grab Fam id from node early on to keep from doing it over and over
string FamID = SubNode[0].Replace("# FAM", "");
// Multiple children can exist for each family so this section had to be a bit more dynaimic
// Look at each line of node
foreach (string Line in SubNode)
{
// If node is HUSB
if (Line.Contains("1 HUSB "))
{
FAM F = new FAM();
F.FamID = FamID;
F.type = "PAR";
F.IndiID = Line.Replace("1 HUSB ", "").Replace("#","").Trim();
Family.Add(F);
}
//If node for Wife
else if (Line.Contains("1 WIFE "))
{
FAM F = new FAM();
F.FamID = FamID;
F.type = "PAR";
F.IndiID = Line.Replace("1 WIFE ", "").Replace("#", "").Trim();
Family.Add(F);
}
//if node for multi children
else if (Line.Contains("1 CHIL "))
{
FAM F = new FAM();
F.FamID = FamID;
F.type = "CHIL";
F.IndiID = Line.Replace("1 CHIL ", "").Replace("#", "");
Family.Add(F);
}
}
}
}
}
private int FindIndexinArray(string[] Arr, string search)
{
int Val = -1;
for (int i = 0; i < Arr.Length; i++)
{
if (Arr[i].Contains(search))
{
Val = i;
}
}
return Val;
}

Categories

Resources