I have to parse (C#) a .CSV file, with a variable "width" and 2 lines of header information (the fist one being name and the second one being the unit).
The data looks like:
Example1.CSV:
"timestamp","NAME_1","NAME_2","NAME_3","NAME_4"
"ms","unit_1","unit_2","unit_3","unit_4"
0.01,1.23,4.56,7.89,0.12
0.02,1.23,4.66,7.89,0.11
0.03,1.23,4.76,7.89,0.11
0.04,56.23,4.86,7.89,0.12
Example2.CSV:
"timestamp","NAME_1","NAME_2","NAME_3","NAME_4","NAME_5",...,"NAME_N"
"ms","unit_1","unit_2","unit_3","unit_4","unit_5",...,"unit_N"
0.01,1.23,4.56,7.89,0.12,0.13,...,0.27
0.02,1.23,4.66,7.89,0.12,0.13,...,0.22
0.03,1.23,4.76,7.89,0.11,0.13,...,0.24
0.04,56.23,4.86,7.89,0.12,0.13,...,0.29
With N being the "width" of the table (value can be up to 128 and larger). I'm planning to use Filehelpers.
I thought of using [FieldOptional()] - but that gets very unhandy, especially when the "width" is variable...
My current attempt looks like
[IgnoreFirst(2)]
[DelimitedRecord(",")]
public sealed class LogData
{
public Double ts;
public Double Field1;
[FieldNullValue(0.0)]
[FieldOptional()]
public Double Field2;
[FieldNullValue(0.0)]
[FieldOptional()]
public Double Field3;
// and so on
}
Any help on "how to solve the variable width"-Problem in a more elegant manner is appreciated - Thank you very much in advance!
Ben
If you are planning to convert the file into DataTable, there is a better option
Please use CsvEngine of FileHelpers library. See the code snippet below:
using (MemoryStream stream = new MemoryStream(_fileContent)) //file content can be file as byte array
{
TextReader reader = new StreamReader(stream);
string path = "C:\\Sample.csv";
CsvEngine csvEngine = new CsvEngine("Model", ',', path);
var dataTable = csvEngine.ReadStreamAsDT(reader);
//Do whatever with dataTable
}
Here the sample file can be csv file or text file contains the header of the csv file you want to process. The columns of the DataTable will be named according to the header of the sample file
Cheers
You can use optional array field. I think you need to be using FileHelpers 2.9.9.
[IgnoreFirst(2)]
[DelimitedRecord(",")]
public class LogData
{
public Double TimeStamp;
[FieldNullValue(0.0)]
[FieldOptional, FieldArrayLength(0, 100)]
public Double[] ManyFields;
}
Here's a working example.
class Program
{
static String content =
#"""timestamp"",""NAME_1"",""NAME_2"",""NAME_3"",""NAME_4""
""ms"",""unit_1"",""unit_2"",""unit_3"",""unit_4""
0.01,1.23,4.56,7.89,0.12
0.02,1.23,4.66,7.89,0.11
0.03,1.23,4.76,7.89,0.11
0.04,56.23,4.86,7.89,0.12";
private static void Main()
{
var engine = new FileHelperEngine<LogData>();
var records = engine.ReadString(content);
Assert.AreEqual(0.01, records[0].TimeStamp);
Assert.AreEqual(1.23, records[0].ManyFields[0]);
Assert.AreEqual(4.56, records[0].ManyFields[1]);
Assert.AreEqual(7.89, records[0].ManyFields[2]);
Assert.AreEqual(0.12, records[0].ManyFields[3]);
}
}
Related
EDIT: i have problem which i cant solve. I have large text file. I want select from this file only special data for me.
My code:
class Program
{
static string ticket = "";
static string openTime = "";
static string type = "";
static float size;
static string item = "";
static float price;
static string closeTime = "";
static float priceC;
static float commission;
static float swap;
static float trade;
static void Main(string[] args)
{
ReadFile.ReadAllFile("..\\..\\..\\File.txt");
}
}
public static class ReadFile
{
public static string ReadAllFile(string file)
{
string content = "";
using (StreamReader sr = new StreamReader(file))
{
content = sr.ReadToEnd();
}
Console.WriteLine(content);
return content;
}
}
One line from text file:
18388699 2021.09.03 14:40:14 buy 0.01 eurusd 1.18720 1.16211 1.17961 1.17201 0.00 -0.69 -12.96
In this text there are spaces. I want split this values to my strings.
This code reads the whole text from this text file. But i need select only data to my string from this file. Is there some method or funkcion which can select only important text passages?
Sorry for previous post, i dont want whole script. I am sreaching for some tips and tricks how to do that :D English is not my native language, then i cant describe my problem perfectly, but i tried my best :D
I would probably do something like this:
static void Main()
{
foreach (var line in File.ReadLines(#"..\..\..\File.txt"))
{
var fields = line.Split();
var ticket = fields[0];
var openTime = DateTime.Parse(fields[1] + " " + fields[2], CultureInfo.InvariantCulture);
var price = decimal.Parse(fields[6], CultureInfo.InvariantCulture);
Console.WriteLine($"{ticket}, {openTime}, {price}");
}
}
Read each line of the file using File.ReadLines().
Split on spaces using String.Split().
Index into the fields to get at the data you want.
ticket is just the first field.
For openTime you need to join the second and third field. I also chose to parse to a DateTime.
For things like prices and quantities I would use a decimal data type. In your code you have a float, I would not use that. float's have very low precision, you might get round off error with those. double's are better, but might still have the same problem. The recommended data type for calculations like these is decimal.
What would be the most efficient way of searching for a specific string in a text then displaying only a portion of it?
Here is my situation: I am currently hosting a .txt file on my server. The function I want to create would access this .txt (maybe even download for efficiency?), search an ID (ex. 300000000) and then put the name in a string (ex. Island Andrew).
Here is an example of the .txt file hosted on my server:
ID: 300000000 NAME: Island Andrew
ID: 300000100 NAME: Island Bob
ID: 300000010 NAME: Island George
ID: 300000011 NAME: Library
ID: 300000012 NAME: Cellar
I have already complete code for a similar example, however, the formatting is different and it is not in c#.
Here it is;
If anyone can help me accomplish this in c#, it would be greatly appreciated.
Thanks.
Simplistic approach without proper error handling.
Main part to look at is regex stuff.
using System;
using System.Net;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Program
{
static void Main()
{
var map = new Map();
Console.WriteLine(map[300000011]);
}
}
public class Map: Dictionary<int, string>
{
public Map()
{
WebClient wc = new WebClient()
{
Proxy = null
};
string rawData = wc.DownloadString("<insert url with data in new format here>");
PopulateWith(rawData);
}
void PopulateWith(string rawText)
{
string pattern = #"ID: (?<id>\d*) NAME: (?<name>.*)";
foreach (Match match in Regex.Matches(rawText, pattern))
{
// TODO: add error handling here
int id = int.Parse( match.Groups["id"].Value );
string name = match.Groups["name"].Value;
this[id] = name;
}
}
}
You could try this to create an array of names in C#:
Dictionary<int,String> mapDictionary;
string[] mapNames = rawData.Split(splitChar, StringSplitOptions.None);
foreach(String str in mapNames)
{
{
String mapid = str.Substring(str.IndexOf(":"));
String mapname = str.Remove(0, str.IndexOf(':') + 1);
mapDictionary.Add(Convert.ToInt32(mapid), mapname);
}
}
Remove all carets (^)
Convert all member access operators (->) to dots
Change gcnew to new Convert array to string[]
Remove private and public modifiers from class, have them on methods
explicitly (e.g. public void CacheMaps())
Change ref class to static class
Change nullptr to null
Change catch(...) to only catch
Move using namespace to the very top of the file, and replace scope resolution operator (::) to dots.
That should be about it.
simplest way would be to do a token separator between ID: 30000 and Name: Andrew Island and remove the ID and Name as such
30000, Andrew Island
Then in your C# code you would create a custom class called
public class SomeDTO {
public long ID{get; set;}
public string Name {get; set;}
}
next you would create a new List of type SomeDTO as such:
var List = new List<SomeDTO>();
then as you're parsing the txt file get a file reader and read it line by line for each line ensure that you have a token separator that separates the two Values by the comma separation.
Now you can simply add it to your new List
var tempId = line[1];
var tempName = line[2];
List.Add(new SomeDTO{ ID = tempId, Name = tempName});
Now that you have the entire list in memory you can do a bunch of searching and what not and find all things you need plus reuse it because you've already built the list.
var first = List.Where(x => x.Name.Equals("Andrew Island")).FirstOrDefault();
string sample1 = <SUCCESS><BUILDING>27</BUILDING></SUCCESS><CLEANED><LOCALITY>Value 1</LOCALITY></CLEANED>
string sample2 = <SUCCESS><BUILDING>14</BUILDING></SUCCESS> <SUCCESS><BUILDING>Value 2</BUILDING></SUCCESS>
In both above string samples I want to get the first "SUCCESS" tag from right to left.
So in sample 1 I want returned = <SUCCESS><BUILDING>27</BUILDING></SUCCESS>
and in sample 2 I want returned = <SUCCESS><BUILDING>Value 2</BUILDING></SUCCESS>
I know I can use Index of to first occurrence but not sure of last
XDocument doc = XDocument.Parse("<xml>" + sample2 + "</xml>");
Text = doc.Root.Elements("SUCCESS").Last().ToString();
c# has a nice String function called LastIndexOf(String). It will work the exact same way as indexOf(String) except give you the last occurrence.
http://msdn.microsoft.com/en-us/library/1wdsy8fy(v=vs.110).aspx
Hope this helps,
Cheers
If you're going to be parsing XML, you might be interested in using the XMLReader class. Read more about the XMLReader here.
Note that you need valid XML for the reader to work. In your example, you would need to wrap the partial XML in a unique root node (part of the XML spec). You might consider making some extension methods to help you:
public static class XMLStringExtensions
{
public static string LastTag(this string innerXml, string tag)
{
string previousTag = null;
using (var reader = XmlReader.Create(new StringReader(innerXml.WrapInRoot())))
while(reader.ReadToFollowing(tag)) previousTag = reader.ReadOuterXml();
return previousTag;
}
public static string WrapInRoot(this string partialXml)
{
return string.Format("<root>{0}</root>", partialXml);
}
}
Then you can invoke it like this:
sample1.LastTag("SUCCESS"); //<SUCCESS><BUILDING>27</BUILDING></SUCCESS>
sample2.LastTag("SUCCESS"); //<SUCCESS><BUILDING>Value 2</BUILDING></SUCCESS>
So at the moment our ERP/PSA software produces an EFT (Electronic Fund Transfer) .txt file which contains Bank and employee bank information which is then sent to the bank.
Problem is as follows the format to which the EFT File is currently being produced is US standard and not suitable to Canadian bank standards. But I have the required canadian bank standard format.
The format of the file is all about number of columns in a file and the number of characters they contain (if the data for the column doesnt reach the number of characters it is filled with spaces).
So I.e.
1011234567Joe,Bloggs 1234567
And for example lets say I try transform to Canadian Standard
A101Joe,Bloggs 1234567 1234567
Where for example "A" needs to be added to first line in the record.
I'm just wondering how to go about a task like this in C#
I.e.
Read in text file.
Line by Line parse data in terms of start and end of characters
Assign values to variables
Rebuild new file with these variables with different ordering and additional data
I don't have my IDE open so my syntax might be a tad off, but I'll try to point you in the right direction. Anyways, what fun would it be to give you the solution outright?
First you're going to want to get a list of lines:
IEnumerable<string> lines = text.Split('\n');
You said that the columns don't have delimiters but rather are of fixed widths, but you didn't mention where the columns sizes are defined. Generally, you're going to want to pull out the text of each column with
colText = line.Substring(startOfColumn, lengthOfColumn);
For each column you'll have to calculate startOfColumn and lengthOfColumn, depending on the positions and lengths of the columns.
Hopefully that's a good enough foundation for you to get started.
I think that your best bet is to create a class to hold the logical data that is present in the file and have methods in this class for parsing the data from a given format and saving it back to a given format.
For example, assume the following class:
public class EFTData
{
public string Name { get; set; }
public string RoutingNumber { get; set; }
public string AccountNumber { get; set; }
public string Id { get; set; }
public void FromUSFormat(string sLine)
{
this.Id = sLine.Substring(0, 3);
this.RoutingNumber = sLine.Substring(3, 7);
this.Name = sLine.Substring(10, 20);
this.AccountNumber = sLine.Substring(30, 7);
}
public string ToCanadianFormat()
{
var sbText = new System.Text.StringBuilder(100);
// Note that you can pad or trim fields as needed here
sbText.Append("A");
sbText.Append(this.Id);
sbText.Append(this.RoutingNumber);
sbText.Append(this.AccountNumber);
return sbText.ToString();
}
}
You can then read from a US file and write to a Canadian file as follows:
// Assume there is only a single line in the file
string sLineToProcess = System.IO.File.ReadAllText("usin.txt");
var oData = new EFTData();
// Parse the us data
oData.FromUSFormat(sLineToProcess);
// Write the canadian data
using (var oWriter = new StreamWriter("canout.txt"))
{
oWriter.Write(oData.ToCanadianFormat());
}
var lines = File.ReadAllLines(inputPath);
var results = new List<string>();
foreach (var line in lines)
{
results.Add(string.Format("A{0}", line));
}
File.WriteAllLines(outputPath, results.ToArray());
this is maybe a dump question but I give it a try.
One of a common task is to import data from ascii files.
It's almost always the same beside the structure of the file.
Comma separated, line seperated, take 5 rows, take 12... whatever...
So it's always a different protocol/mapping but the same handling...
Is there a library for c# which helps to support this day-to-day scenario?
This is awesome: FileHelpers Library
Example:
File:
1732,Juan Perez,435.00,11-05-2002
554,Pedro Gomez,12342.30,06-02-2004
112,Ramiro Politti,0.00,01-02-2000
924,Pablo Ramirez,3321.30,24-11-2002
Create a class that maps your data.
[DelimitedRecord(",")]
public class Customer
{
public int CustId;
public string Name;
public decimal Balance;
[FieldConverter(ConverterKind.Date, "dd-MM-yyyy")]
public DateTime AddedDate;
}
And then parse using:
FileHelperEngine engine = new FileHelperEngine(typeof(Customer));
// To Read Use:
Customer[] res = engine.ReadFile("FileIn.txt") as Customer[];
// To Write Use:
engine.WriteFile("FileOut.txt", res);
Enumerate:
foreach (Customer cust in res)
{
Console.WriteLine("Customer Info:");
Console.WriteLine(cust.Name + " - " +
cust.AddedDate.ToString("dd/MM/yy"));
}
You may want to take a look at the FileHelpers library.
So the only thing those tasks have in common is reading a text file?
If FileHelpers is overkill for you (simple text data, etc.), standard .NET classes should be all you need (String.Split Method, Regex Class, StreamReader Class).
They provide reading delimited by characters (String.Split) or lines (StreamReader).