I have two lists; List<int> numList has identifier number as its elements, and List<string> filePaths has path to file that needs to be analyzed as its elements. I want to filter filePaths based on the numList; that is, I only want to select the filePaths whose file names have the identifier number that is present in the numList.
For example, filePaths has
C:/test/1.test.xlsx
C:/test/2.test.xlsx
C:/test/3.test.xlsx
C:/test/4.test.xlsx
and, numList has
1
2
In this case, I want to construct LINQ statement to only get
C:/test/1.test.xlsx
C:/test/2.test.xlsx
I tried
for(int i = 0; i < numList.Count; i++)
{
filePaths = filePaths.Where(f => Convert.ToInt32(GetNumberFromString(Path.GetFileName(f))) == numList[i]).ToList();
}
And this is GetNumberFromString Helper Method
// Find number in the string
private string GetNumberFromString(string value)
{
int number;
string resultString = Regex.Match(value, #"\d+").Value;
if (Int32.TryParse(resultString, out number))
{
return resultString;
}
else
{
throw new Exception(String.Format("No number present in the file {0}", value));
}
}
I think this will work, but is there more elegant/efficient way of achieving this?
You can do it with a one-liner:
var filteredFilePaths = filePaths.Where(x => numList.Contains(GetNumberFromString(x));
I'd do it like this. The test method assumes that all the files in directory have appropriately formatted names. If that's not a reasonable assumption, it's easy enough to fix.
This is overkill, however, if you only ever care about the "file number" in one place.
public class TestClass
{
public static void TestMethod(String directory)
{
var files = System.IO.Directory.GetFiles(directory).Select(f => new FileInfo(f)).ToList();
var numList = new[] { 1, 2 };
var oneAndTwo = files.Where(fi => numList.Contains(fi.FileNumber)).ToList();
}
}
public class FileInfo
{
public FileInfo()
{
}
public FileInfo(String path)
{
Path = path;
}
public int FileNumber { get; private set; }
private string _path;
public String Path
{
get { return _path; }
set
{
_path = value;
FileNumber = GetNumberFromFileName(_path);
}
}
public static int GetNumberFromFileName(string path)
{
int number;
var fileName = System.IO.Path.GetFileName(path);
string resultString = Regex.Match(fileName, #"\d+").Value;
if (Int32.TryParse(resultString, out number))
{
return number;
}
else
{
throw new Exception(String.Format("No number present in the file {0}", path ?? "(null)"));
}
}
}
A stand-alone One-liner using a Join :
var result = filePaths.Select(x => new { Filename = Path.GetFileName(x), x })
.Join(numList, x => Regex.Match(x.Filename, "^([0-9]+)").Value,
y => y.ToString(),
(x, y) => x.x);
Related
I have a text file having list of movie names and its parts as below:
xxx, Author1, v6
the net, author1, v7
xxx, author3, v10
DDLJ, author3, v11
the fire, author5, v6
the health, author1, v8
the health, author7, v2
the hero, author9, v11
the hero, author8, v3
I would like to get most recent version of movie name. In this case it should return "DDLJ" and "the hero".
This is what I have tried:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
namespace ProgramNamespace
{
public class Program
{
public static List<String> processData(IEnumerable<string> lines)
{
Dictionary<string, int> keyValuePairs = new Dictionary<string, int>();
foreach (var item in lines)
{
string[] readsplitted = item.Split(',');
keyValuePairs.Add(readsplitted[0], Convert.ToInt32(
Regex.Replace(readsplitted[2], "[^0-9]+", string.Empty)));
}
//List<String> retVal = new List<String>();
return retVal;
}
static void Main(string[] args)
{
try
{
List<String> retVal = processData(File.ReadAllLines(#"D:\input.txt"));
File.WriteAllLines(#"D:\output.txt", retVal);
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
}
}
Note that, if required I would like to add a helper class.
EDIT: version for duplicated keys
I rewrote the first solution I gave to take duplicated data into account. The trick is adding a progressive number before the key and separating it with an underscore: this way every key will be unique.
E.g. you will have your Dictionary filled like this:
"1_xxx", 6
"2_the net", 7
"3_xxx", 10
"4_DDLJ", 11
...
Then I remove the number (and the underscore) before providing a result.
public static List<String> processData(IEnumerable<string> lines)
{
var keyValuePairs = new Dictionary<string, int>();
int Position = 0;
foreach (var item in lines)
{
Position++;
string[] readsplitted = item.Split(',');
keyValuePairs.Add(Position.ToString() +"_" + readsplitted[0], Convert.ToInt32(Regex.Replace(readsplitted[2], "[^0-9]+", string.Empty)));
}
var MaxVersion = keyValuePairs.Values.OrderByDescending(f => f).First();
return keyValuePairs.Where(f => f.Value == MaxVersion).Select(f => string.Join("_", f.Key.Split('_').Skip(1))).ToList();
}
More in detail:
keyValuePairs.Values will return just the version numbers
.OrderByDescending(f => f).First() will sort the version numbers in descending order and pick the first, i.e. the highest
keyValuePairs.Where(f => f.Value == MaxVersion) will select the key-value pairs corresponding to the highest version above
.Select(f => f.Key) will give you the keys of your Dictionary, i.e. the titles
This way you will also keep your Dictionary; if you are doing this one time and you don't need to expand your code or reuse your models, you won't have to create other classes or make it more complicated than necessary.
For these kinds of tasks I usually prefer to create a class that represents the data we're collecting, and give it a TryParse method that will create an instance of the class based on a line of data:
public class MovieInfo
{
public string Name { get; set; }
public string Author { get; set; }
public int Version { get; set; }
public static bool TryParse(string input, out MovieInfo result)
{
result = null;
if (input == null) return false;
var parts = input.Split(',');
int version;
if (parts.Length == 3 &&
int.TryParse(parts[2].Trim().TrimStart('v'), out version))
{
result = new MovieInfo
{
Name = parts[0],
Author = parts[1],
Version = version
};
}
return result != null;
}
public override string ToString()
{
return $"{Name} (v{Version}) - {Author}";
}
}
Then it's just a matter of reading the file, creating a list of these classes, and getting all that have the highest number:
public static List<MovieInfo> processData(IEnumerable<string> lines)
{
if (lines == null) return null;
var results = new List<MovieInfo>();
foreach (var line in lines)
{
MovieInfo temp;
if (MovieInfo.TryParse(line, out temp))
{
results.Add(temp);
}
}
var maxVersion = results.Max(result => result.Version);
return results.Where(result => result.Version == maxVersion).ToList();
}
For example:
private static void Main()
{
var lines = new List<string>
{
"xxx, Author1, v6",
"the net, author1, v7",
"xxx, author3, v10",
"DDLJ, author3, v11",
"the fire, author5, v6",
"the health, author1, v8",
"the health, author7, v2",
"the hero, author9, v11",
"the hero, author8, v3",
};
var processed = processData(lines);
foreach (var movie in processed)
{
// Note: this uses the overridden ToString method. You could just do 'movie.Name'
Console.WriteLine(movie);
}
GetKeyFromUser("\nDone!! Press any key to exit...");
}
Output
This is how I would do it. This accounts for getting all the movie names that where the max version is the same.
public static List<String> processData(string fileName)
{
var lines = File.ReadAllLines(fileName);
var values = lines.Select(x =>
{
var readsplitted = x.Split(',');
return new { Name = readsplitted[0], Verison = int.Parse(readsplitted[2].Replace("v", string.Empty))};
});
var maxValue= values.Max(x => x.Verison);
return values.Where(v => v.Verison == maxValue)
.Select(v => v.Name)
.ToList();
}
static void Main(string[] args)
{
try
{
List<String> retVal = processData(#"D:\output.txt");
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
create a Movie class in order to initialize objects for each row that represents a movie.
split the whole string passed to processData() first by new line that by ','.
extract the number of the version of each movie (sperate it from "v") see: extractNumberFromString() method.
find the max version number and get (using linq query) all the movies that share the maximum version number.
public static List<Movie> processData(string s)
{
// list to store all movies
List<Movie> allmovies = new List<Movie>();
// first split by new line
var splitbynewline = s.Split('\n');
// split by ',' and initilize object
foreach (var line in splitbynewline)
{
var moviestring = line.Split(',');
// create new movie object
Movie obj = new Movie { Name = moviestring[0], Author = moviestring[1], Version = moviestring[2] };
obj.VersionNumber = extractNumberFromString(moviestring[2]);
allmovies.Add(obj);
}
// get the max version number
double maxver = allmovies.Max(x => x.VersionNumber);
// set and returen list that containes all movies with max version
List<Movie> result = allmovies.Where(x => x.VersionNumber == maxver).ToList();
return result;
}
/// <summary>
///
/// convert number that exist in a string to an int32 for example sdfdf43gn will return as 43
/// </summary>
/// <param name="value">string that contains inside him as digits</param>
/// <returns>int32</returns>
public static double extractNumberFromString(string value)
{
string returnVal = string.Empty;
System.Text.RegularExpressions.MatchCollection collection = System.Text.RegularExpressions.Regex.Matches(value, "\\d+");
foreach (System.Text.RegularExpressions.Match m in collection)
{
returnVal += m.ToString();
}
return Convert.ToDouble(returnVal);
}
public class Movie
{
public string Name;
public String Author;
public string Version;
public double VersionNumber;
}
Goal:
Retrieve a string value that is "1_2_3" om the code myListAnimals. In the future, the value can be random.
I need to add a "_" between numbers.
Problem:
I don't know how to do it by using LINQ?
public class Animal
{
private void int _number;
private void string _name;
private bool display;
public int Number
{
get { return _number;}
set { _number = value; }
}
public int Name
{
get { return _name;
set { _name = value; }
}
public bool Display
{
get { return display;
set { display = value; }
}
}
List<Animal> myListAnimal = new List<Animal>
Animal myAnimal = new List<Animal>
myAnimal.Number = 1;
myAnimal.Name = "Dog";
myAnimal.Display = True;
myAnimals.add(myAnimal )
Animal myAnimal2 = new List<Animal>
myAnimal2.Number = 2;
myAnimal2.Name = "Cat";
myAnimal2.Display = True;
myAnimals.add(myAnimal2)
Animal myAnimal3 = new List<Animal>
myAnimal3.Number = 3;
myAnimal3.Name = "Pig";
myAnimal3.Display = True;
myAnimals.add(myAnimal3)
Animal myAnimal4 = new List<Animal>
myAnimal4.Number = 4;
myAnimal4.Name = "Sheep";
myAnimal4.Display = false;
myAnimals.add(myAnimal4)
Note: Your code sample isn't valid C#. I assume that you can fix that (it's pretty simple basic changes that need to be made). That said:
Yes, you can use LINQ to concatenate strings, which is ultimately what you're doing.
var concat = myListAnimal
.Where(a => a.Display)
.Select(a => a.Number.ToString())
.Aggregate((current, next) => current + "_" + next);
Console.WriteLine(concat);
Would output with your data:
1_2_3
Where() filters the values where Display != true
Select() projects the number values to a sequence of strings
and Aggregate() does the concatenation.
your code is not valid. First fix it and try this.
var concat =string.Join("_", myListAnimal.Select(a => a.Number).ToArray());
Try using StringBuilder and ForEach extension method.
StringBuilder sb = new StringBuilder();
myAnimals.ForEach(x=> sb.AppendFormat("{0}_",x.Number));
I am in my first steps towards creating a very basic structural analysis software using Visual C#.
I decided to make it console-based (no user interface). Therefore the only way to get user's input is through chars and strings.
Imagine the user wants to create a 2D bar element. She would need to specify an initial point, a final point and a name for that bar. I want the syntax to be like follows:
"CREATE bar NAMED (bar_name) FIRST (first_point) LAST (last_point)"
Where:
(bar_name) is the name of the bar, up to the user. Let (object_name)="bar_A" (string type).
(first_point) would be the initial point of the bar. Since we are creating a 2D bar, (first_point) should be a 1x2 vector that the user should enter between parenthesis. For example, (first_point)=(0,0)
(last_point) would be the final point of the bar. Same type and syntax as (first_point).
I am just wondering if there is any easy way to achieve the string comparison task, something like comparing the user's input against a prefabricated command.
Of course without forgetting about user's input cleaning task.
I know there is a huge amount of possible solutions here. Maybe using LINQ. Maybe just using the String object. I just want to know the most efficient way, where efficient means:
The fastest the user's query gets processed, the better;
the less the lines of codes, the better; and
where thorough query sanitizing tasks are made.
This last point is really important since some user's input like this:
"CREATE bar NAMED bar_a FISRT (0,0) LAST (0,1)"
Note that the user commited a typo (FISRT instead of FIRST), and the query shouldn't run.
Thanks
Okay, I created a simple parser that should work good for you and, if the need arises, you can easily expand.
Start off by creating a new Console Application. Add a new class file called Tokenizer.cs. This file was auto generated by my TokenIcer project that I linked to you in the comments above. Make Tokenizer.cs look like this:
public class TokenParser
{
private readonly Dictionary<Tokens, string> _tokens;
private readonly Dictionary<Tokens, MatchCollection> _regExMatchCollection;
private string _inputString;
private int _index;
public enum Tokens
{
UNDEFINED = 0,
CREATE = 1,
FIRST = 2,
LAST = 3,
BAR = 4,
NAMED = 5,
BAR_NAME = 6,
WHITESPACE = 7,
LPAREN = 8,
RPAREN = 9,
COMMA = 10,
NUMBER = 11
}
public string InputString
{
set
{
_inputString = value;
PrepareRegex();
}
}
public TokenParser()
{
_tokens = new Dictionary<Tokens, string>();
_regExMatchCollection = new Dictionary<Tokens, MatchCollection>();
_index = 0;
_inputString = string.Empty;
_tokens.Add(Tokens.CREATE, "[Cc][Rr][Ee][Aa][Tt][Ee]");
_tokens.Add(Tokens.FIRST, "[Ff][Ii][Rr][Ss][Tt]");
_tokens.Add(Tokens.LAST, "[Ll][Aa][Ss][Tt]");
_tokens.Add(Tokens.BAR, "[Bb][Aa][Rr][ \\t]");
_tokens.Add(Tokens.NAMED, "[Nn][Aa][Mm][Ee][Dd]");
_tokens.Add(Tokens.BAR_NAME, "[A-Za-z_][a-zA-Z0-9_]*");
_tokens.Add(Tokens.WHITESPACE, "[ \\t]+");
_tokens.Add(Tokens.LPAREN, "\\(");
_tokens.Add(Tokens.RPAREN, "\\)");
_tokens.Add(Tokens.COMMA, "\\,");
_tokens.Add(Tokens.NUMBER, "[0-9]+");
}
private void PrepareRegex()
{
_regExMatchCollection.Clear();
foreach (KeyValuePair<Tokens, string> pair in _tokens)
{
_regExMatchCollection.Add(pair.Key, Regex.Matches(_inputString, pair.Value));
}
}
public void ResetParser()
{
_index = 0;
_inputString = string.Empty;
_regExMatchCollection.Clear();
}
public Token GetToken()
{
if (_index >= _inputString.Length)
return null;
foreach (KeyValuePair<Tokens, MatchCollection> pair in _regExMatchCollection)
{
foreach (Match match in pair.Value)
{
if (match.Index == _index)
{
_index += match.Length;
return new Token(pair.Key, match.Value);
}
if (match.Index > _index)
{
break;
}
}
}
_index++;
return new Token(Tokens.UNDEFINED, string.Empty);
}
public PeekToken Peek()
{
return Peek(new PeekToken(_index, new Token(Tokens.UNDEFINED, string.Empty)));
}
public PeekToken Peek(PeekToken peekToken)
{
int oldIndex = _index;
_index = peekToken.TokenIndex;
if (_index >= _inputString.Length)
{
_index = oldIndex;
return null;
}
foreach (KeyValuePair<Tokens, string> pair in _tokens)
{
var r = new Regex(pair.Value);
Match m = r.Match(_inputString, _index);
if (m.Success && m.Index == _index)
{
_index += m.Length;
var pt = new PeekToken(_index, new Token(pair.Key, m.Value));
_index = oldIndex;
return pt;
}
}
var pt2 = new PeekToken(_index + 1, new Token(Tokens.UNDEFINED, string.Empty));
_index = oldIndex;
return pt2;
}
}
public class PeekToken
{
public int TokenIndex { get; set; }
public Token TokenPeek { get; set; }
public PeekToken(int index, Token value)
{
TokenIndex = index;
TokenPeek = value;
}
}
public class Token
{
public TokenParser.Tokens TokenName { get; set; }
public string TokenValue { get; set; }
public Token(TokenParser.Tokens name, string value)
{
TokenName = name;
TokenValue = value;
}
}
In Program.cs, make it look like this:
class Program
{
private class Bar
{
public string Name { get; set; }
public int FirstX { get; set; }
public int FirstY { get; set; }
public int LastX { get; set; }
public int LastY { get; set; }
}
static void Main(string[] args)
{
const string commandCreateBar1 = "CREATE bar NAMED bar_a FIRST(5,10) LAST (15,20)";
const string commandCreateBar2 = "CREATE bar NAMED MyFooBar FIRST(25 , 31) LAST (153 ,210)";
const string commandCreateBar3 = "CREATE bar NAMED MySpaceyFooBar FIRST(0,0) LAST (12,39)";
Bar bar1 = ParseCreateBar(commandCreateBar1);
PrintBar(bar1);
Bar bar2 = ParseCreateBar(commandCreateBar2);
PrintBar(bar2);
Bar bar3 = ParseCreateBar(commandCreateBar3);
PrintBar(bar3);
}
private static void PrintBar(Bar bar)
{
Console.WriteLine("A new bar was Created! \"{0}\" ({1}, {2}) ({3}, {4})", bar.Name, bar.FirstX, bar.FirstY, bar.LastX, bar.LastY);
}
private static Bar ParseCreateBar(string commandLine)
{
var bar = new Bar();
var parser = new TokenParser { InputString = commandLine };
Expect(parser, TokenParser.Tokens.CREATE);
Expect(parser, TokenParser.Tokens.BAR);
Expect(parser, TokenParser.Tokens.NAMED);
Token token = Expect(parser, TokenParser.Tokens.BAR_NAME);
bar.Name = token.TokenValue;
Expect(parser, TokenParser.Tokens.FIRST);
Expect(parser, TokenParser.Tokens.LPAREN);
token = Expect(parser, TokenParser.Tokens.NUMBER);
bar.FirstX = int.Parse(token.TokenValue);
Expect(parser, TokenParser.Tokens.COMMA);
token = Expect(parser, TokenParser.Tokens.NUMBER);
bar.FirstY = int.Parse(token.TokenValue);
Expect(parser, TokenParser.Tokens.RPAREN);
Expect(parser, TokenParser.Tokens.LAST);
Expect(parser, TokenParser.Tokens.LPAREN);
token = Expect(parser, TokenParser.Tokens.NUMBER);
bar.LastX = int.Parse(token.TokenValue);
Expect(parser, TokenParser.Tokens.COMMA);
token = Expect(parser, TokenParser.Tokens.NUMBER);
bar.LastY = int.Parse(token.TokenValue);
Expect(parser, TokenParser.Tokens.RPAREN);
return bar;
}
private static Token Expect(TokenParser parser, TokenParser.Tokens expectedToken)
{
EatWhiteSpace(parser);
Token token = parser.GetToken();
if (token != null && token.TokenName != expectedToken)
{
Console.WriteLine("Expected Token " + expectedToken);
Environment.Exit(0);
}
if (token == null)
{
Console.WriteLine("Unexpected end of input!");
Environment.Exit(0);
}
return token;
}
private static void EatWhiteSpace(TokenParser parser)
{
while (parser.Peek() != null && parser.Peek().TokenPeek != null &&
parser.Peek().TokenPeek.TokenName == TokenParser.Tokens.WHITESPACE)
{
parser.GetToken();
}
}
}
As you can see, I created 3 test scenarios. Notice all white space is ignored. If you want to be strict about the white space, you can modify the EatWhiteSpace function to be strict.
If you want, I have a simple expression parser I could throw into this code too, that way you could have commands such as CREATE bar NAMED bar_a FIRST(3+2, 7*8 + 12) LAST (150-100, 12-3*2). I've got a simple expression parser I made a while back using TokenIcer that I can throw in. It can parse any math expression and supports parenthesis, add, subtract, multiply, and divide.
Tokenization is one way to go, but if you aren't planning on supporting way too many commands and parameters, you should look at Regexes.
Regex regex = new Regex(#"^CREATE bar NAMED (?<BarName>[A-Za-z0-9-_]*) FIRST (?<FirstPoint>\([0-9]+\|[0-9]+\)) LAST (?<LastPoint>\([0-9]+\|[0-9]+\)$");
Match match = regex.Match("create bar named bar_a first (0,0) last (0,1)", RegexOptions.IgnoreCase);
if (match.Success)
{
var name = match.Groups["BarName"].Value;
// and so on for other matches
}
I have the following property Class:
public class Ctas
{
private string _CodAgrup;
public string CodAgrup
{
get { return _CodAgrup; }
set { _CodAgrup = value; }
}
private string _NumCta;
public string NumCta
{
get { return _NumCta; }
set { _NumCta = value; }
}
private string _Desc;
public string Desc
{
get { return _Desc; }
set { _Desc = value; }
}
private string _subctade;
public string SubCtaDe
{
get { return _subctade; }
set { _subctade = value; }
}
private string _Nivel;
public string Nivel
{
get { return _Nivel; }
set { _Nivel = value; }
}
private string _Natur;
public string Natur
{
get { return _Natur; }
set { _Natur = value; }
}
public override string ToString()
{
return "CodAgrup = " + CodAgrup + ", NumCta = " + NumCta + ", Desc = " + Desc + ", SubCtaDe = " + SubCtaDe + ", Nivel = " + Nivel + ", Natur = " + Natur;
}
#endregion
}
and I have Create an XML from these properties, so first I have to fill the properties, then i got the next method i want to use to fill the properties, first question is, is it correct the way Im using to fill the properties?
Then I should retreive the data and write it on an XML file so I convert properties data into a list and then just write them as atributes but when i Debug, I get that the list is empty, Why is that? what could be the best way to do it?
//Insert n data on properties
static void cuenta(string codagroup, string numcta, string desc, string subctade, string nivel, string natur)
{
Ctas cuentas = new Ctas();
int x = 0;
while (cuentas.CodAgrup != null)
{
cuentas.CodAgrup.Insert(x, "codagroup");
cuentas.NumCta.Insert(x, "numcta");
cuentas.Desc.Insert(x, "desc");
cuentas.SubCtaDe.Insert(x,"subctade");
cuentas.Nivel.Insert(x, "nivel");
cuentas.Natur.Insert(x, "natur");
x = x + 1;
}
}
//Converting propierties data into list
List<string> coda = cuentas.CodAgrup.GetType().GetProperties().Select(p => p.Name).ToList();
List<string> ncta = cuentas.NumCta.GetType().GetProperties().Select(p => p.Name).ToList();
List<string> desc = cuentas.Desc.GetType().GetProperties().Select(p => p.Name).ToList();
List<string> subdes = cuentas.SubCtaDe.GetType().GetProperties().Select(p => p.Name).ToList();
List<string> nivel = cuentas.Nivel.GetType().GetProperties().Select(p => p.Name).ToList();
List<string> natur = cuentas.Natur.GetType().GetProperties().Select(p => p.Name).ToList();
//Create XML from data in list´s
for (int i = 0; i < coda.Count; i++)
{
xmlWriter.WriteAttributeString("CodAgrup", coda[i]);
xmlWriter.WriteAttributeString("NumCta", ncta[i]);
xmlWriter.WriteAttributeString("Desc", desc[i]);
//write the atribute when property data exists.
if (cuentas.SubCtaDe != null)
{
xmlWriter.WriteAttributeString("SubCtaDe", subdes[i]);
}
xmlWriter.WriteAttributeString("Nivel", nivel[i]);
xmlWriter.WriteAttributeString("Natur", natur[i]);
xmlWriter.WriteEndElement();
}
Your code is confusing, but if I understand it right, here is the first error I see:
Ctas cuentas = new Ctas();
int x = 0;
while (cuentas.CodAgrup != null) // cuentas.CodAgrup is null from the beginning!
{
cuentas.CodAgrup.Insert(x, "codagroup");
cuentas.NumCta.Insert(x, "numcta");
cuentas.Desc.Insert(x, "desc");
cuentas.SubCtaDe.Insert(x,"subctade");
cuentas.Nivel.Insert(x, "nivel");
cuentas.Natur.Insert(x, "natur");
x = x + 1;
}
Since you are looking at a brand-new Ctas object, and there is no code to initialize the CodAgrup property, it will have the default value of null, so the code never enters the while loop.
Even if it DID, I suspect it would be an endless loop, because you're Inserting a literal value into a string property, and there is no condition I see where cuentas.CodAgrup will ever be null.
As for your XML generation, why not just use the built in XmlSerializer class? Even if you require a specific format, there are attributes that let you customize the XML that is generated.
The goal here is that after inputing csv file, a magic tool would output c# class with the fields from csv. Let's look at example.
Input myFile.csv:
Year,Make,Model
1997,Ford,E350
2000,Mercury,Cougar
Output myFile.cs
public class myFile
{
public string Year;
public string Make;
public string Model;
}
So, the only thing I would need to fix is the types of properties. After that I would use this class with FileHelpers to read csv file. Later it would be mapped to EntityFramework class (using AutoMapper) and saved to database.
Actually, https://csv2entity.codeplex.com/ looks like is doing what I need, but it just doesn't work - I installed it and nothing changed in my Visual studio, no new template appeared. The project is totally dead. Opened source code and ... decided maybe I'll just ask this question in stackoverflow :)
FileHelpers has only a simple wizard, which allows you to manually add fields. But I have 50 fields and this is not the last time I will need to do it, so automated solution is preferred here.
I believe this problem is solved many times before, any help?
Thank you Bedford, I took your code and added three things:
It removes symbols invalid for property names. For example "Order No." will become "OrderNo" property.
Ability to add property and class attributes. In my case I need [DelimitedRecord(",")] and [FieldOptional()], because I'm using FileHelpers.
Some columns don't have names, so it generates names itself. Naming convention is Column10, Column11 and so on.
Final code:
public class CsvToClass
{
public static string CSharpClassCodeFromCsvFile(string filePath, string delimiter = ",",
string classAttribute = "", string propertyAttribute = "")
{
if (string.IsNullOrWhiteSpace(propertyAttribute) == false)
propertyAttribute += "\n\t";
if (string.IsNullOrWhiteSpace(propertyAttribute) == false)
classAttribute += "\n";
string[] lines = File.ReadAllLines(filePath);
string[] columnNames = lines.First().Split(',').Select(str => str.Trim()).ToArray();
string[] data = lines.Skip(1).ToArray();
string className = Path.GetFileNameWithoutExtension(filePath);
// use StringBuilder for better performance
string code = String.Format("{0}public class {1} {{ \n", classAttribute, className);
for (int columnIndex = 0; columnIndex < columnNames.Length; columnIndex++)
{
var columnName = Regex.Replace(columnNames[columnIndex], #"[\s\.]", string.Empty, RegexOptions.IgnoreCase);
if (string.IsNullOrEmpty(columnName))
columnName = "Column" + (columnIndex + 1);
code += "\t" + GetVariableDeclaration(data, columnIndex, columnName, propertyAttribute) + "\n\n";
}
code += "}\n";
return code;
}
public static string GetVariableDeclaration(string[] data, int columnIndex, string columnName, string attribute = null)
{
string[] columnValues = data.Select(line => line.Split(',')[columnIndex].Trim()).ToArray();
string typeAsString;
if (AllDateTimeValues(columnValues))
{
typeAsString = "DateTime";
}
else if (AllIntValues(columnValues))
{
typeAsString = "int";
}
else if (AllDoubleValues(columnValues))
{
typeAsString = "double";
}
else
{
typeAsString = "string";
}
string declaration = String.Format("{0}public {1} {2} {{ get; set; }}", attribute, typeAsString, columnName);
return declaration;
}
public static bool AllDoubleValues(string[] values)
{
double d;
return values.All(val => double.TryParse(val, out d));
}
public static bool AllIntValues(string[] values)
{
int d;
return values.All(val => int.TryParse(val, out d));
}
public static bool AllDateTimeValues(string[] values)
{
DateTime d;
return values.All(val => DateTime.TryParse(val, out d));
}
// add other types if you need...
}
Usage example:
class Program
{
static void Main(string[] args)
{
var cSharpClass = CsvToClass.CSharpClassCodeFromCsvFile(#"YourFilePath.csv", ",", "[DelimitedRecord(\",\")]", "[FieldOptional()]");
File.WriteAllText(#"OutPutPath.cs", cSharpClass);
}
}
There is a link to full code and working example https://github.com/povilaspanavas/CsvToCSharpClass
You can generate the class code with a little C# app which checks all the values for each column. You can determine which is the narrowest type each one fits:
public static string CSharpClassCodeFromCsvFile(string filePath)
{
string[] lines = File.ReadAllLines(filePath);
string[] columnNames = lines.First().Split(',').Select(str => str.Trim()).ToArray();
string[] data = lines.Skip(1).ToArray();
string className = Path.GetFileNameWithoutExtension(filePath);
// use StringBuilder for better performance
string code = String.Format("public class {0} {{ \n", className);
for (int columnIndex = 0; columnIndex < columnNames.Length; columnIndex++)
{
code += "\t" + GetVariableDeclaration(data, columnIndex, columnNames[columnIndex]) + "\n";
}
code += "}\n";
return code;
}
public static string GetVariableDeclaration(string[] data, int columnIndex, string columnName)
{
string[] columnValues = data.Select(line => line.Split(',')[columnIndex].Trim()).ToArray();
string typeAsString;
if (AllDateTimeValues(columnValues))
{
typeAsString = "DateTime";
}
else if (AllIntValues(columnValues))
{
typeAsString = "int";
}
else if (AllDoubleValues(columnValues))
{
typeAsString = "double";
}
else
{
typeAsString = "string";
}
string declaration = String.Format("public {0} {1} {{ get; set; }}", typeAsString, columnName);
return declaration;
}
public static bool AllDoubleValues(string[] values)
{
double d;
return values.All(val => double.TryParse(val, out d));
}
public static bool AllIntValues(string[] values)
{
int d;
return values.All(val => int.TryParse(val, out d));
}
public static bool AllDateTimeValues(string[] values)
{
DateTime d;
return values.All(val => DateTime.TryParse(val, out d));
}
// add other types if you need...
You can create a command line application from this which can be used in an automated solution.
You can create the dynamic model class from CSV using dynamic in C#. Override TryGetMember of the custom DynamicObject class and use Indexers.
A useful link:
C# Linq to CSV Dynamic Object runtime column name
csv2entity has moved to:
https://github.com/juwikuang/csv2entity
The installation guide is the readme.md file.