LinqToExcel Duplicate Column Names - c#

I have a machine generated excel file that has a few columns with the same name. e.g.:
A B C D
Group 1           Group 2
Period | Name     Period | Name
And i got a DTO like this:
[ExcelColumn("Period")]
public string FirstPeriod { get; set; }
[ExcelColumn("Name")]
public string FirstName { get; set; }
[ExcelColumn("Period")]
public string SecondPeriod { get; set; }
[ExcelColumn("Name")]
public string SecondName { get; set; }
I use the following command to read the lines:
var excel = new ExcelQueryFactory(filePath);
excel.WorksheetRange<T>(beginCell, endColl + linesCount.ToString(), sheetIndex);
It reads the file just fine, but when i check the content of my DTO i saw that all the 'Second' properties have the same values of the 'First' ones.
This post was the closest thing that i found in my searches and i think the problem could be solved with something like this:
excel.AddMapping<MyDto>(x => x.FirstPeriod, "A");
excel.AddMapping<MyDto>(x => x.FirstName, "B");
excel.AddMapping<MyDto>(x => x.SecondPeriod, "C");
excel.AddMapping<MyDto>(x => x.SecondName, "D");
But i don't know how to get the excel column letters...
Obs: I got a few more code behind this but i don't think its relevant to the problem.

The problem that you're having is not possible to solve today with LinqToExcel because it wraps the OleDb functions and then they map properties based on columns names, so you lose the OleDb options like "FN" for specify columns (like "F1" for "A").
There's a issue on LinqToExcel github repo about this. https://github.com/paulyoder/LinqToExcel/issues/85
I recommend you to change the name of the columns to not duplicates names (e.g. Period1, Name1, Period2, Name2) if it's not possible to change because its machine generated, try change the header names in runtime.
Another option is to make more than one query in excel file, with ranges splitted your groups and then merging the results later.
var excel = new ExcelQueryFactory(filePath);
var group1 = excel.WorksheetRange<T>(A1, B + rowCount);
var group2 = excel.WorksheetRange<T>(C1, D + rowCount);
Edit: I'll work on a feature to try solve this problem in a elegant manner, so maybe in future you have a more flexible option to map columns and properties (if they accept my Pull Request)

Related

WPU SQLite.Net.SQLiteConnection.Query encoding issue

I have a encoding problem with the Query of SQLite.Net. Everything works fine if i only use the column names in the SQL String, but if i write the SQL on my own, every special char like ä,ü,ö,ß will not be encoded correctly.
Here are two easy examples, one working, one not.
public class ass {
[PrimaryKey, AutoIncrement]
public int _id { get; set; }
[MaxLength(255)]
public string sortname { get; set; }
}
dbConn = new SQLiteConnection(new SQLitePlatformWinRT("testpasswort"),DB_PATH);
dbConn.CreateTable<ass>(SQLite.Net.Interop.CreateFlags.None);
//add a test entry with special chars
ass asss = new ass();
asss.sortname = "oe=öae=äszett=ß";
dbConn.Insert(asss);
//now select the test entry to an ass object
List<ass> getass = dbConn.Table<ass>().ToList<ass>();
//the list is filled and sortname = "oe=öae=äszett=ß"
//now fake a object with
List<ass> sqlass = dbConn.Query<ass>("SELECT 'oe=öae=äszett=ß' as sortname FROM ass").ToList<ass>();
//the List is filled and sortname = "oe=�ae=�szett=�"
I know the query is useless and the following will work:
List<ass> sqlass = dbConn.Query<ass>("SELECT sortname as FROM ass").ToList<ass>();
But the problem is, that the .Query funktion have a encoding issue, this will NOT work:
List<ass> sqlass = dbConn.Query<ass>("SELECT sortname FROM ass WHERE sortname LIKE '%ä%'").ToList<ass>();
But this will work:
List<ass> sqlass = dbConn.Query<ass>("SELECT sortname FROM ass).ToList<ass>().Where(v => v.sortname.Contains("ä"));
everytime i have any special char in the sqlcode it will not work, this is fatal for my needs, because i have a lot of replace(column,find,replace) statements and all of them failed if the find or replace String contains any ü,ö,ä [...]
Did anyone know how to sove this?
A possible solution would be to use #params instead of direct string request. And use the UTF-8 encoding pragma, which you can also use to check your existing database encoding. A helpful description for this issue can be found here.

LinqToExcel returns null

I have an excel sheet of xls format, I am using LinqToExcel to read it and then import it to my DB.
This sheet consist of about 3K row and only 6 cols. I am using .addmapping to map my class properties to the column names
The problem i have is: the cells of column "web-code" are SOMETIMES coming back as null although there are data in the cells.
Here is a sample data that is coming as null!
My Code watch
And here is a sample data where the data coming correct:
My Code Watch
I have tried applying ExcelColumn attribute for mapping, but no luck!
code:
var factory = new ExcelQueryFactory(_excelFilePath);
factory.AddMapping<ExcelPriceEntity>(x => x.WebCode, "WEB-CODE");
factory.AddMapping<ExcelPriceEntity>(x => x.Type, "TYPE");
factory.AddMapping<ExcelPriceEntity>(x => x.Style, "STYLE");
factory.AddMapping<ExcelPriceEntity>(x => x.Qty, "QTY");
factory.AddMapping<ExcelPriceEntity>(x => x.UnitPrice, "Unit Price");
factory.AddMapping<ExcelPriceEntity>(x => x.Bucket, "WEBCODE W/BUCKET");
factory.StrictMapping = StrictMappingType.ClassStrict;
factory.TrimSpaces = TrimSpacesType.Both;
factory.ReadOnly = true;
var prices = factory.Worksheet<ExcelPriceEntity>(_allPricesSheetName).ToList();
var priccerNP = prices.Where(p => p.Type.Contains("900 ARROW TAPE")).ToList();
My PriceEntity Class:
public class ExcelPriceEntity
{
//[ExcelColumn("TYPE")]
public string Type { get; set; }
public string WebCode { get; set; }
//[ExcelColumn("STYLE")]
public string Style { get; set; }
//[ExcelColumn("QTY")]
public string Qty { get; set; }
//[ExcelColumn("Unit Price")]
public string UnitPrice { get; set; }
//[ExcelColumn("WEBCODE W/BUCKET")]
public string Bucket { get; set; }
}
Alternate Solution:
I ended up saving the excel sheet as csv file, then import to SQL table.Then i used linq-to-sql to read the data.
Root Cause:
After researching i found out the problem was that the first cell of this column(web-code) was interger number, and excel trys to figure out the datatype of the column by looking at the first rows!
So next rows of (web-code) column was some text data. So excel couldn't parse it as integer, and assign null value to it!
What I could've done is, assing text value to the first cell so excel would guess the data type as string. But I didn't test that. For anyone reading this answer, try having text value in you first row if you came across the same problem
here, the Contains is not like string contains. It compares a list of cell values to the exact value u give inside the contains method. just try with the full text "900 AMMONIA STOCK OR CUST...)
Another alternative solution to #alsafoo is to convert the column from "general" to "text".
These are the steps:
1. Right click on any cell in the column.
2. In Number tab, select text.
3. Select Format cell
4. Press Ok.
After then, the library will read all values as string.

How can I query a table by its proterty's specific value when the property is a list?

I have a class named Quote and QuoteInfo. And the Quote class has a List of QuoteInfo. Like this:
public class Quote
{
...
public virtual List<QuoteInfo> QuoteInfo { get; set; }
...
}
Also my QuoteInfo class has a Language property. Like this:
public class QuoteInfo
{
...
public virtual Language Language { get; set; }
...
}
As you can see when I query my Quotes like this...
var quotes = dbContext.Quotes.ToList();
... all the QuoteInfos come with it (lazy loading enabled of course). But i just want to get the QuoteInfos with a specific language. How can I do that in one query?
Thanks in advance.
Edit: For example I have a Quote with 2 QuoteInfos. What I want to do is getting Quote and the QuoteInfo list containing the one with the specific language, which would be 1 Quote and its QuoteInfo list's count is 1.
Try this:
var quotes = dbContext.Quotes
.Select(x=>new
{
Quote=x,
QuouteInfos=x.QuoteInfo.Where(y=>y.Language==myPreferedLanguage)
});
it will select it to new List where you have Quoute and QouteInfos which are only in myPreferedLanguage
You need to query the QuoteInfo entities directly using the DbContext if you want to apply filtering.
int quoteId = ; // Id of the Quote you want to get the infos
var myLanguage = ; // lange of the infos you want to get
var infos = dbContext.QuoteInfos.Where(p=>p.QuoteId == quoteId && p.Language == myLanguage);
Using NavigationProperties (in your case QuoteInfo) will allways load the whole related entities and any addtional filtering using a Where statement is applied in memory.
Update:
When i understand you comment right, you want to get all Quotes where an info of the Quote matches a specific language. For this case you can use the following query.
var languageDependantQuote = from quote in dbContext.Quotes
from info in quote.QuoteInfos
where info.Language == myLanguage
select quote;

Best way to dynamically get column names from oracle tables

We are using an extractor application that will export data from the database to csv files. Based on some condition variable it extracts data from different tables, and for some conditions we have to use UNION ALL as the data has to be extracted from more than one table. So to satisfy the UNION ALL condition we are using nulls to match the number of columns.
Right now all the queries in the system are pre-built based on the condition variable. The problem is whenever there is change in the table projection (i.e new column added, existing column modified, column dropped) we have to manually change the code in the application.
Can you please give some suggestions how to extract the column names dynamically so that any changes in the table structure do not require change in the code?
My concern is the condition that decides which table to query. The variable condition is
like
if the condition is A, then load from TableX
if the condition is B then load from TableA and TableY.
We must know from which table we need to get data. Once we know the table it is straightforward to query the column names from the data dictionary. But there is one more condition, which is that some columns need to be excluded, and these columns are different for each table.
I am trying to solve the problem only for dynamically generating the list columns. But my manager told me to make solution on the conceptual level rather than just fixing. This is a very big system with providers and consumers constantly loading and consuming data. So he wanted solution that can be general.
So what is the best way for storing condition, tablename, excluded columns? One way is storing in database. Are there any other ways? If yes what is the best? As I have to give at least a couple of ideas before finalizing.
Thanks,
A simple query like this helps you to know each column name of a table in Oracle.
Select COLUMN_NAME from user_tab_columns where table_name='EMP'
Use it in your code :)
Ok, MNC, try this for size (paste it into a new console app):
using System;
using System.Collections.Generic;
using System.Linq;
using Test.Api;
using Test.Api.Classes;
using Test.Api.Interfaces;
using Test.Api.Models;
namespace Test.Api.Interfaces
{
public interface ITable
{
int Id { get; set; }
string Name { get; set; }
}
}
namespace Test.Api.Models
{
public class MemberTable : ITable
{
public int Id { get; set; }
public string Name { get; set; }
}
public class TableWithRelations
{
public MemberTable Member { get; set; }
// list to contain partnered tables
public IList<ITable> Partner { get; set; }
public TableWithRelations()
{
Member = new MemberTable();
Partner = new List<ITable>();
}
}
}
namespace Test.Api.Classes
{
public class MyClass
{
private readonly IList<TableWithRelations> _tables;
public MyClass()
{
// tableA stuff
var tableA = new TableWithRelations { Member = { Id = 1, Name = "A" } };
var relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 2,
Name = "B"
}
};
tableA.Partner = relatedclasses;
// tableB stuff
var tableB = new TableWithRelations { Member = { Id = 2, Name = "B" } };
relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 3,
Name = "C"
}
};
tableB.Partner = relatedclasses;
// tableC stuff
var tableC = new TableWithRelations { Member = { Id = 3, Name = "C" } };
relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 2,
Name = "D"
}
};
tableC.Partner = relatedclasses;
// tableD stuff
var tableD = new TableWithRelations { Member = { Id = 3, Name = "D" } };
relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 1,
Name = "A"
},
new MemberTable
{
Id = 2,
Name = "B"
},
};
tableD.Partner = relatedclasses;
// add tables to the base tables collection
_tables = new List<TableWithRelations> { tableA, tableB, tableC, tableD };
}
public IList<ITable> Compare(int tableId, string tableName)
{
return _tables.Where(table => table.Member.Id == tableId
&& table.Member.Name == tableName)
.SelectMany(table => table.Partner).ToList();
}
}
}
namespace Test.Api
{
public class TestClass
{
private readonly MyClass _myclass;
private readonly IList<ITable> _relatedMembers;
public IList<ITable> RelatedMembers
{
get { return _relatedMembers; }
}
public TestClass(int id, string name)
{
this._myclass = new MyClass();
// the Compare method would take your two paramters and return
// a mathcing set of related tables that formed the related tables
_relatedMembers = _myclass.Compare(id, name);
// now do something wityh the resulting list
}
}
}
class Program
{
static void Main(string[] args)
{
// change these values to suit, along with rules in MyClass
var id = 3;
var name = "D";
var testClass = new TestClass(id, name);
Console.Write(string.Format("For Table{0} on Id{1}\r\n", name, id));
Console.Write("----------------------\r\n");
foreach (var relatedTable in testClass.RelatedMembers)
{
Console.Write(string.Format("Related Table{0} on Id{1}\r\n",
relatedTable.Name, relatedTable.Id));
}
Console.Read();
}
}
I'll get back in a bit to see if it fits or not.
So what you are really after is designing a rule engine for building dynamic queries. This is no small undertaking. The requirements you have provided are:
Store rules (what you call a "condition variable")
Each rule selects from one or more tables
Additionally some rules specify columns to be excluded from a table
Rules which select from multiple tables are satisfied with the UNION ALL operator; tables whose projections do not match must be brought into alignment with null columns.
Some possible requirements you don't mention:
Format masking e.g. including or excluding the time element of DATE columns
Changing the order of columns in the query's projection
The previous requirement is particularly significant when it comes to the multi-table rules, because the projections of the tables need to match by datatype as well as number of columns.
Following on from that, the padding NULL columns may not necessarily be tacked on to the end of the projection e.g. a three column table may be mapped to a four column table as col1, col2, null, col3.
Some multi-table queries may need to be satisfied by joins rather than set operations.
Rules for adding WHERE clauses.
A mechanism for defining default sets of excluded columns (i.e. which are applied every time a table is queried) .
I would store these rules in database tables. Because they are data and storing data is what databases are for. (Unless you already have a rules engine to hand.)
Taking the first set of requirements you need three tables:
RULES
-----
RuleID
Description
primary key (RuleID)
RULE_TABLES
-----------
RuleID
Table_Name
Table_Query_Order
All_Columns_YN
No_of_padding_cols
primary key (RuleID, Table_Name)
RULE_EXCLUDED_COLUMNS
---------------------
RuleID
Table_Name
Column_Name
primary key (RuleID, Table_Name, Column_Name)
I've used compound primary keys just because it's easier to work with them in this context e.g. running impact analyses; I wouldn't recommend it for regular applications.
I think all of these are self-explanatory except the additional columns on RULE_TABLES.
Table_Query_Order specifies the order in which the tables appear in UNION ALL queries; this matters only if you want to use the column_names of the leading table as headings in the CSV file.
All_Columns_YN indicates whether the query can be written as SELECT * or whether you need to query the column names from the data dictionary and the RULE_EXCLUDED_COLUMNS table.
No_of_padding_cols is a simplistic implementation for matching projections in those UNION ALL columns, by specifying how many NULLs to add to the end of the column list.
I'm not going to tackle those requirements you didn't specify because I don't know whether you care about them. The basic thing is, what your boss is asking for is an application in its own right. Remember that as well as an application for generating queries you're going to need an interface for maintaining the rules.
MNC,
How about creating a dictionary of all the known tables involved in the application process up front (irrespective of the combinations - just a dictionary of the tables) which is keyed on tablename. the members of this dictionary would be a IList<string> of the column names. This would allow you to compare two tables on both the number of columns present dicTable[myVarTableName].Count as well as iterating round the dicTable[myVarTableName].value to pull out the column names.
At the end of the piece, you could do a little linq function to determine the table with the greatest number of columns and create the structure with nulls accordingly.
Hope this gives food for thought..

Parsing a CSV formatted text file

I have a text file that looks like this:
1,Smith, 249.24, 6/10/2010
2,Johnson, 1332.23, 6/11/2010
3,Woods, 2214.22, 6/11/2010
1,Smith, 219.24, 6/11/2010
I need to be able to find the balance for a client on a given date.
I'm wondering if I should:
A. Start from the end and read each line into an Array, one at a time.
Check the last name index to see if it is the client we're looking for.
Then, display the balance index of the first match.
or
B. Use RegEx to find a match and display it.
I don't have much experience with RegEx, but I'll learn it if it's a no brainer in a situation like this.
I would recommend using the FileHelpers opensource project:
http://www.filehelpers.net/
Piece of cake:
Define your class:
[DelimitedRecord(",")]
public class Customer
{
public int CustId;
public string Name;
public decimal Balance;
[FieldConverter(ConverterKind.Date, "dd-MM-yyyy")]
public DateTime AddedDate;
}
Use it:
var engine = new FileHelperAsyncEngine<Customer>();
// Read
using(engine.BeginReadFile("TestIn.txt"))
{
// The engine is IEnumerable
foreach(Customer cust in engine)
{
// your code here
Console.WriteLine(cust.Name);
// your condition >> add balance
}
}
This looks like a pretty standard CSV type layout, which is easy enough to process. You can actually do it with ADO.Net and the Jet provider, but I think it is probably easier in the long run to process it yourself.
So first off, you want to process the actual text data. I assume it is reasonable to assume each record is seperated by some newline character, so you can utilize the ReadLine method to easily get each record:
StreamReader reader = new StreamReader("C:\Path\To\file.txt")
while(true)
{
var line = reader.ReadLine();
if(string.IsNullOrEmpty(line))
break;
// Process Line
}
And then to process each line, you can split the string on comma, and store the values into a data structure. So if you use a data structure like this:
public class MyData
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Balance { get; set; }
public DateTime Date { get; set; }
}
And you can process the line data with a method like this:
public MyData GetRecord(string line)
{
var fields = line.Split(',');
return new MyData()
{
Id = int.Parse(fields[0]),
Name = fields[1],
Balance = decimal.Parse(fields[2]),
Date = DateTime.Parse(fields[3])
};
}
Now, this is the simplest example, and doesn't account for cases where the fields may be empty, in which case you would either need to support NULL for those fields (using nullable types int?, decimal? and DateTime?), or define some default value that would be assigned to those values.
So once you have that you can store the collection of MyData objects in a list, and easily perform calculations based on that. So given your example of finding the balance on a given date you could do something like:
var data = customerDataList.First(d => d.Name == customerNameImLookingFor
&& d.Date == dateImLookingFor);
Where customerDataList is the collection of MyData objects read from the file, customerNameImLookingFor is a variable containing the customer's name, and customerDateImLookingFor is a variable containing the date.
I've used this technique to process data in text files in the past for files ranging from a couple records, to tens of thousands of records, and it works pretty well.
I think the cleanest way is to load the entire file into an array of custom objects and work with that. For 3 MB of data, this won't be a problem. If you wanted to do completely different search later, you could reuse most of the code. I would do it this way:
class Record
{
public int Id { get; protected set; }
public string Name { get; protected set; }
public decimal Balance { get; protected set; }
public DateTime Date { get; protected set; }
public Record (int id, string name, decimal balance, DateTime date)
{
Id = id;
Name = name;
Balance = balance;
Date = date;
}
}
…
Record[] records = from line in File.ReadAllLines(filename)
let fields = line.Split(',')
select new Record(
int.Parse(fields[0]),
fields[1],
decimal.Parse(fields[2]),
DateTime.Parse(fields[3])
).ToArray();
Record wantedRecord = records.Single
(r => r.Name = clientName && r.Date = givenDate);
Note that both your options will scan the file. That is fine if you only want to search in the file for 1 item.
If you need to search for multiple client/date combinations in the same file, you could parse the file into a Dictionary<string, Dictionary <date, decimal>> first.
A direct answer: for a one-off, a RegEx will probably be faster.
If you're just reading it I'd consider reading in the whole file in memory using StreamReader.ReadToEnd and then treating it as one long string to search through and when you find a record you want to look at just look for the previous and next line break and then you have the transaction row you want.
If it's on a server or the file can be refreshed all the time this might not be a good solution though.
If it's all well-formatted CSV like this then I'd use something like the Microsoft.VisualBasic.TextFieldParser class or the Fast CSV class over on code project to read it all in.
The data type is a little tricky because I imagine not every client has a record for every day. That means you can't just have a nested dictionary for your looksup. Instead, you want to "index" by name first and then date, but the form of the date record is a little different. I think I'd go for something like this as I read in each record:
Dictionary<string, SortedList<DateTime, double>>
hey, hey, hey!!! why not do it with this great project on codeproject Linq to CSV, way cool!
rock solid

Categories

Resources