I import fuel transactions from an Excel file into an SQL database.
One column from the Excel is giving me trouble: "product synonym"
// Match to product
FuelProductTypeSynonym typeSyn = ctx.FuelProductTypeSynonym.SingleOrDefault(s => s.Name.ToUpper() == row.Product.ToUpper());
if (typeSyn == null)
{
result.Errors.Add("[" + fileName + "] (row " + currentRow + "): product match not found.");
}
If the product synonym doesn't match what's in the database, the program stops. In my case, I have a product "FOOD/BEVERAGE" and it seems like the "/" portion of the name can't be parsed from Excel.
How do I get around this "/" so SQL understands that name?
Related
I need to create an Excel file in which the user can later adjust a specific information. I'm using C# and EPPlus v6.0.4. For example, if the input is a list of products, I want it to be joined with ' taxes included' string (which can be later changed by the user at Excel):
descriptionA -> descriptionA taxes included
descriptionB -> descriptionB taxes included
descriptionC -> descriptionC taxes included
I'm assuming two worksheets: ws1 (parameter) and ws2 (output list). As shown below, cell B1 is where the user will be able to change the "taxes included" string.
ws1.Cells["A1"].Value = "Additional information:";
excel.Workbook.Names.Add("auxData", ws1.Cells["B1"]);
ws1.Cells["B1"].Value = " taxes included 12%";
At the second worksheet (ws2) I will have the data being populated.
int excelLine = 1;
foreach (var product in productList)
{
string productDescription = product.Description;
ws2.Cells["A" + excelLine].Formula = .....; //need ideas on how to solve this
excelLine ++;
}
At the above .Formula I was trying CONCAT or similar function but its not working (excel file is generated with errors or the formula is not accepted).
The expected output is a cell value ="product full description variable string" & auxData, therefore suitable to B1 text changes by the user (auxData is an Excel name to =ws1!$B$1).
My own solution was:
int excelLine = 1;
foreach (var product in productList)
{
string cellDescription = "=\"" + product.Description + " -- additional information \" & auxData & \"%\"";
ws2.Cells["A" + excelLine].Formula = cellDescription;
excelLine ++;
}
I have a list of invoices that and I transferred them to an Excel spreadsheet.
All the columns are created into the spreadsheet except for the Job Date column. That is blank in the spreadsheet.
Here's the code:
string Directory = ConfigurationSettings.AppSettings["DownloadDestination"] + Company.Current.CompCode + "\\";
string FileName = DataUtils.CreateDefaultExcelFile(Company.Current.CompanyID, txtInvoiceID.Value, Directory);
FileInfo file = new FileInfo(FileName);
Response.Clear();
Response.ContentType = "application/x-download";
Response.AddHeader("Content-Length", file.Length.ToString());
Response.AddHeader("Content-Disposition", "attachment; filename=" + file.Name);
Response.CacheControl = "public";
Response.TransmitFile(file.FullName);
Response.Flush();
Context.ApplicationInstance.CompleteRequest();
public static string CreateDefaultExcelFile(int CompanyID, string InvoiceNo, string CreateDirectory)
{
List<MySqlParameter> param = new List<MySqlParameter>{
{ new MySqlParameter("CompanyID", CompanyID) },
{ new MySqlParameter("InvoiceNo", InvoiceNo) }
};
DataTable result = BaseDisplaySet.CustomFill(BaseSQL, param);
string FileName = CreateDirectory + "InvoiceFile_" + DateTime.Now.ToString("yyyyMMddhhmmssff") + ".";
FileName += "xlsx";
XLWorkbook workbook = new XLWorkbook();
workbook.Worksheets.Add(result, "Bulk Invoices");
workbook.SaveAs(FileName);
return FileName;
}
private const string BaseSQL = " SELECT q.InvoiceNo AS InvoiceNumber, j.JobNo, j.JobDate AS JobDate, " +
" (SELECT Name FROM job_address WHERE AddressType = 6 AND JobID = j.ID LIMIT 0,1) AS DebtorName, " +
" (SELECT CONCAT(Name,CONCAT(',',Town)) FROM job_address WHERE AddressType = 3 AND JobID = j.ID LIMIT 0,1) AS CollectFrom, " +
" (SELECT CONCAT(Name,CONCAT(',',Town)) FROM job_address WHERE AddressType = 2 AND JobID = j.ID LIMIT 0,1) AS DeliverTo, " +
" deladd.Town AS DeliverToTown, deladd.County AS DeliveryToCounty, " +
" (SELECT DocketNo FROM job_dockets WHERE JobID = j.ID LIMIT 0,1) AS DocketNo, " +
" SUM(j.DelAmt) AS DelAmount, " +
" (SELECT CAST(group_concat(DISTINCT CONCAT(AdvisedQty,' ',PieceType) separator ',') AS CHAR(200)) FROM job_pieces WHERE JobID = j.ID GROUP BY JobID ) AS PieceBreakDown " +
" FROM Invoice q " +
" LEFT JOIN customer c ON q.accountcode = c.ID " +
" INNER JOIN job_new j ON q.JobID = j.ID " +
" LEFT JOIN job_address coladd ON coladd.JobID = j.ID AND coladd.AddressType = 3 " +
" LEFT JOIN job_address deladd ON deladd.JobID = j.ID AND deladd.AddressType = 2 " +
" WHERE q.IsActive = 1 AND q.Company_ID = ?CompanyID AND q.InvoiceNo = ?InvoiceNo " +
" group by j.id";
The sql returns all the correct information and as you can see the job date is there:
But when I open the Excel file after it is created, the job date column is blank:
You should convert JobDate in BaseSQL to string.
A sample example is given below. You can use it to get an idea how to convert datetime to varchar.
DECLARE #myDateTime DATETIME
SET #myDateTime = '2008-05-03'
--
-- Convert string
--
SELECT LEFT(CONVERT(VARCHAR, #myDateTime, 120), 10)
I don't know what framework do you use to export data to excel and how powerful it is, but I do know that Excel does not directly support dates (surprise!), at least not in xml-based (OpenXml) xlsx documents. It works only with strings and numbers (which are saved in underlying document as string and number literals)
Considering that, you can use simple workaround: convert your dates to strings via either cast/convert in sql or ToString() in C#. You will loose Excel date functionality (like date filters, custom formats), obviously.
However, it is not an only way (cheers!). You can save your data in the same way Excel stores it. If your framework does not support it, you will have to do it yourself: the recipe will be the same as with creation xlsx documents by hand with DocumentFormat.OpenXml.dll.
Actually, Excel uses "OLE-Automation Date" format as internal representation for dates, which is implemented as a floating-point number whose integral component is the number of days before or after midnight, 30 December 1899, and whose fractional component represents the time on that day divided by 24. This representation is stored in document as number literal. Excel distinguishes dates and numbers by numbering format of corresponding cell. With that in mind, you can use not so simple workaround:
First, convert your dates to numbers:
DateTime date = DateTime.Now;
double dateValue = date.ToOADate();
//or
TimeSpan time = DateTime.Now.TimeOfDay;
double timeValue = (DateTime.FromOADate(0) + time).ToOADate();
Then double variable should be set to CellValue of Excel Cell, you can create new column with double datatype in DataTable, fill it using this transformation, then drop original DateTime column.
Second, apply date format to desired cells. Unfortunately, required code will differ between frameworks, but the principle should be the same:
Locate corresponding cell range (either CellRange or Cells, maybe Columns)
Set date format string (via something like range.NumberFormat.Format="dd/mm/yyyy" or range.NumberFormatString="dd/mm/yyyy")
If, however, this framework does not support simplified formatting (very strange framework that will be), you will have to either set range.NumberFormatId=22 for standard date format or create new number format. If you are rather unlucky and this framework is as simple as DocumentFormat.OpenXml, you will have to create custom CellFormat with correspoding NumberFormatId (22 or id of custom NumberFormat), add it to stylesheet and set styleIndex for corresponding range.
I don't know if it's worth checking out, but when working with large datasets and datatables in the past I usually use ClosedXML to get it done. It's easy to just pass a datatable and let it handle creating the XLSX for it. I have it running my my Windows Server 2008 r2 without issue handling large requests with multiple sheets so I know it works really well.
https://closedxml.codeplex.com/
I have a SQL data reader that reads 2 columns from a sql db table.
once it has done its bit it then starts again selecting another 2 columns.
I would pull the whole lot in one go but that presents a whole other set of challenges.
My problem is that the table contains a large amount of data (some 3 million rows or so) which makes working with the entire set a bit of a problem.
I'm trying to validate the field values so i'm pulling the ID column then one of the other cols and running each value in the column through a validation pipeline where the results are stored in another database.
My problem is that when the reader hits the end of handlin one column I need to force it to immediately clean up every little block of ram used as this process uses about 700MB and it has about 200 columns to go through.
Without a full Garbage Collect I will definately run out of ram.
Anyone got any ideas how I can do this?
I'm using lots of small reusable objects, my thought was that I could just call GC.Collect() on the end of each read cycle and that would flush everything out, unfortunately that isn't happening for some reason.
Ok i hope this fits but here's the method in question ...
void AnalyseTable(string ObjectName, string TableName)
{
Console.WriteLine("Initialising analysis process for SF object \"" + ObjectName + "\"");
Console.WriteLine(" The data being used is in table [" + TableName + "]");
// get some helpful stuff from the databases
SQLcols = Target.GetData("SELECT Column_Name, Is_Nullable, Data_Type, Character_Maximum_Length FROM information_schema.columns WHERE table_name = '" + TableName + "'");
SFcols = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "Fields]");
PickLists = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "PickLists]");
// get the table definition
DataTable resultBatch = new DataTable();
resultBatch.TableName = TableName;
int counter = 0;
foreach (DataRow Column in SQLcols.Rows)
{
if (Column["Column_Name"].ToString().ToLower() != "id")
resultBatch.Columns.Add(new DataColumn(Column["Column_Name"].ToString(), typeof(bool)));
else
resultBatch.Columns.Add(new DataColumn("ID", typeof(string)));
}
// create the validation results table
//SchemaSource.CreateTable(resultBatch, "ValidationResults_");
// cache the id's from the source table in the validation table
//CacheIDColumn(TableName);
// validate the source table
// iterate through each sql column
foreach (DataRow Column in SQLcols.Rows)
{
// we do this here to save making this call a lot more later
string colName = Column["Column_Name"].ToString().ToLower();
// id col is only used to identify records not in validation
if (colName != "id")
{
// prepare to process
counter = 0;
resultBatch.Rows.Clear();
resultBatch.Columns.Clear();
resultBatch.Columns.Add(new DataColumn("ID", typeof(string)));
resultBatch.Columns.Add(new DataColumn(colName, typeof(bool)));
// identify matching SF col
foreach (DataRow SFDefinition in SFcols.Rows)
{
// case insensitive compare on the col name to ensure we have a match ...
if (SFDefinition["Name"].ToString().ToLower() == colName)
{
// select the id column and the column data to validate (current column data)
using (SqlCommand com = new SqlCommand("SELECT ID, [" + colName + "] FROM [" + TableName + "]", new SqlConnection(ConfigurationManager.ConnectionStrings["AnalysisTarget"].ConnectionString)))
{
com.Connection.Open();
SqlDataReader reader = com.ExecuteReader();
Console.WriteLine(" Validating column \"" + colName + "\"");
// foreach row in the given object dataset
while (reader.Read())
{
// create a new validation result row
DataRow result = resultBatch.NewRow();
bool hasFailed = false;
// validate it
object vResult = ValidateFieldValue(SFDefinition, reader[Column["Column_Name"].ToString()]);
// if we have the relevant col definition lets decide how to validate this value ...
result[colName] = vResult;
if (vResult is bool)
{
// if it's deemed to have failed validation mark it as such
if (!(bool)vResult)
hasFailed = true;
}
// no point in adding rows we can't trace
if (reader["id"] != DBNull.Value && reader["id"] != null)
{
// add the failed row to the result set
if (hasFailed)
{
result["id"] = reader["id"];
resultBatch.Rows.Add(result);
}
}
// submit to db in batches of 200
if (resultBatch.Rows.Count > 199)
{
counter += resultBatch.Rows.Count;
Console.Write(" Result batch completed,");
SchemaSource.Update(resultBatch, "ValidationResults_");
Console.WriteLine(" committed " + counter.ToString() + " fails to the database so far.");
Console.SetCursorPosition(0, Console.CursorTop-1);
resultBatch.Rows.Clear();
}
}
// get rid of these likely very heavy objects
reader.Close();
reader.Dispose();
com.Connection.Close();
com.Dispose();
// ensure .Net does a full cleanup because we will need the resources.
GC.Collect();
if (resultBatch.Rows.Count > 0)
{
counter += resultBatch.Rows.Count;
Console.WriteLine(" All batches for column complete,");
SchemaSource.Update(resultBatch, "ValidationResults_");
Console.WriteLine(" committed " + counter.ToString() + " fails to the database.");
}
}
}
}
}
Console.WriteLine(" Completed processing column \"" + colName + "\"");
Console.WriteLine("");
}
Console.WriteLine("Object processing complete.");
}
Could you post some code? .NET's data reader is supposed to be a 'fire-hose' that is stingy on RAM unless, as Freddy suggests, your column-data values are large. How long does this validation+DB write take?
In general, if a GC is needed and can be done, it will be done. I may sound like a broken record but if you have to GC.Collect() something else is wrong.
Open the reader with Sequential access, it might give you the behavior you need. Also, assuming that's a blob, you might also be better off by reading it in chunks.
Provides a way for the DataReader to handle rows that contain columns with large binary values. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. You can then use the GetBytes or GetChars method to specify a byte location to start the read operation, and a limited buffer size for the data being returned.
When you specify SequentialAccess, you are required to read from the columns in the order they are returned, although you are not required to read each column. Once you have read past a location in the returned stream of data, data at or before that location can no longer be read from the DataReader. When using the OleDbDataReader, you can reread the current column value until reading past it. When using the SqlDataReader, you can read a column value can only once.
I am using Oledb driver to load data from excel for displaying in tabcontrol with datargids
I am using following loop to load data from every sheet
foreach (string str in sheets)
{
string query = "SELECT * FROM [" + str + "]";
adapter.SelectCommand.CommandText = query;
adapter.Fill(ds,str);
}
It works well until sheet names start with numbers. I have a excel file whose sheet name is 26203 REV C (EVK). It throws me error as The Microsoft Jet database engine could not find the object ''26203 REV C [EVK]$'_'. Make sure the object exists and that you spell its name and the path name correctly.What can be the remedy to the problem. I do not have any control over the sheet names.
Try using backticks (`) instead of brackets:
SELECT * FROM `26203 REV C (EVK)$`
Remember that you also need to include the dollar symbol suffix when selecting.
Is there a way to add rows from a DataTable to a Excel spreadsheet without interating through a SQL Insert Statement as so? Are there any alternative methods?
foreach (DataRow drow in DataTable DT)
{
cmd.CommandText = "INSERT INTO [LHOME$]([Date], [Owner], [MAKE], [BUY],
[OVERAGE], [SUM]) " + "VALUES(" + "'" + drr.ItemArray[0].ToString() + "'" + "," + "'" + drr.ItemArray[1].ToString() + "'" + "," + "'" + drr.ItemArray[2].ToString() +
I'm trying to work around the DataTypes going into Excel worksheet. Because all my numbers which are saved as string are being inserted as a text with a conversion prompt on each cell. When I write to the worksheet as is. The dataTypes are all Strings. But I'm looking for a way that I don't have to specify the DataType. If there was a way to just push my changes from a DataTable and commit my changes to the spreadsheet.
Here's a shot in the dark, as I've never used IronPython and you may be asking for a C# specific solution. You could try using IronPython here and using this Python library made for accessing excel:
http://www.python-excel.org/
Here's some example code to get a feel for how easy it is to work with:
import xlwt
wb = xlwt.Workbook()
ws = wb.add_sheet('A Test Sheet')
ws.write(2, 0, 1)
ws.write(2, 1, 1)
wb.save('example.xls')
I'm not sure what problem you're trying to solve, but maybe this can help you out.