ClosedXML Copy and Paste Range of formulas as values - c#

I have a function ApplyFormulas() that will obviously apply formulas like so
detailWs.Range(companyModel.RevenueFormulaRangeDollars).FormulaR1C1 = companyModel.RevenueFormulaDollars;
However Now I need to copy that range and paste it in the same spot so the values are real and not just formula references.
I am able to do this in VBA with excel interop but I am utilizing ClosedXML. Does anyone know of a way to do this? I tried CopyTo() but there is no paste special etc.
I also attempted
detailWs.Range(companyModel.NoChargeFormulaRangePercent).Value = detailWs.Range(companyModel.NoChargeFormulaRangePercent).Value;
but im getting a property or indexer cant be used because it lacks a getter but from what I can tell both have a get; set; property.
I've tried a couple for things and still not working..
var test = detailWs.Range(companyModel.NoChargeFormulaRangePercent).CellsUsed();
foreach(var c in test)
{
c.Value = c.Value.ToString();
}

Here's what I created a few months ago to copy all the formulas in one worksheet to another.
Note: I am having a problem where some formulas using a Name are not correctly copying the Name because something thinks the Name(i.e. =QF00) is a reference and will change it with AutoFill. Will update when I figure it out.
cx.IXLWorksheet out_buff
cx.IXLRange src_range
cx.IXLCells tRows = src_range.CellsUsed(x => x.FormulaA1.Length > 0);
foreach (cx.IXLCell v in tRows)
{
cell_address = v.Address.ToString();
out_buff.Range(cell_address).FormulaA1 = v.FormulaA1;
}

Related

Read excel null/blank values

I am trying to read excel null/blank values.
I have looked into hundreds of solutions and either I am implementing it wrong or it just does not seem to work and results in Microsoft.CSharp.RuntimeBinder.RuntimeBinderException:'Cannot perform runtime binding on a null reference'
This is one of the last codes I tried.(Since I was trying to put NA in all the null cells)
for (int i = 2; i <= rowCount; i++)
{
string natext = xlRange.Value2[rowCount, colCount];
if (natext == null)
{
natext = "NA";
}
Any ideas that can help me with some examples?
If the click the details it shows:
Microsoft.CSharp.RuntimeBinder.RuntimeBinderException
HResult=0x80131500 Message=Cannot perform runtime binding on a null
reference Source=
StackTrace:
First, the Excel object model is really weird. Value2 returns an object, and that object can be of all sorts of different types. If xlRange is a cell, then it returns the value of that cell, which could be a string or a double or something else. If xlRange is multiple cells then that object is an array of values. And then each of those values is an object. For each value you don't know if it's a string or a double or something else.
That's not fun to deal with. It's actually really, really bad. C# is a strongly-typed language, which means that you know what type everything is and you don't have to guess. Excel Interop takes that away from you and says, "Here's an object. It could be anything or lots of things that could each be anything. Figure it out. Good luck."
Instead of getting the Value2 property of the range and then looping through the array, it's much easier to deal with the cells in the range instead.
Given that excelRange is a Range of cells:
for (var row = 1; row <= excelRange.Rows.Count; row++)
{
for (var column = 1; row <= excelRange.Columns.Count; row++)
{
var cellText = excelRange[row, column].Text.ToString();
}
}
This does two things. First, you're looking at one cell at a time. Second, you're using the Text property. The Text property should always be a string so you could just do this and it would almost certainly work:
string cellText = excelRange.Cells[row, column].Text;
It's just that the object model returns dynamic, so even though it is a string, the possibility is left open that maybe it won't be.
My strong recommendation - and I think most developers would agree - is to abandon Excel Interop and run from it, and use a library like EPPlus instead. There are tons of examples.
Excel Interop works by actually starting an instance of Excel and giving you access to the clunky VBA object model. It's evil. Chances are that if you open your task manager right now you'll see several extra instances of Excel open that you didn't expect to see. Fixing that is a whole separate frustrating problem.
For some years Excel files have just been collections of XML documents, and EPPlus helps you to work with them as documents, but providing all sorts of helper methods so that you can interact with sheets, ranges, cells, and so forth. Try it. Trust me, you'll never look back.
Here's an example after adding the EPPlus Nuget package:
var pathToYourExcelWorkbook = #"c:\somepath\document.xlsx";
using (var workbookPackage = new ExcelPackage(new FileInfo(pathToYourExcelWorkbook)))
{
var workbook = workbookPackage.Workbook;
var sheet = workbook.Worksheets[1]; // 1-based, or use the name.
for (var row = 1; row <= 10; row++)
{
for (var column = 1; column <= 10; column++)
{
var cellText = sheet.Cells[row, column].Text;
}
}
}
It's awesome. No starting or closing an application - you're just reading from a file. No weird COM objects. And the objects are all strongly-typed. The Text property returns a string.

LinqToExcel Not Parsing Date

I am working with a client to import a rather larger Excel file (over 37K rows) into a custom system and utilizing the excellent LinqToExcel library to do so. While reading all of the data in, I noticed it was breaking on records about 80% in and dug a little further. The reason it fails is the majority of records (with associated dates ranging 2011 - 2015) are normal, e.g. 1/3/2015, however starting in 2016, the structure changes to look like this: '1/4/2016 (note the "tick" at the beginning of the date) and LinqToExcel starts returning a DBNull for that column.
Any ideas on why it would do that and ways around it? Note that this isn't a casting issue - I can use the Immediate Window to see all the values of the LinqToExcel.Row value and where that column index is, it's empty.
Edit
Here is the code I am using to read in the file:
var excel = new LinqToExcel.ExcelQueryFactory(Path.Combine(this.FilePath, this.CurrentFilename));
foreach (var row in excel.Worksheet(file.WorksheetName))
{
data.Add(this.FillEntity(row));
}
The problem I'm referring to is inside the row variable, which is a LinqToExcel.Row instance and contains the raw data from Excel. The values inside row all line up, with the exception of the column for the date which is empty.
** Edit 2 **
I downloaded the LinqToExcel code from GitHub and connected it to my project and it looks like the issue is even deeper than this library. It uses an IDataReader to read in all of the values and the cells in question that aren't being read are empty from that level. Here is the block of code from the
LinqToExcel.ExcelQueryExecutorclass that is failing:
private IEnumerable<object> GetRowResults(IDataReader data, IEnumerable<string> columns)
{
var results = new List<object>();
var columnIndexMapping = new Dictionary<string, int>();
for (var i = 0; i < columns.Count(); i++)
columnIndexMapping[columns.ElementAt(i)] = i;
while (data.Read())
{
IList<Cell> cells = new List<Cell>();
for (var i = 0; i < columns.Count(); i++)
{
var value = data[i];
//I added this in, since the worksheet has over 37K rows and
//I needed to snag right before it hit the values I was looking for
//to see what the IDataReader was exposing. The row inside the
//IDataReader relevant to the column I'm referencing is null,
//even though the data definitely exists in the Excel file
if (value.GetType() == typeof(DateTime) && value.Cast<DateTime>() == new DateTime(2015, 12, 31))
{
}
value = TrimStringValue(value);
cells.Add(new Cell(value));
}
results.CallMethod("Add", new Row(cells, columnIndexMapping));
}
return results.AsEnumerable();
}
Since their class uses an OleDbDataReader to retrieve the results, I think that is what can't find the value of the cell in question. I don't even know where to go from there.
Found it! Once I traced down that it was the OleDbDataReader that was failing and not the LinqToExcel library itself, it sent me down a different path to look around. Apparently, when an Excel file is read by an OleDbDataReader (as virtually all utilities do under the covers), the first few records are scanned to determine the type of content associated with the column. In my scenario, over 20K records had "normal" dates, so it assumed everything was a date. Once it got to the "bad" records, the ' in front of the date meant it couldn't be parsed into a date, so the value was null.
To circumvent this, I load the file and tell it to ignore column headers. Since the header for this column is a string and most of the values are dates, it makes everything a string because of the mismatched types and the values I need are loaded properly. From there, I can parse accordingly and get it to work.
Source: What is IMEX in the OLEDB connection string?

C# ExcelPackage (EPPlus) DeleteRow does not change sheet dimension?

I am trying to build a data import tool that accepts an EXCEL file from the user and parses the data from the file to import data into my application.
I am running across a strange issue with DeleteRow that I cannot seem to find any information online, although it seems like someone would have come across this issue before. If this is a duplicate question, I apologize, however I could not find anything related to my issue after searching the web, except for this one which still isn't solving my problem.
So the issue:
I use the following code to attempt to "remove" any row that has blank data through ExcelPackage.
for (int rowNum = 1; rowNum <= ws.Dimension.End.Row; rowNum++)
{
var rowCells = from cell in ws.Cells
where (cell.Start.Row == rowNum)
select cell;
if (rowCells.Any(cell => cell.Value != null))
{
nonEmptyRowsInFile += 1;
continue;
}
else ws.DeleteRow(rowNum);
//Update: ws.DeleteRow(rowNum, 1, true) also does not affect dimension
}
Stepping through that code, I can see that the DeleteRow is indeed getting called for the proper row numbers, but the issue is when I go to set the "total rows in file" count on the returned result object:
parseResult.RowsFoundInFile = (ws.Dimension.End.Row);
ws.Dimension.End.Row will still return the original row count even after the calls to DeleteRow.
My question is...do I have to "save" the worksheet or call something in order for the worksheet to realize that those rows have been removed? What is the point of calling "DeleteRow" if the row still "exists"? Any insight on this would be greatly appreciated...
Thanks
I think I figured out the problem. This is yet again another closure issue in C#. The problem is that the reference to "ws" is still the same reference from before the DeleteRow call.
In order to get the "updated" dimension, you have to redeclare the worksheet, for example:
ws = excelPackage.Workbook.Worksheets.First();
Once you get a new reference to the worksheet, it will have the updated dimensions, including any removed/added rows/columns.
Hopefully this helps someone.

GemBox.Spreadsheet last used row

I am trying to get the index of the last used row in a spreadsheet. I've found that in excel it could be done like that:
int lastUsedRow = worksheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell,
Type.Missing).Row;
But this doesn't seem to work with GemBox. The idea is that I have a template excel file that I want to fill with more information and therefore need the last row, so that I can continue on the next one.
Hi you can just use ExcelFile.Rows.Count property.
Gets the number of currently allocated elements (dynamically changes when worksheet is modified)
Try the following:
int lastUsedRow = worksheet.Rows.Count - 1;
Also regarding the shahkalpesh suggestion, yes you can also achieve your task with that approach as well, here is how:
var usedRange = worksheet.GetUsedCellRange(true);
int lastUsedRow = usedRange.LastRowIndex;
Note: I haven't used Gembox. My answer is based on searching in the documentation.
GetUsedCellRange returns a CellRange, which has a property named LastRowIndex.
Does this work the same way as Excel?

c# VS Express 2012 excel xml read and list

I'm beginning to program in C#. I can use Visual Studio Express 2012 for this purpose. I'm trying to create application that will import data from xml spreadsheet 2003 from specific column (but not specified number of entries in that column) and it will list text from each cell (all of them in that column).
I have read few topics about it, like this one:
http://social.msdn.microsoft.com/Forums/windowsapps/en-US/4fce4765-2d05-4a2b-8d0a-6219e87f3307/reading-excel-file-using-c-in-winrt-platform?forum=winappswithcsharp
but most of the answers are related with Visual Studio 2012 not the express version, thus I'm limited with libraries and extensions. Most of this solutions when I try to use them, don't work in my VS Express 2012 cause they are missing something.
This program is working for me and is returning value of one specific cell. How can I change it, so it will read every cell from that column, assign every value to a table (or maybe variable) so I can work with this content and maybe randomize order later?
namespace UnitTest
{
public class TestCode
{
//ReadExcelCellTest
public static void Main()
{
XDocument document = XDocument.Load(#"C:\Projekt2\File1.xml");
XNamespace workbookNameSpace = #"urn:schemas-microsoft-com:office:spreadsheet";
// Get worksheet
var query = from w in document.Elements(workbookNameSpace + "Workbook").Elements(workbookNameSpace + "Worksheet")
where w.Attribute(workbookNameSpace + "Name").Value.Equals("Sheet1")
select w;
List<XElement> foundWoksheets = query.ToList<XElement>();
if (foundWoksheets.Count() <= 0) { throw new ApplicationException("Worksheet Settings could not be found"); }
XElement worksheet = query.ToList<XElement>()[0];
// Get the row for "Seat"
query = from d in worksheet.Elements(workbookNameSpace + "Table").Elements(workbookNameSpace + "Row").Elements(workbookNameSpace + "Cell").Elements(workbookNameSpace + "Data")
where d.Value.Equals("StateID")
select d;
List<XElement> foundData = query.ToList<XElement>();
if (foundData.Count() <= 0) { throw new ApplicationException("Row 'StateID' could not be found"); }
XElement row = query.ToList<XElement>()[0].Parent.Parent;
// Get value cell of Etl_SPIImportLocation_ImportPath setting
XElement cell = row.Elements().ToList<XElement>()[1];
// Get the value "Leon"
string cellValue = cell.Elements(workbookNameSpace + "Data").ToList<XElement>()[0].Value;
Console.WriteLine(cellValue);
}
}
}
I believe that any version of Visual Studio would allow you to use open-source libraries:
http://closedxml.codeplex.com/
Link to doc: http://closedxml.codeplex.com/wikipage?title=Finding%20and%20extracting%20the%20data&referringTitle=Documentation
Reading Excel cell value should be much easier then.
How can I change it, so it will read every cell from that column,
assign every value to a table (or maybe variable) so I can work with
this content and maybe randomize order later?
I hope that closedxml will help you achieve this task.
Handling Excel sheets is not as easy as one might expect. For example: cell content is often just a reference to the real value stored in a dictionary (that's what the List<Dictionary<string, string>> is for in the code in the forum topic you linked). Also, other whacky things can happen, like receiving unexpected NULL cells at the end of a row.
I don't know how low level you want to keep your code. If there's a possibility that the functionality will evolve, you better look for some libraries. Take a look at Microsoft's own Open XML SDK: Open XML SDK 2.5. That provides support for all XML based office formats. There are two downsides: it's 12MB+ assembly, and the other is that this one is still not as high level as I expected. But you get some concepts like rows and columns.
The other alternatives are some non Microsoft libraries. There are numerous ones on CodePlex and other open source repos. Watch out to select something which is active and updated. Take a look at the issue section of the project, you'll see that there are usually many. You can see how they are handled. Many projects focus on Excel only, which is probably what you need, and will be smaller than a general OpenXML solution.
You can get paid product. Finally I ended up using SmartXLS, because it turned out our company had a license.

Categories

Resources