How to take part in excel files - c#

Hello sorry for my english.
I have to select a row of a excel file, put any new data and save them.
In the end I see that the excel file is always larger than before although the data are not increased but it looks to be created of the blank columns to the right.
I think this because when I execute the following statement
var wb = openWorkBook(filename);
var ws = wb.Worksheet("CNF");
IXLRow row = ws.Row(device.Ordinal - 1 + FirstRow);
for (int j = 0; j < MAXCOLS; ++j)
{
IXLCell cell = row.Cell(j + FirstCol);
...}
as range goes from A1 to XFD1048576.
Although after I take the line of my interest and cycle of 100 columns when I go
wb.Save();
the file increases.
So I ask you if you have a method to take only a part of a file then for example take already suffered from a limited number of columns, starting from education var ws = wb.Worksheet("CNF");.
Thank you

Related

Showing #value! before enable editing on excel if I write formula using epplus

Using C# .net core I am updating existing excel template with Data and formulas using EPPlus lib 4.5.3.3.
If you see the below screen shots all formula cells has '#value!' even after using calculate method in C# code (Just for reference attached xml screen short just after downloading excel before opening it). Auto calculation is also enabled in Excel.
In one of the blog mentioned to check the xml info,
My requirement is to upload this excel through code to sharepoint site and read the excel formula cells for other operations with out opening the excel manually.
is there any other way to calculate the formula cells form code and update the cell values?
I went through the Why won't this formula calculate unless i double click a cell? as well, but no luck.
using (ExcelPackage p = new ExcelPackage())
{
MemoryStream stream = new MemoryStream(byteArray);
p.Load(stream);
ExcelWorksheet worksheet = p.Workbook.Worksheets.FirstOrDefault(a => a.Name == "InputTemplate");
worksheet.Calculate();
if (worksheet != null)
{
worksheet.Cells["A3"].Value = company.CompanyName;//// Company Name
worksheet.Cells["B3"].Value = product.Name;////peoduct name
worksheet.Cells["C3"].Value = product.NetWeight;
worksheet.Cells["D3"].Value = product.ServingSize;
worksheet.Cells["E3"].Value = 0;
var produceAndIngredientDetailsForExcelList = await GetProduceAndIngredientDetails(companyId, productId);
////rowIndex will be 3
WriteProduceAndIngredientDetailsInExcel(worksheet, produceAndIngredientDetailsForExcelList);
///rowIndex will update based on no. of produce and then Agregates.
StageWiseAggregate(worksheet, produceAndIngredientDetailsForExcelList);
////Write Total Impacts Row
TotalImpactsFormulaSection(worksheet);
worksheet.Calculate();
}
Byte[] bin = p.GetAsByteArray();
return bin;
}
Formula Code
var columnIndex = 22;///"V" Column
for (; columnIndex <= 27; columnIndex++)
{
var columnName = GetExcelColumnName(columnIndex);
worksheet.Cells[currentRowIndex, columnIndex].Formula = $"=SUBTOTAL(109,{columnName}{firstRowIndex}:{columnName}{currentRowIndex - 1})";
}
Found the solution for this issue from my Architect (kudos to him).
I was writing formulas in wrong way by blindly fallowing tutorials like
https://riptutorial.com/epplus/example/26433/add-formulas-to-a-cell
Note: don't follow link shown above.
We should not use "=" for formulas. I just removed it worked like charm
var columnIndex = 22;///"V" Column
for (; columnIndex <= 27; columnIndex++)
{
var columnName = GetExcelColumnName(columnIndex);
worksheet.Cells[currentRowIndex, columnIndex].Formula = $"SUBTOTAL(109,{columnName}{firstRowIndex}:{columnName}{currentRowIndex - 1})";
}
Here is the official tutorial which mentioned correctly.
https://www.epplussoftware.com/en/Developers/ (check the second slide)
Working result:

How can I efficiently compare contiguous and sequential rows using C# in Excel?

I am developing a VSTO add-in for Excel in C# that needs to compare potentially large datasets (100 columns x ~10000 or more rows). It is being done in Excel so an end user can view some pictorial representation of the provided data on a row-by-row basis. This application must be done in Excel despite the potential pitfalls of using these large datasets.
Regardless, my question pertains to an efficient way to compare contiguous and sequential rows. My goal is to compare one row to the row directly after it; if there is a change of any of the elements between row1 and row2, this counts as an "event" and row2 output into a separate sheet. I'm sure you can see that for row-wise comparison of rows when the count is around 10000, this takes a long time (in practice, this is about 150ms-200ms per row for the current code).
Currently, I have used the SequenceEqual() method to compare two lists of strings as follows:
private void FilterRawDataForEventReader(Excel.Application xlApp)
{
List<string> row1 = new List<string>();
List<string> row2 = new List<string>();
xlWsRaw = xlApp.Worksheets["Full Raw Data"];
xlWsEventRaw = xlApp.Worksheets["Event Data"];
Excel.Range xlRawRange = xlWsRaw.Range["A3"].Resize[xlWsRaw.UsedRange.Rows.Count-2, xlWsRaw.UsedRange.Columns.Count];
var array = xlRawRange.Value;
Excel.Range xlRange = (Excel.Range)xlWsEventRaw.Cells[xlWsEventRaw.UsedRange.Rows.Count, 1];
int lastRow = xlRange.get_End(Excel.XlDirection.xlUp).Row;
int newRow = lastRow + 2;
for (int i = 1; i < xlWsRaw.UsedRange.Rows.Count - 2; i++)
{
row1.Clear();
row2.Clear();
for (int j = 1; j <= xlWsRaw.UsedRange.Columns.Count-1; j++)
{
row1.Add(array[i, j].ToString());
row2.Add(array[i + 1, j].ToString());
}
if (!row1.SequenceEqual(row2))
{
row2.Add(array[i + 1, xlWsRaw.UsedRange.Columns.Count].ToString()); // Add timestamp to row2.
for (int j = 0; j < row2.Count; j++)
{
xlWsEventRaw.Cells[newRow, j + 1] = row2[j];
}
newRow++;
}
}
}
During testing, I placed timers are various parts of this method to see how long certain operations take. For 100 columns, the first loop which builds the string arrays for row1 and row2 takes around 100ms per iteration and the whole operation takes between 150ms-200ms when an "event" has been found.
My intuition is that building the two List<string> is the problem but I do not know how else to approach this kind of problem in my experience. I should emphasize, the actual values of the data in the two List<string> don't matter; what matters is if the data are different at all. In that way, I feel that I am approaching this problem incorrectly but don't know how to "re-approach" so to say.
I am wondering if, instead of building arrays of strings through iteration and comparing them with the SequenceEqual() method, anyone can suggest a faster way to compare contiguous and sequential rows?
In case this solution may be useful for someone else trying to use Excel in C# and do some comparisons:
This problem was largely an optimization exercise. By eliminating the multiple loops and using Excel instead to generate the comparison lists:
for (int i = 3; i < xlWsRaw.UsedRange.Rows.Count - 2; i++)
{
rng1 = (Excel.Range)xlWsRaw.Range[xlWsRaw.Cells[i, 1], xlWsRaw.Cells[i, xlWsRaw.UsedRange.Columns.Count - 1]];
rng2 = (Excel.Range)xlWsRaw.Range[xlWsRaw.Cells[i+1, 1], xlWsRaw.Cells[i+1, xlWsRaw.UsedRange.Columns.Count - 1]];
rng3 = (Excel.Range)xlWsEventRaw.Range[xlWsEventRaw.Cells[newRow, 1], xlWsEventRaw.Cells[newRow, xlWsRaw.UsedRange.Columns.Count - 1]];
object[,] cellValues1 = (object[,])rng1.Value2;
object[,] cellValues2 = (object[,])rng2.Value2;
List<string> test1 = cellValues1.Cast<object>().ToList().ConvertAll(x => Convert.ToString(x));
List<string> test2 = cellValues2.Cast<object>().ToList().ConvertAll(x => Convert.ToString(x));
if (!test1.SequenceEqual(test2))
{
rng2.Copy(rng3);
xlWsEventRaw.Cells[newRow, xlWsRaw.UsedRange.Columns.Count].Value = xlWsRaw.Cells[i + 1, xlWsRaw.UsedRange.Columns.Count].Value; // Outputs the timestamp of the event to the events worksheet.
newRow++;
}
}
I believe this can be optimized further but in my case, the ranges contain multiple types including strings so I convert everything to List<string> for the purpose of comparison. The SequenceEqual() method, however it works behind the scenes, is nearly instantaneous and reduces the time to compare 120 columns to around 3ms.

Copy Excel cell comments in a specified range

What is the way to copy all cell comments (on right click - Insert Comments) in a specified range?
Range r1 = (Range)ws1.get_Range("A1", "C10");
Range r2 = (Range)ws2.get_Range("A1", "C10");
r2.Value = r1.Value; // copies cell values and ignores comments
I know that r1.Copy(r2); would copy values and comments, but it shows unnecessary Excel dialogs due to validation issues and therefore I cannot use it.
There's a AddComment method for Range. Unfortunately, it cannot be applied to a range of cells. I guess they assumed: why would you want the same comment written multiple times? So you'll have to loop:
for (int r = 1; r <= r1.Rows.Count; r++)
{
for (int c = 1; c <= r1.Columns.Count; c++)
{
r2[r, c].AddComment(r1.Comment);
}
}

C# Best way to parse flat file with dynamic number of fields per row

I have a flat file that is pipe delimited and looks something like this as example
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
The first two columns are set and will always be there.
* denotes a count for how many repeating fields there will be following that count so Notes 1 2 3
** denotes a count for how many times a block of fields are repeated and there are always 3 fields in a block.
This is per row, so each row may have a different number of fields.
Hope that makes sense so far.
I'm trying to find the best way to parse this file, any suggestions would be great.
The goal at the end is to map all these fields into a few different files - data transformation. I'm actually doing all this within SSIS but figured the default components won't be good enough so need to write own code.
UPDATE I'm essentially trying to read this like a source file and do some lookups and string manipulation to some of the fields in between and spit out several different files like in any normal file to file transformation SSIS package.
Using the above example, I may want to create a new file that ends up looking like this
"ColA","HardcodedString","Note1CRLFNote2CRLF","ColB"
And then another file
Row1: "ColA","A1","A2","A3"
Row2: "ColA","B1","B2","B3"
So I guess I'm after some ideas on how to parse this as well as storing the data in either Stacks or Lists or?? to play with and spit out later.
One possibility would be to use a stack. First you split the line by the pipes.
var stack = new Stack<string>(line.Split('|'));
Then you pop the first two from the stack to get them out of the way.
stack.Pop();
stack.Pop();
Then you parse the next element: 3* . For that you pop the next 3 items on the stack. With 2** you pop the next 2 x 3 = 6 items from the stack, and so on. You can stop as soon as the stack is empty.
while (stack.Count > 0)
{
// Parse elements like 3*
}
Hope this is clear enough. I find this article very useful when it comes to String.Split().
Something similar to below should work (this is untested)
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
string[] columns = line.Split('|');
List<string> repeatingColumnNames = new List<string();
List<List<string>> repeatingFieldValues = new List<List<string>>();
if(columns.Length > 2)
{
int repeatingFieldCountIndex = columns[2];
int repeatingFieldStartIndex = repeatingFieldCountIndex + 1;
for(int i = 0; i < repeatingFieldCountIndex; i++)
{
repeatingColumnNames.Add(columns[repeatingFieldStartIndex + i]);
}
int repeatingFieldSetCountIndex = columns[2 + repeatingFieldCount + 1];
int repeatingFieldSetStartIndex = repeatingFieldSetCountIndex + 1;
for(int i = 0; i < repeatingFieldSetCount; i++)
{
string[] fieldSet = new string[repeatingFieldCount]();
for(int j = 0; j < repeatingFieldCountIndex; j++)
{
fieldSet[j] = columns[repeatingFieldSetStartIndex + j + (i * repeatingFieldSetCount))];
}
repeatingFieldValues.Add(new List<string>(fieldSet));
}
}
System.IO.File.ReadAllLines("File.txt").Select(line => line.Split(new[] {'|'}))

Optimized way of adding multiple hyperlinks in excel file with C#

I wanted to ask if there is some practical way of adding multiple hyperlinks in excel worksheet with C# ..? I want to generate a list of websites and anchor hyperlinks to them, so the user could click such hyperlink and get to that website.
So far I have come with simple nested for statement, which loops through every cell in a given excel range and adds hyperlink to that cell:
for (int i = 0; i < _range.Rows.Count; i++)
{
Microsoft.Office.Interop.Excel.Range row = _range.Rows[i];
for (int j = 0; j < row.Cells.Count; j++)
{
Microsoft.Office.Interop.Excel.Range cell = row.Cells[j];
cell.Hyperlinks.Add(cell, adresses[i, j], _optionalValue, _optionalValue, _optionalValue);
}
}
The code is working as intended, but it is Extremely slow due to thousands of calls of the Hyperlinks.Add method.
One thing that intrigues me is that the method set_Value from Office.Interop.Excel can add thousands of strings with one simple call, but there is no similar method for adding hyperlinks (Hyperlinks.Add can add just one hyperlink).
So my question is, is there some way to optimize adding hyperlinks to excel file in C# when you need to add a large number of hyperlinks...?
Any help would be apreciated.
I am using VS2010 and MS Excel 2010.
I have the very same problems (adding 300 hyperlinks via Range.Hyperlinks.Add takes approx. 2 min).
The runtime issue is because of the many Range-Instances.
Solution:
Use a single range instance and add Hyperlinks with the "=HYPERLINK(target, [friendlyName])" Excel-Formula.
Example:
List<string> urlsList = new List<string>();
urlsList.Add("http://www.gin.de");
// ^^ n times ...
// create shaped array with content
object[,] content = new object [urlsList.Count, 1];
foreach(string url in urlsList)
{
content[i, 1] = string.Format("=HYPERLINK(\"{0}\")", url);
}
// get Range
string rangeDescription = string.Format("A1:A{0}", urlsList.Count+1) // excel indexes start by 1
Xl.Range xlRange = worksheet.Range[rangeDescription, XlTools.missing];
// set value finally
xlRange.Value2 = content;
... takes just 1 sec ...

Categories

Resources