I have an excel sheet (ws) that has several pictures in it, and i want to removed all of it using EPPlus.
this is what i've done, and it worked, but I don't want to remove it using the title of the picture
ws.Drawings.Remove("Picture 1");
ws.Drawings.Remove("Picture 2");
is there a way to remove them all at once?
I don't know of any method to remove them all with a single line of code, however you can do this by looping through all the drawings in a worksheet.
using (var p = new OfficeOpenXml.ExcelPackage(new FileInfo(#"c:\FooFolder\Foo.xlsx")))
{
ExcelWorkbook wb = p.Workbook;
ExcelWorksheet ew = wb.Worksheets.First();
//get the number of drawings in the worksheet to loop through.
//Subtract 1 since the drawings use a 0 base index
int drawingCount = ew.Drawings.Count -1;
//loop through the drawings starting at highest number so the collections index doesn't change as you remove them
for(int i = drawingCount; i>=0; i--)
{
//remove the drawing at current index
ew.Drawings.Remove(i);
}
p.Save();
}
Edit: After I posted this I found a much simpler method.
You can use ExcelWorksheet.Drawings.Clear() this method removes all drawings from the worksheet.
Related
I have a WPF DataGrid which I fill with imported data from an Excel file (*. Xlsx) through a class, the problem is that multiple blank lines are added to the end of the DataGrid that I don't see how to delete. I attach my code.
<DataGrid Name="dgvMuros" Height="210" Margin="8" VerticalAlignment="Top" Padding="5,6" ColumnWidth="50" IsReadOnly="False"
AlternatingRowBackground="Azure" GridLinesVisibility="All" HeadersVisibility="Column"
Loaded="dgvMuros_Loaded" CellEditEnding="DataGrid_CellEditEnding" ItemsSource="{Binding Data}"
HorizontalGridLinesBrush="LightGray" VerticalGridLinesBrush="LightGray" >
</DataGrid>
With this method I import the data from the Excel file.
public void ImportarMuros()
{
ExcelData dataFronExcel = new ExcelData();
this.dgvMuros.DataContext = dataFronExcel;
txtTotMuros.Text = dataFronExcel.numMuros.ToString();
cmdAgregarMuros.IsEnabled = false;
cmdBorrarMuros.IsEnabled = false;
cmdImportar.IsEnabled = false;
}
public class ExcelData
{
public int numMuros { get; set; }
public DataView Data
{
get
{
Excel.Application excelApp = new Excel.Application();
Excel.Workbook workbook;
Excel.Worksheet worksheet;
Excel.Range range;
workbook = excelApp.Workbooks.Open(Environment.CurrentDirectory + "\\MurosEjemplo.xlsx");
worksheet = (Excel.Worksheet)workbook.Sheets["DatMuros"];
int column = 0;
int row = 0;
range = worksheet.UsedRange;
DataTable dt = new DataTable();
dt.Columns.Add("Muro");
dt.Columns.Add("Long");
dt.Columns.Add("Esp");
dt.Columns.Add("X(m)");
dt.Columns.Add("Y(m)");
dt.Columns.Add("Dir");
for (row = 2; row < range.Rows.Count; row++)
{
DataRow dr = dt.NewRow();
for (column = 1; column <= range.Columns.Count; column++)
{
dr[column - 1] = Convert.ToString((range.Cells[row, column] as Excel.Range).Value);
}
dt.Rows.Add(dr);
dt.AcceptChanges();
numMuros = dt.Rows.Count;
}
workbook.Close(true, Missing.Value, Missing.Value);
excelApp.Quit();
return dt.DefaultView;
}
}
}
Below, as commented, is an example of removing the extra “empty” rows from the DataTable.
There are a couple of ways to approach this. One is to clean the Excel file of the extras rows as I am aware that using Excel’s UsedRange property has a nasty habit of flagging rows that have no apparent data as NOT empty. This may be from formatting or other issues. I have a solution for that if you want to go down that rabbit hole. Fastest method to remove Empty rows and Columns From Excel Files using Interop
However, this solution was heavily based on LARGE Excel files with many rows and columns. If the files are not large, then the solution below should work.
Even though your posted code has some much-needed range checking (more below), using the posted code, I was able to read an Excel file that produced extra “empty” rows at the end. It is these rows we want to remove from the DataTable.
I am sure there are other ways to do this, however, a basic approach would be to simply loop through the DataTable rows, and check each cell… and, if ALL the cells on that row are “empty” then remove that row. This is the approach I used below.
To help get this done quickly, keeping this to one loop through the table is a goal. In other words, we want to loop through the table and remove rows from that SAME table. This will mean that extra care is needed. Obviously a foreach loop through the rows will not work.
However, a simple for loop will work, as long as we start at the bottom and work up. AND we need to make sure and NOT use dt.Rows.Count as an “ending” condition in the for loop through the rows. As this could possibly cause some problems. This is easily avoided by simply fixing the row count to a variable and use it as an ending condition. This will allow the code to delete the rows from the bottom up and not have to worry about getting the row and loop indexes mixed up.
A walkthrough of the code would go like… First a bool variable allEmpty is created to indicate if ALL the cells in a row are “empty.” For each row, we will set this variable to true to indicate that the row is empty. Then a loop through each cell of that row and check if each cell is NOT empty. If at least one of the cells in that row is NOT empty, then, we set allEmpty to false and break out of the columns loop. After the columns loop is exited, the code simply checks to see if that row is empty and if so, deletes that row.
It should be noted in the last if statement that checks for the empty row. When the FIRST non-empty row is found, then in this context where we are only wanting to delete the last “empty” rows, then, we are done and can break out of the rows loop and exit since we are only looking for the LAST empty rows.
If you comment out the else portion of the bottom if code, then, the code will remove ALL the empty rows.
bool allEmpty;
int rowCount = dt.Rows.Count - 1;
for (int dtRowIndex = rowCount; dtRowIndex >= 0; dtRowIndex--) {
allEmpty = true;
for (int dtColIndex = 0; dtColIndex < dt.Columns.Count; dtColIndex++) {
if (dt.Rows[dtRowIndex].ItemArray[dtColIndex].ToString() != "") {
allEmpty = false;
break;
}
}
if (allEmpty) {
dt.Rows.RemoveAt(dtRowIndex);
}
else {
break;
}
}
Eye brow raiser for the posted code…
The current posted code makes some dangerous assumptions in relation to what is returned from UsedRange and the dt column indexes. Example, the code starts by grabbing the worksheets UsedRange.
range = worksheet.UsedRange;
We obviously NEED this info, however, at this point in the code, we have NO clue how many rows or columns have been returned. Therefore, when the code gets to the second for loop through the columns... The code uses this column index as an index into the data row dr…
dr[column - 1] = …
Since the data table dt only has 6 columns, this is a risky assignment without checking the index range. Since used range grabs the used cells, what if a user added some text into column 7, 8 or ANY cell greater than 6, then this code will crash and burn. The code MUST check the number of columns returned from UsedRange to avoid an index out of range exception.
There are a couple of ways you could fix this. One would be to set the column loop ending condition to the number of columns in the data table. Unfortunately, this still leads to checking the number of columns returned by the used range considering it may return less columns than the data table has and the code will crash on the same line above only on the right side of the “=” equation.
= Convert.ToString((range.Cells[row, column] as Excel.Range).Value);
In both cases it is clear your code needs to check these ranges BEFORE you start the looping through the used range.
Lastly, if you must use Excel Interop, which is usually a last option case, then you need to minimize the possibility of leaking the COM objects (leaking resources), such that when something goes wrong your code still releases the COM objects the code creates. When using Interop, I suggest you wrap all the Excel code in a try/catch/finally statement. In the try portion you have the code. And the Finally portion is where you close the excel workbook, quit the excel application and release the COM objects.
You will need to decide what to do in the catch portion of code. A simple message box displayed to the user may suffice to tell the user there was an error, the user clicks OK, and the code executes the finally code. Point being, that you want to display something instead of simply swallowing the error.
This approach may look something like…
Microsoft.Office.Interop.Excel.Application ExcelApp = null;
Microsoft.Office.Interop.Excel.Workbook Workbook = null;
Microsoft.Office.Interop.Excel.Worksheet Worksheet = null;
try {
// code that works with excel interop
}
catch (Exception e) {
MessageBox.Show("Error Excel: " + e.Message);
}
finally {
if (Worksheet != null) {
Marshal.ReleaseComObject(Worksheet);
}
if (Workbook != null) {
//Workbook.Save();
Workbook.Close();
Marshal.ReleaseComObject(Workbook);
}
if (ExcelApp != null) {
ExcelApp.Quit();
Marshal.ReleaseComObject(ExcelApp);
}
}
I hope this makes sense and helps.
The Excel spreadsheet should be read by .NET. It is very efficient to read all values from the active range by using the property Value. This transfers all values in a two dimensional array, by one single call to Excel.
However reading strings is not possible for a range which contains more than one single cell. Therefor we have to iterate over all cells and use the Text property. This shows very poor performance for larger document.
The reason of using strings rather than values is to obtains the correct format (for instance for dates or the number of digits).
Here is a sample code written in C# to demonstrate the approach.
static void Main(string[] args)
{
Excel.Application xlApp = (Excel.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Excel.Application");
var worksheet = xlApp.ActiveSheet;
var cells = worksheet.UsedRange();
// read all values in array -> fast
object[,] arrayValues = cells.Value;
// create array for text of the same extension
object[,] arrayText = (object[,])Array.CreateInstance(typeof(object),
new int[] { arrayValues.GetUpperBound(0), arrayValues.GetUpperBound(1) },
new int[] { arrayValues.GetLowerBound(0), arrayValues.GetLowerBound(1) });
// read text for each cell -> slow
for (int row = arrayValues.GetUpperBound(0); row <= arrayValues.GetUpperBound(0); ++row)
{
for (int col = arrayValues.GetUpperBound(0); col <= arrayValues.GetUpperBound(1); ++col)
{
object obj = cells[row, col].Text;
arrayText[row, col] = obj;
}
}
}
The question is, if there is a more efficient way to read the complete string content from an Excel document. One idea was to use cells.Copy to copy the content to the clipboard to get it from there. However this has some restrictions and could of course interfere with users which are working with the clipboard at the same time. So I wonder if there are better approaches to solve this performance issue.
You can use code below:
using (MSExcel.Application app = MSExcel.Application.CreateApplication())
{
MSExcel.Workbook book1 = app.Workbooks.Open( this.txtOpen_FilePath.Text);
MSExcel.Worksheet sheet = (MSExcel.Worksheet)book1.Worksheets[1];
MSExcel.Range range = sheet.GetRange("A1", "F13");
object value = range.Value; //the value is boxed two-dimensional array
}
The code is provided from this post. It should be much more efficient than your code, but may not be the best.
I've a List with some values, I've also a TextBox where I must write a number, then I need to build a Excel with many Sheets with values coming from List. In other words and for example: List have 1000 values, then I enter 100 in TextBox, so I'll need to generate a Excel file with many sheets as values are in List iterating over the value entered in the TextBox in this case will be one Excel file with 10 sheets, every sheet with 100 cells. It's clear? How I can do this using Microsoft.Office.Interop.Excel?
For worksheets:
//get the first workbook in an application
Workbook WB = Application.Workbooks[0]; //Or any other workbook you preffer
Now loop the following for each list of strings you have (each list to a worksheet)
Worksheet WS = (Worksheet)WB.Worksheets.Add(); //this command adds worksheets
Range R = WS.Range["A1"]; //or any other cell you like
//now for cells
for (int i = 0; i < YourStringList.Count; i++) //I believe you can manage to separate the lists yourself
{
R.Offset[i, 0].Value = YourStringList[i];
}
End of the loop
I am trying to add these three types of content into a word doc. This is how I am trying to do it now. However, each item replaces the last one. Adding images always adds to the beginning of the page. I have a loop that calls a function to create the headers and tables, and then adds images after. I think the problem is ranges. I use a starting range of object start = 0;
How can I get these to add one at a time to to a new line in the document?
foreach (var category in observedColumns)
{
CreateHeadersAndTables();
createPictures();
}
Adding Headers:
object start = 0;
Word.Range rng = doc.Range(ref start , Missing.Value);
Word.Paragraph heading;
heading = doc.Content.Paragraphs.Add(Missing.Value);
heading.Range.Text = category;
heading.Range.InsertParagraphAfter();
Adding Tables:
Word.Table table;
table = doc.Content.Tables.Add(rng, 1, 5);
Adding Pictures:
doc.Application.Selection.InlineShapes.AddPicture(#path);
A simple approach will be using paragraphs to handle the Range objects and simply insert a new paragraph one by one.
Looking at the API documentation reveals that Paragraphs implements an Add method which:
Returns a Paragraph object that represents a new, blank paragraph
added to a document. (...) If Range isn't specified, the new paragraph is added after the selection or range or at the end of the document.
Source: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.paragraphs.add(v=office.14).aspx
In that way, it gets straight forward to append new content to the document.
For completeness I have included a sample that shows how a solution might work. The sample loops through a for loop, and for each iteration it inserts:
A new line of text
A table
A picture
The sample has is implemented as a C# console application using:
.NET 4.5
Microsoft Office Object Library version 15.0, and
Microsoft Word Object Library version 15.0
... that is, the MS Word Interop API that ships with MS Office 2013.
using System;
using System.IO;
using Microsoft.Office.Interop.Word;
using Application = Microsoft.Office.Interop.Word.Application;
namespace StackOverflowWordInterop
{
class Program
{
static void Main()
{
// Open word and a docx file
var wordApplication = new Application() { Visible = true };
var document = wordApplication.Documents.Open(#"C:\Users\myUserName\Documents\document.docx", Visible: true);
// "10" is chosen by random - select a value that fits your purpose
for (var i = 0; i < 10; i++)
{
// Insert text
var pText = document.Paragraphs.Add();
pText.Format.SpaceAfter = 10f;
pText.Range.Text = String.Format("This is line #{0}", i);
pText.Range.InsertParagraphAfter();
// Insert table
var pTable = document.Paragraphs.Add();
pTable.Format.SpaceAfter = 10f;
var table = document.Tables.Add(pTable.Range, 2, 3, WdDefaultTableBehavior.wdWord9TableBehavior);
for (var r = 1; r <= table.Rows.Count; r++)
for (var c = 1; c <= table.Columns.Count; c++)
table.Cell(r, c).Range.Text = String.Format("This is cell {0} in table #{1}", String.Format("({0},{1})", r,c) , i);
// Insert picture
var pPicture = document.Paragraphs.Add();
pPicture.Format.SpaceAfter = 10f;
document.InlineShapes.AddPicture(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments), "img_1.png"), Range: pPicture.Range);
}
// Some console ascii-UI
Console.WriteLine("Press any key to save document and close word..");
Console.ReadLine();
// Save settings
document.Save();
// Close word
wordApplication.Quit();
}
}
}
I wanted to ask if there is some practical way of adding multiple hyperlinks in excel worksheet with C# ..? I want to generate a list of websites and anchor hyperlinks to them, so the user could click such hyperlink and get to that website.
So far I have come with simple nested for statement, which loops through every cell in a given excel range and adds hyperlink to that cell:
for (int i = 0; i < _range.Rows.Count; i++)
{
Microsoft.Office.Interop.Excel.Range row = _range.Rows[i];
for (int j = 0; j < row.Cells.Count; j++)
{
Microsoft.Office.Interop.Excel.Range cell = row.Cells[j];
cell.Hyperlinks.Add(cell, adresses[i, j], _optionalValue, _optionalValue, _optionalValue);
}
}
The code is working as intended, but it is Extremely slow due to thousands of calls of the Hyperlinks.Add method.
One thing that intrigues me is that the method set_Value from Office.Interop.Excel can add thousands of strings with one simple call, but there is no similar method for adding hyperlinks (Hyperlinks.Add can add just one hyperlink).
So my question is, is there some way to optimize adding hyperlinks to excel file in C# when you need to add a large number of hyperlinks...?
Any help would be apreciated.
I am using VS2010 and MS Excel 2010.
I have the very same problems (adding 300 hyperlinks via Range.Hyperlinks.Add takes approx. 2 min).
The runtime issue is because of the many Range-Instances.
Solution:
Use a single range instance and add Hyperlinks with the "=HYPERLINK(target, [friendlyName])" Excel-Formula.
Example:
List<string> urlsList = new List<string>();
urlsList.Add("http://www.gin.de");
// ^^ n times ...
// create shaped array with content
object[,] content = new object [urlsList.Count, 1];
foreach(string url in urlsList)
{
content[i, 1] = string.Format("=HYPERLINK(\"{0}\")", url);
}
// get Range
string rangeDescription = string.Format("A1:A{0}", urlsList.Count+1) // excel indexes start by 1
Xl.Range xlRange = worksheet.Range[rangeDescription, XlTools.missing];
// set value finally
xlRange.Value2 = content;
... takes just 1 sec ...