I have a table full of Hyperlinked text in excel, so it's basically a bunch of names but when I click on one, it takes me to some URL in my default browser.
So I am extracting text from this excel table in my program, but the value I get when I extract from these hyperlink cells is that of the string inside, when I want the URL the string is linked to in the excel file.
So I'm thinking there are two ways to do this. Either I can convert all the hyperlinked text in the excel file to the corresponding URLs, or I can use C# to somehow extract the URL value from the cell and not the text.
I don't know how to do either of these things, but any help would be greatly appreciated.
C# code so far:
Excel.ApplicationClass excelApp = new Excel.ApplicationClass();
//excelApp.Visible = true;
Excel.Workbook excelWorkbook =
excelApp.Workbooks.Open("C:\\Users\\use\\Desktop\\list.xls",
0, false, 5, "", "",false, Excel.XlPlatform.xlWindows, "",
true, false, 0, true, false, false);
Excel.Sheets excelSheets = excelWorkbook.Worksheets;
string currentSheet = "Sheet1";
Excel.Worksheet xlws = (Excel.Worksheet)excelSheets.get_Item(currentSheet);
string myString = ((Excel.Range)xlws.Cells[2, 1]).Value.ToString();
As for the excel file, it's just one long row of names hyperlinked. For instance cell A2 would contain the text:
Yummy cookie recipe
And I want to extract the string:
http://allrecipes.com//Recipes/desserts/cookies/Main.aspx
You could use a vba macro:
Hit Alt+F11 to open the VBA editor and paste in the following:
Function URL(rg As Range) As String
Dim Hyper As Hyperlink
Set Hyper = rg.Hyperlinks.Item(1)
URL = Hyper.Address
End Function
And then you can use it in your Worksheet, like this:
=URL(B4)
In your code just add
string myString = ((Excel.Range)xlws.Cells[2, 1]).Cells.Hyperlinks[1].Address;
I obviously recommend doing some checks before accessing the "Hyperlinks" property.
VBA function:
Hit Alt+F11 (Opens Visual Basic Editor)
Click on Insert -> Module (adds a module to your excel file)
Paste the code below for the function of GETURL
Hit Alt+Q (Closes the Visual Basic Editor)
Now use the =GETURL(cell) to get the URL
Example: =GETURL(A1) will return the URL for the Hyperlink displayed in cell A1
Function GETURL(HyperlinkCell As Range)
GETURL = HyperlinkCell.Hyperlinks(1).Address
End Function
Source
Use Visual Studio Tools for Office (VSTO) to open Excel workbook and extract all hyperlinks.
I put a hyperlink into A1 of Sheet1 in Book1.xlsx: text = "example.com, address = "http://www.example.com"
_Application app = null;
try
{
app = new Application();
string path = #"c:\temp\Book1.xlsx";
var workbook = app.Workbooks.Open(path, 0, true, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
var sheets = workbook.Worksheets;
var sheet = (Worksheet)sheets.get_Item("Sheet1");
var range = sheet.get_Range("A1", "A1");
var hyperlinks = range.Cells.Hyperlinks.OfType<Hyperlink>();
foreach (var h in hyperlinks)
{
Console.WriteLine("text: {0}, address: {1}", h.TextToDisplay, h.Address);
}
}
finally
{
if (app != null)
app.Quit();
}
Output:
text: example.com, address: http://www.example.com/
why not use Uri class to convert string into URL:
Uri uri = new Uri("http://myUrl/test.html");
You can use VBA code to achieve this.
Press Alt + F11 to open VB editor, Insert a Module and paste the code below:
Sub run()
On Error Resume Next
For Each hLink In Selection
Range(hLink.Address).Offset(0, 1) = hLink.Hyperlinks(1).Address
Next
End Sub
Save your excel file[in excel 2007 and above save as macro enabled...]
Try this:
Excel.Application appExcel = new Excel.Application();
Excel.Workbooks workBooks = appExcel.Workbooks;
Excel.Workbook excelSheet = workBooks.Open("......EditPath", false, ReadOnly: true);
foreach (Excel.Worksheet worksheet in excelSheet.Worksheets)
{
Excel.Hyperlinks hyperLinks = worksheet.Hyperlinks;
foreach (Excel.Hyperlink lin in hyperLinks)
{
System.Diagnostics.Debug.WriteLine("# LINK: adress:" + lin.Address);
}
}
I just ran into this issue and this is what worked for me:
I used the FormulaR1C1 extension method for a range. So my code looked like this:
for (int r = 2; r <= sheetRange.Rows.Count; r++)
{
documentRecord = new List<string>();
for (int c = 1; c <= wkCol; c++)
{
documentRecord.Add(sheetRange.Cells[r, c].FormulaR1C1);
}
AllRecords.Add(documentRecord);
}
When the record is added to the list of records, the value of whatever the cell range was is formatted into a clickable-hyperlink.
Related
I'm trying to create an script C# in SSIS to create a new column in a sheet on Excel.
I need to know the IndentLevel of a cell in excel and for this i have to create a new column with this values.
I'm trying to do this (Script in c#):
Range values = sheet.get_Range("A13");
values.Value = sheet.Range["B13"].IndentLevel();
In VBA Works like this (Script in VBA inside of a excell):
Range("A16").Value = Range("B16").IndentLevel
In C# how can i do that? i'm trying everything but doenst work.
Complete script:
xlApp = new _Excel.Application();
xlApp.Visible = true;
oWB = (_Excel.Workbook)xlApp.Workbooks.Open(destFile);
_Excel.Worksheet sheet = (_Excel.Worksheet)xlApp.Worksheets[1];
sheet.Columns["B:N"].Delete();
Range values = sheet.get_Range("A13");
values.Value = sheet.Range["B13"].IndentLevel();
Getting rid of the parenthesis seems to do the task correctly.
string destFile = #"E:\StackOverflow\Sample.xlsx";
var xlApp = new Microsoft.Office.Interop.Excel.Application();
xlApp.Visible = true;
var oWB = (Microsoft.Office.Interop.Excel.Workbook)xlApp.Workbooks.Open(destFile);
Microsoft.Office.Interop.Excel.Worksheet sheet = (Microsoft.Office.Interop.Excel.Worksheet)xlApp.Worksheets[1];
sheet.Range["A16"].Value = sheet.Range["B16"].IndentLevel;
The value in cell A16 is set to B16's indent level.
The only other note is to make sure that the file isn't open elsewhere, otherwise the code will open up a read-only copy.
I try to open a CSV file with Excel using the Microsoft.Office.Interop.Excel libary.
And it works fine but all the text is in one column with the delimiter ";".
Here an example:
Id;Name;Zeit
1;Name1;21.05.2019 09:21:04
3;Name2;21.05.2019 09:21:04
This is the code I used to open the CSV in Excel:
object missing = Type.Missing;
Excel.Application ex = new Excel.Application();
Excel.Workbook wbs = ex.Workbooks.Open(#"c:\users\langenwa\desktop\File.csv", 0, false, Excel.XlFileFormat.xlCSV, "", "", false, Excel.XlPlatform.xlWindows, ";", true, false, 0, true, false, false);
Excel.Worksheet mSheet = (Excel.Worksheet)wbs.Worksheets[1];
ex.Visible = true;
Thanks for any help and sorry for my bad English.
The csv file works fine in my excel, likely your system has a different default seperator. You can override this.
try to add this at the top of the csv file:
sep=;
note that this only works when opening csv in excel
See these answers for more details: https://superuser.com/questions/606272/how-to-get-excel-to-interpret-the-comma-as-a-default-delimiter-in-csv-files
I am trying to export Excel files to PDF. I am having success with this using the Microsoft.Office.Interop namespace. I am now trying to find out how to exclude tabs that are marked hidden, so that they are not within the PDF> Hase anyone done this or knows how to do this? My code is shown below that I am currently using.
string inFile = #"C:\Users\casey.pharr\Desktop\testPDF\3364850336.xls";
string outFile = #"C:\Users\casey.pharr\Desktop\testPDF\3364850336_noHidden_out.pdf";
string tempFile = #"C:\Users\casey.pharr\Desktop\testPDF\temp.xls";
try
{
//first copy original file to temp file to work with
File.Copy(inFile,tempFile, true);
Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
app.Visible = false;
app.DisplayAlerts = false;
Microsoft.Office.Interop.Excel.Workbook wkb = app.Workbooks.Open(tempFile);
for(int x = app.Sheets.Count-1; x-1 > 1; x--)
{
Excel._Worksheet sheet = (Excel._Worksheet)app.Sheets[x];
//now delete hidden worksheets from work book. This is why we are using tempFile
if (sheet.Visible == Microsoft.Office.Interop.Excel.XlSheetVisibility.xlSheetHidden || sheet.Visible == Microsoft.Office.Interop.Excel.XlSheetVisibility.xlSheetVeryHidden && sheet != null)
{
//is sheet hidden. If so remove it so not part of converted file
sheet.Delete();
}
}
wkb.ExportAsFixedFormat(Microsoft.Office.Interop.Excel.XlFixedFormatType.xlTypePDF, outFile);
wkb.Close(false);
app.Quit();
//return outputLocation;
The error that occurs on calling .Delete() is below:
Exception from HRESULT: 0x800A03EC
enter code here
So we can convert the pdf's fine, but not remove or exclude hidden worksheets. I went the route to try to delete them then convert the entire file, but not working.
I have C# app for deleting first few rows from excel and then format file to .csv, but now i got not .xlsx but .xlsm and i cant find how to work with, i cant even load data from columns. Its some report file from SAP and i dont find any macro inside. I tried something like this
/* Load Excel File */
Excel.Application excelApp = new Excel.Application();
string workbookPath = #"file.xlsm";
Excel.Workbook excelWorkbook = excelApp.Workbooks.Open(workbookPath, 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
/* Load worksheets collection */
Excel.Sheets excelSheets = excelWorkbook.Worksheets;
/* Select first worksheet */
Excel.Worksheet excelWorksheet = (Excel.Worksheet)excelSheets[1];
/* Deleting first 87 Rows */
Excel.Range range = excelWorksheet.get_Range("1:87").EntireRow;
range.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
/* Save File */
excelWorkbook.SaveAs(#"out_file.xlsm");
excelWorkbook.Close(false);
excelApp.Application.Quit();
/* Release COM objects otherwise Excel remain running */
releaseObject(range);
releaseObject(excelWorkbook);
releaseObject(excelWorksheet);
releaseObject(excelApp);
This work with .xlsx extension (it will delete rows and save it under another name) but not with .xlsm (program run successfully but it dont delete data). Even if i manually excel file save as .xlsx and run program on that file it dont work, but if i manually copy paste data to another .xlsx and run program on that file it work, i dont get it. How can i rewrite this program to delete rows from .xlsm files ? Please help, thank you.
Thanks to Christian Sauer, the EPPLUS.dll worked.
Step 1
Solution Explorer > Project Name > Add > Reference > Browse to EPPLUS.dll
Step 2
using OfficeOpenXml;
using OfficeOpenXml.Style;
using System.IO;
Step 3 (delete rows range)
using (var p = new ExcelPackage(new FileInfo(#"file.xlsm")))
{
var sheet = p.Workbook.Worksheets["Sheet1"];
sheet.DeleteRow(1, 87);
p.SaveAs(new FileInfo(#"output.xlsm"));
}
)
Step 4 (export .xlsm to .csv)
Insert Code between these lines
sheet.DeleteRow(1, 87);
====>[HERE]
p.SaveAs(new FileInfo(#"output.xlsm"));
/* Code placed to [HERE] placeholder */
using (var writer = File.CreateText(#"output.csv"))
{
var rowCount = sheet.Dimension.End.Row;
var columnCount = sheet.Dimension.End.Column;
for (var r = 1; r <= rowCount; r++)
{
for (var c = 1; c <= columnCount; c++)
{
writer.Write(sheet.Cells[r, c].Value);
writer.Write(";");
}
writer.WriteLine();
}
}
I'm using VS2010 + Office Interop 2007 to attempt to get a few specific spreadsheet names from an Excel spreadsheet with 5-6 pages. All I am doing from there is saving those few spreadsheets I need in a tab delimited text file for further processing. So for the three spreadsheet names I get, each one will have its own tab delimited text file.
I can save a file as tab delimited just fine through Interop, but that's assuming I know what the given page name is. I have been informed that each page name will not follow a strict naming convention, but I can account for multiple names like "RCP", "rcp", "Recipient", etc when looking for a desired name.
My question is, can I get all spreadsheet page names in some sort of index so I may iterate through them and try to find the three names I need? That would be so much nicer than trying to grab "RCP", "rcp", "Recipient" pages via a bajillion try/catches.
I'm close, because I can get the COUNT of pages in an Excel spreadsheet via the following:
Excel.Application excelApp = new Excel.Application(); // Creates a new Excel Application
excelApp.Visible = true; // Makes Excel visible to the user.
// The following code opens an existing workbook
string workbookPath = path;
Excel.Workbook excelWorkbook = null;
try
{
excelWorkbook = excelApp.Workbooks.Open(workbookPath, 0,
false, 5, "", "", false, Excel.XlPlatform.xlWindows, "", true,
false, 0, true, false, false);
}
catch
{
//Create a new workbook if the existing workbook failed to open.
excelWorkbook = excelApp.Workbooks.Add();
}
// The following gets the Worksheets collection
Excel.Sheets excelSheets = excelWorkbook.Worksheets;
Console.WriteLine(excelSheets.Count.ToString()); //dat count
Thank you for your time.
foreach ( Worksheet worksheet in excelWorkbook.Worksheets )
{
MessageBox.Show( worksheet.Name );
}
You could use a dictionary:
Dictionary<string, Worksheet> dict = new Dictionary<string, Worksheet>();
foreach ( Worksheet worksheet in excelWorkbook.Worksheets )
{
dict.Add( worksheet.Name, worksheet );
}
// accessing the desired worksheet in the dictionary
MessageBox.Show( dict[ "Sheet1" ].Name );