I have a old excel file , i want to modify that.
i need to filter rows from my old excel file and write selected rows to new excel file.
My plan is to read a single row,store it in a list,pass this arraylist to a function.
The function does some checking on the arraylist, if the condition is satisfied, i will write this entire row to my new excel file else i will go back and read next row.
I am not getting anything to read a single row and save it in a arraylist.
I am able to read cell by cell and do conditioning , but this is very slow.
Any better option.
Read the entire input file into a custom-made object at the same time, do your work on it, then write all of that data at the same time into the final Excel file at the same time.
Related
I am trying to build simple program that does my weekly job.
Everytime I receive csv file, I maintain excel file.
My csv is like below:
key_code,eng_name,...so on
000001,some name,...so on
My excel is like below:
Some text are written on A1-G4
No column hearders written
Data is from 5th row
Each row has data from B-G(1st row B5-G5, 2nd row B6-G6)
If key_code in csv does not exist in excel, I add.
If key_code in csv does exist in excel, I update the rest columns.
If key_code in excel does not exist in csv, I delete the row.
Can anyone tell me any easy way or steps to get this done?
I am very confusing about what to use to update excel file among OleDb, Interop.Excel, epplus, spire.xls, etc.
And in which class do I have to store csv data and excel data to compare.
For reading CSV you can use ChoETL reader, this is one of the best CSV readers I have ever used.
The tricky part is to how to write Excel file and choosing the right tool, amongst the tools you have mentioned EPPlus is best because
Excel.Interop needs Excel(MS Office) to be installed on production machine which can create licencing issues
To use OleDB you need some nitty gritty to use it a better way
EPPlus provides some abstraction which makes it easy to manipulate the excel files
using (var p = new ExcelPackage())
{
//A workbook must have at least on cell, so lets add one...
var ws=p.Workbook.Worksheets.Add("MySheet");
//To set values in the spreadsheet use the Cells indexer.
ws.Cells["A1"].Value = "This is cell A1";
//Save the new workbook. We haven't specified the filename so use the Save as method.
p.SaveAs(new FileInfo(#"c:\workbooks\myworkbook.xlsx"));
}
This is very simple example given on the github page to write, please use it and post any specific issues
If key_code in csv does not exist in excel, I add.
If key_code in csv does exist in excel, I update the rest columns.
If key_code in excel does not exist in csv, I delete the row.
As my understanding of the rules above, you simply delete the old excel file and create a new file from the data in the CSV file.
You can use R to do this very easily:
#Install package 'writexl' if you didn't, by install.packages("writexl")
library(writexl)
#File excel
fn <- "file.xlsx"
#Check its existence
if (file.exists(fn))
#Delete file if it exists
file.remove(fn)
#Read the csv file to a data frame
df <- read.csv("C:/newfile.csv")
#Write the data frame to excel file. Change col_names = TRUE if you want the headers.
write_xlsx(
df,
path = "file.xlsx",
col_names = FALSE
)
I am combining multiple large excel files with different columns and number of columns.
Before starting to combine, I want to collect all header rows in order to make a data table which having all columns in advance.
I know that there is a method datatable.merge in c#, which allow to add missing column while combining.
Because there are too many big excel files, and the maximum rows per sheet in excel is about 1 millions row. So when reaching limit, I must save part of combining to excel, clear the content and keep combine after that. This will lead to the result that the saving part in the early process will don't have the same schema as the final one.
This is the reason why I must collect all header in advance.
As far as I am concerned, library in c# like Epplus or ExcelDataReader load entire content of excel. This lasts very long. I don't need to load all content at once.
Somebody here know how to load excel header row only ?
Thank you so much.
Using FastDBF I am creating DBF files on the fly dynamically) with data fed through text files. The problem is that there is always one blank column at the very end of the file. This is always true. I want to just be able to remove the very last column, and my code does this, however it seems to shift all of the data in the file arouns and turns it into a mess. In terms of code, I am just simply using these line before I close the file
odbf.Header.Unlock();
odbf.Header.RemoveColumn(colCount - 1);
It requires you to unlock to make this edit, and colCount is just the number of columns the file has. It does successfully remove the last column, but like I said it shifts all of the data round with it.
I need to take values form one sheet in one Excel workbook and insert them into another existing workbook.
The values I need to take are the first 6 columns of the first file:
And I want to insert them at the beginning of another book like so
I've been using Spire.Xls to read values from the first sheet and I thought I could just do the same; parse the worksheet, read the values and just paste them into the other sheet, but that wouldn't work because three of the columns I want to copy have the same header "Descripcion" so my parser would only take the values form the first descripcion column and skip the other ones.
Is there any way, using Excel.Interop or maybe Spire itself to copy and paste entire columns between workbooks? Or alternately, is there any way to get all of the 3 "descripcion" values (without rewriting the title of the columns)?
VSTO might be helpful. I've done similar tasks in C#/VSTO.
Perhaps read through: Simple Example of VSTO Excel using a worksheet as a datasource
I need to read the data of particular range in excel file and upload them in database.
The required data does not start at A1 cell instead, they start at A15 and A14 is the header row for columns. there are seven columns with headers.
(I tried to read cells via "get_Range" option)
We need to read the data in each cell and do a row by row update in database.
There are thousands of files of same type in a specific folder.
I am trying to achieve this as C# Console app because this is just a one time job.
Here is the answer i found.
step 1 : Loop through each file in the source directory.
Step 2 : Add Excel Interop Reference. and Create Excel Application Class Object, and also for Workbook, and Range(for used range).
Step 3 : Use the Get Range() function and read the rows. (since this is solution is specific for a problem, the start and end ranges of rows and columns are well known)
Step 4 : Each read row can be constructed as a string till the end of the file.
OR
Insert can be done after reading each row.
step 5 : Get Connection String and Create SQLConnection Object to perform insert. Better to use Transaction-Commit.
Done. Thanks to all.