I'm using LinqToExcel along with C# to read data from a MS Excel spreadsheet and then update data records in my MS SQL database.
The Excel file has these headers: COURSE_ID, PROVIDER_COURSE_TITLE
My code is like this:
public class TestDataCourse
{
[ExcelColumn("PROVIDER_COURSE_TITLE")]
public string cTitle
{
get;
set;
}
}
///////////////////////////////
string pathToExcelFile = #"C:\\O_COURSES.xlsx";
ConnexionExcel ConxObject = new ConnexionExcel(pathToExcelFile);
//read data from excel
var query1 = (from a in ConxObject.UrlConnexion.Worksheet<TestDataCourse>("O_COURSES")
select a).Take(2000).ToList();
//Get data from MS SQL database that need updated
var courses = _courseService.GetAllCoursesFromDB().Take(100).ToList();
int count = 0;
foreach (var course in courses)
{
//Iterate through the excel doc and assign the
TestDataCourse fakeData = query1.Skip(count).Take(1).FirstOrDefault();
course.CourseTitle = fakeData.cTitle;
count++;
}
_courseService.Save();
When I run this code I can see that it does update some of the records in my database, but as the code execution continues, I get a Source Not Available tab open within my Visual Studio and a Cannot perform runtime binding on a null reference.
The null reference exception had me thinking that maybe there was some null data in the Excel doc, so I put this line of code into my for loop:
course.CourseTitle = fakeData == null ? "Course Test" : fakeData.cTitle;
But I still get the same problem.
Could anyone please help?
Thanks.
How can I read the excel sheet data in ASP.net without using OleDbConnection. I have tried OleDbConnection already but I am facing issues with it.
Are there any other ways to do so?
You need EPPlus for this kind of work. Site
Check this. It can read an excel file without OleDbConnection
you can use LinqToExcel to do this. following code will help a bit:
public ActionResult fileread()
{
var excel = new ExcelQueryFactory(#"File Name");
excel.AddMapping<ABC>(x => x.S1, "code"); // ABC is your database table name... code is the column name of excel file
excel.AddMapping<ABC>(x => x.S2, "description");
// you can map as many columns you want
var e = (from c in excel.Worksheet<ABC>("MyExcelFile") select c); // MyExcelFile is the name of Excel File's Sheet name
// similarly you can do whatever you want with the data like.. save to DB
foreach (var y in e)
{
ABC a = new ABC();
a.S1 = y.S1;
a.S2 = y.S2;
db.ABC.Add(a);
}
db.SaveChanges();
}
I am using Lunece.net 2.0.5 version.
I want to open and display all the records in the index file in a grid (table) format in an ASP.NET web application, and also provide edit option for each cell in that grid.
But I don't know how to read each row from Index file.
I used code below-
private List<String> GetIndexTerms(string indexFolder)
{
List<String> termlist = new List<string>();
IndexReader reader = IndexReader.Open(indexFolder, false);
TermEnum terms = reader.Terms();
while (terms.Next())
{
Term term = terms.Term();
String termText = term.Text();
int frequency = reader.DocFreq(term);
termlist.Add(termText);
}
reader.Close();
return termlist;
}
but it returns list of each term and here I am unable to aggregate data by each row (record).
Let me know if there is way to read file by each row or I need to update version of Lucene that I am currently using.
Also please provide any links to Lucene.net's better documentation websites.
You can read all the records/rows (documents in Lucene terminology) directly from the index without searching
var reader = IndexReader.Open(dir);
for (int i = 0; i < reader.MaxDoc(); i++)
{
if (reader.IsDeleted(i)) continue;
Document d = reader.Document(i);
var fieldValuePairs = d.GetFields()
.Select(f => new {
Name = f.Name(),
Value = f.StringValue() })
.ToArray();
}
PS: v2.0.5 is very old. try latest & greatest Lucene.Net
My overall problem is that I have a large Excel file(Column A-S, 85000 rows) that I want to convert to XML. The data in the cells is all text.
The process I'm using now is to manually save the excel file as csv, then parse that in my own c# program to turn it into XML. If you have better recommendations, please recommend. I've searched SO and the only fast methods I found for converting straight to XML require my data to be all numeric.
(Tried reading cell by cell, would have taken 3 days to process)
So, unless you can recommend a different way for me to approach the problem, I want to be able to programmatically remove all commas, <, >, ', and " from the excel sheet.
There are many options to read/edit/create Excel files:
MS provides the free OpenXML SDK V 2.0 - see http://msdn.microsoft.com/en-us/library/bb448854%28office.14%29.aspx (XLSX only)
This can read+write MS Office files (including Excel).
Another free option see http://www.codeproject.com/KB/office/OpenXML.aspx (XLSX only)
IF you need more like handling older Excel versions (like XLS, not only XLSX), rendering, creating PDFs, formulas etc. then there are different free and commercial libraries like ClosedXML (free, XLSX only), EPPlus (free, XLSX only), Aspose.Cells, SpreadsheetGear, LibXL and Flexcel etc.
Another option is Interop which requires Excel to be installed locally BUT Interop is not supported in sever-scenarios by MS.
Any library-based approach to deal with the Excel-file directly is way faster than Interop in my experience...
I would use a combination of Microsoft.Office.Interop.Excel and XmlSerializer to get the job done.
This is in light of the fact that a) you're using a console appilcation, and b) the interop assemblies are easy to integrate to the solution (just References->Add).
I'm assuming that you have a copy of Excel installed in the machine runnning the process (you mentioned you manually open the workbook currently, hence the assumption).
The code would look something like this:
The serializable layer:
public class TestClass
{
public List<TestLineItem> LineItems { get; set; }
public TestClass()
{
LineItems = new List<TestLineItem>();
}
}
public class TestLineItem
{
private string SanitizeText(string input)
{
return input.Replace(",", "")
.Replace(".", "")
.Replace("<", "")
.Replace(">", "")
.Replace("'", "")
.Replace("\"", "");
}
private string m_field1;
private string m_field2;
public string Field1
{
get { return m_field1; }
set { m_field1 = SanitizeText(value); }
}
public string Field2
{
get { return m_field2; }
set { m_field2 = SanitizeText(value); }
}
public decimal Field3 { get; set; }
public TestLineItem() { }
public TestLineItem(object field1, object field2, object field3)
{
m_field1 = (field1 ?? "").ToString();
m_field2 = (field2 ?? "").ToString();
if (field3 == null || field3.ToString() == "")
Field3 = 0m;
else
Field3 = Convert.ToDecimal(field3.ToString());
}
}
Then open the worksheet and load into a 2D array:
// using OExcel = Microsoft.Office.Interop.Excel;
var app = new OEXcel.Application();
var wbPath = Path.Combine(
Environment.GetFolderPath(
Environment.SpecialFolder.MyDocuments), "Book1.xls");
var wb = app.Workbooks.Open(wbPath);
var ws = (OEXcel.Worksheet)wb.ActiveSheet;
// there are better ways to do this...
// this one's just off the top of my head
var rngTopLine = ws.get_Range("A1", "C1");
var rngEndLine = rngTopLine.get_End(OEXcel.XlDirection.xlDown);
var rngData = ws.get_Range(rngTopLine, rngEndLine);
var arrayData = (object[,])rngData.Value2;
var tc = new TestClass();
// since you're enumerating an array, the operation will run much faster
// than reading the worksheet line by line.
for (int i = arrayData.GetLowerBound(0); i <= arrayData.GetUpperBound(0); i++)
{
tc.LineItems.Add(
new TestLineItem(arrayData[i, 1], arrayData[i, 2], arrayData[i, 3]));
}
var xs = new XmlSerializer(typeof(TestClass));
var fs = File.Create(Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments),
"Book1.xml"));
xs.Serialize(fs, tc);
wb.Close();
app.Quit();
The generated XML output will look something like this:
<TestClass>
<LineItems>
<TestLineItem>
<Field1>test1</Field1>
<Field2>some<encoded> stuff here</Field2>
<Field3>123456.789</Field3>
</TestLineItem>
<TestLineItem>
<Field1>test2</Field1>
<Field2>testing some commas, and periods.</Field2>
<Field3>23456789.12</Field3>
</TestLineItem>
<TestLineItem>
<Field1>test3</Field1>
<Field2>text in "quotes" and 'single quotes'</Field2>
<Field3>0</Field3>
</TestLineItem>
</LineItems>
</TestClass>
I am using EPP to open and edit an existing excel document.
The document contains 2 sheets - one with a pivot table (named Pivot) and one with the data (Data!$A$1:$L$9899).
I have a reference to the ExcelPivotTable with the code below, but can't find any properties that relate to the data source.
ExcelPackage package = new ExcelPackage(pivotSpreadsheet);
foreach (ExcelWorksheet worksheet in package.Workbook.Worksheets)
{
if (worksheet.PivotTables.Count > 0)
{
pivotWorkSheetName = worksheet.Name;
pivotTable = worksheet.PivotTables[0];
}
}
How do I get the name and range of the source data? Is there an obvious property that I'm missing or do I have to go hunting through some xml?
PivotTables use a data cache for the data store for performance & abstraction reasons. Remember, you can have a pivot that points to a web service call. The cache itself is what stores that reference. For pivots that refer to data elsewhere in a workbook, you can access it in EPPlus like this:
worksheet.PivotTables[0].CacheDefinition.SourceRange.FullAddress;
If anyone is interested to update the data source with OpenXML SDK 2.5 then here is the code I used.
using (var spreadsheet = SpreadsheetDocument.Open(filepath, true))
{
PivotTableCacheDefinitionPart ptp = spreadsheet.WorkbookPart.PivotTableCacheDefinitionParts.First();
ptp.PivotCacheDefinition.RefreshOnLoad = true;//refresh the pivot table on document load
ptp.PivotCacheDefinition.RecordCount = Convert.ToUInt32(ds.Tables[0].Rows.Count);
ptp.PivotCacheDefinition.CacheSource.WorksheetSource.Reference = "A1:" + IntToLetters(ds.Tables[0].Columns.Count) + (ds.Tables[0].Rows.Count + 1);//Cell Range as data source
ptp.PivotTableCacheRecordsPart.PivotCacheRecords.RemoveAllChildren();//it is rebuilt when pivot table is refreshed
ptp.PivotTableCacheRecordsPart.PivotCacheRecords.Count = 0;//it is rebuilt when pivot table is refreshed
}
public string IntToLetters(int value)//copied from another stackoverflow post
{
string result = string.Empty;
while (--value >= 0)
{
result = (char)('A' + value % 26) + result;
value /= 26;
}
return result;
}