'System.AccessViolationException' occurred - c#

UPDATED: added full block of code where error occurs
UPDATE 2: I found a weird anomaly. The code has now been continuously breaking on that line, when the tabName variable equals "service line prior year". This morning, for grins, I changed the tab name to "test", so in turn the tabName variable equals "test", and it worked more often then not. I am really at a loss.
I have researched a ton and can't find anything that addresses what is happening in my code. It happens randomly though. Sometimes it doesn't happen, then other times it happens in the same spot, but all on this part of the code (on the line templateSheet = templateBook.Sheets[tabName];):
public void ExportToExcel(DataSet dataSet, string filePath, int i, int h, Excel.Application excelApp)
{
//create the excel definitions again.
//Excel.Application excelApp = new Excel.Application();
//excelApp.Visible = true;
FileInfo excelFileInfo = new FileInfo(filePath);
Boolean fileOpenTest = IsFileOpen(excelFileInfo);
Excel.Workbook templateBook;
Excel.Worksheet templateSheet;
//check to see if the template is already open, if its not then open it,
//if it is then bind it to work with it
if (!fileOpenTest)
{ templateBook = excelApp.Workbooks.Open(filePath); }
else
{ templateBook = (Excel.Workbook)System.Runtime.InteropServices.Marshal.BindToMoniker(filePath); }
//this grabs the name of the tab to dump the data into from the "Query Dumps" Tab
string tabName = lstQueryDumpSheet.Items[i].ToString();
templateSheet = templateBook.Sheets[tabName];
excelApp.Calculation = Excel.XlCalculation.xlCalculationManual;
templateSheet = templateBook.Sheets[tabName];
// Copy DataTable
foreach (System.Data.DataTable dt in dataSet.Tables)
{
// Copy the DataTable to an object array
object[,] rawData = new object[dt.Rows.Count + 1, dt.Columns.Count];
// Copy the values to the object array
for (int col = 0; col < dt.Columns.Count; col++)
{
for (int row = 0; row < dt.Rows.Count; row++)
{ rawData[row, col] = dt.Rows[row].ItemArray[col]; }
}
// Calculate the final column letter
string finalColLetter = string.Empty;
string colCharset = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int colCharsetLen = colCharset.Length;
if (dt.Columns.Count > colCharsetLen)
{ finalColLetter = colCharset.Substring((dt.Columns.Count - 1) / colCharsetLen - 1, 1); }
finalColLetter += colCharset.Substring((dt.Columns.Count - 1) % colCharsetLen, 1);
//this grabs the cell address from the "Query Dump" sheet, splits it on the '=' and
//pulls out only the cell address (i.e., "address=a3" becomes "a3")
string dumpCellString = lstQueryDumpText.Items[i].ToString();
string dumpCell = dumpCellString.Split('=').Last();
//referts to the range in which we are dumping the DataSet. The upper right hand cell is
//defined by the 'dumpCell' varaible and the bottom right cell is defined by the
//final column letter and the count of rows.
string firstRef = "";
string baseRow = "";
if (char.IsLetter(dumpCell, 1))
{
char[] createCellRef = dumpCell.ToCharArray();
firstRef = createCellRef[0].ToString() + createCellRef[1].ToString();
for (int z = 2; z < createCellRef.Count(); z++)
{
baseRow = baseRow + createCellRef[z].ToString();
}
}
else
{
char[] createCellRef = dumpCell.ToCharArray();
firstRef = createCellRef[0].ToString();
for (int z = 1; z < createCellRef.Count(); z++)
{
baseRow = baseRow + createCellRef[z].ToString();
}
}
int baseRowInt = Convert.ToInt32(baseRow);
int startingCol = ColumnLetterToColumnIndex(firstRef);
int endingCol = ColumnLetterToColumnIndex(finalColLetter);
int finalCol = startingCol + endingCol;
string endCol = ColumnIndexToColumnLetter(finalCol - 1);
int endRow = (baseRowInt + (dt.Rows.Count - 1));
string cellCheck = endCol + endRow;
string excelRange;
if (dumpCell.ToUpper() == cellCheck.ToUpper())
{
excelRange = string.Format(dumpCell + ":" + dumpCell);
}
else
{
excelRange = string.Format(dumpCell + ":{0}{1}", endCol, endRow);
}
//this dumps the cells into the range on Excel as defined above
templateSheet.get_Range(excelRange, Type.Missing).Value2 = rawData;
//checks to see if all the SQL queries have been run from the "Query Dump" tab, if not, continue
//the loop, if it is the last one, then save the workbook and move on.
if (i == lstSqlAddress.Items.Count - 1)
{
excelApp.Calculation = Excel.XlCalculation.xlCalculationAutomatic;
/*Run through the value save sheet array then grab the address from the corresponding list
place in the address array. If the address reads "whole sheet" then save the whole page,
else set the addresses range and value save that.*/
//for (int y = 0; y < lstSaveSheet.Items.Count; y++)
//{
// MessageBox.Show("Save Sheet: " + lstSaveSheet.Items[y] + "\n" + "Save Address: " + lstSaveRange.Items[y]);
//}
//run the macro to hide the unused columns
excelApp.Run("ReportMakerExecute");
//save excel file as hospital name and move onto the next
SaveTemplateAs(templateBook, h);
//close the open Excel App before looping back
//Marshal.ReleaseComObject(templateSheet);
//Marshal.ReleaseComObject(templateBook);
//templateSheet = null;
//templateBook = null;
//GC.Collect();
//GC.WaitForPendingFinalizers();
}
//Close excel Applications
//excelApp.Quit();
//Marshal.ReleaseComObject(templateSheet);
//Marshal.FinalReleaseComObject(excelApp);
//excelApp = null;
//templateSheet = null;
// GC.Collect();
//GC.WaitForPendingFinalizers();
}
}
The try/catch block is of no use either. This is the error:
"An unhandled exception of type 'System.AccessViolationException' occurred inSQUiRE (Sql QUery REtriever) v1.exe. Additional information: Attempted to read or write protected memory. This is often an indication that other memory is corrupt."

System.AccessViolationException would normally happen when you try to access an unallocated memory in a native code (not .NET). Then .NET translates it to the managed world as this exception.
Your code itself does not have any unsafe block. So access violation must me happening inside Excel.
Given the fact that it sometimes happens, some times not, I would say that it can be caused by a parallel Excel usage (I think the Excel COM is not thread-safe).
I would recommend you putting all your code inside a lock block, to prevent Excel from begin used in parallel. Something like this:
public void ExportToExcel(DataSet dataSet, string filePath, int i, int h, Excel.Application excelApp)
{
lock(this.GetType()) // You can change here to other instance to me used a mutex
{
// Your original code here
}
}

So long story, three days of testing longer, it was because of an excel file that was trying to open and fill with SQL results. The buffer was filling up and causing an exception...it just happened at the same point in every run because the load time for the excel file was the determining factor in it working or failing.
So after the load i just added a delaying do...while that checked to see if the file was accessible or not and it stopped the failures. fileOpenTest was taken from here
do
{
Task.Delay(2000);
}
while(!fileOpenTest);

Related

The source contains no DataRows. error when one iteration in for loop

I am making a program in Visual Studio where you can read in an excel file in a specific format and where my program converts the data from the excel file in a different format and stores it in a database table.
Below you can find a part of my code where something strange happens
//copy schema into new datatable
DataTable _longDataTable = _library.Clone();
foreach (DataRow drlibrary in _library.Rows)
{
//count number of variables in a row
string check = drlibrary["Check"].ToString();
int varCount = check.Length - check.Replace("{", "").Length;
int count_and = 0;
if (check.Contains("and") || check.Contains("or"))
{
count_and = Regex.Matches(check, "and").Count;
varCount = varCount - count_and;
}
//loop through number of counted variables in order to add rows to long datatable (one row per variable)
for (int i = 1; i <= varCount; i++)
{
var newRow = _longDataTable.NewRow();
newRow.ItemArray = drlibrary.ItemArray;
string j = i.ToString();
//fill variablename with variable number
if (i < 10)
{
newRow["VariableName"] = "Variable0" + j;
}
else
{
newRow["VariableName"] = "Variable" + j;
}
}
}
When varCount equals 1, I get the following error message when running the program after inserting an excel file
The source contains no DataRows.
I don't know why I can't run the for loop with just one iteration. Anyone who can help me?

Optimize performance of data processing method

I am using the following code to take some data (in XML like format - Not well formed) from a .txt file and then write it to an .xlsx using EPPlus after doing some processing. StreamElements is basically a modified XmlReader. My question is about performance, I have made a couple of changes but don't see what else I can do. I'm going to use this for large datasets so I'm trying to modify to make this as efficient and fast as possible. Any help will be appreciated!
I tried using p.SaveAs() to do the excel writing but it did not really see a performance difference. Are there better faster ways to do the writing? Any suggestions are welcome.
using (ExcelPackage p = new ExcelPackage())
{
ExcelWorksheet ws = p.Workbook.Worksheets[1];
ws.Name = "data1";
int rowIndex = 1; int colIndex = 1;
foreach (var element in StreamElements(pa, "XML"))
{
var values = element.DescendantNodes().OfType<XText>()
.Select(v => Regex.Replace(v.Value, "\\s+", " "));
string[] data = string.Join(",", values).Split(',');
data[2] = toDateTime(data[2]);
for (int i = 0; i < data.Count(); i++)
{
if (rowIndex < 1000000)
{
var cell1 = ws.Cells[rowIndex, colIndex];
cell1.Value = data[i];
colIndex++;
}
}
rowIndex++;
}
}
ws.Cells[ws.Dimension.Address].AutoFitColumns();
Byte[] bin = p.GetAsByteArray();
using (FileStream fs = File.OpenWrite("C:\\test.xlsx"))
{
fs.Write(bin, 0, bin.Length);
}
}
}
Currently, for it to do the processing and then write 1 Million lines into an Excel worksheet, it takes about ~30-35 Minutes.
I've ran into this issue before and excel has a huge overhead when you're modifying worksheet cells individually one by one.
The solution to this is to create an object array and populate the worksheet using the WriteRange functionality.
using(ExcelPackage p = new ExcelPackage()) {
ExcelWorksheet ws = p.Workbook.Worksheets[1];
ws.Name = "data1";
//Starting cell
int startRow = 1;
int startCol = 1;
//Needed for 2D object array later on
int maxColCount = 0;
int maxRowCount = 0;
//Queue data
Queue<string[]> dataQueue = new Queue<string[]>();
//Tried not to touch this part
foreach(var element in StreamElements(pa, "XML")) {
var values = element.DescendantNodes().OfType<XText>()
.Select(v = > Regex.Replace(v.Value, "\\s+", " "));
//Removed unnecessary split and join, use ToArray instead
string[] eData = values.ToArray();
eData[2] = toDateTime(eData[2]);
//Push the data to queue and increment counters (if needed)
dataQueue.Enqueue(eData);
if(eData.Length > maxColCount)
maxColCount = eData.Length;
maxRowCount++;
}
//We now have the dimensions needed for our object array
object[,] excelArr = new object[maxRowCount, maxColCount];
//Dequeue data from Queue and populate object matrix
int i = 0;
while(dataQueue.Count > 0){
string[] eData = dataQueue.Dequeue();
for(int j = 0; j < eData.Length; j++){
excelArr[i, j] = eData[j];
}
i++;
}
//Write data to range
Excel.Range c1 = (Excel.Range)wsh.Cells[startRow, startCol];
Excel.Range c2 = (Excel.Range)wsh.Cells[startRow + maxRowCount - 1, maxColCount];
Excel.Range range = worksheet.Range[c1, c2];
range.Value2 = excelArr;
//Tried not to touch this stuff
ws.Cells[ws.Dimension.Address].AutoFitColumns();
Byte[] bin = p.GetAsByteArray();
using(FileStream fs = File.OpenWrite("C:\\test.xlsx")) {
fs.Write(bin, 0, bin.Length);
}
}
I didn't try compiling this code, so double check the indexing used; and check for any small syntax errors.
A few extra pointers to consider for performance:
Try to parallel the population of the object array, since it is primarily index based (maybe have a dictionary with an index tracker Dictionary<int, string[]>) and lookup in there for faster population of the object array. You would likely have to trade space for time.
See if you are able to hardcode the column and row counts, or figure it out quickly. In my code fix, I've set counters to count the maximum rows and columns on the fly; I wouldn't recommend it as a permanent solution.
AutoFitColumns is very costly, especially if you're dealing with over a million rows

I want to compare 2000 data cells with 3000 other data cells in excel, but this takes very long

I've got two rows with data which I want to compare with each other to find duplicates. When I run my program it will take hours to complete this task, while it will take Excel a few seconds. But I don't want to do it in Excel because I wanna do it automatically. Row A = 2000 long and Row B = 3000 data long.
Here is what I did:
static void Main(string[] args)
{
excel_init("C:\\blablatest");
for (int j = 1; j < 2000; j++)
{
for (int k = 1; k < 2000; k++)
{
if (excel_getValue("A"+j) == excel_getValue("B"+k))
{
excel_setValue("D"+j,"1");
}
Console.WriteLine(j);
//**STILL LOOP TAKES HOURS**
}
}
excel_close();
Console.ReadKey();
}
private static Microsoft.Office.Interop.Excel.ApplicationClass appExcel;
private static Workbook newWorkbook = null;
private static _Worksheet objsheet = null;
//Method to initialize opening Excel
static void excel_init(String path)
{
appExcel = new Microsoft.Office.Interop.Excel.ApplicationClass();
if (System.IO.File.Exists(path))
{
// then go and load this into excel
newWorkbook = appExcel.Workbooks.Open(path, true, true);
objsheet = (_Worksheet)appExcel.ActiveWorkbook.ActiveSheet;
}
else
{
Console.WriteLine("Unable to open file!");
System.Runtime.InteropServices.Marshal.ReleaseComObject(appExcel);
appExcel = null;
}
}
static void excel_setValue(string cellname, string value)
{
objsheet.get_Range(cellname).set_Value(Type.Missing, value);
}
//Method to get value; cellname is A1,A2, or B1,B2 etc...in excel.
static string excel_getValue(string cellname)
{
string value = string.Empty;
try
{
value = objsheet.get_Range(cellname).get_Value().ToString();
}
catch
{
value = "";
}
return value;
}
//Method to close excel connection
static void excel_close()
{
if (appExcel != null)
{
try
{
newWorkbook.Close();
System.Runtime.InteropServices.Marshal.ReleaseComObject(appExcel);
appExcel = null;
objsheet = null;
}
catch (Exception ex)
{
appExcel = null;
Console.WriteLine("Unable to release the Object " + ex.ToString());
}
finally
{
GC.Collect();
}
}
}
}
(How) can I make this faster???
You are paying a huge overhead by doing the comparison inside Excel. What you should do is extract the data and compare it directly in your application.
The easiest way to do this is to convert Excel ranges to arrays:
var rowAArray = objsheet.Range["A1","A2000"].Value; //object[,] typed array
var rowBArray = objsheet.Range["B1", "B2000"].Value; //object[,] typed array
And now you just have to compare both arrays:
for (int j = 1; j < 2000; j++)
{
for (int k = 1; k < 2000; k++)
{
if (rowBArray[k, 1] == rowAArray[j, 1])
objsheet.Cells[j, 4].Value = 1; //Set value in cell "D*"
}
}
You will have to live with the boxing / unboxing penalty if you are handling numerical values but it will still be much faster than using Excel to perform the comparison.
Haven't tested the code but it should work.
Although your problem is well answered by InBetween and is going to be faster with removing that huge overhead, I must add that you don't need to compare all 2000 * 3000 entries for finding duplicated values, given that you have two sorted lists. Similar work can found here.
Let's sort your two lists, namely A and B(column number), to E and G. What about F? Store original row number of A, as E. For example, if a string "aabbb" was in A384 and now is in E1, store 384 in F1. Then compare two lists as link above, and for example, if you have a duplicate at E644, mark the cell "D"+(value of F644) with 1.
Originally you had O(AB) comparisons, by doing this you have O(AlogA + BlogB) for sorting. (Comparisons will only take O(max(A, B)).)
Note: In my opinion, implementing this is not going to be that easy and bug-free. I recommend first try InBetween's answer. Think about applying my suggestion only if it's still slow.

Optimal Column Width OpenOffice Calc

I'm entering data from a CSV file into a OpenOffice spreadsheet.
This code gets the a new sheet in a spreadsheet:
Public Spreadsheet getSpreadsheet(int sheetIndex, XComponent xComp)
{
XSpreadsheet xSheets = ((XSpreadsheetDocument)xComp).getSheets();
XIndexAccess xSheetIA = (XIndexAccess)xSheets;
XSpreadsheet XSheet = (XSpreadsheet)xSheetsA.getByIndex(sheetIndex).Value;
return XSheet;
}
I then have method that enters a list into a cell range one cell at a time. I want to be able to automatically set the column size for these cells. which is something like
string final DataCell;
Xspreadsheet newSheet = getSpreadsheet(sheetIndex, xComp);
int numberOfRecords = ( int numberOfColumns * int numberOfRows);
for(cellNumber = 0; cellNumber < numberOfrecords; cellNumber++)
{
XCell tableData = newSheet.getCellbyPosition(columnValue, rowValue);
((XText)tableData).setString(finalDataCell);
column Value++;
if(columnValue > = numberOfColumns)
{
rowVal++ column = 0;
}
}
After googling i have found the function:
columns.OptimalWidth = True on http://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=31292
but im unsure on how to use this. Could anyone explain this further or think of another way to have the cell autofit?
I understand the comments in the code are in Spanish I think, but the code is in English. I ran the comments through Google translate so now they are in English. I copied it from here:
//Auto Enlarge col width
private void largeurAuto(string NomCol)
{
XCellRange Range = null;
Range = Sheet.getCellRangeByName(NomCol + "1"); //Recover the range, a cell is
XColumnRowRange RCol = (XColumnRowRange)Range; //Creates a collar ranks
XTableColumns LCol = RCol.getColumns(); // Retrieves the list of passes
uno.Any Col = LCol.getByIndex(0); //Extract the first Col
XPropertySet xPropSet = (XPropertySet)Col.Value;
xPropSet.setPropertyValue("OptimalWidth", new one.Any((bool)true));
}
What this does it this: First it gets the range name and then gets the first column. The real code, though, is XpropertySet being used, which is explained REALLY well here.
public void optimalWidth(XSpreadsheet newSheet)
{
// gets the used range of the sheet
XSheetCellCursor XCursor = newSheet.createCursor();
XUsedAreaCursor xUsedCursor = (XUsedAreaCursor)XCursor;
xUsedCursor.gotoStartOfUsedArea(true);
xUsedCursor.gotoEndOfUsedArea(true);
XCellRangeAddressable nomCol = (XCellRangeAddressable)xUsedCursor;
XColumnRowRange RCol = (XColumnRowRange)nomCol;
XTableColumns LCol = RCol.getColumns();
// loops round all of the columns
for (int i = 0; i < nomCol.getRangeAddress().EndColumn;i++)
{
XPropertySet xPropSet = (XPropertySet)LCol.getByIndex(i).Value;
xPropSet.setPropertyValue("OptimalWidth", new uno.Any(true));
}
}

unable to read a particular cell from excel using reader

I am importing excel into sql server db the excel sheet has three columns :
id(number only)|data|passport
before importing it i want to check for certain things such as:
the passport should begin a letter and rest of the characters must be numbers
id must be numeric only
I am able to check for passport but i am not able to check id even though i am using same code i used for checking passport.
using (DbDataReader dr = command.ExecuteReader())
{
// SQL Server Connection String
string sqlConnectionString = "Data Source=DITSEC3;Initial Catalog=test;Integrated Security=True";
con.Open();
DataTable dt7 = new DataTable();
dt7.Load(dr);
DataRow[] ExcelRows = new DataRow[dt7.Rows.Count];
DataColumn[] ExcelColumn = new DataColumn[dt7.Columns.Count];
//=================================================
for (int i1 = 0; i1 < dt7.Rows.Count; i1++)
{
if (dt7.Rows[i1]["passport"] == null)
{
dt7.Rows[i1]["passport"] = 0;
}
if (dt7.Rows[i1]["id"] == null)
{
dt7.Rows[i1]["id"] = 0;
}
string a = Convert.ToString(dt7.Rows[i1]["passport"]);
string b = dt7.Rows[i1]["id"].ToString();
if (!string.IsNullOrEmpty(b))
{
int idlen = b.Length;
for (int j = 0; j < idlen; j++)
{
if (Char.IsDigit(b[j]))
{
//action
}
if(!Char.IsDigit(b[j]))
{
flag = flag + 1;
int errline = i1 + 2;
Label12.Text = "Error at line: " + errline.ToString();
//Label12.Visible = true;
}
}
if (!String.IsNullOrEmpty(a))
{
int len = a.Length;
for (int j = 1; j < len; j++)
{
if (Char.IsLetter(a[0]) && Char.IsDigit(a[j]) && !Char.IsSymbol(a[j]))
{
//action
}
else
{
flag = flag + 1;
int errline = i1 + 2;
Label12.Text = "Error at line: " + errline.ToString();
//Label12.Visible = true;
}
}
}
}
For some strange reason when i use breakpoint i can see the values of id as long as id is numeric in excel the moment flow comes to cell which has id as 25h547 the value if b turn "" any reason for this? i can give you entire code if you require.
What seems to be happening is that when the data is imported into the holding datatable and the first record in column is alphanumeric it will assume all the records in the column to be alphanumeric if the first one is numeric it will assume that all records in the column are numeric and therefore will be blank for alphanumeric records which occur somewhere in column. I solved the problem myself by modifying connectionstring : "Excel 8.0;IMEX=1;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text"
"IMEX=1;" tells the driver to always read "intermixed" (numbers, dates, strings etc) data columns as text.
specify the imex mode in connectionstring to handle mixed values
See: Mixed values in excel rows
Missing values. The Excel driver reads a certain number of rows (by
default, 8 rows) in the specified source to guess at the data type of
each column. When a column appears to contain mixed data types,
especially numeric data mixed with text data, the driver decides in
favor of the majority data type, and returns null values for cells
that contain data of the other type. (In a tie, the numeric type
wins.) Most cell formatting options in the Excel worksheet do not seem
to affect this data type determination. You can modify this behavior
of the Excel driver by specifying Import Mode. To specify Import Mode,
add IMEX=1 to the value of Extended Properties in the connection
string of the Excel connection manager in the Properties window

Categories

Resources