How to iterate throgh a specific row in Excel table via Interop?

How to iterate throgh a specific row in Excel table via Interop? - c#

So, I'm writing a program that is reads table data and puts cells values in a List. I made it, but there is one problem – UsedRange takes all cells on sheet so there is more items then I need and also, when I specify range by ["A:A", Type.Missng] it gives me an exception:
System.ArgumentException: "HRESULT: 0x80070057 (E_INVALIDARG))"
So my question is how to make it correctly?
Code is:
foreach (Excel.Range row in usedRange)
{
for(int i=0; i<lastCell.Row; i++)
{
if (row.Cells[4, i + 1].Value2 != null)
{
personlist.Add(Convert.ToString(row.Cells[4, i + 1].Value2));
}
else { i++; }
}
foreach(var person in personlist) {
Console.WriteLine(person);
}
}
UPD: I need a last used row, that's why I'm using UsedRange. So if there is any alternatives, like, checking if(!=null)? I will gladly try it
Tried to give it specific range, some tries to made a code like here C# - How do I iterate all the rows in Excel._Worksheet?
and here
https://overcoder.net/q/236542/программно-получить-последнюю-заполненную-строку-excel-с-помощью-c
but maybe I'm a dumb one, 'cause there is literally more than one articles about it and non of it works with me

The problem is 'used range' can include empty range (who knows how excel decides that magic number - if you type a letter on some arbitrary row and then delete it Excel can decide that cell is still part of your used range). You want your own custom definition of what a 'usedRange' is, which presumably is the range of non-blank rows. There's two straightforward ways of implementing this yourself (which gives you added control over it should you want to customize it).
You can just filter the list after the fact removing all blank entries. Or you can process the list in reverse, skipping rows till you find one matching your criteria
bool startProcessing = false;
for(int i=lastCell.Row-1; i>=0; i--)
{
if(!startProcessing){//bool is in case you want blank rows in the middle of the file, otherwise check valid row always
//check if valid row
//continue; if not, set startProcessing to true if yes
}
if (row.Cells[4, i + 1].Value2 != null)
{
personlist.Add(Convert.ToString(row.Cells[4, i + 1].Value2));
}
//else { i++; } //this is a bug, will cause a line skip
}
Also, as an aside - when you call i++; in the body of your for loop, it then calls it again in the header of your for loop and i += 2 skipping a row. Use continue; or just remove the else block altogether.
There's probably a way to get a cellRange matching your criteria, but imo doing it yourself can be better - you can ensure it does exactly what you want.

Related

Using AddRow() in Output Buffer when C# transform in SSIS needs synchronous?

First off I'm quite new to both SSIS and C#, so apologies for any rookie mistakes. I am trying to muddle my way through splitting one column by a specific delimiter from an input file that will have a variable length header, and a footer.
For example, Input0Buffer has one column. The actual data is always preceded by a row starting with the phrase "STARTDATA", and is bracketed with a row starting with "ENDDATA".
The one input column contains 5 bits of data separated by | . Two of these columns I don't care about.
Basically the input file looks like this:
junkrow
headerstuff
morejunk
STARTDATA
ID1|rubbish|stuff|apple|cheese
ID2|badger|junk|pear|yoghurt
So far I have tried to get some row-by-row logic going in the C# transformer, which I think I am happy with - but I can't work out how to get it to output my split data. Code is below.
bool passedSOD;
bool passedEOD;
public void ProcessRow(Input0Buffer data)
{
string Col1, Col2, Col3;
if (data.Column0.StartsWith("ENDDATA"))
{
passedEOD = true;
}
if (passedSOD && !passedEOD)
{
var SplitData = data.Column0.Split('|');
Col1 = SplitData[0];
Col2 = SplitData[3];
Col3 = SplitData[4];
//error about Output0Buffer not existing in context
Output0Buffer.Addrow();
Output0Buffer.prodid = Col1;
Output0Buffer.fruit = Col2;
Output0Buffer.dairy = Col3;
}
if (data.Column0.StartsWith("STARTDATA"))
{
passedSOD = true;
}
}
If I change the output to asynchronous it stops the error about Output0Buffer not existing in the current context, and it runs, but gives me 0 rows output - presumably because I need it to be synchronous to work through each row as I've set this up?
Any help much appreciated.

you can shorten your code by just checking if the row contains a '|'
if(Row.Column0.Contains("|")
{
string[] cols = Row.Column0.Split('|');
Output0Buffer.AddRow();
Output0Buffer.prodid = cols[0];
Output0Buffer.fruit = cols[3];
Output0Buffer.dairy = cols[4];
}
Like Bill said. Make sure this is a transformation component and not a destination. Your options are source, transformation, and destination.
You also might want this as a different output as well. Otherwise, you will need to conditionally split out the "extra" rows.

Thanks both for for answering - it is a transformation, and thank you for the shorter way, however the header and footer are not well formatted and may contain junk characters also, so I daren't risk looking for | in rows. But I will definitely store that away for processing a better formatted file next time.
I got a reply outside this forum so I thought I should answer my own question in case any one else has a similar problem.
Note that:
it's a transform
the Output is be set to SynchronousInputID = None in the Inputs and Outputs section of the Script Transformation Editor
my input is just called Input, and contains one column called RawData
my output is called GenOutput, and has three columns
although the input file only really has 5 fields, there is a trailing | at the end of each row so this counts as 6
Setting the synchronous to None means that Output0Buffer is now recognised in context.
The code that works for me is:
bool passedSOD;
bool passedEOD;
public override void_InputProcessInputRow(InputBuffer Row)
{
if (Row.RawData.Contains("ENDDATA"))
{
passedEOD = true;
GenOutputBuffer.SetEndOfRowset();
}
//IF WE HAVE NOT PASSED THE END OF DATA, BUT HAVE PASSED THE START OF DATA, SPLIT THE ROW
if (passedSOD && !passedEOD)
{
var SplitData = Row.RawData.Split('|');
//ONLY PROCESS IF THE ROW CONTAINS THE RIGHT NUMBER OF ELEMENTS I.E. EXPECTED NUMBER OF DELIMITERS
if (SplitData.Length == 6)
{
GenOutputBuffer.AddRow();
GenOutputBuffer.prodid = SplitData[0];
GenOutputBuffer.fruit = SplitData[3];
GenOutputBuffer.dairy = SplitData[4];
}
//SILENTLY DROPPING ROWS THAT DO NOT HAVE RIGHT NUMBER OF ELEMENTS FOR NOW - COULD IMPROVE THIS LATER
}
if (Row.RawData.Contains("STARTDATA"))
{
passedSOD = true;
}
}
Now I've just got to work out how to convert one of the other fields from string to decimal, but decimal null and allow it to output a null if someone has dumped "N.A" in that field :D

Calculate number of cells with a specific value

I'm creating a program for test management that has a stats function, which calculates the number of bugs that have been fixed, not fixed or N/A.
The test cases are all listed on a DataGridView, where the 1st column is for the test cases, the 2nd one is for the results (the column I'd like to work with) and the latter is just for comments.
Here's a bit of my code to show what I'm talking about
private int Passed() // This method is supposed to count how many test cases have passed
{
int passed = 0;
if (/*what condition should I put here?*/) {
passed++;
}
return passed;
}
//Is this the best way to display the percentage in real time?
private void Refresh_Tick(object sender, EventArgs e)
{
Display2.Text = Passed().ToString();
}
The "Results" column cells have each a combobox with its items being "FIXED", "N/A" and "NOT FIXED".
Please, I'd like to know how I can programmatically access those cells' value and then use them as a condition to count how many bugs have been fixed.

Iterating through all the rows in the gridview should get you the answer.
int countFixed=0;
int countUnFixed=0;
for(int i=0;i<dgv.RowCount;i++)
{
if((string)dgv.Rows[i].Cells[1].Value == "Fixed") //try referring to cells by column names and not the index
countFixed++;
else if((string)dgv.Rows[i].Cells[1].Value == "Not Fixed")
countUnFixed++;
}

How to get format type of cell using c# in spreadsheetlight

I am using spreadsheetlight library to read Excel sheet(.xslx) values using c#.
I can read the cell value using following code
for (int col = stats.StartColumnIndex; col <= stats.EndColumnIndex; col++)
{
var value= sheet.GetCellValueAsString(stats.StartRowIndex, col); //where sheet is current sheet in excel file
}
I am getting the cell value. But how can I get the data type of the cell? I have checked in documentation but didn't find the solution.
Note: For .xls type of excel files i am using ExcelLibrary.dll library where i can easily get the datatype of cells using below code
for (int i = 0; i <= cells.LastColIndex; i++)
{
var type = cells[0, i].Format.FormatType;
}
but there is no similar method in spreadsheetlight.

Here is the answer from the developer Vincent Tang after I asked him as I wasn't sure how to use DataType:
Yes use SLCell.DataType. It's an enumeration, but for most data, you'll be working with Number, SharedString and String.
Text data will be SharedString, and possibly String if the text is directly embedded in the worksheet. There's a GetSharedStrings() or something like that.
For numeric data, it will be Number.
For dates, it's a little tricky. The data type is also Number (ignore the Date enumeration because Microsoft Excel isn't using it). For dates, you also have to check the FormatCode, which is in the SLStyle for the SLCell. Use the GetStyles() to get a list. The SLCell.StyleIndex gives you the index to that list.
For example, if your SLCell has cell value "15" and data type SharedString, then look for index 15 in the list of shared strings. If it's "blah" with String data type, then that's it.
If it's 56789 with Number type, then that's it.
Unless the FormatCode is "mm-yyyy" (or some other date format code), then 56789 is actually the number of days since 1 Jan 1900.
He also recommended using GetCellList() in order to obtain the list of SLCell objects in the sheet. However, for some reason that function was not available in my version of SL, so I used GetCells() instead. That returns a dictionary of SLCell objects, with keys of type SLCellPoint.
So for example to get the DataType (which is a CellValues object) of cell A1 do this:
using (SLDocument slDoc = new SLDocument("Worksheet1.xlsx", "Sheet1")) {
slCP = SLCellPoint;
slCP.ColumnIndex = SLConvert.ToColumnIndex("A"); ///Obviously 1 but useful function to know
slCP.RowIndex = 1;
CellValues slCV = slDoc.GetCells(slCP).DataType;
}
By the way, I also had a problem with opening the chm help file. Try this:
Right Click on the chm file and select properties
Click on Unblock button at the bottom of the General Tab

To get the value of the cell try following code
var cellValue = (string)(excelWorksheet.Cells[10, 2] as Excel.Range).Value;
Use this link for more details

Check out the SLCell.DataType Property. Spreadsheetlight documentation mentions that this returns the Cell datatype, in class Spreadsheetlight.SLCell
public CellValues DataType { get; set; }
PS: On a side note, I figured out how to open the chm documentation. Try opening the chm file in Winzip, it opens without any issues.
Hope it helps. Thanks

Well, After a lot of trail and error methods i got to find a solution for this.
Based on the formatCode of a cell we can decide the formatType of the cell.
Using GetCellStyle method we can get the formatcode of the cell. Using this formatCode we can decide the formatType.
var FieldType = GetDataType(sheet.GetCellStyle(rowIndex, columnIndex).FormatCode);
private string GetDataType(string formatCode)
{
if (formatCode.Contains("h:mm") || formatCode.Contains("mm:ss"))
{
return "Time";
}
else if (formatCode.Contains("[$-409]") || formatCode.Contains("[$-F800]") || formatCode.Contains("m/d"))
{
return "Date";
}
else if (formatCode.Contains("#,##0.0"))
{
return "Currency";
}
else if (formatCode.Last() == '%')
{
return "Percentage";
}
else if (formatCode.IndexOf("0") == 0)
{
return "Numeric";
}
else
{
return "String";
}
}
This method worked for 99% of the cases.
Hope it helps you.

C#: Replace Text Of Blank Items In Listview?

How would I go about about replacing the text of blank items in column 5 of my listView with the word "No"?
I tried this but it threw an InvalidArgument=Value of '4' is not valid for 'index'. error:
foreach (ListViewItem i in listView1.Items)
{
if (i.SubItems[4].Text == " ")
{
i.SubItems[4].Text = i.SubItems[4].Text.Replace(" ", "No");
}
}

The code provided above will get all items within ListView1.Items and check if the sub-item of index 4 and its property Text is equal to   which may result in the described error if the index exceeds the array limit. You may avoid this by making sure that this item is not Nothing.
Example
foreach (ListViewItem i in listView1.Items) //Get all items in listView1.Items
{
if (i.SubItems.Count > 3) //Continue if i.SubItems.Count is more than 3 (The array contains i.SubItems[3] which refers to an item within the 4th column (i.SubItems.Count is not an array. Therefore, it'll start with 1 instead of 0))
{
if (i.SubItems[3].Text == " ") //Continue if i.SubItems[3].Text is equal to  
{
i.SubItems[3].Text = i.SubItems[3].Text.Replace(" ", "No"); //Replace   with No
}
}
}
Notice: Arrays are zero-indexed which means that they start with 0 instead of 1.
Notice: If you only have 4 columns, i.SubItems.Count would be 4 and not 3 because it's a normal int considering that all columns are filled.
Thanks,
I hope you find this helpful :)

If I were you, I'd Use the debugger to figure out what's actually going on.
You can check and see what's actually in i.SubItems, and make sure it's actually what you think it is.
The only possible thing i can think of is maybe you made a typo somewhere or that i.SubItems[4] actually just isn't valid.
maybe you're iterating through some of your list items, but not all of your list items have 5 columns, or maybe some are empty.

Once you get that first error figured out, your logic for replacing the text might work better like this:
if (i.SubItems != null && string.IsNullOrEmpty(i.SubItems[4].Text))
{
i.SubItems[4].Text = "No";
}

C# - Using OpenXML how to read blank spaces on excel files

I was searching for some solutions but failed to find one, currently I have a problem on reading an excel file using OpenXML. With perfect data, there won't be any problem, but with data with blanks, the columns seems to be moving to the left, producing an error saying that the index was not right since it actually moved to the left. I found a solution wherein you can place in cells in between, but when I tried it, an error saying that an object reference was not set to an instance of an object while reading the certain cell with this code (source is from the answer in here for inserting cells How do I have Open XML spreadsheet "uncollapse" cells in a spreadsheet?)
public static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
string value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else if (cell == null)
{
return null;
}
else
{
return value;
}
}
any other ways wherein I can read blank cells as blank without moving the data to the left?
All help will be appreciated! :)
Thanks!

In Open XML, xml file does not contain an entry for the blank cell that's why blank cells are skipped. I faced the same problem. The only solution is apply some logic.
For Example:
When we read a cell we can get its ColumnName (A,B,C etc.) by the following code
string cellIndex = GetColumnName( objCurrentSrcCell.CellReference );
where
public static string GetColumnName(string cellReference)
{
// Create a regular expression to match the column name portion of the cell name.
Regex regex = new Regex("[A-Za-z]+");
Match match = regex.Match(cellReference);
return match.Value;
}
you can store these cells in a Hashtable where key can be the cell ColumnName and value can be the object of the cell. And when writing fetch cells from the hash object serially on some basis or your logic like...
you may loop from A to Z and read the cells at particular key like
if(objHashTable.Contains(yourKey))
{
Cell objCell = (Cell) objHashTable[yourKey];
//Insertcell or process cell
}
else
{
//do process for the empty cell like you may add a new blank cell
Cell objCell = new Cell();
//Insert cell or process cell
}
This is the only way to work with open xml. adding a blank cell during reading is a waste of time. You can add more logic according to you
try this. this will definitely work. or if you find a better solution, do tell me
Have a nice day :)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to iterate throgh a specific row in Excel table via Interop? - c#

Related

Using AddRow() in Output Buffer when C# transform in SSIS needs synchronous?

Calculate number of cells with a specific value

How to get format type of cell using c# in spreadsheetlight

C#: Replace Text Of Blank Items In Listview?

C# - Using OpenXML how to read blank spaces on excel files

Categories

Resources