I have the following code:
The line with 'excelImage' initialization throws the exception: Column number out of bounds.
What is the solution of the problem?
var range2 = worksheet.Cells ["A" + limiter.ToString ()];
range2.Value = tokenGood.id; //Take from JSON-array
worksheet.Row (limiter).Height = 70; //'limiter' is like row iterator
worksheet.Column (1).Width = 10;
Bitmap img = new Bitmap(Image.FromFile (PIC_FILENAME));
OfficeOpenXml.Drawing.ExcelPicture excelImage = worksheet.Drawings.AddPicture ("random_string", img); //Error Line
excelImage.From.Column = 3;
excelImage.From.Row = limiter;
excelImage.SetSize (60, 60);
Workaround for the bug (EPPlus 4.5.3 doesn't check if image resolution exists in: ExcelDrawing.cs,
internal void SetPixelWidth(int pixels, float dpi), line 400).
If image resolution doesn't set, just set it manually. AddPicture will work fine:
if (img.HorizontalResolution == 0 || img.VerticalResolution == 0)
img.SetResolution(96, 96);
var excelImage = worksheet.Drawings.AddPicture("img", img);
This seems to be a bug in current (4.0.5) EPPlus. Even though the current version has been released recently, the backlog of open issues on CodePlex is huge, so I don't see any approaching fixed release...
I solved it by downloading the source, manually correcting 4 rows in EPPlus/Drawing/ExcelDrawingBase.cs and recompiling.
The rows are 513, 515 (both in SetPixelHeight(int, float)) and 540, 542 (in SetPixelWitdh(int, float)).
I changed the occurrences of From.Column or From.Row in those two lines to Math.Max(From.Column, 0) and Math.Max(From.Row, 0).
The problem here is that the From property is an object that has somehow not been properly initialized, and the values of From.Row and From.Column are thus set to int.MinValue - which is not a valid column/row index, as you may guess.
I've been looking for workarounds that don't require to edit the source code, but couldn't find any: hope this helps.
Related
i tried Trial version of Gembox.SpreadSheet.
when i Get Cells[,].value by for() or Foreach().
so i think after Calculate() & get Cell[].value, but that way just take same time,too.
it take re-Calculate when i Get Cell[].value.
workSheet.Calcuate(); <- after this, values are Calculated, am i right?
for( int i =0; i <worksheet.GetUsedCellRange(true).LastRowIndex+1;++i)
{
~~~~for Iteration~~~
var value = workSheet.Cells[i,j].Value; <- re-Calcuate value(?)
}
so here is a Question.
Can i Get calculated values? or you guys know pre-Calculate function or Get more Speed?
Unfortunate, I'm not sure what exactly you're asking, can you please try reformulating your question a bit so that it's easier to understand it?
Nevertheless, here is some information which I hope you'll find useful.
To iterate through all cells, you should use one of the following:
1.
foreach (ExcelRow row in workSheet.Rows)
{
foreach (ExcelCell cell in row.AllocatedCells)
{
var value = cell.Value;
// ...
}
}
2.
for (CellRangeEnumerator enumerator = workSheet.Cells.GetReadEnumerator(); enumerator.MoveNext(); )
{
ExcelCell cell = enumerator.Current;
var value = cell.Value;
// ...
}
3.
for (int r = 0, rCount = workSheet.Rows.Count; r < rCount; ++r)
{
for (int c = 0, cCount = workSheet.CalculateMaxUsedColumns(); c < cCount; ++c)
{
var value = workSheet.Cells[r, c].Value;
// ...
}
}
I believe all of them will have pretty much the same performances.
However, depending on the spreadsheet's content this last one could end up a bit slower. This is because it does not exclusively iterate only through allocated cells.
So for instance, let say you have a spreadsheet which has 2 rows. The first row is empty, it has no data, and the second row has 3 cells. Now if you use 1. or 2. approach then you will iterate only through those 3 cells in the second row, but if you use 3. approach then you will iterate through 3 cells in the first row (which previously were not allocated and now they are because we accessed them) and then through 3 cells in the second row.
Now regarding the calculation, note that when you save the file with some Excel application it will save the last calculated formula values in it. In this case you don't have to call Calculate method because you already have the required values in cells.
You should call Calculate method when you need to update, re-calculate the formulas in your spreadsheet, for instance after you have added or modified some cell values.
Last, regarding your question again it is hard to understand it, but nevertheless:
Can i Get calculated values?
Yes, that line of code var value = workSheet.Cells[i,j].Value; should give you the calculated value because you used Calculate method before it. However, if you have formulas that are currently not supported by GemBox.Spreadsheet's calculation engine then it will not be able to calculate the value. You can find a list of currently supported Excel formula functions here.
or you guys know pre-Calculate function or Get more Speed?
I don't know what "pre-Calculate function" means and for speed please refer to first part of this answer.
I have a number of documents with predicted placement of certain text which I'm trying to extract. For the most part, it works very well, but I'm having difficulties with a certain fraction of documents which have slightly thicker text.
Thin text:
Thick text:
I know it's hard to tell the difference at this resolution, but if you look at MO DAY YEAR TIME (2400) portion, you can tell that the second one is thicker.
The thin text gives me exactly what is expected:
09/28/2015
0820
However, the thick version gives me a triple of every character with white space in between each duplicated character:
1 1 11 1 1/ / /1 1 19 9 9/ / /2 2 20 0 01 1 15 5 5
1 1 17 7 70 0 02 2 2
I'm using the following code to extract text from documents:
public static Document GetDocumentInfo(string fileName)
{
// Using 11 in x 8.5 in dimensions at 72 dpi.
var boudingBoxes = new[]
{
new RectangleJ(446, 727, 85, 14),
new RectangleJ(396, 702, 43, 14),
new RectangleJ(306, 680, 58, 7),
new RectangleJ(378, 680, 58, 7),
new RectangleJ(446, 680, 45, 7),
new RectangleJ(130, 727, 29, 10),
new RectangleJ(130, 702, 29, 10)
};
var data = GetPdfData(fileName, 1, boudingBoxes);
// I would populated the new document with extracted data
// here, but it's not important for the example.
var doc = new Document();
return doc;
}
public static string[] GetPdfData(string fileName, int pageNum, RectangleJ[] boundingBoxes)
{
// Omitted safety checks, as they're not important for the example.
var data = new string[boundingBoxes.Length];
using (var reader = new PdfReader(fileName))
{
if (reader.NumberOfPages < 1)
{
return null;
}
RenderFilter filter;
ITextExtractionStrategy strategy;
for (var i = 0; i < boundingBoxes.Length; ++i)
{
filter = new RegionTextRenderFilter(boundingBoxes[i]);
strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter);
data[i] = PdfTextExtractor.GetTextFromPage(reader, pageNum, strategy);
}
return data;
}
}
Obviously, if nothing else works, I can get rid of duplicate characters after reading them in, as there is a very apparent pattern, but I'd rather find a proper way than a hack. I tried looking around for the past few hours, but couldn't find anyone encountering a similar issue.
EDIT:
I finally came across this SO question:
Text Extraction Duplicate Bold Text
...and in the comments it's indicated that some of the lower quality PDF producers duplicate text to simulate boldness, so that's one of the things that might be happening. However, there is a mention of omitting duplicate text at the location, which I don't know how can be achieved since this portion of my code...
data[i] = PdfTextExtractor.GetTextFromPage(reader, pageNum, strategy);
...reads in the duplicated text completely in any of the specified locations.
EDIT:
I now have come across documents that duplicate contents up to four times to simulate thickness. That's a very strange way of doing things, but I'm sure designers of that method had their reasons.
EDIT:
I produced A solution (see my answer). It processes the data after it's already extracted and removes any repetitions. Ideally this would have been done during the extraction process, but it can get pretty complicated and this seemed like a very clean and easy way of getting the same accomplished.
As #mkl has suggested, one way of tackling this issue is to override LocationExtractionStrategy; however, things get pretty complicated since it would require comparison of locations for each character found at specific boundaries. I tried doing some research in order to accomplish that, but due to poor documentation, it was getting a bit out of hand.
So, instead as I created a post-processing method, loosely based around what #TheMuffinMan has suggested, to clean up any repetitions. I decided not to deal with pixels, but rather with character count anomalies in known static locations. In my case, I know that the second data piece extracted can never be greater than three characters, so it's a good comparison point for me. If you know the document layout, you can use anything on it that you know will always be of fixed length.
After I extract the data with the method listed in my original post, I check to see if the second data piece is greater than three in length. If it returns true, then I divide the given length by three, as that's the most characters it can have and since all repitions come out to even length, I know I'll get an even number of repetition cases:
var data = GetPdfData(fileName, 1, boudingBoxes);
if (data[1].Length > 3)
{
var count = data[1].Length / 3;
for (var i = 0; i < data.Length; ++i)
{
data[i] = RemoveRepetitions(data[i], count);
}
}
As you can see, I then loop over the data and pass each piece into RemoveRepetitions() method:
public static string RemoveRepetitions(string original, int count)
{
if (original.Length % count != 0)
{
return null;
}
var temp = new char[original.Length / count];
for (int i = 0; i < original.Length; i += count)
{
temp[i / count] = original[i];
}
return new string(temp);
}
This method takes the string and the number of expected repetitions, which we calculated earlier. One thing to note is that I don't have to worry about the white spaces that are inserted in the duplicated process, as the example shows in the original post, due to the fact that count will represent the total number of characters where only one should have been.
I want the Excel spreadsheet cells I populate with C# to expand or contract so that all their content displays without manually adjusting the width of the cells - displaying at "just enough" width to display the data - no more, no less.
I tried this:
_xlSheet = (MSExcel.Excel.Worksheet)_xlSheets.Item[1];
_xlSheet.Columns.AutoFit();
_xlSheet.Rows.AutoFit();
...but it does nothing in my current project (it works fine in a small POC sandbox app that contains no ranges). Speaking of ranges, the reason this doesn't work might have something to do with my having created cell ranges like so:
var rowRngMemberName = _xlSheet.Range[_xlSheet.Cells[1, 1], _xlSheet.Cells[1, 6]];
rowRngMemberName.Merge(Type.Missing);
rowRngMemberName.Font.Bold = true;
rowRngMemberName.Font.Italic = true;
rowRngMemberName.Font.Size = 20;
rowRngMemberName.Value2 = shortName;
...and then adding "normal"/generic single-cell values after that.
In other words, I have values that span multiple columns - several rows of that. Then below that, I revert to "one cell, one value" mode.
Is this the problem?
If so, how can I resolve it?
Is it possible to have independent sections of a spreadsheet whose formatting (autofitting) isn't affected by other parts of the sheet?
UPDATE
As for getting multiple rows to accommodate a value, I'm using this code:
private void AddDescription(String desc)
{
int curDescriptionBottomRow = curDescriptionTopRow + 3;
var range =
_xlSheet.Range[_xlSheet.Cells[curDescriptionTopRow, 1], _xlSheet.Cells[curDescriptionBottomRow, 1]];
range.Merge();
range.Font.Bold = true;
range.VerticalAlignment = XlVAlign.xlVAlignCenter;
range.Value2 = desc;
}
...and here's what it accomplishes:
AutoFit is what is needed, after all, but the key is to call it at the right time - after all other manipulation has been done. Otherwise, subsequent manipulation can lose the autofittedness.
If I get what you are asking correctly you are looking to wrap text... at least thats the official term for it...
xlWorkSheet.Range["A4:A4"].Cells.WrapText = true;
Here is the documentation: https://msdn.microsoft.com/en-us/library/office/ff821514.aspx
Good Afternoon. I am new to stack overflow as a poster but have referenced it for years. I have been researching this problem of mine for about 2 weeks and while I've seen solutions that are close I still am left with an issue.
I am writing a C# gui that reads in an assembly code file and highlights different text items for further processing via another program. My form has a RichTextBox that the text is displayed in. In the case below I am trying to select the text at the location of the ‘;’ until the end of the line and change the text to color red. Here is the code that I am using.
Please note: The files that are read in by the program are of inconsistent length, not all lines are formatted the same so I cannot simply search for the ';' and operate on that.
On another post a member has given an extension method for AppendText which I have gotten to work perfectly except for the original text is still present along with my reformatted text. Here is the link to that site:
How to use multi color in richtextbox
// Loop that it all runs in
Foreach (var line in inArray)
{
// getting the index of the ‘;’ assembly comments
int cmntIndex = line.LastIndexOf(';');
// getting the index of where I am in the rtb at this time.
int rtbIndex = rtb.GetFirstCharIndexOfCurrentLine();
// just making sure I have a valid index
if (cmntIndex != -1)
{
// using rtb.select to only select the desired
// text but for some reason I get it all
rtb.Select(cmntIndex + rtbIndex, rtb.SelectionLength);
rtb.SelectionColor = Color.Red;
}
}
Below is the sample assembly code from a file in it's original form all the text is black:
;;TAG SOMETHING, SOMEONE START
ASSEMBLY CODE ; Assembly comments
ASSEMBLY CODE ; Assembly comments
ASSEMBLY CODE ; Assembly comments
;;TAG SOMETHING, SOMEONE FINISH
When rtb.GetFirstCharIndexOfCurrentLine() is called it returns a valid index of the RTB and I imagine that if I add the value returned by line.LastIndexOf(';') I will then be able to just select the text above that looks like ; Assembly comments and turn it red.
What does happen is that the entire line turns red.
When I use the AppendText method above I get
ASSEMBLY CODE (this is black) ; Assembly comments (this is red) (the rest is black) ASSEMBLY CODE ; Assembly comments
The black code is the exact same code as the recolored text. In this case I need to know how to clear the line in the RTB and/or overwrite the text there. All the options that I have tried result in deletion of those lines.
Anywho, I'm sure that was lengthy but I'm really stumped here and would greatly appreciate advice.
I hope I've understood you correctly.
This loops over each line in the richtextbox, works out which lines are the assembly comments, then makes everything red after the ";"
With FOREACH loop as requested
To use a foreach loop you simply need to keep track of the index manually like so:
// Index
int index = 0;
// Loop over each line
foreach (string line in richTextBox1.Lines)
{
// Ignore the non-assembly lines
if (line.Substring(0, 2) != ";;")
{
// Start position
int start = (richTextBox1.GetFirstCharIndexFromLine(index) + line.LastIndexOf(";") + 1);
// Length
int length = line.Substring(line.LastIndexOf(";"), (line.Length - (line.LastIndexOf(";")))).Length;
// Make the selection
richTextBox1.SelectionStart = start;
richTextBox1.SelectionLength = length;
// Change the colour
richTextBox1.SelectionColor = Color.Red;
}
// Increase index
index++;
}
With FOR loop
// Loop over each line
for(int i = 0; i < richTextBox1.Lines.Count(); i++)
{
// Current line text
string currentLine = richTextBox1.Lines[i];
// Ignore the non-assembly lines
if (currentLine.Substring(0, 2) != ";;")
{
// Start position
int start = (richTextBox1.GetFirstCharIndexFromLine(i) + currentLine.LastIndexOf(";") + 1);
// Length
int length = currentLine.Substring(currentLine.LastIndexOf(";"), (currentLine.Length - (currentLine.LastIndexOf(";")))).Length;
// Make the selection
richTextBox1.SelectionStart = start;
richTextBox1.SelectionLength = length;
// Change the colour
richTextBox1.SelectionColor = Color.Red;
}
}
Edit:
Re-reading your question I'm confused as to whether you wanted to make the ; red as well.
If you do remove the +1 from this line:
int start = (richTextBox1.GetFirstCharIndexFromLine(i) + currentLine.LastIndexOf(";") + 1);
Private Sub RichTextBox1_Click(sender As Object, e As EventArgs) Handles RichTextBox1.Click
Dim MyInt1 As Integer
Dim MyInt2 As Integer
' Reset your RTB back color to white at each click
RichTextBox1.SelectionBackColor = Color.White
' Define the nth first character number of the line you clicked
MyInt1 = RichTextBox1.GetFirstCharIndexOfCurrentLine()
' use that nth to find the line number in the RTB
MyInt2 = RichTextBox1.GetLineFromCharIndex(MyInt1)
'Select the line using an array property of RTB (RichTextBox1.Lines())
RichTextBox1.Select(MyInt1, RichTextBox1.Lines(MyInt2).Length)
' This line would be for font color change : RichTextBox1.SelectionColor = Color.Maroon
' This one changes back color :
RichTextBox1.SelectionBackColor = Color.Yellow
End Sub
' There are a few bugs inherent to the rtb.select method
' It bugs if a line wraps, or fails on an "http" line... probably more.
(I just noticed the default stackoverflow.com character colors on my above code are not correct for comment lines and others.)
I'm looking to write a large 2d array to an Excel worksheet using C#. If the array is 500 x 500, the code that I would use to write this is as follows:
var startCell = Worksheet.Cells[1, 1];
var endCell = Worksheet.Cells[500, 500];
var writeRange = (Excel.Range)Worksheet.Cells[startCell, endCell;
writeRange.Value = myArray;
I get an exception on this line:
var endCell = Worksheet.Cells[500, 500];
As anybody who has used C# and Excel via COM can testify, the error message received is pretty much useless. I think that the issue is that the underlying data structure used for the worksheet is not of sufficient size to index cell 500,500 when I first create the sheet.
Does anybody know how to achieve the desired result? I'm hoping that there is a simple way to re-size the underlying data structure before creating the Range.
Thanks.
Edit: Error message is:
{"Exception from HRESULT: 0x800A03EC"}
With and excel error code of -2146827284.
Update: The link supplied in the comments below alluded to an issue with opening the Excel sheet in compatibility mode. This does seem to be the problem. If I save the document in .xlsx or .xlsm format before running my code, this seems to work. My issue is that I cannot expect my users to do this each time. Is there a programmitcal way of achieving this? Would it simply be a case of opening the file, checking the extension and then saving it in the new format if needed?
Found a solution.
Instead of using Worksheet.Cells[x, x], use Worksheet.get_range(x, x) instead.
I just wrote a small example that worked for me. Originally found at SO this answer. I had to adapt this answer as in my Interop assembly (Excel 14 Object Library) there is no more method Worksheet.get_Range(.., ..)
var startCell = ws.Cells[1, 1];
int row = 500, col = 500;
var endCell = ws.Cells[row, col];
try
{
// access range by Property and cells indicating start and end
var writeRange = ws.Range[startCell, endCell];
writeRange.Value = myArray;
}
catch (COMException ex)
{
Debug.WriteLine(ex.Message);
Debugger.Break();
}