I'm extracting text out of an MS Word document (.docx). I'm using the DocX C# library for this purpose, which works in general quit well. No, I want to be able to extract tables. The main problem is, that if I'm looping through the paragraphs, I can get whether I'm in a table cell with:
ParentContainer == Cell
but I do not get any information about how many rows and cells. Second possibility which I see is that there is a list with tables as property of the document object. There I can see, how many rows / columns and so on - but I do not know where they are.
Does anyone has an idea how to deal with tables correctly? Any other solution would be appreciated as well :)
I figured it out. The trick is, to check whether each paragraph is followed by a table. This can be done by
...
if (paragraph.FollowingTable != null)
{
tableId = paragraph.FollowingTable.Index;
}
...
The FollowingTable.Index will give you an index to the table, with which you can get all details about the table in the Document.Tables list.
Related
I have a table in an Excel worksheet where I need to programatically remove entire rows using VSTO. After a lot of searching here and everywhere else, I was unable to find the answer. Due to some unrelated code, I also cannot delete the first row of the table, but need to remove all other rows.
Here are the specific requirements:
One of the functions of this addin is to populate the table. This is done through a loop starting with the "root" named range in the left column of the first row of the table.
Whenever populating the table, I first need to delete all data from the table and then add the new data. I need to use the "root" to add the data, so I can't have it deleted.
I am using the Table for the automated formatting instead of formatting the table manually after adding each cell.
I never know how many rows will be added, but it will always be at least one.
After banging my head on this for a few hours, I slept on it and came at it refreshed this morning. After much trial and error, here is the code I came up with.
var deplTable = ThisSheet.Evaluate("DeploymentTable");
if (deplTable.ListObject.ListRows.Count > 1)
{
do deplTable.ListObject.ListRows[2].Delete();
while (deplTable.ListObject.ListRows.Count > 1);
}
NOTE: ThisSheet is set to the correct sheet earlier. The application works on multiple sheets, so it needs to be flexible.
I tried this a few ways before finally getting it to work. Looping through the rows gave unexpected results; possibly due to timing issues between Excel and VSTO.
Hope this helps other people!
I am making an add-in and I am trying to format the output which my add-in generates,using Format as table table-styles provided by Excel.
The one which you get on the 'home tab' --> 'Format as Table' button on the ribbon.
I am using following code:
SourceRange.Worksheet.ListObjects.Add(XlListObjectSourceType.xlSrcRange,
SourceRange, System.Type.Missing, xlYesNo, System.Type.Missing).Name =
TableName;
SourceRange.Select();
SourceRange.Worksheet.ListObjects[TableName].TableStyle = TableStyleName;
TableStyleName is any style name like TableStyleMedium17, you get it if you just hover a particular style in Excel.
My problem is that, even if I keep the SourceRange as 10 columns, all the columns right till the end get selected and are considered as one table.
Because of that the table I populate right next to it is also considered as a part of the first table that was generated.Since, both the table have same column names excel automatically changes the column names in all the following tables that are generated.
Also, because I am generating the tables in a loop after 2 tables are generated I get the error :
A table cannot overlap another table.
PS: I am clearly mentioning SourceRange as:
var startCell = (Range)worksheet.Cells[startRow, startCol];
var endCell = (Range)worksheet.Cells[endRow, endCol];
var SourceRange = worksheet.get_Range(startCell, endCell);
Kindly suggest a way out.
We were able to figure out what was happening on our end for this:
on the
xlWorkbook.Worksheets.Add([before],[after], Type.Missing, Type.Missing)
call, we had to flip before and after since we wanted the sheets to move right, not left and then accessed
xlWorkbook.Worksheets[sheetCount]
by increasing sheetcount for however many sheets were being generated.
Having it the other way was creating the worksheet to access a previously assigned table formatfrom the SourceRange.Worksheet.ListObjects[TableName].TableStyle = TableStyleName call.
So, I got around this problem a week after posting this, sorry did not update in the rush of things.
This actually is an in built excel functionality.
You cant help it, the excel application will keep doing this.
So, ultimately wrote my own table styles in c# and applied it to the excel range which is mentioned as SourceRange. Its just like writing CSS.
If you are interested in knowing the details of that comment it on this question itself or you can contact me by email from my profile.
I am working on writing some results of a database query into tables in Word. I have already done the code for accessing/creating objects from the results of my query and find myself a little stuck on writing these to a word template given to me. Its essentially a summary document in which i have to insert tables of the data i pulled from the db in the correct position in the document. So for instance, the document has say 4 section headers and under each header, there is some text after which i have to insert a table. One such header can be like:
School Records
Below you will find a table in which all school records will be listed:
So when i go to print my school record data object, i need a way to somehow insert my data in a table format right below the above line in the word document. Can anyone tell me how i can first find the correct position in the doc and then how you create a table in word from c#?
There is an article about how to insert images at specific position in Word document. Maybe it's useful for you to insert table either.
Add bookmark to the word template and then search for the bookmark and replace it with the image/file.
Is it possible to set foreign key between two excel sheets and query records from the two sheets?
I got an excel sheet of Student Details and another sheet consists of the total marks. Fields common to both the sheet is the RegID. I need to display the Name and Marks from the two sheets on a grid...How can it be done? please help....
Query = "SELECT Status from [Viewer$] as a LEFT JOIN [UI$] as b ON a.[Responsible Person] = b.[Responsible Person] where b.[Responsible Person] = null" ;
This query is not returning the records to a dataset...
If you mean to read Data from two Sheets and mix it to make "combined" records which you enter in the Database where you have the FK set then it should be possible.
Read Reading Excel Cells using C#
You can open who sheets which you access in a Loop, but you will have to coordinate the Data building from your sheets.
NewBie,
I've used the excel 2007 openxml sdk (DocumentFormat.OpenXml) to great effect in this scenario. It's basically a LINQ library that takes the excel docs into objects and allows you to query inside c# just like any other LINQ object. Microsoft actually do have (after a quick search) a pretty good 'idiots guide' on this topic. You can find it here:
[edit] - added a few more links
http://msdn.microsoft.com/en-us/library/dd920313%28v=office.12%29.aspx
http://blogs.msdn.com/b/johnrdurant/archive/2010/02/19/excel-open-xml-linq-part-i.aspx
http://www.briankeating.net/blog/post/2010/04/26/Linq-to-Xlsx.aspx
if you use LINQ, it's a no brainer and is definately the only way that I would go with this type of task. It will also 'cater' for you idea of fk's as it will allow you to 'join' on any arbitary field that you care to define (i.e. as with any LINQ query), thus it should address your requirement perfectly.
Not possible using SQL. Neither UNIQUE nor FOREIGN KEY is supported where the data source is an Excel workbook.
i have a data file(csv) consisting of 2 columns & 1000 rows, as i load it to my datagridview it takes alot of time, i just want to show only the first 6 rows just as a preview of file to user. Is there any way i can show only the first 6 rows in my datagrid view. Following is the code im displaying the data in DataGridView.
DataTable csvDataTable = CSVReader.ReadCSVFile(textBoxCsv.Text, true);
dataGridViewCsvData.DataSource = csvDataTable;
dataGridViewCsvData.SelectionMode = DataGridViewSelectionMode.FullColumnSelect;
CSVReader is an open source project isn't it? try to add ReadTopLines method to that class that will read only top N lines given as parameter
Every datatable has it's own DefaultView.
http://msdn.microsoft.com/en-us/library/system.data.datatable.defaultview.aspx
You can then get the Table from the view by DefaultView.GetTable. And you can manipulate data in you View the way you want. You can filter up, query.
You can find out more about expressions here:
http://msdn.microsoft.com/en-us/library/system.data.datacolumn.expression.aspx
OR, since CSVReader is an open-source project, you can simply change
public DataTable CreateDataTable(bool headerRow)
Add number of lines to this method, and you will get what you need without reading the whole file.
I didn't read the whole source, so there might be a solution without even changing a code.
Use Open Source for 100%. Change it, customize it, send you patches! People do appreciate it! And you will get experience, knowledge and new friends who might help you in future :)