Adding text to multiples rows in word with a single bookmark - c#

Is it possible to add several rows with the help of Bookmarks and openXML to a word document?
We have a worddocument that serves as a report template.
In that template we need to add several transaction rows.
The problem is that the number of rows aren't static. It could be 0, 1 or 42 for example.
In the current template (which we can change) we have added 3 bookmarks
TransactionPart, TransactionPart2 and TransactionPart3.
The tree transactionparts forms a singel row with three different datacontent (ID, Description, Amount)
If we have just one transaction row we have no problem adding the data to those bookmarks, but what do we do when we should add row two? There are no bookmarks for more rows.
Is there a smart way of doing this?
Or should we change the worddocument so that the rows end up in a table? Would that solve the problem in a better way?

I would put a single bookmark lets call it "transactions" inside a 3 coloumn table.
Like this
When you know the design of the table, but not the number of rows you'll be needing the simplest way is to add a row for each line of data you have.
You could accomplish that with a code like this
//make some data.
List<String[]> data = new List<string[]>();
for (int i = 0; i < 10; i++)
data.Add(new String[] {"this","is","sparta" });
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open("yourDocument.docx", true))
{
var mainPart = wordDoc.MainDocumentPart;
var bookmarks = mainPart.Document.Body.Descendants<BookmarkStart>();
var bookmark =
from n in bookmarks
where n.Name == "transactions"
select n;
OpenXmlElement elem = bookmark.First().Parent;
//isolate tabel
while (!(elem is DocumentFormat.OpenXml.Wordprocessing.Table))
elem = elem.Parent;
var table = elem; //found
//save the row you wanna copy in each time you have data.
var oldRow = elem.Elements<TableRow>().Last();
DocumentFormat.OpenXml.Wordprocessing.TableRow row = (TableRow)oldRow.Clone();
//remove old row
elem.RemoveChild<TableRow>(oldRow);
foreach (String[] s in data)
{
DocumentFormat.OpenXml.Wordprocessing.TableRow newrow = (TableRow)row.Clone();
var cells = newrow.Elements<DocumentFormat.OpenXml.Wordprocessing.TableCell>();
//we know we have 3 cells
for(int i = 0; i < cells.Count(); i++)
{
var c = cells.ElementAt(i);
var run = c.Elements<Paragraph>().First().Elements<Run>().First();
var text = run.Elements<Text>().First();
text.Text = s[i];
}
table.AppendChild(newrow);
}
}
You end up with this
I've tested this code on a pretty basic document and know it works.
Good luck and let me know if I can clarify further.

Related

Select single table row using HtmlAgilityPack and iterate its links

I try to iterate a single table row and its a href links but it does not work as expected, instead of finding the selected row and its links it find all links in the table.. What am I doing wrong?
var allRows = doc.DocumentNode.SelectNodes("//table[#id='sortingTable']/tr");
var i = 0;
var rowNumber = 0;
foreach (var row in allRows)
{
if (row.InnerText.Contains("Text in cell for which row I want to use"))
{
rowNumber = i+1;
break;
}
i += 1;
}
var list = new List<SortFile>();
var rowToRead = allRows[rowNumber]; // One specific row
var numberOfLinks = rowToRead.SelectNodes("//a[#href]"); // this does not find the 2 links in the table row but all links in the whole table?
foreach (HtmlNode link in rowToRead.SelectNodes("//a[#href]"))
{
//HtmlAttribute att = link.Attributes["href"];
//var text = link.OuterHtml;
}
The XPath you are using (//a[#href]) would get all of the links in the document. // means to find anything starting from the document root.
You should use .//a[#href] to start from the current node and select all links. That would only take the links underneath the tr node you have selected.

Excel List<List<string>> per each row

I have a small program where you can select some database tables and create a excel file with all values for each table and thats my solution to create the excel file.
foreach (var selectedDatabase in this.lstSourceDatabaseTables.SelectedItems)
{
//creates a new worksheet foreach selected table
foreach (TableRetrieverItem databaseTable in tableItems.FindAll(e => e.TableName.Equals(selectedDatabase)))
{
_xlWorksheet = (Excel.Worksheet) xlApp.Worksheets.Add();
_xlWorksheet.Name = databaseTable.TableName.Length > 31 ? databaseTable.TableName.Substring(0, 31): databaseTable.TableName;
_xlWorksheet.Cells[1, 1] = string.Format("{0}.{1}", databaseTable.TableOwner,databaseTable.TableName);
ColumnRetriever retrieveColumn = new ColumnRetriever(SourceConnectionString);
IEnumerable<ColumnRetrieverItem> dbColumns = retrieveColumn.RetrieveColumns(databaseTable.TableName);
var results = retrieveColumn.GetValues(databaseTable.TableName);
int i = 1;
(result is a result.Item3 is a List<List<string>> which contains all values from a table and for each row is a new list inserted)
for (int j = 0; j < results.Item3.Count(); j++)
{
int tmp = 1;
foreach (var value in results.Item3[j])
{
_xlWorksheet.Cells[j + 3, tmp] = value;
tmp++;
}
}
}
}
It works but when you have a table with 5.000 or more values it will take such a long time.
Does someone maybe know a better solution to add the List List string per row than my for foreach solution ?
I utilize the GetExcelColumnName function in my code sample to convert from column count to the excel column name.
The whole idea is, that it's very slow to write excel cells one by one. So instead precompute the whole table of values and then assign the result in a single operation. In order to assign values to a two dimensional range, use a two dimensional array of values:
var rows = results.Item3.Count;
var cols = results.Item3.Max(x => x.Count);
object[,] values = new object[rows, cols];
// TODO: initialize values from results content
// get the appropriate range
Range range = w.Range["A3", GetExcelColumnName(cols) + (rows + 2)];
// assign all values at once
range.Value = values;
Maybe you need to change some details about the used index ranges - can't test my code right now.
As I see, youd didn't do profiling. I recomend to do profiling first (for example dotTrace) and see what parts of your code actualy causing performance issues.
In my practice there is rare cases (almost no such cases) when code executes slower than database requests, even if code is realy awfull in algorithmic terms.
First, I recomend to fill up your excel not by columns, but by rows. If your table has many columns this will cause multiple round trips to database - it is great impact to performance.
Second, write to excel in batches - by rows. Think of excel files as mini-databases, with same 'batch is faster than one by one' principles.

C# docx bookmarks loop

I want to iterate through all bookmarks inside document and set text to each bookmark.Name from datagridview cells values which is already loaded. I'm stuck here in this loop. Please, any suggestions?
using (Novacode.DocX document = DocX.Load(template))
{
foreach (Novacode.Bookmark bookmark in document.Bookmarks)
{
//MessageBox.Show("\tFound bookmarks {0}", bookmark.Name);
//var bookmarks = bookmark.Name;
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[0].Value.ToString());
int i = document.Bookmarks.Count;
var bookmarks = document.Bookmarks[i].Name;
document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[0].Value.ToString());
document.Bookmarks[0].SetText(dataGridViewRow.Cells[1].Value.ToString());
document.Bookmarks[1].SetText(dataGridViewRow.Cells[2].Value.ToString());
document.Bookmarks[2].SetText(dataGridViewRow.Cells[3].Value.ToString());
document.Bookmarks[3].SetText(dataGridViewRow.Cells[4].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[2].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[3].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[4].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[5].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[6].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[7].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[8].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[9].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[10].Value.ToString());
//document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[11].Value.ToString());
}
document.SaveAs(path2);
}
If I understand you correctly, this is what you are trying to achieve with the loop:
using (Novacode.DocX document = DocX.Load(template))
{
int i = 0;
foreach (Novacode.Bookmark bookmark in document.Bookmarks)
{
var bookmarks = document.Bookmarks[i].Name;
document.Bookmarks[bookmark.Name].SetText(dataGridViewRow.Cells[i+1].Value.ToString());
i++;
}
document.SaveAs(path2);
}
What we've done here is declared a variable i which is outside the loop but we increment its value with every foreach iteration. Alternatively, you could rewrite the loop and use a for loop instead:
for(int i=0; i< document.Bookmarks.Count)
{
//change the code here accordingly
}
Let me know if this helps.
Thank you.

Is there a way to dynamically create an object at run time in .NET 3.5?

I'm working on an importer that takes tab delimited text files. The first line of each file contains 'columns' like ItemCode, Language, ImportMode etc and there can be varying numbers of columns.
I'm able to get the names of each column, whether there's one or 10 and so on. I use a method to achieve this that returns List<string>:
private List<string> GetColumnNames(string saveLocation, int numColumns)
{
var data = (File.ReadAllLines(saveLocation));
var columnNames = new List<string>();
for (int i = 0; i < numColumns; i++)
{
var cols = from lines in data
.Take(1)
.Where(l => !string.IsNullOrEmpty(l))
.Select(l => l.Split(delimiter.ToCharArray(), StringSplitOptions.None))
.Select(value => string.Join(" ", value))
let split = lines.Split(' ')
select new
{
Temp = split[i].Trim()
};
foreach (var x in cols)
{
columnNames.Add(x.Temp);
}
}
return columnNames;
}
If I always knew what columns to be expecting, I could just create a new object, but since I don't, I'm wondering is there a way I can dynamically create an object with properties that correspond to whatever GetColumnNames() returns?
Any suggestions?
For what it's worth, here's how I used DataTables to achieve what I wanted.
// saveLocation is file location
// numColumns comes from another method that gets number of columns in file
var columnNames = GetColumnNames(saveLocation, numColumns);
var table = new DataTable();
foreach (var header in columnNames)
{
table.Columns.Add(header);
}
// itemAttributeData is the file split into lines
foreach (var row in itemAttributeData)
{
table.Rows.Add(row);
}
Although there was a bit more work involved to be able to manipulate the data in the way I wanted, Karthik's suggestion got me on the right track.
You could create a dictionary of strings where the first string references the "properties" name and the second string its characteristic.

How to read tables from a particular place in a document?

When I use the below line It reads all tables of that particular document:
foreach (Microsoft.Office.Interop.Word.Table tableContent in document.Tables)
But I want to read tables of a particular content for example from one identifier to another identifier.
Identifier can be in the form of [SRS oraganisation_123] to another identifier [SRS Oraganisation_456]
I want to read the tables only in between the above mentioned identifiers.
Suppose 34th page contains my identifier so I want read all tables from that point to until I come across my second identifier. I don't want to read remaining tables.
Please ask me for any clarification in the question.
Say start and end Identifiers are stored in variables called myStartIdentifier and myEndIdentifier -
Range myRange = doc.Range();
int iTagStartIdx = 0;
int iTagEndIdx = 0;
if (myRange.Find.Execute(myStartIdentifier))
iTagStartIdx = myRange.Start;
myRange = doc.Range();
if (myRange.Find.Execute(myEndIdentifier))
iTagEndIdx = myRange.Start;
foreach (Table tbl in doc.Range(iTagStartIdx,iTagEndIdx).Tables)
{
// Your code goes here
}
Not sure how your program is structured... but if you can access the identifier in tableContent then you should be able to write a LINQ query.
var identifiers = new List<string>();
identifiers.Add("myIdentifier");
var tablesWithOnlyTheIdentifiersIWant = document.Tables.Select(tableContent => identifiers.Contains(tableContent.Identifier)
foreach(var tableContent in tablesWithOnlyTheIdentifiersIWant)
{
//Do something
}
Go through following code, if it helps you.
System.Data.DataTable dt = new System.Data.DataTable();
foreach (Microsoft.Office.Interop.Word.Cell c in r.Cells)
{
if(c.Range.Text=="Content you want to compare")
dt.Columns.Add(c.Range.Text);
}
foreach (Microsoft.Office.Interop.Word.Row row in newTable.Rows)
{
System.Data.DataRow dr = dt.NewRow();
int i = 0;
foreach (Cell cell in row.Cells)
{
if (!string.IsNullOrEmpty(cell.Range.Text)&&(cell.Range.Text=="Text you want to compare with"))
{
dr[i] = cell.Range.Text;
}
}
dt.Rows.Add(dr);
i++;
}
Go through following linked 3rd number answer.
Replace bookmark text in Word file using Open XML SDK

Categories

Resources