I've got a MS Word project where I'm building a number of Panes for users to complete some info which automatically populates text at bookmarks throughout the document. I'm just trying to find the best way of saving these values somehow that I can retrieve them easily when re-opening the document after users have typed in their values.
I could just try to retrieve them from the bookmarks themselves but of course in many cases they contain text values when I'd ideally want to store a primary key somewhere that's not visible to the user and just in case they made changes to the text which would make reverse engineering the values impossible.
I can't seem to find any information on saving custom attributes in a Word document, so would really appreciate some general guidance of how this might be achieved.
Thanks a lot!
I would suggest the use of custom document properties. there you can strings in a key -value manner (at least if it is similar to excel).
I found a thread which explains how to do it:
Set custom document properties with Word interop
After playing around with this a fair bit this is my final code in case it helps someone else, I've found this format easier to understand and work with. It's all based on the referenced article by Christian:
using Office = Microsoft.Office.Core;
using Word = Microsoft.Office.Interop.Word;
using System.Reflection;
Office.DocumentProperties properties = (Office.DocumentProperties)Globals.ThisDocument.CustomDocumentProperties;
//Check if the property exists already
if (properties.Cast<Office.DocumentProperty>().Where(c => c.Name == "nameofproperty").Count() == 0)
{
//Then add the property and value
properties.Add("nameofproperty", false, Office.MsoDocProperties.msoPropertyTypeString, "yourvalue");
}
else
{
//else just update the value
properties["nameofproperty"].Value = "yourvalue";
}
In terms of retrieving the value it's as easy as using the same three lines at the top to get the properties object, perhaps using the code in the if statement to check if it exists, and the retrieving it using properties["nameofproperty"].Value
Related
I'm extracting text out of an MS Word document (.docx). I'm using the DocX C# library for this purpose, which works in general quit well. No, I want to be able to extract tables. The main problem is, that if I'm looping through the paragraphs, I can get whether I'm in a table cell with:
ParentContainer == Cell
but I do not get any information about how many rows and cells. Second possibility which I see is that there is a list with tables as property of the document object. There I can see, how many rows / columns and so on - but I do not know where they are.
Does anyone has an idea how to deal with tables correctly? Any other solution would be appreciated as well :)
I figured it out. The trick is, to check whether each paragraph is followed by a table. This can be done by
...
if (paragraph.FollowingTable != null)
{
tableId = paragraph.FollowingTable.Index;
}
...
The FollowingTable.Index will give you an index to the table, with which you can get all details about the table in the Document.Tables list.
I'm trying to access the 'Last Saved By' file property using C# as part of an MVC web app. I'm able to get pretty much every other property on the file from last modified date, to the owner and I've even used Shell32 to get really obscure properties.
However, I cannot find a way to get the 'Last Saved By' property when I am retrieving the audit properties for each file that I need to report on. The files I need to get this data from are all Excel.
The 'Last Saved By' property can be read using the WindowsAPICodePack.Shell libraries. This property is application specific, so it is not present on some files (for example it is available on .xls but not .csv). The 'Last Saved By' file property is named 'LastAuthor'.
I used nuget to get the package and the code below to access the property:
string lastSavedBy = null;
using (var so = ShellObject.FromParsingName(file))
{
var lastAuthorProperty = so.Properties.GetProperty(SystemProperties.System.Document.LastAuthor);
if (lastAuthorProperty != null)
{
var lastAuthor = lastAuthorProperty.ValueAsObject;
if (lastAuthor != null)
{
lastSavedBy = lastAuthor.ToString();
}
}
}
You can access the property in
Workbook.BuiltinDocumentProperties(7)
Maybe it will have index 6 when accessed from the C#. See MSDN documentation.
Quick verification: in Immediate Window (Ctrl+G) of Excel VBA editor (Alt+F11) you can type
? ThisWorkbook.BuiltinDocumentProperties(7) and hit Enter to display the property. This is the Excel part.
There is also part how to call Excel from the C#, but I am not going to cover this one, you can find literally hundreds of answers and examples on this topic.
Maybe even more effective can be just adding reference to Microsoft.Office.Tools.Excel Namespace and working directly, without Excel.
I'm currently writing a function to save and read data to/from and XML document through
LINQ. Currently I can write the document just fine, but if I go to add data to an existing item, it simply adds a new item. My goal is to create an address book type system (yes I know there's 1000 out there, it's just a learning project for myself) and I've tried ini and basic text but it seems that XML is the best way to go short of using a local DB like sql. Currently I have:
XDocument doc = XDocument.Load(#"C:\TextXML.xml");
var data = new XElement("Entry",
new XElement("Name", textBox1.Text),
new XElement("Address", richTextBox2.Text),
new XElement("Comments", richTextBox1.Text));
doc.Element("Contacts").Add(data);
doc.Save(#"C:\TextXML.xml");
I searched SO and can't seem to find how to append/replace.
Now this saves everything properly, even when I add to the document, but if I want to update an entry I'm not sure how to without creating a new "Entry" nor am have I gotten the knack of removing one. (I'm somewhat new to C# still and self-taught so pardon anything obvious I've overlooked.)
My second issues revolves around loading the information into textboxes.
I'm able to load a list of Entry names into a listbox, but when I go to open the information from that entry I'm not sure how to properly get the nested info.
With the example above I'd need something similar to the following:
XDocument doc = XDocument.Load(#"C:\TextXML.xml");
boxName.Text = The name from the SelectedItem of the list box.
boxAddress.Text = The address child of the element named above etc.
Each method I've tried I wind up with a null reference exception, which tells me I'm not pointing to the right thing, but I'm not sure how to get to those things properly.
I've also tried creating a string and var of the SelectedItem from the list box to help with the naming, and using ToString methods, but still can't figure it out.
For replacing values, there are several functions you can use in XElement:
Value (property with a public setter)
SetValue()
SetElementValue()
SetAttributeValue()
ReplaceWith()
ReplaceNodes()
For example, if you wanted to replace the value in Name:
data.Element("Name").SetValue("NewValue");
or
data.Element("Name").Value = "NewValue";
For loading, once you have the XElement node you desire, then it's as simple as doing
xelement.Value
Or if it's an attribute:
xelement.Attribute("AttributeName").Value
Using your code as an example:
boxName.Text = doc.Element("Entry").Element("Name").Value;
Edit to address comment:
If I'm reading your comment right, you're wanting to extract the Name/Address/etc. data from all the nodes within the Contacts main node?
If so, then you would probably want something like this:
boxName.Text = string.Join(",", doc.Elements("Entry").Select(x => x.Element("Name").Value));
This will give you a single string that has all the names in all Entries separated by a comma. Just change "Name" to "Address" to do the same for addresses.
I'd suggest doing a search for Linq to XML for finding more information about how to use this parsing.
I'm taking over a project so I'm still learning this. The project uses Lucence.NET. I also have no idea if this piece of functionality is correct or not. Anyway, I am instantiating:
var writer = new IndexWriter(directory, analyzer, false);
For specific documents, I'm calling:
writer.DeleteDocuments(new Term(...));
In the end, I'm calling the usual writer.Optimize(), writer.Commit(), and writer.Close().
The field in the Term object is a Guid, converted to a string (.ToString("D")), and is stored in the document, using Field.Store.YES, and Field.Index.NO.
However, with these settings, I cannot seem to delete these documents. The goal is to delete, then add the updated versions, so I'm getting duplicates of the same document. I can provide more code/explanation if needed. Any ideas? Thanks.
The field must be indexed. If a field is not indexed, its terms will not show up in enumeration.
I don't think there is anything wrong with how you are handling the writer.
It sounds as if the term you are passing to DeleteDocuments is not returning any documents. Have you tried to do a query using the same term to see if it returns any results?
Also, if your goal is to simple recreate the document, you can call UpdateDocument:
// Updates a document by first deleting the document(s) containing term and
// then adding the new document. The delete and then add are atomic as seen
// by a reader on the same index (flush may happen only after the add). NOTE:
// if this method hits an OutOfMemoryError you should immediately close the
// writer. See above for details.
You may also want to check out SimpleLucene (http://simplelucene.codeplex.com) - it makes it a bit easier to do basic Lucene tasks.
[Update]
Not sure how I missed it but #Shashikant Kore is correct, you need to make sure the field is indexed otherwise your term query will not return anything.
I'm working on a project where the user can insert data into a document using fields, document properties and variables. The user also needs to be able to remove the data from the document. So far, I've managed to remove the document property and variable, but I'm not sure how I would go about removing the field (that's already inserted into the document). Note that I need to compare the field to a string, and if it matches; delete it from the doc.
I'm assuming you're using .NET Interop with Word. In that case, I believe you're looking for Field.Delete.
This is of course also assuming you know how to get the field you're looking for, which would usually be enumerating through _Document.Fields (or a more finite range if you know one) until you get the right one.
The Field has a Delete method. See the documentation for Field.Delete.
So I think something like this would work:
foreach(Field f in ActiveDocument.Fields)
{
f.Select();
if(f.Type == TypeYouWantToDelete)
{
d.Delete();
}
}