OpenXML Read CustomXMLPart - c#

Hi I am trying to read through all the CustomXMLParts of some Excel files with the following Code but i cannot figure out how to get the xml Data of each individual part.
I cant seem to find the solution online anywhere
public void getCustomXMLParts(string path){
// Open the document for editing.
int nCount = 0;
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(path, false)){
// Code removed here.
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
foreach (CustomXmlPart xmlPart in workbookPart.CustomXmlParts)
{
XmlDocument oDoc = new XmlDocument();
//oDoc.Load();
Response.Write("<Textarea cols=200 rows=10>"+ xmlPart.Uri + "</textarea>");
nCount = nCount + 1;
}
}
Response.Write("<BR>XML Parts Count=" + nCount);
}
There is XML Data store in multiple XMLParts and i would just like to read the XML Part into my C# Code.
Thanks.

figured it out, had to use the following:
StreamReader reader = new StreamReader(xmlPart.GetStream(FileMode.Open, FileAccess.Read));
string FullXML = reader.ReadToEnd();

Related

Creating word add in with OpenXML

I'm new to VSTO and OpenXML and I would like to develop some Word add-in. This add-in should use OpenXML, so is it possible to edit opened document?
For example I have opened Word document and I would like to replace some text using OpenXML on button click.
So I have this code.
var fileFullName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
Globals.ThisAddIn.Application.ActiveDocument.Close(WdSaveOptions.wdSaveChanges, WdOriginalFormat.wdOriginalDocumentFormat, true);
//edit document using OpenXml here
Globals.ThisAddIn.Application.Documents.Open(fileFullName);
And i found this to add text to Word using OpenXML
How to: Open and add text to a word processing document (Open XML SDK)
But i can't figure out how to make them work together.
Can anyone help me with this, Thanks
This is how i solved it:
private void button1_Click(object sender, RibbonControlEventArgs e)
{
var fileFullName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
Globals.ThisAddIn.Application.ActiveDocument.Close(WdSaveOptions.wdSaveChanges, WdOriginalFormat.wdOriginalDocumentFormat, true);
OpenAndAddTextToWordDocument(fileFullName, "[USER_NAME]");
Globals.ThisAddIn.Application.Documents.Open(fileFullName);
}
public static void OpenAndAddTextToWordDocument(string filepath, string txt)
{
// Open a WordprocessingDocument for editing using the filepath.
WordprocessingDocument wordprocessingDocument =
WordprocessingDocument.Open(filepath, true);
// Assign a reference to the existing document body.
Body body = wordprocessingDocument.MainDocumentPart.Document.Body;
// Add new text.
DocumentFormat.OpenXml.Wordprocessing.Paragraph para = body.AppendChild(new DocumentFormat.OpenXml.Wordprocessing.Paragraph());
Run run = para.AppendChild(new Run());
run.AppendChild(new Text(txt));
// Close the handle explicitly.
wordprocessingDocument.Close();
}
}
You can do something like that;
public static void SearchAndReplace(string document)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
{
string docText = null;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
Regex regexText = new Regex("Hello world!");
docText = regexText.Replace(docText, "Hi Everyone!");
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(docText);
}
}
}
Please read this post for more details.
https://msdn.microsoft.com/en-us/library/office/bb508261.aspx

Format text controls on pdf using iTextSharp and C#

I have a pdf document where I am filling all the values using the below code.
using(MemoryStream ms = new MemoryStream())
{
// Fill the PDF with the XFA
using(PdfStamper stamper = new PdfStamper(oInPDF, ms))
{
stamper.Writer.CloseStream = false;
XfaForm.SetXfa(oXFA, stamper.Reader, stamper.Writer);
}
// Code for Flatten the filled PDF.
}
I am trying to draw a box in red around the value displayed to highlight when the values are not in the expected range.
I would like to know, how do I locate the position of a control on a pdf page using iTextSharp and C#.
Any help or info on this, much appreciated.
Many Thanks.
Finally managed to draw borders around controls with below code.
XmlDocument newXMLDoc = new XmlDocument();
newXMLDoc.LoadXml(#"<border><edge thickness=""1.3mm""><color value=""0, 0, 255""/></edge></border>");
if (Rs.Rows.Count > 0)
{
foreach (DataRow query in Rs.Rows)
{
if(isRET)
{
if (oXFA.DomDocument.SelectSingleNode("//t:*[#name='" + Rs[0] + "']", oNameSpace) != null)
{
XmlNode newNode =
oXFA.DomDocument.ImportNode(newXMLDoc.SelectSingleNode("border"), true);
oXFA.DomDocument.SelectSingleNode("//t:*[#name='" + Rs[0] + "']", oNameSpace).AppendChild(newNode);
}
}
}
}

Changes are not getting saved - modify Track Changes Author in Header - OpenXML HeaderPart

Using C# in VS, I am trying to change the author name in track changes found in a word document header based on their dates. Using the debugger, it seems that the author's name is getting changed, but the document changes are not getting saved. I have included the 'headerPart.Header.Save()' line, which would presumably do trick, but no luck. I need help saving the document after the changes have been made - thanks!
private void changeRevAuthor(string docPath, string input_project_date)
{
using (Stream stream = System.IO.File.Open(docPath, FileMode.OpenOrCreate))
{
stream.Seek(0, SeekOrigin.End);
XNamespace w = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
using (WordprocessingDocument document = WordprocessingDocument.Open(stream, true))
{
foreach (HeaderPart headerPart in document.MainDocumentPart.HeaderParts)
{
foreach (OpenXmlElement headerElement in headerPart.RootElement.Descendants())
{
OpenXmlElement children = headerPart.RootElement;
XElement xchildren = XElement.Parse(children.OuterXml);
var ychildren = xchildren.Descendants().Where(x => x.Attributes(w + "author").Count() > 0);
foreach (XElement descendant in ychildren)
{
var date = descendant.Attribute(w + "date").ToString().Substring(8, 10);
if (DateTime.Parse(date) > DateTime.Parse(input_project_date))
{
descendant.SetAttributeValue(w + "author", "new author name");
Debug.WriteLine("this is the new one" + descendant);
}
}
}
headerPart.Header.Save();
Debug.WriteLine("We got here");
}
document.Close();
}
}
}
Use MainDocumentPart save() method to save the changes in the document.
ie:document.MainDocumentPart.Document.Save();

EPPlus with a template is not working as expected

I am currently using EPPlus project in order to manipulate some .xlsx files. The basic idea is that I have to create a new file from a given template.
But when I create the new file from a template, all calculated columns in the tables are messed up.
The code I am using is the following:
static void Main(string[] args)
{
const string templatePath = "template_worksheet.xlsx"; // the path of the template
const string resultPath = "result.xlsx"; // the path of our result
using (var pck = new ExcelPackage(new FileInfo(resultPath), new FileInfo(templatePath))) // creating a package with the given template, and our result as the new stream
{
// note that I am not doing any work ...
pck.Save(); // savin our work
}
}
For example for a .xlsx file (that have a table with 3 columns, the last one is just the sum of the others) the program creates a .xlsx file where the last column have the same value (which is correct only for the first row) in all rows.
The following images shows the result:
Now the questions are:
What is going on here ? Is my code wrong ?
How can I accomplish this task without that unexpected behavior ?
That definitely on to something there. I was able to reproduce it myself. It has to do with the Table you created. if you open your file and remove it using the "Convert To Range" option in the Table Tools tab the problem goes away.
I looked at the source code and it extracts the xml files at the zip level and didnt see any indication that it was actually messing with them - seemed to be a straight copy.
Very strange because if we create and save the xlsx file including a table from EPPlus the problem is not there. This works just fine:
[TestMethod]
public void Template_Copy_Test()
{
//http://stackoverflow.com/questions/28722945/epplus-with-a-template-is-not-working-as-expected
const string templatePath = "c:\\temp\\testtemplate.xlsx"; // the path of the template
const string resultPath = "c:\\temp\\result.xlsx"; // the path of our result
//Throw in some data
var dtdata = new DataTable("tblData");
dtdata.Columns.Add(new DataColumn("Col1", typeof(string)));
dtdata.Columns.Add(new DataColumn("Col2", typeof(int)));
dtdata.Columns.Add(new DataColumn("Col3", typeof(int)));
for (var i = 0; i < 20; i++)
{
var row = dtdata.NewRow();
row["Col1"] = "String Data " + i;
row["Col2"] = i * 10;
row["Col3"] = i * 100;
dtdata.Rows.Add(row);
}
var templateFile = new FileInfo(templatePath);
if (templateFile.Exists)
templateFile.Delete();
using (var pck = new ExcelPackage(templateFile))
{
var ws = pck.Workbook.Worksheets.Add("Data");
ws.Cells["A1"].LoadFromDataTable(dtdata, true);
for (var i = 2; i <= dtdata.Rows.Count + 1; i++)
ws.Cells[i, 4].Formula = String.Format("{0}*{1}", ExcelCellBase.GetAddress(i, 2), ExcelCellBase.GetAddress(i, 3));
ws.Tables.Add(ws.Cells[1, 1, dtdata.Rows.Count + 1, 4], "TestTable");
pck.Save();
}
using (var pck = new ExcelPackage(new FileInfo(resultPath), templateFile)) // creating a package with the given template, and our result as the new stream
{
// note that I am not doing any work ...
pck.Save(); // savin our work
}
}
BUT.....
If we open testtemplate.xlsx, remove the table, save/close the file, reopen, and reinsert the exact same table the problem shows up when you run this:
[TestMethod]
public void Template_Copy_Test2()
{
//http://stackoverflow.com/questions/28722945/epplus-with-a-template-is-not-working-as-expected
const string templatePath = "c:\\temp\\testtemplate.xlsx"; // the path of the template
const string resultPath = "c:\\temp\\result.xlsx"; // the path of our result
var templateFile = new FileInfo(templatePath);
using (var pck = new ExcelPackage(new FileInfo(resultPath), templateFile)) // creating a package with the given template, and our result as the new stream
{
// note that I am not doing any work ...
pck.Save(); // savin our work
}
}
It has to be something burried in their zip copy methods but I nothing jumped out at me.
But at least you can see about working around it.
Ernie
Try to use the following code. This code takes the formatting and other rules and add them as xml node to another file. Ernie described it really well here Importing excel file with all the conditional formatting rules to epplus The best part of the solution is that you can also import formatting along with your other rules. It should take you close to what you need.
//File with your rules, can be your template
var existingFile = new FileInfo(#"c:\temp\temp.xlsx");
//Other file where you want the rules
var existingFile2 = new FileInfo(#"c:\temp\temp2.xlsx");
using (var package = new ExcelPackage(existingFile))
using (var package2 = new ExcelPackage(existingFile2))
{
//Make sure there are document element for the source
var worksheet = package.Workbook.Worksheets.First();
var xdoc = worksheet.WorksheetXml;
if (xdoc.DocumentElement == null)
return;
//Make sure there are document element for the destination
var worksheet2 = package2.Workbook.Worksheets.First();
var xdoc2 = worksheet2.WorksheetXml;
if (xdoc2.DocumentElement == null)
return;
//get the extension list node 'extLst' from the ws with the formatting
var extensionlistnode = xdoc
.DocumentElement
.GetElementsByTagName("extLst")[0];
//Create the import node and append it to the end of the xml document
var newnode = xdoc2.ImportNode(extensionlistnode, true);
xdoc2.LastChild.AppendChild(newnode);
package2.Save();
}
}
Try this
var package = new ExcelPackage(excelFile)
var excelSheet = package.Workbook.Worksheets[1];
for (var i = 1; i < 5; i++){
excelWorkSheet.InsertRow(i, 1, 1); // Use value of i or whatever is suitable for you
}
package.Workbook.Calculate();
Inserting new row copies previous row format and its formula if last prm is set to 1

How to extract text data from MS-Word doc file

i am developing a resume archive where people upload their resume and that resume will be saved in a specific location. the most important things is people may use any version of MS-word to prepare their resume and resume file extension could be doc or docx. so i just like to know is there any free library available which i can use to extract text data from doc or docx file which will work in case of all ms-word version and also work if ms-word is not install in pc. i search google and found some article to extract text data from doc file but i am not sure does they work in case of all ms-word version. so please guide me with info that which library i should use to extract data from ms-word irrespective of ms-word version also give me some good article link on this issue.
also guide me is there any viewer available which i can use to show doc file content from my c# apps irrespective of ms-word version.
thanks
i got the answer
**Need to add this reference Microsoft.Office.Interop.Word**
using System.Runtime.InteropServices.ComTypes;
using System.IO;
public static string GetText(string strfilename)
{
string strRetval = "";
System.Text.StringBuilder strBuilder = new System.Text.StringBuilder();
if (File.Exists(strfilename))
{
try
{
using (StreamReader sr = File.OpenText(strfilename))
{
string s = "";
while ((s = sr.ReadLine()) != null)
{
strBuilder.AppendLine(s);
}
}
}
catch (Exception ex)
{
SendErrorMail(ex);
}
finally
{
if (System.IO.File.Exists(strfilename))
System.IO.File.Delete(strfilename);
}
}
if (strBuilder.ToString().Trim() != "")
strRetval = strBuilder.ToString();
else
strRetval = "";
return strRetval;
}
public static string SaveAsText(string strfilename)
{
string fileName = "";
object miss = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Word.Document doc = null;
try
{
Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.Application();
fileName = Path.GetDirectoryName(strfilename) + #"\" + Path.GetFileNameWithoutExtension(strfilename) + ".txt";
doc = wordApp.Documents.Open(strfilename, false);
doc.SaveAs(fileName, Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatDOSText);
}
catch (Exception ex)
{
SendErrorMail(ex);
}
finally
{
if (doc != null)
{
doc.Close(ref miss, ref miss, ref miss);
System.Runtime.InteropServices.Marshal.ReleaseComObject(doc);
doc = null;
}
GC.Collect();
GC.WaitForPendingFinalizers();
}
return fileName;
}
See the following:
http://msdn.microsoft.com/en-us/library/cc974107%28office.12%29.aspx
How can i read .docx file?
Microsoft Interop Word Nuget
string docPath = #"C:\whereEverTheFileIs.doc";
Application app = new Application();
Document doc = app.Documents.Open(docPath);
string words = doc.Content.Text;
doc.Close();
app.Quit();

Categories

Resources