c# docx load xml from string xml - c#

I have created a program to read a file as array of bytes. The program is consuming word files by using docx library from Xceed. What I want to do is to recreate the parsed docx file from array of bytes.
To bytes:
var doc = Docx.Load("afile.docx");
...
return Encoding.Unicode.GetBytes(doc.Xml.Document.ToString());
Parse:
var doc = Docx.Create("anotherFile.docx");
var document = Encoding.Unicode.GetBytes({--returned bytes--}); <-- document is string with xml
How to save the document like the original?
I'm getting only blank file without any content.

using (var doc = DocX.Load("afile.docx"))
{
//here modify
doc.SaveAs("anotherFile.docx");
}

See this document BinaryWriter
bWriter.Writebytes(bytearray);

Related

Having problem converting big file of PDF to docx

I have almost 900mb of PDF file and I want to convert it to documents or .docx
I've use sautinsoft.pdfFocus
Using this code
string pdfFile = #"d:\Coffee Table Book NPPNP (1).pdf";
string wordFile = #"d:\sample.docx";
// Convert PDF file to DOCX file
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
f.OpenPdf(pdfFile);
if (f.PageCount > 0)
{
// You may choose output format between Docx and Rtf.
f.WordOptions.Format = SautinSoft.PdfFocus.CWordOptions.eWordDocument.Docx;
int result = f.ToWord(wordFile);
MessageBox.Show(result.ToString());
// Show the resulting Word document.
if (result == 0)
{
System.Diagnostics.Process.Start(wordFile);
}
}
After running this code the application get laggy.
And how do I know if how many pages where converted?
String inputPath = #"d:\Coffee Table Book NPPNP (1).pdf";
String outputPath = #"d:\sample.docx";
PDFDocument doc = new PDFDocument(inputPath);
doc.ConvertToDocument(DocumentType.DOCX, outputPath);
Better to refer the above code to reduce complexity.

How to convert Multiple HTML Pages to Single Doc in c#

I am converting a single HTML page to Doc using spire doc. I need to convert multiple html pages from single folder to single Doc. How this can be done. Can anyone give some idea or any library available to achieve this?
Please find my code to convert single HTML to Doc.
Spire.Doc.Document document = new Spire.Doc.Document();
document.LoadFromFile(#"D:\DocFilesConvert\htmlfile.html", Spire.Doc.FileFormat.Html, XHTMLValidationType.None);
document.SaveToFile(#"D:\DocFilesConvert\docfiless.docx", Spire.Doc.FileFormat.Docx);
There seems no direct way to achieve this. One workaround I find is to convert each HTML document to a single Word file, and then merge these Word files in one file.
//get HTML file paths
string[] htmlfilePaths = new string[]{
#"F:\Documents\Html\1.html",
#"F:\Documents\Html\2.html",
#"F:\Documents\Html\3.html"
};
//create Document array
Document[] docs = new Document[htmlfilePaths.Length];
for (int i = 0; i < htmlfilePaths.Length; i++)
{
//load each HTML to a sperate Word file
docs[i] = new Document(htmlfilePaths[i], FileFormat.Html);
//combine these Word files in one file
if (i>=1)
{
foreach (Section sec in docs[i].Sections)
{
docs[0].Sections.Add(sec.Clone());
}
}
}
//save to a Word document
docs[0].SaveToFile("output.docx", FileFormat.Docx2013);

Value of a string for file's location is nil but a stored value says it isn't

I'm trying to convert secured PDFs to XPS and back to PDF using FreeSpire and then combine them using iTextSharp. Below is my code snippet for converting various files.
char[] delimiter = { '\\' };
string WorkDir = #"C:\Users\*******\Desktop\PDF\Test";
Directory.SetCurrentDirectory(WorkDir);
string[] SubWorkDir = Directory.GetDirectories(WorkDir);
//convert items to PDF
foreach (string subdir in SubWorkDir)
{
string[] samplelist = Directory.GetFiles(subdir);
for (int f = 0; f < samplelist.Length - 1; f++)
{
if (samplelist[f].EndsWith(".doc") || samplelist[f].EndsWith(".DOC"))
{
Spire.Pdf.PdfDocument doc = new Spire.Pdf.PdfDocument();
doc.LoadFromFile(sampleist[f], FileFormat.DOC);
doc.SaveToFile((Path.ChangeExtension(samplelist[f],".pdf")), FileFormat.PDF);
doc.Close();
}
. //other extension cases
.
.
else if (samplelist[f].EndsWith(".pdf") || sampleList[f].EndsWith(".PDF"))
{
PdfReader reader = new PdfReader(samplelist[f]);
bool PDFCheck = reader.IsOpenedWithFullPermissions;
reader.Close();
if (PDFCheck)
{
Console.WriteLine("{0}\\Full Permisions", Loan_list[f]);
reader.Close();
}
else
{
Console.WriteLine("{0}\\Secured", samplelist[f]);
Spire.Pdf.PdfDocument doc = new Spire.Pdf.PdfDocument();
string path = Loan_List[f];
doc.LoadFromFile(samplelist[f]);
doc.SaveToFile((Path.ChangeExtension(samplelist[f], ".xps")), FileFormat.XPS);
doc.Close();
Spire.Pdf.PdfDocument doc2 = new Spire.Pdf.PdfDocument();
doc2.LoadFromFile((Path.ChangeExtension(samplelist[f], ".xps")), FileFormat.XPS);
doc2.SaveToFile(samplelist[f], FileFormat.PDF);
doc2.Close();
}
The issue is I get a Value cannot be null error in doc.LoadFromFile(samplelist[f]);.I have the string path = sampleList[f]; to check if samplelist[f] was empty but it was not. I tried to replace the samplelist[f] parameter with the variable named path but it also does not go though. I tested the PDF conversion on a smaller scale it it worked (see below)
string PDFDoc = #"C:\Users\****\Desktop\Test\Test\Test.PDF";
string XPSDoc = #"C:\Users\****\Desktop\Test\Test\Test.xps";
//Convert PDF file to XPS file
PdfDocument doc = new PdfDocument();
doc.LoadFromFile(PDFDoc);
doc.SaveToFile(XPSDoc, FileFormat.XPS);
doc.Close();
//Convert XPS file to PDF
PdfDocument doc2 = new PdfDocument();
doc2.LoadFromFile(XPSDoc, FileFormat.XPS);
doc2.SaveToFile(PDFDoc, FileFormat.PDF);
doc2.Close();
I would like to understand why I am getting this error and how to fix it.
There would be 2 solutions for the problem you are facing.
Get the Document in the Document Object not in PDFDocument. And then probably try to SaveToFile Something like this
Document document = new Document();
//Load a Document in document Object
document.SaveToFile("Sample.pdf", FileFormat.PDF);
You can use Stream for the same something like this
PdfDocument doc = new PdfDocument();
//Load PDF file from stream.
FileStream from_stream = File.OpenRead(Loan_list[f]);
//Make sure the Loan_list[f] is the complete path of the file with extension.
doc.LoadFromStream(from_stream);
//Save the PDF document.
doc.SaveToFile(Loan_list[f] + ".pdf",FileFormat.PDF);
Second approach is the easy one, but I would recommend you to use the first one as for obvious reasons like document will give better convertability than stream. Since the document have section, paragraph, page setup, text, fonts everything which need to be required to do a better or exact formatting required.

C# avoid overwritten xml file

I created a C# Application to write data to xml file. It overrides all the new Data,
How to Avoid..?
Please help me.
This is my code
namespace BarcodeScaner
{
class WriteFile
{
DateTime dt=DateTime.Now;
public WriteFile()
{
}
public void createFile(string ptr,string value)
{
using (XmlWriter writer = XmlWriter.Create("Data.xml"))
{
writer.WriteStartDocument();
writer.WriteStartElement("Products");
writer.WriteStartElement("Details");
writer.WriteAttributeString("PTR", ptr);
writer.WriteAttributeString("Value", value);
writer.WriteAttributeString("DateTime", dt.ToString());
writer.WriteEndElement();
writer.WriteEndDocument();
}
}
}
}
You can use XmlDocument to add custom nodes to an existing Xml Document:
XmlDocument doc = new XmlDocument();
doc.Load("Data.xml");
XmlElement el = doc.CreateElement("child");
el.InnerText = "Example of data being appendeed";
doc.DocumentElement.AppendChild(el);
doc.Save("test.xml");
You need to read the existing file, add your data to it. And then write the combined results to the file.
Because of the format of XML files, you can't simply append new data to the file as you might with some file formats.
I'm assuming that what you're wanting to do is add new data to an existing XML file, correct?
You can't just append data to an XML file, and your code actually creates a new file each time it's run, overwriting the old one.
What you need to do is read the XML data into memory, add the new nodes, and write the whole file out again. If there is a lot of data then it may be more efficient to stream from one file to another and insert the new nodes as you go.

Server-side word automation

I am looking for alternatives to using openxml for a server-side word automation project. Does anyone know any other ways that have features to let me manipulate word bookmarks and tables?
I am currently doing a project of developing a word automation project for my company and I am using DocX Very simple and straight forward API to work with. The approach I am using is, whenever I need to work with XML directly, this API has a property named "xml" in the Paragraph class which gives you access to the underlying xml direclty so that I can work with it. The best part is its not breaking the xml and not corrupting the resulting document. Hope this helps!
Example code using DocX..
XNamespace ns = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
using(DocX doc = DocX.Load(#"c:\temp\yourdoc.docx"))
{
foreach( Paragraph para in doc.Paragraphs )
{
if(para.Xml.ToString().Contains("w:Bookmark"))
{
if(para.Xml.Element(ns + "BookmarkStart").Attribute("Name").Value == "yourbookmarkname")
{
// you got to your bookmark, if you want to change the text..then
para.Xml.Elements(ns + "t").FirstOrDefault().SetValue("Text to replace..");
}
}
}
}
Alternative API exclusively to work with bookmarks is .. http://simpleooxml.codeplex.com/
Example on how to delete text from bookmarkstart to bookmarkend using this API..
MemoryStream stream = DocumentReader.Copy(string.Format("{0}\\template.docx", TestContext.TestDeploymentDir));
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
MainDocumentPart mainPart = doc.MainDocumentPart;
DocumentWriter writer = new DocumentWriter(mainPart);
//Simply Clears all text between bookmarkstart and end
writer.PasteText("", "YourBookMarkName");
//Save to the memory stream, and then to a file
writer.Save();
DocumentWriter.StreamToFile(string.Format("{0}\\templatetest.docx", GetOutputFolder()), stream);
Loading the word document into different API's from memory stream.
//Loading a document file into memorystream using SimpleOOXML API
MemoryStream stream = DocumentReader.Copy(#"c\template.docx");
//Opening it from the memory stream as OpenXML document
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
//Opening it as DocX document for working with DocX Api
DocX document = DocX.Load(stream);

Categories

Resources