Creating an Open XML file in .NET - schema - c#

I'm trying to make a report generator inside of a C# application for my boss, I came across this page and looked into RichTextBoxes and think that I can build on this idea to do what my boss is looking for. http://openxmldeveloper.org/articles/OpenXMLDocFromDotNet.aspx
The issue I'm running into is their example code for the XML portion assumed you were creating an application in an Office 2007 beta. The schema listed here doesn't work for retail Office 2007. Can anyone show me where I can look to find out more about schema in general, or explain what the code is doing here? Alternatively, if anyone has a different suggestion for creating a .docx file based on the contents of a rich text box, that would be greatly appreciated. I found different resources that offered advice similar to this: http://nishantrana.wordpress.com/2007/11/03/creating-word-document-using-c/
But I kept having issues getting it to recognize what a WordApp was.
Here's the code from the first link with the schema issues.
private void GenerateDocument_Click(object sender, EventArgs e)
{
string _nameSpaceURI = "http://schemas.microsoft.com/office/word/2005/10/wordml";
string docFileName = GetSavePath();
//-- Step 1 - Creating the document xml
XmlDocument doc = new XmlDocument();
XmlElement _wWordDoc = doc.CreateElement("w:wordDocument", _nameSpaceURI);
doc.AppendChild (_wWordDoc);
XmlElement _wbody = doc.CreateElement("w:body",_nameSpaceURI);
_wWordDoc.AppendChild(_wbody);
// Check if the string contains a line feed
string[] _SplitStr = mleTextForDocument.Text.Split('\n');
// if it contains line feed then each entry with a line feed goes to a new paragraph.
for (int row = 0; row < _SplitStr.Length; row++)
{
XmlElement _wp1 = doc.CreateElement("w:p",_nameSpaceURI);
_wbody.AppendChild(_wp1);
XmlElement _wr1 = doc.CreateElement("w:r", _nameSpaceURI);
_wp1.AppendChild(_wr1);
XmlElement _wt11 = doc.CreateElement("w:t", _nameSpaceURI);
_wr1.AppendChild(_wt11);
XmlNode _wt1 = doc.CreateNode(XmlNodeType.Text, "w:t",_nameSpaceURI);
_wt1.Value = _SplitStr[row];
_wt11.AppendChild(_wt1);
}
//-- Step 2 - Creating the Package
Package package = null;
package = Package.Open(docFileName, FileMode.Create, FileAccess.ReadWrite);
//-- Step 3 - Create the main document part (document.xml)
Uri uri = new Uri("/word/document.xml", UriKind.Relative);
PackagePart part = package.CreatePart(uri, "application/vnd.ms-word.main+xml");
StreamWriter partWrt = new StreamWriter(part.GetStream(FileMode.Create, FileAccess.Write));
doc.Save(partWrt);
partWrt.Close();
package.Flush();
//-- Step 4 - Create the relationship file
uri = new Uri("/word/document.xml", UriKind.Relative);
PackageRelationship rel = package.CreateRelationship(uri, TargetMode.Internal, "http://schemas.microsoft.com/office/2006/relationships/officeDocument", "rId1");
package.Flush();
//-- Step 5- Close the document.
package.Close();
}
I'm sorry for the lack of a clear question, but I really don't know what question to ask. I've never used schemas before, never used XML, and never had to add references to my projects before. Any advice or suggestions are appreciated.

Dispite the ambigious question, and apparently it's coming from my bizzaro evil twin (nwonknu) (elgoog), joke right.
Anyhow, I've said it before and I'll say it again THE source of quality advise for XML/OpenXML is Eric White. He's a very active blogger, looks like 4+ years of consistant postings (sux when good sources just evaporate sometimes), anyhow breeze through his blog for a few minutes and I'm sure your grasp of OpenXML + Linq 2 XML will be a bit more solid.

Related

Copy slide containing notes from one PowerPoint presentaition to another with OpenXML SDK

I am trying to copy slides from one PowerPoint presentation to another. I have used the procedure outlined in the following article, and it generally works fine.
https://learn.microsoft.com/en-us/previous-versions/office/developer/office-2007/ee361883(v=office.12)?redirectedfrom=MSDN
However, when the slide to be copied contains notes, the resulting presentation after copying is corrupted. I've noticed that the code generates a new notesMaster which is not added to the notesMasterIdLst in presentation.xml, and have a suspicion this might be the issue. However, I cannot add the new notes master to the presentation, as a presentation can only have one notesMaster.
According to the Microsoft Documentation, Open XML SDK is defined this way:
The Open XML SDK 2.5 simplifies the task of manipulating Open XML
packages and the underlying Open XML schema elements within a package.
The Open XML SDK 2.5 encapsulates many common tasks that developers
perform on Open XML packages, so that you can perform complex
operations with just a few lines of code.
It looks like it is not easy to solve your problem using the Open XML SDK. If you use Aspose.Slides for .NET you will copy a slide with its notes as shown below:
var sourceFileName = "example1.pptx";
var targetFileName = "example2.pptx";
var slideIndex = 0;
using (var sourcePresentation = new Presentation(sourceFileName))
using (var targetPresentation = new Presentation(targetFileName))
{
var slide = sourcePresentation.Slides[slideIndex];
targetPresentation.Slides.AddClone(slide);
targetPresentation.Save(targetFileName, SaveFormat.Pptx);
}
You can also evaluate Aspose.Slides Cloud for presentation manipulating. This REST-based API allows you to make 150 free API calls per month for API learning and presentation processing. The following code example shows you how to do the same using Aspose.Slides Cloud:
var slidesApi = new SlidesApi("my_client_id", "my_client_key");
var sourceFileName = "example1.pptx";
var targetFileName = "example2.pptx";
var slideIndex = 1;
using (var sourceStream = File.OpenRead(sourceFileName))
slidesApi.UploadFile(sourceFileName, sourceStream);
using (var targetStream = File.OpenRead(targetFileName))
slidesApi.UploadFile(targetFileName, targetStream);
slidesApi.CopySlide(targetFileName, slideIndex, null, sourceFileName);
using (var resultStream = slidesApi.DownloadFile(targetFileName))
using (var fileStream = File.OpenWrite(targetFileName))
resultStream.CopyTo(fileStream);
I work as a Support Developer at Aspose.
I believe I managed to solve this issue by doing the following steps:
Make the source presentations editable when opening them:
using (PresentationDocument mySourceDeck =
PresentationDocument.Open(
presentationFolder + sourcePresentation, true))
{
PresentationPart sourcePresPart =
mySourceDeck.PresentationPart;
Copy the notes slide CommonSlideData from the slide, then delete the notes slide part from the slide:
sp = (SlidePart)sourcePresPart.GetPartById(slideId.RelationshipId);
CommonSlideData notesSlideData = null;
if (sp.NotesSlidePart != null)
{
notesSlideData = (CommonSlideData)sp.NotesSlidePart.NotesSlide.CommonSlideData.CloneNode(true);
sp.DeletePart(sp.NotesSlidePart);
}
Readd any existing notes slide data by adding a new NotesSlidePart to the copied slide (now added to the target presentation and called destSp), adding relationship parts and a new NotesSlide object initialised with the copied notes slide data.
if (notesSlideData != null)
{
NotesSlidePart notesSlidePart1 = destSp.AddNewPart<NotesSlidePart>();
notesSlidePart1.AddPart(destSp);
notesSlidePart1.AddPart(destPresPart.NotesMasterPart);
NotesSlide notesSlide = new NotesSlide(notesSlideData);
notesSlidePart1.NotesSlide = notesSlide;
}
Warning: The notes slides will be deleted from the source presentation files, so you might want to make a copy of them first, or add them back to the presentation after it has been copied/merged.
This seems to retain at least some existing formatting of the notes slides, such as bold text. However, I have not yet tested this on a lot of different presentations so I suppose there could be some issues if the notes slides are based on very different notes slide masters, but I'm not sure.
Related to this, I ran into a similar issue after getting the notes slides to work, which seemed to be because of any custom xml parts that existed on the presentation to be copied. These presentations worked after adding some code to add relationships to the copied presentation's CustomXmlPart to the target presentation:
foreach (var customXmlPart in destSp.GetPartsOfType<CustomXmlPart>())
{
destPresPart.AddPart(customXmlPart);
}

scraping data from website with a C# console application

I'm trying to learn Spanish and making some flash cards (for my personal use) to help me learn the verbs.
Here is an example, page example. So near the top of the page you will see the past participle: bloqueado & gerund: bloqueando. It is these two values that I wish to obtain in my code and use for my flash cards.
If this is possible I will use a C# console application. I am aware that scraping data from a website is not ideal however this is a once off.
Any guidance on how to start something like this and pitfalls to avoid would be very helpful!
I know this isn't an exact answer, but here is the process I would suggest.
https://www.gnu.org/software/wget/ and mirror the website to a
folder. Wget is a web spider and will follow the links on the site until it has downloaded everything. You'll have to run it with a few different parameters until you figure out the correct settings you want.
Use C# to run through each file in the folder and extract the
words from <section class="verb-mood-section"> in each file. It's your choosing of whether you want to output them to the console or store them in a database or flat file.
Should be that easy, in theory.
Use SGMLReader. SGMLReader is a versatile and robust component that will stream HTML to an XMLReader:
XmlDocument FromHtml(TextReader reader) {
// setup SgmlReader
Sgml.SgmlReader sgmlReader = new Sgml.SgmlReader();
sgmlReader.DocType = "HTML";
sgmlReader.WhitespaceHandling = WhitespaceHandling.All;
sgmlReader.CaseFolding = Sgml.CaseFolding.ToLower;
sgmlReader.InputStream = reader;
// create document
XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.XmlResolver = null;
doc.Load(sgmlReader);
return doc;
}
You can see that you need to create a TextReader first. TThis would in reality be a StreamReader as a TextReader is an abstract class.
Then you create the XMLDocument over that. Once you've got it into the XMLDocument you can use the various methods supported by XMLDocument to isolate and extract the nodes you need. I'll leave you to explore that aspect of it.
You might try using the XDocument class as it's a lot easier to handle than the XMLDocument, especially if you're a newbie. It also supports LINQ.

XSL + XML -> PDF for C#

I know several people asked questions like this, but no answer helped to solve my problem.
Well, I have xsl and xml and want to generate pdf with a processor like Apache.FOP.
I am not able to use any JAVA programms like that. Just able to use C# libraries / exe.
I tried to use nFop:
Version 1.x uses Java.io and..
Version 2.0 doesn't have the ability to set XsltSettings
My current Software uses XSL + XML -> HTML (using standard Stystm.Xml.Xsl on C#) and wktmltopdf to generate PDF from created HTML.
But tables got split when they are too long for the page, and on the next page you don't have any column headers (this is very important for my problem).
I think there are no Free FO-Processor for pure C
Have a look at FoNET.
public static bool XMLToPDF(string pXmlFile, string pXslFile, string pFoFile, string pPdfFile)
{
string lBaseDir = System.IO.Path.GetDirectoryName(pXslFile);
XslCompiledTransform lXslt = new XslCompiledTransform();
lXslt.Load(pXslFile);
lXslt.Transform(pXmlFile, pFoFile);
FileStream lFileInputStreamFo = new FileStream(pFoFile, FileMode.Open);
FileStream lFileOutputStreamPDF = new FileStream(pPdfFile, FileMode.Create);
FonetDriver lDriver = FonetDriver.Make();
lDriver.BaseDirectory = new DirectoryInfo(lBaseDir);
lDriver.CloseOnExit = true;
lDriver.Render(lFileInputStreamFo, lFileOutputStreamPDF);
lFileInputStreamFo.Close();
lFileOutputStreamPDF.Close();
return System.IO.File.Exists(pPdfFile);
}

How do I copy content from Word Document another with images and links?

I've had some problem when copying content from a Word document to another Word document.
The document where the information should end up in have a header.
So far I have managed to copy the content to the second document and not affecting the header.
However I can't figure out how to bind the relationships for links and Images.
This is my code so far:
public static void AddContentToTemplateCopy(
string sourceDocumentPath, string endDocumentPath)
{
using (WordprocessingDocument sourceDoc =
WordprocessingDocument.Open(sourceDocumentPath, false))
using (WordprocessingDocument endDoc =
WordprocessingDocument.Open(endDocumentPath, true))
{
var sourceMainPart = sourceDoc.MainDocumentPart;
var sourceBody = sourceMainPart.Document.Body;
var endSection = endDoc.MainDocumentPart.Document.Body.Elements<SectionProperties>();
var endDocMainPart = endDoc.MainDocumentPart;
var sourceBodyClone = sourceBody.CloneNode(true);
sourceBodyClone.ReplaceChild(endSection.FirstOrDefault().CloneNode(true), sourceBodyClone.Elements<SectionProperties>().FirstOrDefault());
endDocMainPart.Document.ReplaceChild(sourceBodyClone, endDocMainPart.Document.Body);
foreach (HyperlinkRelationship link in sourceMainPart.HyperlinkRelationships)
{
endDocMainPart.AddHyperlinkRelationship(link.Uri, link.IsExternal, link.Id);
}
}
I get the following Error : 'rId6' ID conflicts with the ID of an existing relationship for the specified source.
And the if i have a Image in the content it can't be displayed.
If I zip the document and look at the files in the package I can find the Image but for the same reason as the links the Relation
So my question is: How do I bind the links and Images with their "_rels" references? or how do I copy them so that it works..
This is a Relationship link when I have added the link by hand.
<Relationship Target="media/image1.jpg" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Id="rId11"/>
A picture to show that the link text is copied but have no formatting and that the image can't be displayed.
Thanks to the answer by JasonPlutext i managed to use OpenXML PowerTools (Version 2.2). Keep in mind that the .Net version is 3.5 when importing the project. You Might need to change it. (Supports Open XML 2.5 as well from what I've noticed)
Very simple to create new documents and take parts from old documents.
The code here is in my case where I want the formatting and content from one and then the Header from a template document. The order matters.
Hopefully this will save time for others with the same problem.
public static void AddContentToTemplateCopy(string templateDocumentPath,
string contentDocumentPath,
List<Source> sources,
string outName)
{
sources = new List<Source>()
{
new Source(new WmlDocument(contentDocumentPath),false),
new Source(new WmlDocument(templateDocumentPath),true),
};
DocumentBuilder.BuildDocument(sources, outName);
}
You might find it easier to try Eric White's document builder.

Reading an xml file 50 lines at a time

Currently trying to make a method to read in XML files at the moment 50 lines at a time this will be increased to allow larger files to be used in the program.
At the moment i am trying to accomplish this with the following code.
List<dataclass.DataRecord> list = new List<dataclass.DataRecord>();
string filename = "FileLocation"
XmlDocument testing = new XmlDocument();
//using (StreamReader streamreader = new StreamReader(filename))
using (XmlTextReader reader = new XmlTextReader(new StringReader(filename)))
{
while (reader.Read() != null)
{
for (int i = 0; i < 50; i++)
{
testing.Load(reader);
//list.add(line);
Console.WriteLine(testing);
//testing.Load(reader);
}
}
}
commented lines are just from previous ideas i used to accomplish my goal and the filename has been taken out as i just prefer not to place that online.
Basically at the moment i keep getting the following error:
Data at the root level is invalid. Line 1, position 1.
So i dunno if I am:
A. Going about this the right way.
B. Is the only way to fix this error is by surrounding the "testing.load" by "root + /root" tags
hope someone can help thank.
As I explained in my comment XML consists of nodes whereas you are looking at it as though it were a flat-file with lines.
There are a couple of Stackoverflow questions with answers that match what you are trying to do. The real question is "How can you load a large XML file". The answer is to use a stream rather than loading in one big chunk, following on from there you can find lots of resources about using XmlReader.
Couple of pointers to other SO articles:
C# and Reading Large XML Files
Reading large XML documents in .net
Hope that helps!
If you are only trying to load xml into XmlDocument - why not just
XmlDocument testing = new XmlDocument();
testing.Load(filename);
If your XML file is really big, you're better off using some sort of pull parser (parses tag-by-tag, attribute-by-attribute, etc) rather than DOM parser (loads whole document during parsing, keeps it in memory).

Categories

Resources