I'm using NetOffice to create Word documents.
There is little documentation and I'm struggling to add a header. Can anybody help?
You have to use the Word.Section.Headers property, in the example below I've put an image right-aligned on the page header
foreach (Word.Section section in newDocument.Sections)
{
string picturePath = #"D:\Desktop\test.png";
section.Headers[WdHeaderFooterIndex.wdHeaderFooterPrimary].Range.InlineShapes.AddPicture(picturePath);
section.Headers[WdHeaderFooterIndex.wdHeaderFooterPrimary].Range.ParagraphFormat.Alignment = WdParagraphAlignment.wdAlignParagraphRight;
}
To add some text use:
foreach (Word.Section section in newDocument.Sections)
section.Headers[WdHeaderFooterIndex.wdHeaderFooterPrimary].Range.Text = "TEST";
Hope this helps to investigate further.
Related
I need to check all tags on all shapes on all slides. I can select each shape, however I can't see how to get the shape's tags.
For the given DocumentFormat.OpenXml.Presentation.Shape, how can I get the "val" of the tag with name="MOUNTAIN"
In my shape, the tag rId is in this structure: p:sp > p:nvSpPr > p:cNvPr > p:nvPr > p:custDataList > p:tags
I'm guessing my code needs to do these steps:
• Get the rId of the p:custDataLst p:tags
• Look up the "Target" file name in the slideX.xml.rels file, based on the rId
• Look in the root/tags folder for the "Target" file
• Get the p:tagLst p:tags and look for the p:tag with name="MOUNTAIN"
<p:tagLst
<p:tag name="MOUNTAIN" val="Denali"/>
</p:tagLst>
Here is how my code iterates through shapes on each slide:
for (int x = 0; x < doc.PresentationPart.SlideParts.Count(); x++)
{
SlidePart slide = doc.PresentationPart.SlideParts.ElementAt(x);
ShapeTree tree = slide.Slide.CommonSlideData.ShapeTree;
IEnumerable<DocumentFormat.OpenXml.Presentation.Shape> slShapes = slide.Slide.Descendants<DocumentFormat.OpenXml.Presentation.Shape>();
foreach (DocumentFormat.OpenXml.Presentation.Shape shape in slShapes)
{
//get the specified tag, if it exists
}
}
I see an example of how to add tags: How to add custom tags to powerpoint slides using OpenXml in c#
But I can't figure out how to read the existing tags.
So, how do I get the shape's tags with c#?
I was hoping to do something like this:
IEnumerable<UserDefinedTagsPart> userDefinedTagsParts = shape.NonVisualShapeProperties.ApplicationNonVisualDrawingProperties.CustomerDataList.CustomerDataTags<UserDefinedTagsPart>();
foreach (UserDefinedTagsPart userDefinedTagsPart in userDefinedTagsParts)
{}
but Visual Studio says "ApplicationNonVisualDrawingProperties does not contain a definition for CustomerDataList".
From the OpenXML Productivity Tool, here is the element tree:
You and I seem to be working on similar problems. I'm struggling with learning the file format. The following code is working for me, I'm sure it can be optimized.
public void ReadTags(Shape shape, SlidePart slidePart)
{
NonVisualShapeProperties nvsp = shape.NonVisualShapeProperties;
ApplicationNonVisualDrawingProperties nvdp = nvsp.ApplicationNonVisualDrawingProperties;
IEnumerable<CustomerDataTags> data_tags = nvdp.Descendants<CustomerDataTags>();
foreach (var data_tag in data_tags)
{
UserDefinedTagsPart shape_tags = slidePart.GetPartById(data_tag.Id) as UserDefinedTagsPart;
if (shape_tags != null)
{
foreach (Tag tag in shape_tags.TagList)
{
Debug.Print($"\t{nvsp.NonVisualDrawingProperties.Name} tag {tag.Name} = '{tag.Val}");
}
}
}
}
I've spent a lot of time with OpenXML .docx and .xlsx files ... but not so much with .pptx.
Nevertheless, here are a couple of suggestions that might help:
If you haven't already done so, please downoad the OpenXML SDK Productivity Tool to analyze your file's contents. It's currently available on GitHub:
https://github.com/dotnet/Open-XML-SDK/releases/tag/v2.5
You might simply be able to "grep" for items you're looking for.
EXAMPLE (Word, not PowerPoint... but the same principle should apply):
using (doc = WordprocessingDocument.Open(stream, true))
{
// Init OpenXML members
mainPart = doc.MainDocumentPart;
body = mainPart.Document.Body;
...
foreach (var text in body.Descendants<Text>())
{
if (text.Text.Contains(target))
...
I have publisher document used as template.
How to insert image from file in place of "?" image.
Probably it's same way as for MS Word but I can't find out.
I access that template this way:
using Publisher = Microsoft.Office.Interop.Publisher;
Publisher._Application pubApp = new Publisher.Application();
Publisher.Document doc = pubApp.Open(docPath);
Publisher.Page templateCard = doc.Pages[1];
Basically:
shape.PictureFormat.Replace(filePath);
Best way to find my template image I figured out is set alternative text to my image and just chack it:
foreach (Publisher.Shape shape in currenPage.Shapes) {
if (shape.AlternativeText == "DICKBUTT")
//here you do your stuff
}
I have created EditTextField in PDF using iTextSharp library. I can able to set FieldName to EditTextField. I also want to set DataFormat to it. I will get XML file containing only FieldName and Value. So While merging Value to PDF Template, I want to check the DataFormat and according to that I want to convert the value and set.
I have added a EditTextField. I can able to add name to textfield. I want to add format (Like DateTime format) to TextField. So that next time I can able to fetch all the text fields from PDF and check the format and according to the format, I can able to set the data to it programmatically.
TextField _DOBtext = new TextField(pdfStamper.Writer,
new Rectangle(40, 670, 110, 650), "patient-dob");
_DOBtext.SetFormat("DateFormat", "mm/dd/yyyy"); // we want to set format like this.
pdfStamper.AddAnnotation(_DOBtext.GetTextField(), 1);
pdfStamper.Close();
And while processing the same PDF for filling data, we will first check the AcrofieldName, and logic is written below (its not working code)
var GetField = pdfStamper.Acrofields.Field.where(u=>u.Key == "patient-dob").FirstOrDefault();
var Format = GetField.GetFormat(); // we want like this feature
if (Format != null) {
if (Format.Type = "DateTime") {
value=string.Format(data, Format.FormatString));
stamper.AcroFields.SetField(fieldId, value); //fieldId = "patient-dob"
}
}
Please help me to Set and get the DataFormat
(ediit: add iTextSharp tag)
The reason that iText doesn't support something like SetFormat() and GetFormat() is because the PDF format doesn't support it.
You might notice, however, that Adobe Acrobat allows you to specify a field format. The way that they do this is via a JavaScript file that they ship with all of their products. If other PDF renderers (such as Chrome, Firefox or IE) support this I don't know. This JS file has a bunch of built-in functions that they can use for client-side validation and one of those function is AFDate_FormatEx(). You can see how that lays out in a PDF in this screenshot:
You might have noticed my emphasis above on client-side validation. That's because this is JS which is optional in the spec and it isn't required when programmatically interacting with forms. However, if you are trying to mimic Adobe's product you might want to go down this path. You can programmatically retrieve these settings via something like:
var aditionalActions = reader.AcroFields.Fields["patient-dob"].GetWidget(0).GetAsDict(PdfName.AA);
And you can set it via something like SetAdditionalActions() on your PdfFormField.
This is probably a really fragile path but it might work for you.
However, for me I'd try something different and just hijack another field that I know that I'm not going to use. Looking through the spec I think I'd just pick something like /TM which is just a string used for the field's name when exporting data (I think via FDF or HTTP POST).
var _DOBtext = new TextField(writer, new iTextSharp.text.Rectangle(0, 0, 110, 650), "patient-dob");
var tf = _DOBtext.GetTextField();
tf.MappingName = "patient-dob:date:mm/dd/yyyy";
And to retrieve it:
var mappingName = reader.AcroFields.Fields["patient-dob"].GetWidget(0).GetAsString(PdfName.TM);
You can come up with whatever format you want for this entry, as long as it makes sense to you. I'd recommend keeping the field's original name as part of it and then picking whatever delimiters work for you. The value should travel safely with the PDF regardless of the rendering application.
Sometime ago I wrote a library to manage PDF AcroForm fillup.. this the a sample code from my fill function:
private string fill_form(string output_file)
{
using (PdfReader _pdfReader = new PdfReader(FormPath))
{
using (PdfStamper _pdfStamper = new PdfStamper(_pdfReader, new FileStream(output_file, FileMode.Create)))
{
_pdfStamper.AcroFields.GenerateAppearances = true;
foreach (var _field in _pdfStamper.AcroFields.Fields)
foreach (TemplateField _spField in _lstFields)
{
if (_field.Key.Equals(_spField.Name))
{
switch (_spField.Type )
{
case TemplateFieldType.Text:
_pdfStamper.AcroFields.SetField(_field.Key, _spField.Value);
break;
case TemplateFieldType.Checkbox:
//TODO: check Value field cannot be set != OnValue, Offvalue
if (_spField.Value == _spField.OnValue)
_pdfStamper.AcroFields.SetField(_field.Key, _spField.OnValue);
else
_pdfStamper.AcroFields.SetField(_field.Key, _spField.OffValue);
break;
}
}
}
_pdfStamper.FormFlattening = true;
}
}
return output_file;
}
As you can see in the inner loop I do something similar to your needs.
Hope this helps.
This worked for me (using iText7)
textBox.SetAdditionalAction(PdfName.F, iText.Kernel.Pdf.Action.PdfAction.CreateJavaScript("AFDate_FormatEx(\"yyyy-mm-dd\");"));
textBox.SetAdditionalAction(PdfName.K, iText.Kernel.Pdf.Action.PdfAction.CreateJavaScript("AFDate_KeystrokeEx(\"yyyy-mm-dd\");"));
thanks to the iText RUPS tool.
I use MigraDoc for creating pdf documents in the project.
Code below shows how I work with library:
var document = new Document { Info = { Author = "title" } };
Section section = document.AddSection();
Paragraph paragraph = section.AddParagraph("Title");
var renderer = new PdfDocumentRenderer(true, PdfSharp.Pdf.PdfFontEmbedding.Always) { Document = document };
renderer.RenderDocument();
So, I'm looking for a way to adding link to web resource inside pdf.
Does someone know?)
-------------Solution-------------------
I found solution!
I tried to use AddHyperlink() for adding link, and it was the first step for this. The code below shows correct using:
var h = paragraph.AddHyperlink("http://stackoverflow.com/",HyperlinkType.Web);
h.AddFormattedText("http://www.stackoverflow.com/");
To add a link use AddHyperlink():
var h = paragraph.AddHyperlink("http://stackoverflow.com/",HyperlinkType.Web);
h.AddFormattedText("http://www.stackoverflow.com/");
So the idea that you should add some text for a link to make link visible.
Use paragraph.AddHyperlink() for that purpose. You will need HyperlinkType.Web.
I have a page that contains some links to .mp3/.wav files in that format
File Name
what I need to make a script that will download all these files instead of downloading them my self
I know that I can use regular expression to do some thing like that but i don't know how ? and what is the best choose to do that (Java , C# , JavaScript) ?
Any help will be appreciated
Thanks in Advance
You could use SgmlReader to parse the DOM and extract all the anchor links and then download the corresponding resources:
class Program
{
static void Main()
{
using (var reader = new SgmlReader())
{
reader.DocType = "HTML";
reader.Href = "http://www.example.com";
var doc = new XmlDocument();
doc.Load(reader);
var anchors = doc.SelectNodes("//a/#href[contains(., 'mp3') or contains(., 'wav')]");
foreach (XmlAttribute href in anchors)
{
using (var client = new WebClient())
{
var data = client.DownloadData(href.Value);
// TODO: do something with the downloaded data
}
}
}
}
}
Well, if you want to go hard-core, I think parsing the page with DOMDocument ( http://php.net/manual/en/class.domdocument.php ) and retrieving the files with cURL would do it if you're ok with PHP.
How many files are we talking about here?
Python's Beautiful Soup library is well-suited to this task:
http://www.crummy.com/software/BeautifulSoup/
Could be used in this way:
import urllib2, re
from BeautifulSoup import BeautifulSoup
#open the URL
page = urllib2.urlopen("http://www.foo.com")
#parse the page
soup = BeautifulSoup(page)
#get all anchor elements
anchors = soup.findAll("a")
#filter anchors based on their href attribute
filteredAnchors = filter(lambda a : re.search("\.wav",a["href"]) or re.search("\.mp3",a["href"]), anchors)
urlsToDownload = map(lambda a : a["href"],filteredAnchors)
#download each anchor url...
See here for instructions on downloading the mp3's from their URLs: How do I download a file over HTTP using Python?