I am using below code to highlight text in power point presentation (.pptx) using openxml but below code for pptx - it corrupts the file and ask to repair while opening pptx and after opening it highlights the word but it does not preserves the formatting. So total 2 problems:
1. File gets corrupted
2. Formatting is not preserved
But it highlights the text which i want (but it does not preserve formatting)
I have debuged my code line by line and problem is at below line, but i am not able to figure out what is the problem.
highlightRun.InsertAt(runPro, 0);
I have used openxml productivity tool to compare two files of pptx one with highlighting one without highlighting: The difference i see is as below : i am not using two times RunProperties but it is showing 2 times :
Corrupted file :
public Run GenerateRun()
{
Run run1 = new Run();
RunProperties runProperties1 = new RunProperties(){ Language = "en-US", Dirty = false };
runProperties1.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
SolidFill solidFill1 = new SolidFill();
RgbColorModelHex rgbColorModelHex1 = new RgbColorModelHex(){ Val = "FFF000" };
solidFill1.Append(rgbColorModelHex1);
runProperties1.Append(solidFill1);
RunProperties runProperties2 = new RunProperties(){ Language = "en-US", Dirty = false };
runProperties2.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
SolidFill solidFill2 = new SolidFill();
RgbColorModelHex rgbColorModelHex2 = new RgbColorModelHex(){ Val = "FFFF00" };
solidFill2.Append(rgbColorModelHex2);
runProperties2.Append(solidFill2);
Text text1 = new Text();
text1.Text = "gaits";
run1.Append(runProperties1);
run1.Append(runProperties2);
run1.Append(text1);
return run1;
}
<a:r xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:rPr lang="en-US" dirty="0" smtClean="0">
<a:solidFill>
<a:srgbClr val="FFF000" />
</a:solidFill>
</a:rPr>
<a:rPr lang="en-US" dirty="0" smtClean="0">
<a:solidFill>
<a:srgbClr val="FFFF00" />
</a:solidFill>
</a:rPr>
<a:t>gaits</a:t>
</a:r>
Correct file :
public class GeneratedClass
{
// Creates an Run instance and adds its children.
public Run GenerateRun()
{
Run run1 = new Run();
RunProperties runProperties1 = new RunProperties(){ Language = "en-US", Dirty = false };
runProperties1.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
SolidFill solidFill1 = new SolidFill();
RgbColorModelHex rgbColorModelHex1 = new RgbColorModelHex(){ Val = "FFFF00" };
solidFill1.Append(rgbColorModelHex1);
runProperties1.Append(solidFill1);
Text text1 = new Text();
text1.Text = "gaits";
run1.Append(runProperties1);
run1.Append(text1);
return run1;
}
<a:r xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:rPr lang="en-US" dirty="0" smtClean="0">
<a:solidFill>
<a:srgbClr val="FFFF00" />
</a:solidFill>
</a:rPr>
<a:t>gaits</a:t>
</a:r>
What i am missing ? My complete code is as below :
using OpenXmlDrawing = DocumentFormat.OpenXml.Drawing;
private void HighLightTextPresentation(OpenXmlDrawing.Paragraph paragraph, string text)
{
var found = paragraph
.Descendants<OpenXmlDrawing.Run>()
.Where(r => !string.IsNullOrEmpty(r.InnerText) && r.InnerText != "\\s")
.Select(r =>
{
var runText = r.GetFirstChild<OpenXmlDrawing.Text>();
int index = runText.Text.IndexOf(text, StringComparison.OrdinalIgnoreCase);
// 'Run' is a reference to the text run we found,
// TextNode is a reference to the run's Text object,
// 'TokenIndex` is the index of the search string in run's text
return new { Run = r, TextNode = runText, TokenIndex = index };
})
.FirstOrDefault(o => o.TokenIndex >= 0);
// Nothing found -- escape
if (found == null)
{
return;
}
// Create a node for highlighted text as a clone (to preserve formatting etc)
var highlightRun = found.Run.CloneNode(true);
// Add the highlight node after the found text run and set up the highlighting
paragraph.InsertAfter(highlightRun, found.Run);
highlightRun.GetFirstChild<OpenXmlDrawing.Text>().Text = text;
DocumentFormat.OpenXml.Drawing.RunProperties runPro = new DocumentFormat.OpenXml.Drawing.RunProperties() { Language = "en-US", Dirty = false };
runPro.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
//Apply color to searched text
DocumentFormat.OpenXml.Drawing.SolidFill solidFill1 = new DocumentFormat.OpenXml.Drawing.SolidFill();
DocumentFormat.OpenXml.Drawing.RgbColorModelHex rgbColorModelHex1 = new DocumentFormat.OpenXml.Drawing.RgbColorModelHex() { Val = "FFF000" };//Set Font-Color to Green (Hex "00B050").
solidFill1.Append(rgbColorModelHex1);
runPro.Append(solidFill1);
highlightRun.InsertAt(runPro, 0);
// Check if there's some text in the text run *after* the found text
int remainderLength = found.TextNode.Text.Length - found.TokenIndex - text.Length;
if (remainderLength > 0)
{
// There is some text after the highlighted section --
// insert it in a separate text run after the highlighted text run
var remainderRun = found.Run.CloneNode(true);
paragraph.InsertAfter(remainderRun, highlightRun);
OpenXmlDrawing.Text textNode = remainderRun.GetFirstChild<OpenXmlDrawing.Text>();
textNode.Text = found.TextNode.Text.Substring(found.TokenIndex + text.Length);
// We need to set up this to preserve the spaces between text runs
//textNode.Space = new EnumValue<SpaceProcessingModeValues>(SpaceProcessingModeValues.Preserve);
}
// Check if there's some text *before* the found text
if (found.TokenIndex > 0)
{
// Something is left before the highlighted text,
// so make the original text run contain only that portion
found.TextNode.Text = found.TextNode.Text.Remove(found.TokenIndex);
// We need to set up this to preserve the spaces between text runs
//found.TextNode.Space = new EnumValue<SpaceProcessingModeValues>(SpaceProcessingModeValues.Preserve);
}
else
{
// There's nothing before the highlighted text -- remove the unneeded text run
paragraph.RemoveChild(found.Run);
}
}
As noted in the comments, the reason of "getting corrupted" is that you make the XML structure invalid by creating an additional a:rPr element (RunProperties) in the a:r element (Run) while only one is allowed.
So you should first check whether there already is a RunProperties element in the Run before inserting a new one. If a RunProperties element already exists, you should instead reuse it.
// Either reuse an existing RunProperties element,
// or create a new one if there's none
RunProperties runPro = highlightRun.Descendants<RunProperties>().FirstOrDefault() ??
new RunProperties { Language = "en-US", Dirty = false };
// only add the element if it's really new, don't add existing one
if (runPro.Parent == null)
{
highlightRun.InsertAt(runPro, 0);
}
Related
I am using GemBox.Document library in my ASP.Net Page. I have a paragraph which contains line breaks and I also need to set the paragraph to bold.
I my below code the variable str contains line break characters.
TRY 1
Line breaks work well in the below code
var p3 = new Paragraph(wDoc, str);
How to set BOLD to this paragraph
TRY 2
Bold work well in the below code
var p3 = new Paragraph(wDoc,
new Run(wDoc, str) { CharacterFormat = { Bold = true } }
);
This doesn't allow line breaks
Please help for a solution
Probably the easiest way to do this is something like this:
var paragraph = new Paragraph(wDoc);
paragraph.Content.LoadText(str, new CharacterFormat() { Bold = true });
Or this:
var paragraph = new Paragraph(wDoc);
paragraph.CharacterFormatForParagraphMark.Bold = true;
paragraph.Content.LoadText(str);
But just in case you're interested, the thing to note here is that line breaks are represented with SpecialCharacter objects, not with Run objects.
So the following would be the "manual" way in which you would need to handle those breaks yourself, you would need to add the correct elements to the Paragraph.Inlines collection:
string str = "Sample 1\nSample 2\nSample 3";
string[] strLines = str.Split('\n');
var paragraph = new Paragraph(wDoc);
for (int i = 0; i < strLines.Length; i++)
{
paragraph.Inlines.Add(
new Run(wDoc, strLines[i]) { CharacterFormat = { Bold = true } });
if (i != strLines.Length - 1)
paragraph.Inlines.Add(
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak));
}
That is the same as if you were using this Paragraph constructor:
var paragraph = new Paragraph(wDoc,
new Run(wDoc, "Sample 1") { CharacterFormat = { Bold = true } },
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak),
new Run(wDoc, "Sample 2") { CharacterFormat = { Bold = true } },
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak),
new Run(wDoc, "Sample 3") { CharacterFormat = { Bold = true } });
I'm creating a word document using OXML in Visual Studio. I don't know how long it is going to be and I need to add a simple page number in the footer of the document.
To generate headers and footers I used this:
https://msdn.microsoft.com/en-us/library/ee355228(v=office.12).aspx
As I understand, this presets the default headers/footers before I even write anything in the document. So I'm not quite sure if I can add page numbering to this? I'd really appreciate the help, because I've been stuck on this for a whole day...
You can add dynamic page numbers by adding a SimpleField with an Instruction of "PAGE". Word will automatically update any such field with the correct page number.
In order to code that you can adapt the GeneratePageFooterPart in the link you provided to include a SimpleField in the Run that gets added to the Footer:
private static Footer GeneratePageFooterPart(string FooterText)
{
var element =
new Footer(
new Paragraph(
new ParagraphProperties(
new ParagraphStyleId() { Val = "Footer" }),
new Run(
new Text(FooterText),
// *** Adaptation: This will output the page number dynamically ***
new SimpleField() { Instruction = "PAGE" })
));
return element;
}
Note that you can change the format of the page number by postfixing the PAGE text. From the Ecma Office Open XML Part 1 - Fundamentals And Markup Language Reference.pdf:
When the current page number is 19 and the following fields are updated:
PAGE
PAGE \* ArabicDash
PAGE \* ALPHABETIC
PAGE \* roman
the results are:
19
- 19 -
S
xix
So to get roman numerals for example you would need to change the SimpleField line of code above to:
new SimpleField() { Instruction = "PAGE \\* roman" })
or (if you prefer)
new SimpleField() { Instruction = #"PAGE \* roman" })
Try this:
private static void GenerateFooterPartContent(WordprocessingDocument package, string text = null)
{
FooterPart footerPart1 = package.MainDocumentPart.FooterParts.FirstOrDefault();
if (footerPart1 == null)
{
footerPart1 = package.MainDocumentPart.AddNewPart<FooterPart>();
}
var relationshipId = package.MainDocumentPart.GetIdOfPart(footerPart1);
// Get SectionProperties and set HeaderReference and FooterRefernce with new Id
SectionProperties sectionProperties1 = new SectionProperties();
FooterReference footerReference2 = new FooterReference() { Type = HeaderFooterValues.Default, Id = relationshipId };
sectionProperties1.Append(footerReference2);
package.MainDocumentPart.Document.Body.Append(sectionProperties1);
Footer footer1 = new Footer();
Paragraph paragraph2 = CreateParagraph(package, string.Empty, "Footer");
Run r = new Run(new SimpleField() { Instruction = "DATE" });
paragraph2.Append(r);
if (!string.IsNullOrWhiteSpace(text))
{
r = new Run();
PositionalTab positionalTab1 = new PositionalTab() { Alignment = AbsolutePositionTabAlignmentValues.Center,
RelativeTo = AbsolutePositionTabPositioningBaseValues.Margin,
Leader = AbsolutePositionTabLeaderCharValues.None };
r.Append(positionalTab1);
paragraph2.Append(r);
r = new Run(new Text(text) { Space = SpaceProcessingModeValues.Preserve });
paragraph2.Append(r);
}
r = new Run();
PositionalTab positionalTab2 = new PositionalTab() { Alignment = AbsolutePositionTabAlignmentValues.Right,
RelativeTo = AbsolutePositionTabPositioningBaseValues.Margin,
Leader = AbsolutePositionTabLeaderCharValues.None };
r.Append(positionalTab2);
paragraph2.Append(r);
r = new Run(new Text("Page: ") { Space = SpaceProcessingModeValues.Preserve },
// *** Adaptation: This will output the page number dynamically ***
new SimpleField() { Instruction = "PAGE" },
new Text(" of ") { Space = SpaceProcessingModeValues.Preserve },
// *** Adaptation: This will output the number of pages dynamically ***
new SimpleField() { Instruction = "NUMPAGES" });
paragraph2.Append(r);
footer1.Append(paragraph2);
footerPart1.Footer = footer1;
}
Refer to the following link for more instructions.
I'm trying to generate Word documents using OpenXML SDK and Word Document Generator. I need to apply my custom style on ContentControls (Repeating Section).
For Recursive Placeholders, I use
foreach (var item in list)
{
var datacontext = new OpenXmlElementDataContext()
{
Element = openXmlElementDataContext.Element,
DataContext = item.Value
};
var clonedElement = CloneElementAndSetContentInPlaceholders(datacontext);
SetContentOfContentControl(clonedElement, item.Value);
}
openXmlElementDataContext.Element.Remove();
I need to apply my style on this element. How to I can do ?
I try to see generated code with "Open XML SDK 2.5 Productivity Tool for Microsoft Office" to inspire me:
var moduleDatacontext = new OpenXmlElementDataContext()
{
Element = openXmlElementDataContext.Element,
DataContext = module.Valeur
};
var moduleClonedElement = CloneElementAndSetContentInPlaceholders(moduleDatacontext);
var sdtProperties1 = new SdtProperties();
var styleId1 = new StyleId() { Val = "FormationTitre2" };
ParagraphMarkRunProperties paragraphMarkRunProperties1 = new ParagraphMarkRunProperties();
RunFonts runFonts1 = new RunFonts() { ComplexScriptTheme = ThemeFontValues.MinorHighAnsi };
paragraphMarkRunProperties1.Append(runFonts1);
sdtProperties1.Append(styleId1);
sdtProperties1.Append(paragraphMarkRunProperties1);
Run run1 = new Run() { RsidRunProperties = "00C463E5" };
RunProperties runProperties1 = new RunProperties();
RunFonts runFonts2 = new RunFonts() { ComplexScriptTheme = ThemeFontValues.MinorHighAnsi };
runProperties1.Append(runFonts2);
run1.Append(runProperties1);
moduleClonedElement.Append(sdtProperties1);
moduleClonedElement.Append(run1);
When I open the generated document, I have this error :
We're sorry. We can't open "...docx" because we found a problem with its contents.
I validate the document and I can see 15 errors:
Full Size
I've found the solution. I search first paragraph and apply my custom style on it.
// clone element
var clonedElement = CloneElementAndSetContentInPlaceholders(datacontext);
// search the first created paragraph on my clonedElement
Paragraph p = clonedElement.Descendants<Paragraph>().FirstOrDefault();
if (p != null)
p.PrependChild<ParagraphProperties>(new ParagraphProperties());
// get the paragraph properties
ParagraphProperties pPr = p.Elements<ParagraphProperties>().First();
// apply style
pPr.ParagraphStyleId = new ParagraphStyleId { Val = "FormationTitre2" };
// set content of content control
SetContentOfContentControl(clonedElement, item.Value);
I need to highlight a sentence in docx file, I have this code, and its working fine for many documents , but i noticed that for some document the text inside the document is set word by word, not whole sentence, I mean each word with its own Run, so when searching for that sentence, it is not found because it is word by word in the docx.
NOTE: I am working with Arabic text.
private void HighLightText_userSentence(Paragraph paragraph, string text, string title, string author, decimal percentage, string _color)
{
string textOfRun = string.Empty;
var runCollection = paragraph.Descendants<Run>();
Run runAfter = null;
//find the run part which contains the characters
foreach (Run run in runCollection)
{
if (run.GetFirstChild<Text>() != null)
{
textOfRun = run.GetFirstChild<Text>().Text.Trim();
if (textOfRun.Contains(text))
{
//remove the character from thsi run part
run.GetFirstChild<Text>().Text = textOfRun.Replace(text, "");
runAfter = run;
break;
}
}
}
// create a new run with your customization font and the character as its text
Run HighLightRun = new Run();
RunProperties runPro = new RunProperties();
RunFonts runFont = new RunFonts() { Ascii = "Curlz MT", HighAnsi = "Curlz MT" };
Bold bold = new Bold();
DocumentFormat.OpenXml.Wordprocessing.Color color = new DocumentFormat.OpenXml.Wordprocessing.Color() { Val = _color };
DocumentFormat.OpenXml.Wordprocessing.FontSize fontSize = new DocumentFormat.OpenXml.Wordprocessing.FontSize() { Val = "22" };
FontSizeComplexScript fontSizeComplex = new FontSizeComplexScript() { Val = "24" };
Text runText = new Text() { Text = text };
//runPro.Append(runFont);
runPro.Append(bold);
runPro.Append(color);
//runPro.Append(fontSize);
// runPro.Append(fontSizeComplex);
HighLightRun.Append(runPro);
HighLightRun.Append(runText);
//HighLightRun.AppendChild(new Break());
//HighLightRun.PrependChild(new Break());
//insert the new created run part
paragraph.InsertBefore(HighLightRun, runAfter);
}
I recently used docX and was facing problems with searching and higlighting text. I tried an indirect way. It simple and works in most situations. I do it using the replace statement.
here search text is the text you want to highlight
using (DocX doc = DocX.Load("d:\\Sample.docx"))
{
for (int i = 0; i < doc.Paragraphs.Count; i++)
{
foreach (var item in doc.Paragraphs[i])
{
if (doc.Paragraphs[i] is Paragraph)
{
Paragraph sen = doc.Paragraphs[i] as Paragraph;
Formatting form = new Formatting();
form.Highlight = Highlight.yellow;
form.Bold = true;
sen.ReplaceText(searchText, searchText, false,
System.Text.RegularExpressions.RegexOptions.IgnoreCase,
form, null, MatchFormattingOptions.ExactMatch);
}
}
}
doc.Save();
}
For unit testing purposes, I would like to generate some sample data to be stored as a stream in the dataToImport variable in the following statement:
WordprocessingDocument.Open(dataToImport, false);
Does anyone know how to create a decent set of sample data?
You could potentially use something like the following:
using (WordprocessingDocument wpd = WordprocessingDocument.Open(filename, false)
{
wpd.MainDocumentPart.Document.Body.Append(GenerateParagraph(...text ...);
}
private Paragraph GenerateParagraph(string input)
{
Paragraph paragraph1 = new Paragraph();
Run run1 = new Run();
Break break1 = new Break() { Type = BreakValues.Page };
Text txt = new Text() { Space = SpaceProcessingModeValues.Preserve };
txt.Text = input;
run1.Append(break1);
run1.Append(txt);
paragraph1.Append(run1);
return paragraph1;
}
The value of the ...text... itself could come from any file using FileInputStream objects.
Hope it helps!