highlighting text in Docx using c# - c#

I need to highlight a sentence in docx file, I have this code, and its working fine for many documents , but i noticed that for some document the text inside the document is set word by word, not whole sentence, I mean each word with its own Run, so when searching for that sentence, it is not found because it is word by word in the docx.
NOTE: I am working with Arabic text.
private void HighLightText_userSentence(Paragraph paragraph, string text, string title, string author, decimal percentage, string _color)
{
string textOfRun = string.Empty;
var runCollection = paragraph.Descendants<Run>();
Run runAfter = null;
//find the run part which contains the characters
foreach (Run run in runCollection)
{
if (run.GetFirstChild<Text>() != null)
{
textOfRun = run.GetFirstChild<Text>().Text.Trim();
if (textOfRun.Contains(text))
{
//remove the character from thsi run part
run.GetFirstChild<Text>().Text = textOfRun.Replace(text, "");
runAfter = run;
break;
}
}
}
// create a new run with your customization font and the character as its text
Run HighLightRun = new Run();
RunProperties runPro = new RunProperties();
RunFonts runFont = new RunFonts() { Ascii = "Curlz MT", HighAnsi = "Curlz MT" };
Bold bold = new Bold();
DocumentFormat.OpenXml.Wordprocessing.Color color = new DocumentFormat.OpenXml.Wordprocessing.Color() { Val = _color };
DocumentFormat.OpenXml.Wordprocessing.FontSize fontSize = new DocumentFormat.OpenXml.Wordprocessing.FontSize() { Val = "22" };
FontSizeComplexScript fontSizeComplex = new FontSizeComplexScript() { Val = "24" };
Text runText = new Text() { Text = text };
//runPro.Append(runFont);
runPro.Append(bold);
runPro.Append(color);
//runPro.Append(fontSize);
// runPro.Append(fontSizeComplex);
HighLightRun.Append(runPro);
HighLightRun.Append(runText);
//HighLightRun.AppendChild(new Break());
//HighLightRun.PrependChild(new Break());
//insert the new created run part
paragraph.InsertBefore(HighLightRun, runAfter);
}

I recently used docX and was facing problems with searching and higlighting text. I tried an indirect way. It simple and works in most situations. I do it using the replace statement.
here search text is the text you want to highlight
using (DocX doc = DocX.Load("d:\\Sample.docx"))
{
for (int i = 0; i < doc.Paragraphs.Count; i++)
{
foreach (var item in doc.Paragraphs[i])
{
if (doc.Paragraphs[i] is Paragraph)
{
Paragraph sen = doc.Paragraphs[i] as Paragraph;
Formatting form = new Formatting();
form.Highlight = Highlight.yellow;
form.Bold = true;
sen.ReplaceText(searchText, searchText, false,
System.Text.RegularExpressions.RegexOptions.IgnoreCase,
form, null, MatchFormattingOptions.ExactMatch);
}
}
}
doc.Save();
}

Related

Set Bold to a Paragraph in GemBox Document ASP.Net c#

I am using GemBox.Document library in my ASP.Net Page. I have a paragraph which contains line breaks and I also need to set the paragraph to bold.
I my below code the variable str contains line break characters.
TRY 1
Line breaks work well in the below code
var p3 = new Paragraph(wDoc, str);
How to set BOLD to this paragraph
TRY 2
Bold work well in the below code
var p3 = new Paragraph(wDoc,
new Run(wDoc, str) { CharacterFormat = { Bold = true } }
);
This doesn't allow line breaks
Please help for a solution
Probably the easiest way to do this is something like this:
var paragraph = new Paragraph(wDoc);
paragraph.Content.LoadText(str, new CharacterFormat() { Bold = true });
Or this:
var paragraph = new Paragraph(wDoc);
paragraph.CharacterFormatForParagraphMark.Bold = true;
paragraph.Content.LoadText(str);
But just in case you're interested, the thing to note here is that line breaks are represented with SpecialCharacter objects, not with Run objects.
So the following would be the "manual" way in which you would need to handle those breaks yourself, you would need to add the correct elements to the Paragraph.Inlines collection:
string str = "Sample 1\nSample 2\nSample 3";
string[] strLines = str.Split('\n');
var paragraph = new Paragraph(wDoc);
for (int i = 0; i < strLines.Length; i++)
{
paragraph.Inlines.Add(
new Run(wDoc, strLines[i]) { CharacterFormat = { Bold = true } });
if (i != strLines.Length - 1)
paragraph.Inlines.Add(
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak));
}
That is the same as if you were using this Paragraph constructor:
var paragraph = new Paragraph(wDoc,
new Run(wDoc, "Sample 1") { CharacterFormat = { Bold = true } },
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak),
new Run(wDoc, "Sample 2") { CharacterFormat = { Bold = true } },
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak),
new Run(wDoc, "Sample 3") { CharacterFormat = { Bold = true } });

highlight text in a power point presentation using OpenXML?

I am using below code to highlight text in power point presentation (.pptx) using openxml but below code for pptx - it corrupts the file and ask to repair while opening pptx and after opening it highlights the word but it does not preserves the formatting. So total 2 problems:
1. File gets corrupted
2. Formatting is not preserved
But it highlights the text which i want (but it does not preserve formatting)
I have debuged my code line by line and problem is at below line, but i am not able to figure out what is the problem.
highlightRun.InsertAt(runPro, 0);
I have used openxml productivity tool to compare two files of pptx one with highlighting one without highlighting: The difference i see is as below : i am not using two times RunProperties but it is showing 2 times :
Corrupted file :
public Run GenerateRun()
{
Run run1 = new Run();
RunProperties runProperties1 = new RunProperties(){ Language = "en-US", Dirty = false };
runProperties1.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
SolidFill solidFill1 = new SolidFill();
RgbColorModelHex rgbColorModelHex1 = new RgbColorModelHex(){ Val = "FFF000" };
solidFill1.Append(rgbColorModelHex1);
runProperties1.Append(solidFill1);
RunProperties runProperties2 = new RunProperties(){ Language = "en-US", Dirty = false };
runProperties2.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
SolidFill solidFill2 = new SolidFill();
RgbColorModelHex rgbColorModelHex2 = new RgbColorModelHex(){ Val = "FFFF00" };
solidFill2.Append(rgbColorModelHex2);
runProperties2.Append(solidFill2);
Text text1 = new Text();
text1.Text = "gaits";
run1.Append(runProperties1);
run1.Append(runProperties2);
run1.Append(text1);
return run1;
}
<a:r xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:rPr lang="en-US" dirty="0" smtClean="0">
<a:solidFill>
<a:srgbClr val="FFF000" />
</a:solidFill>
</a:rPr>
<a:rPr lang="en-US" dirty="0" smtClean="0">
<a:solidFill>
<a:srgbClr val="FFFF00" />
</a:solidFill>
</a:rPr>
<a:t>gaits</a:t>
</a:r>
Correct file :
public class GeneratedClass
{
// Creates an Run instance and adds its children.
public Run GenerateRun()
{
Run run1 = new Run();
RunProperties runProperties1 = new RunProperties(){ Language = "en-US", Dirty = false };
runProperties1.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
SolidFill solidFill1 = new SolidFill();
RgbColorModelHex rgbColorModelHex1 = new RgbColorModelHex(){ Val = "FFFF00" };
solidFill1.Append(rgbColorModelHex1);
runProperties1.Append(solidFill1);
Text text1 = new Text();
text1.Text = "gaits";
run1.Append(runProperties1);
run1.Append(text1);
return run1;
}
<a:r xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:rPr lang="en-US" dirty="0" smtClean="0">
<a:solidFill>
<a:srgbClr val="FFFF00" />
</a:solidFill>
</a:rPr>
<a:t>gaits</a:t>
</a:r>
What i am missing ? My complete code is as below :
using OpenXmlDrawing = DocumentFormat.OpenXml.Drawing;
private void HighLightTextPresentation(OpenXmlDrawing.Paragraph paragraph, string text)
{
var found = paragraph
.Descendants<OpenXmlDrawing.Run>()
.Where(r => !string.IsNullOrEmpty(r.InnerText) && r.InnerText != "\\s")
.Select(r =>
{
var runText = r.GetFirstChild<OpenXmlDrawing.Text>();
int index = runText.Text.IndexOf(text, StringComparison.OrdinalIgnoreCase);
// 'Run' is a reference to the text run we found,
// TextNode is a reference to the run's Text object,
// 'TokenIndex` is the index of the search string in run's text
return new { Run = r, TextNode = runText, TokenIndex = index };
})
.FirstOrDefault(o => o.TokenIndex >= 0);
// Nothing found -- escape
if (found == null)
{
return;
}
// Create a node for highlighted text as a clone (to preserve formatting etc)
var highlightRun = found.Run.CloneNode(true);
// Add the highlight node after the found text run and set up the highlighting
paragraph.InsertAfter(highlightRun, found.Run);
highlightRun.GetFirstChild<OpenXmlDrawing.Text>().Text = text;
DocumentFormat.OpenXml.Drawing.RunProperties runPro = new DocumentFormat.OpenXml.Drawing.RunProperties() { Language = "en-US", Dirty = false };
runPro.SetAttribute(new OpenXmlAttribute("", "smtClean", "", "0"));
//Apply color to searched text
DocumentFormat.OpenXml.Drawing.SolidFill solidFill1 = new DocumentFormat.OpenXml.Drawing.SolidFill();
DocumentFormat.OpenXml.Drawing.RgbColorModelHex rgbColorModelHex1 = new DocumentFormat.OpenXml.Drawing.RgbColorModelHex() { Val = "FFF000" };//Set Font-Color to Green (Hex "00B050").
solidFill1.Append(rgbColorModelHex1);
runPro.Append(solidFill1);
highlightRun.InsertAt(runPro, 0);
// Check if there's some text in the text run *after* the found text
int remainderLength = found.TextNode.Text.Length - found.TokenIndex - text.Length;
if (remainderLength > 0)
{
// There is some text after the highlighted section --
// insert it in a separate text run after the highlighted text run
var remainderRun = found.Run.CloneNode(true);
paragraph.InsertAfter(remainderRun, highlightRun);
OpenXmlDrawing.Text textNode = remainderRun.GetFirstChild<OpenXmlDrawing.Text>();
textNode.Text = found.TextNode.Text.Substring(found.TokenIndex + text.Length);
// We need to set up this to preserve the spaces between text runs
//textNode.Space = new EnumValue<SpaceProcessingModeValues>(SpaceProcessingModeValues.Preserve);
}
// Check if there's some text *before* the found text
if (found.TokenIndex > 0)
{
// Something is left before the highlighted text,
// so make the original text run contain only that portion
found.TextNode.Text = found.TextNode.Text.Remove(found.TokenIndex);
// We need to set up this to preserve the spaces between text runs
//found.TextNode.Space = new EnumValue<SpaceProcessingModeValues>(SpaceProcessingModeValues.Preserve);
}
else
{
// There's nothing before the highlighted text -- remove the unneeded text run
paragraph.RemoveChild(found.Run);
}
}
As noted in the comments, the reason of "getting corrupted" is that you make the XML structure invalid by creating an additional a:rPr element (RunProperties) in the a:r element (Run) while only one is allowed.
So you should first check whether there already is a RunProperties element in the Run before inserting a new one. If a RunProperties element already exists, you should instead reuse it.
// Either reuse an existing RunProperties element,
// or create a new one if there's none
RunProperties runPro = highlightRun.Descendants<RunProperties>().FirstOrDefault() ??
new RunProperties { Language = "en-US", Dirty = false };
// only add the element if it's really new, don't add existing one
if (runPro.Parent == null)
{
highlightRun.InsertAt(runPro, 0);
}

How to set font size in empty cell of a Word table using the Open XML SDK?

I am creating a table in OpenXml with C# in a Word file. I used some code mentioned in this question to set the fontsize of the text in the cells. It works fine for the cells that contain text, but the empty cells seem to be given the normal style, and with that a bigger font size, which makes the row height bigger.
Here is my sample code with a single row with a single cell, where the font size should be 9:
TableRow tr = new TableRow();
TableCell tc = new TableCell();
Paragraph par = new Paragraph();
Run run = new Run();
Text txt = new Text("txt");
RunProperties runProps = new RunProperties();
FontSize fontSize = new Fontsize() { Val = "18" }; // font size 9
runProps.Append(fontSize);
run.Append(runProps);
run.Append(txt);
para.Append(run);
tc.Append(para);
tr.Append(tc);
Here is an example of the resulting table. As you can see the middle row is taller than the others. In the cells that say "txt" the font size is 9, but in the blank cell the font size is 11. The code above is used for all the cells, where the empty cell simply has the text "". When I looked at the file with the Open XML Tool, I can see that the RunProperties with value 18 is there for all the cells including the empty one.
How do I set the font size of a cell without displaying any text?
I confirm what you report. One way around it would be to substitute a space " " for the "empty" string, adding "space preserve" to the text run, of course.
Another possibility would be to create a character style and apply it to the table cells. This turned out to be trickier than it sounds as the style needs to be applied twice to cells that contain text: once to the ParagraphMarkRunProperties and once to the RunProperties. For empty cells the style need be applied only to the ParagraphMarkRunProperties.
Actually, on reflection, you can use the same approach for the font size...
I've included both approaches in the code below. The one for just the font size is commented out (four lines).
The sample code assumes that the third cell of the one-row, four column table, has no content. Run and Text information is added only when there is content.
private void btnCreateTable_Click(object sender, EventArgs e)
{
string filePath = #"C:\X\TestCreateTAble.docx";
using (WordprocessingDocument pkg = WordprocessingDocument.Open(filePath, true))
{
MainDocumentPart partDoc = pkg.MainDocumentPart;
Document doc = partDoc.Document;
StyleDefinitionsPart stylDefPart = partDoc.StyleDefinitionsPart;
Styles styls = stylDefPart.Styles;
Style styl = CreateTableCharacterStyle();
stylDefPart.Styles.AppendChild(styl);
Table t = new Table();
TableRow tr = new TableRow();
for (int i = 1; i <= 4; i++)
{
TableCell tc = new TableCell(new TableCellProperties(new TableCellWidth() { Type = TableWidthUnitValues.Dxa, Width = "500" }));
Paragraph para = new Paragraph();
ParagraphProperties paraProps = new ParagraphProperties();
ParagraphMarkRunProperties paraRunProps = new ParagraphMarkRunProperties();
RunStyle runStyl = new RunStyle() { Val = "Table9Point" };
paraRunProps.Append(runStyl);
// FontSize runFont = new FontSize() {Val = "18" };
// paraRunProps.Append(runFont);
paraProps.Append(paraRunProps);
para.Append(paraProps);
Run run = new Run();
Text txt = null;
if (i == 3)
{
}
else
{
txt = new Text("txt");
txt.Space = SpaceProcessingModeValues.Preserve;
RunProperties runProps = new RunProperties();
RunStyle inRunStyl = (RunStyle) runStyl.Clone();
runProps.Append(inRunStyl);
// FontSize inRunFont = (FontSize) runFont.Clone();
// runProps.Append(inRunFont);
run.Append(runProps);
run.Append(txt);
para.Append(run);
}
tc.Append(para);
tr.Append(tc);
}
t.Append(tr);
//Insert at end of document
SectionProperties sectProps = doc.Body.Elements<SectionProperties>().LastOrDefault();
doc.Body.InsertBefore(t, sectProps);
}
}
private Style CreateTableCharacterStyle()
{
Style styl = new Style()
{
CustomStyle = true,
StyleId = "Table9Point",
Type = StyleValues.Character,
};
StyleName stylName = new StyleName() { Val = "Table9Point" };
styl.AppendChild(stylName);
StyleRunProperties stylRunProps = new StyleRunProperties();
stylRunProps.FontSize = new FontSize() { Val = "18" };
styl.AppendChild(stylRunProps);
BasedOn basedOn1 = new BasedOn() { Val = "DefaultParagraphFont" };
styl.AppendChild(basedOn1);
return styl;
}
Just need to set the FontSize and FontSizeComplexScript for the ParagraphProperties and RunProperties, like this:
var cell = new TableCell(
new Paragraph(
new ParagraphProperties(
new SpacingBetweenLines { LineRule = LineSpacingRuleValues.Auto, Before = "0", After = "0" },
new RunProperties(
new FontSize { Val = "18" },
new FontSizeComplexScript { Val = "18" })),
new Run(
new RunProperties(
new FontSize { Val = "18" },
new FontSizeComplexScript { Val = "18" }),
new Text { Text = String.Empty, Space = SpaceProcessingModeValues.Preserve })));

Hiding a PDF text by adding an extra layer

i need to hide a text by adding a new layer over the text i need to hide.
public void ReplacePDFText(string strSearch, StringComparison scCase, string strSource, string strDest)
{
PdfContentByte pCont = null;
if (File.Exists(strSource)) {
PdfReader pdfFileReader = new PdfReader(strSource);
using (PdfStamper psStamp = new PdfStamper(pdfFileReader, new FileStream(strDest, FileMode.Create))) {
for (int intCurrPage = 1; intCurrPage <= pdfFileReader.NumberOfPages; intCurrPage++) {
LocTextExtractionStrategy Strategy = new LocTextExtractionStrategy();
pCont = psStamp.GetUnderContent(intCurrPage);
Strategy.UndercontentCharacterSpacing = pCont.CharacterSpacing;
Strategy.UndercontentHorizontalScaling = pCont.HorizontalScaling;
string currText = PdfTextExtractor.GetTextFromPage(pdfFileReader, intCurrPage, Strategy);
List<iTextSharp.text.Rectangle> lstMatches = Strategy.GetTextLocations(strSearch, scCase);
PdfLayer pdLayer = default(PdfLayer);
pdLayer = new PdfLayer("over", psStamp.Writer);
pCont.SetColorFill(BaseColor.BLACK);
foreach (Rectangle rctRect in lstMatches) {
pCont.Rectangle(rctRect.Left, rctRect.Bottom, rctRect.Width, rctRect.Height);
pCont.Fill();
}
}
}
pdfFileReader.Close();
}
}
The problem with the approach above, is that the layer is added successfully with black color. So instead of the text i have a beautiful black line over the text.
But if i set the pCont.SetColorFill(BaseColor.BLACK) to WHITE, the text is still displayed.
How can i overcome this issue?
Instead of:
pCont = psStamp.GetUnderContent(intCurrPage);
Use:
pCont = psStamp.GetOverContent(intCurrPage);

how can I change the font open xml

How can I change the font family of the document via OpenXml ?
I tried some ways but, when I open the document, it's always in Calibri
Follow my code, and what I tried.
The Header Builder I think is useless to post
private static void BuildDocument(string fileName, List<string> lista, string tipo)
{
using (var w = WordprocessingDocument.Create(fileName, WordprocessingDocumentType.Document))
{
var mp = w.AddMainDocumentPart();
var d = new DocumentFormat.OpenXml.Wordprocessing.Document();
var b = new Body();
var p = new DocumentFormat.OpenXml.Wordprocessing.Paragraph();
var r = new Run();
// Get and format the text.
for (int i = 0; i < lista.Count; i++)
{
Text t = new Text();
t.Text = lista[i];
if (t.Text == " ")
{
r.Append(new CarriageReturn());
}
else
{
r.Append(t);
r.Append(new CarriageReturn());
}
}
// What I tried
var rPr = new RunProperties(new RunFonts() { Ascii = "Arial" });
lista.Clear();
p.Append(r);
b.Append(p);
var hp = mp.AddNewPart<HeaderPart>();
string headerRelationshipID = mp.GetIdOfPart(hp);
var sectPr = new SectionProperties();
var headerReference = new HeaderReference();
headerReference.Id = headerRelationshipID;
headerReference.Type = HeaderFooterValues.Default;
sectPr.Append(headerReference);
b.Append(sectPr);
d.Append(b);
// Customize the header.
if (tipo == "alugar")
{
hp.Header = BuildHeader(hp, "Anúncio Aluguel de Imóvel");
}
else if (tipo == "vender")
{
hp.Header = BuildHeader(hp, "Anúncio Venda de Imóvel");
}
else
{
hp.Header = BuildHeader(hp, "Aluguel/Venda de Imóvel");
}
hp.Header.Save();
mp.Document = d;
mp.Document.Save();
w.Close();
}
}
In order to style your text with a specific font follow the steps listed below:
Create an instance of the RunProperties class.
Create an instance of the RunFont class. Set the Ascii property to the desired font familiy.
Specify the size of your font (half-point font size) using the FontSize class.
Prepend the RunProperties instance to your run containing the text to style.
Here is a small code example illustrating the steps described above:
private static void BuildDocument(string fileName, List<string> text)
{
using (var wordDoc = WordprocessingDocument.Create(fileName, WordprocessingDocumentType.Document))
{
var mainPart = wordDoc.AddMainDocumentPart();
mainPart.Document = new Document();
var run = new Run();
foreach (string currText in text)
{
run.AppendChild(new Text(currText));
run.AppendChild(new CarriageReturn());
}
var paragraph = new Paragraph(run);
var body = new Body(paragraph);
mainPart.Document.Append(body);
var runProp = new RunProperties();
var runFont = new RunFonts { Ascii = "Arial" };
// 48 half-point font size
var size = new FontSize { Val = new StringValue("48") };
runProp.Append(runFont);
runProp.Append(size);
run.PrependChild(runProp);
mainPart.Document.Save();
wordDoc.Close();
}
}
Hope, this helps.
If you are using Stylesheet just add an instance of FontName property at appropriate font index during Fonts initilaization.
private Stylesheet GenerateStylesheet()
{
Stylesheet styleSheet = null;
Fonts fonts = new Fonts(
new Font( // Index 0 - default
new FontSize() { Val = 8 },
new FontName() { Val = "Arial"} //i.e. or any other font name as string
);
Fills fills = new Fills( new Fill(new PatternFill() { PatternType = PatternValues.None }));
Borders borders = new Borders( new Border() );
CellFormats cellFormats = new CellFormats( new CellFormat () );
styleSheet = new Stylesheet(fonts, fills, borders, cellFormats);
return styleSheet;
}
Then use it in Workbook style part as below.
WorkbookStylesPart stylePart = workbookPart.AddNewPart<WorkbookStylesPart>();
stylePart.Stylesheet = GenerateStylesheet();
stylePart.Stylesheet.Save();

Categories

Resources