I have to refactorate very old Code (older than 2005). The Software should open Word Documents and fill variables in the header with data. The Doc will open, but the variables are empty.
I use following codesnipped to show the amount of var´s in a Word Document to do some tests.
It always say´s there are 0... Can I use the WordInterop with Office 365 ?
using Microsoft.Office.Interop.Word;
namespace CheckForVariablesInWord
{
class Program
{
static void Main(string[] args)
{
Microsoft.Office.Interop.Word.Application ap = new Microsoft.Office.Interop.Word.Application();
Document document = ap.Documents.Open(#"C:\temp\TestDocument.docx");
ap.Visible = true;
System.Console.WriteLine(document.Variables.Count);
System.Console.ReadLine();
}
}
}
this is the TestWordDocument:
Here is the Variable named „myTest”
{ DOCVARIABLE myTest * MERGEFORMAT}
thx if anyone can help me out before i get mindsick ;.)
Related
I am relatively new to C# and Visual Studio, and I am completely new to Microsoft.Office.Interop.OneNote. I have begun work on a console app using as a starting place the code in this answer, which is a reply to a question about writing to a OneNote app using C# and interop. My app does successfully retrieve content from a OneNote notebook, but I am not sure how to close the app in such a way that OneNote shuts down cleanly. Whenever I run the app in Visual Studio and then close the console, the next time I try to open OneNote, I get a message saying "We're sorry. OneNote is cleaning up from the last time it was open. Please wait." After about 30 seconds, this is followed by a message saying "It looks like OneNote is having trouble starting right now. If you keep seeing this message, restart your computer and start OneNote again. We're sorry."
It is impossible to open OneNote again until I either reboot or empty the contents of C:\users%userprofile%\appdata\local\temp.
What is the proper code to use to close out of OneNote cleanly, and/or how should I be closing my app inside Visual Studio? (As I said, I'm a noob.)
Here is my code:
using System;
using Word = Microsoft.Office.Interop.Word;
using OneNote = Microsoft.Office.Interop.OneNote;
using System.Runtime.InteropServices;
using System.Xml.Linq;
using System.Linq;
namespace OneNote_to_Word
{
class Program
{
static OneNote.Application onenoteApp = new OneNote.Application();
static XNamespace ns = null;
static void Main(string[] args)
{
GetNamespace();
string notebookId = GetObjectId(null, OneNote.HierarchyScope.hsNotebooks, "Notebook Name");
string sectionId = GetObjectId(notebookId, OneNote.HierarchyScope.hsSections, "done");
string firstPageId = GetObjectId(sectionId, OneNote.HierarchyScope.hsPages, "140506");
GetPageContent(firstPageId);
Console.Read();
}
static void GetNamespace()
{
string xml;
onenoteApp.GetHierarchy(null, OneNote.HierarchyScope.hsNotebooks, out xml);
var doc = XDocument.Parse(xml);
ns = doc.Root.Name.Namespace;
}
static string GetObjectId(string parentId, OneNote.HierarchyScope scope, string objectName)
{
string xml;
onenoteApp.GetHierarchy(parentId, scope, out xml);
var doc = XDocument.Parse(xml);
var nodeName = "";
switch (scope)
{
case (OneNote.HierarchyScope.hsNotebooks): nodeName = "Notebook"; break;
case (OneNote.HierarchyScope.hsPages): nodeName = "Page"; break;
case (OneNote.HierarchyScope.hsSections): nodeName = "Section"; break;
default:
return null;
}
var node = doc.Descendants(ns + nodeName).Where(n => n.Attribute("name").Value == objectName).FirstOrDefault();
return node.Attribute("ID").Value;
}
static string GetPageContent(string pageId)
{
string xml;
onenoteApp.GetPageContent(pageId, out xml, OneNote.PageInfo.piAll);
var doc = XDocument.Parse(xml);
var outLine = doc.Descendants(ns + "Outline").First();
var content = outLine.Descendants(ns + "T").First();
string contentVal = content.Value;
//content.Value = "modified";
//onenoteApp.UpdatePageContent(doc.ToString());
Console.WriteLine(contentVal);
return null;
}
}
}
At least for .NET < 5, I needed to add the following lines at the end of main():
Marshal.FinalReleaseComObject(onenoteApp);
onenoteApp = null;
GC.Collect(); // Start .NET CLR Garbage Collection
GC.WaitForPendingFinalizers(); // Wait for Garbage Collection to finish
So main() now looks like this:
static void Main(string[] args)
{
GetNamespace();
string notebookId = GetObjectId(null, OneNote.HierarchyScope.hsNotebooks, "My Notebook");
string sectionId = GetObjectId(notebookId, OneNote.HierarchyScope.hsSections, "done");
string firstPageId = GetObjectId(sectionId, OneNote.HierarchyScope.hsPages, "140506");
GetPageContent(firstPageId);
Marshal.FinalReleaseComObject(onenoteApp);
onenoteApp = null;
GC.Collect(); // Start .NET CLR Garbage Collection
GC.WaitForPendingFinalizers(); // Wait for Garbage Collection to finish
}
However, in .NET 5, when I tried to use Marshal.FinalReleaseComObject(), Visual Studio complained that "This call site is reachable on all platforms. 'Marshal.FinalReleaseComObject(object)' is only supported on: 'windows'". This problem could be fixed by placing this line above the containing class, main() (cf. this answer):
[System.Runtime.Versioning.SupportedOSPlatform("windows")]
is there any way or any DLL by which i can get location or coordinates of the first letter or alphabet found using OCR in windows form application C# without performing OCR on the whole document?
As i have used Aspose and tesseract Dll to perform OCR on image.it takes time while extracting text as it reads all the text but i want to just read the first word and get the coordinate of the first letter extracted. i have to implement it in windows form application using C#. please help.
Thanks in advance.!
Just as a disclaimer this answer is about a paid software toolkit and I work for the company.
You can check out the LEADTOOLS SDK which has the segmentation algorithm that I mentioned in my comment to zone the document and then find the top-left most zone of text and perform OCR on those bounds.
I wrote a console application to show an example of how to achieve this using the LEADTOOLS OCR NuGet:
https://www.nuget.org/packages/Leadtools.Ocr/
using Leadtools;
using Leadtools.Codecs;
using Leadtools.ImageProcessing.Core;
using Leadtools.Ocr;
using System;
using System.Linq;
namespace FindFirstZone
{
class Program
{
static IOcrEngine ocrEngine;
static RasterCodecs codecs;
static void Main(string[] args)
{
Initialize();
var image = codecs.Load(#"randomtext.png");
LeadRect rect = FindFirstZone(image);
DoOcr(image, rect);
Console.ReadLine();
}
static void Initialize()
{
RasterSupport.SetLicense(#"C:\LEADTOOLS 20\Common\License\LEADTOOLS.LIC",
System.IO.File.ReadAllText(#"C:\LEADTOOLS 20\Common\License\LEADTOOLS.LIC.KEY"));
codecs = new RasterCodecs();
ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD, false);
ocrEngine.Startup(null, null, null, null);
}
static LeadRect FindFirstZone(RasterImage img)
{
AutoZoningCommand autoZoningCommand = new AutoZoningCommand(
AutoZoningOptions.DetectAccurateZones |
AutoZoningOptions.DetectText |
AutoZoningOptions.DontAllowOverlap);
autoZoningCommand.Run(img);
if (autoZoningCommand.Zones != null && autoZoningCommand.Zones.Count > 0)
{
var sortedList = autoZoningCommand.Zones.OrderBy(z => z.Bounds.Top)
.ThenBy(z => z.Bounds.Left).ToList();
return sortedList[0].Bounds;
}
else
throw new Exception("No Zones");
}
static void DoOcr(RasterImage image, LeadRect rect)
{
using (var ocrPage = ocrEngine.CreatePage(image, OcrImageSharingMode.None))
{
ocrPage.Zones.Add(new OcrZone()
{
Bounds = rect,
ZoneType = OcrZoneType.Text,
});
ocrPage.Recognize(null);
Console.WriteLine(ocrPage.GetText(-1));
}
}
}
}
I tested this with some random text I generated (test image here) and here is the output from this program:
Fowl it heaven second don't thing won't third cattle from. Had said
fill brought evening, a said great him
I'm using WinForms. I have a form that has a button.
Goal: On button click: Open up a word document. Where the file path is hard coded into the program. I don't want the users to have to locate the word document.
Problem: I receive this error message. When I wrote my code, I get a red error line under 'Application'.
private void button1_Click(object sender, EventArgs e)
{
this.Application.Documents.Open(#"C:\Test\NewDocument.docx", ReadOnly:true)
}
Instead of adding interop in your reference, you may also consider to use this:
System.Diagnostics.Process.Start(#"C:\Test\NewDocument.docx");
first add the dll of Microsoft.Office.Interop.Word to your references then add this:
using Microsoft.Office.Interop.Word;
and use the following code:
Application ap = new Application();
Document document = ap.Documents.Open(#"C:\Test\NewDocument.docx");
This Application is not this.Application it's Microsoft.Office.Interop.Word.Application.
So you can use this code:
using System;
using Microsoft.Office.Interop.Word;
class Program
{
static void Main()
{
// Open a doc file.
Application application = new Application();
Document document = application.Documents.Open("C:\\word.doc");
//Do whatever you want
// Close word.
application.Quit();
}
}
There is a good answer above which is:
System.Diagnostics.Process.Start(#"C:\Test\NewDocument.docx");
This should be modified for .Net Core 2 and above to be:
var p = new Process();
p.StartInfo = new ProcessStartInfo(filename)
{
UseShellExecute = true
};
p.Start();
Due to the lack of proper documentation, I'm not sure if HtmlAgilityPack supports screen capture in C# after it loads the html contents.
So is there a way I can more or less grab a screenshot using (or along with) HtmlAgilityPack so I can have a visual clue as to what happens every time I do page manipulations?
Here is my working code so far:
using HtmlAgilityPack;
using System;
namespace ConsoleApplication4
{
class Program
{
static void Main(string[] args)
{
string urlDemo = "https://htmlagilitypack.codeplex.com/";
HtmlWeb getHtmlWeb = new HtmlWeb();
var doc = getHtmlWeb.Load(urlDemo);
var sentence = doc.DocumentNode.SelectNodes("//p");
int counter = 1;
try
{
foreach (var p in sentence)
{
Console.WriteLine(counter + ". " + p.InnerText);
counter++;
}
}
catch (Exception e)
{
Console.WriteLine(e);
}
Console.ReadLine();
}
}
}
Currently, it scrapes and output all the p of the page in the console but at the same time I want to get a screen grab of the scraped contents but I don't know how and where to begin.
Any help is greatly appreciated. TIA
You can't do this with HTML Agility Pack. Use a different tool such as Selenium WebDriver. Here is how to do it: Take a screenshot with Selenium WebDriver
Could you use Selenium WebDriver instead?
You'll need to add the following NuGet packages to your project first:
Selenium.WebDriver
Selenium.Support
Loading a page and taking a screenshot is then as simple as...
using System;
using System.Drawing.Imaging;
using System.IO;
using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;
using OpenQA.Selenium.Support.UI;
namespace SeleniumTest
{
class Program
{
static void Main(string[] args)
{
// Create a web driver that used Firefox
var driver = new FirefoxDriver(
new FirefoxBinary(), new FirefoxProfile(), TimeSpan.FromSeconds(120));
// Load your page
driver.Navigate().GoToUrl("http://google.com");
// Wait until the page has actually loaded
var wait = new WebDriverWait(driver, new TimeSpan(0, 0, 10));
wait.Until(d => d.Title.Contains("Google"));
// Take a screenshot, and saves it to a file (you must have full access rights to the save location).
var myDesktop = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
((ITakesScreenshot)driver).GetScreenshot().SaveAsFile(Path.Combine(myDesktop, "google-screenshot.png"), ImageFormat.Png);
driver.Close();
}
}
}
Is it possible to close an InfoPath form programmatically? I know that it can be configured as a form rule / action but I want to close the form via code.
Use the ApplicationClass.XDocuments.Close method and pass it your document object:
using System;
using Microsoft.Office.Interop.InfoPath;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var app = new ApplicationClass();
var uri = #".\form1.xml";
var doc = app.XDocuments.Open(uri, 0);
app.XDocuments.Close(doc);
}
}
}