Microsoft.Office.Interop.Word .doc/.docx lines count C# - c#

I want to count the total number of lines of the word document (.doc /.docx).
In my class, I added a reference to the COM library Microsoft.Office.Interop.Word through which I am counting the document's total word count.
With reference to the this Lines.Count Property documentation, the library also provides a lines count option in the latest version.
But unfortunately, I am unable to find the Lines interface or property in the whole library.
Is there any other way to get the total number of lines of the MS Word document as shown in the image below?
Click here to view image
Method for words count (just for reference)
public int GetWordsCountFromWordFile(string wordFile)
{
try
{
if (!string.IsNullOrEmpty(wordFile))
{
var application = new Application();
var document = application.Documents.Open(wordFile, ReadOnly: true);
int count = document.Words.Count;
document.Close();
return count;
}
return 0;
}
catch (Exception ex)
{
LogWriter.ErrorLogWriter(nameof(Client), nameof(TaskHelper), nameof(GetWordsCountFromWordFile), "int", ex.Message);
return 0;
}
}

Answering my own question because I found the easiest and shortest working solution to the problem, exactly like I wanted it.
I added up two more lines of code in the method provided in my question and I got accurate results.
int lines = document.ComputeStatistics(WdStatistic.wdStatisticLines, true);
application.Quit(WdSaveOptions.wdDoNotSaveChanges);
Note:
Sending true parameter in ComputeStatistics method to include footnotes and endnotes.
Quit will stop the save changes window from opening and will close the MS Word process running in the background.

Related

C# - Shell32.NameSpace does not work when trying to extract metadata from files

I am attempting to get the metadata from a few music files and failing miserably. Online, there seems to be absolutely NO HOPE in finding an answer; no matter what I google. I thought it would be a great time to come and ask here because of this.
The specific error I got was: Error HRESULT E_FAIL has been returned from a call to a COM component. I really wish I could elaborate on this issue, but I'm simply getting nothing back from the COMException object. The error code was -2147467259, and it in hex is -0x7FFFBFFB, and Microsoft have not documented this specific error.
I 70% sure that its not the file's fault. My code will run through a directory full of music and convert the file into a song, hence the ConvertFileToSong name. The function would not be running if the file were to not exist is what I'm trying to say.
The only thing I can really say is that I'm using Dotnet 6, and have a massive headache.
Well, I guess I could also share another problem I had before this error showed up. Dotnet6 has top level code or whatever its called, this means that I can't add the [STAThread] attribute. To solve this, I simply added the code bellow to the top. Not sure why I have to set it to unknown, but that's what I (someone else on Stack Overflow) have to do. That solved that previous problem that the Shell32 could not start, but could that be causing my current problem? Who knows... definitely not me.
Thread.CurrentThread.SetApartmentState(ApartmentState.Unknown);
Thread.CurrentThread.SetApartmentState(ApartmentState.STA);
Here is the code:
// Help from: https://stackoverflow.com/questions/37869388/how-to-read-extended-file-properties-file-metadata
public static Song ConvertFileToSong(FileInfo file)
{
Song song = new Song();
List<string> headers = new List<string>();
// initialise the windows shell to parse attributes from
Shell32.Shell shell = new Shell32.Shell();
Shell32.Folder objFolder = null;
try
{
objFolder = shell.NameSpace(file.FullName);
}
catch (COMException e)
{
int code = e.ErrorCode;
string hex = code.ToString();
Console.WriteLine("MESSAGE: " + e.Message + ", CODE: " + hex);
return null;
}
Shell32.FolderItem folderItem = objFolder.ParseName(file.Name);
// the rest of the code is not important, but I'll leave it there anyway
// pretty much loop infinetly with a counter better than
// while loop because we don't have to declare an int on a new
// line
for (int i = 0; i < short.MaxValue; i++)
{
string header = objFolder.GetDetailsOf(null, i);
// the header does not exist, so we must exit
if (String.IsNullOrEmpty(header)) break;
headers.Add(header);
}
// Once the code works, I'll try and get this to work
song.Title = objFolder.GetDetailsOf(folderItem, 0);
return song;
}
Good night,
Diseased Finger
Ok, so the solution isn't that hard. I used file.FullName which includes the file's name, but Shell32.NameSpace ONLY requires the directory name (discluding the file name).
This is the code that fixed it:
public static Song ConvertFileToSong(FileInfo file)
{
// .....
Shell32.Shell shell = new Shell32.Shell();
Shell32.Folder objFolder = file.DirectoryName;
Shell32.FolderItem folderItem = objFolder.ParseName(file.Name);
// .....
return something;
}

Word Interop Batch Printing

I have an application that creates multiple (regularly 1000+) Word files and then prints them.
I have created the below code that, once the PrintDialog is Ok'd proceeds to print the documents that reside in a folder. However recently we've had some strange behaviour following some Office updates (its Office 2010, which we cannot upgrade). The screen literally flickers, like a really bad graphics problem, then Word crashes after about 500 prints. This hasn't happened before and has been running for probably 12 months+ without issue.
As you can see, the application opens one instance of Word, then uses that to open>print>close each document. I can't see how this causes a problem, as Word only ever has one document open at a time. The memory size of winword.exe also doesn't increase significantly beyond the size of when its first opened too, so the Office updates don't initially appear to have introduced a memory leak.
Is there a more efficient way of doing this?
if (printDialog.ShowDialog() == DialogResult.OK)
{
int c = 1;
int total = directoryInfo.GetFiles().Count();
// Set the Progress bar details
App.Current.Dispatcher.Invoke((Action)delegate { MainWindow.ui_progressBar.Maximum = directoryInfo.GetFiles().Length; });
// loop through the files in order
foreach (FileInfo report in directoryInfo.GetFiles().OrderBy(x => x.FullName))
{
UpdateUI("Printing " + c.ToString() + " / " + total.ToString());
Microsoft.Office.Interop.Word.Document reportFile = wordApp.Documents.Add(report.FullName);
wordApp.ActivePrinter = printDialog.PrinterSettings.PrinterName;
wordApp.ActiveDocument.PrintOut();
reportFile.Close(SaveChanges: false);
reportFile = null;
PrinterCounter++;
App.Current.Dispatcher.Invoke((Action)delegate { MainWindow.ui_progressBar.Value = PrinterCounter; });
c++;
}
}

Why does my file sometimes disappear in the process of reading from it or writing to it?

I have an app that reads from text files to determine which reports should be generated. It works as it should most of the time, but once in awhile, the program deletes one of the text files it reads from/writes to. Then an exception is thrown ("Could not find file") and progress ceases.
Here is some pertinent code.
First, reading from the file:
List<String> delPerfRecords = ReadFileContents(DelPerfFile);
. . .
private static List<String> ReadFileContents(string fileName)
{
List<String> fileContents = new List<string>();
try
{
fileContents = File.ReadAllLines(fileName).ToList();
}
catch (Exception ex)
{
RoboReporterConstsAndUtils.HandleException(ex);
}
return fileContents;
}
Then, writing to the file -- it marks the record/line in that file as having been processed, so that the same report is not re-generated the next time the file is examined:
MarkAsProcessed(DelPerfFile, qrRecord);
. . .
private static void MarkAsProcessed(string fileToUpdate, string
qrRecord)
{
try
{
var fileContents = File.ReadAllLines(fileToUpdate).ToList();
for (int i = 0; i < fileContents.Count; i++)
{
if (fileContents[i] == qrRecord)
{
fileContents[i] = string.Format("{0}{1} {2}"
qrRecord, RoboReporterConstsAndUtils.COMPLETED_FLAG, DateTime.Now);
}
}
// Will this automatically overwrite the existing?
File.Delete(fileToUpdate);
File.WriteAllLines(fileToUpdate, fileContents);
}
catch (Exception ex)
{
RoboReporterConstsAndUtils.HandleException(ex);
}
}
So I do delete the file, but immediately replace it:
File.Delete(fileToUpdate);
File.WriteAllLines(fileToUpdate, fileContents);
The files being read have contents such as this:
Opas,20170110,20161127,20161231-COMPLETED 1/10/2017 12:33:27 AM
Opas,20170209,20170101,20170128-COMPLETED 2/9/2017 11:26:04 AM
Opas,20170309,20170129,20170225-COMPLETED
Opas,20170409,20170226,20170401
If "-COMPLETED" appears at the end of the record/row/line, it is ignored - will not be processed.
Also, if the second element (at index 1) is a date in the future, it will not be processed (yet).
So, for these examples shown above, the first three have already been done, and will be subsequently ignored. The fourth one will not be acted on until on or after April 9th, 2017 (at which time the data within the data range of the last two dates will be retrieved).
Why is the file sometimes deleted? What can I do to prevent it from ever happening?
If helpful, in more context, the logic is like so:
internal static string GenerateAndSaveDelPerfReports()
{
string allUnitsProcessed = String.Empty;
bool success = false;
try
{
List<String> delPerfRecords = ReadFileContents(DelPerfFile);
List<QueuedReports> qrList = new List<QueuedReports>();
foreach (string qrRecord in delPerfRecords)
{
var qr = ConvertCRVRecordToQueuedReport(qrRecord);
// Rows that have already been processed return null
if (null == qr) continue;
// If the report has not yet been run, and it is due, add i
to the list
if (qr.DateToGenerate <= DateTime.Today)
{
var unit = qr.Unit;
qrList.Add(qr);
MarkAsProcessed(DelPerfFile, qrRecord);
if (String.IsNullOrWhiteSpace(allUnitsProcessed))
{
allUnitsProcessed = unit;
}
else if (!allUnitsProcessed.Contains(unit))
{
allUnitsProcessed = allUnitsProcessed + " and "
unit;
}
}
}
foreach (QueuedReports qrs in qrList)
{
GenerateAndSaveDelPerfReport(qrs);
success = true;
}
}
catch
{
success = false;
}
if (success)
{
return String.Format("Delivery Performance report[s] generate
for {0} by RoboReporter2017", allUnitsProcessed);
}
return String.Empty;
}
How can I ironclad this code to prevent the files from being periodically trashed?
UPDATE
I can't really test this, because the problem occurs so infrequently, but I wonder if adding a "pause" between the File.Delete() and the File.WriteAllLines() would solve the problem?
UPDATE 2
I'm not absolutely sure what the answer to my question is, so I won't add this as an answer, but my guess is that the File.Delete() and File.WriteAllLines() were occurring too close together and so the delete was sometimes occurring on both the old and the new copy of the file.
If so, a pause between the two calls may have solved the problem 99.42% of the time, but from what I found here, it seems the File.Delete() is redundant/superfluous anyway, and so I tested with the File.Delete() commented out, and it worked fine; so, I'm just doing without that occasionally problematic call now. I expect that to solve the issue.
// Will this automatically overwrite the existing?
File.Delete(fileToUpdate);
File.WriteAllLines(fileToUpdate, fileContents);
I would simply add an extra parameter to WriteAllLines() (which could default to false) to tell the function to open the file in overwrite mode, and not call File.Delete() at all then.
Do you currently check the return value of the file open?
Update: ok, it looks like WriteAllLines() is a .Net Framework function and therefore cannot be changed, so I deleted this answer. However now this shows up in the comments, as a proposed solution on another forum:
"just use something like File.WriteAllText where if the file exists,
the data is just overwritten, if the file does not exist it will be
created."
And this was exactly what I meant (while thinking WriteAllLines() was a user defined function), because I've had similar problems in the past.
So, a solution like that could solve some tricky problems (instead of deleting/fast reopening, just overwriting the file) - also less work for the OS, and possibly less file/disk fragmentation.

Save and delete Excel file saved using HttpPostedFileBase

I am uploading an Excel file and extracting data from that and saving it into a database. I am using MVC4 .NET Framework. This is my code from class:
public static void Upload(HttpPostedFileBase File)
{
NIKEntities1 obj = new NIKEntities1();
MyApp = new Excel.Application();
MyApp.Visible = false;
string extension = System.IO.Path.GetExtension(File.FileName);
string pic = "Excel" + extension;
string path = System.IO.Path.Combine(System.Web.HttpContext.Current.Server.MapPath("~/Excel"), pic);
File.SaveAs(path);
MyBook = MyApp.Workbooks.Open(path);
MySheet = (Excel.Worksheet)MyBook.Sheets[1]; // Explicit cast is not required here
int lastRow = MySheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell).Row;
List<Employee> EmpList = new List<Employee>();
for (int index = 2; index <= lastRow; index++)
{
System.Array MyValues = (System.Array)MySheet.get_Range("A" +
index.ToString(), "B" + index.ToString()).Cells.Value;
EmpList.Add(new Employee
{
BatchID = MyValues.GetValue(1, 1).ToString(),
BatchName = MyValues.GetValue(1, 2).ToString()
});
}
for (int i = 0; i < EmpList.Count; i++)
{
int x=obj.USP_InsertBatches(EmpList[i].BatchID, EmpList[i].BatchName);
}
}
}
class Employee
{
public string BatchID;
public string BatchName;
}
This code is working perfectly the first time but next time it says that file is currently in use. So I thought of deleting the file at the end of code using the following line:
File.Delete(path);
But this line threw error:
HttpPostedFileBase does not contain definition for Delete
Also, if I don't write this line and try to execute code again it says that it can't save because a file exists with same name and could not be replaced because it is currently in use.
What should I do to get rid of this:
(File.Delete()) Error
Any other way of accessing the Excel file which I am receiving without saving will also be very helpful because I have to just access the data one time.
The File you use there is your variable that is the input parameter of your method. That parameter is of type HttpPostedFileBase and that type has no instance methods (nor static ones for that matter) that allow you to delete that File instance.
You are probably looking for the static Delete method on the File type that is in the System.IO namespace.
A quickfix would be to be explicit about which File you mean:
System.IO.File.Delete(path);
You might want to consider a different naming guideline for your variables though. In c# we tend to write variables starting with a lower case letter. Almost all types in the framework start with an Uppercase letter. Which makes it easier to distinguish the thing file and the type File.
Do notice that a file can only be deleted if it is closed by all processes and all file handles are cleared by the filesystem. In your case you have to make sure Excel closed the file and released it's handles. If you have the search indexer running or a rough virus scanner you might have to try a few times before giving up.
I normally use this code:
// make sure here all Ole Automation servers (like Excel or Word)
// have closed the file (so close the workbook, document etc)
// we iterate a couple of times (10 in this case)
for(int i=0; i< 10; i++)
{
try
{
System.IO.File.Delete(path);
break;
} catch (Exception exc)
{
Trace.WriteLine("failed delete {0}", exc.Message);
// let other threads do some work first
// http://blogs.msmvps.com/peterritchie/2007/04/26/thread-sleep-is-a-sign-of-a-poorly-designed-program/
Thread.Sleep(0);
}
}
From what I can tell, you are opening Excel, reading the file but never closing the Excel.
Add:
MyApp.Workbooks.Close();
MyApp.Quit();
at the end of the Upload function. Even better, wrap whole code you got in
try{
//here goes your current code
}
catch(Exception e)
{
//manage exception
}
finally
{
MyApp.Workbooks.Close();
MyApp.Quit();
}
You initialize MyApp outside try catch block, then whatever happens close the file.

Code that can be used to iterate over every control on all reports in a Microsoft Access database?

Can anyone suggest c# code that can be used to iterate over every control on all reports in a Microsoft Access database? The reason for doing this is that I am converting reports from Microsoft Access to Reporting Services and I want to find all reports in access that has specific text in the control source property.
Currently I am using the Microsoft.Office.Interop.Access assemblies but the code I am using is not working. This is because Access API knowledge is limited.
private static void Main(string[] args)
{
OpenDatabase();
DisplayReportElements();
Console.ReadLine();
}
private static void OpenDatabase()
{
app = new Application();
app.OpenCurrentDatabase(#"database.mdb");
app.Visible = false;
//app.OpenCurrentDatabase(#"C:\DLDWorkspace\Truama\Skills Training.mdb");
}
public static void DisplayReportElements()
{
for (int i = 0; i < app.CurrentProject.AllReports.Count - 1; i++)
{
Report report = app.Reports[i];
foreach (Control control in report.Controls)
{
Console.WriteLine("{0} - {1}", report.FormName, control.Name);
ControlProperties(control);
}
}
}
The following code produces an exception with the message "The number you used to refer to the report is invalid." on line Report report = app.Reports[i];. To get around this I go through and Open each report by calling app.DoCmd.OpenReport in a loop. There is two problems with this. 1. It takes over 12 hours to process 300 reports. and 2. after about 300 (of 600) reports I get an Index is out of bounds somewhere in DisplayReportElements
To iterate through the reports and their controls from within an Access.Application your approach is correct. If you find that process to be too slow or otherwise troublesome then an alternative approach would be to dump all of the reports to text files using the Application.SaveAsText method ...
var app = new Application();
app.OpenCurrentDatabase(#"C:\Users\Public\Database1.accdb");
for (int i = 0; i < app.CurrentProject.AllReports.Count; i++)
{
string rptName = app.CurrentProject.AllReports[i].Name;
Console.WriteLine("Dumping [{0}] ...", rptName);
string fileSpec = #"C:\__tmp\ReportDump\" + rptName + ".txt";
app.SaveAsText(AcObjectType.acReport, rptName, fileSpec);
}
app.CloseCurrentDatabase();
app.Quit();
and then use your favorite text-searching tool to scan the files for lines that contain 'ControlSource =' followed by the string you want to find, e.g.,
ControlSource ="LastName"

Categories

Resources