Pictures in DOC file

Pictures in DOC file - c#

I have file in format DOC (MS Word 97-2003) and I want to get list of all images used in this file. I try to use "Microsoft.Office.Interop.Word" namespace like in code below
Application application = new Application();
Document document = application.Documents.Open(dataPath);
var words = document.InlineShapes;
int count = words.Count;
for (int i = 0; i < count; i++)
{
if (words[i] != null)
{
Console.WriteLine("{0} : {1}", i, words[i].PictureFormat);
}
}
but I can not find any image in this file (in real there exists two images). Maybe I do something wrong? Could you recommend me any library, which will easy it. I can'nt convert file to DOCX

Use document.InlineShapes to grab the images.

It may be funny, but in this case, I think, the numbering goes from 1. That's why you get COMException: "Element doesn't exists in collection".
Try:
for (int i = 1; i <= count; i++)
{
if (words[i] != null)
{
Console.WriteLine("{0} : {1}", i, words[i].PictureFormat);
}
}

Related

Split PDF by chapters from Table Of Contents

I'm using GemBox.Pdf and I need to extract individual chapters in a PDF file as a separate PDF files.
The first page (maybe the second page as well) contains TOC (Table Of Contents) and I need to split the rest of the PDF pages based on it:
Also, those PDF documents that are split, should be named as the chapters they contains.
I can split the PDF based on the number of pages for each document (I figured that out using this example):
using (var source = PdfDocument.Load("Chapters.pdf"))
{
int pagesPerSplit = 3;
int count = source.Pages.Count;
for (int index = 1; index < count; index += pagesPerSplit)
{
using (var destination = new PdfDocument())
{
for (int splitIndex = 0; splitIndex < pagesPerSplit; splitIndex++)
destination.Pages.AddClone(source.Pages[index + splitIndex]);
destination.Save("Chapter " + index + ".pdf");
}
}
}
But I can't figure out how to read and process that TOC and incorporate the chapters splitting base on its items.

You should iterate through the document's bookmarks (outlines) and split it based on the bookmark destination pages.
For instance, try this:
using (var source = PdfDocument.Load("Chapters.pdf"))
{
PdfOutlineCollection outlines = source.Outlines;
PdfPages pages = source.Pages;
Dictionary<PdfPage, int> pageIndexes = pages
.Select((page, index) => new { page, index })
.ToDictionary(item => item.page, item => item.index);
for (int index = 0, count = outlines.Count; index < count; ++index)
{
PdfOutline outline = outlines[index];
PdfOutline nextOutline = index + 1 < count ? outlines[index + 1] : null;
int pageStartIndex = pageIndexes[outline.Destination.Page];
int pageEndIndex = nextOutline != null ?
pageIndexes[nextOutline.Destination.Page] :
pages.Count;
using (var destination = new PdfDocument())
{
while (pageStartIndex < pageEndIndex)
{
destination.Pages.AddClone(pages[pageStartIndex]);
++pageStartIndex;
}
destination.Save($"{outline.Title}.pdf");
}
}
}
Note, from the screenshot it seems that your chapter bookmarks include the order's number (roman numerals). If needed, you can easily remove those with something like this:
destination.Save($"{outline.Title.Substring(outline.Title.IndexOf(' ') + 1)}.pdf");

C# Cannot use Streamwriter on a txt file in C# Properties.Resources

I am currently working on an assignment for school where I am trying to write a 2D string array into a text file. I have the array and know its working fine however every time I try to read the file into Streamwriter I get "System.ArgumentException: 'Illegal characters in path.'". I am relatively new to C# and I have no idea how to fix this.
This is my code. I just need to know how to write my 2D array into the text file without getting this error. Thanks, all and any help is much appreciated!
// This line under is where the error happens
using (var sw = new StreamWriter(Harvey_Norman.Properties.Resources.InventoryList))
{
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 3; j++)
{
sw.Write(InventoryArray[i, j] + " ");
}
sw.Write("\n");
}
sw.Flush();
sw.Close();
}

My guess is that Harvey_Norman.Properties.Resources.InventoryList is a resource in your project that is typed as a string-- and the value of that string is not a valid path for your operating system.
StreamWriter will either take a string, in which case it expects to open a file with the path of that string; or it will take a stream, and you can write to that stream. It looks like you are trying to do the former; but you need to check the value of that resource to see if it is a vaild path.

You're trying to construct a StreamWriter with an invalid file path.
Also, if you're just writing text out, you can use File.CreateText() to create a StreamWriter, for example:
var tempFilePath = Path.GetTempFileName();
using (var writer = File.CreateText(tempFilePath))
{
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 3; j++)
{
if (j > 0)
writer.WriteLine(" ");
writer.Write(InventoryArray[i, j]);
}
writer.WriteLine();
}
}
The using will automatically flush and close the file, and dispose the StreamWriter.

trouble reading and writing to a file c#

I am currently trying to take a file of words that are not in alphabetical, re-order the words so that they are in alphabetical order (I am trying to use a non-built in sort method), and then write the newly ordered list into a new txt file(one that must be created). For example, lets say there is only five words in the txt file that are as follows "dog bat apple rabbit cat". I would want the program to resort these in alphabetical order, and then create a txt file that saves that order. As of right now, the program will iterate through the txt file, but will not save the re-ordered list into the new txt file. What is saved into the new file is this... "System.Collections.Generic.List`1[System.String]"
Truth be told, I am not very savvy with c# yet, so i apologize if my structuring or coding is not very well. The original file that is un-ordered is called "jumbled english FILTERED.ALL.txt", and the file I am trying to write to is called "english FILTERED.ALL.txt".
static void Main(string[] args)
{
// declaring integer for minimum.
int min = 0;
// declare the list for the original file
List<string> LinuxWords = new List<string>();
List<string> lexicalOrder = new List<string>();
// read the text from the file
string[] lines = System.IO.File.ReadAllLines("jumbled english FILTERED.ALL.txt");
string line = string.Empty;
// seperate each word into a string
//foreach (string line in lines)
//{
//add each word into the list.
//LinuxWords.Add(line);
//}
for (int i = 0; i < lines.Length - 1; i++)
{
for (int j = i + 1; j < lines.Length; j++)
{
if (lines[i].Length < lines[j].Length)
{
min = lines[i].Length;
}
else
{
min = lines[j].Length;
}
for (int k = 0; k < min; k++)
{
if (lines[i][k] > lines[j][k])
{
line = lines[i].ToString();
lines[i] = lines[j];
lines[j] = line;
break;
}
else if (lines[i][k] == lines[j][k])
{
continue;
}
else
{
break;
}
}
}
}
for (int i = 0; i < lines.Length; i++)
{
Console.WriteLine("The program is formatting the correct order");
lexicalOrder.Add(lines[i]);
}
//lexicalOrder.ForEach(Console.WriteLine);
//}
//LinuxWords.ForEach(Console.WriteLine);
File.WriteAllText(AppDomain.CurrentDomain.BaseDirectory + "english FILTERED.ALL.txt",
lexicalOrder.ToString());
// write the ordered list back into another .txt file named "english FILTERED.ALL.txt"
// System.IO.File.WriteAllLines("english FILTERED.ALL.txt", lexicalOrder);
Console.WriteLine("Finished");
}

Assuming you mean that you don't get the list saved (if that's not the problem - please be more specific) - you need to change
lexicalOrder.ToString()
to something like
lexicalOrder.Aggregate((s1, s2) => s1 + " " + s2)

can not correctly insert text into a bookmark from another bookmark

I'm writing a windows form application which must exchange the content of Word bookmarks between two documents.
There are two similar documents (wordDocument and wordPattern) with similar amount of bookmarks. I'm trying this:
for (int i = 1; i <= wordDocument.Bookmarks.Count; i++)
{
object j = i;
wordDocument.Bookmarks.get_Item(ref j).Range.Text = wordPattern.Bookmarks.get_Item(ref j).Range.Text.ToString();
//MessageBox.Show(wordDocument.Bookmarks[i].Range.Text);
//MessageBox.Show(wordPattern.Bookmarks[i].Range.Text);
}
But it does the task incorrectly. I mean, it does it in improper order and deletes bookmarks. Help me by providing right way to exchange the text inside the bookmarks.

int count1 = 0;
int count2 = 0;
foreach (Word.Bookmark bookmark1 in wordDocument.Bookmarks)
{
Word.Range bmRange = bookmark1.Range;
//bmRange.Text = "заметка" + count1;
listOfRanges.Add(bmRange);
count1++;
}
foreach (Word.Bookmark bookmark2 in wordPattern.Bookmarks)
{
Word.Range mbRange = bookmark2.Range;
mbRange.Text = listOfRanges[count2].Text;
count2++;
}
Solved it that way.

The requested member of the collection does not exist, MS Word

I tried to run a sample program from dotnetpearls.com and at first the program didn't work at all.
Apparently I had to run VS Express 2012 as Administrator, before I could start an Application object. After that, the next time it errors out, is when I try to print out the text from the document. Error happens at string text = doc.Words[i].Text;
using System;
using Microsoft.Office.Interop.Word;
namespace WordTestProgram
{
class Program
{
static void Main(string[] args)
{
Application app = new Application();
Document doc = app.Documents.Open("C:\\word.doc");
int count = doc.Words.Count;
for (int i = 0; i <= count; i++)
{
string text = doc.Words[i].Text;
Console.WriteLine("Word {0} = {1}",i,text);
}
app.Quit();
}
}
}
I know for a fact that the document I am trying to extract data from, does have 3 words and 3 spaces in it. So it's not empty.

I found the answer myself
Instead of: int i = 0; i <= count; i++
I should do: int i = 1; i <= count; i++
Apparently member 0 in the array is null and the program can't handle that.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Pictures in DOC file - c#

Use document.InlineShapes to grab the images.

It may be funny, but in this case, I think, the numbering goes from 1. That's why you get COMException: "Element doesn't exists in collection". Try: for (int i = 1; i <= count; i++) { if (words[i] != null) { Console.WriteLine("{0} : {1}", i, words[i].PictureFormat); } }

Related

Split PDF by chapters from Table Of Contents

C# Cannot use Streamwriter on a txt file in C# Properties.Resources

trouble reading and writing to a file c#

can not correctly insert text into a bookmark from another bookmark

The requested member of the collection does not exist, MS Word

Categories

Resources