iTextSharp to Word

iTextSharp to Word - c#

I am using iTextSharp in my MVC application to create a .pdf file, but is there a way to convert it to a .doc file?
public ActionResult Download(int? Id)
{
string FileName = "Test";
var FilePath = Path.Combine(Path.GetTempPath(), "Temp.pdf");
Document UserPDF = new Document();
PdfWriter.GetInstance(UserPDF, new FileStream(FilePath, FileMode.Create));
CreateCv(UserPDF); // This is where the PDF is created
var fs = new FileStream(FilePath, FileMode.Open);
var Bytes = new byte[fs.Length];
fs.Read(Bytes, 0, (int)fs.Length);
fs.Close();
return File(Bytes, "application/pdf", FileName + ".pdf");
}

To put it simply, no. There is no way to convert it to a DOC file using iTextSharp. It only supports the reading and generating of PDF files.
Converting PDF to DOC is possible by using some other third party libraries but there is no solid way of doing it (and preserving formatting / whitespace) at the moment. If you search the web, people suggest creating an image from the PDF and then appending the image to the end of a blank DOC file (which is a major hack).

Related

Extract embedded package files from word document using open xml?

I am trying extract the word document, It has embedded files(word,excel,package). I am not able to extract package and save it Using C# Open XML.
The below code just extracts word and excel but not package.
using (WordprocessingDocument document = WordprocessingDocument.Open(fileName, false))
{
foreach (EmbeddedPackagePart pkgPart in document.MainDocumentPart.GetPartsOfType<EmbeddedPackagePart>())
{
if (pkgpart.uri.tostring().startswith(embeddingpartstring))
{
string filename1 = pkgpart.uri.tostring().remove(0, embeddingpartstring.length);
// get the stream from the part
system.io.stream partstream = pkgpart.getstream();
string filepath = "d:\\test\\" + filename1;
// write the steam to the file.
system.io.filestream writestream = new system.io.filestream(filepath, filemode.create, fileaccess.write);
readwritestream(pkgpart.getstream(), writestream);
}
}
}

The issue you're having is, that when you go to MainDocument.Parts and start searching, what you'll get is things like "Imagepart", "ChartPart" etc. where the ChartPart might have it's own embedded part, which could be the Excel or Word file you are looking for.
In short, you need to extend your search for embedded parts, to the actual parts in the mainDocument.
If I just wanted to extract all embedded parts in one of the files from my own project, I would go about it like this.
using (var document = WordprocessingDocument.Open(#"C:\Test\myTestDocument.docx", false))
{
//just grab all the parts, might be relevant to be a bit more clever about it, depending on sizes of files and how many files you want to search through
foreach(var part in document.MainDocumentPart.Parts)
{
//foreach part see if that part containts an EmbeddedPackagePart
var testForEmbedding = part.OpenXmlPart.GetPartsOfType<EmbeddedPackagePart>();
foreach(EmbeddedPackagePart embedding in testForEmbedding)
{
//You should probably insert some clever naming scheme here..
string fileName = embedding.Uri.OriginalString.Split('/').Last();
//stream the EmbeddedPackagePart to a file
using(FileStream myFile = File.Create(#"C:\test\" + fileName))
using (var stream = embedding.GetStream())
{
stream.Seek(0, SeekOrigin.Begin);
stream.CopyTo(myFile);
myFile.Close();
}
}
}
}
I hope this helps!

NPOI - After saving to file corrupts .xlsx workbook

I used this code to write to an exciting excel file. After writing the file, when I open the file manually it is corrupted. I am using NPOI binary 2.3.0.0 Please tell how to avoid excel getting corrupted.
[Authorize]
public void ExportUsers()
{
var path = Server.MapPath(#"~\Content\ExportTemplates\") + "sample.xlsx";
FileStream sw = new FileStream(path, FileMode.Open, FileAccess.Read);
IWorkbook workbook = WorkbookFactory.Create(sw);
ISheet sheet = workbook.GetSheetAt(0);
IRow row = sheet.GetRow(12);
ICell cell = row.CreateCell(row.LastCellNum);
cell.SetCellValue("test");
workbook.CreateSheet("Ripon");
sw.Close();
using (var exportData = new MemoryStream())
{
workbook.Write(exportData);
string saveAsFileName = string.Format("Export-{0:d}.xls", DateTime.Now).Replace("/", "-");
System.Web.HttpContext.Current.Response.ContentType = "application/vnd.ms-excel";
System.Web.HttpContext.Current.Response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}", saveAsFileName));
System.Web.HttpContext.Current.Response.Clear();
System.Web.HttpContext.Current.Response.BinaryWrite(exportData.GetBuffer());
System.Web.HttpContext.Current.Response.End();
}
}
New file is created but corrupted. I've seen people say this is fixed in version 2.0.6, but still not working for me

There are several problems going on here.
First, you are starting with an .xlsx file but then changing the download file extension to .xls. .xls and .xlsx are not the same file format; the former is a binary format, while the latter is a zipped XML format. If you save the file with the wrong extension, then Excel will report the file as corrupted when it is opened.
Second, you are using the wrong method to get the data from the MemoryStream. GetBuffer will return the entire allocated internal buffer array, which will include any unused bytes that are beyond the end of the data if the buffer is not completely full. The extra null bytes will cause the downloaded file to be corrupted. If you want to get just the data in the buffer then you should use ToArray instead.
Third, it looks like you are using the ASP.NET MVC framework (based on the presence of the [Authorize] attribute on your method) but you are directly manipulating the current response instead of using the controller's built-in File method for returning a downloadable file. I would recommend using the built-in methods where possible, as it will make your code much cleaner.
Here is the corrected code:
[Authorize]
public ActionResult ExportUsers()
{
var path = Server.MapPath(#"~\Content\ExportTemplates\") + "sample.xlsx";
FileStream sw = new FileStream(path, FileMode.Open, FileAccess.Read);
IWorkbook workbook = WorkbookFactory.Create(sw);
ISheet sheet = workbook.GetSheetAt(0);
IRow row = sheet.GetRow(12);
ICell cell = row.CreateCell(row.LastCellNum);
cell.SetCellValue("test");
workbook.CreateSheet("Ripon");
sw.Close();
var exportData = new MemoryStream();
workbook.Write(exportData);
exportData.Seek(0, SeekOrigin.Begin); // reset stream to beginning so it can be read
string saveAsFileName = string.Format("Export-{0:d}.xlsx", DateTime.Now).Replace("/", "-");
return File(exportData, "application/vnd.ms-excel", saveAsFileName);
}

Generate a pdf without saving it to disk then show it to browser

Is it possible to merged pdf file without saving it to disk?
I have a generated pdf (via itextsharp) and a physical pdf file. These two should show to the browser as merged.
What I currently have is, (a pseudo code)
public ActionResult Index()
{
// Generate dyanamic pdf first
var pdf = GeneratePdf();
// Then save it to disk for retrieval later
SaveToDisc(pdf);
// Retrieve the static pdf
var staticPdf = GetStaticPdf();
// Retrieve the generated pdf that was made earlier
var generatedPdf = GetGeneratedPdf("someGeneratedFile.pdf");
// This creates the merged pdf
MergePdf(new List<string> { generatedPdf, staticPdf }, "mergedPdf.pdf");
// Now retrieve the merged pdf and show it to the browser
var mergedPdf = GetMergedPdf("mergedPdf.pdf");
return new FileStreamResult(mergedFile, "application/pdf");
}
This works, but I was just wondering if, would it be possible to just merged the pdf and show it to the browser without saving anything on the disc?
Any help would be much appreciated. Thanks

You can try to use PdfWriter class and MemoryStream like this:
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document(PageSize.A4, 60, 60, 10, 10);
PdfWriter pw = PdfWriter.GetInstance(doc, ms);
//your code to write something to the pdf
return ms.ToArray();
}
You can also refer: Creating a PDF from a RDLC Report in the Background
Additionally: if data is the PDF in memory, than you need to do data.CopyTo(base.Response.OutputStream);

try this
string path = Server.MapPath("Yourpdf.pdf");
WebClient client = new WebClient();
Byte[] buffer = client.DownloadData(path);
if (buffer != null)
{
Response.ContentType = "application/pdf";
Response.AddHeader("content-length", buffer.Length.ToString());
Response.BinaryWrite(buffer);
}

Drawing on PDF using ITextSharp, without creating a new PDF

i try to draw simple shapes (rectangles, circles..) on an existing PDF using ITextSharp, without having to create a new PDF. I found a post who talk about this issue (itextsharp modify existing pdf (no new source pdf) and add watermark) and i would like to know if anybody could tell me more about it.
my aim is to modify a pdf by adding a circle on it, the current solution involve the creation of a new PDF (Itextsharp). Is it possible to add a circle on a PDF without creating a new one ?
Thank you.
J.

You can't read a file and write to it simultaneously. Think of how Word works: you can't open a Word document and write directly to it. Word always creates a temporary file, writes the changes to it, then replaces the original file with it and then throws away the temporary file.
You can do that too:
read the original file with PdfReader,
create a temporary file for PdfStamper, and when you're done,
replace the original file with the temporary file.
Or:
read the original file into a byte[],
create PdfReader with this byte[], and
use the path to the original file for PdfStamper.
This second option is more dangerous, as you'll lose the original file if you do something that causes an exception in PdfStamper.
As for adding content with PdfStamper, please take a look at the section entitled "Manipulating existing PDFs" in the free ebook The Best iText Questions on StackOverflow. You'll find questions such as:
How to add a watermark to a PDF file?
How do I insert a hyperlink to another page with iTextSharp in an existing PDF?
iText - How to stamp image on existing PDF and create an anchor
...
All of these examples add content by creating a PdfContentByte instance like this:
PdfContentByte canvas = stamper.getOverContent(pagenumber);
It's this canvas you need to use when drawing a circle on the page with page number pagenumber. It is important that you use the correct coordinates when you do this. That's explained here: How to position text relative to page using iText?
Update:
Json posted the following code in the comments:
string oldFile = #"C:\Users\ae40394\Desktop\hello.pdf";
string newFile = #"C:\Users\ae40394\Desktop\NEW.pdf";
// creating a reader with the original PDF
PdfReader reader = new PdfReader(oldFile);
Rectangle rect = reader.GetPageSize(1);
FileStream fs = new FileStream(newFile,FileMode.Create);
using (PdfStamper stamper = new PdfStamper(reader, fs)) {
// modify the pdf content
PdfContentByte cb = stamper.GetOverContent(1);
cb.SetColorStroke(iTextSharp.text.BaseColor.GREEN);
cb.SetLineWidth(5f);
cb.Circle(rect.GetLeft() + 30, rect.GetBottom() + 30 ,20f);
cb.Stroke();
}
reader.Close();
File.Replace(#"C:\Users\ae40394\Desktop\NEW.pdf", #"C:\Users\ae40394\Desktop\hello.pdf", #"C:\Users\ae40394\Desktop\hello.pdf.bac");
I slightly adapted the code, because:
There is no need for a Document object,
The stamper is closed when using is closed,
When the stamper is closed, so is the FileStream
the coordinates of the circle were hard coded. I used the page size to make sure they are made relative to the origin of the coordinate system, although to be sure, you may also want to check if there's a Crop Box.

You CAN read a file and write to it simultaneously.
Here is an example:
private void button4_Click(object sender, EventArgs e)
{
using (PdfReader pdfReader = new PdfReader(new FileStream(pdfInput, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileStream(pdfInput, FileMode.Open, FileAccess.Write, FileShare.None)))
{
PdfContentByte canvas = pdfStamper.GetUnderContent(1);
canvas.SetColorFill(BaseColor.YELLOW);
canvas.Rectangle(36, 786, 66, 16);
canvas.Fill();
}
}
// PDF Datei im Anschluss anzeigen/öffnen
System.Diagnostics.Process.Start(pdfInput);
}

string oldFile = #"C:\...6166-21.pdf";
string newFile = #"C:\...NEW.pdf";
// open the reader
PdfReader reader = new PdfReader(oldFile);
Rectangle size = reader.GetPageSizeWithRotation(1);
Document document = new Document(size);
FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write);
PdfWriter writer = PdfWriter.GetInstance(document, fs);
document.Open();
// the pdf content
PdfContentByte cb = writer.DirectContent;
cb.SetColorStroke(iTextSharp.text.BaseColor.GREEN);
cb.Circle(150f, 150f, 50f);
cb.Stroke();
// create the new page and add it to the pdf
PdfImportedPage page = writer.GetImportedPage(reader, 1);
cb.AddTemplate(page, 0, 0);
// close the streams and voilá the file should be changed :)
document.Close();
fs.Close();
writer.Close();
reader.Close();

Load XPS to documentviewer from embedded resource

i am trying to make help for my application. I have xps documents which i am loading to documentviewer. These files are embedded in resource file.
I am able to access these as bytearray.
For example
Properties.Resources.help_sudoku_methods_2
returns byte[]
However, documentviewer cant read it and requires fixeddocumentsequence.
So i create memory stream from bytearray, then xpsdocument and then fixeddocumentsequence like this:
private void loadDocument(byte[] sourceXPS)
{
MemoryStream ms = new MemoryStream(sourceXPS);
const string memoryName = "memorystream://ms.xps";
Uri memoryUri = new Uri(memoryName);
try
{
PackageStore.RemovePackage(memoryUri);
}
catch (Exception)
{ }
Package package = Package.Open(ms);
PackageStore.AddPackage(memoryUri, package);
XpsDocument xps = new XpsDocument(package, CompressionOption.SuperFast, memoryName);
FixedDocumentSequence fixedDocumentSequence = xps.GetFixedDocumentSequence();
doc.Document = fixedDocumentSequence;
}
This is very unclean aproach and also doesnt work if there are images in files - instead of images in new documents displays images from first loaded doc.
Is there any cleaner way to load XPS from embedded resources to documentviewer? or do i need somethink like copy file from resources to application directory and load from here and not memorystream? Thank you.

why dont you write file to system temp folder and then read from there.
Stream ReadStream = System.Reflection.Assembly.GetExecutingAssembly().GetManifestResourceStream("file1.xps");
string tempFile = Path.GetTempPath()+"file1.xps";
FileStream WriteStream = new FileStream(tempFile, FileMode.Create, FileAccess.Write);
ReadStream.CopyTo(WriteStream);
WriteStream.Close();
ReadStream.Close();
// Read tempFile INTO memory here and then
File.Delete(tempFile);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

iTextSharp to Word - c#

Related

Extract embedded package files from word document using open xml?

NPOI - After saving to file corrupts .xlsx workbook

Generate a pdf without saving it to disk then show it to browser

Drawing on PDF using ITextSharp, without creating a new PDF

Load XPS to documentviewer from embedded resource

Categories

Resources