Converting contenteditable content to PDF - c#

I would like to know the most efficient way to convert contenteditable features (something that the user puts in) to an pdf. Here is an illustration of what i mean:
1.
2
3
I would also like to know how to convert css features since jsPDF doesn't suppoert this (to my knowledge)

jsPDF doesn't support almost the features what you need. I suggest to create an application to do that.
My background is C#. So:
Program.cs
using HtmlToPdf.Models;
namespace HtmlToPdf.Console
{
public class Program
{
public static void Main(string[] args)
{
var model = new HtmlToPdfModel();
model.HTML = "<h3>Hello world!</h3>";
model.CSS = "h3{color:#f00;}";
HtmlToPdf.Convert(model);
}
}
}
HtmlToPdfModel.cs
namespace HtmlToPdf.Models
{
public class HtmlToPdfModel
{
public string HTML { get; set; }
public string CSS { get; set; }
public string OutputPath { get; set; }
public string FontName { get; set; }
public string FontPath { get; set; }
}
}
HtmlToPdf.cs
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml;
using HtmlToPdf.Models;
using System;
using System.IO;
using System.Text;
namespace HtmlToPdf.Console
{
public class HtmlToPdf
{
public static void Convert(HtmlToPdfModel model)
{
try
{
if (model == null) return;
Byte[] bytes;
//Boilerplate iTextSharp setup here
//Create a stream that we can write to, in this case a MemoryStream
using (var stream = new MemoryStream())
{
//Create an iTextSharp Document which is an abstraction of a PDF but **NOT** a PDF
using (var doc = new Document())
{
//Create a writer that's bound to our PDF abstraction and our stream
using (var writer = PdfWriter.GetInstance(doc, stream))
{
//Open the document for writing
doc.Open();
//In order to read CSS as a string we need to switch to a different constructor
//that takes Streams instead of TextReaders.
//Below we convert the strings into UTF8 byte array and wrap those in MemoryStreams
using (var cssStream = new MemoryStream(Encoding.UTF8.GetBytes(model.CSS)))
{
using (var htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(model.HTML)))
{
var fontProvider = new XMLWorkerFontProvider();
if (!string.IsNullOrEmpty(model.FontPath) && !string.IsNullOrEmpty(model.FontName))
{
fontProvider.Register(model.FontPath, model.FontName);
//Parse the HTML with css font-family
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, htmlStream, cssStream, Encoding.UTF8, fontProvider);
}
else
{
//Parse the HTML without css font-family
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, htmlStream, cssStream);
}
}
}
doc.Close();
}
}
//After all of the PDF "stuff" above is done and closed but **before** we
//close the MemoryStream, grab all of the active bytes from the stream
bytes = stream.ToArray();
}
//Now we just need to do something with those bytes.
//Here I'm writing them to disk but if you were in ASP.Net you might Response.BinaryWrite() them.
//You could also write the bytes to a database in a varbinary() column (but please don't) or you
//could pass them to another function for further PDF processing.
// use this line on Windows version
//File.WriteAllBytes(model.OutputPath, bytes);
// use these lines on Mac version
string path = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "data");
path = Path.Combine(path, "test.pdf");
File.WriteAllBytes(path, bytes);
}
catch (Exception e)
{
throw e;
}
}
}
}
When I wrote this application, I've tested on Windows. So, if you're using Mac, you can replace the line:
File.WriteAllBytes(model.OutputPath, bytes);
in the file HtmlToPdf.cs to
string path = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "data");
path = Path.Combine(path, "test.pdf");
File.WriteAllBytes(path, bytes);
I've commented inside the code.
About the font problem, if you want to use specific font (ex: Roboto), you must provide the font file and the path which your application can assign to.
Nuget packages: iTextSharp and itextsharp.xmlworker
You can convert this console application to web application, everytime you want to make PDF file, just make a request (ajax) to server and hit the method HtmlToPdf.Convert.

Related

Get Text Before image in a PDF using PdfPig

I search to get a string just before a image from a pdf
Example of pdf :
pdfexample
Like the image above :
For the nice picture of a cat 1 i will want to get 'What is image about :' and 'Gps Thing'
For the nice picture of a cat 2 i will want to get 'What is image about 2 :' and 'Gps Thing 2', excluding 'Lot of useless text' (Find by example the last word (GPS before the image 2)
Code used for now :
public void pdf(pdfselectionned){
var pdfdoc = IronPdf.PdfDocument.FromFile(pdfselectionned); // Define my selected pdf
string pdftxt = pdfdoc.ExtractAllText(); // Get all text from the pdf
try
{
using (PdfDocument pdfDocument = PdfDocument.Open(pdfselectionned)) // Extract all image in a folder and named them 1.png,2.png,.....
{
int imageCount = 1;
foreach (Page page in pdfDocument.GetPages())
{
List<XObjectImage> images = page.GetImages().Cast<XObjectImage>().ToList();
foreach (XObjectImage image in images)
{
byte[] imageRawBytes = image.RawBytes.ToArray();
using (FileStream stream = new FileStream($"{dir}\\{imageCount}.png", FileMode.Create, FileAccess.Write))
using (BinaryWriter writer = new BinaryWriter(stream))
{
writer.Write(imageRawBytes);
writer.Flush();
}
imageCount++;
}
}
}
}
catch (Exception)
{
throw;
}
}
Thank a lot if someone find a way to do that :)
(Other topic talk about thing like i want to do but nobody use PdfPig, if a can avoid to use different thin it will be great ^^')

Word found unreadable content error when opening file

I am facing an error "Word found unreadable content in abc.docx. Do you want to recover the content of this document?" while opening the Word (.docx) file.
I have tried with all the solutions given on the Internet but no success on this. Below is my code for writing the content into the stream.
private void test()
{
using (MemoryStream one = await db.DownloadFile("templates", "one.docx"))
{
using (MemoryStream two = await db.DownloadFile("templates", "two.docx"))
{
using (MemoryStream newStream = new MemoryStream())
{
one.CopyTo(newStream);
editingMemoryStream.Position = 0;
using (WordprocessingDocument mainDoc = WordprocessingDocument.Open(one, true))
{
using (WordprocessingDocument newDoc =
WordprocessingDocument.Open(newStream, true))
{
Generate(modal, new, main, report);
}
}
}
}
}
}
private void Generate(List<modal> mo, WordprocessingDocument new, MemoryStream report)
{
var main = new.MainDocumentPart;
modal = mo[0];
AddTableToBody(report, modal.table, mo);
}
public void AddTableToBody(MemoryStream temp, dailyReport,
MainDocumentPart main)
{
using (WordprocessingDocument newDoc = WordprocessingDocument.Open(editingMemoryStream, true))
{
WP.Body body= dailyReport.MainDocumentPart.Document.Body;
var main = dailyReport.MainDocumentPart;
//* some code is here*//
var clone = dailyReport.CloneNode(true);
main.Document.Body.AppendChild(new WP.Paragraph(new WP.Run(clone)));
main.Document.Save();
}
}
The file contains 2 tables with some labels.
Looking at your code, it looks like you are adding a w:body element (Body instance) to a w:r element (Run instance) in the following two lines of code:
var clone = dailyReport.CloneNode(true);
main.Document.Body.AppendChild(new WP.Paragraph(new WP.Run(clone)));
In the above excerpt, clone is a Body instance (w:body element) with all its child elements. I can only assume parent is the MainDocumentPart of some other WordprocessingDocument. You are appending a new Paragraph (w:p) with one w:r (Run) and one w:body (Body) with whatever children the latter has. This is invalid Open XML and most probably the reason why Word complains.

How to programatically create Word's serial letter and store it as a single document?

How can one generate docx serial letter in ASP.NET MVC application?
I can fill in a simple docx template with data from DB by using docx with Content Controls and OpenXML library - as suggested for example here.
However, when trying to use this for serial letter and merge generated documents into single output docx (hint here), resulting serial letter has data of the first entry - e.g. when I was generating letter for 10 employees and feed this data, resulting output generated 10 letters but all with the data of the first employee.
Edit: (sample code added)
internal static Stream CreateMultiPartDocument(IList<object> data, string templatePath)
{
Stream mainDocumentStream = CreateTempDocument(data[0], templatePath);
for(int i = 1; i < data.Count; i++)
{
object childDocumentData = data[i];
Stream childDocumentStream = CreateTempDocument(childDocumentData, templatePath);
AppendChildDocument(mainDocumentStream, childDocumentStream);
}
mainDocumentStream.Flush();
mainDocumentStream.Position = 0;
return mainDocumentStream;
}
internal static Stream CreateTempDocument(object data, string templatePath)
{
string fullTemplatePath = Path.Combine(TEMPLATE_BASE_PATH, templatePath);
FileStream templateFile = File.Open(fullTemplatePath, FileMode.Open);
if(null != templateFile)
{
MemoryStream fileInMemory = new MemoryStream();
templateFile.CopyTo(fileInMemory);
string customXML = data.SerializeToXml();
ReplaceCustomXmlInMemory(fileInMemory, customXML);
fileInMemory.Flush();
fileInMemory.Position = 0;
templateFile.Close();
return fileInMemory;
}
return null;
}
private static void ReplaceCustomXmlInMemory(MemoryStream fileInMemory, string customXML)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(fileInMemory, true))
{
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
mainPart.DeleteParts(mainPart.CustomXmlParts);
CustomXmlPart customXmlPart = mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter streamWriter = new StreamWriter(customXmlPart.GetStream()))
{
streamWriter.Write(customXML);
}
wordDoc.Close();
}
}
The only solution I was able to find is to switch from OpenXml to Syncfusion's FileFormats library - they support mail merge functionality from scratch, with multiple formats of input possible. See link here.
It is available also as a Nuget package, so it was super simple.

How can I return a stream rather than writing to disk?

I am using OpenXML SDK.
The OpenXML SDK creates a method called CreatePackage as such:
public void CreatePackage(string filePath)
{
using (SpreadsheetDocument package = SpreadsheetDocument.Create(filePath, SpreadsheetDocumentType.Workbook))
{
CreateParts(package);
}
}
I call it from my program as follows which will create the Excel file to a given path:
gc.CreatePackage(excelFilePath);
Process.Start(_excelFilePath);
I am not sure how to tweak the code such that it gives back a Stream which shows the Excel file vs having it create the file on disk.
According to the documentation for SpreadsheetDocument.Create there are multiple overloads, one of which takes a Stream.
so change your code to:
public void CreatePackage(Stream stream)
{
using (SpreadsheetDocument package = SpreadsheetDocument.Create(stream, SpreadsheetDocumentType.Workbook))
{
CreateParts(package);
}
}
And then call it with any valid Stream, for example:
using(var memoryStream = new MemoryStream())
{
CreatePackage(memoryStream);
// do something with memoryStream
}

How to save a file without prompting the user for a name/path?

I'm trying to open a stream to a file.
First I need to save a file to my desktop and then open a stream to that file.
This code works well (from my previous project) but in this case, I don't want to prompt the user to pick the save location or even the name of the file. Just save it and open the stream:
Stream myStream;
if (saveFileDialog1.ShowDialog() == DialogResult.OK)
{
if ((myStream = saveFileDialog1.OpenFile()) != null)
{
PdfWriter.GetInstance(document, myStream);
Here's my code for the newer project (the reason for this question):
namespace Tutomentor.Reporting
{
public class StudentList
{
public void PrintStudentList(int gradeParaleloID)
{
StudentRepository repo = new StudentRepository();
var students = repo.FindAllStudents()
.Where(s => s.IDGradeParalelo == gradeParaleloID);
Document document = new Document(PageSize.LETTER);
Stream stream;
PdfWriter.GetInstance(document, stream);
document.Open();
foreach (var student in students)
{
Paragraph p = new Paragraph();
p.Content = student.Name;
document.Add(p);
}
}
}
}
Use Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) to get the desktop directory.
string fileName = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory),
"MyFile.pdf");
using(var stream = File.OpenWrite(fileName))
{
PdfWriter.GetInstance(document, stream);
}
// However you initialize your instance of StudentList
StudentList myStudentList = ...;
using (FileStream stream = File.OpenWrite(#"C:\Users\me\Desktop\myDoc.pdf")) {
try {
myStudentList.PrintStudentList(stream, gradeParaleloID);
}
finally {
stream.Close();
}
}
You should pass the stream into your method:
public void PrintStudentList(Stream stream, int gradeParaleloID) { ... }
EDIT
Even though I hard coded a path above, you shouldn't do that, use something like this to get the path to your desktop:
Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
If this is a local (e.g. Windows/console) application just make the stream a FileStream to whatever path you want (check this for info on how to get the desktop folder path). If the user running the application has write permitions to that file it will be created/saved there.
If this is a web (e.g. ASP.Net) application you won't be able to save the file directly in the client machine without prompting the user (for security reasons).
Stream myStream = new FileStream(#"c:\Users\[user]\Desktop\myfile.dat", FileMode.OpenOrCreate);
Your FileMode may differ depending on what you're trying to do. Also I wouldn't advise actually using the Desktop for this, but that's what you asked for in the question. Preferably, look into Isolated Storage.

Categories

Resources