Slow SelectPDF conversion after publish - c#

I want to convert html code to pdf so I use SelectPDF library, so my code is:
var converter = new HtmlToPdf();
var today = DateTime.UtcNow;
var fileName = $"test - {today}";
var doc = converter.ConvertHtmlString(html);
using var ms = new MemoryStream();
ms.Position = 0;
doc.Save(ms);
var res = ms.ToArray();
doc.Close();
return File(res, "application/pdf", fileName);
I tested using localhost and everything works well, always do a fast conversion (not more than 5 seconds).
The problem starts when I publish on the server, after the method executed sometimes (not always) it returns an error 500
Failed to load resource: the server responded with a status of 500 ()
Message: "Conversion error: Navigation timeout."
Is it a way always to get a fast result? I know I can expand load time as:
converter.Options.MaxPageLoadTime = 120;
But I want to convert it fast, 2 minutes for a simple HTML to pdf conversion is to much

If it works locally and you are getting a time-out on the server sometimes, it is likely that your Html contains a file reference (e.g. javascript, css or image) that is not available to the server at the time.
Make sure external references in your html that are always accessible to your server.

Related

NReco HTML-to-PDF Generator GeneratePdfFromFiles method throws exception

I have a fully working system for creating single page PDFs from HTML as below;
After initializing the converter
var nRecoHTMLToPDFConverter = new HtmlToPdfConverter();
nRecoHTMLToPDFConverter = PDFGenerator.PDFSettings(nRecoHTMLToPDFConverter);
string PDFContents;
PDFContents is an HTML string which is being populated.
The following command works perfectly and gives me the byte[] which I can return;
createDTO.PDFContent = nRecoHTMLToPDFConverter.GeneratePdf(PDFContents);
The problem arises when I want to test and develop the multi page functionality of the NReco library and change an arbitrary number of HTML pages to PDF pages.
var stringArray = new string[]
{
PDFContents, PDFContents,
};
var stream = new MemoryStream();
nRecoHTMLToPDFConverter.GeneratePdfFromFiles(stringArray, null, stream);
var mybyteArray = stream.ToArray();
the PDFContents are exactly the same as above. On paper, this should give me the byte array for 2 identical PDF pages however on call to GeneratePdfFromFiles method, I get the following exception;
WkHtmlToPdfException: Exit with code 1 due to network error: HostNotFoundError (exit code: 1)
Please help me resolve this if you have experience with this library and its complexities. I have a feeling that I'm not familiar with the proper use of a Stream object in this scenario. I've tested the working single page line and the malfunctioning multi page lines on the same method call so their context would be identical.
Many thanks
GeneratePdfFromFiles method you used expects array of file names (or URLs): https://www.nrecosite.com/doc/NReco.PdfGenerator/?topic=html/M_NReco_PdfGenerator_HtmlToPdfConverter_GeneratePdfFromFiles_1.htm
If you operate with HTML content as .NET strings you may simply save it to temp files, generate PDF and remove after that.

Trying to convert docx file to another format(pdf) using Drive API

I was trying to convert .docx file to .pdf using drive api, which sounds reasonable since you can do it manually.
Here is my code:
FilesResource.CreateMediaUpload request;
using (var stream = new System.IO.FileStream(#"test.docx",
System.IO.FileMode.Open))
{
request = driveService.Files.Create(
fileMetadata, stream, "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
request.Fields = ""id, webViewLink, webContentLink, size";
var x = request.Upload();
Console.WriteLine(x);
}
var file = request.ResponseBody;
Afterwards, I am getting id of this file and trying to do:
var downloadRequest = driveService.Files.Export(file.Id, "application/pdf");
which fails with error: "Export only supports Google Docs"
Ofc! I suppose it hasn't yet become "Google DOC", however, this format is supported for conversion as mentioned here and here.
Ok, I've noticed if you go to the drive and open the file manually it will become google doc file and also will get new ID. The export on this ID will work just fine. However, doing something manually isn't acceptable approach for our needs.
Tried another approach, you can use direct link with &export=pdf parameter to convert google doc file.
https://docs.google.com/document/d/FILE_ID/export?format=doc
But passing FILEID to that link doesn't work in this case(works with "DOC" file just fine) Tried doing something similiar to stackoverflow answer. No way.
So. Is there any way to trigger File to become Google DOC and wait till it converts? Is there any other way?
Thanks in advance!
Thanks to #bash.d I was able to convert from docx to pdf.
Actually one have to use v2 of API and its "Insert" method.
https://developers.google.com/drive/v2/reference/files/insert#examples
use the code from this link and specify
request.Convert = true;
after that I used
var downloadRequest = driveService.Files.Export(file.Id, "application/pdf");
and voilĂ ! It worked! Takes about 30 seconds to convert file in my case.

HTML to PDF conversion using WkHtmlToXSharp Caching / Buffering Issue

I want to convert an HTML file to PDF file, and I was using "wkhtmltopdf.exe".
Then we moved this application to a shared hosting server. This server, wouldn't allow to run .exe files, so that I have to use the WkHtmlToXSharp.dll [wrapper for the above exe].
Its working fine but the problem is this it caching the output somewhere, so that every-time I create a new PDF, it always giving the first one.
I have called .Dispose() and setting the converter to null but no use.
But after a certian time, it bring the new PDF, that means it caching or buffering the byte data somewhere.
Below is my code. every-time I pass a new html file[htmlFullPath] with different images in it.
IHtmlToPdfConverter converter = new MultiplexingConverter();
converter.ObjectSettings.Page = htmlFullPath;
converter.ObjectSettings.Web.EnablePlugins = true;
converter.ObjectSettings.Web.EnableJavascript = true;
converter.ObjectSettings.Web.Background = true;
converter.ObjectSettings.Web.LoadImages = true;
converter.ObjectSettings.Load.LoadErrorHandling = LoadErrorHandlingType.ignore;
converter.GlobalSettings.Orientation = (PdfOrientation)Enum.Parse(typeof(PdfOrientation), orientation);
if (!string.IsNullOrEmpty(pageSize))
converter.GlobalSettings.Size.PageSize = (PdfPageSize)Enum.Parse(typeof(PdfPageSize), pageSize);
converter.GlobalSettings.Margin.Top = "0cm";
converter.GlobalSettings.Margin.Bottom = "0cm";
converter.GlobalSettings.Margin.Left = "0cm";
converter.GlobalSettings.Margin.Right = "0cm";
Byte[] bufferPDF = converter.Convert();
System.IO.File.WriteAllBytes(pdfUrl, bufferPDF);
converter.Dispose();
converter = null;
As I mentioned in the question "every-time I pass a new html file[htmlFullPath] with different images in it".
The image for each HTML is different but the Image name was same.
I have renamed the image also with time stamp and all working fine.
That means image with same name making the real problem, it may be a buffering issue of MultiplexingConverter or some settings in the IIS. which I will investigate later.

Uploading string as text file to SkyDrive?

I'm trying to use C# with the Live Connect API to upload a blank (or one that says "test") text file to SkyDrive. The code I have so far:
LiveConnectClient client = await LiveSignin();
string folderID = await getFolder(client);
client.BackgroundUploadAsync(folderID, "pins.txt", "", OverwriteOption.Rename);
where LiveSignin() is a function that handles the sign in code and returns a LiveConnectClient, and getFolder(LiveConnectClient client) is a function that gets the folder ID that I'm trying to upload to.
That code throws an error about the blank string (third parameter on the last line) having to be a "Windows.Storage.Streams.IInputStream", but I can't seem to find any documentation on how to convert a String to an IInputStream, or, for that matter, much of any documentation on "IInputStream" that I can find.
With earlier versions of the Windows Runtime/Live Connect (on another project) I had used:
byte[] byteArray = System.Text.Encoding.Unicode.GetBytes(Doc);
MemoryStream stream = new MemoryStream(byteArray);
App.client.UploadCompleted += client_UploadCompleted;
App.client.UploadAsync(roamingSettings.Values["folderID"].ToString(), docTitle.Text + ".txt", stream);
but that throws a lot of errors now (most of them because UploadAsync has been replaced with BackgroundUploadAsync).
So, is there a way to convert a string to an IInputStream, or do I not even need to use an IInputStream? If my method just doesn't work, how would one upload a blank text file to SkyDrive from a C# Metro app? (developing in Visual Studio 2012 Express on the evaluation of Windows 8 Enterprise, if that makes much of a difference)
EDIT: I finally found "Stream.AsInputStream", but now I'm getting the same error as this
An unhandled exception of type 'System.AccessViolationException'
occurred in Windows.Foundation.winmd
Additional information: Attempted to read or write protected memory.
This is often an indication that other memory is corrupt
the code now:
LiveConnectClient client = await LiveSignin();
string folderID = await getFolder(client);
Stream OrigStream = new System.IO.MemoryStream(System.Text.UTF8Encoding.UTF8.GetBytes("test"));
LiveOperationResult result = await client.BackgroundUploadAsync(folderID, "pins.txt", OrigStream.AsInputStream(), OverwriteOption.Rename);
Hi
Had same problem today and as far as I can see the only solution to this problem is to write your text into a local file first and then upload it.
My solution looks like this:
var tmpFile= await ApplicationData.Current.
LocalFolder.CreateFileAsync
("tmp.txt", CreationCollisionOption.ReplaceExisting);
using (var writer = new StreamWriter(await tmpFile.OpenStreamForWriteAsync()))
{
await writer.WriteAsync("File content");
}
var operationResult =
await client.BackgroundUploadAsync(folderId, tmpFile.Name, tmpFile,
OverwriteOption.Overwrite);

Long paths bug in Windows?

I have the following file:
C:\Users\Jan\Documents\Visual Studio 2010\Projects\AzureTests\Build\82df3c44-0482-47a7-a5d8-9b39a79cf359.cskpg\WebRole1_778722b2-eb95-476d-af6a-917f269a0814.cssx\39e5cb39-cd18-4e1a-9c25-72bd1ad41b49.csman
I can open this file fine via the open window in notepad++, or via the explorer. However, opening via the Run window doesn't work. It gives an 'cannot find the file' dialog. When I query the filesystem in C# with:
var dir = new DirectoryInfo(#"C:\Users\Jan\...")
var fil = dir.GetFiles("*.csman")[0];
The file is also in the list of returned files but I can't do a:
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(fil.FullName);
Because this fails with an 'incorrect data at (1,1)' error. Because the XmlDocument thinks the file is empty. However a File.ReadAllBytes on this file succeeds. This works:
var buf = File.ReadAllBytes(fil.FullName);
using (var ms = new MemoryStream())
{
ms.Write(buf, 0, (int) buf.Length);
ms.Seek(0, SeekOrigin.Begin);
xmlDoc.Load(ms);
}
The problem doesn't occur when calling...
xmlDoc.Save(fil.FullName);
Can someone explain what is happening here?
XmlDocument.LoadXml expects a string that directly contains the XML data.
Parameters
xml
Type: System.String
String containing the XML document to load.
It is therefore interpreting the path-string as if it were XML (which will obviously be invalid, which is why the exception is thrown).
Use the XmlDocument.Load method instead.
Parameters
filename
Type: System.String
URL for the file containing the XML document to load. The URL can be either a local file or an HTTP URL (a Web address).
You don't face the problem when calling XmlDocument.Save, because, like Load, it's single parameter represents the path to the file.
Basically, the somewhat long file-path you've got there is a red-herring and not the root-cause of the issue you are facing.
And your other problem:
Windows "Run" requires quotes around the path name if there are spaces in it.

Categories

Resources