Get all metadata from an existing PDF using iText7

Get all metadata from an existing PDF using iText7 - c#

How can I retrieve all metadata stored in a PDF with iText7?
using (var pdfReader = new iText.Kernel.Pdf.PdfReader("path-to-a-pdf-file"))
{
var pdfDocument = new iText.Kernel.Pdf.PdfDocument(pdfReader);
var pdfDocumentInfo = pdfDocument.GetDocumentInfo();
// Getting basic metadata
var author = pdfDocumentInfo.GetAuthor();
var title = pdfDocumentInfo.GetTitle();
// Getting everything else
var someMetadata = pdfDocumentInfo.GetMoreInfo("need-a-key-here");
// How to get all metadata ?
}
I was using this with iTextSharp but I can't figure how to do it with the new iText7.
using (var pdfReader = new iTextSharp.text.pdf.PdfReader("path-to-a-pdf-file"))
{
// Getting basic metadata
var author = pdfReader.Info.ContainsKey("Author") ? pdfReader.Info["Author"] : null;
var title = pdfReader.Info.ContainsKey("Title") ? pdfReader.Info["Title"] : null;
// Getting everything else
var metadata = pdfReader.Info;
metadata.Remove("Author");
metadata.Remove("Title");
// Print metadata
Console.WriteLine($"Author: {author}");
Console.WriteLine($"Title: {title}");
foreach (var line in metadata)
{
Console.WriteLine($"{line.Key}: {line.Value}");
}
}
I am using version 7.1.1 of iText7.

In iText 7 the PdfDocumentInfo class unfortunately does not expose a method to retrieve the keys in the underlying dictionary.
But you can simply retrieve the Info dictionary contents by immediately accessing that dictionary from the trailer dictionary. E.g. for a PdfDocument pdfDocument:
PdfDictionary infoDictionary = pdfDocument.GetTrailer().GetAsDictionary(PdfName.Info);
foreach (PdfName key in infoDictionary.KeySet())
Console.WriteLine($"{key}: {infoDictionary.GetAsString(key)}");

There is problem with "UnicodeBig", "UTF-8" or "PDF" encoded strings.
For example, if PDF is created with Microsoft Word, then "/Creator" is unreadable encoded and needs to be converted:
.
iText7 has own function for that convert:
...ToUnicodeString().
But it is a Method of the PdfString object and PdfDictionary value (PdfObject) hast to be first casted to this PdfString type.
Complete solution as async, "unbreakable" and auto-disposed function:
public static async Task<(Dictionary<string, string> MetaInfo, string Error)> GetMetaInfoAsync(string path)
{
try
{
var metaInfo = await Task.Run(() =>
{
var metaInfoDict = new Dictionary<string, string>();
using (var pdfReader = new PdfReader(path))
using (var pdfDocument = new PdfDocument(pdfReader))
{
metaInfoDict["PDF.PageCount"] = $"{pdfDocument.GetNumberOfPages():D}";
metaInfoDict["PDF.Version"] = $"{pdfDocument.GetPdfVersion()}";
var pdfTrailer = pdfDocument.GetTrailer();
var pdfDictInfo = pdfTrailer.GetAsDictionary(PdfName.Info);
foreach (var pdfEntryPair in pdfDictInfo.EntrySet())
{
var key = "PDF." + pdfEntryPair.Key.ToString().Substring(1);
string value;
switch (pdfEntryPair.Value)
{
case PdfString pdfString:
value = pdfString.ToUnicodeString();
break;
default:
value = pdfEntryPair.Value.ToString();
break;
}
metaInfoDict[key] = value;
}
return metaInfoDict;
}
});
return (metaInfo, null);
}
catch (Exception ex)
{
if (Debugger.IsAttached) Debugger.Break();
return (null, ex.Message);
}
}

Related

Access token empty error when uploading large files to a ToDoTask using Graph Api

I am trying to attach large files to a ToDoTask using the Graph Api using the example in the docs for attaching large files for ToDoTask and the recommend class LargeFileUploadTask for uploading large files.
I have done this sucessfully before with attaching large files to emails and sending so i used that as base for the following method.
public async Task CreateTaskBigAttachments( string idList, string title, List<string> categories,
BodyType contentType, string content, Importance importance, bool isRemindOn, DateTime? dueTime, cAttachment[] attachments = null)
{
try
{
var _newTask = new TodoTask
{
Title = title,
Categories = categories,
Body = new ItemBody()
{
ContentType = contentType,
Content = content,
},
IsReminderOn = isRemindOn,
Importance = importance
};
if (dueTime.HasValue)
{
var _timeZone = TimeZoneInfo.Local;
_newTask.DueDateTime = DateTimeTimeZone.FromDateTime(dueTime.Value, _timeZone.StandardName);
}
var _task = await _graphServiceClient.Me.Todo.Lists[idList].Tasks.Request().AddAsync(_newTask);
//Add attachments
if (attachments != null)
{
if (attachments.Length > 0)
{
foreach (var _attachment in attachments)
{
var _attachmentContentSize = _attachment.ContentBytes.Length;
var _attachmentInfo = new AttachmentInfo
{
AttachmentType = AttachmentType.File,
Name = _attachment.FileName,
Size = _attachmentContentSize,
ContentType = _attachment.ContentType
};
var _uploadSession = await _graphServiceClient.Me
.Todo.Lists[idList].Tasks[_task.Id]
.Attachments.CreateUploadSession(_attachmentInfo).Request().PostAsync();
using (var _stream = new MemoryStream(_attachment.ContentBytes))
{
_stream.Position = 0;
LargeFileUploadTask<TaskFileAttachment> _largeFileUploadTask = new LargeFileUploadTask<TaskFileAttachment>(_uploadSession, _stream, MaxChunkSize);
try
{
await _largeFileUploadTask.UploadAsync();
}
catch (ServiceException errorGraph)
{
if (errorGraph.StatusCode == HttpStatusCode.InternalServerError || errorGraph.StatusCode == HttpStatusCode.BadGateway
|| errorGraph.StatusCode == HttpStatusCode.ServiceUnavailable || errorGraph.StatusCode == HttpStatusCode.GatewayTimeout)
{
Thread.Sleep(1000); //Wait time until next attempt
//Try again
await _largeFileUploadTask.ResumeAsync();
}
else
throw errorGraph;
}
}
}
}
}
}
catch (ServiceException errorGraph)
{
throw errorGraph;
}
catch (Exception ex)
{
throw ex;
}
}
Up to the point of creating the task everything goes well, it does create the task for the user and its properly shown in the user tasks list. Also, it does create an upload session properly.
The problem comes when i am trying to upload the large file in the UploadAsync instruction.
The following error happens.
Code: InvalidAuthenticationToken Message: Access token is empty.
But according to the LargeFileUploadTask doc , the client does not need to set Auth Headers.
param name="baseClient" To use for making upload requests. The client should not set Auth headers as upload urls do not need them.
Is not LargeFileUploadTask allowed to be used to upload large files to a ToDoTask?
If not then what is the proper way to upload large files to a ToDoTask using the Graph Api, can someone provide an example?

If you want, you can raise an issue for the same with the details here, so that they can have look: https://github.com/microsoftgraph/msgraph-sdk-dotnet-core/issues.

It seems like its a bug and they are working on it.
Temporarily I did this code to deal with the issue of the large files.
var _task = await _graphServiceClient.Me.Todo.Lists[idList].Tasks.Request().AddAsync(_newTask);
//Add attachments
if (attachments != null)
{
if (attachments.Length > 0)
{
foreach (var _attachment in attachments)
{
var _attachmentContentSize = _attachment.ContentBytes.Length;
var _attachmentInfo = new AttachmentInfo
{
AttachmentType = AttachmentType.File,
Name = _attachment.FileName,
Size = _attachmentContentSize,
ContentType = _attachment.ContentType
};
var _uploadSession = await _graphServiceClient.Me
.Todo.Lists[idList].Tasks[_task.Id]
.Attachments.CreateUploadSession(_attachmentInfo).Request().PostAsync();
// Get the upload URL and the next expected range from the response
string _uploadUrl = _uploadSession.UploadUrl;
using (var _stream = new MemoryStream(_attachment.ContentBytes))
{
_stream.Position = 0;
// Create a byte array to hold the contents of each chunk
byte[] _chunk = new byte[MaxChunkSize];
//Bytes to read
int _bytesRead = 0;
//Times the stream has been read
var _ind = 0;
while ((_bytesRead = _stream.Read(_chunk, 0, _chunk.Length)) > 0)
{
// Calculate the range of the current chunk
string _currentChunkRange = $"bytes {_ind * MaxChunkSize}-{_ind * MaxChunkSize + _bytesRead - 1}/{_stream.Length}";
//Despues deberiamos calcular el next expected range en caso de ocuparlo
// Create a ByteArrayContent object from the chunk
ByteArrayContent _byteArrayContent = new ByteArrayContent(_chunk, 0, _bytesRead);
// Set the header for the current chunk
_byteArrayContent.Headers.Add("Content-Range", _currentChunkRange);
_byteArrayContent.Headers.Add("Content-Type", _attachment.ContentType);
_byteArrayContent.Headers.Add("Content-Length", _bytesRead.ToString());
// Upload the chunk using the httpClient Request
var _client = new HttpClient();
var _requestMessage = new HttpRequestMessage()
{
RequestUri = new Uri(_uploadUrl + "/content"),
Method = HttpMethod.Put,
Headers =
{
{ "Authorization", bearerToken },
}
};
_requestMessage.Content = _byteArrayContent;
var _response = await _client.SendAsync(_requestMessage);
if (!_response.IsSuccessStatusCode)
throw new Exception("File attachment failed");
_ind++;
}
}
}
}
}

CsvHelper Error "No header record found" on reading a csv stream

Below is the code which i using to read a stream source of csv files but I get error as "No header record found". The library is 15.0 and I am already using .ToList() as suggested in some solutions, but still the error persists. Below is the method along with the tablefield class and the Read Stream method.
Also note here, I can get the desired result if I pass source as MemoryStream but it fails if I pass it as Stream because I need to avoid writing to memory each time.
public async Task<Stream> DownloadBlob(string containerName, string fileName, string connectionString)
{
// MemoryStream memoryStream = new MemoryStream();
if (string.IsNullOrEmpty(connectionString))
{
connectionString = #"UseDevelopmentStorage=true";
containerName = "testblobs";
}
Microsoft.Azure.Storage.CloudStorageAccount storageAccount = Microsoft.Azure.Storage.CloudStorageAccount.Parse(connectionString);
CloudBlobClient serviceClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = serviceClient.GetContainerReference(containerName);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
if (!blob.Exists())
{
throw new Exception($"Blob Not found");
}
return await blob.OpenReadAsync();
public class TableField
{
public string Name { get; set; }
public string Type { get; set; }
public Type DataType
{
get
{
switch( Type.ToUpper() )
{
case "STRING":
return typeof(string);
case "INT":
return typeof( int );
case "BOOL":
case "BOOLEAN":
return typeof( bool );
case "FLOAT":
case "SINGLE":
case "DOUBLE":
return typeof( double );
case "DATETIME":
return typeof( DateTime );
default:
throw new NotSupportedException( $"CSVColumn data type '{Type}' not supported" );
}
}
}
private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)
{
using (TextReader reader = new StreamReader(source, Encoding.UTF8))
{
var cache = new TypeConverterCache();
cache.AddConverter<float>(new CSVSingleConverter());
cache.AddConverter<double>(new CSVDoubleConverter());
var csv = new CsvReader(reader,
new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = true,
CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
TypeConverterCache = cache
});
csv.Read();
csv.ReadHeader();
var map = (
from col in cols
from src in col.Sources()
let index = csv.GetFieldIndex(src, isTryGet: true)
where index != -1
select new { col.Name, Index = index, Type = col.DataType }).ToList();
while (csv.Read())
{
yield return map.ToDictionary(
col => col.Name,
col => EntityProperty.CreateEntityPropertyFromObject(csv.GetField(col.Type, col.Index)));
}
}
}
StreamReading code:
public async Task<Stream> ReadStream(string containerName, string digestFileName, string fileName, string connectionString)
{
string data = string.Empty;
string fileExtension = Path.GetExtension(fileName);
var contents = await DownloadBlob(containerName, digestFileName, connectionString);
return contents;
}
Sample CSv to be read:
PartitionKey;Time;RowKey;State;RPM;Distance;RespirationConfidence;HeartBPM
te123;2020-11-06T13:33:37.593Z;10;1;8;20946;26;815
te123;2020-11-06T13:33:37.593Z;4;2;79944;8;36635;6
te123;2020-11-06T13:33:37.593Z;3;3;80042;9;8774;5
te123;2020-11-06T13:33:37.593Z;1;4;0;06642;6925;37
te123;2020-11-06T13:33:37.593Z;6;5;04740;74753;94628;21
te123;2020-11-06T13:33:37.593Z;7;6;6;2;14;629
te123;2020-11-06T13:33:37.593Z;9;7;126;86296;9157;05
te123;2020-11-06T13:33:37.593Z;5;8;5;3;7775;08
te123;2020-11-06T13:33:37.593Z;2;9;44363;65;70;229
te123;2020-11-06T13:33:37.593Z;8;10;02;24666;2;2

I have tried to reproduce the problem with version 15.0 of the library, but have failed with classes CSVSingleConverter and CSVDoubleConverter. With the standard classes of the CSVHelper, however, reading the header works:
using System;
using System.IO;
using System.Text;
using CsvHelper;
using CsvHelper.TypeConversion;
namespace ConsoleApp2
{
class Program
{
static void Main(string[] args)
{
using (Stream stream = new FileStream(#"e:\demo.csv", FileMode.Open, FileAccess.Read))
{
ReadCSV(stream);
}
}
private static void ReadCSV(Stream source)
{
using (TextReader reader = new StreamReader(source, Encoding.UTF8))
{
var cache = new TypeConverterCache();
cache.AddConverter<float>(new SingleConverter());
cache.AddConverter<double>(new DoubleConverter());
var csv = new CsvReader(reader,
new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = true,
CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
TypeConverterCache = cache
});
csv.Read();
csv.ReadHeader();
foreach (string headerRow in csv.Context.HeaderRecord)
{
Console.WriteLine(headerRow);
}
}
}
}
}
I´ve changed the lines ...
cache.AddConverter<float>(new CSVSingleConverter());
cache.AddConverter<double>(new CSVDoubleConverter());
... to ...
cache.AddConverter<float>(new SingleConverter());
cache.AddConverter<double>(new DoubleConverter());
I put the CSV data into a UTF-8 text file. Output at the console is:
PartitionKey
Time
RowKey
State
RPM
Distance
RespirationConfidence
HeartBPM
EDIT 2020-12-24:
Put the whole source text online, not just part of it.

Related to my answer to your other question (it has more detail ; you can read it there) I didn't encounter any problem connecting CsvHelper to a blob storage sourced stream
This was the code used (I took the CSV data you posted, added it to a file, upped it to blob):
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private async void button1_Click(object sender, EventArgs e)
{
var cstr = "YOUR CONNSTR" HERE;
var bbc = new BlockBlobClient(cstr, "temp", "ankit.csv");
var s = await bbc.OpenReadAsync(new BlobOpenReadOptions(true) { BufferSize = 16384 });
var sr = new StreamReader(s);
var csv = new CsvHelper.CsvReader(sr, new CsvConfiguration(CultureInfo.CurrentCulture) { HasHeaderRecord = true, Delimiter = ";" });
//try by read/getrecord
while(await csv.ReadAsync())
{
var rec = csv.GetRecord<X>();
Console.WriteLine(rec.PartitionKey);
}
var x = new X();
//try by await foreach
await foreach (var r in csv.EnumerateRecordsAsync(x))
{
Console.WriteLine(r.PartitionKey);
}
}
}
class X {
public string PartitionKey { get; set; }
}

Try setting the source stream back to the start.
private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)
{
source.Position = 0;
You also can't use yield return there. It delays execution of the code until you access the IEnumerable<Dictionary<string, EntityProperty>> returned from the ReadCSV method. The problem is at that point you have already closed the using statement with the TextReader that CsvHelper needs to read your data, so you get a NullReferenceException.
You either need to remove the yield return
var result = new List<Dictionary<string, EntityProperty>>();
while (csv.Read()){
// Add to result
}
return result;
Or pass the TextReader to your method. Any enumaration of the IEnumerable<Dictionary<string, EntityProperty>> must occur before leaving the using statement which will dispose of the TextReader needed by the CsvReader
IEnumerable<Dictionary<string, EntityProperty>> result;
using (TextReader reader = new StreamReader(source, Encoding.UTF8)){
// Calling ToList() will enumerate your yield statement
result = ReadCSV(reader, cols).ToList();
}

I was getting the same error 'No header found...' and this was after several hundred successful reads of the same file. I added the delimiter=","
reader = csv.reader(filename, delimiter=",")
and that solved the problem. I think the csv_reader will attempt to determine the delimiter if the delimiter is not specified, and fails after a while, maybe a memory leak? the comma is the default, but if the reader has to programatically determine it, it is more likely to fail.

Fill in Word template and save as pdf using openxml and openoffice

I am trying to fill a word document with data from an XML. I am using openXML to fill the document, which works great and save it as .docx. The thing is I have to open Word and save the document as an .odt and then use the OpenOffice SDK to open the .docx and save it as a pdf. When I don't save the .docx as .odt, the formatting is off.
What I need to be able to do is be able to either convert the .docx to .odt or save it originally as .odt.
Here is what I have right now:
static void Main()
{
string documentText;
XmlDocument xmlDoc = new XmlDocument(); // Create an XML document object
xmlDoc.Load("C:\\Cache\\MMcache.xml"); // Load the XML document from the specified file
XmlNodeList PatientFirst = xmlDoc.GetElementsByTagName("PatientFirst");
XmlNodeList PatientSignatureImg = xmlDoc.GetElementsByTagName("PatientSignatureImg");
byte[] byteArray = File.ReadAllBytes("C:\\Cache\\TransportationRunReporttemplate.docx");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(stream, true))
{
using (StreamReader reader = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
documentText = reader.ReadToEnd();
}
using (StreamWriter writer = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
writer.Write(documentText);
}
}
// Save the file with the new name
File.WriteAllBytes("C:\\Cache\\MYFINISHEDTEMPLATE.docx", stream.ToArray());
}
}
private static void AddPicture(Bitmap bitmap)
{
using (WordprocessingDocument doc = WordprocessingDocument.Open("C:\\Cache\\MYFINISHEDTEMPLATE.docx", true))
{
//Bitmap image = new Bitmap("C:\\Cache\\scribus.jpg");
SdtElement controlBlock = doc.MainDocumentPart.Document.Body
.Descendants<SdtElement>()
.Where
(r =>
r.SdtProperties.GetFirstChild<Tag>().Val == "Signature"
).SingleOrDefault();
// Find the Blip element of the content control.
A.Blip blip = controlBlock.Descendants<A.Blip>().FirstOrDefault();
ImagePart imagePart = doc.MainDocumentPart
.AddImagePart(ImagePartType.Jpeg);
using (MemoryStream stream = new MemoryStream())
{
bitmap.Save(stream, ImageFormat.Jpeg);
stream.Position = 0;
imagePart.FeedData(stream);
}
blip.Embed = doc.MainDocumentPart.GetIdOfPart(imagePart);
/* DW.Inline inline = controlBlock
.Descendants<DW.Inline>().FirstOrDefault();
// 9525 = pixels to points
inline.Extent.Cy = image.Size.Height * 9525;
inline.Extent.Cx = image.Size.Width * 9525;
PIC.Picture pic = inline
.Descendants<PIC.Picture>().FirstOrDefault();
pic.ShapeProperties.Transform2D.Extents.Cy
= image.Size.Height * 9525;
pic.ShapeProperties.Transform2D.Extents.Cx
= image.Size.Width * 9525;*/
}
ConvertToPDF(#"C:\Cache\MYFINISHEDTEMPLATE2.docx",#"C:\Cache\OpenPdf.pdf");
}
public static Bitmap Base64StringToBitmap(string base64String)
{
Bitmap bmpReturn = null;
byte[] byteBuffer = Convert.FromBase64String(base64String);
MemoryStream memoryStream = new MemoryStream(byteBuffer);
memoryStream.Position = 0;
bmpReturn = (Bitmap)Bitmap.FromStream(memoryStream);
memoryStream.Close();
memoryStream = null;
byteBuffer = null;
return bmpReturn;
}
public static void ConvertToPDF(string inputFile, string outputFile)
{
if (ConvertExtensionToFilterType(System.IO.Path.GetExtension(inputFile)) == null)
throw new InvalidProgramException("Unknown file type for OpenOffice. File = " + inputFile);
StartOpenOffice();
//Get a ComponentContext
var xLocalContext =
Bootstrap.bootstrap();
//Get MultiServiceFactory
var xRemoteFactory =
(XMultiServiceFactory)
xLocalContext.getServiceManager();
//Get a CompontLoader
var aLoader =
(XComponentLoader)xRemoteFactory.createInstance("com.sun.star.frame.Desktop");
//Load the sourcefile
XComponent xComponent = null;
try
{
xComponent = InitDocument(aLoader,
PathConverter(inputFile), "_blank");
//Wait for loading
while (xComponent == null)
{
Thread.Sleep(1000);
}
// save/export the document
SaveDocument(xComponent, inputFile, PathConverter(outputFile));
}
finally
{
if (xComponent != null) xComponent.dispose();
}
}
private static void StartOpenOffice()
{
var ps = Process.GetProcessesByName("soffice.exe");
if (ps.Length != 0)
throw new InvalidProgramException("OpenOffice not found. Is OpenOffice installed?");
if (ps.Length > 0)
return;
var p = new Process
{
StartInfo =
{
Arguments = "-headless -nofirststartwizard",
FileName = "soffice.exe",
CreateNoWindow = true
}
};
var result = p.Start();
if (result == false)
throw new InvalidProgramException("OpenOffice failed to start.");
}
private static XComponent InitDocument(XComponentLoader aLoader, string file, string target)
{
var openProps = new PropertyValue[1];
openProps[0] = new PropertyValue { Name = "Hidden", Value = new Any(true) };
XComponent xComponent = aLoader.loadComponentFromURL(
file, target, 0,
openProps);
return xComponent;
}
private static void SaveDocument(XComponent xComponent, string sourceFile, string destinationFile)
{
var propertyValues = new PropertyValue[2];
// Setting the flag for overwriting
propertyValues[1] = new PropertyValue { Name = "Overwrite", Value = new Any(true) };
//// Setting the filter name
propertyValues[0] = new PropertyValue
{
Name = "FilterName",
Value = new Any(ConvertExtensionToFilterType(System.IO.Path.GetExtension(sourceFile)))
};
((XStorable)xComponent).storeToURL(destinationFile, propertyValues);
}
private static string PathConverter(string file)
{
if (file == null || file.Length == 0)
throw new NullReferenceException("Null or empty path passed to OpenOffice");
return String.Format("file:///{0}", file.Replace(#"\", "/"));
}
public static string ConvertExtensionToFilterType(string extension)
{
switch (extension)
{
case ".odt":
case ".doc":
case ".docx":
case ".txt":
case ".rtf":
case ".html":
case ".htm":
case ".xml":
case ".wps":
case ".wpd":
return "writer_pdf_Export";
case ".xls":
case ".xlsb":
case ".ods":
return "calc_pdf_Export";
case ".ppt":
case ".pptx":
case ".odp":
return "impress_pdf_Export";
default: return null;
}
}
}
}
Just for information I cannot use anything that uses Interop because word will not be installed the machine. I am using OpenXML and OpenOffice

This is what I would try (details below):
1) try Doc format instead of DocX
2) switch to Libre Office and try DocX again
2) use the odf-converter to get a better DocX -> ODT conversion.
More details...
There's something called the odf-conveter (opensource) that can convert DocX->ODT which gives you (typically) more accurate DocX->ODT than Open Office. Take a look at odf-conveter-integrator by OONinja for pre-packaged versions.
Also, Libre Office has DocX support ahead of OpenOffice so you might get a better result simply by switching to Libre Office.
A further option is to start from Doc format rather than DocX. In the OpenOffice world that translates much better to ODT and then to PDF.
Hope that helps.

You may try to use Docxpresso to generate your .odt directly from HTML + CSS code and avoid any conversion issue.
Docxpresso is free for non-commercial use.

How to setup web api controller for ZipFile

I am trying to create and return a zip file with selected documents. The console shows that the selected DocumentId's are being sent from the Angular controller to the api but I am getting a null error.
ApiController
public HttpResponseMessage Get(string[] id)
{
List<Document> documents = new List<Document>();
using (var context = new ApplicationDbContext())
{
Document document = context.Documents.Find(id);
if (document == null)
{
if (document == null)
{
throw new HttpResponseException(new HttpResponseMessage(HttpStatusCode.NotFound));
}
}
using (var zipFile = new ZipFile())
{
// Make zip file
foreach (var d in documents)
{
var dt = d.DocumentDate.ToString("y").Replace('/', '-').Replace(':', '-');
string fileName = String.Format("{0}-{1}-{2}.pdf", dt, d.PipeName, d.LocationAb);
zipFile.AddEntry(fileName, d.DocumentUrl);
}
return ZipContentResult(zipFile);
}
}
}
protected HttpResponseMessage ZipContentResult(ZipFile zipFile)
{
var pushStreamContent = new PushStreamContent((stream, content, context) =>
{
zipFile.Save(stream);
stream.Close(); // After save we close the stream to signal that we are done writing.
}, "application/zip");
return new HttpResponseMessage(HttpStatusCode.OK) { Content = pushStreamContent };
}
UPDATE
public HttpResponseMessage Get([FromUri] string[] id)
{
var documents = new List<Document>();
using (var context = new ApplicationDbContext())
{
foreach (string doc in id)
{
Document document = context.Documents.Find(new object[] { doc });
if (document == null)
{
throw new HttpResponseException(new HttpResponseMessage(HttpStatusCode.NotFound));
}
documents.Add(document);
}
using (var zipFile = new ZipFile())
{
// Make zip file
foreach (var d in documents)
{
var dt = d.DocumentDate.ToString("y").Replace('/', '-').Replace(':', '-');
string fileName = String.Format("{0}-{1}-{2}.pdf", dt, d.PipeName, d.LocationAb);
zipFile.AddEntry(fileName, d.DocumentUrl);
}
return ZipContentResult(zipFile);
}
}
}
ERROR
{"The argument types 'Edm.Int32' and 'Edm.String' are incompatible for this operation. Near WHERE predicate, line 1, column 82."}
STACKTRACE
at System.Data.Entity.Internal.Linq.InternalSet`1.FindInStore(WrappedEntityKey key, String keyValuesParamName)
at System.Data.Entity.Internal.Linq.InternalSet`1.Find(Object[] keyValues)
at System.Data.Entity.DbSet`1.Find(Object[] keyValues)
at TransparentEnergy.ControllersAPI.apiZipPipeLineController.Get(String[] id) in e:\Development\TransparentEnergy\TransparentEnergy\ControllersAPI \BatchZipApi\apiZipPipeLineController.cs:line 25
at lambda_method(Closure , Object , Object[] )
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor. <>c__DisplayClass10.<GetExecutor>b__9(Object instance, Object[] methodParameters)
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.Exec ute(Object instance, Object[] arguments)
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpCo ntrollerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)
New Screenshot of watch after changing string[] id to List id

Unless you forgot to copy all the code into the question you never add anything to the documents list object.
At the start you create a new List object named documents:
List<Document> documents = new List<Document>();
You then search for items and place them in a new document object:
Document document = context.Documents.Find(id);
Then when attempting to make the zip file you are accessing the first created List object that has nothing put into it.
foreach (var d in documents)
I believe this is then causing your save of the zip file to throw an exception
zipFile.Save(stream);
In the find line above
Document document = context.Documents.Find(id);
did you intend
documents = context.Documents.Find(id);
UPDATE 2
I set up a database with your information, created a MVC web api that takes a POST of JSON to pass the data. This populates the list with items pulled from a database by ID.
[HttpPost]
[ActionName("ZipFileAction")]
public HttpResponseMessage ZipFiles([FromBody]int[] id)
{
if (id == null)
{//Required IDs were not provided
throw new HttpResponseException(new HttpResponseMessage(HttpStatusCode.BadRequest));
}
List<Document> documents = new List<Document>();
using (var context = new ApplicationDbContext())
{
foreach (int NextDocument in id)
{
Document document = context.Documents.Find(NextDocument);
if (document == null)
{
throw new HttpResponseException(new HttpResponseMessage(HttpStatusCode.NotFound));
}
documents.Add(document);
}
using (var zipFile = new ZipFile())
{
// Make zip file
foreach (var d in documents)
{
var dt = d.DocumentDate.ToString("y").Replace('/', '-').Replace(':', '-');
string fileName = String.Format("{0}-{1}-{2}.pdf", dt, d.PipeName, d.LocationAb);
zipFile.AddEntry(fileName, d.DocumentUrl);
}
return ZipContentResult(zipFile);
}
}
}

{"The argument types 'Edm.Int32' and 'Edm.String' are incompatible ..."}
Sounds like your key column is an int type and you are searching it using a string. Try changing the type of the parameter from string to int. I renamed it ids as it seems to be a list of document identifiers.
An optimized version would get all needed documents in a single query :
public HttpResponseMessage Get([FromUri] int[] ids)
{
using (var context = new ApplicationDbContext())
{
var documents = context.Documents.Where(doc => ids.Contains(doc.Id)).ToList();
if (documents.Count != ids.Length)
{
throw new HttpResponseException(new HttpResponseMessage(HttpStatusCode.NotFound));
}
using (var zipFile = new ZipFile())
{
// Make zip file
foreach (var d in documents)
{
var dt = d.DocumentDate.ToString("y").Replace('/', '-').Replace(':', '-');
string fileName = String.Format("{0}-{1}-{2}.pdf", dt, d.PipeName, d.LocationAb);
zipFile.AddEntry(fileName, d.DocumentUrl);
}
return ZipContentResult(zipFile);
}
}
}

Modifying existing resource file programmatically

I am using the following code to generate resource file programmatically.
ResXResourceWriter resxWtr = new ResXResourceWriter(#"C:\CarResources.resx");
resxWtr.AddResource("Title", "Classic American Cars");
resxWtr.Generate();
resxWtr.Close();
Now, i want to modify the resource file crated from above code. If i use the same code, the existing resource file gets replaced. Please help me modify it without loosing the existing contents.
Best Regards,
Ankit

public static void AddOrUpdateResource(string key, string value)
{
var resx = new List<DictionaryEntry>();
using (var reader = new ResXResourceReader(resourceFilepath))
{
resx = reader.Cast<DictionaryEntry>().ToList();
var existingResource = resx.Where(r => r.Key.ToString() == key).FirstOrDefault();
if (existingResource.Key == null && existingResource.Value == null) // NEW!
{
resx.Add(new DictionaryEntry() { Key = key, Value = value });
}
else // MODIFIED RESOURCE!
{
var modifiedResx = new DictionaryEntry() { Key = existingResource.Key, Value = value };
resx.Remove(existingResource); // REMOVING RESOURCE!
resx.Add(modifiedResx); // AND THEN ADDING RESOURCE!
}
}
using (var writer = new ResXResourceWriter(ResxPathEn))
{
resx.ForEach(r =>
{
// Again Adding all resource to generate with final items
writer.AddResource(r.Key.ToString(), r.Value.ToString());
});
writer.Generate();
}
}

I had the same problem this resolve it:
This will append to your existing .resx file
var reader = new ResXResourceReader(#"C:\CarResources.resx");//same fileName
var node = reader.GetEnumerator();
var writer = new ResXResourceWriter(#"C:\CarResources.resx");//same fileName(not new)
while (node.MoveNext())
{
writer.AddResource(node.Key.ToString(), node.Value.ToString());
}
var newNode = new ResXDataNode("Title", "Classic American Cars");
writer.AddResource(newNode);
writer.Generate();
writer.Close();

Here is a function that would help you modify the resource file.
public static void UpdateResourceFile(Hashtable data, String path)
{
Hashtable resourceEntries = new Hashtable();
//Get existing resources
ResXResourceReader reader = new ResXResourceReader(path);
if (reader != null)
{
IDictionaryEnumerator id = reader.GetEnumerator();
foreach (DictionaryEntry d in reader)
{
if (d.Value == null)
resourceEntries.Add(d.Key.ToString(), "");
else
resourceEntries.Add(d.Key.ToString(), d.Value.ToString());
}
reader.Close();
}
//Modify resources here...
foreach (String key in data.Keys)
{
if (!resourceEntries.ContainsKey(key))
{
String value = data[key].ToString();
if (value == null) value = "";
resourceEntries.Add(key, value);
}
}
//Write the combined resource file
ResXResourceWriter resourceWriter = new ResXResourceWriter(path);
foreach (String key in resourceEntries.Keys)
{
resourceWriter.AddResource(key, resourceEntries[key]);
}
resourceWriter.Generate();
resourceWriter.Close();
}
Reference link here

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Get all metadata from an existing PDF using iText7 - c#

Related

Access token empty error when uploading large files to a ToDoTask using Graph Api

CsvHelper Error "No header record found" on reading a csv stream

Fill in Word template and save as pdf using openxml and openoffice

How to setup web api controller for ZipFile

Modifying existing resource file programmatically

Categories

Resources