Kofax Export Connector - check for attached components

Kofax Export Connector - check for attached components - c#

I have a webservice that publishes the scanned Kofax documents to another application.
This application webservice takes the following data:
the document (binary)
the IDs of meta fields (from the application) and their values (the index fields from Kofax)
When creating a mapping for the meta fields I would store the selected index field with the meta field ID to the releaseSetupData custom properties.
releaseSetupData.CustomProperties.Add("MetaFieldID", "IndexFieldValue");
When publishing the scanned document I want to publish a PDF file when the PDF Generator is attached otherwise a multipage TIFF file.
How can I check if this generator is attached to the batch class?
As far as I know the TIFF files from Kofax are single pages so I would have to setup a workaround by code?

tldr:
To answer your first question: While I am not sure whether the export connector has access to the queues of the relevant, just use the PDF whenever one is available, and TIFFs otherwise.
I'd check if if a file exists using DocumentData.KofaxPDFPath as path. If that is the case, upload a PDF. If no file exists, I'd save the images to a temporary folder using DocumentData.ImageFiles.Copy(). In both cases you may want to use File.ReadAllBytes(), depending on how your web service call handles said attachments.
Second question: just use 0 for the ImageType as the second argument to DocumentData.ImageFiles.Copy().
More detailed explanation:
Unfortunately, Kofax's object model is a bit messy, here's how PDFs are handled:
The property DocumentData.KofaxPDFFileName will contain a full/absolute path to the converted PDF file, if available. This usually points to a file contained in subfolders in the server file share (i.e. CaptureSV\Images)
The method DocumentData.CopyKofaxPDFFile() will allow you to copy aforementioned file to the path DocumentData.KofaxPDFPath, if defined during setup.
It's a bit of a different story for images:
Images are exposed as a collection of ImageFile in DocumentData.ImageFiles. However, as you already mentioned - these are mostly single page TIFFs.
DocumentData.ImageFiles.Copy() will either allow you to copy all images to the path as defined during setup, i.e. DocumentData.ImageFilePath - alternatively, you can provide a string argument with any custom path. Further, it allows you to define an ImageType, and 0 means Multipage TIFFs, CCITT Group 4 (please refer to the API Reference for further details).

Related

How does SharePoint versioning engine store only changes to files and not the whole file?

One of the many things that SharePoint does extremely well is that when you have versioning enabled for files uploaded to a Document Library, every time you save changes to a file it only saves the difference from the previous version of the file to the Content Database but NOT the whole file again.
I am trying to duplicate that same behavior with standard C# code on either a File System folder in Windows or a SQL Database blob field. Does anyone have any idea or pointers on how SharePoint accomplishes this and how it can be done outside of SharePoint?

SharePoint uses a technique called data "shredding" to contain each change to a given file. Unfortunately, I don't think you will find enough technical details to truly reproduce what they are doing, but you might be able to devise a reasonable approximation using your own design.
When shredded, the data associated with a file such as Document.docx is distributed across a set of BLOBs associated with the file. The independent BLOBS are each assigned a unique ID (offset) to enable reconstruction in the correct order when requested by a user.
Each document "shred" is stored in a SQL database table named DocStreams. Each BLOB contains a numerical Id representative of the source BLOB when coalesced. When a client updates a file, only the shredded BLOB that corresponds to the change is updated with the update occurring on the database server as opposed to the Web server.
For more details on Shredding see
http://download.microsoft.com/download/9/6/6/9661DAC2-393D-445A-BDC1-E60743B1231E/Shredded%20Storage%20in%20SharePoint%202013.pdf
https://jeremythake.com/the-truth-behind-shredded-storage-in-sharepoint-2013-a84ec047f28e
https://www.c-sharpcorner.com/UploadFile/91b369/shredded-storage-in-sharepoint-2013/

Is it possible to get the file extension or type from a file loaded into MemoryStream?

I am creating a service in C# that is to be used by an ASP.Net Web API that will process and store files.
As part of the Web API I take the file(s) and store them in MemoryStreams which are then passed through to a service for actual processing and storage.
Part of the service will be to generate a new filename for each of the files.
so that leads me to the question, is it possible to find out what file type/extension the file was now that it's in a MemoryStream?

No for extension, you can guess type:
Unless content of the file somehow contains name/extension MemoryStream by itself does not contain any information about original file name.
You may try to detect type of the file (many formats have magic number/signature in the beginning for example) by reading some/all content of the stream.
There are also several similar questions like Using .NET, how can you find the mime type of a file based on the file signature not the extension (discusses usage of UrlMon to detect type).

You'd have to examine the data and see what content type it is based on characteristics of the content types you expect to handle. A better solution would be to capture that information as part of the process when you load the file, and pass it along with it.
To capture that information, either assume that the file extension tells you what type it is, or have the user (or calling code) select from a list of valid file types.
This post on Super User might be useful, they discuss detecting file type based on the content.

No, MemoryStream does not know or care where its bytes came from.
You can guess the file format from the first few bytes. I'm sure there is code available on the web to do this.
Or, just make the caller of your service transmit this information.

Where does file information (like DateCreated) get stored when you create a new file?

Suppose that I would like to add extra information about a file, without writing that information as content of that file. How would I do this? A couple of good examples are:
With Word documents, you can add Author tag to a document. And,
MP3 files have lots of info stored inside of them but when you play the file, you don't see that info (unless the program playing the file has been programmed to display that information).
How does Windows do this?

This information is stored in the file system (on windows - NTFS).
In NTFS, you can actually store another file, as part of this information, and it stores much more information about each file than you may expected.
NTFS file streams
Exapmle in C how to consume them
About MP3 and word - In these cases the information is stored inside the file, as part of its format.

Storing additional metadata about a file

for a small project, I would like to be able to store additional information about a file and keep that information with the file even when it is moved.
The additional information will be stored in a XML-file. To keep the file and its description together, I thought about using ZIP-archives without any compression, but I would like these ZIP-archives to behave just like the original files (i.e. if the original file was a video file, a double-click on the archive should open the file in the media player). This requires me to write a small program that handles this 'new' file format.
However, I have not found a solution that would allow me to open the file without first extracting the file from the archive (even without compression), which does take some time and is not what I want.
My questions are: Is there a library (for C# or C/C++) that allows me to open a zip file and directly play/open a file inside it wihout extracting the archive? Or is there an easier way to implement what I need (maybe I am thinking in the wrong direction)?

Windows already allows you to store additional metadata about a shell item (including files) through the Windows Property System.
The Windows API Code Pack includes samples and documentation on how to work with many of the native OS capabilities, including the Property System.
The following excerpts come from the PropertyEdit sample.
To get a file's property by name:
var myObject= ShellObject.FromParsingName(fileName);
IShellProperty prop = myObject.Properties.GetProperty(propertyName);
To set a string property:
if (prop.ValueType == typeof(string))
{
(prop as ShellProperty<string>).Value = value;
}
If you don't want to use the Property System, you can use NTFS alternate data streams to store additional info about a file. There is no direct support for ADS in .NET but a simple search returns multiple wrappers, libraries and SO questions about them, eg NTFS - Alternate Data Streams

Automatically add and delete images in WP7 app storage

I want to make a "recent pages" section in my WP7 app which will show thumbnails of 6 recent browsed pages. How to make a method which saves only 6 image files in the storage and when new ones come replace old ones with it?

Assuming that you define "new" based on the date/time that the image file in IsolatedStorage was created you could determine this by querying GetCreationTime on the file.
You can use IsolatedStorageFile.GetFileNames to determine how many / which files exist. Note: you probably want to create these files in a specific folder so you don't have to worry about other files in IsolatedStorage.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.