Trying to improve multi-page TIFF file splitting

Trying to improve multi-page TIFF file splitting - c#

I am trying to improve the speed at which I am able to split a multi-page TIFF file into it's individual pages, stored as a list of byte arrays. I have this TiffSplitter class that I'm working on, to try and improve the speed of the Paginate method.
I have heard of LibTiff.net, and wonder if it would be any faster than this process? Currently, it takes about 1333 ms to call the Paginate method on a 7-page multipage TIFF file.
Does anyone know what would be the most efficient way to retrieve the individual pages of a multipage TIFF as byte arrays? Or possibly have any suggestions as to how I can improve the speed of the process I'm currently using?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
namespace TiffSplitter
{
public class TiffPaginator
{
private List<byte[]> paginatedData;
public List<byte[]> Pages
{
get
{
return paginatedData;
}
}
public TiffPaginator()
{
paginatedData = new List<byte[]>();
}
public void Paginate(string Filename)
{
using (Image img = Image.FromFile(Filename))
{
paginatedData.Clear();
int frameCount = img.GetFrameCount(FrameDimension.Page);
for (int i = 0; i < frameCount; i++)
{
img.SelectActiveFrame(new FrameDimension(img.FrameDimensionsList[0]), i);
using (MemoryStream memstr = new MemoryStream())
{
img.Save(memstr, ImageFormat.Tiff);
paginatedData.Add(memstr.ToArray());
}
}
}
}
}
}

I tried using the LibTiff.net, and for me, it was quite slow. The time to split a singe 2-page tif was measured in seconds.
In the end, I decided to reference PresentationCore and go with this:
(It splits the images to multiple files, but it should be simple to switch the output to byte arrays)
Stream imageStreamSource = new FileStream("filename", FileMode.Open, FileAccess.Read, FileShare.Read);
TiffBitmapDecoder decoder = new TiffBitmapDecoder(imageStreamSource, BitmapCreateOptions.PreservePixelFormat, BitmapCacheOption.Default);
int pagecount = decoder.Frames.Count;
if (pagecount > 1)
{
string fNameBase = Path.GetFileNameWithoutExtension("filename");
string filePath = Path.GetDirectoryName("filename");
for (int i = 0; i < pagecount; i++)
{
string outputName = string.Format(#"{0}\SplitImages\{1}-{2}.tif", filePath, fNameBase, i.ToString());
FileStream stream = new FileStream(outputName, FileMode.Create, FileAccess.Write);
TiffBitmapEncoder encoder = new TiffBitmapEncoder();
encoder.Frames.Add(decoder.Frames[i]);
encoder.Save(stream);
stream.Dispose();
}
imageStreamSource.Dispose();
}

Related

Word Interop - Save embedded shape as image

I am attempting to save an embedded shape as an image using C#.
If the object is embedded as an actual image (WMF/JPEG) I can retrieve the image without issue but when the object is an embedded shape or an OLE Object that displays as an image in Word I cannot seem to extract or retrieve said object to then either copy to the clipboard or save said image.
Here is my current code sample; either the object is empty or I get the following error:
System.Runtime.InteropServices.ExternalException: 'A generic error occurred in GDI+.'
Any help is appreciated. Thank you
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;
using System.Windows;
using System.Windows.Forms;
using System.Windows.Media;
using System.Windows.Media.Imaging;
namespace ImageMagickSandboxWinForms
{
public partial class frmMain : Form
{
public frmMain()
{
InitializeComponent();
}
public static BitmapSource ConvertBitmap(Bitmap source)
{
return System.Windows.Interop.Imaging.CreateBitmapSourceFromHBitmap(
source.GetHbitmap(),
IntPtr.Zero,
Int32Rect.Empty,
BitmapSizeOptions.FromEmptyOptions());
}
public static Bitmap BitmapFromSource(BitmapSource bitmapsource)
{
Bitmap bitmap;
using (var outStream = new MemoryStream())
{
BitmapEncoder enc = new BmpBitmapEncoder();
enc.Frames.Add(BitmapFrame.Create(bitmapsource));
enc.Save(outStream);
bitmap = new Bitmap(outStream);
}
return bitmap;
}
private void button1_Click(object sender, EventArgs e)
{
string physicsDocLocation = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop));
physicsDocLocation += #"\[Doc path Here].docx";
var wordApp = new Microsoft.Office.Interop.Word.Application();
var wordDoc = wordApp.Documents.Open(physicsDocLocation);
int iCount = wordDoc.InlineShapes.Count;
for (int i = 1; i < (wordDoc.InlineShapes.Count + 1); i++)
{
var currentInlineShape = wordDoc.InlineShapes[i];
currentInlineShape.Range.Select();
wordDoc.ActiveWindow.Selection.Range.Copy();
BitmapSource clipBoardImage = System.Windows.Clipboard.GetImage();
Bitmap bmpClipImage = BitmapFromSource(clipBoardImage);
string finalPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), #"TestConversions");
finalPath += #"\" + Guid.NewGuid().ToString() + ".jpg";
using (MemoryStream memory = new MemoryStream())
{
using (FileStream fs = new FileStream(finalPath, FileMode.Create, FileAccess.ReadWrite))
{
bmpClipImage.Save(memory, ImageFormat.Jpeg); <<<---- Error happens here.
byte[] bytes = memory.ToArray();
fs.Write(bytes, 0, bytes.Length);
}
}
}
wordDoc.Close();
wordApp.Quit();
}
}
}

i have these code in my library, dunno where i have found that but hope you do the job for you: i am using Clippboard to trap the different images, jus t dont forget, Thread is needed to access Clipboard
for (var i = 1; i <= wordApplication.ActiveDocument.InlineShapes.Count; i++)
{
var inlineShapeId = i;
var thread = new Thread(() => SaveInlineShapeToFile(inlineShapeId, wordApplication));
// STA is needed in order to access the clipboard
// https://stackoverflow.com/a/518724/700926
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
thread.Join();
}
// General idea is based on: https://stackoverflow.com/a/7937590/700926
protected static void SaveInlineShapeToFile(int inlineShapeId, Application wordApplication)
{
// Get the shape, select, and copy it to the clipboard
var inlineShape = wordApplication.ActiveDocument.InlineShapes[inlineShapeId];
inlineShape.Select();
wordApplication.Selection.Copy();
// Check data is in the clipboard
if (Clipboard.GetDataObject() != null)
{
var data = Clipboard.GetDataObject();
// Check if the data conforms to a bitmap format
if (data != null && data.GetDataPresent(DataFormats.Bitmap))
{
// Fetch the image and convert it to a Bitmap
var image = (Image) data.GetData(DataFormats.Bitmap, true);
var currentBitmap = new Bitmap(image);
// Save the bitmap to a file
currentBitmap.Save(#"C:\Users\Username\Documents\" + String.Format("img_{0}.png", inlineShapeId));
}
}
}
following if you are using Winform or WPF the clipboard acts differently for an image:
if (Clipboard.ContainsImage())
{
// ImageUIElement.Source = Clipboard.GetImage(); // does not work
System.Windows.Forms.IDataObject clipboardData = System.Windows.Forms.Clipboard.GetDataObject();
if (clipboardData != null)
{
if (clipboardData.GetDataPresent(System.Windows.Forms.DataFormats.Bitmap))
{
System.Drawing.Bitmap bitmap = (System.Drawing.Bitmap)clipboardData.GetData(System.Windows.Forms.DataFormats.Bitmap);
ImageUIElement.Source = System.Windows.Interop.Imaging.CreateBitmapSourceFromHBitmap(bitmap.GetHbitmap(), IntPtr.Zero, Int32Rect.Empty,BitmapSizeOptions.FromEmptyOptions());
Console.WriteLine("Clipboard copied to UIElement");
}
}
}
after if its not functionam due to a bug in translation of format, there is this solution . So its infrecnh but its easily to understand the logic of the using of "DeviceIndependentBitmap"

Best way to read a short array from disk in C#?

I have to write 4GB short[] arrays to and from disk, so I have found a function to write the arrays, and I am struggling to write the code to read the array from the disk. I normally code in other languages so please forgive me if my attempt is a bit pathetic so far:
using UnityEngine;
using System.Collections;
using System.IO;
public class RWShort : MonoBehaviour {
public static void WriteShortArray(short[] values, string path)
{
using (FileStream fs = new FileStream(path, FileMode.OpenOrCreate, FileAccess.Write))
{
using (BinaryWriter bw = new BinaryWriter(fs))
{
foreach (short value in values)
{
bw.Write(value);
}
}
}
} //Above is fine, here is where I am confused:
public static short[] ReadShortArray(string path)
{
byte[] thisByteArray= File.ReadAllBytes(fileName);
short[] thisShortArray= new short[thisByteArray.length/2];
for (int i = 0; i < 10; i+=2)
{
thisShortArray[i]= ? convert from byte array;
}
return thisShortArray;
}
}

Shorts are two bytes, so you have to read in two bytes each time. I'd also recommend using a yield return like this so that you aren't trying to pull everything into memory in one go. Though if you need all of the shorts together that won't help you.. depends on what you're doing with it I guess.
void Main()
{
short[] values = new short[] {
1, 999, 200, short.MinValue, short.MaxValue
};
WriteShortArray(values, #"C:\temp\shorts.txt");
foreach (var shortInfile in ReadShortArray(#"C:\temp\shorts.txt"))
{
Console.WriteLine(shortInfile);
}
}
public static void WriteShortArray(short[] values, string path)
{
using (FileStream fs = new FileStream(path, FileMode.OpenOrCreate, FileAccess.Write))
{
using (BinaryWriter bw = new BinaryWriter(fs))
{
foreach (short value in values)
{
bw.Write(value);
}
}
}
}
public static IEnumerable<short> ReadShortArray(string path)
{
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
using (BinaryReader br = new BinaryReader(fs))
{
byte[] buffer = new byte[2];
while (br.Read(buffer, 0, 2) > 0)
yield return (short)(buffer[0]|(buffer[1]<<8));
}
}
You could also define it this way, taking advantage of the BinaryReader:
public static IEnumerable<short> ReadShortArray(string path)
{
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
using (BinaryReader br = new BinaryReader(fs))
{
while (br.BaseStream.Position < br.BaseStream.Length)
yield return br.ReadInt16();
}
}

Memory-mapping the file is your friend, there's a MemoryMappedViewAccessor.ReadInt16 function that will allow you to directly read the data, with type short, out of the OS disk cache. Also a Write() overload that accepts an Int16. Also ReadArray and WriteArray functions if you are calling functions that need a traditional .NET array.
Overview of using Memory-mapped files in .NET on MSDN
If you want to do it with ordinary file I/O, use a block size of 1 or 2 megabytes and the Buffer.BlockCopy function to move data en masse between byte[] and short[], and use the FileStream functions that accept a byte[]. Forget about BinaryWriter or BinaryReader, forget about doing 2 bytes at a time.
It's also possible to do the I/O directly into a .NET array with the help of p/invoke, see my answer using ReadFile and passing the FileStream object's SafeFileHandle property here But even though this has no extra copies, it still shouldn't keep up with the memory-mapped ReadArray and WriteArray calls.

Multiple file in one Stream, custom stream

According to the answer here I want to write multiple files stream to one stream as following:
4 byte reserved for length number of each stream
each stream content write after it's length number(after 4 byte)
at the end stream will be something like this
Stream = File1 len + File1 stream content + File2 len + File2 stream content + ....
Example code:
result = new ExportResult_C()
{
PackedStudy = packed.ToArray() ,
Stream = new MemoryStream()
};
string[] zipFiles = Directory.GetFiles(zipRoot);
foreach (string fileN in zipFiles)
{
MemoryStream outFile = new MemoryStream(File.ReadAllBytes(fileN));
MemoryStream len = new MemoryStream(4);
//initiate outFile len to 4 byte push it to main stream
//Then push outFile stream to main stream
//Continue and do this for another file
}
//For test Save stream to file(s)
is it good idea? really don't know how that comments can be lines of code.
Thanks in advance.

Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
byte[] testMessage = Encoding.UTF8.GetBytes("The quick brown fox jumped over the lazy dog");
MemoryStream outFile = new MemoryStream();
BinaryWriter writer = new BinaryWriter(outFile);
for (int i = 0; i < 10; i++ )
{
writer.Write(BitConverter.GetBytes(testMessage.Length), 0, 4);
writer.Write(testMessage, 0, testMessage.Length);
}
writer.Flush();
outFile.Position = 0;
BinaryReader reader = new BinaryReader(outFile, Encoding.UTF8);
while (outFile.Position < outFile.Length)
{
int size = reader.ReadInt32();
byte[] data = reader.ReadBytes(size);
}
}
}
}

I think there is a better solution I posted as answer to my question here
multiple file byte will be serialized to one stream and in client side it will be deserialized to a class of byte array.
see here, it may be useful.
But I have accepted the #jdweng solution and I appreciate his attention and help.

How can I unzip a file to a .NET memory stream?

I have files (from 3rd parties) that are being FTP'd to a directory on our server. I download them and process them even 'x' minutes. Works great.
Now, some of the files are .zip files. Which means I can't process them. I need to unzip them first.
FTP has no concept of zip/unzipping - so I'll need to grab the zip file, unzip it, then process it.
Looking at the MSDN zip api, there seems to be no way i can unzip to a memory stream?
So is the only way to do this...
Unzip to a file (what directory? need some -very- temp location ...)
Read the file contents
Delete file.
NOTE: The contents of the file are small - say 4k <-> 1000k.

Zip compression support is built in:
using System.IO;
using System.IO.Compression;
// ^^^ requires a reference to System.IO.Compression.dll
static class Program
{
const string path = ...
static void Main()
{
using(var file = File.OpenRead(path))
using(var zip = new ZipArchive(file, ZipArchiveMode.Read))
{
foreach(var entry in zip.Entries)
{
using(var stream = entry.Open())
{
// do whatever we want with stream
// ...
}
}
}
}
}
Normally you should avoid copying it into another stream - just use it "as is", however, if you absolutely need it in a MemoryStream, you could do:
using(var ms = new MemoryStream())
{
stream.CopyTo(ms);
ms.Position = 0; // rewind
// do something with ms
}

You can use ZipArchiveEntry.Open to get a stream.
This code assumes the zip archive has one text file.
using (FileStream fs = new FileStream(path, FileMode.Open))
using (ZipArchive zip = new ZipArchive(fs) )
{
var entry = zip.Entries.First();
using (StreamReader sr = new StreamReader(entry.Open()))
{
Console.WriteLine(sr.ReadToEnd());
}
}

using (ZipArchive archive = new ZipArchive(webResponse.GetResponseStream()))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
Stream s = entry.Open();
var sr = new StreamReader(s);
var myStr = sr.ReadToEnd();
}
}

Looks like here is what you need:
using (var za = ZipFile.OpenRead(path))
{
foreach (var entry in za.Entries)
{
using (var r = new StreamReader(entry.Open()))
{
//your code here
}
}
}

You can use SharpZipLib among a variety of other libraries to achieve this.
You can use the following code example to unzip to a MemoryStream, as shown on their wiki:
using ICSharpCode.SharpZipLib.Zip;
// Compresses the supplied memory stream, naming it as zipEntryName, into a zip,
// which is returned as a memory stream or a byte array.
//
public MemoryStream CreateToMemoryStream(MemoryStream memStreamIn, string zipEntryName) {
MemoryStream outputMemStream = new MemoryStream();
ZipOutputStream zipStream = new ZipOutputStream(outputMemStream);
zipStream.SetLevel(3); //0-9, 9 being the highest level of compression
ZipEntry newEntry = new ZipEntry(zipEntryName);
newEntry.DateTime = DateTime.Now;
zipStream.PutNextEntry(newEntry);
StreamUtils.Copy(memStreamIn, zipStream, new byte[4096]);
zipStream.CloseEntry();
zipStream.IsStreamOwner = false; // False stops the Close also Closing the underlying stream.
zipStream.Close(); // Must finish the ZipOutputStream before using outputMemStream.
outputMemStream.Position = 0;
return outputMemStream;
// Alternative outputs:
// ToArray is the cleaner and easiest to use correctly with the penalty of duplicating allocated memory.
byte[] byteArrayOut = outputMemStream.ToArray();
// GetBuffer returns a raw buffer raw and so you need to account for the true length yourself.
byte[] byteArrayOut = outputMemStream.GetBuffer();
long len = outputMemStream.Length;
}

Ok so combining all of the above, suppose you want to in a very simple way take a zip file called
"file.zip" and extract it to "C:\temp" folder. (Note: This example was only tested for compress text files) You may need to do some modifications for binary files.
using System.IO;
using System.IO.Compression;
static void Main(string[] args)
{
//Call it like this:
Unzip("file.zip",#"C:\temp");
}
static void Unzip(string sourceZip, string targetPath)
{
using (var z = ZipFile.OpenRead(sourceZip))
{
foreach (var entry in z.Entries)
{
using (var r = new StreamReader(entry.Open()))
{
string uncompressedFile = Path.Combine(targetPath, entry.Name);
File.WriteAllText(uncompressedFile,r.ReadToEnd());
}
}
}
}

Read last line in open file [duplicate]

This question already has answers here:
How to read a text file reversely with iterator in C#
(11 answers)
Closed 8 years ago.
I'm fairly new all this, but I feel like I'm pretty close to making this work, I just need a little help! I want to create a DLL which can read and return the last line in a file that is open in another application. This is what my code looks like, I just don't know what to put in the while statement.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace SharedAccess
{
public class ReadShare {
static void Main(string path) {
FileStream stream = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
StreamReader reader = new StreamReader(stream);
while (!reader.EndOfStream)
{
//What goes here?
}
}
}
}

To read last line,
var lastLine = File.ReadLines("YourFileName").Last();
If it's a large File
public static String ReadLastLine(string path)
{
return ReadLastLine(path, Encoding.ASCII, "\n");
}
public static String ReadLastLine(string path, Encoding encoding, string newline)
{
int charsize = encoding.GetByteCount("\n");
byte[] buffer = encoding.GetBytes(newline);
using (FileStream stream = new FileStream(path, FileMode.Open))
{
long endpos = stream.Length / charsize;
for (long pos = charsize; pos < endpos; pos += charsize)
{
stream.Seek(-pos, SeekOrigin.End);
stream.Read(buffer, 0, buffer.Length);
if (encoding.GetString(buffer) == newline)
{
buffer = new byte[stream.Length - stream.Position];
stream.Read(buffer, 0, buffer.Length);
return encoding.GetString(buffer);
}
}
}
return null;
}
I refered here,
How to read only last line of big text file

File ReadLines should work for you.
var value = File.ReadLines("yourFile.txt").Last();

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Trying to improve multi-page TIFF file splitting - c#

Related

Word Interop - Save embedded shape as image

Best way to read a short array from disk in C#?

Multiple file in one Stream, custom stream

How can I unzip a file to a .NET memory stream?

Read last line in open file [duplicate]

Categories

Resources