Read Excel using NPOI - c#

I am trying to read excel[xls and xlsx] using NPOI,I am using following code, but it is giving 'Unable to Read entire header; 27 bytes Read; expected 512 bytes' while reading an 8KB xls file
byte[] byteArray = Encoding.UTF8.GetBytes(filepath);
MemoryStream stream = new MemoryStream(byteArray);
MemoryStream stream1 = new MemoryStream(Encoding.UTF8.GetBytes(filepath ?? ""));
NPOI.HSSF.UserModel.HSSFWorkbook hssfwb = default(HSSFWorkbook);
hssfwb = new NPOI.HSSF.UserModel.HSSFWorkbook(stream1);
Sheet sheet = hssfwb.GetSheetAt(0);
DataTable dtinputExcel = new DataTable();
I have tried every possible code available on net for this error. Please guide me what's the errorless method to read and excel[xls/xlsx] of any size.

The problem is that the constructor of HSSFWorkbook is expecting a stream containing the contents of the spreadsheet file, while you are passing it a MemoryStream containing the name of the file. You should be using a FileStream to read the file and passing that stream to the HSSFWorkbook constructor.
Try it like this:
IWorkbook hssfwb;
using (FileStream fs = new FileStream(filepath, FileMode.Open, FileAccess.Read))
{
hssfwb = new HSSFWorkbook(fs);
}
ISheet sheet = hssfwb.GetSheetAt(0);

Related

Using memory stream instead of filestream for AWS C# S3 SDK not returning full file

I have some code that is meant to retrieve a file from an S3 bucket and deserialize the file. When I use a file stream I get all the data no problem, but when I use a memory stream it seems like I am not getting all of the data:
It doesnt get the full XML:
Where it should look like:
Here is the code I am using:
internal object ReadDataContractFromFile(string filename, Type type)
{
GetObjectRequest getObjRequest = new GetObjectRequest();
MemoryStream memoryStream = new MemoryStream();
getObjRequest.BucketName = bucketName;
getObjRequest.Key = filename;
string path = #"C:\{PATH_TO_FILE}\requests\" + filename;
FileStream fileStream = new FileStream(path, FileMode.Create, FileAccess.ReadWrite);
using (GetObjectResponse getObjRespone = s3Client.GetObject(getObjRequest))
using (Stream responseStream = getObjRespone.ResponseStream)
{
responseStream.CopyTo(fileStream);
//memoryStream.Seek(0, 0);
XmlReaderSettings rs = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment,
};
XmlReader r = XmlReader.Create(fileStream, rs);
return new DataContractSerializer(type).ReadObject(r);
}
}
If I use memoryStream in place of the fileStream variable, I get the incomplete XML I showed above. I tried to seek to the beginning of the stream incase the position was wrong but that didnt fix it. Any idea what Im doing wrong?

C# equivalent to zlib.decompress

What is the equivalent of the Python function zlib.decompress() in C#? I need to decompress some zlib files using C# and I don't know how to do it.
Python example:
import zlib
file = open("myfile", mode = "rb")
data = zlib.decompress(file.read())
uncompressed_output = open("output_file", mode = "wb")
uncompressed_output.write(data)
I tried using the System.IO.Compression.DeflateStream class, but for every file I try it gives me an exception that the file contains invalid data while decoding.
byte[] binary = new byte[1000000];
using (DeflateStream compressed_file = new DeflateStream(new FileStream(#"myfile", FileMode.Open, FileAccess.Read), CompressionMode.Decompress))
compressed_file.Read(binary, 0, 1000000); //exception here
using (BinaryWriter outputFile = new BinaryWriter(new FileStream(#"output_file", FileMode.Create, FileAccess.Write)))
outputFile.Write(binary);
//Reading the file like normal with a BinaryReader and then turning it into a MemoryStream also didn't work
I should probably mention that the files are ZLIB compressed files. They start with the 78 9C header.
So, luckily, I found this post: https://stackoverflow.com/a/33855097/10505778
Basically the file must be stripped of its 2 header bytes (78 9C). While the 9C is important in decompression (it specifies whether a preset dictionary has been used or not), I don't need it, but I am pretty sure it is not that difficult to modify this to accomodate it:
byte[] binary, decompressed;
using (BinaryReader file = new BinaryReader(new FileStream(#"myfile", FileMode.Open, FileAccess.Read, FileShare.Read))
binary = file.ReadBytes(int.MaxValue); //read the entire file
output = new byte[int.MaxValue];
int outputSize;
using (MemoryStream memory_stream = new MemoryStream(binary, false))
{
memory_stream.Read(decompressed, 0, 2); //discard 2 bytes
using (DeflateStream compressed_file = new DeflateStream(memory_stream, CompressionMode.Decompress)
outputSize = compressed_file.Read(decompressed, 0, int.MaxValue);
}
binary = new byte[outputSize];
Array.Copy(decompressed, 0, binary, 0, outputSize);
using (BinaryWriter outputFile = new BinaryWriter(new FileStream(#"output_file", FileMode.Create, FileAccess.Write)))
outputFile.Write(binary);

Getting a corrupted XLS file using NPOI

I tried to debug this code, but I didn't manage. Do any of you have any idea why my script creates a corrupted XLS file?
string strCaleSalvareTest = #"C:\Users\andrei.tudor\Documents\TipMacheta.xls";
HSSFWorkbook wbXLS;
strEr = "Er";
try
{
fsXLSCitire = new FileStream(strCaleSalvareTest, FileMode.OpenOrCreate, FileAccess.Write, FileShare.None);
wbXLS = new HSSFWorkbook(fsXLSCitire);
strEr = string.Empty;
}
catch (Exception ex)
{
strEr = ex.Message;
}
When I try to run this, it jumps from wbXLS creation to the catch exception block.
You're getting an exception because you are creating a new FileStream for writing (FileAccess.Write) and passing it to the constructor of HSSFWorkbook which is expecting to be able to read from the stream. The file is corrupt because the FileStream is creating the file, but nothing is ever written to it.
If you're just trying to create a new blank workbook and save it to a file, you can do that as shown below. Note that you need to add at least one worksheet to the new workbook, or you will still generate a corrupt file.
// Create a new workbook with an empty sheet
HSSFWorkbook wbXLS = new HSSFWorkbook();
ISheet sheet = wbXLS.CreateSheet("Sheet1");
// Write the workbook to a file
string fileName = #"C:\Users\andrei.tudor\Documents\TipMacheta.xls";
using (FileStream stream = new FileStream(fileName, FileMode.Create, FileAccess.Write))
{
wbXLS.Write(stream);
}
If you're trying to read an existing workbook OR create a new one if it doesn't exist, you need to do something like this:
string fileName = #"C:\Users\andrei.tudor\Documents\TipMacheta.xls";
HSSFWorkbook wbXLS;
try
{
// Try to open and read existing workbook
using (FileStream stream = new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
wbXLS = new HSSFWorkbook(stream);
}
}
catch (FileNotFoundException)
{
// Create a new workbook with an empty sheet
wbXLS = new HSSFWorkbook();
wbXLS.CreateSheet("Sheet1");
}
ISheet sheet = wbXLS.GetSheetAt(0); // Get first sheet
// ...
// Write workbook to file
using (FileStream stream = new FileStream(fileName, FileMode.Create, FileAccess.Write))
{
wbXLS.Write(stream);
}

Uploading ".xlsx" file using DropBox API making file corrupted

DropboxClient dbx = new DropboxClient("************************");
var file = "/Excel/FileName.xlsx";
byte[] bytes = null;
FileStream fs = new FileStream("C:\\Users\\Admin\\Desktop\\Test.xlsx", FileMode.Open, FileAccess.Read);
BinaryReader br = new BinaryReader(fs);
long numBytes = fs.Length;
bytes = br.ReadBytes((int)numBytes);
var mem = new MemoryStream(Encoding.UTF8.GetBytes(bytes.ToString()));
var updated = await dbx.Files.UploadAsync(file, WriteMode.Overwrite.Instance, body: mem);
Here is the code, it overwrite the existing file as per need but make that file corrupted.
I think you're thinking too complex here. UploadAsync expects a Stream. MemoryStream is indeed a Stream, but so is FileStream. Getting rid of the extra reader will result in:
var source = "C:\\Users\\Admin\\Desktop\\Test.xlsx";
var target = "/Excel/FileName.xlsx";
using(var dbx = new DropboxClient("***"))
using(var fs = new FileStream(source, FileMode.Open, FileAccess.Read))
{
var updated = await dbx.Files.UploadAsync(
target, WriteMode.Overwrite.Instance, body: fs);
}
The reason the file will get corrupt is because of reading the data incorrectly. bytes.ToString() will result in System.Byte[]. You're actually uploading System.Byte[] literally instead of the file's contents, which is not a valid Excel document. Also converting a binary file into UTF-8 text doesn't work as expected, because it alters the content being uploaded.

How do I convert this to read a zip file? [duplicate]

This question already has answers here:
Unzipping a .gz file using C#
(3 answers)
Closed 8 years ago.
I am reading an unzipped binary file from disk like this:
string fn = #"c:\\MyBinaryFile.DAT";
byte[] ba = File.ReadAllBytes(fn);
MemoryStream msReader = new MemoryStream(ba);
I now want to increase speed of I/O by using a zipped binary file. But how do I fit it into the above schema?
string fn = #"c:\\MyZippedBinaryFile.GZ";
//Put something here
byte[] ba = File.ReadAllBytes(fn);
//Or here
MemoryStream msReader = new MemoryStream(ba);
What is the best way to achieve this pls.
I need to end up with a MemoryStream as my next step is to deserialize it.
You'd have to use a GZipStream on the content of your file.
So basically it should be like this:
string fn = #"c:\\MyZippedBinaryFile.GZ";
byte[] ba = File.ReadAllBytes(fn);
using (MemoryStream msReader = new MemoryStream(ba))
using (GZipStream zipStream = new GZipStream(msReader, CompressionMode.Decompress))
{
// Read from zipStream instead of msReader
}
To account for the valid comment by flindenberg, you can also open the file directly without having to read the entire file into memory first:
string fn = #"c:\\MyZippedBinaryFile.GZ";
using (FileStream stream = File.OpenRead(fn))
using (GZipStream zipStream = new GZipStream(stream, CompressionMode.Decompress))
{
// Read from zipStream instead of stream
}
You need to end up with a memory stream? No problem:
string fn = #"c:\\MyZippedBinaryFile.GZ";
using (FileStream stream = File.OpenRead(fn))
using (GZipStream zipStream = new GZipStream(stream, CompressionMode.Decompress))
using (MemoryStream ms = new MemoryStream()
{
zipStream.CopyTo(ms);
ms.Seek(0, SeekOrigin.Begin); // don't forget to rewind the stream!
// Read from ms
}

Categories

Resources