I want to decompress a file that was uploaded encoded with gzip to S3 straight to a file stream.
Here is my method that returns the gzip stream after decompressing the S3 stream:
using var stream = await _s3.GetObjectStreamAsync(_processServiceOptions.BucketName, key, null);
using var gzipStream = new GZipStream(stream, CompressionMode.Decompress, true);
await WriteToFileAsync(gzipStream);
I'm trying to use it like so to copy it directly to the file stream, instead of loading it into memory using another stream...
async Task WriteToFileAsync(Stream data)
{
using (var fs = File.OpenWrite(path))
{
await data.CopyToAsync(fs);
}
}
However I'm getting System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.
Why is that?
Related
// code href="https://www.cnblogs.com/mahuanpeng/p/6851793.html"
// Compress bytes
//1. Create a compressed data stream
//2. Set compressStream to store the compressed file stream and set it to compression mode
//3. Write the bytes to be compressed to the compressed file stream
public static byte[] CompressBytes(byte[] bytes)
{
Using (MemoryStream by compressStream = new MemoryStream())
{
Using (var zipStream = new GZipStream(compressStream, System.IO.Compression.CompressionLevel.SmallestSize))
ZipStream.Write(bytes,0, bytes.Length).
Return compressStream.ToArray();
}
}
// Unzip the bytes
//1. Create a compressed data stream
//2. Create the GzipStream object and pass in the unzipped file stream
//3. Create the target flow
//4. Copy zipStream to the destination stream
//5. Return destination stream output bytes
public static byte[] Decompress(byte[] bytes)
{
Using (var compressStream = new MemoryStream(bytes))
{
Using (var zipStream = new GZipStream(compressStream, System.IO.Compression.CompressionLevel.SmallestSize)
{
Using (var resultStream = new MemoryStream())
{
ZipStream.CopyTo(resultStream);
Return resultStream.ToArray();
}
}
}
}
This may seem correct, but in "unzipped code", the following exception occurs:
Unhandled exception. System.NotSupportedException: Specified method is not supported.
At System.IO.Compression.DeflateStream.CopyTo(Stream destination, Int32 bufferSize)
At System.IO.Compression.GZipStream.CopyTo(Stream destination, Int32 bufferSize)
At System.IO.Stream.CopyTo(Stream destination)
Version: .NET6
Although I tried this: C# Unable to copy to MemoryStream from GZipStream
I had to compress and decompress the data in memory. Instead of using FileStream and temporary files, I used. NET6, the compression function is not specified, as long as it can be used. NET library instead of nuget package. If there is a better alternative on Nuget, I would consider it. Other alternatives are also acceptable, as long as the performance of byte[] compression and decompression is achieved. This program needs to be cross-platform!
You created a compression stream, you need a decompression stream instead:
using var unzipStream = new GZipStream(compressStream, CompressionMode.Decompress);
Whenever I try to upload a file to the SFTP server with the .csv file extension the only thing within that file is System.IO.MemoryStream. If it's a .txt extension it will have all the values in the file. I can manually convert the .txt to .csv and it will be fine. Is it possible to upload it directly to the SFTP server as a CSV file?
The SFTP Service is using the SSH.NET library by Renci.
Using statement:
using (var stream = csvFileWriter.Write(data, new CsvMapper()))
{
byte[] file = Encoding.UTF8.GetBytes(stream.ToString());
sftpService.Put(SftpCredential.Credentials.Id, file, $"/file.csv");
}
SFTP service:
public void Put(int credentialId, byte[] source, string destination)
{
using (SftpClient client = new SftpClient(GetConnectionInfo(credentialId)))
{
ConnectClient(client);
using (MemoryStream memoryStream = new MemoryStream(source))
{
client.BufferSize = 4 * 1024; // bypass Payload error large files
client.UploadFile(memoryStream, destination);
}
DisconnectClient(client);
}
Solution:
The csvFilerWriter I was using returned a Stream not a MemoryStream, so by switching the csvFileWriter and CsvPut() over to MemoryStream it worked.
Updated using statement:
using (var stream = csvFileWriter.Write(data, new CsvMapper()))
{
stream.Position = 0;
sftpService.CsvPut(SftpCredential.credemtoa;s.Id, stream, $"/file.csv");
}
Updated SFTP service:
public void CsvPut(int credentialId, MemoryStream source, string destination)
{
using (SftpClient client = new SftpClient(GetConnectionInfo(credentialId)))
{
ConnectClient(client);
client.BufferSize = 4 * 1024; //bypass Payload error large files
client.UploadFile(source, destination);
DisconnectClient(client);
}
}
It looks like the csvFileWriter.Write already returns MemoryStream. And its ToString returns "System.IO.MemoryStream" string. That's the root source of your problem.
Aditionally, as you already have the MemoryStream, its an overkill to copy it to yet another MemoryStream, upload it directly. You are copying the data over and over again, it's just a waste of memory.
Like this:
var stream = csvFileWriter.Write(data, new CsvMapper());
stream.Position = 0;
client.UploadFile(stream, destination);
See also:
Upload data from memory to SFTP server using SSH.NET
When uploading memory stream with contents created by csvhelper using SSH.NET to SFTP server, the uploaded file is empty
A simple test code to upload in-memory data:
var stream = new MemoryStream();
stream.Write(Encoding.UTF8.GetBytes("this is test"));
stream.Position = 0;
using (var client = new SftpClient("example.com", "username", "password"))
{
client.Connect();
client.UploadFile(stream, "/remote/path/file.txt");
}
You can avoid the unnecessary using of memory stream like this:
using (var sftp = new SftpClient(GetConnectionInfo(SftpCredential.GetById(credentialId).Id))))
{
sftp.Connect();
using (var uplfileStream = System.IO.File.OpenRead(fileName))
{
sftp.UploadFile(uplfileStream, fileName, true);
}
sftp.Disconnect();
}
I have a service that downloads a *.tgz file from a remote endpoint. I use SharpZipLib to extract and write the content of that compressed archive to disk. But now I want to prevent writing the files to disk (because that process doesn't have write permissions on that disk) and keep them in memory.
How can I access the decompressed files from memory? (Let's assume the archive holds simple text files)
Here is what I have so far:
public void Decompress(byte[] byteArray)
{
Stream inStream = new MemoryStream(byteArray);
Stream gzipStream = new GZipInputStream(inStream);
TarArchive tarArchive = TarArchive.CreateInputTarArchive(gzipStream);
tarArchive.ExtractContents(#".");
tarArchive.Close();
gzipStream.Close();
inStream.Close();
}
Check this and this out.
Turns out, ExtractContents() works by iterating over TarInputStream. When you create your TarArchive like this:
TarArchive.CreateInputTarArchive(gzipStream);
it actually wraps the stream you're passing into a TarInputStream. Thus, if you want more fine-grained control over how you extract files, you must use TarInputStream directly.
See, if you can iterate over files, directories and actual file contents like this:
Stream inStream = new MemoryStream(byteArray);
Stream gzipStream = new GZipInputStream(inStream);
using (var tarInputStream = new TarInputStream(gzipStream))
{
TarEntry entry;
while ((entry = tarInputStream.GetNextEntry()) != null)
{
var fileName = entry.Name;
using (var fileContents = new MemoryStream())
{
tarInputStream.CopyEntryContents(fileContents);
// use entry, fileName or fileContents here
}
}
}
I am tying to upload large files(1 GB+) to Google Drive using GoogleDrive API. My code works fine with smaller files. But when it comes to larger files error occurs.
Error occurs in the code part where the the file is converted into byte[].
byte[] data = System.IO.File.ReadAllBytes(filepath);
Out of memory exception is thrown here.
Probably you followed developers.google suggestions and you are doing this
byte[] byteArray = System.IO.File.ReadAllBytes(filename);
MemoryStream stream = new MemoryStream(byteArray);
try {
FilesResource.InsertMediaUpload request = service.Files.Insert(body, stream, mimeType);
request.Upload();
I have no idea why the suggest to put the whole file in a byte array and then create a MemoryStream on it.
I think that a better way is this:
using(var stream = new System.IO.FileStream(filename,
System.IO.FileMode.Open,
System.IO.FileAccess.Read))
{
try
{
FilesResource.InsertMediaUpload request = service.Files.Insert(body, stream, mimeType);
request.Upload();
.
.
.
}
I am trying to upload a compressed GZipStream to sftp server using ssh.net library. The problem is that when I create the GZipStream, it can not be read any more. Below is my code:
using (SftpClient client = new SftpClient(connectionInfo))
{
client.Connect();
client.ChangeDirectory("/upload");
var uploadFileDirectory = client.WorkingDirectory + "\testXml.xml.gz";
using (GZipStream gzs = new GZipStream(stream, CompressionLevel.Fastest))
{
stream.CopyTo(gzs);
client.UploadFile(gzs, "text.xml.gz");
}
}
The SftpClient's UploadFile takes a stream and I need to upload the GzipStream that is being compressed (without storing to a local drive and then read it again). But the GZipStream doesn't allow read when it is compressed. I tried doign the upload outside the gzipstream using clause and it says that the stream can not be accessed.
How can I approach this? Is it even possible to do it directly this way or do I need to write it to local drive and then upload it...
For future reference, I manage to find how to do this. You can't read the gZipStream on Compression mode but you can get create another MemoryStream of previous stream's bytes like this:
using (SftpClient client = new SftpClient(connectionInfo))
{
client.Connect();
client.ChangeDirectory("/upload");
using (MemoryStream outputStream = new MemoryStream())
{
using (var gzip = new GZipStream(outputStream, CompressionLevel.Fastest))
{
stream.CopyTo(gzip);
}
using (Stream stm = new MemoryStream(outputStream.ToArray()))
{
client.UploadFile(stm,"txt.gz");
}
}
}