Looking at my code below, I am amazed with the amount of boilerplate code I am required to write just to ensure that a library downloads a file correctly.
Are there any reason why I see 0kb downloaded streams or is this just normal to write a method like this?
public static async Task<string> DownloadSASUriInputDataAsync(string workingDirectory, string sasUri)
{
Trace.TraceInformation("{0}", sasUri);
var input = new CloudBlockBlob(new Uri(sasUri));
input.ServiceClient.DefaultRequestOptions.RetryPolicy = new ExponentialRetry(TimeSpan.FromMilliseconds(100), 10);
var fileName = Path.GetFileName(input.Name);
await Retry.LinearAsync(async () =>
{
try
{
using (var ms = new MemoryStream())
{
await input.DownloadToStreamAsync(ms);
ms.Seek(0, SeekOrigin.Begin);
if (ms.Length == 0)
{
throw new RunAlgorithmException("Downloaded file was 0 byte");
}
using (var fs = new FileStream(Path.Combine(workingDirectory, fileName), FileMode.Create, FileAccess.Write))
{
await ms.CopyToAsync(fs);
}
}
Trace.TraceInformation("downloaded file");
}
catch (StorageException ex)
{
Trace.TraceError("Failed to DownloadSASUriInputDataAsync : {0}", ex.ToString());
throw;
}
}, TimeSpan.FromMilliseconds(500),10);
return fileName;
}
The issue with all the 0kb streams was that the blobs was still being copied.
Blobs can still be accessed even though they are being copied and it will give the behavior above.
Adding checks before tryingto download that the blob.CopyState is completed or missing ensures that it work as the SLA states.
Related
In my scenario, I have an use case where I must received a compressed file, do some validations and then find a specific file within the archive that I'll have to handle through a third-party library. I'm having some trouble getting such library to read the file though. This is what I came up with so far:
public async Task ShapeIt(ZipArchive archive)
{
foreach (var entry in archive.Entries)
{
if (Path.GetExtension(entry.FullName).Equals(".shp"))
{
var stream = entry.Open();
using var ms = new MemoryStream();
await stream.CopyToAsync(ms);
ms.Position = 0;
var fileName = Path.GetTempFileName();
try
{
using var fileStream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Write,
FileShare.ReadWrite);
var bytes = new byte[ms.Length];
ms.Read(bytes, 0, (int)ms.Length);
fileStream.Write(bytes, 0, bytes.Length);
fileStream.Flush();
fileStream.Close();
var featureSource = new ShapeFileFeatureSource(fileName); // Class from 3rd-party
featureSource.Open();
// Do my stuff with the shapefile
}
finally
{
File.Delete(fileName);
}
}
}
}
Take notice that I'm using the "old way" of copying streams as Stream.CopyTo and Stream.CopyToAsync were creating empty files, explicitly calling fileStream.Close() looks like the only way to get the bytes into the file somehow, but that's beyond my point. Regardless, after closing the stream, upon calling featureSource.Open() my application throws
"The process cannot access the file 'C:\\Users\\me\\AppData\\Local\\Temp\\tmpE926.tmp' because it is
being used by another process."
tmpE926.tmp being different every time, obviously. Also take notice that I'm creating a file because the constructor for ShapeFileFeatureSource demands not a stream, not a byte array, but a path.
A much shorter implementation
public async Task ShapeIt(ZipArchive archive)
{
foreach (var entry in archive.Entries)
{
var tempFile = Path.GetTempFileName();
try
{
entry.ExtractToFile(tempFile, true);
if (Path.GetExtension(entry.FullName).Equals(".shp"))
{
var featureSource = new ShapeFileFeatureSource(tempFile);
featureSource.Open();
var type = featureSource.GetShapeFileType();
}
}
finally
{
File.Delete(tempFile);
}
}
}
will actually amount to the same error. I honestly don't think the problem lies within this library, but rather I'm the one screwing it up somehow. Does anyone have any ideas or should I contact the vendor's (unresponsive) tech support?
Edit: Just in case, this is such library Install-Package ThinkGeo.UI.WebApi but you have to subscribe to evaluate in order to use it.
I could not find package for .NET Core with such classes, so I reproduced it through .NET Framework Nuget package. My answer mostly demonstrates how to deal with streams. It would be hard to tell what is wrong with your code, without having access to library you have
using DotSpatial.Data;
using System.IO;
using System.IO.Compression;
namespace ConsoleApp12
{
class Program
{
static void Main(string[] args)
{
using (var fs = File.OpenRead(#"C:\Users\jjjjjjjjjjjj\Downloads\1270055001_mb_2011_vic_shape.zip"))
using (var zipFile = new ZipArchive(fs))
{
foreach (var entry in zipFile.Entries)
{
if (entry.FullName.EndsWith(".shp"))
{
var tempFile = Path.GetTempFileName();
try
{
using (var entryStream = entry.Open())
using (var newFileStream = File.OpenWrite(tempFile))
{
entryStream.CopyTo(newFileStream);
}
var featureSource = ShapefileFeatureSource.Open(tempFile);
var type = featureSource.ShapeType;
}
finally
{
File.Delete(tempFile);
}
}
}
}
}
}
}
UPD: installed trial version of ThinkGeo library, instead of unauthorized exception it gives me FileNotFoundException with given stacktrace
at ThinkGeo.Core.ValidatorHelper.CheckFileIsExist(String pathFilename)
at ThinkGeo.Core.ShapeFileIndex.xh8=(FileAccess readWriteMode)
^^^^^^^^^^^^^^^^^^^^^^^^^ Are we supposed to have index?
at ThinkGeo.Core.ShapeFile.xh8=(FileAccess readWriteMode)
at ThinkGeo.Core.ShapeFileFeatureSource.WjE=()
at ThinkGeo.Core.ShapeFileFeatureSource.OpenCore()
at ThinkGeo.Core.FeatureSource.Open()
at ConsoleApp20.Program.Main(String[] args) in
C:\Users\jjjjjjjjjjjj\source\repos\ConsoleApp20\ConsoleApp20\Program.cs:line 45
ShapeFileIndex ? So I thought I should dig into this way
var featureSource = new ShapeFileFeatureSource(tempFile);
featureSource.RequireIndex = false; // no luck
featureSource.Open();
I tried to find any reference to the idx file it wants, it has property IndexFilePathName, but unfortunately, I am out of luck. (Also tried different folder, so it is not 'Temp' folder issue)
This code morphed for a couple of days until I reached out to tech support, they tinkered with it a bit and came up with this:
public async Task ProcessFile(IFormFile file)
{
if (!Path.GetExtension(file.FileName).Equals(".zip"))
throw new System.Exception("File should be compressed in '.zip' format");
var filePaths = new List<string>();
using (var stream = new MemoryStream())
{
await file.CopyToAsync(stream);
using (var archive = new ZipArchive(stream, ZipArchiveMode.Read, false))
{
var replaceList = new Dictionary<string, string>();
foreach (ZipArchiveEntry entry in archive.Entries)
{
var tempPath = Path.GetTempFileName();
string key = Path.GetFileNameWithoutExtension(entry.FullName);
string value = Path.GetFileNameWithoutExtension(tempPath);
if (replaceList.ContainsKey(key))
{
value = replaceList[key];
}
else
{
replaceList.Add(key, value);
}
string unzippedPath = Path.Combine(Path.GetDirectoryName(tempPath), value + Path.GetExtension(entry.FullName));
entry.ExtractToFile(unzippedPath, true);
filePaths.Add(unzippedPath);
}
foreach (var unzippedPath in filePaths)
{
if (Path.GetExtension(unzippedPath).Equals(".shp"))
{
// Successfully doing third-party library stuff
}
}
foreach (var unzippedPath in filePaths)
{
if (File.Exists(unzippedPath))
{
File.Delete(unzippedPath);
}
}
}
}
}
It works. I'm happy.
I have the following code that generate two kinds of errors. First with the current code I get an exception 'NotSupportedException: This stream from ZipArchiveEntry does not support reading.'. How am I supposed to read the data ?
Furthermore if i use a MemoryStream (as the commented code ) then I can read the data and deserialize correctly but the memorystream i created still remains in memory even if the dispose method has been called on it , causing some memory leaks . Any idea what is wrong with this code ?
void Main()
{
List<Product> products;
using (var s = GetDb().Result)
{
products = Utf8Json.JsonSerializer.Deserialize<List<Product>>(s).ToList();
}
}
// Define other methods and classes here
public static Task<Stream> GetDb()
{
var filepath = Path.Combine("c:/users/tom/Downloads", "productdb.zip");
using (var archive = ZipFile.OpenRead(filepath))
{
var data = archive.Entries.Single(e => e.FullName == "productdb.json");
return Task.FromResult(data.Open());
//using (var reader = new StreamReader(data.Open()))
//{
// var ms = new MemoryStream();
// data.Open().CopyTo(ms);
// ms.Seek(0, SeekOrigin.Begin);
// return Task.FromResult((Stream)ms);
//}
}
}
With the commented code you open the stream into a reader, don't use the reader, then open the stream again and copy over to the memory stream without closing the second opened stream.
It is the second opened stream that remains in memory, not the MemoryStream.
Refactor
public static async Task<Stream> GetDb() {
var filepath = Path.Combine("c:/users/tom/Downloads", "productdb.zip");
using (var archive = ZipFile.OpenRead(filepath)) {
var entry = archive.Entries.Single(e => e.FullName == "productdb.json");
using (var stream = entry.Open()) {
var ms = new MemoryStream();
await stream.CopyToAsync(ms);
return ms;
}
}
}
I have a REST GET API that is written using WCF library to return Stream of a specific requested file that is located on API server that hosts that web service application. The service works well if the size of the requested file is small; that is less than 100 MB. But if file size is greater than > 100 MB, then the service returns 0 bytes without any logged information I can get the library method (saying, the "catch" block).
The library method (the class library project) returns Stream of needed file is
public Stream GetFile(string fileId, string seekStartPosition=null)
{
_lastActionResult = string.Empty;
Stream fileStream = null;
try
{
Guid fileGuid;
if (Guid.TryParse(fileId, out fileGuid) == false)
{
_lastActionResult = string.Format(ErrorMessage.FileIdInvalidT, fileId);
}
else
{
ContentPackageItemService contentItemService = new ContentPackageItemService();
string filePath = DALCacheHelper.GetFilePath(fileId);
if (File.Exists(filePath))
{
fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read);
long seekStart = 0;
// if seek position is specified, move the stream pointer to that location
if (string.IsNullOrEmpty(seekStartPosition) == false && long.TryParse(seekStartPosition, out seekStart))
{
// make sure seek position is smaller than file size
FileInfo fi = new FileInfo(filePath);
if (seekStart >= 0 && seekStart < fi.Length)
{
fileStream.Seek(seekStart, SeekOrigin.Begin);
}
else
{
_lastActionResult = string.Format(ErrorMessage.FileSeekInvalidT, seekStart, fi.Length);
}
}
}
else
{
_lastActionResult = string.Format(ErrorMessage.FileNotFoundT, fileId);
Logger.Write(_lastActionResult,
"General", 1, Constants.LogId.RESTSync, System.Diagnostics.TraceEventType.Error, System.Reflection.MethodBase.GetCurrentMethod().Name);
}
}
}
catch(Exception ex)
{
Logger.Write(ex,"General", 1, Constants.LogId.RESTSync, System.Diagnostics.TraceEventType.Error, System.Reflection.MethodBase.GetCurrentMethod().Name);
}
return fileStream;
}
API method on the client side project (where .svc file is):
[WebGet(UriTemplate = "files/{fileid}")]
public Stream GetFile(string fileid)
{
ContentHandler handler = new ContentHandler();
Stream fileStream = null;
try
{
fileStream = handler.GetFile(fileid);
}
catch (Exception ex)
{
Logger.Write(string.Format("{0} {1}", ex.Message, ex.StackTrace), "General", 1, Constants.LogId.RESTSync, System.Diagnostics.TraceEventType.Error, System.Reflection.MethodBase.GetCurrentMethod().Name);
throw new WebFaultException<ErrorResponse>(new ErrorResponse(HttpStatusCode.InternalServerError, ex.Message), HttpStatusCode.InternalServerError);
}
if (fileStream == null)
{
throw new WebFaultException<ErrorResponse>(new ErrorResponse(handler.LastActionResult), HttpStatusCode.InternalServerError);
}
return fileStream;
}
As you are using REST, I presume you are using the WebHttpBinding. You need to set the MaxReceivedMessageSize on the client binding to be sufficient for the maximum expected response size. The default is 64K. Here's the msdn documentation for the property if you are creating your binding in code. If you are creating your binding in your app.config, then this is the documentation you need.
I used the lib Nunrar site to extract a .rar file:
RarArchive.WriteToDirectory(fs.Name, Path.Combine(#"D:\DataDownloadCenter", path2), ExtractOptions.Overwrite);
the decompression works fine, but I can't after this operation of extract delete the original compressed file
System.IO.File.Delete(path);
because the file is is used by another process
the hole function :
try
{
FileStream fs = File.OpenRead(path);
if(path.Contains(".rar")){
try
{
RarArchive.WriteToDirectory(fs.Name, Path.Combine(#"D:\DataDownloadCenter", path2), ExtractOptions.Overwrite);
fs.Close();
}
catch { }
}
catch { return; }
finally
{
if (zf != null)
{
zf.IsStreamOwner = true; // Makes close also shut the underlying stream
zf.Close(); // Ensure we release resources
}
}
try
{
System.IO.File.Delete(path);
}
catch { }
So can I delete the compressed file after extract it?
I don't know what zf is but you can also likely wrap that in a using statement. Try replacing your FileStream fs part with this
using( FileStream fs = File.OpenRead(path))
{
if(path.Contains(".rar"))
{
try
{
RarArchive.WriteToDirectory(fs.Name, Path.Combine(#"D:\DataDownloadCenter", path2), ExtractOptions.Overwrite);
}
catch { }
}
}
This way fs is closed even if path doesn't contain .rar. You're only closing the fs if rar exists within the filename.
Also, does the library have its own stream handling? It could have a method that closes it.
I also had this issue with nunrar, nether close() or a using statement seem to fix this.
unfortunately the Documentation is scarce, so im now using the SharpCompress library it is a fork of the nunrar library according to the devs of nunrar.The documentation on SharpCompress is also scarce (but less) so here is my method im using:
private static bool unrar(string filename)
{
bool error = false;
string outputpath = Path.GetDirectoryName(filename);
try
{
using (Stream stream = File.OpenRead(filename))
{
var reader = ReaderFactory.Open(stream);
while (reader.MoveToNextEntry())
{
if (!reader.Entry.IsDirectory)
{
Console.WriteLine(reader.Entry.Key);
reader.WriteEntryToDirectory(outputpath, new ExtractionOptions() { ExtractFullPath = true, Overwrite = true });
}
}
}
}
catch (Exception e)
{
Console.WriteLine("Failed: " + e.Message);
error = true;
}
if (!error)
{
File.Delete(filename);
}
return error;
}
Add the following libraries to the top
using SharpCompress.Common;
using SharpCompress.Readers;
Install using nuget.This method works for SharpCompress v0.22.0(latest at the time of writing)
I am hoping that I wont have to copy the full example or create a minimal solution and the issue is some basic knowledge problem of mine.
I have a application that syncs data with Azure Storage, it all works fine and the upload part is already in a Async Method. It all works.
I was trying to optimize alittle more and wanted to change a stream.CopyTo to a CopyToAsync.
using(var filestream = streamProvider())
{
filestream.CopyTo(stream);
stream.Position = 0;
}
I changed it to await filestream.CopyToAsync(stream) and only parts of my files are upload.
At the point of writing I found out that the Exception throw is a "Found invalid data while decoding", IO.InvalidDataException.
This lead me back to the stream being a GZip/ or DeflateStream - so guess my question is about if its not expected to work with CopyToAsync on those kind of streams?
Context
public async Task UploadFile(string storePath, Func<Stream> streamProvider, bool copyLocalBeforeUpload = false)
{
using (new Timer(this.MessageQueue, "UPLOAD", storePath))
{
using (var stream = copyLocalBeforeUpload ? new MemoryStream() : streamProvider() )
{
try
{
if (copyLocalBeforeUpload)
{
using (var filestream = streamProvider())
{
await filestream.CopyToAsync(stream);
stream.Position = 0;
}
}
var blob = GetBlobReference(storePath);
await blob.UploadFromStreamAsync(stream,new BlobRequestOptions { RetryPolicy = new LinearRetry(TimeSpan.FromMilliseconds(100), 3) });
}
catch (Exception e)
{
if (this.MessageQueue != null)
{
this.MessageQueue.Enqueue(string.Format("Failed uploading '{0}'", storePath));
}
}
}
}
}