I want to download more than one file by using webclient method and many threads running at the same time. My url structure depends on a variable 'int i', so i use a for loop to generate urls and filepaths. The problem is until the started thread is brought upon, the url and filepath values are changed. The timeline occurs as below:
In main loop, url = "url1" and path = "filepath1".
Thread1 is called with value "url1" and "filepath1".
In main loop, url = "url2" and path = "filepath2".
Thread2 is called with value "url2" and "filepath2".
Thread1 started with value "url2" and "filepath2".
Thread2 started with value "url2" and "filepath2".
I couldn't find any elegant solutions. What would you suggest?
string path = "";
string url = "";
string baseURL = "http://www.somewebsite.com/12/";
for (int i = 10; i <= DateTime.Now.Month; i++)
{
path = "C:\\folder\\" + i.ToString() + ".html";
url = baseURL + i.ToString();
Thread webThread = new Thread(delegate()
{
downloadScheduleFile(url,path);
});
webThread.Start()
}
private void downloadScheduleFile(string url, string filepath)
{
var client = new WebClient();
try
{
client.DownloadFile(url, filepath);
}
catch(WebException e) {
Console.WriteLine(System.Threading.Thread.CurrentThread.Name+e.Message);
}
}
Its because by the time your thread starts, path and url have changed. You have to create closer local copies.
string baseURL = "http://www.somewebsite.com/12/";
for (int i = 10; i <= DateTime.Now.Month; i++)
{
string path = "C:\\folder\\" + i.ToString() + ".html"; // path declared here
string url = baseURL + i.ToString(); // url declared here
Thread webThread = new Thread(delegate()
{
downloadScheduleFile(url,path);
});
webThread.Start()
}
The way you code is written all threads calling the downloadScheduleFile are referencing the same 2 variables defined in the encompassing block. What you should do is to give every thread its own set of variables.
You need to capture the variables in the outer scope within the delegate, I'm pretty sure you can do this:
string path = "";
string url = "";
string baseURL = "http://www.somewebsite.com/12/";
for (int i = 10; i <= DateTime.Now.Month; i++)
{
path = "C:\\folder\\" + i.ToString() + ".html";
url = baseURL + i.ToString();
Thread webThread = new Thread(delegate()
{
string innerPath = path;
string innerUrl = url
downloadScheduleFile(innerUrl,innerPath);
});
webThread.Start()
}
But give it a try as you might end up with the same issue...
Related
I have a script that scrapes a website, but the while loop does not work. The script downloads a website and then checks if it is a website or a image.
If the downloaded item is a HTML file it saves it and adds 1 to i and the URL.
Problem: The URL does not change, even tho I think it should with this code.
int i = 0;
while (i < 5)
{
using var client = new WebClient();
client.Headers.Add("User-Agent", "C# console program");
int urlnumb = 1;
string url = "http://localhost:7211/database/resource/pk/" + urlnumb;
string content = client.DownloadString(url);
string htmldefiner = "html";
if (content.Contains(htmldefiner))
{
string savedirectory = #"C:/Temp/" + i + ".html";
System.IO.File.WriteAllText(savedirectory, content);
i++;
urlnumb++;
File.WriteAllText(#"C:/Temp/" + i + ".txt", url);
}
else
{
urlnumb++;
}
}
I have an issue with Files.
I am doing an image importer so clients put their files on an FTP server and then they can import it in the application.
During the import process I copy the file in the FTP Folder to another folder with File.copy
public List<Visuel> ImportVisuel(int galerieId, string[] images)
{
Galerie targetGalerie = MemoryCache.GetGaleriById(galerieId);
List<FormatImage> listeFormats = MemoryCache.FormatImageToList();
int i = 0;
List<Visuel> visuelAddList = new List<Visuel>();
List<Visuel> visuelUpdateList = new List<Visuel>();
List<Visuel> returnList = new List<Visuel>();
foreach (string item in images)
{
i++;
Progress.ImportProgress[Progress.Guid] = "Image " + i + " sur " + images.Count() + " importées";
string extension = Path.GetExtension(item);
string fileName = Path.GetFileName(item);
string originalPath = HttpContext.Current.Request.PhysicalApplicationPath + "Uploads\\";
string destinationPath = HttpContext.Current.Server.MapPath("~/Images/Catalogue") + "\\";
Visuel importImage = MemoryCache.GetVisuelByFilName(fileName);
bool update = true;
if (importImage == null) { importImage = new Visuel(); update = false; }
Size imageSize = importImage.GetJpegImageSize(originalPath + fileName);
FormatImage format = listeFormats.Where(f => f.width == imageSize.Width && f.height == imageSize.Height).FirstOrDefault();
string saveFileName = Guid.NewGuid() + extension;
File.Copy(originalPath + fileName, destinationPath + saveFileName);
if (format != null)
{
importImage.format = format;
switch (format.key)
{
case "Catalogue":
importImage.fileName = saveFileName;
importImage.originalFileName = fileName;
importImage.dossier = targetGalerie;
importImage.dossier_id = targetGalerie.id;
importImage.filePath = "Images/Catalogue/";
importImage.largeur = imageSize.Width;
importImage.hauteur = imageSize.Height;
importImage.isRoot = true;
if (update == false) { MemoryCache.Add(ref importImage); returnList.Add(importImage); }
if (update == true) visuelUpdateList.Add(importImage);
foreach (FormatImage f in listeFormats)
{
if (f.key.StartsWith("Catalogue_"))
{
string[] keys = f.key.Split('_');
string destinationFileName = saveFileName.Insert(saveFileName.IndexOf('.'), "-" + keys[1].ToString());
string destinationFileNameDeclinaison = destinationPath + destinationFileName;
VisuelResizer declinaison = new VisuelResizer();
declinaison.Save(originalPath + fileName, f.width, f.height, 1000, destinationFileNameDeclinaison);
Visuel visuel = MemoryCache.GetVisuelByFilName(fileName.Insert(fileName.IndexOf('.'), "-" + keys[1].ToString()));
update = true;
if (visuel == null) { visuel = new Visuel(); update = false; }
visuel.parent = importImage;
visuel.filePath = "Images/Catalogue/";
visuel.fileName = destinationFileName;
visuel.originalFileName = string.Empty;
visuel.format = f;
//visuel.dossier = targetGalerie; On s'en fout pour les déclinaisons
visuel.largeur = f.width;
visuel.hauteur = f.height;
if (update == false)
{
visuelAddList.Add(visuel);
}
else
{
visuelUpdateList.Add(visuel);
}
//importImage.declinaisons.Add(visuel);
}
}
break;
}
}
}
MemoryCache.Add(ref visuelAddList);
// FONCTION à implémenter
MemoryCache.Update(ref visuelUpdateList);
return returnList;
}
After some processes on the copy (the original file is no more used)
the client have a pop-up asking him if he wants to delete the original files in the ftp folder.
If he clicks on Ok another method is called on the same controller
and this method use
public void DeleteImageFile(string[] files)
{
for (int i = 0; i < files.Length; i++)
{
File.Delete(HttpContext.Current.Request.PhysicalApplicationPath + files[i].Replace(#"/", #"\"));
}
}
This method works fine and really delete the good files when I use it in other context.
But here I have an error message:
Process can't acces to file ... because it's used by another process.
Someone have an idea?
Thank you.
Here's the screenshot of Process Explorer
There are couple of thing you can do here.
1) If you can repro it, you can use Process Explorer at that moment and see which process is locking the file and if the process is ur process then making sure that you close the file handle after your work is done.
2) Use try/catch around the delete statement and retry after few seconds to see if the file handle was released.
3) If you can do it offline you can put in some queue and do the deletion on it later on.
You solve this by using c# locks. Just embed your code inside a lock statement and your threads will be safe and wait each other to complete processing.
I found the solution:
in my import method, there a call to that method
public void Save(string originalFile, int maxWidth, int maxHeight, int quality, string filePath)
{
Bitmap image = new Bitmap(originalFile);
Save(ref image, maxWidth, maxHeight, quality, filePath);
}
The bitmap maintains the file opened blocking delete.
just added
image.Dispose();
in the methos and it work fine.
Thank you for your help, and thank you for process explorer. Very useful tool
I have created a process which reads a "template" text file and then based on the String.Format requirements uses the tokens to place my custom text in.
So, everything works, but the process is slow.
The template file can have about 500-1000 lines; I am looking for a way to speed this process up.
Any ideas?
Here is my code below:
templateFilePath = System.IO.Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().GetName().CodeBase).Replace("file:\\", "");
templateFilePath += "\\Templates\\TemplateFile.txt";
tempRequestFilePath = System.IO.Path.GetTempPath();
tempRequestFilePath += Guid.NewGuid();
Directory.CreateDirectory(tempRequestFilePath);
responseFileToWrite = tempRequestFilePath + "\\" + Path.GetFileNameWithoutExtension(zipMergeFilePath) + ".RSP";
if (!File.Exists(templateFilePath))
{
return false;
}
templateString = System.IO.File.ReadAllText(templateFilePath);
currentRecordNumber = 1;
for (int i = 0; i < createToProcess.rtfText.Lines.Length; i++)
{
if (createToProcess.rtfText.Lines[i].Contains("TAG ID:"))
{
string currentTagID = createToProcess.rtfText.Lines[i].Substring(9, 11).Trim();
string currentCustomerNumber = createToProcess.rtfText.Lines[i].Substring(25, 12).Trim();
string currentTaxPeriod = createToProcess.rtfText.Lines[i].Substring(42, 8).Trim();
string currentCustomerPhoneNumber = createToProcess.rtfText.Lines[i].Substring(55, 9).Trim();
DateTime datePurchases = (DateTime.Now).AddDays(-7);
DateTime dateReceived = (DateTime.Now).AddYears(10);
DateTime dateModified = (DateTime.Now).AddYears(-1);
string currentResearchCreateRecord = String.Format(templateString,
currentTagID.PadRight(6),
currentCustomerNumber.PadRight(12),
currentTaxPeriod.PadRight(6),
currentCustomerPhoneNumber.PadRight(8),
datePurchases.Month.ToString("00") + datePurchases.Day.ToString("00") + datePurchases.Year.ToString("0000"),
"RecordNo: " + currentRecordNumber.ToString(),
dateReceived.Month.ToString("00") + dateReceived.Day.ToString("00") + dateReceived.Year.ToString("0000"),
dateModified.Month.ToString("00") + dateModified.Day.ToString("00") + dateModified.Year.ToString("0000")
);
System.Windows.Forms.Application.DoEvents();
File.AppendAllText(responseFileToWrite, currentResearchCreateRecord);
currentRecordNumber += 1;
}
}
using (ZipFile currentZipFile = new ZipFile())
{
currentZipFile.AddFile(responseFileToWrite, "");
currentZipFile.Save(zipMergeFilePath);
}
return true;
You're re-opening the file handle for each line. That's an expensive operation, and slows you down.
Instead, create (in a using block) a StreamWriter for the file, and call WriteLine() to write a single line without closing the file.
Also, reading the Lines property is quite slow. Change that to a foreach loop (or just cache the array) instead of rerunning all that code for each line.
Finally, don't call DoEvents().
be careful with "+" operator, they are very slow.
You should to use "StringBuilder" operator
System.Text.StringBuilder sb = new System.Text.StringBuilder((int)(sLen * Loops * 1.1));
for(i=0;i<Loops;i++) sb.Append(sSource);
sDest = sb.ToString();
https://support.microsoft.com/en-us/kb/306822
I am using the HTML5 canvas element and the new HTML5 file i\o function to drop multiple files on it and have them upload. It works fine, but now I need to generate a new filename if no files are in the destination directory (It's a 7 digit integer) or get the name of the last uploaded file, convert it to int32 and increment that by one for every new file being uploaded to the same directory. This is where the GetFileName(dir); comes in. The first image always uploads fine but the problem begins once the second file is saved and the process hits ImageJob.Build(), I presume this is because once the new file is starting to write, the GetFile() method runs for second file in line simultaneously and is checking for last written file, which is still being written and this creates the conflict. How can I fix this, maybe I can somehow itterate with a foreach over the Request.InputStream data or implement some kind process watch that waits for the process to finish?
Update: I tried using TempData to store the generated filename, and just increment on the int value in TempData for all the next file names and it appears to do better, gets more images in but still errors at some point. But TempData is not for that as it gets erased after each read, reassigning to it again does not help. Maybe I'll try storing it in session.
The process cannot access the file 'C:\Users\Admin\Documents\Visual Studio
2010\Projects\myproj\myproj\Content\photoAlbums\59\31\9337822.jpg'
because it is being used by another process.
public PartialViewResult Upload()
{
string fileName = Request.Headers["filename"];
string catid = Request.Headers["catid"];
string pageid = Request.Headers["pageid"];
string albumname = Request.Headers["albumname"];
var dir = "~/Content/photoAlbums/" + catid + "/" + pageid + "/" + (albumname ?? null);
var noex = GetFileName(dir);
var extension = ".jpg";
string thumbFile = noex + "_t" + extension;
fileName = noex + extension;
byte[] file = new byte[Request.ContentLength];
Request.InputStream.Read(file, 0, Request.ContentLength);
string imgdir;
string thumbimgdir;
string imageurl;
if (albumname != null)
{
imgdir = Server.MapPath("~/Content/photoAlbums/" + catid + "/" + pageid + "/" + albumname + "/" + fileName);
thumbimgdir = Server.MapPath("~/Content/photoAlbums/" + catid + "/" + pageid + "/" + albumname + "/" + thumbFile);
imageurl = "/Content/photoAlbums/" + catid + "/" + pageid + "/" + albumname + "/" + thumbFile;
}
else
{
imgdir = Server.MapPath("~/Content/photoAlbums/" + catid + "/" + pageid + "/" + fileName);
thumbimgdir = Server.MapPath("~/Content/photoAlbums/" + catid + "/" + pageid + "/" + thumbFile);
imageurl = "/Content/photoAlbums/" + catid + "/" + pageid + "/" + thumbFile;
}
ImageJob b = new ImageJob(file, imgdir, new ResizeSettings("maxwidth=1024&maxheight=768&format=jpg")); b.CreateParentDirectory = true; b.Build();
ImageJob a = new ImageJob(file, thumbimgdir, new ResizeSettings("w=100&h=100&mode=crop&format=jpg")); a.CreateParentDirectory = true; a.Build();
ViewBag.CatID = catid;
ViewBag.PageID = pageid;
ViewBag.FileName = fileName;
return PartialView("AlbumImage", imageurl);
}
public string GetFileName(string dir)
{
var FullPath = Server.MapPath(dir);
var dinfo = new DirectoryInfo(FullPath);
string FileName;
if (dinfo.Exists)
{
var Filex = dinfo.EnumerateFiles().OrderBy(x => x.Name).LastOrDefault();
FileName = Filex != null ? Path.GetFileNameWithoutExtension(Filex.Name) : null;
if (FileName != null)
{
FileName = FileName.Contains("_t") ? FileName.Substring(0, FileName.Length - 2) : FileName;
int fnum;
Int32.TryParse(FileName, out fnum);
FileName = (fnum + 1).ToString();
if (fnum > 999999) { return FileName; } //Check that TryParse produced valid int
else
{
var random = new Random();
FileName = random.Next(1000000, 9999000).ToString();
}
}
else
{
var random = new Random();
FileName = random.Next(1000000, 9999000).ToString();
}
}
else
{
var random = new Random();
FileName = random.Next(1000000, 9999000).ToString();
}
return FileName;
}
You simply cannot use the Random class if you want to generate unique filenames. It uses the current time as the seed, so two exactly concurrent requests will always produce the same 'random' number.
You could use a cryptographic random number generator,
but you would still have to ensure that (a) only one thread would generate it at a time, and (b) you used a sufficiently long identifier to prevent the Birthday paradox.
Thus, I suggest that everyone use GUID identifiers for their uploads, as they solve all of the above issues inherently (I believe an OS-level lock is used to prevent duplicates).
Your method also doesn't handle multiple file uploads per-request, although that may be intentional. You can support those by looping through Request.Files and passing each HttpPostedFile instance directly into the ImageJob.
Here's a simplified version of your code that uses GUIDs and won't encounter concurrency issues.
public PartialViewResult Upload()
{
string albumname = Request.Headers["albumname"];
string baseDir = "~/Content/photoAlbums/" + Request.Headers["catid"] + "/" + Request.Headers["pageid"] + "/" (albumname != null ? albumname + "/" : "");
byte[] file = new byte[Request.ContentLength];
Request.InputStream.Read(file, 0, Request.ContentLength);
ImageJob b = new ImageJob(file, baseDir + "<guid>.<ext>", new ResizeSettings("maxwidth=1024&maxheight=768&format=jpg")); b.CreateParentDirectory = true; b.Build();
ImageJob a = new ImageJob(file, baseDir + "<guid>_t.<ext>", new ResizeSettings("w=100&h=100&mode=crop&format=jpg")); a.CreateParentDirectory = true; a.Build();
//Want both the have the same GUID? Pull it from the previous job.
//string ext = PathUtils.GetExtension(b.FinalPath);
//ImageJob a = new ImageJob(file, PathUtils.RemoveExtension(a.FinalPath) + "_t." + ext, new ResizeSettings("w=100&h=100&mode=crop&format=jpg")); a.CreateParentDirectory = true; a.Build();
ViewBag.CatID = Request.Headers["catid"];
ViewBag.PageID = Request.Headers["pageid"];
ViewBag.FileName = Request.Headers["filename"];
return PartialView("AlbumImage", PathUtils.GuessVirtualPath(a.FinalPath));
}
If the process is relatively quick (small files) you could go in a loop, check for that exception, sleep the thread for a couple of seconds, and try again (up to a maximum number of iterations). One caveat is that if the upload is asynchronous you might miss a file.
A couple of other suggestions:
Make the GetFileName to be a private method so that it doesn't get triggered from the web.
The OrderBy in the Filex query might not do what you expect once the it goes to 8 digits (possible if the first Random() is a very high number).
The Random() should probably be seeded to produce better randomness.
I am calling this zip_threading class in another class. string a = zip_threading(?,?)but the problem is that how can i pass the parameter values when i am calling this class which are : String [] files, bool IsOriginal. i have used in this class background worker threading, so the real problem is that passing the value to this class and then return a value when processing is finished in make_zip_file class.
public class zip_threading
{
public string[] files { get; set; } // to be recieved by the zip method as zip file names.
public int number;
public string return_path;
public bool IsOriginal { get; set; } // to be recieved by the zip method as boolean true or fales
public static BackgroundWorker bgw1 = new BackgroundWorker(); // make a background worker object.
public void bgw1_RunWorkerCompleted(Object sender, RunWorkerCompletedEventArgs e)
{
make_zip_file mzf1 = e.Result as make_zip_file;
return_path = mzf1.return_path;
}
public make_zip_file bgw_DoWork(string[] files, bool IsOriginal, make_zip_file argumentest)
{
Thread.Sleep(100);
argumentest.return_path = argumentest.Makezipfile(files,IsOriginal);
return argumentest;
}
public void run_async(string []files,bool IsOriginal)
{
make_zip_file mzf2 = new make_zip_file();
// mzf2.files = files;
//mzf2.IsOriginal = IsOriginal;
bgw1.DoWork += (sender, e) => e.Result = bgw_DoWork(files, IsOriginal, mzf2);
bgw1.RunWorkerAsync();
}
public class make_zip_file
{
public string return_path ;
//public string[] files{get;set;}
// public bool IsOriginal{get;set;}
public string Makezipfile(string[] files, bool IsOriginal)
{
string[] filenames = new string[files.Length];
if (IsOriginal)
for (int i = 0; i < files.Length; i++)
***filenames[i] = HttpContext.Current.Request.PhysicalApplicationPath + files[i].Remove(0, 10).ToString();***
else
for (int i = 0; i < files.Length; i++)
***filenames[i] = HttpContext.Current.Request.PhysicalApplicationPath + files[i].Replace(HttpContext.Current.Request.UrlReferrer.ToString(), "");***
string DirectoryName = filenames[0].Remove(filenames[0].LastIndexOf('/'));
DirectoryName = DirectoryName.Substring(DirectoryName.LastIndexOf('/') + 1).Replace("\\", "");
try
{
string newFile = HttpContext.Current.Request.PhysicalApplicationPath + "images\\Thumbnails\\zipFiles\\" + DirectoryName + ".zip";
if (File.Exists(newFile))
File.Delete(newFile);
using (ZipFile zip = new ZipFile())
{
foreach (string file in filenames)
{
string newfileName = file.Replace("\\'", "'");
zip.CompressionLevel = 0;
zip.AddFile(newfileName, "");
}
zip.Save(newFile);
}
}
catch (Exception ex)
{
Console.WriteLine("Exception during processing {0}", ex);
// No need to rethrow the exception as for our purposes its handled.
}
return_path = "images/Thumbnails/zipFiles/" + DirectoryName + ".zip";
return return_path;
}}
now i am calling this method in other class: like this
String path=zipa.run_async(fileCollection, IsOriginal);
I get error in make_Zip_File, and i mark that with : Object reference not set to an Instance of an object* filenames[i] = HttpContext.Current.Request.PhysicalApplicationPath + files[i].Remove(0, 10).ToString();*
By taking this to a different thread, you are running outside of the http-context, which may well finish long before your zip operation does (tearing down all the things like inbound stream buffers) - yet you are talking to HttpContext.Current.
You have a few options; thinking off the top of my head...
run it on the request thread; it'll take a while, but meh...
buffer all the data you need in memory, and pass that to the zip operation
write the file to disk in a temp area (not the main app folder) from the request thread, then spawn a separate thread to process it from the temp area
but to re-iterate: you can't access the request from another thread - or at least, you shouldn't.
Also, consider:
a request starts
you spin up a thread to do the zip
you return from the original request
(worker thread keeps on going)
you need to think about what you are going to do with the zip filename; you can't just give it to the client - they are no longer listening to you.
Check files[i] is intialized or not since it is coming from somewhere to the function
Makezipfile(string[] files, bool IsOriginal)
{
}
i think there will be no value in it.