Download 3000+ Images Using C#? - c#

I have a list of around 3000 image URLs, where I need to download them to my desktop.
I'm a web dev, so naturally wrote a little asp.net c# download method to do this but the obvious problem happened and the page timed out before I got hardly any of them.
Was wondering if anyone else knew of a good, quick and robust way of me looping through all the image URL's and downloading them to a folder? Open to any suggestions, WinForms, batch file although I'm a novice at both.
Any help greatly appreciated

What about wget? It can download a list of URL specified in a file.
wget -i c:\list-of-urls.txt

Write a C# command-line application (or Winforms, if that's your inclination), and use the WebClient class to retrieve the files.
Here are some tutorials:
C# WebClient Tutorial
Using WebClient to Download a File
or, just Google C# WebClient.
You'll either need to provide a list of files to download and loop through the list, issuing a request for each file and saving the result, or issue a request for the index page, parse it using something like HTML Agility Pack to find all of the image tags, and then issue a request for each image, saving the result somewhere on your local drive.
Edit
If you just want to do this once (as in, not as part of an application), mbeckish's answer makes the most sense.

You might want to use an existing download manager like Orbit, rather than writing your own program for the purpose. (blasphemy, I know)
I've been pretty happy with Orbit. It lets you import a list of downloads from a text file. It'll manage the connections, downloading portions of each file in parallel with multiple connections, to increase the speed of each download. It'll take care of retrying if connections time out, etc. It seems like you'd have to go to a lot of effort to build these kind of features from scratch.

If this is just a one-time job, then one easy solution would be to write a HTML page with img tags pointing to the URLs.
Then browse it with FireFox and use an extension to save all of the images to a folder.

Working on the assumption that this is a one off run once project and as you are a novice with other technologies I would suggest the following:
Rather than try and download all 3000 images in one web request do one image per request. When the image download is complete redirect to the same page passing the URL of the next image to get as a query string parameter. Download that one and then repeat until all images are downloaded.
Not what I would call a "production" solution, but if my assumption is correct it is a solution that will have you up an running in no time.
Another fairly simple solution would be to create a simple C# console application that uses WebClient to download each of the images. The following psuedo code should give you enough to get going:
List<string> imageUrls = new List<string>();
imageUrls.Add(..... your urls from wherever .....)
foreach(string imageUrl in imagesUrls)
{
using (WebClient client = new WebClient())
{
byte[] raw = client.DownloadData(imageUrl);
.. write raw .. to file
}
}

I've written a similar app in WinForms that loops through URLs in an Excel spreadsheet and downloads the image files. I think they problem you're having with implementing this as a web application is that server will only allow the process to run for a short amount of time before the request from your browser times out. You could either increase this time in the web.config file (change the executionTimeout attribute of the httpRuntime element), or implement this functionality as a WinForms application where the long execution time won't be a problem. If this is more than a throw-away application and you decide to go the WinForms route, you may want to add a progress bar to ind

Related

Folder explorer options

I have recently been assigned a task which sounded relatively simple!
Upon attempting it became clear it wasn't as straight forward as i first imagined!!!
I am trying to download multiple files to one location on the users machine. They select these files from lists within a custom share-point web part. Thats the bit i have managed to get working! The downloading is done via WebClient (System.Net.WebClient)
I now want to allow the user to select a location on their local machine to download the files to.
I thought i would be able to use but after attempting this i realized i can only pick files :( in order to get the desired location which will confuse the user
I want something similar to the above but i only need it to return a path location like c:\Temp or any other location the user prefers on their local machine.
Could anyone suggest a control that could provide this functionality. It can also be a share-point control.
In the meantime I will be attempting Tree view as i have never used these before and these may have the power to do this from what i have read
Cheers
Truez
Clarity on language ASP.NET
Unfortunately, you can't do this without some kind of active content, like a Flash control or spit activeX /spit.
It seems strange at first, but you have to consider that this kind of functionality would let a site discover the structure of anyones storage devices; this is not 'a good thing'™
However, perhaps a different approach might solve the problem?
Why are you using WebClient, can't you provide the link to the client and let them choose their own download folder ?
I ended up zipping the files in to one folder and passed the file to be downloaded through the browser! Thanks for your comments!

Minifying and combining files in .net

I am looking at implementing some performance optimization around my javascript/css. In particular looking to achieve the minification and combining of such. I am developing in .net/c# web applications.
I have a couple of options and looking for feedback on each:
First one is this clever tool I came across Chirpy which via visual studio combines, minifies etc -> http://chirpy.codeplex.com/ This is a visual studio add in but as I am in a team environment, this tool isnt ideal.
My next option is to use an Msbuild task (http://yuicompressor.codeplex.com/) to minify the files and also combine them (maybe read from an xml file what needs to be combined). While this works for minifying fine, the concern I have is that I will have to maintain what must be combined which could be a headache.
3rd option is to use msbuild task just for the minifying and at runtime using some helper classes, combine the files on a per page basis. This would combine the files, give it a name and add a version to it.
Any other options I could consider? My concern with the last option is that it may have performance issues as I would have to open the file from the local drive, read its contents and then combine the files. This is alot of processing at run time. I was looking at something like Squishit - https://github.com/jetheredge/SquishIt/downloads This minifies the files at run time but I would look at doing this at compile time.
So any feedback on my approaches would be great? If the 3rd option would not cause performance issues, I am leading towards it.
We have done something similar with several ASP.NET web applications. Specifically, we use the Yahoo Yui compressor, which has a .NET library version which you can reference in your applications.
The approach we took was to generate the necessary merged/minified files at runtime. We wrapped all this logic up into an ASP.NET control, but that isn't necessary depending on your project.
The first time a request is made for a page, we process through the list of included JS and CSS files. In a separate thread (so the original request returns without delay) we then merged the included files together (1 for JS, 1 for CSS), and then apply the Yui compressor.
The result is then written to disk for fast reference in the future
On subsequent requests, the page first looks for the minified versions. If found, it just serves those up. If not, it goes through the process again.
As some icing to the cake:
For debug purposes, if the query string ?debug=true is present, the merged/minified resources are ignored and the original individual files are served instead (since it can be hard to debug optimized JS)
We have found this process to work exceptionally well. We built it into a library so all our ASP.NET sites can take advantage. The post-build scripts can get complicated if each page has different dependencies, but the run-time can determine this quite easily. And, if someone needs to make a quick fix to a CSS file, they can do so, delete the merged versions of the file, and the process will automatically start over without need to do post-build processing with MSBuild or NAnt.
RequestReduce provides a really nice solution for combining and minifying javascript and css at run time. It will also attempt to sprite your background images. It caches the processed files and serves them using custom ETags and far future headers. RequestReduce uses a response filter to transform the content so no code or configuration is needed for basic functionality. It can be configured to work in a web farm environment and sync content accross several servers and can be configured to point to a CDN. It can be downloaded at http://www.RequestReduce.com or from Visual Studio via Nuget. The source is available at https://github.com/mwrock/RequestReduce.
have you heard of Combres ?
go to : http://combres.codeplex.com and check it out
it minifies your CSS and JS files at Runtime meaning you can change any file and upload it and each request the client does it minifies it.
all you gotta do is add the files u wanna compress to a list in the combres XML file and just call the list from your page / masterpage.
if you are using VS2010 you can easily install it on your project using NuGet
here's the Combres NuGet link: http://combres.codeplex.com/wikipage?title=5-Minute%20Quick%20Start
I did a really nice solution to this a couple of years back but I don't have the source left. The solution was for webforms but it should work fine to port it to MVC. I'll give it a try to explain what I did in some simple step. First we need to register the scripts and we wrote a special controller that did just that. When the controller was rendered it did three things:
Minimize all the files, I think we used the YUI compression
Combine all the files and store as string
Calculate a hash for the string of the combined files and use that as a virtual filename. You store the string of combined files in a cached dictionary on the server with the hash value as key, the html that is rendered needs to point to a special folder where the "scripts" are located.
The next step is to implement a special HttpHandler that handles request for files in the special folder. When a request is made to that special folder you make a lookup in the cached dictionary and returns the string bascially.
One really nice feature of this is that the returned script is always valid so the user will never have to ask you for an update of the script. The reason for that is when you make a change to any of the script files the hash value will change and the client will ask for a new script.
You can use this for css-files as well with no problems. I remebered making it configurable so you could turn off combine files, minimize files, or just exclude one file from the process if you wanted to do some debugging.
I might have missed some details, but it wasn't that hard to implement and it turned out very well.
Update: I've implemented a solution for MVC and released it on nuget and have the source up on github.
Microsoft’s Ajax minifier is suprisingly good as a minification tool. I wrote a blog post on combining files and using their minifier in a javascript and stylesheet handler:
http://www.markistaylor.com/javascript-concatenating-and-minifying/
It's worthwhile combining the files at run time to avoid having to synchronise new versions. However, once they are programmatically combined, cache them to disk. Then the code which runs each time the files are fetched need only check that the files haven't changed before serving the cached version.
If they have changed, then the compression code can run as a one-off.
Whilst there will be a slight performance cost, you will also receive a performance benefit from fewer file requests.
This is the approach that the Minify tool uses to compress JS/CSS, which has worked really well for me. It's Linux/PHP only, but you might get some more ideas there too.
I needed a solution for combining/minifying CSS/JS on a .NET 2.0 web app and SquishIt and other tools I found weren't .NET 2.0-compatible, I created my own solution that uses a syntax similar to SquishIt but is compatible with .NET 2.0. Since I thought other people might find it useful I put it up on Github. You can find it here: https://github.com/AlliterativeAlice/simpleyui

C#: Programmatically apply merge/patch to file?

I have a program that requires a few large (~4 or 5mb) files. Once a week, every week, there are new versions of these files with minor changes. Mostly just a few lines added or removed.
When the program starts, if there's an Internet connection, I'd like the program to update these files automatically. Instead of downloading the entire new versions of the files, I'll like to download just a patch based on the client's version of the files that updates them.
How might I do this?
I have total control over the server.
That is a tough problem to solve if you don't have any for knowledge of what is in the file or the server doest have a facility to allow you to request differences. Any program you write that does not have a way to determine the differences with out looking at the old and new file will have to download it anyway.
C# doesn't have any built-in facility to do this, but it sounds like your requirements aren't complicated. Look at how diff and ed on Unix can be used to patch a text file based on an easy-to-grok delta. Of course you should check the resulting file against a hash and fall back to a full download if it isn't correct.

how to know when download is finished

Hi I'm creating online shop. In this shope people online must be buy files with zip extension. They pay with their credit cards or other methods get key and download product. How can I know when they finish product download?
Thanks
Unfortunatelly there is no really good way to do this as some clients might not download the file at once (e.g. Downloadmanagers split the download into several parralel part downloads).
Options are:
If it is very important to you that it can only be downloaded once: You could
simply not support resuming. Then you
can log if the file has entirely been
downloaded (as soon as the last byte
has been sent). This might work well if the download is small.
Otherwise you could offer some grace
data (we usually allow to download
clients to download 5 times the size
of the real download) and log every
download attempt.
You should NOT just count the bytes downloaded (because the download might be disrupted). And NOT just determine if all sections have been downloaded once (also because the download might be disrupted)
Just to clarify: All this means that you have to write your own download handler (fileserver).
you can use custom file server that works on either http or ftp and have it send a notification once the client received the last file fragment.
all other options are problematic; the client might download the file using a download manager,so you cannot even register for any browser event, if there was any.
A custom server application seems indeed a solution for this,
or possibly some kind of scripting.
A normal http server does not notify the end of a connection,
but possibly, if you generate the output in a cgi/php/asp/* script,
you read the file in cgi/php/asp/* scripting language and
send it to the output. when you reach the end of the file, you
do the notification, and then end the script.
When you do it that way, it will only detect fully downloaded files,
and if the connection gets interrupted half-way, it would not mark
the file as downloaded.
a 'cgi-script' can be a compiled c program, (or any other langauge
for that matter). Compiled code anyways. A compiled program
would give better performance then a interpreted script solution.

best way to write a polled FTP download in C#

I currently have a manual process where we upload a text file to a business partner, they have an automated process which reads in the file, processes it and then generates a 'results' log file any where from 3-10minutes (typically) after the initial upload. I need to automate this process via a .NET application.
I already have the upload completed, what I do not have is the download of the result. Since I dont know exactly when the file will be ready to download I figure that I must need to poll the remote site every so often, get a listing of the files in the results directory and see if one matches what I am expecting.
I have done some reading and found some references to AsyncCallBack but I'm not really sure how to proceed with it. the solution has to be something I can manage without any third-party libraries outside of .net since I have a budget of 0 for this little project.
Any help would be greatly appreciated!
Just have a thread (or your main thread) sleep for x milliseconds and attempt to do the download when it's not sleeping. No need to buy a 3rd party FTP library, FTP is built into .NET (FtpWebRequest and FtpWebResponse). They aren't very good (very bare bones) but will probably do for what you want.

Categories

Resources