C# - Autozip files in a folder based on Year-Month - c#

IT has been tasked with reducing the file-server usage rate so I'd like to do my part my compressing old files(ie Excel/Access/Txt).
We have some folders that contain thousands of files so I don't want to just zip the whole directory into one large file - it would be preferrable to have a number of smaller files so that it would be easier for a user to find the data 'bucket' they are looking for.
Is there a way using C# to read through a directory and zip the files into year-month groups (all files from year-month placed together in one zip)?
Or would it be better to use a script like AutoIT?
Or are there programs already existing to do this so I don't have to code anything?

Im not sure if your question is about zipping, selecting files from particular year/month or both.
About zipping Peter already mentioned 7-zip and SharpZipLib. I have personally only experience with the latter but its all positive, easy to work with.
About grouping your files it could be done by iterating all the files in the folder and group them by either there created date or last modified date.
pseudo:
var files = new Dictionary<DateTime, IList<string>>();
foreach (var file in Directory.GetFiles(...)) {
var fi = new FileInfo(file);
var date = fi.CreatedDate();
var groupDate = new DateTime(date.Year, date.Month);
if (!files.ContainsKey(groupDate)) files.Add(groupDate, new Collection<string>());
files[groupDate].Add(file);
}
now your should have a dictionary containing distinct year/month keys and foreach key a list of files belonging to that group. So for zipping
pseudo:
foreach (var entry in files) {
var date = entry.Key;
var list = entry.Value;
// create zip-file named date.ToString();
foreach (var file in list) {
// add file to zip
}
}

Surely you can do this with a bit of C# and libraries like 7-zip or SharpZipLib.

You could use System.IO.Directory.GetFiles() to loop through the files on each directory, parsing out by file name and adding them to a 7-zip or SharpZipLib object. If it's thousands of files it might be best to throw it in a service or some kind of scheduled task to run overnight so as not to tax the fileshare.
Good luck to you !
EDIT: As an addendum you could use a System.IO.FileInfo object for each file if you need to parse by created date or other file attirbutes.

Related

Is there an algorithms that can list all file in the folder (for C#)?

Hello StackOverflow community,
I'm working for a C# web application that can show all necessary files in one folder. For example, you have a folder named "Maps" that stores all information about New York City. I will describe this folder here: The bolded word is folders.
Folder Maps:
->NewYorkCity
->>satellite.png
->>coordinates.txt
->>bridges.png
->>Road1
->>>satellite1.png
->>>roads.txt
->>>houses.png
As you can see, inside folder Maps we have folder NewYorkCity, and inside of this, we have folder Road1. Now I want to collect all files that have "*.png" type. It means I want to collect all images inside the root folder. The problem here is the algorithm to collect the file. I have thought to use "for loops" but I don't know the number of subfolders so I assumed it was impossible.
Here is the code to list the file with specified type that I have used but it works for files that in one folder and doesn't have any subfolders.
DirectoryInfo dInfo = new DirectoryInfo(zipPath); //Assuming Test is your Folder
FileInfo[] Files = dInfo.GetFiles("*.png"); //Getting Text files
string str = "";
foreach (FileInfo file in Files)
{
str = str + ", " + file.Name;
}
I hope you understand my question. Thank you.
You could start by reading the documentation, where you would find System.IO.DirectoryInfo.
Create a DirectoryInfo instance, and use, depending on what you want/need, any of its methods
EnumerateDirectories()
EnumerateFiles()
EnumerateFileSystemInfos()
Like so:
DirectoryInfo di = new DirectoryInfo(#"c:\Maps");
foreach (var fsi in di.EnumerateFileSystemInfos("*", SearchOptions.AllDirectories)
{
// Do something useful with fsi here
}

How to programatically use GroupBy and GroupView in windows folder?

I have lot of folders in windows. In each of the folder there are Image/Videos/Excel/Word/PS/Cad..etc.
Im accessing each file programatically, but when I select the folder each time I tend to manually selecting GroupBy="Date Access", GroupView="Details", add a column="date modified". I cant find a way of saving a customized folder settings in my folders.
Is it possible in c#? I'm using Windows 10.
Yes, it is possible in C# to group files by date-accessed or date-modified.
In following example I'll show you how to group using year of creation-date.
First you need to get file infos:
// get file-infos for all files in D:\Daten
string[] files = Directory.GetFiles(#"D:\Daten");
// and convert them to file-info-objects
List<FileInfo> filesWithFileInfo = files.Select(it => new FileInfo(it)).ToList();
Then you can use GroupBy of Linq to group them by year of create date:
List<IGrouping<int, FileInfo>> filesGroupedByYear = filesWithFileInfo.GroupBy(it => it.CreationTime.Year).ToList();
And this enumerates over all groups and then over files in every group:
foreach(IGrouping<int,FileInfo> filesOfOneYear in filesGroupedByYear)
{
Console.WriteLine($"{filesOfOneYear.Count()} Files of year {filesOfOneYear.Key}:");
foreach(FileInfo file in filesOfOneYear.ToList())
{
Console.WriteLine($"{file.Name}");
}
}

C# - Quickest way to loop through a folder of 30 thousand PDF files

I have a folder full of 30 thousand PDF files (please don't ask why).
I need to loop through them and match the date on the date value chosen on the windows form date picker control.
Here is what I have:
public List<FileInfo> myList = new List<FileInfo>();
DirectoryInfo di = new DirectoryInfo(#"\\PDFs");
myList = (di.EnumerateFiles("*.pdf").Where(x => x.LastWriteTime.Date == datetime.Date).ToList());
After I have got the files in the list, I then move them to another location for various other processing, but the aspect I definitely want to speed up is this part.
It's rather slow, is there anyway to speed this up?
Thanks.
You don't have to wait for the whole list of files (myList) to be constructed - you can start processing after the first enumerated file. Just use Parallel.ForEach to download and process a single file. In the example below I'm using the ConcurrentBag collection to store the results.
var results = new ConcurrentBag<ProcessingResult>();
var files = di.EnumerateFiles("*.pdf").Where(x => x.LastWriteTime.Date == datetime.Date);
Parallel.ForEach(files, file => {
var newLocation = CopyToNewLocation(file);
var processingResult = ExecuteAditionalProcessing(newLocation);
results.Add(processingResult);
});
If Powershell is an option (and I would recommend it), try this:
Get-ChildItem c:\folder | Where{$_.LastWriteTime -gt (Get-Date).AddDays(-7)}
Get-Date will return today, so the above will return all files, which were modified in the last 7 days.

How to zip the multiple files into the folder until condition is met?

Technology Used: C#, IonicZip library.
From the list of multiple log files(Let's say 10,000, each of reasonable amount of size). I have to zip these files in a folder. But then zipped folder's size must be approximately under 4MB. How can I have minimum possible number of zipped folders.
private static string ZipAndReturnFolderPath(IEnumerable<string> files, string saveToFolder)
{
int listToSkip = 0;
using (var zip = new ZipFile())
{
do
{
zip.AddFiles(files.Skip(listToSkip * 10).Take(10));
zip.Save(saveToFolder);
listToSkip++;
}
while ((new FileInfo(saveToFolder).Length < _lessThan4MB) && totalFilesRemaining > 0);
}
return saveToFolder;
}
Here, to make it concise, I have removed few lines of code. Parameter: files - holds the path of the total remaining files to be zipped(Don't worry about how I will maintain that). saveToFolder is the destination for the zipped folder(this will be unique each time the function is called).
I believe this works. I have checked the files it has been zipping and there I find no duplication. But, zipping files to a folder, checking the condition and then again repeating the same process for the next few files in the already zipped folder doesn't sound to be a good approach.
Am I doing anything wrong or is there any efficient way I can achieve this?
I think what you're after has been answered here already, using ZipOutputStream could be what you're after.

Want to list all Image Files in a folder using C# [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
GetFiles with multiple extentions
is there a function like GetFiles that takes more then 1 file type like
DirectoryInfo di = new DirectoryInfo("c:/inetpub/wwwroot/demos");
FileInfo[] rgFiles = di.GetFiles("*.bmp, *.jpg, etc");
AFAIK, this isn't directly possible.
Instead, you can get every file, then filter the array:
HashSet<string> allowedExtensions = new HashSet<string>(extensionArray, StringComparer.OrdinalIgnoreCase);
FileInfo[] files = Array.FindAll(dirInfo.GetFiles(), f => allowedExtensions.Contains(f.Extension));
extensionArray must include . before each extension, but is case-insensitive.
Not that I know of.
I implemented the same problem like so:
DirectoryInfo di = new DirectoryInfo("c:/inetpub/wwwroot/demos");
FileInfo[] rgFiles = di.GetFiles("*.bmp")
.Union(di.GetFiles("*.jpg"))
.Union(di.GetFiles("etc"))
.ToArray();
Note that this requires the System.Linq namespace.
How to get files with multiple extensions
If you want your code to be bullet proof in the sense your file detection mechanism detects an image file not based on the extension but on the nature of the file, you'd have to to load your files as byte[] and look for a magic trail of bytes usually in the beginning of the array. Every graphical file has it's own way of manifesting itself to the software through presenting that magic value of bytes. I can post some code examples if you'd like.
No, theres not. Windows does not have a way to separate filters in the search pattern.
This could be done manually through LINQ, though.
By using the EnumerateFiles you'll get results as they come back so you don't have to wait for all the files in order to start working on the result.
var directory = new DirectoryInfo("C:\\");
var allowedExtensions = new string[] { ".jpg", ".bmp" };
var imageFiles = from file in directory.EnumerateFiles("*", SearchOption.AllDirectories)
where allowedExtensions.Contains(file.Extension.ToLower())
select file;
foreach (var file in imageFiles)
Console.WriteLine(file.FullName);

Categories

Resources