While doing some basic validation on the ASP.Net (4.0) Request.Files (upload) collection I decided to try it with LINQ.
The collection is IEnumerable<T> and so doesn't offer ForEach. Foolishly I decided to build an extension method that would do the job. Sorry to say not so much success...
Running the extension method (below) raises next error:
Unable to cast object of type 'System.String' to type 'System.Web.HttpPostedFile'
There is clearly something I am not getting, but I can't see what it is, so at the risk of looking like an idiot (wont be the first time) here is the code in 3 chunks, along with a promise of gratitude for any help.
First, the extension method with an Action parameter:
//Extend ForEach to IEnumerated Files
public static IEnumerable<HttpPostedFileWrapper> ForEach<T>(this IEnumerable<HttpPostedFileWrapper> source, Action<HttpPostedFileWrapper> action)
{
//breaks on first 'item' init
foreach (HttpPostedFileWrapper item in source)
action(item);
return source;
}
The error occurs when the internal foreach loop hits the 'item' in 'source'.
Here is the calling code (variables MaxFileTries and attachPath are properly set previously) :
var files = Request.Files.Cast<HttpPostedFile>()
.Select(file => new HttpPostedFileWrapper(file))
.Where(file => file.ContentLength > 0
&& file.ContentLength <= MaxFileSize
&& file.FileName.Length > 0)
.ForEach<HttpPostedFileWrapper>(f => f.SaveUpload(attachPath, MaxFileTries));
And lastly, the Action target, Saving the upload file - we don't appear to ever even get to here, but just in case, here it is:
public static HttpPostedFileWrapper SaveUpload(this HttpPostedFileWrapper f, string attachPath, int MaxFileTries)
{
// we can only upload the same file MaxTries times in one session
int tries = 0;
string saveName = f.FileName.Substring(f.FileName.LastIndexOf("\\") + 1); //strip any local
string path = attachPath + saveName;
while (File.Exists(path) && tries <= MaxFileTries)
{
tries++;
path = attachPath + " (" + tries.ToString() + ")" + saveName;
}
if (tries <= MaxFileTries)
{
if (!Directory.Exists(attachPath)) Directory.CreateDirectory(attachPath);
f.SaveAs(path);
}
return f;
}
I confess that some of the above is a cobbling together of "bits found", so I am likely getting what I deserve, but if anyone has a good understanding of (or has at least been through) this, maybe I can learn something.
Thanks for any.
Why don't you just call ToList().ForEach() on the original IEnumerable<T>.
I think this is what you want
Your class that extends HttpFileCollection should be
public static class HttpPostedFileExtension
{
//Extend ForEach to IEnumerated Files
public static void ProcessPostedFiles(this HttpFileCollection source, Func<HttpPostedFile, bool> predicate, Action<HttpPostedFile> action)
{
foreach (var item in source.AllKeys)
{
var httpPostedFile = source[item];
if (predicate(httpPostedFile))
action(httpPostedFile);
}
}
}
And then you can use it like so:
Request.Files.ProcessPostedFiles(
postedFile =>
{
return (postedFile.ContentLength > 0 && postedFile.FileName.Length > 0);
},
pFile =>
{
//Do something with pFile which is an Instance of HttpPosteFile
});
First, write a common extension method:
public static IEnumerable<T> ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
foreach (T item in source)
action(item);
return source; // or void, without return
}
Then, Request.Files is castable to IEnumerable<string>, not IEnumerable<HttpPostedFile>:
IEnumerable<string> requestFiles = this.Request.Files.Cast<string>();
but indeed System.Web.UI.WebControls.FileUpload.PostedFile is HttpPostedFile:
HttpPostedFile file = fileUpload.PostedFile;
HttpPostedFileWrapper wrapper = new HttpPostedFileWrapper(file);
but it's single, not a collection. Where from did you get your collection?
Another extension method:
public static IEnumerable<HttpPostedFile> ToEnumerable(this HttpFileCollection collection)
{
foreach (var item in collection.AllKeys)
{
yield return collection[item];
}
}
Usage:
IEnumerable<HttpPostedFile> files = this.Request.Files.ToEnumerable();
IEnumerable<HttpPostedFileWrapper> wrappers = files.Select(f => new HttpPostedFileWrapper(f));
Related
In this method if i'm not mistaken it's searching for inside the files.
So when I typed in searchTerm for example "Form1" it found 46 files.
But now I want to change the method without the searchTerm so it will loop over all the files but in the end I want to get List of all the files types. If there are the same dupilcate dont add them to the List so in the end I will get a List with files types items like: cs,txt,xml so I will know what files types there are.
IEnumerable<string> SearchAccessibleFiles(string root, string searchTerm)
{
var files = new List<string>();
foreach (var file in Directory.EnumerateFiles(root).Where(m => m.Contains(searchTerm)))
{
files.Add(file);
}
foreach (var subDir in Directory.EnumerateDirectories(root))
{
try
{
files.AddRange(SearchAccessibleFiles(subDir, searchTerm));
}
catch (UnauthorizedAccessException ex)
{
// ...
}
}
return files;
}
The problem is that if I'm just making GetFiles and the root directory is c:\ then it will stop and will not get any files when it's getting to the directory in windows 10: Documents and Settings
Directory.GetFiles(textBox3.Text, "*.*", SearchOption.AllDirectories).ToList();
Since I didn't find a way to work around with Directory.GetFiles on pass over this directory I'm trying to use the recursive method.
You can get file extension using extension = Path.GetExtension(fileName); and remove duplicates by using .Distinct() on files list like this:
IEnumerable<string> SearchAccessibleFiles(string root, string searchTerm)
{
var files = new List<string>();
foreach (var file in Directory.EnumerateFiles(root).Where(m => m.Contains(searchTerm)))
{
string extension = Path.GetExtension(file);
files.Add(extension);
}
foreach (var subDir in Directory.EnumerateDirectories(root))
{
try
{
files.AddRange(SearchAccessibleFiles(subDir, searchTerm));
}
catch (UnauthorizedAccessException ex)
{
// ...
}
}
return files.Distinct().ToList();
}
you can simply remove .Where(m => m.Contains(searchTerm)) part for searching without a search term.
Edit
If you don't want to use .Distict() and want to check duplicates on the go you can try this method:
IEnumerable<string> SearchAccessibleFilesNoDistinct(string root, List<string> files)
{
if(files == null)
files = new List<string>();
foreach (var file in Directory.EnumerateFiles(root))
{
string extension = Path.GetExtension(file);
if(!files.Containes(extension))
files.Add(extension);
}
foreach (var subDir in Directory.EnumerateDirectories(root))
{
try
{
SearchAccessibleFilesNoDistinct(subDir, files);
}
catch (UnauthorizedAccessException ex)
{
// ...
}
}
return files;
}
and first time call looks like this:
var extensionsList = SearchAccessibleFilesNoDistinct("rootAddress",null);
you can see that I passed files list through the recursive method, by this approach we have the same files list, in all recursive calls so that should do the trick, keep in mind that in recursive calls there is no need to get the returned list as we have the same list already,but in the end we can use the returned list for further use.
hope that helps
The problem can be break down to a several parts:
(1) Enumerate recursively all the accessible directories
(2) Enumerate multiple directory files
(3) Get a distinct file extensions
Note that only the (3) is specific, (1) and (2) are general and can be used for other processing (like your SearchAccessibleFiles etc.). So let solve them separately:
(1) Enumerate recursively all the accessible directories:
This in turn can be split in two parts:
(A) Enumerate recursively a generic tree structure
(B) Specialization of the above for accessible directory tree
For (A) I personally use the helper method from my answer to How to flatten tree via LINQ? and similar:
public static class TreeUtils
{
public static IEnumerable<T> Expand<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector)
{
var stack = new Stack<IEnumerator<T>>();
var e = source.GetEnumerator();
try
{
while (true)
{
while (e.MoveNext())
{
var item = e.Current;
yield return item;
var elements = elementSelector(item);
if (elements == null) continue;
stack.Push(e);
e = elements.GetEnumerator();
}
if (stack.Count == 0) break;
e.Dispose();
e = stack.Pop();
}
}
finally
{
e.Dispose();
while (stack.Count != 0) stack.Pop().Dispose();
}
}
}
and here is the specialization for our case:
public static partial class DirectoryUtils
{
public static IEnumerable<DirectoryInfo> EnumerateAccessibleDirectories(string path, bool all = false)
{
var filter = Func((IEnumerable<DirectoryInfo> source) =>
source.Select(di =>
{
try { return new { Info = di, Children = di.EnumerateDirectories() }; }
catch (UnauthorizedAccessException) { return null; }
})
.Where(e => e != null));
var items = filter(Enumerable.Repeat(new DirectoryInfo(path), 1));
if (all)
items = items.Expand(e => filter(e.Children));
else
items = items.Concat(items.SelectMany(e => filter(e.Children)));
return items.Select(e => e.Info);
}
static Func<T, TResult> Func<T, TResult>(Func<T, TResult> func) { return func; }
}
(2) Enumerate multiple directory files:
A simple extension methods to eliminate the repetitive code:
partial class DirectoryUtils
{
public static IEnumerable<FileInfo> EnumerateAccessibleFiles(string path, bool allDirectories = false)
{
return EnumerateAccessibleDirectories(path, allDirectories).EnumerateFiles();
}
public static IEnumerable<FileInfo> EnumerateFiles(this IEnumerable<DirectoryInfo> source)
{
return source.SelectMany(di => di.EnumerateFiles());
}
}
(3) Get a distinct file extensions:
With the above helpers, this is a matter of a simple LINQ query:
var result = DirectoryUtils.EnumerateAccessibleFiles(rootPath, true)
.Select(file => file.Extension).Distinct()
.ToList();
Finally, just for comparison, here is how your original method will look when using the same helpers:
IEnumerable<string> SearchAccessibleFiles(string root, string searchTerm)
{
return DirectoryUtils.EnumerateAccessibleFiles(rootPath, true)
.Where(file => file.FullName.Contains(searchTerm))
.Select(file => file.FullName);
}
If the search term is not supposed to include directory info, you can change the filter condition to file.Name.Contains(searchTerm).
I have 2 collections of files as List<FileInfo>. I am currently using 2 x foreach to loop through each set and match the files (shown below). Is there a quicker way to do this in LINQ and .RemoveAt when found.?
I need the filenames and file lengths to match.
var sdinfo = new DirectoryInfo(srcPath);
var ddinfo = new DirectoryInfo(dstPath);
var sFiles = new List<FileInfo>(sdinfo.GetFiles("*", SearchOption.AllDirectories));
var dFiles = new List<FileInfo>(ddinfo.GetFiles("*", SearchOption.AllDirectories));
foreach (var sFile in sFiles)
{
bool foundFile = false;
int i = 0;
foreach (var dFile in dFiles)
{
if (sFile.Name == dFile.Name && sFile.Length == dFile.Length)
{
foundFile = true;
dFiles.RemoveAt(i);
}
i += 1;
}
}
Cheers.
You could use the Enumerable.Except<TSource> method:
private class FileInfoComparer : IEqualityComparer<FileInfo>
{
public bool Equals(FileInfo x, FileInfo y)
{
return x == null ? y == null : (x.Name.Equals(y.Name, StringComparison.CurrentCultureIgnoreCase) && x.Length == y.Length);
}
public int GetHashCode(FileInfo obj)
{
return obj.GetHashCode();
}
}
sFiles = sFiles.Except(dFiles, new FileInfoComparer()).ToList();
In the example above you get all files from sFiles that are absent in the dFiles.
for one, this code will throw an exception if executed, because you're modifying an enumeration (dFiles) while iterating through it. This is easily solved by using the ToList() method however, in order to copy the enumeration. This will also have an issue because you increment your index regardless of removal, which can also cause an error - the colloquial off-by-one-exception.
If you're worried about speed, don't be. Linq uses methods which use foreach and yield returns, and are mostly visible in source from the Reference Source.
If you want to make the code easier to read, then this is where Linq becomes useful. For one, there is the .Join() Method:
foreach(var fileToRemove in sFiles.Join(dFiles, s => s, d => d, (s, d) => s).ToArray())
dFiles.Remove(fileToRemove);
Assuming you're iterating through dList afterwards, you can also use .Except(...) Method:
var files = sdinfo.GetFiles("*", SearchOption.AllDirectories)
.Except(ddinfo.GetFiles("*", SearchOption.AllDirectories));
finally, if you need to KEEP sFiles, the following code wraps it all together
List<string> sFiles, dFiles;
dFiles = ddinfo.GetFiles("*", SearchOption.AllDirectories)
.Except(sFiles = sdinfo.GetFiles("*", SearchOption.AllDirectories));
If you want to trade space for time, you could build a hash set of one list, and the lookup each element of the other in the hash set. Lookups are O(1) whereas the loop is O(n)
I would like to do something like
Action<FileInfo> deleter = f =>
{
if (....) // delete condition here
{
System.IO.File.Delete(f.FullName);
}
};
DirectoryInfo di = new DirectoryInfo(_path);
di.GetFiles("*.pdf").Select(deleter); // <= Does not compile!
di.GetFiles("*.txt").Select(deleter); // <= Does not compile!
di.GetFiles("*.dat").Select(deleter); // <= Does not compile!
in order to delete old files from a directory. But I do not know how to directly apply the delegate to the FilInfo[] without an explicit foreach (the idea listed above does not work of course).
Is it possible?
Select() is used to project items from TSource to TResult. In your case, you do not need Select because you're not projecting. Instead, use List<T>s ForEach method to delete files:
di.GetFiles("*.pdf").ToList().ForEach(deleter);
As DarkGray suggests you could, if somewhat unusually, utilise the Select to firstly action the file, and then return a null collection. I would recommend utilising the ForEach extension, like so:
ForEach LINQ Extension
public static void ForEach<TSource>(this IEnumerable<TSource> source, Action<T> action)
{
foreach(TSource item in source)
{
action(item);
}
}
You should then be able to execute the action on the array of FileInfo, as array is an enumerator. Like so:
Execution
Action<FileInfo> deleter = f =>
{
if (....) // delete condition here
{
System.IO.File.Delete(f.FullName);
}
};
DirectoryInfo di = new DirectoryInfo(_path);
di.GetFiles("*.pdf").ForEach(deleter);
Edit by Richard.
I do want to raise attention to the argument of foreach vs ForEach. In my opinion the ForEach statement should directly effect the object being passed in, and in this case it does. So I've contradicted myself. Oops! :)
di.GetFiles("*.pdf").Select(_=>{deleter(_);return null;});
or
di.GetFiles("*.pdf").ForEach(action);
public static class Hlp
{
static public void ForEach<T>(this IEnumerable<T> items, Action<T> action)
{
foreach (var item in items)
action(item);
}
}
I have a need to pass the results of a source function (which returns an IEnumerable) through a list of other processing functions (that each take and return an IEnumerable).
All is fine up to that point, but I also need to allow the processing functions to perform multiple loops over their input enumerables.
So rather than pass in IEnumerable<T>, I thought I would change the input parameter to Func<IEnumerable<T>> and allow each of the functions to restart the enumerable if required.
Unfortunately, I'm now getting a stack overflow where the final processing function is calling itself rather than passing the request back down the chain.
The example code is a bit contrived but hopefully gives you an idea of what I'm trying to achieve.
class Program
{
public static void Main(string[] args)
{
Func<IEnumerable<String>> getResults = () => GetInputValues("A", 5);
List<String> valuesToAppend = new List<String>();
valuesToAppend.Add("B");
valuesToAppend.Add("C");
foreach (var item in valuesToAppend)
{
getResults = () => ProcessValues(() => getResults(),item);
}
foreach (var item in getResults())
{
Console.WriteLine(item);
}
}
public static IEnumerable<String> GetInputValues(String value, Int32 numValues)
{
for (int i = 0; i < numValues; i++)
{
yield return value;
}
}
public static IEnumerable<String> ProcessValues(Func<IEnumerable<String>> getInputValues, String appendValue)
{
foreach (var item in getInputValues())
{
yield return item + " " + appendValue;
}
}
}
getResults is captured as a variable, not a value. I don't really like the overall approach you're using here (it seems convoluted), but you should be able to fix the stackoverflow by changing the capture:
foreach (var item in valuesToAppend)
{
var tmp1 = getResults;
var tmp2 = item;
getResults = () => ProcessValues(() => tmp1(),tmp2);
}
On a side note: IEnumerable[<T>] is already kinda repeatable, you simply call foreach another time - is is IEnumerator[<T>] that (despite the Reset()) isn't - but also, I think it is worth doing trying to do this without needing to ever repeat the enumeration, since in the general case that simply cannot be guaranteed to work.
Here's a simpler (IMO) implementation with the same result:
using System;
using System.Collections.Generic;
using System.Linq;
class Program {
public static void Main() {
IEnumerable<String> getResults = Enumerable.Repeat("A", 5);
List<String> valuesToAppend = new List<String> { "B", "C" };
foreach (var item in valuesToAppend) {
string tmp = item;
getResults = getResults.Select(s => s + " " + tmp);
}
foreach (var item in getResults) {
Console.WriteLine(item);
}
}
}
What's the "best" (taking both speed and readability into account) way to determine if a list is empty? Even if the list is of type IEnumerable<T> and doesn't have a Count property.
Right now I'm tossing up between this:
if (myList.Count() == 0) { ... }
and this:
if (!myList.Any()) { ... }
My guess is that the second option is faster, since it'll come back with a result as soon as it sees the first item, whereas the second option (for an IEnumerable) will need to visit every item to return the count.
That being said, does the second option look as readable to you? Which would you prefer? Or can you think of a better way to test for an empty list?
Edit #lassevk's response seems to be the most logical, coupled with a bit of runtime checking to use a cached count if possible, like this:
public static bool IsEmpty<T>(this IEnumerable<T> list)
{
if (list is ICollection<T>) return ((ICollection<T>)list).Count == 0;
return !list.Any();
}
You could do this:
public static Boolean IsEmpty<T>(this IEnumerable<T> source)
{
if (source == null)
return true; // or throw an exception
return !source.Any();
}
Edit: Note that simply using the .Count method will be fast if the underlying source actually has a fast Count property. A valid optimization above would be to detect a few base types and simply use the .Count property of those, instead of the .Any() approach, but then fall back to .Any() if no guarantee can be made.
I would make one small addition to the code you seem to have settled on: check also for ICollection, as this is implemented even by some non-obsolete generic classes as well (i.e., Queue<T> and Stack<T>). I would also use as instead of is as it's more idiomatic and has been shown to be faster.
public static bool IsEmpty<T>(this IEnumerable<T> list)
{
if (list == null)
{
throw new ArgumentNullException("list");
}
var genericCollection = list as ICollection<T>;
if (genericCollection != null)
{
return genericCollection.Count == 0;
}
var nonGenericCollection = list as ICollection;
if (nonGenericCollection != null)
{
return nonGenericCollection.Count == 0;
}
return !list.Any();
}
LINQ itself must be doing some serious optimization around the Count() method somehow.
Does this surprise you? I imagine that for IList implementations, Count simply reads the number of elements directly while Any has to query the IEnumerable.GetEnumerator method, create an instance and call MoveNext at least once.
/EDIT #Matt:
I can only assume that the Count() extension method for IEnumerable is doing something like this:
Yes, of course it does. This is what I meant. Actually, it uses ICollection instead of IList but the result is the same.
I just wrote up a quick test, try this:
IEnumerable<Object> myList = new List<Object>();
Stopwatch watch = new Stopwatch();
int x;
watch.Start();
for (var i = 0; i <= 1000000; i++)
{
if (myList.Count() == 0) x = i;
}
watch.Stop();
Stopwatch watch2 = new Stopwatch();
watch2.Start();
for (var i = 0; i <= 1000000; i++)
{
if (!myList.Any()) x = i;
}
watch2.Stop();
Console.WriteLine("myList.Count() = " + watch.ElapsedMilliseconds.ToString());
Console.WriteLine("myList.Any() = " + watch2.ElapsedMilliseconds.ToString());
Console.ReadLine();
The second is almost three times slower :)
Trying the stopwatch test again with a Stack or array or other scenarios it really depends on the type of list it seems - because they prove Count to be slower.
So I guess it depends on the type of list you're using!
(Just to point out, I put 2000+ objects in the List and count was still faster, opposite with other types)
List.Count is O(1) according to Microsoft's documentation:
http://msdn.microsoft.com/en-us/library/27b47ht3.aspx
so just use List.Count == 0 it's much faster than a query
This is because it has a data member called Count which is updated any time something is added or removed from the list, so when you call List.Count it doesn't have to iterate through every element to get it, it just returns the data member.
The second option is much quicker if you have multiple items.
Any() returns as soon as 1 item is found.
Count() has to keep going through the entire list.
For instance suppose the enumeration had 1000 items.
Any() would check the first one, then return true.
Count() would return 1000 after traversing the entire enumeration.
This is potentially worse if you use one of the predicate overrides - Count() still has to check every single item, even it there is only one match.
You get used to using the Any one - it does make sense and is readable.
One caveat - if you have a List, rather than just an IEnumerable then use that list's Count property.
#Konrad what surprises me is that in my tests, I'm passing the list into a method that accepts IEnumerable<T>, so the runtime can't optimize it by calling the Count() extension method for IList<T>.
I can only assume that the Count() extension method for IEnumerable is doing something like this:
public static int Count<T>(this IEnumerable<T> list)
{
if (list is IList<T>) return ((IList<T>)list).Count;
int i = 0;
foreach (var t in list) i++;
return i;
}
... in other words, a bit of runtime optimization for the special case of IList<T>.
/EDIT #Konrad +1 mate - you're right about it more likely being on ICollection<T>.
Ok, so what about this one?
public static bool IsEmpty<T>(this IEnumerable<T> enumerable)
{
return !enumerable.GetEnumerator().MoveNext();
}
EDIT: I've just realized that someone has sketched this solution already. It was mentioned that the Any() method will do this, but why not do it yourself? Regards
Another idea:
if(enumerable.FirstOrDefault() != null)
However I like the Any() approach more.
This was critical to get this to work with Entity Framework:
var genericCollection = list as ICollection<T>;
if (genericCollection != null)
{
//your code
}
If I check with Count() Linq executes a "SELECT COUNT(*).." in the database, but I need to check if the results contains data, I resolved to introducing FirstOrDefault() instead of Count();
Before
var cfop = from tabelaCFOPs in ERPDAOManager.GetTable<TabelaCFOPs>()
if (cfop.Count() > 0)
{
var itemCfop = cfop.First();
//....
}
After
var cfop = from tabelaCFOPs in ERPDAOManager.GetTable<TabelaCFOPs>()
var itemCfop = cfop.FirstOrDefault();
if (itemCfop != null)
{
//....
}
private bool NullTest<T>(T[] list, string attribute)
{
bool status = false;
if (list != null)
{
int flag = 0;
var property = GetProperty(list.FirstOrDefault(), attribute);
foreach (T obj in list)
{
if (property.GetValue(obj, null) == null)
flag++;
}
status = flag == 0 ? true : false;
}
return status;
}
public PropertyInfo GetProperty<T>(T obj, string str)
{
Expression<Func<T, string, PropertyInfo>> GetProperty = (TypeObj, Column) => TypeObj.GetType().GetProperty(TypeObj
.GetType().GetProperties().ToList()
.Find(property => property.Name
.ToLower() == Column
.ToLower()).Name.ToString());
return GetProperty.Compile()(obj, str);
}
Here's my implementation of Dan Tao's answer, allowing for a predicate:
public static bool IsEmpty<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null) throw new ArgumentNullException();
if (IsCollectionAndEmpty(source)) return true;
return !source.Any(predicate);
}
public static bool IsEmpty<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw new ArgumentNullException();
if (IsCollectionAndEmpty(source)) return true;
return !source.Any();
}
private static bool IsCollectionAndEmpty<TSource>(IEnumerable<TSource> source)
{
var genericCollection = source as ICollection<TSource>;
if (genericCollection != null) return genericCollection.Count == 0;
var nonGenericCollection = source as ICollection;
if (nonGenericCollection != null) return nonGenericCollection.Count == 0;
return false;
}
List<T> li = new List<T>();
(li.First().DefaultValue.HasValue) ? string.Format("{0:yyyy/MM/dd}", sender.First().DefaultValue.Value) : string.Empty;
myList.ToList().Count == 0. That's all
This extension method works for me:
public static bool IsEmpty<T>(this IEnumerable<T> enumerable)
{
try
{
enumerable.First();
return false;
}
catch (InvalidOperationException)
{
return true;
}
}