Sort List<string> based on character count - c#

Example:
List<string> folders = new List<string>();
folders.Add("folder1/folder2/folder3/");
folders.Add("folder1/");
folders.Add("folder1/folder2/");
I want to sort this list based on character i.e '/'
so my output will be
folder1/
folder1/folder2/
folder1/folder2/folder3

LINQ:
folders = folders.OrderBy(f => f.Length).ToList(); // consider null strings
or List.Sort
folders.Sort((s1, s2) => s1.Length.CompareTo(s2.Length));
a safe approach if the list could contain null's:
folders = folders.OrderBy(f => f?.Length ?? int.MinValue).ToList();
If you actually want to sort by the folder-depth not string length:
folders = folders.OrderBy(f => f.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar).Length).ToList();

It's likely you actually want to sort by name:
folders = folders.OrderBy(f => f).ToList();
Or simply:
folders.Sort();
This will work correctly for cases like this:
folder1/
folder1/subfolder1
folder1/subfolder1/subsubfolder
folder2
folder2/subfolder2
Sorting by length alone will consider "folder1" and "folder2" equal.

Related

How to sort numbered filenames in list in C#

I try to sort a list that contains filepaths.
And I want them to be sorted by the numbers in them.
With the given code I use I don't get the expected result.
var mylist = mylist.OrderBy(x => int.Parse(Regex.Replace(x, "[^0-9]+", "0"))).ToList<string>();
I expect the result to be:
c:\somedir\1.jpg
c:\somedir\2.jpg
c:\somedir\3.jpg
c:\somedir\7.jpg
c:\somedir\8.jpg
c:\somedir\9.jpg
c:\somedir\10.jpg
c:\somedir\12.jpg
c:\somedir\20.jpg
But the output is random.
There is a simple way of achieving that.
Let's say you have a string list like this:
List<string> allThePaths = new List<string>()
{
"c:\\somedir\\1.jpg",
"c:\\somedir\\2.jpg",
"c:\\somedir\\20.jpg",
"c:\\somedir\\7.jpg",
"c:\\somedir\\12.jpg",
"c:\\somedir\\8.jpg",
"c:\\somedir\\9.jpg",
"c:\\somedir\\3.jpg",
"c:\\somedir\\10.jpg"
};
You can get the desired result with this:
List<string> sortedPaths = allThePaths
.OrderBy(stringItem => stringItem.Length)
.ThenBy(stringItem => stringItem).ToList();
Note: Also make sure you've included LINQ:
using System.Linq;
Here is a demo example just in case it's needed.
More complex solutions can be found there.
A cleaner way of doing this would be to use System.IO.Path:
public IEnumerable<string> OrderFilesByNumberedName(IEnumerable<string> unsortedPathList) =>
unsortedPathList
.Select(path => new { name = Path.GetFileNameWithoutExtension(path), path }) // Get filename
.OrderBy(file => int.Parse(file.name)) // Sort by number
.Select(file => file.path); // Return only path

Sort List of List and maintain correct number of lists in c#

For the following code...
var lightningFileNames = ConfigurationManager.AppSettings["LightningFileNames"];
var files = Directory.GetFiles(mapPath, lightningFileNames);
List<List<LightningStrikeModel>> ltgStrikes = new List<List<LightningStrikeModel>>();
foreach (string file in files)
{
var stringData = new List<string>();
using (var reader = new StreamReader(file))
{
while (!reader.EndOfStream)
{
var data = reader.ReadLine().Trim();
if (!string.IsNullOrEmpty(data))
{
stringData.Add(data);
}
}
reader.Close();
}
//extracted from file name to get an orderby
int lgtTemp = int.Parse(Regex.Match(file, #"\d+").Value);
ltgStrikes.Add((from item in stringData
select item.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
into rawData
where rawData.Length >= 4
select new LightningStrikeModel
{
Date = rawData[0],
Time = rawData[1],
Latitude = Convert.ToDecimal(rawData[2]),
Longitude = Convert.ToDecimal(rawData[3]),
Polarity = Convert.ToDecimal(rawData[4]),
orderBy = lgtTemp
}).ToList());
}
var tempLtg = ltgStrikes
.SelectMany(record => record)
.OrderBy(record => record.orderBy)
.GroupBy(record => record.orderBy).ToList();
return ltgStrikes;
With filenames of ltg_1.txt, ltg_2.txt ... ltg_12.txt
My problem exists because of 3 things.
1) because I am going out to a folder location to grab a list of files to read the data and populate a list, I get them in the order they are in the folder - so I would read the files in this order
_1.txt, _10.txt, _11.txt, _12.txt, _2.txt and so on
I am unable to change the files names.
2) Some of the files will have nothing in them - a blank file. But I still need to 'read it' and add a place holder to my List> ltgStrikes - i essentially need to have a list of 12 lists regardless of data.
3) Currently, I can achieve a list of 12 lists regardless of data but they are in the wrong order because its adding them to the ltgStrikes in the order they are read. so
_1.txt has an index of [0], _10.txt has an index of [1] but in the end result, it should have an index of [9], _5.txt has an index of [8] but should be [4]
I have tried something like the following but because some files are empty, I do not get a list of 12 lists. my current data only gives me a list of 2 lists since only 2 files of data in them.
var tempLtg = ltgStrikes
.SelectMany(record => record)
.OrderBy(record => record.orderBy)
.GroupBy(record => record.orderBy).ToList();
What am I not seeing? FYI - orderBy is not used to order the data here but ultimately it can be. I need it in another part of the application
You have a whole pile of problems here because you're doing stuff in the wrong order. If you're mixing loops with LINQ like this, odds are good the whole thing will be much better if you just make the whole thing into one big query with no loops. Let's do that:
return Directory.GetFiles(mapPath, lightningFileNames)
.InNaturalOrder() // You write this!
OK, now we have a sequence of files in the right order. What do we want next? The contents of the files, trimmed.
.Select(f => File.ReadLines(f)
.Select(l => l.Trim())
.Where(l => l != ""))
OK, now we have a sequence of sequences of strings. What do we want? A list of lists of LightningStrikeModels. So we transform each string into a model. That gives us a sequence of models.
.Select (stringData =>
(from item in stringData
select item.Split(new[] { ' ' },
StringSplitOptions.RemoveEmptyEntries)
into rawData
where rawData.Length >= 5
select new LightningStrikeModel (...))
We transform each sequence of models into a list of models.
.ToList())
We now have a sequence of lists of models. We want a list of lists of models:
.ToList();
And we're done. We have a list of lists of models, and we can return it.
But let's not stop there. When you're done writing the code ask yourself if you could have done better.
If we do that then we immediately see that the Trim and the filter of empty strings is completely unnecessary. Why? Because we're going to take that string, split it on spaces, eliminate the empty substrings, and discard any string that had fewer than four substrings between spaces. So why did we bother to trim the leading and trailing spaces, and eliminate empty entries? The split would do the former, and the check to see if there are four substrings does the latter. So we can make this simpler:
return Directory.GetFiles(mapPath, lightningFileNames)
.InNaturalOrder()
.Select(f => File.ReadLines(f))
.Select (stringData =>
...
Now do it again. Can we make this simpler? Yes. We have two selects in a row, so we can merge them.
return Directory.GetFiles(mapPath, lightningFileNames)
.InNaturalOrder()
.Select (f =>
(from item in File.ReadLines(f)
select item.Split(new[] { ' ' },
StringSplitOptions.RemoveEmptyEntries)
into rawData
where rawData.Length >= 5
select new LightningStrikeModel (...))
Can we make this better? OH YES WE CAN. We can make two things: a splitter helper and a factory.
static string[] SpaceSplit(this string s) => s.Split( ... );
static LightningStrikeModel BuildModel(this string[] parts) => new ...
And now our query is
return Directory.GetFiles(mapPath, lightningFileNames)
.InNaturalOrder()
.Select (f =>
File.ReadLines(f)
.Select(line => line.SpaceSplit())
.Where(rawData => rawData.Length >= 5)
.Select(rawData => rawData.BuildModel())
.ToList())
.ToList();
OMG look at how much shorter and cleaner that solution is compared to the mess we started with. Look at how clearly correct it is. And easy to understand and maintain. Always be asking yourself if you can make it better.
Can we make this solution better? Yes we can! We can notice for example that we do no error checking on whether or not the strings convert cleanly to decimals. What if they don't? That error should probably be handled somehow, but right now it is not. Think about how you might solve that in a manner that does not make the call site harder to understand.
If you want to read the files in order, you could order files in the foreach
foreach(string file in files.OrderBy(x=>int.Parse(Regex.Match(x,#"(\d+)\.txt").Groups[1].Value))){

LINQ for getting all entries of a IEnumerable<string> which start with the same characters

I got an IEnumerable<string> and I want to gather all entries which start with the same characters.
For example:
Hans
Hannes
Gustav
Klaus
Herbert
Hanne
Now I want to find all entries where the first 2 characters are the same which would return Hans, Hannes, Hanne.
You just need to use .GroupBy
list.GroupBy(x=>x.Substring(0, n)).OrderByDescending(x=>x.Count()).First()
Where n is the number of char you want to compare.
Can also add a Where to filter any requirements you may have:
list.GroupBy(x=>x.Substring(n))
.Where(x=>x.Count() > 1)
.OrderByDescending(x=>x.Count())
.First()
Complete example:
var lst = new string[]
{
"Hans",
"Hannes",
"Gustav",
"Klaus",
"Herbert",
"Hanne"
};
var source = lst.GroupBy(x => x.Substring(0, 2)).OrderByDescending(x => x.Count()).First()
Console.WriteLine(source.Key);
Console.WriteLine(string.Join(",", source));

similar file names into array or list

In each scenario I will have hundreds of .tif files that need to be merged with irfanview. These files are named so that matching character strings before a hyphen indicate they need to be merged (a small example set of file names could be as follows: 0001-1.tif, 0001-2.tif, 0001-3.tif, 0002-1.tif, 0002-2.tif, 0003.tif, 0004-1.tif, 0004-2.tif). Is there a way to put files with the same "prefix" into their own array by a means of character comparison, or would trimming the file names be easier? I would like to get all the "0001's" into an array then all the "0002's" etc. Can someone make a suggestion on the easiest way to do this?
This looks like it will work perfectly, but VS is not recognizing Substring. Is it because I'm trying to insert an array of files into the "GroupBy"? Here's what I have so far:
int i = 0;
string filmtext = textBox1.Text;
string[] filmPath = Directory.GetFiles(filmtext);
string filmfile = Path.GetFileName(filmPath[i].ToString());
filmfile.GroupBy(s => s.Substring(0,4))
.Select(g => g.ToList())
.ToList();
you could group by the first four characters:
filenames.GroupBy(s => s.Substring(0,4))
that would return a collection of groups representing the filenames within that group and a Key value being the 4 characters that you group on.
To create a list of lists from those grouping you could do:
List<List<string>> groups =
filenames.GroupBy(s => s.Substring(0,4))
.Select(g => g.ToList())
.ToList();

Creating a list of docs that contains same name

I'm creating a tool that is supposed to concatenate docs that contain the same name.
example: C_BA_20000_1.pdf and C_BA_20000_2.pdf
These files should be grouped in one list.
That tool runs on a directory lets say
//directory of pdf files
DirectoryInfo dirInfo = new DirectoryInfo(#"C:\Users\derp\Desktop");
FileInfo[] fileInfos = dirInfo.GetFiles("*.pdf");
foreach (FileInfo info in fileInfos)
I want to create an ArrayList that contains filenames of the same name
ArrayList list = new ArrayList();
list.Add(info.FullName);
and then have a list that contains all the ArrayLists of similar docs.
List<ArrayList> bigList = new List<ArrayList>();
So my question, how can I group files that contains same name and put them in the same list.
EDIT:
Files have the same pattern in their names AB_CDEFG_i
where i is a number and can be from 1-n. Files with the same name should have only different number at the end.
AB_CDEFG_1
AB_CDEFG_2
HI_JKLM_1
Output should be:
List 1: AB_CDEFG_1 and AB_CDEFG_2
List 2: HI_JKLM_1
Create method which extracts 'same' part of file name. E.g.
public string GetRawName(string fileName)
{
int index = fileName.LastIndexOf("_");
return fileName.Substring(0, index);
}
And use this method for grouping:
var bigList = Directory.EnumerateFiles(#"C:\Users\derp\Desktop", "*.pdf")
.GroupBy(file => GetRawName(file))
.Select(g => g.ToList())
.ToList();
This will return List<List<string>> (without ArrayList).
UPDATE Here is regular expression, which will work with all kind of files, whether they have number at the end, or not
public string GetRawName(string file)
{
string name = Path.GetFileNameWithoutExtension(file);
return Regex.Replace(name, #"(_\d+)?$", "")
}
Grouping:
var bigList = Directory.EnumerateFiles(#"C:\Users\derp\Desktop", "*.pdf")
.GroupBy(GetRawName)
.Select(g => g.ToList())
.ToList();
It sounds like the difficulty is in deciding which files are the same.
static string KeyFromFileName(string file)
{
// Convert from "C_BA_20000_2" to "C_BA_20000"
return file.Substring(0, file.LastIndexOf("_"));
// Note: This assumes there is an _ in the filename.
}
Then you can use this LINQ to build a list of fileSets.
using System.Linq; // Near top of file
var files = Directory.GetFiles(#"C:\Users\derp\Desktop", "*.pdf")
var fileSets = files
.Select(file => file.FullName)
.GroupBy(KeyFromFileName)
.Select(g => new {g.Key, Files = g.ToList()}
.ToList();
Aside from the fact that your question doesnt identify what "same name" means. This is a typical solution.
fileInfos.GroupBy ( f => f.FullName )
.Select( grp => grp.ToList() ).ToList();
This will get you a list of lists... also won't throw an exception if a file doesn't contain the underscore, etc.
private string GetKey(FileInfo fi)
{
var index = fi.Name.LastIndexOf('_');
return index == -1 ? Path.GetFileNameWithoutExtension(fi.Name)
: fi.Name.Substring(0, index);
}
var bigList = fileInfos.GroupBy(GetKey)
.Select(x => x.ToList())
.ToList();

Categories

Resources