XML string/array Comparison - c#

I'm trying to match the indexes from "filename" and "filesize" with the ones from "xml". They contain filesize and names. I need to match them up in an if statment. I'm stuck though, and have no idea how to proceed.
public static void APB()
{
ArrayList filename = new ArrayList();
ArrayList filesize = new ArrayList();
var directory = new DirectoryInfo(Directory.GetCurrentDirectory());
var files= directory.GetFiles("*", SearchOption.AllDirectories);
long fnd = 0;
foreach (var file in files)
{
filename.Add(file.FullName);
filesize.Add(fnd += file.Length);
}
ArrayList xml = new ArrayList();
XmlTextReader reader = new XmlTextReader(dictonary.launcher);
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
xml.Add(reader.Name);
while (reader.MoveToNextAttribute())
xml.Add(reader.Name + "=" + reader.Value);
break;
}
}
}

Create anonymous types for each with the name and size and compare them.

Related

Collection was modified; command separated string into list

I have a comma separate string to pass, to be able to get the file to a directory, Below is the code. This error is shown when using string split then convert into the list. can you tell me what part of the error is in my code?
sample value:
StudentList ="Image01.jpg,Image02.jpg"
public FileResult DownloadZipFile(string StudentList)
{
var fileName = string.Format("{0}_ImageFiles.zip", DateTime.Today.Date.ToString("dd-MM-yyyy") + "_1");
var tempOutPutPath = Server.MapPath(Url.Content("~/Assets/Student_ID")) + fileName;
using (ZipOutputStream s = new ZipOutputStream(System.IO.File.Create(tempOutPutPath)))
{
s.SetLevel(9);
byte[] buffer = new byte[4096];
List<string> stringList = StudentList.Split(',').ToList();
foreach (string str in stringList)
{
stringList.Add(Server.MapPath("~/Assets/Student_ID/" + str));
}
for (int i = 0; i < stringList.Count; i++)
{
ZipEntry entry = new ZipEntry(Path.GetFileName(stringList[i]));
entry.DateTime = DateTime.Now;
entry.IsUnicodeText = true;
s.PutNextEntry(entry);
using (FileStream fs = System.IO.File.OpenRead(stringList[i]))
{
int sourceBytes;
do
{
sourceBytes = fs.Read(buffer, 0, buffer.Length);
s.Write(buffer, 0, sourceBytes);
} while (sourceBytes > 0);
}
}
s.Finish();
s.Flush();
s.Close();
}
byte[] finalResult = System.IO.File.ReadAllBytes(tempOutPutPath);
if (System.IO.File.Exists(tempOutPutPath))
System.IO.File.Delete(tempOutPutPath);
if (finalResult == null || !finalResult.Any())
throw new Exception(String.Format("No Files found with Image"));
return File(finalResult, "application/zip", fileName);
}
The problem is in your foreach loop. You iterate through the list, but while doing so, you modify the collection. Thats causing the error. One solution to solve this, is to create a temporary dummy List:
List<string> stringList = StudentList.Split(',').ToList();
List<string> tempList = new List<string>();
foreach (string str in stringList)
{
tempList .Add(Server.MapPath("~/Assets/Student_ID/" + str));
}
stringList = tempList;
An alternative solution without a second list, would be to use a classic for-loop:
List<string> stringList = StudentList.Split(',').ToList();
for(int i = 0; i < stringList.Count; i++)
{
stringList [i] = "~/Assets/Student_ID/" + stringList [i];
}

Convert CSV in XML file with C#

I wrote this piece of code that allows me to read a CSV file and convert it to an XML file.
I have a problem, if inside the CSV file there are semicolons (;) the program cannot read the data instead, if there are commas (,) that delimit the words the program can read the data and to insert them correctly in the XML file.
could you find a way to replace the semicolon (;) with the comma (,)?
Thank you very much!! :)
This is the code:
writer.WriteStartDocument();
writer.WriteStartElement("DC8_Recipes");
using (CsvReader reader = new CsvReader(path))
{
reader.ReadHeaders();
while (reader.ReadRecord())
{
writer.WriteStartElement("DC8_Recipes");
writer.WriteAttributeString("PlantNo", reader["id_imp"]);
writer.WriteAttributeString("No", reader["nome_frm"]);
writer.WriteAttributeString("Name", reader["desc_frm"]);
writer.WriteEndElement();
}
reader.Close();
}
writer.WriteEndElement();
writer.WriteEndDocument();
writer.Close();
logText.Text += DateTime.Now + " Convertion Completed\n";
logText.Text += DateTime.Now + " Saving file to: " + savepath + "\n";
try
{
logText.Text += DateTime.Now + " File save completed!\n";
logText.Text += DateTime.Now + " process END\n";
}
catch
{
}
}
You can pass into CsvReader constructor a CsvConfiguration to change the default delimiter (which is based on the current CultureInfo):
The culture is used to determine the default delimiter, default line ending, and formatting when type converting.
using (var csv = new CsvReader(writer, new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ","
}))
{
csv.Read();
}
You could write your own CsvReader:
public static List<Model> ReadCsv(string path)
{
var modelList = new List<Model>();
using (var fileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
using (var streamReader = new StreamReader(fileStream, Encoding.Default))
{
while (!streamReader.EndOfStream)
{
var line = streamReader.ReadLine();
if (string.IsNullOrEmpty(line))
{
continue;
}
var splittedLine = line.Split(';');
var model = new Model();
for (var i = 0; i < splittedLine.Length; i++)
{
switch (i)
{
case 0:
model.FirstColumn = splittedLine[i];
break;
case 1:
model.SecondColumn = splittedLine[i];
break;
case 2:
model.ThirdColumn = Convert.ToInt32(splittedLine[i]);
break;
}
}
modelList.Add(model);
}
}
}
return modelList;
}

Fastest way to fuzzy match two csv files

I have written a very simple program using a nuget package in c# to read in 2 csv files and fuzzy match them and output a new csv file with all the matches. The problem is i need the program to be able to read and compare files up to 700k and comparw it to 100k. I havent been able to find a way to speed up the process. Is there any way i can do this? I will even use another language if need be.
you can ignore all the commented code its just there for when i was using it for testing purposes. sorry im a newer programmer.
the read csv funciton is for reading in the csv. the rest is code inside another function where i pass in the string arrays to pass them through fuzzymatch
static string[] ReadCSV(string path)
{
List<string> name = new List<string>();
List<string> address = new List<string>();
List<string> city = new List<string>();
List<string> state = new List<string>();
List<string> zip = new List<string>();
using (var reader = new StreamReader(path))
{
reader.ReadLine();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
name.Add(values[0] +", "+ values[1]);
//address.Add(values[1]);
//city.Add(values[2]);
//state.Add(values[3]);
//zip.Add(values[4]);
}
}
string[] name1 = name.ToArray();
return name1;
//foreach (var item in name)
//{
// Console.WriteLine(item.ToString());
//}
}
StringBuilder csvcontent = new StringBuilder();
string csvpath = #"C:\Users\bigel\Documents\outputtest.csv";
csvcontent.AppendLine("Name,Address,Match");
//Console.WriteLine("Levenshtein Edit Distance:");
int x = 1;
foreach (var name in string1)
{
for (int i = 0; i < length; i++)
{
int leven = match[i].LevenshteinDistance(name);
//Console.WriteLine(match[i] + "\t{0} against {1}", leven, name);
if (leven <= 7)
{
output[i] = input[i] + ",match";
csvcontent.AppendLine(output[i]);
//Console.WriteLine(match[i] + " " + leven + " against " + name + " is a Match");
//Console.WriteLine(output[i]);
}
else
{
if (i == 500)
{
Console.WriteLine(x);
x++;
}
}
}
}
File.AppendAllText(csvpath, csvcontent.ToString());

Csharp substring text and add it to list

I have file.txt like:
EDIT: I didn't wrote but this is important i guess- In file.txt there can be others lines!
folder=c:\user;c:\test;c:\something;
I need to add one path like one list item (List<string> Folders).
So my List should looks like:
Folders[0] = c:\user
Folders[1] = c:\test
etc. (without text "folder=" which starts line in file.txt and ";" which means end of path).
file can contain much more paths.
I did something like this:
using (FileStream fss = new FileStream(path, FileMode.Open))
{
using (StreamReader sr = new StreamReader(fss))
{
while (sr.EndOfStream == false)
{
string line = sr.ReadLine();
if(line.StartsWith("folders"))
{
int index = line.IndexOf("=");
int index1 = line.IndexOf(";");
string folder = line.Substring(index + 1, index1 - (index + 1));
Folders.Add(folder);
Now in List Folders i have first path but what now? I can't go ahead :(
using(var sr = new StreamReader(path))
{
var folders = sr.ReadToEnd()
.Split(new char[]{';','\n','\r'}, StringSplitOptions.RemoveEmptyEntries)
.Select(o => o.Replace("folder=",""))
.ToArray();
Folders.AddRange(folders);
}
You can try following code, using File.ReadAllText
string Filepath = "c:\abc.txt";
string filecontent = File.ReadAllText(Filepath);
string startingString = "=";
var startIndex = filecontent.IndexOf(startingString);
filecontent = filecontent.Substring(startIndex + 1, filecontent.Length - startIndex - 2);
List<String> folders = filecontent.Split(';').ToList();
Here's a simple example:
List<String> Folders = new List<string>();
private void button1_Click(object sender, EventArgs e)
{
string path = #"C:\Users\mikes\Documents\SomeFile.txt";
string folderTag = "folder=";
using (FileStream fss = new FileStream(path, FileMode.Open))
{
using (StreamReader sr = new StreamReader(fss))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
if (line.StartsWith(folderTag))
{
line = line.Substring(folderTag.Length); // remove the folderTag from the beginning
Folders.AddRange(line.Split(";".ToCharArray(), StringSplitOptions.RemoveEmptyEntries));
}
}
}
}
foreach(string folder in Folders)
{
Console.WriteLine(folder);
}
}
I'd use this approach if you're going to read line by line, and do something else based on what each line starts with. In that case you could add different else if(...) blocks:
if (line.StartsWith(folderTag))
{
line = line.Substring(folderTag.Length); // remove the folderTag from the beginning
Folders.AddRange(line.Split(";".ToCharArray(), StringSplitOptions.RemoveEmptyEntries));
}
else if(line.StartsWith("parameters="))
{
// do something different with a line starting with "parameters="
}
else if (line.StartsWith("unicorns="))
{
// do something else different with a line starting with "unicorns="
}

How to find the Filename with the latest version in C#

I have a folder that is filled with dwg files so I just need to find the latest version of a File or if a File has no versions then copy it to a directory. For example here are three files:
ABBIE 08-10 #6-09H4 FINAL 06-12-2012.dwg
ABBIE 08-10 #6-09H4 FINAL 06-12-2012_1.dwg
ABBIE 08-10 #6-09H4 FINAL 06-12-2012_2.dwg
Notice the difference is one file has a _1 and another has a _2 so the latest file here is the _2. I need to keep the latest file and copy it to a directory. Some files will not have different versions so those can be copied. I cannot focus on the creation date of the file or the modified date because in many instances they are the same so all I have to go on is the file name itself. I'm sure there is a more efficient way to do this than what I will post below.
DirectoryInfo myDir = new DirectoryInfo(#"H:\Temp\Test");
var Files = myDir.GetFiles("*.dwg");
string[] fileList = Directory.GetFiles(#"H:\Temp\Test", "*FINAL*", SearchOption.AllDirectories);
ArrayList list = new ArrayList();
ArrayList WithUnderscores = new ArrayList();
string nameNOunderscores = "";
for (int i = 0; i < fileList.Length; i++)
{
//Try to get just the filename..
string filename = fileList[i].Split('.')[0];
int position = filename.LastIndexOf('\\');
filename = filename.Substring(position + 1);
filename = filename.Split('_')[0];
foreach (FileInfo allfiles in Files)
{
var withoutunderscore = allfiles.Name.Split('_')[0];
withoutunderscore = withoutunderscore.Split('.')[0];
if (withoutunderscore.Equals(filename))
{
nameNOunderscores = filename;
list.Add(allfiles.Name);
}
}
//If there is a number after the _ then capture it in an ArrayList
if (list.Count > 0)
{
foreach (string nam in list)
{
if (nam.Contains("_"))
{
//need regex to grab numeric value after _
var match = new Regex("_(?<number>[0-9]+)").Match(nam);
if (match.Success)
{
var value = match.Groups["number"].Value;
var number = Int32.Parse(value);
WithUnderscores.Add(number);
}
}
}
int removedcount = 0;
//Whats the max value?
if (WithUnderscores.Count > 0)
{
var maxval = GetMaxValue(WithUnderscores);
Int32 intmax = Convert.ToInt32(maxval);
foreach (FileInfo deletefile in Files)
{
string shorten = deletefile.Name.Split('.')[0];
shorten = shorten.Split('_')[0];
if (shorten == nameNOunderscores && deletefile.Name != nameNOunderscores + "_" + intmax + ".dwg")
{
//Keep track of count of Files that are no good to us so we can iterate to next set of files
removedcount = removedcount + 1;
}
else
{
//Copy the "Good" file to a seperate directory
File.Copy(#"H:\Temp\Test\" + deletefile.Name, #"H:\Temp\AllFinals\" + deletefile.Name, true);
}
}
WithUnderscores.Clear();
list.Clear();
}
i = i + removedcount;
}
else
{
//This File had no versions so it is good to be copied to the "Good" directory
File.Copy(#"H:\Temp\SH_Plats\" + filename, #"H:\Temp\AllFinals" + filename, true);
i = i + 1;
}
}
I've made a Regex based solution, and apparently come late to the party in the meantime.
(?<fileName>[A-Za-z0-9-# ]*)_?(?<version>[0-9]+)?\.dwg
this regex will recognise the fileName and version and split them into groups, a pretty simple foreach loop to get the most recent files in a dictionary (cos I'm lazy) and then you just need to put the fileNames back together again before you access them.
var fileName = file.Key + "_" + file.Value + ".dwg"
full code
var files = new[] {
"ABBIE 08-10 #6-09H4 FINAL 06-12-2012.dwg",
"ABBIE 08-10 #6-09H4 FINAL 06-12-2012_1.dwg",
"ABBIE 08-10 #6-09H4 FINAL 06-12-2012_2.dwg",
"Second File.dwg",
"Second File_1.dwg",
"Third File.dwg"
};
// regex to split fileName from version
var r = new Regex( #"(?<fileName>[A-Za-z0-9-# ]*)_?(?<version>[0-9]+)?\.dwg" );
var latestFiles = new Dictionary<string, int>();
foreach (var f in files)
{
var parsedFileName = r.Match( f );
var fileName = parsedFileName.Groups["fileName"].Value;
var version = parsedFileName.Groups["version"].Success ? int.Parse( parsedFileName.Groups["version"].Value ) : 0;
if( latestFiles.ContainsKey( fileName ) && version > latestFiles[fileName] )
{
// replace if this file has a newer version
latestFiles[fileName] = version;
}
else
{
// add all newly found filenames
latestFiles.Add( fileName, version );
}
}
// open all most recent files
foreach (var file in latestFiles)
{
var fileToCopy = File.Open( file.Key + "_" + file.Value + ".dwg" );
// ...
}
You can use this Linq query with Enumerable.GroupBy which should work(now tested):
var allFiles = Directory.EnumerateFiles(sourceDir, "*.dwg")
.Select(path => new
{
Path = path,
FileName = Path.GetFileName(path),
FileNameWithoutExtension = Path.GetFileNameWithoutExtension(path),
VersionStartIndex = Path.GetFileNameWithoutExtension(path).LastIndexOf('_')
})
.Select(x => new
{
x.Path,
x.FileName,
IsVersionFile = x.VersionStartIndex != -1,
Version = x.VersionStartIndex == -1 ? new Nullable<int>()
: x.FileNameWithoutExtension.Substring(x.VersionStartIndex + 1).TryGetInt(),
NameWithoutVersion = x.VersionStartIndex == -1 ? x.FileName
: x.FileName.Substring(0, x.VersionStartIndex)
})
.OrderByDescending(x => x.Version)
.GroupBy(x => x.NameWithoutVersion)
.Select(g => g.First());
foreach (var file in allFiles)
{
string oldPath = Path.Combine(sourceDir, file.FileName);
string newPath;
if (file.IsVersionFile && file.Version.HasValue)
newPath = Path.Combine(versionPath, file.FileName);
else
newPath = Path.Combine(noVersionPath, file.FileName);
File.Copy(oldPath, newPath, true);
}
Here's the extension method which i'm using to determine if a string is parsable to int:
public static int? TryGetInt(this string item)
{
int i;
bool success = int.TryParse(item, out i);
return success ? (int?)i : (int?)null;
}
Note that i'm not using regex but string methods only.
Try this
var files = new My.Computer().FileSystem.GetFiles(#"c:\to\the\sample\directory", Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, "*.dwg");
foreach (String f in files) {
Console.WriteLine(f);
};
NB: Add a reference to Microsoft.VisualBasic and use the following line at the beginning of the class:
using My = Microsoft.VisualBasic.Devices;
UPDATE
The working sample[tested]:
String dPath=#"C:\to\the\sample\directory";
var xfiles = new My.Computer().FileSystem.GetFiles(dPath, Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, "*.dwg").Where(c => Regex.IsMatch(c,#"\d{3,}\.dwg$"));
XElement filez = new XElement("filez");
foreach (String f in xfiles)
{
var yfiles = new My.Computer().FileSystem.GetFiles(dPath, Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, string.Format("{0}*.dwg",System.IO.Path.GetFileNameWithoutExtension(f))).Where(c => Regex.IsMatch(c, #"_\d+\.dwg$"));
if (yfiles.Count() > 0)
{
filez.Add(new XElement("file", yfiles.Last()));
}
else {
filez.Add(new XElement("file", f));
};
};
Console.Write(filez);
Can you do this by string sort? The only tricky part I see here is to convert the file name to a sortable format. Just do a string replace from dd-mm-yyyy to yyyymmdd. Then, sort the the list and get the last record out.
This is what you want considering fileList contain all file names
List<string> latestFiles=new List<string>();
foreach(var groups in fileList.GroupBy(x=>Regex.Replace(x,#"(_\d+\.dwg$|\.dwg$)","")))
{
latestFiles.Add(groups.OrderBy(s=>Regex.Match(s,#"\d+(?=\.dwg$)").Value==""?0:int.Parse(Regex.Match(s,#"\d+(?=\.dwg$)").Value)).Last());
}
latestFiles has the list of all new files..
If fileList is bigger,use Threading or PLinq

Categories

Resources