Check if a location is indexed in Windows Search - c#

How to check if a location is indexed or not? I found following code to index a location in Windows which works fine but I want to check if it is indexed or not before I make it indexed.
Uri path = new Uri(location);
string indexingPath = path.AbsoluteUri;
CSearchManager csm = new CSearchManager();
CSearchCrawlScopeManager manager = csm.GetCatalog("SystemIndex").GetCrawlScopeManager();
manager.AddUserScopeRule(indexingPath, 1, 1, 0);
manager.SaveAll();
Guys i have found a way to check if the location has been included for indexing by using IncludedInCrawlScope.
CSearchManager csm = new CSearchManager();
CSearchCrawlScopeManager manager = csm.GetCatalog("SystemIndex").GetCrawlScopeManager();
if (manager.IncludedInCrawlScope(indexingPath) == 0)
{
manager.AddUserScopeRule(indexingPath, 1, 1, 0);
manager.SaveAll();
}
But it only checks if it has been added for indexing, not if the indexing is complete.Since i will be querying on the SystemIndex, i need to make sure that the location is indexed.

I ran into a similar need and this is what I came up with. In my case I have certain file extensions that are going to end up being sent to a document management system.
I have two methods one uses the System.IO to get a list of the files in the directory that contain the extension from the list.
public IEnumerable<string> DirectoryScan(string directory)
{
List<string> extensions = new List<string>
{
"docx","xlsx","pptx","docm","xlsm","pptm","dotx","xltx","xlw","potx","ppsx","ppsm","doc","xls","ppt","doct","xlt","xlm","pot","pps"
};
IEnumerable<string> myFiles =
Directory.GetFiles(directory, "*", SearchOption.AllDirectories)
.Where(s => extensions.Any(s.EndsWith))
.ToList();
return myFiles;
}`
The second method uses the windows index search Microsoft.Search.Interop
public IEnumerable<string> QueryWindowsDesktopSearch(string directory)
{
List<string> extensions = new List<string>
{ "docx","xlsx","pptx","docm","xlsm","pptm","dotx","xltx","xlw","potx","ppsx","ppsm","doc","xls","ppt","doct","xlt","xlm","pot","pps"};
string userQuery = "*";
Boolean fShowQuery = true;
List<string> list = new List<string>();
CSearchManager manager = new CSearchManager();
CSearchCatalogManager catalogManager = manager.GetCatalog("SystemIndex");
CSearchQueryHelper queryHelper = catalogManager.GetQueryHelper();
queryHelper.QueryWhereRestrictions = string.Format("AND (\"SCOPE\" = 'file:{0}')", directory);
if (extensions != null)
{
queryHelper.QueryWhereRestrictions += " AND Contains(System.ItemType,'";
bool fFirst = true;
foreach (string ext in extensions)
{
if (!fFirst)
{
queryHelper.QueryWhereRestrictions += " OR ";
}
queryHelper.QueryWhereRestrictions += "\"" + ext + "\"";
fFirst = false;
}
queryHelper.QueryWhereRestrictions += "') ";
}
string sqlQuery = queryHelper.GenerateSQLFromUserQuery(userQuery);
using (OleDbConnection connection = new OleDbConnection(queryHelper.ConnectionString))
{
using (OleDbCommand command = new OleDbCommand(sqlQuery, connection))
{
connection.Open();
OleDbDataReader dataReader = command.ExecuteReader();
while (dataReader.Read())
{
var file = dataReader.GetString(0);
if (file != null)
{
list.Add(file.Replace("file:", ""));
}
}
}
}
return list;
}
I call both of these methods from another methods that takes the two results and compares them and returns a Boolean value indicating if they two list match. If they do not match then the folder has not been indexed fully.
If you call the QueryWindowsDesktopSearch on a folder that has not been indexed it returns zero files. You could use this as an indication that the folder isn't in the index bt its possible that the file has been added to the index but the file indexing is stopped.
You could check the status by calling something like this
CSearchManager manager = new CSearchManager();
CSearchCatalogManager catalogManager = manager.GetCatalog("SystemIndex");
_CatalogPausedReason pReason;
_CatalogStatus pStatus;
catalogManager.GetCatalogStatus(out pStatus, out pReason);
That may return something like pStatus = CATALOG_STATUS_PAUSED and pReason = CATALOG_PAUSED_REASON_USER_ACTIVE
You would know that the index is not running. Another thing you could do is call the following
int incrementalCount, notificationQueue, highPriorityQueue;
catalogManager.NumberOfItemsToIndex(out incrementalCount, out notificationQueue, out highPriorityQueue);
This is going to return the in plIncrementalCount value which would list the number of file that the entire SystemIndex has queued for indexing.

Check this implementation from a document management system:
https://code.google.com/p/olakedms/source/browse/SearchEngine/CSearchDAL.cs?r=171

Related

Lucene 4.8 facets usage

I have difficulties understanding this example on how to use facets :
https://lucenenet.apache.org/docs/4.8.0-beta00008/api/Lucene.Net.Demo/Lucene.Net.Demo.Facet.SimpleFacetsExample.html
My goal is to create an index in which each document field have a facet, so that at search time i can choose which facets use to navigate data.
What i am confused about is setup of facets in index creation, to
summarize my question : is index with facets compatibile with
ReferenceManager?
Need DirectoryTaxonomyWriter to be actually written and persisted
on disk or it will embedded into the index itself and is just
temporary? I mean given the code
indexWriter.AddDocument(config.Build(taxoWriter, doc)); of the
example i expect it's temporary and will be embedded into the index (but then the example also show you need the Taxonomy to drill down facet). So can the Taxonomy be tangled in some way with the index so that the are handled althogeter with ReferenceManager?
If is not may i just use the same folder i use for storing index?
Here is a more detailed list of point that confuse me :
In my scenario i am indexing the document asyncrhonously (background process) and then fetching the indext ASAP throught ReferenceManager in ASP.NET application. I hope this way to fetch the index is compatibile with DirectoryTaxonomyWriter needed by facets.
Then i modified the code i write introducing the taxonomy writer as indicated in the example, but i am a bit confused, seems like i can't store DirectoryTaxonomyWriter into the same folder of index because the folder is locked, need i to persist it or it will be embedded into the index (so a RAMDirectory is enougth)? if i need to persist it in a different direcotry, can i safely persist it into subdirectory?
Here the code i am actually using :
private static void BuildIndex (IndexEntry entry)
{
string targetFolder = ConfigurationManager.AppSettings["IndexFolder"] ?? string.Empty;
//** LOG
if (System.IO.Directory.Exists(targetFolder) == false)
{
string message = #"Index folder not found";
_fileLogger.Error(message);
_consoleLogger.Error(message);
return;
}
var metadata = JsonConvert.DeserializeObject<IndexMetadata>(File.ReadAllText(entry.MetdataPath) ?? "{}");
string[] header = new string[0];
List<dynamic> csvRecords = new List<dynamic>();
using (var reader = new StreamReader(entry.DataPath))
{
CsvConfiguration csvConfiguration = new CsvConfiguration(CultureInfo.InvariantCulture);
csvConfiguration.AllowComments = false;
csvConfiguration.CountBytes = false;
csvConfiguration.Delimiter = ",";
csvConfiguration.DetectColumnCountChanges = false;
csvConfiguration.Encoding = Encoding.UTF8;
csvConfiguration.HasHeaderRecord = true;
csvConfiguration.IgnoreBlankLines = true;
csvConfiguration.HeaderValidated = null;
csvConfiguration.MissingFieldFound = null;
csvConfiguration.TrimOptions = CsvHelper.Configuration.TrimOptions.None;
csvConfiguration.BadDataFound = null;
using (var csvReader = new CsvReader(reader, csvConfiguration))
{
csvReader.Read();
csvReader.ReadHeader();
csvReader.Read();
header = csvReader.HeaderRecord;
csvRecords = csvReader.GetRecords<dynamic>().ToList();
}
}
string targetDirectory = Path.Combine(targetFolder, "Index__" + metadata.Boundle + "__" + DateTime.Now.ToString("yyyyMMdd_HHmmss") + "__" + Path.GetRandomFileName().Substring(0, 6));
System.IO.Directory.CreateDirectory(targetDirectory);
//** LOG
{
string message = #"..creating index : {0}";
_fileLogger.Information(message, targetDirectory);
_consoleLogger.Information(message, targetDirectory);
}
using (var dir = FSDirectory.Open(targetDirectory))
{
using (DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(dir))
{
Analyzer analyzer = metadata.GetAnalyzer();
var indexConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer);
using (IndexWriter writer = new IndexWriter(dir, indexConfig))
{
long entryNumber = csvRecords.Count();
long index = 0;
long lastPercentage = 0;
foreach (dynamic csvEntry in csvRecords)
{
Document doc = new Document();
IDictionary<string, object> dynamicCsvEntry = (IDictionary<string, object>)csvEntry;
var indexedMetadataFiled = metadata.IdexedFields;
foreach (string headField in header)
{
if (indexedMetadataFiled.ContainsKey(headField) == false || (indexedMetadataFiled[headField].NeedToBeIndexed == false && indexedMetadataFiled[headField].NeedToBeStored == false))
continue;
var field = new Field(headField,
((string)dynamicCsvEntry[headField] ?? string.Empty).ToLower(),
indexedMetadataFiled[headField].NeedToBeStored ? Field.Store.YES : Field.Store.NO,
indexedMetadataFiled[headField].NeedToBeIndexed ? Field.Index.ANALYZED : Field.Index.NO
);
doc.Add(field);
var facetField = new FacetField(headField, (string)dynamicCsvEntry[headField]);
doc.Add(facetField);
}
long percentage = (long)(((decimal)index / (decimal)entryNumber) * 100m);
if (percentage > lastPercentage && percentage % 10 == 0)
{
_consoleLogger.Information($"..indexing {percentage}%..");
lastPercentage = percentage;
}
writer.AddDocument(doc);
index++;
}
writer.Commit();
}
}
}
//** LOG
{
string message = #"Index Created : {0}";
_fileLogger.Information(message, targetDirectory);
_consoleLogger.Information(message, targetDirectory);
}
}

Why is PropertyDataCollection object persisting multiple records to database

I have a utility that reads the status of MicrosoftBizTalk Server resources .. specifically the ReceiveLocation component. My problem is that the program is submitting multiple entries of each item i.e each item in the data returned is being multiplied by 25 such that instead of persisting only 5 rows the data being persisted is 125. So for example instead of having just 1 row for my first row returned i have 25.
This is my program :
public List<BizTalk> GetBizTalkServicesStatistics()
{
List<BizTalk> model = new List<BizTalk>();
try
{
//Create the WMI search object.
ManagementObjectSearcher Searcher = new ManagementObjectSearcher();
ConnectionOptions options = new ConnectionOptions
{
Username = "+username+",
Password = "+password+",
Authority = "+domain+"
};
var server = "+server+";
// create the scope node so we can set the WMI root node correctly.
ManagementScope Scope = new ManagementScope("\\\\" + server + "\\root\\MicrosoftBizTalkServer", options);
Searcher.Scope = Scope;
// Build a Query to enumerate the MSBTS_ReceiveLocation instances if an argument
// is supplied use it to select only the matching RL.
//if (args.Length == 0)
SelectQuery Query = new SelectQuery();
Query.QueryString = "SELECT * FROM MSBTS_ReceiveLocation";
// else
//Query.QueryString = "SELECT * FROM MSBTS_ReceiveLocation WHERE Name = '" + args[0] + "'";
// Set the query for the searcher.
Searcher.Query = Query;
// Execute the query and determine if any results were obtained.
ManagementObjectCollection QueryCol = Searcher.Get();
// Use a bool to tell if we enter the for loop
// below because Count property is not supported
bool ReceiveLocationFound = false;
// Enumerate all properties.
foreach (ManagementBaseObject envVar in QueryCol)
{
// There is at least one Receive Location
ReceiveLocationFound = true;
PropertyDataCollection envVarProperties = envVar.Properties;
foreach (PropertyData envVarProperty in envVarProperties)
{
BizTalk bizTalk = new BizTalk();
bizTalk.Name = Convert.ToString(envVar["Name"]);
bizTalk.TransportType = Convert.ToString(envVar["AdapterName"]);
bizTalk.Uri = Convert.ToString(envVar["InboundTransportURL"]);
bizTalk.Status = Convert.ToString(envVar["Name"]);
bizTalk.ReceiveHandler = Convert.ToString(envVar["HostName"]);
bizTalk.ReceivePort = Convert.ToString(envVar["ReceivePortName"]);
bizTalk.RunDate = DateTime.Now;
bizTalk.ApplicationId = 24;
bizTalk.ServerId = 8;
bizTalk.InstanceName = "FBCZOP";
model.Add(bizTalk);
}
}
if (!ReceiveLocationFound)
{
Console.WriteLine("No receive locations found matching the specified name.");
}
}
catch (Exception excep)
{
ExceptionLogger.SendErrorToText(excep);
}
return model;
}
Save Function
public void SaveStatistics(BizTalk entity)
{
List<BizTalk> ServerInfo = new List<BizTalk>();
ServerInfo = GetBizTalkServicesStatistics();
foreach (var di in ServerInfo)
{
entity.RunDate = di.RunDate;
entity.Name = di.Name;
entity.Status = di.Status;
entity.Uri = di.Uri;
entity.InstanceName = di.InstanceName;
entity.ReceivePort = di.ReceivePort;
entity.TransportType= di.TransportType;
entity.RunDate = DateTime.Now;
entity.ReceiveHandler = di.ReceiveHandler;
entity.ServerId = entity.ServerId;
entity.ApplicationId = entity.ApplicationId;
appEntities.BizTalk.Add(entity);
appEntities.SaveChanges();
}
}
When i step through the code variable envVarProperties shows record count as 125 under envVarProperties << ResultsView :
Link 1
whilst QueryCol variable shows count of 5 :
Link 2
It looks like you're iterating an extra time in your GetBizTalkServicesStatistics() method.
Remove the foreach loop that starts with foreach (PropertyData envVarProperty in envVarProperties). This is looping through each property the object has (All 25 properties) for each instance (5 instances)... 25 * 5 = 125 values you are retrieving. You only want to iterate through your instances and pull the properties you want. That way you end up with 5 objects in your model object.
I'd suggest maybe something like this (untested because I don't have BizTalk)
public List<BizTalk> GetBizTalkServicesStatistics()
{
List<BizTalk> model = new List<BizTalk>();
try
{
//Create the WMI search object.
ConnectionOptions options = new ConnectionOptions
{
Username = "+username+",
Password = "+password+",
Authority = "+domain+"
};
var server = "+server+";
// create the scope node so we can set the WMI root node correctly.
ManagementScope Scope = new ManagementScope("\\\\" + server + "\\root\\MicrosoftBizTalkServer", options);
ManagementObjectSearcher Searcher = new ManagementObjectSearcher(Scope, new ObjectQuery("SELECT * FROM MSBTS_ReceiveLocation"));
// Enumerate all properties.
foreach (ManagementObject instance in Searcher.Get())
{
{
BizTalk bizTalk = new BizTalk();
bizTalk.Name = instance.Properties["Name"]?.Value?.ToString();
bizTalk.TransportType = instance.Properties["AdapterName"]?.Value?.ToString();
bizTalk.Uri = instance.Properties["InboundTransportURL"]?.Value?.ToString();
bizTalk.Status = instance.Properties["Name"]?.Value?.ToString();
bizTalk.ReceiveHandler = instance.Properties["HostName"]?.Value?.ToString();
bizTalk.ReceivePort = instance.Properties["ReceivePortName"]?.Value?.ToString();
bizTalk.RunDate = DateTime.Now;
bizTalk.ApplicationId = 24;
bizTalk.ServerId = 8;
bizTalk.InstanceName = "FBCZOP";
model.Add(bizTalk);
}
}
// Determine
if (model.Count == 0)
{
Console.WriteLine("No receive locations found matching the specified name.");
}
}
catch (Exception excep)
{
ExceptionLogger.SendErrorToText(excep);
}
return model;
}
Also, this can be simplified more if you remove the connectionoptions (unless you are hard coding credentials which is highly advised against). If you are just using the identity of the executing user, that data is not needed.
-Paul
You are adding the same entity 25 times and overwrite its properties by reference. You need to initialize a new entity inside your loop:
foreach (var di in ServerInfo)
{
var entity = new BizTalk();
entity.RunDate = di.RunDate;
entity.Name = di.Name;
entity.Status = di.Status;
entity.Uri = di.Uri;
entity.InstanceName = di.InstanceName;
entity.ReceivePort = di.ReceivePort;
entity.TransportType= di.TransportType;
entity.RunDate = DateTime.Now;
entity.ReceiveHandler = di.ReceiveHandler;
entity.ServerId = entity.ServerId;
entity.ApplicationId = entity.ApplicationId;
appEntities.BizTalk.Add(entity);
appEn.SaveChanges();
}
}
As you don't show the code where "SaveStatistics" is called it's not sure this will fix your complete problem, but it's at least one method that does not do what you expect it to do.

Grouping Data from a text file in C#

I am currently working on a small project that returns text from 'txt' file based on criteria and then groups it before I export it to a database. In the text file I have:
c:\test\123
Other Lines...
c:\test\124
Problem: "description of error". (this error is for directory 124)
Problem: "description of error". (this error is for directory 124)
c:\test\125
...
I would like to group the 'problems' to their associated directory when importing them to the database. So far I have tried using 'foreach' to return the rows where the line contains/begins with directory or problem. Although this passes the value in order it is not clear for users to see which directory the problem belongs to. Ideally I am after:
Directory(column1) Problem(column2)
c:\test\123 || Null
c:\test\124 || Problem: "description of Error".
c:\test\124 || Problem: "description of Error".
c:\test\125 || Null
Any help that you can give would be greatly appreciated. I have been racking my brains on this for the last week!
(current code)
var lines = File.ReadAllLines(filename);
foreach (var line in File.ReadLines(filename))
{
String stringTest = line;
if (stringTest.Contains(directory))
{
String test = stringTest;
var csb = new SqlConnectionStringBuilder();
csb.DataSource = host;
csb.InitialCatalog = catalog;
csb.UserID = user;
csb.Password = pass;
using (var sc = new SqlConnection(csb.ConnectionString))
using (var cmd = sc.CreateCommand())
{
sc.Open();
cmd.CommandText = "DELETE FROM table";
cmd.CommandText = "INSERT INTO table (ID, Directory) values (NEWID(), #val)";
cmd.Parameters.AddWithValue("#VAL", test);
cmd.ExecuteNonQuery();
sc.Close();
}
}
if (stringTest.Contains(problem))
{
Same for problem....
Here is one solution:
Assuming that you have the following class to hold a result item:
public class ResultItem
{
public string Directory { get; set; }
public string Problem { get; set; }
}
You can do the following:
var lines = File.ReadAllLines(filename);
string current_directory = null;
List<ResultItem> results = new List<ResultItem>();
//maintain the number of results added for the current directory
int problems_added_for_current_directory = 0;
foreach (var line in lines)
{
if (line.StartsWith("c:\\test"))
{
//If we are changing to a new directory
//And we didn't add any items for current directory
//Add a null result item
if (current_directory != null && problems_added_for_current_directory == 0)
{
results.Add(new ResultItem
{
Directory = current_directory,
Problem = null
});
}
current_directory = line;
problems_added_for_current_directory = 0;
}
else if (line.StartsWith("Problem"))
{
results.Add(new ResultItem
{
Directory = current_directory,
Problem = line
});
problems_added_for_current_directory++;
}
}
//If we are done looping
//And we didn't add any items for current (last) directory
//Add a null result item
if (current_directory != null && problems_added_for_current_directory == 0)
{
results.Add(new ResultItem
{
Directory = current_directory,
Problem = null
});
}

Trying to query many text files in the same folder with linq

I need to search a folder containing csv files. The records i'm interested in have 3 fields: Rec, Country and Year. My job is to search the files and see if any of the files has records for more then a single year. Below the code i have so far:
// Get each individual file from the folder.
string startFolder = #"C:\MyFileFolder\";
System.IO.DirectoryInfo dir = new System.IO.DirectoryInfo(startFolder);
IEnumerable<System.IO.FileInfo> fileList = dir.GetFiles("*.*",
System.IO.SearchOption.AllDirectories);
var queryMatchingFiles =
from file in fileList
where (file.Extension == ".dat" || file.Extension == ".csv")
select file;
Then i'm came up with this code to read year field from each file and find those where year count is more than 1(The count part was not successfully implemented)
public void GetFileData(string filesname, char sep)
{
using (StreamReader reader = new StreamReader(filesname))
{
var recs = (from line in reader.Lines(sep.ToString())
let parts = line.Split(sep)
select parts[2]);
}
below a sample file:
REC,IE,2014
REC,DE,2014
REC,FR,2015
Now i'am struggling to combine these 2 ideas to solve my problem in a single query. The query should list those files that have record for more than a year.
Thanks in advance
Something along these lines:
string startFolder = #"C:\MyFileFolder\";
System.IO.DirectoryInfo dir = new System.IO.DirectoryInfo(startFolder);
IEnumerable<System.IO.FileInfo> fileList = dir.GetFiles("*.*",
System.IO.SearchOption.AllDirectories);
var fileData =
from file in fileList
where (file.Extension == ".dat" || file.Extension == ".csv")
select GetFileData(file, ',')
;
public string GetFileData(string filesname, char sep)
{
using (StreamReader reader = new StreamReader(filesname))
{
var recs = (from line in reader.Lines(sep.ToString())
let parts = line.Split(sep)
select parts[2]);
var multipleyears = recs.Distinct().Count();
if(multipleyears > 1)
return filename;
}
}
Not on my develop machine, so this might not compile "as is", but here's a direction
var lines = // file.readalllines();
var years = from line in lines
let parts = line.Split(new [] {','})
select parts[2]);
var distinct_years = years.Distinct();
if (distinct_years >1 )
// this file has several years
"My job is to search the files and see if any of the files has records
for more then a single year."
This specifies that you want a Boolean result, one that says if any of the files has those records.
For fun I'll extend it a little bit more:
My job is to get the collection of files where any of the records is about more than a single year.
You were almost there. Let's first declare a class with the records in your file:
public class MyRecord
{
public string Rec { get; set; }
public string CountryCode { get; set; }
public int Year { get; set; }
}
I'll make an extension method of the class FileInfo that will read the file and returns the sequence of MyRecords that is in it.
For extension methods see MSDN Extension Methods (C# Programming Guide)
public static class FileInfoExtension
{
public static IEnumerable<MyRecord> ReadMyRecords(this FileInfo file, char separator)
{
var records = new List<MyRecord>();
using (var reader = new StreamReader(file.FullName))
{
var lineToProcess = reader.ReadLine();
while (lineToProcess != null)
{
var splitLines = lineToProcess.Split(new char[] { separator }, 3);
if (splitLines.Length < 3) throw new InvalidDataException();
var record = new MyRecord()
{
Rec = splitLines[0],
CountryCode = splitLines[1],
Year = Int32.Parse(splitLines[2]),
};
records.Add(record);
lineToProcess = reader.ReadLine();
}
}
return records;
}
}
I could have used string instead of FileInfo, but IMHO a string is something completely different than a filename.
After the above you can write the following:
string startFolder = #"C:\MyFileFolder\";
var directoryInfo = new DirectoryInfo(startFolder);
var allFiles = directoryInfo.EnumerateFiles("*.*", SearchOption.AllDirectories);
var sequenceOfFileRecordCollections = allFiles.ReadMyRecords(',');
So now you have per file a sequence of the MyRecords in the file. You want to know which files have more than one year, Let's add another extension method to class FileInfoExtension:
public static bool IsMultiYear(this FileInfo file, char separator)
{
// read the file, only return true if there are any records,
// and if any record has a different year than the first record
var myRecords = file.ReadMyRecords(separator);
if (myRecords.Any())
{
int firstYear = myRecords.First().Year;
return myRecords.Any(record => record.Year != firstYear);
}
else
{
return false;
}
}
The sequence of file that have more than one year in it is:
allFiles.Where(file => file.IsMultiYear(',');
Put everything in one line:
var allFilesWithMultiYear = new DirectoryInfo(#"C:\MyFileFolder\")
.EnumerateFiles("*.*", SearchOption.AllDirectories)
.Where(file => file.IsMultiYear(',');
By creating two fairly simple extension methods your problem became one highly readable statement.

Find new file in two folders with a cross check

I am trying to sort two folders in to a patched folder, finding which file is new in the new folder and marking it as new, so i can transfer that file only. i dont care about dates or hash changes. just what file is in the new folder that is not in the old folder.
somehow the line
pf.NFile = !( oldPatch.FindAll(s => s.Equals(f)).Count() == 0);
is always returning false. is there something wrong with my logic of cross checking?
List<string> newPatch = DirectorySearch(_newFolder);
List<string> oldPatch = DirectorySearch(_oldFolder);
foreach (string f in newPatch)
{
string filename = Path.GetFileName(f);
string Dir = (Path.GetDirectoryName(f).Replace(_newFolder, "") + #"\");
PatchFile pf = new PatchFile();
pf.Dir = Dir;
pf.FName = filename;
pf.NFile = !( oldPatch.FindAll(s => s.Equals(f)).Count() == 0);
nPatch.Files.Add(pf);
}
foreach (string f in oldPatch)
{
string filename = Path.GetFileName(f);
string Dir = (Path.GetDirectoryName(f).Replace(_oldFolder, "") + #"\");
PatchFile pf = new PatchFile();
pf.Dir = Dir;
pf.FName = filename;
if (!nPatch.Files.Exists(item => item.Dir == pf.Dir &&
item.FName == pf.FName))
{
nPatch.removeFiles.Add(pf);
}
}
I don't have the classes you are using (like DirectorySearch and PatchFile), so i can't compile your code, but IMO the line _oldPatch.FindAll(... doesn't return anything because you are comparing the full path (c:\oldpatch\filea.txt is not c:\newpatch\filea.txt) and not the file name only. IMO your algorithm could be simplified, something like this pseudocode (using List.Contains instead of List.FindAll):
var _newFolder = "d:\\temp\\xml\\b";
var _oldFolder = "d:\\temp\\xml\\a";
List<FileInfo> missing = new List<FileInfo>();
List<FileInfo> nPatch = new List<FileInfo>();
List<FileInfo> newPatch = new DirectoryInfo(_newFolder).GetFiles().ToList();
List<FileInfo> oldPatch = new DirectoryInfo(_oldFolder).GetFiles().ToList();
// take all files in new patch
foreach (var f in newPatch)
{
nPatch.Add(f);
}
// search for hits in old patch
foreach (var f in oldPatch)
{
if (!nPatch.Select (p => p.Name.ToLower()).Contains(f.Name.ToLower()))
{
missing.Add(f);
}
}
// new files are in missing
One possible solution with less code would be to select the file names, put them into a list an use the predefined List.Except or if needed List.Intersect methods. This way a solution to which file is in A but not in B could be solved fast like this:
var locationA = "d:\\temp\\xml\\a";
var locationB = "d:\\temp\\xml\\b";
// takes file names from A and B and put them into lists
var filesInA = new DirectoryInfo(locationA).GetFiles().Select (n => n.Name).ToList();
var filesInB = new DirectoryInfo(locationB).GetFiles().Select (n => n.Name).ToList();
// Except retrieves all files that are in A but not in B
foreach (var file in filesInA.Except(filesInB).ToList())
{
Console.WriteLine(file);
}
I have 1.xml, 2.xml, 3.xml in A and 1.xml, 3.xml in B. The output is 2.xml - missing in B.

Categories

Resources