How manipulate every string that a IList gives you? - c#

I have a list that is given by this page http://www.codigo-postal.pt/?cp4=4710&cp3= , and as you can see if you visited the link there's a line that always end with the word "Braga", so with no further ado, what I want is to manipulate every string that the list gives me, into last word after the comma?
The List is given by this code:
IList<string> Distritos = new List<string>();
foreach (var Distritoelemen in Gdriver.FindElements(By.ClassName("local")))
{
//Distritos.Add(Distritoelement.Text);
table.Rows.Add(Distritoelement.Text);
}

In order to get the word after the last comma, run this code...
IList<string> Distritos = new List<string>();
foreach (var Distritoelemen in Gdriver.FindElements(By.ClassName("local")))
{
//Distritos.Add(Distritoelement.Text);
table.Rows.Add(Distritoelement.Text.Substring(Distritoelement.Text.LastIndexOf(',') + 1));
You might want to consider putting the Text into a local variable. If you don't want the extra space before the word, you can use .Trim() on the result of Substring as in ...LastIndexOf(',') + 1).Trim());

And if you have more then one list you should do
IList<string> Freguseia = new List<string>();
foreach (var freguesiaelement in Fdriver.FindElements(By.ClassName("local")))
{
Freguseia.Add(freguesiaelement.Text);
}
IList<string> GPS = new List<string>();
foreach (var gpselement in Fdriver.FindElements(By.ClassName("gps")))
{
GPS.Add(gpselement.Text);
}
for (int i = 0; i < Freguseia.Count; i++)
{
table.Rows.Add( Freguseia.ElementAt(i), GPS.ElementAt(i));
}

Related

Creating a new list by incrementing two separate lists

I'm trying to create a new list from two separate lists like so:
List 1 (sCat) = MD0, MD1, MD3, MD4
List 2 (sLev) = 01, 02, 03, R
Output-->
MD0-01
MD0-02
MD0-03
MD0-R
MD1-01
MD1-02
MD1-03
MD1-R
MD3-01
MD3-02
MD3-03
MD3-R
etc...
I would like to know if there is a function that would produce the results above. Ultimately, I would like the user to provide List 2 and have that information added to List 1 and stored as a new list that I could call later.
enter code here
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
List<string> sCat = new List<string>();
// add Categories for the Sheets
sCat.Add("MD0");
sCat.Add("MD1");
sCat.Add("MD3");
List<string> sLev = new List<string>();
// add Levels for the Project
sLev.Add("01");
sLev.Add("02");
sLev.Add("03");
sLev.Add("R");
for (int i = 0; i < sCat.Count; i++)
{
// I am getting stuck here.
// I don't know how to take one item from the sCat list and
// add it to the sLev List incrementally.
Console.WriteLine(sCat[i],i);
}
Console.ReadLine();
}
}
Combine the values of all the elements selected from the first collection with the elements contained in the other collection:
var combined = sCat.SelectMany(s => sLev.Select(s1 => $"{s}-{s1}")).ToList();
Which is like iterating the two collections in a nested for/foreach loop, adding each combined element to a new List<string>:
List<string> combined = new List<string>();
foreach (var s1 in sCat)
foreach (var s2 in sLev) {
combined.Add(s1 + "-" + s2);
}
You can replace your for loop with following:
foreach(var sCatValue in sCat)
{
foreach(var sLevValue in sLev)
{
Console.WriteLine($"{sCatValue}-{sLevValue}");
}
}
private static void Main()
{
List<string> sCat = new List<string>();
// add Categories for the Sheets
sCat.Add("MD0");
sCat.Add("MD1");
sCat.Add("MD3");
List<string> sLev = new List<string>();
// add Levels for the Project
sLev.Add("01");
sLev.Add("02");
sLev.Add("03");
sLev.Add("R");
string dash = "-";
List<string> newList = new List<string>();
for (int i = 0; i < sCat.Count; i++)
{
for (int j = 0; j < sLev.Count; j++)
{
newList.Add(sCat[i] + dash + sLev[j]);
}
}
foreach (var item in newList)
{
Console.WriteLine(item);
}
Console.ReadLine();
}

Read and process 100 text files in c# in parallel

I have project that reads 100 text file with 5000 words in it.
I insert the words into a list. I have a second list that contains english stop words. I compare the two lists and delete the stop words from first list.
It takes 1 hour to run the application. I want to be parallelize it. How can I do that?
Heres my code:
private void button1_Click(object sender, EventArgs e)
{
List<string> listt1 = new List<string>();
string line;
for (int ii = 1; ii <= 49; ii++)
{
string d = ii.ToString();
using (StreamReader reader = new StreamReader(#"D" + d.ToString() + ".txt"))
while ((line = reader.ReadLine()) != null)
{
string[] words = line.Split(' ');
for (int i = 0; i < words.Length; i++)
{
listt1.Add(words[i].ToString());
}
}
listt1 = listt1.ConvertAll(d1 => d1.ToLower());
StreamReader reader2 = new StreamReader("stopword.txt");
List<string> listt2 = new List<string>();
string line2;
while ((line2 = reader2.ReadLine()) != null)
{
string[] words2 = line2.Split('\n');
for (int i = 0; i < words2.Length; i++)
{
listt2.Add(words2[i]);
}
listt2 = listt2.ConvertAll(d1 => d1.ToLower());
}
for (int i = 0; i < listt1.Count(); i++)
{
for (int j = 0; j < listt2.Count(); j++)
{
listt1.RemoveAll(d1 => d1.Equals(listt2[j]));
}
}
listt1=listt1.Distinct().ToList();
textBox1.Text = listt1.Count().ToString();
}
}
}
}
I fixed many things up with your code. I don't think you need multi-threading:
private void RemoveStopWords()
{
HashSet<string> stopWords = new HashSet<string>();
using (var stopWordReader = new StreamReader("stopword.txt"))
{
string line2;
while ((line2 = stopWordReader.ReadLine()) != null)
{
string[] words2 = line2.Split('\n');
for (int i = 0; i < words2.Length; i++)
{
stopWords.Add(words2[i].ToLower());
}
}
}
var fileWords = new HashSet<string>();
for (int fileNumber = 1; fileNumber <= 49; fileNumber++)
{
using (var reader = new StreamReader("D" + fileNumber.ToString() + ".txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
foreach(var word in line.Split(' '))
{
fileWords.Add(word.ToLower());
}
}
}
}
fileWords.ExceptWith(stopWords);
textBox1.Text = fileWords.Count().ToString();
}
You are reading through the list of stopwords many times as well as continually adding to the list and re-attempting to remove the same stopwords over and again due to the way your code is structured. Your needs are also better matched to a HashSet than to a List, as it has set based operations and uniqueness already handled.
If you still wanted to make this parallel, you could do it by reading the stopword list once and passing it to an async method that will read the input file, remove the stopwords and return the resulting list, then you would need to merge the resulting lists after the asynchronous calls came back, but you had better test before deciding you need that, because that is quite a bit more work and complexity than this code already has.
If I understand you correctly, you want to:
Read all words from a file into a List
Remove all "stop words" from the List
Repeat for 99 more files, saving only the unique words
If this is correct, the code is pretty simple:
// The list of words to delete ("stop words")
var stopWords = new List<string> { "remove", "these", "words" };
// The list of files to check - you can get this list in other ways
var filesToCheck = new List<string>
{
#"f:\public\temp\temp1.txt",
#"f:\public\temp\temp2.txt",
#"f:\public\temp\temp3.txt"
};
// This list will contain all the unique words from all
// the files, except the ones in the "stopWords" list
var uniqueFilteredWords = new List<string>();
// Loop through all our files
foreach (var fileToCheck in filesToCheck)
{
// Read all the file text into a varaible
var fileText = File.ReadAllText(fileToCheck);
// Split the text into distinct words (splitting on null
// splits on all whitespace) and ignore empty lines
var fileWords = fileText.Split(null)
.Where(line => !string.IsNullOrWhiteSpace(line))
.Distinct();
// Add all the words from the file, except the ones in
// your "stop list" and those that are already in the list
uniqueFilteredWords.AddRange(fileWords.Except(stopWords)
.Where(word => !uniqueFilteredWords.Contains(word)));
}
This can be condensed into a single line with no explicit loop:
// This list will contain all the unique words from all
// the files, except the ones in the "stopWords" list
var uniqueFilteredWords = filesToCheck.SelectMany(fileToCheck =>
File.ReadAllText(fileToCheck)
.Split(null)
.Where(word => !string.IsNullOrWhiteSpace(word) &&
!stopWords.Any(stopWord => stopWord.Equals(word,
StringComparison.OrdinalIgnoreCase)))
.Distinct());
This code processed over 100 files with more than 12000 words each in less than a second (WAY less than a second... 0.0001782 seconds)
One issue I see here that can help improve performance is listt1.ConvertAll() will run in O(n) on the list. You are already looping to add the items to the list, why not convert them to lower case there. Also why not store the words in a hash set, so you can do look up and insertion in O(1). You could store the list of stop words in a hash set and when you are reading your text input see if the word is a stop word and if its not add it to the hash set to output the user.

Making a mulitdimensional jagged array

I want to make a dynamic array of dynamic arrays, how can I do that?
I've tried with list of list where I use the AddRange() method.
I've also tried iterating through arrays.
Maybe it makes more sense to show what I'm trying to do. I cannot get it to work:
String[] lines = System.IO.File.ReadAllLines(fileName);
String[] linesArr;
String[][] MultiArr;
int i = 0;
foreach (string line in lines)
{
if (line.Contains("EFIX"))
{
linesArr = line.Split(delimiterChars);
for (int x = 0; x < linesArr.Length; x++)
{
MultiArr[i][x] = linesArr[x];
}
Console.WriteLine(fixationsData[i]);
i++;
}
}
LINQ makes this pretty trivial.
string[][] data = File.ReadLines(filename)
.Where(line => line.Contains("EFIX"))
.Select(line => line.Split(delimiterChars))
.ToArray();//omit this last call to allow the data to be streamed,
//greatly removing the memory footprint of the application at no real
//additional cost, assuming you have no compelling reason to eagerly
//load the whole file into memory.
foreach(var line in data)
Console.WriteLine(line);
Using a list of list of strings should work fine. Here's what I'd write.
var lines = System.IO.File.ReadAllLines(fileName);
var multiArr = new List<List<string>>();
var i = 0;
foreach (var line in lines.Where(line => line.Contains("EFIX")))
{
multiArr.Add(line.Split(delimiterChars).ToList());
Console.WriteLine(fixationsData[i]);
i++;
}
String[] lines = System.IO.File.ReadAllLines(fileName);
String[] linesArr;
List<String[]> MultiArr = new List<String[]>();
foreach (string line in lines)
{
if (line.Contains("EFIX"))
{
linesArr = line.Split(delimiterChars);
MultiArr.Add(linesArr);
Console.WriteLine(fixationsData[i]);
}
}
This is a list of string arrays. This is what you are looking for.

How do I make the foreach instruction iterate in 2 places?

how do I make the foreach instruction iterate both in the "files" variable and in the "names" array?
var files = Directory.GetFiles(#".\GalleryImages");
string[] names = new string[8] { "Matt", "Joanne", "Robert","Andrei","Mihai","Radu","Ionica","Vasile"};
I've tried 2 options.. the first one gives me lots of errors and the second one displays 8 images of each kind
foreach(var file in files,var i in names)
{
//Do stuff
}
and
foreach(var file in files)
{
foreach (var i in names)
{
//Do stuff
}
}
You can try using the Zip Extension method of LINQ:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
foreach (var item in numbersAndWords)
Console.WriteLine(item);
Would look something like this:
var files = Directory.GetFiles(#".\GalleryImages");
string[] names = new string[] { "Matt", "Joanne", "Robert", "Andrei", "Mihai","Radu","Ionica","Vasile"};
var zipped = files.Zip(names, (f, n) => new { File = f, Name = n });
foreach(var fn in zipped)
Console.WriteLine(fn.File + " " + fn.Name);
But I haven't tested this one.
It's not clear what you're asking. But, you can't iterate two iterators with foreach; but you can increment another variable in the foreach body:
int i = 0;
foreach(var file in files)
{
var name = names[i++];
// TODO: do something with name and file
}
This, of course, assumes that files and names are of the same length.
You can't. Use a for loop instead.
for(int i = 0; i < files.Length; i++)
{
var file = files[i];
var name = names[i];
}
If the both array have the same length this should work.
You have two options here; the first works if you are iterating over something that has an indexer, like an array or List, in which case use a simple for loop and access things by index:
for (int i = 0; i < files.Length && i < names.Length; i++)
{
var file = files[i];
var name = names[i];
// Do stuff with names.
}
If you have a collection that doesn't have an indexer, e.g. you just have an IEnumerable and you don't know what it is, you can use the IEnumerable interface directly. Behind the scenes, that's all foreach is doing, it just hides the slightly messier syntax. That would look like:
var filesEnum = files.GetEnumerator();
var namesEnum = names.GetEnumerator();
while (filesEnum.MoveNext() && namesEnum.MoveNext())
{
var file = filesEnum.Current;
var name = namesEnum.Current;
// Do stuff with files and names.
}
Both of these assume that both collections have the same number of items. The for loop will only iterate as many times as the smaller one, and the smaller enumerator will return false from MoveNext when it runs out of items. If one collection is bigger than the other, the 'extra' items won't get processed, and you'll need to figure out what to do with them.
I guess the files array and the names array have the same indices.
When this is the case AND you always want the same index at one time you do this:
for (int key = 0; key < files.Length; ++key)
{
// access names[key] and files[key] here
}
You can try something like this:
var pairs = files.Zip(names, (f,n) => new {File=f, Name=n});
foreach (var item in pairs)
{
Console.Write(item.File);
Console.Write(item.Name);
}

My hashtable doesnt work

I am using a hashtable to read data from file and make clusters.
Say the data in file is:
umair,i,umair
sajid,mark,i , k , i
The output is like:
[{umair,umair},i]
[sajid,mark,i,i,k]
But my code does not work. Here is the code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Collections;
namespace readstringfromfile
{
class Program
{
static void Main()
{
/* int i = 0;
foreach (string line in File.ReadAllLines("newfile.txt"))
{
string[] parts = line.Split(',');
foreach (string part in parts)
{
Console.WriteLine("{0}:{1}", i,part);
}
i++; // For demo only
}*/
Hashtable hashtable = new Hashtable();
using (StreamReader r = new StreamReader("newfile.txt"))
{
string line;
while ((line = r.ReadLine()) != null)
{
string[] records = line.Split(',');
foreach (string record in records)
{
if (hashtable[records] == null)
hashtable[records] = (int)0;
hashtable[records] = (int)hashtable[records] + 1;
Console.WriteLine(hashtable.Keys);
}
/////this portion is not working/////////////////////////////////////
foreach (DictionaryEntry entry in hashtable)
{
for (int i = 0; i < (int)hashtable[records]; i++)
{
Console.WriteLine(entry);
}
}
}
}
}
}
}
You're working with the records array when inserting into the hashtable (and when reading from it) instead of using the foreach-variable record. Also, in the final look, you iterate based on records instead of the current entry.Key. You're also declaring the hashtable in a too wide scope, causing all rows to be inserted into the same hashtable, instead of one per row.
public static void Main() {
var lines = new[] { "umair,i,umair", "sajid,mark,i,k,i" };
foreach (var line in lines) {
var hashtable = new Hashtable();
var records = line.Split(',');
foreach (var record in records) {
if (hashtable[record] == null)
hashtable[record] = 0;
hashtable[record] = (Int32)hashtable[record] + 1;
}
var str = "";
foreach (DictionaryEntry entry in hashtable) {
var count = (Int32)hashtable[entry.Key];
for (var i = 0; i < count; i++) {
str += entry.Key;
if (i < count - 1)
str += ",";
}
str += ",";
}
// Remove last comma.
str = str.TrimEnd(',');
Console.WriteLine(str);
}
Console.ReadLine();
}
However, you should consider using the generic Dictionary<TKey,TValue> class, and use a StringBuilder if you're building alot of strings.
public static void Main() {
var lines = new[] { "umair,i,umair", "sajid,mark,i,k,i" };
foreach (var line in lines) {
var dictionary = new Dictionary<String, Int32>();
var records = line.Split(',');
foreach (var record in records) {
if (!dictionary.ContainsKey(record))
dictionary.Add(record, 1);
else
dictionary[record]++;
}
var str = "";
foreach (var entry in dictionary) {
for (var i = 0; i < entry.Value; i++) {
str += entry.Key;
if (i < entry.Value - 1)
str += ",";
}
str += ",";
}
// Remove last comma.
str = str.TrimEnd(',');
Console.WriteLine(str);
}
Console.ReadLine();
}
You're attempting to group elements of a sequence. LINQ has a built-in operator for that; it's used as group ... by ... into ... or the equivalent method .GroupBy(...)
That means you can write your code (excluding File I/O etc.) as:
var lines = new[] { "umair,i,umair", "sajid,mark,i,k,i" };
foreach (var line in lines) {
var groupedRecords =
from record in line.Split(',')
group record by record into recordgroup
from record in recordgroup
select record;
Console.WriteLine(
string.Join(
",", groupedRecords
)
);
}
If you prefer shorter code, the loop be equivalently written as:
foreach (var line in lines)
Console.WriteLine(string.Join(",",
line.Split(',').GroupBy(rec=>rec).SelectMany(grp=>grp)));
both versions will output...
umair,umair,i
sajid,mark,i,i,k
Note that you really shouldn't be using a Hashtable - that's just a type-unsafe slow version of Dictionary for almost all purposes. Also, the output example you mention includes [] and {} characters - but you didn't specify how or whether they're supposed to be included, so I left those out.
A LINQ group is nothing more than a sequence of elements (here, identical strings) with a Key (here a string). Calling GroupBy thus transforms the sequence of records into a sequence of groups. However, you want to simply concatenate those groups. SelectMany is such a concatenation: from a sequence of items, it concatenates the "contents" of each item into one large sequence.

Categories

Resources