Remove All Indexes in String - c#

I have dictionary of int (Dictionary<int, int>) which has index of all parenthesis in a string (key was openStartParenthesisIndex and value was closeEndParenthesisIndex)
e.g in text
stringX.stringY(())() -> stringX.stringY$($()^)^$()^
$ = openParenthesisStartIndex
^ = closeParenthesisEndIndex
Dictionary items:
key value
(openParenthesisStartIndex) --- (closeParenthesisEndIndex)
item1 15 19
item2 16 18
item3 19 21
My problem was when I loop my dictionary and try to remove it on string, next loop the index was not valid since its already change because I remove it .
string myText = "stringX.stringY(())()";
Dictionary<int, int> myIndexs = new Dictionary<int, int>();
foreach (var x in myIndexs)
{
myText = myText.Remove(item.Key, 1).Remove(item.Value-1);
}
Question: how can i remove all index in a string (from startIndex[key] to endIndex[value])?

To prevent the index from changing, one trick is to remove the occurences starting from the end:
string myText = stringX.stringY(())();
Dictionary<int, int> myIndexs = new Dictionary<int, int>();
var allIndexes = myIndexs.Keys.Concat(myIndexs.Values);
foreach (var index in allIndexes.OrderByDescending(i => i))
{
myText = myText.Remove(index, 1);
}
Note that you probably don't need a dictionary at all. Consider replacing it by a list.

StringBuilder will be more suited to your case as you are continuously changing data. StringBuilder MSDN
Ordering the keys by descending order will work as well for removing all indexes.
Another workaround could be to place an intermediary character at required index and replace all instances of that character in the end.
StringBuilder ab = new StringBuilder("ab(cd)");
ab.Remove(2, 1);
ab.Insert(2, "`");
ab.Remove(5, 1);
ab.Insert(5, "`");
ab.Replace("`", "");
System.Console.Write(ab);

Strings when you make a change to a string a new string is always created, so what you want is to create a new string without the removed parts. This code is a little bit complicated because of how it deals with the potential overlap. Maybe the better way would be to cleanup the indexes, making a list of indexes that represent the same removals in the right order without overlap.
public static string removeAtIndexes(string source)
{
var indexes = new Tuple<int, int>[]
{
new Tuple<int, int>(15, 19),
new Tuple<int, int>(16, 18),
new Tuple<int, int>(19, 21)
};
var sb = new StringBuilder();
var last = 0;
bool copying = true;
for (var i = 0; i < source.Length; i++)
{
var end = false;
foreach (var index in indexes)
{
if (copying)
{
if (index.Item1 <= i)
{
copying = false;
break;
}
}
else
{
if (index.Item2 < i)
{
end = true;
}
}
}
if (false == copying && end)
{
copying = true;
}
if(copying)
{
sb.Append(source[i]);
}
}
return sb.ToString();
}

Related

Dictionary to return a character list with their indices

I've been tasked with taking a string and returning a dictionary that has a map of characters to a list of their indices in a given string. The output should show which characters occur where in the given string.
This code passes the test:
public class CharacterIndexDictionary
{
public static Dictionary<string, List<int>> ConcordanceForString(string input)
{
var result = new Dictionary<string, List<int>>();
for (var index = 0; index < input.Length; index++)
{
// Get the character positioned at the current index.
// We could just use input[index] everywhere, but
// this is a little easier to read.
string currentCharacter = input[index].ToString();
// If the dictionary doesn't already have an entry
// for the current character, add one.
if (!result.ContainsKey(currentCharacter))
{
result.Add(currentCharacter, new List<int>());
}
// Add the current index to the list for
// the current character.
result[currentCharacter].Add(index);
}
return result;
}
}
If I wanted to index characters I'd use a Dictionary<char, List<int>> instead of using a string as the key, but this uses string because the test requires it.
This code block is like your code and in a way that you can understand
public Dictionary<string, List<int>> ConcordanceForString(string s)
{
Dictionary<string, List<int>> newDictionary = new Dictionary<string, List<int>>();
List<char> charList = new List<char>();
foreach (var item in s)
{
if (!charList.Any(x => x == item))
{
charList.Add(item);
List<int> itemInds = new List<int>();
for (int i = 0; i< s.Length; i++)
{
if (s[i] == item)
{
itemInds.Add(i);
}
}
newDictionary.Add(item.ToString(), itemInds);
}
}
return newDictionary;
}

Quickest algorithm for identifying pairs in collection of string

I am looking for the quickest algorithm:
GOAL: output the total number of pair occurrences found on a line. The individual elements may be in any order on any given line.
INPUT:
a;b;c;d
a;e;f;g
a;b;f;h
OUTPUT
a;b = 2
a;c = 1
a;d = 1
a;e = 1
a;f = 2
a;g = 1
b;c = 1
b;d = 1
I am programming in C#, I've got a nested for loop adding do a common dictionary of type where string is like a;b and when an occurrence is found it adds to the existing int tally or adds a new one at tally = 0.
Note this:
a;b = 1
b;a = 1
Should be reduced to this:
a;b = 1
I am open to using other languages, the output is in a plain text file which I feed into Gephi visualization tool.
Bonus: Very interested to know the name of this particular algorithm if it's out there. Pretty sure it is.
String[] data = File.ReadAllLines(#"C:\input.txt");
Dictionary<string, int> ress = new Dictionary<string, int>();
foreach (var line in data)
{
string[] outStrings = line.Split(';');
for (int i = 0; i < outStrings.Count(); i++)
{
for (int y = 0; y < outStrings.Count(); y++)
{
if (outStrings[i] != outStrings[y])
{
try
{
if (ress.Any(x => x.Key == outStrings[i] + ";" + outStrings[y]))
{
ress[outStrings[i] + ";" + outStrings[y]] += 1;
}
else
{
ress.Add(outStrings[i] + ";" + outStrings[y], 0);
}
}
catch (Exception)
{
}
}
}
}
}
foreach (var val in ress)
{
Console.WriteLine(val.Key + "----" + val.Value);
}
I think your inner loop should start with i + 1 instead of starting back at 0 again, and the outer loop should only run until Length - 1, since the last item will be compared on the inner loop. Also, when you add a new item, you should add the value 1, not 0 (since the whole reason we're adding it is because we found one).
You can also just store the key into a string once instead of doing multiple concatenations during your comparison and assignment, and you can use the ContainsKey method to determine if a key exists already.
Also, you might want to consider avoiding empty catch blocks unless you're really certain that you don't care if or what went wrong. If I'm expecting an exception and know how to handle it, then I catch that exception, otherwise I'll just let it bubble up the stack.
Here's one way you could modify your code to find all pairs and their counts:
Update
I added a check to ensure that the "pair" key is always sorted, so that "b;a" becomes "a;b". This wasn't an issue in your sample data, but I extended the data to include lines like b;a;a;b;a;b;a;. Also I added StringSplitOptions.RemoveEmptyEntries to the Split method to handle cases where a line begins or ends with a ; (otherwise the null value resulted in a pair like ";a").
private static void Main()
{
var data = File.ReadAllLines(#"f:\public\temp\temp.txt");
var pairCount = new Dictionary<string, int>();
foreach (var line in data)
{
var lineItems = line.Split(new[] {';'}, StringSplitOptions.RemoveEmptyEntries);
for (var outer = 0; outer < lineItems.Length - 1; outer++)
{
for (var inner = outer + 1; inner < lineItems.Length; inner++)
{
var outerComparedToInner = string.Compare(lineItems[outer],
lineItems[inner], StringComparison.Ordinal);
// If both items are the same character, ignore them and keep looping
if (outerComparedToInner == 0) continue;
// Create the pair such that the lower of the two
// values is first, so that "b;a" becomes "a;b"
var thisPair = outerComparedToInner < 0
? $"{lineItems[outer]};{lineItems[inner]}"
: $"{lineItems[inner]};{lineItems[outer]}";
if (pairCount.ContainsKey(thisPair))
{
pairCount[thisPair]++;
}
else
{
pairCount.Add(thisPair, 1);
}
}
}
}
Console.WriteLine("Pair\tCount\n----\t-----");
foreach (var val in pairCount.OrderBy(i => i.Key))
{
Console.WriteLine($"{val.Key}\t{val.Value}");
}
Console.Write("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
Output
Given a file containing your sample data, the output is:
#mrmcgreg, finally after changing the implementation to the ECLAT algorythm everything runs in seconds instead of hours.
Basically for each unique tag, keep track of the LINE NUMBERS where those tags are found, and simply intersect the pair of list of numbers by combination pairs to get the count.
Dictionary<string, List<int>> uniqueTagList = new Dictionary<string, List<int>>();
foreach (var uniqueTag in uniquetags)
{
List<int> lineNumbers = new List<int>();
foreach (var item in data.Select((value, i) => new { i, value }))
{
var value = item.value;
var index = item.i;
//split data into tags
var tags = item.ToString().Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
foreach (var tag in tags)
{
if (uniqueTag == tag)
{
lineNumbers.Add(index);
}
}
}
//remove all having support threshold.
if (lineNumbers.Count > 5)
{
uniqueTagList.Add(uniqueTag, lineNumbers);
}
}

Convert a txt file to dictionary<string, string>

I have a text file and I need to put all even lines to Dictionary Key and all even lines to Dictionary Value. What is the best solution to my problem?
int count_lines = 1;
Dictionary<string, string> stroka = new Dictionary<string, string>();
foreach (string line in ReadLineFromFile(readFile))
{
if (count_lines % 2 == 0)
{
stroka.Add Value
}
else
{
stroka.Add Key
}
count_lines++;
}
Try this:
var res = File
.ReadLines(pathToFile)
.Select((v, i) => new {Index = i, Value = v})
.GroupBy(p => p.Index / 2)
.ToDictionary(g => g.First().Value, g => g.Last().Value);
The idea is to group all lines by pairs. Each group will have exactly two items - the key as the first item, and the value as the second item.
Demo on ideone.
You probably want to do this:
var array = File.ReadAllLines(filename);
for(var i = 0; i < array.Length; i += 2)
{
stroka.Add(array[i + 1], array[i]);
}
This reads the file in steps of two instead of every line separately.
I suppose you wanted to use these pairs: (2,1), (4,3), ... . If not, please change this code to suit your needs.
You can read line by line and add to a Dictionary
public void TextFileToDictionary()
{
Dictionary<string, string> d = new Dictionary<string, string>();
using (var sr = new StreamReader("txttodictionary.txt"))
{
string line = null;
// while it reads a key
while ((line = sr.ReadLine()) != null)
{
// add the key and whatever it
// can read next as the value
d.Add(line, sr.ReadLine());
}
}
}
This way you will get a dictionary, and if you have odd lines, the last entry will have a null value.
String fileName = #"c:\MyFile.txt";
Dictionary<string, string> stroka = new Dictionary<string, string>();
using (TextReader reader = new StreamReader(fileName)) {
String key = null;
Boolean isValue = false;
while (reader.Peek() >= 0) {
if (isValue)
stroka.Add(key, reader.ReadLine());
else
key = reader.ReadLine();
isValue = !isValue;
}
}

Proper way to iterate this?

I have a list of strings that are semicolon separated.
There will always be an even number because the first is the key, the next is the value,
ex:
name;Milo;site;stackoverflow;
So I split them:
var strList = settings.Split(';').ToList();
But now I would like to use a foreach loop to put these into a List<ListItem>
I am wondering if it can be done via iteration, or if I have to use a value 'i' to get [i] and [i+1]
It can be done with LINQ but I am not sure this one is better
var dict = input.Split(';')
.Select((s, i) => new { s, i })
.GroupBy(x => x.i / 2)
.ToDictionary(x => x.First().s, x => x.Last().s);
You can also use moreLinq's Batch for this
var dict2 = input.Split(';')
.Batch(2)
.ToDictionary(x=>x.First(),x=>x.Last());
I can't compile this, but this should work for you:
var list = new List<ListItem>();
for (int i = 0; i < strList.Count; i++)
{
i++;
var li = new ListItem(strList[i - 1], strList[i]);
list.Add(li);
}
again, I'm not in a position to fully recreate your environment but since the first is the key and second is the value, and you're sure of the state of the string, it's a pretty easy algorithm.
However, leveraging a foreach loop would still require you to know a bit more about the index so it's a little more straight forward with a basic for loop.
First, a valuable helper function I use. It is similar to GroupBy except it groups by sequential indexes rather than some key.
public static IEnumerable<List<T>> GroupSequential<T>(this IEnumerable<T> source, int groupSize, bool includePartialGroups = true)
{
if (groupSize < 1)
throw new ArgumentOutOfRangeException("groupSize", groupSize, "Must have groupSize >= 1.");
var group = new List<T>(groupSize);
foreach (var item in source)
{
group.Add(item);
if (group.Count == groupSize)
{
yield return group;
group = new List<T>(groupSize);
}
}
if (group.Any() && (includePartialGroups || group.Count == groupSize))
yield return group;
}
Now you can simply do
var listItems = settings.Split(';')
.GroupSequential(2, false)
.Select(group => new ListItem { Key = group[0], Value = group[1] })
.ToList();
if you want to use foreach
string key=string.Empty;
string value=string.Empty;
bool isStartsWithKey=true;
var strList = settings.Split(';').ToList()
foreach(var item in strList)
{
if(isStartsWithKey)
{
key=item;
}
else
{
value=item;
//TODO: now you can use key and value
}
isStartsWithKey=!isStartsWithKey;
}
List<int, string> yourlist;
for(int i=0;i<strList.length/2;i++)
{
yourlist.add(new ListItem(strList[i*2], strList[i*2+1]));
}
this seems to me to be the simpliest way
for(var i = 0; i < strList.Count(); i = i + 2){
var li = new listItem (strList[i], strList[i + 1];
listToAdd.Add(li);
}
Updated Example
for (var i = 0; i < strList.Count(); i = i + 2){
if (strList.ContainsKey(i) && strList.ContainsKey(i + 1)){
listToAdd.Add(new listItem(strList[i], strList[i + 1]);
}
}

Split and join multiple logical "branches" of string data

I know there's a couple similarly worded questions on SO about permutation listing, but they don't seem to be quite addressing really what I'm looking for. I know there's a way to do this but I'm drawing a blank. I have a flat file that resembles this format:
Col1|Col2|Col3|Col4|Col5|Col6
a|b,c,d|e|f|g,h|i
. . .
Now here's the trick: I want to create a list of all possible permutations of these rows, where a comma-separated list in the row represents possible values. For example, I should be able to take an IEnumerable<string> representing the above to rows as such:
IEnumerable<string> row = new string[] { "a", "b,c,d", "e", "f", "g,h", "i" };
IEnumerable<string> permutations = GetPermutations(row, delimiter: "/");
This should generate the following collection of string data:
a/b/e/f/g/i
a/b/e/f/h/i
a/c/e/f/g/i
a/c/e/f/h/i
a/d/e/f/g/i
a/d/e/f/h/i
This to me seems like it would elegantly fit into a recursive method, but apparently I have a bad case of the Mondays and I can't quite wrap my brain around how to approach it. Some help would be greatly appreciated. What should GetPermutations(IEnumerable<string>, string) look like?
You had me at "recursive". Here's another suggestion:
private IEnumerable<string> GetPermutations(string[] row, string delimiter,
int colIndex = 0, string[] currentPerm = null)
{
//First-time initialization:
if (currentPerm == null) { currentPerm = new string[row.Length]; }
var values = row[colIndex].Split(',');
foreach (var val in values)
{
//Update the current permutation with this column's next possible value..
currentPerm[colIndex] = val;
//..and find values for the remaining columns..
if (colIndex < (row.Length - 1))
{
foreach (var perm in GetPermutations(row, delimiter, colIndex + 1, currentPerm))
{
yield return perm;
}
}
//..unless we've reached the last column, in which case we create a complete string:
else
{
yield return string.Join(delimiter, currentPerm);
}
}
}
I'm not sure whether this is the most elegant approach, but it might get you started.
private static IEnumerable<string> GetPermutations(IEnumerable<string> row,
string delimiter = "|")
{
var separator = new[] { ',' };
var permutations = new List<string>();
foreach (var cell in row)
{
var parts = cell.Split(separator);
var perms = permutations.ToArray();
permutations.Clear();
foreach (var part in parts)
{
if (perms.Length == 0)
{
permutations.Add(part);
continue;
}
foreach (var perm in perms)
{
permutations.Add(string.Concat(perm, delimiter, part));
}
}
}
return permutations;
}
Of course, if the order of the permutations is important, you can add an .OrderBy() at the end.
Edit: added an alernative
You could also build a list of string arrays, by calculating some numbers before determining the permutations.
private static IEnumerable<string> GetPermutations(IEnumerable<string> row,
string delimiter = "|")
{
var permutationGroups = row.Select(o => o.Split(new[] { ',' })).ToArray();
var numberOfGroups = permutationGroups.Length;
var numberOfPermutations =
permutationGroups.Aggregate(1, (current, pg) => current * pg.Length);
var permutations = new List<string[]>(numberOfPermutations);
for (var n = 0; n < numberOfPermutations; n++)
{
permutations.Add(new string[numberOfGroups]);
}
for (var position = 0; position < numberOfGroups; position++)
{
var permutationGroup = permutationGroups[position];
var numberOfCharacters = permutationGroup.Length;
var numberOfIterations = numberOfPermutations / numberOfCharacters;
for (var c = 0; c < numberOfCharacters; c++)
{
var character = permutationGroup[c];
for (var i = 0; i < numberOfIterations; i++)
{
var index = c + (i * numberOfCharacters);
permutations[index][position] = character;
}
}
}
return permutations.Select(p => string.Join(delimiter, p));
}
One algorithm you can use is basically like counting:
Start with the 0th item in each list (00000)
Increment the last value (00001, 00002 etc.)
When you can't increas one value, reset it and increment the next (00009, 00010, 00011 etc.)
When you can't increase any value, you're done.
Function:
static IEnumerable<string> Permutations(
string input,
char separator1, char separator2,
string delimiter)
{
var enumerators = input.Split(separator1)
.Select(s => s.Split(separator2).GetEnumerator()).ToArray();
if (!enumerators.All(e => e.MoveNext())) yield break;
while (true)
{
yield return String.Join(delimiter, enumerators.Select(e => e.Current));
if (enumerators.Reverse().All(e => {
bool finished = !e.MoveNext();
if (finished)
{
e.Reset();
e.MoveNext();
}
return finished;
}))
yield break;
}
}
Usage:
foreach (var perm in Permutations("a|b,c,d|e|f|g,h|i", '|', ',', "/"))
{
Console.WriteLine(perm);
}
I really thought this would be a great recursive function, but I ended up not writing it that way. Ultimately, this is the code I created:
public IEnumerable<string> GetPermutations(IEnumerable<string> possibleCombos, string delimiter)
{
var permutations = new Dictionary<int, List<string>>();
var comboArray = possibleCombos.ToArray();
var splitCharArr = new char[] { ',' };
permutations[0] = new List<string>();
permutations[0].AddRange(
possibleCombos
.First()
.Split(splitCharArr)
.Where(x => !string.IsNullOrEmpty(x.Trim()))
.Select(x => x.Trim()));
for (int i = 1; i < comboArray.Length; i++)
{
permutations[i] = new List<string>();
foreach (var permutation in permutations[i - 1])
{
permutations[i].AddRange(
comboArray[i].Split(splitCharArr)
.Where(x => !string.IsNullOrEmpty(x.Trim()))
.Select(x => string.Format("{0}{1}{2}", permutation, delimiter, x.Trim()))
);
}
}
return permutations[permutations.Keys.Max()];
}
... my test conditions provided me with exactly the output I expected:
IEnumerable<string> row = new string[] { "a", "b,c,d", "e", "f", "g,h", "i" };
IEnumerable<string> permutations = GetPermutations(row, delimiter: "/");
foreach(var permutation in permutations)
{
Debug.Print(permutation);
}
This produced the following output:
a/b/e/f/g/i
a/b/e/f/h/i
a/c/e/f/g/i
a/c/e/f/h/i
a/d/e/f/g/i
a/d/e/f/h/i
Thanks to everyone's suggestions, they really were helpful in sorting out what needed to be done in my mind. I've upvoted all your answers.

Categories

Resources