Searching through keys dictionary - c#

How can i search through a bunch of keys in a Dictionary with a for loop or something like it and see if there is any key with the same first three string values as another string. the following example isnt Code at all but it is basicly the result i whant.
Key1(3932030)
Key2(4201230)
Key3(5209872)
ArrayWithKeys(3930000,4200000,5200000)
Dictionary searchForkeys(ArrayWithKeys[i])
keyFound(3932030)

First, Get substring to search and then use it to find keys inside dictionary object.
string[] keyArray = new string[]{ "3930000", "4200000" , "5200000"};
string substringToSearch ;
foreach(string inputKey in keyArray)
{
substringToSearch = inputKey.Length >= 3 ? inputKey.Substring(0, 3) : inputKey;
if(dictionaryObject.Keys.Any(x => x.StartsWith(substringToSearch)))
{
// below is the key matched with inputKey
dictionaryObject.Where(x => x.Key.StartsWith(substringToSearch)).First().Value;
}
}
EDIT
Using only for loop
string substringToSearch = inputKey.Length >= 3 ? inputKey.Substring(0, 3) : inputKey;
for(int i; i < dictionaryObject.Keys.Count; i++)
{
if( dictionaryObject.ElementAt(i).Key.StartsWith(substringToSearch) )
{
// key matched with inputKey
// below is key
string keyStr = dictionaryObject.ElementAt(i).Key;
}
}

Related

How to replace multiple substrings in a string in C#?

I have to replace multiple substrings from a string (max length 32 of input string). I have a big dictionary which can have millions of items as a key-value pair. I need to check for each word if this word is present in the dictionary and replace with the respective value if present in the dictionary. The input string can have multiple trailing spaces.
This method is being called millions of time, due to this, it's affecting the performance badly.
Is there any scope of optimization in the code or some other better way to do this.
public static string RandomValueCompositeField(object objInput, Dictionary<string, string> g_rawValueRandomValueMapping) {
if (objInput == null)
return null;
string input = objInput.ToString();
if (input == "")
return input;
//List<string> ls = new List<string>();
int count = WhiteSpaceAtEnd(input);
foreach (string data in input.Substring(0, input.Length - count).Split(' ')) {
try {
string value;
gs_dictRawValueRandomValueMapping.TryGetValue(data, out value);
if (value != null) {
//ls.Add(value.TrimEnd());
input = input.Replace(data, value);
}
else {
//ls.Add(data);
}
}
catch(Exception ex) {
}
}
//if (count > 0)
// input = input + new string(' ', count);
//ls.Add(new string(' ', count));
return input;
}
EDIT:
I missed one important thing in the question. substring can occur only once inthe input string. Dictionay key and value have same number of characters.
Here's a method that will take an input string and will build a new string by finding "words" (any consecutive non-whitespace) and then checking if that word is in a dictionary and replacing it with the corresponding value if found. This will fix the issues of Replace doing replacements on "sub-words" (if you have "hello hell" and you want to replace "hell" with "heaven" and you don't want it to give you "heaveno heaven"). It also fixes the issue of swapping. For example if you want to replace "yes" with "no" and "no" with "yes" in "yes no" you don't want it to first turn that into "no no" and then into "yes yes".
public string ReplaceWords(string input, Dictionary<string, string> replacements)
{
var builder = new StringBuilder();
int wordStart = -1;
int wordLength = 0;
for(int i = 0; i < input.Length; i++)
{
// If the current character is white space check if we have a word to replace
if(char.IsWhiteSpace(input[i]))
{
// If wordStart is not -1 then we have hit the end of a word
if(wordStart >= 0)
{
// get the word and look it up in the dictionary
// if found use the replacement, if not keep the word.
var word = input.Substring(wordStart, wordLength);
if(replacements.TryGetValue(word, out var replace))
{
builder.Append(replace);
}
else
{
builder.Append(word);
}
}
// Make sure to reset the start and length
wordStart = -1;
wordLength = 0;
// append whatever whitespace was found.
builder.Append(input[i]);
}
// If this isn't whitespace we set wordStart if it isn't already set
// and just increment the length.
else
{
if(wordStart == -1) wordStart = i;
wordLength++;
}
}
// If wordStart is not -1 then we have a trailing word we need to check.
if(wordStart >= 0)
{
var word = input.Substring(wordStart, wordLength);
if(replacements.TryGetValue(word, out var replace))
{
builder.Append(replace);
}
else
{
builder.Append(word);
}
}
return builder.ToString();
}

Best way to extract string from another string based on collection

I have str which is a string and I want to check if the last part of string is equal to other string, below I do it manually but lets say I have an array strin[] keys = {"From", "To", ...}. If its equal I want to extract (remove) it from str and put it inside key. What is the best way to achieve that?
string key;
if(str.Substring(str.Length - 4) == "From");{
key = "From";
//Do something with key
}
else if (str.Substring(str.Length - 2) == "To") {
key = "To";
//Do something with key
}
... //There may be more string to compare with
str = str.Remove(str.Length - key.Length);
You can just use FirstOrDefault and EndsWith. This will either give you the key it ends with or null. You'll have to include the using System.Linq for this to work.
string key = keys.FirstOrDefault(k => str.EndsWith(k));
if(key != null)
{
str = str.Remove(str.Length - key.Length);
}
Use a foreach loop to iterate your keys, then EndsWith() to detect and SucĀ“bString to extract:
foreach(string key in keys)
{
if(str.EndsWith(key))
{
int len = str.Length - key.Length;
result = str.Substring(0, len);
break;
}
}

Parse proprietary returnstring form a Server

i Need to parse a proprietary string from a tcp Server.
the string i get is the following:
!re.tag=3=.id=*1=name=1 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~!re.tag=3=.id=*2=name=3 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~!done.tag=3~
So when striping off the !done.tag.... and Splitting the string at the ~ I can break down the (in this case) two objects to
!re.tag=3=.id=*1=name=1 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~
!re.tag=3=.id=*2=name=3 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~
then im facing the Problem, how to split the properties and their values.
!re.tag=3
=.id=*2
=name=3 Hour
=owner=admin
=name-for-users=
=validity=3h
=starts-at=logon
=price=0
=override-shared-users=off
Normally i'll do a split on the equals sign, like this:
List<string> arProfiles = profilString.Split('=').ToList();
and then i can guess(!) that the value of the "name" property is at Position 5.
Is there a more proper way to parse these kind of strings (these while i'll get the same kind of strings from different functions)
Paul
//so. we've got the response here
var response = "!re.tag=3=.id=*1=name=1 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~!re.tag=3=.id=*2=name=3 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~!done.tag=3~";
// first we split the line into sections
var sections = Regex.Matches(response, #"!(?<set>.*?)~").Cast<Match>().Select(s=>s.Groups["set"].Value).ToArray();
// next we can parse any section into key/value pairs
var parsed = Regex.Matches(sections[0], #"(?<key>.*?)=(?<value>[^=]*)=?").Cast<Match>()
.Select(pair => new
{
key = pair.Groups["key"].Value,
value = pair.Groups["value"].Value,
}).ToArray();
Don't forget
using System.Text.RegularExpressions;
Seems like each of the values (not the parameter names) are surrounded by a pair of "=".
This should give you what you want, more or less:
var input = "!re.tag=3=.id=*1=name=1 Hour=(...etc...)";
Dictionary<string, string> values = new Dictionary<string, string>();
while(input.Count() > 0){
var keyChars = input.TakeWhile(x=> x != '=');
var currTag = new string(keyChars.ToArray());
var valueChars = input.Skip(currTag.Count() + 1).TakeWhile(x=> x != '=');
var value = new string(valueChars.ToArray());
values.Add(currTag, value);
input = new string(input.Skip(currTag.Length + value.Lenght + 2)
.ToArray());
}
This results in the following keys and values:
!re.tag | 3
.id | *1
name | 1 Hour
owner | admin
name-for-users |
validity | 3h
starts-at | logon
price | 0
override-shared-users | off~
Each parameter name is starting and ending with '=' symbol. That means you need to process string looking for a first value between two '='. That ever will come after that and before next '=' symbol or end of the string is the value of that property. Property may have an empty value, so it must be handled as well.
The first part of the string is different:
!re.tag=3
You'll have to remove or process it individually.
Way to parse it would be:
var inString = #"=.id=*1=name=1 Hour=owner=admin=name-for-users==validity=3h=starts-at=logon=price=0=override-shared-users=off~";
int startOfParameterName = 0;
int endOfParameterName = 0;
int startOfParameterValue = 0;
bool paramerNameEndFound = false;
bool paramerNameStartFound = false;
var arProfiles = new Dictionary<string, string>();
for(int index = 0; index < inString.Length; index++)
{
if (inString[index] == '=' || index == inString.Length - 1)
{
if (paramerNameEndFound || index == inString.Length - 1)
{
var parameterName = inString.Substring(startOfParameterName, endOfParameterName - startOfParameterName);
var parameterValue = startOfParameterValue == index ? string.Empty : inString.Substring(startOfParameterValue, index - startOfParameterValue);
arProfiles.Add(parameterName, parameterValue);
startOfParameterName = index + 1;
paramerNameEndFound = false;
paramerNameStartFound = true;
}
else
{
if (paramerNameStartFound == false)
{
paramerNameStartFound = true;
startOfParameterName = index + 1;
}
else
{
paramerNameEndFound = true;
endOfParameterName = index;
startOfParameterValue = index + 1;
}
}
}
}
Where is a room to perfection, but it works!

Find and replace text in a string using C#

Anyone know how I would find & replace text in a string? Basically I have two strings:
string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDABQODxIPDRQSERIXFhQYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f3//";
string secondS = "abcdefg2wBDABQODxIPDRQSERIXFh/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/abcdefg";
I want to search firstS to see if it contains any sequence of characters that's in secondS and then replace it. It also needs to be replaced with the number of replaced characters in squared brackets:
[NUMBER-OF-CHARACTERS-REPLACED]
For example, because firstS and secondS both contain "2wBDABQODxIPDRQSERIXFh" and "/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/" they would need to be replaced. So then firstS becomes:
string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/[22]QYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39[61]f3//";
Hope that makes sense. I think I could do this with Regex, but I don't like the inefficiency of it. Does anyone know of another, faster way?
Does anyone know of another, faster way?
Yes, this problem actually has a proper name. It is called the Longest Common Substring, and it has a reasonably fast solution.
Here is an implementation on ideone. It finds and replaces all common substrings of ten characters or longer.
// This comes straight from Wikipedia article linked above:
private static string FindLcs(string s, string t) {
var L = new int[s.Length, t.Length];
var z = 0;
var ret = new StringBuilder();
for (var i = 0 ; i != s.Length ; i++) {
for (var j = 0 ; j != t.Length ; j++) {
if (s[i] == t[j]) {
if (i == 0 || j == 0) {
L[i,j] = 1;
} else {
L[i,j] = L[i-1,j-1] + 1;
}
if (L[i,j] > z) {
z = L[i,j];
ret = new StringBuilder();
}
if (L[i,j] == z) {
ret.Append(s.Substring( i-z+1, z));
}
} else {
L[i,j]=0;
}
}
}
return ret.ToString();
}
// With the LCS in hand, building the answer is easy
public static string CutLcs(string s, string t) {
for (;;) {
var lcs = FindLcs(s, t);
if (lcs.Length < 10) break;
s = s.Replace(lcs, string.Format("[{0}]", lcs.Length));
}
return s;
}
You need to be very careful between "Longest common substring and "longest common subsequence"
For Substring: http://en.wikipedia.org/wiki/Longest_common_substring_problem
For SubSequence: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
I would suggest you to also see few videos on youtube on these two topics
http://www.youtube.com/results?search_query=longest+common+substring&oq=longest+common+substring&gs_l=youtube.3..0.3834.10362.0.10546.28.17.2.9.9.2.225.1425.11j3j3.17.0...0.0...1ac.lSrzx8rr1kQ
http://www.youtube.com/results?search_query=longest+common+subsequence&oq=longest+common+s&gs_l=youtube.3.0.0l6.2968.7905.0.9132.20.14.2.4.4.0.224.2038.5j2j7.14.0...0.0...1ac.4CYZ1x50zpc
you can find c# implementation of longest common subsequence here:
http://www.alexandre-gomes.com/?p=177
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_subsequence
I have a similar issue, but for word occurrences! so, I hope this can help. I used SortedDictionary and a binary search tree
/* Application counts the number of occurrences of each word in a string
and stores them in a generic sorted dictionary. */
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
public class SortedDictionaryTest
{
public static void Main( string[] args )
{
// create sorted dictionary
SortedDictionary< string, int > dictionary = CollectWords();
// display sorted dictionary content
DisplayDictionary( dictionary );
}
// create sorted dictionary
private static SortedDictionary< string, int > CollectWords()
{
// create a new sorted dictionary
SortedDictionary< string, int > dictionary =
new SortedDictionary< string, int >();
Console.WriteLine( "Enter a string: " ); // prompt for user input
string input = Console.ReadLine();
// split input text into tokens
string[] words = Regex.Split( input, #"\s+" );
// processing input words
foreach ( var word in words )
{
string wordKey = word.ToLower(); // get word in lowercase
// if the dictionary contains the word
if ( dictionary.ContainsKey( wordKey ) )
{
++dictionary[ wordKey ];
}
else
// add new word with a count of 1 to the dictionary
dictionary.Add( wordKey, 1 );
}
return dictionary;
}
// display dictionary content
private static void DisplayDictionary< K, V >(
SortedDictionary< K, V > dictionary )
{
Console.WriteLine( "\nSorted dictionary contains:\n{0,-12}{1,-12}",
"Key:", "Value:" );
/* generate output for each key in the sorted dictionary
by iterating through the Keys property with a foreach statement*/
foreach ( K key in dictionary.Keys )
Console.WriteLine( "{0,- 12}{1,-12}", key, dictionary[ key ] );
Console.WriteLine( "\nsize: {0}", dictionary.Count );
}
}
This is probably dog slow, but if you're willing to incur some technical debt and need something now for prototyping, you could use LINQ.
string firstS = "123abc";
string secondS = "456cdeabc123";
int minLength = 3;
var result =
from subStrCount in Enumerable.Range(0, firstS.Length)
where firstS.Length - subStrCount >= 3
let subStr = firstS.Substring(subStrCount, 3)
where secondS.Contains(subStr)
select secondS.Replace(subStr, "[" + subStr.Length + "]");
Results in
456cdeabc[3]
456cde[3]123

How to make these 2 methods more Efficient [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Hi guys this is my first question ever on SO so please go easy on me.
I am playing with Lambda/LINQ while building myself few utility methods.
First method takes string like,
"AdnanRazaBhatti"
and breaks it up like ,
"Adnan Raza Bhatti"
Second Methods takes string like first method and also takes,
out String[] brokenResults
and returns broken string like the first method as well as fill up brokenResults array as follows.
"Adnan" "Raza" "Bhatti"
Questions:
A. Can you please suggest how to make these methods more efficient?
B. When I try to use StringBuilder it tells me extension methods like, Where, Select does not exist for StringBuilder class, why is it so? Although indexer works on StringBuilder to get the characters like StringBuilder s = new StrinBuilder("Dang"); char c = s[0]; Here char will be D;
Code
Method 1:
public static string SplitCapital( string source )
{
string result = "";
int i = 0;
//Separate all the Capital Letter
var charUpper = source.Where( x => char.IsUpper( x ) ).ToArray<char>( );
//If there is only one Capital letter then it is already atomic.
if ( charUpper.Count( ) > 1 ) {
var strLower = source.Split( charUpper );
foreach ( string s in strLower )
if ( i < strLower.Count( ) - 1 && !String.IsNullOrEmpty( s ) )
result += charUpper.ElementAt( i++ ) + s + " ";
return result;
}
return source;
}
Method 2:
public static string SplitCapital( string source, out string[] brokenResults )
{
string result = "";
int i = 0;
var strUpper = source.Where( x => char.IsUpper( x ) ).ToArray<char>( );
if ( strUpper.Count( ) > 1 ) {
var strLower = source.Split( strUpper );
brokenResults = (
from s in strLower
where i < strLower.Count( ) - 1 && !String.IsNullOrEmpty( s )
select result = strUpper.ElementAt( i++ ) + s + " " ).ToArray( );
result = "";
foreach ( string s in brokenResults )
result += s;
return result;
}
else { brokenResults = new string[] { source }; }
return source;
}
Note:
I am planning to use these utility methods to break up the table column names I get from my database.
For Example if column name is "BooksId" I will break it up using one of these methods as "Books Id" programmatically, I know there are other ways or renaming the column names like in design window or [dataset].[tableName].HeadersRow.Cells[0].Text = "Books Id" but I am also planning to use this method somewhere else in the future.
Thanks
you can use the following extension methods to split your string based on Capital letters:
public static string Wordify(this string camelCaseWord)
{
/* CamelCaseWord will become Camel Case Word,
if the word is all upper, just return it*/
if (!Regex.IsMatch(camelCaseWord, "[a-z]"))
return camelCaseWord;
return string.Join(" ", Regex.Split(camelCaseWord, #"(?<!^)(?=[A-Z])"));
}
To split a string in a string array, you can use this:
public static string[] SplitOnVal(this string text,string value)
{
return text.Split(new[] { value }, StringSplitOptions.None);
}
If we take your example for consideration, the code will be as follows:
string strTest = "AdnanRazaBhatti";
var capitalCase = strTest.Wordify(); //Adnan Raza Bhatti
var brokenResults = capitalCase.SplitOnVal(" "); //seperate by a blank value in an array
Check this code
public static string SeperateCamelCase(this string value)
{
return Regex.Replace(value, "((?<=[a-z])[A-Z]|[A-Z](?=[a-z]))", " $1");
}
Hope this answer helps you. If you find solution kindly mark my answer and point it up.
Looks to me like regular expressions is the way to go.
I think [A-Z][a-z]+ might be a good one to start with.
Updated version. String builder was used to reduce memory utilization.
string SplitCapital(string str)
{
//Search all capital letters and store indexes
var indexes = str
.Select((c, i) => new { c = c, i = i }) // Select information about char and position
.Where(c => Char.IsUpper(c.c)) // Get only capital chars
.Select(cl => cl.i); // Get indexes of capital chars
// If no indexes found or if indicies count equal to the source string length then return source string
if (!indexes.Any() || indexes.Count() == str.Length)
{
return str;
}
// Create string builder from the source string
var sb = new StringBuilder(str);
// Reverse indexes and remove 0 if necessary
foreach (var index in indexes.Reverse().Where(i => i != 0))
{
// Insert spaces before capital letter
sb.Insert(index, ' ');
}
return sb.ToString();
}
string SplitCapital(string str, out string[] parts)
{
var splitted = SplitCapital(str);
parts = splitted.Split(new[] { ' ' }, StringSplitOptions.None);
return splitted;
}

Categories

Resources