I want to replace characters in a string content in file.
Below Dictionary shows the Key as unwanted character and i need to replace with the value in the Dictionary.
Dictionary<string, string> unwantedCharacters = new Dictionary<string, string>();
unwantedCharacters["É"] = "#";
unwantedCharacters["Ä"] = "[";
unwantedCharacters["Ö"] = "\\";
unwantedCharacters["Å"] = "]";
unwantedCharacters["Ü"] = "^";
unwantedCharacters["é"] = "`";
unwantedCharacters["ä"] = "{";
unwantedCharacters["ö"] = "|";
unwantedCharacters["å"] = "}";
unwantedCharacters["ü"] = "~";
Here is the code i currently using,Feel like its taking too much execution time..
for (int index = 0; index < fileContents.Length; index++)
{
foreach (KeyValuePair<string, string> item in unwantedCharacters)
{
if (fileContents.IndexOf(item.Key) > -1)
{
fileContents = fileContents.Replace(item.Key, item.Value); // Replacing straight characters
}
}
}
ie,Looping in two levels.. Any other ways implement this..Any help will be appreciated
Since you're not modifying the length of the string, if you make unwantedCharacters a Dictionary<char, char> rather than <string, string>, you can do the following:
var charArray = fileContents.ToCharArray();
for (int i = 0; i < charArray.Length; i++)
{
char replacement;
if (unwantedCharacters.TryGetValue(charArray[i], out replacement))
charArray[i] = replacement;
}
fileContents = new string(charArray);
Performance is O(n) in releation to the length of the input string.
It seems fileContents is a string value here. You could simply call replace on the string.
foreach (KeyValuePair<string, string> item in unwantedCharacters)
{
fileContents = fileContents.Replace(item.Key, item.Value);
}
Look this answer: answer
But in this code put your characteres:
IDictionary<string,string> map = new Dictionary<string,string>()
{
{"É", = "#"},
{"Ä", = "["},
{"Ö", = "\\"},
...
};
In order to replace many characters in string, consider to use StringBuilder Class. Replacing one character in string causes in creation of entirly new string so it is highly inefficient. Try the below:
var sb = new StringBuilder(fileContents.Length);
foreach (var c in fileContents)
sb.Append(unwantedCharacters.ContainsKey(c) ? unwantedCharacters[c] : c);
fileContents = sb.ToString();
I assumed here, that your dictionary contains characters (Dictionary<char, char>). It it is a case, just comment and I will edit the solution.
I also assumed, that fileContents is a string.
You can also use LINQ instead of StringBuilder:
var fileContentsEnumerable = from c in fileContents
select unwantedCharacters.ContainsKey(c) ? unwantedCharacters[c] : c;
fileContents = new string(fileContentsEnumerable.ToArray());
You want to build a filter. You process the contents of the file, and do the substitution while you process it.
Something like this:
using(StreamReader reader = new StreamReader("filename"))
using (StreamWriter writer = new StreamWriter("outfile"))
{
char currChar = 0;
while ((currChar = reader.Read()) >= 0)
{
char outChar = unwantedCharacters.ContainsKey(currChar)
? unwantedCharacters[currChar]
: (char) currChar;
writer.Write(outChar);
}
}
You can use a memeory stream if your data is in memory, or a loop through fileContents is that's a string or char array.
This solution is O(n) where n is the length of the file, thanks to the dictionary (note that you could use a simple sparse array instead of the dictionary and you would gain quite a bit of speed).
Do not iterate through the dictionary as other suggest as each substitution is O(n) so you end up with a total time of O(n*d), d being the dictionary size, as you have to go through the file many times.
Remove the foreach and replace with a for loop from 0 to item.Count. This article will help, hopefully.
Related
I need to split a line of text
The general syntax for a delivery instruction is |||name|value||name|value||…..|||
Each delivery instruction starts and ends with 3 pipe characters - |||
A delivery instruction is a set of name/value pairs separated by a single pipe eg name|value
Each name value pair is separated by 2 pipe characters ||
Names and Values may not contain the pipe character
The value of any pair may be a blank string.
I need a regex that will help me resolve the above problem.
My latest attempt with my limited Regex skills:
string SampleData = "|||env|af245g||mail_idx|39||gen_date|2016/01/03 11:40:06||docm_name|Client Statement (01.03.2015−31.03.2015)||docm_cat_name|Client Statement||docm_type_id|9100||docm_type_name|Client Statement||addr_type_id|1||addr_type_name|Postal address||addr_street_nr|||addr_street_name|Robinson Road||addr_po_box|||addr_po_box_type|||addr_postcode|903334||addr_city|Singapore||addr_state|||addr_country_id|29955||addr_country_name|Singapore||obj_nr|10000023||bp_custr_type|Customer||access_portal|Y||access_library|Y||avsr_team_id|13056||pri_avsr_id|||pri_avsr_name|||ctact_phone|||dlv_type_id|5001||dlv_type_name|Channel to standard mail||ao_id|14387||ao_name|Corp Limited||ao_title|||ao_mob_nr|||ao_email_addr||||??";
string[] Split = Regex.Matches(SampleData, "(\|\|\|(?:\w+\|\w*\|\|)*\|)").Cast<Match>().Select(m => m.Value).ToArray();
The expected output should be as follows(based on the sample data string provided):
env|af245g
mail_idx|39
gen_date|2016/01/03 11:40:06
docm_name|Client Statement (01.03.2015−31.03.2015)
docm_cat_name|Client Statement
docm_type_id|9100
docm_type_name|Client Statement
addr_type_id|1
addr_type_name|Postal address
addr_street_nr|
addr_street_name|Robinson Road
addr_po_box|
addr_po_box_type|
addr_postcode|903334
addr_city|Singapore
addr_state|
addr_country_id|29955
addr_country_name|Singapore
obj_nr|10000023
bp_custr_type|Customer
access_portal|Y
access_library|Y
avsr_team_id|13056
pri_avsr_id|
pri_avsr_name|
ctact_phone|
dlv_type_id|5001
dlv_type_name|Channel to standard mail
ao_id|14387
ao_name|Corp Limited
ao_title|
ao_mob_nr|
ao_email_addr|
You can also do it without using Regex. Its just simple splitting.
string nameValues = "|||zeeshan|1||ali|2||ahsan|3|||";
string sub = nameValues.Substring(3, nameValues.Length - 6);
Dictionary<string, string> dic = new Dictionary<string, string>();
string[] subsub = sub.Split(new string[] {"||"}, StringSplitOptions.None);
foreach (string item in subsub)
{
string[] nameVal = item.Split('|');
dic.Add(nameVal[0], nameVal[1]);
}
foreach (var item in dic)
{
// Retrieve key and value here i.e:
// item.Key
// item.Value
}
Hope this helps.
I think you're making this more difficult than it needs to be. This regex yields the desired result:
#"[^|]+\|([^|]*)"
Assuming you're dealing with a single, well-formed delivery instruction, there's no need to match the starting and ending triple-pipes. You don't need to worry about the double-pipe separators either, because the "name" part of the "name|value" pair is always present. Just look for the first thing that looks like a name with a pipe following it, and everything up to the next pipe character is the value.
(?<=\|\|\|).*?(?=\|\|\|)
You can use this to get all the key value pairs between |||.See demo.
https://regex101.com/r/fM9lY3/59
string strRegex = #"(?<=\|\|\|).*?(?=\|\|\|)";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline);
string strTargetString = #"|||env|af245g||mail_idx|39||gen_date|2016/01/03 11:40:06||docm_name|Client Statement (01.03.2015−31.03.2015)||docm_cat_name|Client Statement||docm_type_id|9100||docm_type_name|Client Statement||addr_type_id|1||addr_type_name|Postal address||addr_street_nr|||addr_street_name|Robinson Road||addr_po_box|||addr_po_box_type|||addr_postcode|903334||addr_city|Singapore||addr_state|||addr_country_id|29955||addr_country_name|Singapore||obj_nr|10000023||bp_custr_type|Customer||access_portal|Y||access_library|Y||avsr_team_id|13056||pri_avsr_id|||pri_avsr_name|||ctact_phone|||dlv_type_id|5001||dlv_type_name|Channel to standard mail||ao_id|14387||ao_name|Corp Limited||ao_title|||ao_mob_nr|||ao_email_addr||||??";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here
}
}
Here's a variation of #Syed Muhammad Zeeshan code that runs faster:
string nameValues = "|||zeeshan|1||ali|2||ahsan|3|||";
string[] nameArray = nameValues.Split(new char[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
Dictionary<string, string> dic = new Dictionary<string, string>();
int i = 0;
foreach (string item in nameArray)
{
if (i < nameArray.Length - 1)
dic.Add(nameArray[i], nameArray[i + 1]);
i = i + 2;
}
Interesting, I will like to try:
class Program
{
static void Main(string[] args)
{
string nameValueList = "|||zeeshan|1||ali|2||ahsan|3|||";
while (nameValueList != "|||")
{
nameValueList = nameValueList.TrimStart('|');
string nameValue = GetNameValue(ref nameValueList);
Console.WriteLine(nameValue);
}
Console.ReadLine();
}
private static string GetNameValue(ref string nameValues)
{
string retVal = string.Empty;
while(nameValues[0] != '|') // for name
{
retVal += nameValues[0];
nameValues = nameValues.Remove(0, 1);
}
retVal += nameValues[0];
nameValues = nameValues.Remove(0, 1);
while (nameValues[0] != '|') // for value
{
retVal += nameValues[0];
nameValues = nameValues.Remove(0, 1);
}
return retVal;
}
}
https://dotnetfiddle.net/WRbsRu
I want to make a program which takes a string you entered and turn it to different string, so if for example I input "Hello World" every char will turn into a string and the console will output something like "Alpha Beta Gamma Gamma Zeta Foxtrot Dona Rama Lana Zema" - making every char a word.
I tried doing it like this :
static string WordMap(string value)
{
char[] buffer = value.ToCharArray();
for (int i = 0; i < buffer.Length; i++)
{
if (letter = "a")
{
letter = ("Alpha");
}
//and so on
buffer[i] = letter;
}
return new string(buffer);
}
but I just can't get it to work.
can anyone give me a tip or point me at the right direction?
What you need is a Dictionary<char,string>
var words = new Dictionary<char, string>();
words.Add('a', "Alpha");
words.Add('b',"Beta");
...
string input = Console.ReadLine();
string[] contents = new string[input.Length];
for (int i = 0; i < input.Length; i++)
{
if (words.ContainsKey(input[i]))
{
contents[i] = words[input[i]];
}
}
string result = string.Join(" ", contents);
Or LINQ way:
var result = string.Join(" ", input.Where(words.ContainsKey).Select(c => words[c]));
First off, the buffer is a char array. Arrays have a fixed size and to expand them you need to create a new one. To overcome this cumbersome work, there is a StringBuilder class that does this automatically.
Secondly, if you keep these 'Alpha', 'Beta', ... strings in if statements you will have a very long piece of code. You can replace this by using a dictionary, or create it from a single string or text file.
To put this into practice:
class MyClass
{
static Dictionary<char, string> _map = new Dictionary<char, string>();
static MyClass()
{
_map.Add('a', "Alpha");
_map.Add('b', "Beta");
// etc
}
static string WordMap(string data)
{
var output = new StringBuilder();
foreach (char c in data)
{
if (_map.ContainsKey(c))
{
output.Append(_map[c]);
output.Append(' ');
}
}
return output.ToString();
}
}
Solution without a dictionary:
static string WordMap(string data)
{
const string words = "Alpha Beta Gamma Delta ...";
string[] wordMap = words.Split(' ');
var output = new StringBuilder();
foreach (char c in data)
{
int index = c - 'a';
if (index >= 0 && index < wordMap.Length)
{
output.Append(wordMap[index]);
output.Append(' ');
}
}
return output.ToString();
}
With LINQ and String.Join it's short and readable. Since you want to have a new word for special chards you need a word-map. I would use a Dictionary<char, string>:
static Dictionary<char, string> wordMap = new Dictionary<char, string>()
{
{'a', "Alpha"}, {'b', "Beta"},{'c', "Gamma"}, {'d', "Delta"}
};
static string WordMap(string value)
{
var strings = value
.Select(c =>
{
string word;
if(!wordMap.TryGetValue(c, out word))
word = c.ToString();
return word;
});
return string.Join("", strings);
}
Test:
string newString = WordMap("abcdeghj"); // AlphaBetaGammaDeltaeghj
Tips:
You don't have to create character array from string, you can easily access single characters in string by indexer:
char some = "123"[2];
When you use "" then you create string not char therefore you should use '' to create character for comparison:
if (some == 'a') Console.WriteLine("character is a, see how you compare chars!!!");
A good solution ...
string[] words = { "Alpha", "Beta", "C_word", "D_Word" }; // ....
string myPhrase = "aBC";
myPhrase.Replace(" ", string.Empty).ToList().ForEach(a =>
{
int asciiCode = (int)a;
/// ToUpper()
int index = asciiCode >= 97 ? asciiCode - 32 : asciiCode;
Console.WriteLine(words[index - 65]); // ascii --> 65-a , 66-b ...
});
One more variation of answer which contains not found option.
static Dictionary<char, string> Mapping =
new Dictionary<char, string>()
{ { 'a', "alpha" }, { 'b', "beta" }, { 'c', "gamma" }, { 'd', "zeta" } };
static void Main(string[] args)
{
string test = "abcx";
Console.WriteLine(string.Join(" ", test.Select(t => GetMapping(t))));
//output alpha beta gamma not found
Console.ReadKey();
}
static string GetMapping(char key)
{
if (Mapping.ContainsKey(key))
return Mapping.First(a => a.Key == key).Value;
else
return "not found";
}
Just declare a second variable where you build up your result.
And I think you have some syntax problems, you need to have "==" in a
condition, otherwise it's an assignment.
static string WordMap(string value)
{
string result = string.Empty;
char[] buffer = value.ToCharArray();
for (int i = 0; i < buffer.Length; i++)
{
if (letter == "a")
{
result += ("Alpha");
}
//and so on
}
return result;
}
But I would only do this that way, if this is "just for fun" code,
as it will not be very fast.
Building up the result like I did is slow, a better way would be
result = string.Concat(result, "(Alpha)");
And an even faster way is using a StringBuilder (s. documentation for that),
which offers you fast and convenient methods to deal with bigger strings.
Only downfall here is, that you need to know a little bit, how big the result
will be in characters, as you need to provide a starting dimension.
And here you should not start with simply 1, or 100. Each time, when the StringBuilder
is full, it creates a new bigger instance and copies the values, so multiple instances
of that will fill your memory, which can cause an out of memory exception,
when dealing with some ten thousands of characters.
But as said, for just for fun code, all of that does not matter...
And of course, you need to be aware, that if you do it like that, your result will
be in one straight line, no breaks. If you want line breaks add "\n" at the end of
the strings. Or add anything elese you need.
Regards,
Markus
My problem is with getting values from .txt file.
I have this for example[without enter]:
damage=20 big explosion=50 rangeweapon=50.0
and I want to get these values after "=". Just to make a string[] with something like that:
damage=20
big explosion=50
rangeweapon=50.0
I got some other mechanic but i want to find universal mechanic to get all values into string[] and then just check it in switch.
Thank You.
I have try to solve your problem with regex. I found one solution that is not best solution.
May be it can help or guide you to find best solution.
Please try like this
string operation = "damage=20 big explosion=50 rangeweapon=50.0";
string[] wordText = Regex.Split(operation, #"(\=\d+\.?\d?)");
/*for split word but result array has last value that empty you will delete its*/
string[] wordValue = Regex.Split(operation, #"(\s*\D+\=)"); /*for split digit that is value of word but result array has first value that empty you will delete its*/
After that you can join or do anything you want with those array.
This should parse the string you describe, but keep in mind it isn't very robust and has no error handling.
string stringToParse = "damage=20 big explosion=50 rangeweapon=50.0";
string[] values = stringToParse.Split(' ');
Dictionary<string, double> parsedValues = new Dictionary<string, double>();
string temp = "";
foreach (var value in values)
{
if (value.Contains('='))
{
string[] keyValue = value.Split('=');
parsedValues.Add(temp + keyValue[0], Double.Parse(keyValue[1]));
temp = string.Empty;
}
else
{
temp += value + ' ';
}
}
After this, the parsedValues dictionary should have the information you're looking for.
I'm not an expert about it, but what about using a Regex ?
Not the cleanest code in the world, but will work for your situation.
string input = "damage=20 big explosion=50 rangeweapon=50.0";
string[] parts = input.Split('=');
Dictionary<string, double> dict = new Dictionary<string, double>();
for (int i = 0; i < (parts.Length - 1); i++)
{
string key = i==0?parts[i]:parts[i].Substring(parts[i].IndexOf(' '));
string value = i==parts.Length-2?parts[i+1]:parts[i + 1].Substring(0, parts[i + 1].IndexOf(' '));
dict.Add(key.Trim(), Double.Parse(value));
}
foreach (var el in dict)
{
Console.WriteLine("Key {0} contains value {1}", el.Key, el.Value);
}
Console.ReadLine();
You want to read number from text.You can save your data in text like this.
damage=20,big explosion=50,rangeweapon=50. And read from text via File.ReadAllLines().
string[] Lines;
string[] myArray;
Lines = File.ReadAllLines(your file path);
for (int i = 0; i < Lines.Length; i++)
{
myArray = Lines[i].Split(',');
}
for (int j = 0; j < myArray .Length; j++)
{
string x =myArray [j].ToString();
x = Regex.Replace(x, "[^0-9.]", "");
Console.WriteLine(x);
}
I want to extract unique characters from a string. For example:- 'AAABBBBBCCCCFFFFGGGGGDDDDJJJJJJ' will return 'ABCFGDJ'
I have tried below piece of code but now I want to optimize it.
Please suggest if anyone knows.
static string extract(string original)
{
List<char> characters = new List<char>();
string unique = string.Empty;
foreach (char letter in original.ToCharArray())
{
if (!characters.Contains(letter))
{
characters.Add(letter);
}
}
foreach (char letter in characters)
{
unique += letter;
}
return unique;
}
I don't know if this is faster, but surely shorter
string s = "AAABBBBBCCCCFFFFGGGGGDDDDJJJJJJ";
var newstr = String.Join("", s.Distinct());
Another LINQ approach, but not using string.Join:
var result = new string(original.Distinct().ToArray());
I honestly don't know which approach to string creation would be faster. It probably depends on whether string.Join ends up internally converting each element to a string before appending to a StringBuilder, or whether it has custom support for some well-known types to avoid that.
How about
var result = string.Join("", "AAABBBBBCCCCFFFFGGGGGDDDDJJJJJJ".Distinct());
Make sure that you include System.Linq namespace.
Try this
string str = "AAABBBBBCCCCFFFFGGGGGDDDDJJJJJJ";
string answer = new String(str.Distinct().ToArray());
I hope this helps.
if "AAABBBAAA" should return "ABA", then the following does it. Albeit not very fast.
List<char> no_repeats = new List<char>();
no_repeats.Add(s[0]);
for (int i = 1; i < s.Length; i++)
{
if (s[i] != no_repeats.Last()) no_repeats.Add(s[i]);
}
string result = string.Join("", no_repeats);
My ultimate goal here is to turn the following string into JSON, but I would settle for something that gets me one step closer by combining the fieldname with each of the values.
Sample Data:
Field1:abc;def;Field2:asd;fgh;
Using Regex.Replace(), I need it to at least look like this:
Field1:abc,Field1:def,Field2:asd,Field2:fgh
Ultimately, this result would be awesome if it can be done via Regex in a single call.
{"Field1":"abc","Field2":"asd"},{"Field1":"def","Field2":"fgh"}
I've tried many different variations of this pattern, but can't seem to get it right:
(?:(\w+):)*?(?:([^:;]+);)
Only one other example I could find that is doing something similar, but just enough differences that I can't quite put my finger on it.
Regex to repeat a capture across a CDL?
EDIT:
Here's my solution. I'm not going to post it as a "Solution" because I want to give credit to one that was posted by others. In the end, I took a piece from each of the posted solutions and came up with this one. Thanks to everyone who posted. I gave credit to the solution that compiled, executed fastest and had the most accurate results.
string hbi = "Field1:aaa;bbb;ccc;ddd;Field2:111;222;333;444;";
Regex re = new Regex(#"(\w+):(?:([^:;]+);)+");
MatchCollection matches = re.Matches(hbi);
SortedDictionary<string, string> dict = new SortedDictionary<string, string>();
for (int x = 0; x < matches.Count; x++)
{
Match match = matches[x];
string property = match.Groups[1].Value;
for (int i = 0; i < match.Groups[2].Captures.Count; i++)
{
string key = i.ToString() + x.ToString();
dict.Add(key, string.Format("\"{0}\":\"{1}\"", property, match.Groups[2].Captures[i].Value));
}
}
Console.WriteLine(string.Join(",", dict.Values));
Now you have two problems
I don't think regular expressions will be the best way to handle this. You should probably start by splitting on semicolons, then loop through the results looking for a value that starts with "Field1:" or "Field2:" and collect the results into a Dictionary.
Treat this as pseudo code because I have not compiled or tested it:
string[] data = input.Split(';');
dictionary<string, string> map = new dictionary<string, string>();
string currentKey = null;
foreach (string value in data)
{
// This part should change depending on how the fields are defined.
// If it's a fixed set you could have an array of fields to search,
// or you might need to use a regular expression.
if (value.IndexOf("Field1:") == 0 || value.IndexOf("Field2:"))
{
string currentKey = value.Substring(0, value.IndexOf(":"));
value = value.Substring(currentKey.Length+1);
}
map[currentKey] = value;
}
// convert map to json
I had an idea that it should be possible to do this in a shorter and more clear way. It ended up not being all that much shorter and you can question if it's more clear. At least it's another way to solve the problem.
var str = "Field1:abc;def;Field2:asd;fgh";
var rows = new List<Dictionary<string, string>>();
int index = 0;
string value;
string fieldname = "";
foreach (var s in str.Split(';'))
{
if (s.Contains(":"))
{
index = 0;
var tmp = s.Split(':');
fieldname = tmp[0];
value = tmp[1];
}
else
{
value = s;
index++;
}
if (rows.Count < (index + 1))
rows.Insert(index, new Dictionary<string, string>());
rows[index][fieldname] = value;
}
var arr = rows.Select(dict =>
String.Join("," , dict.Select(kv =>
String.Format("\"{0}\":\"{1}\"", kv.Key, kv.Value))))
.Select(r => "{" + r + "}");
var json = String.Join(",", arr );
Debug.WriteLine(json);
Outputs:
{"Field1":"abc","Field2":"asd"},{"Field1":"def","Field2":"fgh"}
I would go with RegEx as the simplest and most straightforward way to parse the strings, but I'm sorry, pal, I couldn't come up with a clever-enough replacement string to do this in one shot.
I hacked it out for fun through, and the monstrosity below accomplishes what you need, albeit hideously. :-/
Regex r = new Regex(#"(?<FieldName>\w+:)*(?:(?<Value>(?:[^:;]+);)+)");
var matches = r.Matches("Field1:abc;def;Field2:asd;fgh;moo;"); // Modified to test "uneven" data as well.
var tuples = new[] { new { FieldName = "", Value = "", Index = 0 } }.ToList(); tuples.Clear();
foreach (Match match in matches)
{
var matchGroups = match.Groups;
var fieldName = matchGroups[1].Captures[0].Value;
int index = 0;
foreach (Capture cap in matchGroups[2].Captures)
{
var tuple = new { FieldName = fieldName, Value = cap.Value, Index = index };
tuples.Add(tuple);
index++;
}
}
var maxIndex = tuples.Max(tup => tup.Index);
var jsonItemList = new List<string>();
for (int a = 0; a < maxIndex+1; a++)
{
var jsonBuilder = new StringBuilder();
jsonBuilder.Append("{");
foreach (var tuple in tuples.Where(tup => tup.Index == a))
{
jsonBuilder.Append(string.Format("\"{0}\":\"{1}\",", tuple.FieldName, tuple.Value));
}
jsonBuilder.Remove(jsonBuilder.Length - 1, 1); // trim last comma.
jsonBuilder.Append("}");
jsonItemList.Add(jsonBuilder.ToString());
}
foreach (var item in jsonItemList)
{
// Write your items to your document stream.
}