Ok let's say I have an ObservableCollection<string> object. Within this object I have a variety of strings:
SomeString01
SomeString-02
somestring-03
SOMESTRING.04
aString
I want to take an input, we'll call it pattern and store it as a string from a User interface, and do some partial matching on the ObservableCollection. I need do to partial matching on the collection, and everything is going to be case insensitive. In the end I want to these compiled into a brand new ObservableCollection. So here are some example cases:
pattern = "SoME"
// RESULTS:
SomeString01
SomeString-02
somestring-03
SOMESTRING.04
/* --- */
pattern = "-0"
// RESULTS:
SomeString-02
somestring-03
/* --- */
pattern = "ING0"
// RESULTS:
SomeString01
pattern = "s"
// RESULTS:
SomeString01
SomeString-02
somestring-03
SOMESTRING.04
aString
What is the best approach for this in a ClickOnce application?
Like Gabes answer in the comments.
but slightly more specific
.Where(x => x.IndexOf("Some",StringComparison.InvariantCultureIgnoreCase) != -1)
Ok I actually dug around more with Google, and found a better solution:
You could use IndexOf() and pass StringComparison.OrdinalIgnoreCase
string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;
Even better is defining a new extension method for string:
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source.IndexOf(toCheck, comp) >= 0;
}
string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);
Contributed by: JaredPar
Source: Case insensitive 'Contains(string)'
SO now I have implemented it in my source as follows:
foreach (string source in SourceStrings)
{
// Code for some pre-reqs here
if (source.IndexOf(Pattern, StringComparison.OrdinalIgnoreCase) >= 0)
{
subset.Add(source);
}
// Finish up anything else I had to do here
}
Related
I have an enum which I want to present as string using special way:
public enum FailureDescription
{
MemoryFailure,
Fragmentation,
SegmentationFault
}
I want to print the value of that enum as following : FailureDescription.MemoryFailure.ToString() - > Memory Failure
Can I do that ? How? Implement ToString?
You can write own extension method:
public static string ToFormattedText(this MyEnum value)
{
var stringVal = value.ToString();
var bld = new StringBuilder();
for (var i = 0; i < stringVal.Length; i++)
{
if (char.IsUpper(stringVal[i]))
{
bld.Append(" ");
}
bld.Append(stringVal[i]);
}
return bld.ToString();
}
If you want method available for all enums, just replace MyEnum with Enum.
Usage:
var test = MyEnum.SampleName.ToFormattedText();
Consider caching - building string everytime could not be efficient.
Use the Description attribute to decortate your enumeration values. I'd suggest adding a resx file for resources so that you can localise more easily. If you hardcoded "Memory Failure", it becomes more work to be able to change that to another language (as Hans Passant mentioned in the comments).
public enum FailureDescription
{
[Description("Memory Failure")] //hardcoding English
MemoryFailure,
[Description(MyStringsResourceFile.FragmentationDescription)] //reading from a resx file makes localisation easier
Fragmentation,
[Description(MyStringsResourceFile.SegmentationFaultDescription)]
SegmentationFault
}
You can then create a method, or extension method (as shown) to read the Description value.
public static class Extensions
{
public static string GetDescription(this Enum value)
{
FieldInfo fi = value.GetType().GetField(value.ToString());
DescriptionAttribute[] attributes =
(DescriptionAttribute[])fi.GetCustomAttributes(
typeof(DescriptionAttribute),
false);
if (attributes != null &&
attributes.Length > 0)
return attributes[0].Description;
else
return value.ToString();
}
}
Then call the method like so
Console.WriteLine(FailureDescription.MemoryFailure.GetDescription());
This extension method will do it for you:
public static string ToFormattedText(this FailureDescription value)
{
return new string(value.ToString()
.SelectMany(c =>
char.IsUpper(c)
? new [] { ' ', c }
: new [] { c })
.ToArray()).Trim();
}
You also can use a simple regex & linq mixture to extract and concatenate the words:
var withSpaces =
Regex
.Matches(
FailureDescription.MemoryFailureTest.ToString(),
#"([A-Z][a-z]+)(?=[A-Z]|$)")
.Cast<Match>()
.Select(m => m.Groups[1].Value)
.Aggregate((str, next) => (str = str + " " + next));
DEMO at ideone.com
where:
([A-Z][a-z]+)(?=[A-Z]|$)
matches words that begin with upper-case letters until the next upper-case letter or the end of string is found: DEMO at regex101
.Select(m => m.Groups[1].Value)
selects the matched values from the group 1
.Aggregate((str, next) => (str = str + " " + next));
concatenates words and inserts a space between them.
Here is one of the utilities I've been using. #HansPassant in his comment raised a good point about localizing.
This code takes Resource files into consideration. In the attribute with two params first param is the Key in Resource file, where as the second param is the namespace for the resource.
Checkout the git repo https://github.com/seanpaul/EnumExtensions
public enum TestEnum
{
//You can pass what ever string value you want
[StringValue("From Attribute")]
FromAttribute = 1,
//If localizing, you can use resource files
//First param is Key in resource file, second is namespace for Resources.
[StringValue("Test", "EnumExtensions.Tests.Resources")]
FromResource = 2,
//or don't mention anything and it will use built-in ToString
BuiltInToString = 3
}
[Test ()]
public void GetValueFromAttribute ()
{
var testEnum = TestEnum.FromAttribute;
Assert.AreEqual ("From Attribute", testEnum.GetStringValue ());
}
[Test ()]
public void GetValueFromResourceFile ()
{
var testEnum = TestEnum.FromResource;
Assert.AreEqual ("From Resource File", testEnum.GetStringValue ());
}
An elegant solution following the DRY and KISS principles would be using Humanizer:
"Memory Failure".DehumanizeTo<EnumUnderTest>(); // Returns FailureDescription.MemoryFailure.
"Fragmentation".DehumanizeTo<EnumUnderTest>(); // Returns FailureDescription.Fragmentation.
"Segmentation Fault".DehumanizeTo<EnumUnderTest>(); // Returns FailureDescription.SegmentationFault.
I was trying out possibilities to check a string to be an palindrome with the following logic
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Anagram solver");
Console.WriteLine(IsPalindrome("HIMA", "AMHI").ToString());
Console.ReadKey();
}
static bool IsPalindrome(string s1, string s2)
{
return s1.OrderBy(c => c).SequenceEqual(s2.OrderBy(c => c));
}
}
My idea was to get character literals in a string, and compare with that of characters from another string to deduce for a possible palindrome. Is such a thing possible with LINQ SequenceEqual method ?
Looking from the sample above,
'H' shall be compared with 'A' (default equality comparison)
'I' shall be compared with 'M'
'M' shall be compared with 'H'
'A' shall be compared with 'I'
Can any one guide me here.
Thanks and Cheers
Srivatsa
If you want palindrome then you should not order them, just reverse and match -
static bool IsPalindrome(string s1, string s2)
{
return s1.SequenceEqual(s2.Reverse());
}
for case-insensitivity try -
static bool IsPalindrome(string s1, string s2)
{
return s1.ToLower().SequenceEqual(s2.ToLower().Reverse());
}
In your case, "HIMA" and "AMHI" are sorted by the OrderBy LINQ function, which results in two collections containing the characters "AHIM". If you call SequenceEqual this returns true.
For SequenceEqual to return true, both collections have to have the same amount of elements in exactly the same order. No elements are allowed to be duplicated or stored at another position.
If you want to determine if two words are anagrams, that is exactly the functionality you want.
For palindromes, you could use the following:
public bool CheckPalindrome(string first, string second)
{
if (first == null) throw new ArgumentNullException("first");
if (second == null) throw new ArgumentNullExcpetion("second");
return first.Reverse().SequenceEquals(second);
}
You could use this method:
public static bool IsPalindromWith(this string str1, string str2)
{
if(str1 == null || str2 == null) return false;
return str1.SequenceEqual(str2.Reverse());
}
Usage: bool isPalindrom = "HIMA".IsPalindromWith("AMIH");
However, it is a very simple approach which ignores many edge cases.
Here is a better version that takes at least the case into account:
public static bool IsPalindromWith(this string str1, string str2, StringComparison comparison = StringComparison.CurrentCultureIgnoreCase)
{
if(str1 == null || str2 == null) return false;
char[] str2Chars = str2.ToCharArray();
Array.Reverse(str2Chars);
return str1.Equals(new String(str2Chars), comparison);
}
To elaburate on the existing (and i my opinion corrent) answer by #feO2x
Try looking at it like this:
static bool IsAnagram(string s1, string s2)
{
var lst1 = s1.OrderBy(c => c); //will result in { 'A','H','I', 'M' }
var lst2 = s2.OrderBy(c => c); //will *also* result in { 'A','H','I', 'M' }
return lst1.SequenceEqual(lst2);
}
The OrderBy(...) destroys the original order which you are trying to test.
Simply removing them will solve your problem:
static bool IsAnagram(string s1, string s2)
{
var lst1 = s1.AsEnumerable();
var lst2 = s2.AsEnumerable();
return lst1.SequenceEqual(lst2);
}
I'm working on a SignalR WPF application. Im sending messages from Windows Phone. I want to find specific item in that collection.
My view model:
public ViewModel()
{
Messages = new ObservableCollection<string>();
_connection = new HubConnection("http://localhost:49671/");
_dataHub = _connection.CreateHubProxy("dataHub");
}
public ObservableCollection<string> Messages
{
get { return _messages; }
set
{
if (Equals(value, _messages)) return;
_messages = value;
OnPropertyChanged("Messages");
}
}
public async Task Login(string roomName, string userName)
{
_userName = userName;
_roomName = roomName;
await _connection.Start();
await _dataHub.Invoke("JoinRoom", new object[] { _roomName, _userName });
_dataHub.Subscribe("ReceiveMessage").Received += list =>
Dispatcher.CurrentDispatcher.BeginInvoke((Action)(() =>
Messages.Add(list[0].ToString())));
}
Codes that I tried to search
var asd2 = App.MainViewModel.Messages.Where(a => a.Contains("on"));
var on = App.MainViewModel.Messages.IndexOf(App.MainViewModel.Messages.Where(x => x == "on").FirstOrDefault());
List<string> asd = App.MainViewModel.Messages.Where(a => a.Contains("on")).ToList();
var q = App.MainViewModel.Messages.IndexOf(App.MainViewModel.Messages.Contains("on").ToString());
nothing worked for now. Please help .
Edit: The answer on this site didnt work for me. I dont know where the problem is
Attempt no 1 should work fine, as long as the target string has the same casing (UPPERCASE vs lowercase). This search is case sensitive meaning it will NOT find "On", "oN" or "ON" bacause they have different casings. To make case insensitive search, you can use IndexOf instead, which takes a StringComparison parameter:
var asd2 = App.MainViewModel.Messages.Where(a => a.IndexOf("on", StringComparison.CurrentCultureIgnoreCase) >= 0);
Attempt no 2 finds the start position of the first string which matches "on" (again - case sensitive)... This doesn't make any sense really, since any string which exactly matches "on", will always start a position 0.
Attempt no 3 does the same as attempt no 1, but converts the result to a list (Where returns IEnumerable)
Attempt no 4 essentially tries to find the starting position of either "true" or "false". The Contains method will return true if the string "on" (again only exact match) is found, and that result is converted to a string and passed to the IndexOf.
UPDATE
Where returns an IEnumerable (with all the matches found). If you only need to check if "on" exists, you can use Any:
bool containsOn = App.MainViewModel.Messages.Any(a => a.IndexOf("on", StringComparison.CurrentCultureIgnoreCase) >= 0);
If you are dealing with cases and don't have an async issue, the code below works.
Check out this post
Extension method, taken from the post basicly.
public static class StringExt
{
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source.IndexOf(toCheck, comp) >= 0;
}
}
Note that the extension method above will find everything with "on" in it regardless of case, add or modify methods to suit your needs, makes life easier :) I personally love them!
Then for searching
// get first message with on in it
var res = App.MainViewModel.Messages.FirstOrDefault(m => m.Contains("on", StringComparison.OrdinalIgnoreCase));
// get everything with on in it
var res2 = App.MainViewModel.Messages.Where(m => m.Contains("on", StringComparison.OrdinalIgnoreCase));
Hope it helps, and was what you were after
Cheers
Stian
Sending messages as strings like this is really not ideal. Maybe have a look at this library that uses the Event aggregation pattern?
Disclaimer: I'm the author
https://github.com/AndersMalmgren/SignalR.EventAggregatorProxy/wiki
Lets say I have several short string:
string[] shortStrings = new string[] {"xxx","yyy","zzz"};
(this definition can change length on array and on string too, so not a fixed one)
When a given string, I like to check if it combines with the shortStrings ONLY, how?
let say function is like bool TestStringFromShortStrings(string s)
then
TestStringFromShortStrings("xxxyyyzzz") = true;
TestStringFromShortStrings("xxxyyyxxx") = true;
TestStringFromShortStrings("xxxyyy") = true;
TestStringFromShortStrings("xxxxxx") = true;
TestStringFromShortStrings("xxxxx") = false;
TestStringFromShortStrings("xxxXyyyzzz") = false;
TestStringFromShortStrings("xxx2yyyxxx") = false;
Please suggest a memory not tense and relatively fast method.
[EIDT] What this function for?
I will personally use this function to test if a string is a combination of a PINYIN ok, some Chinese stuff. Following Chinese are same thing if you cannot read it.
检测一个字符串是否为汉语拼音(例如检测是否拼音域名)
所有的汉语拼音字符串有:
(To detect whether a string is Hanyu Pinyin (e.g. detect the phonetic domain) of the Pinyin string:)
Regex PinYin = new Regex(#"^(a|ai|an|ang|ao|ba|bai|ban|bang|bao|bei|ben|beng|bi|bian|biao|bie|bin|bing|bo|bu|ca|cai|can|cang|cao|ce|cen|ceng|cha|chai|chan|chang|chao|che|chen|cheng|chi|chong|chou|chu|chua|chuai|chuan|chuang|chui|chun|chuo|ci|cong|cou|cu|cuan|cui|cun|cuo|da|dai|dan|dang|dao|de|den|dei|deng|di|dia|dian|diao|die|ding|diu|dong|dou|du|duan|dui|dun|duo|e|ei|en|eng|er|fa|fan|fang|fei|fen|feng|fo|fou|fu|ga|gai|gan|gang|gao|ge|gei|gen|geng|gong|gou|gu|gua|guai|guan|guang|gui|gun|guo|ha|hai|han|hang|hao|he|hei|hen|heng|hong|hou|hu|hua|huai|huan|huang|hui|hun|huo|ji|jia|jian|jiang|jiao|jie|jin|jing|jiong|jiu|ju|juan|jue|jun|ka|kai|kan|kang|kao|ke|ken|keng|kong|kou|ku|kua|kuai|kuan|kuang|kui|kun|kuo|la|lai|lan|lang|lao|le|lei|leng|li|lia|lian|liang|liao|lie|lin|ling|liu|long|lou|lu|lv|luan|lue|lve|lun|luo|ma|mai|man|mang|mao|me|mei|men|meng|mi|mian|miao|mie|min|ming|miu|mo|mou|mu|na|nai|nan|nang|nao|ne|nei|nen|neng|ni|nian|niang|niao|nie|nin|ning|niu|nong|nou|nu|nv|nuan|nuo|nun|ou|pa|pai|pan|pang|pao|pei|pen|peng|pi|pian|piao|pie|pin|ping|po|pou|pu|qi|qia|qian|qiang|qiao|qie|qin|qing|qiong|qiu|qu|quan|que|qun|ran|rang|rao|re|ren|reng|ri|rong|rou|ru|ruan|rui|run|ruo|sa|sai|san|sang|sao|se|sen|seng|sha|shai|shan|shang|shao|she|shei|shen|sheng|shi|shou|shu|shua|shuai|shuan|shuang|shui|shun|shuo|si|song|sou|su|suan|sui|sun|suo|ta|tai|tan|tang|tao|te|teng|ti|tian|tiao|tie|ting|tong|tou|tu|tuan|tui|tun|tuo|wa|wai|wan|wang|wei|wen|weng|wo|wu|xi|xia|xian|xiang|xiao|xie|xin|xing|xiong|xiu|xu|xuan|xue|xun|ya|yan|yang|yao|ye|yi|yin|ying|yo|yong|you|yu|yuan|yue|yun|za|zai|zan|zang|zao|ze|zei|zen|zeng|zha|zhai|zhan|zhang|zhao|zhe|zhei|zhen|zheng|zhi|zhong|zhou|zhu|zhua|zhuai|zhuan|zhuang|zhui|zhun|zhuo|zi|zong|zou|zu|zuan|zui|zun|zuo)+$");
用下面的正则表达式方法,试过了,最简单而且效果非常好,就是有点慢:(
递归的方式对长字符串比较麻烦,容易内存溢出
(Tried it with the regular expression: it's the most simple and gives very good results, but it's a bit slow. The recursive way on the long string is too much trouble, it's too easy to overflow the stack.)
Edit: Simplified this a lot thanks to L.B and millimoose.
Regular Expressions to the rescue! Using System.Text.RegularExpressions.Regex, we get:
public static bool TestStringFromShortStrings(string checkText, string[] pieces) {
// Build the expression. Ultimate result will be
// of the form "^(xxx|yyy|zzz)+$".
var expr = "^(" +
String.Join("|", pieces.Select(Regex.Escape)) +
")+$";
// Check whether the supplied string matches the expression.
return Regex.IsMatch(checkText, expr);
}
This should be able to properly handle cases that have multiple repeated patterns of different lenghts. E.g. if you the list of possible pieces includes strings "xxx" and "xxxx".
Copy the target string to string builder. For each string in shortstring array, remove all occurences from target. If u end up in zero length string, true else false.
Edit:
This approach is not correct. Please refer to comments. Keeping this answer still here as it may look reasonably correct initially.
You could compare the start of the input string with each of the short strings. As soon as you have a match, you take the rest of the string and repeat. As soon as you have no more string left, you're done. For example:
string[] shortStrings = new string[] { "xxx", "yyy", "zzz" };
bool Test(string input)
{
if (input.Length == 0)
return true;
foreach (string shortStr in shortStrings)
{
if (input.StartsWith(shortStr))
{
if (Test(input.Substring(shortStr.Length)))
return true;
}
}
return false;
}
You might optimize this by removing the recursion, or by sorting the short strings and do a binary instead of a linear search.
Here is a non-recursive version, that uses a Stack object instead. No chance of getting a StackOverflowException:
string[] shortStrings = new string[] { "xxx", "yyy", "zzz" };
bool Test(string input)
{
Stack<string> stack = new Stack<string>();
stack.Push(input);
while (stack.Count > 0)
{
string str = stack.Pop();
if (str.Length == 0)
return true;
foreach (string shortStr in shortStrings)
{
if (str.StartsWith(shortStr))
stack.Push(str.Substring(shortStr.Length));
}
}
return false;
}
EDIT 2:
Confirmed that my performance problems were due to the static function call to the StringExtensions class. Once removed, the IndexOf method is indeed the fastest way of accomplishing this.
What is the fastest, case insensitive, way to see if a string contains another string in C#? I see the accepted solution for the post here at Case insensitive 'Contains(string)' but I have done some preliminary benchmarking and it seems that using that method results in orders of magnitude slower calls on larger strings (> 100 characters) whenever the test string cannot be found.
Here are the methods I know of:
IndexOf:
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
if (string.IsNullOrEmpty(toCheck) || string.IsNullOrEmpty(source))
return false;
return source.IndexOf(toCheck, comp) >= 0;
}
ToUpper:
source.ToUpper().Contains(toCheck.ToUpper());
Regex:
bool contains = Regex.Match("StRiNG to search", "string", RegexOptions.IgnoreCase).Success;
So my question is, which really is the fastest way on average and why so?
EDIT:
Here is my simple test app I used to highlight the performance difference. Using this, I see 16 ms for ToLower(), 18 ms for ToUpper and 140 ms for the StringExtensions.Contains():
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;
namespace ScratchConsole
{
class Program
{
static void Main(string[] args)
{
string input = "";
while (input != "exit")
{
RunTest();
input = Console.ReadLine();
}
}
static void RunTest()
{
List<string> s = new List<string>();
string containsString = "1";
bool found;
DateTime now;
for (int i = 0; i < 50000; i++)
{
s.Add("AAAAAAAAAAAAAAAA AAAAAAAAAAAA");
}
now = DateTime.Now;
foreach (string st in s)
{
found = st.ToLower().Contains(containsString);
}
Console.WriteLine("ToLower(): " + (DateTime.Now - now).TotalMilliseconds);
now = DateTime.Now;
foreach (string st in s)
{
found = st.ToUpper().Contains(containsString);
}
Console.WriteLine("ToUpper(): " + (DateTime.Now - now).TotalMilliseconds);
now = DateTime.Now;
foreach (string st in s)
{
found = StringExtensions.Contains(st, containsString, StringComparison.OrdinalIgnoreCase);
}
Console.WriteLine("StringExtensions.Contains(): " + (DateTime.Now - now).TotalMilliseconds);
}
}
public static class StringExtensions
{
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source.IndexOf(toCheck, comp) >= 0;
}
}
}
Since ToUpper would actually result in a new string being created, StringComparison.OrdinalIgnoreCase would be faster, also, regex has a lot of overhead for a simple compare like this. That said, String.IndexOf(String, StringComparison.OrdinalIgnoreCase) should be the fastest, since it does not involve creating new strings.
I would guess (there I go again) that RegEx has the better worst case because of how it evaluates the string, IndexOf will always do a linear search, I'm guessing (and again) that RegEx is using something a little better. RegEx should also have a best case which would likely be close, though not as good, as IndexOf (due to additional complexity in it's language).
15,000 length string, 10,000 loop
00:00:00.0156251 IndexOf-OrdinalIgnoreCase
00:00:00.1093757 RegEx-IgnoreCase
00:00:00.9531311 IndexOf-ToUpper
00:00:00.9531311 IndexOf-ToLower
Placement in the string also makes a huge difference:
At start:
00:00:00.6250040 Match
00:00:00.0156251 IndexOf
00:00:00.9687562 ToUpper
00:00:01.0000064 ToLower
At End:
00:00:00.5781287 Match
00:00:01.0468817 IndexOf
00:00:01.4062590 ToUpper
00:00:01.4218841 ToLower
Not Found:
00:00:00.5625036 Match
00:00:01.0000064 IndexOf
00:00:01.3750088 ToUpper
00:00:01.3906339 ToLower
I have found that a compiled RegEx is by far the fastest solution and is obviously much more versatile. Compiling it helps put it on par with smaller string comparisons and as you stated, there is no comparison with larger strings.
http://www.dijksterhuis.org/regular-expressions-advanced/ contains some hints to gain maximum speed from RegEx comparisons; you might find it helpful.
This was interesting question to me, so I have created a little test using different methods
string content = "";
for (var i = 0; i < 10000; i++)
content = String.Format("{0} asldkfjalskdfjlaskdfjalskdfj laksdf lkwiuirh 9238 r9849r8 49834", content);
string test = String.Format("{0} find_me {0}", content);
string search = test;
var tickStart = DateTime.Now.Ticks;
//6ms
//var b = search.ToUpper().Contains("find_me".ToUpper());
//2ms
//Match m = Regex.Match(search, "find_me", RegexOptions.IgnoreCase);
//a little bit over 1ms
var c = false;
if (search.Length == search.ToUpper().Replace("find_me".ToUpper(), "x").Length)
c = true;
var tickEnd = DateTime.Now.Ticks;
Debug.Write(String.Format("{0} {1}", tickStart, tickEnd));
So what I ahve done is created a string and searched in it
first method search.ToUpper().Contains("find_me".ToUpper()) 5ms
second method Match m = Regex.Match(search, "find_me", RegexOptions.IgnoreCase) 2ms
third method
if (search.Length == search.ToUpper().Replace("find_me".ToUpper(), "x").Length)
c = true;
it took little more than 1ms