Split function in c# - c#

I have a program which accepts a url(example:care.org), gets the page source of the url and does some calculation.
string text = <the page source of care.org>
string separator = "car";
var cnt = text.ToLower().Split(separator,StringSplitOptions.None);
My aim is to count the number of occurence of the "car" in the page source,
My code considers care as 'car'|'e' it splits it this way.. But i want it to consider whole seperator as one and do the splittin
Please help me with this

You should use reular expressions instead of split() method:
Regex regex = new Regex(#"\bcar\b"); // you should modify it if `car's` needed
Match match = regex.Match(text);
int cnt = 0;
while (match.Success)
{
cnt++;
match = match.NextMatch();
}
// here you get count of `car` in `cnt`

This is how can achieve what you want by using RegularExpressions:
string text = "the page source of care.org";
string separator = #"\bcar\b";
MatchCollection resultsarray = Regex.Matches(text, separator);
Now resultsarray contains your matches. You can count it using
resultsarray.Count

Split returns a string array, you could just count the results.
var cnt = text.ToLower().Split(separator,StringSplitOptions.None).count;

I dont think you need to split, since you are not going to do anything with the substring. You only want a count, so look in to using RegEx.Matches(text, "car[^a-zA-Z0-9]") or similar to define the patterns you are interested in. Good luck!

Related

How to get the count of only special character in a string using Regex?

If my input string is ~!##$%^&*()_+{}:"<>?
How do I get the count of each special character using Regex? For example:
Regex.Matches(inputText, "each special character").Count;
This should be the answer to your question:
Regex.Matches("Little?~ birds! like to# sing##", "[~!##$%^&*()_+{}:\"<>?]").Count
Count should return 6 matches, change the sentence to other variable or something else.
You can find more info about regex expressions here:
http://www.zytrax.com/tech/web/regex.htm
Best Regards!
Instead of thinking of every special characters and adding them up, do it the other way; count every letters/digits and subtract them from the count.
You can do that with a simple one-liner :
string input = "abc?&;3";
int numberOfSpecialCharacters = input.Length - input.Count(char.IsLetterOrDigit); //Gives 3
Which you can also change to
int numberOfSpecialCharacters = input.Count(c => !char.IsLetterOrDigit(c));
Regex is not the best way to do this. here is the Linq based solution
string chars = "~!##$%^&*()_+{}:\"<>?";
foreach (var item in chars.Where(x=> !char.IsLetterOrDigit(x)).GroupBy(x => x))
{
Console.WriteLine(string.Format("{0},{1}",item.Key,item.Count()));
}
I understand that you need to count each spl character count. Correct me If am mistaken.
The non-regex way (which sounds much easier) it to make a list of characters you want to check and use Linq to find the count of those characters.
string inputString = "asdf1!%jkl(!*";
List<char> charsToCheckFor = new List<char>() { '!', '#', '#', ..... };
int charCount = inputString.Count(x => charsToCheckFor.Contains(x));
I am making you write in all the characters you need to check for, because you need to figure out what you want.
If you want to follow other approach then you can use.
string str = "#123:*&^789'!##$*()_+=";
int count = 0;
foreach (char c in str)
{
if (!char.IsLetterOrDigit(c.ToString(),0))
{
count++;
}
}
MessageBox.Show(count.ToString());
It's been a while and I needed a similar answer for handling password validation. Pretty much what VITA said, but here was my specific take for others needing it for the same thing:
var pwdSpecialCharacterCount = Regex.Matches(item, "[~!##$%^&*()_+{}:\"<>?]").Count;
var pwdMinNumericalCharacters = Regex.Matches(item, "[0-9]").Count;
var pwdMinUpperCaseCharacters = Regex.Matches(item, "[A-Z]").Count;
var pwdMinLowerCaseCharacters = Regex.Matches(item, "[a-z]").Count;

C# how to pick out certain part in a string

I have a string in a masked TextBox that looks like this:
123.456.789.abc.def.ghi
"---.---.---.---.---.---" (masked TextBox format when empty, cannot use underscore X( )
Please ignore the value of the characters (they can be duplicated, and not unique as above). How can I pick out part of the string, say "789"? String.Remove() does not work, as it removes everything after the index.
You could use Split in order to separate your values if the . is always contained in your string.
string input = "123.456.789.abc.def";
string[] mySplitString = input.Split('.');
for (int i = 0; i < mySplitString.Length; i++)
{
// Do you search part here
}
Do you mean you want to obtain that part of the string? If so, you could use string.Split
string s = "123.456.789.abc.def.ghi";
var splitString = s.Split('.');
// splitString[2] will return "789"
You could simply use String.Split (if the string is actually what you have shown)
string str = "123.456.789.abc.def.ghi";
string[] parts = str.Split('.');
string third = parts.ElementAtOrDefault(2); // zero based index
if(third != null)
Console.Write(third);
I've just used Enumerable.ElementAtOrDefault because it returns null instead of an exception if there's no such index in the collection(It falls back to parts[2]).
Finding a string:
string str="123.456.789.abc.def.ghi";
int i = str.IndexOf("789");
string subStr = str.Substring(i,3);
Replacing the substring:
str = str.Replace("789","").Replace("..",".");
Regex:
str = Regex.Replace(str,"789","");
The regex can give you a lot of flexibility finding things with minimum code, the drawback is it may be difficult to write them
If you know the index of where your substring begins and the length that it will be, you can use String.Substring(). This will give you the substring:
String myString = "123.456.789";
// looking for "789", which starts at index 8 and is length 3
String smallString = myString.Substring(8, 3);
If you are trying to remove a specific part of the string, use String.Replace():
String myString = "123.456.789";
String smallString = myString.Replace("789", "");
var newstr = new String(str.where(c => "789")).tostring();..i guess this would work or you can use sumthng like this
Try using Replace.
String.Replace("789", "")

match first digits before # symbol

How to match all first digits before # in this line
26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html
Im trying to get this number 26909578
My try
string text = #"26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html";
MatchCollection m1 = Regex.Matches(text, #"(.+?)#", RegexOptions.Singleline);
but then its outputs all text
Make it explicit that it has to start at the beginning of the string:
#"^(.+?)#"
Alternatively, if you know that this will always be a number, restrict the possible characters to digits:
#"^\d+"
Alternatively use the function Match instead of Matches. Matches explicitly says, "give me all the matches", while Match will only return the first one.
Or, in a trivial case like this, you might also consider a non-RegEx approach. The IndexOf() method will locate the '#' and you could easily strip off what came before.
I even wrote a sscanf() replacement for C#, which you can see in my article A sscanf() Replacement for .NET.
If you dont want to/dont like to use regex, use a string builder and just loop until you hit the #.
so like this
StringBuilder sb = new StringBuilder();
string yourdata = "yourdata";
int i = 0;
while(yourdata[i]!='#')
{
sb.Append(yourdata[i]);
i++;
}
//when you get to that # your stringbuilder will have the number you want in it so return it with .toString();
string answer = sb.toString();
The entire string (except the final url) is composed of segments that can be matched by (.+?)#, so you will get several matches. Retrieve only the first match from the collection returned by matching .+?(?=#)

Regex: C# extract text within double quotes

I want to extract only those words within double quotes. So, if the content is:
Would "you" like to have responses to your "questions" sent to you via email?
The answer must be
you
questions
Try this regex:
\"[^\"]*\"
or
\".*?\"
explain :
[^ character_group ]
Negation: Matches any single character that is not in character_group.
*?
Matches the previous element zero or more times, but as few times as possible.
and a sample code:
foreach(Match match in Regex.Matches(inputString, "\"([^\"]*)\""))
Console.WriteLine(match.ToString());
//or in LINQ
var result = from Match match in Regex.Matches(line, "\"([^\"]*)\"")
select match.ToString();
Based on #Ria 's answer:
static void Main(string[] args)
{
string str = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
var reg = new Regex("\".*?\"");
var matches = reg.Matches(str);
foreach (var item in matches)
{
Console.WriteLine(item.ToString());
}
}
The output is:
"you"
"questions"
You can use string.TrimStart() and string.TrimEnd() to remove double quotes if you don't want it.
I like the regex solutions. You could also think of something like this
string str = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
var stringArray = str.Split('"');
Then take the odd elements from the array. If you use linq, you can do it like this:
var stringArray = str.Split('"').Where((item, index) => index % 2 != 0);
This also steals the Regex from #Ria, but allows you to get them into an array where you then remove the quotes:
strText = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
MatchCollection mc = Regex.Matches(strText, "\"([^\"]*)\"");
for (int z=0; z < mc.Count; z++)
{
Response.Write(mc[z].ToString().Replace("\"", ""));
}
I combine Regex and Trim:
const string searchString = "This is a \"search text\" and \"another text\" and not \"this text";
var collection = Regex.Matches(searchString, "\\\"(.*?)\\\"");
foreach (var item in collection)
{
Console.WriteLine(item.ToString().Trim('"'));
}
Result:
search text
another text
Try this (\"\w+\")+
I suggest you to download Expresso
http://www.ultrapico.com/Expresso.htm
I needed to do this in C# for parsing CSV and none of these worked for me so I came up with this:
\s*(?:(?:(['"])(?<value>(?:\\\1|[^\1])*?)\1)|(?<value>[^'",]+?))\s*(?:,|$)
This will parse out a field with or without quotes and will exclude the quotes from the value while keeping embedded quotes and commas. <value> contains the parsed field value. Without using named groups, either group 2 or 3 contains the value.
There are better and more efficient ways to do CSV parsing and this one will not be effective at identifying bad input. But if you can be sure of your input format and performance is not an issue, this might work for you.
Slight improvement on answer by #ria,
\"[^\" ][^\"]*\"
Will recognize a starting double quote only when not followed by a space to allow trailing inch specifiers.
Side effect: It will not recognize "" as a quoted value.

C# Regex - Match and replace, Auto Increment

I have been toiling with a problem and any help would be appreciated.
Problem: I have a paragraph and I want to replace a variable which appears several times (Variable = #Variable). This is the easy part, but the portion which I am having difficulty is trying to replace the variable with different values.
I need for each occurrence to have a different value. For instance, I have a function that does a calculation for each variable. What I have thus far is below:
private string SetVariables(string input, string pattern){
Regex rx = new Regex(pattern);
MatchCollection matches = rx.Matches(input);
int i = 1;
if(matches.Count > 0)
{
foreach(Match match in matches)
{
rx.Replace(match.ToString(), getReplacementNumber(i));
i++
}
}
I am able to replace each variable that I need to with the number returned from getReplacementNumber(i) function, but how to I put it back into my original input with the replaced values, in the same order found in the match collection?
Thanks in advance!
Marcus
Use the overload of Replace that takes a MatchEvaluator as its second parameter.
string result = rx.Replace(input, match => { return getReplacementNumber(i++); });
I'm assuming here that getReplacementNumber(int i) returns a string. If not, you will have to convert the result to a string.
See it working online: ideone

Categories

Resources