String operation in C#

String operation in C# - c#

I have an input string which data is coming in the following format:
"http://testing/site/name/lists/tasks"
"http://testing/site/name1/lists/tasks"
"http://testing/site/name2/lists/tasks" etc.,
How can I extract only name, name1, name2, etc. from this string?
Here is what I have tried:
SiteName = (Url.Substring("http://testing/site/".Length)).Substring(Url.Length-12)
It is throwing an exception stating StartIndex cannot be greater than the number of characters in the string. What is wrong with my expression? How can I fix it? Thanks.

A better option will be to use Regex matching/replace
But the following will also work based on the assumption that all the urls will be similar in pattern
var value = Url.Replace(#"http://testing/site/", "").Replace(#"/lists/tasks", "");
The other option will be to use Uri
var uriAddress = new Uri(#"http://testing/site/name/lists/tasks");
then breaking down uri parts according to your requirement

This is a job for regexp:
string strRegex = #"http://testing/site/(.+)/lists/tasks";
RegexOptions myRegexOptions = RegexOptions.IgnoreCase;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = #"http://testing/site/name/lists/tasks" + "\r\n" + #"http://testing/site/name1/lists/tasks" + "\r\n" + #"http://testing/site/name2/lists/tasks" + "\r\n" + #"http://testing/site/name3/lists/tasks";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here. Reference to first group
}
}

You could also use the Uri class to get the desired part:
string[] urlString = urlText.Split();
Uri uri = default(Uri);
List<string> names = urlString
.Where(u => Uri.TryCreate(u, UriKind.Absolute, out uri))
.Select(u => uri.Segments.FirstOrDefault(s => s.StartsWith("name", StringComparison.OrdinalIgnoreCase)))
.ToList();
Assuming that the part always start with "name".

Because the Substring function with a single argument takes the index of the starting charachter and consume all to the end of the string. It will be a little naive, but you can start at charachter 19: Url.Substring(19);

Related

Regex pattern BBCode to Wiki Notation, C#

I am tasked with converting BB code to WIKI notation and thanx to the many examples on SO I have cracked most of the tougher nuts. This is my first foray into Regex and I'm trying to learn it as I go (I would prefer stringbuilder but it doesnt seem to work with BB code). I have 4 items I need replaced that I cannot seem to create the proper pattern to identify: (original string on left, what I need on right after double dash)
the first item is a problem child because the wiki engine adds a new line where the spaces are. It is not a separate field but part of a larger string so I cant TRIM() it. I am currently using
result = result.Replace("[b]", "*").Replace("[/b]", "*");
the img issue is a need to somehow include the attributes if possible in the given format.
for the last 2 I am stumped. I have used
Regex r = new Regex(#"<a .*?href=['""](.+?)['""].*?>(.+?)</a>");
foreach (var match in r.Matches(multistring).Cast<Match>().OrderByDescending(m => m.Index))
{
string href = match.Groups[1].Value;
string txt = match.Groups[2].Value;
string wikilink = "[" + txt + "|" + href + "]";
sb.Remove(match.Groups[2].Index, match.Groups[2].Length);
sb.Insert(match.Groups[2].Index, wikilink);
}
in the past for HTML but cant seem to refactor it for my current needs. Suggestions, links to resources, all would be appreciated.
EDIT
solved the img issue, though it's not pretty and I still risk removing a closing [/img] tag that may not be caught earlier. The [img] code is fairly consistent, so I used:
Regex imgparser = new Regex(#"\[img[^\]]*\]([^\[]*)");
foreach (var itag in imgparser.Matches(multistring).Cast<Match>().OrderByDescending(m => m.Index))
{
string isrc = itag.Groups[1].Value;
string wikipic = itag.ToString().Replace("[img ", "!" + isrc).Replace("width=", "!width=").Replace("height=", ",height=").Replace("]" + isrc, string.Empty);
result = result.Replace(itag.ToString(), wikipic);
}
result = result.Replace("[/img]", "!");

I can give you a little example for the last case :
string str1 = "[url=http://aadqsdqsd]link[/url]";
var pattern = #"^\[url=(.*)\](.*)\[\/url\]$";
var match = Regex.Match(str1, pattern);
var result = string.Format("[{0}| {1}]", match.Groups[2].Value, match.Groups[1].Value);
//[link| http://aadqsdqsd]
Is it what you want ?
EDIT
if you want to match a larger string you can do :
var strTomatch = "[url=http://1]link1[/url][url=http://2]link2[/url]" + Environment.NewLine +
"[url = http://3]link3[/url]" + Environment.NewLine +
"[url=http://4]link4[/url]";
var match = Regex.Match(strTomatch, #"\[url\s*=\s*(.*?)\](.*?)\[\/url\]", RegexOptions.Multiline);
while (match.Success)
{
var result = string.Format("[{0}| {1}]", match.Groups[2].Value, match.Groups[1].Value);
Debug.WriteLine(result);
match = match.NextMatch();
}
Output
[link1| http://1]
[link2| http://2]
[link3| http://3]
[link4| http://4]

Regex from a html parsing, how do I grab a specific string?

I'm trying to specifically get the string after charactername= and before " >. How would I use regex to allow me to catch only the player name?
This is what I have so far, and it's not working. Not working as it doesn't actually print anything. On the client.DownloadString it returns a string like this:
<a href="https://my.examplegame.com/charactername=Atro+Roter" >
So, I know it actually gets string, I'm just stuck on the regex.
using (var client = new WebClient())
{
//Example of what the string looks like on Console when I Console.WriteLine(html)
//<a href="https://my.examplegame.com/charactername=Atro+Roter" >
// I want the "Atro+Roter"
string html = client.DownloadString(worldDest + world + inOrderName);
string playerName = "https://my.examplegame.com/charactername=(.+?)\" >";
MatchCollection m1 = Regex.Matches(html, playerName);
foreach (Match m in m1)
{
Console.WriteLine(m.Groups[1].Value);
}
}

I'm trying to specifically get the string after charactername= and before " >. 
So, you just need a lookbehind with lookahead and use LINQ to get all the match values into a list:
var input = "your input string";
var rx = new Regex(#"(?<=charactername=)[^""]+(?="")";
var res = rx.Matches(input).Cast<Match>().Select(p => p.Value).ToList();
The res variable should hold all your character names now.

I assume your issue is trying to parse the URL. Don't - use what .NET gives you:
var playerName = "https://my.examplegame.com/?charactername=NAME_HERE";
var uri = new Uri(playerName);
var queryString = HttpUtility.ParseQueryString(uri.Query);
Console.WriteLine("Name is: " + queryString["charactername"]);
This is much easier to read and no doubt more performant.
Working sample here: https://dotnetfiddle.net/iJlBKW

All forward slashes must be unescaped with back slashes like this \/
string input = #"<a href=""https://my.examplegame.com/charactername=Atro+Roter"" >";
string playerName = #"https:\/\/my.examplegame.com\/charactername=(.+?)""";
Match match = Regex.Match(input, playerName);
string result = match.Groups[1].Value;
Result = Atro+Roter

Regex replace all matched tokens with lowercase

Given the following html text snippet
<th>Member name:</th>
<td>$$FULLNAME$$</td>
<th>Club:</th>
<td>$$ClubName$$</td>
<th>Business Category:</th>
<td>$$SubCategory$$</td>
I am trying to replace all the tokens e.g. $$FULLNAME$$ becomes $$fullname$$ using C#, the output should be
<th>Member name:</th>
<td>$$fullname$$</td>
<th>Club:</th>
<td>$$clubname$$</td>
<th>Business Category:</th>
<td>$$subcategory$$</td>
I have come up with this which does not work correctly as the \Lis not converting the matches to lowercase
public static string TokenReplacer(string value)
{
var pattern = Regex.Escape("$$") + "(.*?)" + Regex.Escape("$$");
var regex = new Regex(pattern);
return regex.Replace(value, Regex.Unescape("$$$$") + #"\L$1" + Regex.Unescape("$$$$"));
}

var output = Regex.Replace(input, #"\$\$.+?\$\$", m => m.Value.ToLower());

best possible way to get given substring

lets say I have string in format as below:
[val1].[val2].[val3] ...
What is the best way to get the value from the last bracket set [valx] ?
so for given example
[val1].[val2].[val3]
the result would be val3

You have to define best first, best in terms of readability or cpu-cycles?
I assume this is efficient and readable enough:
string values = "[val1].[val2].[val3]";
string lastValue = values.Split('.').Last().Trim('[',']');
or with Substring which can be more efficient, but it's not as safe since you have to handle the case that's there no dot at all.
lastValue = values.Substring(values.LastIndexOf('.') + 1).Trim('[',']');
So you need to check this first:
int indexOflastDot = values.LastIndexOf('.');
if(indexOflastDot >= 0)
{
lastValue = values.Substring(indexOflastDot + 1).Trim('[',']');
}

For a quick solution to your problem (so not structural),
I'd say:
var startIndex = input.LastIndexOf(".["); // getting the last
then using the Substring method
var value = input.Substring(startIndex + 2, input.Length - (startIndex - 2)); // 2 comes from the length of ".[".
then removing the "]" with TrimEnd function
var value = value.TrimEnd(']');
But this is by all means not the only solution, and not structural to apply.. Just one of many answers to your problem.

I think you want to access the valx.
The easiest solution that comes in my mind is this one:
public void Test()
{
var splitted = "[val1].[val2].[val3]".Split('.');
var val3 = splitted[2];
}

You can use following:
string[] myStrings = ("[val1].[val2].[val3]").Split('.');
Now you can access via index. For last you can use myStrings[myStrings.length - 1]

Providing, that none of val1...valN contains '.', '[' or ']' you can use a simple Linq code:
String str = #"[val1].[val2].[val3]";
String[] vals = str.Split('.').Select((x) => x.TrimStart('[').TrimEnd(']')).ToArray();
Or if all you want is the last value:
String str = #"[val1].[val2].[val3]";
String last = str.Split('.').Last().TrimStart('[').TrimEnd(']');

I'm assuming you always need the last brace. I would do it like this:
string input = "[val1].[val2].[val3]";
string[] splittedInput = input.split('.');
string lastBraceSet = splittedInput[splittedInput.length-1];
string result = lastBraceSet.Substring(1, lastBraceSet.Length - 2);

string str = "[val1].[val2].[val3]";
string last = str.Split('.').LastOrDefault();
string result = last.Replace("[", "").Replace("]", "");

string input="[val1].[val2].[val3]";
int startpoint=input.LastIndexOf("[")+1;
string result=input.Substring(startpoint,input.Length-startpoint-1);

I'd use the below regex. One warning is that it won't work if there are unbalanced square brackets after the last pair of brackets. Most of the answers given suffer from that though.
string s = "[val1].[val2].[val3]"
string pattern = #"(?<=\[)[^\]]+(?=\][^\[\]]*$)"
Match m = Regex.Match(s, pattern)
string result;
if (m.Success)
{
result = m.Value;
}

I would use regular expression, as they are the most clear from intention point of view:
string input = "[val1].[val2].[val3] ...";
string match = Regex.Matches(input, #"\[val\d+\]")
.Cast<Match>()
.Select(m => m.Value)
.Last();

Remove String After Determinate String

I need to remove certain strings after another string within a piece of text.
I have a text file with some URLs and after the URL there is the RESULT of an operation. I need to remove the RESULT of the operation and leave only the URL.
Example of text:
http://website1.com/something Result: OK(registering only mode is on)
http://website2.com/something Result: Problems registered 100% (SOMETHING ELSE) Other Strings;
http://website3.com/something Result: error: "Âíèìàíèå, îáíàðóæåíà îøèáêà - Ìåñòî æèòåëüñòâà ñîäåðæèò íåäîïóñòèìûå ê
I need to remove all strings starting from Result: so the remaining strings have to be:
http://website1.com/something
http://website2.com/something
http://website3.com/something
Without Result: ........
The results are generated randomly so I don't know exactly what there is after RESULT:

One option is to use regular expressions as per some other answers. Another is just IndexOf followed by Substring:
int resultIndex = text.IndexOf("Result:");
if (resultIndex != -1)
{
text = text.Substring(0, resultIndex);
}
Personally I tend to find that if I can get away with just a couple of very simple and easy to understand string operations, I find that easier to get right than using regex. Once you start going into real patterns (at least 3 of these, then one of those) then regexes become a lot more useful, of course.

string input = "Action2 Result: Problems registered 100% (SOMETHING ELSE) Other Strings; ";
string pattern = "^(Action[0-9]*) (.*)$";
string replacement = "$1";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
You use $1 to keep the match ActionXX.

Use Regex for this.
Example:
var r = new System.Text.RegularExpressions.Regex("Result:(.)*");
var result = r.Replace("Action Result:1231231", "");
Then you will have "Action" in the result.

You can try with this code - by using string.Replace
var pattern = "Result:";
var lineContainYourValue = "jdfhkjsdfhsdf Result:ljksdfljh"; //I want replace test
lineContainYourValue.Replace(pattern,"");

Something along the lines of this perhaps?
string line;
using ( var reader = new StreamReader ( File.Open ( #"C:\temp\test.txt", FileMode.Open ) ) )
using ( var sw = new StreamWriter(File.Open( #"C:\Temp\test.edited.txt", FileMode.CreateNew ) ))
while ( (line = reader.ReadLine()) != null )
if(!line.StartsWith("Result:")) sw.WriteLine(line);

You can use RegEx for this kind of processing.
using System.Text.RegularExpressions;
private string ParseString(string originalString)
{
string pattern = ".*(?=Result:.*)";
Match match = Regex.Match(originalString, pattern);
return match.Value;
}

A Linq approach:
IEnumerable<String> result = System.IO.File
.ReadLines(path)
.Where(l => l.StartsWith("Action") && l.Contains("Result"))
.Select(l => l.Substring(0, l.IndexOf("Result")));

Given your current example, where you want only the website, regex match the spaces.
var fileLine = "http://example.com/sub/ random text";
Regex regexPattern = new Regex("(.*?)\\s");
var websiteMatch = regexPattern.Match(fileLine).Groups[1].ToString();
Debug.Print("!" + websiteMatch + "!");
Repeating for each line in your text file. Regex explained: .* matches anything, ? makes the match ungreedy, (brackets) puts the match into a group, \\s matches whitespace.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

String operation in C# - c#

Because the Substring function with a single argument takes the index of the starting charachter and consume all to the end of the string. It will be a little naive, but you can start at charachter 19: Url.Substring(19);

Related

Regex pattern BBCode to Wiki Notation, C#

Regex from a html parsing, how do I grab a specific string?

Regex replace all matched tokens with lowercase

best possible way to get given substring

Remove String After Determinate String

Categories

Resources