Retrieving string from a source string which is between 2 strings

Retrieving string from a source string which is between 2 strings - c#

It might be very simple but would like to know that,is there any alternative to find a string between a source string which by passing it start and end string
the following is achievable by this code ,but this there any better code than this as i think this will slow the system if used in many conditions.
string strSource = "The LoadUserProfile call failed with the following error: ";
string strResult = string.Empty;
string strStart = "loaduserProfile";
string strEnd = "error";
int startindex = strSource.IndexOf(strStart, StringComparison.OrdinalIgnoreCase);
int endindex = strSource.LastIndexOf(strEnd, StringComparison.OrdinalIgnoreCase);
startindex = startindex + strStart.Length;
int endindex = endindex - startindex;
strResult = strSource.Substring(startindex, endindex);
Thanks
D.Mahesh

Use regex and find the group value, but not sure if it will be faster or slower.
Here is an example code to implement this using Regex (no VS, so excuse if there is syntax error)
string pattern = Regex.Escape(strStart) + "(?<middle>[\s\S]*)" + Regex.Escape(strEnd);
Match match = Regex.Match(strSource, pattern);
if (match.Success)
{
// read the group value matches the name "middle"
......
}

Your code is pretty spot-on string manipulation. I don't think it can be made faster algorithmically. You can also do this using a regular expression, but I don't believe it will end up being faster in that case as well.
If you don't need case insensitivity, changing StringComparison.OrdinalIgnoreCase to StringComparison.Ordinal should provide some speedup.
Otherwise, you probably have to look elsewhere for speed improvements.

Related

counting a string with special characters in a string in c#

I would like to count a string (search term) in another string (logfile).
Splitting the string with the method Split and searching the array afterwards is too inefficient for me, because the logfile is very large.
In the net I found the following possibility, which worked quite well so far. However,
count = Regex.Matches(_editor.Text, txtLookFor.Text, RegexOptions.IgnoreCase).Count;
I am now running into another problem there, that I get the following error when I count a string in the format of "Nachricht erhalten (".
Errormessage:
System.ArgumentException: "Nachricht erhalten (" analysed - not enough )-characters.

You need to escape the ( symbol as it has a special function in regular expressions:
var test = Regex.Matches("Nachricht erhalten (3)", #"Nachricht erhalten \(", RegexOptions.IgnoreCase).Count;
If you do this by user input where the user is not familiar with regular expressions you probably easier off using IndexOf in a while loop, where you keep using the new index found in the last loop. Which might also be a bit better on performance than a regular expression. Example:
var test = "This is a test";
var searchFor = "is";
var count = 0;
var index = test.IndexOf(searchFor, 0);
while (index != -1)
{
++count;
index = test.IndexOf(searchFor, index + searchFor.Length);
}

Find repeated occurrences in String

I'm currently trying to find all matches to a rule in a string and copy those to a vector. The purpose is to build an application which retrieves the top N .mp3 files (podcasts) from a community website.
My current tactic:
public static string getBetween(string strSource, string strStart, string strEnd)
{
int Start, End;
if (strSource.Contains(strStart) && strSource.Contains(strEnd))
{
Start = strSource.IndexOf(strStart, 0) + strStart.Length;
End = strSource.IndexOf(strEnd, Start);
string sFound = strSource.Substring(Start, End + 4 - Start);
strSource = strSource.Remove(Start, End + 4 - Start);
return sFound;
}
else
{
return"";
}
}
Called like this:
for (int i = 0; i < N; i++)
{
Podcast.Add(getBetween(searchDoc(#TARGET_HTM), "Sound/", ".mp3"));
}
Where searchDoc is:
public static string searchDoc(string strFile)
{
StreamReader sr = new StreamReader(strFile);
String line = sr.ReadToEnd();
return line;
}
Why am I posting such a big chunk of code?
This is my first application in C#. I assume my current tactic is flawed and I'd rather see a solution for the underlying problem than a cheap fix for lousy code. Feel free to do whatever you feel like though.
What it should do:
Find all occurrences of "Sound/" + * + ".mp3" (all MP3 files in the directory Sound, whatever their name, from the top of the target .htm file till N are found. Do so by returning the top occurrence and removing this from the String.
What it does:
It finds the first occurrence just fine. It also removes the occurrence just fine. However, it only does so from strSource which gets discarded at the end of the function.
Problem:
How do I return the modified string in a safe manner (no global variables or other improper tricks), so the found occurrence is properly removed and the next can be found?

This is the wrong approach. You can use Regex.Matches to get all matches of the pattern that you want. The regex would be something like "Sound/[^/\"]+\.mp3".
Once you have a list of matches you can apply .Cast<Match>().Take(3).Select(m => m.Value) to it to get the top 3 matches as strings.
It looks like you have a C++ background. This can lead to low-level designs out of habit. Try to avoid manual string parsing and loops.

Three flaws:
First, these two things seem to belong together strongly, but you split them over two functions.
Second, you forgot to use the startIndex parameter of Substring, requiring you to rebuild strings that are later discarded (this is a performance hit!)
Third, you had a small error: you hardcoded the length of strEnd as 4.
I just made an extension method based on your code, which fixes these 3 flaws. Untested, since I have no VS on this computer.
public static List<string> Split(this string source, string start, string end) {
List<string> result = new List<string>();
int i=0;
while(source.indexOf(start, i) != -1) {
startIndex = source.IndexOf(start, i) + start.Length;
endIndex = source.IndexOf(end, start);
result.Add(source.Substring(startIndex, endIndex + end.Length - startIndex));
i = endIndex;
}
return result;
}

How to convert a string containing escape characters to a string

I have a string that is returned to me which contains escape characters.
Here is a sample string
"test\40gmail.com"
As you can see it contains escape characters. I need it to be converted to its real value which is
"test#gmail.com"
How can I do this?

If you are looking to replace all escaped character codes, not only the code for #, you can use this snippet of code to do the conversion:
public static string UnescapeCodes(string src) {
var rx = new Regex("\\\\([0-9A-Fa-f]+)");
var res = new StringBuilder();
var pos = 0;
foreach (Match m in rx.Matches(src)) {
res.Append(src.Substring(pos, m.Index - pos));
pos = m.Index + m.Length;
res.Append((char)Convert.ToInt32(m.Groups[1].ToString(), 16));
}
res.Append(src.Substring(pos));
return res.ToString();
}
The code relies on a regular expression to find all sequences of hex digits, converting them to int, and casting the resultant value to a char.

string test = "test\40gmail.com";
test.replace(#"\40","#");
If you want a more general approach ...
HTML Decode

The sample string provided ("test\40gmail.com") is JID escaped. It is not malformed, and HttpUtility/WebUtility will not correctly handle this escaping scheme.
You can certainly do it with string or regex functions, as suggested in the answers from dasblinkenlight and C.Barlow. This is probably the cleanest way to achieve the desired result. I'm not aware of any .NET libraries for decoding JID escaping, and a brief search hasn't turned up much. Here is a link to some source which may be useful, though.

I just wrote this piece of code and it seems to work beautifully... It requires that the escape sequence is in HEX, and is valid for value's 0x00 to 0xFF.
// Example
str = remEscChars(#"Test\x0D") // str = "Test\r"
Here is the code.
private string remEscChars(string str)
{
int pos = 0;
string subStr = null;
string escStr = null;
try
{
while ((pos = str.IndexOf(#"\x")) >= 0)
{
subStr = str.Substring(pos + 2, 2);
escStr = Convert.ToString(Convert.ToChar(Convert.ToInt32(subStr, 16)));
str = str.Replace(#"\x" + subStr, escStr);
}
}
catch (Exception ex)
{
throw ex;
}
return str;
}

.NET provides the static methods Regex.Unescape and Regex.Escape to perform this task and back again. Regex.Unescape will do what you need.
https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.unescape

parse string value from a URL using c#

I have a string "http://site1/site2/site3". I would like to get the value of "site2" out of the string. What is the best algorythm in C# to get the value. (no regex because it needs to be fast). I also need to make sure it doesn't throw any errors (just returns null).
I am thinking something like this:
currentURL = currentURL.ToLower().Replace("http://", "");
int idx1 = currentURL.IndexOf("/");
int idx2 = currentURL.IndexOf("/", idx1);
string secondlevelSite = currentURL.Substring(idx1, idx2 - idx1);

Assuming currentURL is a string
string result = new Uri(currentURL).Segments[1]
result = result.Substring(0, result.Length - 1);
Substring is needed because Segments[1] returns "site2/" instead of "site2"

Your example should be fast enough. If we really want to be nitpicky, then don't do the initial replace, because that will be at least an O(n) operation. Do a
int idx1 = currentURL.IndexOf("/", 8 /* or something */);
instead.
Thus you have two O(n) index look-ups that you optimized in the best possible way, and two O(1) operations with maybe a memcopy in the .NET's Substring(...) implementation... you can't go much faster with managed code.

currentURL = currentURL.ToLower().Replace("http://", "");
var arrayOfString = String.spilt(currentUrl.spit('/');

My assumption is you only need the second level. if there's no second level then it'll just return empty value.
string secondLevel = string.Empty;
try
{
string currentURL = "http://stackoverflow.com/questionsdgsgfgsgsfgdsggsg/3358184/parse-string-value-from-a-url-using-c".Replace("http://", string.Empty);
int secondLevelStartIndex = currentURL.IndexOf("/", currentURL.IndexOf("/", 0)) + 1;
secondLevel = currentURL.Substring(secondLevelStartIndex, (currentURL.IndexOf("/", secondLevelStartIndex) - secondLevelStartIndex));
}
catch
{
secondLevel = string.Empty;
}

Remove characters after specific character in string, then remove substring?

I feel kind of dumb posting this when this seems kind of simple and there are tons of questions on strings/characters/regex, but I couldn't find quite what I needed (except in another language: Remove All Text After Certain Point).
I've got the following code:
[Test]
public void stringManipulation()
{
String filename = "testpage.aspx";
String currentFullUrl = "http://localhost:2000/somefolder/myrep/test.aspx?q=qvalue";
String fullUrlWithoutQueryString = currentFullUrl.Replace("?.*", "");
String urlWithoutPageName = fullUrlWithoutQueryString.Remove(fullUrlWithoutQueryString.Length - filename.Length);
String expected = "http://localhost:2000/somefolder/myrep/";
String actual = urlWithoutPageName;
Assert.AreEqual(expected, actual);
}
I tried the solution in the question above (hoping the syntax would be the same!) but nope. I want to first remove the queryString which could be any variable length, then remove the page name, which again could be any length.
How can I get the remove the query string from the full URL such that this test passes?

For string manipulation, if you just want to kill everything after the ?, you can do this
string input = "http://www.somesite.com/somepage.aspx?whatever";
int index = input.IndexOf("?");
if (index >= 0)
input = input.Substring(0, index);
Edit: If everything after the last slash, do something like
string input = "http://www.somesite.com/somepage.aspx?whatever";
int index = input.LastIndexOf("/");
if (index >= 0)
input = input.Substring(0, index); // or index + 1 to keep slash
Alternately, since you're working with a URL, you can do something with it like this code
System.Uri uri = new Uri("http://www.somesite.com/what/test.aspx?hello=1");
string fixedUri = uri.AbsoluteUri.Replace(uri.Query, string.Empty);

To remove everything before the first /
input = input.Substring(input.IndexOf("/"));
To remove everything after the first /
input = input.Substring(0, input.IndexOf("/") + 1);
To remove everything before the last /
input = input.Substring(input.LastIndexOf("/"));
To remove everything after the last /
input = input.Substring(0, input.LastIndexOf("/") + 1);
An even more simpler solution for removing characters after a specified char is to use the String.Remove() method as follows:
To remove everything after the first /
input = input.Remove(input.IndexOf("/") + 1);
To remove everything after the last /
input = input.Remove(input.LastIndexOf("/") + 1);

Here's another simple solution. The following code will return everything before the '|' character:
if (path.Contains('|'))
path = path.Split('|')[0];
In fact, you could have as many separators as you want, but assuming you only have one separation character, here is how you would get everything after the '|':
if (path.Contains('|'))
path = path.Split('|')[1];
(All I changed in the second piece of code was the index of the array.)

The Uri class is generally your best bet for manipulating Urls.

To remove everything before a specific char, use below.
string1 = string1.Substring(string1.IndexOf('$') + 1);
What this does is, takes everything before the $ char and removes it. Now if you want to remove the items after a character, just change the +1 to a -1 and you are set!
But for a URL, I would use the built in .NET class to take of that.

Request.QueryString helps you to get the parameters and values included within the URL
example
string http = "http://dave.com/customers.aspx?customername=dave"
string customername = Request.QueryString["customername"].ToString();
so the customername variable should be equal to dave
regards

I second Hightechrider: there is a specialized Url class already built for you.
I must also point out, however, that the PHP's replaceAll uses regular expressions for search pattern, which you can do in .NET as well - look at the RegEx class.

you can use .NET's built in method to remove the QueryString.
i.e., Request.QueryString.Remove["whatever"];
here whatever in the [ ] is name of the querystring which you want to
remove.
Try this...
I hope this will help.

You can use this extension method to remove query parameters (everything after the ?) in a string
public static string RemoveQueryParameters(this string str)
{
int index = str.IndexOf("?");
return index >= 0 ? str.Substring(0, index) : str;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Retrieving string from a source string which is between 2 strings - c#

Related

counting a string with special characters in a string in c#

Find repeated occurrences in String

How to convert a string containing escape characters to a string

parse string value from a URL using c#

Remove characters after specific character in string, then remove substring?

Categories

Resources