Getting substring for string

Getting substring for string - c#

I have a string of the format
[00:26:19] Completed 80000 out of 500000 steps (16%)
from which I want to get the 16 part.
Should I search for ( and then get the % and get the portion in between, or would it be wiser to set up a regex query?

RegEx is probably going to be the trend, but I don't see a good reason for it, personally.
That being said, this should work:
String s = "[00:26:19] Completed 80000 out of 500000 steps (16%)";
Int32 start = s.LastIndexOf('(') + 1;
Console.WriteLine(s.Substring(start,s.LastIndexOf('%')-start));
And you can Convert.ToInt32() if you feel it necessary.

I would use a regular expression like this:
([^%]+)%\)$
This expression would allow non-numeric data to be captured - if you are certain that the text within the parenthesis and just to the left of the percentage will always be a number you can simplify the expression to this:
(\d+)%\)$

Another Fast way is...
string s = "[00:26:19] Completed 80000 out of 500000 steps (16%)";
string res = s.Split("(%".ToCharArray())[1];
this assumes we will only see '(' and '%' once in the string

It depends on how variable you expect the input string (the "haystack") to be, and how variable you expect your target pattern (the "needle") to be. Regexes are extremely useful for describing a whole class of needles in a largely unknown haystack, but they're not the right tool for input that's in a very static format.
If you know your string will always be something like:
"[A:B:C] Completed D out of E steps (F%)"
where 1) A-F are the only variable portions, and 2) A-F are always numeric, then all you need is a little string manipulation:
int GetPercentage(string str)
{
return int.Parse(
str.Substring(
str.IndexOf('(') + 1,
str.IndexOf('%') - str.IndexOf('(')
)
);
}
The key question here is: "Are the presence of ( and % sufficient to indicate the substring I'm trying to capture?" That is, will they only occur in that one position? If the rest of the haystack might contain ( or % somewhere, I'd use regex:
#"(?<=\()\d+(?=%\)))$"

Related

Task to replace first and last character of two strings

I am doing exercise which provides solutions too however, no explanation on the code in the solution is given and cannot understand the code. Hope I can get help in understanding it
Exercise:
Write a C# program to create a new string from a given string where the first and last characters will change their positions.
Strings:
w3resource
Python
Expected output:
e3resourcw
nythoP
Solution:
public class Exercise16 {
static void Main(string[] args)
{
Console.WriteLine(first_last("w3resource"));
Console.WriteLine(first_last("Python"));
Console.WriteLine(first_last("x"));
}
public static string first_last(string ustr)
{
// code that I don't understand
return ustr.Length > 1
? ustr.Substring(ustr.Length - 1) + ustr.Substring(1, ustr.Length - 2) + ustr.Substring(0, 1) : ustr;
}
}
P.S - I am beginner in C# but not in programming overall

The ? operator is also called the conditional operator in C#. It acts like a miniature if statement letting you express the entire statement in a single expression. In this case it is used to verify that there is at least two characters in the string, otherwise it returns the single character string itself.
As for the Substring statements, consider which characters are being extracted from ustr with each call...
ustr.Substring(ustrLength - 1): extract the last character
ustr.Substring(1, ustr.Length - 2): extract all characters from the second to the second to last
ustr.Substring(0, 1): extract the first character
When concatenated in the order above you can see that the resulting string will start with the final character of the original string, followed by all characters from the second to the second to last, finally followed by the first character.

Basically it says if the length is greater than 1 then execute this:
ustr.Substring(ustr.Length - 1) + ustr.Substring(1, ustr.Length - 2) + ustr.Substring(0, 1)
If not, return this string variable:
ustr
This is an example of Conditional Operator "?:": Microsoft Docs Conditional Operator.
Substring means you get specific range of string character. For examples, you can check Substring Examples.

String.Contains and String.LastIndexOf C# return different result?

I have this problem where String.Contains returns true and String.LastIndexOf returns -1. Could someone explain to me what happened? I am using .NET 4.5.
static void Main(string[] args)
{
String wikiPageUrl = #"http://it.wikipedia.org/wiki/ʿAbd_Allāh_al-Sallāl";
if (wikiPageUrl.Contains("wikipedia.org/wiki/"))
{
int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki/");
Console.WriteLine(i);
}
}

While #sa_ddam213's answer definitely fixes the problem, it might help to understand exactly what's going on with this particular string.
If you try the example with other "special characters," the problem isn't exhibited. For example, the following strings work as expected:
string url1 = #"http://it.wikipedia.org/wiki/»Abd_Allāh_al-Sallāl";
Console.WriteLine(url1.LastIndexOf("it.wikipedia.org/wiki/")); // 7
string url2 = #"http://it.wikipedia.org/wiki/~Abd_Allāh_al-Sallāl";
Console.WriteLine(url2.LastIndexOf("it.wikipedia.org/wiki/")); // 7
The character in question, "ʿ", is called a spacing modifier letter1. A spacing modifier letter doesn't stand on its own, but modifies the previous character in the string, this case a "/". Another way to put this is that it doesn't take up its own space when rendered.
LastIndexOf, when called with no StringComparison argument, compares strings using the current culture.
When strings are compared in a culture-sensitive manner, the "/" and "ʿ" characters are not seen as two distinct characters--they're processed into one character, which does not match the parameter passed in to LastIndexOf.
When you pass in StringComparison.Ordinal to LastIndexOf, the characters are treated as distinct, due to the nature of Ordinal comparison.
Another way to make this work would be to use CompareInfo.LastIndexOf and supply the CompareOptions.IgnoreNonSpace option:
Console.WriteLine(
CultureInfo.CurrentCulture.CompareInfo.LastIndexOf(
wikiPageUrl, #"it.wikipedia.org/wiki/", CompareOptions.IgnoreNonSpace));
// 7
Here we're saying that we don't want combining characters included in our string comparison.
As a sidenote, this means that #Partha's answer and #Noctis' answer only work because the character is being applied to a character that doesn't appear in the search string that's passed to LastIndexOf.
Contrast this with the Contains method, which by default performs an Ordinal (case sensitive and culture insensitive) comparison. This explains why Contains returns true and LastIndexOf returns false.
For a fantastic overview of how strings should be manipulated in the .NET framework, check out this article.
1: Is this different than a combining character or is it a type of combining character? would appreciate if someone would clear that up for me.

Try using StringComparison.Ordinal
This will compare the string by evaluating the numeric values of the corresponding chars in each string, this should work with the special chars you have in that example string
string wikiPageUrl = #"http://it.wikipedia.org/wiki/ʿAbd_Allāh_al-Sallāl";
int i = wikiPageUrl.LastIndexOf("http://it.wikipedia.org/wiki/", StringComparison.Ordinal);
// returns 0;

The thing is C# lastindexof looks from behind.
And wikipedia.org/wiki/ is followed by ' which it takes as escape sequence. So either remove ' after wiki/ or have an # there too.
The following syntax will work( anyone )
string wikiPageUrl = #"http://it.wikipedia.org/wiki/Abd_Allāh_al-Sallāl";
string wikiPageUrl = #"http://it.wikipedia.org/wiki/#ʿAbd_Allāh_al-Sallāl";
int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki");
All 3 works
If you want a generalized solution for this problem replace ' with #' in your string before you perform any operations.

the ' characters throws it off.
This should work, when you escape the ' as \':
wikiPageUrl = #"http://it.wikipedia.org/wiki/\'Abd_Allāh_al-Sallāl";
if (wikiPageUrl.Contains("wikipedia.org/wiki/"))
{
"contains".Dump();
int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki/");
Console.WriteLine(i);
}
figure out what you want to do (remove the ', escape it, or dig deeper :) ).

How to get index of any charcter in unicode string

I having a string variable which basically holds value of corresponding English word in the form of Chinese.
String temp = "'％1'不能输入步骤'％2'";
But when i want to know wether the string having %1 in it or not by using IndexOf function
if(temp.IndexOf("%1") != -1)
{
}
I am not getting true even if it contain %1.
So is there any issue due to Chinese charters or any thing else.
Pls suggest me how i can get the index of any charter in above case.

That is because ％1 is not equal to %1 What you want to do in this case as workaround is select the symbols out of string you have like
var s = "'％1'不能输入步骤'％2'";
var firstFragment = s.Substring(1, 2); // this should select you ％1
and then do
if(temp.IndexOf(first) != -1){
}

Comments gave the answer. Use the same percent character, so instead of:
"%1"
use:
"％1"
Or, if you find that problematic (your source code is in a "poor" code page, or you fear the code is hard to read when it contains full-width characters that resemble ASCII characters), use:
"\uFF051"
or even:
"\uFF05" + "1"
(concatenation will be done by the C# compiler, no extra concatting done at run-time).
Another approach might be Unicode normalization:
temp = temp.Normalize(NormalizationForm.FormKC);
which seems to project the "exotic" percent char into the usual ASCII percent char, although I am not sure if that behavior is guaranteed, but see the Decomposition field on Unicode Character 'FULLWIDTH PERCENT SIGN' (U+FF05).

Modifying string.Substring

9 times out of 10, when I want to use the Substring() method on a string, it's because I want to shave some characters off the END of the string, not the start. While the default Substring does have support for this, it does so by taking a second parameter which is the length of the desired string, rather than the endpoint. This means if I consistently want to shave off N characters off of a series of strings of differing length, I have to do an extra step, which can result in a good deal more code. Take for example:
//Shaving the first N chars
string trimmed = foo.bar.widget.GetType().Name.Substring(N);
vs.
//Shaving the last N chars
string trimmed = foo.bar.widget.GetType().Name.Substring(0, foo.bar.widget.GetType().Name.Length - N);
or maybe to save the extra function call, use a second line:
string name = foo.bar.widget.GetType().Name;
string trimmed = name.Substring(0, name.Length - N);
Either way, you're basically doubling the amount of code necessary to shave characters off the end of the string rather than the beginning. My modification would be simple. If you pass a negative N (which would otherwise be meaningless), it would shave -N characters off the end of the string instead of the beginning. I can already code this up with an extension method:
public static string MySubstring(this string str, int val)
{
return (val > 0) ? str.Substring(val) : str.Substring(0, str.Length + val);
}
And then when I want to shave off the final N chars, I just do:
string trimmed = foo.bar.widget.GetType().Name.MySubstring(-N);
Short and sweet, just like shaving off the beginning characters. My question is - would it be possible to override the behavior of the default Substring() function so that I can do this without having to use my own unique name for the function? It's not like it would invalidate any existing code, because previously there was no reason to pass it a negative number, and doing so would simply throw an exception and crash. This is just one of those simple no-nonsense features that feels like it should've been part of the implementation to begin with.

According to C# documentation, you can use extension methods to extend a class or interface, but not to override them. An extension method with the same name and signature as an interface or class method will never be called. So the answer is "No".
Arguably, this is a good thing™, because otherwise your code would become a nightmare to read to someone not familiar with your extension.
Note: str.Substring(0, str.Length + val); can be replaced with str.Remove(str.Length + val)

You can't override a method on string in the strict sense using extension methods, as the compiler will always choose an instance method over an extension method with the same signature when compiling a method call. However, you can achieve something close to what you want using named arguments. This should also help avoid readability issues. Here's an example
public static string Substring(this string #this, int trimFromEnd)
{
return #this.Substring(0, #this.Length - trimFromEnd);
}
// if you do
"abc".Substring(1) -> returns "bc"
// if you do
"abc".Substring(trimFromEnd: 1) -> returns "ab"
Personally, I find this a bit more readable than Substring(-1) or just Substring(varName), where varName happens to be negative.

validate excel worksheet name

I'm getting the below error when setting the worksheet name dynamically. Does anyone has regexp to validate the name before setting it ?
The name that you type does not exceed 31 characters. The name does
not contain any of the following characters: : \ / ? * [ or ]
You did not leave the name blank.

You can use the method to check if the sheet name is valid
private bool IsSheetNameValid(string sheetName)
{
if (string.IsNullOrEmpty(sheetName))
{
return false;
}
if (sheetName.Length > 31)
{
return false;
}
char[] invalidChars = new char[] {':', '\\', '/', '?', '*', '[', ']'};
if (invalidChars.Any(sheetName.Contains))
{
return false;
}
return true;
}

To do worksheet validation for those specified invalid characters using Regex, you can use something like this:
string wsName = #"worksheetName"; //verbatim string to take special characters literally
Match m = Regex.Match(wsName, #"[\[/\?\]\*]");
bool nameIsValid = (m.Success || (string.IsNullOrEmpty(wsName)) || (wsName.Length > 31)) ? false : true;
This also includes a check to see if the worksheet name is null or empty, or if it's greater than 31. Those two checks aren't done via Regex for the sake of simplicity and to avoid over engineering this problem.

Let's match the start of the string, then between 1 and 31 things that aren't on the forbidden list, then the end of the string. Requiring at least one means we refuse empty strings:
^[^\/\\\?\*\[\]]{1,31}$
There's at least one nuance that this regex will miss: this will accept a sequence of spaces, tabs and newlines, which will be a problem if that is considered to be blank (as it probably is).
If you take the length check out of the regex, then you can get the blankness check by doing something like:
^[^\/\\\?\*\[\]]*[^ \t\/\\\?\*\[\]][^\/\\\?\*\[\]]*$
How does that work? If we defined our class above as WORKSHEET, that would be:
^[^WORKSHEET]*[^\sWORKSHEET][^WORKSHEET]*$
So we match one or more non-forbidden characters, then a character that is neither forbidden nor whitespace, then zero or more non-forbidden characters. The key is that we demand at least one non-whitespace character in the middle section.
But we've lost the length check. It's hard to do both the length check and the regex in one expression. In order to count, we have to phrase things in terms of matching n times, and the things being matched have to be known to be of length 1. But in order to allow whitespace to be placed freely - as long as it's not all whitespace - we need to have a part of the match that is not necessarily of length 1.
Well, that's not quite true. At this point this starts to become a really bad idea, but nevertheless: onwards, into the breach! (for educational purposes only)
Instead of using * for the possibly-blank sections, we can specify the number we expect of each, and include all the possible ways for those three sections to add up to 31. How many ways are there for two numbers to add up to 30? Well, there's 30 of them. 0+30, 1+29, 2+28, ... 30+0:
^[^WORKSHEET]{0}[^\sWORKSHEET][^WORKSHEET]{30}$
|^[^WORKSHEET]{1}[^\sWORKSHEET][^WORKSHEET]{29}$
|^[^WORKSHEET]{2}[^\sWORKSHEET][^WORKSHEET]{28}$
....
|^[^WORKSHEET]{30}[^\sWORKSHEET][^WORKSHEET]{0}$
Obviously if this was a good idea, you'd write a program that expression rather than specifying it all by hand (and getting something wrong). But I don't think I need to tell you it's not a good idea. It is, however, the only answer I have to your question.
While admittedly not actually answering your question, I think #HatSoft has the right approach, encoding the conditions directly and clearly. After all, I'm now satisfied that an answer to your question as asked is not actually a helpful thing.

You might want to do a check for the name History as this is a reserved sheet name in Excel.

Something like that?
public string validate(string name)
{
foreach (char c in Path.GetInvalidFileNameChars())
name = name.Replace(c.ToString(), "");
if (name.Length > 31)
name = name.Substring(0, 31);
return name;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.