validate excel worksheet name

validate excel worksheet name - c#

I'm getting the below error when setting the worksheet name dynamically. Does anyone has regexp to validate the name before setting it ?
The name that you type does not exceed 31 characters. The name does
not contain any of the following characters: : \ / ? * [ or ]
You did not leave the name blank.

You can use the method to check if the sheet name is valid
private bool IsSheetNameValid(string sheetName)
{
if (string.IsNullOrEmpty(sheetName))
{
return false;
}
if (sheetName.Length > 31)
{
return false;
}
char[] invalidChars = new char[] {':', '\\', '/', '?', '*', '[', ']'};
if (invalidChars.Any(sheetName.Contains))
{
return false;
}
return true;
}

To do worksheet validation for those specified invalid characters using Regex, you can use something like this:
string wsName = #"worksheetName"; //verbatim string to take special characters literally
Match m = Regex.Match(wsName, #"[\[/\?\]\*]");
bool nameIsValid = (m.Success || (string.IsNullOrEmpty(wsName)) || (wsName.Length > 31)) ? false : true;
This also includes a check to see if the worksheet name is null or empty, or if it's greater than 31. Those two checks aren't done via Regex for the sake of simplicity and to avoid over engineering this problem.

Let's match the start of the string, then between 1 and 31 things that aren't on the forbidden list, then the end of the string. Requiring at least one means we refuse empty strings:
^[^\/\\\?\*\[\]]{1,31}$
There's at least one nuance that this regex will miss: this will accept a sequence of spaces, tabs and newlines, which will be a problem if that is considered to be blank (as it probably is).
If you take the length check out of the regex, then you can get the blankness check by doing something like:
^[^\/\\\?\*\[\]]*[^ \t\/\\\?\*\[\]][^\/\\\?\*\[\]]*$
How does that work? If we defined our class above as WORKSHEET, that would be:
^[^WORKSHEET]*[^\sWORKSHEET][^WORKSHEET]*$
So we match one or more non-forbidden characters, then a character that is neither forbidden nor whitespace, then zero or more non-forbidden characters. The key is that we demand at least one non-whitespace character in the middle section.
But we've lost the length check. It's hard to do both the length check and the regex in one expression. In order to count, we have to phrase things in terms of matching n times, and the things being matched have to be known to be of length 1. But in order to allow whitespace to be placed freely - as long as it's not all whitespace - we need to have a part of the match that is not necessarily of length 1.
Well, that's not quite true. At this point this starts to become a really bad idea, but nevertheless: onwards, into the breach! (for educational purposes only)
Instead of using * for the possibly-blank sections, we can specify the number we expect of each, and include all the possible ways for those three sections to add up to 31. How many ways are there for two numbers to add up to 30? Well, there's 30 of them. 0+30, 1+29, 2+28, ... 30+0:
^[^WORKSHEET]{0}[^\sWORKSHEET][^WORKSHEET]{30}$
|^[^WORKSHEET]{1}[^\sWORKSHEET][^WORKSHEET]{29}$
|^[^WORKSHEET]{2}[^\sWORKSHEET][^WORKSHEET]{28}$
....
|^[^WORKSHEET]{30}[^\sWORKSHEET][^WORKSHEET]{0}$
Obviously if this was a good idea, you'd write a program that expression rather than specifying it all by hand (and getting something wrong). But I don't think I need to tell you it's not a good idea. It is, however, the only answer I have to your question.
While admittedly not actually answering your question, I think #HatSoft has the right approach, encoding the conditions directly and clearly. After all, I'm now satisfied that an answer to your question as asked is not actually a helpful thing.

You might want to do a check for the name History as this is a reserved sheet name in Excel.

Something like that?
public string validate(string name)
{
foreach (char c in Path.GetInvalidFileNameChars())
name = name.Replace(c.ToString(), "");
if (name.Length > 31)
name = name.Substring(0, 31);
return name;
}

Related

String.Contains and String.LastIndexOf C# return different result?

I have this problem where String.Contains returns true and String.LastIndexOf returns -1. Could someone explain to me what happened? I am using .NET 4.5.
static void Main(string[] args)
{
String wikiPageUrl = #"http://it.wikipedia.org/wiki/ʿAbd_Allāh_al-Sallāl";
if (wikiPageUrl.Contains("wikipedia.org/wiki/"))
{
int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki/");
Console.WriteLine(i);
}
}

While #sa_ddam213's answer definitely fixes the problem, it might help to understand exactly what's going on with this particular string.
If you try the example with other "special characters," the problem isn't exhibited. For example, the following strings work as expected:
string url1 = #"http://it.wikipedia.org/wiki/»Abd_Allāh_al-Sallāl";
Console.WriteLine(url1.LastIndexOf("it.wikipedia.org/wiki/")); // 7
string url2 = #"http://it.wikipedia.org/wiki/~Abd_Allāh_al-Sallāl";
Console.WriteLine(url2.LastIndexOf("it.wikipedia.org/wiki/")); // 7
The character in question, "ʿ", is called a spacing modifier letter1. A spacing modifier letter doesn't stand on its own, but modifies the previous character in the string, this case a "/". Another way to put this is that it doesn't take up its own space when rendered.
LastIndexOf, when called with no StringComparison argument, compares strings using the current culture.
When strings are compared in a culture-sensitive manner, the "/" and "ʿ" characters are not seen as two distinct characters--they're processed into one character, which does not match the parameter passed in to LastIndexOf.
When you pass in StringComparison.Ordinal to LastIndexOf, the characters are treated as distinct, due to the nature of Ordinal comparison.
Another way to make this work would be to use CompareInfo.LastIndexOf and supply the CompareOptions.IgnoreNonSpace option:
Console.WriteLine(
CultureInfo.CurrentCulture.CompareInfo.LastIndexOf(
wikiPageUrl, #"it.wikipedia.org/wiki/", CompareOptions.IgnoreNonSpace));
// 7
Here we're saying that we don't want combining characters included in our string comparison.
As a sidenote, this means that #Partha's answer and #Noctis' answer only work because the character is being applied to a character that doesn't appear in the search string that's passed to LastIndexOf.
Contrast this with the Contains method, which by default performs an Ordinal (case sensitive and culture insensitive) comparison. This explains why Contains returns true and LastIndexOf returns false.
For a fantastic overview of how strings should be manipulated in the .NET framework, check out this article.
1: Is this different than a combining character or is it a type of combining character? would appreciate if someone would clear that up for me.

Try using StringComparison.Ordinal
This will compare the string by evaluating the numeric values of the corresponding chars in each string, this should work with the special chars you have in that example string
string wikiPageUrl = #"http://it.wikipedia.org/wiki/ʿAbd_Allāh_al-Sallāl";
int i = wikiPageUrl.LastIndexOf("http://it.wikipedia.org/wiki/", StringComparison.Ordinal);
// returns 0;

The thing is C# lastindexof looks from behind.
And wikipedia.org/wiki/ is followed by ' which it takes as escape sequence. So either remove ' after wiki/ or have an # there too.
The following syntax will work( anyone )
string wikiPageUrl = #"http://it.wikipedia.org/wiki/Abd_Allāh_al-Sallāl";
string wikiPageUrl = #"http://it.wikipedia.org/wiki/#ʿAbd_Allāh_al-Sallāl";
int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki");
All 3 works
If you want a generalized solution for this problem replace ' with #' in your string before you perform any operations.

the ' characters throws it off.
This should work, when you escape the ' as \':
wikiPageUrl = #"http://it.wikipedia.org/wiki/\'Abd_Allāh_al-Sallāl";
if (wikiPageUrl.Contains("wikipedia.org/wiki/"))
{
"contains".Dump();
int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki/");
Console.WriteLine(i);
}
figure out what you want to do (remove the ', escape it, or dig deeper :) ).

How to get index of any charcter in unicode string

I having a string variable which basically holds value of corresponding English word in the form of Chinese.
String temp = "'％1'不能输入步骤'％2'";
But when i want to know wether the string having %1 in it or not by using IndexOf function
if(temp.IndexOf("%1") != -1)
{
}
I am not getting true even if it contain %1.
So is there any issue due to Chinese charters or any thing else.
Pls suggest me how i can get the index of any charter in above case.

That is because ％1 is not equal to %1 What you want to do in this case as workaround is select the symbols out of string you have like
var s = "'％1'不能输入步骤'％2'";
var firstFragment = s.Substring(1, 2); // this should select you ％1
and then do
if(temp.IndexOf(first) != -1){
}

Comments gave the answer. Use the same percent character, so instead of:
"%1"
use:
"％1"
Or, if you find that problematic (your source code is in a "poor" code page, or you fear the code is hard to read when it contains full-width characters that resemble ASCII characters), use:
"\uFF051"
or even:
"\uFF05" + "1"
(concatenation will be done by the C# compiler, no extra concatting done at run-time).
Another approach might be Unicode normalization:
temp = temp.Normalize(NormalizationForm.FormKC);
which seems to project the "exotic" percent char into the usual ASCII percent char, although I am not sure if that behavior is guaranteed, but see the Decomposition field on Unicode Character 'FULLWIDTH PERCENT SIGN' (U+FF05).

c# Split sentence

Is it possible to split this combined words into two?
ex: "Firstname" to
"First"
"Name"
I have a bunch of properties eg FirstName,LastName etc. and I need to display this on my page. Thats why I need to separate this property name to display into more appropriate way.

Your aim is fuzzy.
If properties alway have Uppercase letter, you can find positions of all uppercase letters in the word and devide it by that positions.
If uppercase letters is not guaranteed, the best way would be to create transform table. The table would be define pairs of initial property name and resulting text. In this way you will have simple map for transormation

Edit: OP specified that he needs to split property names
If you follow CamelCase naming convention for properties (i.e. "FirstName" instead of "Firstname"), you can split the words by upper case characters quite easily.
string[] SplitByCaps(string input)
{
StringBuilder output = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
char c = input[i];
if (i > 0 && Char.IsUpper(c))
output.Append(' ');
output.Append(c);
}
return output.ToString().Split(' ');
}
Orinal answer:
I would say, for practical purposes, it's not possible to do this for any arbitrary string.
Of course it is possible to write a program to do this, but whatever your actual needs are, that program would be overkill. There might also be libraries that already do this, but they would be so heavy that you wouldn't want to take a dependency on them.
Any program which could achieve this would have to have know all words in the English language (let's not even consider multilanguage solutions). You would also require an intelligent lexical parser, because for any word, there might be more than one possible way to split it.
I suggest you look into some other way to solve your particular problem.

Unless you have a dictionary of all 'single' words the only solution I can think of is to split on upper letters:
FirstName -> First Name
The problem will still exist for UIFilter -> UI Filter.

You can use substring to get the first 5 characters from the string. Then replace the first five characters in original string to blank.
string str = "Firstname";
string firstPart = str.Substring(0,5); // "First"
string secondPart = str.replace(firstPart,""); // "name"
If you want to make it generic for any word to be split, then you need to have some definite criteria on which you can divide the word into parts. Without definite criteria, it is not possible to split the string as expected by you.

C# Extracting a name from a string

I want to extract 'James\, Brown' from the string below but I don't always know what the name will be. The comma is causing me some difficuly so what would you suggest to extract James\, Brown?
OU=James\, Brown,OU=Test,DC=Internal,DC=Net
Thanks

A regex is likely your best approach
static string ParseName(string arg) {
var regex = new Regex(#"^OU=([a-zA-Z\\]+\,\s+[a-zA-Z\\]+)\,.*$");
var match = regex.Match(arg);
return match.Groups[1].Value;
}

You can use a regex:
string input = #"OU=James\, Brown,OU=Test,DC=Internal,DC=Net";
Match m = Regex.Match(input, "^OU=(.*?),OU=.*$");
Console.WriteLine(m.Groups[1].Value);

A quite brittle way to do this might be...
string name = #"OU=James\, Brown,OU=Test,DC=Internal,DC=Net";
string[] splitUp = name.Split("=".ToCharArray(),3);
string namePart = splitUp[1].Replace(",OU","");
Console.WriteLine(namePart);
I wouldn't necessarily advocate this method, but I've just come back from a departmental Christmas lyunch and my brain is not fully engaged yet.

I'd start off with a regex to split up the groups:
Regex rx = new Regex(#"(?<!\\),");
String test = "OU=James\\, Brown,OU=Test,DC=Internal,DC=Net";
String[] segments = rx.Split(test);
But from there I would split up the parameters in the array by splitting them up manually, so that you don't have to use a regex that depends on more than the separator character used. Since this looks like an LDAP query, it might not matter if you always look at params[0], but there is a chance that the name might be set as "CN=". You can cover both cases by just reading the query like this:
String name = segments[0].Split('=', 2)[1];

That looks suspiciously like an LDAP or Active Directory distinguished name formatted according to RFC 2253/4514.
Unless you're working with well known names and/or are okay with a fragile hackaround (like the regex solutions) - then you should start by reading the spec.
If you, like me, generally hate implementing code according to RFCs - then hope this guy did a better job following the spec than you would. At least he claims to be 2253 compliant.

If the slash is always there, I would look at potentially using RegEx to do the match, you can use a match group for the last and first names.
^OU=([a-zA-Z])\,\s([a-zA-Z])
That RegEx will match names that include characters only, you will need to refine it a bit for better matching for the non-standard names. Here is a RegEx tester to help you along the way if you go this route.

Replace \, with your own preferred magic string (perhaps & #44;), split on remaining commas or search til the first comma, then replace your magic string with a single comma.
i.e. Something like:
string originalStr = #"OU=James\, Brown,OU=Test,DC=Internal,DC=Net";
string replacedStr = originalStr.Replace("\,", ",");
string name = replacedStr.Substring(0, replacedStr.IndexOf(","));
Console.WriteLine(name.Replace(",", ","));

Assuming you're running in Windows, use PInvoke with DsUnquoteRdnValueW. For code, see my answer to another question: https://stackoverflow.com/a/11091804/628981

If the format is always the same:
string line = GetStringFromWherever();
int start = line.IndexOf("=") + 1;//+1 to get start of name
int end = line.IndexOf("OU=",start) -1; //-1 to remove comma
string name = line.Substring(start, end - start);
Forgive if syntax is not quite right - from memory. Obviously this is not very robust and fails if the format ever changes.

Phone Number Formatting, OnBlur

I have a .NET WinForms textbox for a phone number field. After allowing free-form text, I'd like to format the text as a "more readable" phone number after the user leaves the textbox. (Outlook has this feature for phone fields when you create/edit a contact)
1234567 becomes 123-4567
1234567890 becomes (123) 456-7890
(123)456.7890 becomes (123) 456-7890
123.4567x123 becomes 123-4567 x123
etc

A fairly simple-minded approach would be to use a regular expression. Depending on which type of phone numbers you're accepting, you could write a regular expression that looks for the digits (for US-only, you know there can be 7 or 10 total - maybe with a leading '1') and potential separators between them (period, dash, parens, spaces, etc.).
Once you run the match against the regex, you'll need to write the logic to determine what you actually got and format it from there.
EDIT: Just wanted to add a very basic example (by no means is this going to work for all of the examples you posted above). Geoff's suggestion of stripping non-numeric characters might help out a bit depending on how you write your regex.
Regex regex = new Regex(#"(?<areaCode>([\d]{3}))?[\s.-]?(?<leadingThree>([\d]{3}))[\s.-]?(?<lastFour>([\d]{4}))[x]?(?<extension>[\d]{1,})?");
string phoneNumber = "701 123-4567x324";
Match phoneNumberMatch = regex.Match(phoneNumber);
if(phoneNumberMatch.Success)
{
if (phoneNumberMatch.Groups["areaCode"].Success)
{
Console.WriteLine(phoneNumberMatch.Groups["areaCode"].Value);
}
if (phoneNumberMatch.Groups["leadingThree"].Success)
{
Console.WriteLine(phoneNumberMatch.Groups["leadingThree"].Value);
}
if (phoneNumberMatch.Groups["lastFour"].Success)
{
Console.WriteLine(phoneNumberMatch.Groups["lastFour"].Value);
}
if (phoneNumberMatch.Groups["extension"].Success)
{
Console.WriteLine(phoneNumberMatch.Groups["extension"].Value);
}
}

I think the easiest thing to do is to first strip any non-numeric characters from the string so that you just have a number then format as mentioned in this question

I thought about stripping any non-numeric characters and then formatting, but I don't think that works so well for the extension case (123.4567x123)

Lop off the extension then strip the non-numeric character from the remainder. Format it then add the extension back on.
Start: 123.4567x123
Lop: 123.4567
Strip: 1234567
Format: 123-4567
Add: 123-4567 x123

I don't know of any way other than doing it yourself by possibly making some masks and checking which one it matches and doing each mask on a case by case basis. Don't think it'd be too hard, just time consuming.

My guess is that you could accomplish this with a conditional statement to look at the input and then parse it into a specific format. But I'm guessing there is going to be a good amount of logic to investigate the input and format the output.

This works for me. Worth checking performance if you are doing this in a tight loop...
public static string FormatPhoneNumber(string phone)
{
phone = Regex.Replace(phone, #"[^\d]", "");
if (phone.Length == 10)
return Regex.Replace(phone,
"(?<ac>\\d{3})(?<pref>\\d{3})(?<num>\\d{4})",
"(${ac}) ${pref}-${num}");
else if ((phone.Length < 16) && (phone.Length > 10))
return Regex.Replace(phone,
"(?<ac>\\d{3})(?<pref>\\d{3})(?<num>\\d{4})(?<ext>\\d{1,5})",
"(${ac}) ${pref}-${num} x${ext}");
else
return string.Empty;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.