C# method to check if character is markup or not

C# method to check if character is markup or not - c#

Is there any method in the .NET framework which will return true if a given character is an XML markup character? That is, one of the characters, '"<>_&, and any others that may exist.
I understand I can go for a simple string search also, but was wondering if a built-in method exists which would not rely on manually typing the characters.

You may checkout the following KB article.

I'm not sure the term "XML markup character" (or rather a context-independent function for detecting these) makes much sense, since some of the characters you list only have special meaning depending on the context in which they appear (such as ' and ", which are normal characters if they appear outside of a tag).
Apart from that, you could always write your own such function:
bool IsMarkupCharacter(char ch)
{
switch (ch)
{
case '\'':
case '\"':
case '<':
case '>':
case '&':
return true;
default:
return false;
}
}
Of course you would want to check this against the XML specification to check if it's truly complete. (I didn't include _ from your list, by the way; it is not special to XML in any way, AFAIK.)

You can also use this code
const string XMLCHARS = "'\"\\<>&";
if(XMLCHARS.Contains(c))
{
--
}

You can use extension
public static class CharExtension
{
public static bool IsXmlMarkup(this char charecter)
{
if(charecter == '\'' || charecter == '\"' || e.t.c)
return true;
return false;
}
}
and then just use
char c = '\'';
var res = c.IsXmlMarkup();

Related

detectecting destructive SQL queries with C#

So I am looking to find a more effective way to determine all variants of the strings in the array in this this C# code I wrote. I could loop over the whole string and compare each character in sqltext to the one before it and make it overly complicated or i could try to learn something new. I was thinking there has to be a more efficient way. I showed this to a co-worker and she suggested I use a regular expression. I have looked into regular expressions a little bit, but i cant seem to find the right expression.
what I am looking for is a version that takes all variants of the indexes of the array in this code:
public bool securitycheck(String sqltext)
{
string[] badSqlList = new string[] {"insert","Insert","INSERT",
"update","Update","UPDATE",
"delete","Delete","DELETE",
"drop","Drop", "DROP"};
for (int i = 0; i < badSqlList.Count(); i++)
{
if (sqltext.Contains(badSqlList[i]) == true)
{
return true;
}
}
return false;
}
but takes into account for alternate spelling. this code for example does not take into account for "iNsert, UpDate, dELETE, DrOP" but according to my coworker there is a way using Regular expressions to take into account for this.
What is the best way to do this in your opinion?
[Update]
thank you everyone, there is lots of really good information here and it really does open my eyes to handling SQL programatically. the scope on this tool I am building is very small and anyone with the permissions to access this tool and who has intent on being malicious would be someone who has direct access to the database anyway. these checks are in place to more or less prevent laziness. The use-case does not permit for parameterized queries or i would be doing that. your insight has been very educational and I appreciate all your help!

You can do:
if (badSqlList.Any(r => sqltext.IndexOf(r, StringComparison.InvariantCultureIgnoreCase) >= 0))
{
//bad SQL found
}
IndexOf with StringComparison enum value will ensure case insensitive comparison.
Another approach could be:
return sqltext.Split()
.Intersect(badSqlList,StringComparer.InvariantCultureIgnoreCase)
.Any()
Split your Sql on white space and then compare each word with your white list array. This could save you in cases where your legal table name has keyword like INESRTEDStudents
Not really sure about your requirements, but, generally, a better option would be to use Parameterized queries in the first place. You can't be 100% sure with your white list and there still would be ways to bypass it.

Do not reinvent the wheel - just use parameterized queries as everyone here tells you (fixes even more problem than you are currently aware), you'll thank as all in the future...
But do use this to sanitaze all your filter strings that go in WHERE clauses:
public static string EscapeSpecial(string s)
{
Contract.Requires(s != null);
var sb = new StringBuilder();
foreach(char c in s)
{
switch(c)
{
case '[':
case ']':
case '%':
case '*':
{
sb.AppendFormat(CultureInfo.InvariantCulture, "[{0}]", c);
break;
}
case '\'':
{
sb.Append("''");
break;
}
default:
{
sb.Append(c);
break;
}
}
}
return sb.ToString();
}

Parsing CSV data using a finite state machine

I want to read a file containing comma-separated values, so have written a finite state machine:
private IList<string> Split(string line)
{
List<string> values = new List<string>();
string value = string.Empty;
ParseState state = ParseState.Initial;
foreach (char c in line)
{
switch (state)
{
case ParseState.Initial:
switch (c)
{
case COMMA:
values.Add(string.Empty);
break;
case QUOTE:
state = ParseState.Quote;
break;
default:
value += c;
state = ParseState.Data;
break;
}
break;
case ParseState.Data:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
throw new InvalidDataException("Improper quotes");
default:
value += c;
break;
}
break;
case ParseState.Quote:
switch (c)
{
case QUOTE:
state = ParseState.QuoteInQuote;
break;
default:
value += c;
break;
}
break;
case ParseState.QuoteInQuote:
switch (c)
{
case COMMA:
values.Add(value);
value = string.Empty;
state = ParseState.Initial;
break;
case QUOTE:
value += c;
state = ParseState.Quote;
break;
default:
throw new InvalidDataException("Unpaired quotes");
}
break;
}
}
switch (state)
{
case ParseState.Initial:
case ParseState.Data:
case ParseState.QuoteInQuote:
values.Add(value);
break;
case ParseState.Quote:
throw new InvalidDataException("Unclosed quotes");
}
return values;
}
Yes, I know the advice about CSV parsers is "don't write your own", but
I needed it quickly and
our download policy at work would take several days to allow me to
get open source off the 'net.
Hey, at least I didn't start with string.Split() or, worse, try using a Regex!
And yes, I know it could be improved by using a StringBuilder, and it's restrictive on quotes in the data, but
performance is not an issue and
this is only to generate well-defined test data in-house,
so I don't care about those.
What I do care about is the apparent trailing block at the end for mopping up all the data after the final comma, and the way that it's starting to look like some sort of an anti-pattern down there, which was exactly the sort of thing that "good" patterns such as a FSM were supposed to avoid.
So my question is this: is this block at the end some sort of anti-pattern, and is it something that's going to come back to bite me in the future?

All of the FSMs I've ever seen (not that I go hunting for them, mind you) all have some kind of "mopping up" step, simply due to the nature of enumeration.
In an FSM you're always acting upon the current state, and then resetting the 'current state' for the next iteration, so once you've hit the end of your iterations you have to do one last operation to act upon the 'current state'. (Might be better to think about it as acting upon the 'previous state' and setting the 'current state').
Therefore, I would consider that what you've done is part of the pattern.
But why didn't you try some of the other answers on SO?
Split CSV String (specifically this answer)
How to properly split a CSV using C# split() function? (specifically this answer)
Adapted solution, still an FSM:
public IEnumerable<string> fsm(string s)
{
int i, a = 0, l = s.Length;
var q = true;
for (i = 0; i < l; i++)
{
switch (s[i])
{
case ',':
if (q)
{
yield return s.Substring(a, i - a).Trim();
a = i + 1;
}
break;
// pick your flavor
case '"':
//case '\'':
q = !q;
break;
}
}
yield return s.Substring(a).Trim();
}
// === usage ===
var row = fsm(csvLine).ToList();
foreach(var column in fsm(csvLine)) { ... }

In a FSM you identify which states are the permitted halting states. So in a typical implementation, when you come out of the loop you need to at least check to make sure that your last state is one of the permitting halting states or throw a jam error. So having that one last state check outside of the loop is part of the pattern.

The source of the problem, if you want to call it that, is the absence of an end-of-line marker in your input data. Add a newline character, for example, at the end of your input string and you will be able to get rid of the "trailing block" that seems to annoy you so much.
As far as I'm concerned, your code is correct and, no, there is no reason why this implementation will come back to bite you in the future!

I had a similiar issue, but i was parsing a text file character by character. I didnt like this big clean-up-switch-block after the while loop. To solve this, I made a wrapper for the streamreader. The wrapper checked when streamreader had no characters left. In this case, the wrapper would return an EOT-ascii character once (EOT is equal to EOF). This way my state machine could react to the EOF depending on the state it was in at that moment.

does contain and does not contain in same if

For some reason i cannot get this if statement to work
if (htmlCode.Contains("Sign out") && !htmlCode.Contains("bye bye"))
{
// do stuff...
}
is there any way to get contains and does not contain to work in same if statement?

First of all check the htmlCode, the text could be mixed with some html tags or something like that, also the issue can be with cases, when trying to find some string in the text you should always remember about cases.
You can use .Contains method or .IndexOf, actually the contains method in the Framework is implemented like this:
public bool Contains(string value)
{
return this.IndexOf(value, StringComparison.Ordinal) >= 0;
}
For comparing large strings without knowing the case I would use:
htmlCode.IndexOf("Sign out", StringComparison.InvariantCultureIgnoreCase);
htmlCode.IndexOf("Bye bye", StringComparison.InvariantCultureIgnoreCase);
If you know that response will be small, you can use .ToLower() or .ToUpper()

Try to compare by converting either upper case or lower case.
if (htmlCode.ToUpper().Contains("SIGN OUT") && !htmlCode.ToUpper().Contains("BYE BYE"))
{
// do stuff...
}

You if clause works correctly
It might be not working because of the string case
So i would suggest you do it this way
if (htmlCode.ToUpper().Contains("Sign out".ToUpper()) && !htmlCode.ToUpper().Contains("bye bye".ToUpper()))

How to make C# Switch Statement use IgnoreCase

If I have a switch-case statement where the object in the switch is string, is it possible to do an ignoreCase compare?
I have for instance:
string s = "house";
switch (s)
{
case "houSe": s = "window";
}
Will s get the value "window"? How do I override the switch-case statement so it will compare the strings using ignoreCase?

A simpler approach is just lowercasing your string before it goes into the switch statement, and have the cases lower.
Actually, upper is a bit better from a pure extreme nanosecond performance standpoint, but less natural to look at.
E.g.:
string s = "house";
switch (s.ToLower()) {
case "house":
s = "window";
break;
}

Sorry for this new post to an old question, but there is a new option for solving this problem using C# 7 (VS 2017).
C# 7 now offers "pattern matching", and it can be used to address this issue thusly:
string houseName = "house"; // value to be tested, ignoring case
string windowName; // switch block will set value here
switch (true)
{
case bool b when houseName.Equals("MyHouse", StringComparison.InvariantCultureIgnoreCase):
windowName = "MyWindow";
break;
case bool b when houseName.Equals("YourHouse", StringComparison.InvariantCultureIgnoreCase):
windowName = "YourWindow";
break;
case bool b when houseName.Equals("House", StringComparison.InvariantCultureIgnoreCase):
windowName = "Window";
break;
default:
windowName = null;
break;
}
This solution also deals with the issue mentioned in the answer by #Jeffrey L Whitledge that case-insensitive comparison of strings is not the same as comparing two lower-cased strings.
By the way, there was an interesting article in February 2017 in Visual Studio Magazine describing pattern matching and how it can be used in case blocks. Please have a look: Pattern Matching in C# 7.0 Case Blocks
EDIT
In light of #LewisM's answer, it's important to point out that the switch statement has some new, interesting behavior. That is that if your case statement contains a variable declaration, then the value specified in the switch part is copied into the variable declared in the case. In the following example, the value true is copied into the local variable b. Further to that, the variable b is unused, and exists only so that the when clause to the case statement can exist:
switch(true)
{
case bool b when houseName.Equals("X", StringComparison.InvariantCultureIgnoreCase):
windowName = "X-Window";):
break;
}
As #LewisM points out, this can be used to benefit - that benefit being that the thing being compared is actually in the switch statement, as it is with the classical use of the switch statement. Also, the temporary values declared in the case statement can prevent unwanted or inadvertent changes to the original value:
switch(houseName)
{
case string hn when hn.Equals("X", StringComparison.InvariantCultureIgnoreCase):
windowName = "X-Window";
break;
}

As you seem to be aware, lowercasing two strings and comparing them is not the same as doing an ignore-case comparison. There are lots of reasons for this. For example, the Unicode standard allows text with diacritics to be encoded multiple ways. Some characters includes both the base character and the diacritic in a single code point. These characters may also be represented as the base character followed by a combining diacritic character. These two representations are equal for all purposes, and the culture-aware string comparisons in the .NET Framework will correctly identify them as equal, with either the CurrentCulture or the InvariantCulture (with or without IgnoreCase). An ordinal comparison, on the other hand, will incorrectly regard them as unequal.
Unfortunately, switch doesn't do anything but an ordinal comparison. An ordinal comparison is fine for certain kinds of applications, like parsing an ASCII file with rigidly defined codes, but ordinal string comparison is wrong for most other uses.
What I have done in the past to get the correct behavior is just mock up my own switch statement. There are lots of ways to do this. One way would be to create a List<T> of pairs of case strings and delegates. The list can be searched using the proper string comparison. When the match is found then the associated delegate may be invoked.
Another option is to do the obvious chain of if statements. This usually turns out to be not as bad as it sounds, since the structure is very regular.
The great thing about this is that there isn't really any performance penalty in mocking up your own switch functionality when comparing against strings. The system isn't going to make a O(1) jump table the way it can with integers, so it's going to be comparing each string one at a time anyway.
If there are many cases to be compared, and performance is an issue, then the List<T> option described above could be replaced with a sorted dictionary or hash table. Then the performance may potentially match or exceed the switch statement option.
Here is an example of the list of delegates:
delegate void CustomSwitchDestination();
List<KeyValuePair<string, CustomSwitchDestination>> customSwitchList;
CustomSwitchDestination defaultSwitchDestination = new CustomSwitchDestination(NoMatchFound);
void CustomSwitch(string value)
{
foreach (var switchOption in customSwitchList)
if (switchOption.Key.Equals(value, StringComparison.InvariantCultureIgnoreCase))
{
switchOption.Value.Invoke();
return;
}
defaultSwitchDestination.Invoke();
}
Of course, you will probably want to add some standard parameters and possibly a return type to the CustomSwitchDestination delegate. And you'll want to make better names!
If the behavior of each of your cases is not amenable to delegate invocation in this manner, such as if differnt parameters are necessary, then you’re stuck with chained if statments. I’ve also done this a few times.
if (s.Equals("house", StringComparison.InvariantCultureIgnoreCase))
{
s = "window";
}
else if (s.Equals("business", StringComparison.InvariantCultureIgnoreCase))
{
s = "really big window";
}
else if (s.Equals("school", StringComparison.InvariantCultureIgnoreCase))
{
s = "broken window";
}

An extension to the answer by #STLDeveloperA. A new way to do statement evaluation without multiple if statements as of C# 7 is using the pattern matching switch statement, similar to the way #STLDeveloper though this way is switching on the variable being switched
string houseName = "house"; // value to be tested
string s;
switch (houseName)
{
case var name when string.Equals(name, "Bungalow", StringComparison.InvariantCultureIgnoreCase):
s = "Single glazed";
break;
case var name when string.Equals(name, "Church", StringComparison.InvariantCultureIgnoreCase):
s = "Stained glass";
break;
...
default:
s = "No windows (cold or dark)";
break;
}
The visual studio magazine has a nice article on pattern matching case blocks that might be worth a look.

In some cases it might be a good idea to use an enum. So first parse the enum (with ignoreCase flag true) and than have a switch on the enum.
SampleEnum Result;
bool Success = SampleEnum.TryParse(inputText, true, out Result);
if(!Success){
//value was not in the enum values
}else{
switch (Result) {
case SampleEnum.Value1:
break;
case SampleEnum.Value2:
break;
default:
//do default behaviour
break;
}
}

One possible way would be to use an ignore case dictionary with an action delegate.
string s = null;
var dic = new Dictionary<string, Action>(StringComparer.CurrentCultureIgnoreCase)
{
{"house", () => s = "window"},
{"house2", () => s = "window2"}
};
dic["HouSe"]();
// Note that the call doesn't return text, but only populates local variable s.
// If you want to return the actual text, replace Action to Func<string> and values in dictionary to something like () => "window2"

Here's a solution that wraps #Magnus 's solution in a class:
public class SwitchCaseIndependent : IEnumerable<KeyValuePair<string, Action>>
{
private readonly Dictionary<string, Action> _cases = new Dictionary<string, Action>(StringComparer.OrdinalIgnoreCase);
public void Add(string theCase, Action theResult)
{
_cases.Add(theCase, theResult);
}
public Action this[string whichCase]
{
get
{
if (!_cases.ContainsKey(whichCase))
{
throw new ArgumentException($"Error in SwitchCaseIndependent, \"{whichCase}\" is not a valid option");
}
//otherwise
return _cases[whichCase];
}
}
public IEnumerator<KeyValuePair<string, Action>> GetEnumerator()
{
return _cases.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return _cases.GetEnumerator();
}
}
Here's an example of using it in a simple Windows Form's app:
var mySwitch = new SwitchCaseIndependent
{
{"hello", () => MessageBox.Show("hello")},
{"Goodbye", () => MessageBox.Show("Goodbye")},
{"SoLong", () => MessageBox.Show("SoLong")},
};
mySwitch["HELLO"]();
If you use lambdas (like the example), you get closures which will capture your local variables (pretty close to the feeling you get from a switch statement).
Since it uses a Dictionary under the covers, it gets O(1) behavior and doesn't rely on walking through the list of strings. Of course, you need to construct that dictionary, and that probably costs more. If you want to reuse the Switch behavior over and over, you can create and initialize the the SwitchCaseIndependent object once and then use it as many times as you want.
It would probably make sense to add a simple bool ContainsCase(string aCase) method that simply calls the dictionary's ContainsKey method.

I would say that with switch expressions (added in C# 8.0), discard patterns and local functions the approaches suggested by #STLDev and #LewisM can be rewritten in even more clean/shorter way:
string houseName = "house"; // value to be tested
// local method to compare, I prefer to put them at the bottom of the invoking method:
bool Compare(string right) => string.Equals(houseName, right, StringComparison.InvariantCultureIgnoreCase);
var s = houseName switch
{
_ when Compare("Bungalow") => "Single glazed",
_ when Compare("Church") => "Stained glass",
// ...
_ => "No windows (cold or dark)" // default value
};

It should be sufficient to do this:
string s = "houSe";
switch (s.ToLowerInvariant())
{
case "house": s = "window";
break;
}
The switch comparison is thereby culture invariant. As far as I can see this should achieve the same result as the C#7 Pattern-Matching solutions, but more succinctly.

I hope this helps try to convert the whole string into particular case either lower case or Upper case and use the Lowercase string for comparison:
public string ConvertMeasurements(string unitType, string value)
{
switch (unitType.ToLower())
{
case "mmol/l": return (Double.Parse(value) * 0.0555).ToString();
case "mg/dl": return (double.Parse(value) * 18.0182).ToString();
}
}

Using the Case Insensitive Comparison:
Comparing strings while ignoring case.
switch (caseSwitch)
{
case string s when s.Equals("someValue", StringComparison.InvariantCultureIgnoreCase):
// ...
break;
}
for more detail Visit this link: Switch Case When In C# Statement And Expression

Now you can use the switch expression (rewrote the previous example):
return houseName switch
{
_ when houseName.Equals("MyHouse", StringComparison.InvariantCultureIgnoreCase) => "MyWindow",
_ when houseName.Equals("YourHouse", StringComparison.InvariantCultureIgnoreCase) => "YourWindow",
_ when houseName.Equals("House", StringComparison.InvariantCultureIgnoreCase) => "Window",
_ => null
};

Multiple variables in switch statement in c

How to write following statement in c using switch statement in c
int i = 10;
int j = 20;
if (i == 10 && j == 20)
{
Mymethod();
}
else if (i == 100 && j == 200)
{
Yourmethod();
}
else if (i == 1000 || j == 2000) // OR
{
Anymethod();
}
EDIT:
I have changed the last case from 'and' to 'or' later. So I appologise from people who answered my question before this edit.
This scenario is for example, I just wanted to know that is it possible or not. I have google this and found it is not possible but I trust gurus on stackoverflow more.
Thanks

You're pressing for answers that will unnaturally force this code into a switch - that's not the right approach in C, C++ or C# for the problem you've described. Live with the if statements, as using a switch in this instance leads to less readable code and the possibility that a slip-up will introduce a bug.
There are languages that will evaluate a switch statement syntax similar to a sequence of if statements, but C, C++, and C# aren't among them.
After Jon Skeet's comment that it can be "interesting to try to make it work", I'm going to go against my initial judgment and play along because it's certainly true that one can learn by trying alternatives to see where they work and where they don't work. Hopefully I won't end up muddling things more than I should...
The targets for a switch statement in the languages under consideration need to be constants - they aren't expressions that are evaluated at runtime. However, you can potentially get a behavior similar to what you're looking for if you can map the conditions that you want to have as switch targets to a hash function that will produce a perfect hash the matches up to the conditions. If that can be done, you can call the hash function and switch on the value it produces.
The C# compiler does something similar to this automatically for you when you want to switch on a string value. In C, I've manually done something similar when I want to switch on a string. I place the target strings in a table along with enumerations that are used to identify the strings, and I switch on the enum:
char* cmdString = "copystuff"; // a string with a command identifier,
// maybe obtained from console input
StrLookupValueStruct CmdStringTable[] = {
{ "liststuff", CMD_LIST },
{ "docalcs", CMD_CALC },
{ "copystuff", CMD_COPY },
{ "delete", CMD_DELETE },
{ NULL, CMD_UNKNOWN },
};
int cmdId = strLookupValue( cmdString, CmdStringTable); // transform the string
// into an enum
switch (cmdId) {
case CMD_LIST:
doList();
break;
case CMD_CALC:
doCalc();
break;
case CMD_COPY:
doCopy();
break;
// etc...
}
Instead of having to use a sequence of if statements:
if (strcmp( cmdString, "liststuff") == 0) {
doList();
}
else if (strcmp( cmdString, "docalcs") == 0) {
doCalc();
}
else if (strcmp( cmdString, "copystuff") == 0) {
doCopy();
}
// etc....
As an aside, for the string to function mapping here I personally find the table lookup/switch statement combination to be a bit more readable, but I imagine there are people who might prefer the more direct approach of the if sequence.
The set of expressions you have in your question don't look particularly simple to transform into a hash - your hash function would almost certainly end up being a sequence of if statements - you would have basically just moved the construct somewhere else. Jon Skeet's original answer was essentially to turn your expressions into a hash, but when the or operation got thrown into the mix of one of the tests, the hash function broke down.

In general you can't. What you are doing already is fine, although you might want to add an else clause at the end to catch unexpected inputs.
In your specific example it seems that j is often twice the value of i. If that is a general rule you could try to take advantage of that by doing something like this instead:
if (i * 2 == j) /* Watch out for overflow here if i could be large! */
{
switch (i)
{
case 10:
// ...
break;
case 100:
// ...
break;
// ...
}
}

(Removed original answer: I'd missed the fact that the condition was an "OR" rather than an "AND". EDIT: Ah, because apparently it wasn't to start with.)
You could still theoretically use something like my original code (combining two 32-bit integers into one 64-bit integer and switching on that), although there would be 2^33 case statements covering the last condition. I doubt that any compiler would actually make it through such code :)
But basically, no: use the if/else structure instead.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# method to check if character is markup or not - c#

You may checkout the following KB article.

You can also use this code const string XMLCHARS = "'\"\\<>&"; if(XMLCHARS.Contains(c)) { -- }

You can use extension public static class CharExtension { public static bool IsXmlMarkup(this char charecter) { if(charecter == '\'' || charecter == '\"' || e.t.c) return true; return false; } } and then just use char c = '\''; var res = c.IsXmlMarkup();

Related

detectecting destructive SQL queries with C#

Parsing CSV data using a finite state machine

does contain and does not contain in same if

How to make C# Switch Statement use IgnoreCase

Multiple variables in switch statement in c

Categories

Resources