I am trying to extract the number out of the last part of the string, I have wrote a function to do this but am having problems with out of range index.
Here is the string
type="value" cat=".1.3.6.1.4.1.26928.1.1.1.2.1.2.1.1" descCat=".1.3.6.1.4.1.26928.1.1.1.2.1.2.1.3"
and here is my function
private static string ExtractDescOID(string property)
{
string result = "";
int startPos = property.LastIndexOf("descOid=\"") + "descOid=\"".Length;
int endPos = property.Length - 1;
if (endPos - startPos != 1)
{
//This now gets rid of the first . within the string.
startPos++;
result = property.Substring(startPos, endPos);
}
else
{
result = "";
}
if (startPos == endPos)
{
Console.WriteLine("Something has gone wrong");
}
return result;
}
I want to be able to get 1.3.6.1.4.1.26928.1.1.1.2.1.2.1.3 this part of the string. I have stepped through the code, the string length is 99 however when AND MY startPos becomes 64 and endPos becomes 98 which is actually within the range.
The second argument to Substring(int, int) isn't the "end position", but the length of the substring to return.
result = property.Substring(startPos, endPos - startPos);
Read the documentation again, the second value is the length, not the index.
As found on MSDN:
public string Substring(
int startIndex,
int length
)
A different approach to this problem could be through using string.Split() to take care of the parsing for you. The only reason why I would propose this (other than that I like to present additional options to what's already there, plus this is the lazy man's way out) is that from a code perspective, the code is easier IMHO to decompose, and when decomposed, is easier to comprehend by others.
Here's the sample program with some comments to illustrate my point (tested, btw).
class Program
{
static void Main(string[] args)
{
var someAttributesFromAnXmlNodeIGuess =
"type=\"value\" cat=\".1.3.6.1.4.1.26928.1.1.1.2.1.2.1.1\" descCat=\".1.3.6.1.4.1.26928.1.1.1.2.1.2.1.3\"";
var descCat = GetMeTheAttrib(someAttributesFromAnXmlNodeIGuess, "descCat");
Console.WriteLine(descCat);
Console.ReadLine();
}
// making the slightly huge assumption that you may want to
// access other attribs in the string...
private static string GetMeTheAttrib(string attribLine, string attribName)
{
var parsedDictionary = ParseAttributes(attribLine);
if (parsedDictionary.ContainsKey(attribName))
{
return parsedDictionary[attribName];
}
return string.Empty;
}
// keeping the contracts simple -
// i could have used IDictionary, which might make sense
// if this code became LINQ'd one day
private static Dictionary<string, string> ParseAttributes(string attribLine)
{
var dictionaryToReturn = new Dictionary<string, string>();
var listOfPairs = attribLine.Split(' '); // items look like type=value, etc
foreach (var pair in listOfPairs)
{
var attribList = pair.Split('=');
// we were expecting a type=value pattern... if this doesn't match then let's ignore it
if (attribList.Count() != 2) continue;
dictionaryToReturn.Add(attribList[0], attribList[1]);
}
return dictionaryToReturn;
}
}
Related
Firstly I understand that there are several ways to do this and I do have some code which runs, but what I just wanted to find out was if anyone else has a recommended way to do this. Say I have a string which I already know that would have contain a specific character (a ‘,’ in this case). Now I just want to validate that this comma is used only once and not more. I know iterating through each character is an option but why go through all that work because I just want to make sure that this special character is not used more than once, I’m not exactly interested in the count per se. The best I could think was to use the split and here is some sample code that works. Just curious to find out if there is a better way to do this.
In summary,
I have a certain string in which I know has a special character (‘,’ in this case)
I want to validate that this special character has only been used once in this string
const char characterToBeEvaluated = ',';
string myStringToBeTested = "HelloWorldLetus,code";
var countOfIdentifiedCharacter = myStringToBeTested.Split(characterToBeEvaluated).Length - 1;
if (countOfIdentifiedCharacter == 1)
{
Console.WriteLine("Used exactly once as expected");
}
else
{
Console.WriteLine("Used either less than or more than once");
}
You can use string's IndexOf methods:
const char characterToBeEvaluated = ',';
string myStringToBeTested = "HelloWorldLetus,code";
string substringToFind = characterToBeEvaluated.ToString();
int firstIdx = myStringToBeTested.IndexOf(substringToFind, StringComparison.Ordinal);
bool foundOnce = firstIdx >= 0;
bool foundTwice = foundOnce && myStringToBeTested.IndexOf(substringToFind, firstIdx + 1, StringComparison.Ordinal) >= 0;
Try it online
You could use the LINQ Count() method:
const char characterToBeEvaluated = ',';
string myStringToBeTested = "HelloWorldLetus,code";
var countOfIdentifiedCharacter = myStringToBeTested.Count(x => x == characterToBeEvaluated);
if (countOfIdentifiedCharacter == 1)
{
Console.WriteLine("Used exactly once as expected");
}
else
{
Console.WriteLine("Used either less than or more than once");
}
This is the most readable and simplest approach and is great if you need to know the exact count but for your specific case #ProgrammingLlama's answer is better in terms of efficiency.
Adding another answer using a custom method:
public static void Main()
{
const char characterToBeEvaluated = ',';
string myStringToBeTested = "HelloWorldLetus,code";
var characterAppearsOnlyOnce = DoesCharacterAppearOnlyOnce(characterToBeEvaluated, myStringToBeTested);
if (characterAppearsOnlyOnce)
{
Console.WriteLine("Used exactly once as expected");
}
else
{
Console.WriteLine("Used either less than or more than once");
}
}
public static bool DoesCharacterAppearOnlyOnce(char characterToBeEvaluated, string stringToBeTested)
{
int count = 0;
for (int i = 0; i < stringToBeTested.Length && count < 2; ++i)
{
if (stringToBeTested[i] == characterToBeEvaluated)
{
++count;
}
}
return count == 1;
}
The custom method DoesCharacterAppearOnlyOnce() performs better than the method using IndexOf() for smaller strings - probably due to the overhead calling IndexOf. As the strings get larger the IndexOf method is better.
I have a text file full of strings, one on each line. Some of these strings will contain an unknown number of "#" characters. Each "#" can represent the numbers 1, 2, 3, or 4. I want to generate all possible combinations (permutations?) of strings for each of those "#"s. If there were a set number of "#"s per string, I'd just use nested for loops (quick and dirty). I need help finding a more elegant way to do it with an unknown number of "#"s.
Example 1: Input string is a#bc
Output strings would be:
a1bc
a2bc
a3bc
a4bc
Example 2: Input string is a#bc#d
Output strings would be:
a1bc1d
a1bc2d
a1bc3d
a1bc4d
a2bc1d
a2bc2d
a2bc3d
...
a4bc3d
a4bc4d
Can anyone help with this one? I'm using C#.
This is actually a fairly good place for a recursive function. I don't write C#, but I would create a function List<String> expand(String str) which accepts a string and returns an array containing the expanded strings.
expand can then search the string to find the first # and create a list containing the first part of the string + expansion. Then, it can call expand on the last part of the string and add each element in it's expansion to each element in the last part's expansion.
Example implementation using Java ArrayLists:
ArrayList<String> expand(String str) {
/* Find the first "#" */
int i = str.indexOf("#");
ArrayList<String> expansion = new ArrayList<String>(4);
/* If the string doesn't have any "#" */
if(i < 0) {
expansion.add(str);
return expansion;
}
/* New list to hold the result */
ArrayList<String> result = new ArrayList<String>();
/* Expand the "#" */
for(int j = 1; j <= 4; j++)
expansion.add(str.substring(0,i-1) + j);
/* Combine every expansion with every suffix expansion */
for(String a : expand(str.substring(i+1)))
for(String b : expansion)
result.add(b + a);
return result;
}
I offer you here a minimalist approach for the problem at hand.
Yes, like other have said recursion is the way to go here.
Recursion is a perfect fit here, since we can solve this problem by providing the solution for a short part of the input and start over again with the other part until we are done and merge the results.
Every recursion must have a stop condition - meaning no more recursion needed.
Here my stop condition is that there are no more "#" in the string.
I'm using string as my set of values (1234) since it is an IEnumerable<char>.
All other solutions here are great, Just wanted to show you a short approach.
internal static IEnumerable<string> GetStrings(string input)
{
var values = "1234";
var permutations = new List<string>();
var index = input.IndexOf('#');
if (index == -1) return new []{ input };
for (int i = 0; i < values.Length; i++)
{
var newInput = input.Substring(0, index) + values[i] + input.Substring(index + 1);
permutations.AddRange(GetStrings(newInput));
}
return permutations;
}
An even shorter and cleaner approach with LINQ:
internal static IEnumerable<string> GetStrings(string input)
{
var values = "1234";
var index = input.IndexOf('#');
if (index == -1) return new []{ input };
return
values
.Select(ReplaceFirstWildCardWithValue)
.SelectMany(GetStrings);
string ReplaceFirstWildCardWithValue(char value) => input.Substring(0, index) + value + input.Substring(index + 1);
}
This is shouting out loud for a recursive solution.
First, lets make a method that generates all combinations of a certain length from a given set of values. Because we are only interested in generating strings, lets take advantage of the fact that string is immutable (see P.D.2); this makes recursive functions so much easier to implement and reason about:
static IEnumerable<string> GetAllCombinations<T>(
ISet<T> set, int length)
{
IEnumerable<string> getCombinations(string current)
{
if (current.Length == length)
{
yield return current;
}
else
{
foreach (var s in set)
{
foreach (var c in getCombinations(current + s))
{
yield return c;
}
}
}
}
return getCombinations(string.Empty);
}
Study carefully how this methods works. Work it out by hand for small examples to understand it.
Now, once we know how to generate all possible combinations, building the strings is easy:
Figure out the number of wildcards in the specified string: this will be our combination length.
For every combination, insert in order each character into the string where we encounter a wildcard.
Ok, lets do just that:
public static IEnumerable<string> GenerateCombinations<T>(
this string s,
IEnumerable<T> set,
char wildcard)
{
var length = s.Count(c => c == wildcard);
var combinations = GetAllCombinations(set, length);
var builder = new StringBuilder();
foreach (var combination in combinations)
{
var index = 0;
foreach (var c in s)
{
if (c == wildcard)
{
builder.Append(combination[index]);
index += 1;
}
else
{
builder.Append(c);
}
}
yield return builder.ToString();
builder.Clear();
}
}
And we're done. Usage would be:
var set = new HashSet<int>(new[] { 1, 2, 3, 4 });
Console.WriteLine(
string.Join("; ", "a#bc#d".GenerateCombinations(set, '#')));
And sure enough, the output is:
a1bc1d; a1bc2d; a1bc3d; a1bc4d; a2bc1d; a2bc2d; a2bc3d;
a2bc4d; a3bc1d; a3bc2d; a3bc3d; a3bc4d; a4bc1d; a4bc2d;
a4bc3d; a4bc4d
Is this the most performant or efficient implementation? Probably not but its readable and maintainable. Unless you have a specific performance goal you are not meeting, write code that works and is easy to understand.
P.D. I’ve omitted all error handling and argument validation.
P.D.2: if the length of the combinations is big, concatenting strings inside GetAllCombinations might not be a good idea. In that case I’d have GetAllCombinations return an IEnumerable<IEnumerable<T>>, implement a trivial ImmutableStack<T>, and use that as the combination buffer instead of string.
I have some string like this:
str1 = STA001, str2 = STA002, str3 = STA003
and have code to compare strings:
private bool IsSubstring(string strChild, string strParent)
{
if (!strParent.Contains(strChild))
{
return false;
}
else return true;
}
If I have strChild = STA001STA002 and strParent = STA001STA002STA003 then return true but when I enter strChild = STA001STA003 and check with strParent = STA001STA002STA003 then return false although STA001STA003 have contains in strParent. How can i resolve it?
What you're describing is not a substring. It is basically asking of two collections the question "is this one a subset of the other one?" This question is far easier to ask when the collection is a set such as HashSet<T> than when the collection is a big concatenated string.
This would be a much better way to write your code:
var setOne = new HashSet<string> { "STA001", "STA003" };
var setTwo = new HashSet<string> { "STA001", "STA002", "STA003" };
Console.WriteLine(setOne.IsSubsetOf(setTwo)); // True
Console.WriteLine(setTwo.IsSubsetOf(setOne)); // False
Or, if the STA00 part was just filler to make it make sense in the context of strings, then use ints directly:
var setOne = new HashSet<int> { 1, 3 };
var setTwo = new HashSet<int> { 1, 2, 3 };
Console.WriteLine(setOne.IsSubsetOf(setTwo)); // True
Console.WriteLine(setTwo.IsSubsetOf(setOne)); // False
The Contains method only looks for an exact match, it doesn't look for parts of the string.
Divide the child string into its parts, and look for each part in the parent string:
private bool IsSubstring(string child, string parent) {
for (int i = 0; i < child.Length; i+= 6) {
if (!parent.Contains(child.Substring(i, 6))) {
return false;
}
}
return true;
}
You should however consider it cross-part matches are possible, and if that is an issue. E.g. looking for "1STA00" in "STA001STA002". If that would be a problem, then you should divide the parent string similarly, and only make direct comparisons between the parts, not using the Contains method.
Note: Using hungarian notation for the data type of variables is not encouraged in C#.
This may be a bit on the overkill side, but it might be incredibly beneficial.
private static bool ContainedWord(string input, string phrase)
{
var pattern = String.Format(#"\b({0})", phrase);
var result = Regex.Match(input, pattern);
if(string.Compare(result, phrase) == 0)
return true;
return false;
}
If the expression finds a match, then compare the result to your phrase. If they're zero, it matched. I may be misunderstanding your intent.
I am new to C# and I ran into the following problem (I have looked for a solution here and on google but was not successful):
Given an array of strings (some columns can possibly be doubles or integers "in string format") I would like to convert this array to an integer array.
The question only concerns the columns with actual string values (say a list of countries).
Now I believe a Dictionary can help me to identify the unique values in a given column and associate an integer number to every country that appears.
Then to create my new array which should be of type int (or double) I could loop through the whole array and define the new array via the dictionary. This I would need to do for every column which has string values.
This seems inefficient, is there a better way?
In the end I would like to do multiple linear regression (or even fit a generalized linear model, meaning I want to get a design matrix eventually) with the data.
EDIT:
1) Sorry for being unclear, I will try to clarify:
Given:
MAKE;VALUE ;GENDER
AUDI;40912.2;m
WV;3332;f
AUDI;1234.99;m
DACIA;0;m
AUDI;12354.2;m
AUDI;123;m
VW;21321.2;f
I want to get a "numerical" matrix with identifiers for the the string valued columns
MAKE;VALUE;GENDER
1;40912.2;0
2;3332;1
1;1234.99;0
3;0;0
1;12354.2;0
1;123;0
2;21321.2;1
2) I think this is actually not what I need to solve my problem. Still it does seem like an interesting question.
3) Thank you for the responses so far.
This will take all the possible strings which represent an integer and puts them in a List.
You can do the same with strings wich represent a double.
Is this what you mean??
List<int> myIntList = new List<int>()
foreach(string value in stringArray)
{
int myInt;
if(Int.TryParse(value,out myInt)
{
myIntList.Add(myInt);
}
}
Dictionary is good if you want to map each string to a key like this:
var myDictionary = new Dictionary<int,string>();
myDictionary.Add(1,"CountryOne");
myDictionary.Add(2,"CountryTwo");
myDictionary.Add(3,"CountryThree");
Then you can get your values like:
string myCountry = myDictionary[2];
But still not sure if i'm helping you right now. Do you have som code to specify what you mean?
I'm not sure if this is what you are looking for but it does output the result you are looking for, from which you can create an appropriate data structure to use. I use a list of string but you can use something else to hold the processed data. I can expand further, if needed.
It does assume that the number of "columns", based on the semicolon character, is equal throughout the data and is flexible enough to handle any number of columns. Its kind of ugly but it should get what you want.
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication3
{
class StringColIndex
{
public int ColIndex { get; set; }
public List<string> StringValues {get;set;}
}
class Program
{
static void Main(string[] args)
{
var StringRepresentationAsInt = new List<StringColIndex>();
List<string> rawDataList = new List<string>();
List<string> rawDataWithStringsAsIdsList = new List<string>();
rawDataList.Add("AUDI;40912.2;m");rawDataList.Add("VW;3332;f ");
rawDataList.Add("AUDI;1234.99;m");rawDataList.Add("DACIA;0;m");
rawDataList.Add("AUDI;12354.2;m");rawDataList.Add("AUDI;123;m");
rawDataList.Add("VW;21321.2;f ");
foreach(var rawData in rawDataList)
{
var split = rawData.Split(';');
var line = string.Empty;
for(int i= 0; i < split.Length; i++)
{
double outValue;
var isNumberic = Double.TryParse(split[i], out outValue);
var txt = split[i];
if (!isNumberic)
{
if(StringRepresentationAsInt
.Where(x => x.ColIndex == i).Count() == 0)
{
StringRepresentationAsInt.Add(
new StringColIndex { ColIndex = i,
StringValues = new List<string> { txt } });
}
var obj = StringRepresentationAsInt
.First(x => x.ColIndex == i);
if (!obj.StringValues.Contains(txt)){
obj.StringValues.Add(txt);
}
line += (string.IsNullOrEmpty(line) ?
string.Empty :
("," + (obj.StringValues.IndexOf(txt) + 1).ToString()));
}
else
{
line += "," + split[i];
}
}
rawDataWithStringsAsIdsList.Add(line);
}
rawDataWithStringsAsIdsList.ForEach(x => Console.WriteLine(x));
Console.ReadLine();
/*
Desired output:
1;40912.2;0
2;3332;1
1;1234.99;0
3;0;0
1;12354.2;0
1;123;0
2;21321.2;1
*/
}
}
}
I have an old VB6 project that I am converting to C#. I have a string. I am verifying, 1 character at a time, that it only contains [E0-9.+-].
This is the old VB6 code:
sepIndex = 1
Do While Mid(xyString, sepIndex, 1) Like "[E0-9.+-]"
sepIndex = sepIndex + 1
Loop
I am then using the index number to calculate the Left of the same string, and convert it to a double. Using Right, I then cut the string down to the remaining part of the string that I want:
xFinal = CDbl(left(xyString, sepIndex - 1))
xyString = LTrim(right(xyString, Len(xyString) - sepIndex + 1))
This works great in VB, and I have no problems whatsoever. In C# it is different. Knowing that the only way to emulate Left/Mid/Right in C# is to create my own function, I did so (part of my Utils namespace):
public static string Mid(string param, int startIndex)
{
try
{
//start at the specified index and return all characters after it
//and assign it to a variable
string result = param.Substring(startIndex);
//return the result of the operation
return result;
}
catch
{
return string.Empty;
}
}
public static string Right(string param, int length)
{
try
{
//start at the index based on the lenght of the sting minus
//the specified lenght and assign it a variable
string result = param.Substring(param.Length - length, length);
//return the result of the operation
return result;
}
catch
{
return string.Empty;
}
}
public static string Left(string param, int length)
{
try
{
//we start at 0 since we want to get the characters starting from the
//left and with the specified lenght and assign it to a variable
string result = param.Substring(0, length);
//return the result of the operation
return result;
}
catch
{
return string.Empty;
}
}
To the best of my knowledge, I have converted the code from VB6 to C#. The problem is that when it finally sets xyString, it is cutting the string in the wrong spot, and still leaving a trailing character that I want in xFinal.
From what I gather, this may be either an index issue (which I know in C#, Substring is 0-based, so I changed sepIndex to the value of 0), or problem with the wrong loop structure.
Here is the converted code, and I hope I could get some idea of what is going on here:
sepIndex = 0;
while (Regex.IsMatch(Utils.Mid(xyString, sepIndex, 1), "[E0-9.+-]"))
{
sepIndex++;
}
xFinal = Convert.ToDouble(Utils.Left(xyString, sepIndex - 1)); // - 1
xyString = Utils.Right(xyString, xyString.Length - sepIndex + 1).TrimStart(' '); // + 1
EDIT
This has been solved. I simply removed the +1 and -1 from my Utils.Left function, and it returned the proper string. Thanks for the inputs.
Seems to be just RegEx math with negative condition - "[^E0-9.+-]":
var sepIndex = Regex.Match(xyString, #"[^E0-9\.+-]+").Index;
If you need to do it by hand the easiest way to get single character is to just get single character from the string without trimming:
// replace Utils.Mid(xyString, sepIndex, 1) with
xyString[sepIndex]
Try using following
do{
sepIndex++;
}while (Regex.IsMatch(Utils.Mid(xyString, sepIndex, 1), "[E0-9.+-]"))
why not importing microsoft.visualbasic namespace?
class Program
{
static void Main(string[] args)
{
string t = Microsoft.VisualBasic.Strings.Mid("hello", 2);
}
}
you can reuse what you had before