Split a non delimited, and variable length string

Split a non delimited, and variable length string - c#

given the following string
38051211)JLRx(0x04>0x01):JL_PAGE_INFO(0x63,0x00,0x01,0x03,0x00,0x73,0x00,0x00,0x0A,0x01,0x01,0xF2,0x01)
How can I split it so I can use each split in a listview column?
I can use split at the : for example but then I need to split at the next ( and each value after split using the ,.
Any advice would be greatly appreciated, its how I add the 2nd and 3rd and so on parts I am struggling with
{
if (line.Contains("JLTx"))
{
string[] JLTx = line.Split(new[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
listView1.Items.Add(JLTx[0]);
listView1.Items[listView1.Items.Count - 1].SubItems.Add(JLTx[1]);
}
}
So using the following regex
Regex regex = new Regex(#"(.*)JLTx\((.*)\):(JL_[(A-Z)_]*)\((.*)\)");
I cant seem to split at the : as not in any of the matches. Where am I going wrong
Thanks all

As others have pointed out you have a lot of options for parsing this into some header friendly format. #Johns answer above would work if your JL_PAGE_INFO is stable for all input. You could also use a regex. A lot of it depends on how stable your input data is. Here is a example using string functions to create the list of headers you described.
static IEnumerable<string> Tokenize(string input)
{
if (string.IsNullOrEmpty(input))
yield break;
if (')' != input[input.Length - 1])
yield break;
int colon = input.IndexOf(':');
string pageData = input.Substring(colon + 1);
if (string.IsNullOrEmpty(pageData))
yield break;
int open = pageData.IndexOf('(');
if (colon != -1 && open != -1)
{
yield return input.Substring(0, colon+1);
foreach (var token in pageData.Substring(open+1, pageData.Length - (open + 1) - 1).Split(','))
yield return token;
}
}

If the second item of your split string - JLTx[1] - is always going to be JL_PAGE_INFO(...) I would try this:
string[] mystring = JLTx[1].Replace("JL_PAGE_INFO(","").Replace(")","")Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);

Related

How to fix Join(" ") doesnt add a space to the concatenated string

When i try to Join() an array with words and try to add a space to it,
The space seems to be left out, any idea why?
A little background info, when i try to reverse a string of words given to me so that
"hi my name is" should become "is name my hi"
public static string ReverseWords(string text)
{
string[] words = text.Split(' ');
string s = "";
for (int i = words.Length - 1; i >= 0; i--)
{
s+= string.Join(" ", words [i]);
}
return s;
}
The expected outcome would have been: "world! hello"
But it was: "world!hello"
As you can see I'm missing the space between world! and hello.
Any ideas?

You're calling Join with a single word at a time, using the overload accepting a parameter array. Joining a single item will always just return that item - there's nothing else to join it with.
Instead of that, just call it with all the words, in reverse order:
public static string ReverseWords(string text)
{
string[] words = text.Split(' ');
return string.Join(" ", words.Reverse());
}

Join combines the elements of the array with the specified separator, but you are adding each element separately. So no space is added.
string.Join(" ", words.Reverse());

string.Join() is designed to work on a list of strings, placing the separator char between each. You're only giving it one at a time, hence no separator. Try this:
public static string ReverseWords(string text)
{
string[] words = text.Split(' ');
return string.Join(" ", words.Reverse());
}

sorry that it took me a while guys, im still a beginner.
But the .Reverse() function wasnt it because it would turn every char in the word backwards, which wasnt my objective.
The trick was as following:
public static string ReverseWords(string text)
{
string[] words = text.Split(' ');
string s = words[words.Length - 1];
for (int i = words.Length - 2; i >= 0; i--)
{
s+= " " + words[i];
}
return s;
}
}
really alot of thanks with guiding.
#igor you where right, i kept trying to do this exercise in an online environment when i copied it to VS i could discover what was wrong by slowly debugging.

Regex patterns C#

I want to validate a string in such a manner that in that string, if a "-" is present it should have an alphabet before and after it.
But I am unable to form the regex pattern.
Can anyone please help me for the same.

Rather than using a regex to check this I think I would write an extension method using Char.IsLetter(). You can handle multiple dashes then, and use languages other than English.
public static bool IsValidDashedString(this String text)
{
bool temp = true;
//retrieve the location of all the dashes
var indexes = Enumerable.Range(0, text.Length)
.Where(i => text[i] == '-')
.ToList();
//check if any dashes occur, if they are the 1st character or the last character
if (indexes.Count() == 0 ||
indexes.Any(i => i == 0) ||
indexes.Any(i => i == text.Length-1))
{
temp = false;
}
else //check if each dash is preceeded and followed by a letter
{
foreach (int i in indexes)
{
if (!Char.IsLetter(text[i - 1]) || !Char.IsLetter(text[i + 1]))
{
temp = false;
break;
}
}
}
return temp;
}

The following will match a string with one alphabetic character before the "-" and one after:
[A-z]-[A-z]
You may need to first test whether there is "-" present if that is not always the case. Could do with more information about the possible string contents and exactly why you need to perform the test

(^.+)(\D+)(-)(\D+)(.+)
I have tested this for some examples here http://regexr.com/39vfq

Cut a text on a string when '.' is found in the text [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Remove characters after specific character in string, then remove substring?
I have a string variabel that contains a text like this:
"Hello My name is B and I love soccer. I live in California."
I want to cut the text after the first '.' so the text displays
"Hello My name is B and I love soccer."
How can I do it in the simpliest way?
I tried:
Char Mychar = '.';
Stringvariabel.trimstart(Mychar);
But I guess it's wrong.

Char Mychar = '.';
Stringvariabel = Stringvariabel.Split(Mychar).First() + Mychar.toString();

If you're only interested in the first sentence, then just grab a substring starting at the beginning and ending at the '.'.
Stringvariabel.Substring(0, Stringvariabel.IndexOf('.') + 1);

You can use string.Split to get the result:
string input = "Hello My name is B and I love soccer. I live in California. ";
string result = string.Format("{0}.", input.Split('.').First());

Make use of IndexOf function will do work for you..
string input = "Hello My name is B and I love soccer. I live in California. ";
int i = input .IndexOf('.');
string result = s.Substring(0,i+1);

One convenient way is to use string.Split and ask for just the first part:
var firstPart = input.Split(new[] { '.' }, 1).First();
This is quite efficient because it won't continue processing the string after the first dot, but it will remove the dot (if it exists) and you will not be able to tell if there was a dot in the first place.
The other option is string.IndexOf and a conditional:
var index = input.IndexOf(".");
if (index != -1) {
input = input.SubString(0, index);
}

TrimStart removes characters from the start of a string that are in the list you give it. It would only remove a . if it appears at the very start.
You can find the first . and take a substring up to that point:
stringVar.Substring(0, stringVar.IndexOf('.') + 1);

You can do something like below
stringVariable.Split('.')[0]
or
stringVariable.SubString(0, stringVariable.IndexOf(".") + 1)
Hope this Helps!!

The simplest way would be to take the substring up until the first occurrence of the character.
public string TrimAtFirstChar(string s, char c)
{
int index = s.IndexOf(c);
if(index == -1) //there is no '.' in the string
return s;
return s.Substring(0, index)
}
Alternately, to avoid worrying about the case where there is no '.', you could use stringvariable.Split('.')[0].

Take a look at the String.Split method.
var myString = "Hello My name is B and I love soccer. I live in California. ";
var firstPart = myString.Split('.')[0];

var splitLine = yourString.Split('.');
if (splitLine != null && splitLine.Count > 0)
return splitLine[0];

Without split, only using Substring and IndexOf (which is more efficient when the text is very large):
int index = text.IndexOf(".") + 1;
String result = text;
if(index > 0)
result = text.Substring(0, index);
http://ideone.com/HL6GwN

Please be aware that String.Split() can have an impact, cause you create an array that contains all substrings delimited by the given separator, but you are only interested in the first occurence. So using IndexOf() and Substring() makes much more sense.
string input = "Hello My name is B and I love soccer. I live in California. ";
var index = input.IndexOf(".");
var result = index > 0
? input.Substring(0, index)
: input;

variablename.Substring(0, variablename.IndexOf('.') + 1);

Regex to match separate integer list in c#

I want to match comma separate integer list using Regex. I have used bellow pattern but it doesn't work for me.
if (!Regex.IsMatch(textBox_ImportRowsList.Text, #"^([0-9]+(,[0-9]+))*"))
{
errorProvider1.SetError(label_ListRowPosttext, "Row Count invalid!");
}
Valid Inputs:
1
1,2
1,4,6,10
Invalid Inputs:
1,
1.1
1,A
2,/,1
,1,3

use this regular expression:
^\d+(,\d+)*$

EDIT
Best way to validate your comma separated string is
string someString = "1,2,3";
bool myResults = someString.Split(';').
Any<string>(s => !isNumeric(s));
if(myResults)
Console.Writeln("invalid number");
else
Console.Writeln("valid number");
public bool isNumeric(string val)
{
if(val == String.Empty)
return false;
int result;
return int.TryParse(val,out result);
}
The following might also work for you. This regex will also capture an empty string.
^(\d+(,\d+)*)?$
or
^\d+(,\d+)*$
start with an integer, so '\d+'. That is 1 or more digit characters ('0'-'9')
Then make a set of parenthesis which contains ',\d+' and put an asterisk after it. allow the , and digit

You've got the asterisk in the wrong place. Instead of this:
#"^([0-9]+(,[0-9]+))*"
...use this:
#"^([0-9]+(,[0-9]+)*)"
Additionally, you should anchor the end like you did the beginning, and don't really need the outermost set of parentheses:
#"^[0-9]+(,[0-9]+)*$"

You could use ^\d+(,\d+)*$ but as #Lazarus has pointed out, this may be a case where regex is a bit of a overkill and string.Split() would be better utilized you could even use this with a int.tryParse if you are trying to manipulate numbers.

you can try with this code
List<string> list = new List<string>();
list.Add("1");
list.Add("1.1");
list.Add("1,A");
list.Add("2,/,1");
foreach (var item in list)
{
if (!Regex.IsMatch(item, #"^([0-9](,[0-9])*)$"))
{
Console.WriteLine("no match :" + item);
}
}

try this one
String strInput = textBox_ImportRowsList.Text;
foreach (String s in strInput.Split(new[]{',', ' '}, StringSplitOptions.RemoveEmptyEntries))
{
if(!Regex.IsMatch(s, #"^\d+(,\d+)*$"))
{
errorProvider1.SetError(label_ListRowPosttext, "Row Count invalid!");
}
}

How to remove words based on a word count

Here is what I'm trying to accomplish. I have an object coming back from
the database with a string description. This description can be up to 1000
characters long, but we only want to display a short view of this. So I coded
up the following, but I'm having trouble in actually removing the number of
words after the regular expression finds the total count of words. Does anyone
have good way of dispalying the words which are less than the Regex.Matches?
Thanks!
if (!string.IsNullOrEmpty(myObject.Description))
{
string original = myObject.Description;
MatchCollection wordColl = Regex.Matches(original, #"[\S]+");
if (wordColl.Count < 70) // 70 words?
{
uxDescriptionDisplay.Text =
string.Format("<p>{0}</p>", myObject.Description);
}
else
{
string shortendText = original.Remove(200); // 200 characters?
uxDescriptionDisplay.Text =
string.Format("<p>{0}</p>", shortendText);
}
}
EDIT:
So this is what I got working on my own:
else
{
int count = 0;
StringBuilder builder = new StringBuilder();
string[] workingText = original.Split(' ');
foreach (string word in workingText)
{
if (count < 70)
{
builder.AppendFormat("{0} ", word);
}
count++;
}
string shortendText = builder.ToString();
}
It's not pretty, but it worked. I would call it a pretty naive way of doing this. Thanks for all of the suggestions!

I would opt to go by a strict character count rather than a word count because you might happen to have a lot of long words.
I might do something like (pseudocode)
if text.Length > someLimit
find first whitespace after someLimit (or perhaps last whitespace immediately before)
display substring of text
else
display text
Possible code implementation:
string TruncateText(string input, int characterLimit)
{
if (input.Length > characterLimit)
{
// find last whitespace immediately before limit
int whitespacePosition = input.Substring(0, characterLimit).LastIndexOf(" ");
// or find first whitespace after limit (what is spec?)
// int whitespacePosition = input.IndexOf(" ", characterLimit);
if (whitespacePosition > -1)
return input.Substring(0, whitespacePosition);
}
return input;
}

One method, if you're using at least C#3.0, would be a LINQ like the following. This is provided you're going strictly by word count, not character count.
if (wordColl.Count > 70)
{
foreach (var subWord in wordColl.Cast<Match>().Select(r => r.Value).Take(70))
{
//Build string here out of subWord
}
}
I did a test using a simple Console.WriteLine with your Regex and your question body (which is over 70 words, it turns out).

You can use Regex Capture Groups to hold the match and access it later.
For your application, I'd recommend instead simply splitting the string by spaces and returning the first n elements of the array:
if (!string.IsNullOrEmpty(myObject.Description))
{
string original = myObject.Description;
string[] words = original.Split(' ');
if (words.Length < 70)
{
uxDescriptionDisplay.Text =
string.Format("<p>{0}</p>", original);
}
else
{
string shortDesc = string.Empty;
for(int i = 0; i < 70; i++) shortDesc += words[i] + " ";
uxDescriptionDisplay.Text =
string.Format("<p>{0}</p>", shortDesc.Trim());
}
}

Are you wanting to remove 200 characters or start truncating at the 200th character? When you call original.Remove(200) you are indexing the start of the truncation at the 200th character. This is how you use Remove() for a certain number of characters to remove:
string shortendText = original.Remove(0,200);
This starts at the first character and removes 200 starting with that one. Which I imagine that's not what you're trying to do since you're shortening a description. That's merely the correct way to use Remove().
Instead of using Regex matchcollections why not just split the string? It's a lot easier and straight forward. You can set the delimiter to a space character and split that way. Not sure if that completely fixes your need but it just might. I'm not sure what your data looks like in the description. But you split this way:
String[] wordArray = original.Split(' ');
From there you can determine the word count with wordArray's Length property value.

If I was you I would go by characters as you may have many one letter words or many long words in your text.
Go through until characters <= your limit, then either find the next space and then add these characters to a new string (possibly using the SubString method) or take these characters and add a few full stops, then make a new string The later could be unproffessional I suppose.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split a non delimited, and variable length string - c#

If the second item of your split string - JLTx[1] - is always going to be JL_PAGE_INFO(...) I would try this: string[] mystring = JLTx[1].Replace("JL_PAGE_INFO(","").Replace(")","")Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);

Related

How to fix Join(" ") doesnt add a space to the concatenated string

Regex patterns C#

Cut a text on a string when '.' is found in the text [duplicate]

Regex to match separate integer list in c#

How to remove words based on a word count

Categories

Resources