Parsing a string with, seemingly, no delimiter

Parsing a string with, seemingly, no delimiter - c#

I have the following string that I need to parse out so I can insert them into a DB. The delimiter is "`":
`020 Some Description `060 A Different Description `100 And Yet Another `
I split the string into an array using this
var responseArray = response.Split('`');
So then each item in the responseArrray[] looks like this: 020 Some Description
How would I get the two different parts out of that array? The 1st part will be either 3 or 4 characters long. 2nd part will be no more then 35 characters long.
Due to some ridiculous strangeness beyond my control there is random amounts of space between the 1st and 2nd part.

Or put the other two answers together, and get something that's more complete:
string[] response = input.Split(`);
foreach (String str in response) {
int splitIndex = str.IndexOf(' ');
string num = str.Substring(0, splitIndex);
string desc = str.Substring(splitIndex);
desc.Trim();
}
so, basically you use the first space as a delimiter to create 2 strings. Then you trim the second one, since trim only applies to leading and trailing spaces, not everything in between.
Edit: this a straight implementation of Brad M's comment.

You can try this solution:
var inputString = "`020 Some Description `060 A Different Description `100 And Yet Another `";
int firstWordLength = 3;
int secondWordMaxLength = 35;
var result =inputString.Split('`')
.SelectMany(x => new[]
{
new String(x.Take(firstWordLength).ToArray()).Trim(),
new String(x.Skip(firstWordLength).Take(secondWordMaxLength).ToArray()).Trim()
});
Here is the result in LINQPad:
Update: My first solution has some problems because the use of Trim after Take.Here is another approach with an extension method:
public static class Extensions
{
public static IEnumerable<string> GetWords(this string source,int firstWordLengt,int secondWordLenght)
{
List<string> words = new List<string>();
foreach (var word in source.Split(new[] {'`'}, StringSplitOptions.RemoveEmptyEntries))
{
var parts = word.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
words.Add(new string(parts[0].Take(firstWordLengt).ToArray()));
words.Add(new string(string.Join(" ",parts.Skip(1)).Take(secondWordLenght).ToArray()));
}
return words;
}
}
And here is the test result:

Try this
string response = "020 Some Description060 A Different Description 100 And Yet Another";
var responseArray = response.Split('`');
string[] splitArray = {};
string result = "";
foreach (string it in responseArray)
{
splitArray = it.Split(' ');
foreach (string ot in splitArray)
{
if (!string.IsNullOrWhiteSpace(ot))
result += "-" + ot.Trim();
}
}
splitArray = result.Substring(1).Split('-');

string[] entries = input.Split('`');
foreach (string s in entries)
GetStringParts(s);
IEnumerable<String> GetStringParts(String input)
{
foreach (string s in input.Split(' ')
yield return s.Trim();
}
Trim only removes leading/trailing whitespace per MSDN, so spaces in the description won't hurt you.

If the first part is an integer
And you need to account for some empty
For me the first pass was empty
public void parse()
{
string s = #"`020 Some Description `060 A Different Description `100 And Yet Another `";
Int32 first;
String second;
if (s.Contains('`'))
{
foreach (string firstSecond in s.Split('`'))
{
System.Diagnostics.Debug.WriteLine(firstSecond);
if (!string.IsNullOrEmpty(firstSecond))
{
firstSecond.TrimStart();
Int32 firstSpace = firstSecond.IndexOf(' ');
if (firstSpace > 0)
{
System.Diagnostics.Debug.WriteLine("'" + firstSecond.Substring(0, firstSpace) + "'");
if (Int32.TryParse(firstSecond.Substring(0, firstSpace), out first))
{
System.Diagnostics.Debug.WriteLine("'" + firstSecond.Substring(firstSpace-1) + "'");
second = firstSecond.Substring(firstSpace).Trim();
}
}
}
}
}
}

You can get the first part by finding the first space and make a substring. The second is also a Substring. Try something like this.
foreach(string st in response)
{
int index = response.IndexOf(' ');
string firstPart = response.Substring(0, index);
//string secondPart = response.Substring(response.Lenght-35);
//better use this
string secondPart = response.Substring(index);
secondPart.Trim();
}

Related

string.PadRight() doesn't seem to work in my code

I have a Powershell output to re-format, because formatting gets lost in my StandardOutput.ReadToEnd().There are several blanks to be removed in a line and I want to get the output formatted readable.
Current output in my messageBox looks like
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
What I want is
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
What am I doing wrong?
My C# knowledge still is basic level only
I found this question here, but maybe I don't understand the answer correctly. The solution doesn't work for me.
Padding a string using PadRight method
This is my current code:
string first = "";
string last = "";
int idx = line.LastIndexOf(" ");
if (idx != -1)
{
first = line.Substring(0, idx).Replace(" ","").PadRight(10, '~');
last = line.Substring(idx + 1);
}
MessageBox.Show(first + last);

String.PadLeft() first parameter defines the length of the padded string, not padding symbol count.
Firstly, you can iterate through all you string, split and save.
Secondly, you should get the longest string length.
Finally, you can format strings to needed format.
var strings = new []
{
"Microsoft.MicrosoftJigsaw All",
"Microsoft.MicrosoftMahjong All"
};
var keyValuePairs = new List<KeyValuePair<string, string>>();
foreach(var item in strings)
{
var parts = item.Split(new [] {" "}, StringSplitOptions.RemoveEmptyEntries);
keyValuePairs.Add(new KeyValuePair<string, string>(parts[0], parts[1]));
}
var longestStringCharCount = keyValuePairs.Select(kv => kv.Key).Max(k => k.Length);
var minSpaceCount = 5; // min space count between parts of the string
var formattedStrings = keyValuePairs.Select(kv => string.Concat(kv.Key.PadRight(longestStringCharCount + minSpaceCount, ' '), kv.Value));
foreach(var item in formattedStrings)
{
Console.WriteLine(item);
}
Result:
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All

The PadRight(10 is not enough, it is the size of the complete string.
I would probably go for something like:
string[] lines = new[]
{
"Microsoft.MicrosoftJigsaw All",
"Microsoft.MicrosoftMahjong All"
};
// iterate all (example) lines
foreach (var line in lines)
{
// split the string on spaces and remove empty ones
// (so multiple spaces are ignored)
// ofcourse, you must check if the splitted array has atleast 2 elements.
string[] splitted = line.Split(new Char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
// reformat the string, with padding the first string to a total of 40 chars.
var formatted = splitted[0].PadRight(40, ' ') + splitted[1];
// write to anything as output.
Trace.WriteLine(formatted);
}
Will show:
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
So you need to determine the maximum length of the first string.

Assuming the length of second part of your string is 10 but you can change it. Try below piece of code:
Function:
private string PrepareStringAfterPadding(string line, int totalLength)
{
int secondPartLength = 10;
int lastIndexOfSpace = line.LastIndexOf(" ");
string firstPart = line.Substring(0, lastIndexOfSpace + 1).Trim().PadRight(totalLength - secondPartLength);
string secondPart = line.Substring(lastIndexOfSpace + 1).Trim().PadLeft(secondPartLength);
return firstPart + secondPart;
}
Calling:
string line1String = PrepareStringAfterPadding("Microsoft.MicrosoftJigsaw All", 40);
string line2String = PrepareStringAfterPadding("Microsoft.MicrosoftMahjong All", 40);
Result:
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
Note:
Code is given for demo purpose please customize the totalLength and secondPartLength and calling of the function as per your requirement.

Get Text From file C#

I am reading text file line By line and in that I want to get data between special characters after checking whether line containing special character or not.In my case I want to check whether line contains <#Tag()> and if it contains then fetch the string between () i.e. line is having <#Tag(param1)> then it should return param1
But the problem is line may contains more then one <#Tag()>
For Example Line is having - <#Tag(value1)> <#Tag(value2)> <#Tag(value3)>
Then it should return first value1 then value2 and then value3
string contents = File.ReadAllText(#"D:\Report Format.txt");
int start = contents.IndexOf("Header") + "Header".Length;
int end = contents.IndexOf("Data") - "Header".Length;
int length = end - start;
string headerData = contents.Substring(start, length);
headerData = headerData.Trim(' ', '-');
MessageBox.Show(headerData);
using (StringReader reader = new StringReader(headerData))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains("<#Tag"))
{
string input = line;
string output = input.Split('<', '>')[1];
MessageBox.Show(output);
Globals.Tags.SystemTagDateTime.Read();
string newoutput = Globals.Tags.SystemTagDateTime.Value.ToString();
input = input.Replace(output, newoutput);
input = Regex.Replace(input, "<", "");
input = Regex.Replace(input, ">", "");
MessageBox.Show(input);
}
}
}

Try following
var matches = Regex.Matches(line, #"(?<=\<\#Tag\()\w+(?=\)\>)")
foreach (Match match in matches)
MessageBox.Show(match.Value);
If you want to accomplish context described in comments try following.
var line = "<#Tag(value1)> <#Tag(value2)> <#Tag(value3)>";
var matches = Regex.Matches(line, #"(?<=\<\#Tag\()\w+(?=\)\>)");
//use matches in your case to find values. i assume 10, 20 , 30
var values = new Dictionary<string, int>() { { "value1", 10 }, { "value2", 20 }, { "value3", 30 } };
const string fullMatchRegexTemplate = #"\<\#Tag\({0}\)\>";
foreach (var value in values)
Regex.Replace(line, string.Format(fullMatchRegexTemplate, value.Key), value.Value.ToString());

This might do the trick for you
[^a-zA-Z0-9]
Basically it matches all non-alphanumeric characters.
private void removeTag()
{
string n = "<#Tag(value1)> <#Tag(value2)> <#Tag(value3)>";
string tmp = Regex.Replace(n, "Tag+", "");
tmp = Regex.Replace(tmp, "[^0-9a-zA-Z]+", ",") ;
}
Another one could be
string tmp = Regex.Replace(n, "[^0-9a-zA-Z]*[Tag]*[^0-9a-zA-Z]", ",");

You could do this with a regex (I'll work on one)- but as a simple shortcut just do:
var tags = line.Split(new string[] { "<#Tag" }, StringSplitOptions.None);
foreach(var tag in tags)
{
//now parse each one
}
I see tchelidze just posted regex that looks pretty good so I'll defer to that answer as the regex one.

You can also collect them after splitting the string by the constant values <#Tag( and )> like this:
string str = "<#Tag(value1)> <#Tag(value2)> <#Tag(value3)>";
string[] values = str.Split(new string[] { "<#Tag(", ")>" }, StringSplitOptions.RemoveEmptyEntries);
values contains:
value1, value2, value3
Show the results in MessageBox:
foreach (string val in values) {
if (!(String.IsNullOrEmpty(val.Trim()))) {
MessageBox.Show(val);
}
}
Edit based on you comment:
Can i display complete value1 value2 value3 in one message box not with comma but with the same spacing as it was
string text = "";
foreach (string val in values) {
text += val;
}
MessageBox.Show(text);
Based on the comment:
Now the last query Before showing it in message box I want to replace it by thier values for example 10 20 and 30
string text = "";
foreach (string val in values) {
// where val is matching your variable (let's assume you are using dictionary for storing the values)
// else is white space or other... just add to text var.
if (yourDictionary.ContainsKey(val)) {
text += yourDictionary[val];
} else {
text += val;
}
}
MessageBox.Show(text);

Separating string on each character and putting it in an array

I'm pulling a string from a MySQL database containing all ID's from friends in this format:
5+6+12+33+1+9+
Now, whenever i have to add a friend it's simple, I just take the the string and add whatever ID and a "+". My problem lies with separating all the ID's and putting it in an array. I'm currently using this method
string InputString = "5+6+12+33+1+9+";
string CurrentID = string.Empty;
List<string> AllIDs = new List<string>();
for (int i = 0; i < InputString.Length; i++)
{
if (InputString.Substring(i,1) != "+")
{
CurrentID += InputString.Substring(i, 1);
}
else
{
AllIDs.Add(CurrentID);
CurrentID = string.Empty;
}
}
string[] SeparatedIDs = AllIDs.ToArray();
Even though this does work it just seems overly complicated for what i'm trying to do.
Is there an easier way to split strings or cleaner ?

Try this:-
var result = InputString.Split(new char[] { '+' });
You can use other overloads of Split
as well if you want to remove empty spaces.

You should to use the Split method with StringSplitOptions.RemoveEmptyEntries. RemoveEmptyEntries helps you to avoid empty string as the last array element.
Example:
char[] delimeters = new[] { '+' };
string InputString = "5+6+12+33+1+9+";
string[] SeparatedIDs = InputString.Split(delimeters,
StringSplitOptions.RemoveEmptyEntries);
foreach (string SeparatedID in SeparatedIDs)
Console.WriteLine(SeparatedID);

string[] IDs = InputString.Split('+'); will split your InputString into an array of strings using the + character

var result = InputString.Split('+', StringSplitOptions.RemoveEmptyEntries);
extra options needed since there is a trailing +

trying to remove unique words from a string

My code believe needs to remove unique instancs of the same word (Complete word).
What do I mean by complete words. Well, given the following string:
THIIS IS IS A TEST STRRING
I need this returned:
THIIS IS A TEST STRRING
My code returns this:
THIS IS A TEST STRING
var items = sString.Split(' ');
var uniqueItems = new HashSet<string>(items);
foreach (var item in uniqueItems)
{
strBuilder.Append(item);
strBuilder.Append(" ");
}
finalString = strBuilder.ToString().TrimEnd();
How can i therefore, retain an instance of a duplicate characters within a word, but remove complete duplicate words entirley?

You need Split and Distinct
var words = "THIIS IS IS A TEST STRRING".Split(' ').Distinct();
var result = string.Join(" ", words);

bro Call this method i know its a bit complex and lengthy but you will be amazed after getting the results!
public string IdentifySamewords(string str)
{
string[] subs=null;
char[] ch=str.ToCharArray();
int count=0;
for(int i=0;i<ch.Length;i++)
{
if(ch[i]==' ')
count++;
}
count++;
subs=new string[count];
count=0;
for(int i=0;i<ch.Length;i++)
{
if(ch[i]==' ')
count++;
else
subs[count]+=ch[i].ToString();
}
string current=null,prev=null,res=null;
for(int i=0;i<subs.Length;i++)
{
current=subs[i];
if(current!=prev)
res+=current+" ";
prev=current;
}
return res;
}

C# fix sentence

I need to take a sentence in that is all on one line with no spaces and each new word has a captial letter EX. "StopAndSmellTheRoses" and then convert it to "Stop and smell the roses" This is my function that I have but I keep getting an argument out of range error on the insert method. Thanks for any help in advance.
private void FixSentence()
{
// String to hold our sentence in trim at same time
string sentence = txtSentence.Text.Trim();
// loop through the string
for (int i = 0; i < sentence.Length; i++)
{
if (char.IsUpper(sentence, i) & sentence[i] != 0)
{
// Change to lowercase
char.ToLower(sentence[i]);
// Insert space behind the character
// This is where I get my error
sentence = sentence.Insert(i-1, " ");
}
}
// Show our Fixed Sentence
lblFixed.Text = "";
lblFixed.Text = "Fixed: " + sentence;
}

The best way to build up a String in this manner is to use a StringBuilder instance.
var sentence = txtSentence.Text.Trim();
var builder = new StringBuilder();
foreach (var cur in sentence) {
if (Char.IsUpper(cur) && builder.Length != 0) {
builder.Append(' ');
}
builder.Append(cur);
}
// Show our Fixed Sentence
lblFixed.Text = "";
lblFixed.Text = "Fixed: " + builder.ToString();
Using the Insert method creates a new string instance every time resulting in a lot of needlessly allocated values. The StringBuilder though won't actually allocate a String until you call the ToString method.

You can't modify the sentence variable in the loop that is going through it.
Instead, you need to have a second string variable that you append all of the found words.

Here is the answer
var finalstr = Regex.Replace(
"StopAndSmellTheRoses",
"(?<=[a-z])(?<x>[A-Z])|(?<=.)(?<x>[A-Z])(?=[a-z])|(?<=[^0-9])(?<x>[0-9])(?=.)",
me => " " + me.Value.ToLower()
);
will output
Stop and smell the roses

Another version:
public static class StringExtensions
{
public static string FixSentence(this string instance)
{
char[] capitals = Enumerable.Range(65, 26).Select(x => (char)x).ToArray();
string[] words = instance.Split(capitals);
string result = string.Join(' ', words);
return char.ToUpper(result[0]) + result.Substring(1).ToLower();
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parsing a string with, seemingly, no delimiter - c#

string[] entries = input.Split('`'); foreach (string s in entries) GetStringParts(s); IEnumerable<String> GetStringParts(String input) { foreach (string s in input.Split(' ') yield return s.Trim(); } Trim only removes leading/trailing whitespace per MSDN, so spaces in the description won't hurt you.

Related

string.PadRight() doesn't seem to work in my code

Get Text From file C#

Separating string on each character and putting it in an array

trying to remove unique words from a string

C# fix sentence

Categories

Resources