Get Text From file C# - c#

I am reading text file line By line and in that I want to get data between special characters after checking whether line containing special character or not.In my case I want to check whether line contains <#Tag()> and if it contains then fetch the string between () i.e. line is having <#Tag(param1)> then it should return param1
But the problem is line may contains more then one <#Tag()>
For Example Line is having - <#Tag(value1)> <#Tag(value2)> <#Tag(value3)>
Then it should return first value1 then value2 and then value3
string contents = File.ReadAllText(#"D:\Report Format.txt");
int start = contents.IndexOf("Header") + "Header".Length;
int end = contents.IndexOf("Data") - "Header".Length;
int length = end - start;
string headerData = contents.Substring(start, length);
headerData = headerData.Trim(' ', '-');
MessageBox.Show(headerData);
using (StringReader reader = new StringReader(headerData))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains("<#Tag"))
{
string input = line;
string output = input.Split('<', '>')[1];
MessageBox.Show(output);
Globals.Tags.SystemTagDateTime.Read();
string newoutput = Globals.Tags.SystemTagDateTime.Value.ToString();
input = input.Replace(output, newoutput);
input = Regex.Replace(input, "<", "");
input = Regex.Replace(input, ">", "");
MessageBox.Show(input);
}
}
}

Try following
var matches = Regex.Matches(line, #"(?<=\<\#Tag\()\w+(?=\)\>)")
foreach (Match match in matches)
MessageBox.Show(match.Value);
If you want to accomplish context described in comments try following.
var line = "<#Tag(value1)> <#Tag(value2)> <#Tag(value3)>";
var matches = Regex.Matches(line, #"(?<=\<\#Tag\()\w+(?=\)\>)");
//use matches in your case to find values. i assume 10, 20 , 30
var values = new Dictionary<string, int>() { { "value1", 10 }, { "value2", 20 }, { "value3", 30 } };
const string fullMatchRegexTemplate = #"\<\#Tag\({0}\)\>";
foreach (var value in values)
Regex.Replace(line, string.Format(fullMatchRegexTemplate, value.Key), value.Value.ToString());

This might do the trick for you
[^a-zA-Z0-9]
Basically it matches all non-alphanumeric characters.
private void removeTag()
{
string n = "<#Tag(value1)> <#Tag(value2)> <#Tag(value3)>";
string tmp = Regex.Replace(n, "Tag+", "");
tmp = Regex.Replace(tmp, "[^0-9a-zA-Z]+", ",") ;
}
Another one could be
string tmp = Regex.Replace(n, "[^0-9a-zA-Z]*[Tag]*[^0-9a-zA-Z]", ",");

You could do this with a regex (I'll work on one)- but as a simple shortcut just do:
var tags = line.Split(new string[] { "<#Tag" }, StringSplitOptions.None);
foreach(var tag in tags)
{
//now parse each one
}
I see tchelidze just posted regex that looks pretty good so I'll defer to that answer as the regex one.

You can also collect them after splitting the string by the constant values <#Tag( and )> like this:
string str = "<#Tag(value1)> <#Tag(value2)> <#Tag(value3)>";
string[] values = str.Split(new string[] { "<#Tag(", ")>" }, StringSplitOptions.RemoveEmptyEntries);
values contains:
value1, value2, value3
Show the results in MessageBox:
foreach (string val in values) {
if (!(String.IsNullOrEmpty(val.Trim()))) {
MessageBox.Show(val);
}
}
Edit based on you comment:
Can i display complete value1 value2 value3 in one message box not with comma but with the same spacing as it was
string text = "";
foreach (string val in values) {
text += val;
}
MessageBox.Show(text);
Based on the comment:
Now the last query Before showing it in message box I want to replace it by thier values for example 10 20 and 30
string text = "";
foreach (string val in values) {
// where val is matching your variable (let's assume you are using dictionary for storing the values)
// else is white space or other... just add to text var.
if (yourDictionary.ContainsKey(val)) {
text += yourDictionary[val];
} else {
text += val;
}
}
MessageBox.Show(text);

Related

string.PadRight() doesn't seem to work in my code

I have a Powershell output to re-format, because formatting gets lost in my StandardOutput.ReadToEnd().There are several blanks to be removed in a line and I want to get the output formatted readable.
Current output in my messageBox looks like
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
What I want is
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
What am I doing wrong?
My C# knowledge still is basic level only
I found this question here, but maybe I don't understand the answer correctly. The solution doesn't work for me.
Padding a string using PadRight method
This is my current code:
string first = "";
string last = "";
int idx = line.LastIndexOf(" ");
if (idx != -1)
{
first = line.Substring(0, idx).Replace(" ","").PadRight(10, '~');
last = line.Substring(idx + 1);
}
MessageBox.Show(first + last);
String.PadLeft() first parameter defines the length of the padded string, not padding symbol count.
Firstly, you can iterate through all you string, split and save.
Secondly, you should get the longest string length.
Finally, you can format strings to needed format.
var strings = new []
{
"Microsoft.MicrosoftJigsaw All",
"Microsoft.MicrosoftMahjong All"
};
var keyValuePairs = new List<KeyValuePair<string, string>>();
foreach(var item in strings)
{
var parts = item.Split(new [] {" "}, StringSplitOptions.RemoveEmptyEntries);
keyValuePairs.Add(new KeyValuePair<string, string>(parts[0], parts[1]));
}
var longestStringCharCount = keyValuePairs.Select(kv => kv.Key).Max(k => k.Length);
var minSpaceCount = 5; // min space count between parts of the string
var formattedStrings = keyValuePairs.Select(kv => string.Concat(kv.Key.PadRight(longestStringCharCount + minSpaceCount, ' '), kv.Value));
foreach(var item in formattedStrings)
{
Console.WriteLine(item);
}
Result:
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
The PadRight(10 is not enough, it is the size of the complete string.
I would probably go for something like:
string[] lines = new[]
{
"Microsoft.MicrosoftJigsaw All",
"Microsoft.MicrosoftMahjong All"
};
// iterate all (example) lines
foreach (var line in lines)
{
// split the string on spaces and remove empty ones
// (so multiple spaces are ignored)
// ofcourse, you must check if the splitted array has atleast 2 elements.
string[] splitted = line.Split(new Char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
// reformat the string, with padding the first string to a total of 40 chars.
var formatted = splitted[0].PadRight(40, ' ') + splitted[1];
// write to anything as output.
Trace.WriteLine(formatted);
}
Will show:
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
So you need to determine the maximum length of the first string.
Assuming the length of second part of your string is 10 but you can change it. Try below piece of code:
Function:
private string PrepareStringAfterPadding(string line, int totalLength)
{
int secondPartLength = 10;
int lastIndexOfSpace = line.LastIndexOf(" ");
string firstPart = line.Substring(0, lastIndexOfSpace + 1).Trim().PadRight(totalLength - secondPartLength);
string secondPart = line.Substring(lastIndexOfSpace + 1).Trim().PadLeft(secondPartLength);
return firstPart + secondPart;
}
Calling:
string line1String = PrepareStringAfterPadding("Microsoft.MicrosoftJigsaw All", 40);
string line2String = PrepareStringAfterPadding("Microsoft.MicrosoftMahjong All", 40);
Result:
Microsoft.MicrosoftJigsaw All
Microsoft.MicrosoftMahjong All
Note:
Code is given for demo purpose please customize the totalLength and secondPartLength and calling of the function as per your requirement.

Split string with string and retrieve value c#

I have below string
string arguments = "-p=C:\Users\mplususer\Documents\SharpDevelop Projects\o9\o9\bin\Debug\o9.exe""-t=False""-r=TestRun";
I want to split string with "-t=" and get false value
if (arguments.Contains("-t="))
{
string[] value = arguments.Split(new string[] { "\"-t=" }, StringSplitOptions.None);
if (value.Length != 0)
{
mode = value[1].Replace("\"", "");
}
}
Simply make an IndexOf:
var i = myString.IndexOf("-t=") + 3;
if(i != -1)
value = myString.SubString(i, myString.IndexOf("\"", i) - i);
You can also use a regex for this:
var r = new Regex("-t (true|false)");
var g = r.Match(myString).Groups;
if(g.Count > 1)
value = Convert.ToBoolean(g[1].ToString());
You can do it also like this like your approach without regex:
string arguments = "-p=C:\\Users\\mplususer\\Documents\\SharpDevelop Projects\\o9\\o9\\bin\\Debug\\o9.exe\" \"-t=False\" \"-r=TestRun";
if (arguments.Contains("-t="))
{
string splitted = arguments.Split(new string[] { "\"-t=" }, StringSplitOptions.None)[1];
string value = splitted.Split(new string[] { "\"" }, StringSplitOptions.None)[0];
}
When you split on "-t", you get two fragments. The fragment after that delimiter also contains a "-r" option - which is the issue you are probably seeing.
If you know that the "-t" option is always followed by a "-r", you could simply split on new string[] { "\"-t=", "\"-r=" }. Then you get three fragments and you still want the second one (at index 1).

Splitting a line of text that has key value pairs where value can be empty

I need to split a line of text
The general syntax for a delivery instruction is |||name|value||name|value||…..|||
Each delivery instruction starts and ends with 3 pipe characters - |||
A delivery instruction is a set of name/value pairs separated by a single pipe eg name|value
Each name value pair is separated by 2 pipe characters ||
Names and Values may not contain the pipe character
The value of any pair may be a blank string.
I need a regex that will help me resolve the above problem.
My latest attempt with my limited Regex skills:
string SampleData = "|||env|af245g||mail_idx|39||gen_date|2016/01/03 11:40:06||docm_name|Client Statement (01.03.2015−31.03.2015)||docm_cat_name|Client Statement||docm_type_id|9100||docm_type_name|Client Statement||addr_type_id|1||addr_type_name|Postal address||addr_street_nr|||addr_street_name|Robinson Road||addr_po_box|||addr_po_box_type|||addr_postcode|903334||addr_city|Singapore||addr_state|||addr_country_id|29955||addr_country_name|Singapore||obj_nr|10000023||bp_custr_type|Customer||access_portal|Y||access_library|Y||avsr_team_id|13056||pri_avsr_id|||pri_avsr_name|||ctact_phone|||dlv_type_id|5001||dlv_type_name|Channel to standard mail||ao_id|14387||ao_name|Corp Limited||ao_title|||ao_mob_nr|||ao_email_addr||||??";
string[] Split = Regex.Matches(SampleData, "(\|\|\|(?:\w+\|\w*\|\|)*\|)").Cast<Match>().Select(m => m.Value).ToArray();
The expected output should be as follows(based on the sample data string provided):
env|af245g
mail_idx|39
gen_date|2016/01/03 11:40:06
docm_name|Client Statement (01.03.2015−31.03.2015)
docm_cat_name|Client Statement
docm_type_id|9100
docm_type_name|Client Statement
addr_type_id|1
addr_type_name|Postal address
addr_street_nr|
addr_street_name|Robinson Road
addr_po_box|
addr_po_box_type|
addr_postcode|903334
addr_city|Singapore
addr_state|
addr_country_id|29955
addr_country_name|Singapore
obj_nr|10000023
bp_custr_type|Customer
access_portal|Y
access_library|Y
avsr_team_id|13056
pri_avsr_id|
pri_avsr_name|
ctact_phone|
dlv_type_id|5001
dlv_type_name|Channel to standard mail
ao_id|14387
ao_name|Corp Limited
ao_title|
ao_mob_nr|
ao_email_addr|
You can also do it without using Regex. Its just simple splitting.
string nameValues = "|||zeeshan|1||ali|2||ahsan|3|||";
string sub = nameValues.Substring(3, nameValues.Length - 6);
Dictionary<string, string> dic = new Dictionary<string, string>();
string[] subsub = sub.Split(new string[] {"||"}, StringSplitOptions.None);
foreach (string item in subsub)
{
string[] nameVal = item.Split('|');
dic.Add(nameVal[0], nameVal[1]);
}
foreach (var item in dic)
{
// Retrieve key and value here i.e:
// item.Key
// item.Value
}
Hope this helps.
I think you're making this more difficult than it needs to be. This regex yields the desired result:
#"[^|]+\|([^|]*)"
Assuming you're dealing with a single, well-formed delivery instruction, there's no need to match the starting and ending triple-pipes. You don't need to worry about the double-pipe separators either, because the "name" part of the "name|value" pair is always present. Just look for the first thing that looks like a name with a pipe following it, and everything up to the next pipe character is the value.
(?<=\|\|\|).*?(?=\|\|\|)
You can use this to get all the key value pairs between |||.See demo.
https://regex101.com/r/fM9lY3/59
string strRegex = #"(?<=\|\|\|).*?(?=\|\|\|)";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline);
string strTargetString = #"|||env|af245g||mail_idx|39||gen_date|2016/01/03 11:40:06||docm_name|Client Statement (01.03.2015−31.03.2015)||docm_cat_name|Client Statement||docm_type_id|9100||docm_type_name|Client Statement||addr_type_id|1||addr_type_name|Postal address||addr_street_nr|||addr_street_name|Robinson Road||addr_po_box|||addr_po_box_type|||addr_postcode|903334||addr_city|Singapore||addr_state|||addr_country_id|29955||addr_country_name|Singapore||obj_nr|10000023||bp_custr_type|Customer||access_portal|Y||access_library|Y||avsr_team_id|13056||pri_avsr_id|||pri_avsr_name|||ctact_phone|||dlv_type_id|5001||dlv_type_name|Channel to standard mail||ao_id|14387||ao_name|Corp Limited||ao_title|||ao_mob_nr|||ao_email_addr||||??";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here
}
}
Here's a variation of #Syed Muhammad Zeeshan code that runs faster:
string nameValues = "|||zeeshan|1||ali|2||ahsan|3|||";
string[] nameArray = nameValues.Split(new char[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
Dictionary<string, string> dic = new Dictionary<string, string>();
int i = 0;
foreach (string item in nameArray)
{
if (i < nameArray.Length - 1)
dic.Add(nameArray[i], nameArray[i + 1]);
i = i + 2;
}
Interesting, I will like to try:
class Program
{
static void Main(string[] args)
{
string nameValueList = "|||zeeshan|1||ali|2||ahsan|3|||";
while (nameValueList != "|||")
{
nameValueList = nameValueList.TrimStart('|');
string nameValue = GetNameValue(ref nameValueList);
Console.WriteLine(nameValue);
}
Console.ReadLine();
}
private static string GetNameValue(ref string nameValues)
{
string retVal = string.Empty;
while(nameValues[0] != '|') // for name
{
retVal += nameValues[0];
nameValues = nameValues.Remove(0, 1);
}
retVal += nameValues[0];
nameValues = nameValues.Remove(0, 1);
while (nameValues[0] != '|') // for value
{
retVal += nameValues[0];
nameValues = nameValues.Remove(0, 1);
}
return retVal;
}
}
https://dotnetfiddle.net/WRbsRu

Parsing a string with, seemingly, no delimiter

I have the following string that I need to parse out so I can insert them into a DB. The delimiter is "`":
`020 Some Description `060 A Different Description `100 And Yet Another `
I split the string into an array using this
var responseArray = response.Split('`');
So then each item in the responseArrray[] looks like this: 020 Some Description
How would I get the two different parts out of that array? The 1st part will be either 3 or 4 characters long. 2nd part will be no more then 35 characters long.
Due to some ridiculous strangeness beyond my control there is random amounts of space between the 1st and 2nd part.
Or put the other two answers together, and get something that's more complete:
string[] response = input.Split(`);
foreach (String str in response) {
int splitIndex = str.IndexOf(' ');
string num = str.Substring(0, splitIndex);
string desc = str.Substring(splitIndex);
desc.Trim();
}
so, basically you use the first space as a delimiter to create 2 strings. Then you trim the second one, since trim only applies to leading and trailing spaces, not everything in between.
Edit: this a straight implementation of Brad M's comment.
You can try this solution:
var inputString = "`020 Some Description `060 A Different Description `100 And Yet Another `";
int firstWordLength = 3;
int secondWordMaxLength = 35;
var result =inputString.Split('`')
.SelectMany(x => new[]
{
new String(x.Take(firstWordLength).ToArray()).Trim(),
new String(x.Skip(firstWordLength).Take(secondWordMaxLength).ToArray()).Trim()
});
Here is the result in LINQPad:
Update: My first solution has some problems because the use of Trim after Take.Here is another approach with an extension method:
public static class Extensions
{
public static IEnumerable<string> GetWords(this string source,int firstWordLengt,int secondWordLenght)
{
List<string> words = new List<string>();
foreach (var word in source.Split(new[] {'`'}, StringSplitOptions.RemoveEmptyEntries))
{
var parts = word.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
words.Add(new string(parts[0].Take(firstWordLengt).ToArray()));
words.Add(new string(string.Join(" ",parts.Skip(1)).Take(secondWordLenght).ToArray()));
}
return words;
}
}
And here is the test result:
Try this
string response = "020 Some Description060 A Different Description 100 And Yet Another";
var responseArray = response.Split('`');
string[] splitArray = {};
string result = "";
foreach (string it in responseArray)
{
splitArray = it.Split(' ');
foreach (string ot in splitArray)
{
if (!string.IsNullOrWhiteSpace(ot))
result += "-" + ot.Trim();
}
}
splitArray = result.Substring(1).Split('-');
string[] entries = input.Split('`');
foreach (string s in entries)
GetStringParts(s);
IEnumerable<String> GetStringParts(String input)
{
foreach (string s in input.Split(' ')
yield return s.Trim();
}
Trim only removes leading/trailing whitespace per MSDN, so spaces in the description won't hurt you.
If the first part is an integer
And you need to account for some empty
For me the first pass was empty
public void parse()
{
string s = #"`020 Some Description `060 A Different Description `100 And Yet Another `";
Int32 first;
String second;
if (s.Contains('`'))
{
foreach (string firstSecond in s.Split('`'))
{
System.Diagnostics.Debug.WriteLine(firstSecond);
if (!string.IsNullOrEmpty(firstSecond))
{
firstSecond.TrimStart();
Int32 firstSpace = firstSecond.IndexOf(' ');
if (firstSpace > 0)
{
System.Diagnostics.Debug.WriteLine("'" + firstSecond.Substring(0, firstSpace) + "'");
if (Int32.TryParse(firstSecond.Substring(0, firstSpace), out first))
{
System.Diagnostics.Debug.WriteLine("'" + firstSecond.Substring(firstSpace-1) + "'");
second = firstSecond.Substring(firstSpace).Trim();
}
}
}
}
}
}
You can get the first part by finding the first space and make a substring. The second is also a Substring. Try something like this.
foreach(string st in response)
{
int index = response.IndexOf(' ');
string firstPart = response.Substring(0, index);
//string secondPart = response.Substring(response.Lenght-35);
//better use this
string secondPart = response.Substring(index);
secondPart.Trim();
}

How to split string that delimiters remain in the end of result?

I have several delimiters. For example {del1, del2, del3 }.
Suppose I have text : Text1 del1 text2 del2 text3 del3
I want to split string in such way:
Text1 del1
text2 del2
text3 del3
I need to get array of strings, when every element of array is texti deli.
How can I do this in C# ?
String.Split allows multiple split-delimeters. I don't know if that fits your question though.
Example :
String text = "Test;Test1:Test2#Test3";
var split = text.Split(';', ':', '#');
//split contains an array of "Test", "Test1", "Test2", "Test3"
Edit: you can use a regex to keep the delimeters.
String text = "Test;Test1:Test2#Test3";
var split = Regex.Split(text, #"(?<=[;:#])");
// contains "Test;", "Test1:", "Test2#","Test3"
This should do the trick:
const string input = "text1-text2;text3-text4-text5;text6--";
const string matcher= "(-|;)";
string[] substrings = Regex.Split(input, matcher);
StringBuilder builder = new StringBuilder();
foreach (string entry in substrings)
{
builder.Append(entry);
}
Console.Out.WriteLine(builder.ToString());
note that you will receive empty strings in your substring array for the matches for the two '-';s at the end, you can choose to ignore or do what you like with those values.
You could use a regex. For a string like this "text1;text2|text3^" you could use this:
(.*;|.*\||.*\^)
Just add more alternative pattens for each delimiter.
If you want to keep the delimiter when splitting the string you can use the following:
string[] delimiters = { "del1", "del2", "del3" };
string input = "text1del1text2del2text3del3";
string[] parts = input.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
for(int index = 0; index < parts.Length; index++)
{
string part = parts[index];
string temp = input.Substring(input.IndexOf(part) + part.Length);
foreach (string delimter in delimiters)
{
if ( temp.IndexOf(delimter) == 0)
{
parts[index] += delimter;
break;
}
}
}
parts will then be:
[0] "text1del1"
[1] "text2del2"
[2] "text3del3"
As #Matt Burland suggested, use Regex
List<string> values = new List<string>();
string s = "abc123;def456-hijk,";
Regex r = new Regex(#"(.*;|.*-|.*,)");
foreach(Match m in r.Matches(s))
values.Add(m.Value);

Categories

Resources