what can I do to achieve the output on right side as shown on image? Do note that later on there will be many data with this kind of inconsistent alignment and is there any way to loop all text along with adjusted alignment as shown on right side of image?
You could use Regex to replace occurrence of multiple whitespace (2 or more,including tab) with a single whitespace. For example,
var result = Regex.Replace(str,"[\t ]{2,}"," ");
The common forms of whitespace could divided into
Space
Tab(\t)
NewLine(\n)
Return(\t)
In the above scenario, it looks like you need to replace all the Space (2 or more) and Tab characters with a single whitespace character (and ignore the NewLine/Return characters). For the purpose, you could use Regex as shown in the code above
Short Answer, use Trim() or TrimEnd, but here is the way I would do your task using my Nuget package you can convert this to string.Split() if you don't want to use a Nuget package:
Nuget Package:
DataJuggler.Core.UltimateHelper (.Net Framework)
DataJuggler.UltimateHelper.Core (.Net Core)
.Net Framework is shown
// source input
string inputFileText = "blah blah blah blah blah " + Environment.NewLine + "blah blah blah blah blah";
// parse the lines
List<TextLine> lines = WordParser.GetTextLines(inputFileText);
// If the lines collection exists and has one or more items
if (ListHelper.HasOneOrMoreItems(lines))
{
// Iterate the collection of TextLine objects
foreach (TextLine line in lines)
{
// Get the words
List<Word> words = WordParser.GetWords(line.Text);
}
}
Then you can do what you want with each Line, and each Line contains a list of Words (string text).
Here is the code the Nuget package uses, if you would rather copy it:
public static List<TextLine> GetTextLines(string sourceText)
{
// initial value
List<TextLine> textLines = new List<TextLine>();
// typical delimiter characters
char[] delimiterChars = Environment.NewLine.ToCharArray();
// local
int counter = -1;
// verify the sourceText exists
if (!String.IsNullOrEmpty(sourceText))
{
// Get the list of strings
string[] linesOfText = sourceText.Split(delimiterChars);
// now iterate the strings
foreach (string lineOfText in linesOfText)
{
// local
string text = lineOfText;
// increment the counter
counter++;
// add every other row
if ((counter % 2) == 0)
{
// Create a new TextLine
TextLine textLine = new TextLine(text);
// now add this textLine to textLines collection
textLines.Add(textLine);
}
}
}
// return value
return textLines;
}
public static List<Word> GetWords(string sourceText, char[] delimeters = null, bool allowEmptyStrings = false)
{
// initial value
List<Word> words = new List<Word>();
// typical delimiter characters
char[] delimiterChars = { ' ','-','/', ',', '.', '\t' };
// if the delimter exists
if (NullHelper.Exists(delimeters))
{
// use these delimters
delimiterChars = delimeters;
}
// verify the sourceText exists
if (!String.IsNullOrEmpty(sourceText))
{
// Get the list of strings
string[] strings = sourceText.Split(delimiterChars);
// now iterate the strings
foreach(string stringWord in strings)
{
// verify the word is not an empty string or a space
if ((allowEmptyStrings) || (TextHelper.Exists(stringWord)))
{
// Create a new Word
Word word = new Word(stringWord);
// now add this word to words collection
words.Add(word);
}
}
}
// return value
return words;
}
Related
I`m new in c#, I'm still learning that language. Now I try to make app which read text and to my data i need only specific lines. Text look like:
[HAEDING]
Some value
[HEADING]
Some other value
[HEADING]
Some other text
and continuation of this text in new line
[HEADING]
Last text
I try to write method which read text and put it into string[] by split it like this:
string[0] = Some value
string[1] = Some other value
string[2] = Some other text and continuation of this text in new line
string[3] = Last text
So I want to read line from value [HEADING] to value new line which is empty. I thought that is should write by ReadAllLines and line by line check start position on value [HEADING] and end position on empty value in new line. I try this code:
string s = "mystring";
int start = s.IndexOf("[HEADING]");
int end = s.IndexOf("\n", start);
string result = s.Substring(start, end - start);
but it's substring to all lines in my text not like loop between first [HEADING] and empty new line, second etc.
Maybe someone can help me with this?
You could try to split the string by "[HEADING]" to get the strings between these lines. Then you could join each string into a single line and trim the whitespace around the strings:
string content = #"[HEADING]
Some value
[HEADING]
Some other value
[HEADING]
Some other text
and continuation of this text in new line
[HEADING]
Last text";
var segments = content.Split(new[] { "[HEADING]"}, StringSplitOptions.RemoveEmptyEntries) // Split into multiple strings
.Select(p=>p.Replace("\r\n"," ").Replace("\r"," ").Replace("\n"," ").Trim()) // Join each single string into single line
.ToArray();
Result:
segments[0] = "Some value"
segments[1] = "Some other value"
segments[2] = "Some other text and continuation of this text in new line"
segments[3] = "Last text"
Here's a solution which avoids the substring/index checking, which could potentially be fraught with errors.
There are answers such as this one that use LINQ, but for a newcomer to the language, basic looping is an OK place to start. Also, this is not necessarily the best solution for efficiency or whatever.
This foreach loop will handle your case, and some of the "dirty" cases.
var segments = new List<string>();
bool headingChanged = false;
foreach (var line in File.ReadAllLines("somefilename.txt"))
{
// skip blank lines
if (string.IsNullOrWhitespace(line)) continue;
// detect a heading
if (line.Contains("[HEADING]")
{
headingChanged = true;
continue;
}
if (headingChanged)
{
segments.Add(line);
// this keeps us working on the same segment if there
// are more lines to be added to the segment
headingChanged = false;
}
else
{
segments[segments.Length - 1] += " ";
segments[segments.Length - 1] += line;
// you could replace the above two lines with string interpolation...
// segments[segments.Length - 1] = $"{segments[segments.Length - 1]} {line}";
}
}
In the above loop, the ReadAllLines obviates the need to check for \r and \n. Contains will handle [HEADING] no matter where it changes.
You don't need substring, you can just compare the value s == "[HEADING]".
Here's an easy to understand example:
var lines = System.IO.File.ReadAllLines(myFilePath);
var resultLines = new List<String>();
var collectedText = new List<String>();
foreach (var line in lines)
{
if (line == "[HEADING]")
{
collectedText = new List<String>();
}
else if (line != "")
{
collectedText.Add(line);
}
else //if (line == "")
{
var joinedText = String.Join(" ", collectedText);
resultLines.Add(joinedText);
}
}
return resultLines.ToArray();
the loop does this:
we go line by line
"start collecting" (create list) when we encounter with "[HEADING]" line
"collect" (add to list) line if not empty
"finish collecting" (concat and add to results list) when line is empty
I'm searching for a solution to this case:
I have a Method inside a DLL that receive a string that contains some words as "placeholders/parameters" that will be replaced by a result of another specific method (inside dll too)
Too simplificate: It's a query string received as an argument to be on a method inside a DLL, where X word that matchs a specifc case, will be replaced.
My method receive a string that could be like this:
(on .exe app)
string str = "INSERT INTO mydb.mytable (id_field, description, complex_number) VALUES ('#GEN_COMPLEX_ID#','A complex solution', '#GEN_COMPLEX_ID#');"
MyDLLClass.MyMethod(str);
So, the problem is: if i replace the #GEN_COMPLEX_ID# on this string, wanting that a different should be on each match, it not will happen because the replaced executes the function in a single shot (not step by step). So, i wanna help to implement this: a step by step replace of any text (like Find some word, replace, than next ... replace ... next... etc.
Could you help me?
Thanks!
This works pretty well for me:
string yourOriginalString = "ab cd ab cd ab cd";
string pattern = "ab";
string yourNewDescription = "123";
int startingPositionOffset = 0;
int yourOriginalStringLength = yourOriginalString.Length;
MatchCollection match = Regex.Matches(yourOriginalString, pattern, RegexOptions.IgnoreCase | RegexOptions.Multiline);
foreach (Match m in match)
{
yourOriginalString = yourOriginalString.Substring(0, m.Index+startingPositionOffset) + yourNewDescription + yourOriginalString.Substring(m.Index + startingPositionOffset+ m.Length);
startingPositionOffset = yourOriginalString.Length - yourOriginalStringLength;
}
If what you're asking is how to replace each placeholder with a different value, you can do it using the Regex.Replace overload which accepts a MatchEvaluator delegate, and executes it for each match:
// conceptually, something like this (note that it's not checking if there are
// enough values in the replacementValues array)
static string ReplaceMultiple(
string input, string placeholder, IEnumerable<string> replacementValues)
{
var enumerator = replacementValues.GetEnumerator();
return Regex.Replace(input, placeholder,
m => { enumerator.MoveNext(); return enumerator.Current; });
}
This is, of course, presuming that all placeholders look the same.
Pseudo-code
var split = source.Split(placeholder); // create array of items without placeholders
var result = split[0]; // copy first item
for(int i = 1; i < result.Length; i++)
{
bool replace = ... // ask user
result += replace ? replacement : placeholder; // to put replacement or not to put
result += split[i]; // copy next item
}
you should use the split method like this
string [] placeholder = {"#Placeholder#"} ;
string[] request = cd.Split(placeholder, StringSplitOptions.RemoveEmptyEntries);
StringBuilder requetBuilding = new StringBuilder();
requetBuilding.Append(request[0]);
int index = 1;
requetBuilding.Append("Your place holder replacement");
requetBuilding.Append(request[index]);
index++; //next replacement
// requetBuilding.Append("Your next place holder replacement");
// requetBuilding.Append(request[index]);
I want to seperate multiple values from a gridview control and show it in four textboxes. Is that possible?
Right now I get this value:
With this code:
var lblRef = new Label
{
Text = ((Label) row.FindControl("LabelAssignmentReference")).Text
};
string valueTextBox = lblRef.Text;
int indexOfRefSwe = valueTextBox.IndexOf(",", StringComparison.Ordinal);
string valueRef = valueTextBox.Substring(0, indexOfRefSwe);
TextBoxReference.Text = valueRef;
But how do i get it in multiple values? ` TextBoxReference.Text = valueRef;
TextBoxRefPhone.Text = "??";
TextBoxRefEmail.Text = "??";
TextBoxRefDesc.Text = "??";`
This should get you started.
string[] splits = lblRef.Text.Split(',');
Console.WriteLine(splits[0]); // refname
Console.WriteLine(splits[1]); // 08712332
Console.WriteLine(splits[2]); // ref#gmail.com
Console.WriteLine(splits[3]); // refdescription
I suggest also adding validation checks to make sure you don't get any errors, such as checking that splits.Length == 4 as expected.
Note that the spaces will be included in the beginning of the last three elements of splits. You can eliminate those using the Trim method, or by providing an array of delimiters new[] {',', ' '} to the split function and ignore empty elements (there's an overload for that).
There is System.String.Split()-method:
string[] parts = str.Split(new char[] {','});
Afterwards, work on the parts.
Example from MSDN
using System;
public class SplitTest {
public static void Main() {
string words = "This is a list of words, with: a bit of punctuation" +
"\tand a tab character.";
string [] split = words.Split(new Char [] {' ', ',', '.', ':', '\t' });
foreach (string s in split) {
if (s.Trim() != "")
Console.WriteLine(s);
}
}
}
you can do as below
var values = lblRef.Text.Split(',');
TextBoxRefPhone.Text = values[0];
if(values.Length>0)
TextBoxRefEmail.Text =values[1];
if(values.Length>1)
TextBoxRefDesc.Text = values[2];
Edit
there is a Split overload method which accept params. so we can give one character
public string[] Split(params char[] separator);
The params keyword lets you specify a method parameter that takes an
argument where the number of arguments is variable.
This is a program that reads in a CSV file, adds the values to a dictionary class and then analyses a string in a textbox to see if any of the words match the dictionary entry. It will replace abbreviations (LOL, ROFL etc) into their real words. It matches strings by splitting the inputted text into individual words.
public void btnanalyze_Click(object sender, EventArgs e)
{
var abbrev = new Dictionary<string, string>();
using (StreamReader reader = new StreamReader("C:/Users/Jordan Moffat/Desktop/coursework/textwords0.csv"))
{
string line;
string[] row;
while ((line = reader.ReadLine()) != null)
{
row = line.Split(',');
abbrev.Add(row[0], row[1]);
Console.WriteLine(abbrev);
}
}
string twitterinput;
twitterinput = "";
// string output;
twitterinput = txtInput.Text;
{
char[] delimiterChars = { ' ', ',', '.', ':', '\t' };
string text = twitterinput;
string[] words = twitterinput.Split(delimiterChars);
string merge;
foreach (string s in words)
{
if (abbrev.ContainsKey(s))
{
string value = abbrev[s];
merge = string.Join(" ", value);
}
if (!abbrev.ContainsKey(s))
{
string not = s;
merge = string.Join(" ", not);
}
;
MessageBox.Show(merge);
}
The problem so far is that the final string is outputted into a text box, but only prints the last word as it overwrites. This is a University assignment, so I'm looking for a push in the correct direction as opposed to an actual answer. Many thanks!
string.Join() takes a collection of strings, concatenates them together and returns the result. But in your case, the collection contains only one item: value, or not.
To make your code work, you could use something like:
merge = string.Join(" ", merge, value);
But because of the way strings work, this will be quite slow, so you should use StringBuilder instead.
This is the problem:
string not = s;
merge = string.Join(" ", not);
You are just joining a single element (the latest) with a space delimiter, thus overwriting what you previously put into merge.
If you want to stick with string you need to use Concat to append the new word onto the output, though this will be slow as you are recreating the string each time. It will be more efficient to use StringBuilder to create the output.
If your assignment requires that you use Join to build up the output, then you'll need to replace the target words in the words array as you loop over them. However, for that you'll need to use some other looping mechanism than foreach as that doesn't let you modify the array you're looping over.
Better to User StringBuilder Class for such purpose
http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.aspx
Need some ideas how to solve this problem.
I have a template file what describes the line in the text file. For example:
Template
[%f1%]|[%f2%]|[%f3%]"[%f4%]"[%f5%]"[%f6%]
Text file
1234|1234567|123"12345"12"123456
Now i need to read in the fields from the text file. In the template file fields are described with [%some name%]. Allso in the template file there is set what the field separators are, in this example here there are | and ". The lenght of the fields can change through different files but the separators will stay the same. What would be the best way to read in the template and by template read in the text file?
EDIT: Text file has multiple rows, like this:
1234|1234567|123"12345"12"123456"\r\n
1234|field|123"12345"12"asdasd"\r\n
123sd|1234567|123"asdsadf"12"123456"\r\n
45gg|somedata|123"12345"12"somefield"\r\n
EDIT2: Ok, lets make it even harder. Some fields can contain binary data and i know the starting and end position of the binary data field. I should be able to mark those fields in the template and then the parser will know that this field is binary. How to solve this problem?
I would create a regex based on the template and then parse the text file using that:
class Parser
{
private static readonly Regex TemplateRegex =
new Regex(#"\[%(?<field>[^]]+)%\](?<delim>[^[]+)?");
readonly List<string> m_fields = new List<string>();
private readonly Regex m_textRegex;
public Parser(string template)
{
var textRegexString = '^' + TemplateRegex.Replace(template, Evaluator) + '$';
m_textRegex = new Regex(textRegexString);
}
string Evaluator(Match match)
{
// add field name to collection and create regex for the field
var fieldName = match.Groups["field"].Value;
m_fields.Add(fieldName);
string result = "(.*?)";
// add delimiter to the regex, if it exists
// TODO: check, that only last field doesn't have delimiter
var delimGroup = match.Groups["delim"];
if (delimGroup.Success)
{
string delim = delimGroup.Value;
result += Regex.Escape(delim);
}
return result;
}
public IDictionary<string, string> Parse(string text)
{
var match = m_textRegex.Match(text);
var groups = match.Groups;
var result = new Dictionary<string, string>(m_fields.Count);
for (int i = 0; i < m_fields.Count; i++)
result.Add(m_fields[i], groups[i + 1].Value);
return result;
}
}
You can parse the template using regular expressions. An expression like this will match each field definition and separator:
Match m = Regex.Match(template, #"^(\[%(?<name>.+?)%\](?<separator>.)?)+$")
The match will contain two named groups for (name and separator), each of which will contain a number of captures for each time they matched in the input string. In your example, the separator group would have one less capture than the name group.
You can then iterate over the captures, and use the results to extract the fields from the input string and store the values, like this:
if( m.Success )
{
Group name = m.Groups["name"];
Group separator = m.Groups["separator"];
int index = 0;
Dictionary<string, string> fields = new Dictionary<string, string>();
for( int x = 0; x < name.Captures.Count; ++x )
{
int separatorIndex = input.Length;
if( x < separator.Captures.Count )
separatorIndex = input.IndexOf(separator.Captures[x].Value, index);
fields.Add(name.Captures[x].Value, input.Substring(index, separatorIndex - index));
index = separatorIndex + 1;
}
// Do something with results.
}
Obviously in a real program you'd have to account for invalid input and such, which I didn't do here.
I would do this with a few lines of code. Loop through your template row, grabbing all text between "[" as the variable name and everything else as a terminator. Read all the text to the terminal, assign it to the variable name, repeat.
1- Use API for that sscanf(line, format, __arglist) check here
2- Use string split Like:
public IEnumerable<int> GetDataFromLines(string[] lines)
{
//handle the output data
List<int> data = new List<int>();
foreach (string line in lines)
{
string[] seperators = new string[] { "|", "\"" };
string[] results = line.Split(seperators, StringSplitOptions.RemoveEmptyEntries);
foreach (string result in results)
{
data.Add(int.Parse(result));
}
}
return data;
}
Test it with line:
line = "1234|1234567|123\"12345\"12\"123456";
string[] lines = new string[] { line };
GetDataFromLines(lines);
//output list items are:
1234
1234567
123
12345
12
123456