split string in to several strings at specific points

split string in to several strings at specific points - c#

I have a text file with lines of text laid out like so
12345MLOL68
12345MLOL68
12345MLOL68
I want to read the file and add commas to the 5th point, 6th point and 9th point and write it to a different text file so the result would be.
12345,M,LOL,68
12345,M,LOL,68
12345,M,LOL,68
This is what I have so far
public static void ToCSV(string fileWRITE, string fileREAD)
{
int count = 0;
string x = "";
StreamWriter commas = new StreamWriter(fileWRITE);
string FileText = new System.IO.StreamReader(fileREAD).ReadToEnd();
var dataList = new List<string>();
IEnumerable<string> splitString = Regex.Split(FileText, "(.{1}.{5})").Where(s => s != String.Empty);
foreach (string y in splitString)
{
dataList.Add(y);
}
foreach (string y in dataList)
{
x = (x + y + ",");
count++;
if (count == 3)
{
x = (x + "NULL,NULL,NULL,NULL");
commas.WriteLine(x);
x = "";
count = 0;
)
}
commas.Close();
}
The problem I'm having is trying to figure out how to split the original string lines I read in at several points. The line
IEnumerable<string> splitString = Regex.Split(FileText, "(.{1}.{5})").Where(s => s != String.Empty);
Is not working in the way I want to. It's just adding up the 1 and 5 and splitting all strings at the 6th char.
Can anyone help me split each string at specific points?

Simpler code:
public static void ToCSV(string fileWRITE, string fileREAD)
{
string[] lines = File.ReadAllLines(fileREAD);
string[] splitLines = lines.Select(s => Regex.Replace(s, "(.{5})(.)(.{3})(.*)", "$1,$2,$3,$4")).ToArray();
File.WriteAllLines(fileWRITE, splitLines);
}

Just insert at the right place in descending order like this.
string str = "12345MLOL68";
int[] indices = {5, 6, 9};
indices = indices.OrderByDescending(x => x).ToArray();
foreach (var index in indices)
{
str = str.Insert(index, ",");
}
We're doing this in descending order because if we do other way indices will change, it will be hard to track it.
Here is the Demo

Why don't you use substring , example
editedstring=input.substring(0,5)+","+input.substring(5,1)+","+input.substring(6,3)+","+input.substring(9);
This should suits your need.

Related

Remove Identical Words from a string array

The goal is to remove a certain prefix word from a string in string array example: ["Market1", "Market2", "Market3"]. The prefix word Market is dominant in string array, so we have to remove Market from string array so the result should be ["1", "2", "3"]. Please take note that the Market prefix word in string could be anything.

Look for the first character that is not identical among all strings and select a substring starting at that position to remove the prefix.
string[] words = new string[] { "Market1", "Market2", "Market3" };
int i = 0;
while (words.All(word => word.Length > i && word[i] == words[0][i])) ++i;
var wordsWithoutPrefixes = words.Select(word => word.Substring(i)).ToArray();

Make a delimited string and then replace all the Market with an empty string and then split the string to an array.
string[] arr = new string[] { "Market1", "Market2", "Market3" };
string[] result = string.Join(".", arr).Replace("Market", "").Split('.');

Loop through each item in the array and for each item chop off the beginning the start matches.
var commonPrefix = "Market";
for (int i = 0; i < arr.length, i++) {
if(arr[i].IndexOf(commonPrefix) == 0) {
arr[i] = arr[i].Substring(commonPrefix.Length);
}
}

You can use LINQ:
string[] myArray = ["Market1", "Market2", "Market3"];
string prefix = myArray[0];
foreach (var s in myArray)
{
while (!s.StartsWith(prefix))
prefix = prefix.Substring(0, prefix.Length - 1);
}
string[] result = myArray
.Select(s => s.Substring(prefix.Length))
.ToArray();

Loop through the array of string and replace the substring containing prefix with an empty substring.
string[] s=new string[]{"Market1","Market2","Market3"};
string prefix="Market";
foreach(var x in s)
{
if(x.Contains(prefix))
{
x=x.Replace(prefix,"");
}
}

Replace string with multiple different options

Hi there wonderful people of stackOverFlow.
I am currently in a position where im totaly stuck. What i want to be able to do is take out a word from a text and replace it with a synonym. I thought about it for a while and figured out how to do it if i ONLY have one possible synonym with this code.
string pathToDesk = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
string text = System.IO.File.ReadAllText(pathToDesk + "/Text.txt");
string replacementsText = System.IO.File.ReadAllText(pathToDesk + "/Replacements.txt");
string wordsToReplace = System.IO.File.ReadAllText(pathToDesk + "/WordsToReplace.txt");
string[] words = text.Split(' ');
string[] reWords = wordsToReplace.Split(' ');
string[] replacements = replacementsText.Split(' ');
for(int i = 0; i < words.Length; i++) {//for each word
for(int j = 0; j < replacements.Length; j++) {//compare with the possible synonyms
if (words[i].Equals(reWords[j], StringComparison.InvariantCultureIgnoreCase)) {
words[i] = replacements[j];
}
}
}
string newText = "";
for(int i = 0; i < words.Length; i++) {
newText += words[i] + " ";
}
txfInput.Text = newText;
But lets say that we were to get the word hi. Then i want to be able to replace that with {"Hello","Yo","Hola"}; (For example)
Then my code will not be good for anything since they will not have the same position in the arrays.
Is there any smart solution to this I would really like to know.

you need to store your synonyms differently
in your file you need something like
hello yo hola hi
awesome fantastic great
then for each line, split the words, put them in an array array of arrays
Now use that to find replacement words
This won't be super optimized, but you can easily index each word to a group of synonyms as well.
something like
public class SynonymReplacer
{
private Dictionary<string, List<string>> _synonyms;
public void Load(string s)
{
_synonyms = new Dictionary<string, List<string>>();
var lines = s.Split(new[] {'\r', '\n'}, StringSplitOptions.RemoveEmptyEntries);
foreach (var line in lines)
{
var words = line.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries).ToList();
words.ForEach(word => _synonyms.Add(word, words));
}
}
public string Replace(string word)
{
if (_synonyms.ContainsKey(word))
{
return _synonyms[word].OrderBy(a => Guid.NewGuid())
.FirstOrDefault(w => w != word) ?? word;
}
return word;
}
}
The OrderBy gets you a random synonym...
then
var s = new SynonymReplacer();
s.Load("hi hello yo hola\r\nawesome fantastic great\r\n");
Console.WriteLine(s.Replace("hi"));
Console.WriteLine(s.Replace("ok"));
Console.WriteLine(s.Replace("awesome"));
var words = new string[] {"hi", "you", "look", "awesome"};
Console.WriteLine(string.Join(" ", words.Select(s.Replace)));
and you get :-
hello
ok
fantastic
hello you look fantastic

Your first task will be to build a list of words and synonyms. A Dictionary will be perfect for this. The text file containing this list might look like this:
word1|synonym11,synonym12,synonym13
word2|synonym21,synonym22,synonym23
word3|synonym31,synonym32,synonym33
Then you can construct the dictionary like this:
public Dictionary<string, string[]> GetSynonymSet(string synonymSetTextFileFullPath)
{
var dict = new Dictionary<string, string[]>();
string line;
// Read the file and display it line by line.
using (var file = new StreamReader(synonymSetTextFileFullPath))
{
while((line = file.ReadLine()) != null)
{
var split = line.Split('|');
if (!dict.ContainsKey(split[0]))
{
dict.Add(split[0], split[1].Split(','));
}
}
}
return dict;
}
The eventual code will look like this
public string ReplaceWordsInText(Dictionary<string, string[]> synonymSet, string text)
{
var newText = new StringBuilder();
string[] words = text.Split(' ');
for (int i = 0; i < words.Length; i++) //for each word
{
string[] synonyms;
if (synonymSet.TryGetValue(words[i], out synonyms)
{
// The exact synonym you wish to use is up to you.
// I will just use the first one
words[i] = synonyms[0];
}
newText.AppendFormat("{0} ", words[i]);
}
return newText.ToString();
}

Parsing a string with, seemingly, no delimiter

I have the following string that I need to parse out so I can insert them into a DB. The delimiter is "`":
`020 Some Description `060 A Different Description `100 And Yet Another `
I split the string into an array using this
var responseArray = response.Split('`');
So then each item in the responseArrray[] looks like this: 020 Some Description
How would I get the two different parts out of that array? The 1st part will be either 3 or 4 characters long. 2nd part will be no more then 35 characters long.
Due to some ridiculous strangeness beyond my control there is random amounts of space between the 1st and 2nd part.

Or put the other two answers together, and get something that's more complete:
string[] response = input.Split(`);
foreach (String str in response) {
int splitIndex = str.IndexOf(' ');
string num = str.Substring(0, splitIndex);
string desc = str.Substring(splitIndex);
desc.Trim();
}
so, basically you use the first space as a delimiter to create 2 strings. Then you trim the second one, since trim only applies to leading and trailing spaces, not everything in between.
Edit: this a straight implementation of Brad M's comment.

You can try this solution:
var inputString = "`020 Some Description `060 A Different Description `100 And Yet Another `";
int firstWordLength = 3;
int secondWordMaxLength = 35;
var result =inputString.Split('`')
.SelectMany(x => new[]
{
new String(x.Take(firstWordLength).ToArray()).Trim(),
new String(x.Skip(firstWordLength).Take(secondWordMaxLength).ToArray()).Trim()
});
Here is the result in LINQPad:
Update: My first solution has some problems because the use of Trim after Take.Here is another approach with an extension method:
public static class Extensions
{
public static IEnumerable<string> GetWords(this string source,int firstWordLengt,int secondWordLenght)
{
List<string> words = new List<string>();
foreach (var word in source.Split(new[] {'`'}, StringSplitOptions.RemoveEmptyEntries))
{
var parts = word.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
words.Add(new string(parts[0].Take(firstWordLengt).ToArray()));
words.Add(new string(string.Join(" ",parts.Skip(1)).Take(secondWordLenght).ToArray()));
}
return words;
}
}
And here is the test result:

Try this
string response = "020 Some Description060 A Different Description 100 And Yet Another";
var responseArray = response.Split('`');
string[] splitArray = {};
string result = "";
foreach (string it in responseArray)
{
splitArray = it.Split(' ');
foreach (string ot in splitArray)
{
if (!string.IsNullOrWhiteSpace(ot))
result += "-" + ot.Trim();
}
}
splitArray = result.Substring(1).Split('-');

string[] entries = input.Split('`');
foreach (string s in entries)
GetStringParts(s);
IEnumerable<String> GetStringParts(String input)
{
foreach (string s in input.Split(' ')
yield return s.Trim();
}
Trim only removes leading/trailing whitespace per MSDN, so spaces in the description won't hurt you.

If the first part is an integer
And you need to account for some empty
For me the first pass was empty
public void parse()
{
string s = #"`020 Some Description `060 A Different Description `100 And Yet Another `";
Int32 first;
String second;
if (s.Contains('`'))
{
foreach (string firstSecond in s.Split('`'))
{
System.Diagnostics.Debug.WriteLine(firstSecond);
if (!string.IsNullOrEmpty(firstSecond))
{
firstSecond.TrimStart();
Int32 firstSpace = firstSecond.IndexOf(' ');
if (firstSpace > 0)
{
System.Diagnostics.Debug.WriteLine("'" + firstSecond.Substring(0, firstSpace) + "'");
if (Int32.TryParse(firstSecond.Substring(0, firstSpace), out first))
{
System.Diagnostics.Debug.WriteLine("'" + firstSecond.Substring(firstSpace-1) + "'");
second = firstSecond.Substring(firstSpace).Trim();
}
}
}
}
}
}

You can get the first part by finding the first space and make a substring. The second is also a Substring. Try something like this.
foreach(string st in response)
{
int index = response.IndexOf(' ');
string firstPart = response.Substring(0, index);
//string secondPart = response.Substring(response.Lenght-35);
//better use this
string secondPart = response.Substring(index);
secondPart.Trim();
}

SubString editing

I've tried a few different methods and none of them work correctly so I'm just looking for someone to straight out show me how to do it . I want my application to read in a file based on an OpenFileDialog.
When the file is read in I want to go through it and and run this function which uses Linq to insert the data into my DB.
objSqlCommands.sqlCommandInsertorUpdate
However I want to go through the string , counting the number of ","'s found . when the number reaches four I want to only take the characters encountered until the next "," and do this until the end of the file .. can someone show me how to do this ?
Based on the answers given here my code now looks like this
string fileText = File.ReadAllText(ofd.FileName).Replace(Environment.NewLine, ",");
int counter = 0;
int idx = 0;
List<string> foo = new List<string>();
foreach (char c in fileText.ToArray())
{
idx++;
if (c == ',')
{
counter++;
}
if (counter == 4)
{
string x = fileText.Substring(idx);
foo.Add(fileText.Substring(idx, x.IndexOf(',')));
counter = 0;
}
}
foreach (string s in foo)
{
objSqlCommands.sqlCommandInsertorUpdate("INSERT", s);//laClient[0]);
}
However I am getting an "length cannot be less than 0" error on the foo.add function call , any ideas ?

A Somewhat hacky example. You would pass this the entire text from your file as a single string.
string str = "1,2,3,4,i am some text,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20";
int counter = 0;
int idx = 0;
List<string> foo = new List<string>();
foreach (char c in str.ToArray())
{
idx++;
if (c == ',')
{
counter++;
}
if (counter == 4)
{
string x = str.Substring(idx);
foo.Add(str.Substring(idx, x.IndexOf(',')));
counter = 0;
}
}
foreach(string s in foo)
{
Console.WriteLine(s);
}
Console.Read();
Prints:
i am some text
9
13
17

As Raidri indicates in his answer, String.Split is definitely your friend. To catch every fifth word, you could try something like this (not tested):
string fileText = File.ReadAllText(OpenDialog.FileName).Replace(Environment.NewLine, ",");
string words[] = fileText.Split(',');
List<string> everFifthWord = new List<string>();
for (int i = 4; i <= words.Length - 1, i + 5)
{
everyFifthWord.Add(words[i]);
}
The above code reads the selected file from the OpenFileDialog, then replaces every newline with a ",". Then it splits the string on ",", and starting with the fifth word takes every fifth word in the string and adds it to the list.

File.ReadAllText reads a text file to a string and Split turns that string into an array seperated at the commas:
File.ReadAllText(OpenDialog.FileName).Split(',')[4]
If you have more than one line use:
File.ReadAllLines(OpenDialog.FileName).Select(l => l.Split(',')[4])
This gives an IEnumerable<string> where each string contains the wanted part from one line of the file

It's not clear to me if you're after every fifth piece of text between the commas or if there are multiple lines and you want only the fifth piece of text on each line. So I've done both.
Every fifth piece of text:
var text = "1,2,3,4,i am some text,6,7,8,9"
+ ",10,11,12,13,14,15,16,17,18,19,20";
var everyFifth =
text
.Split(',')
.Where((x, n) => n % 5 == 4);
Only the fifth piece of text on each line:
var lines = new []
{
"1,2,3,4,i am some text,6,7",
"8,9,10,11,12,13,14,15",
"16,17,18,19,20",
};
var fifthOnEachLine =
lines
.Select(x => x.Split(',')[4]);

How can I read X lines down from another line in a text file?

I have a text file that I load into a string array. The contents of the file looks something like this:
OTI*IA*IX*NA~ REF*G1*J EVERETTE~ REF*11*0113722462~
AMT*GW*229.8~ NM1*QC*1*JENNINGS*PHILLIP~ OTI*IA*IX*NA~ REF*G1*J
EVERETTE~ REF*11*0113722463~ AMT*GW*127.75~
NM1*QC*1*JENNINGS*PHILLIP~ OTI*IA*IX*NA~ REF*G1*J EVERETTE~
REF*11*0113722462~ AMT*GW*10.99~ NM1*QC*1*JENNINGS*PHILLIP~ ...
I'm looking for the lines that start with OTI, and if it's followed by "IA" then I need to get the 10 digit number from the line that starts with REF*11. So far, I have this:
string[] readText = File.ReadAllLines("myfile.txt");
foreach (string s in readText) //string contains 1 line of text from above example
{
string[] currentline = s.Split('*');
if (currentline[0] == "OTI")
{
//move down 2 lines and grab the 10 digit
//number from the line that starts with REF*11
}
}
The line I need is always 2 lines after the current OTI line. How do I access the line that's 2 lines down from my current line?

Instead of using foreach() you can use a for(int index = 0; index < readText.Length; index++)
Then you know the line you are accessing and you can easily say int otherIndex = index + 2
string[] readText = File.ReadAllLines("myfile.txt");
for(int index = 0; index < readText.Length; index++)
{
string[] currentline = readText[index].Split('*');
if (currentline[0] == "OTI")
{
//move down 2 lines and grab the 10 digit
//number from the line that starts with REF*11
int refIndex = index + 2;
string refLine = readText[refIndex];
}
}

What about:
string[] readText = File.ReadAllLines("myfile.txt");
for (int i = 0; i < readText.Length; i++)
{
if (readText[i].StartsWith("OTI") && readText[i+2].StartsWith("REF*11")){
string number = readText[i+2].Substring("REF*11".Length, 10);
//do something
}
}

This looks like an EDI file! Ahh, EDI, the memories...
The good news is that the EDI file is delimited, just like most CSV file formats. You can use any standard CSV file library to load the EDI file into a gigantic array, and then iterate through it by position.
I published my open source CSV library here, feel free to use it if it's helpful. You can simply specify the "asterisk" as the delimiter:
https://code.google.com/p/csharp-csv-reader/
// This code assumes the file is on disk, and the first row of the file
// has the names of the columns on it
DataTable dt = CSVReader.LoadDataTable(myfilename, '*', '\"');
At this point, you can iterate through the datatable as normal.
for (int i = 0; i < dt.Rows.Count; i++) {
if (dt.Rows[i][0] == "OTI") {
Console.WriteLine("The row I want is: " + dt.Rows[i + 2][0]);
}
}

If you want to use regex to tokenize the items and create dynamic entities, here is such a pattern
string data = #"NM1*QC*1*JENNINGS*PHILLIP~
OTI*IA*IX*NA~
REF*G1*J EVERETTE~
REF*11*0113722463~
AMT*GW*127.75~
NM1*QC*1*JENNINGS*PHILLIP~
OTI*IA*IX*NA~
REF*G1*J EVERETTE~
REF*11*0113722462~
AMT*GW*10.99~
NM1*QC*1*JENNINGS*PHILLIP~";
string pattern = #"^(?<Command>\w{3})((?:\*)(?<Value>[^~*]+))+";
var lines = Regex.Matches(data, pattern, RegexOptions.Multiline)
.OfType<Match>()
.Select (mt => new
{
Op = mt.Groups["Command"].Value,
Data = mt.Groups["Value"].Captures.OfType<Capture>().Select (c => c.Value)
}
);
That produces a list of items like this which you can apply your business logic to:

Why dont you use regular expression matches using Regex.Match or Regex.Matches defined in System.Text.RegularExpressions? You can also look at string pattern matching algorithms such as the Knuth-Morris-Pratt algorithms.

string[] readText = File.ReadAllLines("myfile.txt");
foreach (string s in readText) //string contains 1 line of text from above example
{
string[] currentline = s.Split('*');
if (currentline[0] == "REF"&&currentline[1] == "11")
{
found=false;
needed=current+2;
}
}

string[] readText = File.ReadAllLines("myfile.txt");
for(int linenum = 0;linenum < readText.Length; linenum++)
{
string s = readText[linenum];
string[] currentline = s.Split('*');
if (currentline[0] == "OTI")
{
//move down 2 lines and grab the 10 digit
linenum +=2;
string refLine = readText[linenum];
//number from the line that starts with REF*11
// Extract your number here from refline
}
}

Thank guys .. this is what I came up with, but I'm also reading your answers as I KNOW I will learn something! Thanks again!
string[] readText = File.ReadAllLines("myfile.txt");
int i = 0;
foreach (string s in readText)
{
string[] currentline = s.Split('*');
if (currentline[0] == "OTI")
{
lbRecon.Items.Add(readText[i+2].Substring(8, 9));
}
i++;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

split string in to several strings at specific points - c#

Simpler code: public static void ToCSV(string fileWRITE, string fileREAD) { string[] lines = File.ReadAllLines(fileREAD); string[] splitLines = lines.Select(s => Regex.Replace(s, "(.{5})(.)(.{3})(.*)", "$1,$2,$3,$4")).ToArray(); File.WriteAllLines(fileWRITE, splitLines); }

Why don't you use substring , example editedstring=input.substring(0,5)+","+input.substring(5,1)+","+input.substring(6,3)+","+input.substring(9); This should suits your need.

Related

Remove Identical Words from a string array

Replace string with multiple different options

Parsing a string with, seemingly, no delimiter

SubString editing

How can I read X lines down from another line in a text file?

Categories

Resources