Remove part of a string from List<string> - c#

I have a List<string>, the string part representing filenames that I need to filter out: anything that comes before the character '&' included must be erased.
List<string> zippedTransactions = new List<string>();
zippedTransactions.Add("33396&20151007112154000549659S03333396SUMMARIES.PDF");
zippedTransactions.Add("33395&20151007112400000549659S03333395SUMMARIES.PDF");
zippedTransactions.Add("33397&20151007112555000549659S03333397SUMMARIES.PDF");
// desired output:
// "20151007112154000549659S03333396SUMMARIES.PDF";
// "20151007112400000549659S03333395SUMMARIES.PDF";
// "20151007112555000549659S03333397SUMMARIES.PDF"
NOTE: I don't want to give it the classic iterative-style look, since C# provides for plentiful of functional interfaces to interact with this sort of data structure, I want to start using it.

Here is one Linq approach with RegEx
Transactions = Transactions.Select(x => Regex.Replace(x, ".*&", string.Empty)).ToList();
That's more fault tolerant compared to Split('&')[1] in case there is no & in your filename

Try this
for (int i = 0; i < zippedTransactions.Count; i++)
{
zippedTransactions[i] = zippedTransactions[i].Split('&')[1];
}

If you happen to have visual studio, and a version that supports C# Interactive, I suggest you try this.
> zippedTransactions = new List<string>() {
"33396&20151007112154000549659S03333396SUMMARIES.PDF",
"33395&20151007112400000549659S03333395SUMMARIES.PDF",
"33397&20151007112555000549659S03333397SUMMARIES.PDF"
};
>
> zippedTransactions.Select(dirname => dirname.Split('&')[1])
Enumerable.WhereSelectListIterator<string, string> { "20151007112154000549659S03333396SUMMARIES.PDF", "20151007112400000549659S03333395SUMMARIES.PDF", "20151007112555000549659S03333397SUMMARIES.PDF" }
>
And even if you don't, you can get an idea of what's happening just by looking at the code.
The WhereSelectListIterator is a data structure holding the logic you intend to execute on the data structure. It is evaluated (read: the loop actually happens) only when you consume it (for example, calling .ToList() at the end).
This code will only take the second element coming after splitting the string on '&', so you might wanna generalize it or tune it for your requirements.

Use string.Split to split the string at the desired character and retrieve the portion that you want:
foreach (var item in zippedTransactions)
{
string[] result = item.Split('&');
Console.WriteLine(result[1]);
}

You can use the string.IndexOf function to find the location of a character in the string and then use string.Remove to remove the characters up to that point:
for(int i =0; i < zippedTransactions.Count; i++)
{
int count = zippedTransactions[i].IndexOf("&") + 1;
zippedTransactions[i] = zippedTransactions[i].Remove(0, count);
}

following code will help you
for (int i = 0; i < zippedTransactions.Count; i++)
{
string[] result = zippedTransactions[i].Split('&');
zippedTransactions[i] = result[result.Length-1];
}

Related

How to add from an array of strings to a list, each word that has its letters in alphabetical order?

string[] words = new string[5] { "abbey","billowy", "chills","abced","abcde" };
it should display only:
abbey billowy chills abcde
I tried this code
List<string> AlphabeticOrder = new List<string>();
foreach (var word in words)
{
for (int i = 1; i < word.Length; i++)
{
if (word[i] < word[i - 1])
{
break;
}
AlphabeticOrder.Add(word);
break;
}
}
One line solution:
var alphabeticOrder = words.Where(w => Enumerable.SequenceEqual(w.OrderBy(x => x), w)).ToList();
EDIT: As pointed out in comments this approach is not the most optimal when it comes to the performance, so if this is a priority, one can consider solutions proposed in other answers.
This becomes easier if you break it into pieces. What you need is a function that takes a string and tells you if the characters in the string are in alphabetical order.
For example:
public static class CharacterSequence // I didn't think hard about the name
{
public static bool CharactersAreInAlphabeticalOrder(string input)
{
return input.SequenceEqual(input.OrderBy(c => c));
}
}
Having done that, the next part is just checking a collection of strings and returning only the ones where the characters are in order. If
A string is a collection of characters (char). This method takes the sequence of characters and sorts it. Then it compares the original to the sorted. If they are the same, then the original sequence was in order.
var wordsWithCharactersInOrder =
words.Where(CharacterSequence.CharactersAreInAlphabeticalOrder);
One reason why it's helpful to break it up like this is that it's easier to understand. It's very easy to read the above code and tell what it does. Also, if you realize that there's something you want to change about the way you check for characters in order, you can change that in the smaller function.
For example, you might realize that the original function is case-sensitive. C comes before d, but D comes before c. In this example it's less noticeable because the function is small, but as logic becomes more complex it's easier to read and think about when we break things into smaller functions. The case-insensitive version would be
public static bool CharactersAreInAlphabeticalOrder(string input)
{
var lowerCase = input.ToLower();
return lowerCase.SequenceEqual(lowerCase.OrderBy(c => c));
}
If you want to go a step further then you can compare the characters one at a time instead of sorting the entire string.
public static bool CharactersAreInAlphabeticalOrder(string input)
{
if (input.Length < 2) return true;
var lowerCase = input.ToLower();
var characterIndexes = Enumerable.Range(0, input.Length - 1);
return characterIndexes.All(characterIndex =>
lowerCase[characterIndex] <= lowerCase[characterIndex + 1]);
}
You can also write unit tests for it. If you know that the smaller function always returns the expected results, then the larger one that checks a collection of strings will return the correct results.
Here's an example of a unit test. It's much easier to test lots of conditions this way and have confidence that the function works than to edit the code and run it over and over. If you realize that there's another case you have to account for, you can just add it.
[DataTestMethod]
[DataRow("A", true)]
[DataRow("AB", true)]
[DataRow("abc", true)]
[DataRow("aBc", true)]
[DataRow("ba", false)]
public void CharactersAreInAlphabeticalOrder_returns_expected_result(string input, bool expected)
{
var result = CharacterSequence.CharactersAreInAlphabeticalOrder(input);
Assert.AreEqual(expected, result);
}
There was a small error in my original code. It didn't work if a word had only two letters. Without the test that error could have gone into the application without being noticed until later when it would take longer to find and fix. It's much easier with a test.
Words with letters in alphabetical order are known as abecedarian.
The difficulty in your algorithm is breaking out of a nested loop. There are different strategies to solve this problem:
Use a labeled statement and goto. Goto is frowned upon.
Use of a Boolean guard. This is okay but not very readable.
Place the inner loop into a method. This is the clean and easy to read solution that I decided to present.
Let us create a helper method:
private static bool IsAbecedarianWord(string word)
{
for (int i = 1; i < word.Length; i++) {
if (word[i] < word[i - 1]) {
return false;
}
}
return true;
}
With its help we can write:
foreach (var word in words) {
if (IsAbecedarianWord(word)) {
AlphabeticOrder.Add(word);
}
}
Clean and simple!
One note to naming conventions in C#. The usual conventions are (in short):
Type names, Method names and Property names are written in PascalCase. Interfaces are additionally prefixed with an upper case I (IPascalCase).
Names of method parameters and local variables are written in camelCase.
Field names (class and struct variables) are written in _camelCase with a leading underscore.
With that in mind, I suggest renaming AlphabeticOrder to abecedarian.
If you want to use your method try adding this:
foreach (var word in words)
{
for (int i = 1; i < word.Length; i++)
{
if (word[i] < word[i - 1])
{
break;
}
if (i == word.Length - 1)
{
AlphabeticOrder.Add(word);
}
}
}
Problem in your code is that it checks first 2 letters and if they are in alphabetic order it adds them to list.
The reason this is not working as expected is because the logic on whether to discard a result is flawed. The for loop which iterates the letters within the word is only checking the first letter and then exiting the loop regardless of the result.
I've added comments to your function below to help explain this.
for (int i = 1; i < word.Length; i++)
{
if (word[i] < word[i - 1]) // check the 2nd letter of the word against the 1st
{
break; // if the 2nd letter comes before the 1st in the alphabet, exit
}
AlphabeticOrder.Add(word); // add the word to the list
break; // exit the for loop
}
You should refactor the code such that it checks every letter of the word before adding it to the list of alphabetical words. You can also still end the for loop early if the condition fails.
There's a few ways to solve this, you could track the letters like Adam's answer above. Another possibility is to sort the array of letters and compare it to the original. If the arrays match then it's an alphabetical word for your scenario, if no match then it's not.
E.g.
foreach (var word in words)
{
var letters = word.ToList();
var sorted = word.OrderBy(l => l);
if (letters.SequenceEqual(sorted))
{
AlphabeticOrder.Add(word);
}
}
Which outputs:
abbey,billowy,chills,abcde
Logic is flawed.
Condition is satisfied in the first 2 letters and immediately added to the list.
List<string> AlphabeticOrder = new List<string>();
bool isOrdered = true; // Added this
foreach (var word in words)
{
isOrdered = true; // Added this
for (int i = 1; i < word.Length; i++)
{
if (word[i] < word[i - 1])
{
isOrdered = false; // Added this
break;
}
}
// Added this
if(isOrdered)
AlphabeticOrder.Add(word);
}

How to remove rows from IEnumerable

I'm loading CSV Files into a IEnumerable.
string[] fileNames = Directory.GetFiles(#"read\", "*.csv");
for (int i = 0; i < fileNames.Length; i++)
{
string file = #"read\" + Path.GetFileName(fileNames[i]);
var lines = from rawLine in File.ReadLines(file, Encoding.Default)
where !string.IsNullOrEmpty(rawLine)
select rawLine;
}
After that I work with the Data but now there are couple of Files that are pretty much empty and only have ";;;;;;" (the amount varies) written in there.
How can I delete those rows before working with them and without changing anything in the csv files?
If the amount of ; characters per line is variable, this is what your "where" condition should look like:
where !string.IsNullOrEmpty(rawLine) && !string.IsNullOrEmpty(rawLine.Trim(';'))
rawLine.Trim(';') will return a copy of the string with all ; characters removed. If this new string is empty, it means this line can be ignored, since it only contained ; characters.
You can't remove anything from an IEnumerable(like from a List<T>), buty ou can add a filter:
lines = lines.Where(l => !l.Trim().All(c => c == ';'));
This won't delete anything, but you won't process these lines anymore.
You can't remove rows from an enumerable - https://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx.
Instead try creating a new array with the filtered data or filter it on the where clause that you presented like:
string[] fileNames = Directory.GetFiles(#"read\", "*.csv");
for (int i = 0; i < fileNames.Length; i++)
{ string file = #"read\" + Path.GetFileName(fileNames[i]);
var lines = from rawLine in File.ReadLines(file, Encoding.Default) where !string.IsNullOrEmpty(rawLine) && rawLine != ";;;;;;" select rawLine;}
There are multiple solution.
Convert enumerable to List, then delete from List. This is bit expensive.
Create one function. // You can apply multiple filter in case required.
public IEnumrable<string> GetData(ref IEnumrable<string> data)
{
return data.Where(c=> !String.Equals(c,"<<data that you want to filter>>");
}
As another option to read CSV file is to make use of TextFieldParser class. It has CommentTokens and Delimiters which may help you on this.
Specifying ; as a CommentTokens may help you.
Tutorial

Replace character at specific index in List<string>, but indexer is read only [duplicate]

This question already has answers here:
Is there an easy way to change a char in a string in C#?
(8 answers)
Closed 5 years ago.
This is kind of a basic question, but I learned programming in C++ and am just transitioning to C#, so my ignorance of the C# methods are getting in my way.
A client has given me a few fixed length files and they want the 484th character of every odd numbered record, skipping the first one (3, 5, 7, etc...) changed from a space to a 0. In my mind, I should be able to do something like the below:
static void Main(string[] args)
{
List<string> allLines = System.IO.File.ReadAllLines(#"C:\...").ToList();
foreach(string line in allLines)
{
//odd numbered logic here
line[483] = '0';
}
...
//write to new file
}
However, the property or indexer cannot be assigned to because it is read only. All my reading says that I have not set a setter for the variable, and I have tried what was shown at this SO article, but I am doing something wrong every time. Should what is shown in that article work? Should I do something else?
You cannot modify C# strings directly, because they are immutable. You can convert strings to char[], modify it, then make a string again, and write it to file:
File.WriteAllLines(
#"c:\newfile.txt"
, File.ReadAllLines(#"C:\...").Select((s, index) => {
if (index % 2 = 0) {
return s; // Even strings do not change
}
var chars = s.ToCharArray();
chars[483] = '0';
return new string(chars);
})
);
Since strings are immutable, you can't modify a single character by treating it as a char[] and then modify a character at a specific index. However, you can "modify" it by assigning it to a new string.
We can use the Substring() method to return any part of the original string. Combining this with some concatenation, we can take the first part of the string (up to the character you want to replace), add the new character, and then add the rest of the original string.
Also, since we can't directly modify the items in a collection being iterated over in a foreach loop, we can switch your loop to a for loop instead. Now we can access each line by index, and can modify them on the fly:
for(int i = 0; i < allLines.Length; i++)
{
if (allLines[i].Length > 483)
{
allLines[i] = allLines[i].Substring(0, 483) + "0" + allLines[i].Substring(484);
}
}
It's possible that, depending on how many lines you're processing and how many in-line concatenations you end up doing, there is some chance that using a StringBuilder instead of concatenation will perform better. Here is an alternate way to do this using a StringBuilder. I'll leave the perf measuring to you...
var sb = new StringBuilder();
for (int i = 0; i < allLines.Length; i++)
{
if (allLines[i].Length > 483)
{
sb.Clear();
sb.Append(allLines[i].Substring(0, 483));
sb.Append("0");
sb.Append(allLines[i].Substring(484));
allLines[i] = sb.ToString();
}
}
The first item after the foreach (string line in this case) is a local variable that has no scope outside the loop - that’s why you can’t assign a value to it. Try using a regular for loop instead.
Purpose of for each is meant to iterate over a container. It's read only in nature. You should use regular for loop. It will work.
static void Main(string[] args)
{
List<string> allLines = System.IO.File.ReadAllLines(#"C:\...").ToList();
for (int i=0;i<=allLines.Length;++i)
{
if (allLines[i].Length > 483)
{
allLines[i] = allLines[i].Substring(0, 483) + "0";
}
}
...
//write to new file
}

How to find a pair of chars within a string in c# netMF?

This has probably (somewhere) been asked before, but can't find any documentation on it (i have looked!).
Say I had declared a string like:
String Test = "abcdefg";
How would i go about searching the string to see if I could see "cd" anywhere in the string by searching through the string in pairs, like:
{ab}{bc}{cd}{de}{ef}{fg}
That is, if I split each of the values up, and searched for a pair of chars next to each other? Is there a built in function for this?
I have thought about using a char array for this, but it seems to (logically) be very 'heavy'/'slow'. Would there be a better solution to search this string?
EDIT 1
Once I see this "cd", I would then need to doSomething() at that position (which I have already implemented by using the substring method.
Try this.
String.IndexOf(...) != -1
For more infö, read here.
Similar to the answer from Neo, but in a loop to get all instances within the string:
string Test = "abcdefgcd";
int index = Test.IndexOf("cd");
while (index > -1)
{
//DoSomething();
index = Test.IndexOf("cd", ++index);
}
The first IndexOf checks for the existence of what you want, whilst the second IndexOf (in the loop) checks for a match after the last index.
In the above we find two matches and then the loop ends.
There is no build in function that will do that.
having a for loop should do what you want.
something like that:
string str = string.empty;
for (i=0;i<ch.length;i++) {
if (i != ch.length) {
str += ch[i] + ch[i+1];
}
}
also you can use regex however that wont be fast either.
In order to optimize this on a large scale you can implement byte shifting.
The ASCII code of your string characters is your friend in this case, full working example below:
var yourString = "abcdefg";
var x = '\0';
for (var i = 0; i < yourString.Length; i++)
{
//check whether i+1 index is not out of range
if (i + 1 != yourString.Length)
{
var test = yourString[i + 1];
x = yourString[i];
if(x.ToString() + test.ToString() == "cd")
{
Console.Write("Found at position " + i)
}
}
}

C# equivalent to Javascript "push"

I'm trying to convert this code over to C# and was wondering what is the equivalent to Javascript's "Array.push"?
Here's a few lines of the code i'm converting:
var macroInit1, macroInit2;
var macroSteps = new Array();
var i, step;
macroInit1 = "Random String";
macroInit2 = "Random String two";
macroSteps.push(macroInit1 + "another random string");
macroSteps.push(macroInit2 + "The last random string");
for (i=0; i<10; i++)
{
for (step = 0; step < macroSteps.length; step++)
{
// Do some stuff
}
}
You could use a List<string>:
var macroInit1 = "Random String";
var macroInit2 = "Random String two";
var macroSteps = new List<string>();
macroSteps.Add(macroInit1 + "another random string");
macroSteps.Add(macroInit2 + "The last random string");
for (int i = 0; i < 10; i++)
{
for (int step = 0; step < macroSteps.Count; step++)
{
}
}
Of course this code looks extremely ugly in C#. Depending on what manipulations you are performing on those strings you could take advantage of the LINQ features built into C# to convert it into a one-liner and avoid writing all those imperative loops.
This is to say that when converting source code from one language to another it's not a matter of simply searching for the equivalent data type, etc... You could also take advantage of what the target language has to offer.
You can replace that either with
List<string> macroSteps for a type-safe list-of-string
or
ArrayList macroSteps. for a flexible list-of-object
or
Stack<string> macroSteps. It has .Push() and .Pop() like in JS.
It can be much more clean, declarative and nice in C#, for example:
//In .NET both lists and arraus implement IList interface, so we don't care what's behind
//A parameter is just a sequence, again, we just enumerate through
//and don't care if you put array or list or whatever, any enumerable
public static IList<string> GenerateMacroStuff(IEnumerable<string> macroInits) {
{
return macroInits
.Select(x => x + "some random string or function returns that") //your re-initialization
.Select(x => YourDoSomeStuff(x)) //what you had in your foreach
.ToArray();
}
And it can be used then:
var myInits = new[] {"Some init value", "Some init value 2", "Another Value 3"};
var myMacroStuff = GetMacroStuff(myInits); //here is an array of what we need
BTW, we can suggest you a solution how to "do stuff" properly and nicely if you just describe what you want, not just show us a code we don't have any clue how to use and ask how to translate it literally.
Because a literal translation can be so unnatural and ugly in .NET world, and you will have to maintain this ugliness... We don't want you to be in this position :)

Categories

Resources