Convert file with int value in each line to IEnumerable<int> - c#

I have a file with int values in each line (although it's possible that some values are not ints like some comments). But the structure of the file is:
1
2
3
4
5
6
7
#some comment
9
10
etc...
What's the fastest way to convert it to IEnumerable. I could read line by line and use List and call Add method, but I guess it's not the best in terms of performance.
Thanks

You could create your IEnumerable on-the-fly while reading the file:
IEnumerable<Int32> GetInts(string filename)
{
int tmp = 0;
foreach(string line in File.ReadLines(filename))
if (Int32.TryParse(line, out tmp))
yield return tmp;
}
This way, you can do whatever you want to do with your integers while reading the file, using a foreach loop.
foreach(int i in GetInts(#"yourfile"))
{
... do something with i ...
}
If you just want to create a list, simply use the ToList extension:
List<Int32> myInts = GetInts(#"yourfile").ToList();
but there probably won't be any measurable performance difference if you "manually" create a list as you described in your question.

var lines = File.ReadLines(path).Where(l => !l.StartsWith("#"));
you can also append .Select(x => int.Parse(x))

public static IEnumerable<int> ReadInts(TextReader tr)
{
//put using here to have this manage cleanup, but in calling method
//is probably better
for(string line = tr.ReadLine(); line != null; line = tr.ReadLine())
if(line.Length != 0 && line[0] != '#')
yield return int.Parse(line);
}
I assume from your description that a line that doesn't match should throw an exception, but I guessed also that blank lines where you don't want them are very common, so I do cathc that case. Adapt to catch that as appropriate otherwise.

If you want to add lines only if they are convertible to ints, you could use int.TryParse. I suggest to use File.ReadLines instead of File.ReadAllLines(creates an array in memory):
int value;
IEnumerable<String>lines = File.ReadLines(path)
.Where(l => int.TryParse(l.Trim(), out value));
or (if you want to select those ints):
int value;
IEnumerable<int>ints= File.ReadLines(path)
.Where(l => int.TryParse(l.Trim(), out value))
.Select(l => value);

Related

How to convert a string into Int collection [duplicate]

This question already has an answer here:
How to split a number into individual nos
(1 answer)
Closed 6 years ago.
Below is my string.
var str = "1,2,3,4,5";
var strArr = str.split(","); // this gives me an array of type string
List<int> intCol = new List<int>(); //I want integer collection. Like
What I am doing is:-
foreach(var v in strArr)
{
intCol.add(Convert.ToInt(v));
}
Is it right way to do it?
Well that's a way of doing it, certainly - but LINQ makes it a lot easier, as this is precisely the kind of thing it's designed for. You want a pipeline that:
Splits the original string into substrings
Converts each string into an integer
Converts the resulting sequence into a list
That's simple:
List<int> integers = bigString.Split(',').Select(int.Parse).ToList();
The use of int.Parse as a method group is clean here, but if you're more comfortable with using lambda expressions, you can use
List<int> integers = bigString.Split(',').Select(s => int.Parse(s)).ToList();
var numbers = str.Split(',').Select(x => int.Parse(x)).ToList();
But in such cases I would add some error handling in case the item could not be converted to an integer like this:
var strArr = str.Split(',')
.Select(x =>
{
int num;
if (int.TryParse(x, out num))
{
return num;
}
// Parse failed so return -1 or some other value or log it
// or throw exception but then this whole operation will fail
// so it is upto you and your needs to decide what to do in such
// a case.
return -1;
});
Note: Convert.ToInt() will throw a FormatException if the value cannot be converted. TryParse will not.

Maximum number of occurrences a character appears in an array of strings

In C#, given the array :
string[] myStrings = new string[] {
"test#test",
"##test",
"######", // Winner (outputs 6)
};
How can I find the maximum number of occurrences that the character # appears in a single string ?
My current solution is :
int maxOccurrences = 0;
foreach (var myString in myStrings)
{
var occurrences = myString.Count(x => x == '#');
if (occurrences > maxOccurrences)
{
maxOccurrences = occurrences;
}
}
return maxOccurrences;
Is their a simplier way using linq that can act directly on the myStrings[] array ?
And can this be made into an extension method that can work on any IEnumerable<string> ?
First of all let's project your strings into a sequence with count of matches:
myStrings.Select(x => x.Count(x => x == '#')) // {1, 2, 6} in your example
Then pick maximum value:
int maximum = myStrings
.Select(s => s.Count(x => x == '#'))
.Max(); // 6 in your example
Let's make an extension method:
public static int CountMaximumOccurrencesOf(this IEnumerable<string> strings, char ch)
{
return strings
.Select(s => s.Count(c => c == ch))
.Max();
}
However there is a big HOWEVER. What in C# you call char is not what you call character in your language. This has been widely discussed in other posts, for example: Fastest way to split a huge text into smaller chunks and How can I perform a Unicode aware character by character comparison? then I won't repeat everything here. To be "Unicode aware" you need to make your code more complicate (please note code is wrote here then it's untested):
private static IEnumerable<string> EnumerateCharacters(string s)
{
var enumerator = StringInfo.GetTextElementEnumerator(s.Normalize());
while (enumerator.MoveNext())
yield return (string)enumerator.Value;
}
Then change our original code to:
public static int CountMaximumOccurrencesOf(this IEnumerable<string> strings, string character)
{
return strings
.Select(s => s.EnumerateCharacters().Count(c => String.Equals(c, character, StringComparison.CurrentCulture))
.Max();
}
Note that Max() alone requires collection to don't be empty (use DefaultIfEmpty() if collection may be empty and it's not an error). To do not arbitrary decide what to do in this situation (throw an exception if it should happen or just return 0) you can may make this method less specialized and leave this responsibility to caller:
public static int CountOccurrencesOf(this IEnumerable<string> strings,
string character,
StringComparison comparison = StringComparison.CurrentCulture)
{
Debug.Assert(character.EnumerateCharacters().Count() == 1);
return strings
.Select(s => s.EnumerateCharacters().Count(c => String.Equals(c, character, comparison ));
}
Used like this:
var maximum = myStrings.CountOccurrencesOf("#").Max();
If you need it case-insensitive:
var maximum = myStrings.CountOccurrencesOf("à", StringComparison.CurrentCultureIgnoreCase)
.Max();
As you can now imagine this comparison isn't limited to some esoteric languages but it also applies to invariant culture (en-US) then for strings that must always be compared with invariant culture you should specify StringComparison.InvariantCulture. Don't forget that you may need to call String.Normalize() also for input character.
You can write something like this. Note the usage of DefaultIfEmpty, to not throw an exception if myStrings is empty, but revert to 0.
var maximum = myStrings.Select(e => e.Count(ee => ee == '#')).DefaultIfEmpty().Max();
You can do that with Linq combined to Regex:
myStrings.Select(x => Regex.Matches(x, "#").Count).max();

C# Loop Through An Array

I am completely new to C#. I am trying to loop through a short array, where the string elements in the array are placed at the end of a website search. The code:
int n = 1;
string[] s = {"firstitem","seconditem","thirditem"}
int x = s.Max(); // note, from my research this should return the maximum value in the array, but this is the first error
x = x + 1
while (n < x)
{
System.Diagnostics.Process.Start("www.website.com/" + b[0]);
b[]++; // this also generates an error "identifier expected"
}
My coding, logic or both are wrong. Based on what I've read, I should be able to get the maximum value in an array (as an int), then add to the arrays value while a WHILE loop adds each value in the array at the end of the website (and then stops). Note, that on the first error, I tried coding it differently, like the below:
int x = Convert.ToInt32(s.Max);
However, it generates an overload error. If I'm reading things correctly, MAX should find the maximum value in a sequence.
foreach(var str in s)
{
System.Diagnostics.Process.Start("www.website.com/" + str);
}
You have a collection of strings. The largest string is still a string, not an int. Since s.Max() is a string, and you're assinging it to a variable of type int: int x = s.Max(); the compiler (correctly) informs you that the types do not match. You need to convert that string to an int. Since, looking at your data, they aren't integers, and I see no sensible way of converting those strings into integers, I see no reasonable solution. What integer should "firstitem" be?
If you just want to execute some code for each item in the array then use one of these patterns:
foreach(string item in s)
{
System.Diagnostics.Process.Start("www.website.com/" + item);
}
or
for(int i = 0; i < s.Length; i++)
{
System.Diagnostics.Process.Start("www.website.com/" + s[i]);
}
You're missing a couple of semi-colons
x should presumably be the Length of the array, not the largest value in it
You need to increment x inside of your loop - at the end of it, not outside of it
You should actually be incrementing n, not x
n should be starting at 0, not at 1
Inside the loop you're using b[0] where you probably want to use b[n]
I'm no C++ guru, but I have no idea what b[]++ might mean
As other answers have mentioned, you may want to use a for or foreach instead of a while.
Make an effort to go through some introductory tutorials. Trial and error can be a useful tool, but there's no need to fall back on that when learning the very basics
Following is an image to point out what are the errors of your code:
After the correction, it would be:
int n=1;
string[] s= { "firstitem", "seconditem", "thirditem" };
int x=s.Length;
while(n<x) {
System.Diagnostics.Process.Start("www.website.com/"+s[n]);
n++; // or ++n
}
And we can make it more semantic:
var items=new[] { "firstitem", "seconditem", "thirditem" };
for(int index=1, count=items.Length; index<count; ++index)
Process.Start("www.website.com/"+items[index]);
If the starting order doesn't matter, and we can use foreach instead, and we can use Linq to make the code even simpler:
var list=(new[] { "firstitem", "seconditem", "thirditem" }).ToList();
list.ForEach(item => Process.Start("www.website.com/"+item));
and we might quite often write in another form:
foreach(var item in new[] { "firstitem", "seconditem", "thirditem" })
Process.Start("www.website.com/"+item);
from the sample
var processList = (new string[]{"firstitem","seconditem","thirditem"})
.Select(s => Process.Start("www.website.com/" + s))
.ToList();
and here is a test version that outputs to console
(new string[] { "firstitem", "seconditem", "thirditem" })
.Select(s => { Console.WriteLine(#"www.website.com/" + s); return s; })
.ToList();
note: Select requires a return type and the .ToList() enforces evaluation.

Simple Way to Read Integers from a File

I am trying to read in a file which is essentially a list of integers, seperated by a line. Obviously, file input can never be trusted so I need to filter out non-integers.
1
2
3
4
I know the as operator usually converts if it can and then assigns a null, however because int isn't nullable this isn't the case. I thought that perhaps I could cast to Nullable<int>. I have never really delved into this, I thought perhaps I could do:
var lines = File.ReadAllLines("");
var numbers = lines.Select(line => line as int?).Where(i => i != null);
I know that I could get potentially get around this by doing:
var numbers = lines.Select(line =>
{
int iReturn = 0;
if (int.TryParse(line, out iReturn ))
return iReturn;
else
return null;
}).Where(i => i != null);
I also could potentially do it as an extension method.
I was just looking for a neat, concise way of doing the cast in a statement and also understanding why my code is invalid.
I'm always using this simple extension method:
public static int? TryGetInt(this string item)
{
int i;
bool success = int.TryParse(item, out i);
return success ? (int?)i : (int?)null;
}
Then it's easy:
var numbers = lines.Select(line => line.TryGetInt())
.Where(i => i.HasValue)
.Select(i => i.Value);
You can also use int.TryParse without the extension, but that is undocumented hence might stop working in future:
int i = 0;
var numbers = lines.Where(line => int.TryParse(line, out i))
.Select(line => i);
Edit
"also understanding why my code is invalid"
relevant code:
if (int.TryParse(line, out iReturn ))
return iReturn;
else
return null;
It would work if you'd replace
else
return null;
with
else
return (int?)null;
because you are returning an int, but null is not convertible implicitly to an int.
There isn't a concise way to do this because here you don't need to cast (you cannot cast) -- you need to convert from one type to another. The types of course are int and string (so not exactly "any" types), but as in the general case any conversion between unrelated types cannot be done "just like that".
Nope. C# is deliberately cautious about changing strings to numbers.
You can make your code shorter (no more nulls) using a foreach loop
var numbers = new List<int>();
foreach(string line in lines)
{
int n;
if (int.TryParse(line, out n))
numbers.Add(n);
}
If I understand you correctly and you want just filter the non integer lines, maybe regex is an option?
var lines = File.ReadAllLines("");
var numbers = lines.Where(i => Regex.IsMatch(i, "[0-9]+"));
Here's the best I came up with:
Use this extension method:
public static class Int32Extensions
{
public static int? ParseOrDefault(this string text)
{
int iReturn = 0;
if (int.TryParse(text, out iReturn))
{
return iReturn;
}
return null;
}
}
Like this:
var query = lines.Select(x => x.ParseOrDefault()).Where(x => x.HasValue);
You can create and extension method
public static int? ToInt(this string s, int default){ ... }
and use it in LINQ:
var lines = File.ReadAllLines(path);
var numbers = lines.Select(line => line.ToInt())
.Where(i => i != null);

int array to string

In C#, I have an array of ints, containing digits only. I want to convert this array to string.
Array example:
int[] arr = {0,1,2,3,0,1};
How can I convert this to a string formatted as: "012301"?
at.net 3.5 use:
String.Join("", new List<int>(array).ConvertAll(i => i.ToString()).ToArray());
at.net 4.0 or above use: (see #Jan Remunda's answer)
string result = string.Join("", array);
You can simply use String.Join function, and as separator use string.Empty because it uses StringBuilder internally.
string result = string.Join(string.Empty, new []{0,1,2,3,0,1});
E.g.: If you use semicolon as separator, the result would be 0;1;2;3;0;1.
It actually works with null separator, and second parameter can be enumerable of any objects, like:
string result = string.Join(null, new object[]{0,1,2,3,0,"A",DateTime.Now});
I realize my opinion is probably not the popular one, but I guess I have a hard time jumping on the Linq-y band wagon. It's nifty. It's condensed. I get that and I'm not opposed to using it where it's appropriate. Maybe it's just me, but I feel like people have stopped thinking about creating utility functions to accomplish what they want and instead prefer to litter their code with (sometimes) excessively long lines of Linq code for the sake of creating a dense 1-liner.
I'm not saying that any of the Linq answers that people have provided here are bad, but I guess I feel like there is the potential that these single lines of code can start to grow longer and more obscure as you need to handle various situations. What if your array is null? What if you want a delimited string instead of just purely concatenated? What if some of the integers in your array are double-digit and you want to pad each value with leading zeros so that the string for each element is the same length as the rest?
Taking one of the provided answers as an example:
result = arr.Aggregate(string.Empty, (s, i) => s + i.ToString());
If I need to worry about the array being null, now it becomes this:
result = (arr == null) ? null : arr.Aggregate(string.Empty, (s, i) => s + i.ToString());
If I want a comma-delimited string, now it becomes this:
result = (arr == null) ? null : arr.Skip(1).Aggregate(arr[0].ToString(), (s, i) => s + "," + i.ToString());
This is still not too bad, but I think it's not obvious at a glance what this line of code is doing.
Of course, there's nothing stopping you from throwing this line of code into your own utility function so that you don't have that long mess mixed in with your application logic, especially if you're doing it in multiple places:
public static string ToStringLinqy<T>(this T[] array, string delimiter)
{
// edit: let's replace this with a "better" version using a StringBuilder
//return (array == null) ? null : (array.Length == 0) ? string.Empty : array.Skip(1).Aggregate(array[0].ToString(), (s, i) => s + "," + i.ToString());
return (array == null) ? null : (array.Length == 0) ? string.Empty : array.Skip(1).Aggregate(new StringBuilder(array[0].ToString()), (s, i) => s.Append(delimiter).Append(i), s => s.ToString());
}
But if you're going to put it into a utility function anyway, do you really need it to be condensed down into a 1-liner? In that case why not throw in a few extra lines for clarity and take advantage of a StringBuilder so that you're not doing repeated concatenation operations:
public static string ToStringNonLinqy<T>(this T[] array, string delimiter)
{
if (array != null)
{
// edit: replaced my previous implementation to use StringBuilder
if (array.Length > 0)
{
StringBuilder builder = new StringBuilder();
builder.Append(array[0]);
for (int i = 1; i < array.Length; i++)
{
builder.Append(delimiter);
builder.Append(array[i]);
}
return builder.ToString()
}
else
{
return string.Empty;
}
}
else
{
return null;
}
}
And if you're really so concerned about performance, you could even turn it into a hybrid function that decides whether to do string.Join or to use a StringBuilder depending on how many elements are in the array (this is a micro-optimization, not worth doing in my opinion and possibly more harmful than beneficial, but I'm using it as an example for this problem):
public static string ToString<T>(this T[] array, string delimiter)
{
if (array != null)
{
// determine if the length of the array is greater than the performance threshold for using a stringbuilder
// 10 is just an arbitrary threshold value I've chosen
if (array.Length < 10)
{
// assumption is that for arrays of less than 10 elements
// this code would be more efficient than a StringBuilder.
// Note: this is a crazy/pointless micro-optimization. Don't do this.
string[] values = new string[array.Length];
for (int i = 0; i < values.Length; i++)
values[i] = array[i].ToString();
return string.Join(delimiter, values);
}
else
{
// for arrays of length 10 or longer, use a StringBuilder
StringBuilder sb = new StringBuilder();
sb.Append(array[0]);
for (int i = 1; i < array.Length; i++)
{
sb.Append(delimiter);
sb.Append(array[i]);
}
return sb.ToString();
}
}
else
{
return null;
}
}
For this example, the performance impact is probably not worth caring about, but the point is that if you are in a situation where you actually do need to be concerned with the performance of your operations, whatever they are, then it will most likely be easier and more readable to handle that within a utility function than using a complex Linq expression.
That utility function still looks kind of clunky. Now let's ditch the hybrid stuff and do this:
// convert an enumeration of one type into an enumeration of another type
public static IEnumerable<TOut> Convert<TIn, TOut>(this IEnumerable<TIn> input, Func<TIn, TOut> conversion)
{
foreach (TIn value in input)
{
yield return conversion(value);
}
}
// concatenate the strings in an enumeration separated by the specified delimiter
public static string Delimit<T>(this IEnumerable<T> input, string delimiter)
{
IEnumerator<T> enumerator = input.GetEnumerator();
if (enumerator.MoveNext())
{
StringBuilder builder = new StringBuilder();
// start off with the first element
builder.Append(enumerator.Current);
// append the remaining elements separated by the delimiter
while (enumerator.MoveNext())
{
builder.Append(delimiter);
builder.Append(enumerator.Current);
}
return builder.ToString();
}
else
{
return string.Empty;
}
}
// concatenate all elements
public static string ToString<T>(this IEnumerable<T> input)
{
return ToString(input, string.Empty);
}
// concatenate all elements separated by a delimiter
public static string ToString<T>(this IEnumerable<T> input, string delimiter)
{
return input.Delimit(delimiter);
}
// concatenate all elements, each one left-padded to a minimum length
public static string ToString<T>(this IEnumerable<T> input, int minLength, char paddingChar)
{
return input.Convert(i => i.ToString().PadLeft(minLength, paddingChar)).Delimit(string.Empty);
}
Now we have separate and fairly compact utility functions, each of which are arguable useful on their own.
Ultimately, my point is not that you shouldn't use Linq, but rather just to say don't forget about the benefits of creating your own utility functions, even if they are small and perhaps only contain a single line that returns the result from a line of Linq code. If nothing else, you'll be able to keep your application code even more condensed than you could achieve with a line of Linq code, and if you are using it in multiple places, then using a utility function makes it easier to adjust your output in case you need to change it later.
For this problem, I'd rather just write something like this in my application code:
int[] arr = { 0, 1, 2, 3, 0, 1 };
// 012301
result = arr.ToString<int>();
// comma-separated values
// 0,1,2,3,0,1
result = arr.ToString(",");
// left-padded to 2 digits
// 000102030001
result = arr.ToString(2, '0');
To avoid the creation of an extra array you could do the following.
var builder = new StringBuilder();
Array.ForEach(arr, x => builder.Append(x));
var res = builder.ToString();
string result = arr.Aggregate("", (s, i) => s + i.ToString());
(Disclaimer: If you have a lot of digits (hundreds, at least) and you care about performance, I suggest eschewing this method and using a StringBuilder, as in JaredPar's answer.)
You can do:
int[] arr = {0,1,2,3,0,1};
string results = string.Join("",arr.Select(i => i.ToString()).ToArray());
That gives you your results.
I like using StringBuilder with Aggregate(). The "trick" is that Append() returns the StringBuilder instance itself:
var sb = arr.Aggregate( new StringBuilder(), ( s, i ) => s.Append( i ) );
var result = sb.ToString();
string.Join("", (from i in arr select i.ToString()).ToArray())
In the .NET 4.0 the string.Join can use an IEnumerable<string> directly:
string.Join("", from i in arr select i.ToString())
I've left this here for posterity but don't recommend its use as it's not terribly readable. This is especially true now that I've come back to see if after a period of some time and have wondered what I was thinking when I wrote it (I was probably thinking 'crap, must get this written before someone else posts an answer'.)
string s = string.Concat(arr.Cast<object>().ToArray());
The most efficient way is not to convert each int into a string, but rather create one string out of an array of chars. Then the garbage collector only has one new temp object to worry about.
int[] arr = {0,1,2,3,0,1};
string result = new string(Array.ConvertAll<int,char>(arr, x => Convert.ToChar(x + '0')));
This is a roundabout way to go about it its not much code and easy for beginners to understand
int[] arr = {0,1,2,3,0,1};
string joined = "";
foreach(int i in arr){
joined += i.ToString();
}
int number = int.Parse(joined);
If this is long array you could use
var sb = arr.Aggregate(new StringBuilder(), ( s, i ) => s.Append( i ), s.ToString());
// This is the original array
int[] nums = {1, 2, 3};
// This is an empty string we will end up with
string numbers = "";
// iterate on every char in the array
foreach (var item in nums)
{
// add the char to the empty string
numbers += Convert.ToString(item);
}
// Write the string in the console
Console.WriteLine(numbers);

Categories

Resources