How to limit the number of text strings arrays to 15 - c#

How to limit the lines pulled from the document to 15 lines at a time. Right now it displays all the lines at once. thanks.
class Program {
static void Main(string[] args) {
string[] lines = System.IO.File.ReadAllLines(
#"C:\Users\chri749y\Documents\Skrive til fil\Testprogram.txt");
foreach (string line in lines) {
Console.WriteLine("{0}", line);
}
Console.ReadKey();
}
}

If you want top 15 lines only, try Take (Linq) which is specially designed for this:
var lines = System.IO.File
.ReadLines(#"C:\Users\chri749y\Documents\Skrive til fil\Testprogram.txt")
.Take(15);
In case you want batch processing i.e. get 0 .. 14 lines then 15 .. 29 lines etc.
// Split input into batches with at most "size" items each
private static IEnumerable<T[]> Batch<T>(IEnumerable<T> lines, int size) {
List<T> batch = new List<T>(size);
foreach (var item in lines) {
if (batch.Count >= size) {
yield return batch.ToArray();
batch.Clear();
}
batch.Add(item);
}
if (batch.Count > 0) // tail, possibly incomplete batch
yield return batch.ToArray();
}
Then
var batches = Batch(System.IO.File
.ReadLines(#"C:\Users\chri749y\Documents\Skrive til fil\Testprogram.txt"),
15);
foreach (var batch in batches) { // Array with at most 15 items
foreach (var line in batch) {
...
}
}

If I've understood you correctly, you can do this using System.IO.File.ReadLines which lets you stream the file in as an IEnumerable<string>. I've created a custom batching function which will read 15 lines at a time.
static void Main(string[] args)
{
var lines = System.IO.File.ReadLines(#"C:\Users\chri749y\Documents\Skrive til fil\Testprogram.txt");
foreach (var batch in Batch(lines, 15))
{
foreach (var line in batch)
{
Console.WriteLine(line);
}
Console.ReadKey();
}
Console.ReadKey();
}
This will return a List per batchSize (e.g. 15) lines of the file.
private IEnumerable<List<T>> Batch<T>(IEnumerable<T> source, int batchSize)
{
if (batchSize < 1)
{
throw new ArgumentException("Batch size must be at least 1.", nameof(batchSize));
}
var batch = new List<T>(batchSize);
foreach (var item in source)
{
batch.Add(item);
if (batch.Count == batchSize)
{
yield return batch;
batch = new List<T>(batchSize);
}
}
if (batch.Any())
{
yield return batch;
}
}

Related

Print 2D list in console in C#

I have following code.
class Solution
{
static void Main(String[] args)
{
var matrix = new List<List<int>>();
for (int i = 0; i < 6; ++i)
{
string[] elements = Console.ReadLine().Split(' ');
matrix.Add(new List<int>());
foreach (var item in elements)
{
matrix[i].Add(int.Parse(item));
}
}
}
}
I know to print out the array which we read from console, convert it to int from string, we will have to use foreach loop. But here to print out the list in the console how can we write the code?
Print the values line by line:
foreach (var line in matrix)
{
foreach (var item in line)
{
Console.Write(item+"\t");
}
Console.WriteLine();
}

Pluck a chunk out of Dictionary<string, int> [duplicate]

This question already has answers here:
Create batches in LINQ
(21 answers)
Closed 3 years ago.
I am developing a C# program which has an "IEnumerable users" that stores the ids of 4 million users. I need to loop through the IEnumerable and extract a batch 1000 ids each time to perform some operations in another method.
How do I extract 1000 ids at a time from start of the IEnumerable, do some thing else, then fetch the next batch of 1000 and so on?
Is this possible?
You can use MoreLINQ's Batch operator (available from NuGet):
foreach(IEnumerable<User> batch in users.Batch(1000))
// use batch
If simple usage of library is not an option, you can reuse implementation:
public static IEnumerable<IEnumerable<T>> Batch<T>(
this IEnumerable<T> source, int size)
{
T[] bucket = null;
var count = 0;
foreach (var item in source)
{
if (bucket == null)
bucket = new T[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket.Select(x => x);
bucket = null;
count = 0;
}
// Return the last bucket with all remaining elements
if (bucket != null && count > 0)
{
Array.Resize(ref bucket, count);
yield return bucket.Select(x => x);
}
}
BTW for performance you can simply return bucket without calling Select(x => x). Select is optimized for arrays, but selector delegate still would be invoked on each item. So, in your case it's better to use
yield return bucket;
Sounds like you need to use Skip and Take methods of your object. Example:
users.Skip(1000).Take(1000)
this would skip the first 1000 and take the next 1000. You'd just need to increase the amount skipped with each call
You could use an integer variable with the parameter for Skip and you can adjust how much is skipped. You can then call it in a method.
public IEnumerable<user> GetBatch(int pageNumber)
{
return users.Skip(pageNumber * 1000).Take(1000);
}
The easiest way to do this is probably just to use the GroupBy method in LINQ:
var batches = myEnumerable
.Select((x, i) => new { x, i })
.GroupBy(p => (p.i / 1000), (p, i) => p.x);
But for a more sophisticated solution, see this blog post on how to create your own extension method to do this. Duplicated here for posterity:
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> collection, int batchSize)
{
List<T> nextbatch = new List<T>(batchSize);
foreach (T item in collection)
{
nextbatch.Add(item);
if (nextbatch.Count == batchSize)
{
yield return nextbatch;
nextbatch = new List<T>();
// or nextbatch.Clear(); but see Servy's comment below
}
}
if (nextbatch.Count > 0)
yield return nextbatch;
}
How about
int batchsize = 5;
List<string> colection = new List<string> { "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"};
for (int x = 0; x < Math.Ceiling((decimal)colection.Count / batchsize); x++)
{
var t = colection.Skip(x * batchsize).Take(batchsize);
}
try using this:
public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
this IEnumerable<TSource> source,
int batchSize)
{
var batch = new List<TSource>();
foreach (var item in source)
{
batch.Add(item);
if (batch.Count == batchSize)
{
yield return batch;
batch = new List<TSource>();
}
}
if (batch.Any()) yield return batch;
}
and to use above function:
foreach (var list in Users.Batch(1000))
{
}
You can achieve that using Take and Skip Enumerable extension method. For more information on usage checkout linq 101
Something like this would work:
List<MyClass> batch = new List<MyClass>();
foreach (MyClass item in items)
{
batch.Add(item);
if (batch.Count == 1000)
{
// Perform operation on batch
batch.Clear();
}
}
// Process last batch
if (batch.Any())
{
// Perform operation on batch
}
And you could generalize this into a generic method, like this:
static void PerformBatchedOperation<T>(IEnumerable<T> items,
Action<IEnumerable<T>> operation,
int batchSize)
{
List<T> batch = new List<T>();
foreach (T item in items)
{
batch.Add(item);
if (batch.Count == batchSize)
{
operation(batch);
batch.Clear();
}
}
// Process last batch
if (batch.Any())
{
operation(batch);
}
}
You can use Take operator linq
Link : http://msdn.microsoft.com/fr-fr/library/vstudio/bb503062.aspx
In a streaming context, where the enumerator might get blocked in the middle of the batch, simply because the value is not yet produced (yield) it is useful to have a timeout method so that the last batch is produced after a given time. I used this for example for tailing a cursor in MongoDB. It's a little bit complicated, because the enumeration has to be done in another thread.
public static IEnumerable<List<T>> TimedBatch<T>(this IEnumerable<T> collection, double timeoutMilliseconds, long maxItems)
{
object _lock = new object();
List<T> batch = new List<T>();
AutoResetEvent yieldEventTriggered = new AutoResetEvent(false);
AutoResetEvent yieldEventFinished = new AutoResetEvent(false);
bool yieldEventTriggering = false;
var task = Task.Run(delegate
{
foreach (T item in collection)
{
lock (_lock)
{
batch.Add(item);
if (batch.Count == maxItems)
{
yieldEventTriggering = true;
yieldEventTriggered.Set();
}
}
if (yieldEventTriggering)
{
yieldEventFinished.WaitOne(); //wait for the yield to finish, and batch to be cleaned
yieldEventTriggering = false;
}
}
});
while (!task.IsCompleted)
{
//Wait for the event to be triggered, or the timeout to finish
yieldEventTriggered.WaitOne(TimeSpan.FromMilliseconds(timeoutMilliseconds));
lock (_lock)
{
if (batch.Count > 0) //yield return only if the batch accumulated something
{
yield return batch;
batch.Clear();
yieldEventFinished.Set();
}
}
}
task.Wait();
}

Split list in foreach loop to add new line each iteration

I want to print all the data line by line where each line contains "n" number of digits, n being user defined.
Something like:
void Print(List<int> list, int charactersPerLine)
{
// TODO: magic here
}
Usage
List<int> list = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 }
Print(list, 4);
Desired output:
1234
5678
So far I tried:
List.AddRange(array);
List.Add(array2);
List.Add(array3);
foreach (int i in List)
{
Console.Write("{0}", i);
}
and when the loop writes to the console everything in the List<int> is written in a line, and the output is like:
12345678
Is this possible?
Use:
foreach (int i in List)
{
Console.WriteLine("{0}", i);
}
If your input is in {1,2,3,4,5,6,7,8} format you have to use some condition when to use Console.WriteLine() or Console.Write()
const int LineCharacterLimit = 4;
int i = 0;
foreach (int i in List)
{
i++;
if (i == LineCharacterLimit)
{
Console.WriteLine("{0}", i);
i=0;
}
else
{
Console.Write("{0}", i);
}
}
You could use String Builder first. Then just put a \n after each line.
StringBuilder str = new StringBuilder();
int count = 1;
foreach (int i in List)
{
str.Append(i.ToString());
if(count%4 ==0)
str.Append("\n");
count++;
}
Console.Write(str.ToString());
More generic version:
static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int batchSize)
{
int currentBatchSize = 0;
List<T> batch = new List<T>();
foreach (var e in source)
{
batch.Add(e);
if (++currentBatchSize % batchSize == 0)
{
yield return batch;
batch.Clear();
}
}
yield return batch;
}
}
I'm sure you can find something like this in morelinq package.
Usage:
static void Print(List<int> list, int charactersPerLine)
{
foreach (var batch in list.Batch(charactersPerLine))
{
var strBatch = batch.Select(e => e.ToString()).ToArray();
Console.WriteLine(string.Join(",", strBatch));
}
}

How can I write a strings to file with linebreaks inbetween?

I want to be able to write some values to a file whilst creating blank lines in between. Here is the code that I have so far:
TextWriter w_Test = new StreamWriter(file_test);
foreach (string results in searchResults)
{
w_Test.WriteLine(Path.GetFileNameWithoutExtension(results));
var list1 = File.ReadAllLines(results).Skip(10);
foreach (string listResult in list1)
{
w_Test.WriteLine(listResult);
}
}
w_Test.Close();
This creates 'Test' with the following output:
result1
listResult1
listResult2
result2
listResult3
result3
result4
I want to write the results so that each result block is 21 lines in size before writing the next, e.g.
result1
(20 lines even if no 'listResult' found)
result2
(20 lines even if no 'listResult' found)
etc.......
What would be the best way of doing this??
TextWriter w_Test = new StreamWriter(file_test);
foreach (string results in searchResults)
{
int noLinesOutput = 0;
w_Test.WriteLine(Path.GetFileNameWithoutExtension(results));
noLinesOutput++;
var list1 = File.ReadAllLines(results).Skip(10);
foreach (string listResult in list1)
{
w_Test.WriteLine(listResult);
noLinesOutput++;
}
for ( int i = 20; i > noLinesOutput; i-- )
w_Test.WriteLine();
}
w_Test.Close();
Here's a simple helper method I use in such cases:
// pad the sequence with 'elem' until it's 'count' elements long
static IEnumerable<T> PadRight<T>(IEnumerable<T> enm,
T elem,
int count)
{
int ii = 0;
foreach(var elem in enm)
{
yield return elem;
ii += 1;
}
for (; ii < count; ++ii)
{
yield return elem;
}
}
Then
foreach (string listResult in
PadRight(list1, "", 20))
{
w_Test.WriteLine(listResult);
}
should do the trick.
Perhaps with this loop:
var lines = 20;
foreach(string fullPath in searchResults)
{
List<string> allLines = new List<string>();
allLines.Add(Path.GetFileNameWithoutExtension(fullPath));
int currentLine = 0;
foreach(string line in File.ReadLines(fullPath).Skip(10))
{
if(++currentLine > lines) break;
allLines.Add(line);
}
while (currentLine++ < lines)
allLines.Add(String.Empty);
File.WriteAllLines(fullPath, allLines);
}

Split large file into smaller files by number of lines in C#?

I am trying to figure out how to split a file by the number of lines in each file. THe files are csv and I can't do it by bytes. I need to do it by lines. 20k seems to be a good number per file. What is the best way to read a stream at a given position? Stream.BaseStream.Position? So if I read the first 20k lines i would start the position at 39,999? How do I know I am almost at the end of a files? Thanks all
using (System.IO.StreamReader sr = new System.IO.StreamReader("path"))
{
int fileNumber = 0;
while (!sr.EndOfStream)
{
int count = 0;
using (System.IO.StreamWriter sw = new System.IO.StreamWriter("other path" + ++fileNumber))
{
sw.AutoFlush = true;
while (!sr.EndOfStream && ++count < 20000)
{
sw.WriteLine(sr.ReadLine());
}
}
}
}
int index=0;
var groups = from line in File.ReadLines("myfile.csv")
group line by index++/20000 into g
select g.AsEnumerable();
int file=0;
foreach (var group in groups)
File.WriteAllLines((file++).ToString(), group.ToArray());
I'd do it like this:
// helper method to break up into blocks lazily
public static IEnumerable<ICollection<T>> SplitEnumerable<T>
(IEnumerable<T> Sequence, int NbrPerBlock)
{
List<T> Group = new List<T>(NbrPerBlock);
foreach (T value in Sequence)
{
Group.Add(value);
if (Group.Count == NbrPerBlock)
{
yield return Group;
Group = new List<T>(NbrPerBlock);
}
}
if (Group.Any()) yield return Group; // flush out any remaining
}
// now it's trivial; if you want to make smaller files, just foreach
// over this and write out the lines in each block to a new file
public static IEnumerable<ICollection<string>> SplitFile(string filePath)
{
return File.ReadLines(filePath).SplitEnumerable(20000);
}
Is that not sufficient for you? You mention moving from position to position,but I don't see why that's necessary.

Categories

Resources