I am using this code to extract a chunk from file
// info is FileInfo object pointing to file
var percentSplit = info.Length * 50 / 100; // extract 50% of file
var bytes = new byte[percentSplit];
var fileStream = File.OpenRead(fileName);
fileStream.Read(bytes, 0, bytes.Length);
fileStream.Dispose();
File.WriteAllBytes(splitName, bytes);
Is there any way to speed up this process?
Currently for a 530 MB file it takes around 4 - 5 seconds. Can this time be improved?
There are several cases of you question, but none of them is language relevant.
Following are something to concern
What is the file system of source/destination file?
Do you want to keep original source file?
Are they lie on the same drive?
In c#, you almost do not have a method could be faster than File.Copy which invokes CopyFile of WINAPI internally. Because of the percentage is fifty, however, following code might not be faster. It copies whole file and then set the length of the destination file
var info=new FileInfo(fileName);
var percentSplit=info.Length*50/100; // extract 50% of file
File.Copy(info.FullName, splitName);
using(var outStream=File.OpenWrite(splitName))
outStream.SetLength(percentSplit);
Further, if
you don't keep original source after file splitted
destination drive is the same as source
your are not using a crypto/compression enabled file system
then, the best thing you can do, is don't copy files at all.
For example, if your source file lies on FAT or FAT32 file system, what you can do is
create new dir entry(entries) for newly splitted parts of file
let the entry(entries) point(s) to the cluster of target part(s)
set correct file size for each entry
check for cross-link and avoid that
If your file system was NTFS, you might need to spend a long time to study the spec.
Good luck!
var percentSplit = (int)(info.Length * 50 / 100); // extract 50% of file
var buffer = new byte[8192];
using (Stream input = File.OpenRead(info.FullName))
using (Stream output = File.OpenWrite(splitName))
{
int bytesRead = 1;
while (percentSplit > 0 && bytesRead > 0)
{
bytesRead = input.Read(buffer, 0, Math.Min(percentSplit, buffer.Length));
output.Write(buffer, 0, bytesRead);
percentSplit -= bytesRead;
}
output.Flush();
}
The flush may not be needed but it doesn't hurt, this was quite interesting, changing the loop to a do-while rather than a while had a big hit on performance. I suppose the IL is not as fast. My pc was running the original code in 4-6 secs, the attached code seemed to be running at about 1 second.
I get better results when reading/writing by chunks of a few megabytes. The performances changes also depending on the size of the chunk.
FileInfo info = new FileInfo(#"C:\source.bin");
FileStream f = File.OpenRead(info.FullName);
BinaryReader br = new BinaryReader(f);
FileStream t = File.OpenWrite(#"C:\split.bin");
BinaryWriter bw = new BinaryWriter(t);
long count = 0;
long split = info.Length * 50 / 100;
long chunk = 8000000;
DateTime start = DateTime.Now;
while (count < split)
{
if (count + chunk > split)
{
chunk = split - count;
}
bw.Write(br.ReadBytes((int)chunk));
count += chunk;
}
Console.WriteLine(DateTime.Now - start);
Related
I would like to anticipate the exact size of my file before writing it in my device, to handle the error or prevent the crash in case there is no space in the corresponding drive. So I have this simple console script, that generates the file:
using System;
using System.IO;
namespace myNamespace
{
class Program
{
static void Main(string[] args) {
byte[] myByteArray = new byte[100];
MemoryStream stream = new MemoryStream();
string fileName = "E:\\myFile.mine";
FileStream myFs = new FileStream(fileName, FileMode.CreateNew);
BinaryWriter toStreamWriter = new BinaryWriter(stream);
BinaryWriter toFileWriter = new BinaryWriter(myFs, System.Text.Encoding.ASCII);
myFs.Write(myByteArray, 0, myByteArray.Length);
for (int i = 0; i < 30000; i++) {
toStreamWriter.Write(i);
}
Console.WriteLine($"allocated memory: {stream.Capacity}" );
Console.WriteLine($"stream lenght {stream.Length}");
Console.WriteLine($"file size: {(stream.Length / 4) * 4.096 }");
toFileWriter.Write(stream.ToArray());
Console.ReadLine();
}
}
}
I got to the point when I get to anticipate the size of the file.
I will be stream.Length / 4) * 4.096, but as long an the ramainder of stream.Length / 4 is 0.
For example for the case of adding 13589 integers to the stream
for (int i = 0; i < 13589; i++) {
toStreamWriter.Write(i);
}
I get that the file size is 55660,544 bytes in the script, but then its 57344 bytes in the explorer.
Same result as if the integers added would have been 14000 instead of 13589.
How can I anticipate the exact size of my created file when the remainder of stream.Length / 4 is not 0?
Edit: For the potential helper running the script you need to delete the created file every time the script is run! Of course use a path and fileName of your choice :)
Regarding the relation stream.Length / 4) * 4.096, the 4 is coming for the byte size, and I guess that the 4.096 comes from the array and file generation, however any further explanation would be much appreciated.
Edit2: Check that if the pending results are logged with:
for (int i = 13589; i <= 14000; i++) {
Console.WriteLine($"result for {i} : {(i*4 / 4) * 4.096} ");
}
You obtain:
....
result for 13991 : 57307,136
result for 13992 : 57311,232
result for 13993 : 57315,328
result for 13994 : 57319,424
result for 13995 : 57323,52
result for 13996 : 57327,616
result for 13997 : 57331,712
result for 13998 : 57335,808
result for 13999 : 57339,904
result for 14000 : 57344
So I assume that file size fits the next cluster + byteStream size with no decimal reminder. Would this file size set logic make sense for the file size anrticipation? If the stream is very big also?
From what I understand from the comments, the question is about how to get the actual file size of the file. Not the file size on disk. And your code is actually almost correct in doing so.
The math is pretty basic. In your example, you create a file stream and write a 100 byte long array to the file stream. Then you create memory stream and write 30000 integers into the memory stream. Then you write the memory stream into the file stream. Considering that each integer here is 4 byte long, as specified by C#, the resulting file has a file size of (30000 * 4) + 100 = 120100 bytes. At least for me, it's exactly what the file properties say in the Windows Explorer.
You could get the same result a bit easier with the following code:
FileStream myFs = new FileStream("test.file", FileMode.CreateNew);
byte[] myByteArray = new byte[100];
myFs.Write(myByteArray, 0, myByteArray.Length);
BinaryWriter toFileWriter = new BinaryWriter(myFs, System.Text.Encoding.ASCII);
for (int i = 0; i < 30000; i++)
{
toFileWriter.Write(i);
}
Console.WriteLine($"stream lenght {myFs.Length}");
myFs.Close();
This will return a stream length of 120100 bytes.
In case I misunderstood your question and comments and you were actually trying to get the file size on disk:
Don't go there. You cannot reliably predict the file size on disk due to variable circumstances. For example, file compression, encryption, various RAID types, various file systems, various disk types, various operating systems.
We writing an application to move content from an OneDrive account into Azure Storage. We've managed to get this working but ran into memory issues working with big files (> 1GB) and Block Blobs. We've decided that Append Blobs are the best way going forward as that will solve the memory issues.
We're using a RPC call to SharePoint to get the file stream for big files, more info can be found here:
http://sharepointfieldnotes.blogspot.co.za/2009/09/downloading-content-from-sharepoint-let.html
Using the following code is working fine when writing the file from OneDrive to local storage
using (var strOut = System.IO.File.Create("path"))
using (var sr = wReq.GetResponse().GetResponseStream())
{
byte[] buffer = new byte[16 * 1024];
int read;
bool isHtmlRemoved = false;
while ((read = sr.Read(buffer, 0, buffer.Length)) > 0)
{
if (!isHtmlRemoved)
{
string result = Encoding.UTF8.GetString(buffer);
int startPos = result.IndexOf("</html>");
if (startPos > -1)
{
//get the length of the text, '</html>' as well
startPos += 8;
strOut.Write(buffer, startPos, read - startPos);
isHtmlRemoved = true;
}
}
else
{
strOut.Write(buffer, 0, read);
}
}
}
This creates the file with the correct size, but when we try to write it to an append blob in Azure Storage, we are not getting the complete file and in other cases getting bigger files.
using (var sr = wReq.GetResponse().GetResponseStream())
{
byte[] buffer = new byte[16 * 1024];
int read;
bool isHtmlRemoved = false;
while ((read = sr.Read(buffer, 0, buffer.Length)) > 0)
{
if (!isHtmlRemoved)
{
string result = Encoding.UTF8.GetString(buffer);
int startPos = result.IndexOf("</html>");
if (startPos > -1)
{
//get the length of the text, '</html>' as well
startPos += 8;
//strOut.Write(buffer, startPos, read - startPos);
appendBlob.UploadFromByteArray(buffer, startPos, read - startPos);
isHtmlRemoved = true;
}
}
else
{
//strOut.Write(buffer, 0, read);
appendBlob.AppendFromByteArray(buffer, 0, read);
}
}
}
Is this the correct way of doing it? Why would we be getting different file sizes?
Any suggestions will be appreciated
Thanks
In response to "Why would we be getting different file sizes?":
From the CloudAppendBlob.appendFromByteArray documentation
"This API should be used strictly in a single writer scenario
because the API internally uses the append-offset conditional header
to avoid duplicate blocks which does not work in a multiple writer
scenario." If you are indeed using a single writer, you need to
explicitly set the value of
BlobRequestOptions.AbsorbConditionalErrorsOnRetry to true.
You can also check if you are exceeding the 50,000 committed block
limit. Your block sizes are relatively small, so this is a
possibility with sufficiently large files (> 16KB * 50,000 = .82
GB).
In response to "Is this the correct way of doing it?":
If you feel you need to use Append Blobs, try using the CloudAppendBlob.OpenWrite method to achieve functionality more similar to your code example for local storage.
Block Blobs seem like they might be a more appropriate fit for your scenario. Can you please post the code you were using to upload Block Blobs? You should be able to upload to Block Blobs without running out of memory. You can upload different blocks in parallel to achieve faster throughput. Using Append Blobs to append (relatively) small blocks will result in degradation of sequential read performance, as currently append blocks are not defragmented.
Please let me know if any of these solutions work for you!
I have written an application that implements a file copy that is written as below. I was wondering why, when attempting to copy from a network drive to a another network drive, the copy times are huge (20-30 mins to copy a 300mb file) with the following code:
public static void CopyFileToDestination(string source, string dest)
{
_log.Debug(string.Format("Copying file {0} to {1}", source, dest));
DateTime start = DateTime.Now;
string destinationFolderPath = Path.GetDirectoryName(dest);
if (!Directory.Exists(destinationFolderPath))
{
Directory.CreateDirectory(destinationFolderPath);
}
if (File.Exists(dest))
{
File.Delete(dest);
}
FileInfo sourceFile = new FileInfo(source);
if (!sourceFile.Exists)
{
throw new FileNotFoundException("source = " + source);
}
long totalBytesToTransfer = sourceFile.Length;
if (!CheckForFreeDiskSpace(dest, totalBytesToTransfer))
{
throw new ApplicationException(string.Format("Unable to copy file {0}: Not enough disk space on drive {1}.",
source, dest.Substring(0, 1).ToUpper()));
}
long bytesTransferred = 0;
using (FileStream reader = sourceFile.OpenRead())
{
using (FileStream writer = new FileStream(dest, FileMode.OpenOrCreate, FileAccess.Write))
{
byte[] buf = new byte[64 * 1024];
int bytesRead = reader.Read(buf, 0, buf.Length);
double lastPercentage = 0;
while (bytesRead > 0)
{
double percentage = ((float)bytesTransferred / (float)totalBytesToTransfer) * 100.0;
writer.Write(buf, 0, bytesRead);
bytesTransferred += bytesRead;
if (Math.Abs(lastPercentage - percentage) > 0.25)
{
System.Diagnostics.Debug.WriteLine(string.Format("{0} : Copied {1:#,##0} of {2:#,##0} MB ({3:0.0}%)",
sourceFile.Name,
bytesTransferred / (1024 * 1024),
totalBytesToTransfer / (1024 * 1024),
percentage));
lastPercentage = percentage;
}
bytesRead = reader.Read(buf, 0, buf.Length);
}
}
}
System.Diagnostics.Debug.WriteLine(string.Format("{0} : Done copying", sourceFile.Name));
_log.Debug(string.Format("{0} copied in {1:#,##0} seconds", sourceFile.Name, (DateTime.Now - start).TotalSeconds));
}
However, with a simple File.Copy, the time is as expected.
Does anyone have any insight? Could it be because we are making the copy in small chunks?
Changing the size of your buf variable doesn't change the size of the buffer that FileStream.Read or FileStream.Write use when communicating with the file system. To see any change with buffer size, you have to specify the buffer size when you open the file.
As I recall, the default buffer size is 4K. Performance testing I did some time ago showed that the sweet spot is somewhere between 64K and 256K, with 64K being more consistently the best choice.
You should change your File.OpenRead() to:
new FileStream(sourceFile.FullName, FileMode.Open, FileAccess.Read, FileShare.None, BufferSize)
Change the FileShare value if you don't want exclusive access, and declare BufferSize as a constant equal to whatever buffer size you want. I use 64*1024.
Also, change the way you open your output file to:
new FileStream(dest, FileMode.Create, FileAccess.Write, FileShare.None, BufferSize)
Note that I used FileMode.Create rather than FileMode.OpenOrCreate. If you use OpenOrCreate and the source file is smaller than the existing destination file, I don't think the file is truncated when you're done writing. So the destination file would contain extraneous data.
That said, I wouldn't expect this to change your copy time from 20-30 minutes down to the few seconds that it should take. I suppose it could if every low-level read requires a network call. With the default 4K buffer, you're making 16 read calls to the file system in order to fill your 64K buffer. So by increasing your buffer size you greatly reduce the number of OS calls (and potentially the number of network transactions) your code makes.
Finally, there's no need to check to see if a file exists before you delete it. File.Delete silently ignores an attempt to delete a file that doesn't exist.
Call the SetLength method on your writer Stream before actual copying, this should reduce operations by the target disk.
Like so
writer.SetLength(totalBytesToTransfer);
You may need to set the Stream's psoition back to the start after calling this method by using Seek. Check the position of the stream after calling SetLength, should be still zero.
writer.Seek(0, SeekOrigin.Begin); // Not sure on that one
If that still is too slow use the SetFileValidData
I am developing a TCP file transfer client-server program. At the moment I am able to send text files and other file formats perfectly fine, such as .zip with all contents intact on the server end. However, when I transfer a .gif the end result is a gif with same size as the original but with only part of the image showing as if most of the bytes were lost or not written correctly on the server end.
The client sends a 1KB header packet with the name and size of the file to the server. The server then responds with OK if ready and then creates a fileBuffer as large as the file to be sent is.
Here is some code to demonstrate my problem:
// Serverside method snippet dealing with data being sent
while (true)
{
// Spin the data in
if (streams[0].DataAvailable)
{
streams[0].Read(fileBuffer, 0, fileBuffer.Length);
break;
}
}
// Finished receiving file, write from buffer to created file
FileStream fs = File.Open(LOCAL_FOLDER + fileName, FileMode.CreateNew, FileAccess.Write);
fs.Write(fileBuffer, 0, fileBuffer.Length);
fs.Close();
Print("File successfully received.");
// Clientside method snippet dealing with a file send
while(true)
{
con.Read(ackBuffer, 0, ackBuffer.Length);
// Wait for OK response to start sending
if (Encoding.ASCII.GetString(ackBuffer) == "OK")
{
// Convert file to bytes
FileStream fs = new FileStream(inPath, FileMode.Open, FileAccess.Read);
fileBuffer = new byte[fs.Length];
fs.Read(fileBuffer, 0, (int)fs.Length);
fs.Close();
con.Write(fileBuffer, 0, fileBuffer.Length);
con.Flush();
break;
}
}
I've tried a binary writer instead of just using the filestream with the same result.
Am I incorrect in believing successful file transfer to be as simple as conversion to bytes, transportation and then conversion back to filename/type?
All help/advice much appreciated.
Its not about your image .. It's about your code.
if your image bytes were lost or not written correctly that's mean your file transfer code is wrong and even the .zip file or any other file would be received .. It's gonna be correpted.
It's a huge mistake to set the byte buffer length to the file size. imagine that you're going to send a large a file about 1GB .. then it's gonna take 1GB of RAM .. for an Idle transfering you should loop over the file to send.
This's a way to send/receive files nicely with no size limitation.
Send File
using (FileStream fs = new FileStream(srcPath, FileMode.Open, FileAccess.Read))
{
long fileSize = fs.Length;
long sum = 0; //sum here is the total of sent bytes.
int count = 0;
data = new byte[1024]; //8Kb buffer .. you might use a smaller size also.
while (sum < fileSize)
{
count = fs.Read(data, 0, data.Length);
network.Write(data, 0, count);
sum += count;
}
network.Flush();
}
Receive File
long fileSize = // your file size that you are going to receive it.
using (FileStream fs = new FileStream(destPath, FileMode.Create, FileAccess.Write))
{
int count = 0;
long sum = 0; //sum here is the total of received bytes.
data = new byte[1024 * 8]; //8Kb buffer .. you might use a smaller size also.
while (sum < fileSize)
{
if (network.DataAvailable)
{
{
count = network.Read(data, 0, data.Length);
fs.Write(data, 0, count);
sum += count;
}
}
}
}
happy coding :)
When you write over TCP, the data can arrive in a number of packets. I think your early tests happened to fit into one packet, but this gif file is arriving in 2 or more. So when you call Read, you'll only get what's arrived so far - you'll need to check repeatedly until you've got as many bytes as the header told you to expect.
I found Beej's guide to network programming a big help when doing some work with TCP.
As others have pointed out, the data doesn't necessarily all arrive at once, and your code is overwriting the beginning of the buffer each time through the loop. The more robust way to write your reading loop is to read as many bytes as are available and increment a counter to keep track of how many bytes have been read so far so that you know where to put them in the buffer. Something like this works well:
int totalBytesRead = 0;
int bytesRead;
do
{
bytesRead = streams[0].Read(fileBuffer, totalBytesRead, fileBuffer.Length - totalBytesRead);
totalBytesRead += bytesRead;
} while (bytesRead != 0);
Stream.Read will return 0 when there's no data left to read.
Doing things this way will perform better than reading a byte at a time. It also gives you a way to ensure that you read the proper number of bytes. If totalBytesRead is not equal to the number of bytes you expected when the loop is finished, then something bad happened.
Thanks for your input Tvanfosson. I tinkered around with my code and managed to get it working. The synchronicity between my client and server was off. I took your advice though and replaced read with reading a byte one at a time.
I have to split a huge file into many smaller files. Each of the destination files is defined by an offset and length as the number of bytes. I'm using the following code:
private void copy(string srcFile, string dstFile, int offset, int length)
{
BinaryReader reader = new BinaryReader(File.OpenRead(srcFile));
reader.BaseStream.Seek(offset, SeekOrigin.Begin);
byte[] buffer = reader.ReadBytes(length);
BinaryWriter writer = new BinaryWriter(File.OpenWrite(dstFile));
writer.Write(buffer);
}
Considering that I have to call this function about 100,000 times, it is remarkably slow.
Is there a way to make the Writer connected directly to the Reader? (That is, without actually loading the contents into the Buffer in memory.)
I don't believe there's anything within .NET to allow copying a section of a file without buffering it in memory. However, it strikes me that this is inefficient anyway, as it needs to open the input file and seek many times. If you're just splitting up the file, why not open the input file once, and then just write something like:
public static void CopySection(Stream input, string targetFile, int length)
{
byte[] buffer = new byte[8192];
using (Stream output = File.OpenWrite(targetFile))
{
int bytesRead = 1;
// This will finish silently if we couldn't read "length" bytes.
// An alternative would be to throw an exception
while (length > 0 && bytesRead > 0)
{
bytesRead = input.Read(buffer, 0, Math.Min(length, buffer.Length));
output.Write(buffer, 0, bytesRead);
length -= bytesRead;
}
}
}
This has a minor inefficiency in creating a buffer on each invocation - you might want to create the buffer once and pass that into the method as well:
public static void CopySection(Stream input, string targetFile,
int length, byte[] buffer)
{
using (Stream output = File.OpenWrite(targetFile))
{
int bytesRead = 1;
// This will finish silently if we couldn't read "length" bytes.
// An alternative would be to throw an exception
while (length > 0 && bytesRead > 0)
{
bytesRead = input.Read(buffer, 0, Math.Min(length, buffer.Length));
output.Write(buffer, 0, bytesRead);
length -= bytesRead;
}
}
}
Note that this also closes the output stream (due to the using statement) which your original code didn't.
The important point is that this will use the operating system file buffering more efficiently, because you reuse the same input stream, instead of reopening the file at the beginning and then seeking.
I think it'll be significantly faster, but obviously you'll need to try it to see...
This assumes contiguous chunks, of course. If you need to skip bits of the file, you can do that from outside the method. Also, if you're writing very small files, you may want to optimise for that situation too - the easiest way to do that would probably be to introduce a BufferedStream wrapping the input stream.
The fastest way to do file I/O from C# is to use the Windows ReadFile and WriteFile functions. I have written a C# class that encapsulates this capability as well as a benchmarking program that looks at differnet I/O methods, including BinaryReader and BinaryWriter. See my blog post at:
http://designingefficientsoftware.wordpress.com/2011/03/03/efficient-file-io-from-csharp/
How large is length? You may do better to re-use a fixed sized (moderately large, but not obscene) buffer, and forget BinaryReader... just use Stream.Read and Stream.Write.
(edit) something like:
private static void copy(string srcFile, string dstFile, int offset,
int length, byte[] buffer)
{
using(Stream inStream = File.OpenRead(srcFile))
using (Stream outStream = File.OpenWrite(dstFile))
{
inStream.Seek(offset, SeekOrigin.Begin);
int bufferLength = buffer.Length, bytesRead;
while (length > bufferLength &&
(bytesRead = inStream.Read(buffer, 0, bufferLength)) > 0)
{
outStream.Write(buffer, 0, bytesRead);
length -= bytesRead;
}
while (length > 0 &&
(bytesRead = inStream.Read(buffer, 0, length)) > 0)
{
outStream.Write(buffer, 0, bytesRead);
length -= bytesRead;
}
}
}
You shouldn't re-open the source file each time you do a copy, better open it once and pass the resulting BinaryReader to the copy function. Also, it might help if you order your seeks, so you don't make big jumps inside the file.
If the lengths aren't too big, you can also try to group several copy calls by grouping offsets that are near to each other and reading the whole block you need for them, for example:
offset = 1234, length = 34
offset = 1300, length = 40
offset = 1350, length = 1000
can be grouped to one read:
offset = 1234, length = 1074
Then you only have to "seek" in your buffer and can write the three new files from there without having to read again.
Have you considered using the CCR since you are writing to separate files you can do everything in parallel (read and write) and the CCR makes it very easy to do this.
static void Main(string[] args)
{
Dispatcher dp = new Dispatcher();
DispatcherQueue dq = new DispatcherQueue("DQ", dp);
Port<long> offsetPort = new Port<long>();
Arbiter.Activate(dq, Arbiter.Receive<long>(true, offsetPort,
new Handler<long>(Split)));
FileStream fs = File.Open(file_path, FileMode.Open);
long size = fs.Length;
fs.Dispose();
for (long i = 0; i < size; i += split_size)
{
offsetPort.Post(i);
}
}
private static void Split(long offset)
{
FileStream reader = new FileStream(file_path, FileMode.Open,
FileAccess.Read);
reader.Seek(offset, SeekOrigin.Begin);
long toRead = 0;
if (offset + split_size <= reader.Length)
toRead = split_size;
else
toRead = reader.Length - offset;
byte[] buff = new byte[toRead];
reader.Read(buff, 0, (int)toRead);
reader.Dispose();
File.WriteAllBytes("c:\\out" + offset + ".txt", buff);
}
This code posts offsets to a CCR port which causes a Thread to be created to execute the code in the Split method. This causes you to open the file multiple times but gets rid of the need for synchronization. You can make it more memory efficient but you'll have to sacrifice speed.
The first thing I would recommend is to take measurements. Where are you losing your time? Is it in the read, or the write?
Over 100,000 accesses (sum the times):
How much time is spent allocating the buffer array?
How much time is spent opening the file for read (is it the same file every time?)
How much time is spent in read and write operations?
If you aren't doing any type of transformation on the file, do you need a BinaryWriter, or can you use a filestream for writes? (try it, do you get identical output? does it save time?)
Using FileStream + StreamWriter I know it's possible to create massive files in little time (less than 1 min 30 seconds). I generate three files totaling 700+ megabytes from one file using that technique.
Your primary problem with the code you're using is that you are opening a file every time. That is creating file I/O overhead.
If you knew the names of the files you would be generating ahead of time, you could extract the File.OpenWrite into a separate method; it will increase the speed. Without seeing the code that determines how you are splitting the files, I don't think you can get much faster.
No one suggests threading? Writing the smaller files looks like text book example of where threads are useful. Set up a bunch of threads to create the smaller files. this way, you can create them all in parallel and you don't need to wait for each one to finish. My assumption is that creating the files(disk operation) will take WAY longer than splitting up the data. and of course you should verify first that a sequential approach is not adequate.
(For future reference.)
Quite possibly the fastest way to do this would be to use memory mapped files (so primarily copying memory, and the OS handling the file reads/writes via its paging/memory management).
Memory Mapped files are supported in managed code in .NET 4.0.
But as noted, you need to profile, and expect to switch to native code for maximum performance.