Since my original question was a bit too vague, let me clarify.
My goals are:
to estimate blank disc size after selecting filesystem via IMAPI
to estimate space which my file will consume on this disc if i burn it.
What i would like to know:
Is it possible to get bytes per sector for selected file system programmatically
If not, is there default value for bytes per sector which IMAPI uses for different file systems / media types, and is it documented somewhere.
Ok, so the short answer to my question is: one can safely assume, that sector size for DVD/BD discs = 2048 bytes.
The reason, why i was getting different sizes during my debug sessions, was because of an error in code, which retrieved sectors count :)
Mentioned code block was copypasted from http://www.codeproject.com/Articles/24544/Burning-and-Erasing-CD-DVD-Blu-ray-Media-with-C-an , so just in case im posting a quick fix.
original code:
discFormatData = new MsftDiscFormat2Data();
discFormatData.Recorder = discRecorder;
IMAPI_MEDIA_PHYSICAL_TYPE mediaType = discFormatData.CurrentPhysicalMediaType;
fileSystemImage = new MsftFileSystemImage();
fileSystemImage.ChooseImageDefaultsForMediaType(mediaType);
if (!discFormatData.MediaHeuristicallyBlank)
{
fileSystemImage.MultisessionInterfaces = discFormatData.MultisessionInterfaces;
fileSystemImage.ImportFileSystem();
}
Int64 freeMediaBlocks = fileSystemImage.FreeMediaBlocks;
fixed code:
discFormatData = new MsftDiscFormat2Data { Recorder = discRecorder };
fileSystemImage = new MsftFileSystemImage();
fileSystemImage.ChooseImageDefaults(discRecorder);
if (!discFormatData.MediaHeuristicallyBlank)
{
fileSystemImage.MultisessionInterfaces = discFormatData.MultisessionInterfaces;
fileSystemImage.ImportFileSystem();
}
Int64 freeMediaBlocks = fileSystemImage.FreeMediaBlocks;
If you know free/used blocks and the total size of the storage volume (ignoring used/free space) then you can calculate the size per block and then work the rest out.
block size = total size / (blocks used + blocks free)
free space = size per block * blocks free
I'd be surprised if you found the block size was anything other than 1K though
via IMAPI - IWriteEngine2::get_BytesPerSector
http://msdn.microsoft.com/en-us/library/windows/desktop/aa832661(v=vs.85).aspx
This project uses a managed IMAPI2 wrapper to make life easier - http://www.codeproject.com/Articles/24544/Burning-and-Erasing-CD-DVD-Blu-ray-Media-with-C-an
Related
I have to translate a project from c# to R. In this c# project i have to handle binary files.
I have three problems:
1.I am having some issues to convert this code:
//c#
//this work fine
using (BinaryReader rb = new BinaryReader(archive.Entries[0].Open())){
a = rb.ReadInt32();
b = rb.ReadInt32();
c = rb.ReadDouble();
}
#R
#this work, but it reads different values
#I tried to change the size in ReadBin, but it's the same story. The working diretory is the right one
to.read <- "myBinaryFile.tmp"
line1<-c(readBin(to.read,"integer",2),
readBin(to.read,"double",1))
How can I read float (in c# i have rb.ReadSingle()) in R?
Is there in R a function to memorize the position that you have arrived when you are reading a binary file? So next time you will read it again, you could skip what you have already read (as in c# with BinaryReader)
Answering your questions directly:
I am having some issues to convert this code...
What is the problem here? Your code block contains the comment "but it's the same story", but what is the story? You haven't explained anything here. If your problem is with the double, you should try setting readBin(..., size = 8). In your case, your code would read line1 <- c(readBin(to.read,"integer", 2), readBin(to.read, "double", 1, 8)).
How can I read float (in c# i have rb.ReadSingle()) in R?
Floats are 4 bytes in size in this case (I would presume), so set size = 4 in readBin().
Is there in R a function to memorize the position that you have arrived when you are reading a binary file? So next time you will read it again, you could skip what you have already read (as in c# with BinaryReader)
As far as I know there is nothing available (more knowledgeable people are welcome to add their inputs). You could, however, easily write a wrapper script for readBin() that does this for you. For instance, you could specify how many bytes you want to discard (i.e., this can correspond to n bytes that you have already read into R), and read in that many bytes via a dummy readBin() like so readBin(con = yourinput, what = "raw", n = n), where the integer n would indicate the number of bytes you wish to throw away. Thereafter, you could have your wrapper script go read succeeding bytes into a variable of your choice.
In my web application i use LeadTools to Create Multi Page Tiff file from stream. Below is a code that shows how I use leadtools.
using (RasterCodecs codecs = new RasterCodecs())
{
RasterImage ImageToAppened = default(RasterImage);
RasterImage imageSrc = default(RasterImage);
codecs.Options.Load.AllPages = true;
ImageToAppened = codecs.Load(fullInputPath, 1);
FileInfo fileInfooutputTiff = new FileInfo(fullOutputPath);
if (fileInfooutputTiff.Exists)
{
imageSrc = codecs.Load(fullOutputPath);
imageSrc.AddPage(ImageToAppened);
codecs.Save(imageSrc, fullOutputPath, RasterImageFormat.Ccitt, 1);
}
else
{
codecs.Save(ImageToAppened, fullOutputPath, RasterImageFormat.Ccitt, 1);
}
}
Above code works properly and i get many request for my web application at around 2000 requests. In some cases i get below error . But later on again it works properly for other request.
You have exceeded the amount of memory allowed for RasterImage allocations.See RasterDefaults::MemoryThreshold::MaximumGlobalRasterImageMemory.
Is that memory issue is for single request or for all the objects during the application started(global object)?
So what is the solution for above error?
The error you report references the MaximumGlobalRasterImageMemory:
You have exceeded the amount of memory allowed for RasterImage allocations.See RasterDefaults::MemoryThreshold::MaximumGlobalRasterImageMemory.
In the documentation it states:
Gets or sets a value that specifies the maximum size allowed for all RasterImage object allocations.
When allocating a new RasterImage object, if the new allocation causes the total memory used by all allocated RasterImage objects to exceed the value of MaximumGlobalRasterImageMemory, then the allocation will throw an exception.
So it looks like it's for all objects.
These are the specified default values:
On x86 systems, this property defaults to 1.5 GB.
On x64 systems, this property defaults to either 1.5 GB or 75 percent of the system's total physical RAM, whichever is larger.
I would advise that you familiarise yourself with the documentation for the SDK.
When handling files with many pages, here are a few general tips that could help with both web and desktop applications:
Avoid loading all pages and adding them to one RasterImage in memory. Instead loop through them and load them one (or a few) at a time, then append them to output file without keeping them in memory. Appending to file could get slower as the page count grows, but this help topic explains how you can speed that up.
You have "using (RasterCodecs codecs ..)" in your code, but the large memory is for the image, not the codecs object. Consider wrapping your RasterImage object in a "using" scope to speed up its disposal. In other words, go for "using (RasterImage image = ...)"
And the obvious suggestion: go for 64-bit, install as much RAM as you can and increase the value of MaximumGlobalRasterImageMemory.
This is a continuation of my question about downloading files in chunks. The explanation will be quite big, so I'll try to divide it to several parts.
1) What I tried to do?
I was creating a download manager for a Window-Phone application. First, I tried to solve the problem of downloading
large files (the explanation is in the previous question). No I want to add "resumable download" feature.
2) What I've already done.
At the current moment I have a well-working download manager, that allows to outflank the Windows Phone RAM limit.
The plot of this manager, is that it allows to download small chunks of file consequently, using HTTP Range header.
A fast explanation of how it works:
The file is downloaded in chunks of constant size. Let's call this size "delta". After the file chunk was downloaded,
it is saved to local storage (hard disk, on WP it's called Isolated Storage) in Append mode (so, the downloaded byte array is
always added to the end of the file). After downloading a single chunk the statement
if (mediaFileLength >= delta) // mediaFileLength is a length of downloaded chunk
is checked. If it's true, that
means, there's something left for download and this method is invoked recursively. Otherwise it means, that this chunk
was last, and there's nothing left to download.
3) What's the problem?
Until I used this logic at one-time downloads (By one-time I mean, when you start downloading file and wait until the download is finished)
that worked well. However, I decided, that I need "resume download" feature. So, the facts:
3.1) I know, that the file chunk size is a constant.
3.2) I know, when the file is completely downloaded or not. (that's a indirect result of my app logic,
won't weary you by explanation, just suppose, that this is a fact)
On the assumption of these two statements I can prove, that the number of downloaded chunks is equal to
(CurrentFileLength)/delta. Where CurrentFileLenght is a size of already downloaded file in bytes.
To resume downloading file I should simply set the required headers and invoke download method. That seems logic, isn't it? And I tried to implement it:
// Check file size
using (IsolatedStorageFileStream fileStream = isolatedStorageFile.OpenFile("SomewhereInTheIsolatedStorage", FileMode.Open, FileAccess.Read))
{
int currentFileSize = Convert.ToInt32(fileStream.Length);
int currentFileChunkIterator = currentFileSize / delta;
}
And what I see as a result? The downloaded file length is equal to 2432000 bytes (delta is 304160, Total file size is about 4,5 MB, we've downloaded only half of it). So the result is
approximately 7,995. (it's actually has long/int type, so it's 7 and should be 8 instead!) Why is this happening?
Simple math tells us, that the file length should be 2433280, so the given value is very close, but not equal.
Further investigations showed, that all values, given from the fileStream.Length are not accurate, but all are close.
Why is this happening? I don't know precisely, but perhaps, the .Length value is taken somewhere from file metadata.
Perhaps, such rounding is normal for this method. Perhaps, when the download was interrupted, the file wasn't saved totally...(no, that's real fantastic, it can't be)
So the problem is set - it's "How to determine number of the chunks downloaded". Question is how to solve it.
4) My thoughts about solving the problem.
My first thought was about using maths here. Set some epsilon-neiborhood and use it in currentFileChunkIterator = currentFileSize / delta; statement.
But that will demand us to remember about type I and type II errors (or false alarm and miss, if you don't like the statistics terms.) Perhaps, there's nothing left to download.
Also, I didn't checked, if the difference of the provided value and the true value is supposed to grow permanently
or there will be cyclical fluctuations. With the small sizes (about 4-5 MB) I've seen only growth, but that doesn't prove anything.
So, I'm asking for help here, as I don't like my solution.
5) What I would like to hear as answer:
What causes the difference between real value and received value?
Is there a way to receive a true value?
If not, is my solution good for this problem?
Are there other better solutions?
P.S. I won't set a Windows-Phone tag, because I'm not sure that this problem is OS-related. I used the Isolated Storage Tool
to check the size of downloaded file, and it showed me the same as the received value(I'm sorry about Russian language at screenshot):
I'm answering to your update:
This is my understanding so far: The length actually written to the file is more (rounded up to the next 1KiB) than you actually wrote to it. This causes your assumption of "file.Length == amount downloaded" to be wrong.
One solution would be to track this information separately. Create some meta-data structure (which can be persisted using the same storage mechanism) to accurately track which blocks have been downloaded, as well as the entire size of the file:
[DataContract] //< I forgot how serialization on the phone works, please forgive me if the tags differ
struct Metadata
{
[DataMember]
public int Length;
[DataMember]
public int NumBlocksDownloaded;
}
This would be enough to reconstruct which blocks have been downloaded and which have not, assuming that you keep downloading them in a consecutive fashion.
edit
Of course you would have to change your code from a simple append to moving the position of the stream to the correct block, before writing the data to the stream:
file.Position = currentBlock * delta;
file.Write(block, 0, block.Length);
Just as a possible bug. Dont forget to verify if the file was modified during requests. Specialy during long time between ones, that can occor on pause/resume.
The error could be big, like the file being modified to small size and your count getting "erronic", and the file being the same size but with modified contents, this will leave a corrupted file.
Have you heard an anecdote about a noob-programmer and 10 guru-programmers? Guru programmers were trying to find an error in his solution, and noob had already found it, but didn't tell about it, as it was something that stupid, we was afraid to be laughed at.
Why I remembered this? Because the situation is similar.
The explanation of my question was very heavy, and I decided not to mention some small aspects, that I was sure, worked correctly. (And they really worked correctly)
One of this small aspects, was the fact, that the downloaded file was encrypted via AES PKCS7 padding. Well, the decryption worked correctly, I knew it, so why should I mention it? And I didn't.
So, then I tried to find out, what exactly causes the error with the last chunk. The most credible version was about problems with buffering, and I tried to find, where am I leaving the missing bytes. I tested again and again, but I couldn't find them, as every chunk was saving without any losses. And one day I comprehended:
There is no spoon
There is no error.
What's the point of AES PKCS7? Well, the primary one is that it makes the decrypted file smaller. Not much, only at 16 bytes. And it was considered in my decryption method and download method, so there should be no problem, right?
But what happens, when the download process interrupts? The last chunk will save correctly, there will be no errors with buffering or other ones. And then we want to continue download. The number of the downloaded chunks will be equal to currentFileChunkIterator = currentFileSize / delta;
And here I should ask myself: "Why are you trying to do something THAT stupid?"
"Your downloaded one chunk size is not delta. Actually, it's less than delta". (the decryption makes chunk smaller to 16 bytes, remember?)
The delta itself consists of 10 equal parts, that are being decrypted. So we should divide not by delta, but by (delta - 16 * 10) which is (304160 - 160) = 304000.
I sense a rat here. Let's try to find out the number of the downloaded chunks:
2432000 / 304000 = 8. Wait... OH SHI~
So, that's the end of story.
The whole solution logic was right.
The only reason it failed, was my thought, that, for some reason, the downloaded decrypted file size should be the same as the sum of downloaded encrypted chunks.
And, of course, as I didn't mention about the decryption(it's mentioned only in previous question, which is only linked), none of you could give me a correct answer. I'm terribly sorry about that.
In continue to my comment..
The original file size as I understand from your description is 2432000 bytes.
The Chunk size is set to 304160 bytes (or 304160 per "delta").
So, the machine which send the file was able to fill 7 chunks and sent them.
The receiving machine now has 7 x 304160 bytes = 2129120 bytes.
The last chunk will not be filled to the end as there is not enough bytes left to fill to it.. so it will contain: 2432000 - 2129120 = 302880 which is less than 304160
If you add the numbers you will get 7x304160 + 1x302880 = 2432000 bytes
So according to that the original file transferred in full to the destination.
The problem is that you are calculating 8x304160 = 2433280 insisting that even the last chunk must be filled completely - but with what?? and why??
In humble.. are you locked in some kind of math confusion or did I misunderstand your problem?
Please answer, What is the original file size and what size is being received at the other end? (totals!)
What is the best memory buffer size to allocate to download a file from Internet? Some of the samples said that it should be 1K. Well, I need to know in general why is it? And also what's the difference if we download a small .PNG or a large .AVI?
Stream remoteStream;
Stream localStream;
WebResponse response;
try
{
response = request.EndGetResponse(result);
if (response == null)
return;
remoteStream = response.GetResponseStream();
var localFile = Path.Combine(FileManager.GetFolderContent(), TaskResult.ContentItem.FileName);
localStream = File.Create(localFile);
var buffer = new byte[1024];
int bytesRead;
do
{
bytesRead = remoteStream.Read(buffer, 0, buffer.Length);
localStream.Write(buffer, 0, bytesRead);
BytesProcessed += bytesRead;
} while (bytesRead > 0);
}
For what it's worth, I tested reading a 1484 KB text file using progressive powers of two (sizes of 2,4,8,16...). I printed out to the console window the number of milliseconds required to read each one. Much past 8192 it didn't seem like much of a difference. Here are the results on my Windows 7 64 bit machine.
2^1 = 2 :264.0151
2^2 = 4 :193.011
2^3 = 8 :175.01
2^4 = 16 :153.0088
2^5 = 32 :139.0079
2^6 = 64 :134.0077
2^7 = 128 :132.0075
2^8 = 256 :130.0075
2^9 = 512 :133.0076
2^10 = 1024 :133.0076
2^11 = 2048 :90.0051
2^12 = 4096 :69.0039
2^13 = 8192 :60.0035
2^14 = 16384 :56.0032
2^15 = 32768 :53.003
2^16 = 65536 :53.003
2^17 = 131072 :52.003
2^18 = 262144 :53.003
2^19 = 524288 :54.0031
2^20 = 1048576 :55.0031
2^21 = 2097152 :54.0031
2^22 = 4194304 :54.0031
2^23 = 8388608 :54.003
2^24 = 16777216 :55.0032
Use at least 4KB. It's the normal page size for Windows (i.e. the granularity at which Windows itself manages memory), which means that the .Net memory allocator doesn't need to break down a 4KB page into 1KB allocations.
Of course, using a 64KB block will be faster, but only marginally so.
2k, 4k or 8k are good choices.
It is not important how much is the page size, the change in speed would be really marginal and unpredictable.
First of all, C# memory can be moved, C# uses a compacting generational garbage collector. There is not any kind of information on where data will be allocated.
Second, arrays in C# can be formed by non-contiguous area of memory!
Arrays are stored contiguously in virtual memory but contiguous virtual memory doesn't mean contiguous physical memory.
Third, array data structure in C# occupies some bytes more than the content itself (it stores array size and other informations). If you allocate page size amount of bytes, using the array will switch page almost always!
I would think that optimizing code using page size can be an non-optimization.
Usually C# arrays performs very well but if you really need precise allocation of data you need to use pinned arrays or Marshal allocation, but this will slow down the garbage collector.
Using marshal allocation and unsafe code can be a little faster but really it don't worth the effort.
I would say it is better to just use your arrays without thinking too much about the page size. Use 2k, 4k or 8k buffers.
I have problem with remote machine closing connection when used 64K buffer when downloading from iis.
I solved the problem raising the buffer to 2M
It will depend on the hardware and scope too. I work for cloud deployed workloads, in server world you may find 40G Ethernet cards and you can assume MTUs of 9000 bytes. Additionally, you dont want your ethernet card interrupt your processor for every single frame. So, ignoring the middle actors in the Windows/Linux kernel you should go for a one or two folds higher:
100 * 9000 ~~ 900kB so I generally choose 512KB as default value (as long as I know this value is not oversizing the regular expected file size being downloaded)
In some cases you can find out (or know, or hack around in a debugger and hence find out albeit in a non-change-resistant way) the size of a buffer used by the stream(s) you are writing to or reading from. In this case it will give a slight advantage if you match that size, or failing that, for one buffer to be a whole multiple of the other.
Otherwise 4096 unless you've a reason otherwise (wanting a small buffer to give rapid UI feedback for example), for the reasons already given.
I am going to store 350M pre-calculated double numbers in a binary file, and load them into memory as my dll starts up. Is there any built in way to load it up in parallel, or should I split the data into multiple files myself and take care of multiple threads myself too?
Answering the comments: I will be running this dll on powerful enough boxes, most likely only on 64 bit ones. Because all the access to my numbers will be via properties anyway, I can store my numbers in several arrays.
[update]
Everyone, thanks for answering! I'm looking forward to a lot of benchmarking on different boxes.
Regarding the need: I want to speed up a very slow calculation, so I am going to pre-calculate a grid, load it into memory, and then interpolate.
Well I did a small test and I would definitely recommend using Memory Mapped Files.
I Created a File containing 350M double values (2.6 GB as many mentioned before) and then tested the time it takes to map the file to memory and then access any of the elements.
In all my tests in my laptop (Win7, .Net 4.0, Core2 Duo 2.0 GHz, 4GB RAM) it took less than a second to map the file and at that point accessing any of the elements took virtually 0ms (all time is in the validation of the index).
Then I decided to go through all 350M numbers and the whole process took about 3 minutes (paging included) so if in your case you have to iterate they may be another option.
Nevertheless I wrapped the access, just for example purposes there a lot conditions you should check before using this code, and it looks like this
public class Storage<T> : IDisposable, IEnumerable<T> where T : struct
{
MemoryMappedFile mappedFile;
MemoryMappedViewAccessor accesor;
long elementSize;
long numberOfElements;
public Storage(string filePath)
{
if (string.IsNullOrWhiteSpace(filePath))
{
throw new ArgumentNullException();
}
if (!File.Exists(filePath))
{
throw new FileNotFoundException();
}
FileInfo info = new FileInfo(filePath);
mappedFile = MemoryMappedFile.CreateFromFile(filePath);
accesor = mappedFile.CreateViewAccessor(0, info.Length);
elementSize = Marshal.SizeOf(typeof(T));
numberOfElements = info.Length / elementSize;
}
public long Length
{
get
{
return numberOfElements;
}
}
public T this[long index]
{
get
{
if (index < 0 || index > numberOfElements)
{
throw new ArgumentOutOfRangeException();
}
T value = default(T);
accesor.Read<T>(index * elementSize, out value);
return value;
}
}
public void Dispose()
{
if (accesor != null)
{
accesor.Dispose();
accesor = null;
}
if (mappedFile != null)
{
mappedFile.Dispose();
mappedFile = null;
}
}
public IEnumerator<T> GetEnumerator()
{
T value;
for (int index = 0; index < numberOfElements; index++)
{
value = default(T);
accesor.Read<T>(index * elementSize, out value);
yield return value;
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
T value;
for (int index = 0; index < numberOfElements; index++)
{
value = default(T);
accesor.Read<T>(index * elementSize, out value);
yield return value;
}
}
public static T[] GetArray(string filePath)
{
T[] elements;
int elementSize;
long numberOfElements;
if (string.IsNullOrWhiteSpace(filePath))
{
throw new ArgumentNullException();
}
if (!File.Exists(filePath))
{
throw new FileNotFoundException();
}
FileInfo info = new FileInfo(filePath);
using (MemoryMappedFile mappedFile = MemoryMappedFile.CreateFromFile(filePath))
{
using(MemoryMappedViewAccessor accesor = mappedFile.CreateViewAccessor(0, info.Length))
{
elementSize = Marshal.SizeOf(typeof(T));
numberOfElements = info.Length / elementSize;
elements = new T[numberOfElements];
if (numberOfElements > int.MaxValue)
{
//you will need to split the array
}
else
{
accesor.ReadArray<T>(0, elements, 0, (int)numberOfElements);
}
}
}
return elements;
}
}
Here is an example of how you can use the class
Stopwatch watch = Stopwatch.StartNew();
using (Storage<double> helper = new Storage<double>("Storage.bin"))
{
Console.WriteLine("Initialization Time: {0}", watch.ElapsedMilliseconds);
string item;
long index;
Console.Write("Item to show: ");
while (!string.IsNullOrWhiteSpace((item = Console.ReadLine())))
{
if (long.TryParse(item, out index) && index >= 0 && index < helper.Length)
{
watch.Reset();
watch.Start();
double value = helper[index];
Console.WriteLine("Access Time: {0}", watch.ElapsedMilliseconds);
Console.WriteLine("Item: {0}", value);
}
else
{
Console.Write("Invalid index");
}
Console.Write("Item to show: ");
}
}
UPDATE I added a static method to load all data in a file to an array. Obviously this approach takes more time initially (on my laptop takes between 1 and 2 min) but after that access performance is what you expect from .Net. This method should be useful if you have to access data frequently.
Usage is pretty simple
double[] helper = Storage<double>.GetArray("Storage.bin");
HTH
It sounds extremely unlikely that you'll actually be able to fit this into a contiguous array in memory, so presumably the way in which you parallelize the load depends on the actual data structure.
(Addendum: LukeH pointed out in comments that there is actually a hard 2GB limit on object size in the CLR. This is detailed in this other SO question.)
Assuming you're reading the whole thing from one disk, parallelizing the disk reads is probably a bad idea. If there's any processing you need to do to the numbers as or after you load them, you might want to consider running that in parallel at the same time you're reading from disk.
The first question you have presumably already answered is "does this have to be precalculated?". Is there some algorithm you can use that will make it possible to calculate the required values on demand to avoid this problem? Assuming not...
That is only 2.6GB of data - on a 64 bit processor you'll have no problem with a tiny amount of data like that. But if you're running on a 5 year old computer with a 10 year old OS then it's a non-starter, as that much data will immediately fill the available working set for a 32-bit application.
One approach that would be obvious in C++ would be to use a memory-mapped file. This makes the data appear to your application as if it is in RAM, but the OS actually pages bits of it in only as it is accessed, so very little real RAM is used. I'm not sure if you could do this directly from C#, but you could easily enough do it in C++/CLI and then access it from C#.
Alternatively, assuming the question "do you need all of it in RAM simultaneously" has been answered with "yes", then you can't go for any kind of virtualisation approach, so...
Loading in multiple threads won't help - you are going to be I/O bound, so you'll have n threads waiting for data (and asking the hard drive to seek between the chunks they are reading) rather than one thread waiitng for data (which is being read sequentially, with no seeks). So threads will just cause more seeking and thus may well make it slower. (The only case where splitting the data up might help is if you split it to different physical disks so different chunks of data can be read in parallel - don't do this in software; buy a RAID array)
The only place where multithreading may help is to make the load happen in the background while the rest of your application starts up, and allow the user to start using the portion of the data that is already loaded while the rest of the buffer fills, so the user (hopefully) doesn't have to wait much while the data is loading.
So, you're back to loading the data into one massive array in a single thread...
However, you may be able to speed this up considerably by compressing the data. There are a couple of general approaches woth considering:
If you know something about the data, you may be able to invent an encoding scheme that makes the data smaller (and therefore faster to load). e.g. if the values tend to be close to each other (e.g. imagine the data points that describe a sine wave - the values range from very small to very large, but each value is only ever a small increment from the last) you may be able to represent the 'deltas' in a float without losing the accuracy of the original double values, halving the data size. If there is any symmetry or repetition to the data you may be able to exploit it (e.g. imagine storing all the positions to describe a whole circle, versus storing one quadrant and using a bit of trivial and fast maths to reflect it 4 times - an easy way to quarter the amount of data I/O). Any reduction in data size would give a corresponding reduction in load time. In addition, many of these schemes would allow the data to remain "encoded" in RAM, so you'd use far less RAM but still be able to quickly fetch the data when it was needed.
Alternatively, you can very easily wrap your stream with a generic compression algorithm such as Deflate. This may not work, but usually the cost of decompressing the data on the CPU is less than the I/O time that you save by loading less source data, so the net result is that it loads significantly faster. And of course, save a load of disk space too.
In typical case, loading speed will be limited by speed of storage you're loading data from--i.e. hard drive.
If you want it to be faster, you'll need to use faster storage, f.e. multiple hard drives joined in a RAID scheme.
If your data can be reasonably compressed, do that. Try to find algorithm which will use exactly as much CPU power as you have---less than that and your external storage speed will be limiting factor; more than that and your CPU speed will be limiting factor. If your compression algorithm can use multiple cores, then multithreading can be useful.
If your data are somehow predictable, you might want to come up with custom compression scheme. F.e. if consecutive numbers are close to each other, you might want to store differences between numbers---this might help compression efficiency.
Do you really need double precision? Maybe floats will do the job? Maybe you don't need full range of doubles? For example if you need full 53 bits of mantissa precision, but need only to store numbers between -1.0 and 1.0, you can try to chop few bits per number by not storing exponents in full range.
Making this parallel would be a bad idea unless you're running on a SSD. The limiting factor is going to be the disk IO--and if you run two threads the head is going to be jumping back and forth between the two areas being read. This will slow it down a lot more than any possible speedup from parallelization.
Remember that drives are MECHANICAL devices and insanely slow compared to the processor. If you can do a million instructions in order to avoid a single head seek you will still come out ahead.
Also, once the file is on disk make sure to defrag the disk to ensure it's in one contiguous block.
That does not sound like a good idea to me. 350,000,000 * 8 bytes = 2,800,000,000 bytes. Even if you manage to avoid the OutOfMemoryException the process may be swapping in/out of the page file anyway. You might as well leave the data in the file and load smaller chucks as they are needed. The point is that just because you can allocate that much memory does not mean you should.
With a suitable disk configuration, splitting into multiple files across disks would make sense - and reading each file in a separate thread would then work nicely (if you've some stripyness - RAID whatever :) - then it could make sense to read from a single file with multiple threads).
I think you're on a hiding to nothing attempting this with a single physical disk, though.
Just saw this : .NET 4.0 has support for memory mapped files. That would be a very fast way to do it, and no support required for parallelization etc.