Reading data from a socket directly into a Memory Mapped File - c#

I'm trying to create a .NET Core application and am getting a little stuck on IPC.
I have data coming in (let's say, from a socket, or a file, some sort of streaming interface), via executable 1. Now, I want this data to be read by executable 2. So, I've created a MMF, where executable 1 writes data, and executable 2 reads data. All is well.
However, I would really like to skip the "copy" step here. If I have a message coming in via a socket, I need to read the message (ergo; store it in some byte array), and then copy it to the appropriate Memory Mapped File.
Is there a way (especially now, with the new Memory, Span etc) to have them use the same memory?
This code almost seems to work, but not completely:
const int bufferSize = 1024;
var mappedFile = MemoryMappedFile.CreateNew("IPCFile", bufferSize);
var fileAccessor = mappedFile.CreateViewAccessor(0, bufferSize, MemoryMappedFileAccess.ReadWrite);
// THere now exists a region of byte[] * bufferSize somewhere. I want to write to that.
byte[] bufferMem = new byte[bufferSize];
// I know the memory address where this area is, and I know the size of it:
unsafe
{
byte* startOfRegion = (byte*)0;
fileAccessor.SafeMemoryMappedViewHandle.AcquirePointer(ref startOfRegion);
// But how do I "assign" this region to the managed object?
// This throws "System.MissingMethodException: 'No parameterless constructor defined for type 'System.Byte[]'."
//bufferMem = Marshal.PtrToStructure<byte[]>(new IntPtr(startOfRegion));
// This almost looks like it works, but bufferMem remains null. Does not give any errors though.
bufferMem = Unsafe.AsRef<byte[]>(startOfRegion);
}
// For a shorter example, just using a file stream
var incomingData = File.OpenRead(#"C:\Temp\plaatje.png");
// Pass in the "reserved" memory region. But StreamPipeReaderOptions wants a MemoryPool<byte>, not a byte[].
// How would I cast that? MemoryPool is abstract, so can't even be instantiated
var reader = PipeReader.Create(incomingData, new StreamPipeReaderOptions(bufferMem));
ReadResult readResult;
// Actually read data
while (true)
{
readResult = await reader.ReadAsync();
if (readResult.IsCompleted || readResult.IsCanceled)
break;
reader.AdvanceTo(readResult.Buffer.Start, readResult.Buffer.End);
}
// Now signal the other process to read the contents of the MMF and continue
This question seems to ask a similar thing, but does not have an answer, and is from 2013.

Related

Strange results from OpenReadAsync() when reading data from Azure Blob storage

I'm having a go at modifying an existing C# (dot net core) app that reads a type of binary file to use Azure Blob Storage.
I'm using Windows.Azure.Storage (8.6.0).
The issue is that this app reads the binary data from files from a Stream in very small blocks (e.g. 5000-6000 bytes). This reflects how the data is structured.
Example pseudo code:
var blocks = new List<byte[]>();
var numberOfBytesToRead = 6240;
var numberOfBlocksToRead = 1700;
using (var stream = await blob.OpenReadAsync())
{
stream.Seek(3000, SeekOrigin.Begin); // start reading at a particular position
for (int i = 1; i <= numberOfBlocksToRead; i++)
{
byte[] traceValues = new byte[numberOfBytesToRead];
stream.Read(traceValues, 0, numberOfBytesToRead);
blocks.Add(traceValues);
}
}`
If I try to read a 10mb file using OpenReadAsync(), I get invalid/junk values in the byte arrays after around 4,190,000 bytes.
If I set StreamMinimumReadSize to 100Mb it works.
If I read more data per block (e.g. 1mb) it works.
Some of the files can be more than 100Mb, so setting the StreamMinimumReadSize may not be the best solution.
What is going on here, and how can I fix this?
Are the invalid/junk values zeros? If so (and maybe even if not) check the return value from stream.Read. That method is not guaranteed to actually read the number of bytes that you ask it to. It can read less. In which case you are supposed to call it again in a loop, until it has read the total amount that you want. A quick web search should show you lots of examples of the necessary looping.

How to efficiently set the number of bytes to download for HttpWebRequest?

I'm currently working on a file downloader project. The application is designed so as to support resumable downloads. All downloaded data and its metadata(download ranges) are stored on the disk immediately per call to ReadBytes. Let's say that I used the following code snippet :-
var reader = new BinaryReader(response.GetResponseStream());
var buffr = reader.ReadBytes(_speedBuffer);
DownloadSpeed += buffr.Length;//used for reporting speed and zeroed every second
Here _speedBuffer is the number of bytes to download which is set to a default value.
I have tested the application by two methods. First is by downloading a file which is hosted on a local IIS server. The speed is great. Secondly, I tried to download the same file's copy(from where it was actually downloaded) from the internet. My net speed is real slow. Now, what I observed that if I increase the _speedBuffer then the downloading speed from the local server is good but for the internet copy, speed reporting is slow. Whereas if I decrease the value of _speedBuffer, the downloading speed(reporting) for the file's internet copy is good but not for the local server. So I thought, why shouldn't I change the _speedBuffer at runtime. But all the custom algorithms(for changing the value) I came up with were in-efficient. Means the download speed was still slow as compared other downloaders.
Is this approach OK?
Am I doing it the wrong way?
Should I stick with default value for _speedBuffer(byte count)?
The problem with ReadBytes in this case is that it attempts to read exactly that number of bytes, or it returns when there is no more data to read.
So you receive a packet containing 99 bytes of data, then calling ReadBytes(100) will wait for the next packet to include that missing byte.
I wouldn't use a BinaryReader at all:
byte[] buffer = new byte[bufferSize];
using (Stream responseStream = response.GetResponseStream())
{
int bytes;
while ((bytes = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
DownloadSpeed += bytes;//used for reporting speed and zeroed every second
// on each iteration, "bytes" bytes of the buffer have been filled, store these to disk
}
// bytes was 0: end of stream
}

Principles behind FileStreaming

I've been working on a project recently that involves a lot of FileStreaming, something which I've not really touched on before.
To try and better acquaint myself with the principles of such methods, I've written some code that (theoretically) downloads a file from one dir to another, and gone through it step by step, commenting in my understanding of what each step achieves, like so...
Get fileinfo object from DownloadRequest Object
RemoteFileInfo fileInfo = svr.DownloadFile(request);
DownloadFile method in WCF Service
public RemoteFileInfo DownloadFile(DownloadRequest request)
{
RemoteFileInfo result = new RemoteFileInfo(); // create empty fileinfo object
try
{
// set filepath
string filePath = System.IO.Path.Combine(request.FilePath , #"\" , request.FileName);
System.IO.FileInfo fileInfo = new System.IO.FileInfo(filePath); // get fileinfo from path
// check if exists
if (!fileInfo.Exists)
throw new System.IO.FileNotFoundException("File not found",
request.FileName);
// open stream
System.IO.FileStream stream = new System.IO.FileStream(filePath,
System.IO.FileMode.Open, System.IO.FileAccess.Read);
// return result
result.FileName = request.FileName;
result.Length = fileInfo.Length;
result.FileByteStream = stream;
}
catch (Exception ex)
{
// do something
}
return result;
}
Use returned FileStream from fileinfo to read into a new write stream
// set new location for downloaded file
string basePath = System.IO.Path.Combine(#"C:\SST Software\DSC\Compilations\" , compName, #"\");
string serverFileName = System.IO.Path.Combine(basePath, file);
double totalBytesRead = 0.0;
if (!Directory.Exists(basePath))
Directory.CreateDirectory(basePath);
int chunkSize = 2048;
byte[] buffer = new byte[chunkSize];
// create new write file stream
using (System.IO.FileStream writeStream = new System.IO.FileStream(serverFileName, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
do
{
// read bytes from fileinfo stream
int bytesRead = fileInfo.FileByteStream.Read(buffer, 0, chunkSize);
totalBytesRead += (double)bytesRead;
if (bytesRead == 0) break;
// write bytes to output stream
writeStream.Write(buffer, 0, bytesRead);
} while (true);
// report end
Console.WriteLine(fileInfo.FileName + " has been written to " + basePath + " - Done!");
writeStream.Close();
}
What I was hoping for is any clarification or expansion on what exactly happens when using a FileStream.
I can achieve the download, and now I know what code I need to write in order to perform such a download, but I would like to know more about why it works. I can find no 'beginner-friendly' or step by step explanations on the web.
What is happening here behind the scenes?
A stream is just an abstraction, fundamentally it works like a pointer within a collection of data.
Take the example string of "Hello World!" for example, it is just a collection of characters, which are fundamentally just bytes.
As a stream, it could be represented to have:
A length of 12 (possibly more including termination characters etc)
A position in the stream.
You read a stream by moving the position around and requesting data.
So reading the text above could be (in pseudocode) seen to be like this:
do
get next byte
add gotten byte to collection
while not the end of the stream
the entire data is now in the collection
Streams are really useful when it comes to accessing data from sources such as the file system or remote machines.
Imagine a file that is several gigabytes in size, if the OS loaded all of that into memory any time a program wanted to read it (say a video player), there would be a lot of problems.
Instead, what happens is the program requests access to the file, and the OS returns a stream; the stream tells the program how much data there is, and allows it to access that data.
Depending on implementation, the OS may load a certain amount of data into memory ahead of the program accessing it, this is known as a buffer.
Fundamentally though, the program just requests the next bit of data, and the OS either gets it from the buffer, or from the source (e.g. the file on disk).
The same principle applies to streams between different computers, except requesting the next bit of data may very well involve a trip to the remote machine to request it.
The .NET FileStream class and the Stream base class, all just defer to the windows systems for working with streams in the end, there's nothing particularly special about them, it's just what you can do with the abstraction that makes them so powerful.
Writing to a stream is just the same, but it just puts data into the buffer, ready for the requester to access.
Infinite Data
As a user pointed out, streams can be used for data of indeterminate length.
All stream operations take time, so reading a stream is typically a blocking operation that will wait until data is available.
So you could loop forever while the stream is still open, and just wait for data to come in - an example of this in practice would be a live video broadcast.
I've since located a book - C# 5.0 All-In-One For Dummies - It explains everything about all Stream classes, how they work, which one is most appropriate and more.
Only been reading about 30 minutes, already have such a better understanding. Excellent guide!

C# Networkstream BeginRead How to obtain buffer length/size?

I have a problem to obtain the right buffer size of my application.
What i read from the site about specifying the buffer size is normally declared before reading.
byte[] buffer = new byte[2000];
And then using to get the result.
However, this method will stop once the received data contains '00', but my return code contains something like this... 5300000002000000EF0000000A00. and the length is not fixed, can be this short until 400 bytes
So the problems comes, if i define a prefixed length like above, eg 2000, the return value is
5300000002000000EF0000000A000000000000000000000000000000000000000000000000000..........
thus making me unable to split the bytes to the correct amount.
Can any1 show me how to obtain the actual received data size from networkstream or any method/cheat to get what i need?
Thanks in advance.
Network streams have no length.
Unfortunately, your question is light on detail, so it's hard to offer specific advice. But you have a couple of options:
If the high-level protocol being used here offers a way to know the length of the data that will be sent, use that. This could be as simple as the remote host sending the byte count before the rest of the data, or some command you could send to the remote host to query the length of the data. Without knowing what high-level protocol you're using, it's not possible to say whether this is even an option or not.
Write the incoming data into a MemoryStream object. This would always work, whether or not the high-level protocol offers a way to know in advance how much data to expect. Note that if it doesn't, then you will simply have to receive data until the end of the network stream.
The latter option looks something like this:
MemoryStream outputStream = new MemoryStream();
int readByteCount;
byte[] rgb = new byte[1024]; // can be any size
while ((readByteCount = inputStream.Read(rgb, 0, rgb.Length)) > 0)
{
outputStream.Write(rgb, 0, readByteCount);
}
return outputStream.ToArray();
This assumes you have a network stream named "inputStream".
I show the above mainly because it illustrates the more general practice of reading from a network stream in pieces and then storing the result elsewhere. Also, it is easily adapted to directly reading from a socket instance (you didn't mention what you're actually using for network I/O).
However, if you are actually using a Stream object for your network I/O, then as of .NET 4.0, there has been a more convenient way to write the above:
MemoryStream outputStream = new MemoryStream();
inputStream.CopyTo(outputStream);
return outputStream.ToArray();

How to shrink a string and be able to find the original later

I am working on this app that is still in beta, so I set up a logging system. The log is too long to be used in a mailto url so I thought about shrinking the text and then decrypt it.
Let's say I have a 50 line long log, this should help me make something like this zef16z1e6f8 and then have a procedure to use that to find out all 50 lines of the log.
I would like to note that I don't need any fancy TripleDES encryption or something.
First I would suggest re-looking at why you can't just mail the entire log content? Unless you have large logs (>5MB) I'd suggest just mailing the log. If you still want to pursue some shrinking strategy there are two I'd consider.
If you want a simple reference string which can be used to lookup your log data at some later stage you can just associate some sort of identifier with the data (e.g. a GUID as suggested by Eugene). This has the benefit of having a constant length, irrespective of the log size.
Alternatively you could just compress the log, this will shrink the data somewhat (anything up to about 90%, as Dan mentioned). However this has the downside of having a variable length and for very large logs may still exceed your size limitations. If you go this route you could do something like this (not tested):
private string GetCompressedString()
{
byte[] byteArray = Encoding.UTF8.GetBytes("Some long log string");
using (var ms = new MemoryStream())
{
using (var gz = new GZipStream(ms, CompressionMode.Compress, true))
{
ms.Write(byteArray, 0, byteArray.Length);
}
ms.Position = 0;
var compressedBytes = new byte[ms.Length];
ms.Read(compressedBytes, 0, compressedBytes.Length);
return Convert.ToBase64String(compressedBytes);
}
}

Categories

Resources