Is it possible to read the Nth attachment of the Mth message, from an mbox file, using MimeKit.MimeParser? In my case, I would store few messages (few fields for each msg including a list of attachments) to an in-memory data structure and after that, I want to be able to return to a specific message attachment and read its contents.
Things I have tried so far:
Remembering underlying stream position for each read message and
positioning the stream to that position before calling
_parser.ParseMessage() later to get the message and its attachment.
I also tried to use LINQ methods to get a message by
MessageID in combination with setting stream position to 0 and
calling SetStream again and without it.
The above does not work.
Here is some code just to illustrate my efforts:
public void SaveAttachment(Attachment att, Stream outStream)
{
_inputStream.Seek(0, SeekOrigin.Begin);
_parser.SetStream(_inputStream, false);
//MimeMessage mimeMsg = _parser.Skip((int)(att.Parent as Message).Position).First();
MimeMessage mimeMsg =_parser.SingleOrDefault(x => x.MessageId == (att.Parent as Message).EntryID);
MimeEntity mimeAtt = mimeMsg.Attachments.ToList()[att.AttachmentIndex];
if (mimeAtt is MessagePart)
{
(mimeAtt as MessagePart).Message.WriteTo(outStream);
}
else
{
(mimeAtt as MimePart).Content.DecodeTo(outStream);
}
}
Is it possible to read the Nth attachment of the Mth message, from an mbox file, using MimeKit.MimeParser?
If you want to do this, then you will need exact stream start/end offsets of the MimeEntity that you want.
Then what you'll want to do is to wrap the stream in a MimeKit.IO.BoundStream with those stream offsets to prevent the parser from straying outside of those bounds and set the BoundStream on the MimeParser.
When you set the stream, make sure to use MimeFormat.Entity (and not MimeFormat.Mbox) since you are only interested in parsing a single MimeEntity (which can be a multipart containing other MimeEntities).
To get these offsets, you'll need to subscribe to the MimeParser's MimeEntityBegin/End events when you first parse the mbox: http://www.mimekit.net/docs/html/Events_T_MimeKit_MimeParser.htm
I want to be able to return to a specific message attachment and read its contents.
Have you looked into the persistent argument to MimeParser.SetStream()?
This may still use more memory than you want to use (since it will have all of the headers loaded + track stream offsets for each MimeEntity's content), but you may find that it's more convenient and has low-enough memory usage to fit your practical needs.
When this property is set to true, instead of loading each MimePart's content into RAM, it instead creates a BoundStream that wraps the stream provided to the MimeParser so that when you request the content of these MimeParts, it lazily loads it from disk.
By default (or when persistent = false), the MimeParser will load that content into a MemoryBlockStream (effectively a MemoryStream that tries to reduce byte array resizing for performance) which can, as you probably know, use quite a bit of memory if the messages have large attachments (or a lot of them).
The thing to watch out for when using persistent = true is that you will need to keep the mbox file stream open if you want to be able to get the content of any of the MimeParts parsed by the parser. Once you close the stream, trying to get the content of any MimeParts will likely result in an ObjectDisposedException.
Related
I am looking into PdfReport.Core and have been asked to let our .NET CORE 2.0 WEB-API return a PDF to the calling client. The client would be any https caller like a ajax or mvc client.
Below is a bit of the code I am using. I am using swashbuckle to test the api, which looks like it is returning the report but when I try to open in a PDF viewer it says it is curropted. I am thinking I am not actually outputting the actual PDF to the stream, suggestions?
[HttpGet]
[Route("api/v1/pdf")]
public FileResult GetPDF()
{
var outputStream = new MemoryStream();
InMemoryPdfReport.CreateStreamingPdfReport(_hostingEnvironment.WebRootPath, outputStream);
outputStream.Position = 0;
return new FileStreamResult(outputStream, "application/pdf")
{
FileDownloadName = "report.pdf"
};
}
I'm not familiar with that particular library, but generally speaking with streams, file corruption is a result of either 1) the write not being flushed or 2) incorrect positioning within the stream.
Since, you've set the position back to zero, I'm guessing the problem is that your write isn't being flushed correctly. Essentially, when you write to a stream, the data is not necessarily "complete" in the stream. Sometimes writes are queued to more efficiently write in batches. Sometimes, there's cleanup tasks a particular stream writer needs to complete to "finalize" everything. For example, with a format like PDF, end matter may need to be appended to the bytes, particular to the format. A stream writer that is writing PDF would take care of this in a flush operation, since it cannot be completed until all writing is done.
Long and short, review the documentation of the library. In particular, look for any method/process that deals with "flushing". That's most likely what your missing.
I am a little confused on exactly how C# streams work, and when and how streaming works with them. I have read various articles, but I am not sure I completely understand it still.
I have the following code:
[HttpGet]
[Route("GetImage/ImageId/{imageId:long}")]
public HttpResponseMessage GetImage(long imageId)
{
var imageStream = OpenStream(imageId);
response.Content = new StreamContent(imageStream);
response.Content.Headers.ContentType = new MediaTypeHeaderValue("image/jpeg");
return response;
}
I understand that a stream is basically a abstraction over a backing file, so when you open the stream, it is not loading the stream fully into memory yet (depending how its opened) but lets assume its opened without loading it into memory yet.
So the questions are:
1.If I took imageStream and use .CopyTo to copy it to another stream, I'm assuming this has to read the whole stream into memory to perform this function?
2.With the code above, if the client reads this stream a few bytes at a time, how does the API understand to only send pieces at a time, does the connection stay open until this streaming is done?
3.If I need the full image to do anything on the client side, is there a benefit to streaming this, or would getting the whole image via just GetStreamAsync do the same? Does GetStreamAsync pull the entire stream down into memory?
I am using
using (StreamWriter writer = new StreamWriter(Response.OutputStream, System.Text.Encoding.UTF8));
In order to directly write some lines of text and send them to the browser as an attachment.
I now though also want to save that text locally in a file, but Id rather avoid changing too much of my code. Can I write the contents of Response.OutputStream into a text file before ending the response?
I believe what you ask for is not doable. I am quite sure Response.OutpuStream is not seekable (property CanSeek yielding false), meaning you won't be able to get at its start for reading its content. It is probably not readable either (property CanRead yielding false).
Attempting any of those operations would yield a NotSupportedException.
If your needs are for some basic logging, you may work around that by enabling .Net standard network traces. Or code an IHttpModule as suggested here.
Otherwise, you may use an intermediate MemoryStream with your StreamWriter, then reset this MemoryStream to Position 0, write it to OutputStream, reset it again to Position 0, write it to your file.
You can use the CopyTo method of the Stream object. At the end you can copy the whole OutputStream to an other one which write it to a file. (https://msdn.microsoft.com/en-us/library/dd782932(v=vs.110).aspx)
I have a problem to obtain the right buffer size of my application.
What i read from the site about specifying the buffer size is normally declared before reading.
byte[] buffer = new byte[2000];
And then using to get the result.
However, this method will stop once the received data contains '00', but my return code contains something like this... 5300000002000000EF0000000A00. and the length is not fixed, can be this short until 400 bytes
So the problems comes, if i define a prefixed length like above, eg 2000, the return value is
5300000002000000EF0000000A000000000000000000000000000000000000000000000000000..........
thus making me unable to split the bytes to the correct amount.
Can any1 show me how to obtain the actual received data size from networkstream or any method/cheat to get what i need?
Thanks in advance.
Network streams have no length.
Unfortunately, your question is light on detail, so it's hard to offer specific advice. But you have a couple of options:
If the high-level protocol being used here offers a way to know the length of the data that will be sent, use that. This could be as simple as the remote host sending the byte count before the rest of the data, or some command you could send to the remote host to query the length of the data. Without knowing what high-level protocol you're using, it's not possible to say whether this is even an option or not.
Write the incoming data into a MemoryStream object. This would always work, whether or not the high-level protocol offers a way to know in advance how much data to expect. Note that if it doesn't, then you will simply have to receive data until the end of the network stream.
The latter option looks something like this:
MemoryStream outputStream = new MemoryStream();
int readByteCount;
byte[] rgb = new byte[1024]; // can be any size
while ((readByteCount = inputStream.Read(rgb, 0, rgb.Length)) > 0)
{
outputStream.Write(rgb, 0, readByteCount);
}
return outputStream.ToArray();
This assumes you have a network stream named "inputStream".
I show the above mainly because it illustrates the more general practice of reading from a network stream in pieces and then storing the result elsewhere. Also, it is easily adapted to directly reading from a socket instance (you didn't mention what you're actually using for network I/O).
However, if you are actually using a Stream object for your network I/O, then as of .NET 4.0, there has been a more convenient way to write the above:
MemoryStream outputStream = new MemoryStream();
inputStream.CopyTo(outputStream);
return outputStream.ToArray();
I'm trying to implement file compression to an application. The application has been around for a while, so it needs to be able to read uncompressed documents written by previous versions. I expected that DeflateStream would be able to process an uncompressed file, but for GZipStream I get the "The magic number in GZip header is not correct" error. For DeflateStream I get "Found invalid data while decoding". I guess it does not find the header that marks the file as the type it is.
If it's not possible to simply process an uncompressed file, then 2nd best would be to have a way to determine whether a file is compressed, and choose the method of reading the file. I've found this link: http://blog.somecreativity.com/2008/04/08/how-to-check-if-a-file-is-compressed-in-c/, but this is very implementation specific, and doesn't feel like the right approach. It can also provide false positives (I'm sure this would be rare, but it does indicate that it's not the right approach).
A 3rd option I've considered is to attempt using DeflateStream, and fallback to normal stream IO if an exception occurs. This also feels messy, and causes VS to break at the exception (unless I untick that exception, which I don't really want to have to do).
Of course, I may simply be going about it the wrong way. This is the code I've tried in .Net 3.5:
Stream reader = new FileStream(fileName, FileMode.Open, readOnly ? FileAccess.Read : FileAccess.ReadWrite, readOnly ? FileShare.ReadWrite : FileShare.Read);
using (DeflateStream decompressedStream = new DeflateStream(reader, CompressionMode.Decompress))
{
workspace = (Workspace)new XmlSerializer(typeof(Workspace)).Deserialize(decompressedStream);
if (readOnly)
{
reader.Close();
workspace.FilePath = fileName;
}
else
workspace.SetOpen(reader, fileName);
}
Any ideas?
Thanks!
Luke.
Doesn't your file format have a header? If not, now is the time to add one (you're changing the file format by supporting compression, anyway). Pick a good magic value, make sure the header is extensible (add a version field, or use specific magic values for specific versions), and you're ready to go.
Upon loading, check for the magic value. If not present, use your current legacy loading routines. If present, the header will tell you whether the contents are compressed or not.
Update
Compressing the stream means the file is no longer an XML document, and thus there's not much reason to expect the file can't contain more than your data stream. You really do want a header identifying your file :)
The below is example (pseudo)-code; I don't know if .net has a "substream", SubRangeStream is likely something you'll have to code yourself (DeflateStream probably adds it's own header, so a substream might not be necessary; could turn out useful further down the road, though).
Int64 oldPosition = reader.Position;
reader.Read(magic, 0, magic.length);
if(IsRightMagicValue(magic))
{
Header header = ReadHeader(reader);
Stream furtherReader = new SubRangeStream(reader, reader.Position, header.ContentLength);
if(header.IsCompressed)
{
furtherReader = new DeflateStream(furtherReader, CompressionMode.Decompress);
}
XmlSerializer xml = new XmlSerializer(typeof(Workspace));
workspace = (Workspace) xml.Deserialize(furtherReader);
} else
{
reader.Position = oldPosition;
LegacyLoad(reader);
}
In real-life, I would do things a bit differently - some proper error handling and cleanup, for instance. Also, I wouldn't have the new loader code directly in the IsRightMagicValue block, but rather I'd spin off the work either based on the magic value (one magic value per file version), or I would keep a "common header" portion with fields common to all versions. For both, I'd use a Factory Method to return an IWorkspaceReader depending on the file version.
Can't you just create a wrapper class/function for reading the file and catch the exception? Something like
try
{
// Try return decompressed stream
}
catch(InvalidDataException e)
{
// Assume it is already decompressed and return it as it is
}