C++/zlib/gzip compression and C# GZipStream decompression fails - c#

I know there's a ton of questions about zlib/gzip etc but none of them quite match what I'm trying to do (or at least I haven't found it). As a quick overview, I have a C# server that decompresses incoming strings using a GZipStream. My task is to write a C++ client that will compress a string compatible with GZipStream decompression.
When I use the code below I get an error that says "The magic number in GZip header is not correct. Make sure you are passing in a GZip stream." I understand what the magic number is and everything, I just don't know how to magically set it properly.
Finally, I'm using the C++ zlib nuget package but have also used the source files directly from zlib with the same bad luck.
Here's a more in depth view:
The server's function for decompression
public static string ReadMessage(NetworkStream stream)
{
byte[] buffer = new byte[512];
StringBuilder messageData = new StringBuilder();
GZipStream gzStream = new GZipStream(stream, CompressionMode.Decompress, true);
int bytes = 0;
while (true)
{
try
{
bytes = gzStream.Read(buffer, 0, buffer.Length);
}
catch (InvalidDataException ex)
{
Console.WriteLine($"Busted: {ex.Message}");
return "";
}
// Use Decoder class to convert from bytes to Default
// in case a character spans two buffers.
Decoder decoder = Encoding.Default.GetDecoder();
char[] chars = new char[decoder.GetCharCount(buffer, 0, bytes)];
decoder.GetChars(buffer, 0, bytes, chars, 0);
messageData.Append(chars);
Console.WriteLine(messageData);
// Check for EOF or an empty message.
if (messageData.ToString().IndexOf("<EOF>", StringComparison.Ordinal) != -1)
break;
}
int eof = messageData.ToString().IndexOf("<EOF>", StringComparison.Ordinal);
string message = messageData.ToString().Substring(0, eof).Trim();
//Returns message without ending EOF
return message;
}
To sum it up, it accepts a NetworkStream in, gets the compressed string, decompresses it, adds it to a string, and loops until it finds <EOF> which is removed then returns the final decompressed string. This is almost a match from the example off of MSDN.
Here's the C++ client side code:
char* CompressString(char* message)
{
int messageSize = sizeof(message);
//Compress string
z_stream zs;
memset(&zs, 0, sizeof(zs));
zs.zalloc = Z_NULL;
zs.zfree = Z_NULL;
zs.opaque = Z_NULL;
zs.next_in = reinterpret_cast<Bytef*>(message);
zs.avail_in = messageSize;
int iResult = deflateInit2(&zs, Z_BEST_COMPRESSION, Z_DEFLATED, (MAX_WBITS + 16), 8, Z_DEFAULT_STRATEGY);
if (iResult != Z_OK) zerr(iResult);
int ret;
char* outbuffer = new char[messageSize];
std::string outstring;
// retrieve the compressed bytes blockwise
do {
zs.next_out = reinterpret_cast<Bytef*>(outbuffer);
zs.avail_out = sizeof(outbuffer);
ret = deflate(&zs, Z_FINISH);
if (outstring.size() < zs.total_out) {
// append the block to the output string
outstring.append(outbuffer,
zs.total_out - outstring.size());
}
} while (ret == Z_OK);
deflateEnd(&zs);
if (ret != Z_STREAM_END) { // an error occurred that was not EOF
std::ostringstream oss;
oss << "Exception during zlib compression: (" << ret << ") " << zs.msg;
throw std::runtime_error(oss.str());
}
return &outstring[0u];
}
Long story short here, it accepts a string and goes through a pretty standard zlib compression with the WBITS being set to wrap it in a gzip header/footer. It then returns a char* of the compressed input. This is what is sent to the server above to be decompressed.
Thanks for any help you can give me! Also, let me know if you need any more information.

In your CompressString function you return a char* obtained from the a locally declared std::string. The string will be destroyed when the function returns which will release the memory at the pointer you've returned.
It's likely that something is being allocated to the this memory region and writing over your compressed data before it gets sent.
You need to ensure the memory containing the compressed data remains allocated until it has been sent. Perhaps by passing a std::string& into the function and storing it in there.
An unrelated bug: you do char* outbuffer = new char[messageSize]; but there is no call to delete[] for that buffer. This will result in a memory leak. As you're throwing exceptions from this function too I would recommend using std::unique_ptr<char[]> instead of trying to manually sort this out with your own delete[] calls. In fact I would always recommend std::unique_ptr instead of explicit calls to delete if possible.

Related

Google.Protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero)

I have a problem with my school project, i use Protobuf library but i have the following error:
Google.Protobuf.InvalidProtocolBufferException" Protocol message contained an invalid tag (zero).
My protocol message wrapper is:
syntax = "proto3";
package CardGameGUI.Network.Protocol.Message;
message WrapperMessage {
enum MessageType {
HELLO_MESSAGE = 0;
JOIN_ROOM_MESSAGE = 1;
JOIN_ROOM_RESPONSE_MESSAGE = 2;
}
MessageType type = 1;
bytes payload = 2;
}
I use this to send a message:
public void SendObject<T>(Protocol.Message.WrapperMessage.Types.MessageType type, T messageObject)
{
byte[] message;
// Serialize message
using (var stream = new MemoryStream())
{
((IMessage)messageObject).WriteTo(stream);
message = stream.GetBuffer();
}
byte[] wrapper = new Protocol.Message.WrapperMessage{Type = type, Payload = Google.Protobuf.ByteString.CopyFrom(message)}.ToByteArray();
Connection.SendObject<byte[]>("ByteMessage", wrapper);
}
And my server handler:
private void IncommingMessageHandler(PacketHeader header, Connection connection, byte[] message)
{
Protocol.Message.WrapperMessage wrapper = Protocol.Message.WrapperMessage.Parser.ParseFrom(message);
switch (wrapper.Type)
{
case Protocol.Message.WrapperMessage.Types.MessageType.HelloMessage:
GetClient(connection.ConnectionInfo.NetworkIdentifier).MessageHandler(Protocol.Message.HelloMessage.Parser.ParseFrom(wrapper.Payload.ToByteArray()));
break;
}
}
The wrapper message is perfectly unserialized, and type is correctly matched, but at the treatment of my Payload, the exception pops.
Do i do something bad?
Edit: a small screen of the message Payload
The problem is probably that you used GetBuffer without making use of the known length. GetBuffer returns the oversized backing array. The data after the stream's .Length is garbage and should not be consumed - it will typically (but not always) be zeros, which is what you are seeing.
Either use ToArray() instead of GetBuffer(), or track the .Length of the stream and only consume that much of the oversized buffer.
Another possibility is "framing" - it looks like you're handling packets, but if this is TCP there is no guarantee that the chunks you receive are the same sizes as the chunks you send. If you are sending multiple messages over TCP you need to implement your own framing (typically via a length prefix, since you're talking binary data).
Incidentally, this isn't protobuf-net.
If neither of those is the problem: check the data you receive is exactly (byte for byte) the data you send (including lengths). It is easy for data to get corrupted or mis-chunked by IO code.
i encounter this problem in this case
because my serialize bytestream loss the varint lenth
such as if i serialize a "Person.proto" message which have 672 bytes
if i deserialize the 672 bytes will encounter the error
the solve strategy is that add the varint len in the 672bytes so youcan get a 674bytes stream
the extra amount data is the "varint" code for 672, which is 160,5
you can get the varint bytes by the function
public static byte[] VarInt(int value)
{
//data len
List<byte> varIntBuffer = new List<byte>();
int index = 0;
while (true)
{
if ((value & ~0x7f) == 0)
{
varIntBuffer.Add((byte)(value & 0x7f));
break;
}
else
{
varIntBuffer.Add((byte)((value & 0x7f) | 0x80));
value = value >> 7;
}
index++;
}
return varIntBuffer.ToArray();
}
I had this same issue when attempting to deserialize a byte array which had been initialized to a fixed size but there was a bug which meant I was not populating the array with proto bytes (so the byte array was populated with zeros when I was attempting to deserialize).
It turns out that I was reading bytes from a JMS BytesMessage twice in a test case but was not calling BytesMessage.reset() before the second read.
I'm guessing you could get a similar bug if attempting to read from an InputStream twice without calling reset()

Sending image over socket and saving it on the server

I'm facing currently some problems with my project and I hope that you are able to identify my problem as I'm not capable to see it by myself.
I'm trying to send a picture from a C# Client (Windows) to my C Server that is running on a linux system. I'm transmitting the image binary data via a TCP Socket and that works just fine, the problem is when I'm writing the received buffer on the linux system with fwrite, it seems that some information that is present in the buffer, is not written or written with a corrupted value to the file.
E.g. I'm trying to send this picture here:
And that one I get on the server:
The client:
public static void sendPicture(Image image)
{
Byte[] imageBytes;
using (MemoryStream s = new MemoryStream())
{
image.Save(s, ImageFormat.Jpeg);
imageBytes = s.ToArray();
s.Close();
}
if (imageBytes.Length <= 5242880)
{
try
{
NetworkStream stream = client.GetStream();
File.WriteAllBytes("before.jpg", imageBytes);
//Send image Size
Byte[] imgSize = BitConverter.GetBytes((UInt32)imageBytes.Length);
stream.Write(imgSize, 0, imgSize.Length);
//Get answer from server if filesize is ok
Byte[] data = new Byte[1];
// recv only looks if we have a partial read,
// works like stream.Read(data, 0, data.Length)
Int32 count = recv(stream, data, data.Length);
if (count != 1 || data[0] != 0x4)
return;
stream.Write(imageBytes, 0, imageBytes.Length);
...
}
The server:
...
// INFO_BUFFER_SIZE (1)
buffer = malloc(INFO_BUFFER_SIZE);
// does allow parital reads
if (read_from_client(connfd, buffer, INFO_BUFFER_SIZE) == NULL) {
error_out(buffer,
"Error during receiv of client data # read_from_client");
close(connfd);
continue;
}
// reconstruct the image_size
uint32_t image_size =
buffer[3] << (0x8 * 3) | buffer[2] << (0x8 *2) | buffer[1] << 0x8 |
buffer[0];
fprintf(stderr, "img size: %u\n", image_size);
// Check if file size is ok
if (check_control(image_size, 1, response) > 0) {
inform_and_close(connfd, response);
continue;
}
// Inform that the size is ok and we're ready to receive the image now
(void) send(connfd, response, CONT_BUFFER_SIZE, 0);
free(buffer);
buffer = malloc(image_size);
if (read_from_client(connfd, buffer, image_size) == NULL) {
error_out(buffer,
"Error during receiv of client data # read_from_client");
close(connfd);
continue;
}
FILE * f;
// Generate a GUID for the name
uuid_t guid;
uuid_generate_random(guid);
char filename[37];
uuid_unparse(guid, filename);
if ((f = fopen(filename, "wb")) == NULL) {
inform_and_close(connfd, 0x0);
error_out(buffer,
"Error while trying to open a file to save the data");
continue;
}
if (fwrite(buffer, sizeof(buffer[0]), image_size, f) != image_size) {
inform_and_close(connfd, 0x0);
error_out(buffer, "Error while writing to file");
continue;
}
char output[100];
(void) snprintf(output, sizeof(output), "mv %s uploads/%s.jpg",
filename, filename);
system(output);
free(buffer);
close(connfd);
...
If I receive the image and send it directly back to the client and the client then writes the received buffer to a file, everything is fine, there is not difference between the file that has been sent and the one it received. Because of that I’m quite sure that the transmission works as expected.
If you need anything more, let me know!
fopen creates a buffered I/O stream. But you aren't flushing the buffer and closing the file. Cutting out the error checking, you're doing:
f = fopen(filename, "wb");
fwrite(buffer, sizeof (buffer[0]), image_size, f);
You should add these at the end:
fflush(f);
fclose(f);
fclose will actually flush for you but it's best to do the flush separately so that you can check for errors prior to closing.
What's happening here (somewhat oversimplified) is:
fopen creates the file on disk (by using open(2) [or creat(2)] system call) and also allocates an internal buffer
fwrite repeatedly fills the internal buffer. Each time the buffer fills to some boundary (determined by BUFSIZ -- see setbuf(3) man page), fwrite flushes it to disk (i.e. it does a write(2) system call).
However, when you're finished, the end of the file is still sitting in the internal buffer -- because the library can't know that you're done writing to the file and you didn't happen to be on a BUFSIZ boundary. Either fflush or fclose will tell the library to flush out that last partial buffer, writing it to disk.. Then with fclose the underlying OS file descriptor will be closed (which you should always do anyway, or your server will have a "file descriptor leak").

How memory management works between bytes and string?

In C# If I have 4-5 GB of data which is now in form of bytes then I am converting it into string then what will be the impact of this on memory and how to do better memory management while using variable for large string ?
Code
public byte[] ExtractMessage(int start, int end)
{
if (end <= start)
return null;
byte[] message = new byte[end - start];
int remaining = Size - end;
Array.Copy(Frame, start, message, 0, message.Length);
// Shift any remaining bytes to front of the buffer
if (remaining > 0)
Array.Copy(Frame, end, Frame, 0, remaining);
Size = remaining;
ScanPosition = 0;
return message;
}
byte[] rawMessage = Buffer.ExtractMessage(messageStart, messageEnd);
// Once bytes are received, I want to create an xml file which will be used for further more work
string msg = Encoding.UTF8.GetString(rawMessage);
CreateXMLFile(msg);
public void CreateXMLFile(string msg)
{
string fileName = "msg.xml";
if (File.Exists(fileName))
{
File.Delete(fileName);
}
using (File.Create(fileName)) { };
TextWriter tw = new StreamWriter(fileName, true);
tw.Write(msg);
tw.Close();
}
.NET strings are stored as unicode which means two bytes per character. As you are using UTF8 you'll double the memory usage when converting to a string.
Once you've converted the text to a string nothing more will happen unless you try do modify it. string objects are immutable which means that a new copy of the string will be created each time you modify it using one of the methods such as Remove().
You can read more here: How are strings passed in .NET?
A byte array however is always passed by reference, and each change will affect all variables holding it. Thus changes will not hurt performance/memory consumption.
You can get a byte[] from a string by using var buffer = yourEncoding.GetBytes(yourString);. Common encodings can be accessed using static variables: var buffer= Encoding.UTF8.GetBytes(yourString);

Invalid paramether in Image.FromStream(MemoryStream)

im trying to send an image via network stream, i have a sendData and Getdata functions
and i always get an invalid parameter when using the Image.FromStream function
this is my code :
I am Getting the pic from the screen, then converting it to a byte[]
Inserting it to a Memory stream that i send via a networkStream.
private void SendData()
{
StreamWriter swWriter = new StreamWriter(this._nsClient);
// BinaryFormatter bfFormater = new BinaryFormatter();
// this method
lock (this._secLocker)
{
while (this._bShareScreen)
{
// Check if you need to send the screen
if (this._bShareScreen)
{
MemoryStream msStream = new MemoryStream();
this._imgScreenSend = new Bitmap(this._imgScreenSend.Width, this._imgScreenSend.Height);
// Send an image code
swWriter.WriteLine(General.IMAGE);
swWriter.Flush();
// Copy image from screen
this._grGraphics.CopyFromScreen(0, 0, 0, 0, this._sizScreenSize);
this._imgScreenSend.Save(msStream, System.Drawing.Imaging.ImageFormat.Jpeg);
msStream.Seek(0, SeekOrigin.Begin);
// Create the pakage
byte[] btPackage = msStream.ToArray();
// Send its langth
swWriter.WriteLine(btPackage.Length.ToString());
swWriter.Flush();
// Send the package
_nsClient.Write(btPackage, 0, btPackage.Length);
_nsClient.Flush();
}
}
}
}
private void ReciveData()
{
StreamReader srReader = new StreamReader(this._nsClient);
string strMsgCode = String.Empty;
bool bContinue = true;
//BinaryFormatter bfFormater = new BinaryFormatter();
DataContractSerializer x = new DataContractSerializer(typeof(Image));
// Lock this method
lock (this._objLocker)
{
while (bContinue)
{
// Get the next msg
strMsgCode = srReader.ReadLine();
// Check code
switch (strMsgCode)
{
case (General.IMAGE):
{
// Read bytearray
int nSize = int.Parse(srReader.ReadLine().ToString());
byte[] btImageStream = new byte[nSize];
this._nsClient.Read(btImageStream, 0, nSize);
// Get the Stream
MemoryStream msImageStream = new MemoryStream(btImageStream, 0, btImageStream.Length);
// Set seek, so we read the image from the begining of the stream
msImageStream.Position = 0;
// Build the image from the stream
this._imgScreenImg = Image.FromStream(msImageStream); // Error Here
Part of the problem is that you're using WriteLine() which adds Environment.NewLine at the end of the write. When you just call Read() on the other end, you're not dealing with that newline properly.
What you want to do is just Write() to the stream and then read it back on the other end.
The conversion to a string is strange.
What you're doing, when transferring an image, is sending an array of bytes. All you need to do is send the length of the expected stream and then the image itself, and then read the length and the byte array on the other side.
The most basic and naive way of transferring a byte array over the wire is to first send an integer that represents the length of the array, and read that length on the receiving end.
Once you now know how much data to send/receive, you then send the array as a raw array of bytes on the wire and read the length that you previously determined on the other side.
Now that you have the raw bytes and a size, you can reconstruct the array from your buffer into a valid image object (or whatever other binary format you've just sent).
Also, I'm not sure why that DataContractSerializer is there. It's raw binary data, and you're already manually serializing it to bytes anyway, so that thing isn't useful.
One of the fundamental problems of network programming using sockets and streams is defining your protocol, because the receiving end can't otherwise know what to expect or when the stream will end. That's why every common protocol out there either has a very strictly defined packet size and layout or else does something like sending length/data pairs, so that the receiving end knows what to do.
If you implement a very simple protocol such as sending an integer which represents array length and reading an integer on the receiving end, you've accomplished half the goal. Then, both sender and receiver are in agreement as to what happens next. Then, the sender sends exactly that number of bytes on the wire and the receiver reads exactly that number of bytes on the wire and considers the read to be finished. What you now have is an exact copy of the original byte array on the receiving side and you can then do with it as you please, since you know what that data was in the first place.
If you need a code example, I can provide a simple one or else there are numerous examples available on the net.
Trying to keep it short:
the Stream.Read function (which you use) returns an int that states how many bytes were read, this is return to you so you could verify that all the bytes you need are received.
something like:
int byteCount=0;
while(byteCount < nSize)
{
int read = this._nsClient.Read(btImageStream, byteCount, nSize-byteCount);
byteCount += read;
}
this is not the best code for the job

C++ zlib inflate failing - translation of c# fixup?

I'm trying to inflate a string using zlib's deflate, but it's failing, apparently because it doesn't have the right header. I read elsewhere that the C# solution to this problem is:
public static byte[] FlateDecode(byte[] inp, bool strict) {
MemoryStream stream = new MemoryStream(inp);
InflaterInputStream zip = new InflaterInputStream(stream);
MemoryStream outp = new MemoryStream();
byte[] b = new byte[strict ? 4092 : 1];
try {
int n;
while ((n = zip.Read(b, 0, b.Length)) > 0) {
outp.Write(b, 0, n);
}
zip.Close();
outp.Close();
return outp.ToArray();
}
catch {
if (strict)
return null;
return outp.ToArray();
}
}
But I know nothing about C#. I can surmise that all it's doing is adding a prefix to the string, but what that prefix is, I have no idea. Would someone be able to phrase this function (or even just the header creation and string concatenation) in C++?
The data which I'm trying to inflate is taken from a PDF using zlib deflation.
Thanks a million,
Wyatt
I've had better luck using SharpZipLib for zlib interop than with the native .Net Framework classes. This correctly handles streams from C++ (zlib native) and from Java's compression classes without any funny business being needed.
I can't see any prefixes, sorry. Here's what the logic appears to be; sorry this isn't in C++:
MemoryStream stream = new MemoryStream(inp);
InflaterInputStream zip = new InflaterInputStream(stream);
Create an inflate stream from the data passed
MemoryStream outp = new MemoryStream();
Create a memory buffer stream for output
byte[] b = new byte[strict ? 4092 : 1];
try {
int n;
while ((n = zip.Read(b, 0, b.Length)) > 0) {
If you're in strict mode, read up to 4092 bytes - or 1 in non-strict mode - into a byte buffer
outp.Write(b, 0, n);
Write all the bytes decoded (may be less than the 4092) to the output memory buffer stream
zip.Close();
outp.Close();
return outp.ToArray();
Clean up, and return the output memory buffer stream as an array.
I'm a bit confused, though: why not just cut array b off at n elements and return that rather than go via a MemoryStream? The code also ought really to take care to clean up the memory streams and zip on exception (e.g. using using) since they're all IDisposable but I guess that's not really important since they don't correspond to I/O file handles, only memory structures.

Categories

Resources