From a call to an external API my method receives an image/png as an IRestResponse where the Content property is a string representation.
I need to convert this string representation of image/png into a byte array without saving it first and then going File.ReadAllBytes. How can I achieve this?
You can try a hex string to byte conversion. Here is a method I've used before. Please note that you may have to pad the byte depending on how it comes. The method will throw an error to let you know this. However if whoever sent the image converted it into bytes, then into a hex string (which they should based on what you are saying) then you won't have to worry about padding.
public static byte[] HexToByte(string HexString)
{
if (HexString.Length % 2 != 0)
throw new Exception("Invalid HEX");
byte[] retArray = new byte[HexString.Length / 2];
for (int i = 0; i < retArray.Length; ++i)
{
retArray[i] = byte.Parse(HexString.Substring(i * 2, 2), NumberStyles.HexNumber, CultureInfo.InvariantCulture);
}
return retArray;
}
This might not be the fastest solution by the way, but its a good representation of what needs to happen so you can optimize later.
This is also assuming the string being sent to you is the raw byte converted string. If the sender did anything like a base58 conversion or something else you will need to decode and then use method.
I have found that the IRestResponse from RestSharp actually contains a 'RawBytes' property which is of the response content. This meets my needs and no conversion is necessary!
Related
I'm calling a json-rpc api that returns a UCHAR array that represents a PDF file (so the result property on return contains a string representation of a UCHAR array). I need to convert this result string into a Byte array so I can handle it as a PDF file, i.e., save it and/or forward it as a file in a POST to another api.
I have tried the following (the result variable is the returned UCHAR string):
char[] pdfChar = result.ToCharArray();
byte[] pdfByte = new byte[pdfChar.Length];
for (int i = 0; i < pdfChar.Length; i++)
{
pdfByte[i] = Convert.ToByte(pdfChar[i]);
}
File.WriteAllBytes(basePath + "test.pdf", pdfByte);
I have also tried:
byte[] pdfByte = Encoding.ASCII.GetBytes(pdfObj.result);
File.WriteAllBytes(basePath + "test.pdf", pdfByte);
With both of these, when I try to open the resulting test.pdf file, it will not open, presumably because it was not converted properly.
Turns out that, although the output of the API function is UCHAR, when it comes in as part of the JSON string, it is a base64 string, so this works for me:
byte[] pdfBytes = Convert.FromBase64String(pdfObj.result);
I'm pretty sure the API is making that conversion "under the hood", i.e., while the function being called returns UCHAR, the api is using a framework to create the JSON-RPC responses, and it is likely performing the conversion before sending it out. If it is .NET that makes this conversion from UCHAR to base64, then please feel free to chime in and confirm this.
Do you know the file encoding format? Try to use this
return System.Text.Encoding.UTF8.GetString(pdfObj.result)
EDIT:
The solution you found is also reported here
var base64EncodedBytes = System.Convert.FromBase64String(pdfObj.result);
return System.Text.Encoding.UTF8.GetString(base64EncodedBytes)
byte[] header = new byte[]{255, 216};
string ascii = Encoding.ASCII.GetString(header);
I expect ASCII to be equal to be FFD8 (JPEG SOI marker)
Instead I get "????"
In this case you'd be better to compare the byte arrays rather than converting to string.
If you must convert to string, I suggest using the encoding Latin-1 aka ISO-8859-1 aka Code Page 28591 encoding, as this encoding will map all bytes with hex values are in the range 0-255 to the Unicode character with the same hex value - convenient for this scenario. Any of the following will get this encoding:
Encoding.GetEncoding(28591)
Encoding.GetEncoding("Latin1")
Encoding.GetEncoding("ISO-8859-1")
Yes, that's because ASCII is only 7-bit - it doesn't define any values above 127. Encodings typically decode unknown binary values to '?' (although this can be changed using DecoderFallback).
If you're about to mention "extended ASCII" I suspect you actually want Encoding.Default which is "the default code page for the operating system"... code page 1252 on most Western systems, I believe.
What characters were you expecting?
EDIT: As per the accepted answer (I suspect the question was edited after I added my answer; I don't recall seeing anything about JPEG originally) you shouldn't convert binary data to text unless it's genuinely encoded text data. JPEG data is binary data - so you should be checking the actual bytes against the expected bytes.
Any time you convert arbitrary binary data (such as images, music or video) into text using a "plain" text encoding (such as ASCII, UTF-8 etc) you risk data loss. If you have to convert it to text, use Base64 which is nice and safe. If you just want to compare it with expected binary data, however, it's best not to convert it to text at all.
EDIT: Okay, here's a class to help image detection method for a given byte array. I haven't made it HTTP-specific; I'm not entirely sure whether you should really fetch the InputStream, read just a bit of it, and then fetch the stream again. I've ducked the issue by sticking to byte arrays :)
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Linq;
public sealed class SignatureDetector
{
public static readonly SignatureDetector Png =
new SignatureDetector(0x89, 0x50, 0x4e, 0x47);
public static readonly SignatureDetector Bmp =
new SignatureDetector(0x42, 0x4d);
public static readonly SignatureDetector Gif =
new SignatureDetector(0x47, 0x49, 0x46);
public static readonly SignatureDetector Jpeg =
new SignatureDetector(0xff, 0xd8);
public static readonly IEnumerable<SignatureDetector> Images =
new ReadOnlyCollection<SignatureDetector>(new[]{Png, Bmp, Gif, Jpeg});
private readonly byte[] bytes;
public SignatureDetector(params byte[] bytes)
{
if (bytes == null)
{
throw new ArgumentNullException("bytes");
}
this.bytes = (byte[]) bytes.Clone();
}
public bool Matches(byte[] data)
{
if (data == null)
{
throw new ArgumentNullException("data");
}
if (data.Length < bytes.Length)
{
return false;
}
for (int i=0; i < bytes.Length; i++)
{
if (data[i] != bytes[i])
{
return false;
}
}
return true;
}
// Convenience method
public static bool IsImage(byte[] data)
{
return Images.Any(detector => detector.Matches(data));
}
}
If you then wrote:
Console.WriteLine(ascii)
And expected "FFD8" to print out, that's not the way GetString work. For that, you would need:
string ascii = String.Format("{0:X02}{1:X02}", header[0], header[1]);
I once wrote a custom encoder/decoder that encoded bytes 0-255 to unicode characters 0-255 and back again.
It was only really useful for using string functions on something that isn't actually a string.
Are you sure "????" is the result?
What is the result of:
(int)ascii[0]
(int)ascii[1]
On the other hand, pure ASCII is 0-127 only...
I'm trying to convert a string to a byte[] using the ASCIIEncoder object in the .NET library. The string will never contain non-ASCII characters, but it will usually have a length greater than 16. My code looks like the following:
public static byte[] Encode(string packet)
{
ASCIIEncoder enc = new ASCIIEncoder();
byte[] byteArray = enc.GetBytes(packet);
return byteArray;
}
By the end of the method, the byte array should be full of packet.Length number of bytes, but Intellisense tells me that all bytes after byteArray[15] are literally questions marks that cannot be observed. I used Wireshark to view byteArray after I sent it and it was received on the other side fine, but the end device did not follow the instructions encoded in byteArray. I'm wondering if this has anything to do with Intellisense not being able to display all elements in byteArray, or if my packet is completely wrong.
If your packet string basically contains characters in the range 0-255, then ASCIIEncoding is not what you should be using. ASCII only defines character codes 0-127; anything in the range 128-255 will get turned into question marks (as you have observed) because there characters are not defined in ASCII.
Consider using a method like this to convert the string to a byte array. (This assumes that the ordinal value of each character is in the range 0-255 and that the ordinal value is what you want.)
public static byte[] ToOrdinalByteArray(this string str)
{
if (str == null) { throw new ArgumentNullException("str"); }
var bytes = new byte[str.Length];
for (int i = 0; i < str.Length; ++i) {
// Wrapping the cast in checked() will trigger an OverflowException
// if the character being converted is out of range for a byte.
bytes[i] = checked((byte)str[i]);
}
return bytes;
}
The Encoding class hierarchy is specifically designed for handling text. What you have here doesn't seem to be text, so you should avoid using these classes.
The standard encoders use the replacement character fallback strategy. If a character doesn't exist in the target character set, they encode a replacement character ('?' by default).
To me, that's worse than a silent failure; It's data corruption. I prefer that libraries tell me when my assumptions are wrong.
You can derive an encoder that throws an exception:
Encoding.GetEncoding(
"us-ascii",
new EncoderExceptionFallback(),
new DecoderExceptionFallback());
If you are truly using only characters in Unicode's ASCII range then you'll never see an exception.
I'm reading a file into byte[] buffer. The file contains a lot of UTF-16 strings (millions) in the following format:
The first byte contain and string length in chars (range 0 .. 255)
The following bytes contains the string characters in UTF-16 encoding (each char represented by 2 bytes, means byteCount = charCount * 2).
I need to perform standard string operations for all strings in the file, for example: IndexOf, EndsWith and StartsWith, with StringComparison.OrdinalIgnoreCase and StringComparison.Ordinal.
For now my code first converting each string from byte array to System.String type. I found the following code to be the most efficient to do so:
// position/length validation removed to minimize the code
string result;
byte charLength = _buffer[_bufferI++];
int byteLength = charLength * 2;
fixed (byte* pBuffer = &_buffer[_bufferI])
{
result = new string((char*)pBuffer, 0, charLength);
}
_bufferI += byteLength;
return result;
Still, new string(char*, int, int) it's very slow because it performing unnecessary copying for each string.
Profiler says its System.String.wstrcpy(char*,char*,int32) performing slow.
I need a way to perform string operations without copying bytes for each string.
Is there a way to perform string operations on byte array directly?
Is there a way to create new string without copying its bytes?
No, you can't create a string without copying the character data.
The String object stores the meta data for the string (Length, et.c.) in the same memory area as the character data, so you can't keep the character data in the byte array and pretend that it's a String object.
You could try other ways of constructing the string from the byte data, and see if any of them has less overhead, like Encoding.UTF16.GetString.
If you are using a pointer, you could try to get multiple strings at a time, so that you don't have to fix the buffer for each string.
You could read the File using a StreamReader using Encoding.UTF16 so you do not have the "byte overhead" in between:
using (StreamReader sr = new StreamReader(filename, Encoding.UTF16))
{
string line;
while ((line = sr.ReadLine()) != null)
{
//Your Code
}
}
You could create extension methods on byte arrays to handle most of those string operations directly on the byte array and avoid the cost of converting. Not sure what all string operations you perform, so not sure if all of them could be accomplished this way.
I have the following code to encode password field but it gets error when password field is longer than ten characters.
private string base64Encode(string sData)
{
try
{
byte[] encData_byte = new byte[sData.Length];
//encData_byte = System.Text.Encoding.UTF8.GetBytes(sData);
encData_byte = System.Text.Encoding.UTF8.GetBytes(sData);
string encodedData = Convert.ToBase64String(encData_byte);
return encodedData;
}
catch (Exception ex)
{
throw new Exception("Error in base64Encode" + ex.Message);
}
}
This is the code to decode the encoded value
public string base64Decode(string sData)
{
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
System.Text.Decoder utf8Decode = encoder.GetDecoder();
byte[] todecode_byte = Convert.FromBase64String(sData);
int charCount = utf8Decode.GetCharCount(todecode_byte, 0, todecode_byte.Length);
char[] decoded_char = new char[charCount];
utf8Decode.GetChars(todecode_byte, 0, todecode_byte.Length, decoded_char, 0);
string result = new String(decoded_char);
return result;
}
That code itself shouldn't be failing - but it's not actually providing any protection for the password. I'm not sure what kind of "encoding" you're really trying to do, but this is not the way to do it. Issues:
Even if this worked, it's terrible from a security point of view - this isn't encryption, hashing, or anything like it
You're allocating a new byte array for no good reason - why?
You're catching Exception, which is almost always a bad idea
Your method ignores .NET naming conventions
If you can explain to us what the bigger picture is, we may be able to suggest a much better approach.
My guess is that the exception you're seeing is actually coming when you call Convert.FromBase64String, i.e. in the equivalent decoding method, which you haven't shown us.
I think you will need to modify your code.
These are 2 links which gives more details -
Encrypt and decrypt a string
http://msdn.microsoft.com/en-us/library/system.security.cryptography.rsacryptoserviceprovider.aspx
They are correct about this not being secure. But the question you asked was why is the code failing. Base64 strings usually take up more space than the string they encode. You are trying to store the same amount of data in fewer characters (64 instead of 255), so it expands the string. Since you are dimensioning the array based on the size of the string, any time the base 64 string exceeds the size of the base 255 string, you get an error. Instead of writing the code yourself, use the built in converters.
System.Convert.ToBase64String()
System.Convert.FromBase64String()
But as I mentioned before, this is not secure, so only use this if you are trying to do something with a legacy system, and you need to preserve functionality for some reason.