Strings not comparing - c#

The decoded message what was first sent using sockets to another form cannot be compared or can be but that if doesn't work. That if was just jumped every time
byte[] receivedData = new byte[1500];
receivedData = (byte[])aResult.AsyncState;
string data = encoder.GetString(receivedData);
listMessage.Items.Add("Friend: " + data);
if (data == "Friend Disconnected")
{
//this not perform
listMessage.Items.Clear();
lblHostPort.Text = "";
lblLocalPort.Text = "";
grpFriend.Visible = true;
grpHost.Visible = true;
button1.Text = "connect";
}

String comparision only works if the strings are exactly the same. An extra, missing or different whitespace. A small letter where a big one should be. Even different Unicode Normalisation - all of this and more can get in the way of it. As you are creating that string from raw bytes, even different encodings could throw a wrench into that mix.
As a general rule, string is terrible for processing and information transmissions. The only type somewhat worse is byte themself. The only advantage of string is that is (often) human readable.
But a numeric error code or even Enumeration tends to be leagues more reliable for this kind of work.

Their is 2 possibilities for you issues.
For the first one, maybe that you are not using the correct encoding in your encoder object. (difficult to say without additional information on this object.)
Encoding
Something that you can try is to check if you can get better result by using the Compare method between strings instead of the operator ==.
You will then be able to perform comparison case insensitive or with specifics options.
Again, I can't give you more information right now as you don't indicate the content of the data variable in your question.
string comparison method

Related

Does string.Replace(string, string) create additional strings?

We have a requirement to transform a string containing a date in dd/mm/yyyy format to ddmmyyyy format (In case you want to know why I am storing dates in a string, my software processes bulk transactions files, which is a line based textual file format used by a bank).
And I am currently doing this:
string oldFormat = "01/01/2014";
string newFormat = oldFormat.Replace("/", "");
Sure enough, this converts "01/01/2014" to "01012014". But my question is, does the replace happen in one step, or does it create an intermediate string (e.g.: "0101/2014" or "01/012014")?
Here's the reason why I am asking this:
I am processing transaction files ranging in size from few kilobytes to hundreds of megabytes. So far I have not had a performance/memory problem, because I am still testing with very small files. But when it comes to megabytes I am not sure if I will have problems with these additional strings. I suspect that would be the case because strings are immutable. With millions of records this additional memory consumption will build up considerably.
I am already using StringBuilders for output file creation. And I also know that the discarded strings will be garbage collected (at some point before the end of the time). I was wondering if there is a better, more efficient way of replacing all occurrences of a specific character/substring in a string, that does not additionally create an string.
Sure enough, this converts "01/01/2014" to "01012014". But my question
is, does the replace happen in one step, or does it create an
intermediate string (e.g.: "0101/2014" or "01/012014")?
No, it doesn't create intermediate strings for each replacement. But it does create new string, because, as you already know, strings are immutable.
Why?
There is no reason to a create new string on each replacement - it's very simple to avoid it, and it will give huge performance boost.
If you are very interested, referencesource.microsoft.com and SSCLI2.0 source code will demonstrate this(how-to-see-code-of-method-which-marked-as-methodimploptions-internalcall):
FCIMPL3(Object*, COMString::ReplaceString, StringObject* thisRefUNSAFE,
StringObject* oldValueUNSAFE, StringObject* newValueUNSAFE)
{
// unnecessary code ommited
while (((index=COMStringBuffer::LocalIndexOfString(thisBuffer,oldBuffer,
thisLength,oldLength,index))>-1) && (index<=endIndex-oldLength))
{
replaceIndex[replaceCount++] = index;
index+=oldLength;
}
if (replaceCount != 0)
{
//Calculate the new length of the string and ensure that we have
// sufficent room.
INT64 retValBuffLength = thisLength -
((oldLength - newLength) * (INT64)replaceCount);
gc.retValString = COMString::NewString((INT32)retValBuffLength);
// unnecessary code ommited
}
}
as you can see, retValBuffLength is calculated, which knows the amount of replaceCount's. The real implementation can be a bit different for .NET 4.0(SSCLI 4.0 is not released), but I assure you it's not doing anything silly :-).
I was wondering if there is a better, more efficient way of replacing
all occurrences of a specific character/substring in a string, that
does not additionally create an string.
Yes. Reusable StringBuilder that has capacity of ~2000 characters. Avoid any memory allocation. This is only true if the the replacement lengths are equal, and can get you a nice performance gain if you're in tight loop.
Before writing anything, run benchmarks with big files, and see if the performance is enough for you. If performance is enough - don't do anything.
Well, I'm not a .NET development team member (unfortunately), but I'll try to answer your question.
Microsoft has a great site of .NET Reference Source code, and according to it, String.Replace calls an external method that does the job. I wouldn't argue about how it is implemented, but there's a small comment to this method that may answer your question:
// This method contains the same functionality as StringBuilder Replace. The only difference is that
// a new String has to be allocated since Strings are immutable
Now, if we'll follow to StringBuilder.Replace implementation, we'll see what it actually does inside.
A little more on a string objects:
Although String is immutable in .NET, this is not some kind of limitation, it's a contract. String is actually a reference type, and what it includes is the length of the actual string + the buffer of characters. You can actually get an unsafe pointer to this buffer and change it "on the fly", but I wouldn't recommend doing this.
Now, the StringBuilder class also holds a character array, and when you pass the string to its constructor it actually copies the string's buffer to his own (see Reference Source). What it doesn't have, though, is the contract of immutability, so when you modify a string using StringBuilder you are actually working with the char array. Note that when you call ToString() on a StringBuilder, it creates a new "immutable" string any copies his buffer there.
So, if you need a fast and memory efficient way to make changes in a string, StringBuilder is definitely your choice. Especially regarding that Microsoft explicitly recommends to use StringBuilder if you "perform repeated modifications to a string".
I haven't found any sources but i strongly doubt that the implementation creates always new strings. I'd implement it also with a StringBuilder internally. Then String.Replace is absolutely fine if you want to replace once a huge string. But if you have to replace it many times you should consider to use StringBuilder.Replace because every call of Replace creates a new string.
So you can use StringBuilder.Replace since you're already using a StringBuilder.
Is StringBuilder.Replace() more efficient than String.Replace?
String.Replace() vs. StringBuilder.Replace()
There is no string method for that. You are own your own. But you can try something like this:
oldFormat="dd/mm/yyyy";
string[] dt = oldFormat.Split('/');
string newFormat = string.Format("{0}{1}/{2}", dt[0], dt[1], dt[2]);
or
StringBuilder sb = new StringBuilder(dt[0]);
sb.AppendFormat("{0}/{1}", dt[1], dt[2]);

Looping as an expression

I come across things where this would be useful rather often, and if it exists I want to know about it. I'm not really sure how to explain it to search for it, but it's basically a one line loop statement- similar to a lambada. This isn't the best example (it's a simple solution without this), but it's what was on my mind when I decided to finally ask this question. But this is the kind of thing I'm talking about.
(The following is what I'm thinking of looks like. I'm asking if something similar exists)
In my current situation, I am converting a string into a byte array to write to a stream. I want to be able to do this to create the byte array:
byte[] data = String ==> (int i; Convert.ToByte(String[i]))
Where i is the number in the string based on it's length, and the next line is the output for item.
You should read about LINQ.
Your code can be written as:
var String = "some string";
byte[] data = String.Select(x => Convert.ToByte(x)).ToArray();
or even with method group:
byte[] data = String.Select(Convert.ToByte).ToArray();

How to work with numeric/integer values in Redis with Booksleeve

I'm currently using an in-memory cache and looking to switch to a distributed caching mechanism with Redis. I had a look at ServiceStack's client, but the rate-limiting licensing doesn't work for me, so Booksleeve seems to be recommended.
I've set up a test program to just set and that get that same value back from Booksleeve, but it seems like the result I'm getting back isn't in my expected format, and I'm not sure what the right way to handle this is (there isn't much in the way of documentation that I can see). Here is my simple test program:
RedisConnection conn = new RedisConnection("localhost", 6379);
conn.Open();
conn.Strings.Set(1, "test", 100);
var task = conn.Strings.Get(1, "test");
task.Wait();
var x = task.Result; // byte[] result = {49, 48, 48};
var str = BitConverter.ToString(x); // string result = "31-30-30";
var l = BitConverter.ToInt64(x, 0); // Destination array is not long enough to copy all the items in the collection. Check array index and length.
As you can see, at no point do I get back the same value of "100" that I cached with my original "Set" statement. It's also interesting that I don't seem to be able to cache by numeric values (since Get and Set are members of conn.Strings). Is the recommended approach to just .ToString() all numeric key values?
Any explanation as to why I'm unable to get back the original cached value (and best practices for doing this) would be greatly appreciated.
My answer has two parts:
Redis always saves strings, even if you set a number. BUT it internally knows to do some specific actions on strings that represent numbers.
For example, if right after your first .Set() assignment you'll add:
conn.Strings.Increment(1, "test", 1);
the test key will have the value "101", which is a string, but one that is made out of an arithmetic calculation by Redis.
You need to fix your conversion function. Instead of using BitConverter, that's the right way to convert:
var str = System.Text.Encoding.UTF8.GetString(x);
var value = int.Parse(str);
Of course, this snippet doesn't include any kind of error checking, which is fairly easy to apply (e.g. what if the value is empty or contains something that is not a number).
As for your last question, is using .ToString() the recommended approach - yes. That's the way to work with Redis. But of course, you can make your own utility wrappers that take care of converting values that suppose to contian numbers, to numbers. Something like GetIntValue(string key) or so.

converting bytes to a string C#

I want to convert a binary file to a string which could be then converted back to the binary file.
I tried this:
byte[] byteArray = File.ReadAllBytes(#"D:\pic.png");
for (int i = 0; i < byteArray.Length; i++)
{
textBox1.Text += (char)byteArray[i];
}
but it's too slow, it takes about 20 seconds to convert 5KB on i5 CPU.
I noticed that notepad does the same in much less time.
Any ideas on how to do it?
Thanks
If you want to be able to convert back to binary without losing any information, you shouldn't be doing this sort of thing at all - you should use base64 encoding or something similar:
textBox1.Text = Convert.ToBase64String(byteArray);
You can then convert back using byte[] data = Convert.FromBase64String(text);. The important thing is that base64 converts arbitrary binary data to known ASCII text; all byte sequences are valid, all can be round-tripped, and as it only requires ASCII it's friendly to many transports.
There are four important things to take away here:
Don't treat arbitrary binary data as if it were valid text in a particular encoding. Phil Haack wrote about this in a blog post recently, in response to some of my SO answers.
Don't perform string concatenation in a loop; use a StringBuilder if you want to create one final string out of lots of bits, and you don't know how many bits in advance
Don't use UI properties in a loop unnecessarily - even if the previous steps were okay, it would have been better to construct the string with a loop and then do a single assignment to the Text property
Learn about System.Text.Encoding for the situation where you really have got encoded text; Encoding.UTF8.GetString(byteArray) would have been appropriate if this had been UTF-8-encoded data, for example

Is there a better way to convert to ASCII from an arbitrary input?

I need to be able to take an arbitrary text input that may have a byte order marker (BOM) on it to mark its encoding, and output it as ASCII. We have some old tools that don't understand BOM's and I need to send them ASCII-only data.
Now, I just got done writing this code and I just can't quite believe the inefficiency here. Four copies of the data, not to mention any intermediate buffers internally in StreamReader. Is there a better way to do this?
// i_fileBytes is an incoming byte[]
string unicodeString = new StreamReader(new MemoryStream(i_fileBytes)).ReadToEnd();
byte[] unicodeBytes = Encoding.Unicode.GetBytes(unicodeString.ToCharArray());
byte[] ansiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);
string ansiString = Encoding.ASCII.GetString(ansiBytes);
I need the StreamReader() because it has an internal BOM detector to choose the encoding to read the rest of the file. Then the rest is just to make it convert into the final ASCII string.
Is there a better way to do this?
If you've got i_fileBytes in memory already, you can just check whether or not it starts with a BOM, and then convert either the whole of it or just the bit after the BOM using Encoding.Unicode.GetString. (Use the overload which lets you specify an index and length.)
So as code:
int start = (i_fileBytes[0] == 0xff && i_fileBytes[1] == 0xfe) ? 2 : 0;
string text = Encoding.Unicode.GetString(i_fileBytes, start, i_fileBytes.Length-start);
Note that that assumes a genuinely little-endian UTF-16 encoding, however. If you really need to detect the encoding first, you could either reimplement what StreamReader does, or perhaps just build a StreamReader from the first (say) 10 bytes, and use the CurrentEncoding property to work out what you should use for the encoding.
EDIT: Now, as for the conversion to ASCII - if you really only need it as a .NET string, then presumably all you want to do is replace any non-ASCII characters with "?" or something similar. (Alternatively it might be better to throw an exception... that's up to you, of course.)
EDIT: Note that when detecting the encoding, it would be a good idea to just call Read() a single time to read one character. Don't call ReadToEnd() as by picking 10 bytes as an arbitrary amount of data, it might end mid-character. I don't know offhand whether that would throw an exception, but it has no benefits anyway...
System.Text.Encoding.ASCII.GetBytes(new StreamReader(new MemoryStream(i_fileBytes)).ReadToEnd())
That should save a few round-trips.

Categories

Resources