I come across things where this would be useful rather often, and if it exists I want to know about it. I'm not really sure how to explain it to search for it, but it's basically a one line loop statement- similar to a lambada. This isn't the best example (it's a simple solution without this), but it's what was on my mind when I decided to finally ask this question. But this is the kind of thing I'm talking about.
(The following is what I'm thinking of looks like. I'm asking if something similar exists)
In my current situation, I am converting a string into a byte array to write to a stream. I want to be able to do this to create the byte array:
byte[] data = String ==> (int i; Convert.ToByte(String[i]))
Where i is the number in the string based on it's length, and the next line is the output for item.
You should read about LINQ.
Your code can be written as:
var String = "some string";
byte[] data = String.Select(x => Convert.ToByte(x)).ToArray();
or even with method group:
byte[] data = String.Select(Convert.ToByte).ToArray();
Related
The decoded message what was first sent using sockets to another form cannot be compared or can be but that if doesn't work. That if was just jumped every time
byte[] receivedData = new byte[1500];
receivedData = (byte[])aResult.AsyncState;
string data = encoder.GetString(receivedData);
listMessage.Items.Add("Friend: " + data);
if (data == "Friend Disconnected")
{
//this not perform
listMessage.Items.Clear();
lblHostPort.Text = "";
lblLocalPort.Text = "";
grpFriend.Visible = true;
grpHost.Visible = true;
button1.Text = "connect";
}
String comparision only works if the strings are exactly the same. An extra, missing or different whitespace. A small letter where a big one should be. Even different Unicode Normalisation - all of this and more can get in the way of it. As you are creating that string from raw bytes, even different encodings could throw a wrench into that mix.
As a general rule, string is terrible for processing and information transmissions. The only type somewhat worse is byte themself. The only advantage of string is that is (often) human readable.
But a numeric error code or even Enumeration tends to be leagues more reliable for this kind of work.
Their is 2 possibilities for you issues.
For the first one, maybe that you are not using the correct encoding in your encoder object. (difficult to say without additional information on this object.)
Encoding
Something that you can try is to check if you can get better result by using the Compare method between strings instead of the operator ==.
You will then be able to perform comparison case insensitive or with specifics options.
Again, I can't give you more information right now as you don't indicate the content of the data variable in your question.
string comparison method
So I've looked at Dictionaries and various arrays for this, and I'm sure I'm missing an elegant solution.
Currently, I have a configuration dictionary that has information about what data needs to be retrieved.
Then I create a string[,] array where the first string is the item number and the second is the configuration value for a given item, then the value is the value for that configuration item. Something like this:
ret[0,0] = "12345678"
ret[0,1] = "\\localhost\images"
ret[0,2] = "\test.img"
ret[1,0] = "23231231"
ret[1,1] = "\\localhost\images"
ret[1,2] = "\here.img"
There are more values, but that's the gist of it.
Now I need to also to grab each of those .img files (which are concatenated TIFF files) and extract images into byte[] values. Some of the additional values are an offset and length in the file for that item number's image, so extracting the images is easy. For some reason, however, I'm having a hard time finding a smart way to index the byte[] arrays for a given image (there's a front and a rear image for each) with the index value of the ret[,] array. Neither Dictionaries or Lists seem like they'd work. If I could have a jagged array with mixed values, that would work, but I don't really see how to do that.
Please let me know if I'm not making sense regarding what I'm looking for. I may need to draw it out lol
Thanks!
You need something like:
Dictionary<int,Dictionary<int,<string>> myVar = new Dictionary<int,Dictionary<int,string>>();
myVar.add(0,new Dictionary<int,string>(0,'string'));
Console.WriteLine(myVar[0][0]);
You might also want to check the DataTable class.
You can simply define your own class for the image.
It stores the number, and all the other strings and the byte array. Then you implement a List of this class.
I want to convert a binary file to a string which could be then converted back to the binary file.
I tried this:
byte[] byteArray = File.ReadAllBytes(#"D:\pic.png");
for (int i = 0; i < byteArray.Length; i++)
{
textBox1.Text += (char)byteArray[i];
}
but it's too slow, it takes about 20 seconds to convert 5KB on i5 CPU.
I noticed that notepad does the same in much less time.
Any ideas on how to do it?
Thanks
If you want to be able to convert back to binary without losing any information, you shouldn't be doing this sort of thing at all - you should use base64 encoding or something similar:
textBox1.Text = Convert.ToBase64String(byteArray);
You can then convert back using byte[] data = Convert.FromBase64String(text);. The important thing is that base64 converts arbitrary binary data to known ASCII text; all byte sequences are valid, all can be round-tripped, and as it only requires ASCII it's friendly to many transports.
There are four important things to take away here:
Don't treat arbitrary binary data as if it were valid text in a particular encoding. Phil Haack wrote about this in a blog post recently, in response to some of my SO answers.
Don't perform string concatenation in a loop; use a StringBuilder if you want to create one final string out of lots of bits, and you don't know how many bits in advance
Don't use UI properties in a loop unnecessarily - even if the previous steps were okay, it would have been better to construct the string with a loop and then do a single assignment to the Text property
Learn about System.Text.Encoding for the situation where you really have got encoded text; Encoding.UTF8.GetString(byteArray) would have been appropriate if this had been UTF-8-encoded data, for example
Does anyone else feel that the iterators are coming up short when you want to take a part a sequence piece by piece?
Maybe I should just start writing my code in F# (btw anybody knows if F# uses lazy evaluation) but I've found myself wanting a way to pull at a sequence in a very distinct way.
e.g.
// string.Split implemented as a generator, with lazy evaluation
var it = "a,b,c".GetSplit(',').GetEnumerator();
it.MoveNext();
var a = it.Current;
it.MoveNext();
it.MoveNext();
var c = it.Current;
That works, but I don't like it, it is ugly. So can I do this?
var it = "a,b,c".GetSplit(',');
string a;
var c = it.Yield(out a).Skip(1).First();
That's better. But I'm wondering if there's another way of generalizing the same semantic, maybe this is good enough. Usually I'm doing some embedded string parsing, that's when this pops out.
There's also the case where I wish to consume a sequence up to a specific point, then basically, fork it (or clone it, that's better). Like so
var s = "abc";
IEnumerable<string> a;
var b = s.Skip(1).Fork(out a);
var s2 = new string(a.ToArray()); // "bc"
var s3 = new string(b.ToArray()); // "bc"
This last one, might not be that useful at first, I find that it solves backtracking issues rather conveniently.
My question is do we need this? or does it already exist in some manner and I've just missed it?
Sequences basically work OK at what they do, which is to provide a simple interface that yields a stream of values on demand. If you have more complicated demands then you're welcome to use a more powerful interface.
For instance, your string examples look like they could benefit being written as a parser: that is, a function that consumes a sequence of characters from a stream and uses internal state to keep track of where it is in the stream.
I am trying to compare two large datasets from a SQL query. Right now the SQL query is done externally and the results from each dataset is saved into its own csv file. My little C# console application loads up the two text/csv files and compares them for differences and saves the differences to a text file.
Its a very simple application that just loads all the data from the first file into an arraylist and does a .compare() on the arraylist as each line is read from the second csv file. Then saves the records that don't match.
The application works but I would like to improve the performance. I figure I can greatly improve performance if I can take advantage of the fact that both files are sorted, but I don't know a datatype in C# that keeps order and would allow me to select a specific position. Theres a basic array, but I don't know how many items are going to be in each list. I could have over a million records. Is there a data type available that I should be looking at?
If data in both of your CSV files is already sorted and have the same number of records, you could skip the data structure entirely and do in-place analysis.
StreamReader one = new StreamReader("C:\file1.csv");
StreamReader two = new StreamReader("C:\file2.csv");
String lineOne;
String lineTwo;
StreamWriter differences = new StreamWriter("Output.csv");
while (!one.EndOfStream)
{
lineOne = one.ReadLine();
lineTwo = two.ReadLine();
// do your comparison.
bool areDifferent = true;
if (areDifferent)
differences.WriteLine(lineOne + lineTwo);
}
one.Close();
two.Close();
differences.Close();
System.Collections.Specialized.StringCollection allows you to add a range of values and, using the .IndexOf(string) method, allows you to retrieve the index of that item.
That being said, you could likely just load up a couple of byte[] from a filestream and do byte comparison... don't even worry about loading that stuff into a formal datastructure like StringCollection or string[]; if all you're doing is checking for differences, and you want speed, I would wreckon byte differences are where it's at.
This is an adaptation of David Sokol's code to work with varying number of lines, outputing the lines that are in one file but not the other:
StreamReader one = new StreamReader("C:\file1.csv");
StreamReader two = new StreamReader("C:\file2.csv");
String lineOne;
String lineTwo;
StreamWriter differences = new StreamWriter("Output.csv");
lineOne = one.ReadLine();
lineTwo = two.ReadLine();
while (!one.EndOfStream || !two.EndOfStream)
{
if(lineOne == lineTwo)
{
// lines match, read next line from each and continue
lineOne = one.ReadLine();
lineTwo = two.ReadLine();
continue;
}
if(two.EndOfStream || lineOne < lineTwo)
{
differences.WriteLine(lineOne);
lineOne = one.ReadLine();
}
if(one.EndOfStream || lineTwo < lineOne)
{
differences.WriteLine(lineTwo);
lineTwo = two.ReadLine();
}
}
Standard caveat about code written off the top of my head applies -- you may need to special-case running out of lines in one while the other still has lines, but I think this basic approach should do what you're looking for.
Well, there are several approaches that would work. You could write your own data structure that did this. Or you can try and use SortedList. You can also return the DataSets in code, and then use .Select() on the table. Granted, you would have to do this on both tables.
You can easily use a SortedList to do fast lookups. If the data you are loading is already sorted, insertions into the SortedList should not be slow.
If you are looking simply to see if all lines in FileA are included in FileB you could read it in and just compare streams inside a loop.
File 1
Entry1
Entry2
Entry3
File 2
Entry1
Entry3
You could loop through with two counters and find omissions, going line by line through each file and see if you get what you need.
Maybe I misunderstand, but the ArrayList will maintain its elements in the same order by which you added them. This means you can compare the two ArrayLists within one pass only - just increment the two scanning indices according to the comparison results.
One question I have is have you considered "out-sourcing" your comparison. There are plenty of good diff tools that you could just call out to. I'd be surprised if there wasn't one that let you specify two files and get only the differences. Just a thought.
I think the reason everyone has so many different answers is that you haven't quite got your problem specified well enough to be answered. First off, it depends what kind of differences you want to track. Are you wanting the differences to be output like in a WinDiff where the first file is the "original" and second file is the "modified" so you can list changes as INSERT, UPDATE or DELETE? Do you have a primary key that will allow you to match up two lines as different versions of the same record (when fields other than the primary key are different)? Or is is this some sort of reconciliation where you just want your difference output to say something like "RECORD IN FILE 1 AND NOT FILE 2"?
I think the asnwers to these questions will help everyone to give you a suitable answer to your problem.
If you have two files that are each a million lines as mentioned in your post, you might be using up a lot of memory. Some of the performance problem might be that you are swapping from disk. If you are simply comparing line 1 of file A to line one of file B, line2 file A -> line 2 file B, etc, I would recommend a technique that does not store so much in memory. You could either read write off of two file streams as a previous commenter posted and write out your results "in real time" as you find them. This would not explicitly store anything in memory. You could also dump chunks of each file into memory, say one thousand lines at a time, into something like a List. This could be fine tuned to meet your needs.
To resolve question #1 I'd recommend looking into creating a hash of each line. That way you can compare hashes quick and easy using a dictionary.
To resolve question #2 one quick and dirty solution would be to use an IDictionary. Using itemId as your first string type and the rest of the line as your second string type. You can then quickly find if an itemId exists and compare the lines. This of course assumes .Net 2.0+