Profilling results - how to understand - c#

I did profiling for my console application using Unity IOC and a lot of calls using HttpCLient. How to understand it?
Function Name, Inclusive Samples, Exclusive Samples, Inclusive Samples %, Exclusive Samples %
Microsoft.Practices.Unity.UnityContainer.Resolve 175 58 38.89 12.89
Microsoft.Practices.Unity.UnityContainer..ctor 29 29 6.44 6.44
System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[System.DateTime].Start 36 13 8.00 2.89
Microsoft.Practices.Unity.UnityContainerExtensions.RegisterInstance 9 9 2.00 2.00
System.Net.Http.HttpClientHandler..ctor 9 9 2.00 2.00
System.Net.Http.HttpMessageInvoker.Dispose 9 9 2.00 2.00
System.Activator.CreateInstance 20 8 4.44 1.78
Microsoft.Practices.Unity.ObjectBuilder.NamedTypeDependencyResolverPolicy.Resolve 115 3 25.56 0.67
What means that inclusive samples for Microsoft.Practices.Unity.UnityContainer.Resolve are 38,89 but exclusive are 12,89? Is it ok? Not too much?

"Inclusive" means "exclusive time plus time spent in all callees".
Forget the "exclusive" stuff.
"Inclusive" is what it's costing you.
It says UnityContainer.Resolve is costing you 39% of time,
and Unity.ObjectBuilder.NamedTypeDependencyResolverPolicy.Resolve is costing you 26%.
It looks like the first one calls the second one, so you can't add their times together.
If you could avoid calling all that stuff, you would save at least 40%, giving you a speedup of at least 100/60 or 1.67 or 67%
By the way, that Unity stuff, while not exactly deprecated, is no longer being maintained.

Related

Read Fortran binary file into C# without knowledge of Fortran source code?

Part one of my question is even if this is possible? I will briefly describe my situation first.
My work has a licence for a software that performs a very specific task, however most of our time is spent exporting data from the results into excel etc to perform further analysis. I was wondering if it was possible to dump all of the data into a C# object so that I can then write my own analysis code, which would save us a lot of time.
The software we licence was written in Fortran, but we have no access to the source code. The file looks like it is written out in binary, however I do not know if it is unformatted / sequential etc (is there anyway to discern this?).
I have used some of the other answers on this site to successfully read in the data to a byte[], however this is as far as I have got. I have tried to change portions to doubles (which I assume most of the data is) but the numbers do not strike me as being meaningful (most appear too large or too small).
I have the documentation for the software and I can see that most of the internal variable names are 8 character strings, would this be saved with the data? If not I think it would be almost impossible to match all the data to its corresponding variable. I imagine most of the data will be double arrays of the same length (the number of time points), however there will also be some arrays with a longer length as some data would have been interpolated where shorter time steps were needed for convergence.
Any tips or hints would be appreciated, or even if someone tells me its just not possible so I don't waste any more time trying to solve this.
Thank you.
If it was formatted, you should be able to read it with a text editor: The numbers are written in plain text.
So yes, it's probably unformatted.
There are different methods still. The file can have a fixed record length, or it might have a variable one.
But it seems to me that the first 4 bytes represent an integer containing the length of that record in bytes. For example, here I've written the numbers 1 to 10, and then 11 to 30 into an unformatted file, and the file looks like this:
40 1 2 3 4 5 6 7 8 9 10 40
80 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 80
(I added the new line) In here, the first 4 bytes represent the number 40, followed by 10 4-byte blocks representing the numbers 1-10, followed by another 40. The next record starts with an 80, and 20 4-byte blocks containing the numbers 11 through 30, followed by another 80.
So that might be a pattern you could try to see. Read the first 4 bytes and convert them to integer, then read that many bytes and convert them to whatever you think it should be (4 byte float, 8 byte float, et cetera), and then check whether the next 4 bytes again represents the number that you read first.
But there other methods to write data in Fortran that doesn't seem to have this behaviour, for example direct access and stream. So no guarantees.

Teaching a ANN how to add

Preface: I'm currently learning about ANNs because I have ~18.5k images in ~83 classes. They will be used to train a ANN to recognize approximately equal images in realtime. I followed the image example in the book, but it doesn't work for me. So I'm going back to the beginning as I've likely missed something.
I took the Encog XOR example and extended it to teach it how to add numbers less than 100. So far, the results are mixed, even for exact input after training.
Inputs (normalized from 100): 0+0, 1+2, 3+4, 5+6, 7+8, 1+1, 2+2, 7.5+7.5, 7+7, 50+50, 20+20.
Outputs are the numbers added, then normalized to 100.
After training 100,000 times, some sample output from input data:
0+0=1E-18 (great!)
1+2=6.95
3+4=7.99 (so close!)
5+6=9.33
7+8=11.03
1+1=6.70
2+2=7.16
7.5+7.5=10.94
7+7=10.48
50+50=99.99 (woo!)
20+20=41.27 (close enough)
From cherry-picked unseen data:
2+4=7.75
6+8=10.65
4+6=9.02
4+8=9.91
25+75=99.99 (!!)
21+21=87.41 (?)
I've messed with layers, neuron numbers, and [Resilient|Back]Propagation, but I'm not entirely sure if it's getting better or worse. With the above data, the layers are 2, 6, 1.
I have no frame of reference for judging this. Is this normal? Do I have not enough input? Is my data not complete or random enough, or too weighted?
You are not the first one to ask this. It seems logical to teach an ANN to add. We teach them to function as logic gates, why not addition/multiplication operators. I can't answer this completely, because I have not researched it myself to see how well an ANN performs in this situation.
If you are just teaching addition or multiplication, you might have best results with a linear output and no hidden layer. For example, to learn to add, the two weights would need to be 1.0 and the bias weight would have to go to zero:
linear( (input1 * w1) + (input2 * w2) + bias) =
becomes
linear( (input1 * 1.0) + (input2 * 1.0) + (0.0) ) =
Training a sigmoid or tanh might be more problematic. The weights/bias and hidden layer would basically have to undo the sigmoid to truely get back to an addition like above.
I think part of the problem is that the neural network is recognizing patterns, not really learning math.
ANN can learn arbitrary function, including all arithmetics. For example, it was proved that addition of N numbers can be computed by polynomial-size network of depth 2. One way to teach NN arithmetics is to use binary representation (i.e. not normalized input from 100, but a set of input neurons each representing one binary digit, and same representation for output). This way you will be able to implement addition and other arithmetics. See this paper for further discussion and description of ANN topologies used in learning arithmetics.
PS. If you want to work with image recognition, its not good idea to start practicing with your original dataset. Try some well-studied dataset like MNIST, where it is known what results can be expected from correctly implemented algorithms. After mastering classical examples, you can move to work with your own data.
I am in the middle of a demo that makes the computer to learn how to multiply and I share my progress on this: as Jeff suggested I used the Linear approach and in particular ADALINE. At this moment my program "knows" how to multiply by 5. This is the output I am getting:
1 x 5 ~= 5.17716232607829
2 x 5 ~= 10.147218373698
3 x 5 ~= 15.1172744213176
4 x 5 ~= 20.0873304689373
5 x 5 ~= 25.057386516557
6 x 5 ~= 30.0274425641767
7 x 5 ~= 34.9974986117963
8 x 5 ~= 39.967554659416
9 x 5 ~= 44.9376107070357
10 x 5 ~= 49.9076667546553
Let me know if you are interested in this demo. I'd be happy to share.

Consistent number generator from multiple input variables

I wan't to generate a fictional job title from some information I have about the visitor.
For this, I have a table of about 30 different job titles:
01 CEO
02 CFO
03 Key Account Manager
...
29 Window Cleaner
30 Dishwasher
I'm trying to find a way to generate one of these titles from a few different variables like name, age, education history, work history and so on. I wan't it to be somewhat random but still consistent so that the same variables always result in the same title.
I also wan't the different variables to have some impact on the result. Lower numbers are "better" jobs and higher numbers are "worse" jobs, but it doesn't have to be very accurate, just not completely random.
So take these two people as an example.
Name: Joe Smith
Number of previous employers: 10
Number of years education: 8
Age: 56
Name: Samantha Smith
Number of previous employers: 1
Number of years education: 0
Age: 19
Now the reason I wan't the name in there is to have a bit of randomness, so that two co-workers of the same age with the same background doesn't get exactly the same title. So I was thinking of using the number of letters in the name to mix it up a bit.
Now I can generate consistent numbers in an infinite number of ways, like the number of letters in the name * age * years of education * number of employers. This would come out as 35 840 for Joe Smith and 247 for Samantha Smith. But I wan't it to be a number between 1-30 where Samantha is closer to 25-30 and Joe is closer to 1-5.
Maybe this is more of a math problem than a programming problem, but I have seen a lot of "What's your pirate name?" and similar apps out there and I can't figure out how they work. "What's your pirate name?" might be a bad example, since it's probably completely random and I wan't my variables to matter some, but the idea is the same.
What I have tried
I tried adding weights to variable groups so I would get an easier number to use in my calculations.
Age
01-20 5
20-30 4
30-40 3
40-50 2
...
Years of education
00-01 0
01-02 1
02-03 2
04-05 3
...
Add them together and play around with those numbers, but there was a lot of problems like everyone ending up in pretty much the same mid-range (no one got to be CEO or dishwasher, everyone was somewhere in the middle), not to mention how messy the code was.
Is there a good way to accomplish what I want to do without having to build a massive math engine?
int numberOfTitles = 30;
var semiRandomID = person.Name.GetHashCode()
^ person.NumberOfPreviousEmployers.GetHashCode()
^ person.NumberOfYearsEducation.GetHashCode()
^ person.Age.GetHashCode();
var semiRandomTitle = Math.Abs(semiRandomID) % numberOfTitles;
// adjust semiRandomTitle as you see fit
semiRandomTitle += ((person.Age / 10) - 2);
semiRandomTitle += (person.NumberOfYearsEducation / 2);
The semiRandomID is a number that is generated from unique hashes of each component. The numbers are unique so that you will always generate the same number for "Joe" for example, but they don't mean anything. It's just a number. So we take all those unique numbers and generate one job title out of the 30 available. Every person has the same chance to get each job title (probably some math freak will proof that there's egde cases to the contrary, but for all practical, non-cryptographic means, it's sufficient).
Now each person has one job title assigned that looks random. However, as it's math and not randomness, they will get the same every time.
Now lets assume Joe got Taxi-Driver, the number 20. However, he has 10 years of formal education, so you decide you want to have that aspect have some weight. You could just add the years onto the job title number, but that would make anyone with 30 years of college parties CEO, so you decide (arbitrarily) that each year of education counts for half a job title. You add (NumberOfYearsEducation / 2) to the job title.
Lets assume Jane got CIO, the number 5. However, she is only 22 years old, a little young to be that high on the list. Again, you could just add the years onto the job title number, but that would make anyone with 30 years of age a CEO, so you decide (arbitrarily) that each year counts as 1/10 of a job title. In addition, you think that being very young should instead subtract from the job title. All years below the first 20 should indeed be a negative weight. So the formula would be ((Age / 10) - 2). One point for each 10 years of age, with the first 2 counting as negative.

Reading data over serial port from voltmeter

I'm sort of new at this and I'm writing a small application to read data from a voltmeter. It's a RadioShack Digital Multimeter 46-range. The purpose of my program is to perform something automatically when it detects a certain voltage. I'm using C# and I'm already familiar with the SerialPort class.
My program runs and reads the data in from the voltmeter. However, the data is all unformatted/gibberish. The device does come with its own software that displays the voltage on the PC, however this doesn't help me since I need to grab the voltage from my own program. I just can't figure out how to translate this data into something useful.
For reference, I'm using the SerialPort.Read() method:
byte[] voltage = new byte[100];
_serialPort.Read(voltage, 0, 99);
It grabs the data and displays it as so:
16 0 30 0 6 198 30 6 126 254 30 0 30 16 0 30 0 6 198 30 6 126 254 30 0 30 16 0 3
0 0 6 198 30 6 126 254 30 0 30 16 0 30 0 6 198 30 6 126 254 30 0 30 16 0 30 0 6
198 30 6 126 254 30 0 30 24 0 30 0 6 198 30 6 126 254 30 0 30 16 0 30 0 254 30 6
126 252 30 0 6 0 30 0 254 30 6 126 254 30 0
The space separates each element of the array. If I use a char[] array instead of byte[], I get complete gibberish:
▲ ? ? ▲ ♠ ~ ? ▲ ♠ ▲ ? ? ▲ ♠ ~ ? ▲ ♠ ▲ ? ? ▲ ♠ ~ ? ▲ ♠
Using the .ReadExisting() method gives me:
▲ ?~?♠~?▲ ▲? ▲ ?~♠~?▲ ?↑ ▲ ??~♠~?▲ F? ▲ ??~♠~?▲ D? ▲ ??~♠~?▲ f?
.ReadLine() times out, so doesn't work. ReadByte() and ReadChar() just give me numbers similar to the Read() into array function.
I'm in way over my head as I've never done something like this, not really sure where else to turn.
It sounds like you're close, but you need to figure out the correct Encoding to use.
To get a string from an array of bytes, you need to know the Code Page being used. If it's not covered in the manual, and you can't find it via a google/bing/other search, then you will need to use trial and error.
To see how to use GetChars() to get a string from a byte array, see Decoder.GetChars Method
In the code sample, look at this line:
Decoder uniDecoder = Encoding.Unicode.GetDecoder();
That line is specifically stating that you are to use the Unicode code page to get the correct code page.
From there, you can use an override of the Encoding class to specify different Code Pages. This is documented here: Encoding Class
If the Encoding being used isn't one of the standards, you can use the Encoding(Int32) override in the Constructor of the Encoding class. A list of valid Code Page IDs can be found at Code Pages Supported by Windows
There are two district strategies for solving your communications problem.
Locate and refer to appropriate documentation and design\modify a program to implement the specification.
The following may be appropriate, but are not guaranteed to describe the particular model DVM that you have. Nonetheless, they MAY serve as a starting point.
note that the authors of these documents comment that the Respective models may be 'visually identical', but also comments that '"Open-source packages that reportedly worked on LINUX with earlier RS-232 models do not work with the 2200039"
http://forums.parallax.com/attachment.php?attachmentid=88160&d=1325568007
http://sigrok.org/wiki/RadioShack_22-812
http://code.google.com/p/rs22812/
Try to reverse engineer the protocol. if you can read the data in a loop and collect the results, a good approach to reverse engineering a protocol, is to apply various representative signals to the DVM. You can use a short-circuit resistance measurements, various stable voltage measurements, etc.
The technique I suggest is most valuable is to use an automated variable signal generator. In this way, by analyzing the patterns of the data, you should be more readily be able to identify which points represent the raw data and which points represent stable descriptive data, like the unit of measurements, mode of operation, etc.
Some digital multimeters use 7 bit data transfer. You should set serial communication port to 7 data bits instead of standard 8 data bits.
I modified and merged a couple of older open source C programs together on linux in order to read the data values from the radio shack meter whose part number is 2200039. This is over usb. I really only added a C or an F on one range. My program is here, and it has the links where I got the other two programs in it.
I know this example is not in C#, but it does provide the format info you need. Think of it is as the API documentation written in C, you just have to translate it into C# yourself.
The protocol runs at 4800 baud, and 8N1 appears to work.

Brute force or algorithm

I'm not sure what kind of approach is needed but let me describe the problem:
Given an arbitrary number of workers (2 or more) are scheduled to work in any given month (including weekends).
Only one worker may work that assigned day.
2a. This worker may not work the day before or after.
Workers also work weekends and if possible equally distributed to the number of workers.
3a. Saturdays and Sundays are weighed equally.
Allot for possible vacations taken
4a. No restriction on sequential days
4b. May not take so much vacation that will interfere with rule(s) #2 and #3
What is the most flexible way to sort these criteria.
What is this type of problem called?
Can someone to point me to the right direction so I can read and learn about it. Obviously if this is something that is already been solved with an algorithm, point me to the right paper or book so I can read and understand it.
Clarification: I'm not looking for how many [total] days and weekends each worker would work but a way to [evenly] distribute the days worked in that month.
E.g. Workers A B C; A requested vacation 17 to 20
Obviously there are other permutations than the example I listed below.
M T W Th F Sa Su
====================
October 1 2 3 4 5 6 7
2012 A B C A B C A
8 9 10 11 12 13 14
B C A B C A B
15 16 17 18 19 20 21
C A B C B C A
22 23 24 25 26 27 28
B A C A C B C
29 30 31
A B A
Use the simplex algorithm. You can program constraints as thus:
Each day needs to be filled by exactly one person
For each day and for each worker, they should work at least one out of every three day block
Nobody should work on their vacation days
The max number of weekends a worker has in a month should be no more than 1+floor(weekend shifts/workers)

Categories

Resources