I have a quick question that I haven't found out how to do efficiently (in C#).
I have a list array of Points (X,Y). I need to find which 3 points are the tightest cluster. It's for a mapping project.
What would the best way to do this be? There's only about 6 to 9 items in the list.
Thanks in advance.
Cheers!
For such small numbers, the brute force method should work just fine. With six points, there are 20 possible combinations of three points. With 9 points, there are 84 possible combinations. I wouldn't recommend this approach for a lot of points, but with just a handful, it's going to be plenty fast enough and it's dead simple to write.
You can easily generate the combinations:
for (int i = 0; i < points.Length - 2; ++i)
{
for (j = i + 1; j < points.Length - 1; j++)
{
for (k = j + 1; k < points.Length; k++)
{
// Here, your three points are
// points[i], points[j], and points[k]
// compute "tightness" and store
}
}
}
You'll need a structure to hold your combinations:
struct PointGroup
{
public readonly int i;
public readonly int j;
public readonly int k;
public readonly double tightness;
public PointGroup(int i, int j, int k, double tight)
{
this.i = i;
this.j = j;
this.k = k;
this.tightness = tight;
}
}
If you create one of those structures for each group and store them in an array, you can simply sort the array and take the best three.
Your bigger problem is coming up with a definition of "tight group." Also, you have to decide if a point can be in more than one of those "tightest" groups. Three possible ways to define tightness are:
The sum of the distances between the points is minimized.
The average distance from each point to the center of the group is minimized.
The circumference of the triangle formed by the three points is minimized.
Undoubtedly there are more.
If the points are not identical, this becomes a form of cluster analysis.
There are various algorithms that differ in how they measure and "cluster" points, though with only a few points, a brute force approach might be the easiest... You could just measure the distance between each pair of points, and sort...
You can simplify the problem as follows:
Don't check a Point against itself; distance is zero.
Exploit symmetry: distance from Point i to Point j is the same as Point j to Point i
Those eliminate a number of combinations.
But, given those, you have to calculate the distance between each pair and sort.
Related
I've been trying to come up with an algorithm which deals, as the title states,
X amount of Cards, per Y amount of Cards, over Z amount of Players of a normal (52 piece) Deck of Cards which is sorted or unsorted. I've been walking into a wall for the past few hours to come up with a working solution, while also Googling to find similar problems. Unfortunately without success, hence this question.
An example would be: dealing 2 Cards, per 1, over 2 Players would result in
Player 1 receiving 1 card
Player 2 receiving 1 card
Player 1 receiving 1 card
Player 2 receiving 1 card
Until now I have a solution with which I'm able to run my application, although the actual dealing algorithm isn't keeping the 'per' parameter into account. It will deal the right amount of cards to the Y amount of players, although each player will receive the total amount to be dealed in 1 go..
I was wondering if anyone here had to handle a similar problem in the past? Or could guide me into the right direction? :/
public List<Card>[] Deel(int per, int players, int cards)
{
_currentCard= 0;
List<Card>[] output = new List<Card>[players];
if (_cardsDistributed < _deck.Count)
{
for (int i = 0; i < players; i++)
{
List<Card> hand = new List<Card>();
for (int j = 0; j < cards; j++)
{
_currentCard= 0;
hand.Add(_deck[_currentCard]);
_deck.Remove(_deck[_currentCard]);
_cardsDistributed++;
}
output[i] = hand;
}
return output;
}
else
return null;
}
One way to think about this is to get a deck of cards and do it yourself, by hand, and write down the steps. For example, you have three players and you want to deal each player four cards, two at a time. So what do you do?
You take the deck in hand, hold it over the first player's pile, and deal two cards. The code for that is pretty simple:
player = 1
for i = 1 to 2
deal next card to player
Then you move to the second player's pile and deal two cards, and you do the same thing for the third player. So you need a loop to go through the players:
for player = 1 to 3
for i = 1 to 2
deal next card to player
At this point you've dealt two cards to each of the three players.
If you want to deal X cards Y at a time, and Y is smaller than X, then you need to go around to each player multiple times. How many? Well, how many times does Y go into X? The answer is X/Y.
If you were doing this by hand, you would start over at player 1, deal him two cards, move on to player 2, etc. Adding that in code is simple:
numRounds = 4/2
for round = 1 to numRounds
for player = 1 to 3
for card = 1 to 2
deal next card to player
Now, replace the constant values with X, Y, and Z, and try it with some other combination by following those exact steps. Deal six cards to each player, three at a time. Did it work? Try a few other combinations to verify that the steps you wrote down always work.
Once you've determined that the algorithm you've developed works, then writing the code to implement it on the computer is easy. There are, of course, some minor details like how to deal a card, but those are easy compared to figuring out the overall approach to the problem.
I was fortunate that I discovered this approach to problem solving early in my education. Casting an algorithmic problem into physical terms lets me build a model that I can play with, and write down the steps I took to solve the problem. After that, writing the program is a simple matter of duplicating those steps in code. It doesn't work for all problems, but it's very effective for a large number of different problems that you will encounter.
If I understood your question correctly you need something like this:
public List<Card>[] Deel(int per, int players, int cards)
{
List<Card>[] output = new List<Card>[players];
// init hand for each player
for (int i = 0; i < players; i++)
{
output[i] = new List<Card>();
}
// assume the number of cards is divided by 'per' without a remainder
// otherwise you need one more round to deal rest (cards % per) cards
int rounds = cards / per;
for (int round = 0; round < rounds; round++)
{
for (int i = 0; i < players; i++)
{
for (int j = 0; j < per; j++)
{
if (_deck.Count > 0)
{
output[i].Add(_deck[0]);
_deck.Remove(_deck[0]);
_cardsDistributed++;
}
else
{
// should throw an exception because the deck contains no more cards
// or maybe you need to check it before dealing
}
}
}
}
return output;
}
I am trying to calculate the time complexity of my fitness function for the genetic algorithm I wrote.
What I did do: I already read a few articles and examples
How to calculate Time Complexity for a given algorithm
Big-O Complexity Chart
Determining The Complexity Of Algorithm (The Basic Part)
How to find time complexity of an algorithm
Time Complexity of Evolutionary Algorithms for
Combinatorial Optimization: A Decade of Results
.
However non of these were really satisfying, where I could say: Now I know how to apply this on my code.
Let me show you my fitness function, where I guessed a few execution times.
public static List<double> calculateFitness(List<List<Point3d>> cF, Point3d startpoint)
{
List<double> Fitness = new List<double>(); // 1+1
for (int i = 0; i < cF.Count; i++) // 1 ; N+1 ; N
{
Point3d actual; // N
Point3d next; // N
double distance; // N
double totalDistance = startpoint.DistanceTo(cF[i][0]); // (1+1+1+1)*N
for (int j = 0; j < cF[i].Count - 1; j++) // { 1 ; N ; N-1 }*N
{
actual = cF[i][j]; // (1+1)*(N-1)
next = cF[i][j + 1]; // (1+1)*(N-1)
distance = actual.DistanceTo(next); // (1+1+1+1)*(N-1)
totalDistance += distance; // (1+1)*(N-1)
}
totalDistance += cF[i][cF[i].Count - 1].DistanceTo(startpoint); // (1+1+1+1)*N
Fitness.Add(totalDistance); // N
}
return Fitness; // 1
}
Do you know any links where there are examples, so that I could learn how to calculate the time complexity use-oriented.
Or maybe someone can explain it here. For example for this code piece I'm not sure at all: double totalDistance = startpoint.DistanceTo(cF[i][0]); --> (1+1)N ?
Or this: actual = cF[i][j]; --> (1+1)NN ?
So in general, the time complexity would be: 1+1+ (1+N+1+N+N+N+N+4N+ N*{ 1+N+N-1+2*(N-1)+2*(N-1)+4*(N-1)+2*(N-1) } +4N+N) = 2 + (2+14N+ N*{12N-10}) = 12N^2 + 4N + 4 = O(N^2)
Generally when doing Big-O analysis, we ignore constant time operations (i.e. O(1)) and any constant factors. We are just trying to get a sense of how well the algorithm scales with N. What this means in practice is that we are looking for loops and non-constant time operations
With that in mind, I've copied your code below and then annotated certain points of interest.
public static List<double> calculateFitness(List<List<Point3d>> cF, Point3d startpoint)
{
List<double> Fitness = new List<double>();
for (int i = 0; i < cF.Count; i++) // 1.
{
Point3d actual; // 2.
Point3d next;
double distance;
double totalDistance = startpoint.DistanceTo(cF[i][0]); // 3.
for (int j = 0; j < cF[i].Count - 1; j++) // 4.
{
actual = cF[i][j]; // 5.
next = cF[i][j + 1];
distance = actual.DistanceTo(next);
totalDistance += distance;
}
totalDistance += cF[i][cF[i].Count - 1].DistanceTo(startpoint);
Fitness.Add(totalDistance); // 6.
}
return Fitness;
}
The i loop will execute N times where N is cF.Count. If we were being incredibly formal, we would say that the comparison i < cF.Count takes some constant time c and i++ takes some constant time d. Since they are executed N times, the total time here is cdN. But as I mentioned, Big-O ignores these constant factors and so we say that it is O(N).
These declarations are constant time, O(1).
Indexing into a .NET List is documented as being O(1). I can't find documentation for the DistanceTo method, but I can't imagine it being anything but O(1) because it would be simple math operations.
Here we have another loop that executes N times. If we were being strict about it, we would introduce a second variable here because cF[i].Count isn't necessarily equal to cF.Count. I'm not going to be that strict.
Again, indexing into a list is O(1).
This is actually the tricky one. The Add method is documented as follows:
If Count is less than Capacity, this method is an O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.
How this is typically implemented, the operation is O(1) most of the time, but is occasionally O(n) where n is the length of the list being added to, Fitness in this case. This is generally referred to as amortized O(1).
So in the end you mainly just have O(1) operations. What there is though is one O(N) loop within another O(N) loop. So the algorithm as a whole is O(N) * O(N) = O(N2).
This one is for you CompSci or stats people. Can you please tell me, if theList contains 72,786 "things," what the value of compareCount will be at the end of the loops? I'm thinking it's 72,786^2-1 but it's been soo long since this old brain worked like that. Much obliged for your time and assistance!
List<thing> theList = new List<thing>();//list contains 73,786 "things"
private void compare()
{
int compareCount = 0;
for(int i = 0; i < theList.Count-1; i++)
{
for(int comp = i + 1; comp < theList.Count; comp++)
{
compare(theList[i], theList[comp]);
compareCount++;
}
}
}
The compareCount in your code will have the value (72786^2 - 72786) / 2 = 2648864505. I've confirmed that by running it. As it is written now, there is no need to have the call compare(theList[i], theList[comp]) in the inner loop (as it doesn't influence the count in any way).
Here's how I remember the (n^2 - n)/2 formula: a round robin tournament with n players, each of them meeting all other players exactly once.
The match plan is a square with n rows and columns (n * n = n^2 combinations). Since a player doesn't play against himself the n matches on the diagonal from upper left to lower right must be subtracted (n^2 - nmatches left now). Pairings of player A against player B in the triangle above the diagonal are the same as pairings of B against A in the triangle below (there are (n^2 - n)/2 of such pairings). Subtracting this number from n^2 - n gives the final result of (n^2 - n)/2 possible matches.
I have a problem making this task: An n-vertex graph is a scorpion if it has a vertex 1(the sting) connected to a vertex two (the tail) connected a vertex 3 (the body) connected to the other vertexes (the feet). Some of the feet may be connected to other feet. Design an algorithm that decides whether a given drawing represents a scorpion and tell in which line is sting,tail , body and feets.
This is my data file to read from:
(+) is where is edge and (-) where are no edges
I'm trying to find the sting first but how basically i could search for connections with tail and body? also i have to use recursion
EDIT:
Ok now i habe found how much "+" there are in each line:
int[] B = new int[100];
for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
int HowMuch = Regex.Matches(A[i,j], #"\+").Count;
Saving[i] += HowMuch;
}
if(Saving[i]>=3)
{
Console.WriteLine("It's a scorpion!");
Console.WriteLine("The body is in: " + i + " part");
}
}
And with recursion i'm trying to find path connections... How i should continue?
static void Find(string[,] A, int n, int j)
{
for (int i = 0; i < n; i++)
{
if(A[i,j]=="+")
{
j = i;
Find(A, n, j);
}
}
}
So, I'm giving you an idea on how to solve this. I took help from this. You should take a look there. There is a hint on that site.
My approach is slightly different from theirs.
From abstract point of view, you are asking, from an adjacency matrix, determine whether the given points are like this image(aka the scorpion). (taken from that site)
Now, how the adjacency matrix convert to scorpion? Let's look at your example.
I drawn the adjacency matrix and the graph by hand. I hope its not too difficult to understand.
Now how to solve it? You compute the degree for each node here. You can compute it from the adjacency matrix here. (The degree means the number of nodes one node is connected to, For example, for the graph i drawn there, the degree of 1 is 1, degree of 0 is 2 and so on...)
At first you find degree of all the nodes here(nodes means vertex and vice versa).
So, the sting should be the one with degree one. Now there is a problem with this, i'll get back to it. But for now lets not consider it.
The tail would be with degree 2. And it would be connected with the sting. So, you find the one node connected with sting and you are done. That is the tail.
The node that is connected with tail(apart from sting) is the body.
The body would be with degree >= 2. So if there is a vertex with that much degree, then that's the body for sure. And the nodes connected with it are the feets.
Now you may say, the feets are with degree 2 so why are not tail? Because they are not connected to sting.(which you have computed earlier)
You may also say, the feets are with degree 1 so why not sting? because its connected to some node that has degree > 2, which cannot be(as the tail has degree of 2)
Now thats all well and good, but consider a problem, If the graph is like this,
1-0-3-4
then what would be the sting and the what would be the feet? My answer is both. Both 1 and 4 can be leg or sting.
I hope you understand what i have said.
Clarification on the image if needed:
You said, where there is a + there is an edge. Notice the + on 1 and 3 on row 0. So, 0 is connected to 1 and 4. I've connected them just like that. And the connections are bidirectional. You can see that from adjacency matrix.
I tried to make this code perform faster using Parallel.ForEach and ConcurrentBag but it's still running way to long (esp. when having in mind that in my scenario i may also be 1.000.000++):
List<Point> points = new List<Point>();
for(int i = 0; i<100000;i++) {
Point point = new Point {X = i-50000, Y = i+50000, CanDelete = false};
points.Add(point);
}
foreach (Point point in points) {
foreach (Point innerPoint in points) {
if (innerPoint.CanDelete == false && (point.X - innerPoint.X) < 2) {
innerPoint.Y = point.Y;
point.CanDelete = true;
}
}
}
That code will perform WORSE in parallel, due to the data access patterns.
The best way to speed it up is to recognize that you don't need to consider all O(N^2) pairs of points, but only the ones having nearby X-coordinates.
First, sort the list by X-coordinate, O(N log N), then process forward and backward in the list from each point until you leave the neighborhood. You'll need to use indexing and not foreach.
If your sample data, the list is already sorted.
Since your distance test is symmetric, and removes matching points from consideration, you can skip looking at earlier points.
for (int j = 0; j < points.Length; ++j) {
int x1 = points[j].X;
//for (int k = j; k >= 0 && points[k].X > x1 - 2; --k ) { /* merge points */ }
for (int k = j + 1; k < points.Length && points[k].X < x1 + 2; ++k ) { /* merge points */ }
}
Not only is the complexity better, the cache behavior is far superior. And it can be split among multiple threads with far less cache contention.
Well, I don't know exactly what do you want, but let's try.
First, when creating the List, you might want to set it's desired initial size, since you know how many items it will hold. So it does not need to grow all the time.
List<Point> points = new List<Point>(100000);
Next, you could sort the list by the X property. So you would only compare each point with the points that are near it: when you find the first, forward or backward, that is too distant, you can stop comparing.