I failed at this problem for several hours now and just can't get my head around it. It seems fairly simple from a "human" POV, but somehow I just can't seem able to write it into code.
Situation: Given several number ranges that are defined by a starting number and the current "active" number which are assigned to specific locations (or 0 for generic ones)
startno | actualno | location
100 | 159 | 0
200 | 203 | 1
300 | 341 | 2
400 | 402 | 0
Now, as you can see, there can also be two ranges for one location. In this case, only the range with the highest startno (in this case, 400) is regarded as active, the other one only exists for history purposes.
Every user is assigned to a specific location (the same IDs as in the location column), but never to a generic one (zero).
When a used wants a new number, he will get a number assigned from a range that is assigned to his location, or, if none is found, from the highest generic one (e.g. user.location = 0 would get 403, user.location = 2 would get 342).
Then, the user can select to either use this number or an amount X starting from the assigned number.
Here comes the question: How can I assure that the ranges don't overlap into each other? Say the user (location = 2) gets the next number 342 and decides he needs 100 numbers following that. This would produce the end number to 441, which is inside the generic range, which mustn't happen.
I tried around with several nested SELECTs, using both the starting and ending number, aggregating MAX(), JOINing the table on itself, but I just can't get it 100% right.
From my understanding with such a thing I may just create a trigger on the table in db to do the validation and raise an error if overlap found while the application update the table, so that user will just simply get an error saying you can't do it. Say if you want it end with 441 then just let user do it and try to update the table with actualno to 441, then a simple select compare the new number to all existing startno see if it's bigger than any startno then raise the error. Something like following in the update trigger:
IF EXISTS(SELECT 1 FROM
Table1
WHERE #newnumber >= startno AND id <> #currentID)
BEGIN
'Go Raise the error
END
Well maybe I missed something here in some certain case this won't work and please let me know.
Using trigger for data integrity check is totally OK and shouldn't be a problem at all. This would be much easier than validation ahead especially if you think about multithreading stuff might create some big problem there.
In the other hand, for prevent this happened too easy, I might just add couple more zero into those numbers as initial values:
startno | actualno | location
100000 | 100059 | 0
200000 | 200003 | 1
300000 | 300041 | 2
400000 | 400002 | 0
As so often, I found an approach not long after posting the question. It seems describing a problem so other people understand it is half-way to getting the solution. At least, I got a possible one which so far proofed to be quite resistant.
I query the database with
SELECT nostart FROM numbers
WHERE nostart BETWEEN X AND Y
where X is the start number requested and Y is the end number of the user. (To be conform with my introduction example, X = 342 and Y = 441
This will then give me a list of all ranges whose starting number is inside the range of the numbers the user requested, in this case the list would be
nostart
400
Now, if the query doesn't find a result, I'm golden and the numbers can be used. If the query finds a single result, and that result is equal to the starting number of the user, I'm also OK because this means it's the first time a user requested something from this range.
If that is not the case, the range cannot be used, because another range is inside it. Also, if the query finds multiple results (e.g. for X = 100 and Y = 350, which would result in 100|200|300 I also deny the request, because several ranges are overlapped.
If anyone has a better solution or notes on this one, I'll leave this here and use it as long as it works out.
Related
Given a string in the format {Length}.{Text} (such as 3.foo), I want to determine which string, from a finite list, the given string is.
The reader starts at the 0-index and can seek forward (skipping characters if desired).
As an example, consider the following list:
10.disconnect
7.dispose
7.distort
The shortest way to determine which of those strings has been presented might look like:
if (reader.Current == "1")
{
// the word is "disconnect"
}
else
{
reader.MoveForward(5);
if (reader.Current == "p")
{
// the word is "dispose"
}
else
{
// the word is "distort"
}
}
The question has 2 parts, though I hope someone can just point me at the right algorithm or facet of information theory that I need to read more about.
1) Given a finite list of strings, what is the best way to generate logic that requires the least number of seeks & comparisons, on average, to determine which word was presented?
2) As with the first, but allowing weighting such that hotpaths can be accounted for. i.e. if the word "distort" is 4 times more likely than the words "disconnect" and "dispose", the logic shown above would be more performant on average if structured as:
reader.MoveForward(5);
if (reader.Current == "t")
{
// the word is distort
}
else //...
Note: I'm aware that the 6th character in the example set is unique so all you need to do to solve the example set is switch on that character, but please assume there is a longer list of words.
Also, this isn't some homework assignment - I'm writing a parser/interception layer for the Guacamole protocol. I've looked at Binary Trees, Tries, Ulam's Game, and a few others, but none of those fit my requirements.
I dont know if this would be of any help, but I'll throw my 5 cents in anyway.
What about a tree that automatically gets more granular as you have more strings in the list, and checking of the existing leaves are done with respect to "hotpaths"?
for example, I would have something like this with your list:
10.disconnect
7.dispose
7.distort
root ---- 7 "check 4th letter" ------ if "t" return "distort"
| "in the order of " |
| " hot paths " --- if "p"return "dispose"
|
----10 ---- return "disconnect"
you can have this dynamically build up. for example if you add 7.display it would be
root ---- 7 "check 4th letter" ------ if "t" return "distort"
| "in the order of " |
| " hot paths " --- if "p" --- "check 5th letter" --- if "o" ...
| |
----10 ---- return "disconnect" --- if "l" ...
so nodes in the tree would have a variable "which index to check", and leaves corresponding to possible results (order is determined statistically). so something like:
# python example
class node():
def __init__(which_index, letter):
self.which_index = which_index # which index this node checks to determine next node
self.letter = letter # for which letter we go to this node
self.leaves = SomeLinkedList()
def add_leaf(node):
self.leaves.putInCorrectPositionDependingOnHowHotPathItIs(node)
def get_next(some_string):
for leaf in self.leaves:
if some_string[self.which_index] == leaf.letter:
return leaf
raise Exception("not found")
another alternative is of course hashing.
But if you are micro-optimizing, it is hard to say as there are other factors that come into play (eg. probably time you save from memory caching would be very significant).
0 1 1 1 1
1 0 1 1 2
1 1 0 1 2
1 1 1 0 2
1 2 2 2 0
The table above gives the value of a relation for each pair of strings. As you can see the matrix is symmetric(relation is commutative).Now i am supposed to find all possible groups (if subsets of a group already found are skipped it is fine.A group can have any size) such that in a given group for any possible string pair in the group the relation value is less than a particular threshold(say 2).
I've tried to do it in c# but it did not cover all possibilities and resulted in too many loops.The reason i didn't turn to any clustering algorithms is because this relation is not a distance metric.
An algorithm or a syntactic element in c# which might make the process easy or a clue about how to approach the problem...Any help will be appreciated.
That is quite outside my league, but I did a quick search around and found these, might give you an enlightment:
"Cluster analysis" with MySQL
This one is an article, in Portuguese though, about using the Tocher method on cluster analysys:
http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0100-204X2007001000008
Sorry for not being able to help more. I'll study it, see if I can come up with something more useful
Not all clustering algorithms require a metric.
For example, most HAC can work with similarities as well as with distances (except for ward apparently).
Your requirement sounds exactly like complete linkage clustering. Even if you don't use clustering, you'll still get the same result as with HAC.
Bad news is that HAC usually is O(n^3). But I believe if you fix the threshold beforehand, it is only O(n^2).
I'm really sorry, I would search my question but I don't know how to word it correctly, I have a big problem with mental math and I've been trying to think of a solution. I'm trying to build a program for my guild that will add names to a list. This list has a parallel struct for each entry that holds the person's name and number of donations to a raffle.
I want to randomize through all the names and have the results positively influenced by the amount of donations people have put in. likes 1 or 1% increase in possibility * number of donations. I've done randomization before but I've never had to influence the odds slightly. help me please? I would post code but I don't have ANY randomization at the time. I can give you some other stuff though:
I made a struct to hold each user's entry:
public struct pEntry
{
public string Name;
public int Entries;
}
I made a list to hold all the entries:
public List<pEntry> lEntries = new List<pEntry>();
I want my randomization to be positively influenced by the number of entries(donations) they've made. This has no cap on how high it can go but if I need to maybe I can make it like.. 255.After it randomizes it will pick display a message saying that, that user has won and it will remove that entry from the list and pick a few more. This program will be used for multiple raffles and things like it but all use the same system.
You need to know how much donations increase the chance of winning. Does 1 = 1 more chance than someone that hasn't donated? Once you have a base and a bonus, you can just generate a "chart", see what your top number is, and generate a random number. After that, you can remove that person or reset their chances by whatever fraction you want, regenerate your chance list, and generate a new number.
For instance, if you want 1 donation unit to give 10% more chance for that person, then you could generate a list of all the guild, and each person gets 10 "lots" + 1 lot per donation unit. So if you had 3 people, where:
Ann gave 5 donation units
Tom gave 2 donation units
Bob didn't give any
The resulting list would be:
Ann lot count: 10 (base) + 5 (donations) = 15 lots
Tom lot count: 10 (base) + 2 (donations) = 12 lots
Bob lot count: 10 (base) + 0 (no donations) = 10 lots
Then you generate a number between 1 and 37 (15 + 12 + 10) to determine the winner.
Figure out what each donation should improve their odds (increases in their lots) and then you can start building your ranges and generate your numbers.
Yesterday at work I set out to figure out how to sort numbers without using the library method Array.Sort. I worked on and off when time permitted and finally was able to come up with a basic working algorithm at the end of today. It might be rather stupid and the slowest way, but I am content that I have a working code.
But there is something wrong or missing in the logic, that is causing the output to hang before printing the line: Numbers Sorted. (12/17/2011 2:11:42 AM)
This delay is directly proportionate to the number of elements in the array. To be specific, the output just hangs at the position where I put the tilde in the results section below. The content after tilde is getting printed after that noticeable delay.
Here is the code that does the sort:
while(pass != unsortedNumLen)
{
for(int i=0,j=1; i < unsortedNumLen-1 && j < unsortedNumLen; i++,j++)
{
if (unsorted[i] > unsorted[j])
{
pass = 0;
swaps++;
Console.Write("Swapping {0} and {1}:\t", unsorted[i], unsorted[j]);
tmp = unsorted[i];
unsorted[i] = unsorted[j];
unsorted[j] = tmp;
printArray(unsorted);
}
else pass++;
}
}
The results:
Numbers unsorted. (12/17/2011 2:11:19 AM)
4 3 2 1
Swapping 4 and 3: 3 4 2 1
Swapping 4 and 2: 3 2 4 1
Swapping 4 and 1: 3 2 1 4
Swapping 3 and 2: 2 3 1 4
Swapping 3 and 1: 2 1 3 4
Swapping 2 and 1: 1 2 3 4
~
Numbers sorted. (12/17/2011 2:11:42 AM)
1 2 3 4
Number of swaps: 6
Can you help identify the issue with my attempt?
Link to full code
This is not homework, just me working out.
Change the condition in your while to this:
while (pass < unsortedNumLen)
Logically pass never equals unsortedNumLen so your while won't terminate.
pass does eventually equal unsortedNumLen when it goes over the max value of an int and loops around to it.
In order to see what's happening yourself while it's in the hung state, just hit the pause button in Visual Studio and hover your mouse over pass to see that it contains a huge value.
You could also set a breakpoint on the while line and add a watch for pass. That would show you that the first time the list is sorted, pass equals 5.
It sounds like you want a hint to help you work through it and learn, so I am not posting a complete solution.
Change your else block to the below and see if it puts you on the right track.
else {
Console.WriteLine("Nothing to do for {0} and {1}", unsorted[i], unsorted[j]);
pass++;
}
Here is the fix:
while(pass < unsortedNumLen)
And here is why the delay occurred.
After the end of the for loop in which the array was eventually sorted, pass contains at most unsortedNumLen - 2 (if the last change was between first and second members). But it does not equal the unsorted array length, so another iteration of while and inner for starts. Since the array is sorted unsorted[i] > unsorted[j] is always false, so pass always gets incremented - exactly the number of times j got incremented, and that is the unsortedNumLen - 1. Which is not equal to unsortedNumLen, and so another iteration of while begins. Nothing essentially changed, and after this iteration pass contains 2 * (unsortedNumLen - 1), which is still not equal to unsortedNumLen. And so on.
When pass reaches value int.MaxValue, it the overflow happens, and next value the variable pass will get is int.MinValue. And the process goes on, until pass finally gets the value unsortedNumLen at the moment the while condition is checked. If you are particularly unlucky, this might never happen at all.
P.S. You might want to check out this link.
This is just a characteristic of the algorithm you're using to sort. Once it's completed sorting the elements it has no way of knowing the sort is complete, so it does one final pass checking every element again. You can fix this by adding --unsortedNumLen; at the end of your for loop as follows:
for(int i=0,j=1; i < unsortedNumLen-1 && j < unsortedNumLen; i++,j++)
{
/// existing sorting code
}
--unsortedNumLen;
Reason? Because you algorithm is bubbling the biggest value to the end of the array, there is no need to check this element again since it's already been determined to be larger the all other elements.
I am trying to figure out a way to let the user sort records (etc. a list of friends).
I want to give the user the opportunity to move a record (friend) straight to the top or bottom of the list, or by entering a number (in between).
First I thought of just adding a column called SortOrder (int) to the table with all the users friends and set the number according to which order the records should be shown.
But what I am trying to avoid is that etc. a user have 400 friends, and if he wants to set friend number 400 to be at position 1 in the list, then I will have to update every single record with a new SortOrder.
All data is stored in an MS SQL database.
I hope someone out there have a magic solution for this?
Use floating point numbers for the sort column.
Set the initial items as 0.0, 1.0 etc.
Moving to top, use min -1.0. Moving to bottom, set to max+1.0. Moving between two items, set to (prev+next)/2.0
This is similar to the line numbers approach, but there is more "space" between the numbers. Theoretically, there is still the point where you need to renumber, when two adjancted values grow to close. I have no idea how soon this happens in practice, but I expect it to be very infrequent, so this can be done with any maintenance task.
[edit] FWIW, this problem came back to me a few times, so here's a way that does roughly the same, but with strings.
I wouldn't imagine they'd be doing this often enough to be a real concern but, if you're worried, use the trick we pioneered with our BASIC code from days of yore.
Back when BASIC had line numbers, we'd simply number them 10, 20, 30 and so on, so that if we needed to insert one between 10 and 20, we'd call it 15. Or if 20 should have come before 10, we'd renumber it to 5.
With a 32 bit integer column you could have 200,000 friends with a spacing of 100, more than enough to move things around, especially if you're clever.
You may want to run a sweep job occasionally to renumber the friends to 100, 200, and so on (sort of a disk defragmenter for your social network). Don't try to detect this by the looking at the friend numbers, use another field, setting it to true when a user re-arranges their friends and clearing it when you defragment. This will be more efficient.
It sounds like you're looking for a linked list type structure, where each record would hold the ID of the next record in order.
I don't know about magic, but for moving to the top or bottom, you could just set the SortOrder to MIN/MAX(SortOrder) +/- 1. Who says the top has to be 1 or 0?
Here's how I'd do it: Use your SortOrder column. Presumably, there would be an initial default sort order, say alphabetical, and so everyone would be given a SortOrder value based on their alphabetical order.
Then, when a user moved someone to the top, you could just set SortOrder to max +1. If they moved someone to the bottom, then it would be min -1. If they moved someone to somewhere in the middle, then you would want to calculate which half of the middle they are moving to. If it's the top half, then bump up the SortOrder of everyone above them. If it's the bottom half, then decrease the SortOrder of everyone below.
Not sure there's a more expedient way of doing it...
You could look at it as groups of friends.
Initially, everyone is in Group 0, and the order is by name or something.
- If the user then increases the "Group" of friend (a) to 1, then they move to the top
- If the user then increases the "Group" of friend (b) to 1, then (a) and (b) appear at the top
- If the user then increases the "Group" of friend (b) again, then (b) appears 1st and (a) 2nd
Just a thougt...