As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm looking for a efficient method to generate a combination of numbers in every possible combination. So, if I had a generic list of integers (1 - 120), I would want one result to actually be all 120 numbers in numerical order from 1 to 120, and then I would need every other combination where those numbers were in different order.
For what it’s worth, here’s how it may be done for small ranges (e.g. 1–8) using LINQ and recursion.
If you try increasing the range incrementally, you will realize why this approach won’t work.
static void Main(string[] args)
{
int[][] combinations = GetCombinations(8).Select(c => c.ToArray()).ToArray();
string s = string.Join("\n", combinations.Select(c => string.Join(",", c)));
Console.WriteLine(s);
}
static IEnumerable<IEnumerable<int>> GetCombinations(int count)
{
return GetCombinations(Enumerable.Range(1, count));
}
static IEnumerable<IEnumerable<int>> GetCombinations(IEnumerable<int> elements)
{
if (elements.Count() == 1)
return EnumerableSingle(elements);
return elements.SelectMany((element, index) =>
GetCombinations(elements.ExceptAt(index)).Select(tail =>
tail.Prepend(element)));
}
static IEnumerable<T> ExceptAt<T>(this IEnumerable<T> source, int index)
{
return source.Take(index).Concat(source.Skip(index + 1));
}
static IEnumerable<T> Prepend<T>(this IEnumerable<T> source, T element)
{
return EnumerableSingle(element).Concat(source);
}
static IEnumerable<T> EnumerableSingle<T>(T element)
{
return Enumerable.Repeat(element, 1);
}
Well, when you found a way to do that fast, go and claim a nobel prize.
You just broke every modern encryption mechanism, which is based on a similar primciple - the fact that calculating every possible combination of two (prime) nubmers is not possible fast.
If that is a homework, you got up to a joke. if you really think there is a magic hidden secret we do not tell you, you are - living in delusions.
Sorry, this is one of the issues that just make no sense.
I'm looking for a efficient method
Define efficient. The most efficient method I can see now is grabbing a TON of computers and go for it with brute force. The NSA supposedly can do that for 128 numbers within an acceptable timeframe now ;)
The seonc alternative, if you ahve limited money, is to go for time. Put in a small machine, with a solar panel, somewhere, and let it calculate for some time. Supposedly per the one true story of the world (as told in "The Hithhikers Guide o the Galaxy) this is why earth exist - to calculate the question to the absolute answerr, which is 42.
THe THIRD way - by far the most efficient - is just to use 42 as answer. If it fits you just found THE question, if not it it just another failure.
Sorry, I HAD to make that non serious. People regularly come with "simple" mathematical questions that just FALL into the factorization type of trap.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
First, a little background: I enjoy working on project Euler problems (https://projecteuler.net/archives) but many of them require a ton of heavy computation so I try to save known constants in memory so they don't have to be recalculated every time. These include things like n!, nPr, nCr, and lists of primes. For the purpose of this question let's just stick with primes because any solution with those can be easily ported to the others.
The question: Let's say I want to save the first 1,000,000 primes in memory for repeated access while doing heavy computation. The 1,000,000th prime is 15,485,863 so ints will do just fine here. I need to save these values in a way such that access is O(1) because these will be access a lot.
What I've tried so far:
Clearly I can't put all 1,000,000 in one cs file because Visual Studio throws a fit. I've been trying to break it into multiple files using a partial class and 2-D List<List<int>>
public partial class Primes
{
public readonly List<int> _primes_1 = new List<int>
{
2, 3, ... 999983
}
}
So _primes_1 has the primes less than 1,000,000, _primes_2 has the primes between 1,000,000 to 2,000,000, etc, 15 files worth. Then I put them together
public partial class Primes
{
public List<List<int>> _primes = new List<List<int>>()
{
_primes_1, _primes_2, _primes_3, _primes_4, _primes_5,
_primes_6, _primes_7, _primes_8, _primes_9, _primes_10,
_primes_11, _primes_12, _primes_13, _primes_14, _primes_15
};
}
This methodology does work as it is easy to enumerate through the list and IsPrime(n) checks are fairly simple as well (binary search). The big downfall with this methodology is that VS starts to freak out because each file has ~75,000 ints in it (~8000 lines depending on spacing). In fact, much of my editing of these files has to be done in NPP just to keep VS from hanging/crashing.
Other things I've considered:
I originally read the numbers in off a text file and could do that in the program but clearly I would want to do that at startup and then just have the values available. I also considered dumping them into sql but again, eventually they need to be in memory. For the in memory storage I considered memcache but I don't know enough about it to know how efficient it is in look ups.
In the end, this comes down to two questions:
How do the numbers get in to memory to begin with?
What mechanism is used to store them?
Spending a little more time in spin up is fine (within reason) as long as the lookup mechanism is fast fast fast.
Quick note: Yes I know that if I only do 15 pages as shown then I won't have all 1,000,000 because 15,485,863 is on page 16. That's fine, for our purposes here this is good enough.
Bring them in from a single text file at startup. This data shouldn't be in source files (as you are discovering).
Store them in a HashSet<int>, so for any number n, isPrime = n => primeHashSet.Contains(n). This will give you your desired O(1) complexity.
HashSet<int> primeHashSet = new HashSet<int>(
File.ReadLines(filePath)
.AsParallel() //maybe?
.SelectMany(line => Regex.Matches(line, #"\d+").Cast<Match>())
.Select(m => m.Value)
.Select(int.Parse));
Predicate<int> isPrime = primeHashSet.Contains;
bool someNumIsPrime = isPrime(5000); //for example
On my (admittedly fairly snappy) machine, this loads in about 300ms.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Hey I've been working on something from time to time and it has become relatively large now (and slow). However I managed to pinpoint the bottleneck after close up measuring of performance in function of time.
Say I want to "permute" the string "ABC". What I mean by "permute" is not quite a permutation but rather a continuous substring set following this pattern:
A
AB
ABC
B
BC
C
I have to check for every substring if it is contained within another string S2 so I've done some quick'n dirty literal implementation as follows:
for (int i = 0; i <= strlen1; i++)
{
for (int j = 0; j <= strlen2- i; j++)
{
sub = str1.Substring(i, j);
if (str2.Contains(sub)) {do stuff}
else break;
This was very slow initially but once I realised that if the first part doesnt exist, there is no need to check for the subsequent ones meaning that if sub isn't contained within str2, i can call break on the inner loop.
Ok this gave blazing fast results but calculating my algorithm complexity I realised that in worst case this will be N^4 ? I forgot that str.contains() and str.substr() both have their own complexities (N or N^2 I forgot which).
The fact that I have a huge amount of calls on those inside a 2nd for loop makes it perform rather.. well N^4 ~ said enough.
However I calculated the average run-time of this both mathematically using probability theory to evaluate the probability of growth of the substring in a pool of randomly generated strings (this was my base line) measuring when the probability became > 0.5 (50%)
This showed an exponential relationship between the number of different characters and the string length (roughly) which means that in the scenarios I use my algorithm the length of string1 wont (most probably) never exceed 7
Thus the average complexity would be ~O(N * M) where N is string length1 and M is string length 2. Due to the fact that I've tested N in function of constant M, I've gotten linear growth ~O(N) (not bad opposing to the N^4 eh?)
I did time testing and plotted a graph which showed nearly perfect linear growth so I got my actual results matching my mathematical predictions (yay!)
However, this was NOT taking into account the cost of string.contains() and string.substring() which made me wonder if this could be optimized even further?
I've been also thinking of making this in C++ because I need rather low-level stuff? What do you guys think? I have put a great time into analysing this hope I've elaborated everything clear enough :)!
Your question is tagged both C++ and C#.
In C++ the optimal solution will be to use iterators, and std::search. The original strings remains unmodified, and no intermediate objects get created. There won't be an equivalent of your Substring() taking place at all, so this eliminates that part of the overhead.
This should achieve the theoretically-best performance: brute force search, testing all permutations, with no intermediate object construction or destruction, other than the iterators themselves, which simply replace your two int index variables. I can't think of any faster way of implementing this basic algorithm.
Are You testing one string against one string? If You test bunch of strings against another bunch of strings, it is a whole different story. Even if You have the best algorithm for comparing one string against another O(X), it does not mean repeating it M*N times You would get the best algorithm for processing M strings against N.
When I made something simmiliar, I built dictionary of all substrings of all N strings
Dictionary<string, List<int>>
The string is a substring and int is index of string that contains that substring. Then I tested all substrings of all M strings against it. The speed was suddenly not O(M*N*X), but O(max(M,N)*S), where S is number of substrings of one string. Depending on M, N, X, S that may be faster. I do not say the dictionary of substrings is the best approach, I just want to point out that You should always try to see the whole picture.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I am currently looking at typical interview questions, to get me in the right frame of mind.
I am trying to come up with my own solutions to the problems instead of trying to remember the given solutions.
The problem is that I'm not sure if my solutions are optimal or have a major flaw in design that i am not seeing.
So here is one of the solutions that I came up with for the basic "Is this string unique" problem as in check if all characters in a string are unique.
public static bool IsUnique(string str)
{
bool isUnique = true;
for (int i = 0; i < str.Length; i++)
{
if (str.LastIndexOf(str.ElementAt(i)) != i)
{
isUnique = false;
break;
}
}
return isUnique;
}
Does anyone have advice on whether this code is optimal and has acceptable time and space complexity?
For the purposes of this answer I will refer to Big-O notation to indicate complexity of an algorithm. The trick to efficiency is to realize the minimum Big-O measurement at which the problem can be solved, and then attempt to replicate that efficiency.
You can derive some efficiency facts by thinking about the algorithm logically: to check if all characters are unique, you need to evaluate all characters. So that's an O(n) traversal of the string guaranteed, and I doubt you'd easily get more efficient than that. Now, can you solve it yourself in O(n) or O(2n) time? If so, that's pretty decent because your algorithm is in linear time and will scale linearly (steadily get slower for larger string inputs).
Your current algorithm loops over the string and then for each character, iterates over the string again to compare it and find an equal character. This makes the algorithm an n traversal where each visit does an n traversal itself, so an O(n^2) algorithm. This is known as a polynomial time algorithm, which is not very good because it does not scale linearly; it scales polynomially. This means that your algorithm will get much slower with larger inputs, and that's a bad thing.
A quick change to make it slightly more efficient would be to start the comparison for an equivalent character at the current index you're at in the string + 1... You know that all previously checked characters are unique, so you care only about future characters. This would become an n traversal where each visit does a substring traversal from the current point (less work done as you traverse the string), but this is also an O(n^2) algorithm because it runs in the square of the outer loop's time. This is also a polynomial time algorithm, as before, but is slightly more efficient. It will still scale badly with larger inputs, however.
Think of alternative ways to avoid repeated iterations. These often come at the cost of memory, but are practical. I know how I would try and solve it, but telling you my answer doesn't help you learn. ;)
EDIT: As you requested, I'll share my answer
I'd do it by having a HashSet that I load each visited character into. HashSet lookups and adds are approximately an O(1) operation. The beauty of the HashSet.Add method is that it returns true if it added the value and false if the value already existed (which is the condition that determines your algorithm result). So mine would be:
var hashSet = new HashSet<char>();
foreach (char c in myString)
{
if (!hashSet.Add(c))
{
return false;
}
}
return true;
Pros: O(n) linear algorithm.
Cons: Extra memory used for HashSet.
EDIT2: Everyone loves cheap LINQ tricks, so here's another way
var hashSet = new HashSet<char>();
return myString.Any(c => !hashSet.Add(c));
Using a HashSet is more efficient as it has a constant looup time O(1), compared to looking up a character in a string with a linear lookup time O(n):
public static bool AreCharsUnique(string str)
{
var charset = new HashSet<char>();
foreach (char c in str) {
if (charset.Contains(c)) {
return false;
} else {
charset.Add(c);
}
}
return true;
}
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
This was the question asked by one Interviewer. I was unable to answer.
Question was, assume you want to pick a random number from the given array.
Condition is you are not supposed to pick anything in sequential and
not to use built in Random function.
I have no idea. Like to know how is this Math.Random does for us?
I googled and didn't find the implementation/logic behind that.
Any one know?
So far three people have told you to use the last digit of Ticks. This doesn't work. Try doing so in a tight loop and you will quickly see why it is a bad idea.
The question is not very well posed. I like giving ambiguously posed questions in interviews because you get to find out how the candidate deals with an ambiguous situation. In this case I would immediately push back and find out what the interviewer means by "random". Is pseudo-randomness good enough? Is there a source of high-quality entropy available?
Once you have a clarified question it should be easier to answer.
The problem comes down to managing entropy. If you have a very weak source of entropy -- like the value of Ticks (not the last digit, which is worthless, but the entire value) then you can use that to seed a pseudo-random-number generator. If you have a high quality source of entropy then you can just use that to generate random bits directly.
Guaranteed to be random. (tongue FIRMLY in cheek):
void Main()
{
Enumerable.Range(0, 10).Select(x => ComeOnItsKindaRandom(0, 10)).Dump();
}
public int ComeOnItsKindaRandom(int minValue, int maxValue)
{
var query = "http://www.random.org/integers/?num=1&min={0}&max={1}&col=1&base=10&format=plain&rnd=new";
var request = WebRequest.Create(string.Format(query, minValue, maxValue));
var response = request.GetResponse();
using(var sr = new StreamReader(response.GetResponseStream()))
{
var body = sr.ReadToEnd().Trim();
return int.Parse(body);
}
}
If you only want to get one item from an array, without using the Random class, you could use a modulo function with an unknown value, such as DateTime.Now.Ticks:
string[] items = new[] { "1", "2", "3", "4", "5" };
// Modulo items.Lenth returns a value from 0 to Length - 1
int index = (int)(DateTime.Now.Ticks % items.Length);
Console.WriteLine(items[index]);
I would use the
DateTime.Now.Tick
And then taking just enough numbers for example for Math.Random(10), I will only take the last two numbers.
Or you can take the modulo of this tick like followed :
public static class MyMath
{
private static int counter = 1;
public static int Random(int max)
{
counter++;
long ticks = DateTime.Now.Ticks;
int result = Math.Abs((int) (ticks/counter)%max);
return result;
}
}
see the following test :
[Test]
public void test()
{
List<int> test = new List<int>();
for (int i = 0; i < 10; i++)
{
test.Add(MyMath.Random(100));
}
Console.WriteLine("result:");
foreach (int i in test)
{
Console.WriteLine();
}
}
Here is an implementation of Random numbers in C. You could try rewriting it in C#.
Random Numbers for C: End, at last?
http://www.cse.yorku.ca/~oz/marsaglia-rng.html
It seems to be of very high quality.
But writing this code in a interview mite not be easy, but you could definitely tell him the ideas used.
Just one idea, you could use one of the last digits in DateTime.Now.Ticks to get a basis to choose the index. Or maybe some hash function on the same thing. Or use a web service that can give you random numbers measured from radiation. Math.Random just picks from predefined tables (yes indeed, it's not truly random).
I think they were just seeing if you knew about LCG (Linear Congruential Generator) algorithms.
The maths behind them is somewhat tricky however, so I doubt they could expect you to be able to write one off the top of your head.
But failing that, couldn't you just cheat like this to generate a random index?
int index = Guid.NewGuid().GetHashCode() % array.Length;
First of all, it's known as Pseudo-Random not just Random, since Random series is impossible to generate in a computational form,
Most Pseudo-Random number generators PRNG are on this form:
at Time 0: R(0) = Random(Seed)
at Time i: R(i) = Random(R(i-1));
Second, Random doesn't mean you don't know what will the ith outcome, but that the series is robust & it's very difficult to guess the formula or the seed given chain of outcomes
Hope this helps
I have a list of items which I would like to partition into subsets. For the sake of discussion lets say they're files. I would like each subset to contain at most 5 files, and for the total size of the files in the subset to be less than 1 MB if possible. If a single file exceeds 1MB it should be in a subset by itself.
I wrote this up in a slightly more generic form, using a generic "item metric" instead of file size. But I suspect there's a simpler and/or better way to do this. Any suggestions?
Here's what I've got:
public static IEnumerable<IEnumerable<T>> InSetsOf<T>(this IEnumerable<T> source, int maxItemsPerSet, int maxMetricPerSet, Func<T, int> getMetric)
{
int currentMetricSum = 0;
List<T> currentSet = new List<T>();
foreach (T listItem in source)
{
int itemMetric = getMetric(listItem);
if (currentSet.Count > 0 &&
(currentSet.Count >= maxItemsPerSet || (currentMetricSum + itemMetric) > maxMetricPerSet))
{
yield return currentSet;
//Start a new subset
currentSet = new List<T>();
currentMetricSum = 0;
}
currentSet.Add(listItem);
currentMetricSum += itemMetric;
}
//Return the last set
yield return currentSet;
}
Bin packing is a NP-hard problem. The only way to get an optimal solution is to test all combinations. If there is a fixed number of distinct sizes, it can be systematically done using dynamic programming (there is an answer on SO with sample code for this case), but the running time for such algorithm is terrible.
This means you should be looking for a heuristic which would get you close to the optimal solution in reasonable time. Your algorithm (first-fit) is a good starting point. With little effort, it can be slightly improved by presorting the items by decreasing size. There are, however, several other more-or-less complex heuristics which improve both speed and the results.
A Google search returned this as one of the results: Basic analysis of bin-packing heuristics (there is a paper which analyses results). Apparently, a best fit algorithm with a bin lookup table provides good results with a reasonable running time.
The 1MB test is missing, but otherwise your code looks OK to me. I do not think there is a significantly better way of doing it.