Euler Problem 14: Learn to Memoize

This is going to be fun. It’s a bit of LINQ, a bit of academic Computer Science, and a bit of meteorology.

Euler Problem 14 concerns a sequence referred to as hailstorm numbers.

Hailstorm number sequences are generated by applying one of two functions to a number in order to generate the next number. In this case, the sequence is:

n –> n / 2 (n is even)

n –> 3n + 1 (n is odd)

When n is even, the next number in the sequence is smaller. When n is odd, the next number in the sequence is larger

In all cases, it is believed that for every starting number, the sequence will oscillate for some time, and then eventually converge to some minimum number. (In this case, that minimum number is 1.)

These sequences are called hailstorm numbers because they (very simplistically) act like hail in a storm: oscillating up and down in a thunderhead before eventually falling to earth.

This particular problem asks you to find the sequence that has the longest chain, given input numbers less than 1 million.

The brute force method will take your computer a very long time to compute. The problem asks for the longest sequence, and you’ve got a million of them to compute.

Stack size is a related problem. You could write a method that computes the next value, and then finds the sequence size recursively:

1: private static long Generate(long n)

` 2: {`

3: if (n == 1) { return 1; }

` 4: var next = (n % 2 == 0) ? n / 2 : 3 * n + 1;`

5: return Generate(next) + 1;

` 6: }`

A LINQ query gives you the answer:

1: var answer = (from StartingValue in Enumerable.Range(1, 999999)

` 2: let SequenceSize = Generate(StartingValue)`

` 3: orderby SequenceSize descending`

4: select new { StartingValue, SequenceSize }).First();

This works, and does give the correct answer.

But I’m not really satisfied, because it takes more than 15 seconds (on my PDC laptop) to finish.

There are two steps to making this faster. The final version uses a technique called Memoization. Memoization enables you to avoid computing the same result more than once. Pure functions have a useful property in that the output depends on their inputs, and nothing else. Therefore, once you’ve computed the sequence length for, say 64, you should never need to compute it again. It’s always going to be the same.

Moemoization means to store the result for a computation and returned the stored result rather than do the work again. This can provide significant savings in a recursive algorithm like this. For example, memoization of the result for 64 (7) means saving the computations for 64, 32,16,8,4,2 and 1. Memoization of the results for longer sequences would mean correspondingly larger savings.

You could modify the Generate method to provide storage for previously computed results. Here’s how you would do that:

1: private static Dictionary<long, long> StoredResults = new Dictionary<long, long>();

` 2: `

3: private static long Generate(long n)

` 4: {`

5: if (StoredResults.ContainsKey(n))

6: return StoredResults[n];

7: if (n == 1)

` 8: { `

` 9: StoredResults[n] = 1;`

10: return 1;

` 11: }`

` 12: var next = (n % 2 == 0) ? n / 2 : 3 * n + 1;`

` 13: var answer = Generate(next) + 1;`

` 14: StoredResults[n] = answer;`

15: return answer;

` 16: }`

But, what’s the fun in that? There’s no reuse. It’s a one off solution.

I want to write a generic Memoize function that lets me memoize any function with one variable. Wes Dyer’s post explains this technique in detail. Memoize is a generic method that executes a function, abstracting away the types for both the parameter type and the result type:

1: public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)

` 2: {`

3: var previousResults = new Dictionary<T, TResult>();

` 4: `

5: // Important: This is a lamdba, not the result.

6: return (T arg) =>

` 7: {`

8: if (previousResults.ContainsKey(arg))

9: return previousResults[arg];

10: else

` 11: {`

` 12: TResult result = function(arg);`

` 13: previousResults.Add(arg, result);`

14: return result;

` 15: }`

` 16: };`

` 17: }`

The first question you may have is how the previous results actually works. It’s a local variable, not a static storage. How can it possible live beyond the scope of the method?

Welll, that’s the magic of a closure. Memoize doesn’t return a value, it returns some func that enables you to find the value later. That func contains the dictionary. I go into this technique in Items 33 and 40 in More Effective C#.

In order to use this, you need to move Generate() from a regular method to a lamdba expression (even if it is a multi-line lambda):

1: Func<long, long> GenerateSequenceSize = null;

2: GenerateSequenceSize = (long n) =>

` 3: {`

4: if (n == 1) { return 1; }

` 5: var next = (n % 2 == 0) ? n / 2 : 3 * n + 1;`

6: return GenerateSequenceSize(next) + 1;

` 7: };`

` 8: `

` 9: GenerateSequenceSize = GenerateSequenceSize.Memoize();`

Now, we have the generate function in a form we can memoize. The only trick here is that you have to set GenerateSequenceSize to null before you assign it in line 2. Otherwise, the compiler complains about using an unassigned value in line 6.

You can extend the Memoize function to methods with more than one input, but for now, I’ll leave that as an exercise for the reader.

Tags:
C#
C# General
Euler
Linq
Technology

Created: 2/4/2010 8:48:00 PM

I create content for .NET Core. My work appears in the .NET Core documentation site. I'm primarily responsible for the section that will help you learn C#.

All of these projects are Open Source (using the Creative Commons license for content, and the MIT license for code). If you would like to contribute, visit our GitHub Repository. Or, if you have questions, comments, or ideas for improvement, please create an issue for us.