Friday, November 13, 2015

Project Euler - Problem 6: Sum Square Difference

Project Euler - Problem 6: Sum Square Difference

The sum of the squares of the first ten natural numbers is, 1 2 + 2 2 + ... + 10 2 = 385 The square of the sum of the first ten natural numbers is, ( 1 + 2 + ... + 10 ) 2 = 55 2 = 3025 Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 - 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

Explanation:

This one is a two part problem. We have to add up all numbers to our limit and then square the result. And we have to add up the squares of all numbers up to our limit. Then subtract the two and report the result.

Solution:

We can do combine two of the steps so that we are only iterating over the series of numbers once. We will have to create two variables to hold the value of the sum at that point in time - since we do not know how large this number will be (in a real world situation) I have made them long which should increase their data capacity size. After the loop, we square the proper variable and display the subtraction to the screen.
private static void Problem6()
{
    int limit = 100;
    long sum = 0;
    long sumSquared = 0;

    for(int i = 1; i <= limit; i++)
    {
        sum += i;
        sumSquared += (i * i);
    }
    sum *= sum;

    Console.WriteLine(sum - sumSquared);
}
An online sample of which can be found here.

With this I get an average tick count in the middle 400000s on the sample above.

Math (Optimized) Solution:

The previous solution is a rather elementary solution, and it works, but there are some changes we can make to make it more efficient. There was a famous German mathematician named Johann Gauss who solved the sum of all numbers. In elementary school, Gauss' class was tasked to find the sum of the numbers from 1 to 100. The task was assigned to keep them busy, as the teacher did not expect anyone to finish quickly. Gauss however found the answer rather quickly by discovering he could split the set of numbers into two (divide and conquer). Each number in the set would correspond to a number in the other set to generate the same sum. Let's look at it this way:

1 + 2 + ... + 50
100 + 99 + ... + 51
Now when we add them in this way we get all of the sums to equal 101.

1 + 100 = 101
2 + 99 = 101
So now the trick is to multiply that sum by the number of sums that add to that sum (in Gauss' case it was 50) because you have that many sums of the number (because you divided them into two sets).

50 * 101 = 5050
Our formula for this then would be:

F(n) = n ( n+1 ) 2
This optimizes only half of our algorithm however - we need to optimize the sum of all squares up to the limit. Look at the first few squares and see if we can come up with a pattern.

1 2 + 2 2 + 3 2 + ... + n 2
1 + 4 + 9 + ... + n 2
This is most likely a polynomial with an exponent to the third power somewhere in it because we are dealing with squares. We can find the generating function by solving for a, b, and c in the following equation.

a n 3 + b n 2 + c n
We can do this by solving for the various values of n = namely when n is 1, 2, and 3. We need the equation to be solved for 1, 5, and 14 respectively - so let's start substituting.
First start with n = 1 and solve for a.

a ( 1 3 ) + b ( 1 2 ) + c = 1
Which should get something like this:

a = 1 - b - c Next do n = 2 and solve for b.

a ( 2 3 ) + b ( 2 2 ) + 2 c = 5 8 ( 1 - b - c ) + 4 b + 2 c = 5 8 - 8 b - 8 c + 4 b + 2 c = 5 4 b = 3 - 6 c

And should get something like this. b = 3 - 6 c 4 Then do n = 3 and solve for c.

a ( 3 3 ) + b ( 3 2 ) + 3 c = 14 27 ( 1 - b - c ) + 9 b + 3 c = 14 27 - 27 b - 27 c + 9 b + 3 c = 14 13 - 18 b = 24 c 13 - 18 ( 3 - 6 c 4 ) = 24 c 13 - ( 54 - 108 c 4 ) = 24 c 52 - 54 + 108 c = 96 c 12 c = 2 c = 1 6
Now substitute c back into the other equations. b = 3 - 6 ( 1 6 ) 4 b = 3 - 1 4 b = 2 4 = 1 2
And finally substitute back in to solve for a. a = 1 - 1 2 - 1 6 a = 1 2 - 1 6 = 3 6 - 1 6 = 2 6 = 1 3
After all that we can plug in the values for a, b, and c and test if it holds for all numbers.

F(n) = 1 3 n 3 + 1 2 n 2 + 1 6 n F(n) = 2 6 n 3 + 3 6 n 2 + 1 6 n F(n) = 2 n 3 + 3 n 2 + n 6 F(n) = n ( 2 n 2 + 3 n + 1 ) 6 F(n) = n ( n + 1 ) ( 2 n + 1 ) 6
This can be proven to work for all cases by using induction. The proof of which can be found here.

And we can now update our code!
private static void Problem6(int limit)
{
    long sum = limit*(limit+1)/2;
    sum *= sum;
    long sumSquared = (limit*(limit+1)*(2*limit+1))/6;

    Console.WriteLine(sum - sumSquared);
}
An online sample of which can be found here.

This solution has a somewhat noticeable impact. I got tick times in the low 400000s consistently.

Summary:

There are many ways to solve a problem, and sometimes the ease of the solution can be the deciding factor over performance. In this one we were able to show an actual performance increase by finding the generator functions for the values instead of looping and manually calculating the numbers. This method is not always as straight-forward however. And the performance gain may be negligible.

No comments:

Post a Comment