The nested means problem.

This month I have been introduced to the following curious problem:

Take the numbers from 1 to 12. Put them into groups. E.g. (1, 2), (8, 10), (3, 4, 5, 6, 7, 9, 11, 12). Average each group e.g. 1.5, 9, 7.125. Then average those averages: 5.875. How do you group the numbers to get the minimum?

In this blog post I will try my best to answer the following questions:

What is the solution to the original puzzle?
What about the range from 1 to $n$ ?

And elaborate on the following open question:

What if groups can be nested $m$ times, instead of just two times?

Diving in⌗

To solve this problem and gain some intuition about it, I will use Dyalog APL. I start by defining the average function and verifying the numbers given in the problem statement:

      a←+/÷≢
      a (a 1 2) (a 8 10) (a 3 4 5 6 7 9 11 12)
5.875

To reason about this problem, recall the Simpson Paradox, which states that an average of averages does not necessarily yield an average overall. The average of numbers from 1 to 12 is 6.5. Notice that:

      a (a 1) (a 2 3 4 5 6 7 8 9 10 11 12)
4

The average has decreased below the expected 6.5, but is it the best that we can do? Notice that it is trivially more efficient to group small numbers into separate mean groups:

      a (a 1 2) (a 3 4 5 6 7 8 9 10 11 12)
4.5
      a (a 1) (a 2) (a 3 4 5 6 7 8 9 10 11 12)
3.5

Knowing this, we can attempt at discovering the mean that taking $n$ first numbers to separate groups yields, for $n$ up to 11:

      {a(a¨⍳⍵),a⍵+⍳12-⍵}¨⍳11
4 3.5 3.5 3.7 4 4.357142857 4.75 5.166666667 5.6 6.045454545 6.5

Hence, the optimal grouping that solves problem 1 is as follows: (1) (2) (3 ... 12).

$n$ elements.⌗

Let’s start by modifying the model to actually compute the minimum element:

      ⊃⍋{a(a¨⍳⍵),a⍵+⍳12-⍵}¨⍳11
2

Mapping this function over ranges from 3 to 33 yields a curious pattern:

      {x←⍵⋄⊃⍋{a(a¨⍳⍵),a⍵+⍳x-⍵}¨⍳⍵-1}¨3+⍳30
1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5

The pattern can be explored further if we pick the largest possible partitioning and group the result:

      {x←⍵⋄{⊃⌽⍸⍵∊⌊/⍵}{a(a¨⍳⍵),a⍵+⍳x-⍵}¨⍳⍵-1}¨⍳70
0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5
       5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7

This sequence is clearly derived from the integer square root function - a(n) = floor((-1 + sqrt(1+4*n))/2) - as evidenced below:

      ⌊.5×¯1+.5*⍨1+4×⍳70
0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5
       5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7

Hence, the optimal grouping that solves problem 2 is the one that puts the first $a(n)$ numbers in separate groups, and then the remainder in the last group.

But why?⌗

The intuition is clearly sufficient to derive a solution with plenty of evidence to back it, but let’s try to prove it now.

We will consider the following function which models our averages ( $m$ is the amount of single groups, $n$ is the amount of numbers in total):

$\displaystyle b(n,m) = \frac{\displaystyle \sum_{k=1}^m k + \frac{\displaystyle \sum_{k=1}^{n-m} m+k}{m-n}}{m+1}$

The function can be trivially simplified as follows. Start by writing the sums in a closed form:

$\displaystyle b(n,m) = \frac{\displaystyle 0.5m(m+1) + \frac{0.5 (n^2 - m^2 + n - m)}{m-n}}{m+1}$

Simply fractions:

$\displaystyle b(n,m) = \frac{\displaystyle 0.5m(m+1) - 0.5 (m+n+1)}{m+1} = 0.5\frac{n(m+1)^2}{m+1}$

We want to find the value of $m \in [1, n)$ for which the function has the smallest value. Obviously:

$\frac{\partial b}{\partial m} = 0.5 - \frac{0.5 n}{(m+1)^2}$

It is sufficient that the partial derivative is zero for a fixed $n$ at some point $m > 0$ . The formula for critical points is rewritten as:

$n = m^2 + m + 1$

Which trivially yields $m = \sqrt{n} - 1$ . This result is admittedly different from the result obtained previously: a(n) = floor((-1 + sqrt(1+4*n))/2), but the discrepancy is easily explained by rounding - after all, $\sqrt{n} - 1$ does not have to be a whole number and our choice of a(n) was skewed towards the last index of the smallest value; notice that the following holds for $n>0$ .

$\left| \frac{1}{2}(\sqrt{4n+1}-1) - (\sqrt{n}-1) \right| < 1$

Elementary operations assuming $n>0$ yield:

$\sqrt{4n+1} < 2\sqrt{n}+1$

Notice that $\sqrt{4n+1} < \sqrt{n} + \sqrt{n+1}$ since, by squaring both sides, $4n+1<2n+1+2\sqrt{n}\sqrt{n+1}$ , hence $n<\sqrt{n}\sqrt{n+1}$ , which holds since $n=\sqrt{n}\sqrt{n}<\sqrt{n}\sqrt{n+1}$ . Substituting yields $\sqrt{n}+1 > \sqrt{n+1}$ , which is trivially true by natural induction.

Nested groups.⌗

Combining the observations from the previous sections, we can now tackle the problem of having $m$ nested groups. Following the previous schema, the optimal result is in the form $\text{avg}(a_1, a_2, …, \text{avg}(b_1, b_2, \dots, \text{avg}(\dots)))$ . The helper function used in this scenario is as follows:

      c←{a↑,∘a/(∊1↑⍨¨⍵)⊂⍳+/⍵}
      c 2 3 4 ⍝ avg(1,2,avg(3,4,5,avg(6,7,8,9)))
2.625

If $m$ were not fixed, the most optimal layout would trivially be 1 1 1 1....

The question is as follows:

Given $n$ (top of the range) and $m$ (amount of groups), determine the formula for the terms $c$ such that the average is minimised. For example, given $n=32$ and $m=3$ , c=2 4 25

A special case: 3-deep nesting.⌗

Trivially, the following function models c for 3-deep nesting (the proof can be formulated following the reasoning from previous sections):

$f(a,b,c) = \frac{a^2b+a^2+3ab+3a+b^2+3b+c+1}{2(a+1)(b+1)}$

Since $c=n-a-b$ to get a solution for the range up to $n$ , notice the following partial derivatives of $f(a,b,n-a-b)$ :

$\frac{\partial f}{\partial a} = \frac{-b^2+b+1+2a(b+1)+a^2(b+1)-n}{2(a+1)(b+1)^2}$

$\frac{\partial f}{\partial b} = \frac{b^2+2b+a+1-n}{2(a+1)(b+1)^2}$

Since the denominators are irrelevant in finding the critical points (poles only for improbable values of $a,b$ ), the second equation relates $b$ to $a$ and $n$ :

$b^2+2b+a+1-n = 0 \implies b=\sqrt{n-a}-1$

Substituting gives an equation that relates $a$ and $n$ :

$\sqrt{n-a}(a^2+2a+3)+a=2n+1$

This equation will always have a few solution but it is remarkably easy to pick the correct one: $b$ is at least $1$ , so $a$ must be smaller than $n$ by at least 4. Neither $a$ nor $b$ may be negative. The equation is impossible to solve for $a$ given $n$ through analytic methods and numerical methods are required.

Using Newton-Raphson, I arrive at $a \approx 2.082$ , which trivially yields $b \approx 4.4696$ and $c \approx 25.4484$ :

      c 2 4 25
3.466666667

Which is provably the optimal way to group the range from 1 to 32 using 3-deep ranges.