More intuitive explanation for the harmonic series's sum

Hey everyone! I want to talk about something that’s been bothering me for a while.

It’s fairly well known that the sum of \frac{n}{i}, 1 \leq i \leq n for any n (that is, \sum\limits_{i=1}^n{\frac{n}{i}}) is O(n\log{n}). This is useful for, for example, finding all divisors of numbers from 1 to n in O(n\log{n}) time. Most editorials just throw this fact at us or say that it can be done with calculus. That’s cool, but what about the majority of people that haven’t learned calculus? I have an alternate explanation that should be easier to understand.

Let’s write \sum\limits_{i=1}^n{\frac{n}{i}} as n \cdot \sum\limits_{i=1}^n{\frac{1}{i}} (note: \sum\limits_{i=1}^n{\frac{1}{i}} is the harmonic series), and instead show that \sum\limits_{i=1}^n{\frac{1}{i}} = O(\log{n}). Then the series is:

\frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} + \frac{1}{7} + \frac{1}{8} + \frac{1}{9} + \frac{1}{10} + \dots

Now I will replace this series with another, strictly greater series:

\frac{1}{1} + \frac{1}{2} + \frac{1}{2} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{8} + \frac{1}{8} + \frac{1}{8} + \dots

where each denominator is “rounded” down to the next smallest power of 2. Because each term of the new series is not less than the corresponding term of the old series, the sum of the new series is an upper bound to the sum of the old series. So now let’s find the sum of the new series by grouping all similar elements together:

(\frac{1}{1}) + (\frac{1}{2} + \frac{1}{2}) + (\frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4}) + (\frac{1}{8} + \dots + \frac{1}{8}) + \dots

Each group of the series with elements of \frac{1}{x} has at most x elements, so each group’s sum is upper-bounded by 1. How many groups are there? Short answer: O(\log{n}).

Long answer

Let’s also round n up to one less than its nearest power of 2, and write it as 2k - 1. Then, the sum of the sizes of the groups are 1 + 2 + 4 + 8 + \dots + \frac{k}{2} + k = 2k - 1 (See why by writing the partial sums out by hand, they’re always of the form 2^p - 1. Or you could think of the numbers in binary, the sum will always be in the form 111\dots111, which is a power of 2 (1000\dots000) minus 1). This is essentially asking "how many times can you divide k by 2 until you get 1", which is the definition of \log_2{k} = O(\log{n}).

Since each group’s sum \leq 1 and there are O(\log{n}) groups, the total sum is also O(\log{n}), and we are done. A similar explanation is on Wikipedia, but it’s rather only used to show that for n = \infty, the sum is also \infty, and doesn’t give any actual information about the sum.

I’ve noticed a general lack of tutorials on this forum (at least, as compared to the “help me” posts). Do you guys want more explanations of things? I’m probably too lazy to make stuff like videos, but I’m fine with continuing this (possibly with images included).

By the way, as a bonus:

All divisors of 1 to n
vector<int> divisors[n + 1];
for (int i = 1; i <= n; i++) {
  for (int j = i; j <= n; j += i) {
    divisors[j].push_back(i);
  }
}

Can you see, given the explanation above, why this is O(n\log{n})?

Additional note by @everule1: if you round each term’s denominator up to the next greatest power of 2, as such:

\frac{1}{1} + \frac{1}{2} + \frac{1}{4} + \frac{1}{4} + \frac{1}{8} + \frac{1}{8} + \frac{1}{8} + \frac{1}{8} + \frac{1}{16} + \frac{1}{16} + \dots

you can also lower-bound the series’s sum as O(\log{n}) in the exact same way.

71 Likes

That’s a clever explanation. I never myself reasoned enough over that expression before to convince myself. Now it’s clear :slight_smile:

Yaa it would be great if you could explain more topics in the same. Including images would be even better.

I don’t think any images would suit this specific topic, but yeah they’d work for other ones.

2 Likes

This also leads to a similar proof for why it’s at least O(N\log{N}). I think it’s at least worth mentioning.

2 Likes

Cool! I didn’t even notice that, I’ll add it

2 Likes

That’s really helpfull, and surprisingly difficult to find on the internet otherwise!

it’s very obvious . Also it is sandwiching it is very useful in limits and approximation . Actually u will get theta(n*log n)