Two New Deep Conjectures in Probabilistic Number Theory

Two New Deep Conjectures in Probabilistic Number Theory


The material discussed here is also of interest to machine learning, AI, big data, and data science practitioners, as much of the work is based on heavy data processing, algorithms, efficient coding, testing, and experimentation. Also, it’s not just two new conjectures, but paths and suggestions to solve these problems. The last section contains a few new, original exercises, some with solutions, and may be useful to students, researchers, and instructors offering math and statistics classes at the college level: they range from easy to very difficult. Some great probability theorems are also discussed, in layman’s terms: see section 2.1. 
The two deep conjectures highlighted in this article (conjectures B and C) are related to the digit distribution of well known math constants such as Pi or log 2, with an emphasis on binary digits of SQRT(2). This is an old problem, one of the most famous ones in mathematics, still unsolved today.
1, A Strange Recursive Formula
Let us consider the following recurrence system:
  If z(n) < 2y(n) Then

y(n + 1) = 4y(n) – 2z(n)
z(n + 1) = 2z(n) + 3

  Else

y(n + 1) = 4y(n)
z(n + 1) = 2z(n) -1 

Now, let us consider the following limit:

The limit may or may not exist, and it may or may not depend on the initial conditions. Let us assume that the initial values y(1) and z(1) are strictly positive integers. If for some value of n, either y(n) = 2z(n) or 2y(n) = z(n), we call it the non-standard case. Otherwise, we call it the standard case. The first result below is about the limit mentioned earlier. However this is not one of the two deep conjectures featured in this article, but rather an interesting intermediary result. 
Conjecture A
In the standard case, the above limit is always equal to 1. In the non-standard case, it is always equal to 3.
Now let’s the fun begin. Let d(n) = (z(n) − 2z(n−1) + 1)  / 4. The sequence d(n) represents the binary digits of some unknown number x, a number that depends on the initial conditions. In the standard case, x is irrational while in the non-standard case, x is a rational number. It turns out that if y(1) = 2, and z(1) = 5 then that number is x = SQRT(2). You may ask: so what? Here is where it becomes fascinating: Proving Conjecture A for y(1) = 2 and z(1) = 5 (this is the standard case) amounts to proving that exactly 50% of the binary digits of SQRT(2) are equal to one.
Generally speaking, the above limit is equal to 4p − 1 where p is the proportion of binary digits equal to one for the number x in question: this fact is easy to prove, almost trivial, as the formula in the limit was built with that goal in mind.
An article about this recursion for SQRT(2), with the connection to its digit distribution, can be found here. An application to the gaming industry can be found in section 2.1 in the following article. The source code that I used in my computations is accessible here (Perl code using the Bignum library for exact arithmetic.)
The convergence to 1, for the standard case and in particular for x = SQRT(2), is backed by empirical evidence. This also works even if y(1) or z(1) are not integers. Indeed, non-integer initial values could be investigated to get deeper insights and try to prove conjecture A. For instance, it might be worth investigating what happens when y(1) tends to 2, and z(1) tends to 5. Below is a chart slowing the rather slow convergence for the limit, of the order 1 / SQRT(n) — the same speed (not a coincidence!) as for the central limit theorem. 

Figure 1: Convergence of the limit to 1, for x = SQRT(2), based on the first n = 10,000 iterations 
1.1. A deeper result
The slow convergence, mimicking the central limit theorem based on empirical evidence, led me to issue the following deeper result. 
Let 
Then E(n) = SQRT(n) * (L(n) – 1) = O(1). In other words, E(n) is bounded as n tends to infinity. The first deep result below (Conjecture B) is equivalent (actually a little stronger) to the statement just made if x = SQRT(2):
Conjecture B
Let q(n) denotes the proportion of binary digits of SQRT(2), or any other normal number (also called good seed, for instance Pi or log 2) that are equal to one, among their first n digits. Then we have:

This is a much stronger statement than the best result proved so far. The best formally proved result states that among the first n binary digits of SQRT(2), at least SQRT(2n) of them are equal to one as n tends to infinity (reference: here) but everyone “knows” that this proportion actually tends to n / 2.
My conjecture is confirmed by empirical evidence, pictured in the chart below.

Figure 2: SQRT(n) * |q(n) – 1/2| for n up to 1,000,000;
q(n) is the proportion of binary digits of SQRT(2) equal to one, among its first n digits
The time series in Figure 2 looks like a Brownian motion at first glance, but it is not. The variance of a Brownian motion is strictly increasing, and becomes infinite over time, but here we are dealing with a bounded Brownian motion.
1.2. Connection to the Berry-Esseen theorem
The Berry-Esseen theorem provides a second-order approximation to the central limit theorem (itself a first order approximation.) Higher order approximations are available, see here. If the random variables are Bernoulli, better approximations are available, this is a topic of active research. Let me mention a recent result, which is the topic of a PhD thesis, which is where Conjecture B is derived from  (the full thesis available here):
If the binary digits of normal numbers were distributed as i.i.d Bernouilli variables of parameter 1/2, as it very, very, very much seems to be the case, then Conjecture B would be true as a result of the above theorem. As of today, it is still one of the biggest mysteries of mother nature. Millions of smart mathematicians have worked on this problem without success for hundreds of years. 
1.3. Potential path to solving this problem
On a positive note, there are a few promising approaches. In this section, I describe a possible path to a solution.  The recurrence system described at the very beginning of this article, yielding the digits d(n) of some unknown real number x, with one of them being x = SQRT(2), has the following nice features:

It is ergodic as far as the sequence d(n) is concerned
It does not depend on the initial values y(1) and z(1), for its limiting distribution
It does not involve SQRT(2) at all, as far as asymptotics are concerned
It only has two statistical solutions (called attractors in chaos theory): the standard and non-standard cases, and SQRT(2) belongs to the standard case

In short, proving for the standard case that being a normal number is the only solution (the attactor) to the underlying discrete stochastic equation attached to the recurrence system, may be the way to go. This is equivalent to proving that the unique attractor is the uniform distribution, for the digit sequence d(n). 
To the contrary, the classical approach relies on the recursion x(n + 1) = bx(n) – INT(bx(n)) where b is the base (here b = 2) and x(1) = x. Then d(n) = INT(bx(n)). It yields the stochastic integral equation P(Y  <  y) = P( bY – INT(bY)  <  y) with infinitely many solutions (statistical distributions.) Most of these solutions, for non-normal numbers, are pretty wild: see here and here. In addition, in the classical approach, the solution depends on the initial value: the number x. 
Instead, using the discrete recurrence system featured in the introduction, eliminates these problems, at least for the number x = SQRT(2) in base b = 2.
Note: The recursion mentioned here is identical to the one featured in section 2.1 in this article after the change of variable z(n) = 4x(n) + 1. An additional change of variables, w(n) = z(n) − 2y(n), could prove useful.
2. Potential Solution Based on Special Rational Number Sequences
Here we follow a totally different approach to study the binary digits of SQRT(2). We start with the following sequence:

The brackets represent the fractional part function. It is easy to prove that x(n) is a rational number, and that its period in its base 2 expansion is equal to 2 * 3^(n -1). What’s more, the proportion of zeros and ones in the binary expansion of any x(n) is exactly 50/50 (see here). The median value M(n) computed on x(1), …, x(n) is one of these numbers, if n is odd: it is the middle value once x(1), …, x(n) have been sorted in numerical order. Thus the proportion of zeros and ones, in M(n), is always 50/50 if n is odd, regardless of n. At the limit as n tends to infinity, M(n) tends to SQRT(2)/2. This is due to the fact that x(n) has the distribution of 1 / 2^X where X is uniform on [0, 1], because {n log 3 / log 2} is equidistributed mod 1, itself a consequence of the fact that log 3 / log 2 is irrational.  
It is tempting to conclude that the binary digits of SQRT(2) are thus 50/50 zeros and ones, and that this ratio is exact, not an approximation. It is too good to be true though, the situation is actually more complicated than that. If you look at the minimum or the maximum rather than the median, it does not work. Even if looking at the median, but with different x(n)’s, say x(n)’s that are perfect random numbers uniformly distributed on [0, 1], while M(n) always has 50/50 zeros and ones, at the limit as n tends to infinity, it is no longer true: M(n) tends to 1/2, the worst non-normal number!
The idea to make this work is to remove infinitely many x(n)’s, in order words, to consider a subsequence x(t(n)) with t(n) being increasingly large integers. If the thinning process used to carefully remove selected x(n)’s is in some sense uniform, then the distributions of x(t(n)) ,and x(n) will be identical and thus the limiting medians, will be unchanged. The new x(n) can be defined as follows:

Here, t(n) is a strictly increasing sequence of integers. If t(n) is a polynomial in n, with the leading coefficient equal to one, then {t(n) log 3 / log 2} is still equidistributed modulo 1 (see why here), and thus the distribution of the new x(n)’s and its limiting median is unchanged. If any time you remove an original x(n) smaller than SQRT(2), your remove one that is larger than SQRT(2)/2, and conversely, then the limiting median is also unchanged, equal to SQRT(2)/2. Or if t(n) is the arrival time of the n-th minimum record of |x(k) – SQRT(2)/2| computed on the first original x(k), k = 1, …, n, then the limit of the new sequence x'(n) = x(t(n)), as n tends to infinity, will be SQRT(2)/2: this is discussed in details here. This latter approach is mathematically elegant: it involves the convergents of the continued fraction of log 3 / log 2, and has some connections with music theory.
I haven’t found so far a sequence t(n) that leads to a fool-proof solution. One approach that is promising, is discussed in section 2.1. It also leads to our second deep conjecture, Conjecture C. 
2.1. Interesting statistical result
Let N(m, x) be the number of binary digits equal to 1, among the first m binary digits of x, and assume x is in [0, 1]. Using the sequence x(n) defined at the beginning of section 2, we say that the term x(n) is a bad number if and only if:

for  at least one value of m between m = 1 and m = n. Also we say that x is an irregular number if the same condition is satisfied for at least one value of m, however large or small. 
Now let’s remove all the bad numbers from the sequence x(n). This will remove a very tiny proportion of the x(n)’s, less than one out of 2^15 terms, but a strictly positive proportion nevertheless. Any term x(n) starting with 20+ successive digits equal to one, in its binary expansion, will be removed. Will the limiting median SQRT(2)/2 be preserved if we do this? Very likely, but no one knows. Among the standard mathematical constants, none are expected to be irregular numbers if Conjecture B is correct. This leads us to the following, even deeper conjecture:
Conjecture C
The number SQRT(2) is not an irregular number.  Indeed, none of the top 20 most popular mathematical constants (Pi, e, log 2 and so on) is an irregular number.

Of course after removing the bad terms in the sequence x(n), the limiting median will have 50% zeros and 50% ones in its binary expansion, by construction, whatever that new median is. But the median might have been shifted by this process. For instance, the minimum won’t tend to 1/2 anymore, the maximum won’t tend to 1 anymore. But what about the median? Yet Conjecture C gives very strong bounds about the binary digit distribution for SQRT(2), and knowing these bounds might help solving the problem.  
2.2. Another curious statistical result
Let’s get back to the original sequence x(n) defined at the beginning of section 2. Let’s look at the m-th binary digit of x(n). It’s average value computed over all x(n), n = 1, 2, and so on is denoted as m(m). We have:

Also,

The number 1 / (2 log 2) is the average value of x(n), computer over all the terms of the sequence x(n). Finally m(m) converges exponentially fast to 1/2 as m tends to infinity, and the sequence m(m) is strictly increasing. As a side note, the number v(m) consisting of the successive m-th binary digit of x(n) for n = 1, 2, and so on, is irrational: its proportion of binary digits equal to one is m(m), and since m(m) is irrational, that number v(m) must also be irrational, and of course, non-normal regardless of m. For more details, see here.
3. Exercises
The material presented in sections 1 and 2 represents only a tiny part of my research on this topic. The exercises below explore some of the interesting issues not covered in the main part of this article.
[1] This a starter, easy to prove, even though the result sounds intriguing. Let d(n) be the n-th binary digit of a number x in [0, 1]. Prove that

[2] More difficult. Define R(n) as 

where h(n) is an increasing sequence of positive integers, with h(n)/n tends to 1 as n tends to infinity. Prove that R(n) tends to log 2 as n tends to infinity. If h(n) = n, then R(n) = H(2n) – H(n) where H(n) is the n-th harmonic number. These fractions have been well studied, and could potentially lead to a proof that log 2 is a normal number, especially if h(n) is carefully chosen. 
[3] Using the equidistribution theorem, prove that the proportion of binary digits equal to one, for an irrational number x in [0, 1], is equal to 

The change of variable z = 2^y can help. 
[4] More difficult. Prove that if if x(n) = { nq } and y(n) = { nq * 12 / 35 }, then the correlation between the two sequences is 1 / (12 * 35). The brackets represent the fractional part function. Replace 12 and 35 by any pair of integers that do not share a common divisor, and as long as q is irrational, this result easily generalizes. For a proof, see here.
[5] Let us consider the 2^n possible sequences of n binary digits. We define a run of length k as being a sub-sequence of k identical digits. How many of these 2^n  n-digits “words” have a maximum run of length k? Start with k = 2. The answer, for k = 2, is F(n), the n-th Fibonacci number. See here the solution to this problem. More generally, the longest run is of order log n.  See here for details, this is a pretty deep result. See also this article. 
[6] At the beginning of section 2, we defined a specific sequence x(n) that has this property: all x(n)’s are rational numbers with (in binary representation) a period equal to 2 * 3^(n-1). The denominator, for these rational numbers is always 3^n. If we want rational numbers with a bigger period, indeed with the maximum possible period, we should look at denominators that are reptend primes: see here  and here. See also my question posted on StackExchange, here. Would it be easier to work with these special primes, rather than with my original sequence x(n)?
[7] It is possible to get an incredibly good approximation for the sum of the first n terms {k log 3 / log 2}, k = 1, 2, …, n. As usual, the brackets represent the fractional part function. More specifically:

Prove this result. See solution here. 
[8] Consider the following recursion:  x(1) = 1, x(n+1) = ax(n) + bn, y(n) = −1 + x(n)/a^n. This recursion seems quite intractable, but it has some interesting features and it is much more friendly than it looks at first glance. Here a, b are two positive parameters, with a > 1. Let g(a, b) be the limit of y(n) as n tends to infinity. Prove that if a and b are rational numbers, then g(a, b) is also a rational number. See solution here. Unfortunately, it means that this recursion is of no use to prove the normalcy of standard mathematical constants, contrarily to what I initially thought.
[9] Difficult. Let x(n) = { b^n x} with the brackets representing the fractional part function. Does the auto-correlation structure of this sequence characterizes a good seed: if correl(x(n), x(n+k)) = 1 / b^k, does it mean that x is a good seed (that is, a normal number) in base b? The definition of this correlation coefficient is posted here: see section “definition of correlation”. There is no solution to this day/ 

Link: Two New Deep Conjectures in Probabilistic Number Theory