A Simple Introduction to Complex Stochastic Processes – Part 2

A Simple Introduction to Complex Stochastic Processes – Part 2


In my first article on this topic (see here) I introduced some of the complex stochastic processes used by Wall Street data scientists, using a simple approach that can be understood by people with no statistics background other than a first course such as stats 101. I defined and illustrated the continuous Brownian motion (the mother of all these stochastic processes) using approximations by discrete random walks, simply re-scaling the X-axis and the Y-axis appropriately, and making time increments (the X-axis) smaller and smaller, so that the limiting process is a time-continuous one. This was done without using any complicated mathematics such as measure theory or filtrations.
Here I am going one step further, introducing the integral and derivative of such processes, using rudimentary mathematics. All the articles that I’ve found on this subject are full of complicated equations and formulas. It is not the case here. Not only do I explain this material in simple English, but I also provide pictures to show how an Integrated Brownian motion looks like (I could not find such illustrations in the literature), how to compute its variance, and focus on applications, especially to number theory, Fintech and cryptography problems. Along the way, I discuss moving averages in a theoretical but basic framework (again with pictures), discussing what the optimal window should be for these (time-continuous or discrete) time series.
1. General Framework
As in my previous article, we define a time-continuous process as the limit of a time-discrete process. The time-discrete process is referred to as the base process. The fundamental example is as follows:
Start with discrete random variables U(k), k = 1, 2, and so on (the base process) that are independently and identically distributed, with mean equal to 0. Typically, the time series { U(k) } is a white noise.

Define X(n) = U(1) + … + U(n)
Standardize X(n) so that its variance does not depend on n, that is, introduce Y(n) = X(n) / SQRT(Var[U(1)]). This step consists of re-scaling the Y-axis.
Re-scale the time-axis (the X-axis) so that time increments are now equal to 1 / n rather than 1, and let n tends to infinity.  The limiting variable for Y(n), as n tends to infinity, is denoted as Z(1). The value at time t (t being continuous this time) is the limiting value of Y(INT(nt)) as n tends to infinity, where INT is the integer part function, and it is denoted as Z(t).

The collection of random variables { Z(t) } defines the resulting, time-continuous, properly re-scaled stochastic process. In this case, Var[Z(t)] = t Var[U(1)]. Also Z(t) has a Gaussian distribution, by construction and by virtue of the central limit theorem. This process is known as a Brownian motion. The initial random variables U(k) could be Gaussian, or uniform on {-1, +1}, or uniform on [-1, +1]. It does not matter. See illustration here. 
This process is continuous everywhere but nowhere differentiable. Thus the idea to build processes that are derived from { Z(t) }, but smoother (differentiable everywhere.) We introduce two such types of processes that meet this goal:

The cumulative or integrated process { S(t) } derived from { Z(t) }
The theoretical moving average process { M(t) } derived from { Z(t) }

Finally, we also define the inverse operation of integration as differentiation. The differentiated process of S(t) is Z(t). In practice, the smoother (integrated or moving average) process is easier to study and sometimes displays patterns that can’t be identified in the original process. More on this in the last section.
2. Integrated, Moving Average and Differential Process
Here we define three operations to transform a stochastic process into another one, hopefully more useful for interpretation and decision making, than the original process: Integration, differentiation, and moving average. In all three cases, the construction follows the same principle: In the construction of the Brownian motion described in the previous section, replace X(n) = U(1) + … + U(n) by X(n) = V(1) + … + V(n), where V(k) is described below for each transformation. We also discuss some of the challenges to make this methodology mathematically robust.

Integrated Process, Construction: V(k) = U(1) + … + U(k).
Differential Process, Construction: V(k) = U(k+1) – U(k). If { Z(t) } is the Brownian motion described in the introduction, then the resulting process is a white noise: continuous and differentiable nowhere.
Moving Average Process, Construction: V(k) = U(1) + … + U(h(k)) where h(k) is as small as possible to make the resulting process continuous and differentiable everywhere.  For Brownian motions,  h(k) = SQRT(k) works. Does h(k) = log(k) work? This would make the resulting process far more similar to the original one, but maybe barely (if at all) continuous — in other words, more chaotic than with h(k) = SQRT(k).

Challenges
The general construction procedure described above needs to be further studied from a theoretical point of view, for the following reasons:

Are the derived processes depending on U(1), U(2) and so on? This should not be the case. This issue is especially critical for the differentiated process.
Is the differentiation operation truly the reverse of integration?
Are my definitions of differential and integrated processes compatible with or equivalent to the highly technical definitions found in the literature?

Proper Re-scaling and Variance Computation
The second step in the general framework (see first section) needs to be adjusted to get things right, when constructing differential, integrated, or moving average processes. In short, you must keep the variance of X(n) not depending on n as n tends to infinity. Let us show you how it works for the integrated Brownian motion. In this case, for the integrated process, we have:

Thus,

So, the proper re-scaling factor for the Y-axis, as n tends to infinity, is in this case Y(n) = X(n) / SQRT(n^3 / 3). Using similar arguments, one can easily prove that

Also, S(t) has a Gaussian distribution for the same reasons as described in the first section. The same logic applies to compute Var[M(t)]. The details are left as an exercise. A more complicated exercise consists of computing the covariance between S(t) and S(t – s) for s > 0, and proving that {S(t) } is NOT a Brownian motion itself (being differentiable everywhere unlike Brownian motions.) 

Figure 1: Brownian motion realization (blue) and its moving average (red)
In Figures 1 and 2, the X-axis represents the time axis, between 0 and 1. The Brownian motion is continuous everywhere while differentiable nowhere. To the contrary, the moving average and integrated processes are both continuous and differentiable everywhere.

Figure 2:  Brownian motion realization (blue) and its integrated process (green)
3. Application to Number Theory
The purpose here is to study pairs of large prime numbers p and q with p < q, used to design cryptographic keys m = pq. Some of these primes exhibit strong patterns that make them unsuitable for use in highly secure cryptographic systems, making it less difficult to factor m. Factoring a product of two large primes each with hundreds of digits, is usually an intractable problem if p and q are carefully chosen. In other words, we are looking for primes p and q that are not as “random” (or to put it differently, less strongly prime)  than your typical large prime numbers. Factoring m allows you to break the key in cryptographic systems, something you want to avoid precisely by carefully choosing p and q.
We discuss here one example of such bad pair, namely p = 1879 and q = 3803. We use the same construction technique as outlined in the first section, resulting in a process that looks exactly like a Brownian motion up to some value of t, then suddenly exhibits two big jolts very early on the time scale, in this particular example.
The base process { U(k) } (see first section) is defined by
The resulting process { Z(t) } created using the technique described in the first section, is a Brownian motion up to the first jolt occurring at k = SQRT(m); the second jolt occurs at k = q.  Note that for small k’s, { U(k) } is reminiscent of number representations in various systems, as described in this article.  In other examples, the interesting jolt occurs at k = p (the smaller prime.) In many other cases, no jolts are found.

Figure 3: Brownian motion up to first jolt; second jolt used to break cryptographic key
Figure 3 is based on the first 25,000 observations. The base process { U(k) } does not display these jolts. They are only visible in the process { Z(t) } pictured in Figure 3.  Detecting these jolts (the second one) is of particular interest since it allows you to factor m to retrieve the two large prime factors p and q. To do this, one solution consists of using sampled values of U(k) together with interpolation techniques. Note that as k becomes large (higher than 12,000) the U(k)’s become so heavily auto-correlated that the process { Z(t) } is no longer a Brownian motion. In some ways, it could be used as a better model to simulate stock market prices, than standard Brownian motions.
For testing purposes, tables of large prime numbers are easy to find on the Internet, for instance here.
For related articles by the same author, click here or visit www.VincentGranville.com. Follow me on Twitter at @GranvilleDSC or on LinkedIn.
DSC Resources

Subscribe to our Newsletter
Comprehensive Repository of Data Science and ML Resources
Advanced Machine Learning with Basic Excel
Difference between ML, Data Science, AI, Deep Learning, and Statistics
Selected Business Analytics, Data Science and ML articles
Hire a Data Scientist | Search DSC | Classifieds | Find a Job
Post a Blog | Forum Questions


Link: A Simple Introduction to Complex Stochastic Processes – Part 2