Divide-and-conquer
Asymptotic behavior
D&C logo

Philippe Dumas's home page
Divide-and-conquer | Maple
Publications | Address | Version française

hline

Introduction
A fundamental idea
Complexity and recurrences
2-rational sequences
Asymptotic behavior
References

The asymptotic behavior of divide-and-conquer sequences is fundamental in algorithmic since the efficiency of an algorithm is usually given by its behavior on data of large size. One can distinguish two approaches : an elementary approach and a sophisticated approach.

Elementary approach. Let us start with the elementary approach. It is employed in almost all the works on the subject, [6, p.117], [9, chap. 4], [19, p. 546]. The results are often shown in an expeditious way; on occasion they are wrong by a lack of precison on the assumptions. It is impossible to treat all the cases and it seems more interesting to state clear, valid results in a well defined class of recurrences. This framework will provide heuristics for the examples met in practice and those examples will be treated by an ad hoc method if needed.

The sequences (Tn) which satisfy the recurrence

\begin{displaymath}
T_n=aT_{\lfloor n/2\rfloor}+f_n,\end{displaymath}

where $a\geq1$ and the sequence (fn) is given, constitute a simple class. Their asymptotic behavior is provided by the following result.
I.
If $f_n=O(n^{\log_2{a}+\varepsilon})$ then $T_n=O(n^{\log_2{a}+\varepsilon})$ for all $\varepsilon\gt$.
II.
If both sequences (fn) and (Tn) are positive and $f_n=\Theta(n^{\log_2{a}})$ then $T_n=\Theta(n^{\log_2{a}}\log_2{n})$.
III.
If $f_n=O(n^{\log_2{a}-\varepsilon})$ then $T_n=O(n^{\log_2{a}})$ for all $\varepsilon\gt$.
Let us recall that in computer science the notation $\Theta$ is employed to express the similarity, preferably noted $\asymp$ in mathematics; the relation $u_n=\Theta(v_n)$ abbreviates the conjunction of the two relations un=O(vn) and vn=O(un).

A heuristic method that is frequently used to determine the behavior of a divide-and-conquer sequence consists in evaluating it on the powers of 2. In the domain of 2-rational sequences this process is natural: such a sequence is the translation of a rational series, in the sense of formal language theory. A subsequence associated with the integers whose binary expansion is of the form (uvk)2, where u and v are fixed patterns, has a rational generating series in the ordinary meaning. This subsequence thus satisfies a linear recurrence with constant coefficients of classical type. The sequence being evaluated at powers of 2, the comparison may be extended to all the integers under some assumptions of monotony and regularity [19, p. 565].

The simple result that we quoted provides the complexity of several divide-and-conquer algorithms. The multiplication of integers by Karatsuba's method, the multiplication of polynomials, the multiplication of matrices by Strassen's method all have cost bounded by some sequences which satisfy

\begin{displaymath}
u_n=3u_{\lfloor n/2\rfloor}+cn,\qquad
v_n=7v_{\lfloor n/2\rfloor}+cn^2,\end{displaymath}

respectively for the first and the second, and for the last. Thus they have some cost respectively of order $O(n^{\log_23})$ and $O(n^{\log_27})$, and thus of order O(n1,58) and O(n2,81). This is asymptotically better than the cost of the naive algorithms which are respectively of order n2 and n3.

For mergesort the cost in the average case and in the worst case both satisfy a recurrence of the form

\begin{displaymath}
T_n=T_{\lfloor n/2\rfloor}+T_{\lceil
n/2\rceil}+O(n).\end{displaymath}

The second point above does not apply directly, but a simple variant shows that the cost is in both cases of order $O(n\log_2n)$. In the search of the convex hull by the divide-and-conquer method the most expensive part is the sorting of the points according to their x-coordinate and the complexity is still in $O(n\log_2n)$. One can improve these results by an experimental evaluation in the average case.

One could give other illustrations; we refer the reader to [7, pp. 23-24], [9, §28.5],[9, § 29.3.2], [9, §35.4], [14, p. 172], [20, §5.1.7], [24, p. 190], [27, p. 27], [32, pp. 153, 164]), [33, § 10.2], [34, chap. 28]. One will also consult [18], [26, §4.6.3], [28], and [13].

Sophisticated approach. Now let us handle the sophisticated approach. The divide-and-conquer sequences encountered in the analysis of algorithms have a polynomial behavior i.e. of order $O(n^{\alpha})$. It is for example the case of all 2-rational sequences; indeed such a sequence (un) is expressed by

\begin{displaymath}
u_n=\lambda A_{n_\ell}\dotsb A_{n_0}\gamma\qquad\text{if
$n=(n_{\ell}\ldots n_0)_2$}\,;\end{displaymath}

this formula gives immediately

\begin{displaymath}
\vert u_n\vert\leq \Vert \lambda\Vert \Vert A_{n_\ell}\Vert\...
 ...\gamma\Vert\leq C\max(\Vert A_0\Vert,\Vert A_1\Vert)^{\ell+1}, \end{displaymath}

if a norm on the space of dimension N is used, and this provides the announced growth because the length of the binary expansion of n is in $\log_2{n}$.

Such a behavior makes it possible to consider the Dirichlet series associated with the sequence,

\begin{displaymath}
u(s)=\sum_{n\geq1}\frac{u_n}{n^s}.\end{displaymath}

It is defined in a right half-plane of the complex plane and very often it is extended to a meromorphic function over the whole plane. It is in particular the case for 2-rational sequences because the Dirichlet series then satisfies an infinite functional equation which expresses u(s) in terms of u(s+1), u(s+2), etc. In addition Perron's formula  [5, p. 245], [36, p. 147],

\begin{displaymath}
\sum_{n<N}u_n+\frac{1}{2}u_N=\frac{1}{2i\pi}
\int_{c-i\infty}^{c+i\infty}u(s)N^s\frac{ds}{s},\end{displaymath}

in which integration is done on a vertical line located to the right of the absolute convergence abscissa, allows us to express the partial sums of the sequence. If the function u(s) has a good behavior towards $i\infty$ one can push the line of integration towards the left, over the pole. By taking the residues into account, one obtains an asymptotic expansion, say for the Cesàro mean of the sequence. On occasion, this expansion is an exact expression. For example [17], the sequence (sn) which gives the number of 1s in the binary expansion of the integers between 1 and n has value

\begin{displaymath}
S_n=\frac{1}{2}n\log_2{n}+nF(\log_2n),\end{displaymath}

where F(u) is the Fourier series

\begin{displaymath}
F(u)=\frac{\log_2\pi}{2}-\frac{1}{2\log 2}-\frac{3}{4}-
\fra...
 ...sum_{k\neq0}\frac{\zeta(\chi_k)}{\chi_k(\chi_k+1)}
e^{2ik\pi u}\end{displaymath}

and $\zeta(s)$ is the Riemann zeta function, $\chi_k=2ik\pi/\log 2$. In the same way the number $\Phi_n$ of odd binomial coefficients in the first n lines of Pascal's triangle has value

\begin{displaymath}
\Phi_n=n^{\log_23}\,G(\log_2n),\end{displaymath}

where G(u) is a 1-periodic function defined by its Fourier series.

The remarkable point of these results is the occurrence of periodic fluctuations in logarithmic scale. One notices them in the picture below where the sequence of points $(\log_2{n},\Phi_n/n^{\log_23})$ is shown. On this topic one can consult  [16], [15], [2], [3].

The divide-and-conquer strategy thus provides a method to obtain effecient algorithms. Moreover it leads to a large class of recurrences. Their algebraic study shows links with automata theory and more generally to rational series of formal language theory. In addition their precise asymptotic analysis rests on methods of analytic number theory, which make it possible to explain the fluctuations that one observes in experiments.

hline

Previous page: 2-rational sequences
Next page: References