Divide-and-conquer Asymptotic behavior

Divide-and-conquer | Maple
Publications | Address | Version française

The asymptotic behavior of divide-and-conquer sequences is fundamental in algorithmic since the efficiency of an algorithm is usually given by its behavior on data of large size. One can distinguish two approaches : an elementary approach and a sophisticated approach.

Elementary approach. Let us start with the elementary approach. It is employed in almost all the works on the subject, [6, p.117], [9, chap. 4], [19, p. 546]. The results are often shown in an expeditious way; on occasion they are wrong by a lack of precison on the assumptions. It is impossible to treat all the cases and it seems more interesting to state clear, valid results in a well defined class of recurrences. This framework will provide heuristics for the examples met in practice and those examples will be treated by an ad hoc method if needed.

The sequences (Tn) which satisfy the recurrence

where and the sequence (fn) is given, constitute a simple class. Their asymptotic behavior is provided by the following result.
I.
If then for all .
II.
If both sequences (fn) and (Tn) are positive and then .
III.
If then for all .
Let us recall that in computer science the notation is employed to express the similarity, preferably noted in mathematics; the relation abbreviates the conjunction of the two relations un=O(vn) and vn=O(un).

A heuristic method that is frequently used to determine the behavior of a divide-and-conquer sequence consists in evaluating it on the powers of 2. In the domain of 2-rational sequences this process is natural: such a sequence is the translation of a rational series, in the sense of formal language theory. A subsequence associated with the integers whose binary expansion is of the form (uvk)2, where u and v are fixed patterns, has a rational generating series in the ordinary meaning. This subsequence thus satisfies a linear recurrence with constant coefficients of classical type. The sequence being evaluated at powers of 2, the comparison may be extended to all the integers under some assumptions of monotony and regularity [19, p. 565].

The simple result that we quoted provides the complexity of several divide-and-conquer algorithms. The multiplication of integers by Karatsuba's method, the multiplication of polynomials, the multiplication of matrices by Strassen's method all have cost bounded by some sequences which satisfy

respectively for the first and the second, and for the last. Thus they have some cost respectively of order and , and thus of order O(n1,58) and O(n2,81). This is asymptotically better than the cost of the naive algorithms which are respectively of order n2 and n3.

For mergesort the cost in the average case and in the worst case both satisfy a recurrence of the form

The second point above does not apply directly, but a simple variant shows that the cost is in both cases of order . In the search of the convex hull by the divide-and-conquer method the most expensive part is the sorting of the points according to their x-coordinate and the complexity is still in . One can improve these results by an experimental evaluation in the average case.

One could give other illustrations; we refer the reader to [7, pp. 23-24], [9, §28.5],[9, § 29.3.2], [9, §35.4], [14, p. 172], [20, §5.1.7], [24, p. 190], [27, p. 27], [32, pp. 153, 164]), [33, § 10.2], [34, chap. 28]. One will also consult [18], [26, §4.6.3], [28], and [13].

Sophisticated approach. Now let us handle the sophisticated approach. The divide-and-conquer sequences encountered in the analysis of algorithms have a polynomial behavior i.e. of order . It is for example the case of all 2-rational sequences; indeed such a sequence (un) is expressed by

this formula gives immediately

if a norm on the space of dimension N is used, and this provides the announced growth because the length of the binary expansion of n is in .

Such a behavior makes it possible to consider the Dirichlet series associated with the sequence,

It is defined in a right half-plane of the complex plane and very often it is extended to a meromorphic function over the whole plane. It is in particular the case for 2-rational sequences because the Dirichlet series then satisfies an infinite functional equation which expresses u(s) in terms of u(s+1), u(s+2), etc. In addition Perron's formula  [5, p. 245], [36, p. 147],

in which integration is done on a vertical line located to the right of the absolute convergence abscissa, allows us to express the partial sums of the sequence. If the function u(s) has a good behavior towards one can push the line of integration towards the left, over the pole. By taking the residues into account, one obtains an asymptotic expansion, say for the Cesàro mean of the sequence. On occasion, this expansion is an exact expression. For example [17], the sequence (sn) which gives the number of 1s in the binary expansion of the integers between 1 and n has value

where F(u) is the Fourier series

and is the Riemann zeta function, . In the same way the number of odd binomial coefficients in the first n lines of Pascal's triangle has value

where G(u) is a 1-periodic function defined by its Fourier series.

The remarkable point of these results is the occurrence of periodic fluctuations in logarithmic scale. One notices them in the picture below where the sequence of points is shown. On this topic one can consult  [16], [15], [2], [3].

The divide-and-conquer strategy thus provides a method to obtain effecient algorithms. Moreover it leads to a large class of recurrences. Their algebraic study shows links with automata theory and more generally to rational series of formal language theory. In addition their precise asymptotic analysis rests on methods of analytic number theory, which make it possible to explain the fluctuations that one observes in experiments.

 Previous page: 2-rational sequences Next page: References