Average Bit-Complexity of Euclidean Algorithms

Brigitte Vallée

GREYC, Université de Caen

Algorithms Seminar

May 22, 2000

[summary by Marni Mishna]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract
The complexity the Euclidean algorithm and its variants is well studied. This work refines the problem further by considering precise average bit-complexity. The technique is sufficiently general as to apply to a wide class of gcd-type algorithms. It determines elementary transformations for each algorithm and derives asymptotic information from their analytic behaviour. The methods rely on properties of transfer operators adapted from dynamical systems theory. The use of Ergodic Theorems in the continuous case (continued fraction algorithms) foreshadows the results, which use Tauberian Theorems as replacement. This is joint work with Ali Akhavi [1].



1   Why the Bit Case?

Since the initial average case analysis of the Euclidean algorithm in 1969 by Heilbronn and Dixon a wide variety of approaches have been used to examine variants, the most recent of which is the method of using transfer operators [3, 4].

The technique involves viewing the algorithm as a dynamical system and each iterative step as a linear fractional transformation (LFT). Previous talks by the speaker [2] shed some light on this technique, how several classes of GCD algorithms fell under a unified approach and furthermore, why they were naturally divided into two categories: slow (Q(log2 n)) and fast (Q(log n)).

This same technique will now aid in the study of bit-wise complexity. The motivation for this refinement is the following. It is not a priori evident whether the properties which make the slow algorithms slow extend to the bit case. It is true that there are more iterations, but what of the size of each iteration? This work answers the question definitively, yielding the same divisions between slow and fast algorithms, however with new complexity descriptions, Q(log3n) and Q(log2n). Furthermore, it is of interest to consider a practical complexity measure. The method offers precise insights on the distribution of costs. This enables a further refinement on the classification between the fast and slow algorithms.

1.1   Standard algorithm

The standard Euclidean algorithm determines the gcd of v0 and v1 by a finite number of steps of the form vi=mivi-1 +vi+1, with final step vk=0. Define l(x)=ëlog2 xû +1, the binary length of x. At each step there is one ``naive'' division with bit cost l(vi)l(mi), and two assignment steps involving vi and vi+1. The total bit-complexity of one iteration is l(vi)l(mi)+l(vi)+l(vi+1). The cost for the standard algorithm is then
C(v1, v0)=
k
å
i=1
l(vi)· ( l(mi)+2 ) .

2   Main Result: Bit-Wise Complexity

The following two sets are valid input to the Euclidean algorithm:
W={ (u,v)| gcd(u,v)=1, 1£ u<v }  and  WN={ (u,v)| (u,v)Î Wv£ N }.
The goal is to estimate the mean value of a cost C:W® R on WN. More precisely, to determine the asymptotic value as N®¥ of the mean value EN[C] satisfying EN[C]=CN/|WN|, where CN=å(u,v)ÎWNC(v, u).

The function of interest in this presentation is the bit-cost of the standard Euclidean algorithm, and consequently the cost is as defined in the previous section, but the methods are sufficiently general as to apply to a number of cases. The technique views the algorithm as a dynamical system


[c]4in with each iterative step a LFT. Modifying the LFT yields the variants. The continued fraction expression of the problem motivates the use of the transformations U(x)=1/x-ë1/xû and M(x)=ë1/xû. Notice that mi+1=M(Ui(v1/v0)). The value of k in the continued fraction to the right is the depth.

[c]2inv1/v0=1/m1+1/m2+1/m3+... +1/mk

2.1   Ergodic theory estimates

Gauss observed that the iteration of the transformation U has invariant density Y(t)=1/log 21/1+t. For any A:N® R such that å A(m) m-2<¥, define E¥[A(m)]= ò01A[m(t)]Y(tdt. This is equal to
E
 
¥
[ A(m) ] =
 
å
m³1
A(m) æ
ç
ç
è
log2 æ
ç
ç
è
1+
1
m
ö
÷
÷
ø
-log2 æ
ç
ç
è
1+
1
m+1
ö
÷
÷
ø
ö
÷
÷
ø
.
For example, when applied to l(m): E¥[l(m)]=(1/log 2) log(Õk³11+2-k).

In the continuous case, ergodic theory is applicable and gives the result that the expected value EN[åk=1mA(Uk(x))] approaches E¥[A] almost everywhere. Although ergodic theory does not apply in the discrete case, it does give plausible estimates as to what to expect. The assignment A(m)=l(m) gives the expected size of mi in bits. The discrete version is formulated as EN[åk=1p(x)A(Uk(x))], where p(x) is the depth of the necessarily finite continued fraction expansion of the rational x. In this framework one can study the aymptotic behaviour of several functions on WN, such as: A~(x)=åk=1p(x) A(mk(x)) and C~(x)=åk=1p(x) l(mk(x))·log2 vk(x).

One might anticipate that the value of EN[A~] under certain conditions should relate to the expected depth and the expected size of an iteration. The expected depth, E[p], corresponds to the number of iterations of the Euclidean algorithm on input WN, and is asymptotic to 6/p2 log2N. So, in the case of A(m)=l(m),
EN é
ê
ê
ë
~
A
 
ù
ú
ú
û
~ EN[pE
 
¥
[ A(m) ] = æ
ç
ç
è
12
p2
log
 
Õ
k³0
æ
ç
ç
è
1+
1
2k
ö
÷
÷
ø
ö
÷
÷
ø
log2N.
This is the mean size of the continued fraction encoding of a rational number. A similar heuristic analysis of EN[C~] shows the relation
EN é
ê
ê
ë
~
C
 
ù
ú
ú
û
~ EN[p]
1
2
log2 N · E
 
¥
[ l(m) ] .
These observations give a context for the main result.
Theorem 1   The average bit-complexity of the standard Euclidean algorithm on the set of valid inputs of denominator less than N is asymptotically of log-squared order:
EN[C]~ æ
ç
ç
è
6log 2
p2
log
 
Õ
k³ 1
æ
ç
ç
è
1+
1
2k
ö
÷
÷
ø
ö
÷
÷
ø
log22N.

This agrees with the heuristic argument. Numerically, this values satisfies EN[C]~1.24237log22 N.

3   Summary of Methods

The general method for obtaining this result is similar to the speaker's analysis of gcd-type functions. The average is expressible by partial sums of coefficients of Dirichlet series. Tauberian theory transfers the analytic behaviour of the series near singularities into asymptotic behaviour of coefficients. When seen as a dynamical system the generating functions of bit-cost relate to the Ruelle operators associated to the algorithm. The singularities of the Dirichlet series are related to spectral projections of the operators and are easy to describe.

3.1   Dirichlet generating functions

Define wn to be the set of all pairs (u,v) in W with v=n and Cn as the cumulative value of C over wn. Then the corresponding encoding into Dirichlet generating functions is
F
 
á cñ
(s)=
 
å
n³ 1
Cn
ns
=
 
å
(v0, v1)Î W
C(v1, v0)
v0s
.
Thus the expected average cost is EN[C]=(å n£ NCn)/(ån£ N|wn|).

3.2   Tauberian theorem

The Tauberian theorems are a natural tool to consider as they give asymptotic information about the partial sums of coefficients of Dirichlet series. They rely on the nature and position of the singularities of F(s)=å an n-s.
Theorem 2  [Delange]   Let F(s) be a Dirichlet series with non-negative coefficients such that F(s) converges for Â(s)>s>0. Assume that:
  1. F is analytic on Â(s)=s, where s¹s;
  2. F(s)=A(s)(s-s)-g-1 +C(s) for some g³ 0, and A(s) and C(s) analytic with A(s)¹0.
Then, as N®¥, the partial sum of coefficients is
 
å
n£ N
an=
A(s)
sG(g+1)
N
s
 
log
g
 
N   ( 1+e(N) ) ,   where   e(N)® 0.
However, the conditions are difficult to verify for Fá cñ(s) in its present form. A transformation gives the required information about the singularities.

3.3   Ruelle operators

The classical operator is
Gs[F](x)=
 
å
m³1
1
(m+x)s
F æ
ç
ç
è
1
m+x
ö
÷
÷
ø
.
Let H={ h| h(x)=(m+x)-1, m³ 1 }, the set of inverse branches of U. If D[h] is the denominator of the LFT h(x), then since D[h° g](x)=D[h]g(x)· D[g](x), the iterates of Gs are given by
Gsk[F](x)=
 
å
hÎ H
1
D[h](x)s
F° h(x).
Rationals of W can be written x=h(0) for some h in Hk where k³ 0. Then the Dirichlet generating function for |wn| is equal to ån³1|wn|n-s=åhÎ H*D[h](0)-s=(I-Gs)-1[1](0). A cost version of Rs,h[F](x)=D[h](x)-1F° h(x) is defined as Rs,h[c][F](x)=c(h)D[h](x)-1F° h(x). Similarly the cost companion to Gs=åhÎ HRs,h is Gs[c]=åhÎ HRs,h[c].

Recall that C(v0, v1)~ åi=1log2(vi)c(mi). If x=v1/v0=h1° h2°...° hk(0), then c(mi) only depends on hi and vi only depends on hi+1°...° hk(0). That is, the function can be expressed as
h=(h1°...° hi-1)° hi°(hi+1°...° hk) =bi(h)° hi° ei(h).

Defining Cs,h=-åi=1k/ sRs,ei(h)° Rs,hi[c]° Rs, bi(h) yields Fá cñ(s)=åhÎ H*Cs,h[1](0).

3.4   Functional analysis

The singularities of the cost function can now be described in terms of the singularities of the Cs,h, and subsequently of (I-Gs)-1. Analysis of (I-Gs)-1 determines the values for the Tauberian theorem to be s=2 and g=2. Using this, Theorem 1 now follows. In the case of the operators related to the slow algorithms, the corresponding result is g=3, accounting for the log-cubed behaviour.

4   Variants and Encoding

As before, the technique applies to a family of variants. For example, the bit-complexity of the centred algorithm is asymptotic to
6log 2
p2
log æ
ç
ç
ç
ç
ç
è
f2
¥
Õ
k=3
f2+
2f
2k-1
f2-
2
2k-1
ö
÷
÷
÷
÷
÷
ø
log22N,   where  f=((5)1/2+1)/2.

Finally, the average length of a continued fraction encoding is computable. This is the room occupied in memory by (m1, m2,..., mk, vk). The encoding uses the same principles as Fano--Shannon.
Theorem 3   The average Fano--Shannon code-length DN of the continued fraction expansion produced by the standard algorithm on valid inputs with denominator size N satisfies
DN~
12log2
p2
æ
ç
ç
è
1+
2
log 2
log
¥
Õ
k=1
æ
ç
ç
è
1+
1
2k
ö
÷
÷
ø
ö
÷
÷
ø
log2 N.
The numerical value is 2.04868 log2N, which is close to the optimal 2log2 N.

References

[1]
Akhavi (A.) and Vallée (B.). -- Average bit-complexity of Euclidean algorithms. In Montanari (Ugo), Rolim (José D. P.), and Welzl (Emo) (editors), Automata, languages and programming. Lecture Notes in Computer Science, vol. 1853, pp. 374--387. -- Springer, New York, 2000. Proceedings of the 27th ICALP Conference, Geneva, Switzerland, July 2000.

[2]
Salvy (B.). -- Algorithms Seminar, 1998--1999. -- Research Report n°3830, Institut National de Recherche en Informatique et en Automatique, December 1999.

[3]
Vallée (B.). -- Dynamics of the binary Euclidean algorithm: functional analysis and operators. Algorithmica, vol. 22, n°4, 1998, pp. 660--685. -- Average-case analysis of algorithms.

[4]
Vallée (Brigitte). -- A unifying framework for the analysis of a class of Euclidean algorithms. In Gonnet (Gastón H.), Panario (Daniel), and Viola (Alfredo) (editors), LATIN 2000: Theoretical Informatics. Lecture Notes in Computer Science, vol. 1776, pp. 343--354. -- Springer, Berlin, 2000. Proceedings of the 4th Latin American Symposium, Punta del Este, Uruguay, April 2000.

This document was translated from LATEX by HEVEA.