Average Bit-Complexity of Euclidean Algorithms

Average Bit-Complexity of Euclidean Algorithms

Brigitte Vallée

GREYC, Université de Caen

Algorithms Seminar

May 22, 2000

[summary by Marni Mishna]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract

The complexity the Euclidean algorithm and its variants is well studied. This work refines the problem further by considering precise average bit-complexity. The technique is sufficiently general as to apply to a wide class of gcd-type algorithms. It determines elementary transformations for each algorithm and derives asymptotic information from their analytic behaviour. The methods rely on properties of transfer operators adapted from dynamical systems theory. The use of Ergodic Theorems in the continuous case (continued fraction algorithms) foreshadows the results, which use Tauberian Theorems as replacement. This is joint work with Ali Akhavi [1].

1 Why the Bit Case?

Since the initial average case analysis of the Euclidean algorithm in 1969 by Heilbronn and Dixon a wide variety of approaches have been used to examine variants, the most recent of which is the method of using transfer operators [3, 4].

The technique involves viewing the algorithm as a dynamical system and each iterative step as a linear fractional transformation (LFT). Previous talks by the speaker [2] shed some light on this technique, how several classes of GCD algorithms fell under a unified approach and furthermore, why they were naturally divided into two categories: slow (Q(log² n)) and fast (Q(log n)).

This same technique will now aid in the study of bit-wise complexity. The motivation for this refinement is the following. It is not a priori evident whether the properties which make the slow algorithms slow extend to the bit case. It is true that there are more iterations, but what of the size of each iteration? This work answers the question definitively, yielding the same divisions between slow and fast algorithms, however with new complexity descriptions, Q(log³n) and Q(log²n). Furthermore, it is of interest to consider a practical complexity measure. The method offers precise insights on the distribution of costs. This enables a further refinement on the classification between the fast and slow algorithms.

1.1 Standard algorithm

The standard Euclidean algorithm determines the gcd of v₀ and v₁ by a finite number of steps of the form v_i=m_iv_i-1 +v_i+1, with final step v_k=0. Define l(x)=ëlog₂ xû +1, the binary length of x. At each step there is one ``naive'' division with bit cost l(v_i)l(m_i), and two assignment steps involving v_i and v_i+1. The total bit-complexity of one iteration is l(v_i)l(m_i)+l(v_i)+l(v_i+1). The cost for the standard algorithm is then

C(v₁, v₀)=

i=1

l(v_i)·

(

l(m_i)+2

)

2 Main Result: Bit-Wise Complexity

The following two sets are valid input to the Euclidean algorithm:

W={ (u,v)| gcd(u,v)=1, 1£ u<v } and W_N={ (u,v)| (u,v)Î W, v£ N }.

The goal is to estimate the mean value of a cost C:W® R on W_N. More precisely, to determine the asymptotic value as N®¥ of the mean value E_N[C] satisfying E_N[C]=C_N/|W_N|, where C_N=å_(u,v)ÎW_{_N}C(v, u).

The function of interest in this presentation is the bit-cost of the standard Euclidean algorithm, and consequently the cost is as defined in the previous section, but the methods are sufficiently general as to apply to a number of cases. The technique views the algorithm as a dynamical system

[c]4in with each iterative step a LFT. Modifying the LFT yields the variants. The continued fraction expression of the problem motivates the use of the transformations U(x)=1/x-ë1/xû and M(x)=ë1/xû. Notice that m_i+1=M(Uⁱ(v₁/v₀)). The value of k in the continued fraction to the right is the depth.

[c]2inv₁/v₀=1/m₁+1/m₂+1/m₃+... +1/m_k

2.1 Ergodic theory estimates

Gauss observed that the iteration of the transformation U has invariant density Y(t)=1/log 21/1+t. For any A:N® R such that å A(m) m^-2<¥, define E_¥[A(m)]= ò₀¹A[m(t)]Y(t) dt. This is equal to

[

A(m)

]

m³1

A(m)

æ
ç
ç
è

log₂

æ
ç
ç
è

ö
÷
÷
ø

-log₂

æ
ç
ç
è

m+1

ö
÷
÷
ø

For example, when applied to l(m): E_¥[l(m)]=(1/log 2) log(Õ_k³11+2^-k).

In the continuous case, ergodic theory is applicable and gives the result that the expected value E_N[å_k=1^mA(U^k(x))] approaches E_¥[A] almost everywhere. Although ergodic theory does not apply in the discrete case, it does give plausible estimates as to what to expect. The assignment A(m)=l(m) gives the expected size of m_i in bits. The discrete version is formulated as E_N[å_k=1^p(x)A(U^k(x))], where p(x) is the depth of the necessarily finite continued fraction expansion of the rational x. In this framework one can study the aymptotic behaviour of several functions on W_N, such as: A^~(x)=å_k=1^p(x) A(m_k(x)) and C^~(x)=å_k=1^p(x) l(m_k(x))·log₂ v_k(x).

One might anticipate that the value of E_N[A^~] under certain conditions should relate to the expected depth and the expected size of an iteration. The expected depth, E[p], corresponds to the number of iterations of the Euclidean algorithm on input W_N, and is asymptotic to 6/p² log²N. So, in the case of A(m)=l(m),

E_N

é
ê
ê
ë

ù
ú
ú
û

~ E_N[p]× E

[

A(m)

]

æ
ç
ç
è

p²

log

k³0

æ
ç
ç
è

2^k

ö
÷
÷
ø

log₂N.

This is the mean size of the continued fraction encoding of a rational number. A similar heuristic analysis of E_N[C^~] shows the relation

E_N

é
ê
ê
ë

ù
ú
ú
û

~ E_N[p]

log₂ N ·

[

l(m)

]

These observations give a context for the main result.

Theorem 1 The average bit-complexity of the standard Euclidean algorithm on the set of valid inputs of denominator less than N is asymptotically of log-squared order:

E_N[C]~

æ
ç
ç
è

6log 2

p²

log

k³ 1

æ
ç
ç
è

2^k

ö
÷
÷
ø

log₂²N.

This agrees with the heuristic argument. Numerically, this values satisfies E_N[C]~1.24237log₂² N.

3 Summary of Methods

The general method for obtaining this result is similar to the speaker's analysis of gcd-type functions. The average is expressible by partial sums of coefficients of Dirichlet series. Tauberian theory transfers the analytic behaviour of the series near singularities into asymptotic behaviour of coefficients. When seen as a dynamical system the generating functions of bit-cost relate to the Ruelle operators associated to the algorithm. The singularities of the Dirichlet series are related to spectral projections of the operators and are easy to describe.

3.1 Dirichlet generating functions

Define w_n to be the set of all pairs (u,v) in W with v=n and C_n as the cumulative value of C over w_n. Then the corresponding encoding into Dirichlet generating functions is

á cñ

(s)=

n³ 1

C_n

n^s

(v₀, v₁)Î W

C(v₁, v₀)

v₀^s

Thus the expected average cost is E_N[C]=(å_n£
NC_n)/(å_{n£ N}|w_n|).

3.2 Tauberian theorem

The Tauberian theorems are a natural tool to consider as they give asymptotic information about the partial sums of coefficients of Dirichlet series. They rely on the nature and position of the singularities of F(s)=å a_n n^-s.

Theorem 2 [Delange] Let F(s) be a Dirichlet series with non-negative coefficients such that F(s) converges for Â(s)>s>0. Assume that:

F is analytic on Â(s)=s, where s¹s;
F(s)=A(s)(s-s)^-g-1 +C(s) for some g³ 0, and A(s) and C(s) analytic with A(s)¹0.

Then, as N®¥, the partial sum of coefficients is

n£ N

a_n=

A(s)

sG(g+1)

log

(

1+e(N)

)

, where e(N)® 0.

However, the conditions are difficult to verify for F_{á cñ}(s) in its present form. A transformation gives the required information about the singularities.

3.3 Ruelle operators

The classical operator is

G_s[F](x)=

m³1

(m+x)^s

æ
ç
ç
è

m+x

ö
÷
÷
ø

Let H={ h| h(x)=(m+x)^-1, m³ 1 }, the set of inverse branches of U. If D[h] is the denominator of the LFT h(x), then since D[h° g](x)=D[h]g(x)· D[g](x), the iterates of G_s are given by

G_s^k[F](x)=

hÎ H

D[h](x)^s

F° h(x).

Rationals of W can be written x=h(0) for some h in H^k where k³ 0. Then the Dirichlet generating function for |w_n| is equal to å_n³1|w_n|n^-s=å_hÎ_H^_*D[h](0)^-s=(I-G_s)^-1[1](0). A cost version of R_s,h[F](x)=D[h](x)^-1F° h(x) is defined as R_s,h^[c][F](x)=c(h)D[h](x)^-1F° h(x). Similarly the cost companion to G_s=å_hÎ_HR_s,h is G_s^[c]=å_hÎ_HR_s,h^[c].

Recall that C(v₀, v₁)~ å_i=1log₂(v_i)c(m_i). If x=v₁/v₀=h₁° h₂°...° h_k(0), then c(m_i) only depends on h_i and v_i only depends on h_i+1°...° h_k(0). That is, the function can be expressed as

h=(h₁°...° h_i-1)° h_i°(h_i+1°...° h_k) =b_i(h)° h_i° e_i(h).

Defining C_s,h=-å_i=1^k¶/¶ sR_s,e_{_i}_(h)° R_s,h_{_i}^[c]° R_{s, b}_{_i}_(h) yields F_{á cñ}(s)=å_hÎ_H^_*C_s,h[1](0).

3.4 Functional analysis

The singularities of the cost function can now be described in terms of the singularities of the C_s,h, and subsequently of (I-G_s)^-1. Analysis of (I-G_s)^-1 determines the values for the Tauberian theorem to be s=2 and g=2. Using this, Theorem 1 now follows. In the case of the operators related to the slow algorithms, the corresponding result is g=3, accounting for the log-cubed behaviour.

4 Variants and Encoding

As before, the technique applies to a family of variants. For example, the bit-complexity of the centred algorithm is asymptotic to

6log 2

p²

log

æ
ç
ç
ç
ç
ç
è

f²

k=3

f²+

2^k-1

f²-

2^k-1

ö
÷
÷
÷
÷
÷
ø

log₂²N, where f=((5)^1/2+1)/2.

Finally, the average length of a continued fraction encoding is computable. This is the room occupied in memory by (m₁, m₂,..., m_k, v_k). The encoding uses the same principles as Fano--Shannon.

Theorem 3 The average Fano--Shannon code-length D_N of the continued fraction expansion produced by the standard algorithm on valid inputs with denominator size N satisfies

D_N~

12log²

p²

æ
ç
ç
è

log 2

log

k=1

æ
ç
ç
è

2^k

ö
÷
÷
ø

log₂ N.

The numerical value is 2.04868 log₂N, which is close to the optimal 2log₂ N.

References

[1]: Akhavi (A.) and Vallée (B.). -- Average bit-complexity of Euclidean algorithms. In Montanari (Ugo), Rolim (José D. P.), and Welzl (Emo) (editors), Automata, languages and programming. Lecture Notes in Computer Science, vol. 1853, pp. 374--387. -- Springer, New York, 2000. Proceedings of the 27th ICALP Conference, Geneva, Switzerland, July 2000.
[2]: Salvy (B.). -- Algorithms Seminar, 1998--1999. -- Research Report n°3830, Institut National de Recherche en Informatique et en Automatique, December 1999.
[3]: Vallée (B.). -- Dynamics of the binary Euclidean algorithm: functional analysis and operators. Algorithmica, vol. 22, n°4, 1998, pp. 660--685. -- Average-case analysis of algorithms.
[4]: Vallée (Brigitte). -- A unifying framework for the analysis of a class of Euclidean algorithms. In Gonnet (Gastón H.), Panario (Daniel), and Viola (Alfredo) (editors), LATIN 2000: Theoretical Informatics. Lecture Notes in Computer Science, vol. 1776, pp. 343--354. -- Springer, Berlin, 2000. Proceedings of the 4th Latin American Symposium, Punta del Este, Uruguay, April 2000.

This document was translated from L^AT_EX by H^EV^EA.