ASYMPTOTICS OF THE STIRLING NUMBERS
OF THE SECOND KIND
Bruno Salvy & John Shackell
January 1998
We show how computer algebra can be used to compute automatically the asymptotics of the Bell numbers and of the average value and variance of the number of parts in a partition. This worksheet details the computations involved in Section 3 of our article Symbolic Asymptotics: Multiseries of Inverse Functions , available as INRIA Research Report #3264, 1997. The starting point for all these asymptotic expansions is the bivariate generating function of the Stirling numbers of the second kind:
> F:=exp(u*(exp(x)-1));
> series(F,x);
The coefficient of in this series is the Stirling number of the second kind divided by . These numbers count the number of partitions of the set {1,..., } in parts.
For functions with a fast growth at infinity, like many functions related to , the asymptotic behaviour of the Taylor coefficients can be computed by the saddle-point method. This method relies on using an integral along a closed contour in the complex plane chosen in such a way as to pass through a special point called the saddle-point. The integral is concentrated in the neighborhood of the saddle-point and good asymptotic estimates are obtained from a local expansion there. However, in these applications, the saddle-point is defined implicitly by an equation and only an asymptotic expansion of it is available. This introduces technical difficulties in the manipulation of the expansions. This worksheet explores how our algorithm for asymptotic inversion applies in such circumstances.
We first load our experimental implementation
> read `/export/salvy/Alcom/97/asympt_inv/load.mpl`;
Sufficient conditions for deriving an asymptotic expansion by the saddle-point method have been given by Harris and Schoenfeld. These conditions are satisfied by all the generating functions we consider in this worksheet. The following procedure (which can be skipped in a first reading) takes as input a function , the associated saddle-point equation, an asymptotic scale (see below) and computes the asymptotic expansion of the th Taylor coefficient of as tends to infinity in terms of the saddle-point.
> harris_schoenfeld:=proc(f,x,R,nofR,nbterms,scale)local A, B, Bb, Cc, F, i, m, u, compo;A:=diff(log(f),x);for i to 2*nbterms do B[i]:=normal(x^i/i!*diff(A,[x$(i-1)])) od;Bb:=x*diff(B[1],x)/2;for i to 2*nbterms-2 do Cc[i]:=normal(-(B[i+2]+(-1)^i/(i+2)*B[1])/Bb) od;F[0]:=1;for i to nbterms-1 do F[i]:=(-1)^i/sqrt(Pi)*add(GAMMA(m+i+1/2)/m!*add(mul(Cc[u],u=compo),compo=combinat[composition](2*i,m)),m=1..2*i)od;tower(subs([x=R,n=nofR],f/2/sqrt(Pi)/sqrt(Bb)/x^n*add(normal(F[i]/Bb^i),i=0..nbterms-1)),scale)end:
Asymptotics of the Bell numbers
The Bell numbers are the sum over k of the Stirling numbers . Their generating function is
> f:=subs(u=1,F);
Alternatively, since these numbers count the number of partitions of a set, this generating function and the numbers can be obtained using combstruct :
> G:={P=Set(Set(Atom,card>0))}:
> combstruct[gfsolve](G,labelled,x);
Here are the first numbers:
> series(f,x,20);
or, using combstruct
> seq(combstruct[count]([P,G,labelled],size=i),i=0..19);
the factor 1/ being due to the use of exponential generating functions.
The saddle-point equation we use is
> speq:=subs(x=R,diff(log(f),x)*x)-1=n;
Then the asymptotic analysis of the Bell numbers is classically based on the asymptotic behaviour of the saddle-point viewed as a function of as tends to infinity.
From the above equation our algorithm produces the asymptotic expansion of the saddle-point in two stages: first a sequence of ``exact'' information is computed by the procedure Invert which also creates the scale scaleR necessary for the expansions of functions of R :
> r1:=Invert(lhs(speq),R,'scaleR');
The last argument has been assigned the corresponding asymptotic scale:
> scaleR[list];
In a second stage, asymptotic information is extracted from the exact result r1 and the scale necessary for the expansions of functions of n is created
> exp1:=multiseriesinverse([r1],scaleR,0,2,n,3,'scalen');
Here again the variable scalen holds the asymptotic scale related to the variable n :
> scalen[list];
The procedure harris_schoenfeld written above is then able to compute the expansion of the Bell numbers in terms of the saddle-point R in the scale scaleR which has been obtained above
> d1:=harris_schoenfeld(f,x,R,lhs(speq),3,scaleR);
The result follows upon substitution of R by its asymptotic expansion in terms of n ( exp1 above). Since the logarithm of this expansion is easier to handle, we first compute it:
> d2:=`tower/ln`(d1,scaleR):
> multiseries(d2,scaleR,0,3,nops(scaleR[list]),3);
Hence the result by substituting the expansion of in terms of in this last expansion:
> substitute(d2,scaleR,exp1,scalen,3);
The first part of this expansion is the classical expansion of which can be found in (de Bruijn 1981, p. 108). The remaining parts reflect the use of multiseries, a recent tool which makes it possible to deal with indefinite cancellation in a simple way.
Average number of parts in a partition
The technique demonstrated above also applies to the average number of parts in a partition, where an indefinite cancellation occurs and this exemplifies the use of multiple scales. The generating function of is
> g:=subs(u=1,diff(F,u));
The average number of parts in a partition of a set of size is the th Taylor coefficient of divided by the th coefficient of whose asymptotics we have just computed. The saddle-point equation for is:
> speq2:=normal(subs(x=R,diff(log(g),x)*x)-1)=n;
From this equation our algorithm produces the asymptotic expansion of the saddle-point in two stages as before:
> r2:=Invert(op(1,speq2),R):
> exp2:=multiseriesinverse([r2],scaleR,0,2,n,3,scalen);
This expansion of this new saddle-point is the same as that of the previous case. But this does not mean that both saddle-points are equal. The difference between their asymptotic behaviours can be obtained by working in a different scale. Here is a refined estimate for the first saddle-point:
> sp1:=multiseriesinverse([r1],scaleR,2,1,n,3,'scalezeta');
> alias(zeta=%1):
> sp1;
and here is an estimate for the second saddle-point refined similarly
> sp2:=multiseriesinverse([r2],scaleR,2,1,n,3,scalezeta);
Hence the difference can be computed at this level of the asymptotic scale
> `multiseries/expand`(x1-x2,[x1,x2],[sp1,sp2],scalezeta,3);
or in terms of
> substitute(SERIEStoseries(",'polynom'),scalezeta,exp1,scalen,3);
Thus the saddle-points differ asymptotically by / , which could not be detected from the asymptotic expansion in the original scale.
In order to compute the average number of parts in a partition, we have to divide the estimate for the th coefficient of by the estimate for the th coefficient of . Since these estimates are given in terms of different saddle-points, the computation is more easily done by obtaining both estimates in the same scale, which involves , and then substituting the estimate for in terms of .
> estden:=substitute(d1,scaleR,sp1,scalezeta,3);
> n1:=harris_schoenfeld(g,x,R,op(1,speq2),3,scaleR):
> estnum:=substitute(SERIEStoseries(multiseries(n1,scaleR,2,1,nops(scaleR[list]),3),'polynom'),scaleR,sp2,scalezeta,3);
Hence the ratio in is
> ratio:=`multiseries/expand`(x2/x1,[x1,x2],[estden,estnum],scalezeta,3);
And here is the final result in : the average number of parts in a partition of elements:
> substitute(SERIEStoseries(ratio,'polynom'),scalezeta,exp1,scalen,3);
This result is classical. However, it is worth noting that classical references give at best the first term of the asymptotic behaviour (n/log n) and one of them is wrong by a factor of e=exp(1). This gives an idea of the difficulty of performing such computations by hand.
Variance
Computation of the variance leads to further cancellation. It is obtained from the generating function
> h:=subs(u=1,diff(u*diff(F,u),u));
The corresponding saddle-point equation is
> speq3:=subs(x=R,normal(diff(log(h),x)*x))-1=n;
The expansion of the saddle-point in terms of is obtained as before
> r3:=Invert(op(1,speq3),R);
> sp3:=multiseriesinverse([r3],scaleR,2,1,n,3,scalezeta);
This leads to the following estimate for the th coefficient of , first in terms of R and then in terms of by substituting R by its asymptotic expansion in terms of :
> n2:=harris_schoenfeld(h,x,R,op(1,speq3),3,scaleR):
> estnum2:=substitute(SERIEStoseries(multiseries(n2,scaleR,2,1,nops(scaleR[list]),3),'polynom'),scaleR,sp3,scalezeta,3);
From there we get the variance in terms of
> finalres:=`multiseries/expand`(x3/x1-(x2/x1)^2,[x1,x2,x3],[estden,estnum,estnum2],scalezeta,3);
and in terms of
> substitute(SERIEStoseries(",'polynom'),scalezeta,exp1,scalen,3);
Again, only the first term is given in the literature. Our method can compute as many terms as desired, as well as the expansions of higher moments.
Conclusion
Multiseries are specially useful for saddle-point asymptotics that arise frequently in combinatorial applications. Delicate expansions (that lead to errors in the technical literature) can be handled automatically. Such a process is specially useful for moment computations that involve minute saddle-point perturbations.