ROBUSTNESS IN RANDOM INTERCONNECTIONS GRAPHS
Philippe Flajolet, January 11, 1998
(Based on joint work with Kostas Hatzis, Patras)
Random graphs are usually studied by the probabilistic method that is based on inequalities and on approximations of random variables. Many models that appear in this context can however be subjected to analytic methods. This worksheet explores one such situation using Combstruct , Gfun , and the Maple system. The objective is to characterise the interplay between three parameters of a random graph: its density (number of edges), its robustness to link failures, and its connectivity by short paths.
A triple
, where
is a graph,
and
are two designated nodes (the source and the target), is said to be
-robust if
and
are connected by at least two edge-disjoint vertices of length exactly
. This definition captures an intuitive notion of robustness to link failures, since
and
will remain connected by "short" paths even in the event of a link (i.e., edge) becoming unavailable in the interconnection graph represented by
.
The random graph model is that of where the graph has vertices and each of the possible edges is independently taken with probability . Under this model, and because of the symmetry it implies with respect to the naming of nodes, it is strictly equivalent to examine robustness properties between a fixed source and destination or between a random pair of nodes . For definiteness, we adopt the latter formulation. A graph with very samll will not be robust, while a graph with close to 1 will have a high number of edge-disjoint paths of small length. The purpose is to evaluate the threshold value of from which -robustness becomes likely. In this worksheet, we focus on the problem of estimating the mean number of edge-disjoint pairs of length between a random source and a random target, and deduce the associated threshold.
The analysis proceeds in stages:
- Step 1 . Enumerate the number of pairs of paths of length connecting nodes and on a line, so that both paths use the same set of nodes .This is really a problem of counting so-called avoiding permutations defined as cyclic permutations with constraints on their so-called "succession gaps", i.e., the values of the differences between succesive values that are constrained not to belong to the set . This steps itself decomposes into:
S tep 1a . Enumerate so-called templates that are skelettons of permutations with a distinguished set of exceptions .
Step 1b . Transform the counting of templates into enumeration of permutations with a distinguished set of exceptions and conclude by means of the Principle of Inclusion-Exclusion , a classical combinatorial principle.
- Step 2 . Modify the model in order to allow for nodes taken from outside the segment . We then count avoiding paths that may borrow "outer nodes". This situation models the original random graph problem by allowing nodes taken from the whole pool of nodes that are available from the graph .
- Step 3 . Return to the original random graph problem and obtain the expected number of edge-disjoint pairs between a source and a destination in a random graph that obeys the model.
References .
[Comtet, 1974] : L. Comtet, Advanced Combinatorics , Reidel, 1974.
[EIS]: N. Sloane and S. Plouffe, The Encyclopedia of Integer Sequences , Academic Press, 1995.
This Maple worksheet is based on the current versions of the Maple packages combstruct and gfun (for version V.4) that can be found under http://www-rocq.inria.fr/algo/ .
> restart;
> with(combstruct);
> with(gfun);
1. Permutations with constrained succession gaps
The goal of this section is to enumerate permutations of that are cyclic, of the form , and with no succession gap in . We call such permutations avoiding permutations . In another language, what is the number of paths of length n-1 that join 1 to and have no edge in common with those of the line graph defined by the interval ?
There are no such paths for . Surprisingly enough, the first nontrivial configurations occur at (try it!), namely
{[1, 4, 2, 5, 3, 6], [1, 3, 5, 2, 4, 6]}
and for , one has
{[1, 3, 6, 4, 2, 5, 7], [1, 3, 5, 2, 6, 4, 7], [1, 4, 6, 2, 5, 3, 7], [1, 4, 2, 6, 3, 5, 7], [1, 5, 3, 6, 4, 2, 7],
[1, 5, 3, 6, 2, 4, 7], [1, 5, 2, 4, 6, 3, 7], [1, 4, 6, 3, 5, 2, 7], [1, 6, 3, 5, 2, 4, 7], [1, 6, 4, 2, 5, 3, 7]}
so that the numbers are
. A brute force enumeration routine based on the predefined structures availaible under
Combstruct
is given below.
1.1. Templates
A template is a scheme that specifies which "bits and pieces" of the interval may be exceptions. We firtst need to enumerate templates.
In combstruct, we specify templates as decompositions of the interval into blocks that are either
-- isolated points ,
-- blocks of contiguous unit intervals (based at integer points) oriented left-to-right ,
-- blocks of contiguous unit intervals (based at integer points) oriented right-to-left .
A template must start with an or block and end similarly with an or block.
For instance, for , the template
will correspond to any cyclic permutation that has successions of values (in the cycle traversal)
1,2; 2,3; 5,6; 11,10; 10,9; 12,13
as distinguished exceptions to the basic constraint of avoiding permutations.
The combinatorial specification
First a preliminary specification. Let be a binary alphabet. The collection of strings beginning and ending with a letter is described by
> sp0:=S=Prod(Sequence(Prod(a,Sequence(b))),a);
(It suffices to decompose according to each occurrence of the letter .)
Now, the three types of blocks in a template are described by
>
sp1:=Prod(begin_blockP,Z,end_blockP); sp2:=Prod(begin_blockLR,Z,Sequence(Prod(mu_length,Z),card>=1),end_blockLR);
sp3:=Prod(begin_blockRL,Sequence(Prod(mu_length,Z),card>=1),Z,end_blockRL);
Clearly, and are combinatorially isomorphic. For reasons related to application of the inclusion exclusion argument, we keep track of the size (number of nodes of the basic interval graph) as well as of the length of blocks and of their or character. Then, the grammar of templates is:
> Q:=subs([a=Union(sp1,sp2),b=sp3],sp0), begin_blockP=Epsilon, end_blockP=Epsilon, begin_blockLR=Epsilon, end_blockLR=Epsilon, begin_blockRL=Epsilon, end_blockRL=Epsilon, mu_length=Epsilon;
> temp15:=draw([S,{Q},unlabelled],size=15);
> decode:=proc(e) subs([Prod=proc() args end,Sequence=proc() args end,mu_length=NULL,Epsilon=NULL], e); ["] end;
> decode(temp15);
> seq(count([S,{Q},unlabelled],size=n),n=0..20);
Generating functions
A trivariate generating function immediately results from the specification. There, records size, records the total number of blocks (needed for subsequent permutation enumerations since blocks should be chained to each other), and records the total length of or blocks (the number of distiguished exceptions needed for inclusion exclusion).
> gfsolve({Q},unlabelled,z,[[u,begin_blockP,begin_blockLR,begin_blockRL],[v,mu_length]]);
> f_zuv:=subs(",S(z,u,v));
This GF can be checked by comparing its Taylor expansion and exhaustive listing, at least for small size values.
> map(expand,series(f_zuv,z));
The exhaustive listing is given by Combstruct[allstructs]
> map(decode,allstructs([S,{Q},unlabelled],size=3));
This corresponds to the correct listing
[[1],[2,3]]; [[1],[2],[3]]; [1,2,3]; [[1,2],3]
hence the monomial in the series expansion
1.2. Avoiding permutations
From templates to permutations
Let be the number of permutations of size with distinguished exceptions: is the number of permutations with distinguished special successions of the type or . Let be the number of permutations with no exception. Then, by inclusion exclusion, one has
.
(Roughly, one takes objects with at least 0 exception, subtract those with at least one exception, then add back those with at least two exceptions, etc.)
Let be the number of templates with a total of nodes, blocks, and a total length of the and blocks equal to . Then, by looking at all ways to connect the blocks, one has
, and , if ,
since any such connection is determined by an arbitrary permutation of the intermediate blocks.
The trivariate GF of the has been determined in the previous section. This shows a chain by which the are computable.
Observe that the extension of by linearity to an arbitrary series in is given by
.
That is to say, we just replace in expansions and apply the Euler integral
.
Thus, with = ,
the OGF satisfies
Generating functions
Start from the previously determined trivariate GF and apply the basic chain.
> f_zuv; map(expand,series(f_zuv,z));
For inclusion-exclusion, one must set :
> f_zu:=subs(v=-1,f_zuv);
Application of the -transformation (that count the number of ways to connect the blocks) requires the modified form
> k_zu:=normal(f_zu-(u-u^2)*subs(u=0,diff(f_zu,u)));
The OGF is here
> Q_z:=Int(k_zu*exp(-u)/u^2,u=0..infinity);
This solves the original counting problem numerically:
> Q_ser:=series(map(proc(e) int(e*exp(-u)/u^2,u=0..infinity) end,map(expand,convert(series(k_zu,z=0,15),polynom))),z,15);
Closed form . The quantity can be expressed in terms of the exponential integral:
> Q_z_closed:=eval(subs(Int=int,Q_z));
Since one deals with ordinary generating functions (OGF's), this is to be taken as a formal (asymptotic) series:
> map(normal,subs(z=1/y,Q_z_closed));map(simplify,asympt(",y,11));
.
The exponential integral ( ) involves the divergent series of factorials:
> Sum(n!*(-y)^(-n-1),n=0..infinity)=asympt(Ei(1,y)*exp(y),y,10);
The divergent series is also a hypergeometric series:
> series(hypergeom([1,1],[],z),z=0,9);
This gives rise to a general conversion procedure from Exponential integrals to hypergeometric forms:.
> convert_hypergeom:=proc(e) subs(Ei=proc(a,b) if a<>1 then ERROR(`unable to convert`) else exp(-b)/b*hypergeom([1,1],[],-1/b) fi end,e); simplify("); end;
Hence another closed form for the OGF of the
> Q_z_closed2:=convert_hypergeom(Q_z_closed);
Recurrences . Holonomic descriptions (by means of linear differential equations) can be obtained by Gfun[holexprtodiffeq] .
> Qz_ode0:=holexprtodiffeq(Q_z_closed2,Y(z));
> subs(Y(z)=Qz_ser,Qz_ode0):series(",z=0,11);
> Qz_ode:=subs(_C[0]=1,Qz_ode0);
The transformation to a linear recurrence is obtained by Gfun[diffeqtorec]
> Q_rec:=diffeqtorec({Qz_ode=0,Y(0)=0},Y(z),u(n));
This provides an algorithm that uses a linear number of arithmetic operations to determine the quantities .
> ha:=rectoproc(Q_rec,u(n),remember);
> ha(5);ha(50);ha(500);
Asymptotics.
An asymptotic pattern is easily inferred from numerical values..
> seq(ha(n)/(n-2)!*1.0,n=2..50);
> ha(500)/(498)!*1.0;
> exp(-2)=exp(-2.0);
And one can even guess the next term in an asymptotic expansion
> seq(n*(1-ha(n)/(n-2)!*exp(2.0)),n=6..50);
> nn:=500; ha(nn)/((nn-2)!*(1.0-2.0/nn)*exp(-2.0));
Thus, with reasonable certainty, we expect
.
The principle of a proof based on the generating function method is that
,
provided that the argument of the hypergeometric is a function that is analytic at the origin. Here, one has
> Q_z_closed2;
> -z*(z-1)/(1+z)=series(-z*(z-1)/(1+z),z=0,6);
so that the asymptotic proportion of legal permutations is provedly equal to :
> Qn_asy:=`Q `[n]/(n-2)!=exp(-2)*(1+O(1/n));
Results
We have thus obtained here a few original results regarding the enumeration of avoiding permutations.
Theorem 1 . The number of avoiding permutations has ordinary generating function:
> Q_z_closed=Q_z_closed2;
The coefficients satisfy the recurrence:
> Q_rec;
Accordingly, the ordinary generating function satisfies the corresponding linear ODE:
> map(factor,Qz_ode);
The coefficients satisfy the asymptotic estimate
> Qn_asy;
2. Paths with outer nodes
We now define avoiding paths . An avoiding path of size is a sequence of values from the set , where represents generically a node that does not belong to the interval , with the constraint that the initial value of the sequence is , the final value is , no successive value pairs can be of the type or , and no value from can occur more than once. Here are the sets of avoiding paths for sizes :
An avoiding path thus describes a path from to that is allowed to go through some "outer nodes". The goal is here to enumerate avoiding paths by size and by number of outer nodes conventionally represented by .
Let be the number of avoiding paths of size with outer nodes. Clearly, is the number of avoiding permutations. An extension of the earlier argument is thus needed.
2.1. Templates
One first needs templates on which an inclusion-exclusion argument is applied. The specifications are a simple modification of templates associated to avoiding permutations.
Let be a ternary alphabet. The collection of strings beginning with and and containing only one that occurs at the end is described by
> sp0:=S=Prod(Sequence(Prod(a,Sequence(b))),x);
(It suffices to decompose according to each occurrence of the letter .)
We first need so-called "outer points" that are taken from outer space.
> Outerpoints:=Sequence(Prod(Z,mu_outerpoint));
We also need "inner points".
> Innerpoints:=Sequence(Prod(Z,mu_innerpoint));
Size is defined as the cumulated number of points in the pair of paths that underlies an avoiding path in the sense above: it is thus equal to the length of the avoiding path plus the number of symbols corresponding to the outer nodes. We thus introduce a special notation for nodes of the integer line that are shared by the two paths:
> Z2:=Prod(Z,Z);
Now, the three types of blocks are described by
>
sp1:=Prod(mu_block,Z2,Outerpoints,Innerpoints); sp2:=Prod(mu_block,Z2,Sequence(Prod(mu_length,Z2),card>=1),Outerpoints,Innerpoints);
sp3:=Prod(Sequence(Prod(mu_length,Z2),card>=1),Z2,Outerpoints,Innerpoints,mu_block);
(Clearly, and are combinatorially isomorphic.)
The blocks that can occur at the end are of type and can only be of type or but without outer points nor innerpoints.
> sp1x:=Prod(mu_block,Z2);
> sp2x:=Prod(mu_block,Z2,Sequence(Prod(mu_length,Z2),card>=1));
Then, the grammar is:
> Q:=subs([a=Union(sp1,sp2),b=sp3,x=Union(sp1x,sp2x)],sp0), mu_block=Epsilon, mu_length=Epsilon,mu_outerpoint=Epsilon,mu_innerpoint=Epsilon;
> draw([S,{Q},unlabelled],size=8);
> seq(count([S,{Q},unlabelled],size=n),n=0..20);
> for j from 0 to 6 do allstructs([S,{Q},unlabelled],size=j) od;
2.2. Generating functions
The trivariate generating function immediately results from the specification.
> gfsolve({Q},unlabelled,z,[[u,mu_block],[v,mu_length],[w1,mu_outerpoint],[w2,mu_innerpoint]]);
For inclusion-exclusion, one must set v=-1:
> f_zu:=factor(subs(v=-1,subs(",S(z,u,v,w1,w2))));
Application of the -transformation (that counts the number of ways to connect the blocks) requires the modified form
> h_zu:=factor(normal(f_zu-(u-u^2)*subs(u=0,diff(f_zu,u))));
This solves the original counting problem:
> Q_ser:=series(map(proc(e) int(e*exp(-u)/u^2,u=0..infinity) end,map(expand,convert(series(h_zu,z=0,12),polynom))),z,15);
There, the coefficient of counts the number of avoiding paths with size and with outer nodes.
Closed form and recurrences . The OGF is here
> Q_z:=Int(h_zu*exp(-u)/u^2,u=0..infinity);
And this can be expressed in terms of the exponential integral:
> Q_z_closed:=eval(subs(Int=int,Q_z));
Again, there is an "explicit form" of the OGF the problem
> Qz_closed2:=map(factor,convert_hypergeom(Q_z_closed));
> Qz_closed3:=factor(Qz_closed2-subs(hypergeom=0,Qz_closed2))+factor(subs(hypergeom=0,Qz_closed2));
The diagonal, where and appear with equal exponents, is then easily extracted.
> Q_order:=20: Qz_ser:=map(normal,series(Qz_closed3,z=0,Q_order+2)):subs([w1=w,w2=t/w],Qz_ser): subs([w1=w,w2=t/w],Qz_ser): map(series,",w=0,2*Q_order+5):Qzt_ser:=map(coeff,",w,0);
We thus get results that generalize the counting of avoiding permutations.
> subs(t=0,Qzt_ser);subs(t=1,Qzt_ser);
2.3. Exhaustive listings
Here's a procedure that lists balanced avoiding pairs exhaustively (by generating all permutations and filtering the relevant ones) and derives the counting polynomials. This fully confirms the GF computations done previously.
>
test:=proc()
local i, perms, dontcares, poly, count1, p, p1, j, x, s, n, listing; n:=args[1];
perms:=combstruct[allstructs](Permutation([seq(i,i=2..n-1)]),size=n-2); dontcares := combstruct[allstructs](Subset({seq(i,i=2..n-1)}));
listing:={};
poly:=0;
for s in dontcares do
count1:=0;
for p in perms do p1:=[1,op(p),n];
for x in s do p1[x]:=-1; od;
for j from 1 to n-1 while abs(p1[j+1]-p1[j])<>1 do
od;
if j=n then count1:=count1+1; listing:=listing union {p1} fi;
od;
poly:=poly+count1*t^nops(s)/nops(s)!;
od;
if nargs>1 then RETURN(sort(listing)) fi;
RETURN(poly);
end;
> for j from 2 to 6 do j,test(j) od;
> for j from 2 to 6 do j,test(j,LIST_THEM_ALL) od;
> remove(has,test(6,LIST_THEM_ALL),-1);
> remove(has,test(7,LIST_THEM_ALL),-1);
These results confirm the GF computation of the previous subsection.
2.4. Explicit binomial formulae
We can now return to the GF of avoiding paths enumerated by size and number of outer nodes.
> Qz_closed3;
The coefficient of is obtained by straight expansion and avoiding paths are then enumerated by . The corresponding formulae are obtained directly by symbolic expansions (performed manually, though!)
> j:='j': C_formula:=C(n+2,j)=Sum(Sum( (-1)^(k1+k2)*(n-j-k1-k2)!*'B'(n-j-k1-k2,k1)* 'B'(n-j-k1+1,k2)*'B'(n-k1-k2,j)^2, k1=0..n-j-k2), k2=0..n-j);
where is the binomial, coefficient .
> Bin:=proc(n,k) if n<0 or k<0 then 0 else binomial(n,k) fi; end;
>
coef:=proc(n,j) option remember; local k1,k2;
add( add( (-1)^(k1+k2)*(n-j-k1-k2)!*Bin(n-j-k1-k2,k1)* Bin(n-j-k1+1,k2)*Bin(n-k1-k2,j)^2, k1=0..n-j-k2), k2=0..n-j); end;
>
Coef:=proc(n,j) local r; option remember; r:=coef(n-2,j);
if j=0 then r:=r-(-1)^n fi; r; end;
This gives the generating polynomials .
> co:=proc(n,t) local j; option remember; add(Coef(n,j)*t^j,j=0..n) end;
Here is a short table, again consistent with values obtained by exhaustive listing of small cases.
> for nn from 2 to 8 do 'co'(nn,t)=co(nn,t) od;
Results
Theorem 2 . The number of avoiding paths counted by size and by number of outer nodes satisfies for
> C_formula;
For , the formula simplifies and provides the number of avoiding permutations
> 'Q[n+2]'=subs(j=0,B(n-k1-k2,0)=1,op(2,"))-(-1)^n;
( denotes a binomial coefficient.)
3. The random graph problem
We conclude this section by showing how to estimate the robustness of connections to link failures between a source and a target in a random graph that obeys the model, where edges between nodes are chosen independently with probability .
An avoiding pair of length in a graph is an unordered pair of paths, each of length that may share some nodes but are edge disjoint.The mean number of (unordered) avoiding pairs in a random graph obeying the model between an arbitrary source and an arbitary destination is an indicator of the robustness of the graph to a single link failure. This mean value is:
> l:='l': Numpath_formula:=1/2*1/(n*(n-1))*Sum(C(l+1,j)*B(n,l+1+j)*(l+1+j)!,j=0..l)*p^(2*l);
The argument is as follows. The coefficient corresponds to the fact that one takes unordered pairs of paths, the coefficient averages over all sources and destinations, and the arrangement numbers account for the number of ways to embed an avoiding path into a graph by choosing certain nodes and assigning them in some order to an avoiding path.
The -coefficients are as provided by Theorem 2 and still means binomial coefficient.
The procedure that implements the formula for the mean number of paths is then
> means:=proc(n,p,l) local j; option remember; 1/2*1/(n*(n-1))*add(Coef(l+1,j)*binomial(n,l+1+j)*(l+1+j)!,j=0..l+1)*p^(2*l); expand("); if l<=12 then factor(") else " fi end;
> for ll from 1 to 10 do means(n,p,ll) od;
Here is an instance for a graph of nodes, a probability of , so that the mean node degree is about :
> seq(subs(n=10^5,p=5.0*10^(-5),[ll,means(n,p,ll)]),ll=2..16);
It appears that there is a sharp threshold between and . This is to be compared with the number of nodes reachable from the root of a tree with node degree in steps: is just below while exceeds
The probability threshold after which the mean number of unordered pairs exceeds 1 is given by
> p0:=proc(nn,l) local p; fsolve(subs(n=nn,means(n,p,l))=1,p,0..1) end;
And this quantity is best visualised in terms of the mean node degree:
> delta:=proc(n,p) (n-1)*p; end;
For a graph of size , we have
> seq([ll,delta(10^5,p0(10^5,ll))],ll=2..10);
The thresholds are fairly sharp as attested by plots of the mean values of the number of (unordered) avoiding pairs when the probability increases (the horizontal axis gives the mean degree ).
> nn:=10^5: plot([seq(means(nn,delta/(nn-1),ll),ll=3..8)],delta=0..20,mu=0..1,thickness=3);
(The curves from left to right correspond to path length = 8, 7, 6, 5, 4, 3.)
Asymptotics . For dense graphs and large size, a numerical pattern clearly emerges. Here is what happens for mean degree and size :
> seq(subs(n=10^5,p=1.0*10^(-4),[ll,means(n,p,ll)]),ll=2..20);
The mantissas all start with 0.49. This corresponds to a regime where the distance is small so that only dominant terms in the mean value polynomial intervene. The mean number of avoiding pairs is then approximately
> mu=1/2*n^(2*l)*p^(2*l+2)*(1+o(1));
or, in terms of the mean degree parameter :
> mu=1/2*delta^(2*l)*p^2;
This simplified formula explains the numerical regularities seen above, where the mean number increases by a factor of about when increases by 1.
Conversely, a clear example of nonasymptotic regime is provided by and , that is .
> nn:=10^3; dd:=10.0^(1/8); pp:=dd/(nn-1);for ll from 10 to 28 by 2 do l=ll,ExactMean=evalf(subs(n=nn,p=pp,means(n,p,ll)),5),ApproxMean=evalf(1/2*nn^(2*ll)*pp^(2*ll+2),5) od;
The ratio between the exact formula and the simplified approximation is typically of about 1 to 2 or 1 to 4, with the approximation being systematically optimistic. An explanation is that, apart from asymptotic approximations in , the simplified formula precisely neglects the exclusion effect on edges and this is permissible only in dense large graphs having high connectivity.
Conclusion
Permutations with constrained succession gaps are accessible to combstruct and gfun . This is achieved via the notion of templates that are nicely decomposable in conjunction with the inclusion-exclusion principle that is expressed by a simple integral transformation. Generating functions and recurrences result automatically and this leads to explicit binomial formulae. An application to robustness in a random interconnection graph derives from this approach and we have shown how to determine the mean number of edge-disjoint pairs between a source and a target in such a random graph.