• Nie Znaleziono Wyników

Generating graphs that approach a prescribed modularity

N/A
N/A
Protected

Academic year: 2021

Share "Generating graphs that approach a prescribed modularity"

Copied!
11
0
0

Pełen tekst

(1)

Generating graphs that approach a prescribed modularity

S. Trajanovski∗, F.A. Kuipers, J. Mart´ın-Hern´andez, P. Van Mieghem

Delft University of Technology, Faculty of Electrical Engineering, Mathematics and Computer Science, P.O. Box 5031, 2600 GA Delft, The Netherlands

Abstract

Modularity is a quantitative measure for characterizing the existence of a community structure in a network. A network’s modularity depends on the chosen partitioning of the network into communities, which makes finding the specific partition that leads to the maximum modularity a hard problem. In this paper, we prove that deciding whether a graph with a given number of links, number of communities, and modularity exists is NP-complete and subsequently propose a heuristic algorithm for generating graphs with a

given modularity. Our graph generator allows constructing graphs with a given number of links and different topological properties.

The generator can be used in the broad field of modeling and analyzing clustered social or organizational networks.

Keywords: Modularity, Graph generator, Modeling community structure

1. Introduction

Community structure is observed in many real-world net-works, such as (online) social netnet-works, where groups of friends of a certain person are often also friends of each other. For instance, one group of friends could originate from the school community, another from the sports community, and yet another group could be living in the same neighborhood.

Community detection or characterizing the level of

commu-nity structure in a network is difficult. The modularity

met-ric, initially proposed by Newman and Girvan [1] to detect network communities, has attracted significant attention, e.g. see [2, 3, 4]. The maximum modularity expresses how

clus-tered the network is and gives the resulting partitioning into

the corresponding clustered communities. Modularity has its limitations in detecting community structure, for instance com-munities smaller than a certain resolution limit may be unde-tectable [5], while larger sub-graphs may be partitioned even if they are random graphs [6]. Additionally, computing the maximum modularity of a given graph is an NP-complete prob-lem, as was proved by Brandes et al. [2]. Nonetheless, has re-mained a popular metric for representing community structure and several heuristic algorithms for detecting maximum modu-larity [7, 4, 8] have been proposed.

Ever since the seminal work of Erd˝os and R´enyi [9] on mod-eling and analyzing random graphs, various graph generators have been proposed. Graph generators are predominantly used to mimic existing networks, such that either a proper network abstraction can be analyzed or simply to test new algorithms and applications when the actual network is too big or not com-pletely known. Popular graph generators include the:

Corresponding author

Email addresses: S.Trajanovski@tudelft.nl (S. Trajanovski), F.A.Kuipers@tudelft.nl (F.A. Kuipers),

J.MartinHernandez@tudelft.nl (J. Mart´ın-Hern´andez), P.F.A.VanMieghem@tudelft.nl (P. Van Mieghem)

• Erd˝os-R´enyi random graph generator [9, 10] that generates networks with a binomial degree distribution and where links exist with a fixed probability p.

• Barab´asi-Albert power-law graph generator [11] and its variations [12, 13] that produce graphs with a power-law degree distribution. Power-law graphs are for instance used to reflect the Internet AS topology [14].

• Watts and Strogatz small-world graph generator [15], which was proposed to generate networks with high

clus-tering coefficient and small diameter.

However, the proposed models produce graphs with low modularity, thus failing to match the strong community struc-ture of social networks. To date, there does not exist any gen-erator that produces graphs with a given number of commu-nities and fixed modularity. This paper aims to fill this gap by proposing such a generator. Artificially generated graphs with a

required modularity would offer the possibility to analyze

com-munity detection, information spreading, or robustness proper-ties on an appropriate scale.

We study the problem of finding a graph G with a given mod-ularity m, number L of links and number c of communities. As it is shown in the paper, the modularity m taken together with the number of communities c quantitatively shows community presence or absence. Our main contributions are:

(a) We prove that deciding whether a graph, with a modularity m, number L of links, and partitioning into c communities exists, is NP-complete.

(b) We analyze the influence of link rewiring strategies on the modularity of a graph.

(c) We propose a novel graph generator that produces graphs with a given number of communities and a modularity close to that of a given modularity.

(2)

The paper is organized as follows. A short overview of the state-of-the-art on modularity, community detection and related graph generators is given in Section 2. The complexity of gener-ating graphs with a given modularity is discussed in Section 3.

Section 4 analyzes the effect of link rewiring on the modularity

of a graph. Section 5 proposes a heuristic algorithm for gen-erating network structures with a given modularity and number of communities. The properties of the generated graphs are dis-cussed in Section 6. We conclude in Section 7.

2. Related Work

The modularity metric has been proposed by Newman and Girvan [1] as a global metric for quantifying community istence in networks. Subsequently, modularity has been ex-plored as a metric for community detection in graphs and net-works [16, 7, 8, 17, 18]. A thorough summary of the state-of-the-art in community detection in general and modularity in particular has been provided by Fortunato [19]. Brandes

et al. [2] proved that finding the maximum modularity is an

NP-complete problem. In addition, they proposed a linear

pro-gramming (LP) technique for finding the maximum

modular-ity. A similar LP-based approach for modularity maximiza-tion was proposed in [20]. In our previous work [21], we have determined a tight bound and the properties of the maximum modular graphs for a given number of links. An algorithm that seeks for the local maxima, based on a greedy technique has been given in [16]. Fast modularity based community de-tection algorithms on very large networks have been proposed in [17, 8, 22]. Some weaknesses in modularity optimization have also been determined, such as the incapability to detect communities smaller than a resolution limit [5] or the breaking up of large random sub-graphs into separate communities [6]. A spectral analysis of the modularity as well as correlation with other metrics, such as assortativity [23, 24], has been conducted in [25].

Orman et al. [26] have made a qualitative comparison of community detection algorithms and surveyed the models for generating graphs with community structure. The model pre-sented by Girvan and Newman [27] generates a network con-sisting of a small number of Erd˝os-R´enyi graphs [9] that are weakly connected. Few other models with a larger number of communities have been proposed that lead to more realistic (e.g., power-law) degree distributions [28, 29]. Finally, mod-els that produce weighted and undirected graphs with commu-nity overlap have been proposed by Lancichinetti and Fortu-nato [30].

Unlike previous work, we first prove the NP-completeness of deciding whether a graph with a given modularity, number of links and number of communities exists. To the best of our knowledge, our generator is the first in producing graphs with a given modularity, number of links and number of communities. Moreover, our generator returns the number of links per com-munity, leaving space for leveraging other structural properties per community, such as the degree distribution.

3. Complexity of modular graph generation

For a certain partitioning of a network G of N nodes into

ccommunities, modularity has been defined by Newman and

Girvan [1] as a function of the graph’s adjacency matrix values

ai jand its node degrees difor i, j= 1, 2, ..., N as

m= 1 2L N X i=1 N X j=1 ai j− didj 2L !

1{i, j∈ the same community} (1) where we follow the notation introduced in [31].

By considering the cumulative degree DCi, which is the sum

of all the nodal degrees in community Ci; the total number LCi

of links within Ci; and the number Linterof links that connect

nodes in different communities, the original form for the mod-ularity (1) can be modified [25] into

m= 1 −1 c− Linter L − 1 2c c X j=1 c X k=1 DCj− DCk 2L !2 (2)

We use the term inter-community links to refer to links that

connect nodes in different communities and the term

intra-community linksfor those links, where both end-points reside

in the same community. For each community Ci(i = 1, ..., c),

the number of inter-community links, where exactly one node

is in Ci, is denoted as LCouti and the number of intra-community

links within Cias LCini. Because, from a degree perspective, all

inter-community links in Ciare counted twice, we have

DCi= 2L

Ci in + L

Ci out

Over all possible partitions of G, the partitioning that leads to highest modularity m is of general interest. Based on (2), an immediate conclusion is that maximum modularity is achieved

by minimizing the number Linterof links that connect nodes in

different communities, while keeping the cumulative degrees of the communities as equal as possible.

In order to gain more control over modularity-based commu-nity structure (and its weaknesses as exposed in [5, 6]), we con-sider the modularity m and the number of communities c as joint indicators for the community existence in a graph. For a fixed number c of communities, a rough upper bound for the

modu-larity is (1 −1c). The modularity value should therefore be

inter-preted based on the number of communities. For instance, for

c= 2, a modularity value m = 0.48 would constitute a “highly

clustered” network, while the same value for c= 5 could be

in-terpreted as “medium clustered.” Theoretically, m < 1 and the asymptotic value of 1 is only achieved for an infinite number of fully isolated communities. However, we are interested in modularity maximization in connected networks.

We proceed to formalize the problem of graph construction

with a given modularity. Using the fact thatPc

i=1DCi = 2L, we transform (2) into c X i=1 D2Ci =4cL (L − L inter− mL) c 2 + 1 (3)

(3)

We consider two variants of the graph generation problem,

namely one where Linter is fixed, and the other in which it is

not.

Problem 1. Find a graph G with a given total number L of links and corresponding partitioning into c communities, where the

communities are connected by Linterlinks, for which the

modu-larity of the generated graph equals m, i.e.                  Pc i=1D2Ci = 4cL(L−Linter−mL) (c 2)+1 DCi = 2L Ci in + L Ci out Pc i=1DCi = 2L Pc i=1L Ci out= 2Linter Problem 1 is equivalent to

Problem 1*. For given L, c, Linter and m, find a non-negative

integer vector ~LC= n LCi in, L Ci out o

i=1,...,cof 2c elements in total, such

that              Pc i=1  2LCi in + L Ci out 2 = 4cL(L−Linter−mL) (c 2)+1 Pc i=1L Ci out= 2Linter Pc i=1L Ci in = L − Linter

Relaxing the requirement for ~LC to be an integer

val-ued vector results in a convex quadratically constrained program, which can be solved in polynomial time (i.e.,

Pc i=1  2LCi in + L Ci out 2 = ~LT

CP~LC, with P a 2c × 2c matrix

consist-ing of the sub-matrix "

4 1

1 1

#

along the diagonal and 0 for the other elements. Since P is positive semi-definite, the quadratic constraint is convex).

Problem 2. Find a graph G with a given number of links L, a corresponding partitioning into c communities, and a given modularity m, such that

               4cLLinter+ c 2 + 1 P c i=1DC2i= 4cL 2(1 − m) DCi = 2L Ci in + L Ci out Pc i=1DCi = 2L Pc i=1L Ci out= 2Linter Problem 2 is equivalent to

Problem 2*. For given L, c, and m, find a non-negative integer

vector ~LC= n LCi in, L Ci out o

i=1...cof 2c elements in total, such that

       2cLPc i=1L Ci out+ c 2 + 1 P c i=1  2LCi in + L Ci out 2 = 4cL2 (1 − m) Pc i=1  2LCi in + L Ci out = 2L

Problem 2* is the problem of main interest in this paper and in the remainder we refer to it as the Modular Graph Existence (MGE) problem. A solution to the MGE problem does not con-stitute a graph, but gives the number of links inside and between communities. Based on this information, various instantiations of graphs might be possible. We will now prove that the MGE

problem is NP-complete, even for a fixed partitioning c= 2 into

two communities. We start with the following Lemma 1.

Lemma 1. For x < b√Cc, x2 ≡ C(mod B) is equivalent to

x2+ By = C.

Proof. Let us assume that x is a solution of x2 ≡ C(mod B),

then the pair (x, y = C−xB2) is a solution of x2+ By = C, since

x2 = Bk + C for some k ∈ N and thus x2 + By = Bk + C +

BC−Bk−CB = C. On the other hand, assuming that (x, y) is a

solution of x2+ By = C and taking modulo B on both sides,

using (By) mod B= 0, we arrive at x2≡ C(mod B), hence x is a

solution.

Lemma 1 shows that finding a solution to the quadratic

Dio-phantine equation x2+ By = C is as hard as finding a solution

to x2 ≡ C(mod B). This problem has been shown to be

NP-complete by Manders and Adleman [32] even for few known

factors of B, for instance with B an even number1. Hence, the

quadratic Diophantine problem x2+ By = C is NP-complete.

Theorem 2. The MGE problem, i.e. deciding whether a graph, with modularity m, number L of links, and a partitioning into

c= 2 communities, exists, is NP-complete.

Proof. Given c = 2 and L, a solution to the MGE

prob-lem returns two integer numbers, namely LC1

in and L C1 out (where LC2 in = L − L C1 in − L C1 out and L C2 out = L C1

out). Based on (2), it can

be verified in polynomial time whether those numbers indeed lead to a modularity m, and hence the problem is in the class

NP2. To prove that the MGE problem is also NP-hard3, we

demonstrate how solving the modular graph existence problem would present a solution to the NP-complete quadratic Dio-phantine problem, which asks whether an x ∈ N exists for

which x2 + By = C holds with B, C ∈ N and B even. We

proceed in two steps. First we translate, in polynomial time, the quadratic Diophantine problem into an MGE problem and subsequently demonstrate how a solution to that MGE problem can be translated back, in polynomial time, to a solution of the quadratic Diophantine problem.

1. Diophantine to MGE. Let us assume that we are looking

for a solution (x, y) to x2 + By = C with B even, where

the implicit factor of 2 does not affect the hardness of the

problem. This problem translates to deciding whether a

graph G exists with L= B2 links and with modularity m=

1 2 −

C

2L2. If indeed a solution (x, y) exists, then a solution

to MGE also exists where community C1 contains L−y2+x

links and community C2 contains

L−y−x

2 links, and where

both communities are connected via y links. Indeed, based on the expression in (2), such a solution has L links and a

1In the same paper [32], Manders and Adleman have also proved that finding

a solution to the general quadratic Diophantine equation Ax2+ By = C is

NP-complete.

2NP (non-deterministic polynomial time) refers to a class of problems

whose solution correctness can be verified in polynomial time [33].

3NP-hard problems refer to a class of problems that are “at least as hard as

the hardest problems in NP,” and it is generally believed that they cannot be solved in polynomial time. NP-hard problems that themselves are in NP are called NP-complete [33].

(4)

modularity m= 1 −1 2− y L − 1 8L2(2 L − y+ x 2 − 2 L − y − x 2 ) 2 =1 2 − y L− 1 8L24x 2 =1 2 − x2+ 2Ly 2L2 = 1 2 − x2+ By 2L2 =1 2 − C 2L2

2. MGE to Diophantine. Let us assume that the constraints of the MGE problem are satisfied, namely

             4L(LC1 out+ L C2 out)+ 2  2LC1 in + L C1 out 2 + 2LC2 in + L C2 out 2 = 8L2(1 − m)  2LC1 in + L C1 out + 2L C2 in + L C2 out = 2L

Going back to the notation of DCi = 2L

Ci in + L Ci out, i = 1, 2, and setting y= LC1 out = L C2 outwe have ( 4L(y+ y) + 2(D2 C1+ D 2 C2)= 8L 2(1 − m) DC1+ DC2= 2L

With DC2 = 2L − DC1, where we choose DC1 ≥ DC2, we

obtain 8Ly+ 2(D2C1+ (2L − DC1) 2)= 8L2 (1 − m) or (DC1− L) 2+ 2Ly = L2− 2mL2

From our initial Diophantine to MGE translation we have

that B= 2L and C = L2− 2mL2, thus the solution to x2+

By= C is obtained from a solution to the corresponding

MGE problem as x = DC1− L, and y= L

C1

out, with C1 the

largest community.

In our proof, we have relied on quantifying the number of links in and between communities that would lead to a given modularity and we have not relied on a possible graph

realiza-tion. Although the difference is subtle, since the Diophantine

problem depends on numbers, our reliance on link numbers in-stead of real links in a graph is crucial. Numbers can be stored in binary representation and therefore only grow logarithmi-cally in the size of the input, while real links in a graph cannot be represented in binary notation (and are often represented via an adjacency matrix).

Within a community Ci, several (sub)-graph structures can

be devised that obey the required number LCi

in of links in the solution vector ~LC= n LCi in, L Ci out o

i=1,...,cto the MGE problem. The

denser (in terms of the average degree E[D]) this community graph is, the better it actually reflects a community, and the less likely it becomes that another partitioning would result in a higher modularity.

4. Changing the modularity via link rewiring

We identify three link rewiring steps, referred to as transfor-mations, to change a graph’s modularity.

Transformation 1. The modularity m of a graph G

(parti-tioned into communities Ci) increases by replacing an

inter-community link between Ci and Cj with an intra-community

link in Cior Cj(in Figure 1).

Ci

Cj

Figure 1: Replacing an inter-community link between Ciand Cjwith an

intra-communitylink in Cj(Transformation 1).

The difference ∆m1in modularity between G and the

result-ing graph G0after having rewired is

∆m1(G, DCi, DCj)=

2L+ DCj− DCi− 1

2L2

The derivation of∆m1 has been placed in the Appendix.

Be-cause the sum of all degrees equals twice the number of links,

we have DCi< 2L and DCj≥ 1. Therefore,

∆m1(G, DCi, DCj) >

2L+ 1 − 2L − 1

2L2 = 0

The reverse operation, which decreases the modularity, is also possible: provided that we assure that a rewiring does not dis-connect the graph.

Transformation 2. If there are two communities Ci and Cj,

such that DCi − DCj > 2, then the modularity can be increased

by moving an intra-community link from Cito Cj(in Figure 2).

Ci

Cj

Figure 2: Replacing an intra-community link in Ci with an intra-community

link in Cj(Transformation 2).

In this case, the number of inter-community links remains

the same, while DCj is increased by 2 and DCi decreased by 2.

The difference ∆m2in modularity, as derived in the Appendix,

after this transformation is

∆m2(G, DCi, DCj)=

DCi− DCj− 2

(5)

Transformation 2 demonstrates that the modularity of G

in-creases by making the cumulative degrees DCi of all the

com-munities as close as possible.

Transformation 3. The modularity of a graph G increases by

replacing an inter-community link between Ciand Cj with an

intra-community link in a third community Ck, if 2L+ DCi +

DCj> 2DCk+ 3 (in Figure 3).

Ci Cj

Ck

Figure 3: Replacing an inter-community link between Ciand Cjwith an

intra-communitylink in a third community Ck(Transformation 3).

As demonstrated in the Appendix, the difference between the

modularity of G and the resulting graph G0is

∆m3(G, DCi, DCj, DCk)=

2L+ DCi+ DCj− 2DCk− 3

2L2 > 0

Transformation 3 is in fact obtained by consecutively applying Transformations 1 and 2.

In our proposed graph generator TMGG, explained in Sec-tion 5, we start with an initial graph and subsequently apply the transformations until we reach the desired modularity. We propose to start with the connected graph (determined in our previous work [21]) of L links and c communities that has max-imum modularity mmax= 1 − 1 c− c −1 L −            1 2L2, r= 0 r(c−2r) 2cL2 , 1 ≤ r ≤ b c 2c (c−r)(2r−c) 2cL2 , b c 2c< r ≤ c − 1 where r= L mod c.

5. Tunable modularity graph generator

Let us denote by community graph the abstraction where a node reflects one community and a link connects two nodes

from different communities. In this section, we propose the

Tunable Modularity Graph Generator (TMGG) algorithm that generates graphs with a given modularity m and number c of partitions. Our generator starts by generating a graph of max-imum attainable modularity for a given m and c in Initialize. The initial community graph is a tree with no more than 1 link between two communities. We subsequently use Transforma-tions 1 and 2 (in ReplaceInternalExternal and ShiftInternal,

Algorithm 1: Initialize

input : Number L of links, number c of communities

output: Max modularity mmax= max{m(L, c)}, initial

community graph C, initial internal link sums

{LCi in}i=1,...,c 1 r ← L mod c, k ← bL cc, mmax← 1 − 1 c− c−1 L ; 2 LCin1 ← k, i ← 2; 3 if r== 0 then 4 while i ≤ c do 5 C : create a link (i − 1, i) 6 LCi in ← k − 1, i ← i+ 1; 7 mmax← mmax− 1 2L2 8 else if r ≤ bc2c then 9 while i ≤ c − r do 10 C : create a link (i − 1, i) 11 LCi in ← k − 1; 12 if i ≤ r then

13 C : create a link (i, c − i+ 1);

14 LCinc−i+1 ← k; 15 i ← i+ 1; 16 LCini ← k, mmax← mmax−r(2ccL−22r) 17 else 18 while i ≤ r do 19 C : create a link (i − 1, i) 20 LCini ← k; 21 if i ≤ c − r then

22 C : create a link (i, c − i+ 1);

23 LCi in ← L Ci in − 1, L Cc−i+1 in ← k; 24 i ← i+ 1; 25 mmax← mmax−(c−2r)(2cL2r−c);

respectively) to increase/decrease the modularity towards the

desired modularity m.

We vary the order of using these transformations, resulting in three generator variants:

• StartReplacing • StartShifting • Random

All generator variants use Initialize to construct a

commu-nity graph of maximum attainable modularity mmaxfor a given

Land c. Variant StartReplacing (lines 6-11 in TMGG) starts

by applying procedure ReplaceInternalExternal to the com-munity graph to establish a modularity close to the interval

[m−, m+]. If the obtained modularity fluctuates twice around

the interval [m − , m+ ] (explained in the next paragraph

of this section), StartReplacing continues with the procedure ShiftInternal (lines 10-11 in TMGG). As soon as the range

(6)

Procedure ReplaceInternalExternal (Transformation 1) input : Number L of links, number c of communities,

desired modularity m, the current modularity

mcur, the current modularity change∆mcur, the

current state ∈ {1, 2}, internal link sums

{LCi

in}i=1,...,c

1 find i and j, such that∆m1(G, DCi, DCj) is minimum;

2 if mcur> m then // in state 1

3 if state== 2 and ∆m1(G, DCi, DCj) ≥∆mcurthen

return false;

4 if LCinj== 0 then break;

5 C: add 1 link between Ciand Cj;

6 ∆mcur←∆m1(G, DCi, DCj), mcur← mcur−∆mcur;

7 LCinj ← LCinj− 1, state ← 1;

8 else // in state 2

9 if state== 1 and ∆m1(G, DCi, DCj) ≥∆mcurthen

return false;

10 ∆mcur←∆m1(G, DCi, DCj), mcur← mcur+ ∆mcur;

11 if ∃! a link between Ciand Cjthen break;

12 C: remove 1 link between Ciand Cjif C is still

connected; otherwise break;

13 LCj in ← L

Cj

in + 1, state ← 2;

14 return true

Procedure ShiftInternal (Transformation 2)

input : Number L of links, number c of communities, desired modularity m, the current modularity

mcur, the current modularity change∆mcur, the

current state ∈ {1, 2}, internal link sums

{LCi

in}i=1,...,c

1 find i and j, such that∆m2(G, DCi, DCj) is minimum;

2 if mcur> m then // in state 1

3 if state== 2 and ∆m2(G, DCi, DCj) ≥∆mcurthen

return false;

4 ∆mcur←∆m2(G, DCi, DCj), mcur← mcur−∆mcur;

5 LCini ← LCini+ 1, LCinj ← LCinj− 1, state ← 1;

6 else // in state 2

7 if state== 1 and ∆m1(G, DCi, DCj) ≥∆mcurthen

return false;

8 ∆mcur←∆m2(G, DCi, DCj), mcur← mcur+ ∆mcur;

9 LCini ← LCini− 1, LCinj ← LCinj+ 1, state ← 2;

10 return true

StartShifting (lines 12-17 in TMGG) tries to obtain a

modu-larity in the interval [m − , m+ ], but with a reversed order

of the procedures as in StartReplacing. First, the procedure ShiftInternal is preferred over ReplaceInternalExternal.

Fi-nally, the last variant Random (lines 18-23 in algorithm TMGG) randomly chooses one of the procedures ReplaceInternalEx-ternal (with a certain probability p) and ShiftInReplaceInternalEx-ternal (with

probability (1 − p)) until the value in the interval [m − , m+ ]

is achieved.

For a very small value of , a modularity in [m − , m+ ]

may not be found. The termination condition effectuates when

in consecutive (link rewiring) transformations the modularity

value alternatively goes below and above the interval [m−, m+

] (lines 3 and 9 in ReplaceInternalExternal; lines 3 and 7 in ShiftInternal; and line 25 in TMGG), without getting closer to that interval. In the algorithm, this is reflected by the current modularity going from state 1 (above m) to 2 (below m) or vice

versatwice in a row. Hence, TMGG either finds a

modular-ity in the interval [m − , m+ ] (as it “converges” towards the

interval) or it terminates when no further improvements are ob-served in four consecutive transformations. All three variants StartReplacing, StartShifting and Random return the

commu-nity graph, i.e., a family of graphs or the topology between

communities and the number of links within each community. Based on the output, we are able to construct arbitrary graphs with a given number of links for each community. The

topolog-ical differences of the resulting graphs are studied in Section 6.

5.1. Algorithm complexity and accuracy

The algorithm variants approach the given value m with dif-ferent speed and accuracy. In the paper, we use the probability

p = 0.5 in the variant Random, leading to an equal

probabil-ity in choosing between ReplaceInternalExternal and Shift-Internal. For p ≈ 0, Random would be closer to the StartRe-placing variant, and for p ≈ 1, Random would be closer to the StartShifting variant. Figure 4 presents the speed in terms of number of iteration steps, at which the three algorithm variants approach the requested modularity m. One iteration step cor-responds to a single modularity change in the TMGG variants.

0 50 100 150 200 250 300 350 0.64 0.66 0.68 0.7 0.72 0.74 0.76 0.78

0.8 max. modularity mmax

iteration step

Modularity

StartReplacing Random (p=0.5) StartShifting desired modularity m

Figure 4: Approaching speed of algorithm variants, with L = 1000, c = 5,

m= 0.655, mmax= 0.796 and  = 5 · 10−3. One iteration step corresponds to a

single modularity change in the TMGG variants.

The variant StartReplacing reaches m in the smallest number of iterations, which is expected because its modularity change

(7)

Algorithm 2: TMGG

input : Number L of links, number c of communities, desired modularity m, variant algVariant, probability p

output: community graph C, internal link sums {LCi

in}i=1,...,c

1 [mmax, C, {LCi

in}i=1,...,c] ←Initialize (L,c); 2 mcur← mmax;

3 if mcur− > m then return There is no graph with modularity in [m − , m + ];

4 ∆mcur←+∞, state ← 0, approachM ← true;

5 switch algVariant do

6 case StartReplacing // try 1st Transformation 1 then 2

7 while |mcur− m|>  and approachM == true do

8 approachM ← ReplaceInternalExternal (L,c,m,mcur,∆mcur,state,{LCini}i=1,...,c);

9 approachM ← true;

10 while |mcur− m|>  and approachM == true do

11 approachM ← ShiftInternal (L,c,m,mcur,∆mcur,state,{LCini}i=1,...,c);

12 case StartShifting // try 1st Transformation 2 then 1

13 while |mcur− m|>  and approachM == true do

14 approachM ← ShiftInternal (L,c,m,mcur,∆mcur,state,{LCini}i=1,...,c);

15 approachM ← true;

16 while |mcur− m|>  and approachM == true do

17 approachM ← ReplaceInternalExternal (L,c,m,mcur,∆mcur,state,{LCini}i=1,...,c);

18 case Random // choose randomly Transformation 1 or 2

19 while |mcur− m|>  and approachM == true do

20 choose randomly 1) with probability p OR 2) with probability (1 − p):

21 1) approachM ← ReplaceInternalExternal (L,c,m,mcur,∆mcur,state,{LCi

in}i=1,...,c);

22 2) approachM ← ShiftInternal (L,c,m,mcur,∆mcur,state,{LCini}i=1,...,c);

23 if the procedure has changed then state ← 0; approachM ← true;

24 otherwise break;

25 if approachM== f alse then return There is no graph with modularity in [m − , m + ];

∆m1 = O(1/L) is bigger than the modularity change ∆m2 =

O(1/L2) in StartShifting. Regarding the time complexity, all

three variants start with Initialize, which “costs” O(c). If we

denote by mstartthe initial modularity obtained after Initialize,

we obtain the time complexity of StartReplacing as

O(StartReplacing) = mstart− m −

O(1/2L) = O((mstart− m −)L)

Similarly, the time complexity of StartShifting is

O(StartShifting) =mstart− m −

O(1/2L2) = O((mstart− m −)L

2

)

Moreover, because∆m2 < ∆m1, we have a better accuracy in

StartShifting. The variant Random is in between StartShift-ing and StartReplacStartShift-ing, in terms of the approachStartShift-ing speed, the time-complexity and the accuracy. The modularity of the

pro-duced graph, if one is returned, differs from the desired

mod-ularity m by at most ± in all three variants. The smaller , the higher the accuracy. Figure 4 illustrates that both StartRe-placing and Random variants attain the modularity m linearly,

as opposed to a “non-linear” (∆m2= O(1/L2)) decrease for the

variant StartShifting.

6. Properties of the obtained graphs

The three algorithm variants generate community graphs

with different topological properties.

6.1. Topological properties

The variant StartShifting ends up with a community graph, with a very small number of inter-community links. In most of the cases, the community graph is a tree or very close to a tree. On the other hand, there are just a few (usually only one) munities with a very high number of links and all the other com-munities have a similar number of links. Unlike StartShift-ing, the StartReplacing variant generates graphs with higher number of inter-community links, but all the communities have a similar number of intra-community links (communities with similar size). These properties are exhibited in Figure 5. When comparing the number of inter-community links, the variant Random (p = 0.5) is somewhere in between StartShifting and StartReplacing.

Table ?? shows the difference in topological metrics for the

three graphs produced by the three variants for given values of

L, c, m and . The variant Random (p = 0.5) has

(8)

val-pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pppp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pppp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pppp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pppppp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pppp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp 310 301 316 283 328 276 270 285 303 327287302 288 271 284 281 300 299 268 278 279 307 277 309 282 318 272 294 295 324 315 274 317325 298 304 292 47 8 11 51 21 44 41 45 56 35 15 19 39 25 33 55 64 50 20 2 6 10 7 59 16 27 43 61 12 49 67 62 14 17 13 5 37 54 1 28 29 23 48 63 66 9 38 3 18 42 52 40 32 53 65 57 46 24 34 36 583060 31 26 22 74 108 94 88 106 123 93 77 99 107 82 84 113 95 76 115 105 122 72 109 73 90 80 121 100 79 81 117 89 75 112 69 96 104 98 71 85 119 91 114 78 116 83 97 124 118 102 111 86101 120 87 68 70 103 92 197 172 194 147 131 165 182 185 203 202 151 168 188 198 166 133179 167 189175 152181 144 160140 184 191 196 156 161 150 180 200 136146 186148 201 178177 187 155 154 126190 173 195 157 204 149 159 129 220 234221 233 223 257 206 261 232 244 247230 236 259 252 209 263 264 231 248 267 240 219 241 246 213 243 242 217 207 222 218 208 239 211 255249251 225 205 245 212 256 258 227 216 260 229250 210 224 214 235 228 266 215 237 226254 262 238 253 265 169 138 137 183 176 163 158 141 125 4 199 143139 174 153 145 134 132 135 164 142 192 127 323 130 314 293 162 291 193 308 322 313 311 273 286 320 275 269 296 297 326 280 321 312 305 306 319 289 290 110 128 170 171 (a) StartReplacing pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pppppppp pp pp pp pp pp pp pp pp pp pp pp pppp pp pppp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pppp pp pp pp pppppp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pppp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp 202229226 218 225 213 228214 231 239 241 215 230 217 208 103 79 112 13 129 93 4 197 200 199 198 220233 209 222 238 243 240 232 207 219 211 227 236 205 210 235 234 212 242 138 97 139 53 85 142 144 50 182 196 189 165 195 188190 193 194 152 171 186168 169163 158 151 175 155 178 192 162 187 157 3 223221 216 206 203 224 204 244 179 173 150 183 149 185160 164 191 156 184 153 159 167 176 172 174 177 170 181 180 166 161 154 312 334 303 325 320 301328 309 338 321 322 299 336 332302 311 289 282 280 256 272 292 261278 290 268 250 326 318 327 340 335 295 339 281 291 262 271 246 276 287 293 275 274 288 284 253 245 285 251 257 260 263 267 249270 247 248 279 266 273 294 313 324 343 306 308 329 341 24 28 69 128 106 122 147 84 35 70 104 37 89 92 116 140 68 48 44 117 141 11325 96 52 56 95 124 41 5972 49 125 10 22 46 315 296 305 317 307 304 310 201 51 5 277 283 258 252 39 265 331 342 300 333 323 319 330 6 269 254286 264 255259 237 298 297 316 314 337 131 98 63 82 111 135 60 21 88 114 14374 90 145 107 118 2 86 146 34 31 80 17 55110 76 2775 109 130 77 14 18 67 78 108 58 121 87 133 105 38 40 45 94 100 1 132 71 36 15 19 33 5411 7 136 16 14812 61 30 81 47 83 91 126 119 9 26 62 73 134 99 66 123 23 42 8 29 120 43 32 57 101 127 64102 65 20 115 137 (b) Random (p = 0.5) pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pppppp pp pppp pp pppp pppp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pppp pp pppp pp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pppp pppp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pppp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pppppppp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pppp pp pp 28 20 29 41 1 16 4 18 37 34 14 24 26 21 25 35 10 39 8 31 3833 19 9 7 12 36 17 5 6 2 11 23 40 15 13 27 322 32 88 66 64 86 45 52 84 85 43 50 42 82 30 276 235 171 202 282 256 184 143 46 215 218 237 48 49 76 57 83 73 53 79 81 63 60 74 67 77 58 62 80 44 69 65 61 56 87 54 75 68 71 51 70 55 72 59 78 47 400 368 401 398 367 399 380 370 366 396 387 386 406 394 378 390 391381 395 369 402 374384 372 376 373 379 377 389 393 397 392 375 405404403 388382 385 383 371 203 289 265 144 158 251 283 182 141 288 163 181 157 95 247 234 285 108 314 167 209 191 319 189 162 112 232 116 326 105 310 269 219 244 97 111 321 103 115 258 169 305 173 271 250 273 267 166 94 92 168 164 216 233190 223 186 311 227 179 99 106 30789 151 313 93 287 255 136 109 262 253 306 152 217 135 96 309 213 294 133242 126 139 222 324 180 194 150 246 303 268 140 128 327104137 257 323 214 102 208 129 318 146 238 122 316266 297 293 138 176 295 188170 123 165 160308130 107 260110 220 228 290 134 281 322 245 90 236 117 270 201 312 113 174118 300 298 183 196 148 187 302 264 291 243 211 98 239 154 274 159 210 229 155 193 230 206 120 249 254 261 161296 124100 205 299 127 315 342 340 349 345 362 331 350 346 353 347 365 332 338 330 329 343 361 341 352 363 337 334 359 333 339 360 355 364 344 354 357 336 348 356 351 335 358 131 192224 221 172 248 145 301 195 278 231 101 241 200 199207 259 263 212 91 156 284 198 292 119 272 304 325 317277178286 252 175 142225 132 177197 279 185 328 114 149 153 320 121 240 280 147 204 125 275 226 (c) StartShifting

Figure 5: Graphs returned by the three algorithm variants (L= 1000, c = 5, m = 0.655 and  = 5 · 10−3).

Table 1: Topological metrics of the three returned graphs (L= 1000, c = 5, m = 0.655 and  = 5 · 10−3).

Algorithm variant E[D] C E[H] ρD µN−1 λ1 K

StartReplacing 5.88 0.355 3.70 -0.06 0.041 9.87 84%

Random (p = 0.5) 5.67 0.167 4.00 -0.04 0.036 8.84 86%

StartShifting 4.93 0.151 5.26 -0.01 ≈ 0 6.52 95%

ues for StartReplacing and StartShifting. In general, the vari-ant StartReplacing (StartShifting) produces graphs with the highest (lowest) average degree E[D]; the highest (lowest)

av-erage clustering coefficient C; the lowest (highest) average

hop-count E[H]; the highest (lowest) algebraic connectivity µN−1;

the highest (lowest) spectral radius λ1; and the smallest (largest)

assortativity ρD.

We define the modularity quality coefficient K = m

mmax as

a ratio between the desired modularity m and the maximum

modularity mmaxof the obtained graph (using Newman’s

algo-rithm [16], because as stated before, finding the mmaxis also an

NP-complete problem [2]). Because mmaxis the maximum of

a given graph with an unknown number c of communities, we have K ∈ [0, 1]. The higher K, the more likely the original num-ber c of communities is preserved. Table ?? (the last column) shows that the StartShifting variant has produced the graph with the largest K due to the small number of inter-community links and “higher link density” within the communities, fol-lowed by Random (p = 0.5) and StartReplacing.

In Figure 6, we display the relation between the average

clus-tering coefficient and the desired modularity. The average

clus-tering coefficient reflects to what extent nodes tend to cluster

together and depends on the number of triangles in a graph. Figure 6 shows a linear relation between the modularity and the

average clustering coefficient, where StartReplacing produces

the graphs with highest average clustering coefficient. The

0.3 0.4 0.5 0.6 0.7 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Modularity (m)

Average clustering coefficient (

C

)

StartReplacing Random (p=0.5) StartShifting

Figure 6: Clustering coefficient C as a function of the desired modularity value

mfor the algorithm variants with L= 1000, c = 5 and  = 5 · 10−3. Internally,

the communities are constructed as random graphs.

StartReplacing produced graphs have many inter-community links, which means there is a higher probability of also having

triangles spanning different communities than with

(9)

68 65 71 74 73 70 81 93 79 76 89 67 80 66 58 64 59 63 62 61 78 72 57 98 113 104 111 105 103 114 119 83 87 85 92 91 86 88 84 90 106 116 118 96 107 102 82 60 75 77 69 117 100 94 115 97 99 101 112 110 95 109 108 16 7 5 22 10 8 13 12 6 9 17 38 48 36 43 39 55 42 53 34 26 30 40 27 32 28 37 52 54 41 50 31 47 29 46 33 20 56 24 11 15 19 4 1 14 3 2 174 178 179 175 177 176 23 21 25 18 49 35 45 51 44 152 188 156 123 171 170 185 133 149 189 145 163 142 141 137 201 167 198 200 199 202 169 153 196 194 147 120 164 197 165 121 186 151 158 154 172 187 148 173 150 157 180 182 184 183 181 128 130 139 136 124 168166 138 155 131 140 126 144 127 125 135 192 190 193 191 122 129 132 143 195 146 134 161 162 159 160

(a) User-centric friendship network of the person X in Facebook.

pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp pp pp pppp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pppp pp pp 70 63 48 65 61 66 60 68 54 58 74 59 57 52 55 72 62 64 69 73 126 108 118 114 67 53 49 56 51 50 71 8 4 5 1 6 39 33 117 109 102 112 107110 123 128 113 152 141 135 155 131 151 140 148 133 147 146 129 136 132 144 134 138 130 90 99 79 89 96 94 95 76 86 81 75 78 85 97 98 100 84 80 137 150 145 153 142 149 154 139 143 103 120 124 125 106 83 87 92 77 82 88 91 93 9 43 10 28 13 16 12 17 36 3 11 2242 18 25 37 26 2 23 45 46 47 24 44 21 31 20 19 15 41 27 40 30 29 35 32 38 14 34 7 121 101 119 116 115 111 104105127 122 (b) TMGG modeled network. Figure 7: Real Facebook friendship and TMGG constructed networks.

6.2. Online social network modeling

To demonstrate that TMGG can indeed generate realistic community-structured networks we will make a comparison with a real user-centric friendship network of a single person

X in Facebook4, as displayed in Figure 7a. The nodes are

Face-book friends of X and a link exists between two nodes if the corresponding two friends of X are also friends of each other. The visualization shows a clear community structure. Using TMGG (variant StartShifting), we have generated a network,

in Figure 7b, that has the same modularity (m = 0.7), number

of communities (c = 5) and number of links (L = 1773) as the

Facebook network of X. The two networks have similar

prop-erties, such as similar average nodal degree (E[D] = 20) and

clustering coefficient (C = 0.68), which supports our claim that

TMGG can generate realistic networks.

7. Conclusions

We have considered the problem of constructing graphs with a given modularity and have proved that deciding whether such a graph exists is NP-complete. Subsequently, we have proposed a heuristic algorithm TMGG that generates graphs with a given modularity and number of links. TMGG has three variants and all start from a graph with maximum modularity [21] that is altered via rewiring. Furthermore, we have analyzed the dif-ference in speed and accuracy of the three variations, and we have studied the topological properties of the graphs generated by them. All three TMGG variants produce community graphs, i.e. a family of graphs consisting of the topology between com-munities and the number of links within each community. The community graph presents ample flexibility to generate and

4http://www.facebook.com

fine-tune the final graph towards other desired topological

prop-erties, such as nodal degree distribution, without affecting the

modularity.

Acknowledgments

We are grateful to Norbert Blenn for the useful discussions. This research has been supported by the EU FP7 Network of Excellence in Internet Science EINS (project no. 288021) and by the GigaPort3 project led by SURFnet.

Appendix A. The derivations for the modularity changes

We consider the difference ∆m in modularity between the

graph G and the graph G0, obtained from G after a change in

communities Ci and Cj. Using the modularity definition (2),

the difference is reflected inDCp− DCk

2 −D0 Cp− D 0 Ck 2 , 0,

with p ∈ {i, j}. Hence,∆m boils down to

∆m =Linter− L0inter L − 1 8cL2 c X p=1 c X k=1  DCp− DCk 2 −DC0p− D 0 Ck 2 =Linter− Linter0 L − 2 8cL2 c X k=1  D0Ci− D 0 Ck 2 −DCi− DCk 2 + DC0j− DCk 2 −DCj− DCk 2 =Linter− Linter0 L − 1 4cL2 c X k=1 k,i, j h DCi+ D 0 Ci− 2DCk   D0Ci− DCi  + DCj+ D 0 Cj− 2DCk   D0 Cj− DCj i − 1 4cL2  D0Ci− DCi− (D 0 Cj− DCj)   D0Ci+ DCi− (D 0 Cj+ DCj)  (A.1)

(10)

Appendix A.1. Transformation1

Here, L0inter= Linter− 1, D0Ci = DCi− 1 and D

0

Cj = DCj+ 1 as

has been discussed in Transformation 1. The expression (A.1) becomes ∆m1(G, DCi, DCj)= = 1 L− 1 4cL2 c X k=1 k,i, j h 2DCi− 2DCk+ 1  −2DCj− 2DCk− 1 i − 1 4cL2  DCi− DCi− (DCj− DCj)+ 2   DCi+ DCi− (DCj+ DCj)+ 2  = 1 L− 2 4cL2 c X k=1 k,i, j  DCi− DCj+ 1  −2  2DCi− 2DCj+ 2  4cL2 = 1 L− c −2 2cL2  DCi− DCj+ 1  −DCi− DCj+ 1 cL2 = 1 L− c −2+ 2 2c · L2  DCi− DCj+ 1 = 1 L− 1 2L2  DCi− DCj+ 1  =2L − 1 − DCi+ DCj 2L2

Appendix A.2. Transformation2

Here, L0 inter = Linter, D 0 Ci = DCi − 2 and D 0 Cj = DCj + 2 as

has been discussed in Transformation 2. The expression (A.1) becomes ∆m2(G, DCi, DCj)= − 1 4cL2 c X k=1 k,i, j h DCi+ D 0 Ci− 2DCk   DC0i− DCi  + DCj+ D 0 Cj− 2DCk   D0 Cj− DCj i − 1 4cL2  D0Ci− DCi− (D 0 Cj− DCj)   D0Ci+ DCi− (D 0 Cj+ DCj)  = 4 4cL2 c X k=1 k,i, j h DCi− DCk− 1  −DCj− DCk+ 1 i + 4 4cL2  2DCi− 2 − 2DCj− 2  = 1 cL2 c X k=1 k,i, j  DCi− DCj− 2 + 2 cL2  DCi− DCj− 2  =c −2+ 2 cL2  DCi− DCj− 2 = 1 L2  DCi− DCj− 2 

Appendix A.3. Transformation3

The difference ∆m in modularity between the graph G and

the graph G0, obtained from G after a change in communities

Ci, Cjand Ck(Transformation 3) is ∆m3(G, DCi, DCj, DCk)= 1 L− 1 4cL2 c X p=1 p,i, j,k [DCi+ D 0 Ci− 2DCp   D0 Ci− DCi  + DCj+ D 0 Cj− 2DCp   D0Cj− DCj + DCk+ D 0 Ck− 2DCp   D0Ck− DCk  ] − 1 4cL2  D0Ci− DCi− (D 0 Cj− DCj)   D0Ci+ DCi− (D 0 Cj+ DCj)  − 1 4cL2  D0 Ci− DCi− (D 0 Ck− DCk)   D0 Ci+ DCi− (D 0 Ck+ DCk)  − 1 4cL2  D0Cj− DCj− (D 0 Ck− DCk)   D0Cj+ DCj− (D 0 Ck+ DCk)  =1 L+ 1 4cL2 c X p=1 p,i, j,k [2DCi− 2DCp− 1 + 2DCj− 2DCp− 1  − 22DCk− 2DCp+ 2  ] − 1 4cL2 (−1 − 2)  2DCi− 1 − (2DCk+ 2)  − 1 4cL2(−1 − 2)  2DCj− 1 − (2DCk+ 2)  =1 L+ 1 2cL2 c X p=1 p,i, j,k  DCi+ DCj− 2DCk− 3  + 3 4cL2  2DCi+ 2DCj− 4DCk− 6  =1 L+ 2c − 6+ 6 4cL2 h DCi+ DCj− 2DCk− 3 i =2L+ DCi+ DCj− 2DCk− 3 2L2 References

[1] M. E. J. Newman, M. Girvan, Finding and evaluating community struc-ture in networks, Phys. Rev. E 69 (2004) 026113.

[2] U. Brandes, D. Delling, M. Gaertler, R. G¨orke, M. Hoefer, Z. Nikoloski, D. Wagner, On Finding Graph Clusterings with Maximum Modularity, in: Graph-Theoretic Concepts in Computer Science, volume 4769 of Lecture

Notes in Computer Science, Springer Berlin/Heidelberg, 2007, pp. 121–

132.

[3] R. Guimer`a, L. A. N. Amaral, Functional cartography of complex

metabolic networks, Nature 433 (2005) 895–900.

[4] J. Duch, A. Arenas, Community detection in complex networks using extremal optimization, Phys. Rev. E 72 (2005) 027104.

[5] S. Fortunato, M. Barth´elemy, Resolution limit in community detection, Proceedings of the National Academy of Sciences 104 (2007) 36–41. [6] A. Lancichinetti, S. Fortunato, Limits of modularity maximization in

community detection, Phys. Rev. E 84 (2011) 066122.

[7] R. Guimer`a, M. Sales-Pardo, L. A. N. Amaral, Modularity from fluctu-ations in random graphs and complex networks, Phys. Rev. E 70 (2004) 025101.

[8] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfold-ing of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment (2008) P10008.

[9] P. Erd˝os, A. R´enyi, On the evolution of random graphs, Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5 (1960) 17–61.

[10] E. Gilbert, Random graphs, Annals of Mathematical Statistics 30 (1959) 1141.

[11] R. Albert, A.-L. Barab´asi, Statistical Mechanics of Complex Networks, Review of Modern Physics 74 (2002) 47–97.

[12] R. Albert, A.-L. Barab´asi, Topology of evolving networks: Local events and universality, Phys. Rev. Lett. 85 (2000) 5234–5237.

[13] T. Bu, D. Towsley, On distinguishing between internet power law topol-ogy generators, in: INFOCOM 2002. Twenty-First Annual Joint Confer-ence of the IEEE Computer and Communications Societies., volume 2, pp. 638 – 647.

[14] M. Faloutsos, P. Faloutsos, C. Faloutsos, On power-law relationships of the Internet topology, in: Proceedings of SIGCOMM ’99, ACM, New York, NY, USA, 1999, pp. 251–262.

[15] D. J. Watts, S. H. Strogatz, Collective dynamics of small world networks, Nature (1998) 440–442.

[16] M. E. J. Newman, Detecting community structure in networks, Eur. Phys. J. B 38 (2004) 321–330.

[17] A. Clauset, M. E. J. Newman, C. Moore, Finding community structure in very large networks, Phys. Rev. E 70 (2004) 066111.

[18] P. Schumm, C. Scoglio, Bloom: A stochastic growth-based fast method of community detection in networks, Journal of Computational Science 3 (2012) 356 – 366.

(11)

[19] S. Fortunato, Community detection in graphs, Physics Reports 486 (2010) 75 – 174.

[20] G. Agarwal, D. Kempe, Modularity-maximizing graph communities via mathematical programming, Eur. Phys. J. B 66 (2008) 409–418. [21] S. Trajanovski, H. Wang, P. Van Mieghem, Maximum modular graphs,

Eur. Phys. J. B 85 (2012) 1–14.

[22] N. Blenn, C. Doerr, S. van Kester, P. Van Mieghem, Crawling and detect-ing community structure in online social networks usdetect-ing local informa-tion, in: IFIP Networking, Prague, Czech Republic, 2012.

[23] M. E. J. Newman, Mixing patterns in networks, Phys. Rev. E 67 (2003) 026126.

[24] P. Van Mieghem, H. Wang, X. Ge, S. Tang, F. A. Kuipers, Influence of assortativity and degree-preserving rewiring on the spectra of networks, Eur. Phys. J. B 76 (2010) 643–652.

[25] P. Van Mieghem, X. Ge, P. Schumm, S. Trajanovski, H. Wang, Spectral graph analysis of modularity and assortativity, Phys. Rev. E 82 (2010) 056113.

[26] G. K. Orman, V. Labatut, H. Cherifi, Qualitative comparison of commu-nity detection algorithms., in: DICTAP (2), volume 167 of Communica-tions in Computer and Information Science, Springer, 2011, pp. 265–279. [27] M. Girvan, M. E. J. Newman, Community structure in social and bio-logical networks, Proceedings of the National Academy of Sciences 99 (2002) 7821–7826.

[28] A. Lancichinetti, S. Fortunato, F. Radicchi, Benchmark graphs for testing community detection algorithms, Phys. Rev. E 78 (2008) 046110. [29] J. P. Bagrow, Evaluating local community methods in networks, Journal

of Statistical Mechanics: Theory and Experiment (2008) P05001. [30] A. Lancichinetti, S. Fortunato, Community detection algorithms: A

com-parative analysis, Phys. Rev. E 80 (2009) 056117.

[31] P. Van Mieghem, Graph Spectra for Complex Networks, Cambridge Uni-versity Press, Cambridge, UK, 2011.

[32] K. Manders, L. Adleman, NP-complete decision problems for quadratic polynomials, in: Proceedings of the eighth annual ACM symposium on theory of computing, STOC ’76, ACM, New York, NY, USA, 1976, pp. 23–29.

[33] M. R. Garey, D. S. Johnson, Computers and Intractability; A Guide to the Theory of NP-Completeness, W. H. Freeman & Co., New York, NY, USA, 1990.

Cytaty

Powiązane dokumenty

Some classes of difference graphs (paths, trees, cycles, special wheels, com- plete graphs, complete bipartite graphs etc.) were investigated by Bloom, Burr, Eggleton, Gervacio,

On the other hand, if the goal is only to distinguish every two adjacent vertices in G by a vertex coloring, then this can be accomplished by means of a proper coloring of G and

A matroidal family of graphs is a non-empty collection P of connected graphs with the following property: given an arbitrary graph G, the edge sets of the subgraphs of G, which

A complete probability measure µ on a space X is said to be Radon if it is defined on the Borel subsets of X and has the property that the measure of each Borel set is the supremum

Recall that the covering number of the null ideal (i.e. Fremlin and has been around since the late seventies. It appears in Fremlin’s list of problems, [Fe94], as problem CO.

In this section we used a standard random number generator which we verified to return a nearly uniform distribution for samples of size 10 6 lending some credibility to the

For a graph G, the forcing geodetic number f (G) ≥ 2 if and only if every vertex of each minimum geodetic set belongs to at least two minimum geodetic sets.. Forcing Geodetic Numbers

Murphy, Lower bounds on the stability number of graphs computed in terms of degrees, Discrete Math. Selkow, The independence number of a graph in terms of degrees,