• Nie Znaleziono Wyników

Maximum parsimony distance on phylogenetic trees

N/A
N/A
Protected

Academic year: 2021

Share "Maximum parsimony distance on phylogenetic trees"

Copied!
18
0
0

Pełen tekst

(1)

Maximum parsimony distance on phylogenetic trees

A linear kernel and constant factor approximation algorithm

Jones, Mark; Kelk, Steven; Stougie, Leen

DOI

10.1016/j.jcss.2020.10.003

Publication date

2021

Document Version

Final published version

Published in

Journal of Computer and System Sciences

Citation (APA)

Jones, M., Kelk, S., & Stougie, L. (2021). Maximum parsimony distance on phylogenetic trees: A linear

kernel and constant factor approximation algorithm. Journal of Computer and System Sciences, 117,

165-181. https://doi.org/10.1016/j.jcss.2020.10.003

Important note

To cite this publication, please use the final published version (if applicable).

Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

Contents lists available atScienceDirect

Journal

of

Computer

and

System

Sciences

www.elsevier.com/locate/jcss

Maximum

parsimony

distance

on

phylogenetic

trees:

A linear

kernel

and

constant

factor

approximation

algorithm

Mark Jones

a

,

b

,

,

Steven Kelk

c

,

Leen Stougie

b

,

d

,

e

aDelftInstituteofAppliedMathematics,DelftUniversityofTechnology,VanMourikBroekmanweg6,2628XE,Delft,theNetherlands bCentrumWiskunde&Informatica(CWI),1098XGAmsterdam,theNetherlands

cDepartmentofDataScienceandKnowledgeEngineering(DKE),MaastrichtUniversity,6200MDMaastricht,theNetherlands dVrijeUniversiteitAmsterdam,1081HVAmsterdam,theNetherlands

eINRIA-Erable,France

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received7April2020

Receivedinrevisedform 23October2020 Accepted26October2020

Availableonline7December2020

Keywords:

Phylogenetics Maximumparsimony Fixedparametertractability Maximumagreementforest

Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetictrees.ItisNP-hardtocompute,andveryfewpositivealgorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixedparameter tractable. We do this by establishinga linearkernel i.e.,thatafter applyingcertain reductionrulesthe resulting instance has size that is bounded by a linear function of the distance. As powerful corollariestothisresultweprovethattheproblempermitsapolynomial-time constant-factorapproximationalgorithm;thatthetreewidthofanaturalauxiliary graphstructure encountered in phylogenetics is bounded by a function of the distance; and that the distance is withinaconstantfactor ofthe sizeof amaximum agreementforest ofthe twotrees,awellstudiedobjectinphylogenetics.

©2020TheAuthor(s).PublishedbyElsevierInc.Thisisanopenaccessarticleunderthe CCBYlicense(http://creativecommons.org/licenses/by/4.0/).

1. Introduction

Phylogeneticsisthescienceofinferringandcomparingtrees(ormoregenerally,graphs)thatrepresenttheevolutionary historyofasetofspecies[34].Inthisarticlewefocusontrees.Theinferenceproblemhasbeencomprehensivelystudied: givenonlydataaboutthespeciesinX (suchasDNAdata)constructaphylogenetictree whichoptimizesaparticularobjective function[17,40].Informally,aphylogenetictreeissimplyatreewhoseleavesarebijectivelylabelledby X .Duetodifferent objectivefunctions,multipleoptimaandthephenomenonthatcertaingenomesaretheresultofseveralevolutionarypaths (rather than just one) we are often confronted with multiple “good” phylogenetic trees [32]. In such caseswe wish to formally quantifyhowdissimilarthesetreesreallyare.Thisleads naturallytotheproblemofdefiningandcomputingthe

distance between phylogenetictrees [36]. Many such distances havebeen proposed, some ofwhich can be computed in polynomial-time, such as Robinson-Foulds (RF) distance [33], and some of which are NP-hard, such as SubtreePruneand Regraft (SPR)distance[9] orTreeBisectionandReconnection (TBR)distance[1].

Interestingly,distancesarenotonlyrelevantasanumericalquantificationofdifference:theyalsoappearinconstructive methods for the inference ofphylogenetic networks[20], whichgeneralise trees to graphs, andphylogenetic supertrees,

*

Correspondingauthorat:DelftInstituteofAppliedMathematics,DelftUniversityofTechnology,VanMourikBroekmanweg6,2628XE,Delft,the Netherlands.

E-mailaddress:M.E.L.Jones@tudelft.nl(M. Jones). https://doi.org/10.1016/j.jcss.2020.10.003

0022-0000/©2020TheAuthor(s).PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).

(3)

whichseektomergemultipletreesintoasinglesummarytree[42].InrecentdecadesNP-hardphylogeneticdistanceshave attractedquitesomeattentionfromthediscreteoptimizationandparameterizedcomplexitycommunities,seee.g.[12,16].

In this articlewe focuson a relatively newdistance measure, maximumparsimonydistance, henceforth denoted dM P. Let T1 and T2 be two unrooted (i.e. undirected) binary phylogenetic trees,with the sameset of leaf labels X . Consider

an arbitraryassignmentofcolours(“states”)to X ;we callsuchanassignment acharacter.Theparsimonyscore of T1 with

respect to thecharacter is theminimum numberofbichromatic edges in T1,ranging overall possible colouringsof the

internalverticesofT1.TheparsimonydistanceofT1andT2isthemaximumabsolutedifferencebetweenparsimonyscores

ofT1 andT2,rangingoverallcharacters[18,31].

The distancehasseveralattractive properties;it isa metric,and(unlike e.g.RF distance) itis not confounded bythe influence ofhorizontal evolutionaryevents [18]. Furthermore,the concept ofparsimony, whichlies atthe heart ofdM P, is fundamentalin phylogeneticssince itarticulatesthe ideathat explanations ofevolutionary historyshouldbe no more complexthannecessary.Alongsideitshistoricalsignificanceforappliedphylogenetics[17],thestudyofcharacter-based par-simonyhasgivenrisetomanybeautifulcombinatorialandalgorithmicresults;werefertoe.g.[37,29,38,2,30] foroverviews. Unfortunately, itis NP-hardto compute dM P [22]. A simpleexponential-time algorithm isknown [26], which runsin time O

n

·

poly

(

n

))

,where

|

X

|

=

n and

φ

1

.

618 is the golden ratio,but beyondthis few positive results are known. Thisisfrustratingandsurprising,sinceanumberofresultslinkdM P tothewell-studiedTBRdistance,henceforthdenoted dT B R. Namely, it has been proven that dM P is a lower bound on dT B R [18], which, informally, asks for the minimum number oftopological rearrangement operationsto transformone tree into the other;an empirical study has suggested thatinpracticethedistancesareoftenveryclose[23].Also,dM P hasbeenusedtoprovethetightnessofthebest-known kernelizationresultsfordT B R [24,25].What,exactly,istherelationshipbetweendM P anddT B R?Thisisapertinentquestion, whichtranscendsthespecificsofTBRdistancebecause,crucially,dT B R canbe characterizedusingthepowerfulmaximum agreementforest abstraction.

Distancesbasedonagreementforestshavebeenintensivelyandsuccessfullystudied inrecentyears,astheuseofthe agreementforestabstractionalmostalwaysyieldsfixedparametertractabilityandconstant-factorapproximationalgorithms [10], manyofwhichare effectiveinpractice.Werefer to[41,39,14,35] forrecentoverviewsoftheagreement forest liter-ature, andbookssuch as[15] for an introductionto fixed parameter tractability.Inparticular, dT B R can becomputed in O

(

3dT B R

·

poly

(

n

))

time[13],permitsapolynomial-time3-approximationalgorithm,andakernelofsize11d

T B R

9 [25]. Incontrast,priortothispaperverylittlewasknownaboutdM P:nothingwasknownabouttheapproximabilityofdM P; itwas not knownwhetherit isfixedparametertractable (wheredM P istheparameter);and, while,asmentionedabove, it is known that dM P

dT B R,it remained unclear howmuch smaller dM P can be than dT B R in the worst case. Despite promising partial resultsit evenremained unclearwhether questionssuch as“Is dM P

k?” can be solved inpolynomial time whenk isa constant[8,23]. Thisisanotherimportantdifference withdistancessuch asdT B R,wherecorresponding questionsaretriviallypolynomialtime solvableforfixedk.TheapparentextracomplexityofdM P seems tostemfromthe unusualmax-mindefinitionoftheproblem,andthefactthatunlikedT B R,whichisbasedontopologicalrearrangementsof subtrees,dM P isbasedonlyoncharacters.

Inthisarticlewetakea significantstepforwardinunderstanding thedeepercomplexityofdM P andresolveall ofthe above questions.Our central result is that we prove that two common polynomial-time reduction rules encountered in phylogenetics,thesubtree andchain reductions[1],aresufficienttoproducealinearkernel fordM P.Thismeansthat,after exhaustiveapplicationoftheserules,whichpreserve dM P,thereducedtreeswillhaveatmost

α

· (

dM P

+

1

)

leaves, with

α

=

560.The fixed parametertractability ofcomputingdM P (parameterizedby itself)thenfollows, bysolving the kernel using the exact algorithm from [26]. The fact that the reduction rules preserve dM P was already known [23]. However, proving thebound onthesize ofthereducedtrees requiresratherinvolvedcombinatorialarguments,which haveavery differentflavourtotheargumentstypicallyencounteredinthemaximumagreementforestliterature.Themaingoalofthis articleistopresenttheseargumentsasclearlyaspossible,ratherthantooptimizetheresultingconstants.

The kernelconfirmsthat questionssuchas“IsdM P

k?”can,indeed,be solvedinpolynomial time:it isstrikingthat heretheproofoffixedparametertractabilityhasprecededtheweakerresultofpolynomial-timesolveabilityforfixedk.

Next,by producinga modified,constructiveversion oftheboundingargumentunderpinning thekernelization, weare abletodemonstrateapolynomial-time

α

(

1

+

1

/

r

)

-factorapproximationalgorithmforcomputationofdM P foranyconstant r,placingtheprobleminAPX.

Anumberofotherpowerfulcorollariesresultfromthekernelization.Weleveragethefactthatthereductionrulesalso preserve dT B R, to show that 1

ddT B RM P

2

α

, which limits how much smaller dM P can be than dT B R. Subsequently, we show that thetreewidth ofan auxiliary graphstructure knownasthe displaygraph [11] isbounded bya linearfunction of dM P, resolving an open question posedseveraltimes [28,23]. Thetreewidth bound, andthe existence ofa non-trivial approximationalgorithmfordM P,werespecifiedassufficientconditionsforprovingthefixedparametertractabilityofdM P viaCourcelle’s Theorem[23];ourlinearkernelimpliesthem.Summarising,ourcentralresultshowshowkernelizationcan openthegatewaytoahostofstrongauxiliaryresultsandbypassintermediatestepsinthealgorithmdesignprocess.

The structure of the paper is as follows. In Section 2 we give formal definitions and insightful preliminary results. In Section 3 we prove our main result: the linear kernel.The section starts with Subsection 3.1 that gives a high-level overviewofhowasequenceoflemmasandtheoremsleadtothekernel,whereasintherestofthesectiontheselemmas and theorems are proved. Interestingcorollaries of the existence of a linearkernel are derived inSection 4: Aconstant approximation algorithm in Section 4.1; A bound on the ratio between dM P and dT B R in Section 4.2; A bound on the

(4)

Fig. 1. TwounrootedbinaryphylogenetictreesT1,T2onX= {a,. . . ,g}.Solidedgesaremonochromaticanddashededgesarebichromaticunderanoptimal

extensionforthecharacterχ:X→ {red,blue}, whereχ(a)=χ(b)=χ(c)=red,χ(d)=χ(e)=χ(f)=χ(g)=blue.Asthereisonebichromaticedge inT1 andtwoinT2,wehavethatlχ(T1)=1,lχ(T2)=2,provingthatdM P(T1,T2)≥ |1−2|=1.Infact,itcanbeverifiedthatnocharactercancause

theparsimonyscoresofthesetwotreestodifferbymore,sodM P(T1,T2)=1.WewillshowinSection4.2thatdT B R(T1,T2)=2,becauseamaximum

agreementforestofthesetwotreescontainsthreeblocks[23].(Forinterpretationofthecoloursinthefigure(s),thereaderisreferredtothewebversion ofthisarticle.)

treewidthoftheso-calleddisplaygraphintermsofdM P inSection4.3.Section5concludeswithsomedirectionsforfuture research.

2. Definitionsandpreliminaries

An unrootedbinaryphylogenetictree ona setof species(ortaxa) X isan undirectedtreein whichall internal vertices havedegree3,andthedegree-1 vertices(theleaves)arebijectivelylabelledwithelementsfromX .Forbrevitywewillrefer tounrootedbinaryphylogenetictreesasphylogenetictrees,orevenshortertrees.SeeFig.1foranexample.

GivenasetS

X andatreeT on X ,we denoteby T

[

S

]

thespanningsubtreeonS inT ,thatis,theminimalconnected subgraph Tof T suchthat Tcontainsevery elementofS.TheinducedsubtreeT

|

SbyS inT isthetreederivedfromT

[

S

]

bysuppressinganyverticesofdegree2.

Givenasubset S

X andatreeT on X ,wesaythat S hasdegreed inT ifthereareexactlyd edgesuv inT for which

u isinT

[

S

]

andv isnot;inotherwords,d isthenumberofedgesseparatingT

[

S

]

fromtherestofT .Wecalltheseedges

pendingedgesofS in T .

For two disjointsubsets S1

,

S2

X , we say S1 and S2 are spanning-disjoint in T if thespanning subtreesT

[

S1

]

and T

[

S2

]

areedge-disjoint.(ObservethatasT is binary,thisalsoimpliesthat T

[

S1

]

andT

[

S2

]

arevertex-disjoint.)Similarly,

wesayacollectionS1

,

. . .

Sm ofsubsetsof X arespanning-disjoint inT if Si

,

Sj arespanning-disjointinT foranyi

=

j. 2.1. Charactersandparsimony

Acharacter on X isa function

χ

:

X

C,whereC isasetofstates.Inthispaperthereisnolimit onthesizeofC,in contrasttosome contextswhere

|

C

|

isassumedtobequitesmall(forexample,ingeneticdatathenucleobasesA,C,G,T). Thinkofthestatesascolours,say1

,

2

,

. . . ,

t

=: [

t

]

.

Foragivencharacter

χ

andtreeT on X ,theparsimonyscore measureshowwell T fits

χ

.Itisdefinedinthefollowing way.Callacolouring

φ

:

V

(

T

)

→ [

t

]

an extension of

χ

to T if

φ (

x

)

=

χ

(

x

)

forallx

X . Denoteby



T

(φ)

thenumberof bichromaticedges uv in T ,i.e.forwhich

φ (

u

)

= φ(

v

)

.We usuallyomit subscript T whenthetreeis clearfromcontext. Theparsimonyscore forT withrespectto

χ

isdefinedas

(

T

)

=

min

φ



T

(φ)

wheretheminimum istakenover allpossibleextensions

φ

of

χ

to T .An extension

φ

that achievesthisbound iscalled anoptimalextension of

χ

to T .Anoptimalextension,andthustheparsimonyscore,canbeeasilycomputedinpolynomial timeusingdynamicprogrammingore.g.Fitch’salgorithm[19].

Observethat foranyT and

χ

,the parsimonyscore forT with respectto

χ

is atleast

|

χ

(

X

)

|

1, i.e.the numberof colours assignedby

χ

minus 1.If

(

T

)

isexactly

|

χ

(

X

)

|

1,wesaythat T isaperfectphylogeny for

χ

.Fortrees T1

,

T2

andacharacter

χ

on X ,theparsimonydistancewithrespectto

χ

isdefinedas dM Pχ

(

T1

,

T2

)

= |

(

T1

)

(

T2

)

|.

Nowwearereadytodefinethemaximumparsimonydistance betweentwotrees(seealsoFig.1).FortwotreesT1

,

T2 on X ,themaximumparsimonydistanceisdefinedas

dM P

(

T1,T2)

=

max

χ dM Pχ

(

T1,T2)

wherethemaximumistakenoverallpossiblecharacters

χ

onX [18,31].Equivalently,wemaywriteitas dM P

(

T1,T2)

=

max

(5)

where

φ

1 isanoptimalextensionof

χ

toT1,and

φ

2 anoptimalextensionof

χ

toT2.Thismeasuresatisfiestheproperties

ofa distancemetric onthespaceofunrootedbinary phylogenetictrees[18,31].Fortwo treesonn taxaitisknownthat

dM P isatmostn

2

n

+

1 [18].Aweakerbound ofn

1 iseasilyobtainedbyobservingthat theparsimonyscoreofa characteronatreeisatleast0andatmostn

1.

GivenatreeT on X andacolouring

φ

:

V

(

T

)

→ [

t

]

,theforestinducedby

φ

isderived fromT bydeletingevery bichro-maticedgeunder

φ

.Observethatthenumberofconnectedcomponentsintheforestinducedby

φ

isexactly

(φ)

+

1.

Lemma1.If

χ

:

X

→ [

t

]

isacharacterwithSi

=

χ

−1

(

i

)

= ∅

(i.e.atleastonetaxaiscoloured i)foreachi

∈ [

t

]

,andT isatreeonX , then

lT

(

χ

)

t

1

withequalityifandonlyifS1

,

. . .

Starespanning-disjointinT .

Proof. ToseethatlT

(

χ

)

t

1,consideranoptimalextension

φ

of

χ

toT ,andlet F betheforestinducedby

φ

.Aseach connectedcomponentin F ismonochromaticallycolouredby

φ

,theremustbe atleastt connectedcomponents,andthus

(φ)

t

1,whichimplies

(

T

)

t

1.

NowsupposethatS1

,

. . . ,

Starespanning-disjointinT .Thenconstructanextension

φ

of

χ

toT byfirstsetting

φ (

u

)

=

i foreveryvertexu inT

[

Si

]

,foreachi

∈ [

t

]

.(Asthespanningtreesareedge-disjointandthusvertex-disjointinT ,thisis well-defined).Foranyremainingunassignedverticesv,ifv hasaneighbouru forwhich

φ (

u

)

isdefined,thenset

φ (

v

)

= φ(

u

)

. Repeatthisprocess untilevery vertexisassigneda colourby

φ

. Nowobserve thatby construction,the verticesassigned colouri by

φ

formaconnectedsubtreeforeach i

∈ [

t

]

.Thustheforestinducedby

φ

hasexactlyt connectedcomponents, andso

(φ)

=

t

1.

Finally,suppose

(

T

)

=

t

1, andlet

φ

be an optimalextension of

χ

.Then theforest F inducedby

φ

hasexactly t

connectedcomponents,whichimpliesbythepigeonholeprinciplethateach Si isasubsetofoneconnectedcomponentin F .Thenaseach Si iscontainedwithin adifferentconnectedcomponentof F , thespanning treesT

[

Si

]

arealsocontained withinthesecomponents,andso S1

,

. . .

St arespanning-disjoint.



2.2. Parameterizedcomplexityandkernelization

A parameterizedproblem is aproblemfor whichthe inputsare oftheform

(

x

,

k

)

, wherek isan non-negative integer, calledtheparameter.Aparameterizedproblemisfixed-parametertractable (FPT)ifthereexistsanalgorithmthatsolvesany instance

(

x

,

k

)

in f

(

k

)

· |

x

|

O(1)time,where f

()

isacomputablefunctiondependingonlyonk.Aparameterizedproblemhas akernel ofsize g

(

k

)

,where g

()

isacomputablefunctiondependingonlyonk,ifthereexistsapolynomialtimealgorithm transforming anyinstance

(

x

,

k

)

intoan equivalent problem

(

x

,

k

)

, with

|

x

|,

k

g

(

k

)

.If g

(

k

)

is a polynomial ink then

we call this a polynomialkernel; if g

(

k

)

=

O

(

k

)

then it isa linearkernel. It is well-known that a parameterized problem isfixed-parameter tractableifandonlyifithasa(notnecessarilypolynomial)kernel.Formoreinformation,wereferthe readerto [15].

Foramaximizationproblem



and

ρ

1,wesay



hasaconstantfactorapproximation withapproximationratio

ρ

ifthere existsapolynomial-timealgorithmsuchthatforanyinstance

π

of



,thefollowinginequalitieshold,whereopt

(

π

)

denotes themaximumvalueofasolutionto

π

,andalg

(

π

)

denotesthevalueofthesolutionto

π

returnedbythealgorithm:

1

opt

(

π

)

alg

(

π

)

ρ

Inthispaperwestudythefollowingmaximizationproblem: MaximumParsimonyDistance(dmp)

Input: Twotrees T1

,

T2 onasetoftaxa X .

Output: Acharacter

χ

on X thatmaximizes

|

(

T1

)

(

T2

)

|

.

3. Kernelbound

3.1. Overview

Inthissectionwegiveanoverviewoftheconstituentpartsofourkernelizationresult,andhowtheyfittogether. The firststepistoapply tworeduction rules,theCherryruleandtheChain rule,describedinthe nextsection.These rulescorrespondroughlytoreductionrulesthatoftenappearinpapersoncomputationalphylogenetics.Thecorrectnessof theserules was provedin [23];ourcontributionisto show thatthe exhaustiveapplicationoftheserules grantsa linear kernel,asstatedinthefollowingtheorem.

(6)

Theorem1.Thereexistsaconstant

α

(

α

=

560)forwhichthefollowingholds.Let

(

T1

,

T2

)

beapairofbinaryunrootedphylogenetic treesonX thatareirreducibleunderReductionRules1and2.

Thenif

|

X

|

α

k,itholdsthatdM P

(

T1

,

T2

)

k,andwecanfindawitnessingcharacter,i.e.acharacter

χ

yieldingdM Pχ

(

T1

,

T2

)

k,inpolynomialtime.

Thistheorem,togetherwiththecorrectnessofthereductionrulesasprovedin [23],immediatelyimpliesalinearkernel for dmp.

Toshowhowweprovethetheorem,wewillneedtointroducesometerminologyaswego.

Aquartet Q isanysetof4 elementsin X .IfT1

|

Q

=

T2

|

Q,wesaythat Q isaconflictingquartet for

(

T1

,

T2

)

.

Asacrucialstepweprovethatforany S largeenoughwithrespecttothedegreeofS inboth T1 andT2,eitherthere

existsaconflictingquartetoroneofthereductionrulesapplies.

Lemma2.LetS beasubsetofX withd1thedegreeofS inT1,andd2thedegreeofS inT2.If

|

S

|

>

9

(

d1

+

d2

)

12,theneither T1

|

S

=

T2

|

S oroneofReductionRules1or2appliesto

(

T1

,

T2

)

.Inparticularif

(

T1

,

T2

)

isirreducibleunderRules1or2and

|

S

|

9

(

d1

+

d2

)

11,thenthereexistsaconflictingquartetQ

S,andsuchaquartetcanbefoundinpolynomialtime.

The next resultimpliesthat if we havea large enoughnumber ofconflicting quartets that are alsospanning-disjoint in both T1 and T2,then we are done. While it is intuitively clearthat such quartets can be leveraged to create a high

parsimonyscoreinonetree,somecarehastobetakentokeeptheparsimonyscorelowintheothertree.

Lemma3.Let

Q

= {

Q1

,

. . . ,

Qk

}

beasetofconflictingquartetsforT1

,

T2,suchthatQ1

,

. . .

Qkarespanning-disjointinT1andin T2.

ThendM P

(

T1

,

T2

)

k,andwecanfindawitnessingcharacterinpolynomialtime.

Incombination,Lemmas2and3allowustoshowthatdM P

(

T1

,

T2

)

k providedthatwecanfindatleastk setsS1

,

. . .

Sk thatarespanning-disjointinbothtreesandsatisfytheconditionsofLemma2.

We will findk such sets as part of the construction of a character that witnesses dM P

(

T1

,

T2

)

k, for any reduced

instancewith

|

X

|

α

k.Inordertoconstructthischaracter,wefirstcreateapartitionof X intolargesubsets,asdescribed bythefollowinglemma.

Lemma4.Supposethat

|

X

|

2ct forsomeintegersc andt,andletT1beaphylogenetictreeonX .

TheninpolynomialtimewecanconstructapartitionS1

,

. . . ,

StofX withS1

,

. . . ,

Stspanning-disjointinT1,suchthat

|

Si

|

c foreachi.

Wenotethatthereisaone-to-onecorrespondencebetweenpartitionsandcharactersonX ,inthefollowingsense.Given a partition S1

,

. . .

St of X ,we maydefinea character

χ

:

X

→ [

t

]

such that

χ

(

x

)

=

i if x

Si,foreach i

∈ [

t

]

.Callsucha characterthecharacterdefined by S1

,

. . .

St.

Thusletusconsiderthecharacter

χ

on X definedbythepartitiondescribedbyLemma4.Since S1

,

. . .

St are spanning-disjointinT1,Lemma1tellsthattheparsimonyscoreofT1 withrespectto

χ

isexactlyt

1.

Lemma5.Let

χ

bethecharacterdefinedbythepartitionS1

,

. . . ,

StwhereS1

,

. . . ,

Starespanning-disjointinT1,letd1

,

d2bepositive integerssuchthatd1d2

d1

d2

>

0,andassume

t



(

2d1d2

+

d1) d1d2

d1

d2



k

.

TheneitherdM Pχ

(

T1

,

T2

)

k,orinpolynomialtimewecanfindasetofindicesi1

,

. . .

ikwithk

k suchthat:

Si1

,

. . .

Sikarespanning-disjointinT2(aswellasinT1);

Sijhasdegreeatmostd1inT1foreachj

∈ [

k

]

;and

Sijhasdegreeatmostd2inT2foreachj

∈ [

k

]

.

We willprove Theorem1 bycombiningtheseresults inthefollowing way.Fixintegers d1

,

d2 to be determinedlater.

Assume

(

T1

,

T2

)

isirreducibleunderReductionRules1and2,andassumethat

|

X

| ≥

2ct

,

where c

=

9

(

d1

+

d2)

11 and t

(

2d1d2

+

d1)

d1d2

d1

d2



k (thisholdsif

|

X

|

α

k).

ByLemma4,thereexistsapartitionS1

,

. . .

St ofX with S1

,

. . .

St spanning-disjointinT1and

|

Si

|

c foreachi

∈ [

t

]

.Let

(7)

getasetofindicesi1

,

. . .

iksuchthat Si1

,

. . .

Sik arespanning-disjointinT2 (aswellasinT1),each Sij hasdegreeatmost

d1 in T1,andeach Sij hasdegreeatmostd2 inT2.Buttheneach Sij satisfiestheconditionsofLemma2,andtherefore

foreach j

∈ [

k

]

thereexistsaconflictingquartet Qj

Sij.Moreover,as Si1

,

. . .

Sik arespanning-disjointin T1 andT2,the

quartets Q1

,

. . .

Qkarealsospanning-disjointinT1andT2.ThenLemma3impliesthatdM P

(

T1

,

T2

)

k.

Bysettingd1

=

4 andd2

=

5,wegetthat

α

=

560,givingthedesiredbound.

Inthenextsubsectionsweproveeachoftheselemmas,andthenthemaintheorem,inturn.

3.2. Reductionrules

We begin by statingthe reduction rules forour kernelizationresult. In what follows,a pair

(

x

,

y

)

with x

,

y

X is a

cherry inatree T ifthereexistsan internalvertexu inT adjacenttoboth x andy.Acherryisalsosometimesknownin theliterature asasibling-pair. Asequence ofleavesx1

,

. . .

xr

X isachain in T if thereexistsa pathofinternal vertices p1

,

. . . ,

pr (possiblywithp1

=

p2 andpossiblywithpr−1

=

pr),suchthatforeachi

∈ [

r

]

pi istheinternal vertexadjacent toxi.Wecallr thelength ofthischain.

ReductionRule1.[Cherryreductionrule]Ifthereexistx

,

y

X suchthat

(

x

,

y

)

isacherryineachofT1

,

T2,thenreplace

(

T1

,

T2

)

with

(

T1

|

X\{x}

,

T2

|

X\{x}

)

.

ReductionRule2.[Chainreductionrule]Ifthereexistsasequenceofleavesx1

, . . .

xr

X suchthatx1

,

. . .

xrischaininbothT1and T2,andr

5,thenreplace

(

T1

,

T2

)

with

(

T1

|

X\{x5,...,xr}

,

T2

|

X\{x5,...xr}

)

(thus,thecommonchainisreducedtolength4).

Thecorrectnessoftheserules(inthesensethattheypreservedM P)waspreviouslyprovedin [23].

Theorem2.Let

(

T1

,

T2

)

beaninstanceof dmpderivedfrom

(

T1

,

T2

)

byanapplicationofReductionRules1or2.Then

dM P

(

T1

,

T2

)

=

dM P

(

T1,T2).

CorrectnessofthechainreductionrulefollowsfromTheorem3.1in [23].Correctnessofthecherryreductionrulefollows asasubcaseofTheorem4.1in [23].

Our main contribution is to show that ifan instanceis reduced by these rules then its size is bounded by a linear functionofdM P.

3.3. Smalldegreesets

InthissectionweproveLemma2.

Lemma2. Let S beasubsetof X withd1 thedegreeof S inT1,andd2 thedegreeofS in T2.If

|

S

|

>

9

(

d1

+

d2

)

12,then eitherT1

|

S

=

T2

|

S oroneofReductionRules1or2appliesto

(

T1

,

T2

)

.Inparticularif

(

T1

,

T2

)

isirreducibleunderRules1or2and

|

S

|

9

(

d1

+

d2

)

11,thenthereexistsaconflictingquartetQ

S,andsuchaquartetcanbefoundinpolynomialtime.

Proof. Since unrooted binary trees are characterized by their quartets [34,Theorem 6.3.5(iii)], the last statement of the lemmafollowsdirectly.

WewillshowthatifT1

|

S

=

T2

|

S andneitherofthereductionrulesappliesto

(

T1

,

T2

)

,then

|

S

|

9

(

d1

+

d2

)

12.This

impliesthemainclaimofthelemma.LetusdenoteT

|

S

=

T1

|

S

=

T2

|

S.

Considerthebackbone graphofT

|

S obtainedbydeletingallleaves(seeFig.2foranexample).LetPC bethesetofnodes havingdegree1onthebackbone,whichwerefertoasparents ofacherryinT

|

S.LetPL bethesetofnodeshavingdegree 2 onthe backbone,which werefer to asparents ofaleaf ofT

|

S.All remaining verticeson thebackbone havedegree 3. Thus

|

S

|

,thetotalnumberofleavesof T

|

S is2

|

PC

|

+ |

PL

|

.We callthepathbetweenanytwoodddegreeverticesonthe backbone,havinginternalnodesonlyinPL,aside ofthebackbone.

First noticethat for each cherryin T

|

S,there mustexist in T1

[

S

]

,the spanning treeon S in T1, orin T2

[

S

]

anode,

incident to a pending edge of S, between at least one of its two leaves and its corresponding node in PC. Otherwise ReductionRule1canbeapplied.Inparticularthisimpliesthat

|

PC

|

d1

+

d2.

Thusatleast PC ofthed1

+

d2 pendingedgesmustbeusedfor“cutting”thecherries,eachofthemcutting1leafofa

cherry.Letuschooseonesuchleaffromeachcherry,andcallthesethecut-leaves.

After removing cut-leaves, every node in PC and PL is now the parent of 1 leaf in T

|

S.Every side ofthe backbone contains at most4vertices in PC and PL, unless T1

[

S

]

or T2

[

S

]

hasa node ofa pending edge of S or a node adjacent

to anodeofa pendingedgeon thatside.We showthat everysuch pendingedgeon asidemayincrease thenumberof

PL-nodesonthat sidebyatmost5 (seeFig.2).Indeed,supposeasideofthebackbone hasintotald pendingedgesof S inboth T1 and T2,butmorethan 4

+

5d nodesin PL,i.e.atleast5

(

d

+

1

)

.Then T

|

S containsachainoflength5

(

d

+

1

)

,

(8)

Fig. 2. ExampleillustrationofthebackboneofT|S=T1|S=T2|SwithinT1andT2,whereS= {s1,. . . ,s29}.Edgesandverticesofthebackboneareinbold.

ObservethatT|Shasthechains1,. . . ,s9,but(T1,T2)donothaveacommonchainoflengthgreaterthan4,astheleafs5hasasiblinga inT2. whichwecansplitupintod

+

1 chainsoflength 5.Clearlyatleastone ofthesechainshasnopendingedgeineitherT1

orT2,andsoT1

,

T2 haveacommonchainoflength5,acontradiction.

Thus thetotalnumberofnodesfrom PC and PL onaside isatmostfivetimesthenumberofpendingedges ofS (in T1

[

S

]

orT2

[

S

]

)onthatside,plus4.OtherwiseReductionRule2canbeapplied.Giventhatwe alreadyused

|

PC

|

pending edgesforcuttingthecherries,wehaved1

+

d2

− |

PC

|

pendingedgeslefttobedistributedoverthesides.

The number ofsides onthe backbone is the numberof edges in an unrooted binary tree with

|

PC

|

leaves, which is 2

|

PC

|

3.ThereforethetotalnumberofleavesofT

|

S is

|

S

| =

2

|

PC

| + |

PL

| ≤ |

PC

| +

4

(

2

|

PC

| −

3

)

+

5

(

d1

+

d2

− |

PC

|)

4

|

PC

| +

5

(

d1

+

d2

)

12

.

Clearly,thisattainsitslargestvalueif

|

PC

|

=

d1

+

d2,inwhichcase

|

S

|

9

(

d1

+

d2

)

12,aswastobeproven.



3.4. Combiningconflictingquartets

(9)

Lemma3. Let

Q

= {

Q1

,

. . . ,

Qk

}

beasetofconflictingquartetsforT1

,

T2,suchthatQ1

,

. . .

Qkarespanning-disjointinT1andin T2.

ThendM P

(

T1

,

T2

)

k,andwecanfindawitnessingcharacterinpolynomialtime.

Proof. Foraquartet Q andtreeT ,wesaythatT

|

Q

=

ab

|

cd if Q

= {

a

,

b

,

c

,

d

}

andinT thepathbetweena andb is edge-disjointfromthepathbetweenc andd.Withoutlossofgenerality,wemayassume Qi

= {

ai

,

bi

,

ci

,

di

}

,T1

|

Qi

=

aibi

|

cidiand

T2

|

Qi

=

aici

|

bidiforeachi

∈ [

k

]

.

We will show how to build a character

χ

with two states, such that

(

T1

)

k, and

(

T2

)

2k. This shows that dM Pχ

(

T1

,

T2

)

k,asrequired.

Theideaistoconstruct

χ

insuchawaythat,foreachquartet Qi,

χ

(

ai

)

=

χ

(

bi

)

=

χ

(

ci

)

=

χ

(

di

)

.Thiswillensurethat

(

T2

)

isatleast2k,asT2willhaveatleast2k edge-disjointpaths(fromai toci andfrombidi,foreach i

∈ [

k

]

)thateach requireatleastonechangeinstatealongsomeedge.

Foreach Qi,leteQi denoteanedgeinT1 suchthatinT1

[

Qi

]

,ei isonthepaththatseparates

{

ai

,

bi

}

from

{

ci

,

di

}

.

Now weconstructafunction

φ

:

V

(

T1

)

→ {

red

,

blue

}

asfollows.Startbychoosing anarbitraryleafinT1,saywithout lossofgeneralitya1,andset

φ (

a1

)

=

red.Nowproceedasfollows.Foranyedgeuv inT1suchthat

φ (

u

)

isdefinedbut

φ (

v

)

isnot,weset

φ (

v

)

= φ(

u

)

,unlessuv

=

eQi forsomei.Inthatcase,weset

φ (

v

)

=

blueif

φ (

u

)

=

red,andset

φ (

v

)

=

red

otherwise.

Nowwecanlet

χ

betherestrictionof

φ

to X .Byconstruction,

φ

isanextensionof

χ

toT1and

(φ)

= |

eQi

:

i

∈ [

k

]|

=

k.

Thisisenoughtoshowthat

(

T1

)

k.

We now show that

χ

(

ai

)

=

χ

(

bi

)

=

χ

(

ci

)

=

χ

(

di

)

, foreach i

∈ [

k

]

.To seethis, consider thespanning tree T1

[

Qi

]

. By construction, T1

[

Qi

]

containstheedgeeQi andeQi separates

{

ai

,

bi

}

from

{

ci

,

di

}

.Letui

,

vi betheverticesofeQi,withui

thevertexclosertoai andbi.Notethat T1

[

Qi

]

cannotcontaineQj forany j

=

i,asT1

[

Qi

]

andT1

[

Qj

]

areedge-disjoint.It

followsthatui

,

aibiareallassignedthesamevalueby

φ

andvi

,

ci

,

diareassignedtheoppositevalue.Thusbydefinitionof

χ

,wehave

χ

(

ai

)

=

χ

(

bi

)

= φ(

ui

)

= φ(

vi

)

=

χ

(

ci

)

=

χ

(

di

)

.

ItremainstoobservethatasQ1

,

. . .

Qkarespanning-disjointinT2,theai

ciandbi

dipathsinT2 arepairwise

edge-disjointforalli

∈ [

k

]

.Thenas

χ

(

ai

)

=

χ

(

ci

)

and

χ

(

bi

)

=

χ

(

di

)

,thereexistatleast2k edgesuv inT2with

φ

2

(

u

)

= φ

2

(

v

)

,for

anyextension

φ

2 of

χ

toT2.Itfollowsthat

(

T2

)

2k,andsodM P

(

T1

,

T2

)

dM Pχ

(

T1

,

T2

)

= |

(

T1

)

(

T2

)

|

2k

k

=

k.

Sinceeach edgeisprocessedatmostonceintheconstruction of

χ

,itisclearthat thisconstructiontakespolynomial

time.



3.5. Constructinganinitialpartition

InthissectionweproveLemma4.

Lemma4. Supposethat

|

X

|

2ct forsomeintegersc andt,andletT1beaphylogenetictreeonX .

TheninpolynomialtimewecanconstructapartitionS1

,

. . . ,

StofX withS1

,

. . . ,

Stspanning-disjointinT1,suchthat

|

Si

|

c foreachi.

Proof. Weprovetheclaimbyinductionont.Forthebasecase,ift

=

1 thenwemaylet S1

=

X ,andwehavethedesired

partition.

Fortheinductivestep,assume

|

X

|

2ct andthattheclaimistrueforsmallervaluesoft.Wefirstfixanarbitraryrooting on T1.Thatis,chooseanarbitraryedgee in T1 andsubdivideitwithanew(temporary)vertexr,thenorientalledges in T1 awayfromr.Underthisrooting,letu bealowestvertexinT1 forwhichu hasatleastc descendantsin X .Let St

X bethesetofthesedescendants. NotethatsinceT1isbinary,

|

St

|

<

2c,asotherwiseoneofthetwochildrenofu wouldbe alowervertexwithatleastc descendants.

NowconsidertheinducedsubtreeT1

|

X,where X

=

X

\

St.As

|

St

|

<

2c,wehaveX

2c

(

t

1

)

.Thenbytheinductive hypothesis,we canconstructapartition S1

,

. . . ,

St−1 of X with S1

,

. . . ,

St−1 spanning-disjointinT1

|

X,suchthat

|

Si

|

c for each i.By construction itis clearthat St is spanning-disjointin T1 from S1

,

. . . ,

St−1.Thus S1

,

. . . ,

St isthe desired partition.

Astheconstructionof St canbedoneinpolynomialtimeandthisprocessisrepeatedt

≤ |

X

|

times,theentireprocess takespolynomialtime.



3.6. Well-behavedsets

InthissectionweproveLemma5.Westartwithanobservation:

Observation1.Forany(notnecessarilybinary)unrootedtreeT withn vertices,andanyintegerd

1,thenumberofverticesinT withdegreestrictlygreaterthand isatmostn

/

d.1

(10)

Proof. Foreachvertexv inT letd

(

v

)

denotethedegreeofv.Recallthatanunrootedtreewithn verticeshasexactlyn

1 edges.Itfollowsthat



vV(T)

d

(

v

)

=

2

|

E

(

T

)

| =

2n

2

.

NowsupposethatT hasm

>

n

/

d verticeswithdegreestrictlygreaterthand,i.e.atleastd

+

1.Theremainingn

m vertices

allhavedegreeatleast1,fromwhichitfollowsthat



vV(T)

d

(

v

)

m

(

d

+

1

)

+

n

m

=

md

+

n

≥ (

n

/

d

)

d

+

n

=

2n

,

acontradiction.



Lemma5. Let

χ

bethecharacterdefinedbythepartitionS1

,

. . . ,

StwhereS1

,

. . . ,

Starespanning-disjointinT1,letd1

,

d2bepositive integerssuchthatd1d2

d1

d2

>

0,andassume

t



(

2d1d2

+

d1) d1d2

d1

d2



k

.

TheneitherdM Pχ

(

T1

,

T2

)

k,orinpolynomialtimewecanfindasetofindicesi1

,

. . .

ikwithk

k suchthat:

Si1

,

. . .

Sikarespanning-disjointinT2(aswellasinT1);

Sijhasdegreeatmostd1inT1foreachj

∈ [

k

]

;and

Sijhasdegreeatmostd2inT2foreachj

∈ [

k

]

.

Proof. By Lemma 1,

(

T1

)

=

t

1. If

(

T2

)

t

+

k

1, then dM Pχ

(

T1

,

T2

)

k as required. So we may assume that

(

T2

)

t

+

k

2.Let

δ

=

(

T2

)

(

T1

)

,andobservethat0

≤ δ ≤

k

1.

Wenow constructa partition P1

,

. . .

Ps of X whichisspanning-disjoint inT2 (seeFig.3foran illustration).Let

φ

2 be

anoptimalextensionof

χ

to T2.As

(

T2

)

=

(

T1

)

+ δ =

t

+ δ −

1,theforestinducedby

φ

2 hasexactlys monochromatic

connectedcomponents,wheres

=

t

+ δ

.Let P1

,

. . . ,

Ps bethepartition of X formedbytakingtheintersectionof X with the vertex set of each tree in this forest. Observe that by construction P1

,

. . .

Ps are spanning-disjoint in T2, and that

furthermoreeachPjisasubsetofSiforsomei

∈ [

t

]

(aseachelementof Pjisassignedthesamevalueby

φ

2,andthusby

χ

).

Nowlet

I ⊆ [

t

]

denotethesetofindicesi in

[

t

]

suchthat

Si

=

Pjforsome j

∈ [

s

]

;

Sihasdegreeatmostd1 inT1;and

Sihasdegreeatmostd2 inT2.

Notethat since P1

,

. . .

Pj arespanning-disjoint in T2,the sets

{

Si

:

i

I}

arealso spanning-disjointin T2. Noticethat

it is sufficient to provethat

|

I|

k, whence anysubset of k indices from

I

satisfiesthe lemma. We will prove thisby providingupperboundsonthenumberofindicesin

[

t

]

thatdonotsatisfytheconditionsof

I

.

Let

I

0 denotethesetofindicesi

∈ [

t

]

suchthat Pj

=

Si forany j

∈ [

s

]

.Wefirstclaimthat

|

I

0

|

≤ δ

.Indeed,sinceevery Pjisasubsetofsome Si andS1

,

. . .

St andP1

,

. . . ,

PsarebothpartitionsofX ,wehavethatforeveryi

I

0,thereexistat

leasttwodistinctindices j

,

j

∈ [

s

]

forwhich Pj

,

Pj

Si.Hence,

s

2

|

I

0

| + |[

t

] \

I

0

| =

t

+ |

I

0

|.

Thereforeif

|

I

0

|

> δ

thens

>

t

+ δ

,contradictingthedefinitionofs.Thus,wehave

|

I

0

|

≤ δ

.

Next, let

I>

d1 denote the set of indices i

∈ [

t

]

for which Si has degree greater than d1 in T1. We will show that

|

I>

d1

|

t

/

d1.Foreach i

∈ [

t

]

,compressthespanningsubtreeT1

[

Si

]

toasinglevertex,andobservethatthedegreeofthis vertexisequaltothedegreeofSiinT1.Anyvertexu whichisnotpartofanyT1

[

Si

]

ismergedwithoneofitsneighbours. Notethatthismergingprocesscanonlyincreasethedegreesoftheremainingvertices.CalltheresultingtreeT1.SeeFig.4.

T1 hast vertices,eachofthemcorrespondingtoasubset Si,andhavingdegreeatleastthedegreeofthecorresponding Si inT1.NowbyObservation1,thereareatmostt

/

d1 verticesin T1 withdegreegreaterthand1.Itfollowsthatthereareat

mostt

/

d1 valuesofi

∈ [

t

]

forwhich Sihasdegreegreaterthand1 inT1,andthus

|

I>

d1

|

t

/

d1 aswewantedtoshow. Similarlylet

J>

d2 denotethesetofindices j

∈ [

s

]

forwhich Pj hasdegreegreaterthand2 inT2.Bysimilararguments asusedfor

I>

d1 above,wecanshowthat

|

J>

d2

|

s

/

d2.

Noticethatforanyi

∈ [

t

]

,ifi isnotin

I

,theneitheri

I

0,ori

I>

d1,orthereexists j

J>

d2 suchthat Si

=

Pj.We thereforehavethat

(11)

Fig. 3. Illustrationoftheconstructionofpartition P1,P2,P3,P4,P5 fromS1,S2,S3.Solidedgesaremonochromaticanddashededgesarebichromatic

underanoptimalextensionforχ,whereχisthecharacterinducedbyS1,S2,S3.

Fig. 4. Illustrationoftheconstructionofauxiliarytree T1,givenapartitionofX with S1= {a,b,c},S2= {d,e,f},S3= {g,h,i},S4= {j,k},S5= {l,m}.

Notethattheinternalvertexlabelledu isnotpartofT1[Si]foranyi,sowemergeitwithanarbitraryadjacentvertex.Inthiscasewemergeu into

(12)

Now,usingthatt

(2d1d2+d1) d1d2−d1−d2k,s

=

t

+ δ

and

δ

k

1,wehave:

|

I

| ≥

t

− |

I

0

| − |

I>

d1

| − |

J>

d2

|

t

− δ −

t

/

d1

s

/

d2

=

t

− δ −

t

/

d1

− (

t

+ δ)/

d2

=

d1d2t

d1d

d2t

d1t

dd1d2

=

(

d1d2

d1

d2)t

− (

d1d2

+

d1)δ d1d2

(

d1d2

d1

d2

)

t

− (

d1d2

+

d1

)(

k

1

)

d1d2

(

2d1d2

+

d1

)

k

− (

d1d2

+

d1

)(

k

1

)

d1d2

=

d1d2k

+

d1d2

+

d1 d1d2

>

d1d2k d1d2

=

k

,

as we needed to prove. To see that

I

can be constructed in polynomial time, it suffices to observe that the partition

P1

,

. . . ,

Pscan beconstructedinpolynomialtime(asthe

φ

2 canbe foundinpolynomialtime),andafterthiseach Si can becheckedformembershipin

I

inpolynomialtime.



3.7. ProofofTheorem1

Lemma6.Letd1

,

d2bepositiveintegerssuchthatd1d2

d1

d2

>

0.Let

(

T1

,

T2

)

beapairofbinaryunrootedphylogenetictreeson X thatareirreducibleunderReductionRules1and2.

Thenif

|

X

|

2ct,wherec

=

9

(

d1

+

d2

)

11 andt

=

d(12dd21−d2d+1−d1d)2



k,itholdsthatdM P

(

T1

,

T2

)

k,andwecanfindawitnessing

characterinpolynomialtime.

Proof. By Lemma4,thereexistsapartition S1

,

. . .

St of X ,allspanning-disjointin T1,andwith

|

Si

|

c forall i

∈ [

t

]

.Let

χ

be thecharacter definedby S1

,

. . . ,

St. If

χ

isa witness todM P

(

T1

,

T2

)

k,then we mayreturn

χ

andwe aredone.

Otherwise,wemayapplyLemma5tofindindicesi1

,

. . .

iksuchthat:

Si1

,

. . .

Sik areallspanning-disjointinT2 (aswellasinT1);

each Sij hasdegreeatmostd1 inT1;and

each Sij hasdegreeatmostd2 inT2.

Nowforeach Sij,wehavethatSij hasdegreed

j 1

d1 inT1andd j 2

d2 inT2,andthat

|

Sij

| ≥

c

>

9

(

d1

+

d2

)

11

9

(

d j 1

+

d j 2

)

11

,

andalsothat

(

T1

,

T2

)

isirreducibleunderRules1and2.ThuswemayapplyLemma2,tofindaconflictingquartetQj

Sij

foreachij.

Finally, as Si1

,

. . .

Sik are spanning-disjoint in both T1 and T2, and as each Qj is a subset of Sij, we have that

Q1

,

. . . ,

Qk are also spanning-disjoint in both T1 and T2. Therefore we may apply Lemma 3 to find a witnessing

char-acter fordM P

(

T1

,

T2

)

k. As each stepofthis process takespolynomial time,the construction ofa witnessingcharacter

takespolynomialtime.



ItremainstocompletetheproofofTheorem1.

Theorem1. Thereexistsaconstant

α

(

α

=

560)forwhichthefollowingholds.Let

(

T1

,

T2

)

beapairofbinaryunrootedphylogenetic treesonX thatareirreducibleunderReductionRules1and2.

Thenif

|

X

|

α

k,itholdsthatdM P

(

T1

,

T2

)

k,andwecanfindawitnessingcharacter,i.e.acharacter

χ

yieldingdM Pχ

(

T1

,

T2

)

k,inpolynomialtime.

Cytaty

Powiązane dokumenty

The major technical result which we obtain is of indepen- dent interest, and it states, in particular, that whenever a locally minimal group G having no small normal subgroups (in

Keywords and Phrases:Maximum modulus; Polynomial; Refinement; Refinement of the generalization of Schwarz’s lemma; No zeros in |z| &lt;

In the SI system of units, distance is measured in metres (m), mass in kilograms (kg) and time in seconds (s).. The momentum of an object is given by the mass of the object

Note that we consider 0 to be a natural number, this is a convention, some textbook author may exclude 0 from the set of natural numbers.. In other words rational numbers are

This dimension expresses the degree to which the less powerful members of a society accept and expect that power is distributed unequally.. In societies with low Power Distance,

In a recent work (Bogdanowicz and Giaro, 2012) we presented a general method for creating matching metrics for unrooted phylogenetic trees (not necessarily binary) and an example of

For a problem of optimal discrete control with a discrete control set composed of vertices of an n-dimensional permutohedron, a fully polynomial-time approximation scheme is

Via the Crofton formulas the above inequality is a con- sequence of Milnor’s results concerning the Betti numbers of an algebraic variety (see [Mi1], [Mi2], in which the