• Nie Znaleziono Wyników

Dynamic protein assemblies in homologous recombination with single DNA molecules

N/A
N/A
Protected

Academic year: 2021

Share "Dynamic protein assemblies in homologous recombination with single DNA molecules"

Copied!
225
0
0

Pełen tekst

(1)
(2)
(3)

homologous recombination with single

DNA molecules

Proefschrift

ter verkrijging van de graad van doctor aan de Technische Universiteit Delft,

op gezag van de Rector Magnificus prof. dr. ir. J.T. Fokkema, voorzitter van het College voor Promoties,

in het openbaar te verdedigen op woensdag 17 oktober 2007 om 10.00 uur door

Adrianus Hendricus (Thijn) van der Heijden

(4)

Samenstelling van de promotiecommissie: Rector Magnificus voorzitter

Prof. Dr. C. Dekker Technische Universiteit Delft, promotor Dr. C. Wyman Erasmus Medical Center

Prof. Dr. S. Kowalczykowski University of California, Verenigde Staten Prof. Dr. A. Stasiak Universit´e de Lausanne, Zwitserland Dr. Ir. S.J.T. van Noort Universiteit Leiden

Prof. Dr. D. Frenkel Universiteit van Amsterdam en AMOLF Prof. Dr. I.T. Young Technische Universiteit Delft

Prof. Dr. H.W.M. Salemink Technische Universiteit Delft, reservelid

Keywords: DNA-protein interaction, homologous recombination, single molecule, magnetic tweezers, atomic force microscopy

Published by: Thijn van der Heijden

Cover design: Thijn van der Heijden & Frank van Heesch Printed by: Ponsen & Looijen B.V.

The production of this thesis was financially supported by Delft University of Technology, Global Fruit, T. van der Heijden Holding B.V, and Van der Heijden Holding Mariahout B.V.

An electronic version of this thesis, including color figures, is available at: http://www.library.tudelft.nl/dissertations/

Casimir PhD series, Delft-Leiden, 2007-08 ISBN: 978-90-8593-038-9

(5)

1 Introduction 1

1.1 DNA repair . . . 2

1.2 Properties of DNA-repair proteins . . . 7

1.2.1 Prokaryotic RecA . . . 7

1.2.2 Eukaryotic Rad51 . . . 9

1.2.3 Eukaryotic Rad50 . . . 9

1.3 Tools to study individual protein-DNA interaction . . . 9

1.3.1 Atomic Force Microscopy . . . 10

1.3.2 Magnetic Tweezers . . . 12

1.4 Outline of this thesis . . . 12

Bibliography . . . 14

2 DNA repair by homologous recombination, a highly dynamic process 17 2.1 Introduction . . . 18

2.2 Filament assembly . . . 20

2.3 Role of ATP hydrolysis . . . 25

2.4 Rearrangement of the nucleoprotein filament . . . 26

2.5 Disassembly . . . 28

2.6 Outlook . . . 28

Bibliography . . . 29

3 Monte Carlo simulations of protein assembly, disassembly, and translocation on DNA 33 3.1 Introduction . . . 34

3.2 Description of the model . . . 37

3.3 Methods . . . 38

3.4 Results . . . 39

3.4.1 Non-cooperative binding . . . 39

(6)

3.4.3 Multimeric binding and Hill coefficient . . . 46

3.4.4 Dissociation . . . 46

3.4.5 Rearrangements . . . 48

3.4.6 Combination of processes . . . 49

3.5 Discussion . . . 50

3.6 Application of Monte Carlo modeling . . . 52

3.7 Conclusion . . . 53

3.8 Acknowledgements . . . 54

Bibliography . . . 54

4 Real-time assembly and disassembly of human RAD51 filaments on individual DNA molecules 57 4.1 Introduction . . . 58

4.2 Materials & Methods . . . 59

4.2.1 RAD51 purification . . . 59

4.2.2 DNA substrates . . . 59

4.2.3 Magnetic Tweezers . . . 60

4.2.4 Flow cell . . . 60

4.2.5 RAD51/DNA reactions . . . 60

4.2.6 Monte Carlo Simulations . . . 61

4.3 Results . . . 63

4.3.1 Assembly/disassembly of RAD51 onto/from double-strand DNA . . . 63

4.3.2 Assembly/disassembly of RAD51 onto/from single-strand DNA . . . 66

4.3.3 Monte Carlo simulations of nucleoprotein filament formation 67 4.3.4 Growth by monomers or multimers . . . 68

4.3.5 Quantitative comparison of model and data . . . 70

4.3.6 Quantifying dissociation . . . 72

4.3.7 Final coverage of RAD51-coated DNA . . . 73

4.3.8 Implications for the filament structure . . . 76

4.4 Discussion . . . 76

4.5 Acknowledgements . . . 80

Bibliography . . . 80

Supplementary Information . . . 83

5 AFM tip-induced dissociation of RecA-dsDNA filaments 87 5.1 Introduction . . . 88

(7)

Contents

5.2.1 Tip-sample interaction strongly influences RecA filament

disassembly . . . 88

5.2.2 Size determination from AFM images is highly dependent on tip shape and requires quantification . . . 91

5.3 Conclusions . . . 91

Bibliography . . . 91

Supplementary Information . . . 92

Bibliography . . . 93

6 Torque-limited RecA polymerization on dsDNA 95 6.1 Introduction . . . 96

6.2 Materials & Methods . . . 97

6.2.1 Magnetic tweezers . . . 97

6.2.2 DNA substrates . . . 98

6.2.3 Flow cell . . . 98

6.2.4 RecA/DNA reactions . . . 98

6.3 Results . . . 99

6.3.1 Torsionally unconstrained dsDNA . . . 99

6.3.2 Torsionally constrained dsDNA . . . 99

6.4 Data analysis & Discussion . . . 102

6.5 Conclusion . . . 108

6.6 Acknowledgments . . . 108

Bibliography . . . 109

7 Translocation of RecA on single-stranded DNA in the presence of ATP hydrolysis 111 7.1 Introduction . . . 112

7.2 Materials & Methods . . . 112

7.2.1 RecA protein . . . 112

7.2.2 DNA substrates . . . 113

7.2.3 Magnetic Tweezers . . . 113

7.2.4 Flow cell . . . 113

7.2.5 RecA/DNA reactions . . . 113

7.2.6 Monte Carlo Simulations . . . 113

7.3 Results & Discussion . . . 113

7.3.1 Assembly/disassembly in the presence of Mg2+ . . . 114

7.3.2 Assembly/disassembly in the presence of Ca2+ . . . 116

(8)

7.3.4 Monte Carlo simulations of RecA-single-strand DNA

inter-action . . . 117

7.4 Conclusion . . . 121

7.5 Acknowledgements . . . 121

Bibliography . . . 122

8 Homologous recombination in real time: DNA strand exchange by RecA 125 8.1 Introduction . . . 126

8.2 Materials & Methods . . . 127

8.2.1 DNA substrates . . . 127

8.2.2 Magnetic Tweezers . . . 128

8.2.3 Flow cell . . . 129

8.2.4 RecA/DNA reactions . . . 129

8.2.5 D-loop assay . . . 130

8.3 Results & Discussion . . . 130

8.3.1 Rate of joint-molecule formation in the presence of ATPγS 131 8.3.2 Kinetics of strand invasion and exchange in the presence of ATP . . . 131

8.3.3 Structure of the joint molecule . . . 133

8.3.4 Length of DNA synapsis during strand exchange . . . 136

8.3.5 Structure of DNA after strand exchange . . . 136

8.4 Acknowledgements . . . 141

Bibliography . . . 141

9 The coiled-coil of the human Rad50 DNA repair protein contains specific segments of increased flexibility 145 9.1 Introduction . . . 146

9.2 Materials & Methods . . . 148

9.2.1 Analysis of DNA flexibility with the local curvature method 149 9.3 Results . . . 151

9.3.1 Quantitative analysis of flexibility along the Rad50 coiled-coil151 9.3.2 Mapping Rad50 regions of increased flexibility on its amino acid sequence . . . 153

9.4 Discussion . . . 155

9.5 Acknowledgments . . . 156

(9)

Contents

10 High flexibility of DNA on short length scales probed by atomic

force microscopy 159

10.1 Introduction . . . 160

10.2 Materials & Methods . . . 160

10.2.1 Sample preparation and AFM . . . 160

10.2.2 Image analysis . . . 161

10.2.3 Data Analysis . . . 161

10.3 Results . . . 163

10.3.1 The WLC . . . 163

10.3.2 AFM measurements of DNA contours . . . 164

10.3.3 Anharmonic model . . . 166 10.4 Discussion . . . 169 10.5 Acknowledgments . . . 170 Bibliography . . . 171 Supplementary Information . . . 172 10.A Theory . . . 172

10.A.1 Scope of model . . . 172

10.A.2 Scale dependence in equilibrium statistical physics . . . 173

10.A.3 Model-independent tests . . . 175

10.A.4 Tests that distinguish different models . . . 175

10.A.5 Relation to other work . . . 175

10.B Materials & Methods . . . 177

10.B.1 Sample preparation, AFM imaging, and control experiments 177 10.B.2 Image analysis . . . 180

10.C Monte Carlo evaluation of models . . . 182

10.C.1 Monte Carlo code . . . 182

10.C.2 Simulated data . . . 182

10.C.3 Excise big kinks . . . 185

10.D Other calculations . . . 186

10.D.1 Force–extension and Cyclization . . . 186

10.D.2 Nematic ordering . . . 188

10.D.3 Comparison to kinkable WLC theory . . . 188

10.D.4 Rounded energy function . . . 189

10.D.5 Comparison to Du et al. . . 190

10.E Out-of-equilibrium adsorption model . . . 190

Bibliography . . . 195

(10)

Samenvatting 203

Dankwoord 207

Curriculum Vitae 211

(11)

Introduction

(12)

All living creatures are made of cells, small membrane-bounded compartments filled with a concentrated aqueous solution of molecules. The blueprints of the cell containing the instructions to construct proteins and RNA molecules, are stored in long polymers called deoxyribonucleic acid (DNA). The DNA segments that carry this genetic information are called genes. Upon cell division, the genetic information stored in these molecules should be passed onto the next generation with as little errors as possible. The long-term survival of a species may be en-hanced by genetic changes, but the survival of the individual demands genetic stability. Maintaining genetic stability requires not only an extremely accurate mechanism for replicating the DNA before cell division, but also mechanisms for repairing the many accidental lesions that occur continually in DNA. Most such spontaneous changes in DNA are temporary because they are immediately corrected by processes that are collectively called DNA repair. Despite the thou-sands of random changes created every day in the DNA of a human cell by heat, metabolic accidents, and external damaging agents like ultraviolet light, only a few stable changes (mutations) accumulate in the DNA sequence of an average cell in a year. We now know that in eukaryotes fewer than one in a thousand accidental base changes in DNA causes a mutation; the rest is eliminated with remarkable efficiency by DNA repair.

DNA can be damaged in many different ways (see Figure 1.1). The type of DNA damage produced depends on the type of mutagen. For example, ultraviolet light mostly damages DNA by producing thymine dimers, which are cross-links between adjacent pyrimidine bases in a DNA strand. On the other hand, oxidants such as free radicals within the cell’s interior produce multiple forms of damage ranging from single base mutations to a complete break of both strands of the double helix. Many mutagens intercalate into the space between two adjacent base pairs. Binding of a intercalator distorts the DNA strands by unwinding the double helix. Such structural changes of the DNA molecule may inhibit both transcription and DNA replication, causing toxicity and mutations. As a result, DNA intercalators are often carcinogens with ethidium bromide and benzopyrene, the major mutagen in tobacco smoke, as examples. Due to their properties of inhibiting DNA transcription and replication, they are also used in chemotherapy to inhibit rapidly-growing cancer cells.

1.1

DNA repair

(13)

1

Figure 1.1: Different types of DNA damage. Under the influence of internal and exter-nal DNA-damaging agents DNA undergoes constantly random changes. These changes can vary from single nucleotide mutations or deletions and the formation of pyrimidine dimers to the most severe case where both strands of the helix are separated.

this damage is reversed by the action of the enzyme photolyase, whose activity depends on energy absorbed from ultraviolet light [1]. For single base mutations, the other strand can be used as a template to guide the correction of the damaged strand. Different pathways exist to repair damage to one of the two strands of the DNA molecule, like base excision repair [2, 3] and nucleotide excision repair [3–5]. The former pathway repairs a single nucleotide damaged for instance by oxidation, the latter involves repairing of damage affecting longer strands of 2 – 30 bases. These mechanisms depend on the existence of two copies of the genetic information, one in each strand of the DNA double helix: if the sequence in one strand is accidentally changed via an alteration in the nucleotide sequence, information is not lost irretrievably because a complementary copy of the altered strand remains in the sequence of nucleotides in the other strand.

The most severe lesion that can occur, is a double-stranded break where both strands of the double helix break. Double-stranded breaks are not only induced by internal or external damaging agents, but also occur during DNA replication. Inaccurate repair of double-stranded breaks is hazardous to the cell because this can lead to genome rearrangements, eventually causing the cell to malfunction or even worse, to become cancerous. In cells, two major, mechanistically distinct pathways exist to repair double-stranded breaks [6, 7]: non-homologous end-joining and homologous recombination (see Figures 1.2 and 1.3).

(14)

during the break are affected. Loss of damaged nucleotides at the site of the break can lead to deletions, and joining of non-matching termini creates translo-cations. Non-homologous end-joining is especially important before the cell has replicated its DNA, since there is no template available in close proximity to the double-strand break required for repair by homologous recombination.

Genetic and biochemical analyses identified and assigned functions for the proteins that are key factors during non-homologous end-joining (for reviews see [8, 9]). A model for double-strand break repair through non-homologous end-joining is depicted in Figure 1.2. Within the non-homologous end-end-joining path-way, two subpathways exist: direct end-joining and microhomology-dependent end-joining. The first pathway directly joins the DNA ends after no or limited processing. The latter pathway uses small direct repeat sequences close to the DNA end to join the ends. The first step in non-homologous end-joining is some processing of the ends to make them blunt or to reveal these microhomologies, short stretches of homologous nucleotides (3 to 16 bp). The Rad50/Mre11/Nbs1 complex has been implicated in this processing step, holding the broken ends in close proximity [10, 11]. Other protein complexes important in an early step of the end-joining reaction are DNA-PKCSand the Ku heterodimer. DNA-PKCS can

interact with the Ku heterodimer bound at a DNA end and the resulting complex can promote tethering of two DNA ends [12, 13]. Next, DNA-PKCS can recruit

the ligase IV and Xrcc4 protein complex [14, 15] followed by ligation of the DNA ends. In the case of joining through microhomologies, non-complementary short 3´and 5´overhangs would result in ’flap structures’, a single-strand overhang one of the strands. Structure-specific endonucleases can remove these flap structures before ligation [16, 17].

Homologous recombination uses –in contrast to non-homologous end-joining– the sequence template of the sister chromatid to restore the original DNA molecule without the loss of any nucleotide information. In addition to break repair, homologous recombination is also used to restore collapsed, stalled, and reversed replication forks. Recombination is not only required for chromosome stability but is also necessary for creating genetic diversity.

(15)

1

Double-strand Break DNA-PKCS Ku70/80 heterodimer Rad50/Mre11/Nbs1 Xrcc4 & Ligase IV

Figure 1.2: Schematic representation of non-homologous end-joining. In the left route, the two ends of the double-strand break are directly brought together and undergo limited processing. Alternative to such direct end joining, short repeats (blue) can be used to align the ends (right branch). Candidate proteins that act in these steps are the Rad50/Mre11/Nbs1 complex as well as DNA-PKCS and the

heterodimer Ku70/80. Finally, the eventual flap structures could be cleaved by structure-specific endonucleases, and the ends are ligated through the activity of the ligase IV and Xrcc4 protein complex.

overhangs (schematically depicted in Figure 1.3). These overhangs are protein-encapsulated into a nucleoprotein helical filament. During the second phase called synapsis, this structure searches for a homologous sequence in the sister chro-matid. Upon recognition, the invasion of the single-stranded DNA into the un-wound homologous double-stranded DNA molecule results in the formation of a so-called D-loop structure. After formation of a D-loop structure, the invaded 3´ end can serve as a primer for a DNA polymerase to fill in the missing nucleotides.

(16)

Double-strand Break Polymerase Ligase Resolvase RecBCD/Exo1 RecA/Rad51 Ligase Helicase

(17)

1

1.2

Properties of DNA-repair proteins

In this section, we discuss in more detail the properties of three specific DNA-repair proteins which are studied in this thesis.

1.2.1

Prokaryotic RecA

During homologous recombination, a single DNA strand from one DNA double helix must invade another double helix. In Escherichia coli, this requires recom-bination protein A (RecA), produced by the recA gene, which was identified in 1965 as being essential for recombination between chromosomes [21]. The recA gene of E. coli is indispensable to a number of processes which both maintain and diversify the genetic material of the bacterial cell. Mutations in recA genes are remarkably pleiotropic, involving not only recombination but also DNA repair, SOS mutagenesis, cell division, and chromosomal segregation.

Long sought by biochemists, this elusive gene product was finally purified to homogeneity in 1976 [22]. The RecA protein is a 352-aminoacid polypeptide of 37,842 Da (see Figure 1.4A). RecA possesses a number of interesting biochemical activities which contribute to its biological functions, including DNA binding, ATP binding and hydrolysis, binding and cleavage of target proteins, helical-filament formation, and the pairing and exchange of homologous DNA strands. Given its important role in ensuring cell viability, it is not surprising that the RecA protein is both ubiquitous and well conserved among a range of prokaryotes. Additionally, proteins displaying homology to RecA and possessing similar DNA-pairing and exchange activities have been found in eukaryotic organisms.

During the initial presynaptic stage of homologous recombination, RecA in-teracts with ssDNA to form a nucleoprotein filament. The functional pre-synaptic filament is a ternary complex consisting of RecA, ssDNA, and the nucleotide co-factor ATP. RecA binds to ssDNA in a nonspecific cooperative manner and forms a contiguous filament, in which the molecule is completely saturated by RecA. The nucleoprotein filament formed has a right-handed helical form with six RecA monomers per turn, a pitch of 95 ˚A, and a diameter of 100 ˚A [23] (see Figure 1.4C and D). In this filament, a RecA monomer covers three nucleotides with a rise per nucleotide of 5.1 ˚A. Compared to dsDNA where the rise per base pair is 3.4 ˚A, RecA binding induces the DNA substrate to elongate by 50 %. In the absence of a nucleotide cofactor or with ADP, RecA binds to ssDNA forming a collapsed filament that has a reduced pitch of 75 ˚A.

(18)

Figure 1.4: Crystal structure of the recombinase monomers of (a) RecA and (b) human Rad51. The colors represent the electrostatic potential of the protein surface with red and blue denoting respectively negatively and positively charged groups. (c) Transparent surface models of the helical nucleoprotein filaments formed by the yeast Rad51 protein (top) and the bacterial RecA protein (bottom). Within Rad51 filaments, DNA (red) is extended and untwisted in a manner similar to its extension and untwisting in RecA filaments.

(19)

1

1.2.2

Eukaryotic Rad51

In bacteria, genetic recombination is catalysed by the RecA protein, the product of the recA gene. One of the human genes that shares homology with E. coli recA is RAD51 [24, 25]. The human RAD51 gene encodes a 339 aminoacid protein with a molecular weight of 36,966 Da (see Figure 1.4B). The human Rad51 protein was first purified in 1994 [26]. Overall, the human Rad51 sequence shows 56% homology (30% identity) with E. coli RecA. Despite their similarities in size, the homology only extends from residues 33 to 240 in RecA, since human Rad51 lacks the C-terminal region of RecA while containing 63 extra residues at the N-terminus of the protein [24, 25].

The Rad51 protein has ATP-dependent DNA-binding characteristics that are similar to those of RecA. The Rad51-dsDNA filament is very similar to that of RecA. It has a slightly greater helical pitch (99 versus 95 ˚A) but has the same axial rise per base pair (5.1 ˚A), and the same apparent DNA-binding stoichiometry (3 bp per monomer) as the RecA filament [26–29].

1.2.3

Eukaryotic Rad50

A protein that is involved in both major repair pathways is the protein complex Rad50/Mre11/Nbs1. The Rad50/Mre11 complex is a heterotetramer, R2M2,

ar-ranged with a globular DNA-binding domain, including the Mre11 dimer and the two Rad50 ATPase domains, from which the long intramolecular coiled coils of Rad50 protrude [10, 30, 31] (see Figure 1.5A). The coiled-coil apex contains a structure described as a zinc hook. These zinc hooks can dimerize by the coordination of a Zn2+ ion, providing a possible interface for interaction

be-tween Rad50/Mre11 complexes. Rad50/Mre11 complexes form oligomers on lin-ear DNA where interactions between the apices of the coiled coils then tether DNA molecules (see Figure 1.5B) [10, 11, 32].

1.3

Tools to study individual proteDNA

in-teraction

(20)

Figure 1.5: Structure and interaction of the Rad50/Mre11 complex with DNA. (a) The Rad50/Mre11 complex has a globular domain from which two long flexible arms protrude. (b) The globular domain binds to DNA and interaction between the extended arms bring the two ends of a broken DNA molecule in close proximity.

level. This approach is practically impossible and we therefore choose to follow the interaction of individual molecules in time.

In the last decades different single-molecule techniques have been developed to probe in various ways the interaction between proteins and DNA. Imaging tools like electron and atomic force microscopy allow visualization at the nanometer scale. In an optical or magnetic tweezers setup, the interaction between DNA and protein is probed via force spectroscopy, exerting forces in the pN range. Labeling molecules with fluorescent dyes allows tracking by light microscopy. Throughout this thesis, two different single-molecule techniques were applied to study the role of the above proteins in homologous recombination: Atomic Force Microscopy and Magnetic Tweezers.

1.3.1

Atomic Force Microscopy

(21)

1

Figure 1.6: Single-molecule techniques. (a) In an atomic force microscope, a laser beam is focused on a small cantilever with a pyramidal tip. The reflection of the laser light is detected by a quadrant photodetector. Changes in intensity on the pho-todetector are proportional to the deflection of the cantilever and thus to the tip position. During imaging, the tip is scanned over the sample while measuring the deflection at every position. This yields a high-resolution topographical image of the surface of the sample. (b) In a magnetic tweezers setup, a DNA molecule is tethered between a glass surface and a magnetic bead. A pair of external mag-nets is used to apply a force on the magnetic bead. Translation and rotation of the external magnets allows changing the force and torque applied on the bead and thus on the tethered DNA molecule. The position of the bead is followed in three dimensions in real time. Changes in the properties of the DNA molecule by protein interaction are reflected in the position of the magnetic bead.

to measure tip-sample distances very accurately, the Atomic Force Microscope (AFM) was invented. By mounting a sharp tip on a very flexible cantilever, a local and sensitive force sensor is created that can be operated in a similar way as the STM. Using forces rather than tunneling current, the AFM is not limited to conducting samples and high-resolution topography images can be obtained even in aqueous solutions.

(22)

1.3.2

Magnetic Tweezers

The first setups that were able to manipulate magnetic objects in solution were constructed by biophysicists for in vivo studies of the viscoelastic properties of the cytoplasm [36]. In 1992, researchers measured the response of a single DNA molecule to a stretching force [37]. Here, a single DNA molecule was anchored at one extremity to a treated glass cover slip and at the other to a magnetic bead (see Figure 1.6B). A magnet placed above the flow cell exerted a force on the superparamagnetic bead and hence on the DNA molecule. The stretching force was determined by analysis of the Brownian fluctuations of the bead that can be followed by ordinary light microscopy. Because the magnetic bead is attached to a DNA molecule, the Brownian motion of the bead is restricted and varies with the applied force, e.g. the distance between the external magnets and the magnetic bead. The forces range from tenths to tens of piconewtons. Movement of the tethered bead with respect to a reference particle, i.e. non-magnetic bead, allows tracking of the bead in three dimensions with nanometer accuracy while the temporal resolution is in the order of milliseconds.

An advantage of the magnetic tweezers with respect to optical tweezers is the ability to control the torsion in the DNA molecule. Rotation of the external magnets is coupled to rotation of the magnetic bead, inducing either negative or positive plectonemic supercoils in the tethered DNA molecule [38]. In a magnetic tweezers setup, torque and force can be applied independently to the tethered DNA molecule. Protein-induced changes in the physical properties of the DNA molecule will be reflected in the motion of the magnetic bead allowing to study the interaction between protein and DNA at a single-molecule level.

1.4

Outline of this thesis

This thesis mainly reports experimental work studying the interaction of DNA-repair proteins with DNA at the single molecule level. It also contains a theoret-ical study to develop a tool to extract interaction rates from such single-molecule data.

(23)

1

Chapter 3 describes our Monte Carlo simulations of proteins that bind,

dis-sociate, and translocate along a single DNA substrate. Different bind-ing schemes of aspecific bindbind-ing are modeled, i.e. cooperative and non-cooperative. Also, two different schemes for dissociation and translocation are discussed and simulated. In single-molecule experiments, a combination of these interactions can be expected when a protein binds aspecifically to a single DNA molecule. Using Monte Carlo simulations, different combina-tions are simulated and used to fit data from single-molecule experiments. Such a fit not only yields a measure for the underlying kinetic scheme but also a quantization of the interaction rates involved. This method was successfully applied to extract (dis)assembly rates for RecA and Rad51. Chapter 4 describes the binding properties of human Rad51 with single- and

double-stranded DNA. Using different buffer conditions, the assembly and disassembly of Rad51 filaments can be measured separately in a magnetic tweezers setup. Experiments at varying concentrations of Rad51 yielded a measure for the binding unit during nucleation and filament extension, which was found to be multimeric in both cases. The rates of nucleation and filament extension are such that the observed filament formation consists of multiple nucleation event leading to the formation of multiple short filament patches that are only a few tens of monomers long.

Chapter 5 reports on the filament assembly of RecA on ssDNA. The interac-tion is followed in real time using a magnetic tweezers setup by monitoring the end-to-end distance. As observed for Rad51, the binding unit dur-ing both filament nucleation and extension is multimeric. In contrast to Rad51, the interaction cannot solely be described by assembly and ATP-hydrolysis-induced disassembly, but also involves translocation of small fil-ament patches along the ssDNA substrate.

Chapter 6 shows the interaction of RecA with a torsionally constrained DNA molecule followed in real time using magnetic tweezers. RecA binding in-duces local unwinding of the dsDNA molecule causing the formation of pos-itive plectonemes in the remainder of the molecule. The built-up torsion causes the filament assembly reaction to stall. Removal of these positive plectonemes leads eventually to a fully covered dsDNA molecule.

(24)

condi-tions and the force exerted by the scanning tip on the adhered nucleoprotein filaments.

Chapter 8 reports the interaction of a RecA-coated ssDNA filament with a tar-get duplex DNA molecule containing a region of homology during strand invasion and exchange. This interaction is followed in real time using a mag-netic tweezers setup. Changes in length and twist of the tethered molecule suggest a protein-coated three-stranded intermediate when ATP hydrolysis is suppressed. In the presence of ATP hydrolysis that allows protein disso-ciation, a ’D-wrap’ structure is formed when strand exchange is completed. The length of the synapsis formed between the invading RecA-ssDNA fila-ment and the duplex DNA measures only about 80 base pairs traveling along the region of homology while continuously performing strand exchange. Chapter 9 shows the visualization of the human DNA-repair-protein complex

Rad50/Mre11 by atomic force microscopy. From high-resolution images, the protruding arms of the protein complex are characterized using a novel method to locally determine flexibility. The coiled-coil region of the Rad50 protein contains specific segments of increased flexibility that correlate with the predicted protein structure. The segments with increased flexibility are likely to be important for the function of the Rad50 complex. Since the protein can tether DNA molecules through multiple flexible interactions of the tips of the coiled-coil arms, it is essential that the arms are bendable enough to adopt different conformations.

Chapter 10 describes experiments where the flexibility of double-stranded is characterized at short length scales extracted from high-resolution images of dsDNA molecules adhered to a mica surface and visualized with atomic force microscopy. The analysis shows that at length scales much shorter than the persistence length of dsDNA (50 nm) the flexibility of dsDNA cannot be described by the worm-like chain model. Instead a phenomeno-logical model dubbed the sub-elastic chain describes the observed flexibility over the entire range.

Bibliography

(25)

1

[4] de Laat, W. L., Jaspers, N. G. J. & Hoeijmakers, J. H. J. Genes &

Devel-opment 13, 768–785 (1999).

[5] Batty, D. P. & Wood, R. D. Gene 241, 193–204 (2000). [6] Haber, J. E. Trends in Genetics 16, 259–264 (2000).

[7] Khanna, K. K. & Jackson, S. P. Nature Genetics 27, 247–254 (2001). [8] Lieber, M. R. Genes to Cells 4, 77–85 (1999).

[9] Lewis, L. K. & Resnick, M. A. Mutation Research-Fundamental and Molec-ular Mechanisms of Mutagenesis 451, 71–89 (2000).

[10] de Jager, M. et al. Molecular Cell 8, 1129–1135 (2001). [11] Moreno-Herrero, F. et al. Nature 437, 440–443 (2005).

[12] Cary, R. B. et al. Proceedings of the National Academy of Sciences of the United States of America 94, 4267–4272 (1997).

[13] Yaneva, M., Kowalewski, T. & Lieber, M. R. Embo Journal 16, 5098–5112 (1997).

[14] Chen, L., Trujillo, K., Sung, P. & Tomkinson, A. E. Journal of Biological Chemistry 275, 26196–26205 (2000).

[15] Calsou, P., Delteil, C., Frit, P., Droulet, J. & Salles, B. Journal of Molecular Biology 326, 93–103 (2003).

[16] Paull, T. T. & Gellert, M. Molecular Cell 1, 969–979 (1998).

[17] Wu, X. T., Wilson, T. E. & Lieber, M. R. Proceedings of the National Academy of Sciences of the United States of America 96, 1303–1308 (1999). [18] Kowalczykowski, S. C., Dixon, D. A., Eggleston, A. K., Lauder, S. D. &

Rehrauer, W. M. Microbiological Reviews 58, 401–465 (1994).

[19] Haber, J. E. Mutation Research-Fundamental and Molecular Mechanisms of Mutagenesis 451, 53–69 (2000).

[20] Szostak, J. W., Orrweaver, T. L., Rothstein, R. J. & Stahl, F. W. Cell 33, 25–35 (1983).

[21] Clark, A. J. & Margulie, A. D. Proceedings of the National Academy of Sciences of the United States of America 53, 451–459 (1965).

[22] McEntee, K., Hesse, J. E. & Epstein, W. Proceedings of the National Academy of Sciences of the United States of America 73, 3979–3983 (1976). [23] Roca, A. I. & Cox, M. M. Progress in Nucleic Acid Research and Molecular

(26)

[24] Shinohara, A. et al. Nature Genetics 4, 239–243 (1993).

[25] Yoshimura, Y., Morita, T., Yamamoto, A. & Matsushiro, A. Nucleic Acids Research 21, 1665–1665 (1993).

[26] Benson, F. E., Stasiak, A. & West, S. C. Embo Journal 13, 5764–5771 (1994).

[27] Yu, X., Jacobs, S. A., West, S. C., Ogawa, T. & Egelman, E. H. Proceedings of the National Academy of Sciences of the United States of America 98, 8419–8424 (2001).

[28] Ogawa, T., Yu, X., Shinohara, A. & Egelman, E. H. Science 259, 1896–1899 (1993).

[29] Sung, P. & Robberson, D. L. Cell 82, 453–461 (1995). [30] Hopfner, K. P. et al. Cell 101, 789–800 (2000).

[31] Hopfner, K. P. et al. Cell 105, 473–485 (2001).

[32] Wiltzius, J. J. W., Hohl, M., Fleming, J. C. & Petrini, J. H. J. Nature Structural & Molecular Biology 12, 403–407 (2005).

[33] Weiss, S. Science 283, 1676–1683 (1999).

[34] Binnig, G., Rohrer, H., Gerber, C. & Weibel, E. Applied Physics Letters 40, 178–180 (1982).

[35] Binnig, G., Quate, C. F. & Gerber, C. Physical Review Letters 56, 930–933 (1986).

[36] Crick, F. H. C. & Hughes, A. F. W. Experimental Cell Research 1, 37–80 (1950).

[37] Smith, S. B., Finzi, L. & Bustamante, C. Science 258, 1122–1126 (1992). [38] Strick, T. R., Allemand, J. F., Bensimon, D., Bensimon, A. & Croquette,

(27)

DNA repair by homologous

recombination, a highly dynamic process

Abstract

(28)

2.1

Introduction

Homologous recombination and its central event of DNA strand exchange are essential processes in all living organisms for generating genetic diversity and for repairing potentially disasterous DNA breaks [1–3]. In homologous recombination however, the broken duplex molecule is first processed yielding single-stranded overhangs (see Figure 2.1). These overhangs are used to search for homologous sequences in the intact sister chromatid performed by a protein machinery form-ing the basis for the restoration of the original DNA molecule. The fascinatform-ing changes in DNA that occur during homologous recombination are accomplished by a molecular mechanism which is not entirely clear yet.

In this chapter, we review the current understanding of the homologous re-combination pathway and the role of single-molecule techniques herein. Recent single-molecule experiments (see Figure 2.2) have revived the physical picture by defining quantitative details of the mechanism of strand exchange. The key play-ers in homologous recombination are the RecA-like recombinases. In prokaryotes, this is the RecA protein [4, 5], and in eukaryotes Rad51 [6, 7]. Since the discov-ery of these key players, many bulk experiments have been conducted to unravel the mechanism of homologous recombination. This pathway can be split into different stages where sequentially recombinase filament assembly, nucleoprotein filament rearrangement accompanying exchange of DNA strands, and recombi-nase filament disassembly from the newly formed products occurs. Information about the kinetics of nucleoprotein filament assembly on single-strand DNA by RecA or Rad51 or about the process of homology recognition and strand exchange by the nucleoprotein filament was difficult to extract from bulk experiments. To address such questions in bulk experiments is difficult for two reasons. First, the interactions behave stochastically. Because a bulk experiment tests many reac-tions simultaneously, only an average behavior of the interaction mechanism is obtained. Second, to study for instance the intermediate stage of nucleoprotein filament rearrangement accompanying strand exchange, all reactions in the bulk experiment would need to be synchronized.

(29)

2

Double-strand Break End resection Strand invasion DNA synthesis Homology search Branch resolution

Figure 2.1: Repair of DNA double-strand breaks by homologous recombination. In homol-ogous recombination, repair is initiated by resection of a double-strand break creating ssDNA overhangs which are covered by RecA-like recombinases. This is followed by homology search of these protein-bound ssDNA overhangs in the sister chromatid. Once homology is found, a joint molecule is formed between the invad-ing ssDNA and the target duplex DNA followed by an exchange of strands. Durinvad-ing the last phase, the missing nucleotides are filled in and the formed branched struc-tures are resolved to obtain two intact DNA molecules without the loss of any nucleotides.

(30)

Figure 2.2: Single-molecule techniques addressing the interaction between DNA repair pro-teins and DNA. Examples are magnetic tweezers, F¨orster resonance energy trans-fer (FRET), and single molecule fluorescence on flow-stretched DNA molecules.

In the currently established model of homologous recombination, the reaction is split into three different stages: recombinase or nucleoprotein filament assembly, nucleoprotein filament rearrangement accompanying exchange of strands, and re-combinase filament disassembly from the newly formed products (see Figure 2.1). These stages are discussed below and the prokaryotic and eukaryotic machinery are compared at each stage.

2.2

Filament assembly

(31)

2

that recruits the single-stranded binding proteins and the RecA-like recombinase.

First, the broken DNA ends are processed to give single-strand DNA overhangs [17–21]. Studies directed at deciphering the recombinase mechanism have been guided by knowledge of the prototype, the prokaryotic RecA protein. RecA and its orthologs [22, 23], such as Rad51, Dmc1, and UvsX, are ATPases. RecA and Rad51 polymerize on single-strand DNA to form a right-handed helical protein filament [24]. Structural characterization of these nucleoprotein filaments has almost exclusively been performed using static methods, which resulted in em-phasizing regularity. However, even static methods reveal variation in filament structure, often depending on the nucleotide cofactor bound, indicating flexibility and possibly dynamic rearrangements. Electron microscopy studies revealed that nucleoprotein filaments have a pitch of 9.5 – 10.0 nm, comprising 6 recombinase molecules and 18 nucleotides per helical repeat, thus 3 nucleotides per recom-binase molecule [25]. Interestingly, the single-strand DNA in the nucleoprotein filament is held in an extended formation, being stretched by as much as 50 % of the length of a B-form duplex DNA molecule. This seems to prevent the bound single-strand DNA from forming a long interaction with the sister chromatid, be-cause adjacent nucleotides are quickly out of register with the target homologous sequence. Otherwise, double-stranded DNA bound by RecA and similar recombi-nases, is stretched and unwound in a DNA structure proposed to be intrinsically recombinogenic [25].

The dynamics of the formation of a nucleoprotein filament by the recombi-nases is thought to occur via a pathway that is similar for RecA and Rad51. In the classical view obtained from bulk experiments, an initial nucleation event is followed by adjacent monomeric binding of recombinase proteins that extend the nucleoprotein filament (see Figure 2.3A) [26, 27]. This preference for adjacent binding is called cooperative binding. For very high degrees of cooperativity, this yields long continuous nucleoprotein filaments.

(32)

a

RecA

RecA-ssDNA nucleoprotein filament RecA translocation

b

Rad51

Rad51-ssDNA nucleoprotein filament Rad52

c

RecA/Rad51

SSB/RPA

RecA/Rad51-ssDNA nucleoprotein filament

(33)

2

have quantitatively described these essential interactions in new detail. Recent

dynamic force spectroscopy (see Chapter 7 and [31–33]) and fluorescence exper-iments [34–38] have elucidated important details of the mechanism of filament formation by RecA and Rad51.

In the experiment by Joo et al. [34], the binding of RecA to short oligonu-cleotides was monitored via FRET in the presence of various nucleotide cofactors and analyzed by a method based upon hidden Markov modeling. It was shown that RecA nucleates on single-stranded DNA via at least a pentameric binding unit and that the nucleoprotein filament extended by monomeric association. This technique allowed characterizing of the (dis)assembly rates on both ends of the substrate. Although FRET allows only the use of short substrates po-tentially biasing protein binding from multimers to monomers, to determine the FRET efficiency, the spatial and temporal resolution are respectively in the range of nanometers and tens of milliseconds, more than sufficient to detect single pro-tein binding. However, to extend the lifetime of the flourphores attached to the DNA substrate, special buffer conditions are applied to minimize the amount of free oxygen radicals. In such buffer conditions, the ATPase activity of the protein can be attenuated pontentially affecting protein-DNA interaction and thus the outcome of the experiment. Furthermore, the use of hidden Markov modeling requires an assumption between the mechanistic interaction between protein and DNA, i.e. only assembly or also dissociation and even protein reorganization, before relevant rates can be extracted from the experimental data [39].

(34)

A similar approach was applied to probe the interaction and extract rates for nucleation and filament extension between Rad51 and single-strand DNA [33]. In contrast to RecA, Rad51 did not form a single continuous nucleoprotein fila-ment, rather the tethered DNA molecule was covered by short filament patches. As observed for RecA in similar conditions, nucleation and filament extension occurred via multimeric binding units.

In the course of homologous recombination, RecA-like recombinases not only interact with single-strand DNA, but also bind double-strand DNA to initiate for instance strand invasion and exchange. The kinetics of the interaction between RecA and double-strand DNA was monitored by binding of fluorescently labeled proteins binding to a single flow-stretched double-strand DNA molecule [35] and in a magnetic tweezers setup [31, 32]. As observed in bulk experiments [40], binding of RecA to double-strand DNA is limited by nucleation, followed by a rapid extension of the nucleoprotein filament leading to a single continuous nucleoprotein filament. Applying different concentrations of RecA protein showed that nucleation occurs via a multimer (most probably a pentamer) [35] similar to what was observed for RecA and Rad51 on single-strand DNA (see Chapter 7 and [33, 34]). The fluorescence approach allows direct visualization of the random nucleation events along the flow-stretched DNA molecule, although the spatial resolution suffers from convolution. This means that gaps between proteins in the order of 100 nanometer will disappear due to this effect. Furthermore, to visualize a protein by fluorescence it must contain a fluorescent chemical group. Modification of the wild-type protein to enable fluorescent detection can influence its biochemical and biological activity [37].

(35)

2

2.3

Role of ATP hydrolysis

The formation of a nucleoprotein filament requires the presence of a nucleotide cofactor, e.g. ATP [24]. Hydrolysis of this cofactor causes disassembly at one end of the filament. Recent single-molecule experiments that measure in real time the (dis)assembly of RecA and Rad51 from single-strand DNA show that both proteins have disassembly rates comparable to their assembly rates (see Chapter 7 and [33, 34]). This yields a highly dynamic rearrangement of the filaments due to continuous binding and dissociation. This interplay between nucleation, filament extension, and dissociation does not yield long continuous filaments, but instead short filament patches with small stretches of bare single-strand DNA in between.

Next to the continuous interplay, data in this thesis indicates that RecA was able to reorganize along single-strand DNA during filament formation, which is possible upon ATP hydrolysis (see Chapter 7). This dynamic RecA-single-strand DNA interaction readily explains the formation of the long continuous filaments that were observed. Rearrangement of DNA bound protein could eliminate gaps between filament patches. Comparison of the ATP hydrolysis rates of RecA and Rad51 in similar reaction conditions shows that RecA hydrolyzes ATP two orders of magnitude faster [41, 42]. This difference can be explained by the fact that both recombinases use ATP hydrolysis to dissociate from the DNA substrate, but RecA requires an extra amount of ATP hydrolysis to reorganize along single-strand DNA.

(36)

2.4

Rearrangement of the nucleoprotein

fila-ment

After the formation of a nucleoprotein filament by the recombinase RecA or Rad51, the sequence within the filament has to be aligned with the a partner with the same nucleotide sequences before strand exchange can occur (see Figure 2.1). The exact mechanism for the search of homology and the subsequent sequence alignment is not known, but different schemes have been proposed [48, 49]. Ho-mology recognition may occur via a melting-annealing process or a three-stranded intermediate. In the former, the target duplex DNA is locally unwound prior to the invasion of the single strand, where sequence recognition relies on the RecA-bound single strand that probes the complementary strand of the duplex DNA. In this structure called a paranemic joint the double-strand DNA molecule is lo-cally unwound, but the bases of both DNA molecules are not topologilo-cally linked [50]. Alternatively, the invading single strand may bind to the target duplex DNA forming a RecA-bound three-stranded intermediate [49, 51]. After strand invasion, strand exchange occurs forming a new heteroduplex DNA molecule expelling one of the strands of the target duplex DNA in a displacement loop (D-loop). The homologous strands are now topologically linked (base paired and truly interwound) in a structure called a plectonemic joint.

It has not been determined whether a three-stranded intermediate appears in the course of the strand-exchange process or what the structure is of the paranemic or plectonemic joints. Although a number of experiments support the three-stranded intermediate hypothesis, others argue against it [52] These contrasting results may be due to different experimental conditions, especially the presence of the nucleotide cofactor used. In recent experiments, however, the process of strand exchange by RecA between single DNA molecules could be followed in real time (see Chapter 8 and [53]). In a magnetic tweezers setup, torsionally constrained double-stranded DNA molecules were tethered between magnetic beads and a glass surface with a defined region of homology to the invading RecA-single-strand DNA nucleoprotein filament. In a magnetic tweezers setup not only the force exerted on the tethered DNA molecule is controlled, but also the amount of twist or writhe depending on the stretching force applied [10]. Changes in end-to-end distance can therefore be attributed to either changes in contour length or changes in the helical twist of the tethered DNA molecule. This is necessary because RecA not only elongates the DNA substrate upon binding but also unwinds it [54].

(37)

RecA-2

Figure 2.4: Synapsis in homologous recombination. Single-molecule experiments show inter-mediate stages of the strand invasion reaction. First, a triple-strand interinter-mediate covered by RecA is formed. While in the presence of ATP hydrolysis one end invades the target duplex DNA, at the trailing end RecA dissociates from the synapsis forming a D-wrap structure where the expelled strand is wrapped around the newly formed heteroduplex DNA.

(38)

is different from the classical picture sketched above.

2.5

Disassembly

The DNA templates have to be protein free in order to be accessible for other essential processes to happen, e.g. branch migration [55, 56]. Dissociation of Rad51 from double-strand DNA was monitored in both a fluorescence [37] and a magnetic tweezers setup [33]. Direct visualization showed that dissociation of labeled Rad51 proteins occurred at multiple positions along the DNA substrate [37]. The spatial resolution of this technique does not allow seeing individual proteins dissociate in time, but on longer timescales the amount of bound Rad51 decreases. As Rad51 forms many short filament patches, multiple ends of filament patches are available from which Rad51 can dissociate. This end-dependent dis-assembly behavior was also observed in the magnetic tweezers experiments with a similar disassembly rate as obtained from the Monte Carlo simulations [33].

2.6

Outlook

The application of single-molecule techniques in recent years has elucidated many aspects of how homologous recombination works both in prokaryotes and eukary-otes yielding new insights in the different stages of filament assembly, rearrange-ment of strands, and disassembly. The improved temporal and spatial resolution of these techniques compared to bulk methods allow better characterization of the mechanisms involved during these stages. The influence of ATP hydroly-sis causes protein rearrangement during filament assembly: a signature of more kinetics than only binding as proposed in the classic scheme. Another advan-tage of applying single-molecule techniques can be found in following a reaction in time while it surpasses different stages. This allows probing the kinetics and structures formed during an intermediate stage of homologous recombination like DNA strand exchange.

(39)

2

structure seems necessary for homology recognition or for stabilizing the

three-strand intermediate [25, 58].

Both recombinases, RecA and Rad51, start filament formation via nucleation with a multimeric (most likely pentameric) binding complex. Because recom-binases lack affinity for a particular sequence, nucleation can occur along the entire single-strand DNA substrate, after which filament extension follows. In the absence of single-stranded binding proteins, multiple short filament patches are formed due to random nucleation events. Binding of single-strand-binding proteins reduces the number of possible nucleation events stimulating the forma-tion of longer continuous filaments. The fact that filament extension also occurs via multimers (and most likely pentamers) for both RecA and Rad51 (see Chap-ter 7 and [33]), although some experiments suggest monomeric association [34], invokes a problem in displacing single-stranded binding proteins, because a pen-tamer requires at least 15 bases available to be bound. It has been shown that single-stranded binding proteins have a strong affinity for single-strand DNA and occupy approximately 30 to 60 bases per complex, depending on the buffer con-ditions used [59]. Although RecA and Rad51 show similar binding interactions with single-strand DNA, RecA is able to displace bound SSB during filament formation but Rad51 filament formation is blocked by bound RPA [29, 30]. The origin of this difference between both recombinases can now be addressed by the techniques described.

Studying homologous recombination is an important issue for understand-ing double-strand break repair. Several cancer-prone genetic diseases, includunderstand-ing Blooms syndrome and Fanconi anaemia, are associated with homologous recom-bination dysfunction or deficiency. Furthermore, homologous recomrecom-bination im-pairment is probably the underlying cause of breast, ovarian, and other cancers in individuals who harbor mutations in the BRCA1 and BRCA2 genes. Given the link to cancer, research on the mechanism and regulation of homologous recom-bination has received increasing attention. With the use of novel single-molecule techniques, the mechanism of the repair pathway of homologous recombination is now slowly unraveled, yielding valuable information about the interaction be-tween the proteins and DNA involved. Such fundamental information will even-tually allow us to understand why certain mutations in the DNA repair pathway lead to the formation of cancerous cells.

Bibliography

(40)

[2] Haber, J. E. Trends in Genetics 16, 259–264 (2000).

[3] Karran, P. Current Opinion in Genetics & Development 10, 144–150 (2000). [4] Clark, A. J. & Margulies, A. D. Proceedings of the National Academy of

Sciences of the United States of America 53, 451–459 (1965).

[5] Kowalczykowski, S. C., Dixon, D. A., Eggleston, A. K., Lauder, S. D. & Rehrauer, W. M. Microbiological Reviews 58, 401–465 (1994).

[6] Shinohara, A. et al. Nature Genetics 4, 239–243 (1993).

[7] Haber, J. E. Mutation Research - Fundamental and Molecular Mechanisms of Mutagenesis 451, 53–69 (2000).

[8] Ashkin, A. Nature 330, 608–609 (1987).

[9] Ashkin, A. Proceedings of the National Academy of Sciences of the United States of America 94, 4853–4860 (1997).

[10] Strick, T. R., Allemand, J. F., Bensimon, D. & Croquette, V. Biophysical Journal 74, 2016–2028 (1998).

[11] Strick, T. R., Allemand, J. F., Bensimon, D., Bensimon, A. & Croquette, V. Science 271, 1835–1837 (1996).

[12] Strick, T. R., Croquette, V. & Bensimon, D. Nature 404, 901–904 (2000). [13] Weiss, S. Science 283, 1676–1683 (1999).

[14] Ha, T. et al. Proceedings of the National Academy of Sciences of the United States of America 93, 6264–6268 (1996).

[15] Perkins, T. T., Smith, D. E. & Chu, S. Science 264, 819–822 (1994). [16] Brewer, L. R., Corzett, M. & Balhorn, R. Science 286, 120–123 (1999). [17] Dixon, D. A. & Kowalczykowski, S. C. Cell 66, 361–371 (1991).

[18] Dixon, D. A. & Kowalczykowski, S. C. Cell 73, 87–96 (1993).

[19] Ponticelli, A. S., Schultz, D. W., Taylor, A. F. & Smith, G. R. Cell 41, 145–151 (1985).

[20] Taylor, A. F., Schultz, D. W., Ponticelli, A. S. & Smith, G. R. Cell 41, 153–163 (1985).

[21] Sugawara, N. & Haber, J. E. Molecular and Cellular Biology 12, 563–575 (1992).

(41)

2

[23] Story, R. M., Weber, I. T. & Steitz, T. A. Nature 355, 318–325 (1992).

[24] Wyman, C. & Kanaar, R. Current Biology 14, R629–R631 (2004).

[25] Yu, X., VanLoock, M. S., Yang, S., Reese, J. T. & Egelman, E. H. Current Protein & Peptide Science 5, 73–79 (2004).

[26] Roca, A. I. & Cox, M. M. Progress in Nucleic Acid Research and Molecular Biology 56, 129–223 (1997).

[27] Register, J. C. & Griffith, J. Journal of Biological Chemistry 260, 2308–2312 (1985).

[28] Sung, P. & Robberson, D. L. Cell 82, 453–461 (1995).

[29] Sugiyama, T. & Kowalczykowski, S. C. Journal of Biological Chemistry 277, 31663–31672 (2002).

[30] Kowalczykowski, S. C. & Krupp, R. A. Journal of Molecular Biology 193, 97–113 (1987).

[31] Fulconis, R. et al. Biophysical Journal 87, 2552–2563 (2004).

[32] van der Heijden, T. et al. Nucleic Acids Research 33, 2099–2105 (2005). [33] van der Heijden, T. et al. Nucleic Acids Research (2007). In print. [34] Joo, C. et al. Cell 126, 515–527 (2006).

[35] Galletto, R., Amitani, I., Baskin, R. J. & Kowalczykowski, S. C. Nature 443, 875–878 (2006).

[36] van Mameren, J. et al. Biophysical Journal 91, L78–L80 (2006). [37] Modesti, M. et al. Structure 15, 599–609 (2007).

[38] Gran´eli, A., Yeykal, C. C., Robertson, R. B. & Greene, E. C. Proceedings of the National Academy of Sciences of the United States of America 103, 1221–1226 (2006).

[39] Rabiner, L. R. Proceedings of the IEEE 77, 257–286 (1989).

[40] Pugh, B. F. & Cox, M. M. Journal of Biological Chemistry 262, 1326–1336 (1987).

[41] Tombline, G. & Fishel, R. Journal of Biological Chemistry 277, 14417–14425 (2002).

[42] Weinstock, G. M., Mcentee, K. & Lehman, I. R. Journal of Biological Chem-istry 256, 8829–8834 (1981).

(42)

[44] Krejci, L. et al. Journal of Biological Chemistry 277, 40132–40141 (2002). [45] Shinohara, A. & Ogawa, T. Nature 391, 404–407 (1998).

[46] Sugiyama, T., New, J. H. & Kowalczykowski, S. C. Proceedings of the Na-tional Academy of Sciences of the United States of America 95, 6049–6054 (1998).

[47] Park, M. S., Ludwig, D. L., Stigger, E. & Lee, S. H. Journal of Biological Chemistry 271, 18996–19000 (1996).

[48] Howard-Flanders, P., West, S. C. & Stasiak, A. Nature 309, 215–220 (1984). [49] Camerini-Otero, R. D. & Hsieh, P. Cell 73, 217–223 (1993).

[50] Bianchi, M., Dasgupta, C. & Radding, C. M. Cell 34, 931–939 (1983). [51] Zhurkin, V. B., Raghunathan, G., Ulyanov, N. B., Camerini-Otero, R. D. &

Jernigan, R. L. Journal of Molecular Biology 239, 181–200 (1994).

[52] Prevost, C. & Takahashi, M. Quarterly Reviews of Biophysics 36, 429–453 (2003).

[53] Fulconis, R., Mine, J., Bancaud, A., Dutreix, M. & Viovy, J. L. Embo Journal 25, 4293–4304 (2006).

[54] Stasiak, A. & Dicapua, E. Nature 299, 185–186 (1982). [55] West, S. C. Annual Review of Genetics 31, 213–244 (1997).

[56] Bugreev, D. V., Mazina, O. M. & Mazin, A. V. Nature 442, 590–593 (2006). [57] West, S. C., Cassuto, E. & Howardflanders, P. Nature 290, 29–33 (1981). [58] Klapstein, K., Chou, T. & Bruinsma, R. Biophysical Journal 87, 1466–1477

(2004).

(43)

Monte Carlo simulations of protein

assembly, disassembly, and translocation

on DNA

Abstract

(44)

3.1

Introduction

In this chapter, we follow the interaction of multiple proteins with a one-dimensional lattice representing the DNA substrate in a Monte Carlo simulation. In this approach, every bound protein can be followed in simulated time while allowing different forms of interaction with the DNA substrate. Comparison be-tween simulated and experimental data yields insight of the binding mechanism involved between the proteins and DNA substrate studied.

In the last decade, new experimental techniques have opened the way to study protein-DNA or protein-protein interaction between single molecules. In contrast to bulk experiments, single-molecule experiments do not suffer from averaging multiple events and thereby allow the characterization of processes in much greater detail. The interaction between proteins and DNA involves a variety of relevant processes, e.g. binding, dissociation, translocation, shape deformation, etc. To describe the dynamic interactions between a protein and DNA, one can study systems where the protein-DNA interaction is restricted to a specific region, i.e. a recognition sequence as observed for the translocation of the restriction enzyme EcoR124I [1]. For this type of experiments, where a single protein is followed in time while interacting with a DNA molecule, models were developed to extract kinetic interaction rates.

On the other hand, many protein-DNA or protein-protein interactions involve iterative interactions. One approach to study these systems in detail at a single-molecule level, has been to avoid multiple events at the same time by reducing the amount of target area. For example, the length of available DNA substrate can be limited to only tens of bases, or the concentration of protein present in the reaction can be substantially lowered with respect to the target area. Another approach to study non-specific interactions of proteins with DNA or other proteins at a single-molecule level is to develop models that go beyond describing single-entity binding. An analytical toolbox should be available that is able to extract interaction parameters for the different processes involved between protein and DNA.

(45)

3

Non-cooperative binding Cooperative binding Assembly A Disassembly End-dependent Position-independent B Translocation Unidirectional Diffusive C

Figure 3.1: Schematic drawings of different pathways for protein-DNA kinetics. (a) Assembly of a non-specific binding protein with its substrate can be divided into two modes, non-cooperative and cooperative. In the former (left panel), the protein binds randomly, whereas in the latter (right panel) a preference exist to bind next to an already bound protein. (b) Disassembly of bound proteins can also be divided into two different modes, end-dependent and position-independent dissociation. In the first case (left), only proteins located at the end of a protein complex can dissociate, whereas in the second case (right) all bound proteins regardless of their position within the protein complex have the same probability to dissociate. (c) Reorganization of a bound protein can be described by either a diffusive (left) or a unidirectional (right) mode.

protein concentration and the cooperativity number, the fractional final cover-age was deduced at equilibrium, yielding a value for the binding constant of the protein to the lattice.

(46)

proteins, due to the infinite lattice that they consider. This is a priori not true, because even in the non-cooperative case, proteins can bind next to an already bound protein. Finally, the obtained coverage in the model is always complete. This outcome is incorrect because gaps smaller than the binding size of the pro-tein remain on the lattice due to the random binding nature of the propro-tein.

Recently, an analytical tool based upon hidden Markov modeling was ap-plied to extract kinetic rates from single-molecule fluorescence data [6, 7] and ion-channel data [8, 9]. Hidden Markov modeling was developed originally to aid in speech recognition [10]. A regular Markov model consists of a series of states where at each time the system may have changed from the state it was in the moment before, or it may have stayed in the same state. These states are directly visible to the observer. In a hidden Markov model, the observation is a probabilistic function of the state, i.e., the resulting model contains an underlying stochastic process that is not observable (it is hidden) [11].

A correct interpretation of single-molecule data using hidden Markov mod-eling, depends on the underlying system chosen, viz., the number of states in the model and the corresponding probabilities involved [10]. Furthermore, the states within the model should be independent of each other. For example, hid-den Markov modeling does not work well for RNA secondary structure analysis [11]. Studying protein binding to a lattice using hidden Markov modeling causes a likewise problem because a distant bound protein can influence interactions of a different protein cluster.

(47)

3

3.2

Description of the model

We model protein-DNA interaction, e.g. (cooperative) binding, disassembly, and reorganization, using Monte Carlo simulations [13, 14]. Monte Carlo simulations have been successfully applied in different areas of physics and mathematics [15, 16]. In Monte Carlo simulations, we model the interaction between pro-tein and DNA like a Markov chain; the next possible state of the propro-tein-DNA complex depends on the current state. Depending on the transition probability between different states, a certain pathway is followed, and we can study both the dynamics as well as equilibrium states.

We first describe the implementation of a two-state process in our simulations. A two-state process can be written as

A k1

−−−→ B. (3.1)

The reaction rate k1 is coupled to a transition probability p1 in a Markov chain

as follows k1 =

p1

∆t, (3.2)

where ∆t is the duration of a single simulation step in the Monte Carlo simu-lations. The duration of a simulation step is taken such that (i) the transition probability within a single simulation step is always much smaller than unity and (ii) the chance of having two local transitions within a single simulation step is negligible. In the Monte Carlo simulations, a transition between states occurs whenever the transition probability is larger than a random value extracted from a uniform distribution between 0 and 1. The transition rate between states is a priori unknown, and needs to be extracted from comparison to experimental data.

(48)

3.3

Methods

Binding of protein onto a DNA substrate was modeled using Monte Carlo sim-ulations implemented in Interactive Data Language (RSI, Boulder CO). A one-dimensional array was used to represent the DNA substrate containing a number of elements equivalent to the number of nucleotides or base pairs of the DNA molecule of interest. Simulations were done with various binding sizes for the protein. Cooperative binding was described by nucleation followed by growth that extended the nucleation point, whereas non-cooperative binding involved only nucleation.

Nucleation was allowed to occur at any point along the entire molecule. In the Monte Carlo simulations, the nucleation step was simulated as follows: a value was randomly extracted from a uniform distribution yielding a value between 0 and 1. If this value was smaller than a given threshold corresponding to the set nucleation rate for the entire molecule, a protein was bound. The binding location was deduced from a second random number between 0 and 1, that was extracted from a uniform distribution that was multiplied by the number of elements in the one-dimensional array. Binding occurred only when this site plus the following n − 1 sites were not covered by another protein, to account for the fact that each protein covers n nucleotides or base pairs.

For cooperative binding, we evaluated all nucleation sites where protein patch extension could occur. For each site, a value was extracted from an uniform distribution and compared to a given threshold corresponding to the set rate of extension for a single protein patch. If this value was smaller than the threshold, the protein patch was extended if the next n nucleotides or base pairs were not already covered by protein. Extension was only permitted into the direction of higher numbers in the one-dimensional array.

The probabilities for nucleation and growth per time step were taken so small, that the chance of two binding events within a single Monte Carlo step was neg-ligible. The threshold values, which are rates expressed in units (Monte Carlo step)−1, can convert into kinetic rates expressed in s−1 by adjusting the time axis

of the Monte Carlo growth curve to the experimental growth data. Whereas our simple modeling involved protein patch extension and disassembly in an unidirec-tional fashion, essentially the same results are found if extension and disassembly occur in both directions, albeit with two slightly different values for the rates that change by a factor up to 2.

(49)

3

m ≥ 1) opposite to the protein-patch-extension end (i.e. towards lower numbers

in the array), a value was extracted from a uniform distribution and if this value was smaller than the threshold set by the dissociation rate, the protein dissociated and a vacancy was created. Alternatively, a second route was considered where dissociation was allowed at all monomer sites i.e., also in the middle of protein patches. Here, the above procedure was extended to all bound proteins.

Reorganization of individual proteins or protein patches along the DNA sub-strate was incorporated as follows: a value was randomly extracted from a uni-form distribution yielding a value between 0 and 1. If this value was smaller than a given threshold corresponding to the reorganization rate, a one nucleotide or base pair step of the protein patch was made. For unidirectional transloca-tion, the direction was uniform, towards a lower number in the array, which was chosen here. For diffusive motion, the stepping direction was randomly towards higher/lower numbers in the array, when the extracted value from a uniform distribution was larger/smaller than 0.5.

3.4

Results

We modeled the interaction between proteins and DNA for a variety of processes, i.e. binding, dissocation, reorganization, and a combination of these. The protein-DNA interaction via binding can be divided into two different schemes, non-cooperative and non-cooperative binding (see Figure 3.1A). We first present the results for non-cooperative binding of proteins to DNA.

3.4.1

Non-cooperative binding

(50)

to the lattice is therefore smaller than when all proteins would mutually align such that no gaps would remain on the lattice. Division of the length of the lattice by the number of bound proteins yields the apparent binding site size for the protein, that is larger than its intrinsic binding site due to the existence of gaps. For large binding site sizes of the protein, n > 15 nt, the apparent binding site is approximately 30 % larger than the actual binding site (see Figure 3.2E). The kinetics of simple non-cooperative binding can be analytically described as follows. The binding process is limited by the amount of free base pairs available on the DNA molecule (see Figure 3.2A). During growth the amount of free base pairs Nfree corresponds to

dNfree

dt = −aNfree, (3.3)

where a is the binding rate of the protein to the lattice, which together with the boundary condition of Nfree(0) = N yields

Nfree = Ne−at. (3.4)

The time-dependent occupancy θ becomes

θ(t) ≡ NboundN (t) = 1 − e−at, (3.5)

showing an exponential binding profile in excellent agreement with the profiles obtained in the Monte Carlo simulations (see black lines in Figure 3.2B).

The final occupancy depends on the binding site size n of the protein. Assume a lattice that consists of N possible binding sites, allowing binding up to N

n proteins. During the non-cooperative binding process, ‘gaps’ of size i (1 ≤ i ≤ n − 1) are created throughout the entire lattice reducing the maximum amount of proteins. A protein that binds, covers n sites for actually binding the protein plus the size of the average gap formed between proteins, thus can be represented by an effective binding size n∗ = n + s

gap. The average gap size between proteins

is not equivalent to 1 2(n − 1) but sgap = n−1 X i=1 i

n + i instead. This is because the binding site size increases as the gap size increases. Therefore, one needs to take into account the actual number of proteins with gap size i that are bound to the lattice which decreases as (n + i)−1. Together, this yields for the fractional

Cytaty

Powiązane dokumenty

Bezośrednio ów wyższy wiek zdawały się zresztą potw ier­ dzać badania geologiczno-stratograficzne w środkowej dolinie rzeki Awash, k tó rej dopływem jest rzeka

Pytanie to wiąże się z kolejnym: czy - mówiąc, że m ożna żywić przekonanie fałszywe lecz uzasadnione - nie zgadzamy się na to, iż czyjeś p rze­ konanie, że p

Finally, the growth profiles in the presence of ATP and Mg 2+ were modeled using the nucleation and extension rates per filament patch obtained in the presence of Ca 2+ , plus

The concentration-dependent rates for nucleation and extension per filament patch obtained for different binding sizes applied in the Monte Carlo simulations ( Figure S 3), are

Swoiste duchowe „dopingowanie” nas przez świętych do większego wysiłku duchowego zaczyna się od samego przykładu ich życia.. Jakże pokrzepiające jest zawsze

共a兲 Event scatter plot of type-1 共gray兲 and type-2 共black兲 events recorded at 120 mV with 11.5-kbp linear DNA molecules.. 共b兲 Histogram of observed dwell times for 1 and

Wy­ d aje m i się, że zinstytucjonalizow anie dom ów gry (tak ja k plan zinstytucjonalizow ania rozpusty w R otterdam ie — zbudow anie centrum prostytucji na

For example, while the force exerted by a DNA molecule as a function of its end- to-end distance was recently measured under condensation conditions using optical tweezers [13–15],