• Nie Znaleziono Wyników

RNA 3D structure:

N/A
N/A
Protected

Academic year: 2022

Share "RNA 3D structure:"

Copied!
67
0
0

Pełen tekst

(1)

RNA 3D structure:

bioinformatics perspective

Janusz M. Bujnicki

IIMCB, Warsaw & UAM, Poznan

(2)

Today

MEDICINE and BIOLOGY depend on DNA analysis

Human karyotype with color added to distinguish chromosome pairs. {{PD-USGov-NIH}}

2

(3)

Today

identification of specific DNA

sequences

Human karyotype with color added to distinguish chromosome pairs. {{PD-USGov-NIH}}

…AAGTGCT CGAGCA T…

MEDICINE and BIOLOGY depend on DNA analysis

3

(4)

Today

identification of specific DNA

sequences

diagnostics gene therapy

forensics etc.

Human karyotype with color added to distinguish chromosome pairs. {{PD-USGov-NIH}}

…AAGTGCT CGAGCA T…

MEDICINE and BIOLOGY depend on DNA analysis

4

(5)

Today & Tomorrow

rapid growth of research on…

http://upload.wikimedia.org/wikipedia/commons/f/f2/0324_DNA_Translation_and_Codons.jpg

information storage

executive layer

workhorse

5

(6)

Today & Tomorrow

rapid growth of research on…

RNA!!!

http://upload.wikimedia.org/wikipedia/commons/f/f2/0324_DNA_Translation_and_Codons.jpg

information storage

executive layer

workhorse

6

(7)

Today & Tomorrow

rapid growth of research on…

RNA!!!

http://upload.wikimedia.org/wikipedia/commons/f/f2/0324_DNA_Translation_and_Codons.jpg

information storage

executive layer

workhorse

7

(8)

RNA

… T A C G G C G T T A G A C A A G T G C G T G A G T A C A C A …

… A T G C C G C A A T C T G T T C A C G C A C T C A T G T G T …

… U A C G G C G U U A G A C A A G U G C G U G A G U A C A C A …

MASTER REGULATOR OF LIVING CELLS!

8

(9)

RNAs as the… new proteins?

Analogies between proteins and RNAs:

Sequences: linear polymers

Structures: complex 3D shapes Functions: catalysis, regulation

The “1D-3D-F” code:

sequence ↔ 3D structure ↔ function RNase P & RNase P

RNA protein

(10)

RNAs as the… new proteins?

Analogies between proteins and RNAs:

Sequences: linear polymers

Structures: complex 3D shapes Functions: catalysis, regulation

• Experimental structure determination is very difficult for RNA

• There are many RNA genes with unknown function (in particular in Eukaryota)

• We need to break the code (at least 1D-3D) to better understand their function The “1D-3D-F” code:

sequence ↔ 3D structure ↔ function RNase P & RNase P

RNA protein

(11)

Ludwig Edward Boltzmann (1844-1906) Charles

Darwin (1809-1882)

EVOLUTIONARY BIOLOGY

STATISTICAL THERMODYNAMICS Protein

PROTEIN 3D structure prediction two schools of thought

11

(12)

Homologous macromolecules retain very similar 3D structures despite accumulated

substitutions in sequences RNase A RNase 4

According to Anfinsen:

native stucture corresponds to the global free energy minimum of the system

PROTEIN structure prediction

two schools of thought

(13)

TIME:

~MILLIONS OF YEARS GRANULARITY:

~RESIDUES

TIME:

~MILISECONDS / SECONDS GRANULARITY:

~ATOMS PHYSICAL MODEL:

1D→3D FOLDING EVOLUTIONARY MODEL:

DIVERGENCE & CONSERVATION

13

(14)

Families of homologous RNAs

also retain very similar 3D structures despite accumulated

substitutions in sequences

RNAs also fold into low energy structures:

Thus far it has been exploited on the level of 2D prediction A-riboswitch G-riboswitch

RNA folding and evolution:

…quite like proteins?

(15)

TIME:

~MILLIONS OF YEARS GRANULARITY:

~RESIDUES

TIME:

~MILISECONDS / SECONDS GRANULARITY:

~ATOMS

STRUCTURE PREDICTION FOR RNA

UAUCGUAUGCUUUGCGCGCAGCAGCGAAGCGCUGACAC

PHYSICAL MODEL:

1D→3D FOLDING EVOLUTIONARY MODEL:

DIVERGENCE & CONSERVATION

15

(16)

TEMPLATE

16

ModeRNA:

RNA homology-modeling

Lena Rother

Kristian

Rother

(17)

MODEL

17

ModeRNA:

RNA homology-modeling

(18)

MODEL VS

CRYSTAL STRUCTURE RMSD 2.27 Å

18

ModeRNA:

RNA homology-modeling

(19)

MODEL VS

CRYSTAL STRUCTURE RMSD 2.27 Å

19

ModeRNA:

RNA homology-modeling

aminoacyl tRNA

synthetase

(20)

RNA is more dynamic than proteins

http://pharmacy.utah.edu/medchem/faculty/Davis_D.htm

NMR structure of a functionally important domain of the HCV IRES RNA

in complex with an inhibitor of viral replication.

(21)

RNA changes conformation upon ligand binding

decoding region A-site HIV-1 frameshift inducing element

HIV-1 TAR RNA

(22)

RNA thermometers

change structure depending on temperature

Jens Kortmann, Franz Narberhaus

Nat Rev Microbiol 2012 Apr 16;10(4):255-65.

Bacterial RNA thermometers: molecular zippers and switches.

(23)

RNA thermometers

change structure depending on temperature

Jens Kortmann, Franz Narberhaus

Nat Rev Microbiol 2012 Apr 16;10(4):255-65.

Bacterial RNA thermometers: molecular zippers and switches.

(24)

Riboswitches:

change structure upon ligand binding

Kim, J. N.; Breaker, R. R.,

Purine sensing by riboswitches.

Biology of the Cell 2008, 100, (1), 1-11.

(25)

Riboswitches:

change structure upon ligand binding

Kim, J. N.; Breaker, R. R.,

Purine sensing by riboswitches.

Biology of the Cell 2008, 100, (1), 1-11.

(26)

RNA is much more dynamic than proteins

(27)

SimRNA

Coarse-grained model for RNA folding

3' 5'

1zih.pdb: GCAA RNA tetraloop, NMR

27

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(28)

G G G C

G C

A A

G

C C

U 3' 5'

1zih.pdb: GCAA RNA tetraloop, NMR

SimRNA

Coarse-grained model for RNA folding

28

Michał Boniecki

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(29)

Statistical potential:

• Distances between close atoms

• Angles between virtual bonds

• Torsion (dihedral) angles

• Residue-residue interactions (short and long-range) Boltzmann distribution law:

Frequently observed conformations

= local energy minima

Eta-Theta map =

= potential for backbone

U

Distribution of contacts in 3D =

= potential for interactions

SimRNA

Energy function

29

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(30)

SimRNA Sampling

• Monte Carlo, Metropolis algorithm

perform random move

if energy decreases, move is accepted

otherwise move is accepted with probability:

• simulated annealing (folding, unfolding)

• Replica Exchange (typically 10 replicas)

e E -E kT

2 1

30

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(31)

SimRNA Restraints

• complete freezing

• secondary structure (WC cis base pairs)

• atom position (”pinning”)

• atom distance (”tethering”)

r

Score(r)

r

(0,0)

radius thrs

1.0 slope

d

Score(d)

d

(0,0)

mindist max dist

1.0 slope

Score(d)

d

(0,0)

mindist max dist

welldepth

...(...)...

31

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(32)

SimRNA folding simulation 1l2x.pdb, viral pseudoknot

RMSD to the crystal structure E

5' 3'

GGCGCGGCACCGUCCGCGGAACAAACGG

32

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(33)

SimRNA folding simulation 1l2x.pdb, viral pseudoknot

RMSD to the crystal structure E

GGCGCGGCACCGUCCGCGGAACAAACGG

33

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(34)

time E

RMSD to the crystal structure E

SimRNA folding simulation 1l2x.pdb, viral pseudoknot

results for all replicas

34

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(35)

SimRNA folding simulation 1l2x.pdb, viral pseudoknot

RMSD to the crystal structure

Energy barrier

E

..(((((...)))))...

...(((...)))

..(((((...)))))...

...(((...)))

NATIVE PREDICTION

RMSD: 4.2 Å

35

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(36)

SimRNA folding recapitulates folding funnels for many RNA molecules

Energy

RMSD 0=native

Energy funnel

native structure minimal energy

1a60 1e95

(((((...)))))(((...)))...

...((((((...))))))...

(((((...)))))(((...)))...

...((((((...))))))...

.((((...))))...

...(((((...))))).

(((((...)))))...

...(((((...))))).

((((((...))))))...

...((.((((((...)))))).)).

((((((...))))))...

...(((((((((...))).)))))).

1ymo

.(((((...)))))...

...(((...)))

.(((((...)))))...

...((((...))))

437d

((((...))))...

...(((...)))

((((...))))...

...((((...).)))

2a43

..(((((...)))))...

...(((...)))

..(((((...)))))...

...((((...))))

1l3d

36

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM

SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(37)

Alternative (suboptimal) structures

37

1fqz: domain IIID of hepatitis C virus IRES

NMR

cluster 1

cluster 2

cluster 3

2 3 1

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(38)

Non-canonical base pairs predicted correctly

38

1fqz: domain IIID of hepatitis C virus IRES superposition

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(39)

SimRNA workflow

39

• conversion to coarse-grained representation

• simulation

• clustering of low-energy decoys

• selection of decoys:

- lowest energy

- biggest clusters (typically 1st, 2nd, 3rd)

• conversion to full-atom representation

• full-atom (fine-grained) refinement

• additional model quality verification

Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction Nucleic Acids Res. 2015 in press

(40)

PDB ID: 1L2X

QRNAS: physics-based energy function discriminates native-like structures

(RMSD < 2 Å)

Decoys generated by unfolding

Juliusz Stasiewicz, Janusz M. Bujnicki, unpublished 1

(41)

Original structure

PDB ID: 1BYX

QRNAS:

MolProbity score improvement

Structure refined with QRNAS

Juliusz Stasiewicz

2

(42)

ModeRNA / SimRNA / QRNAS modeling pipeline

template: Azoarcus intron structure (1zzn) target: phage Twort intron

3

(43)

ModeRNA / SimRNA / QRNAS modeling pipeline

comparative model (ModeRNA)

4

(44)

ModeRNA / SimRNA / QRNAS modeling pipeline

refolded variable parts (SimRNA)

5

(45)

ModeRNA / SimRNA / QRNAS modeling pipeline

all-atom refinement (QRNAS)

6

(46)

ModeRNA / SimRNA / QRNAS modeling pipeline

phage Twort intron structure (1y0q)

7

(47)

ModeRNA / SimRNA / QRNAS modeling pipeline

model vs reference (1y0q)

8

(48)

R5 R6

P Watson-

Crick RIB

Hoogsteen

Sugar

R6 P R5

RIB

S1 CA

S2 CA

! distance

orientation

nt edge

steric clashes

Known structures of protein-RNA complexes

Statistical

exp AB

ABobs

AB

n

pn

GRAMM decoys

Quasichemical

exp AB ABobs

AB

n

pn

tot B

A

AB

x x N

n

exp

  

Coarse-grained potentials

Estat = Er + Ea + Es + Ep

RNA-protein docking

DARS-RNP & QUASI-RNP

DARS-RNP and QUASI-RNP: new statistical potentials for protein-RNA docking.

9

Tuszynska I, Bujnicki JM.

BMC Bioinformatics. 2011 Aug 18;12:348

(49)

NPDock:

rigid body RNA/DNA-protein docking

10

NPDock: a web server for protein-nucleic acid docking.

Tuszynska I, Magnus M, Jonak K, Dawson W, Bujnicki JM.

Nucleic Acids Res. 2015 Jul 1;43(W1):W425-30.

(50)

SimRNP: modeling of RNA-protein complexes

S

O O

O

O

O

O O O

N N

N N

N

N N

H N2 N

COOH

COOH

COOH

HO H N2 N

S

O O

O

O O

O O O

N N

N N

N

N N

H N2 N

COOH

COOH

COOH

HO H N2 N

Representation of protein chain:

like in REFINER Representation of aromatic residues

ARG

H N2 COOH HN

NH NH2

H N2 COOH N N H

H N2 COOH

H N2 COOH OH

HIS PHE

TYR

H N2 COOH N

TRP

Michal Boniecki et al. 11

(51)

E TOTAL = E(protein) + E(RNA) + E(protein,RNA)

SimRNP: modeling of RNA-protein complexes

Michal Boniecki et al. 12

4 protein chains 1 RNA chain

RNA

protein protein

1E8O_P_posRestr_.avi

(52)

INPUT:

• sequences of all components

• structures/models of some components

• disorder / flexibility

• molecule shapes (cryoEM, SAXS/SANS)

• distance restraints (cross-linking, FRET, etc.)

• accessibility (enzyme active sites exposed etc.) http://genesilico.pl/pyry3d

PyRy3D:

coarse-grained modeling of complexes

13

Kasprzak JM, Dobrychłop M, Koryciński M, Potrzebowski W, Susik M, Pogorzelska L, Niemiec R, Rudnicki W, Bujnicki JM PyRy3D: a software tool for modeling of large macromolecular complexes with user-defined restraints

unpublished

(53)

Is it possible to build models that agree with all the input data?

How many such models exist?

PyRy3D:

coarse-grained modeling of complexes

14

Joanna Kasprzak

Kasprzak JM, Dobrychłop M, Koryciński M, Potrzebowski W, Susik M, Pogorzelska L, Niemiec R, Rudnicki W, Bujnicki JM PyRy3D: a software tool for modeling of large macromolecular complexes with user-defined restraints

unpublished

(54)

21 complexes with simulated maps

5 complexes with experitmental cryoEM maps

PyRy3D benchmarking

15

Kasprzak JM, Dobrychłop M, Koryciński M, Potrzebowski W, Susik M, Pogorzelska L, Niemiec R, Rudnicki W, Bujnicki JM PyRy3D: a software tool for modeling of large macromolecular complexes with user-defined restraints

unpublished

(55)

ANIMATIONS MODEL RANKING

SOLUTION SCORING

SIMULATIONS PARAMETER

TESTING INPUT FILE PREPARATION

PyRy3D Chimera plugin

RESTRAINT VIOLATION

VISUALIZED ENERGY / SCORE PLOTS

http://iimcb.genesilico.pl/pyry3d

Mateusz Dobrychlop, Joanna Kasprzak et al. 16

(56)

http://genesilico.pl/pyry3d/

Mateusz Dobrychlop, Joanna Kasprzak et al., collaboration: Witold Rudnicki (ICM) 17

(57)

RNA 3D structure prediction

Grzegorz Lach, Krzysztof Formanowicz, Michal Boniecki et al.

18

Given a target sequence, predict its 3D structure

UAUCGUAUGCUUUGCGCGC

AGCAGCGAAGCGCUGACAC

(58)

RNA structure-based sequence design

Grzegorz Lach, Krzysztof Formanowicz, Michal Boniecki et al.

19 UAUCGUAUGCUUUGCGCGC

AGCAGCGAAGCGCUGACAC

Given target 3D structure, predict a sequence that folds to form that structure

(59)

RNA structure-based sequence design

Grzegorz Lach, Krzysztof Formanowicz, Michal Boniecki et al.

20 UAUCGUAUGCUUUGCGCGC

AGCAGCGAAGCGCUGACAC

Given target 3D structure, predict a sequence that folds to form that structure

...with the aid of secondary structure design (and prediction)

UAUCGUAUGCUUUGCGCGC

AGCAGCGAAGCGCUGACAC

(60)

RNA structure-based sequence design

Grzegorz Lach, Krzysztof Formanowicz, Michal Boniecki et al.

21 UAUCGUAUGCUUUGCGCGC

AGCAGCGAAGCGCUGACAC

Given target 3D structure, predict a sequence that folds to form that structure

...with the aid of secondary structure design (and prediction)

UAUCGUAUGCUUUGCGCGC AGCAGCGAAGCGCUGACAC

Positive design:

Maximize the positive energetic effect of forming the target structure Negative design:

Minimize the positive energetic effect of forming all other structures

(61)

RNA design with DesiRNA and SimRNA

22

DesiRNA – algorithm for secondary structure design

• use both positive and negative design

• consider oligomerization (monomers vs homooligomers)

1. generate initial sequences randomly or according to constraints 2. for each sequence compute the MFE and suboptimal structures 3. select sequences forming structures similar to the target structure

penalize sequences that form other structures

4. identify potential sites of mutations with largest effect on structure 5. exhaustively mutate at selected positions

6. go back to 2.

Grzegorz Lach, Krzysztof Formanowicz, Michal Boniecki et al.

SimRNA – algorithm for 3D structure folding used in a positive design mode

• define target structure (starting coordinates and/or restraints)

• use additional ”move”: sequence substitution

• use DesiRNA fitness function for sequence

(62)

RNase H zinc finger ZF-QQR

DNA RNA

N

C protein

structure modeling

nucleic acid structure modeling

protein- -nucleic acid

docking

SELEX enzyme

engineering

re-design

of specific contacts

Design, modeling, and engineering of a sequence-specific RNase H

23

Sulej AA, Tuszynska I, Skowronek KJ, Nowotny M, Bujnicki JM

Sequence-specific cleavage of the RNA strand in DNA-RNA hybrids by the fusion of ribonuclease H with a zinc finger.

Nucleic Acids Res. 2012 Dec;40(22):11563-70

(63)

BsMiniIII RNase

cuts dsRNA sequence-specifically

Głów D, Pianka D, Sulej AA, Kozłowski ŁP, Czarnecka J, Chojnowski G, Skowronek KJ, Bujnicki JM Sequence-specific cleavage of dsRNA by Mini-III Rnase.

Nucleic Acids Res. 2015 Mar 11;43(5):2864-73, Breakthrough Article

24

(64)

• Modeling of DNA 3D structure (and DNA/RNA complexes)

• Flexible modeling of protein-RNA complexes

• Flexible docking of small molecules

(to RNA, to proteins, and to protein-nucleic acid complexes)

• Design of RNA-protein interactions

• Engineering of sequence-specific RNA-binding proteins

• Design of new RNA- and RNP-based nanomachines

25

Challenges and work in progress

(65)

Wiley Interdisciplinary Reviews: Computational Molecular Sciences

(66)
(67)

Acknowledgements

Laboratory of Bioinformatics and Protein Engineering

IIMCB Warsaw:

Michał Boniecki Grzegorz Łach Wayne Dawson Irina Tuszynska

Grzegorz Chojnowski Marcin Magnus

our software: http://genesilico.pl

Laboratory of Bioinformatics Adam Mickiewicz University

Poznań:

Joanna Kasprzak Mateusz Dobrychłop Kristian Rother Lena Rother

Recent collaborations:

Marcin Nowotny (IIMCB, Warsaw, Poland)

Sean McKenna (Univerity of Manitoba, Winnipeg, Canada) Andrzej Dziembowski (IBB PAN, Warsaw, Poland)

Wim Versées & Louis Droogmans (Free University of Brussels, Belgium)

Cytaty

Powiązane dokumenty

Przykładow o wskazać m oż­ n a trw ałą dyskrym inację twórczości pseudoklasy- ków, dyskrym inację, której w yrazem jest już samo określenie te j szkoły

RNase MRP pre-rRNA processing RNP complex, similar to RNase P RNase L rRNA degradation in apoptosis oligo 2-5A dependent (ppp(A2’p) n A) ELAC2/Trz1 3’ tRNA endonuclease PDE motif and

mut with PolII stalled on damage TCR not activated, only PolII degradation and global genome repair

lce już prawdopodobnej przebudowie ustroju – widzieli w swobodnej i intrat działalności wydawniczej. Tylko część owych druków zdaje się bowiem stanowić owoc działań

When the HNCs were heated to 160 °C with a heating rate of 10 degrees/min and annealed at this temperature for 5 min, the bright contrast corresponding to PbSe was observed not only

The hypothesis that saw-tooth bars and troughs situated on the ebb-tidal delta gradually migrate into the ones situated in front of Ameland , is valid for 1989, but has to be

C Structure superimposition tions in the bound state prediction, while the top three models are also pre-experibetween native structure green and best predicted model blue, Ding

In the preliminary step of our research, we collected structural data to create the search space for further sampling and the library construction process.. The Protein Data