• Nie Znaleziono Wyników

ATM - AFAM

WRN -AFAM

5. H UMAN E VOLUTION

5.1. Foundations

In the last decade a lot of relevant discoveries has been made in the area of origin of our species. These discoveries vary from fossils dated to several million years old, like skeleton of the Pierolapithecus catalaunicus being the early Great Ape from middle Miocene (Moya-Sola et al. 2004) or a few million years younger skeletons of Sahelanthropus tchadensis, Orrorin tugenensis, Ardipithekus ramidus and Australopithecus anamensis claimed to be our extinct antecessors living in Pliocene (Leakey and Walker 2003, Tattersall 2003a), to fossils as young as several thousand years old LB1 skeleton of Homo floresiensis (Brown et al.

2004).

The latter is especially intriguing, as it is representative of the order Homo probably different from our own species and being alive in Indonesian island in late Pleistocene, only about 38,000-18,000 years ago (Morwood et al. 2004) i.e. after Homo sapiens appeared in the region (55,000-35,000 years ago). Due to the height of the body (approximately 1m), and because of the size of the brain (about 380cm3) H. floresiensis exhibits the most extreme case of the genus Homo and hardly matches any of two main interpretations of human origins.

The first interpretation, called the multiregional hypothesis (Walpoff 1999), assumes that modern humans evolved from the H. erectus species, which dispersed over the Old World more than one million years ago. In this hypothesis the genetic flow between these archaic human populations was so strong that it is justified to talk about one large-scale evolutionary process, which led from H. erectus to H. sapiens. The competing theory, known as the recent out-of-Africa origin hypothesis (Wilson and Cann 1992), assumes that there was very limited gene flow between archaic human populations which emerged from H. erectus and a population of anatomically modern humans, which left Africa about 100,000 years ago and spread through the Old World in subsequent tens thousand of years, reaching the New World through the Bering Sea frozen in Ice Ages, some 20,000 years ago. The debate between these

two models is still open, although the recent out-of-Africa hypothesis is considered by majority as the one, which better reflects the genetic record of humans (Jobling et al. 2004).

While it requires some time (and perhaps new discoveries) to give the coherent explanation of the H. floresiensis within (slightly?) rewritten human origin hypotheses, the early conclusions of evident isolation of small-bodied humans, seem to contradict the multiregionality. Mirazon Lahr and Foley (2004) express this fact even stronger writing in Nature that “H. floresiensis puts yet another (the last?) nail in the multiregional coffin”. There are some doubts whether multiregionalists become convinced. They claim they posses strong paleoanthropological support for multiregional evolution of humans in continuity of anatomical features (especially in Asia, but also in Australia and Europe) before and after arrival of modern humans dispersing out of Africa (Thorne and Wolpoff 1992). Indeed, assuming the lack of interbreeding between archaic (autochthons) and modern (invaders) humans, it is hardly to explain the fact that some bone features of Australians, being clearly distinctive from Africans, are present in Australian fossils before and after the appearance of modern humans in the region.

Tattersall (2003b) does not agree with this interpretation and considers Homo erectus as the local evolutionary dead path, and Wilson and Cann (1992) address the problem indicating that mentioned bone features are not necessarily independent and selectively neutral. They suggest that successive re-evolution of similar bone patterns is plausible in similar environmental conditions. Still, the relatively short time required for replication of changes, makes this explanation at least disputable, especially having in mind that also some nuclear genes support different histories as compared to those inferred from mitochondrial DNA (mtDNA) (Hey 1997).

Nevertheless, due to the ease of PCR amplification of mtDNA present in a one cell in multiple copies, mtDNA-based inferences are an important source of our knowledge about origin of modern humans. This is true even more in the light of conflicting inferences yielded based on multiple autosomal microsatellite loci. Kimmel et al. (1998) suggested that extensive population growth has occurred in Asia and Europe and not in Africa, whereas Reich and Goldstein (1998) inferred just opposite. Therefore, successful sequencing of the mtDNA (yielding more unique results due to the lack of recombination) from Neanderthal fossils became the mile stone in revealing our evolutionary paths.

For example, until recently, the estimation of the mitochondrial mutation rate could rely only on human-chimpanzee divergence data. However due to relatively long time to this divergence, all estimates of this time were very inaccurate ranging from 4 to 9 million years (O‟Connell 1995) – with the most probable value of 6 million years. Consequently estimated mutation rate could not be accurate and so is true with mitochondrial Eve (mtEve) epoch.

O‟Connell (1995) proved that the same genetic diversity of modern humans applied to his

branching process based model can give estimates of the mtEve epoch between 700 thousand even up to 1.5 million years.

These results were very different from those obtained with the use of phylogenetic trees estimated to 280 and 200 thousand of years by Hasegawa and Horai (1991) and Wilson and Cann (1992) respectively. The difference was not only due to very small sample size used by O‟Connell (just 19 individuals resulting in too large genetic diversity of contemporary humans as compared to more recent data) but mainly due to insufficient concordance of his model with actual evolution of humans for times of order of million of years. In his paper O‟Connell indicated also decreasing reliability of outgroup based methods when the outgroup is not close enough in genetic distance to the considered sample. Summarizing, until recently, mtEve dating estimates were dependent on inaccurate inference about human-chimpanzee divergence time and furthermore, they depended to great extent on the method applied for inferring.

When in 1997 (Krings et al. 1997) for the first time the mtDNA was sequenced from Homo neanderthalensis dated to be alive about 40,000 years ago (Schmitz et al. 2002), only less than 400 base pairs were sequenced. The next successful sequencings of Neanderthal mtDNA in 1999 (Krings et al. 1999) 2000 (Ovchinnikov et al. 2000, Krings et al. 2000) confirmed the accuracy of the first experiment. Since then, the mtDNA divergence rate no longer has to be guessed relying on the assumption of its constancy over a few million years, and problematic dating of human-chimpanzee split.

In 2004 the four additional Neanderthal fossils yielded mtDNA sequences together with five early modern humans fossils (Serre et al. 2004) and the results were in full concordance with previous sequencing efforts. What is also important, fossils sequenced by Serre et al.

(2004) contained examples (Vandija 77, Vandija 80, Mladeč 25c, Mladeč 2) considered by multiregionalists as “transitional” between Neanderthals and early modern humans due to some morphological features (Smith 1984, Frayer 1986, 1992, and Wolpoff 1999). Yet the mtDNA proved to be of Neanderthal type for Vandija fossils considered as Neanderthals, and of modern human type for Mladeč fossils, considered as modern humans. This is exactly, what is expected by recent out-of-Africa model, suggesting that some morphological features shared by mentioned fossils can be results of similar environmental influence or could arise just by chance without strong genetic flow between Neanderthals and early modern humans.

Serre et al (2004), apart from reporting these results try to estimate the upper limit of possible Neanderthal admixture to early modern humans, consistent with mtDNA testimony.

They use a coalescence method in three different demographies: (i) constant population size and population growth (ii) before and (iii) after potential point of Neanderthal admixture respectively. The numerical value of the estimate equal to 25 percent is given only for the simplest case of population constant size, known however to be unrealistic. In section 5.4,

similar (but indicating smaller admixture) limit it estimated, using branching process methodology. Interestingly branching processes have been recently also used for inferring the age of the primate last common ancestor based on archeological stratification and the number of species known to live in a given period (Tavare et al. 2002).

The results obtained by the author (section 5.4, see also Cyran and Kimmel 2005, Cyran 2010) further reduce the hypothetical Neanderthal mtDNA admixture to early modern humans gene pool. Even better estimates are possible when the history of human population inferred from archeological studies correlating Aurignacian, Chatelperronian and Gravettian cultures with Neanderthals or modern humans (Mellars 2004), as well as the influence of the Ice Ages on demography (Forster 2004) will yield more reliable estimates of the population size in different regions of the globe and corresponding time-inhomogeneous branching processes will be used.

As it was stated, human evolution at molecular level is reflected in the genome record.

However, it is often hard to interpret this record, because a population under consideration could undergo periods of expansions, which, if undetected, could lead to erroneous inferences. Therefore, the problem of detecting past population growths become one of crucial issues in contemporary population genetics This problem is addressed in section 5.2 using the microsatellite markers.

Microsatellites are short tandem repeats, STRs (Renwick et al. 2001, Agrafioti and Stumpf 2007, Vowles and Amos 2006), which are quite abundant in genomes and undergo relatively fast mutations. Therefore, they are suitable for testing the evolution of populations rather than emergence of species, and no doubt, they have found applications in various tests for population detection. Using such data the author has proposed a new statistical test, which has greater power for detection of population growth than other available microsatellite based methods (see section 5.2 for details).

Moreover, some genes were under strong pressure of natural selection (the efforts aiming to search the signatures of such selection have been described in section 4.3), while genetic variation in others is mainly the result of the genetic drift (see section 3.2) and the selectively neutral mutations (see section 3.3 and 4.1). If the gene under consideration is exhibiting signatures of natural selection (see section 3.4) then some variants of it must be more or less fit to the environment. Very often it is associated with some disorder having genetic background, but in some cases it is responsible for the development of the species.

The best known example of the latter is the ASPM gene responsible for the brain size in primates, including humans (Zhang 2003). As presented in section 3.4, and also in sections 4.2 and 4.3, there is also balancing selection in which the heterozygotes (i.e. organisms having different alleles at two homologues chromosomes) are more fit than any homozygotes (i.e. organisms having identical variants at both homologues chromosomes). This is the case

with human sickle cell anemia which is caused by two identical copies of mutated allele.

However, if this allele is present in heterozygote together with wild-type allele, then the carrier of one copy of mutant allele, not only does not suffer sickle cell anemia, but also this individual is able to generate successful immune response to the malaria. Therefore, on malaria endemic regions the mutant allele is frequent, despite it is responsible for severe disorder in homozygotes.

The indices of genetic variation, including allele distribution, heterozygosity or linkage disequilibrium, are affected by the population history. Therefore a lot of effort has been spent by statistical geneticists to estimate the long-term demographic history of populations belonging to various species. For this purpose many statistical tests detecting past population expansion have been proposed, for example King et al. (2000), Bjorklund (2003), Laan et al.

(2005), Cyran and Myszor (2008b). Section 5.2 details the efforts in this field, and in particular, presents original neural network-based test (Cyran and Myszor 2008b, 2008c) with power exceeding powers of other known tests for detecting past population expansion.

In particular, the interest in our own history induced in the last decades the research focused on inferring the human population history (Polański and Kimmel 2003). DNA sequences which reflect genetic diversity taken from many qualitatively different loci of H.

sapiens and H. Neanderthalensis have been analyzed. These analyses include for example studies of maternally inherited mitochondrial DNA (mtDNA) (Serre et al. 2004, Krings et al.

2000, Krings et al. 1999, Krings et al. 1997, Rogers 1995), paternally inherited Y chromosomes (Jobling 2001, Thompson et al. 2000), X chromosomes (Wooding and Rogers 2000), autosomal DNA sequences (Yu et al. 2001, Noonan et al. 2006, Pennisi 2007), nuclear short tandem repeats (STRs) (Kimmel et al. 1998), or protein sequences including -globin (Harding et al. 1997, Fullerton et al. 1994), pyruvate dehydrogenase alpha 1 (PDHA1) (Hey 1997) or Duchenne muscular dystrophy gene product (DMD) (Zietkiewicz et al. 1998).

Despite these and similar efforts the problem of human population trajectory is still open and thus there is a growing interest in studies on how sensitive are genetic variation indices to departures from assumed in different models population histories. Moreover, applicability of methods for calculating the distributions of the time to coalescence is limited to the model within which they have been formulated.

The most widely used models assume simplifications such as multinomial sampling or deterministic population size. The question arises how robust they are for populations evolving stochastically. One interesting example which comprises stochasticity is O'Connell limit theory of genealogy in branching processes. This problem is explored in section 5.3. In particular, it is considered there how fast, in terms of number of generations, the limiting distributions of O'Connell are adequate descriptions of transient distributions.

To answer the problem extensive simulations of slightly supercritical branching processes were performed and the results are compared with O'Connell limits. Furthermore, coalescent computations under the Wright-Fisher model are compared with limiting O'Connell results and with full genealogy-based expectations. These expectations are used to estimate the age of the root of mitochondrial polymorphism of modern humans (or in other words to date the Mitochondrial Eve epoch), based on mtDNA sequenced from living humans and Neanderthal fossils.

Finally the problem of Neanderthal admixture in a gene pool of Upper Paleolithic anatomically modern humans is considered in section 5.4. The methodology applied accounts for the effect of the genetic drift, which could eliminate the hypothetical Neanderthal mtDNA admixture until present. To model the demography, the slightly supercritical Markov‟s BP based on the O‟Connell model has been proposed. Relying on relatively fast convergence to the O‟Connell‟s limiting properties it was possible to estimate the time of extinction of the Neanderthals relatively to the time of the root of the mtDNA polymorphism of modern humans.

The results of the study presented in section 5.4 indicate that the maximum hypothetical contribution of Neanderthal mtDNA which could be eliminated by the genetic drift at 0.05 significance level is about 12%. Moreover, the expected value of the admixture has been estimated to be about 4%. Relevance of the research considered in section 5.4 lies in treating mtDNA-based studies as complementary approaches to those based on nuclear DNA sequenced by the Neanderthal genome project.

5.2. Inferring demography

Coalescent theory (see section 3.5) enables creating huge amounts of samples in quite a short time (Marjoram and Wall 2006), yet its methods were developed some years ago when computers were rather expensive and possessed relatively low computational power. Over the last years the situation has changed due to invention of multi-core processors and overall progress in technology, which makes contemporary hardware highly efficient in computations and available at reasonable price. What is more, some recent research shows that given circumstances, coalescent methods might return different results than time-forward simulation approach.

In both coalescent-based and time-forward simulation methods it is often desired to obtain sample from population with experienced changes in amount of individuals between generations. One interesting application is to simulate changes of chosen genetic markers caused by mutation process. In the case of genetic markers the microsatellites can be used.

These are short strains of DNA build from repeating motifs of length 2-6 nucleotides (Renwick et al. 2001). Length of microsatellite is denoted by an amount of such repeated motifs, usually 60 or so (Goldstein and Pollock 1997).

Common mutation in microsatellites are changes in the amount of repeated motifs, i.e.

change in the length of a microsatellite (Sia et al. 2000). Usually there is used one-step symmetric stepwise mutation model (SSMM), in which microsatellite might change length by one, with additional assumption that the probability of addition and deletion of one repeating motif is equal (Kimura and Ohta 1978). Microsatellites became popular because of their relative high mutation rate (about 10-4 – 10-5), and the fact that they are spread all over genome (Zhivotovsky et al. 1997) – in human genome more than 10 000 microsatellites have been identified (Agrafioti and Stumpf 2007). Additionally, most of them is in non coding DNA, so according to neutral model of molecular evolution, they probably do not have influence on reproductive capabilities of individuals. Furthermore, microsatellites are easy in mathematical analysis.

During the research work performed by Cyran and Myszor (2008a), there was created a series of populations that underwent different kind and magnitude of growth. To simulate development of the population the model providing dynamic description of the evolution was formulated. It was based on the Wright – Fisher model (see section 3.2), which, in the most often used version, assumes (Hein et al. 2005):

 discrete and non overlapping generations,

 haploid individuals in populations,

 constancy of population size,

 equilibrium fitness of individuals in the population,

 lack of geographical or social structure in the population,

 no recombination in the population.

Because there were simulated populations whose size was changing in time, the applied W-F model allowed for changes in population size. The experiments concerned the Y (Bachtrog and Charlesworth 2001) chromosome or mtDNA (Eyre-Walker and Awadalla 2001) in order to eliminate the recombination issues and provide haploid individuals. When new generation was created the old one was deleted so there were no overlapping generations. During creation of new individual all parents could be chosen with equal probability, what eliminated problems of individuals' fitness and geographical or social structure.

In time forward simulation, the succeeding generation was generated based on the previous one. Each individual in the previous generation might have influence on the current