| Title | The genetic architecture of morphological changes in the Ninespine Stickleback (Pungitius Pungitius) and the domesticated pigeon (Columba Livia): Insights from population analysis, quantitative trait mapping and whole-genome resequencing |
| Publication Type | dissertation |
| School or College | College of Science |
| Department | Biological Sciences |
| Author | Stringham, Sydney Ann |
| Date | 2014-08 |
| Description | Here I present the results of my doctoral dissertation, which is aimed at increasing our understanding of the genetic basis of large morphological changes. The work I have done has primarily been carried out using ninespine sticklebacks (Pungitiuspungitius) as a model organism. Specifically, I have investigated the genetic architecture of pelvic reduction (a structure homologous to tetrapod hindlimbs) in multiple populations using a combination of traditional QTL mapping as well as whole-genome comparisons. Additionally, I examined the structure of breed groups among domesticated pigeons (Columba livia) in order to determine whether or not similar derived traits are found in genetically unrelated breeds. This work lays the foundation to develop the domesticated pigeon as a genetic and developmental model for avian diversity. |
| Type | Text |
| Publisher | University of Utah |
| Subject | Evolution; Pungitius; Stickleback |
| Dissertation Institution | University of Utah |
| Dissertation Name | Doctor of Philosophy |
| Language | eng |
| Rights Management | Copyright © Sydney Ann Stringham 2014 |
| Format | application/pdf |
| Format Medium | application/pdf |
| Format Extent | 1,748,296 bytes |
| Identifier | etd3/id/3168 |
| ARK | ark:/87278/s6mp8bhw |
| DOI | https://doi.org/doi:10.26053/0H-M60T-9500 |
| Setname | ir_etd |
| ID | 196734 |
| OCR Text | Show THE GENETIC ARCHITECTURE OF MORPHOLOGICAL CHANGES IN THE NINESPINE STICKLEBACK (PUNGITIUS PUNGITIUS) AND THE DOMESTICATED PIGEON (COLUMBA LIVIA): INSIGHTS FROM POPULATION ANALYSIS, QUANTITATIVE TRAIT MAPPING AND WHOLE-GENOME RESEQUENCING by Sydney Ann Stringham A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Biology The University of Utah August 2014 Copyright © Sydney Ann Stringham 2014 All Rights Reserved The Uni v e r s i t y of Utah Graduat e School STATEMENT OF DISSERTATION APPROVAL The dissertation of Sydney Ann Stringham has been approved by the following supervisory committee members: Michael D. Shapiro Chair 4/30/2014 Date Approved Richard M. Clark Member 4/30/2014 Date Approved Jon Seger Member 4/30/2014 Date Approved Gabrielle Kardon Member 4/30/2014 Date Approved Lynn B. Jorde Member 4/30/2014 Date Approved and by Neil Vickers Chair of the Department of Biology and by David B. Kieda, Dean of The Graduate School. ABSTRACT Here I present the results of my doctoral dissertation, which is aimed at increasing our understanding of the genetic basis of large morphological changes. The work I have done has primarily been carried out using ninespine sticklebacks (Pungitiuspungitius) as a model organism. Specifically, I have investigated the genetic architecture of pelvic reduction (a structure homologous to tetrapod hindlimbs) in multiple populations using a combination of traditional QTL mapping as well as whole-genome comparisons. Additionally, I examined the structure of breed groups among domesticated pigeons (Columba livia) in order to determine whether or not similar derived traits are found in genetically unrelated breeds. This work lays the foundation to develop the domesticated pigeon as a genetic and developmental model for avian diversity. ABSTRACT.......................................................................................................................Ill LIST OF FIGURES........................................................................................................... vl LIST OF TABLES............................................................................................................ vll CHAPTERS 1 COMPLEX GENETIC ARCHITECTURE UNDERLIES PELVIC REDUCTION IN A CANADIAN POPULATION OF NINESPINE STICKLEBACK (PUNGITIUSPUNGITIUS): A COMBINED GENOMIC RE-SEQUENCING AND QTL MAPPING APPROACH..........................................................................1 Introduction...................................................................................................................1 Materials and Methods.................................................................................................5 Genome Sequencing and Assembly...............................................................5 Annotation........................................................................................................7 Mutation Rate in Ninespine Lineage.............................................................8 Population Structure Analysis........................................................................ 8 Pooled Resequencing...................................................................................... 9 Variant Calling.................................................................................................9 Fst and Likelihood Ratio Test Analyses......................................................10 Cross Husbandry........................................................................................... 10 Phenotyping...................................................................................................11 Bulked Segregant Analysis of Crosses........................................................11 QTL Mapping................................................................................................12 Results and Discussion..............................................................................................13 Draft Genome and Comparative Resequencing.........................................13 Genome sequencing and assembly.................................................. 13 Pooled resequencing..........................................................................13 Quantitative Trait Mapping...........................................................................16 Crosses and bulked segregant analysis........................................... 16 QTL mapping................................................................................... 17 Overlap of Pooled Resequencing and QTL Mapping................................ 18 Linkage Group 12 and Pelvic Phenotype.................................................... 19 Differentiation Between Salt River and Pine Lake Populations................22 Candidate genes in regions of differentiation................................ 25 Conclusions.................................................................................................................26 TABLE OF CONTENTS References...................................................................................................................30 2 DIVERGENCE, CONVERGENCE, AND THE ANCESTRY OF FERAL POPULATIONS IN THE DOMESTIC ROCK PIGEON......................................58 Summary.....................................................................................................................59 Results and Discussion..............................................................................................59 Genetic Structure of Domestic Pigeon Breeds............................................ 59 Convergent Evolution of Traits................................................................... 62 Geographic Origins of Breeds..................................................................... 62 Ancestry of Feral Pigeon Populations..........................................................62 The Domestic Pigeon as a Model for Avian Genetics and Diversity ........62 Accession Numbers...................................................................................................64 Supplemental Information........................................................................................ 64 Acknowledgements....................................................................................................64 References...................................................................................................................65 3 THE GENETIC BASIS OF DIVERGENCE AND CONVERGENCE IN TELEOST FISH......................................................................................................... 66 Abstract.......................................................................................................................66 Introduction................................................................................................................ 67 Sticklebacks (Family Gasterosteidae)...................................................................... 70 Introduction....................................................................................................70 Armor Plate Variation...................................................................................71 Reduction and Loss of the Pelvic Fin Complex..........................................73 Body Shape Variation...................................................................................76 Summary........................................................................................................77 Mexican Cavefish (Family Characidae, Astyanax mexicanus).............................. 78 Introduction....................................................................................................78 Pigment Variation......................................................................................... 78 Eye Loss.........................................................................................................80 Selection, Neutral Mutation, and Pleiotropy............................................... 81 Summary........................................................................................................82 Cichlids (Family Cichlidae)...................................................................................... 83 Introduction....................................................................................................83 Feeding Morphology..................................................................................... 83 Summary........................................................................................................85 Discussion...................................................................................................................85 Genetic Architecture of Derived Traits........................................................85 Coding Versus Regulatory Mutations..........................................................87 Convergent Evolution...................................................................................88 Future Directions.......................................................................................................89 Glossary......................................................................................................................90 References...................................................................................................................91 v 1.1 Collection sites and variation in phenotype in Salt River and Pine L ak e ............ 48 1.2 STRUCTURE analysis of Salt River and Pine Lake by pelvic phenotype........... 49 1.3 Summary of whole genome scans and QTL mapping............................................ 50 1.4 Comparison of LRT and FST from Salt River and Pine Lake................................ 52 1.5 Mean pelvic score in Salt River crosses.................................................................. 53 1.6 Histogram of pelvic scores in Salt River half-sibling families.............................. 54 1.7 Nucleotide diversity (pi) in Pine Lake and Salt River populations.......................55 1.8 LRT and FST values in interpopulation comparisons.............................................. 57 2.1 Genetic structure of the rock pigeon (Columba livia)............................................60 2.2 Consensus neighbor-joining tree of forty domestic breeds and one free-living population of rock pigeon......................................................................................... 61 2.3 Comparison of Darwin's morphology-based classification and genetic structure analysis of domestic pigeon breeds..........................................................................63 2.4 Distribution of several derived traits across groups of domestic pigeons............ 64 3.1 Variation in stickleback plate and pelvic phenotypes...........................................103 3.2 Schematic of quantitative trait locus mapping in laboratory crosses...................104 3.3 Eye loss and pigmentation differences in Mexican cavefish............................... 106 3.4 A sample of the cichlid diversity in Lake Tanganyika and Lake Malawi.......... 107 LIST OF FIGURES LIST OF TABLES 1.1 Summary of genomic libraries used for reference sequence..................................40 1.2 Summary of additional teleost genome assemblies................................................ 41 1.3 Summary of Salt River crosses.................................................................................42 1.4 Summary of samples used in bulked segregant analysis........................................43 1.5 Genome metrics.........................................................................................................44 1.6 Summary of genomic regions identified by QTL mapping....................................45 1.7 Candidate genes of pelvic reduction........................................................................ 46 1.8 Pelvic phenotypes by sex in wild-caught fish..........................................................47 CHAPTER 1 COMPLEX GENETIC ARCHITECTURE UNDERLIES PELVIC REDUCTION IN A CANADIAN POPULATION OF NINESPINE STICKLEBACK (PUNGITIUSPUNGITIUS): A COMBINED GENOMIC RESEQUENCING AND QTL MAPPING APPROACH Introduction Although there are many examples of divergent vertebrate lineages evolving similar traits, relatively little is known about the types and number of mutations underlying these convergent events. Stickleback fish provide an ideal model to investigate dramatic convergent events because they are one of a few groups that have evolved anatomical, physiological, or behavioral differences among populations of the same species that are of a magnitude typically seen between different species. This allows powerful methods such as quantitative trait locus (QTL) mapping used to investigate the evolution of dramatic, and adaptively relevant, changes in wild populations. The retreat of glacial ice less than 20,000 years ago allowed populations of marine sticklebacks to colonize new, inland freshwater habitats (Bernatchez and Wilson 1998; Hewitt 2000). This shift to freshwater presented novel trophic niches as well as new physiological and predatory challenges, and many geographically and phylogenetically distinct populations of sticklebacks evolved to their new habitats in similar ways. One dramatic example of a repeated change is the loss of the pelvic complex (Bell and Foster 1994). The stickleback pelvis is homologous to the tetrapod hindlimb and is composed of a pelvic girdle and two serrated spines. While the pelvis provides protection from gape-limited predators (Hoogland et al. 1957; Hagen and Gilbertson 1972; Moodie 1972; Gross 1978; Lescak and von Hippel 2011), it is thought to be disadvantageous when grasping predators are a bigger threat (Hoogland et al. 1957; Reimchen 1980; Reist 1980; Bell et al. 1993; Bell and Orti 1994; Ziuganov and Zotin 1995; Marchinko 2009). Interestingly, parallel reduction in the pelvic skeleton has occurred not only among multiple populations of threespine sticklebacks (Gasterosteus aculeatus, which has been the subject of many classic behavioral, ecological, and recent genomic studies), but also in the ninespine stickleback (Pungitiuspungitius) and the brook stickleback (Culaea inconstans) (Nelson and Atton 1971; Wootton 1976; Blouw and Boyd 1992; Bell and Foster 1994; Ziuganov and Zotin 1995). Therefore, the stickleback family (Gasterosteidae) is an ideal multispecies system to examine the genetics of adaptive traits on both micro- and macroevolutionary scales. Several studies have examined the genetic basis of pelvic reduction in threespine stickleback populations. First, mapping studies showed that a major-effect quantitative trait locus (QTL) on linkage group 7 along with between three and four minor-effect QTL (Cresko et al. 2004; Shapiro et al. 2004; Coyle et al. 2007; Shapiro et al. 2009; Shikano et al. 2013). This region of the genome was interesting because it contained the hindlimb specific transcription factor Pitxl (Shapiro et al. 2004). Later, Chan et al. (2010) confirmed that independent deletions of a pelvic enhancer of Pitxl were associated with pelvic reduction in several populations. Thus, the same phenotype in different 2 populations is controlled by independent mutations in the same gene. Complementation tests (Shapiro et al. 2006a) and QTL mapping (Shikano et al. 2013) also identified Pitxl as a candidate for pelvic reduction in ninespine sticklebacks. Therefore, the same gene might be responsible for similar morphological changes in two species that have been separated by more than 10 million years. However, unlike in threespine sticklebacks, pelvic reduction in ninespine sticklebacks does not map to Pitxl in all populations examined to date. For example, a study of an Alaskan population found that the major contributor to pelvic reduction mapped to linkage group 4, which is unlinked to Pitxl (Shapiro et al. 2009). Therefore, at least two different genetic changes potentially lead to pelvic reduction in ninespine sticklebacks. With only a handful of examples, it is difficult to make broad conclusions about the genetic architecture of pelvic reduction in ninespine sticklebacks. Recent phylogenetic analysis suggests that, in North America, ninespine sticklebacks probably dispersed from three distinct refugia and evolved pelvic reduction independently in each case. Populations from the west coast of North America descended from populations from the Bering refugium, inland populations dispersed from the Mississippi refugium, and those found along the east coast of North America came from the Atlantic refugium. Given that the two mapping studies conducted thus far in ninespine sticklebacks come from the Bering lineage in North America (Shapiro et al. 2009) or an Eastern European lineage (Shikano et al. 2013); a closer investigation of the genetic mechanisms underlying pelvic reduction in a population derived from the Mississippi refugium could provide insight into general genomic patterns of pelvic reduction in ninespine sticklebacks. 3 To this end, we examined two populations of ninespine sticklebacks from the Northwest Territories of Canada, Salt River and Pine Lake, which exhibit an unusually broad range of pelvic phenotypes. That is, while most freshwater populations of ninespine sticklebacks have a complete pelvic skeleton and a few exhibit pelvic loss in all individuals, these populations comprise individuals with a wide range of pelvic phenotypes. In this study, we took a two-step approach, combining traditional QTL mapping and comparative whole-genome sequencing, to identify genomic regions that contribute to pelvic phenotype in Salt River and Pine Lake. We began by conducting traditional QTL mapping in the Salt River population. QTL mapping is a robust method to identify genomic regions that contribute to phenotypic variation, and it has relatively low rates of false positives (Sahana et al. 2006). However, in laboratory crosses with limited numbers of progeny, QTL mapping can often result in candidate genomic regions that contain hundreds of genes. To address this challenge, we also assembled a draft genome for the ninespine stickleback and used it as the basis for whole-genome resequencing aimed at identifying genomic regions with divergent allele frequencies, and presumably selection, between phenotypic classes. In contrast to linkage mapping in laboratory populations, whole-genome sequencing studies of trait variation in natural populations can potentially implicate smaller genomic regions. This precision results from historical recombination events that break down linkage disequilibrium between causative mutations and their surrounding (presumably neutral) variants. However, the large datasets in such studies are more prone to false positives. A combination of these approaches should allow for the identification of QTL with relative confidence, which can then be validated and narrowed 4 5 with resequencing data (Stinchcombe and Hoekstra 2008). Materials and Methods Genome Sequencing and Assembly The DNA for reference genome sequencing was extracted from a single female fish from an unnamed creek in Wasilla, Alaska (61° 37' N, 149° 30' W). This population was chosen because it has low rates of heterozygosity compared to other populations (Aldenhoven et al. 2010), which facilitates genome assembly (Holt et al. 2002; Vinson et al. 2005). We constructed two paired-end sequencing libraries with insert sizes of 250 bp and 500 bp using the Illumina Paired-End DNA Sample Prep Kit. An additional mate-pair library with an insert size of 2400 bp was also constructed (National Center for Genome Resources, Santa Fe, New Mexico). 101-bp paired-end sequencing was performed on all libraries using the Illumina HiSeq2000 platform (University of Utah High Throughput Genomics core). Statistics of the raw reads are listed in Table 1.1. An initial genome assembly was constructed using ALLPATHS-LG (r40776) (Gnerre et al., 2011) and contained 8784 scaffolds with a total length of 387.6 MB. We improved the assembly by using SSPACE (Boetzer et al., 2011) to perform further scaffolding with end reads from two ninespine stickleback BAC libraries (92,160 reads, mean length 910 bp; BAC libraries VMRC34 and VMRC35, Benaroya Research Institute at Virginia Mason, Seattle, WA). Because SSPACE uses short paired-end reads as input, we used the following protocol to convert the Sanger reads into a short-read library: for each fragment in a read pair, we split the read into 2 segments and selected the first 80 bp from each segment to construct a new paired-end library in silico. By aligning this library to the previous genome assembly with Bowtie (Langmead et al., 2009), we calculated a mean BAC insert size of 140 kb (SD = 41 kb). We then ran SSPACE to scaffold our genome assembly with the new library. The final genome assembly contains 7824 scaffolds with a total length of 428.1 Mb and contig and scaffold N50 lengths of 122.8 kb and 302.8 kb, respectively. The longest contig is 1.16 Mb and the longest scaffold is 4.14 Mb. Expected genome size based on kmer distribution in the 250- and 500-bp libraries calculated using Jellyfish (Marcais and Kingsford 2011) was 518.4 Mb. Although this longer than our assembled genome size, the CEGMA (Parra et al. 2007) pipeline reports that 90.32% of conserved eukaryotic proteins were found within this assembly, indicating a relatively complete gene annotation and suggesting that unassembled genomic regions are probably enriched for repetitive sequences. Finally, to place scaffolds in a relative genomic order we identified regions of synteny between threespine (BROADS1 assembly) and ninespine stickleback genomes. Using protein sequence, we identified reciprocal best BLAST hits between the two genomes. Any ninespine scaffold that had at least one reciprocal best blast hit on a threespine chromosome was considered orthologous and was placed in a relative order based on the threespine stickleback genome. Fifteen ninespine stickleback scaffolds showed synteny with two different threespine stickleback chromosomes, these scaffolds were not assigned a relative position. A total of 2,672 ninespine stickleback scaffold (83% of the genome assembly) showed synteny with a single threespine stickleback chromosome. 6 7 Annotation MAKER version 2.29 (Holt and Yandell 2011) was used to annotate the genome assembly using multiple lines of evidence. An RNA-seq library was created using mRNA extracted from adult heart, eye, brain, liver, and muscle tissue as well as whole embryos at 3 and 6 days postfertilization (chorion and yolk removed). mRNA samples from all tissues were combined in equimolar amounts for Illumina library construction. RNA-seq reads were assembled using Trinity (Grabherr et al. 2011) and provided as evidence for the genefinders in MAKER. Additional evidence included all RefSeq teleost proteins (downloaded July 30, 2013 from http://www.ncbi.nlm.nih.gov) and all Uniprot/SwissProt proteins (downloaded July 29, 2013 from ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete). Repetitive regions were masked using a species-specific repeat library generated by RepeatModeler (Smit and Hubley 2008). This library was then aligned to Uniprot/SwissProt proteins using BLAST (Altschul et al. 1990) (E < 0.0001) and repeat library entries with matches to a known protein gene were removed. Additional masking was done with a list of known transposable elements provided by MAKER. Other areas of low complexity were soft-masked (Korf et al. 2003) using Repeatmasker (Smit et al. 1996) to prevent the seeding of evidence alignments in those regions but still allowing extension of evidence alignments through them (Altschul et al. 1990; Cantarel et al. 2008). Genes were predicted using SNAP (Korf 2004) and Augustus (Stanke and Waack 2003; Stanke et al. 2008) trained for Pungtiuspungitius using MAKER in an iterative fashion (Cantarel et al. 2008). The final annotation set consisted of the all MAKER-generated annotations with protein or mRNA-seq support and the subset of the unsupported gene predictions that contained one or more protein family domains as detected by IPRscan (Quevillon et al. 2005). In total, we identified 22,432 protein-coding genes (mean length = 10,015 bp). Of these, 21,516 showed homology with other species, and 16,654 were supported by mRNAseq data. Overall, the ninespine stickleback genome is similar to other published teleost fish genomes in terms of the number of annotated genes, but is relatively compact in length in comparison to several other teleosts (Table 1.2). Mutation Rate in Ninespine Lineage We estimated the species-specific mutation rate for the ninespine stickleback as described previously by Shapiro et al. (2013). Briefly, we used TBLASTX (Altschul et al. 1990) alignments (E < 10"8) to identify one-to-one orthologs between fugu (Takifugu rubripes), threespine stickleback, and ninespine stickleback, with fugu aligned to threespine and ninespine stickleback separately. We then identified four-fold degenerate codon positions shared between the three species and generated three way alignments from 5,773 orthologous genes. We ran MODELTEST (Posada and Crandall 1998) using these alignments and found that the General Time Reversible (GTR) substitution model best fit the observed data. We then ran the baseml script in the PAML package (Yang 2007) under the GTR model with divergence times of 85 MYA and 100 MYA based on reports by Near et al. (2012) and Santini et al. (2009), respectively. Population Structure Analysis We assessed population structure within and between the Salt River and Pine Lake populations using the Bayesian clustering analysis in Structure with a 50,000- ! 8 iteration burn-in followed by 500,000 iterations (Pritchard et al. 2000). We genotyped 186 Pine Lake fish and 160 Salt River fish at 12 unlinked microsatellite loci: Pun44, Pun68, Pun157, Pun78, Pun117, Pun255, Pun212, Pun203, Pun171, Pun19, Pun261, and Pun238 (Shapiro et al. 2009). We examined population models from K = 2 to K = 6. Pooled Resequencing Fish with the most extreme pelvic phenotypes, that is, fish with a pelvic score of 8 (complete) and fish with a pelvic score of 0-2 (reduced) (Bell et al. (1987)), were collected from Salt River (complete, n = 100; reduced, n = 64) and Pine Lake (complete, n = 100; reduced, n = 89) (Northwest Territories, Canada; Pine Lake: 59° 33'N, 112° 15' W; Salt River: 59° 49'N, 111° 58' W). Equimolar amounts of DNA from each individual were combined into one of four pools of DNA (Pine Lake complete, Pine Lake reduced, Salt River complete, Salt River reduced). These pools were used to construct Illumina sequencing libraries with an insert size of 250 bp, which were sequenced using the Illumina HiSeq2000 platform to a depth of 25-45x coverage using 101-bp paired end reads (University of Utah High Throughput Genomics Core). Variant Calling We aligned sequencing reads from each population pool to the reference genome using Bowtie2 (Langmead and Salzberg 2012). We then used two different software pipelines to call nucleotide variants from the resulting BAM alignment files. First, SNVer (Wei et al. 2011), software specifically designed to detect variants in pooled sequences, identified 5.2 million variants. Second, the Genome Analysis Toolkit (Van der Auwera et al. 2013) was used to realign indels (RealignerTargetCreator, IndelRealigner) and call ! 9 10 variants with UnifiedGenotyper; this method identified 5.4 million variants. The two variant sets were then intersected to include only SNPs that were identified by both methods, resulting in a final variant set of 3,510,585 SNPs. Fst and Likelihood Ratio Test Analyses A number of metrics have been used in whole-genome comparisons, but not all are applicable for pooled sequencing as many require individual haplotype information and are not designed to properly account for sequencing errors in pooled data. In order to assess allele frequency differentiation between phenotypic classes we used both FST (Weir and Cockerham 1984) and a likelihood ratio test (Kim et al. (2010). The latter method was included because it includes depth of coverage as a factor, which is important because regions of low coverage may not accurately reflect the allele frequency of the population. For both tests, we excluded sites with less than 10x coverage to avoid variants that might not accurately reflect allele frequencies in the population (Zhu et al. 2012). We also excluded sites with greater than 100x coverage, as these sites are probably repetitive sequences that do not map uniquely. Both FST and LRT metrics were smoothed over a 10-kb sliding window with 2 kb steps. Depending on the population and metric used, the number of total windows analyzed for each population and pool was between 174,914 and 175,169. Cross Husbandry Twenty-eight crosses were made between individuals from the Salt River population. Offspring were raised to at least 30-mm standard length in 29-gallon aquaria with a 16-hour light/8-hour dark cycle. Fish were euthanized using MS-222 and preserved in 70% ethanol. Tissue was removed from the liver and right pectoral fin for subsequent DNA isolation. To stain external bone, fish were fixed in 10% neutral buffered formalin, stained with alizarin red, and preserved in 70% ethanol for phenotyping. Four half-sibling families (12 crosses in total) showed variation in pelvic phenotype in the F1 generation and were used for QTL mapping (Table 1.3). Pelvic score in these crosses ranged from 0 (no pelvis) to 8 (complete pelvis). In total, 381 F1 offspring were included in subsequent analyses. Phenotyping For all crosses, skeletal measurements were taken using digital calipers under a dissecting microscope. Measurements included standard length, pelvic girdle length, pelvic spine length, and pelvic ascending process height (Shapiro et al. 2009). Separate measurements were taken on left and right sides. The same person measured individual traits, and each measurement was taken three times then averaged. Numbers of lateral plates were assessed separately for left and right sides and, because mid-body plates were absent in all individuals, we counted anterior and posterior rows separately. Bulked Segregant Analysis of Crosses Using crosses from Salt River established in 2010, DNA was extracted from individuals with extreme pelvic phenotypes (that is, those with a pelvic score of 8 or those with a pelvic score of 0-2) and pooled, within crosses, in equimolar amounts (see Table 1.4). These pools, along with individual (unpooled) parents, were genotyped with 192 microsatellite markers as previously described (Shapiro et al. 2009), and the results 11 were visualized with GeneMapper software (Applied Biosystems, Foster City, CA). The relative PCR amplification intensities of alleles from each pool were compared by eye, and we found that 32 markers showed differential allele representation between the two phenotypic pools in at least two of the three families tested. These 32 markers were then used to genotype individual fish from four families from 2008 and 2010. Additionally, because the QTL mapping software we used to analyze our crosses is not able to process linkage groups with a single marker, Pun61, Pun98, Pun159, Stn259, Stn329, and Stn435 were added to anchor markers that were the only representative on a given linkage group. Since many progeny of the crosses had immature gonads, Stn19 was also used to genotype all fish individually in order to determine sex (Shikano et al. 2011). This marker produced a discernable genotype in 88% and 79% of fish in 2008 and 2010 crosses, respectively. QTL Mapping The number of offspring in each of the 12 crosses was between 15 and 49. By grouping crosses with the same male parent and conducting all further mapping analyses with F1 half-sibling families we were able use larger sample sizes and improve our power to identify quantitative trait loci (Family 1, n = 178; Family 2, n = 56; Family 3, n = 89; Family 4, n = 58). To that end, we used the half-sibling portal of GridQTL (Seaton 2006) to analyze each family separately. We included genotype data for 38 markers, phenotype data for 8 pelvic traits, standard length and sex, and marker distances from a previously published ninespine linkage map (Shapiro et al. 2009). GridQTL was run using length as a covariate, sex as a cofactor, and 1000 chromosome-wide permutations. Default settings were used for all other parameters. 12 Results and Discussion Draft Genome and Comparative Resequencing Genome sequencing and assembly. The ninespine stickleback reference assembly was sequenced from a single fish from the Church Road population (southcentral Alaska) using the Illumina HiSeq2000 platform. The draft genome contains 7,850 scaffolds (N50 length = 299.5kb) and a total assembled length of 441.1 MB with approximately 140.5x coverage. See Table 1.5 for a summary of genome metrics. The final annotation set ("MAKER standard build"; Campbell et al. 2014), contains 22,432 protein coding genes, 80.1% of which contain a protein domain as detected by IPRscan (Quevillon et al. 2005). 87% of genes have an annotation edit distance less than 0.5, consistent with a well annotated genome (Holt and Yandell 2011), and 95.9% of the annotated genes are similar to proteins in SwissProt as identified by BLAST (E < 0.0001) (Altschul et al. 1990). We used annotated protein sequence from threespine stickleback and fugu (Takifugu rubripes) to calculate a lineage-specific mutation rate of between 0.009 and 0.010 mutations/site/MY assuming divergence times of 85 MY and 100 MY, respectively (Santini et al. 2009; Near et al. 2012). These values are similar to mutation rates calculated for other teleost fish lineages (0.007-0.04 mutations/site/MY) (Jaillon et al. 2004; Burridge et al. 2008). Pooled resequencing. Pooled resequencing is an effective method for identifying genomic regions that are differentiated or under selection when single-genome resequencing is cost-prohibitive or impractical. The estimation of allele frequencies in pooled data has been shown to accurately represent the allele frequencies in pooled 13 populations given a minimum of 10x coverage (Zhu et al. 2012; Rellstab et al. 2013). This method has been used to successfully identify selective sweeps and candidate genomic regions underlying traits in many organisms, including Arabidopsis, maize, Drosophila, domestic chickens and pigs, and humans (Burke et al. 2010; Marklund and Carlborg 2010; Rubin et al. 2010; Turner et al. 2010; Janssen et al. 2011; Udpa et al. 2011; Zhou et al. 2011; Rubin et al. 2012). The Salt River and Pine Lake populations (Figure 1.1 A) of ninespine sticklebacks present a unique opportunity to use this method to better understand the genetic architecture of pelvic reduction. The majority of freshwater ninespine stickleback populations have a complete pelvic skeleton and at least 10 exhibit a reduced pelvis; however, few show the range of pelvic phenotypes seen in Salt River and Pine Lake (Figure 1.1 B, C) (Nelson 1971; Nelson and Paetz 1972; Blouw and Boyd 1992; Ziuganov and Zotin 1995; Shapiro et al. 2006b; Herczeg et al. 2010; Mobley et al. 2011; Klepaker et al. 2013). Comparing very divergent phenotypes within a single population controls for demographic differences in genetic background that could confound results when comparing phenotypes between two populations. We began by genotyping individuals that contributed to the resequencing pool with 12 unlinked microsatellite markers and confirmed that there was no genetic differentiation based on pelvic phenotype within these populations. We analyzed Pine Lake and Salt River samples together. Using the Evanno method (Evanno et al. 2005) as implemented in STRUCTURE HARVESTER (Earl and vonHoldt 2011) determined that the most likely number of populations is 2 (Figure 1.2); there was no substructure seen between phenotypes within a population. 14 We then collected and pooled DNA from between 68 and 100 fish with the most extreme pelvic phenotypes from each population (for a total of 4 groups) and sequenced each pool to a depth of between 25-45x coverage. Because pools were assembled based only on pelvic phenotype in unstructured populations, some of the genomic regions that show signatures of selection or differentiation are expected to influence pelvic variation. Allele frequency differences between pools of distinct phenotypes within a population were assessed using both FST (Weir and Cockerham 1984) and a likelihood ratio test (LRT) (Kim et al. 2010) in 10kb sliding windows, with 2kb steps, along the length of the genome. The mean values for each test when comparing phenotypes within a population were as follows: Salt River, mean LRT = 0.512, mean FST = 0.016; Pine Lake, mean LRT = 0.362, mean FST = 0.007. The overall low value for both of these statistics is expected, as both are metrics of population differentiation and the phenotypic groups that are being compared originate from the same population. To identify genomic regions that might contribute to pelvic phenotype in these populations, we further examined the top 0.1% of all windows (Salt River: FST > 0.060, LRT > 2.00; Pine Lake, FST > 0.022, LRT > 1.24). In Salt River, all linkage groups except 2, 10, and 21 have windows that are in the top 0.1%. In Pine Lake, all linkage groups have windows that meet this threshold. Additionally, similar regions of the genome show elevated FST and LRT values when comparing pelvic complete and pelvic reduced pools in Salt River and in Pine Lake (Figures 1.3 and 1.4). In both populations, linkage group 12 is enriched for elevated FST and LRT scores. There are also regions in the center of linkage group 4, the end of linkage group 19 and a segment of the unordered region of the genome that are elevated in both populations. 15 Based on the geographic proximity and similarity of pelvic phenotypes seen in both of these populations, we hypothesized that similar regions of the genome may be affecting pelvic phenotype in these populations. In order to test for overlap between the results from Salt River and those from Pine Lake, we counted how many of the windows found in the top 0.1% of LRT scores in Salt River overlap exactly with any windows in the top 0.1% identified in Pine Lake. Overall, we found that of the windows in the top 0.1% in both Salt River and Pine Lake, 259 overlapped exactly between the two populations (14.8% of all windows in the top 0.1%). There are also regions of the genome with high differentiation between phenotypes in one population, but not the other. For example, linkage group 10 has a region of prominent differentiation in Pine Lake, but not in Salt River. Additionally, linkage group 21 contains 23 windows in the top 0.1% of scores in Pine Lake, while there are no high-LRT windows on that linkage group in Salt River. Overall, these patterns suggest that some genomic regions are indeed associated with pelvic reduction in both populations, for example, linkage groups 4, 12, and 19. However, there are also other genomic regions associated with pelvic phenotype that are unique to one population (e.g., linkage group 10 in Pine Lake). Quantitative Trait Mapping Crosses and bulked segregant analysis. Because whole-genome comparisons are expected to contain some false-positive signals, we combined this approach with traditional QTL mapping, which is more robust to this problem. In order to rapidly screen for genomic regions associated with pelvic phenotype, we began by using bulked segregant genotyping (Postlethwait et al. 1994; Cresko et al. 2004) with 192 previously 16 described microsatellite markers located throughout the genome (Shapiro et al. 2009) (see Methods and Table 1.4). We found that 32 markers showed differential allele frequencies between complete and reduced pools in at least two families. These candidate markers were used to genotype all individual fish from all four families. QTL mapping. F1 offspring from all families were genotyped individually with the 32 candidate markers identified by bulked segregant analysis. QTL analysis of genotypes and pelvic phenotypes with GridQTL (Seaton 2006) identified 14 linkage groups that affect at least one component of the pelvic skeleton (Table 1.6). Seven linkage groups (1a, 1b, 3, 8, 12, 14a, and 17) were identified in more than one family, while another seven linkage groups (2, 4, 10, 15b, 16, 18, and 19) affected only one component of the pelvis in a single family. We also note that some linkage groups affect one pelvic structure in one family, and another structure in a second family. For example, linkage group 1b affects left spine and left girdle length in cross 2010-05 but ascending process height in cross 2010-03. These differences among crosses could be due to low numbers of offspring in some families, which would make small-effect QTL difficult to identify, an effect of the differences in the distribution of pelvic phenotype across families or differences in the genetic backgrounds of the parents, which differ between families. Overall, in the Salt River population, we detected genomic regions of a smaller effect than the major loci previously observed in other crosses (between 4.3 and 25.1 percent variance explained [PVE]). This is notable because, to date, examination of the genetic architecture of pelvic reduction across populations of both threespine and ninespine sticklebacks have found variation in pelvic phenotype to be controlled 17 primarily by a single, large-effect, QTL (PVE: 59.0-87.0 %) and between 1 and 4 secondary QTL (PVE: 5.6-33.2%) (Shapiro et al. 2004; Coyle et al. 2007; Shapiro et al. 2009; Shikano et al. 2013). Furthermore, although some of the genomic regions implicated in our QTL mapping have been seen in other stickleback populations (linkage groups 4, 7b, and 8), these genomic regions may be acting on pelvic phenotype in a way that is distinct from in other populations. For example, while linkage groups 4 and 8 have been identified as large-effect QTLs in ninespine and threespine stickleback populations, respectively, they have only a small effect on pelvic phenotype in Salt River (LG4, 11.9 PVE; LG8, 4.9-5.9 PVE). Finally, we have also identified a number of novel small-effect QTL that contribute to pelvic reduction in the Salt River population including linkage groups 1b, 3, 10, 12, 14a, 15b, 16, 17, 18, and 19. Overlap of Pooled Resequencing and QTL Mapping Of the 14 genomic regions we identified that affect pelvic phenotype in Salt River, 9 were located within 500 kb of a window in the top 0.1% of LRT values. In order to identify genes that might affect pelvic phenotype in Salt River, we compiled a list of candidate genes that were located within 50 kb up- or downstream of a window in the top 0.1% of LRT values that was also within 500 kb of a microsatellite identified in QTL mapping. This list included a total of 12 genes on 3 linkage groups (Table 1.7). As there has only been one gene previously implicated in stickleback pelvic reduction, it is unsurprising that the candidates identified by our study are novel. Furthermore, while some were located in genomic regions previously implicated in pelvic phenotype (linkage group 8, spine length in threespine sticklebacks) (Peichel et al. 2001), others were not (those on linkage groups 3 and 12). Of the list of 12 candidates, two stand out because of 18 their role in sonic hedgehog (Shh) signaling (EFCAB7) and skeletal development (Chst11). EFCAB7 is interesting as its depletion has recently been shown to impair Shh signaling in skeletal tissues and mimic Ellis van Creveld syndrome, which is characterized by, among other phenotypes, shortened limbs (Pusapati et al. 2014). Chst11 homozygous null mice exhibit neonatal lethality, dwarfism, and abnormal skeletal structures. Taken together, the combination of traditional QTL mapping and whole-genome resequencing allowed us to characterize the genetic basis of pelvic reduction in Salt River and Pine Lake as a combination of multiple loci of relatively small to moderate effect and also identify at least two noteworthy candidate genes that can easily be assessed in terms of coding or, in the future, expression differences between the two phenotypes. Linkage Group 12 and Pelvic Phenotype Using both whole-genome resequencing and QTL mapping techniques, we identified linkage group 12 as a contributor to pelvic phenotype. In an Alaskan population of ninespine sticklebacks lateral plate number maps to this linkage group (Shapiro et al. 2009). Additionally, this linkage group contains the sex-determination region of the genome in ninespine sticklebacks (Ross et al. 2009; Shapiro et al. 2009) and when we compared the overall pelvic score between males and females in Salt River crosses, we found that females have significantly lower pelvic scores than males (ANOVA; p-value < 0.001) (Figures 1.5 and 1.6). This difference between the sexes was seen in all crosses and in all pelvic structures with the exception of the left ascending process. Because we found that pelvic phenotype differed between the sexes, we tested for 19 potential differences between QTL identified in males and females using only the largest cross (2008-01). In doing so, we found that the QTL on linkage group 12 is only present in females. This could be because males have an XY genotype at the sex-determining region and would therefore show no heterozygosity at markers in this region; without heterozygosity at a marker, it would not be possible to identify a QTL. We also found three QTL that were specific to males (linkage groups 1a, 5, and 18). This may be because, in general, females are more likely to be missing components of the pelvis. Therefore, QTL found in exclusively female samples could be explaining presence or absence of a structure, while those identified in exclusively male populations account for variation in a structure that is present. Motivated by these results, we tested for correlations between sex and pelvic phenotype in wild-caught fish. For both the Pine Lake and Salt River populations, we used the first 100 pelvic-complete fish that we collected for resequencing. Likewise, we used the first 62-86 pelvic reduced fish that we collected. Therefore, we do not have a random sample of the population as a whole, but fish were taken randomly from the population within each pelvic phenotype. We found that within wild-caught Salt River fish, there are significantly more females with a reduced pelvis and significantly more males with a complete pelvis (Fisher's exact test, p-value <0.01) (Table 1.8). This pattern is also seen in Salt River F1 offspring: the mean pelvic score is significantly lower in females than males (ANOVA, p<0.001) Surprisingly, wild-caught fish from Pine Lake do not show any sex-specific differences in pelvic score. Nucleotide diversity in these populations also suggests that there in no significant difference pelvic phenotype between the sexes in Pine Lake (Figure 1.7). Overall linkage group 12 shows increased values of 20 pi compared to other genomic regions. This is probably because the reference genome was assembled using a female, which would have two X chromosomes; any reads from Y chromosomes in the pooled sequencing data would likely map on top of the X chromosome. This would result in a level of nucleotide diversity on this linkage group. Interestingly, there are differences in pi on LG12 between the two phenotypic pools in Salt River. Pi is higher on LG 12 in the complete pool, which is composed primarily of males. Pi is lower in the reduced (female-dominant) pool. This difference in pi is not seen in Pine Lake, which implies that the ratio of sexes in each phenotypic pool is similar. These results are unexpected given how close, phylogenetically and geographically, these two populations are. However, there is the possibility that while linkage group 12 contributes to pelvic phenotype in both of these populations (which is observed in resequencing data) the specific region of the linkage group is different between the two. We did not conduct QTL mapping in Pine Lake, but it may be possible that a recombination event since the separation of Salt River and Pine Lake populations has separated the regions of LG12 that contribute to pelvic phenotype and sex determination in one of these populations. That is, a hypothetical "pelvic reduction" gene may originally have been linked to the sex-determining region of LG12 but after a recombination event since the split of the two populations moved to the pseudoautosomal region of the chromosome in Pine Lake. One way to determine if pelvic reduction is mapping to distinct regions of linkage group 12 in Salt River and Pine Lake would be a comparison of QTL mapping results between the two populations. Currently, QTL mapping data for Pine Lake is not available. Furthermore, resequencing data are uninformative as recombination is highly reduced in sex chromosomes, thereby elevating 21 Fst and LRT values across the entire linkage group. What might drive this difference in pelvic phenotypes between the sexes? A reduced pelvic skeleton could provide a female-specific advantage, for example, an increase in clutch or egg size. Life history traits such as these have long been thought to be a prime target of selection (Mousseau and Roff 1987) but while many studies have surveyed life history traits in threespine sticklebacks (reviewed in Baker 1994), very few have specifically examined differences between individuals or populations that vary in the extent of bony armor. Data comparing the clutch size of populations with varying lateral plate morphology have been conflicting; while Kynard (1972) reported that females with fewer lateral plates had significantly larger clutches, Baker (1994) reanalyzed the same data and found no correlation. Conversely, Baker et al. (1998) compared female life history traits among 12 Alaskan populations of threespine sticklebacks that varied in pelvic phenotype and found that while clutch size was not statistically different between morphs, mean egg mass was greater in pelvic-reduced populations. Egg size has been shown to be correlated with larger embryos as well as juveniles (Blaxter and Hempel 1963; Reagan and Conley 1977; Thorpe et al. 1984; McKay et al. 1985), an increased growth rate in hatchlings (Wallace and Aasjord 1984), and increased juvenile survival (Marsh 1986). Therefore, egg size is an important life history trait that could impact an individual's lifetime fecundity and fitness. Differentiation Between Salt River and Pine Lake Populations Salt River and Pine Lake are located within 40 km of one another and may have been connected at some point in the recent past (Nelson and Paetz 1974), yet these bodies of water differ in several fundamental respects. In addition to typical differences between 22 lake and stream habitats (e.g., depth, vegetation, water movement), there are also large differences in salinity (Salt River = 2%; Pine Lake = 0.31%) (Nelson 1972). Furthermore, ninespine stickleback residents of each of these habitats differ in overall body morphology and may occupy distinct niches. Fish from Salt River have a more "benthic" appearance, that is, shorter and deeper bodies with shorter fins (personal observation). In contrast, those in Pine Lake appear more "limnetic" with long, streamlined bodies and heads, longer fins, and a narrower caudal region. This combination of traits suggests an open-water niche (Webb 1982; Walker 1997; Walker and Bell 2000; Spoljaric and Reimchen 2007). Because of these morphological and potential physiological dissimilarities, we also examined genomic differentiation between the two populations as a whole to identify genes that may potentially be under selection in these very different habitats. Both FST and LRT showed broadly similar patterns and, as expected, overall values of both of these metrics were higher than in within-population comparisons (mean Fst and LRT were 0.053 and 3.55, respectively) (Figure 1.8). Similar genomic comparisons have been done in threespine sticklebacks and have identified genomic regions that differ significantly between marine and freshwater populations as well as benthic and limnetic morphs within a lake (Hohenlohe et al. 2010; Jones et al. 2012a; Jones et al. 2012b). While it is possible that our work may not be directly comparable to previous studies using a different species, any regions that overlap between the two may provide information about recurrent selection at similar genomic regions between species. Collectively, previous work has identified hundreds of SNPs with differing allele frequencies between habitats and nearly every linkage group contains regions of 23 increased differentiation. However, a handful of linkage groups have been identified repeatedly in multiple studies using SNP genotyping arrays, RAD Tag genotyping, and whole-genome resequencing (1, 4, 7, 11, 20, and 21). The genomic regions with the highest differentiation identified in our study are located on linkage groups 5, 6, 8, 10, 13, 14, 16, and 20. Hohenlohe et al. (2010) identified regions of linkage groups 2, 4, 9, 11, 16, and 19 that were highly differentiated among freshwater populations of threespine sticklebacks, but with the exception of linkage group 16, there is very little overlap between this interpopulation comparison and our own. This could be due to the fact that all populations included in the threespine stickleback comparison were from lake populations and may be detecting selection on genomic regions that would be beneficial in those specific habitats. Because previous work has been conducted in a different species and focused primarily on differences between marine and freshwater habitats (both populations included in our comparison are from freshwater habitats), the lack of overlap between the two are unsurprising. The best comparison to our study may be between benthic and limnetic threespine stickleback species pairs as, morphologically, Salt River resembles a benthic form while Pine Lake appears more limnetic. Jones et al. (2012a) identified 15 genomic regions that differed between benthics and limnetics in multiple lakes, including portions of linkage groups 1, 2, 4, 7, 10, 11, 12, 20, and 21. Again, we found only one linkage group (10) in common between stream and lake populations of ninespine sticklebacks in our study and the comparison between benthic and limnetic threespine stickleback species pairs (Jones et al. 2012a). The overall lack of overlap between our study and other published work could be that in all other comparisons the authors were specifically looking for regions that were 24 under selection across multiple populations. Because our work focused on just two populations, we may be seeing signals of differentiation that are very population-specific and would not be picked up in studies such as those done previously. It could also be that the genomic regions under selection in ninespine sticklebacks after a shift to freshwater are different than those in threespine sticklebacks. It has already been noted that different genomic regions control sex, lateral plates, and pelvic reduction in the two species (Peichel et al. 2001; Colosimo et al. 2004; Shapiro et al. 2004; Shapiro et al. 2009). Finally, although both threespine and ninespine sticklebacks have adapted to superficially similar habitats and undergone a similar set of morphological changes associated with a shift to freshwater because they are distinct species with unique natural histories, they simply may not be directly comparable. Candidate genes in regions of differentiation. In other between-population comparisons in threespine sticklebacks, specific genes have been identified that show repeated selection between benthic and limnetic species pairs (IGK, KITLG, THUMPD3) and marine and freshwater populations (WNT7B, ATPase, EDA, Mucin, SULT4A) (Jones et al. 2012a; Jones et al. 2012b). To test whether any of the same genes might show signatures of selection between Salt River and Pine Lake, we examined genes that were near genomic regions that showed the highest differentiation between the two populations. We compiled a list of genes that were found within 50 kb up- or downstream of peaks that had an LRT score over 60 (top 0.5% of scores). A total of 79 windows across 10 scaffolds met this criterion and contained a total of 45 genes. Interestingly, many genes on this list play a role in immune function (JAK2, CD274, AC3H2, SIGLEC15, SLAMF7). In threespine sticklebacks, even benthic and limnetic species 25 pairs in the same lake differ in parasite load and populations show adaptation to local parasites (MacColl 2009; Eizaguirre et al. 2012); therefore, it is expected that the Salt River and Pine Lake populations would adapt to distinct local immunological challenges. Other genomic scans in threespine sticklebacks that compared marine and freshwater populations and benthic and limnetic species pairs also identified genes associated with immune function (Jones et al. 2012a; Jones et al. 2012b). Another notable gene that was near a peak of differentiation is DKK1. This gene has been shown to play a role in face and head morphogenesis (Roessler et al. 2000; Mao et al. 2001; Mukhopadhyay et al. 2001) and is an interesting candidate given that a primary morphological difference between Salt River and Pine Lake fish is craniofacial shape (personal observation). In conclusion, while we did not identify any of the same specific genes that show high signatures of selection between benthic and limnetic threespine stickleback populations, we did find some genes that are in the same class as those identified threespine sticklebacks (i.e., immune function). We also identified genes such as DKK1, which have not been noted in other species but are interesting because of their potential role in a morphological difference that characterizes Salt River and Pine Lake. Conclusions Because of their dramatic morphological, physiological, and behavioral variation between populations, ninespine sticklebacks provide an ideal model to examine convergent, and adaptively relevant, skeletal phenotypes in wild populations. By using this species to better understand the genetic architecture underlying a dramatic change such as pelvic reduction, we can gain a better understanding of the patterns that underlie evolutionary change in general. For example, how many genetic changes are responsible 26 for large morphological changes, and do the same genetic changes underlie the repeated evolution of similar traits in different lineages? The work presented here suggests that pelvic reduction is not always explained by a small number of genes of large effect and that of the populations of ninespine sticklebacks that have been examined to date; there are at least three distinct genetic mechanisms that could lead to pelvic reduction. Additionally, because a considerable amount of work has been done on the genetics of phenotypic variation in the closely related threespine stickleback, work in ninespine sticklebacks adds to the understanding of convergent traits on a micro- and macroevolutionary scale. While there may not be enough populations examined in either species to draw definitive conclusions, this work, combined with current literature, suggests that while there are some mechanisms that underlie pelvic reduction in both species (i.e., Pitxl), ninespine sticklebacks have exhibited a broader range of genetic possibilities for pelvic loss. Here we have presented the draft genome of the ninespine stickleback, which was used as the reference for comparative resequencing of two Canadian populations that display an unusually broad range of pelvic phenotypes. This allowed us to identify a larger than expected number of differentiated genomic regions between pools of individuals with divergent pelvic phenotypes. We were then able to compare patterns of differentiation between these two populations with similar population-level pelvic phenotypes. We found that linkage group 12, the sex-determining linkage group, shows elevated levels of differentiation in both Salt River and Pine Lake. In addition, other regions of the genome show differentiation in both populations. These results suggest that similar genetic mechanisms are responsible for pelvic reduction in these two populations. 27 To compliment the whole-genome resequencing we also conducted QTL mapping using four half-sibling families from Salt River. These results implicated 14 different genomic regions that contribute to pelvic phenotype. Among these were nine that were located within 500 kb of a window identified as being highly differentiated by whole-genome resequencing. This combination of techniques allowed us to compile a list of candidate genes possibly contributing to pelvic reduction, including candidates that have been implicated in skeletal development and Shh signaling. Overall, the genetic architecture of pelvic reduction that we have described for Salt River and Pine Lake contrasts with previously published studies. Previous work in both threespine and ninespine sticklebacks suggest that pelvic reduction is primarily caused by a few genes of large effect with a small number of modifiers, and in one study a specific molecular change in Pitxl was found in multiple populations (Chan et al. 2010). While identification of individual genes controlling phenotype is important and allows for a deeper understanding of the developmental pathways that lead to specific morphologies as well as the selective forces that may be acting on individual alleles in wild populations, simply understanding the genetic architecture of a dramatic and ecologically relevant trait can add to our knowledge of the overall patterns of evolution. Our current knowledge of the genetic basis of large morphological changes may be biased because genes of large effect are easier to identify. Furthermore, once a gene is identified, it is added to a list of candidate genes for future studies on a phenotype and, therefore, may be overrepresented in subsequent literature. Complementation tests or gene expression analyses, for example, are testing for possible effects of previously identified genes or genomic regions (Cole et al. 2003; Shapiro et al. 2009). Other work 28 has used mapping crosses to examine only a subset of genomic markers in the vicinity of Pitx1 (Coyle et al. 2007). The work presented here suggests that pelvic reduction in sticklebacks does not always have a genetically simple basis. Before sticklebacks were used for molecular genetic studies, it had already been noted that pelvic reduction, at least in ninespine sticklebacks, might have different genetic architectures between populations. For example, Ziuganov and Zotin (1995) hypothesize that pelvic reduction in Levin Navolok Creek (Russia) is the result of a single genomic region with complete genetic dominance. In contrast, Blouw and Boyd (1992) found that a polygenic model with a genotypic threshold best explains pelvic reduction in O'Keefe's Lake (Prince Edward Island, Canada). Therefore, while there there are many cases of ecologically relevant traits in sticklebacks and other organisms that have been attributed to a small number of genes or genetic regions, the work presented here is one of a growing number of examples of a complex genetic basis describing a major phenotypic change. See Orr and Coyne (1992) and Rockman (2012) for a review of the possible overrepresentation of large effect QTL in the literature to date. Furthermore, we identified several QTL that have never been shown to affect pelvic phenotype in other populations. Linkage group 12 is a particularly interesting QTL as this linkage group is the sex-determining region of the genome. This suggests that pelvic phenotype is correlated with sex in this population. In fact, we did see significantly lower pelvic scores in wild-caught females from Salt River as well as females from crosses, but did not see that same pattern in wild-caught Pine Lake fish, suggesting a recent recombination event in the Pine Lake population. Differences in pelvic phenotype 29 30 between the sexes has not been previously described and raises the possibility that pelvic reduction could have an affect on female-specific reproductive traits such as clutch size. Possible differences in reproductive traits such as increased clutch or egg size in fish with lower amounts of armor have been reported, but those studies were done comparing armor phenotypes from different populations. More work will need to be done to see if this pattern is the case within the Salt River and Pine Lake populations. If pelvic reduction is found to be correlated with differences in reproductively relevant traits, it could mean that in addition to the current hypotheses (including differences in calcium availability and the presence of grasping predators) increased fecundity may also need to be considered a possible selective force driving pelvic reduction. References Aldenhoven, J. T., M. A. Miller, P. S. Corneli, and M. D. Shapiro. 2010. Phylogeography of ninespine sticklebacks (Pungitius pungitius) in North America: glacial refugia and the origins of adaptive traits. Mol Ecol 19:4061-4076. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alighnment search tool. J Mol Biol 215:403-410. Baker, J. A. 1994. Life history variation in female threespine stickleback. Pp. 144-187 in M. A. Bell, and S. A. Foster, eds. The Evolutionary Biology of the Threespine Stickleback. Oxford University Press, New York. Baker, J. A., S. A. Foster, D. C. Heins, M. A. Bell, and R. W. King. 1998. Variation in female life-history traits among Alaskan populations of the threespine stickleback, Gasterosteus aculeatus L. (Pisces: Gasterosteidae). Biol J Linn Soc Lond 63:141159. Bell, A. M., G. Orti, J. A. Walker, and J. P. Koenings. 1993. Evolution of pelvic reduction in threespine stickleback fish-a test of competing hypotheses. Evolution 47:906-914. Bell, M. A. 1987. Interacting evolutionary constrains in pelvic reduction of threespine sticklebacks, Gasterosteus aculeatus (Pisces, Gasterosteidae). Biol J Linn Soc 31:347-382. 31 Bell, M. A. and S. A. Foster. 1994. The Evolutionary Biology of the Threespine Stickleback. Oxford Univ Press, Oxford. Bell, M. A. and G. Orti. 1994. Pelvic reduction in threespine stickleback from Cook Inlet lakes: geographic distribution and intrapopulation variation. Copeia 1994:314325. Bernatchez, L. and C. C. Wilson. 1998. Comparative phylogeography of Nearctic and Palearctic fishes. Molecular Ecology 7:431-452. Blaxter, J. H. S. and G. Hempel. 1963. The influence of egg size on herring larvae, Clupea harengus. J. Cons. Inter. Explor. Mer. 28:211-240. Blouw, D. M. and G. J. Boyd. 1992. Inheritance of reduction, loss, and asymmetry of the pelvis of Pungitius pungitius (ninespine stickleback). Heredity 68:33-42. Burke, M. K., J. P. Dunham, P. Shahrestani, K. R. Thornton, M. R. Rose, and A. D. Long. 2010. Genome-wide analysis of a long-term evolution experiment with Drosophila. Nature 467:587-590. Burridge, C. P., D. Craw, D. Fletcher, and J. M. Waters. 2008. Geological dates and molecular rates: fish DNA sheds light on time dependency. Mol Biol Evol 25:624-633. Campbell, M. S., M. Law, C. Holt, J. C. Stein, G. D. Moghe, D. E. Hufnagel, J. Lei, R. Achawanantakun, D. Jiao, C. J. Lawrence, D. Ware, S. H. Shiu, K. L. Childs, Y. Sun, N. Jiang, and M. Yandell. 2014. MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations. Plant physiology 164:513-524. Cantarel, B. L., I. Korf, S. M. Robb, G. Parra, E. Ross, B. Moore, C. Holt, A. Sanchez Alvarado, and M. Yandell. 2008. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18:188-196. Chan, Y. F., M. E. Marks, F. C. Jones, G. Villarreal, Jr., M. D. Shapiro, S. D. Brady, A. M. Southwick, D. M. Absher, J. Grimwood, J. Schmutz, R. M. Myers, D. Petrov, B. Jonsson, D. Schluter, M. A. Bell, and D. M. Kingsley. 2010. Adaptive evolution of pelvic reduction in sticklebacks by recurrent deletion of a Pitx1 enhancer. Science 327:302-305. Cole, N. J., M. Tanaka, A. Prescott, and C. Tickle. 2003. Expression of limb initiation genes and clues to the morphological diversification of threespine stickleback. Curr Biol 13:R951-952. Colosimo, P. F., C. L. Peichel, K. Nereng, B. K. Blackman, M. D. Shapiro, D. Schluter, and D. M. Kingsley. 2004. The genetic architecture of parallel armor plate reduction in threespine sticklebacks. PLoS Biol 2:E109. 32 Coyle, S. M., F. A. Huntingford, and C. L. Peichel. 2007. Parallel evolution of Pitx1 underlies pelvic reduction in Scottish threespine stickleback (Gasterosteus aculeatus). J Hered 98:581-586. Cresko, W. A., A. Amores, C. Wilson, J. Murphy, M. Currey, P. Phillips, M. A. Bell, C. B. Kimmel, and J. H. Postlethwait. 2004. Parallel genetic basis for repeated evolution of armor loss in Alaskan threespine stickleback populations. Proc Natl Acad Sci U S A 101:6050-6055. Earl, D. A. and B. M. vonHoldt. 2011. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv. Genet. Resour. 4:359-361. Eizaguirre, C., T. L. Lenz, M. Kalbe, and M. Milinski. 2012. Divergent selection on locally adapted major histocompatibility complex immune genes experimentally proven in the field. Ecol Lett 15:723-731. Evanno, G., S. Regnaut, and J. Goudet. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14:2611-2620. Grabherr, M. G., B. J. Haas, M. Yassour, J. Z. Levin, D. A. Thompson, I. Amit, X. Adiconis, L. Fan, R. Raychowdhury, Q. Zeng, Z. Chen, E. Mauceli, N. Hacohen, A. Gnirke, N. Rhind, F. di Palma, B. W. Birren, C. Nusbaum, K. Lindblad-Toh, N. Friedman, and A. Regev. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature biotechnology 29:644-652. Gross, H. P. 1978. Natural selection by predators on the defensive apparatus of the three-spined stickleback, Gasterosteus aculeatus L. Can J Zool 56:398-413. Hagen, D. W. and L. G. Gilbertson. 1972. Geographic variation and environmental selection in Gasterosteus aculeaus L. in the Pacific northwest, America. Evolution 26:32-51. Herczeg, G., M. Turtiainen, and J. Merila. 2010. Morphological divergence of North- European nine-spined sticklebacks (Pungitiuspungitius): signatures of parallel evolution. Biol J Linn Soc 101:403-416. Hewitt, G. 2000. The genetic legacy of the Quaternary ice ages. Nature 405:907-913. Hohenlohe, P. A., S. Bassham, P. D. Etter, N. Stiffler, E. A. Johnson, and W. A. Cresko. 2010. Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. PLoS Genet 6:e1000862. Holt, C. and M. Yandell. 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 33 12:491. Holt, R. A., G. M. Subramanian, A. Halpern, G. G. Sutton, R. Charlab, D. R. Nusskern, P. Wincker, A. G. Clark, J. M. Ribeiro, R. Wides, S. L. Salzberg, B. Loftus, M. Yandell, W. H. Majoros, D. B. Rusch, Z. Lai, C. L. Kraft, J. F. Abril, V. Anthouard, P. Arensburger, P. W. Atkinson, H. Baden, V. de Berardinis, D. Baldwin, V. Benes, J. Biedler, C. Blass, R. Bolanos, D. Boscus, M. Barnstead, S. Cai, A. Center, K. Chaturverdi, G. K. Christophides, M. A. Chrystal, M. Clamp, A. Cravchik, V. Curwen, A. Dana, A. Delcher, I. Dew, C. A. Evans, M. Flanigan, A. Grundschober-Freimoser, L. Friedli, Z. Gu, P. Guan, R. Guigo, M. E. Hillenmeyer, S. L. Hladun, J. R. Hogan, Y. S. Hong, J. Hoover, O. Jaillon, Z. Ke, C. Kodira, E. Kokoza, A. Koutsos, I. Letunic, A. Levitsky, Y. Liang, J. J. Lin, N. F. Lobo, J. R. Lopez, J. A. Malek, T. C. McIntosh, S. Meister, J. Miller, C. Mobarry, E. Mongin, S. D. Murphy, D. A. O'Brochta, C. Pfannkoch, R. Qi, M. A. Regier, K. Remington, H. Shao, M. V. Sharakhova, C. D. Sitter, J. Shetty, T. J. Smith, R. Strong, J. Sun, D. Thomasova, L. Q. Ton, P. Topalis, Z. Tu, M. F. Unger, B. Walenz, A. Wang, J. Wang, M. Wang, X. Wang, K. J. Woodford, J. R. Wortman, M. Wu, A. Yao, E. M. Zdobnov, H. Zhang, Q. Zhao, S. Zhao, S. C. Zhu, I. Zhimulev, M. Coluzzi, A. della Torre, C. W. Roth, C. Louis, F. Kalush, R. J. Mural, E. W. Myers, M. D. Adams, H. O. Smith, S. Broder, M. J. Gardner, C. M. Fraser, E. Birney, P. Bork, P. T. Brey, J. C. Venter, J. Weissenbach, F. C. Kafatos, F. H. Collins and S. L. Hoffman. 2002. The genome sequence of the malaria mosquito Anopheles gambiae. Science 298:129-149. Hoogland, R. D., D. Morris, and N. Tinbergen. 1957. The spines of sticklebacks (Gasterosteus and Pygosteus) as means of defense against predators (Perca and Esox). Behaviour 10:205-230. Jaillon, O., J. M. Aury, F. Brunet, J. L. Petit, N. Stange-Thomann, E. Mauceli, L. Bouneau, C. Fischer, C. Ozouf-Costaz, A. Bernot, S. Nicaud, D. Jaffe, S. Fisher, G. Lutfalla, C. Dossat, B. Segurens, C. Dasilva, M. Salanoubat, M. Levy, N. Boudet, S. Castellano, V. Anthouard, C. Jubin, V. Castelli, M. Katinka, B. Vacherie, C. Biemont, Z. Skalli, L. Cattolico, J. Poulain, V. De Berardinis, C. Cruaud, S. Duprat, P. Brottier, J. P. Coutanceau, J. Gouzy, G. Parra, G. Lardier, C. Chapple, K. J. McKernan, P. McEwan, S. Bosak, M. Kellis, J. N. Volff, R. Guigo, M. C. Zody, J. Mesirov, K. Lindblad-Toh, B. Birren, C. Nusbaum, D. Kahn, M. Robinson-Rechavi, V. Laudet, V. Schachter, F. Quetier, W. Saurin, C. Scarpelli, P. Wincker, E. S. Lander, J. Weissenbach, and H. Roest Crollius. 2004. Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431:946-957. Janssen, S., G. Ramaswami, E. E. Davis, T. Hurd, R. Airik, J. M. Kasanuki, L. Van Der Kraak, S. J. Allen, P. L. Beales, N. Katsanis, E. A. Otto, and F. Hildebrandt. 2011. Mutation analysis in Bardet-Biedl syndrome by DNA pooling and massively parallel resequencing in 105 individuals. Hum Genet 129:79-90. 34! Jones, F. C., Y. F. Chan, J. Schmutz, J. Grimwood, S. D. Brady, A. M. Southwick, D. M. Absher, R. M. Myers, T. E. Reimchen, B. E. Deagle, D. Schluter, and D. M. Kingsley. 2012a. A genome-wide SNP genotyping array reveals patterns of global and repeated species-pair divergence in sticklebacks. Curr Biol 22:83-90. Jones, F. C., M. G. Grabherr, Y. F. Chan, P. Russell, E. Mauceli, J. Johnson, R. Swofford, M. Pirun, M. C. Zody, S. White, E. Birney, S. Searle, J. Schmutz, J. Grimwood, M. C. Dickson, R. M. Myers, C. T. Miller, B. R. Summers, A. K. Knecht, S. D. Brady, H. Zhang, A. A. Pollen, T. Howes, C. Amemiya, J. Baldwin, T. Bloom, D. B. Jaffe, R. Nicol, J. Wilkinson, E. S. Lander, F. Di Palma, K. Lindblad-Toh, and D. M. Kingsley. 2012b. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484:55-61. Kim, S. Y., Y. Li, Y. Guo, R. Li, J. Holmkvist, T. Hansen, O. Pedersen, J. Wang, and R. Nielsen. 2010. Design of association studies with pooled or un-pooled next-generation sequencing data. Genet Epidemiol 34:479-491. Klepaker, T., K. Ostbye, and M. A. Bell. 2013. Regressive evolution of the pelvic complex in stickleback fishes: a study of convergent evolution. Evol Ecol Res 15:1-23. Korf, I. 2004. Gene Finding in Novel Genomes. BMC Bioinformatics 5:59-67. Korf, I., M. Yandell, and J. Bedel. 2003. BLAST. O'Reily, Cambridge. Kynard, B. E. 1972. Male breeding behavior and lateral plate phenotypes in the threespine stickleback (Gasterosteus aculeatus L.). University of Washington, Seattle. Langmead, B. and S. L. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nature methods 9:357-359. Lescak, E. A. and F. A. von Hippel. 2011. Selective predation of threespine stickleback by rainbow trout. Ecology of Freshwater Fish 20:308-314. MacColl, A. D. C. 2009. Parasite burdens differ between sympatric three-spined stickleback species. Ecography 32:153-160. Mao, B., W. Wu, Y. Li, D. Hoppe, P. Stannek, A. Glinka, and C. Niehrs. 2001. LDL-receptor- related protein 6 is a receptor for Dickkopf proteins. Nature 411:321325. Marcais, G. and C. Kingsford. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27:764-770. Marchinko, K. B. 2009. Predation's role in repeated phenotypic and genetic divergence of 35 armor in threespine stickleback. Evolution 63:127-138. Marklund, S. and O. Carlborg. 2010. SNP detection and prediction of variability between chicken lines using genome resequencing of DNA pools. BMC Genomics 11:665. Marsh, E. 1986. Effects of Egg Size on Offspring Fitness an Maternal Fecundity in the Orangethroat Darter, Etheostoma spectabile (Pisces: Percidae). Copeia 1986:1830. McKay, L. R., P. E. Ihssen, and G. W. Friars. 1985. Genetic parameters of growth in rainbow trout, Salmo gairdneri, prior to maturation. Can. J. Genet. Cyto 28:306312. Mobley, K. B., D. Lussetti, F. Johansson, G. Englund, and F. Bokma. 2011. Morphological and genetic divergence in Swedish postglacial stickleback (Pungitius pungitius) populations. BMC Evol Biol 11:287. Moodie, G. E. E. 1972. Predation, natural selection and adaptation in an unusual threespine stickleback. Heredity 28:155-167. Mousseau, T. A. and D. A. Roff. 1987. Natural selection and the heritability of fitness components. Heredity (Edinb) 59 (Pt 2):181-197. Mukhopadhyay, M., S. Shtrom, C. Rodriguez-Esteban, L. Chen, T. Tsukui, L. Gomer, D. W. Dorward, A. Glinka, A. Grinberg, S.-P. Huang, C. Niehrs, J. C. I. Belmonte, and H. Westphal. 2001. Dickkopf1 is required for embryonic head induction and limb morphogenesis in the mouse. Dev. Cell 1:423-434. Near, T. J., R. I. Eytan, A. Dornburg, K. L. Kuhn, J. A. Moore, M. P. Davis, P. C. Wainwright, M. Friedman, and W. L. Smith. 2012. Resolution of ray-finned fish phylogeny and timing of diversification. Proc Natl Acad Sci U S A 109:1369813703. Nelson, J. S. 1971. Absence of the pelvic complex in ninespine sticklebacks, Pungitius pungitius, collected in Ireland and Wood Buffalo National Park region, Canada, with notes on meristic variation. Copeia:707-717. Nelson, J. S. and F. M. Atton. 1971. Georgraphic and morphological variation in the presence and absence of the pelvic skeleton in the brook stickleback, Culaea inconstans (Kirtland), in Alberta and Saskatchewan. Can J Zool 49:343-352. Nelson, J. S. and M. J. Paetz. 1972. Fishes of the north-eastern Wood Buffalo National Park region, Alberta and North-West Territories. Can Field-Nat 86:133-144. Orr, H. A. and J. A. Coyne. 1992. The genetics of adaptation: a reassessment. Am Nat 140:725-742. 36 Parra, G., K. Bradnam, and I. Korf. 2007. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23:1061-1067. Peichel, C. L., K. S. Nereng, K. A. Ohgi, B. L. Cole, P. F. Colosimo, C. A. Buerkle, D. Schluter, and D. M. Kingsley. 2001. The genetic architecture of divergence between threespine stickleback species. Nature 414:901-905. Posada, D. and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818. Postlethwait, J. H., S. L. Johnson, C. N. Midson, W. S. Talbot, M. Gates, E. W. Ballinger, D. Africa, R. Andrews, T. Carl, J. S. Eisen, and et al. 1994. A genetic linkage map for the zebrafish. Science 264:699-703. Pritchard, J. K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-959. Pusapati, G. V., C. E. Hughes, K. V. Dorn, D. Zhang, P. Sugianto, L. Aravind, and R. Rohatgi. 2014. EFCAB7 and IQCE Regulate Hedgehog Signaling by Tethering the EVC-EVC2 Complex to the Base of Primary Cilia. Developmental cell 28:483-496. Quevillon, E., V. Silventoinen, S. Pillai, N. Harte, N. Mulder, R. Apweiler, and R. Lopez. 2005. InterProScan: protein domains identifier. Nucleic Acids Res 33:W116-120. Reagan, R. E. and C. M. Conley. 1977. Effect of egg diameter on growth of channel catfish. Prog. Fish Cult. 39:133-134. Reimchen, T. E. 1980. Spine deficiency and polymorphism in a population of Gasterosteus aculeatus-an adaptation to predators. Can J Zool 58:1232-1244. Reist, J. D. 1980. Predation upon pelvic phenotypes of brook stickleback, Culaea inconstans, by selected invertebrates. Can J Zool 58:1253-1258. Rellstab, C., S. Zoller, A. Tedder, F. Gugerli, and M. C. Fischer. 2013. Validation of SNP allele frequencies determined by pooled next-generation sequencing in natural populations of a non-model plant species. PLoS One 8:e80422. Rockman, M. V. 2012. The QTN program and the alleles that matter for evolution: all that's gold does not glitter. Evolution 66:1-17. Roessler, E., Y. Du, A. Glinka, A. Dutra, C. Niehrs, and M. Muenke. 2000. The genomic structure, chromosome location, and analysis of the human DKK1 head inducer gene as a candidate for holoprosencephaly. Cytogenet Cell Genet 89:220-224. Ross, J. A., J. R. Urton, J. Boland, M. D. Shapiro, and C. L. Peichel. 2009. Turnover of 37 sex chromosomes in the stickleback fishes (gasterosteidae). PLoS Genet 5:e1000391. Rubin, C. J., H. J. Megens, A. Martinez Barrio, K. Maqbool, S. Sayyab, D. Schwochow, C. Wang, O. Carlborg, P. Jern, C. B. Jorgensen, A. L. Archibald, M. Fredholm, M. A. Groenen, and L. Andersson. 2012. Strong signatures of selection in the domestic pig genome. Proc Natl Acad Sci USA 109:19529-19536. Rubin, C. J., M. C. Zody, J. Eriksson, J. R. Meadows, E. Sherwood, M. T. Webster, L. Jiang, M. Ingman, T. Sharpe, S. Ka, F. Hallbook, F. Besnier, O. Carlborg, B. Bed'hom, M. Tixier-Boichard, P. Jensen, P. Siegel, K. Lindblad-Toh, and L. Andersson. 2010. Whole-genome resequencing reveals loci under selection during chicken domestication. Nature 464:587-591. Sahana, G., D. J. de Koning, B. Guldbrandtsen, P. Sorensen, and M. S. Lund. 2006. The efficiency of mapping of quantitative trait loci using cofactor analysis in half-sib design. Genetics, selection, evolution : GSE 38:167-182. Santini, F., L. J. Harmon, G. Carnevale, and M. E. Alfaro. 2009. Did genome duplication drive the origin of teleosts? A comparative study of diversification in ray-finned fishes. BMC Evol Biol 9. Seaton, G. H., J., Grunchec, J-A.; White, I.; Allen, J.; De Koning, D.J.; Wei, W.; Berry, D.; Haley, C., Knott, S. 2006. GridQTL: A grid portal for QTL mapping of compute intensive datasets. Proceedings of the 8th World Congress on Genetics Applied to Livestock Production, Belo Horizonte, Brazil. Shapiro, M. D., M. A. Bell, and D. M. Kingsley. 2006a. Parallel genetic origins of pelvic reduction in vertebrates. Proc Natl Acad Sci USA 103:13753-13758. Shapiro, M. D., Z. Kronenberg, C. Li, E. T. Domyan, H. Pan, M. Campbell, H. Tan, C. D. Huff, H. Hu, A. I. Vickrey, S. C. Nielsen, S. A. Stringham, H. Hu, E. Willerslev, M. T. Gilbert, M. Yandell, G. Zhang, and J. Wang. 2013. Genomic diversity and evolution of the head crest in the rock pigeon. Science 339:1063-1067. Shapiro, M. D., M. E. Marks, C. L. Peichel, B. K. Blackman, K. S. Nereng, B. Jonsson, D. Schluter, and D. M. Kingsley. 2004. Genetic and developmental basis of evolutionary pelvic reduction in threespine sticklebacks. Nature 428:717-723. Shapiro, M. D., M. E. Marks, C. L. Peichel, B. K. Blackman, K. S. Nereng, B. Jonsson, D. Schluter, and D. M. Kingsley. 2006b. Corrigendum: Genetic and developmental basis of evolutionary pelvic reduction in threespine sticklebacks. Nature 439. Shapiro, M. D., B. R. Summers, S. Balabhadra, J. T. Aldenhoven, A. L. Miller, C. B. Cunningham, M. A. Bell, and D. M. Kingsley. 2009. The genetic architecture of skeletal convergence and sex determination in ninespine sticklebacks. Curr Biol 38 19:1140-1145. Shikano, T., G. Herczeg, and J. Merila. 2011. Molecular sexing and population genetic inference using a sex-linked microsatellite marker in the nine-spined stickleback (Pungitius pungitius). BMC Res Notes 4:119. Shikano, T., V. N. Laine, G. Herczeg, J. Vilkki, and J. Merila. 2013. Genetic architecture of parallel pelvic reduction in ninespine sticklebacks. G3 3:1833-1842. Smit, A. F. A. and R. Hubley. 2008. Smit, A. F. A., R. Hubley, and P. Green. 1996. RepeatMasker Open-3.0. Spoljaric, M. A. and T. E. Reimchen. 2007. 10,000 years later: evolution of body shape in Haida Gwaii three-spined stickleback. J Fish Biol 70:1484. Stanke, M., M. Diekhans, R. Baertsch, and D. Haussler. 2008. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24:637-644. Stanke, M. and S. Waack. 2003. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19 Suppl 2:ii215-225. Stinchcombe, J. R. and H. E. Hoekstra. 2008. Combining population genomics and quantitative genetics: finding the genes underlying ecologically important traits. Heredity 100:158-170. Thorpe, J. E., M. S. Miles, and D. S. Keay. 1984. Developmental rate, fecundity and egg size in Atlantic salmon, Salmo salar. Aquaculture. 43:289-305. Turner, T. L., E. C. Bourne, E. J. Von Wettberg, T. T. Hu, and S. V. Nuzhdin. 2010. Population resequencing reveals local adaptation of Arabidopsis lyrata to serpentine soils. Nat Genet 42:260-263. Udpa, N., D. Zhou, G. G. Haddad, and V. Bafna. 2011. Tests of selection in pooled case-control data: an empirical study. Front Genet 2:83. Van der Auwera, G. A., M. O. Carneiro, C. Hartl, R. Poplin, G. del Angel, A. Levy- Moonshine, T. Jordan, K. Shakir, D. Roazen, J. Thibault, E. Banks, K. V. Garimella, D. Altshuler, S. Gabriel, and M. A. DePristo. 2013. From FastQ data to high-confidence variant calls: The Genome Analysis Toolkit best practices pipeline. Current Protocols in Bioinformatics 43:11.10.11-11.10.33. Vinson, J. P., D. B. Jaffe, K. O'Neill, E. K. Karlsson, N. Stange-Thomann, S. Anderson, J. P. Mesirov, N. Satoh, Y. Satou, C. Nusbaum, B. Birren, J. E. Galagan, and E. S. Lander. 2005. Assembly of polymorphic genomes: algorithms and application to 39 Ciona savignyi. Genome Res 15:1127-1135. Walker, J. A. 1997. Ecological morphology of lacustrine three-spine stickleback Gasterosteus aculeatus L. (Gasterosteidae) body shape. Biol J Linn Soc 61:3-50. Walker, J. A. and M. A. Bell. 2000. Net evolutionary trajectories of body shape evolution within a microgeographic radiation of threespine sticklebacks. J Zool Lond 252:293-302. Wallace, J. C. and D. Aasjord. 1984. An investigation of the consequences of egg size for the culture of Arctic Charr, Salvelinus alpinus (L.) Journal of Fish Biology 24:427-435. Webb, P. W. 1982. Locomotor patters in the evolution of actinopterygian fishes. Am Zool 22:329-342. Wei, Z., W. Wang, P. Hu, G. J. Lyon, and H. Hakonarson. 2011. SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data. Nucleic Acids Res 39:e132. Weir, B. S. and C. C. Cockerham. 1984. Estimating F-Statistics for the Analysis of Population Structure. Evolution 38:1358-1370. Wootton, R. J. 1976. The Biology of the Sticklebacks. Academic, London. Yang, Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586-1591. Zhou, D., N. Udpa, M. Gersten, D. W. Visk, A. Bashir, J. Xue, K. A. Frazer, J. W. Posakony, S. Subramaniam, V. Bafna, and G. G. Haddad. 2011. Experimental selection of hypoxia-tolerant Drosophila melanogaster. Proc Natl Acad Sci U S A 108:2349-2354. Zhu, Y., A. O. Bergland, J. Gonzalez, and D. A. Petrov. 2012. Empirical validation of pooled whole genome population re-sequencing in Drosophila melanogaster. PLoS One 7:e41901. Ziuganov, V. V. and A. A. Zotin. 1995. Pelvic girdle polymorphism and reproductive barriers in the ninespine stickleback Pungitius pungitius (L.) from northwest Russia. Behaviour 132:1095-1105. 40 Table 1.1. Summary of genomic libraries used for reference sequence Library Insert Size Standard Deviation of Insert Size Number of Reads Coverage 250 bp ± 30 bp 132,407,570 33.64x 500 bp ± 60 bp 116,127,466 29.5x 2400 bp ±300 bp 304,565,738 77.37x 41 Table 1.2. Summary of additional teleost genome assemblies Species Assembly Size (bp) Protein Coding Genes Astyanax mexicanus AstMex102 964,248,202 23,042 Danio rerio Zv.9 1,505,581,940 26,459 Gadus morhua gadMor1 608,029,870 20,095 Gasterosteus aculeatus BROADS1 446,627,861 20,787 Oreochromis niloticus Orenil 1.0 815,725,529 21,437 Oryzias latipes MEDAKA1 700,386,597 19,699 Pungitius pungitius PunPun1 441,103,789 22,432 Takifugu rubripes FUGU 4.0 393,312,790 18,523 Tetraodon nigroviridis TETRAODON 8.0 342,419,788 19,602 42 Table 1.3. Summary of Salt River crosses Family Year Male Male Pelvic Score Female Female Pelvic Score Offspring (n) 2008-01 2008 Male 08-2 8 Female 08-4 7 45 2008 Male 08-2 8 Female 08-5 8 53 2008 Male 08-2 8 Female 08-6 8 31 2008 Male 08-2 8 Female 08-7 8 49 2010-01 2010 Male 10-1 8 Female 10-1 0 21 2010 Male 10-1 8 Female 10-3 0 35 2010-03 2010 Male 10-3 8 Female 10-11 8 38 2010 Male 10-3 8 Female 10-13 0 24 2010 Male 10-3 8 Female 10-14 0 27 2010-05 2010 Male 10-5 8 Female 10-20 8 15 2010 Male 10-5 8 Female 10-22 0 24 2010 Male 10-5 8 Female 10-23 8 19 43 Table 1.4. Summary of samples used in bulked segregant analysis Cross Name n included in complete BSA pool n included in reduced BSA pool 2010-01 14 16 2010-03 21 22 2010-05 16 11 44 Table 1.5. Genome metrics Genome size (bp) 441,103,789 Coverage 140.5 x Number of contigs 8784 Contig N50 length (bp) 122,764 Mean Contig length (bp) 45,262 Number of scaffolds 7850 Scaffold N50 length (bp) 302,754 Mean scaffold length (bp) 56,191 Exonic sequence (bp) 47,709,142 Intronic sequence (bp) 171,695,097 Intergenic sequence (bp) 221,699,550 Number of Genes 22,432 Median gene length (bp) 5,604 Median exon length (bp) 132 Median intron length (bp) 227 45 Table 1.6. Summary of genomic regions identified by QTL mapping _____________________ Family Linkage Group Structure 2008-01 2010-01 2010-03 2010-05 1a left spine 9.4 6.3 1a right spine 11.7 1b left spine 6.9 1b left ascending process 5. 7 1b right ascending process 5.4 1b left girdle 5. 6 1b right girdle 5. 8 2 left girdle 4.6 3 left ascending process 9.6 3 left girdle 4. 5 6.9 4 left ascending process 11.9 8 left spine 6.4 8 right spine 5. 9 8 right ascending process 5.9 8 left girdle 4. 9 8 right girdle 5. 6 10 right girdle 4.2 12 left spine 17.1 5.7 5.2 12 right spine 25.1 12 right girdle 5.9 4. 4 12 left ascending process 7.4 14a left spine 5.5 14a right ascending process 9.6 14a right girdle 5.2 15b right ascending process 5.2 15b right girdle 4.1 16 right spine 5.9 16 left ascending process 7.9 16 right ascending process 5.8 17 right spine 4.3 17 left girdle 9. 9 17 right girdle 5. 2 18 left spine 7.5 18 right ascending process 5.5 19 left girdle 6.9 F-test: light pink, p < 0.05; dark pink, p < 0.01; numbers indicate percent variance explained by male genotype as determined by ANOVA. 46 Table 1.7. Candidate genes of pelvic reduction Linkage Group_____ Scaffold Gene_______________________________Abbreviation 3 scaffold1112 Neurogenic differentiation factor 6-B neurod6b 3 scaffold1112 Calcium/calmodulin-dependent 3'C5'- cyclic nucleotide phosphodiesterase 1C PDE1C 8 scaffold80 EF-hand calcium-binding domain-containing protein 7 EFCAB7 8 scaffold80 Phosphoglucomutase-1 Pgm1 8 scaffold80 Tyrosine-protein kinase transmembrane receptor ROR1 Ror1 8 scaffold140 DNA-dependent protein kinase catalytic subunit Prkdc 8 scaffold140 Intestinal-type alkaline phosphatase 1 Alpi 12 scaffold14 Tyrosine-protein phosphatase nonreceptor type 11 PTPN11 12 scaffold14 Plexin-A1 PLXNA1 12 scaffold14 Carbohydrate sulfotransferase 11 Chst11 12 scaffold14 RING finger protein 223 RNF223 12 scaffold14 Agrin AGRN 47 Table 1.8. Pelvic phenotypes by sex in wild-caught fish Fisher's 0.01) Female Male SALT RIVER - complete* 38 60 SALT RIVER - reduced* 42 18 PINE LAKE - complete 43 54 PINE LAKE - reduced 45 37 exact test (*P < 48 Figure 1.1. Collection sites and variation in phenotype in Salt River and Pine Lake. A) Locations of sites where fish were collected for both whole-genome resequencing and QTL mapping crosses. B) Variation in pelvic phenotype (top) and whole body shape (bottom) seen in both Salt River and C) Pine Lake (right). In both of these populations, there are individuals with pelvic phenotypes ranging from complete (top, left) to absent (bottom, right) 49 1.00 0.80 0.60 0.40 0.20 0.00 PINE LAKE SALT RIVER Complete .UJLR educed Complete Reduced Figure 1.2. STRUCTURE analysis of Salt River and Pine Lake by pelvic phenotype. STRUCTURE plot showing that based on genotypes at 12 unlinked microsatellite markers; Pine Lake and Salt River are distinct populations, but do not show any substructure based on pelvic phenotype within a population. 50 Figure 1.3. Summary of whole-genome scans and QTL mapping. Likelihood ratio test values averaged in 10kb sliding windows (2kb step) plotted across the genome. Putative linkage groups based on synteny with threespine sticklebacks are pictured from left to right. Any window with an LRT score in the top 0.1% of all windows is indicated by a red point. Quantitative trait loci identified by mapping are indicated by colored vertical lines. A) LRT values and QTL results in Salt River. Colors of vertical lines indicate how many families a given QTL was identified in (green = 1 family, orange = 2 families, black = 3 families). B) LRTvalues and QTL results in Salt River when sexes are analyzed separately. Colors of vertical lines indicate which sex a given QTL was identified in (pink = females only, blue = males only, purple = both sexes). C) LRT values in Pine Lake show broadly similar patterns to those seen in Salt River. 51 Chromosome Pine Lake Salt River 52 Chromosome Figure 1.4. Comparison of LRT and FST from Salt River and Pine Lake. Pine Lake Salt River 53 p-value < 0.001 (ANOVA) LU CO +1 c acc Q) oo COo > a Females Males 8 7 - 6 - 5 - 4 - 3 - 2 - 1 - 0 2 3 4 2 3 4 Figure 1.5. Mean pelvic score in Salt River crosses. Mean pelvic score (±SE) separated by family and sex. Females (red) have a significantly lower pelvic score than males (blue). This trend holds when all fish are grouped together (ANOVA; P < 0.001) as well as within each family (Family 1, P < 0.01; Family 2, P < 0.01; Family 3, P < 0.001; Family 4, P < 0.001). Count Count Count Count 54 A Salt River - 2008 B Salt River - 2010 - male 1 1 2 3 4 5 6 7 FemalePelvic Score 1 2 3 4 5 6 7 Female Pelvic Score 8 0 1 2 3 4 5 6 7 MalePelvicScore 0 1 2 3 4 5 6 7 MalePelvicScore Salt River - 2010 - male 3 D Salt River - 2010 - male 5 Figure 1.6. Histogram of pelvic scores in Salt River half-sibling families. Female scores are depicted in red and male scores in blue. The mean pelvic phenotype for each group is denoted by a vertical dashed line. p-value < 0.001 ANOVA 55 Figure 1.7. Nucleotide diversity (pi) in Pine Lake and Salt River populations. Values of pi were calculated in 10-kb sliding windows with 2-kb steps for complete (top) and reduced (center) pools separately, as well as for the total population (bottom). A. PINE LAKE B. 0.025 0.020 0.015 0.010 0.005 0.000 complete 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 2021 unordered 0.025 0.020 0.015 0.010 0.005 0.000 reduced 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 2021 unordered 0.025 0.020 0.015 0.010 0.005 0.000 total I 1 • • 1,1 J* *' I *! " . v „ j i:i i. J-.! I A ti. Lb i-'t 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 2021 unordered Chromosome 0.025 -i 0.020 - 0.015 - 0.010 - 0.005 - 0.000 - 0.025 i 0.020 - 0.015 - 0.010 - 0.005 - 0.000 - 0.025 - 0.020 - 0.015 - 0.010 - 0.005 - 0.000 - SALT RIVER complete 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 2021 unordered reduced 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 2021 unordered total 1 2 3 4 5 6 7 8 9 1011 12 13 14 15 16 17 18 19 2021 unordered Chromosome 57 A. 80 -| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 unordered Chromosome Figure 1.8. LRT and F St values in interpopulation comparisons. LRT and F s t values in a comparison of Salt River and Pine Lake (averaged in 25-kb sliding windows 2-kb step) plotted across the genome. Putative linkage groups based on synteny with threespine sticklebacks are pictured from left to right. Any window with an LRT score in the top 0.1% of all windows is indicated by a red point. CHAPTER 2 DIVERGENCE, CONVERGENCE, AND THE ANCESTRY OF FERAL POPULATIONS IN THE DOMESTIC ROCK PIGEON Reprinted from Curr. Biol., 22, Stringham, S. et al., Divergence, convergence, and the ancestry of feral populations in the domestic rock pigeon, 1-7, Copyright (2012), with permission from Elsevier. 59 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 Current Biology 22,1-7, February 21, 2012 ©2012 Elsevier Ltd All rights reserved DOI 10.1016/j.cub.2011.12.045 Report Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon Sydney A. Stringham,1'3 Elisabeth E. Mulroy,1'3 Jinchuan Xing,2 David Record,1 Michael W. Guernsey,1 Jaclyn T. Aldenhoven,1 Edward J. Osborne,1 and Michael D. Shapiro1 * department of Biology 2Department of Human Genetics University of Utah, Salt Lake City, UT 84112, USA Summary Domestic pigeons are spectacularly diverse and exhibit variation in more traits than any other bird species [1]. In The Origin of Species, Charles Darwin repeatedly calls attention to the striking variation among domestic pigeon breeds- generated by thousands of years of artificial selection on a single species by human breeders-as a model for the process of natural divergence among wild populations and species [2], Darwin proposed a morphology-based classification of domestic pigeon breeds [3], but the relationships among major groups of breeds and their geographic origins remain poorly understood [4, 5]. We used a large, geographically diverse sample of 361 individuals from 70 domestic pigeon breeds and two free-living populations to determine genetic relationships within this species. We found unexpected relationships among phenotypically divergent breeds as well as convergent evolution of derived traits among several breed groups. Our findings also illuminate the geographic origins of breed groups in India and the Middle East and suggest that racing breeds have made substantial contributions to feral pigeon populations. Results and Discussion Genetic Structure of Domestic Pigeon Breeds Charles Darwin was a pigeon aficionado and relied heavily on the dramatic results of artificial selection in domestic pigeons to communicate his theory of natural selection in wild populations and species [2]. " Believing that it is always best to study some special group, I have, after deliberation, taken up domestic pigeons," he wrote in The Origin of Species [2] (p. 20). Darwin noted that unique pigeon breeds are so distinct that, based on morphology alone, a taxonomist might be tempted to classify them as completely different genera [3], yet he also concluded that all breeds are simply variants within a single species, the rock pigeon Columba livia. Pigeons were probably domesticated in the Mediterranean region at least 3,000-5,000 years ago, and possibly even earlier as a food source [3, 6, 7]. Their remarkable diversity can be viewed as the outcome of a massive selection experiment. Breeds show dramatic variation in craniofacial structures, color and pattern of plumage pigmentation, feather placement and structure, number and size of axial and appendicular skeletal elements, vocalizations, flight behaviors, and 3These authors contributed equally to this work ‘Correspondence: shapiro@biology.utah.edu many other traits [1-5]. Furthermore, many of these traits are present in multiple breeds. Today, a large and dedicated pigeon hobbyist community counts thousands of breeders among its ranks worldwide. These hobbyists are the caretakers of a valuable-but largely untapped-reservoir of biological diversity. Here, as an initial step in developing the pigeon as a model for evolutionary genetics and developmental biology, we address two fundamental questions about the evolution of derived traits in this species. First, what are the genetic relationships among modern pigeon breeds? And second, does genetic evidence support the shared ancestry of breeds with similar traits, or did some traits evolve repeatedly in genetically unrelated breeds? To address these questions, we studied the genetic structure and phylogenetic relationships among a large sample of domestic pigeon breeds. Our primary goal was to examine relationships among traditional breed groups, to which breeds are assigned based on phenotypic similarities and/or geographic regions of recent breed development (Figure 1) [4, 5, 8]. First, we used 32 unlinked microsatellite markers to genotype 361 individual birds from 70 domestic breeds and two free-living populations. We next used the Bayesian clustering method in STRUCTURE software [9] to detect genetically similar individuals within the sample (Figure 1; see also Figure S1 available online). When two genetic clusters were assumed (K = 2, where K is the number of putative clusters of genetically similar individuals; Figure 1), the first cluster combined several breed groups with dramatically different morphologies. Principal members of this grouping included the pouters and croppers, which have a greatly enlarged, inflatable crop (an outpocketing of the esophagus); the fantails, which have supernumerary and elevated tail feathers; and mane pigeons, breeds with unusual feather manes or hoods about the head (Figure 1). The second ancestral cluster consisted mainly of the tumblers (including rollers and highflyers), the most breed-rich of the major groups (at least 80 breeds recognized in the USA) [4,8], Tumblers are generally small bodied and were originally bred as performance flyers, with many breeds still capable of performing backward somersaults in flight. In most modern tumbler breeds, however, selection is most intense on morphological traits such as beak size and plumage. Also included in this cluster are the owl and the wattle breeds (wattles are skin thickenings emanating from the beak). These two breed groups contrast dramatically in several key traits: owls are typically diminutive in body size, have a pronounced breast or neck frill, and have among the smallest beaks of all breeds, whereas the wattle breeds (English carrier, scandaroon, and dragoon in our analysis) are larger bodied, lack a frill, and have among the most elaborated beak skeletons of all domestic pigeons [4, 5], The homers (homing pigeons and their relatives) are included in the second cluster as well. The carrier, cumulet, and owl breeds-all members of this cluster-contributed to the modern homing pigeon during its development in England and Belgium approximately 200 years ago [5]. Consistent with this recent admixture, the owls and several homer breeds 60 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 Current Biology Vol 22 No 4 2 Modena & European free-living Wattles & Homers Voice, Utility, & North American ferals . - J w x f r fc - . Si j * . -i- *J■l- V.VS _JLjL - - M ., J u i p J | ! S 1155a "i!!i *ir11III 11111 in i ‘ if »ii i i i - j i i i • l 1} Figure 1. Genetic Structure of the Rock Pigeon (Columba livia) Results from STRUCTURE analysis showing coefficients of genetic cluster membership of 361 individuals representing 70 domestic breeds and two free-living populations (European and North American, at the far left and far right of the plots, respectively) of rock pigeon. Each vertical line represents an individual bird, and proportion of membership in a genetic cluster is represented by different colors. Thin black lines separate breeds. At K = 2, the owls, wattles, and tumblers are the predominant members of one cluster (blue), while other breeds comprise another cluster (orange). At K = 3, the pouters and fantails (yellow) separate from the toys and other breeds, and at K = 5, the fantails separate from the pouters. Pouters and fantails also share genetic similarity with the recently derived king, a breed with a complex hybrid background that probably includes contributions from Indian breeds [5]. At K = 5, fantails are also united with the Modena, an ancient Italian breed, and a free-living European population. The latter two form a discrete cluster at K = 9. A t K = 10 and greater (Figure S1), some of the breed groups are assigned to different genetic clusters. This suggests that a number of assumed clusters beyond K = 9 reveals the structure of individual breeds, ratherthan lending additional insights about genetically similar breed groups. Top row of photos, left to right: Modena, English trumpeter, fantail, scandaroon, king, Cauchois. Bottom row: Jacobin, English pouter, Oriental frill, West of England tumbler, Zitterhals (Stargard shaker). Photos are courtesy of Thomas Hellmann and are not to scale. See Figure S1 for results from K = 2-25 and Tables S1 and S2 for breed and marker information, respectively. continue to share partial membership in the same cluster at K = 4 and beyond, and the cumulet shares similarity with the homers and wattles at K = 7. Numbers of clusters beyond K = 9 reveal the structure of individual breeds, rather than lending additional insights about breed groups (Figure S1). Notably, although allelic similarity is potentially indicative of shared ancestry, this analysis does not explicitly generate a phylogenetic hypothesis. Moreover, an alternative explanation for clustering is that large effective population sizes might result in an abundance of shared alleles. We next used multilocus genotype data from a subset of breeds (those with >50% membership in a cluster at K = 9) 61 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 Structure and Phylogeny of Domestic Pigeons 3 Figure 2. Consensus Neighbor-Joining Tree of Forty Domestic Breeds and One Free-Living Population of Rock Pigeon The tree here was constructed using pairwise Cavalli-Sforza chord genetic distances and includes the subset of breeds with >50% membership in one genetic cluster at K = 9. Branch colors match cluster colors in Figure 1, except all tumbler breeds are represented with light blue for clarity. A notable incongruence between the STRUCTURE analysis and the tree is the grouping of the English pouter with a tumbler rather than with the other pouters; however, this grouping is not well supported. Percent bootstrap support on branches (>50%) is based on 1,000 iterations, and branch lengths are proportional to bootstrap values. to calculate genetic distances among breeds and to generate a neighbor-joining tree (Figure 2). Among the major groups, only subsets of the pouter, fantail, mane, tumbler, Modena and free-living European, and owl branches of the tree have strong statistical support (Figure 2). Nevertheless, at the breed level we observed substantial genetic differentiation, suggesting that in many cases, hybridization among breeds has been limited (mean pairwise FST = 0.204 for all breeds, maximum FSt = 0.446; potentially more reliable differentiation estimates considering the modest sample sizes for some breeds [10]: mean Dest = 0.156, maximum Des, = 0.421; Tables S4 and S5). As a comparison, mean pairwise differentiation among African and Eurasian human populations with historically limited gene flow is lower (mean FSt = 0.106, maximum FST = 0.240 for the comparison between Pygmy and Chinese populations using a dense genome-wide SNP set [11]). Taking these results together, our analysis shows both expected and unexpected genetic affinities among breeds. Like other domesticated animals such as dogs and chickens, pigeons probably have a reticular rather than hierarchical evolutionary history, which is reflected in the complex genetic structure of many breeds and a star-shaped phylogeny. These findings probably result from hybridization that has occurred throughout the domestication history of the pigeon; this practice continues among some modern breeders as well, often with the goal of transferring a new color into an established breed, or " improving" an existing trait. Unlike the stringent regulations for registering purebred dogs, in which modern breeds are effectively closed breeding populations separated by large genetic distances [12,13], no barriers exist to mixed ancestry or parentage of pigeons (average FST = 0.33 between dog breeds [12] compared to 0.24 for pigeons). On the other 62 Current Biology Vol 22 No 4 4 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 hand, little genetic variation divides dog breeds into subgroups [13], and like our tree (Figure 2), neighbor-joining trees of dogs show limited structuring of the internal branches [12,13]. Convergent Evolution of Traits Darwin classified 32 pigeon breeds into four major groups based primarily on morphological traits, especially beak size (Figure 3A). We repeated our STRUCTURE analysis with 14 breeds from Darwin's study that were available to us and found that his morphological classification is broadly congruent with our genetic results (Figure 3B). Beak size is only one of many traits that pigeon breeders have selected over the past several centuries, or in some cases millennia. Feathered feet, head crests, and a multitude of color variants appear in many lineages [8] and must have evolved more than once (Figure 4). Together, these findings suggest that traits do often, but not always, track the ancestry of breeds. This theme of repeated evolution is widespread in genetic studies of other natural and domesticated species as well [14-17], Geographic Origins of Breeds Modern breeds are frequently described as having origins in England, Germany, Belgium, or elsewhere in Europe, but their progenitors were probably brought there from afar by traders or colonialists [3-5, 18, 19]. Although we may never definitively know the sites of pigeon domestication, genetic data combined with historical records may provide new clues about the geographic origins of some of the major breed groups. Most historical accounts trace the origins of the wattle breeds, owls, and tumblers to the Middle and Near East hundreds of years ago, with ancient breeds transported to Europe and India for further development by hybridization or selection [3, 5, 19-21], Our genetic analyses are consistent with this common geographic origin: these three groups share substantial membership in the same genetic cluster at K = 2-3, and two of the three wattle breeds (English carrier and dragoon) retain high membership coefficients in the tumbler cluster through K = 5 (Figure 1). The fantail breeds probably originated in India and have undergone less outcrossing than many other breeds [5], In our STRUCTURE analysis, the fantail (and the Indian fantail to a lesser extent) shows a surprising affinity with the pouters at K = 2-3, and these two groups share a major branch on the neighbor-joining tree (Figures 1 and 2); these two groups are among the most morphologically extreme of all domestic pigeons, and among the most different from each other. European breeders have developed pouters for several hundred years [22, 23], and Dutch traders might have originally brought them to Europe from India [5], Together, historical accounts and genetic similarity between fantails and pouters support the hypothesis of common geographic origin in India. Ancestry of Feral Pigeon Populations Domestic rock pigeons were first brought to North America approximately 400 years ago, and feral populations were probably established shortly thereafter [24, 25]. Likewise, some Eurasian and North African feral populations are probably nearly as old as the most ancient domestication events. In addition to the domestic breeds in our study, we also included a feral pigeon population (Salt Lake City, Utah). Escaped individuals from nearly any domestic breed have the potential to contribute to the feral gene pool, and feral birds showed highly heterogeneous membership across clusters at most values of K (Figure 1). However, we expected that the racing homer would be a major contributor to the feral gene pool. Pigeon racing is an enormously popular and high-stakes hobby worldwide. Although many birds in homing competitions are elite racers that reliably navigate hundreds of miles to their home lofts, some breeders report that up to 20% of their birds that start a race do not return. As predicted, pairwise Dest for the racing homer to feral comparison was among the lowest 0.1 % of all pairwise comparisons (Dest = 0.006), and pairwise FST was the lowest for any pairwise comparison (FST = 0.049). Therefore, feral pigeons and racing homers show very little genetic differentiation, and wayward racing homers probably make a substantial contribution to the genetic profile of this local feral population. We also included samples of free-living rock pigeons (the existence of " pure" wild populations uncontaminated by domestics or ferals is questionable [26]) from Scotland to test for genetic similarities with domestic breeds and with our North American feral sample. Consistent with previous studies [24, 27], European and North American free-living populations are highly differentiated (Dest = 0.162). The European sample groups with the Modena, a former racing breed that was developed in Italy up to 2,000 years ago [5] (Figures 1 and 2). This suggests either that Modenas were developed from European free-living populations or that, as in North America, wayward racers contributed to the local feral population, perhaps for centuries. Studies of additional feral populations will reveal whether strong affinities with racing breeds occur locally and sporadically or, as we suspect, almost everywhere. The Domestic Pigeon as a Model for Avian Genetics and Diversity Darwin enthusiastically promoted domestic pigeons as a proxy for understanding natural selection in wild populations and species, and pigeons thus hold a unique station in the history of evolutionary biology. More recently, domesticated animals have emerged as important models for rapid evolutionary change [28], Feathered feet, head ornamentation, skeletal differences, plumage color variation, and other traits prized by breeders offer numerous opportunities to examine the genetic and developmental bases of morphological novelty in birds. These and other traits evolved repeatedly in many breeds, and a challenge arising from this study is to determine whether this distribution of traits resulted from selection on standing variation (either by hybridization between breeds or repeated selection on variants in wild populations), from de novo mutation in independent lineages, or both. In the first case, we would expect certain regions of the pigeon genome to share histories and haplotypes that reflect the transfer of valued traits between breeds. This hypothesis will be testable when we have more detailed information about genomic diversity in this species. Pigeons are also easily bred in the lab, and morphologically distinct breeds are interfertile [2, 3, 29]. Therefore, hybrid crosses should be a fruitful method to map the genetic architecture of derived traits, many of which are known to have a relatively simple genetic basis [4, 29]. The extreme range of variation in domestic pigeons mirrors, if not exceeds, the diversity among wild species of columbids (pigeons and doves) and other birds. Domestic pigeons and 63 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 Structure and Phylogeny of Domestic Pigeons 5 COLUMBA LIVIA or ROCK-PIGEON. GROUP I. GROUP H. GROUP HI. 1i . Q BSOTOBT-S . 3. Kali-Par •■Murtuffla Baaeorah German P. Lille P. Begndottcn Scanderoon Tronfo Dragon Dutch P. I Pigoon Cygne English Pouter. English Hunt. Carrier. Java Fan tail Persian Tumblor Loton Tumbler Common Tumbler Tuibit 9. su b - 10. GBOtre. GROUP IV. 11. Barb. Fantail. African Short- Indian Jacobin. Owl. faced Frill- 2 umbler. back. ? r f J i f I m • r l ? ' $ "2 ' 'S' B K=5 K=4 K=3 K=2 - ■■ ■_ 3o Q-cn c HI CO O CD C LU OO CJ) CD OO co ■ac co o C/D 03C 03 C CD O a) _QE 13 \- 0 Oa o -C a.) CT> c LU 0O a: c CD CO j- CD CL .Q OO CD -*C-D» 0 Q . E3 O) c LU J0Zcn 3 CD ^ g O) CD ^ =2 ro *> C 5 c LL CO O C^ ' CXD CD ^ O "CD $ CO Figure 3. Comparison of Darwin's Morphology-Based Classification and Genetic Structure Analysis of Domestic Pigeon Breeds (A) Darwin classified 32 breeds into four groups: (I) the pouters and croppers, which have enlarged crops (see also Figures 1 and 4); (II) wattle breeds, many of which have elaborated beaks, and the large-bodied runts; (III) an " artificial" grouping diagnosed by a relatively short beak; and (IV) breeds that resemble the ancestral rock pigeon " in all important points of structure, especially in the beak" [3] (p. 154). Image reproduced with permission from John van Wyhe ed. 2002, The Complete Work of Charles Darwin Online (http://darwin-online.org.uk/). (B) Mean coefficients of genetic cluster membership for 14 domestic breeds represented in Darwin's classification and our genetic analysis. When two clusters are assumed (K = 2), fantails are separated from all other breeds. At K = 3, the breeds in Darwin's group IV and the African owl (group II) share a high coefficient of membership in a new cluster. At K = 4, the African owl, laugher, and (to a lesser extent) English pouter share membership in a new cluster that includes members of three different morphological groups. At K = 5, the English pouter and Jacobin form a cluster. Although some genetic clusters span more than one morphological group, others are consistent within a group. For example, the wattle breeds (group II), tumblers (group III), and most of group IV remain united with breeds of similar morphology at K = 2-5. Taken together, these results confirm that morphology is a good general predictor of genetic similarity in domestic pigeons, yet they also show that breeds that share allelic similarity can be morphologically distinct. Darwin, too, recognized that breeds united in form were not necessarily united in ancestry and, conversely, that anatomically dissimilar breeds might be related. For example, he classified the short-beaked barb (not in our genetic data set) with the long-beaked breeds of group II. 64 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 Current Biology Vol 22 No 4 6 Beak size Enlarged crop Head crest Figure 4. Distribution of Several Derived Traits across Groups of Domestic Pigeons The phylogenetic tree in Figure 2 was converted to a cladogram format with equal branch lengths (far left). For the beak size column, " +" indicates a substantial increase in size relative to the ancestral condition, and "O" indicates a decrease [4, 8]. For body mass, " +" indicates breeds with a maximum over 550 g, and "O" indicates those under 340 g [4, 8]. Although a 4-fold difference in body mass is depicted here, extremes in body mass among all known breeds differ by more than an order of magnitude. For crop, feathered feet, and head crest, " +" indicates fixed or variable presence of the trait (substantial departure from the ancestral condition [4, 8]). All traits shown were selected in multiple groups except an enlarged crop, which is confined to the pouters and croppers. A possible exception is the Cauchois (not included in the tree; see Figure 1), a non-pouter breed with an enlarged and inflatable crop, thought to have been developed centuries ago from a cross between a pouter and large-bodied Mondain breed [5, 33], Our STRUCTURE analysis supports this hypothesis, with the Cauchois sharing 37.8%-89.7% membership in the genetic cluster containing the pouters at K = 2-9 (Figure 1). Breeds shown (clockwise from upper left) are African owl*, scandaroon, Norwich cropper, old German owl, West of England tumbler*, white Carneau, and Budapest short-face tumbler. Scale bars represent 10 cm. 'Photos courtesy of Thomas Hellmann. Body mass Feathered feet wild bird species vary in many of the same traits, so domestic pigeons provide an entry point to the genetic basis of avian evolutionary diversity in general [1, 30], Changes in the same genes, and even in some cases the same mutations, have recently been shown to underlie similar phenotypes in both wild and domesticated populations [31, 32], The genetic history of pigeons is a critical framework for the analysis of the genetic control of many novel traits in this fascinating avian species. Accession Numbers The microsatellite markers and sequences reported in this paper have been deposited at GenBank with the accession numbers GF111523- GF111539. Supplemental Information Supplemental Information includes one figure, four tables, and Supplemental Experimental Procedures and can be found with this article online at doi:10.1016/j.cub.2011.12.045. Acknowledgments We thank Kyle Christensen and members of the Utah Pigeon Club, National Pigeon Association, and Bund Deutscher Rassegflugelzuchter for their spirited collaboration; Elena Boer, Terry Dial, Jennifer Koop, Matt Miller, and Jessica Waite for collection assistance; Jon Seger, Kyle Christensen, and Eric Domyan for comments on drafts of the manuscript; and Thomas Hellmann for photos used in Figures 1 and 4. Animal protocols were approved by the University of Utah Institutional Animal Care and Use Committee (protocol 09-04015). This work was supported by National Institutes of Health (NIH) grant T32GM007464 (S.A.S. and E.J.O.), National 65 Structure and Phylogeny of Domestic Pigeons 7 Please cite this article in press as: Stringham et al., Divergence, Convergence, and the Ancestry of Feral Populations in the Domestic Rock Pigeon, Current Biology (2012), doi:10.1016/j.cub.2011.12.045 Science Foundation grant DGE0841233 (S.A.S.), the University of Utah BioURP and UROP programs (E.E.M. and M.W.G.), NIH/National Human Genome Research Institute grant K99HG005846 (J.X.), a Burroughs Wellcome Fund Career Award in the Biomedical Sciences (M.D.S.), and a gift from Onorio Catenacci. Received: September 2, 2011 Revised: December 19, 2011 Accepted: December 19, 2011 Published online: January 19, 2012 References 1. Price, T.D. (2002). Domesticated birds as a model for the genetics of speciation by sexual selection. Genetica 776, 311-327. 2. Darwin, C. (1859). On the Origin of Species by Means of Natural Selection (London: John Murray). 3. Darwin, C.R. (1868). The Variation of Animals and Plants under Domestication, Volume 1 (London: John Murray). 4. Levi, W.M. (1965). Encyclopedia of Pigeon Breeds (Sumter, SC: Levi Publishing). 5. Levi, W.M. (1986). The Pigeon, Second Revised Edition (Sumter, SC: Levi Publishing). 6. Sossinka, R. (1982). Domestication in birds. In Avian Biology, Volume 6, D.S. Farner, A.S. King, and K.C. Parkes, eds. (London: Academic Press), pp. 373-403. 7. Driscoll, C.A., Macdonald, D.W., and O'Brien, S.J. (2009). From wild animals to domestic pets, an evolutionary view of domestication. Proc. Natl. Acad. Sci. USA 706 (Suppl 7), 9971-9978. 8. National Pigeon Association. (2010). National Pigeon Association Book of Standards (Goodlettsville, TN: Purebred Pigeon Publishing). 9. Pritchard, J.K., Stephens, M., and Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics 155, 945-959. 10. Jost, L. (2008). G St and its relatives do not measure differentiation. Mol. Ecol. 77, 4015-4026. 11. Xing, J., Watkins, W.S., Witherspoon, D.J., Zhang, Y., Guthery, S.L., Thara, R., Mowry, B.J., Bulayeva, K., Weiss, R.B., and Jorde, L.B. (2009). Fine-scaled human genetic structure revealed by SNP microarrays. Genome Res. 79, 815-825. 12. Parker, H.G., Kim, L.V., Sutter, N.B., Carlson, S., Lorentzen, T.D., Malek, T.B., Johnson, G.S., DeFrance, H.B., Ostrander, E.A., and Kruglyak, L. (2004). Genetic structure of the purebred domestic dog. Science 304, 1160-1164. 13. Vaysse, A., Ratnakumar, A., Derrien, T., Axelsson, E., Rosengren Pielberg, G., Sigurdsson, S., Fall, T., Seppala, E.H., Hansen, M.S., Lawley, C.T., et al.; LUPA Consortium. (2011). Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping. PLoS Genet. 7, e1002316. 14. Aldenhoven, J.T., Miller, |
| Reference URL | https://collections.lib.utah.edu/ark:/87278/s6mp8bhw |



