Evolutionary and Developmental Genetics

Penn State University

Research

====> Notice software available for simulating evolutionary genetics and genetic epidemiology (see ForSim section, below)

Also, our research interests are easy to see by checking our blog, named The Mermaid's Tale after our recent book. Comments and discussions take place there that you may find interesting (we hope so, at least!).

Our lab is interested in the evolution of complex traits, evolutionary principles generally, and issues in the nature of our knowledge in the life sciences, including bioethical and societal aspects. We are also interested in human variation, how it got here, how much there is, and what it means in regard to complex traits. Our work is largely in genetics and involves studies of human polymorphisms and the amount of variation in genes related to human phenotypes, including disease-related traits, but with a concentration on the evolutionary processes that generated it, and issues involved in finding it. We take bioinformatic and simulation approaches to this subject. Programming and data analysis skills are important, and there are and will be great job opportunities in this area for the foreseeable future.

A major thrust of my work is in developmental genetics and the evolution of developmental processes that control complex traits. We are currently working on the genetic basis of morphological traits that have been important in vertebrate evolution and constitute the fossil record, with particular attention on the patterning of the vertebrate dentition, and craniofacial morphology and its evolution. Our current model systems include collaborative work with a baboon genealogy in San Antonio (Southwest Foundation for Biomedical Research), and various experimental mouse models. We have extensive collaborations with Drs Richtmeier, Jablonski, Ryan, Walker, and Buchanan in our Department, with Jeff Rogers at Baylor University in Houston and the Southwest Foundation for Biomedical Researchi n San Antonio, and with Jim Cheverud at Washington University, and other colleagues and consultants.

Tying these areas together is an interest in an over-arching view of evolution as we know it from the science itself, as well as the way that the post-Darwin world has adopted Darwinian concepts. These concepts connect genes, history, origins, and the process of change as well as biological development. But the same general ideas of competitive natural selection are extended to many other areas of science and philosophy, including physics, the assumed nature of extra-terrestrial life, and human societies generally (see Bioethics, below). Often this is done uncritically or even based on misunderstandings about what we know about life and evolution. That can lead to misunderstandings both within biology itself and in the general public. Darwinian ideas have also often been upsetting to some religions, as we know, and we're in a time when that sensitivity is high in this country.

I collaborate with Joe Terwilliger and Joe Lee, both at Columbia University in New York. Joe T also has developed many tools for genetic inference in the disease context. His slightly wacky web page (linkage.cpmc.columbia.edu/index2.html) shows why that is interesting to do! We co-teach a mini-course called Logical Reasoning in Human Genetics in which we discuss the amount and evolutionary origin of human variation in the context of human disease, and approaches to find and characterize that variation. See my Courses page or Joe's web page for more details and the next scheduled offering.

A major collaboration with Joe, and with a research scientist/programmer, Brian Lambert, here is a broad-purpose, forward evolutionary simulation program package, to examine how evolution works and the issues we face in inferring the underlying genetic architecture of complex traits. In real life, we can never be sure we know the whole truth, but you can be when you have simulated data. Our program, called ForSim, is being developed for this purpose. More details are given below.

You can generally find our published papers by searching on PubMed or this lab web page.

::::Please note that while I try to keep this webpage reasonably current, I can't guarantee its precise accuracy. Things change as grants, interests, and personnel come and go. I do my best to keep it updated, however

FOR QUESTIONS ABOUT MY RESEARCH, PLEASE CONTACT ME: kenweiss@psu.edu

Development and patterning of the mammalian dentition

Human variation and disease

Craniofacial development and evolution

ForSim: Forward Evolutionary Simulation

Bioethics and Biocosmology


Development and patterning of the mammalian dentition

Recumbinant Inbred Mice

Evolution of Complex Traits

We are searching for candidate genes involved in tooth development by comparing morphological differences in the teeth of inbred strains of mice. The photographs on the left show a comparison of the lower second molars of two strains of inbred mice (Balb/cJ (left) and SJL/J (right)). The SJL/J mice lack the fifth cusp on their lower second molars. We are doing gene expression to try to find genes in candidate intervals that differ between the parental strains.

 

Another major project here is being led mainly by Kazuhiko Kawasaki in my lab. See the Publications tab on my web page for publication references, of which there are several.
Gene duplication creates evolutionary novelties by using older tools in new ways. We have identified evidence that the genes for enamel matrix proteins (EMPs), milk caseins, and salivary proteins comprise a family called the SCPP genes Image(secretory calcium phospho-proteins) that are descended from a common ancestral gene called SPARCL1 by tandem gene duplication. These genes remain linked, except for one EMP gene, amelogenin. The SCPP genes show common structural features and are expressed in ontogenetically similar tissues. Many of these genes encode secretory Ca-binding phosphoproteins, which regulate the Ca-phosphate concentration of the extracellular environment. By exploiting this fundamental property, these genes have subsequently diversified to serve specialized adaptive functions. Casein makes milk supersaturated with Ca-phosphate, which was critical to the successive mammalian divergence. The innovation of enamel led to mineralized feeding apparatus, which enabled active predation of early vertebrates. The EMP genes comprise a subfamily not identified previously. A set of genes for dentine and bone extracellular matrix proteins constitutes an additional cluster distal to the EMP gene cluster, with similar structural features to EMP genes. The duplication and diversification of the primordial genes for enamel/dentine/bone extracellular matrix may have been important in core vertebrate feeding adaptations, the mineralized skeleton, the evolution of saliva, and, eventually, lactation. The order of duplication events may help delineate early events in mineralized skeletal formation, which is a major characteristic of vertebrates. We are also exploring the similar evolutionary history of other gene families involved in bioimineralization, in particular genes related to collagen.

The SCPP family of genes provides and example of phenogenetic drift in which a trait (the mineralized nature of teeth, for example) is retained by natural selection, while its genetic basis changes. We are currently concentrating on early vertebrate evolution, including sharks, agnaths (lamprey), and amphibians. My Publications page lists other SCPP papers by Kazz and myself that are out or forthcoming--search also in PubMed under Kawasaki K as author for papers he has authored on his own on this interesting subject..

Craniofacial growth and evolution

In collaboration with Joan Richtsmeier, Nina Jablonski, Tim Ryan, Anne Buchanan, and Alan Walker in our Department, Jim Cheverud in Washington University, and Jeffrey Rogers at the Southwest Foundation for Biomedical Research in San Antonio, Texas. I am involved in studies of the genetic basis and evolution of the shape of the head in primate evolution. This is part of the NSF Hominid, or Human Origins, program (see www.hominid.psu.edu, or the Hominid project's web page at nsf.org). Our project involves gene mapping in baboons and mice and various aspects of informatics and experimental mouse genetics, based on morphometric findings on CT-scanned individuals from a cross between large and small mice and 800 baboons from a large known research pedigree at the Southwest Foundation. In addition, we are using the human, chimpanzee, macacaque, and mouse (and other) whole-genome sequences, and bioinformatic (comparative DNA sequence analysis) approaches, to identify genes or regulatory regulatoy sequence elements that might have undergone natural selection or similar evolutionary processes during the evolution of the uniquely shaped human head and face. We have already identified some interesting candidate regions in the baboon data, and mapping studies of a large sample of mice will occur sometime in 2010, along with scaling up and sharpening of the baboon results.

In this project we'll also be doing mapping analysis on 1200 skulls from a 34th generation intercross between Lg and Sm mice, who differ in body size. This is a resource developed by Jim Cheverud in St Louis, and the specimens are being scanned and landmarked at the present time.

Candidate regions will be followed up in various ways, including gene expression and other experimental studies of the role of candidate genes in the craniofacial develoment in mouse embryos. For more information about this project and its many facets, go here: www.hominid.psu.edu.

Human Variation and Disease

We’re interested in the molecular genetic investigation of the amount of human variation, its geographic distribution, and the relationship of that variation to risk of common, complex, conditions like cardiovascular disease (CVD) and diabetes. The first main aim of our work was to elucidate the full nature of standing variation as it occurs within and among human populations, both to relate observed variation to the effects of demographic factors and, where applicable the effects of natural selection, to define more clearly the ‘normal’ variation whose perturbations may be associated with disease. Currently I am not directly working with subjects affected by disease. Rather, I'm interested in the evolutionary and population processes that generate the variation in our species that includes disease-associated variation. From time to time we do become involved in studies of genetic varition more directly in relation to disease, and we are involved, via Dr Sue Rutherford (a Research Scientist in the group--see Personnel), in studies of diabetes and cardiovascular disease. Students (at this stage, especially undergrads) may find opportunities for learning and working with Sue. At present we are accepting graduate students who wish to do research in disease from simulation and bioinformatic techniques. For those interests see my work in forward evolutionary simulation (see next section),

ForSim: A Forward Evolutionary Computer Simulation

Population genetics theory provides vital tools to understand many aspects of genetic change over the long and the short term.  There are many excellent backward (coalescent) simulation programs, that take a set of sequences sampled today and work backwards in time to reconstruct their common ancestral sequence.  Some of these programs can handle recombination, and natural selection in rather restricted ways. But life is really lived forward, and to be able to understand many aspects of evolution we need to be able to simulate the actual processes of genetic change as they happen forward in time.  Forward evolutionary simulations work the way nature does, screening on phenotypes and only indirectly on genotypes. Forward simulation is brute-force simulation, rather than resting on elegant theory, but for the same reason it is much more flexible. Weiss Lab Image

With Brian Lambert, we have developed a forward evolutionary simulation program called ForSim, that is phenogenetic rather than genetic in nature, an attempt at full-fledged evolutionary simulation. A phenogenetic simulation generates genetic variation, but then rather than having that evolve directly, translates that variation (as real organisms do) into phenotypes, and it is those that are subject to various modes of natural selection,migration, and mate choice, in finite populations of various types, sizes, structures, and dynamics.  Popoulations grow, divide, die out, send migrants to each other, and so on. Modeling such phenomena is important for understanding the genetic architecture that results from the evolution of real biological traits.  For example, migration can be based on phenotypes, such as those more suitable to a new environment, which carries relevant genotypes along with it. The figure above shows basic logical flow of the program's components. Below we present results that suggest just a taste of what the program package can do. ForSim is exceedingly flexible, and allows users to specify many different aspects of a simulation, but because of that, the output may require anlaytic scripting for analysis. At the end, and/or at user-specified points during the simulation, the entire data at that time (or user-specified subsets) can be saved for post-run analysis. At the end, the population, case-control samples for a specified phenotype, and user-specirfied numbers of multigeneration pedigrees are saved, along with many figures displaying conditions during the run (e.g., population size, heritability, losses due to selection, etc.). Weiss Lab

 

The figure on the left shows results of a case-control comparison among simulated individuals, as reflected in the haplotypes at a simulaed gene. White (controls) and grey (cases) are given as haplotypes, one row to each. The haplotype's frequency is given in the bar graph at the left, and its net effect on a simulated quantitative phenotype on the right (red, negative effect, blue positive effect). Each contributing SNP is shown (green: no phenotype effect). These are arranged in haplotype-similarity order based on ClustalW clustering. You can see the multiallelic haplotypes and their relationships, as well as the small differences between what makes a case vs a control.

The next figure shows the haplotypes and linkage disequilibrium in two source populations (A and B) that evolved separately for 2,500 generations, after splitting from a single population that had evolved for 7,500 generations, and thenformed an admixed population (C), that then evolved for 10 generations to the 'present' (this is like many admixed human populations in the US today). The figure shows the LD pattern differences between the populations as resolved by the Haploview program (on which HapMap was based, but applied to our simulated data). . Of the 10 simulated genes shown here, 5 affected a trait under weak selection, identified by arrows, whle the other genes did not affect a trait and evolved neutrally. The admixture and selection affects are not great, as is usually the case, but can be seen: and these data are like real human data on most genome regions in admixed populations.

Weiss Lab Image

It is clear that human geneticists are having difficulty understanding the genetic architecture and specific contributing genes that underlie variation in important complex traits like those responsible for human diseases. This is a major impetus for the program, and ForSim is being written in collaboration with Joe Terwilliger and Joe Lee at Columbia with whom we share interest in how study design and anlaytic strategy may be optimized in searches for disease genes, by mapping and other means.  The examples shown here reflect that biomedical interest. Similar problems face anthropologists interested in the genetic basis of normal traits, with focus on how they evolve, such as craniofacial shape or stature, and their variation among primates including ourselves (applications we are currently making).  ForSim is designed to address these problems. At the end of a simulation pedigree data are saved, that can be tested in biostatistical inference packages. But the applications are not limited to humans, nor to very short-term population history, but can be applied to longer-term evolution as well.

ForSim accommodates multiple loci, chromosomes, and populations, and a range of mating, gene flow, phenogenetic, and selection models with parameters that can be changed at any point during a simulation, and copious output data during the run (as specified by the user) and at the end. Version 1.0 includes the ability to specify multiple genes and phenotypes, each gene able to affect different phenotypes, phenotypes determined by user-written functions of genes and environmental effects, that can vary over time or among populations, fitness functions of similar nature and complexity that can be based on multiple phenotypes, individual and family-specific environments, phenotype based assortative mating, multiple populations, gene x environment interaction. Users can address (and alter) many of the variables in ways, and at points during the run, by so specifying in the input specification file, in ForSim's own block-structured input scripting language. So there is extensive freedom for users to specify problems of their own interest, as free as possible from program-based restrictions. Most chracteristics can be altered at specified points during a simulation. The program is fast given its flexibility; for example, ForSim can simulate a 100Mb chromsome with 10 genes of 40Kb each evolving for a population of 10,000 (roughly, the effective population size of the human species) for 10,000 generations (roughly the age of the human species), with mutation, recombination, the genes evolving neutrally in 28 minutes, on a 2005 vintage 2.8GHz Pentium computer. Zooming in on the first figure above, you can see the relative time demands of the various program functional branches (based on tracking C++ class calls during the run).

ForSim is not as fast as (more restrictive) coalescent simulations are, and is not designed to generate 100,000 replications of a situation to generate detailed probability distributions (unless you have access to a CPU farm). But it can generate a reasonable number of reps, sufficient for understanding basic genetic architecture and its heterogeneity for interestinglly complex situations, and by using multiple unlinked genes and traits, and multiple populations, it can generate repetitions of multiple events during the same stochastic evolutionary history. Even for realistically complex scenarios, as CPU speed and memory increase exponentially, and with increasingly available and affordable high-speed CPU farms, large numbers of replications will increasingly become possible. Long-term evolutionary events can be understood, even to the point of generating 'ancient DNA', that is, directly saving the entire genetic population data at points during a run.

ForSim is intended for simulation of the generation of genetic variation within a user-specified genetic etiology (number of genes and their general mode of action and interaction, etc.), on a scale of up to millions of years (technically, open-ended), so that we can better understand the amount and nature of causal variation for complex traits today. There is extensive output and some knowledge of programming (e.g, scripting by Perl, Ruby, or Python) is important in parsing the data for specific uses, because so much data are made avaialble. Many population geneticists and genetic epidemiologists have written ad hoc forward simulations of one kind or another. We think that ForSim is more flexible and general than they are, generally, more natural interms of modeling evolution by phenotype, and less dependent on theoretical assumptions. However, we did not develop ForSim to be in a contest with other programs, and each has its own uses and unique value.

A brief introduction to ForSim is published (Lambert et al., Bioinformatics, 24: 1821-1822, 2008). The ForSim user Manual is available, and the program version 1.0 (C++ source code and other wrapper scripts and resources) is available upon emailed request from us (contact Brian Lambert: bwl1@psu.edu) at no cost (conditional on agreeing to an open-source non-commercialization license), including the latest version of the Manual describing the program and how to set up its input file that specifies the many different user-variable run conditions, along with samples of output text and graphics. The featrues, output data, and so on are continually being updated and improved. We will want users to register with us so we can notify of bugs and changes, and agree how to credit the program and describe any changes you make to the code in any resulting publications, watermarking the modified source code appropriately. The posted Manual may not be the most current; we are always updating it to make explanations clearer (we hope!) and as we modify the program or find errors. New features are regularly added. If you have a serious interest in using ForSim, contact us for the latest versions

Version 2.0 is anticipated in about a year to add the ability for users to write their own algorithms of unrestricted complexity (in C++ class definitions to be compiled along with the program), that will allow open-ended phenotype, migration/mate-choice, and selection functions.

Bioethics and biocosmology

With the completion of the Human Genome Project (HGP) and many other genomic resources rapidly expanding, leading to accelerating use of genetic information in diagnosis and risk assessment, it is important to take stock of the larger social and ethical implications of population genetics research. Several relevant areas are currently of interest to me and my group. We explore the empirical and epistemological limits of genetic information, both to highlight the current tendency toward what I think is excessive genetic determinism that goes beyond what we can actually say with data (or, often, directly against what we already know). It is important that people have a fuller understanding of the genetic and environmental contributions to complex traits and how difficult such traits, and their evolution, are to understand. Another major research interest is on the implications of population-specific (or as some would insist on calling it, 'race' specific) analysis, attempting to balance the likely benefits of increased knowledge regarding disease etiology and risk, against potential harms of stigmatization, discrimination, and categorical treatment of quantitative variation. I am also interested in the ways in which vested interests, ranging from private biotechnology firms to representatives of communities being investigated, interact to determine research priorities and hence, scientific outcomes and even our views of the science itself.

These subjects are connected through the over-arching theory of evolution, and genes as the underlying 'atoms' of this worldview. The powerful and perceptive ideas of genes and natural selection and historical evolution are extended, sometimes uncritically but with important implications for society and human well-being, to economics, politics, behavior, social relations and many other areas. We're interested in how this happens and what it means--especially because even biologists seem sometimes unaware of its limitations within biology itself.

We work closely with faculty and staff in Penn State's Rock Ethics Institute and department of Science, Technology, and Society (and others) in our bioethics work, and we have recently established an undergraduate minor in the subject....with future developments in the works. Post-docs or graduate students with interests in the history, philosophy, or societal asepcts of relevant areas of biology or biological anthropology might find our program interesting. We like to train students who know genetics well but who are primarily interested in the societal application issues of the science. A current student in this program is a lawyer, interested in human rights law related to the use of DNA to establish, or claim, ancestry or to predict disease.

Bioethics Links

Rock Ethics Institute at Penn State:
http://rockethics.psu.edu/

Science, Medicine, and Technology in Culture program at Penn State:
http://rockethics.psu.edu/smtc/

ELSI Research Program of the National Human Genome Research Institute:
http://www.genome.gov/10001618

Bioethics Resources (from the NIH):
http://www.nih.gov/sigs/bioethics/index.html

Bioethics.net:
http://www.ajobonline.com/

New Mouse Under Development

"The Kawasaki"

"Easy to handle, but extremely difficult to breed" says Chief of Mouse Development Kazu Kawasaki of his latest mouse.

Consisting of only a head and tail, the Kawasaki Mouse, eliminates the fuss and muss of internal organs* for researchers interested in tooth, brain, or tail development. An ancillary advantage is the ability to study flagellar locomotion in mammals. That work is in progress.

Look for this handy research tool to be available soon!


Mouse

* NOTE: Requires special "pre-digested" feed (also under development)