Saturday, April 22, 2017

Simulating continuous space in SLiM 2

I've written before about SLiM 2, the genetically-explicit, individual-based evolutionary simulation software package that I work on nowadays with Philipp Messer at Cornell (see here and here).  SLiM is very flexible, because it's highly scriptable; you can customize and extend almost every aspect of its simulations.  It's also quite fast, as we showed in our paper on it (and it has gotten even faster since then).  It can run on Mac OS X or on Linux, including at the command line and on computing clusters.  And it includes a full-featured graphical modeling environment for OS X, allowing interactive model development and visual debugging.  And it's open-source (on GitHub) and free to use.  If you haven't checked it out already, those links above are a good place to start.

The SLiMgui modeling environment.

In this post I want to call attention to some new features I added recently: support for continuous space, and for spatial interactions.  These features are now released in version 2.3 of SLiM (downloadable from the SLiM home page).  SLiM 2 has always support spatially discrete subpopulations connected via migration, and it has always been possible to write script that governs interactions between individuals.  With these new additions, we now support continuous space as well as discrete space (and even both in the same model, if you wish), and we now provide built-in support for extremely fast evaluation of spatial interactions (i.e., individuals interacting with the other individuals nearest to them in space).

We'll look at three spatial models in this post.  Let's get right into it with a video of a continuous-space model running in SLiMgui:



What you see here, if you click play, are individuals (the little colored squares) in a 2D space (the black background).  SLiM now supports 1D, 2D, and 3D models, but since 2D is the easiest for us humans to visualize we'll stick with that here.  This model includes two spatial interactions: spatial competition, and local mate choice.  Individuals in this model are colored according to their fitness; when a dense cluster of individuals occurs, they compete with each other so strongly that their fitness is substantially decreased (the colors toward red).  On the other hand, it doesn't pay to be off by yourself in the wilderness; you might experience little competition, but you will also be unlikely to be chosen as a mate since mate choice is also local in this model.  These considerations thus trade off, leading to dynamically evolving spatial clusters that walk the line between being too dense and being too dispersed.  The number, size, and density of the clusters depend upon factors such dispersal distance (which is also local in this model), competition strength, and the mating kernel.

So we have a fairly sophisticated spatial model here, and it also includes all of the usual sophistication of SLiM models: individuals here have explicitly modeled chromosomes, with every mutation tracked through time, for example.  What does the SLiM script for a model like this look like?

initialize() {
   initializeSLiMOptions(dimensionality="xy");
   initializeMutationRate(1e-7);
   initializeMutationType("m1", 0.5, "f", 0.0);
   initializeGenomicElementType("g1", m1, 1.0);
   initializeGenomicElement(g1, 0, 99999);
   initializeRecombinationRate(1e-8);
   
   // spatial competition
   initializeInteractionType(1, "xy", reciprocal=T,
      maxDistance=0.3);
   i1.setInteractionFunction("n", 3.0, 0.1);
   
   // spatial mate choice
   initializeInteractionType(2, "xy", reciprocal=T,
      maxDistance=0.1);
   i2.setInteractionFunction("n", 1.0, 0.02);
}
1 late() {
   sim.addSubpop("p1", 500);
   
   for (ind in p1.individuals)
      ind.setSpatialPosition(p1.pointUniform());
}
1: late() {
   i1.evaluate();
   i2.evaluate();
}
fitness(NULL) {
   totalStrength = i1.totalOfNeighborStrengths(individual);
   return 1.1 - totalStrength / p1.individualCount;
}
mateChoice() {
   // spatial mate choice
   return i2.strength(individual);
}
modifyChild() {
   do pos = parent1.spatialPosition + rnorm(2, 0, 0.01);
   while (!p1.pointInBounds(pos));
   child.setSpatialPosition(pos);
   
   return T;
}
2000 late() { sim.outputFixedMutations(); }

That's all it takes!  This is actually recipe 14.4 in the "SLiM cookbook", a compendium of example models that occupies the first half of SLiM's manual; for a more complete discussion of this model you can refer to that recipe.  Here, let's just walk through it very quickly.

The script is divided up into script blocks that do different things and run at different times; each script block is enclosed by braces {} and has indented script inside.  The first block is an initialize() callback, which SLiM runs when the simulation is first initialized.  It tells SLiM to run a 2D model using x-y coordinates, and sets up some basic genetic structure (mutation rate, chromosome length, recombination rate, background neutral mutations generated stochastically, etc.).  Then it initializes two interaction types: one for spatial competition, the other for spatial mate choice.  Without getting into the details too much, the calls here declare that these interactions both depend upon both the x and y coordinates of individuals, but they use different spatial interaction kernels and have different maximum distances.

Next comes a script block that runs at the end of generation 1.  It kicks off the simulation by adding a new subpopulation of 500 individuals, and gives each individual an initial position in space that is drawn randomly from within the spatial bounds of the subpopulation (we use the default bounds here, of [0,1] for x and y, but those bounds can be changed, of course).

The next block runs at the end of every generation from 1 onwards; that is the meaning of "1: late()".  It simply tells the two spatial interactions to evaluate themselves.  Interactions can be evaluated at any time; we have chosen this point in the generation cycle to do it.  (Since the positions of individuals can change at any point in time, it is necessary to choose such a "reference time" for interactions.)

Next comes a fitness() callback.  This is a script block that affects the fitness values for individuals, based upon genetics or upon any other simulation state.  Here we use the first interaction type that we defined, i1, to total up the interaction strengths between the focal individual and all other nearby individuals, and we decrease the focal individual's fitness accordingly.  This implements spatial competition: the more individuals are near us in space, and the nearer they are, the lower our fitness.

The next block is a mateChoice() callback.  This consults the second interaction type that we defined, i2, in order to obtain mating weights to use between the focal individual and all other individuals.  In essence, it says that the mating weight for each potential weight is equal to its interaction strength: the closer the potential mate is, then, the more likely it is to be chosen as a mate.

Then comes a modifyChild() callback; this is called whenever a new offspring individual is being generated by SLiM.  Its job is just to set the spatial position of offspring.  It does this by starting with the position of the first (i.e., maternal) parent, adding random deviations to x and y drawn from a normal distribution, looping until the new position is within spatial bounds (thus implementing a "reprising" boundary condition), and finally setting the position on the offspring.  It returns T (true) to tell SLiM to proceed with generating the proposed offspring.

Finally, an event at generation 2000 outputs a list of all of the mutations that fixed during the run of the model;  The model then terminates, since there are no events scheduled for generations beyond 2000.

Apologies if that description was eye-wateringly boring!  The point is just to show that it really is possible to set up a genetically and spatially explicit model, including both spatial competition and local mate choice, in less than a page of code.  SLiM is doing a whole lot of work behind the scenes, of course; the spatial competition interaction here, for example, is actually evaluated using an advanced data structure called a k-d tree, allowing very fast interaction calculations even for relatively large population sizes.  But all of that complexity is hidden, effectively encapsulated by SLiM's scripting interface.



We're going to look at two other spatial models here, but in much less depth; we won't look at more Eidos script, we'll just see videos and discuss the models a little.  These models are also recipes in SLiM's "cookbook", though, so you can look at their code in detail if you wish.

The second model we'll look at (recipe 14.10 in the cookbook) involves a "landscape map" that defines an environmental characteristic that varies across space.  For this somewhat silly model, we have created a landscape map that reflects the land/water dichotomy across the whole planet.  This video might get clipped on the right by the narrow width of the blog; you can open it in YouTube to see it at full size:



The model uses this landscape to limit the dispersal of individuals; new offspring can disperse onto land locations, but not water locations.  The whole population starts in a tight cluster in East Africa, and then disperses outward from there to reach Madagascar and the Middle East, India and Russia, and Western Europe.  If the model runs longer, the population finds its way across the Indonesian archipelago and into Australia; making it to the New World, however, is highly unlikely since it would require such a long jump across open ocean.  (The Bering Strait is not open to migration, since the left and right boundaries of the map are not connected in this model.)

Obviously this is just a toy model – the way that the map projection distorts the globe makes it unrealistic, just for starters – but it is a proof of concept, showing that any arbitrary landscape map can be defined and can then be used by the model to modify the simulation dynamics in any way desired.



The third and final model we'll look at (recipe 14.11 in the cookbook) generates a landscape map programmatically, rather than using a pre-fab map as the previous model did.  Each run of the model uses a different, random map, and this map defines what sort of phenotype is most fit at that point in space.  You could think of the mapped environmental trait as being elevation, for example; at a high elevation location, high-elevation-adapted individuals will be most fit, and mutatis mutandi for low elevations.  There is thus local, divergent selection pressure applied across the map, selecting for local adaptation.  And that is what we see:



The colors of individuals here reflect the environment for which they are best adapted; yellow individuals are most fit in yellow landscapes, red individuals in red landscapes.  The population starts out as all red (with occasional mutants), and those red individuals quickly die off in areas of the map that aren't red, leaving red populations in the redder areas of the map.  Mutants arise and manage to establish locally-adapted populations in several locations across the map, however; this is actually speciation occurring in the model.  This is a sexual model, and mate choice is both spatial and assortative; it is a "magic trait" model in which the trait under divergent selection is also used in assortative mating, facilitating divergence and speciation.  A longer runtime often leads to colonization and local adaptation to conditions across the map.  Although it is not the same model, this is in many ways similar to a model I wrote with Rupert Mazzucco and Ulf Dieckmann a couple of years ago; but that model was not written in SLiM, and so it involved a huge amount of code and took months to write (and it was not even genetically explicit, nor was it a sexual model!), whereas I coded this recipe in SLiM in about an hour, and it is about a page and a half of code!

You might wonder exactly how the individuals in this model are adapting; what is the genetics underlying the phenotype modeled here?  This is a QTL-based model, as it happens.  QTLs that affect the phenotype of individuals can arise randomly anywhere in the genome, with any random effect size.  Phenotypes are then calculated based upon the additive effects of all of the QTLs possessed by a given individual.  So it is essentially a quantitative genetics model, but instead of assuming an infinite number of loci of infinitesimal effect, as analytical quantitative genetics models typically do, it actually allows a QTL-based genetic structure to emerge dynamically.  One could then look at what that emergent QTL structure looks like – the distribution of effect sizes, for example – and how that is affected by the spatial structure of the model.  There are all sorts of fascinating theoretical questions in this area that can now be explored in SLiM, with just a few lines of script, in a graphical modeling environment.  Empirical questions, too, can be modeled since you can specify whatever genetic structure you wish (based on the genome map of your study organism, perhaps), set up whatever initial model state you desire (based upon empirical measurements), create whatever landscape maps you wish (based upon your field site), and so forth.

I must say I can't wait to see how people use these new features!  If you're interested, download SLiM from the SLiM home page, join the slim-announce mailing list to hear about new versions with new features like this, and/or join the slim-discuss mailing list to ask questions or discuss ideas.  Thanks for reading, and happy modeling!

No comments:

Post a Comment