Parameters
How to Use This Section:
In order to create and run a Geonomics Model
, you will need a valid
Geonomics parameters file. No worry though – this is very easy to create!
To generate a new, template parameters file, you will simply call the
gnx.make_parameters_file
function, feeding it the appropriate
arguments (to indicate how many Species
and Layer
s you
want to include in your Model
; which parameters sections you want
included in the file, both for those
Layer
s and Species
and for
other components of the Model
; and the path and filename for your new
parameters file). Geonomics will then automatically create the file for you,
arranged as you requested and saved where you requested.
When you then open that file, you will see the following:
#<your_filename>.py
#This is a default parameters file generated by Geonomics
#(by the gnx.params.make_parameters_file() function).
# :::::: ::: :: :::::::::::#
#:::::: :::: ::: :: :: :: ::::::::::: ::#
#::::::::: :: :: :::::::::::::::::::::::::#
#:::::::::: :::::::::: :::::: :::::::: ::#
# : :::: :: :::: : :: :::::::: : :: : #
# GGGGG :EEEE: OOOOO NN NN OOOOO MM MM IIIIII CCCCC SSSSS #
# GG EE OO OO NNN NN OO OO MM MM II CC SS #
# GG EE OO OO NN N NN OO OO MMM MMM II CC SSSSSS #
# GG GGG EEEE OO OO NN NNN OO OO MM M MM II CC SS #
# GG G EE OO OO NN NN OO OO MM MM II CC SSS #
# GGGGG :EEEE: OOOOO NN NN OOOOO MM MM IIIIII CCCCC SSSSS #
# : :::::::: :::::::::: :: :: : : #
#: ::::: :::::: ::: ::::::: #
# ::: ::::: :: ::::: #
# :: :::: #
# :: #
params = {
##############
#### LAND ####
##############
'land': {
##############
#### main ####
##############
'main': {
# y,x (a.k.a. i,j) dimensions of the Landscape
'dim': (20,20),
#.
#.
#.
This is the beginning of a file that is really just a long but simple Python
script (hence the ‘.py’ extension); this whole file just defines a single,
long, nested dict
(i.e. a Python ‘dictionary’) containing all of your
parameter values. It may look like a lot, but don’t be concerned! For two
reasons:
All the hard work is already done for you. You’ll just need to change the default values where and how you want to, to set up your particular simulation scenario.
You will probably leave a good number of the parameters defined in this file untouched. Geonomics does its best to set sensible default values for all its parameters. Though of course, you’ll want to think clearly nonetheless about whether the default value for each parameter is satisfactory for your purposes.
Each parameter in the parameters value is preceded by a terse comment, to remind you what the parameter does. But for detailed information about each parameter, you’ll want to refer to the following information. What follows is a list of all of the Geonomics parameters (in the sections and the top-to-bottom order in which they’ll appear in your parameters files). For each parameter, you will see a section with the following information:
a snippet of the context (i.e. lines of Python code) in which it appears in a parameters file;
the valid Python data type(s) the parameter can take
the default value of the parameter
a ranking score, indicating how likely it is that you will want to reset this parameter (i.e. change it from its default value), and encoded as follows:
‘Y’: almost certainly, or must be reset for your
Model
to run‘P’: it is quite possible that you will want to reset this parameter, but this will depend on your use and scenario
‘N’: almost certainly not, or no need to reset because it should be set intelligently anyhow (Note: this does not mean that you cannot reset the parameter! if that is the case for any value then it does not appear in the parameters file)
other relevant, detailed information about the parameter, including an explanation of what it defines, how its value is used, where to look for additioanl information about parameters related to other Python packages, etcetera
These section will be formatted as follows:
<param_name>
#brief comment about the parameter
'<param_name>': <default_param_value>,
<valid Python data type(s)>
default: <default value>
reset? <ranking>
<Explanation of what the parameter defines, how its value is used, and any other relevant information.>
This section should serve as your primary point of reference
if you confront any uncertainty while creating your own parameters files.
We’ll start with the section of parameters that
pertains to the Landscape
object.
Landscape parameters
Main
dim
# x,y (a.k.a. j,i) dimensions of the Landscape
'dim': (20,20),
tuple
default: (20,20)
reset: P
This defines the y,x dimensions of the
Landscape
, in units of cells. As you might imagine, these values are used for a wide variety of basic operations throughout Geonomics. Change the default value to the dimensions of the landscape you wish to simulate on.
res
# x,y resolution of the Landscape
'res': (1,1),
tuple
default: (1,1)
reset: N
This defines the
Landscape
resolution (or cell-size) in the y,x dimensions (matching the convention of the dim parameter). This information is only used if GIS rasters ofLandscape
layers are to be written out as GIS raster files (as parameterized in the ‘Data’ parameters). Defaults to the meaningless value (1,1), and this value generally needn’t be changed in your parameters file, because it will be automatically updated to the resolution of any GIS rasters that are read in for use asLayers
(assuming they all share the same resolution; otherwise, an Error is thrown).
ulc
# x,y coords of upper-left corner of the Landscape
'ulc': (0,0),
tuple
default: (0,0)
reset: N
This defines the x,y upper-left corner (ULC) of the
Landscape
(in the units of some real-world coordinate reference system, e.g. decimal degrees, or meters). This information is only used if GIS rasters ofLandscape
layers are to be written out as GIS raster files. Defaults to the meaningless value (0,0), and this value usually needn’t be changed in your parameters file, because it will be automatically updated to match the ULC value of any GIS rasters that are read in for use asLayers
(assuming they all share the same ULC; otherwise, an Error is thrown).
prj
#projection of the Landscape
'prj': None,
str
; (WKT projection string)
default: None
reset: N
This defines the projection of the
Landscape
, as a string of Well Known Text (WKT). This information is only used if GIS rasters ofLandscape
layers are to be written out as GIS raster files. Defaults toNone
, which is fine, because this value will be automatically updated to match the projection of any GIS rasters that are read in for us asLayers
(assuming they all share the same projection; otherwise, an Error is thrown)
Layers
layer_<n>
################
#### layers ####
################
'layers': {
#layer name (LAYER NAMES MUST BE UNIQUE!)
'layer_0': {
{str
, int
}
default: layer_<n>
reset? P
This parameter defines the name for each Layer
. (Note that unlike most
parameters, this parameter is a dict
key,
the value for which is a dict
of parameters defining the Layer
being named.) As the capitalized
reminder in the parameters states, each Layer
must have a unique name
(so that a parameterized Layer
isn’t overwritten in the
ParametersDict
by a second, identically-named Layer
; Geonomics
checks for unique names and throws an Error if this condition is not met.
Layer
names can, but needn’t be, descriptive of what each
Layer
represents. Example valid values include: 0, 0.1, ‘layer_0’, 1994,
‘1994’, ‘mean_ann_tmp’. Names default to layer_<n>
,
where n is a series of integers starting from 0 and counting the number
of Layer
s.
Init
There are four different types of Layers
that can be created. The
parameters for each are explained in the next four subsections.
random
n_pts
#parameters for a 'random'-type Layer
'rand': {
#number of random points
'n_pts': 500,
int
default: 500
reset? P
This defines the number of randomly located, randomly valued points
from which the random Layer
will be interpolated. (Locations drawn
from uniform distributions between 0 and the Landscape
dimensions on
each axis. Values drawn from a uniform distribution between 0 and 1.)
interp_method
#interpolation method ('linear', 'cubic', or 'nearest')
'interp_method': 'linear',
},
{'linear'
, 'cubic'
, 'nearest'
}
default: 'linear'
reset? N
This defines the method to use to interpolate random points to the array that
will serve as the Layer
’s raster. Whichever of the three valid values
is chosen ('linear'
, 'cubic'
, or 'nearest'
) will be passed
on as an argument to scipy.interpolate.griddata
. Note that the
'nearest'
method will generate a random categorical array, such as
might be used for modeling habitat types.
defined
rast
#parameters for a 'defined'-type Layer
'defined': {
#raster to use for the Layer
'rast': np.ones((100,100)),
nx2 np.ndarray
default: np.ones((100,100))
reset? Y
This defines the raster that will be used for this Layer
. Can be set to
None
if an array for the raster should instead be interpolated from a
set of valued points using the pts, vals, and interp_method
parameters. Dimensions of this array must match the dimensions of the
Landscape
.
pts
#parameters for a 'defined'-type Layer
'defined': {
#point coordinates
'pts': None,
nx2 np.ndarray
default: None
reset? Y
This defines the coordinates of the points to use to
interpolate this Layer
. Can be left as None
if the rast
parameter is given a numpy.ndarray
.
vals
#point values
'vals': None,
{list
, 1xn np.ndarray
}
default: None
reset? Y
This defines the values of the points to use to
interpolate this Layer
. Can be left as None
if the rast
parameter is given a numpy.ndarray
.
interp_method
#interpolation method {None, 'linear', 'cubic',
#'nearest'}
'interp_method': None,
},
{'linear'
, 'cubic'
, 'nearest'
}
default: None
reset? N
This defines the method to use to interpolate random points to the array that
will serve as the Layer
’s raster. Whichever of the valid string values
is chosen ('linear'
, 'cubic'
, or 'nearest'
) will be passed
on as an argument to scipy.interpolate.griddata
. Note that the
'nearest'
method will generate a random categorical array, such as
might be used for modeling habitat types. Can be left as None
if
the rast parameter is given a numpy.ndarray
.
file
filepath
#parameters for a 'file'-type Layer
'file': {
#</path/to/file>.<ext>
'filepath': '/PATH/TO/FILE.EXT',
str
default: '/PATH/TO/FILE.EXT'
reset? Y
This defines the location and name of the file that should be read in as the
raster-array for this Layer
. Valid file types include a ‘.txt’ file
containing a 2d np.ndarray
, or any GIS raster file that can be read
by rasterio.open
. In all cases, the raster-array read in from the
file must have dimensions equal to the stipulated dimensions of the
Landscape
(as defined in the dims parameter, above); otherwise,
Geonomics will throw an Error. Defaults to a dummy filename that must be
changed.
scale_min_val
#minimum value to use to rescale the Layer to [0,1]
'scale_min_val': None,
{float
, int
}
default: None
reset? P
This defines the minimum value (in the units of the variable represented by
the file you are reading in) to use when rescaling the file’s array to
values between 0 and 1. (This is done to satisfy the requirement that all
Geonomics Layer
s have arrays in that interval). Defaults to None
(in which case Geonomics will set it to the minimum value observed in this
file’s array). But note that you should put good thought into
this parameter, because it won’t necessarily be the minimum value
observed in the file; for example, if this file is being used
to create a Layer
that will undergo environmental change
in your Model, causing its real-world values to drop
below this file’s minimum value, then you will probably want to set
this value to the minimum real-world value that will occur for this Layer
during your Model
scenario, so that low values
that later arise on this Layer don’t get truncated at 0.
scale_max_val
#maximum value to use to rescale the Layer to [0,1]
'scale_max_val': None,
{float
, int
}
default: None
reset? P
This defines the maximum value (in the units of the variable represented by
the file you are reading in) to use when rescaling the file’s array to
values between 0 and 1. (This is done to satisfy the requirement that all
Geonomics Layer
s have arrays in that interval). Defaults to None
(in which case Geonomics will set it to the maximum value observed in this
file’s array). But note that you should put good thought into
this parameter, because it won’t necessarily be the maximum value
observed in the file; for example, if this file is being used
to create a Layer
that will undergo environmental change
in your Model, causing its real-world values to increase
above this file’s maximum value, then you will probably want to set
this value to the maximum real-world value that will occur for this
Layer
during your Model
scenario, so that high values that
later arise on this Layer don’t get truncated at 1.
coord_prec
#decimal-precision to use for coord-units (ulc & res)
'coord_prec': 5,
int
default: 5
reset? P
This defines number of decimals to which to round upper-left corner
coordinates and resolution values read in from a raster file.
Because Geonomics requires equality of these values amongst all
input raster files, this allows the user to stipulate
the level of precision of their coordinate system, avoiding
false coordinate-system mismatch errors because of
arbitrary float imprecision.
(Note that for Layer
s for which change rasters will be read in,
the same coordinate precision value will be used for all input rasters.)
units
#units of this file's variable
'units': None,
{str
, None
}
default: None
reset? P
This is an optional parameter providing a string-representation
of the units in which a raster file’s variable is expressed.
If provided, it will be used to label the colorbar on plots
of the raster’s Layer
.
nlmpy
function
#parameters for an 'nlmpy'-type Layer
'nlmpy': {
#nlmpy function to use the create this Layer
'function': 'mpd',
str
that is the name of an nlmpy
function
default: 'mpd'
reset? P
This indicates the nlmpy
function that should be used to generate
this Layer
’s array. (nlmpy
is a Python package for
generating neutral landscape models; NLMs.) Defaults to 'mpd'
(the
function for creating a midpoint-displacement NLM). Can be set to any other
str
that identifies a valid nlmpy
function, but then the
remaining parameters in this section must be changed to the parameters
that that function needs, and only those parameters
(because they will be unpacked into this function,
i.e. passed on to it, at the time it is called.
(Visit the Cheese Shop for more
information about the nlmpy
package and available functions).
nRow
#number of rows (MUST EQUAL LAND DIMENSION y!)
'nRow': 20,
int
default: 20
reset? P
This defines the number of rows in the nlmpy
array that is created.
As the capitalized reminder in the parameters file mentions, this must be
equal to the y-dimension of the Landscape
; otherwise, an error
will be thrown. Note that this parameter (as for the remaining parameters in
this section, other than the function parameter) is valid for the
default nlmpy.mpd
function that is set by the
function parameter); if you are using a different nlmpy
function to create this Layer
then this and the remaining parameters
must be changed to the parameters that that function needs,
and only those parameters (because they will be unpacked into that function,
i.e. passed on to it, at the time it is called).
nCol
#number of cols (MUST EQUAL LAND DIMENSION x!)
'nCol': 20,
int
default: 20
reset? P
This defines the number of columns in the nlmpy
array that is created.
As the capitalized reminder in the parameters file mentions, this must be
equal to the x-dimension of the Landscape
; otherwise, an error
will be thrown. Note that this parameter (as for the remaining parameters in
this section, other than the function parameter) is valid for the
default nlmpy.mpd
function that is set by the
function parameter); if you are using a different nlmpy
function to create this Layer
then this and the remaining parameters
must be changed to the parameters that that function needs,
and only those parameters (because they will be unpacked into that function,
i.e. passed on to it, at the time it is called).
h
#level of spatial autocorrelation in element values
'h': 1,
float
default: 1
reset? P
This defines the level of spatial autocorrelation in the element values
of the nlmpy
array that is created.
Note that this parameter (and the remaining parameters in
this section, other than the function parameter) is valid for the
default nlmpy
function (nlmpy.mpd
, which is set by the
function parameter); but if you are using a different nlmpy
function to create this Layer
then this and the remaining parameters
must be changed to the parameters that that function needs,
and only those parameters (because they will be unpacked into that function,
i.e. passed on to it, at the time it is called).
Change
change_rast
#land-change event for this Layer
'change': {
#array of file for final raster of event, or directory
#of files for each stepwise change in event
'change_rast': '/PATH/TO/FILE.EXT',
{2d np.ndarray
, str
}
default: '/PATH/TO/FILE.EXT'
reset? Y
This defines either the final raster of the Landscape
change event
(with valid values being a numpy.ndarray
or a string pointing
to a valid raster file, i.e. a file that can be read by rasterio.open
);
or the stepwise series of changes to be made over the course of the
Landscape
change event (with the valid value being a string
pointing to a directory full of valid raster files).
Note that whether an array, a raster, or multiple rasters
are input, their dimensions must be equal to the dimensions of the Layer
that is being changed (and hence to the Landscape
to which it belongs).
Also note that if a directory of stepwise-change rasters is provided, the
rasters’ filenames must begin with the integer timesteps at which they
should be used during the change event, followed by underscores. (For example,
files with the filenames ‘50_mat_2001.tif’, ‘60_mat_2011.tif’,
‘65_mat_2011.tif’ would be used at timesteps 50, 60, and 65 during a model.)
Defaults to a dummy file name that must be changed.
start_t
#starting timestep of event
'start_t': 50,
int
default: 50
reset? P
This indicates the first timestep of the Landscape
-change event.
Defaults to 50, but should be set to suit your specific scenario.
If a directory of files is provided for the change_rast parameter,
then this must match the earliest timestep in that series of files
(as indicated by the integers at the beginning of the file names).
end_t
#ending timestep of event
'end_t': 100,
int
default: 100
reset? P
This indicates the last timestep of the
Landscape
-change event.
Defaults to 100, but should be set to suit your specific scenario.
If a directory of files is provided for the change_rast parameter,
then this must match the final timestep in that series of files
(as indicated by the integers at the beginning of the file names).
n_steps
#number of stepwise changes in event
'n_steps': 5,
int
default: 5
reset? P
This indicates the number of stepwise changes to use to model a
Landscape
-change event.
If the the change_rast parameter is a directory of files,
then the value of this parameter must be the number of files in that directory.
If the change_rast parameter is either an np.ndarray
or a file name,
then the changes during the Landscape
-change event
are linearly interpolated (cellwise for the whole Layer
) to this
number of discrete, instantaneous Landscape
changes between
the starting and ending rasters. Thus, the fewer the number of
steps, the larger, magnitudinally, each change will be. So more
steps may be ‘better’, as it will better approximate change that is continuous
in time. However, there is a potenitally significant memory trade-off here:
The whole series of stepwise-changed arrays is computed when the
Model
is created, then saved and used at the appropriate timestep
during each Model
run (and if the Layer
that is changing is used
by any Species
as a _ConductanceSurface
then each
intermediate _ConductanceSurface
is also calculated
when the Model
is first built, which can be much more memory-intensive
because these are 3-dimensional arrays).
These objects take up memory, which may be limiting for larger
Model
s and/or Landscape
objects. This often will not be a
major issue, but depending on your use case it could pose a problem, so
is worth considering.
Community and Species parameters
Species
spp_<n>
#spp name (SPECIES NAMES MUST BE UNIQUE!)
'spp_0' : {
{str
, int
}
default: spp_<n>
reset? P
This parameter defines the name for each Species
.
(Note that unlike most parameters, this parameter is
a dict
key, the value for which is a dict
of parameters defining the Species
being named.) As the capitalized
reminder in the parameters states, each Species
must have a unique name (so that a parameterized
Species
isn’t overwritten in the ParametersDict
by a
second, identically-named Species
; Geonomics
checks for unique names and throws an Error if this condition is not met.
Species
names can, but needn’t be, descriptive of what each
Species
represents. Example valid values include: 0, ‘spp0’,
‘high-dispersal’, ‘C. fasciata’. Names default to
spp_<n>
, where n is a series of
integers starting from 0 and counting the number of Species
.
Init
N
'init': {
#starting number of individs
'N': 250,
int
default: 250
reset? P
This defines the starting size of this Species
. Importantly, this
may or may not be near the stationary size of the Species
after
the Model
has burned in, because that size will depend on the
carrying-capacity raster (set by the K parameter), and on
the dynamics of specific a Model
(because of the interaction of
its various parameters).
K_layer
#name of the carrying-capacity Layer
'K_layer': 'layer_0',
str
default: ‘layer_0’
reset? P
This indicates, by name, the Layer
to be used as the
carrying-capacity raster for a Species
. The values of this
Layer
, multiplied by K_factor, should express
the carrying capacity at each cell, in number
of Individual
s. Note that the sum of the values of the product of
this Layer
and K_factor
can serve as a rough estimate of the expected stationary
number of individuals of a Species
;
however, observed stationary size could vary
substantially depending on various other Model
parameters (e.g. birth
and death rates and mean number of offspring per mating event) as well
as on stochastic events (e.g. failure to colonize, or survive in, all
habitable portions of the Landscape
).
K_factor
#multiplicative factor for carrying-capacity layer
'K_factor': 1,
{int
, float
}
default: 1
reset? P
This defines the factor by which the raster of the Layer
indicated
by K_layer will be multiplied to create a Species
’ carrying-
capacity raster. Because Layer
s’ rasters are constrained to [0,1],
this allows the user to stipulate that cells have carrying capacities in
excess of 1.
msprime
# params for 1+ msprime pops for sampling starting individs
'msprime': {
# index of msprime source pop
0: {
# number of individs to sample from pop
100: {
# 1x2 coord pair of nx2 coord pairs of individs
'coords': [0, 0],
# valid kwargs dict for Model.add_individuals
# 'source_msprime_params' argument...
'recomb_rate': 0.5,
'mut_rate': 0.0001,
'demography': None,
'population_size': None,
'ancestry_model': None,
'random_seed': None,
}
},
{dict
, None
}
default: values displayed above
reset? Y
This block of parameters can be used to set up one or more msprime-simulated populations from which all starting Individuals for a Geonomics Model are sourced (replacing the Individuals simulated during a Geonomics burn-in). Each dict keyed to a serial integer defines a separate population. The only key in that dict is the number of Individuals to draw from that population, and the value is a dict of parameters that includes: 1.) the starting coordinates to be assigned to the Individuals (either a single coordinate pair at which all Individuals should be placed or an nx2 numpy.ndarray containing a coordinate pair for each Individual; and 2.) a series of parameters that can be fed as kwargs into the ‘source_msprime_params’ parameter of the Model.add_individuals method. The msprime functionality provided through the Model.add_individuals method is a wrapper around msprime.sim_ancestry, but it only allows a certain subset of parameters to be fed through to that msprime function (with the remaining parameters, e.g., sequence_length, being drawn directly from the Geonomics parameters file, to ensure compatibility); see help(Model.add_individuals) for details on which kwargs can be specified and used. Also, please note that msprime is highly flexible, allowing for many different combined choices of demography and ancestry models, so it is the user’s responsibility to ensure that they do not stipulate msprime parameters that lead to nonsensical scenarios (e.g., generating a number of msprime-derived Geonomics Individuals that is larger than the census size defined for msprime population).
Mating
repro_age
'mating' : {
#age(s) at sexual maturity (if tuple, female first)
'repro_age': 0,
{int
, (int, int)
, None
}
default: 0
reset? P
This defines the age at which Individual
s in the Species
can begin to reproduce. If the value provided is a 2-tuple of different
numbers (and the Species
uses separate sexes), then the first
number will be used as females’ reproductive age, the second as males’.
If the value is 0, or None
, Individual
s are capable
of reproduction from time of time.
sex
#whether to assign sexes
'sex': False,
bool
default: False
reset? P
This determines whether Individual
s will be assigned separate sexes
that are used to ensure only male-female mating events.
sex_ratio
#ratio of males to females
'sex_ratio': 1/1,
{float
, int
}
default: 1/1
reset? P
This defines the ratio of males to females (i.e. it will be converted to a probability that an offspring is a male, which is used as the probability of a Bernoulli draw of that offspring’s sex).
R
#intrinsic growth rate
'R': 0.5,
float
default: 0.5
reset? P
This defines a Species
’ intrinsic growth rate, which is used
as the ‘R’ value in the spatialized logistic growth equation that
regulates population density (\(\frac{\mathrm{d}
N_{x,y}}{\mathrm{d}t}=rN_{x,y}(1-\frac{N_{x,y}}{K_{x,y}})\)).
b
#intrinsic birth rate (MUST BE 0<=b<=1)
'b': 0.2,
float
in interval [0, 1]
default: 0.2
reset? P
This defines a Species
’ intrinsic birth rate, which is
implemented as the probability that an identified potential mating
pair successfully produces offspring. Because this is a probability, as
the capitalized reminder in the parameters file mentions, this value must
be in the inclusive interval [0, 1].
NOTE: this may later need to be re-implemented to allow for spatial variation in intrinsic rate (i.e.. expression of a birth-rate raster), and/or for density-dependent birth as well as mortality
n_births_dist_lambda
#expectation of distr of n offspring per mating pair
'n_births_distr_lambda': 1,
{float
, int
}
default: 1
reset? P
This defines the lambda parameter for the Poisson distribution from which a mating pair’s number of offspring is drawn (unless n_births_fixed is set to True, in which case it defines the number of offspring produced by each successful mating event). Hence, this is either the expected or exact value for the number of offspring born in a successful mating event (depending on how n_births_fixed is set).
n_births_fixed
#whether n births should be fixed at n_births_dist_lambda
'n_births_fixed': True,
bool
default: True
reset? P
This determines whether or not the number of births for each mating event will be fixed. If set to true, each successful mating event will produce n_births_distr_lambda new offspring.
mating_radius
#radius of mate-search area (None, for panmixia)
'mating_radius': 1
{float
, int
, None
}
default: 1
reset? Y
This defines the radius within which an Indvidual
can find a mate.
This radius is provided to queries run on the _KDTree
object.
(If set to None
then true panmixia will be used, i.e. each
Individual
, with probability equal to its Species
’ birth
rate, will choose any other individual in the population as its mate,
after which the chosen pair will then go through age- and sex-eligibility
checks as needed given the parameterization.)
choose_nearest_mate
#whether individs should choose nearest neighs as mates
'choose_nearest_mate': False,
- py:
bool
default: False
reset? P
This determines whether or each Individual
will always
choose its nearest neighbor as a mate.
Defaults to False, allowing each focal Individual
to
randomly choose from among all other Individauls
occurring
within its mating radius.
(In that case, if inverse_dist_mating is False then all other nearby
Individuals
will have equal probability of being chosen;
if inverse_dist_**mating is True, then other Individuals
will
have probabilities linearly related to their inverse distance from the
focal Individual
.)
(Note that this parameter will only
be used if mating_radius is not None
.)
inverse_dist_mating
#whether mate-choice should be inverse distance-weighted
'inverse_dist_mating': False,
- py:
bool
default: False
reset? P
This determines whether or each focal Individual
will use the inverse of the distance between itself and all
other Individauls
occurring within its mating radius
to weight the mutually exclusive probabilites of choosing
each of those other Individuals
as a mate.
If False, then each other Individual
within the
focal Individual
’s mating radius has a uniform probability
of being chosen as a mate.
(Note that this parameter will only
be used if choose_nearest_mate is False and mating_radius
is not None
.)
Mortality
max_age
#maximum age
'max_age': 1,
{int
, None
}
default: 1
reset? P
This defines the maximum age an individual can achieve before being
forcibly culled from the Species
. Defaults to 1 (which will create
a Wright-Fisher-like simulation, with discrete generations). Can be set
to any other age, or can be set to None
(in which case no maxmimum
age is enforced).
d_min
#min P(death) (MUST BE 0<=d_min<=1)
'd_min': 0,
float
in interval [0, 1]
default: 0
reset? N
This defines the minimum probabilty of death that an Individual
can face each time its Bernoulli death-decision is drawn. Because this
is a probability, as the capitalized reminder in
the parameters file mentions, this value must be in the
inclusive interval [0, 1].
d_max
#max P(death) (MUST BE 0<=d_max<=1)
'd_max': 1,
float
in interval [0, 1]
default: 1
reset? N
This defines the minimum probabilty of death that an Individual
can face each time its Bernoulli death-decision is drawn. Because this
is a probability, as the capitalized reminder in
the parameters file mentions, this value must be in the
inclusive interval [0, 1].
density_grid_window_width
'mortality' : {
#width of window used to estimate local pop density
'dens_grid_window_width': None,
{float
, int
, None
}
default: None
reset? N
This defines the width of the window used by the _DensityGridStack
to estimate a raster of local Species
densities. The user should
feel free to set different values for this parameter (which could be
especially helpful when calling Model.plot_density
to inspect the
resulting surfaces calculated at different window widths, if trying
to heuristically choose a reasonable value to set for a
particular simulation scenario). But be aware that choosing particularly
small window widths (in our experience, windows smaller than ~1/20th of
the larger Landscape
dimension) can cause dramatic increases in the
run-time of the density calculation (which runs twice per timestep).
Defaults to None
, which will internally be set to the integer
nearest to 1/10th of the larger Landscape
dimension;
for many purposes this will work, but in some cases
the user may wish to control this.
Movement
move
#whether or not the species is mobile
'move': True,
- py:
bool
default: True
reset? P
This determines whether the :py: Species being parameterized is mobile
(i.e. whether its individuals should move). A Species
without movement
will still undergo dispersal of offspring, but after dispersing
those offspring will remain fixed in location until death.
direction_distr_mu
'movement': {
#mode of distr of movement direction
'direction_distr_mu': 1,
{int
, :py;`float`}
default: 1
reset? N
This is the \(\mu\) parameter of the VonMises distribution
(a circularized normal distribution) from which
movement directions are chosen when movement is random and isotropic
(rather than
being determined by a _ConductanceSurface
;
if a _ConductanceSurface
is being usen this parameter is ignored). The \(\kappa\) value
that is fed into this same distribution (direction_distr_kappa)
causes it to be very dispersed,
such that the distribution is effectively a uniform distribution on
the unit circle (i.e. all directions are effectively equally probable).
For this reason, changing this parameter without changing the
direction_distr_kappa value also, will make no change in the directions
drawn for movement. If random, isotropic
movement is what you aim to model then there is probably little reason
to change these parameters.
direction_distr_kappa
#concentration of distr of movement direction
'direction_distr_kappa': 0,
{int
, float
}
default: 0
reset? N
This is the \(\kappa\) parameter of the VonMises distribution
(a circularized normal distribution) from which
movement directions are chosen when movement is random and isotropic
(rather than
being determined by a _ConductanceSurface
;
if a _ConductanceSurface
is being usen this parameter is ignored). The default value of 0 will
cause this distribution to be very dispersed, approximating a uniform
distribution on the unit circle and rendering the \(\mu\)
value (direction_distr_mu) effectively meaningless. However, as this
parameter’s value increases the resulting circular distributions will become
more concentrated around \(\mu\), making the value fed to
direction_distr_mu influential. If random, isotropic
movement is what you aim to model then there is probably little reason
to change these parameters.
movement_distance_distr_param1
#1st param of distr of movement distance
'movement_distance_distr_param1': 0.1,
{int
, float
}
default: 0.1
reset? Y
This is the first parameter of the distribution used to draw movement distances. The values generated by the movement distribution will be expressed in units of raster-cell widths. This paramter and movement_distance_distr_param2 should be set to reflect a distribution of movement distances that is appropriate for your scenario. The distribution to which this parameter applies depends on the value of the movement_distance_distr parameter.
movement_distance_distr_param2
#2nd param of distr of movement distance
'movement_distance_distr_param2': 0.5,
{int
, float
}
default: 0.5
reset? Y
This is the second parameter of the distribution used to draw movement distances. The values generated by the movement distribution will be expressed in units of raster-cell widths. This paramter and movement_distance_distr_param1 should be set to reflect a distribution of movement distances that is appropriate for your scenario. The distribution to which this parameter applies depends on the value of the movement_distance_distr parameter.
movement_distance_distr
#movement distance distr to use ('lognormal','levy','wald')
'movement_distance_distr': 'lognormal',
str
default: ‘lognormal’
reset? Y
This determines whether movement is modeled using a lognormal distribution (‘lognormal’; default), a Lévy distribution (‘levy’), or a Wald distribution (‘wald’).
dispersal_distance_distr_param1
#1st param of distr of dispersal distance
'dispersal_distance_distr_param1': -1,
{int
, float
}
default: -1
reset? Y
This is the first parameter of the distribution used to draw dispersal distances. The values generated by the dispersal distribution will be expressed in units of raster-cell widths. This paramter and dispersal_distance_distr_param2 should be set to reflect a distribution of dispersal distances that is appropriate for your scenario. The distribution to which this parameter applies depends on the value of the dispersal_distance_distr parameter.
dispersal_distance_distr_param2
#2nd param of distr of dispersal distance
'dispersal_distance_distr_param2': 0.05,
{int
, float
}
default: 0.05
reset? Y
This is the second parameter of the distribution used to draw dispersal distances. The values generated by the dispersal distribution will be expressed in units of raster-cell widths. This paramter and dispersal_distance_distr_param1 should be set to reflect a distribution of dispersal distances that is appropriate for your scenario. The distribution to which this parameter applies depends on the value of the dispersal_distance_distr parameter.
dispersal_distance_distr
#dispersal distance distr to use ('lognormal','levy','wald')
'dispersal_distance_distr': 'lognormal',
str
default: ‘lognormal’
reset? Y
This determines whether dispersal is modeled using a lognormal distribution (‘lognormal’; default), a Lévy distribution (‘levy’), or a Wald distribution (‘wald’).
Movement and Dispersal _ConductanceSurfaces
layer
'move_surf' : {
#move-surf Layer name
'layer': 'layer_0',
str
default: 'layer_0'
reset? P
This indicates, by name, the Layer
to be used as to construct the
_ConductanceSurface
for a Species
. Note that this can also
be thought of as the Layer
that should serve as a
Species
’ permeability raster (because Individual
s moving
on this _ConductanceSurface
toward the higher
(if mixture distributions are used) or highest
(if unimodl distributions are used) values in their neighborhoods).
mixture
#whether to use mixture distrs
'mixture': True,
bool
default: True
reset? P
This indicates whether the _ConductanceSurface
should be built using
VonMises mixture distributions or unimodal VonMises distributions.
If True, each cell in the _ConductanceSurface
will have an approximate
circular distribution that is a
weighted sum of 8 unimodal VonMises distributions (one per cell in the 8-cell
neighborhood); each of those summed unimodal distributions will have as its
mode the direction of the neighboring cell on which it is based and as its
weight the relative permeability of the cell on which it is based
(relative to the full neighborhood). If False, each cell in the
_ConductanceSurface
will have an approximated circular distribution
that is a single
VonMises distribution with its mode being the direction of the maximum-valued
cell in the 8-cell neighborhood and its concentration determined by
vm_distr_kappa.
vm_distr_kappa
#concentration of distrs
'vm_distr_kappa': 12,
{int
, float
}
default: 12
reset? N
This sets the concentration of the VonMises distributions used to build
the approximated circular distributions in the _ConductanceSurface
.
The default value was chosen heuristically as one that provides a reasonable
concentration in the direction of a unimodal VonMises distribution’s mode
without causing VonMises mixture distributions built from an
evenly weighted sum of distributions pointing toward the
8-cell-neighborhood directions to have 8 pronounced modes.
There will probably be little need to change the default value, but if
interested then the user could create Model
s with various values
of this parameter and then use the Model.plot_movement_surface
method to explore the influence of the parameter on the resulting
_ConductanceSurface
s.
approx_len
#length of approximation vectors for distrs
'approx_len': 5000,
{int
}
default: 5000
reset? P
This determines the length of the vector of values used to approximate each
distribution on the _ConductanceSurface
(i.e. the size of the z-axis
of the np.ndarray
used to hold all the distribution-approximations, where
the y and x axes have the same dimensions as the Landscape
). The default
value of 5000 is fine for many cases, but may need to be
reduced depending on the Landscape
dimensions (because for a larger
Landscape
, say 1000x1000 cells, it would create a
_ConductanceSurface
that is roughly 4Gb,
and if the Layer
on which the _ConductanceSurface
is based will be
undergoing landscape changes then numerous versions of an object of this size
would need to be generated when the Model
is built and held in memory).
The value to use for this parameter will depend on the size of the
Landscape
, the exact scenario being simulated, and the memory of the
machine on which the Model
is to be run.
_GenomicArchitecture
gen_arch_file
'gen_arch': {
#file defining custom genomic arch
'gen_arch_file': None,
{str
, None
}
default: {None
, '<your_model_name>_spp-<n>_gen_arch.csv'
reset? P
This argument indicates whether a custom genomic architecture file should
be used to create a Species
’ GenomicArchitecture
, and if so,
where that file is located. If the value is None
, no file will be
used and the values of this Species
’ other genomic
architecture parameters in the parameters file will be used to create
the GenomicArchitecture
. If the value is a str
pointing to a
custom genomic-architecture file
(i.e. a CSV file with loci as rows and ‘locus’,
‘p’, ‘dom’, ‘r’, ‘trait’, and ‘alpha’ as columns stipulating the
locus numbers, starting
allele frequencies, dominance values, inter-locus recombination rates,
trait names, and effect sizes of all loci; values can be left blank if not applicable).
Geonomics will create an empty
file of this format for each Species
for which the
‘genomes’ argument is given the value ‘custom’ when
gnx.make_parameters_file
is called (which will be saved as
‘<your_model_name>_spp-<n>_gen_arch.csv’).
Note that when Geonomics reads in a custom genomic architecture file
to create a Model
, it will check
that the length (i.e. number of rows) in this file is equal to the length
stipulated by the L parameter, and will also check that the first value
at the top of the ‘r’ column is 0 (which is used to implement independent
assortment during gametogenesis). If either of these checks fails,
Geonomics throws an Error.
L
#num of loci
'L': 1000,
int
default: 1000
reset? P
This defines the total number of loci in the genomes in a
Species
.
l_c
#num of chromosomes
'l_c': [100],
list
of int
s
default: [100]
reset? P
This defines the lengths (in number of loci) of each of the chromosomes
in the genomes in a Species
. Note that the sum of this list
must equal L, otherwise Geonomics will throw an Error.
Also note that Geonomics models genomes as single L x 2
arrays, where separate chromosomes are delineated by points along
the genome where the recombination rate is 0.5;
thus, for a model where recombination rates are often at or near 0.5, this
parameter will have little meaning.
start_p_fixed
#fixed starting allele freq; None/False -> rand; True -> 0.5
'start_p_fixed': 0.5,
{float
, bool
, None
}
default: 0.5
reset? P
If a float
on the interval [0,1] is provided,
that value will be used as the starting allele
frequency at which all loci (except neutral loci,
if start_neut_zero is True) will be fixed.
If None
, the starting allele
frequency of each locus will be drawn as a uniform random variable
between 0 and 1, inclusive.
If a bool
is provided, True
will fix all
loci at the default starting allele frequency of 0.5,
whereas False
will not fix starting
allele frequencies, effectively having the same effect
as None
.
Defaults to 0.5.
start_neut_zero
#whether to start neutral locus freqs at 0
'start_neut_zero': False,
bool
default: False
reset? P
If True, all neutral loci will start with the ‘1’ allele at a frequency of 0 (i.e. all individuals will be homozygous ‘0’|’0’ at those loci).
mu_neut
#genome-wide per-base neutral mut rate (0 to disable)
'mu_neut': 0,
float
default: 1e-9
reset? P
This defines the genome-wide per-base neutral mutation rate. This value can be set to 0 to disable neutral mutation.
mu_delet
#genome-wide per-base deleterious mut rate (0 to disable)
'mu_delet': 0,
float
default: 0
reset? P
This defines the genome-wide per-base deleterious mutation rate.
This value can be set to 0 to disable deleterious mutation. Note that all
deleterious mutation will fall outside the loci that affect any Trait
s
a Species
may have, and will behave simply as globally
deleterious mutations (i.e. mutations that reduce the mutated
Individual
’s fitness regardless of that Individual
’s
spatial location).
delet_alpha_distr_shape
#shape of distr of deleterious effect sizes
'delet_alpha_distr_shape': 0.2,
float
default: 0.2
reset? P
This defines the shape parameter of the gamma distribution from which the effect sizes of deleterious loci are drawn. (Values drawn will be truncated to the interval [0,1].)
delet_alpha_distr_scale
#scale of distr of deleterious effect sizes
'delet_alpha_distr_scale': 0.2,
float
default: 0.2
reset? P
This defines the scale parameter of the gamma distribution from which the effect sizes of deleterious loci are drawn. (Values drawn will be truncated to the interval [0,1].)
r_distr_alpha
#alpha of distr of recomb rates
'r_distr_alpha': None,
{float
, None
}
default: None
reset? P
This defines the alpha parameter of the beta distribution from which interlocus recombination rates are drawn. (Values drawn will be truncated to the interval [0, 0.5].) If r_distr_beta is None, recombination rates will be fixed at this value. Defaults to None, and r_distr_beta defaults to None, such that all recombination rates will be fixed at a value (1/L) that yields approximately 1 expected recombination event per gamete per generation.
r_distr_beta
#beta of distr of recomb rates
'r_distr_beta': None,
{float
, None
}
default: None,
reset? P
This defines the beta parameter of the beta distribution from which interlocus recombination rates are drawn. (Values drawn will be truncated to the interval [0, 0.5].) Defaults to None, which will fix recombination rates at the value of r_distr_alpha (which defaults to 0.5, i.e. independence), or else will fix all rates at a value (1/L) that yields approximately 1 expected recombination event per gamete per generation (if r_distr_alpha is None).
dom
#whether loci should be dominant (for allele '1')
'dom': False,
bool
default: False
reset? P
This indicates whether loci should be treated as dominant (if True) for the ‘1’ allele or as codominant (if False). Codominance is the default behavior, because it is assumed that Geonomics will often be used to model quantitative traits, for which this is a reasonable assumption.
pleiotropy
#whether to allow pleiotropy
'pleiotropy': False,
bool
default: False
reset? P
This indicates whether pleiotropy should be allowed. If True, loci will be
permitted to contribute to more than one Trait
.
recomb_rate_custom_fn
#custom fn for drawing recomb rates
'recomb_rate_custom_fn': None,
{function
, None
}
default: None
reset? P
This parameter allows the user to provide a custom function according to which
interlocus recombination rates will be assigned. If set to None
, the
default behavior (i.e. recombination rates chosen from a beta distribution
using r_distr_alpha and r_distr_beta) will be used.
n_recomb_paths_mem
#number of recomb paths to hold in memory
'n_recomb_paths_mem': int(1e4),
int
default: int(1e4)
reset? P
This defines the maximum number of recombination paths for Genomics to hold in
memory at one time. Geonomics models recombination by using the interlocus
recombination rates to draw a large number of recombination ‘paths’
along the Lx2 genome array (when the Model
is first built), and
then shuffling and cycling through those recombination paths as
needed during Model
runs. Of the total number of paths created, some
subset will be held in memory (the number of these is defined by
this parameter), while the remainder will live in a temporary
file (which is occasionally read in whenever the paths in memory are close to
being used up). Thus, to avoid problems, the number provided to this parameter
should be comfortably larger than the largest anticipated number of
recombination paths that will be needed during a single mating event (i.e.
larger than two times the largest antipicated number of offspring to be born
to the Species
during one timestep).
n_recomb_paths_tot
#total number of recomb paths to simulate
'n_recomb_paths': int(1e5),
This defines the total number of recombination paths that Geonomics will
generate. Geonomics models recombination by using the interlocus
recombination rates to draw a large number of recombination ‘paths’
along the Lx2 genome array (when the Model
is first built), and
then shuffling and cycling through those recombination paths as
needed during Model
runs. The larger the total number of these paths
that is created, the more closely Geonomics will model truly
free recombination and the more prceisely it will model the exact
interlocus recombination rates defined in a Species
’
GenomicArchitecture
.
allow_ad_hoc_recomb
#whether to generate recombination paths at each timestep
'allow_ad_hoc_recomb': False,
bool
default: False
reset? P
This determines whether or not recombinants should be drawn each timestep (rather than recombination paths being drawn and stored when a model is first built, then used randomly throught the model run). This is advantageous because it models recombination exactly (rather than approximating recombination by drawing some number of fixed recombination paths that get repeatedly used), and for combinations of larger genome sizes (L) and larger mean population sizes (N) it avoids the memory used by storing so many recombination paths drawn at model creation, thus making these parameterizations feasible on computers with memory limitations). It is disadvantageous, however, because it runs somewhat slower than the default approach (recombinants drawn at model creation) for a range of L and N values, and also because it is only available for parameterizations with homogeneous recombination across the genome.
jitter_breakpoints
#whether to jitter recomb bps, to correctly track num_trees
'jitter_breakpoints': False,
bool
default: False
reset? P
This determines whether or not the recombination breakpoints stored by the tskit TableCollection should be slightly jitted off of their x.5 default positions. Enabling this will render each recombination effectively unique, and thus will allow the tskit.TreeSequence to correctly report the number of trees (TreeSequence.num_trees). However, it will do this at the expense of additional memory usage, which could potentially be limiting. Thus, this should be set to True if and only if the TreeSequence information will be used in a way that requires accurate representation of the number of trees.
mut_log
#whether to save mutation logs
'mut_log': None,
{str
, None
}
default: None
reset? P
This indicates the location of the mutation-log file where Geonomics should
save a record of each mutation that occurs for a Species
Species
, for each iteration. If None
, no mutation log
will be created and written to.
use_tskit
#whether to use tskit (to record full spatial pedigree)
'use_tskit': True,
bool
default: True
reset? P
This indicates whether Geonomics should use the tskit
API
to store individuals’ genomes in a tskit.TableCollection
.
If tskit
is used the the full spatial pedigree of the current population
will be available at any time, and each Individual will carry a numpy.ndarray
containing just its genotypes for any non-neutral loci.
If tskit
is not used, each Individual will carry a numpy.ndarray
containing its genotypes for all simulated loci (neutral and non-neutral).
Defaults to using tskit
(True),
but may need to be switched to False for models
using large numbers of independent loci
and not needing access to the spatial pedigree
(because larger numbers of independent loci cause a geometric
expansion of the number of trees stored in the tskit.TreeSequence
,
requiring significantly more memory overall,
and significantly greater runtime to store them at each time step).
tskit_simp_interval
#time step interval for simplification of tskit tables
'tskit_simp_interval': 100,
int
default: 100
reset? N
This sets the interval, in timesteps,
between subsequent tskit
simplifications.
Defaults to simplifying every 100 timesteps, as suggested by the
tskit package authors (see
here
). This most likely need not be changed, but for simulations
with especially large population and/or genome sizes the user may
wish to experiment with reducing this interval so as to improve performance.
Traits
trait_<n>
#trait name (TRAIT NAMES MUST BE UNIQUE!)
'trait_0' : {
{str
, int
}
default: trait_<n>
reset? P
This parameter defines the name for each Trait
.
(Note that unlike most parameters, this parameter is a dict
key,
the value for which is a dict
of parameters defining the Trait
being named.) As the capitalized
reminder in the parameters states, each Trait
must have a unique name (so that a parameterized
Trait
isn’t overwritten in the ParametersDict
by a
second, identically-named Trait
; Geonomics
checks for unique names and throws an Error if this condition is not met.
Trait
names can, but needn’t be, descriptive of what each
Trait
represents. Example valid values include: 0, ‘trait0’,
‘tmp_trait’, ‘bill length’. Names default to trait_<n>
,
where n is a series of integers starting from 0 and counting the
number of Trait
s for this Species
.
layer
#trait-selection Layer name
'layer': 'layer_0',
str
default: 'layer_0'
reset? P
This indicates, by name, the Layer
that serves as the selective force
acting on this Trait
. (For example, if this Trait is selected upon by
annual mean temperature, then the name of the Layer
representing annual mean temperature should be provided here.)
phi
#phenotypic selection coefficient
'phi': 0.05,
{float
, np.ndarray
of float
s}
default: 0.05
reset? P
This defines the phenotypic selection coefficient on this Trait
(i.e
the selection coefficient acting on the phenotypes, rather than the genotypes,
of this Trait
). The effect of this value can be thought of as the
reduction (from 1) in an Individual
’s survival probability when that
Individual
is maximally unfit (i.e. when that Individual
has a
phenotypic value of 1.0 but is located in a location with an environmental
value of 0.0, or vice versa). When the value is a float
then the
strength of selection will be the same for all locations on the
Landscape
. When the value is an np.ndarray
of
equal dimensions to the Landscape
then the strength of
selection will vary across space, as indicated by the values in this array
(what Geonomics refers to as a “spatially contingent” selection regime).
Importantly, most individuals will experience selection on a given trait that
is only a fraction of the strength dictated by this parameter,
because an individual’s fitness for a given trait
is determined by the product of the trait’s selection coefficient
and the individual’s degree of mismatch to its local environment,
and it is often unlikely that individuals would occur in environments
that are completely opposed to those individuals’ phenotypic values
(e.g. a 0-valued individual in a 1-valued environmental cell).
Because of this, selection coefficients that would be considered ‘strong’
in classical, aspatial population genetics models will tend to behave
less strongly in Geonomics models.
(For mathematical detail, see the Selection section.)
n_loci
#number of loci underlying trait
'n_loci': 1,
int
default: 10
reset? P
This defines the number of loci that should contribute to the phenotypes
of this Trait
. These loci will be randomly drawn from across the
genome.
mu
#mutation rate at loci underlying trait
'mu': 1e-9,
float
default: 1e-9
reset? P
This defines the mutation rate for this Trait
(i.e. the rate at which
mutations that affect the phenotypes of this Trait
will arise). Set to
0 to disable mutation for this Trait
.
alpha_distr_mu
#mean of distr of effect sizes
'alpha_distr_mu' : 0.1,
float
default: 0.1
reset? N
This defines the mean of the normal distribution from which a Trait
’s
initially parameterized loci and new mutations’ effect sizes are drawn (with
the exception of monogenic traits, whose starting locus always has an alpha
value of 0.5, but whose later mutations are influenced by this parameter).
For effect sizes drawn from a distribution, it is recommended
to set this value set to 0 and adjust alpha_distr_sigma.
For fixed effect sizes, set this value to the fixed
effect size and set alpha_distr_sigma to 0; effects will alternate
between positive and negative when they are assigned to loci.
In either case, new mutations in a Trait
will then be equally likely to decrease or increase Individual
s’
phenotypes from the multigenic baseline phenotype of 0.5 (which is also
the central value on a Geonomics Landscape
).
It is also recmmended that the user consider the number of loci for a trait
when setting the fixed or distributed effect sizes; for example, for a trait
with 10 underlying loci, an average or fixed absolute effect size of 0.1
will enable phenotypes that cover the range of values on a
Geonomics Landscape
(i.e. phenotypes 0 <= z <= 1), whereas
0.05 will likely not enable that full range of phenotypes, and 0.5 will
generate many phenotypes that fall outside that range and will be selected
against at all locations on the Landscape
.
alpha_distr_sigma
#variance of distr of effect size
'alpha_distr_sigma': 0,
float
default: 0
reset? P
This defines the standard deviation of the normal distribution from which
a Trait
’s new mutations’ effect sizes are drawn.
For effect sizes drawn from a distribution, it is recommended
to set this value set to some nonzero number
and set alpha_distr_mu to 0. For fixed effect sizes,
set this value to 0 and set alpha_distr_mu to the fixed effect size;
effects will alternate between positive and negative when they are
assigned to loci. In either case, new mutations in a Trait
will then be equally likely to decrease or increase Individual
s’
phenotypes from the multigenic baseline phenotype of 0.5 (which is also
the central value on a Geonomics Landscape
).
max_alpha_mag
#max allowed magnitude for an alpha value
'max_alpha': None,
{float
}
default: None
reset? P
This defines the maximum value that can be drawn for a locus’ effect size (i.e. alpha). Defaults to None, but the user may want to set this to some reasonable value, to prevent chance creation of loci with extreme effects.
gamma
#curvature of fitness function
'gamma': 1,
{int
, float
}
default: 1
reset? N
This defines the curvature of the fitness function (i.e.
how fitness decreases as the absolute difference between an
Individual
’s optimal and actual phenotypes increases). The user
will probably have no need to change this from the default value of 1
(which causes fitness to decrease linearly around the optimal
phenotypic value). Values < 1 will cause the fitness function to be
concave up; values > 1 will cause it to be concave down.
univ_adv
#whether the trait is universally advantageous
'univ_adv': False
bool
default: False
reset? P
This indicates whether whether selection on a Trait
should be
universal (i.e. whether a phenotype of 1 should be optimal everywhere
on the Landscape
). When set to True, selection of the Trait
will be directional on the entire Species
, regardless
of Individual
s’ spatial contexts.
Species change
Demographic change
kind
#kind of event {'monotonic', 'stochastic',
#'cyclical', 'custom'}
'kind': 'monotonic',
{'monotonic'
, 'stochastic'
, 'cyclical'
, 'custom'
}
default: 'monotonic'
reset? P
This indicates what type of demographic change is being parameterized. Each event has a certain length (in timesteps; defined by the start and end parameters). Note that of the other parameters in this section, only those that are necessary to parameterize the type of change event indicated here will be used.
In 'monotonic'
change events, a Species
’
carrying capacity raster (K) is multiplied by a constant factor
(rate) at each timestep during the event.
In 'stochastic'
change events, K fluctuates
around the baseline value (i.e. the K-raster at the time that the change event
begins) at each required timestep during the event (where the sizes of the
fluctuations are drawn from the distribution indicated by
distr, the floor and ceiling on those sizes are set by
size_range, and the required timesteps are determined by interval).
In 'cyclical'
change events, K undergoes a number (indicated
by n_cycles) of sinusoidal cycles between some minimum and maximum
values (indicated by size_range).
In 'custom'
change events, the baseline K is multiplied by a series
of particular factors (defined by sizes) at a series of particular
timesteps (defined by timesteps).
start_t
#starting timestep
'start_t': 50,
int
default: 50
reset? P
This indicates the timestep at which the demographic change event should start.
end_t
#ending timestep
'end_t': 100,
int
default: 100
reset? P
This indicates the last timestep of the change event.
rate
#rate, for monotonic change
'rate': 1.02,
float
default: 1.02
reset? P
This indicates the rate at which a 'monotonic'
change event should occur.
At each timestep during the event, a new carrying capacity raster (K)
will be calculated by multiplying the previous step’s K by this factor.
Thus, values should be expressed relative to 1.0 indicating no change.
interval
#interval of changes, for stochastic change
'interval': 1,
int
default: 1
reset? P
This indicates the interval at which fluctutations should occur during a
'stochastic'
change event (i.e. the number of timesteps to wait
between fluctuations).
distr
#distr, for stochastic change {'uniform', 'normal'}
'distr': 'uniform',
{'uniform'
, 'normal'
}
default: 'uniform'
reset? P
This indicates the distribution from which to draw the sizes of
fluctuations in a 'stochastic'
change event. Valid options are
‘uniform’ and ‘normal’.
n_cycles
#num cycles, for cyclical change
'n_cycles': 10,
int
default: 10
reset? P
This indicates the number of cyclical fluctuations that should occur during
a 'cyclical'
change event.
size_range
#min & max sizes, for stochastic & cyclical change
'size_range': (0.5, 1.5),
tuple
of float
s
default: (0.5, 1.5)
reset? P
This defines the minimum and maximum sizes of fluctuations that can occur
during 'stochastic'
and 'cyclical'
change events.
timesteps
#list of timesteps, for custom change
'timesteps': [50, 90, 95],
list
of int
s
default: [50, 90, 95]
reset? P
This defines the series of particular timesteps at which fluctutations should
occur during a 'custom'
change event.
sizes
#list of sizes, for custom change
'sizes': [2, 5, 0.5],
list
of float
s
default: [2, 5, 0.5]
reset? P
This defines the series of particular fluctutations that should occur
during a 'custom'
change event.
Life-history change
<life_hist_param>
#life-history parameter to change
'<life_hist_param>': {
str
default: '<life_hist_param>'
reset? P
This indicates the life-history parameter to be changed by this life-history
change event. (Note that unlike most parameters, this parameter is
a dict
key, the value for which is a dict
of parameters controlling how the life-history parameter that is named
will change.)
timesteps
#list of timesteps
'timesteps': [],
list
of int
s
default: []
reset? P
This indicates the timesteps at which the life-history parameter being changed should change (to the values indicated by vals).
vals
#list of values
'vals': [],
list
of float
s
default: []
reset? P
This indicates the values to which the life-history parameter being changed should change (at the timesteps indicated by timesteps).
Other parameters
Main
T
#total Model runtime (in timesteps)
'T': 100,
int
default: 100
reset? Y
This indicates the total number of timesteps for which the main portion of
a Model
(i.e. the portion after the burn-in has completed) will be run
during each iteration.
burn_T
#min burn-in runtime (in timesteps)
'burn_T': 30,
int
default: 30
reset? P
This indicates the minimum number of timesteps for which a Model
’s
burn-in will run. (Note this is only a minimum because the test for
burn-in completion includes a check that at least this many timesteps have
elapsed, but also includes two statistical checks of stationarity of the
size of each Species
in a Community
.)
num
#seed number
'num': None,
{int
, None
}
default: None
reset? P
This indicates whether or not to set the seeds of the random number
generators (by calling np.random.seed
and random.seed
)
before building and running a Model
. If value is an integer, the seeds
will be set to that value. If value is None
, seeds will not be set.
Iterations
num_iterations
#num iterations
'n_its': 2,
int
default: 2
reset? Y
This indicates the number of iterations for which the Model
should be run. (Note that for each iteration a separate subdirectory of
data and stats will be written, if your Model
has parameterized data
and stats to be collected.)
rand_landscape
#whether to randomize Landscape each iteration
'rand_landscape': False,
bool
default: False
reset? P
This indicates whether the Landscape
should be randomized for each
iteration. If True, a new Landscape
will be generated at the start
of each iteration. If False, the Landscape
from iteration 0 will be
saved and reused for each subsequent iteration.
rand_comm
#whether to randomize Community each iteration
'rand_comm': False,
bool
default: False
reset? P
This indicates whether the Community
should be randomized for each
iteration. If True, a new Community
will be generated at the start
of each iteration. If False, the Community
from iteration 0 will be
saved and reused for each subsequent iteration (and whether that
Community
is saved before or after being burned in will depend on
the value provided to the repeat_burn parameter).
rand_genarch
#whether to randomize GenomicArchitectures each iteration
'rand_genarch': True,
bool
default: True
reset? N
This indicates whether all Species
with genomes should have their
GenomicArchitectures
randomized (and then genomes redrawn) for each
iteration.
Defaults to True because this, in combination with the other default settings
(rand_landscape False, rand_comm False, repeat_burn False) creates
a reasonable default behavior for the set of iterations in a run:
Each iteration will use the same landscape and the same number and spatial
distribution of starting individuals, thus obviating the need for the most
computationally costly components of a Geonomics model prior to the main phase,
but those individuals’ GenomicArchitectures
and genomes
will be drawn at random, providing independent instantiations
of the landscape genomic scenario being simulated.
Note, however, that this parameter is inconsequential for any Species
with a custom genomic architecture file; if a Species
’ gen_arch_file
parameter is not None then the custom genomic architecture file will be used
to set the genomic architecture for that Species
on each iteration,
regardless of whether rand_genarch is True or False.
repeat_burn
#whether to burn in each iteration
'repeat_burn': False,
bool
default: False
reset? P
This indicates whether a reused Community
should be burned in
separately for each iteration for which it is reused. If True, the
Community
from iteration 0 will be saved as soon as its instantiated,
but will have a new burn-in run for each iteration in which it is used. If
False, the Community
from iteration 0 will be saved after its burn-in
is complete, and then will only have the main portion of its Model
run
separately during each iteration. (Note that if rand_community is set to True then
the value of this parameter will not be used.)
Data
Sampling
scheme
#sampling scheme {'all', 'random', 'point', 'transect'}
'scheme': 'random',
{'all'
, 'random'
, 'point'
, 'transect'
}
default: 'random'
reset? P
This indicates the sampling scheme to use when collecting data from a
Model
. Currently valid values include 'all'
,
'random'
, 'point'
, and 'transect'
.
With 'all'
, data will be collected for all Individual
s
at each sampling timestep. With 'random'
, data will be collected from a
random sample of Individual
s (of size indicated by parameter n)
from anywhere on the Landscape
.
With 'point'
, data will be collected from random samples of size n
within a certain distance (radius) of each of a set of particular
points (points). With 'transect'
, a linear transect of some
number of points (n_transect_points) between some endpoints
(transect_endpoints) will be created, and then data will be collected
from random samples of size n with a certain distance (radius)
of each point along the transect.
n
#sample size at each point, for point & transect sampling
'n': 250,
int
default: 250
reset? P
This indicates the total number of Individual
s to sample each time
data is collected (if scheme is 'random'
), or the number of
Individual
s to sample around each one of a set of points (if scheme
is 'point'
or 'transect'
). This parameter will only be used if
scheme is 'random'
, 'point'
, or 'transect'
; otherwise
it may be set to None
.
points
#coords of collection points, for point sampling
'points': None,
{tuple
of 2-tuple
s, None
}
default: None
reset? P
This indicates the points around which to sample Individual
s for data
collection. This parameter will only be used if scheme is 'point'
;
otherwise it may be set to None
.
transect_endpoints
#coords of transect endpoints, for transect sampling
'transect_endpoints': None,
{2-tuple
of 2-tuple
s, None
}
default: None
reset? P
This indicates the endpoints between which to create a transect, along which
Individual
s will be sampled for data collection.
This parameter will only be used if scheme is 'transect'
;
otherwise it may be set to None
.
n_transect_points
#num points along transect, for transect sampling
'n_transect_points': None,
{int
, None
}
default: None
reset? P
This indicates the number of points to create on the transect along which
Individual
s will be sampled for data collection.
This parameter will only be used if scheme is 'transect'
;
otherwise it may be set to None
.
radius
#collection radius around points, for point & transect sampling
'radius': None,
{float
, int
, None
}
default: None
reset? P
This indicates the radius around sampling points within which
Individual
s may be sampled for data collection.
This parameter will only be used if scheme is 'point'
or
'transect'
; otherwise it may be set to None
.
when
#when to collect data
'when': None,
{int
, list
of int
s, None
}
default: None
reset? P
This indicates the timesteps during main Model
iterations
at which data should be collected (in addition to after the final timestep
of each iteration, when data is always collected for any Model
for which
data collection is parameterized). If value is a non-zero int
,
it will be treated as a frequency at which data should be collected (e.g.
a value of 5 will cause data to be collected every 5 timesteps). If value
is a list of int
s, they will be treated as the particular timesteps
at which data should be collected. If value is 0 or None
,
data will be collected only after the final timestep.
include_landscape
#whether to save current Layers when data is collected
'include_landscape': False,
bool
default: False
reset? P
This indicates whether to include the Landscape
Layer
s among the
data that is collected. If True, each Layer
will be written to a raster
or array file (according to the format indicated by
geo_rast_format) each time data is collected.
include_fixed_sites
#whether to include fixed loci in VCF files
'include_fixed_sites': False,
bool
default: False
reset? P
This indicates whether fixed sites (i.e. loci which are fixed for either the
0 or 1 allele) should be included in any VCF files that are written. Thus,
this parameter is only relevant if 'vcf'
is one of the genetic data
formats indicated by gen_format.
Format
gen_format
#format for genetic data {'vcf', 'fasta'}
'gen_format': ['vcf', 'fasta'],
{'vcf'
, 'fasta'
, ['vcf', 'fasta']
}
default: ['vcf', 'fasta']
reset? P
This indicates the format or formats to use for writing genetic data.
data. Currently valid formats include 'vcf'
and 'fasta'
formats.
Either or both formats may be specified; all formats that are specified will
be written each time data is collected.
geo_vect_format
#format for vector geodata {'csv', 'shapefile', 'geojson'}
'geo_vect_format': 'csv',
{'csv'
, 'shapefile'
, 'geojson'
}
default: 'csv'
reset? P
This indicates the format to use for writing geographic vector data (i.e.
Individual
s’ point locations).
Currently valid formats include 'csv'
, 'shapefile'
,
and 'geojson'
. Any one format may be specified.
geo_rast_format
#format for raster geodata {'geotiff', 'txt'}
'geo_rast_format': 'geotiff',
{'geotiff'
, 'txt'
}
default: 'geotiff'
reset? P
This indicates the format to use for writing geographic raster data (i.e.
Layer
arrays). Currently valid formats include 'geotiff'
and 'txt'
. Either format may be specified. Note that this parameter
will only be used if the include_landscape parameter is set to True.
nonneut_loc_format
#format for files containing non-neutral loci
'nonneut_loc_format': 'csv',
{'csv'
, None
}
default: 'csv'
reset? P
This indicates the format to use for writing files
containing lists (in columns) of the non-neutral loci subtending
each of the traits (whose names are used as column names).
Currently valid values include 'csv'
,
which saves a simple CSV file, and None
,
which tells Geonomics not to save these files as output.
Stats
The stats parameters section has subsection for each statistc that Geonomics
can calculate. (Currently valid statistics include:
- ‘Nt’: number of individuals at timestep t
- ‘het’: heterozygosity
- ‘maf’: minor allele frequency
- ‘mean_fit’: mean fitness of a Species
- ‘ld’: linkage disequilibrium
There are only a few parameters, which are shared across all of those subsections, and each parameter always means the same thing despite which statistic it is parameterizing. Thus, hereafter we provide a single of each of those parameters are how it works, regardless of the statistic for which it used:
calc
#whether to calculate
'calc': True,
bool
default: (varies by statistic)
reset? P
This indicates whether or not a given statistic should be calculated. Thus,
only those statistics whose calc parameters are set to True will be
calculated and saved when their Model
is run.
freq
#calculation frequency (in timesteps)
'freq': 5,
int
default: (varies by statistic)
reset? P
This indicates the frequency with which a given statistic should be calculated during each iteration (in timesteps). If set to 0, Geonomics will calculate and save this statistic for only the first and last timesteps of each iteration.
mean
#whether to mean across sampled individs
'mean': False,
bool
default: (varies by statistic, and only valid for certain statistics)
reset? P
For some statistics that produce a vector of values each timestep
when they are collected (containing one value per Individual
),
such as heterozygosity, this indicates
whether those values should instead be meaned and saved as a
single value for each timestep.