For additional help with these tools, please contact mget-help@nicholas.duke.edu
We are pleased to announce a major update to MGET’s larval connectivity modeling tools. Prior to MGET 0.8a62, the larval connectivity modeling tools were based on the approach described in Treml et al. (2008). In MGET 0.8a62, we added new tools that implement the Treml et al. (2012) approach, which provides numerous advantages, including:
Workflow
- Larvae now settle. The old approach dispersed larvae without settling them and reported connections between patches based on the maximum number of larvae dispersed by a source patch that were suspended over a destination patch over the course the the simulation. In the new approach, when larvae drift over a patch they settle there and are removed from the simulation and become unavailable to downstream patches, providing a more realistic simulation of dispersal.
- Competency and settlement behavior are configurable. You can control the rate at which larvae become competent to settle (via a gamma cumulative distribution function), the rate that they settle over habitat patches once they are competent (via a simple linear rate), and whether all larvae within a grid cell can sense and move to the habitat available there, or just the fraction that are over that habitat (i.e. whether or not the larvae have a “sensory zone”).
- The advection algorithm is more accurate. The new approach is based on the Multidimensional positive definite advection transport algorithm (MPDATA) (Smolarkiewicz 1983; Smolarkiewicz and Margolin 1998; Smolarkiewicz 2006).
- Larvae can die. You can specify a mortality rate. Different rates can be tested quickly without rerunning the entire hydrodynamic algorithm.
- The connectivity network can be represented in two ways. Depending on your research question, you may want to characterize connectivity according to the absolute quantity of larvae transmitted from one site to another, or to the probability that a larva released from the first site will settle on the second.
The new tools use a four-step workflow:
- Create a simulation using three rasters that define the study area and habitat patches.
- Load ocean currents into the simulation for dates of interest. You may choose from several popular sources of currents data.
- Run the dispersal algorithm for the patches and the settlement parameters you desire.
- Visualize the results as a series of rasters showing larvae density through time and a line feature class showing connections between patches. If desired, you apply mortality at this step.
Here’s the toolbox:
Here’s an example model:
Example analysis
For this example, we’ll examine connectivity of coral reefs within the central and western Gulf of Mexico. I chose this region because it is relatively small, allowing my example simulation to run quickly, and because I had data readily available from a recent paper, Schill et al. (2015). Although this covers a specific dataset and region, I’ll offer hints for how you can best configure the tools for your own data and region. (But please, do not contact me asking for step-by-step instructions on how to do basic GIS tasks, such as how produce a water mask from publicly available GIS data. If you do not already have the GIS skills to do these kinds of things yourself, you are not ready to use the connectivity tools.)
Step 1: Create the simulation
The first step is to create the simulation. In order to do that, we need three rasters: a water mask raster, a patch IDs raster, and a patch cover raster. The rasters must all have the same coordinate system, cell size, extent, and number of rows and columns. Before discussing each raster in detail, I’ll discuss their coordinate system, cell size, and extent. You should select these parameters very carefully; they strongly influence on the results, the length of time required to run the simulation, and the likelihood that the simulation will fail due to insufficient memory.
Coordinate system
The rasters should use a projected coordinate system such as Mercator with meters as the linear unit. The hydrodynamic algorithm performs calculations using Cartesian coordinates with meters as the unit. The coordinate system should, as much as possible, be configured such that its rows and columns run parallel to lines of longitude and latitude. When MGET loads ocean currents data into the simulation, it reprojects the original data into the coordinate system of your rasters. The original data come as vector components–pairs of u and v images giving current speed in the north/south and east/west directions. When MGET reprojects them, it does not know how to adjust the results to account for the fact that north/south/east/west may no longer be up/down/right/left on portions of your map. Projections in which north/south/east/west deviate strongly from up/down/right/left will not have accurate results. This can problem is easy to avoid for study areas near the equator but is increasingly difficult as latitude increases.
If you do not know what to choose, I recommend ArcGIS’s “WGS 1984 Mercator” projection. (I do not recommend any of the “Web Mercator” projections, however, due to the difficulty in properly reprojecting ocean currents data to these projections.) Feel free to contact us for advice.
Cell size, extent, and number of rows and columns
Choosing a cell size is difficult. Often you are working with coral reef data at resolutions of 1 km or higher. High resolution maps are nice, so you should just use the cell size of your reef data for the connectivity analysis, right? There are three reasons why this is probably not a good idea:
- Most ocean currents data are much coarser resolution. Most of the global currents datasets are derived from remote sensing observations reported by satellite altimeters which have along-track resolutions on the order of 10 km. The resulting gridded products typically range in resolution from 10-25 km at the equator. If you use a very fine scale such as 1 km, MGET will interpolate these products at that resolution but that does not mean the results will be accurate at that scale. Indeed, it is not possible to know whether or not the results are accurate because the ocean currents datasets are not able to resolve phenomena at that scale.
- Simulation run time increases dramatically as cell size decreases. Decreasing the cell size increases the number of cells in the study area; halving the cell size quadruples the number of cells. When you reduce cell size, you must also reduce the time step of the simulation so that the hydrodynamic algorithm remains numerically stable. This means that more time steps are needed to span a given simulation length. The increased number of cells and time steps can greatly lengthen the time it takes to complete the simulation.
- The tool may run out of memory. At present, the tool is restricted to running as a 32-bit process, which means that in practice it can access about 2 GB of memory before it fails, regardless of how much memory you have on your computer. We aim to remove this limitation in a future release, but until then, you must be careful about the number of rows and columns of your rasters, and the number of time steps in the simulation.
I recommend that you:
- First research ocean currents datasets and select one. As you will see in the next bullet, I recommend that you limit the resolution of your analysis to that of the ocean currents data. To follow this advice, you need to select an ocean currents dataset that matches your region and time period of interest. I discuss this further in the Step 2 section below.
- Choose a cell size that is not smaller than half the resolution of your ocean currents. For example, if you’re using AVISO 0.25° currents in a tropical area, their resolution is about 28 km at the equator, so I’d recommend not going smaller than about 14 km, or perhaps 10 km if you are working somewhat north or south of the equator. Working at a finer scale than that does not seem likely to produce results that will be plausible at that finer scale. (I have not tested this assertion, however.)
- Keep your rasters to less than 200,000 cells. For example, 500 columns x 400 rows = 200,000 cells. If fewer would work, by all means make your rasters smaller! The fewer cells you have, the faster your simulation will run. Simulations with 200,000 cells and a few hundred patches often max out available memory and run the risk of failing.
- Do not clip your rasters tightly around habitat patches. Leave a buffer between habitat patches and the edges of rasters, to allow currents to circulate in this area and perhaps return larvae to the spawning patch. Once larvae move beyond the edge of the raster, they are lost and cannot return.
Finally, if you are trying to model fine scale processes around small extents–e.g. nearshore movements of larvae around a small island or atoll–this may not be the tool for you. The tool is intended for use at regional scales where dominant mesoscale circulation features connect patches over 10s to 100s of km.
The Gulf of Mexico example
The Gulf of Mexico example uses data from Schill et al. (2015). In that analysis, we were interested in coral reef connectivity throughout the entire Caribbean, Gulf of Mexico, southeastern U.S., and Bermuda. We used currents data from NOAA’s Atlantic Real-Time Ocean Forecast System (RTOFS), a basin-scale ocean forecast system based on the HYbrid Coordinate Ocean Model (HYCOM). Atlantic RTOFS used a variably-spaced grid that ranged from about 4 km resolution in the western Gulf of Mexico to about 17 km near Africa. We resampled the RTOFS to an 8 km cell size, the coarsest resolution in our study area (the eastern Caribbean). We used the WGS 1984 Mercator projection. The rasters were 594 columns by 413 rows.
For the example here, I clipped the Schill et al. (2015) data to the central and western Gulf of Mexico (see below). I maintained the same coordinate system, but clipped the rasters to 180 columns by 198 rows. I focused on the smaller region to limit the run time of the tool. At this extent, the simulation takes about 25 minutes to run on my circa-2015 laptop, vs. about 20 hours for the full Schill et al. (2015) extent.
The variably-spaced Atlantic RTOFS data we used in Schill et al. (2015) could not be represented as rasters in ArcGIS, which requires constant cell spacing. To work around this, we developed a custom MATLAB script for resampling it into the Mercator projection. We have not had time to redevelop this script for use in MGET, so the Gulf of Mexico example presented here does not use Atlantic RTOFS. Instead I used the HYCOM Global GLBa0.08 dataset, which has a resolution of about 9 km and uses a Mercator projection. These characteristics make it quite suitable for the 8 km rasters we developed for the Schill et al. (2015) paper. Because the Gulf of Mexico is within the extent of the HYCOM Gulf of Mexico GOMl0.04 dataset, it would have worked as well. It offers a 4.5 km resolution.
You can download the Gulf of Mexico example data here.
Now I will discuss the details of the three rasters needed to create the simulation.
Water mask raster
The water mask indicates which cells are land and which are water. It must have an integer data type. The value 0 or No Data indicates land, all other values indicate water. During the simulation, larvae are allowed to move between water cells but cannot enter land cells. Larvae that are moved in the direction of land are “blocked” and remain in the adjacent water cell. Larvae that are moved beyond the extent of the raster from water cells on the edge are lost.
You can produce water masks from bathymetry datasets, which are usually given in raster form. SRTM30_PLUS, ETOPO, and GEBCO are widely used global bathymetries. Depending on your study area, you may also be able to find a regional bathymetry. Assuming you followed our advice about selecting a cell size that is not too small relative to your ocean currents data, it is very likely that the bathymetry has a smaller cell size than you want to use. You should project your bathymetry to your selected coordinate system at high resolution, clip it, use the ArcGIS Spatial Analyst to flag cells as water or land (e.g. with the Con tool or Raster Calculator), and then use the Resample tool to increase the cell size to your desired coarseness. I do not recommend using the Project Raster tool to increase the cell size; I have observed it introducing spatial error in the results. Check your results carefully.
You can also build water masks from shoreline databases, such as GSHHG, which are often distributed as land polygons. Reproject the polygons to the desired coordinate system and clip them. (It might be faster to first clip them to a region larger than your study area, then reproject them.) Then convert them to rasters at the desired cell size. You’ll probably have data for land and No Data for water. Use ArcGIS Spatial Analyst tools (e.g. Is Null, Raster calculator) to convert them to have 1 for water and 0 or No Data for land.
Here’s the water mask for our Gulf of Mexico example:
Patch IDs raster
This raster specifies the locations and IDs of habitat patches from which larvae will be released and upon which larvae can settle. It must have an integer data type. Each patch is defined as one or more cells having the same integer ID value. Patch IDs may range from 1 to 65535, inclusive. The value 0 may not be used. No Data indicates that the cell is not part of a patch. Typically, each patch’s cells form a single contiguous blob, but this is not required; the cells of a patch may be separated by No Data cells. All patches must occur in cells labelled as water by the water mask raster. If your patches occur close to land, you may need to carefully edit the patch IDs and water mask rasters together to ensure no desired habitat is missed.
Here’s the patch IDs raster for our Gulf of Mexico example. Note how this does not fully follow our recommendation above to buffer patches near the edge of the study area: in the southeast, the fringing reefs along the Yucatán penninsula extend all the way to the bottom edge of raster. The real reason for this is laziness; I clipped the Caribbean-wide data from Schill et al. (2015)and did not bother to edit out the southeasternmost reefs. But I can also offer some reasonable justifications. In this example, I’m not very interested in self recruitment and connectivity between those southeastern reefs. Also, the dominant current patterns flow from south to north here, so there is reduced risk that larvae will be carried off the southern edge of the study area.
How should patches be defined?
These fringing reefs also prompt the questions: how big should patches be? Should long runs of continuous habitat cells be grouped into a single patch or split into multiple patches? If they are split into multiple patches, how should it be done? And would it be better to have each cell of the raster be its own patch?
The answers to these questions depend on the question you’re seeking to answer, and also on a trade-off between level of detail and speed of execution. Adjacent areas of suitable habitat are usually highly connected. There is usually no benefit to treating each cell as a distinct patch. This usually results in a very large number of patches, greatly increasing execution time. Also, the connectivity network that results is usually very dense and must be further summarized to clearly communicate the results.
On the other hand, it may not be sufficiently informative to create a single patch for all the cells of a reef that extends for hundreds of kilometers, especially when it spans multiple political jurisdictions or management zones and the purpose of your study is to investigate connectivity in the context of human uses of the ocean. In this situation, it may be best to break up the reef into multiple patches at political or management boundaries. It is also appropriate to split patches up at locations where physiography or oceanography might strongly affect the flow of larvae. For example, if a reef extends around both sides of a peninsula, it might be appropriate to split the reef into two patches–one for either side of the peninsula–or three–one for either side and one for the tip.
There are no hard-and-fast rules for grouping cells into patches. I recommend you take the approach that seems most appropriate for your situation and document it accordingly. That said, I prefer to to have no more than “a few hundred” patches in a large analysis, or “a few dozen” in a small analysis, as is presented here. Once you exceed a few hundred patches, the simulation time stretches into days for little apparent benefit. When the number of patches is limited to a few dozen, the number of connections is often in the low hundreds, which is not too difficult to visualize on a map.
I followed your advice and selected a coarse cell size commensurate with the ocean currents I’m using. But now some cells only have a little suitable habitat and some have a lot. How should I handle this?
Elaborating on this question: When a cell has only a tiny amount of reef, it seems wrong to mark the cell as being part of a patch. That makes the patch extend over a much larger area than occurs in reality. But if I don’t mark the cell as being part of patch, that reef will be lost. How do I resolve this dilemma?
Mark the cell as being part of a patch, then use the patch cover raster to specify how much of the cell is covered by suitable habitat. See below for details.
Where can I get coral reef data for my own analysis?
UNEP-WCMC Global Distribution of Coral Reefs is a good place to start.
Patch cover raster
The patch cover raster specifies the proportion of each cell’s area that is occupied by habitat from which larvae can be released or upon which larvae can settle. It must have a floating point data type. The raster values must be greater than or equal to 0 and less than or equal to 1. The value 1 indicates that the entire cell is occupied by suitable habitat, while 0.5 indicates that only half of it is. For example, if the cell size is 8 km, the values 1 and 0.5 mean the cell contains 64 and 32 square km of suitable habitat, respectively.
At the start of the simulation, cells are allocated quantities of larvae according to the patch cover raster. A cell with a patch cover of 1 receives 1 unit of larvae; a cell with a patch cover of 0.2 receives 0.2 units of larvae. The simulator assumes that both suitable habitat and larvae are distributed uniformly across the cell at all times. As the simulation progresses, when larvae that are competent to settle drift over a cell, by default only the fraction of the cell that is occupied by suitable habitat receives settlers. (This behavior can be modified by the Use Sensory Zone parameter, described further below.)
Use the patch cover raster to account for the loss of resolution you experienced rescaling your high resolution reef data to the lower resolution ocean currents data–e.g. that some of the low resolution cells are completely covered by habitat, while others are only slightly covered. You can use the ArcGIS Spatial Analyst Block Statistics tool to compute how many high resolution cells occupy the lower resolution cells and obtain a proportion ranging from 0 to 1. This will work both for high resolution presence/absence data (classified as 0 or 1) or high resolution percent cover data (ranging from 0 to 100). In the latter case, if values range from 0 to 100, you may have to divide by 100 to rescale them to 0 to 1.
Here is the patch cover raster for the Gulf of Mexico example:
Note how most of the raster is the value 0. For 0 or No Data, the tool assumes that the cell does not contain any suitable habitat, even if the patch IDs raster indicates that suitable habitat is there. Conversely, if the value is greater than 0 but the corresponding patch IDs raster contains No Data, it is assumed that the cell does not contain any suitable habitat.
Running the tool
Once you have the rasters created, running the Create Larval Dispersal Simulation From ArcGIS Rasters tool is simple:
(If you wish to try this yourself using the Gulf of Mexico example data, download it here.)
You must specify a Simulation Directory that the tool will create, as well as your three rasters. The tool will create the directory, initialize it with some private files and directories, and make copies of your three rasters. Do not manually tamper with the contents of the simulation directory. It is not intended to be read or modified directly.
If you do not see the documentation on the right, click the Show Help >> button and it will appear. Show Help >> will then change to << Hide Help, as you see above.
Step 2: Load ocean currents into the simulation
After creating the simulation, you must load ocean currents data into the Simulation Directory. These will be used in Step 3 to run the simulation. But before you load currents, you must determine which ocean currents product you will use.
Selecting an ocean currents product
At the time of this writing, MGET included support for five ocean currents products for larval dispersal simulations. (At the time of this writing, it was not possible to use products other than these, but see below for further discussion).
Product | Spatial extent | Time range | Cell size | Time step | Depth range |
---|---|---|---|---|---|
AVISO DUACS 2014 | Global | 1993-now | 1/4° (~28 km at equator) | 1 day | Surface only |
Mediterranean Sea | 1993-now | 1/8° (~11-14 km in Med) | 1 day | Surface only | |
HYCOM GLBa0.08 | Global, 66°S-60°N | Sept 2008-now | 9 km | 1 day | 0-5500m |
HYCOM GOMl0.04 | 76-98°W, 18-32°N | 2003-now | 4.5 km | 1 day | 0-5500m |
OSCAR | Global, 80°S-80°N | Oct 1992-now | 1/3° (~37 km at equator) | 5 days | Surface only (15m) |
Pacific ROMS-CoSiNE | 100°E-70°W, 45°S-65°N | 1991-2015 | 13.9 km | 73 hours | Surface to seafloor |
First, you should eliminate the products that do not match the spatial extents and time ranges of interest to you. Next, consider the cell size and time step. Also read up on each product and see how it works. After building an understanding, select the product that best meets your needs, in terms of resolution and any other characteristics important to you.
Note that for products that support multiple depth ranges, MGET does allow you to pick a depth of interest, but it does not presently implement vertical migration as a larval behavior. MGET assumes that larvae remain at a constant depth throughout the simulation. For experiments in which larvae remain near the surface, this assumption may be reasonable. For larvae that inhabit sub-surface layers, it may not be reasonable. If not, this tool may not be useful for your research question.
For the Gulf of Mexico example, the best product might be HYCOM GOMl0.04, an 4.5 km resolution ocean model specifically designed for the Gulf of Mexico. I definitely would have chosen that had the reef rasters been of comparable resolution. But I already had reef rasters readily available at 8 km resolution and did not want to undertake the effort of producing new ones at a higher resolution just for this example. Plus, had I done so, it would greatly increase the run time of the tool, probably by a factor of four or more. Finally, it is likely that many more MGET users will be interested in the global HYCOM model (GLBa0.08), and its 9 km cell size is similar to the reef rasters, so I’m demonstrating it instead.
Running the tool
After you select a currents product, run the “Load” tool for that product. The procedure and tool parameters are similar for all of the products. For the Gulf of Mexico example, the tool was Load HYCOM GLBa0.08 Currents Into Larval Dispersal Simulation:
The purpose of this tool is to download currents data for a specified range of dates and add them to the Simulation Directory, thereby making it possible to run simulations for dates within that range. The values you provide for Start date and End date do not control when a simulation starts and ends. You’ll decide that during Step 3, below, when you actually run simulations. The load currents tool performs the job of downloading the currents, storing them locally, and making them ready for use by the simulator, so that when it needs them they can be retrieved quickly. The Start date and End date tell the tool the range of dates for which the currents data should be prepared.
You can run this tool multiple times to load different ranges of dates into the same Simulation Directory. For example, if you want to run simulations during the August-October periods of three different years, you could run the tool three times, once for each year, providing 1 August as the Start date and 31 October as the End date. The tool can take a long time to run, so if speed matters it is better to run the tool multiple times for short ranges of dates (e.g. August-October for three different years), rather than once for a single long range (e.g. August of year 1 through October of year 3).
In the Gulf of Mexico example, I wanted to simulate dispersal of larvae for 30 days; this is known as a 30 day pelagic larval duration (PLD). Repeating what was done by Schill et al. (2015), I wanted the larvae to be released on the last quarter moon of September 2011, which fell on 20 September. So I put in 20 September 2011 as the Start date and 20 October 2011 as the End date. Because these dates are just those for loading data, not for running the simulation, I could have put in an earlier start date or later end date and obtained more data while still covering the focal period. For example, if I wanted to simulate August 2011 last quarter moon as well, as was done by Schill et al. (2015), I could have used 21 August as the start date and loaded two months of currents, which I would use to run two consecutive 30-day simulations (in Step 3 below).
I assumed my larvae occupied the surface, so I used 0 for Depth. I used the default resampling technique of CUBIC for projecting the ocean currents to the coral reef raster coordinate system.
I selected Del2a for the Method for estimating missing currents values. This parameter tells the tool how to fill in cells that are marked as water by your water mask but are missing currents data. It is important that you do this; if the cells are left as No Data, then larvae cannot leave the cells by advection, only by diffusion. Then when you run the simulation in Step 3, the tool will report a warning (see Step 4 below for what this looks like). If you select a Method for estimating missing currents, the tool will interpolate values for those missing cells using values from nearby cells that do have data. You should review the documentation of the parameter in ArcGIS. (Thanks to John D’Errico for the MATLAB implementation of the algorithms underlying this parameter.)
In my experience, the results of the various Del2 methods are fairly similar. I recommend Del2a as a default. All of them can take a significant amount of time to execute for large rasters. The Del4 method while more numerically accurate, is much slower and can require significantly more memory, so I tend not to use it. If you receive an OUT OF MEMORY error while loading currents, switch to the Del2c method. If you still receive the error, you’ll probably have to recreate the simulation at a coarser cell size to reduce the number of cells. You are welcome to contact me for advice.
Finally, this parameter is intended to fill in small areas of missing data for which these interpolation algorithms are reasonable. If your currents product always lacks data in a certain area–e.g. a nearshore environment or at shallow depths–you should not blindly assume that this interpolation procedure will produce correct results! Instead, you should try to find another currents product that offers estimates for your areas of interest. Most likely the developer of the currents data, usually oceanographers, will have a better way of estimating currents values than the simple interpolation algorithms offered by this tool.
I left the other parameters at default values except for Cache Directory, which I set to a different folder than the Simulation Directory. The cache directory is optional; it tells MGET to store a private copy of whatever data were downloaded, so that if MGET needs those data again–if you recreate the simulation directory, for example–it can pull the data from there, which is much faster than downloading them again from the server.
Can I load several different products into a single Simulation Directory, e.g. for different time ranges?
No. At the time of this writing, the tools would only allow you to load a single product in a given simulation directory. You may load as many different time ranges as you like for that product. If you need to use a different product, create another Simulation Directory.
Does MGET support the global HYCOM GLBu0.08 datasets?
Short answer: No, but you might be able to do it yourself, and you are welcome to contact us to advocate we add it.
Long answer: The global HYCOM GLBu0.08 datasets span October 1992 to the present day using a 0.08° equirectangular coordinate system (a.k.a. “geographic” projection) spanning 80.48°S-80.48°N. By contrast, the global HYCOM GLBa0.08 datasets span September 2008 to the present day using a complex “tripole grid”; MGET can access a 9 km Mercator-projected portion of this grid spanning 66°S-60°N. HYCOM produces both from the same underlying results, thus they are effectively the same data, just on a different projection.
We only had time to build a tool for loading GLBa0.08 into connectivity simulations. You are welcome to contact us and advocate we add it. Meanwhile you can try to do it yourself. In a different part of the MGET toolbox there is a tool called Create Rasters for HYCOM GLBu0.08 4D Variable that can download GLBu0.08 rasters and store them in a directory. After reverse engineering the format of the Simulation Directory (see next question below), you could run then run the MGET Find ArcGIS Rasters and Project to Template tool to reproject the rasters to the same coordinate system, cell size, and extent as your patch rasters, interpolate cells missing data that are marked as water by your mask, and store the results in the Simulation Directory.
Can I load my own ocean currents data in the Simulation Directory?
As I am writing this, we do not provide a tool for loading your own currents data into the simulation. If this is something that is absolutely critical for your project, please inquire with us and maybe we can collaborate on it.
You may also try to do it yourself by reverse engineering the format that currents are stored in the Simulation Directory. The currents go into a subdirectory named Currents. In it, there are subdirectories u and v for the horizontal and vertical vector components rasters. In each of those, there are subdirectories by year. In the year directories, the currents rasters appear as .img files with the name uYYYYDDDHHMM.img or vYYYYDDDHHMM.img, where YYYY is the four-digit year, DDD is the three digit day of the year (001 to 365, or 366 on leap years), HH is the hour (00 to 23) and MM is the minute (00 to 59). (The second is assumed to be 0.)
The currents rasters must use the same coordinate system, cell size, extent, and number of rows and columns as the three rasters used to initialize the simulation. The MGET tools Project Raster To Template and Find ArcGIS Rasters and Project to Template can help automate this procedure. MGET’s various Load Currents tools use them internally to convert the original currents rasters to the geographic characteristics of your simulation rasters.
The images must use a constant time increment. There must be no gaps in the series of time slices for periods for which you want to run a simulation. For example, if want to conduct a simulation for the month of September 2015 and you have a daily currents product, you must provide images for all 30 days of September 2015. There must not be any missing days.
You must also modify Simulation.ini file at the root of the Simulation Directory to set four variables:
- currentsloaded – set this to True
- currentsproduct – set this to some arbitrary string describing your product; it does not matter what it is
- currentsdatetype – you probably want to set this to Center, indicating that the date/time of the file indicates the center of the time window that the image applies to
- maxsecondsbetweencurrentsimages – set this to the number of seconds between each image, e.g. 86400 for daily images
Step 3: Run the simulation
After loading currents for a range of dates you want to simulate, you can run the simulation. This step will disperse the larvae using the ocean currents and track where they settle. This step takes a long time to run, ranging from a few minutes for a simulation spanning a few days for a few patches, to hours or even a few days for a 30-day simulation for dozens or hundreds of patches.
The tool has many parameters. To obtain good results, you must choose values carefully. Let’s walk through them in detail.
Main parameters
Here’s how the tool looks with you first open it:
Simulation directory
The directory you created with the Create Larval Dispersal Simulation tool and then loaded with ocean currents.
Results directory
An existing directory to receive the results. You must create it before running the tool. The tool will create a bunch of outputs within it. The outputs will be overwritten if they already exist. The documentation for this parameter describes the outputs in detail. I discuss them more below.
I suggest you make this directory a peer of the simulation directory. Alternatively you can put it inside the simulation directory, but be careful: if you run the Create Larval Dispersal Simulation tool again (e.g. by accidentally rerunning your geoprocessing model) it will delete and recreate the simulation directory and your results will be lost. If you plan to run simulations using multiple parameters–e.g. different start dates, durations, settlement parameters–you should name the directory in a way that lets you remember those values. However, if you forget to do this, the tool creates a text file within the results directory that lists the parameter values for that run.
Start date
The start date for your simulation. If desired, you may also include a time; if you don’t, midnight (00:00:00) will be used. (Given the resolution of the currents data, specifying the time is not likely to be important.)
Duration
The number of days to simulate. The first time you run your simulation, I highly recommend you use a short duration such as 2-5 days and limit the simulation to just a few patches (using the Patches that disperse larvae parameter described below). This will let you check whether there are any basic problems, such as the Time Step being too large or too small (see below), or some cells missing ocean currents values. You can also quickly check the results to see if the tool is doing what you thought it would do.
Next, increase the duration to the full amount you’d like to simulate. Usually this is the pelagic larval duration (PLD) value for the species you are modeling. If you are testing multiple PLD values, use the shortest one. Again, do this for just a few patches and check that the output looks reasonable. This will demonstrate that the tool can run for a few patches without failing due to insufficient memory (see below for more about that). It will also give you an idea how long the full simulation takes, per patch, so you can plan when to run the full simulation (e.g. overnight, while you’re away from work).
Finally, when you’re ready for the complete simulation, use the full duration with all of the patches. This will usually take several hours or more.
Time step
Number of hours the simulator should advance its clock after recalculating larval movements and settlement. This is an important parameter that directly affects both the accuracy of the results and how long the tool will take to run. Although we provided a default of 1 hour, this may not be an appropriate value. Depending on your cell size and the maximum speed of your ocean currents, you might need a smaller value for the simulation to be numerically stable. Fortunately it is not difficult to determine an appropriate value, using the following procedure.
First run the tool using default time step of 1 hour. Use a short duration (2-5 days) and just one patch so it runs very quickly. Every time the tool runs it performs a numerical stability check. If you get this message:
Warning: The stability condition is 1.566760, which is greater than 2 ^ -1/2 = 0.707106. The simulation MAY NOT be numerically stable and the output results MAY NOT be correct. To fix this problem, reduce the simulation time step to 0.037610 days or less.
it indicates that the time step is too long. When the cell size is small or the ocean currents are fast, a shorter time step is needed. Reduce the time step to the recommended value, or less. Note that the warning message reports the time step in daysbut the tool requires you input it hours. We are sorry for the inconvenience. To convert days to hours, multiply by 24.
Later, when the results are visualized, it will be inconvenient if the time step cannot be divided evenly into 24 hours. So don’t just convert the recommended value reported by the warning to hours and plug it into the tool. For example, in the example above, don’t use 0.90264 hours. Use something like 0.5 hours instead, so exactly 48 time steps occur per day.
Now run the tool again and verify that the warning goes away. Instead you should get this message:
The stability condition is 0.391690, which is less than or equal to 2 ^ -1/2 = 0.707106. The simulation will be numerically stable.
Now run the tool again for the full simulation duration (e.g. 30 days). If you get the warning again, it means that during this period there was a faster current than you encountered during the initial shorter simulation. Reduce the time step further and rerun the full simulation again until the warning goes away.
If you see that the stability condition is very low, e.g. below 0.1, it indicates that the time step is shorter than it needs to be. This may happen for the default time step of 1 hour if you’re using relatively coarse resolution currents or your study area has relatively slow currents. There is no harm in simulating using a short time step. But the length of time the tool requires to run is directly proportional to the time step. If you increase the time step the tool will complete in a shorter time, which can be advantageous.
The amount of memory required to run the simulation does not depend on the time step. If you receive an OUT OF MEMORYerror while running the simulation, increasing the time step will generally not solve the problem (but increasing the Simulation summarization period might; see below).
Whenever you change the time step, after you settle on a final value always adjust the Simulation summarization period as well (see below).
Simulation summarization period
Period, expressed as the number of simulation time steps, at which the simulation results should be “summarized”. This parameter affects two important aspects of the simulation and directly influences the amount of memory required to execute the simulation, so it should be chosen carefully.
The first important aspect this parameter influences concerns the production of visualizations (Step 4) that summarize the results of the simulation. As the simulation runs, each time the summary period elapses the tool records an image of larval density at that point in the simulation, among other things. The first summarization occurs at the moment larvae are released but before any time has elapsed. The subsequent summarizations occur after the specified number of time steps elapse. For example, if the time step is 0.5 hours and the summarization period is 24, a summarization will occur every 12 hours. Using these records, the tool in Step 4 produces a time series of larval density rasters–one raster each time a summary occurs. The date and time of the summarization are stamped into the raster’s name. It is convenient, therefore, to choose a time step and summarization period that produces rasters at human-friendly summarization times, such as once per day, once per 6 hours, etc. In this respect, your choice of the summarization period is mainly an aesthetic choice: do you want many, frequent larval density rasters, or just a few, infrequent rasters?
The second important aspect this parameter influences concerns larval mortality, an optional process that you apply in Step 4 below. The summarization period determines the frequency at which mortality will be applied during that step.
Settlement parameters
<TO BE WRITTEN>
Additional parameters
<TO BE WRITTEN>
Common warning and error messages
The ocean currents images are missing data
Warning: The ocean currents images are missing data for some cells flagged as water by your water mask. This will affect the accuracy of the simulation. The simulator will assume the ocean currents have a velocity of zero in these cells. Larvae that enter these cells can only exit via diffusion. These cells may retain larvae in a manner that is not realistic.
or:
Warning: The ocean currents images are missing data for some cells flagged as habitat patches. This will affect the accuracy of the simulation. The simulator will assume the ocean currents have a velocity of zero in these cells. Larvae can only exit these cells via diffusion. These cells may retain larvae in a a manner that is not realistic. The IDs of the affected patches are: <a list of patch IDs>
Fix this by recreating the simulation and then, when you run the Load Currents tool, use the Method for estimating missing currents parameter to fill in the missing values. See the discussion of that parameter in Step 3 above.
The stability condition is <some value>, which is greater than 2 ^ -1/2
Warning: The stability condition is 1.566760, which is greater than 2 ^ -1/2 = 0.707106. The simulation MAY NOT be numerically stable and the output results MAY NOT be correct. To fix this problem, reduce the simulation time step to 0.037610 days or less.
Fix this by decreasing the Time step parameter. See above for further discussion.
During simulation of patch 137, a negative density was detected
Warning: During simulation of patch 137, a negative density was detected during step 5 of the simulation. This indicates a numerical stability problem with the MPDATA algorithm. You should reduce the simulation time step to a lower value and try again. If this does not resolve the problem, please contact the MGET development team for assistance.
This should only happen when the Time step is much too large. Fix this by decreasing the Time step parameter. See above for further discussion.
The start date / end date occurs too far before / after …
ValueError: The start date of the simulation (2010-09-20 00:00:00) occurs too far before the date of the first ocean currents image (2011-09-20 00:00:00) that is loaded in the larval dispersal simulation in directory C:\GOM_Connectivity\Simulation. To fix this problem, either move the start date forward or load some older ocean currents data into the simulation, so that the start date matches up with the currents data.
or:
ValueError: The end date of the simulation (2011-11-19 00:00:00) occurs too far after the date of the last ocean currents image (2011-10-20 00:00:00) that is loaded in the larval dispersal simulation in directory C:\GOM_Connectivity\Simulation. To fix this problem, either move the start date backward, reduce the duration of the simulation, or load some more recent ocean currents data into the simulation, so that the end date matches up with the currents data.
The usual reason for these messages is that you that changed your dates or duration but forgot to load additional ocean currents into the simulation. The messages suggest solutions.
Step 4: Visualize the results
<TO BE WRITTEN>
References
Schill SR, Raber GT, Roberts JJ, Treml EA, Brenner J, Halpin PN (2015) No Reef Is an Island: Integrating Coral Reef Connectivity Data into the Design of Regional-Scale Marine Protected Area Networks. PLoS ONE 10(12): e0144199.
Smolarkiewicz PK (1983) A simple positive definite advection scheme with small implicit diffusion. Monthly Weather Review 111:479–86.
Smolarkiewicz PK (2006) Multidimensional positive definite advection transport algorithm: an overview. International Journal for Numerical Methods in Fluids 50:1123–44.
Smolarkiewicz PK, Margolin LG (1998) MPDATA: a finite difference solver for geophysical flows. Journal of Computational Physics 140:459–80.
Treml E, Halpin P, Urban D, Pratson L (2008) Modeling population connectivity by ocean currents, a graph-theoretic approach for marine conservation. Landscape Ecology 23:19–36.
Treml EA, Roberts J, Chao Y, Halpin P, Possingham HP, Riginos C (2012) Reproductive output and duration of the pelagic larval stage determine seascape-wide connectivity of marine populations. Integrative and Comparative Biology 52(4): 525-537.