Impact Statement
The paper presents agenerator of spatio-temporal dynamic for bio-imaging, called the birth-death-move (BDM) model. This stochastic model simulates particle dynamics, accounting for interactions and colocalization. We illustrate the high flexibility of this model by presenting results on real-word image series. Model calibration from real fluorescence microscopy data shows that it faithfully reproduces the joint dynamics of the Langerin and Rab11 proteins.
1. Introduction
A long-term goal in fundamental biology is to decipher the spatiotemporal dynamic coordination and organization of interacting molecules within molecular complexes at the single cell-level. This includes the characterization of intracellular dynamics, which is essential to a better understanding of fundamental mechanisms like membrane transport. To that end, dedicated image analysis methods have been developed to process challenging temporal series of 2D–3D images acquired by fluorescence microscopy.(Reference Kervrann, Sorzano, Acton, Olivo-Marin and Unser1)
In this context, mathematical and biophysical models are indispensable to decode and synthesize the traffic flows of biomolecules. They constitute crucial prior models in most particle tracking procedures and they are needed to carry out simulations in order to evaluate the performance of image analysis algorithms and to facilitate the data augmentation step for the training of complex models like deep neural networks. Among them, particle-based stochastic models form the main class of tracking models(Reference Chenouard, Smal, De Chaumont, Maška, Sbalzarini, Gong, Cardinale, Carthel, Coraluppi, Winter, Cohen, Godinez, Rohr, Kalaidzidis, Liang, Duncan, Shen, Xu, Magnusson, Jaldén, Blau, Paul-Gilloteaux, Roudot, Kervrann, Waharte, Tinevez, Shorte, Willemse, Celler, van Wezel, Dan, Tsai, Ortiz de Solórzano, Olivo-Marin and Meijering2–Reference Vo, Vo and Phung5) and they are often at the basis of single molecule localization microscopy (SMLM) simulators.(Reference Badoual, Arizono, Denizot, Ducros, Berry, Nägerl and Kervrann6–Reference Venkataramani, Herrmannsdörfer, Heilemann and Kuner10) Popular softwares providing particle-based stochastic simulations include Virtual Cell,(Reference Schaff, Fink, Slepchenko, Carson and Loew11) MCell,(Reference Gupta, Czech, Kuczewski, Bartol, Sejnowski, Lee and Faeder12) and Smoldyn,(Reference Andrews, Addy, Brent and Arkin13) but they are mainly dedicated to reaction-diffusion dynamics for specific biophysics applications. In particular, as mentioned in the review paper,(Reference Andrews14) they are “also known as Brownian motion simulators” and as such they hardly represent the diversity of particle motions observed in some applications.
The aim of particle-based models, as those exploited in the above references, is to represent the collective motion of particles and global biomolecule trafficking. The latter should ideally account for the stochastic displacement of all individual particles, but also for a possible regime switching of each trajectory, the time of appearance of new biomolecules and their lifetime. Moreover, interactions between biomolecules should be possible, for instance between different types of proteins, giving rise to the colocalization phenomena observed in several applications.(Reference Bolte and Cordelieres15–Reference Lavancier, Pécot, Zengzhen and Kervrann18)
Beyond the aforementioned popular softwares, there is already a vast variety of stochastic models introduced in the literature to represent the individual trajectories, allowing for instance for Brownian, confined, anomalous, or directed motions with variable velocities within the cell,(Reference Bourgeois7, Reference Lagardère, Chamma, Bouilhol, Nikolski and Thoumine8, Reference Arts, Smal, Paul, Wyman and Meijering19–Reference Pécot, Zengzhen, Boulanger, Salamero and Kervrann26) or even supported along a cytoskeleton network.(Reference Boulanger, Kervrann and Bouthemy20, Reference Kervrann27, Reference Lagache, Dauty and Holcman28) However, these dynamics are rarely prone to regime switching, though this feature is often observed in real applications.(Reference Briane, Kervrann and Vimond22) They also generally assume independence between particles. Regarding the time and location of appearance of new particles, the existing models (including those provided by the popular softwares) are unsophisticated if not ignoring this feature. A constant rate of birth is generally assumed and no interaction with the existing particles is considered for the location of appearance, ruling out any colocalized dynamics. The same restriction occurs for the dynamics of disappearance of particles. Consequently, there is still an avenue to improve the existing particle-based models in order to take into account this lack of features.
We propose in this contribution to leverage a tailored stochastic model introduced in Ref. (Reference Lavancier and Le Guével29), which is flexible enough to include all aforementioned features in an unified and theoretically well-grounded framework. In agreement with our objective, this so-called birth-death-move (BDM) spatial point process is a model for the dynamics of a system of particles, that move over time, while some new particles may appear in the cell and some existing particles may disappear. Moreover, each particle may be marked by a given label, for example, among different possible labeled proteins and/or different types of motion regimes, and this mark may change over time, for example, a particle may switch from one regime (e.g., Brownian) to another (e.g., directed motion). This switch of a mark is sometimes called a “mutation” in the literature, but we prefer here to use the term “transformation” to avoid misunderstanding with a genuine biological mutation. The trajectories can be driven by any continuous Markov diffusion model, that includes most models for individual trajectories previously considered in the literature, and some interactions may be introduced so that colocalization phenomena can be generated. The intensity of births, that govern the waiting time before the next appearance of a new particle, may depend on the current configuration of particles, and similarly for the intensity of deaths. For instance, we may design that the more biomolecules in the cell there are, the higher the death intensity is, implying a rapid disappearance. Some spatial effects may also be taken into account, in order to create distinct motion regimes in some regions of the cell, or to encourage some spatial regions for the appearance of a new particle, for example, nearby some existing particles due to colocalization.
In a nutshell, compared to existing particle-based stochastic models and softwares, our approach enables to simulate a vast variety of Markov trajectories for the system of particles, including interactions between them during their displacements, as well as in the dynamics of births and deaths, thus accounting for possible colocalization effects. Additionally, it allows for regime-switching within each individual trajectory.
The remainder of the article is organized as follows. In Section 2, we give the precise definition of our stochastic process, and we in particular list all ingredients needed to fully specify the model. An iterative construction is presented in Section 2.2, clarifying how the dynamics proceeds, and an effective simulation algorithm is formally detailed in the Appendix and made available online. In Section 2.3, we provide numerous examples for the specifications of the model, that we think are relevant for many real biomolecules dynamics. In Section 3, we demonstrate the potential of our approach by focusing on the joint dynamics of Langerin and Rab11 proteins being involved in membrane trafficking. We start by the inspection of a real dataset in order to calibrate judiciously the different parameters of the BDM model to be in agreement with this example. The dataset consists of a sequence of images acquired by 3D multi-angle TIRF (total internal reflection fluorescence) microscopy technique,(Reference Boulanger, Gueudry, Munch, Cinquin, Paul-Gilloteaux, Bardin, Guérin, Senger, Blanchoin and Salamero30) depicting the locations of Langerin and Rab11 proteins close to the plasma membrane of the cell, specifically over a distance of 1 μm in the z-axis. After some post-processing, the sequence shows a set of trajectories for both type of proteins, that follow different motion regimes, are spatially distributed within the cell in a specific way, and occur at different periods during the sequence, which is in perfect line with the dynamics of a BDM process. The observed trajectories for the Langerin channel are depicted on the leftmost plot of Figure 1. We compute a set of descriptors from this dataset in order to calibrate the parameters of our stochastic process, but also to create some benchmark features for the assessment of our synthetic sequences. Finally, in Section 3.2, we generate several simulated sequences and show that they exhibit comparable features as those observed in the real sequence. An example of generated trajectories is displayed in the rightmost plot of Figure 1. For this illustration, although the individual trajectories exhibit basic dynamics (they are independent, homogeneous in space and either follow Brownian, confined, or directed motions), the advantage of our approach lies in its ability to incorporate regime switching within these trajectories and to account for the colocalization phenomenon when new particles appear.
Supplementary materials, including the Python code for simulation, the raw data, and some further simulated sequences, are available in our online GitHub repository at https://github.com/balsollier-lisa/BDM-generator-for-bioimaging.
2. The mathematical model
2.1. Heuristic and notations
In order to mimic the dynamics of biomolecules, we consider a multitype BDM process with mutations, denoted by $ {\left({\mathbf{X}}_t\right)}_{t\ge 0} $ . This process is a generalization of BDM processes, as introduced in Ref. (Reference Lavancier and Le Guével29). In the following, to avoid misunderstanding we rather use the term “transformation” instead of “mutation,” as explained in Section 1. This section describes the spatiotemporal dynamics of $ {\left({\mathbf{X}}_t\right)}_{t\ge 0} $ and introduces some notation.
At each time $ t $ , $ {\mathbf{X}}_t $ is a collection of particles located in a bounded set $ \Lambda $ of $ {\mathrm{\mathbb{R}}}^d $ . Each particle is assigned a mark that represents a certain feature. We denote by $ \unicode{x1D544} $ the collection of possible marks. Through time, the particles move (possibly depending on their associated mark and in interaction with each other) and three sudden changes may occur, that we call “jumps”:
-
1. a “birth”: a new particle, assigned with a mark, may appear;
-
2. a “death”: an existing particle may disappear;
-
3. a “transformation”: the mark of an existing particle may change.
Example: In our biological application treated in Section 3, $ \Lambda $ represents a cell in dimension $ d=2 $ or $ d=3 $ . We observe inside this cell two types of particles, associated to Langerin and Rab11 proteins, and each of them moves according to three different possible regimes: Brownian, directed, or confined motion. For this example, each particle is therefore marked out of six possibilities, whether it is associated to Langerin (L) or Rab11 (R), and depending on its motion regime (1 to 3), so that $ \unicode{x1D544}=\left\{\left(L,1\right),\Big(L,2\Big),\Big(L,3\Big),\Big(R,1\Big),\Big(R,2\Big),\Big(R,3\Big)\right\} $ . Through time, each particle moves independently of the others according to its motion regime and eventually a new particle may appear, an existing one may disappear, and the motion regime of some particles may change.
We denote by $ n\left({\mathbf{X}}_t\right) $ the number of particles at time $ t $ . Each particle $ {x}_i\in {\mathbf{X}}_t $ , for $ i=1,\dots, n\left({\mathbf{X}}_t\right) $ , is decomposed as $ {x}_i=\left({z}_i,{m}_i\right) $ where $ {z}_i\in \Lambda $ stands for its position while $ {m}_i\in \unicode{x1D544} $ denotes its mark. Accordingly we have $ {\mathbf{X}}_t\in {\left(\Lambda \times \unicode{x1D544}\right)}^{n\left({\mathbf{X}}_t\right)} $ . (Strictly speaking the ordering of particles in $ {\mathbf{X}}_t $ does not matter, because any permutation of particles leads to the same collection of particles. We choose in this article to bypass this nuance and use the same notation as if $ {\mathbf{X}}_t $ was a vector a particles, even it is actually a set of particles.) Since the number of particles changes over time, the stochastic process $ {\left({\mathbf{X}}_t\right)}_{t\ge 0} $ takes its values in the space
To stress the fact that $ {\mathbf{X}}_t $ is not a simple value but encodes the positions and marks of a system of particles, we will say that this system at time $ t $ is in configuration $ {\mathbf{X}}_t $ .
To fully specify the dynamics of $ {\left({\mathbf{X}}_t\right)}_{t\ge 0} $ , we need the following ingredients:
-
1. A system of equations $ \left(\mathrm{Move}\right) $ that rules the way each particle of $ {\mathbf{X}}_t $ moves continuously between two jumps. We will typically consider a system of stochastic differential equations acting on the position of each particle, possibly depending on their associated mark and in interaction with the other particles;
-
2. Three continuous bounded functions $ \beta $ , $ \delta $ , and $ \tau $ from $ E $ to $ \left[0,\infty \right) $ , called birth, death and transformation intensity functions respectively, that govern the waiting times before a new birth, a new death, and a new transformation. At each time $ t $ , we may interpret $ \beta \left({\mathbf{X}}_t\right) dt $ as the probability that a birth occurs in the interval $ \left[t,t+ dt\right] $ , given that the system of particles is in the configuration $ {\mathbf{X}}_t $ , and similarly for $ \delta $ and $ \tau $ .
-
3. Three transition probability functions that indicate how each jump occurs:
-
• $ {p}^{\beta}\left(\left(z,m\right)|\mathbf{x}\right) $ : probability density function that the birth occurs at the position $ z $ with the mark $ m $ , given that there is a birth and that the system of particles is in configuration $ \mathbf{x} $ at the birth time;
-
• $ {p}^{\delta}\left({x}_i|\mathbf{x}\right) $ , for $ {x}_i\in \mathbf{x} $ : probability that the death concerns the particle $ {x}_i $ in $ \mathbf{x} $ , given that there is a death and that the system of particles is in configuration $ \mathbf{x} $ at the death time;
-
• $ {p}^{\tau}\left(\left({z}_i,{m}_i\right),m|\mathbf{x}\right) $ , for $ \left({z}_i,{m}_i\right)\in \mathbf{x} $ : probability that the particle $ {x}_i=\left({z}_i,{m}_i\right) $ in $ \mathbf{x} $ changes its mark and that this transformation leads to the new mark $ m\ne {m}_i $ , given that there is a transformation and that the system of particles is in configuration $ \mathbf{x} $ at the transformation time.
We provide in Section 2.3 some examples for the choice of these characteristics. Finally, we will denote by $ {T}_1,{T}_2,\dots $ the jump times of the process and we agree that $ {T}_0=0 $ .
2.2. Algorithmic construction
Assume we are given the characteristics of the process $ {\left({\mathbf{X}}_t\right)}_{t\ge 0} $ as introduced in the previous section, that are the system of equations $ \left(\mathrm{Move}\right) $ , the intensity functions $ \beta $ , $ \delta $ , $ \tau $ , and the transition probability functions $ {p}^{\beta } $ , $ {p}^{\delta } $ , and $ {p}^{\tau } $ . Then, starting from an initial configuration $ {\mathbf{X}}_0 $ at time $ {T}_0=0 $ , we construct iteratively the process in the time interval $ \left[0,T\right] $ as follows. Here we set $ \alpha =\beta +\delta +\tau $ to be the total intensity of jumps.
-
1. Generate $ n\left({\mathbf{X}}_0\right) $ continuous trajectories as solutions of $ \left(\mathrm{Move}\right) $ in the interval $ \left[0,T-{T}_0\right] $ , given the initial conditions $ {\mathbf{X}}_0 $ . Denote $ {\left({\mathbf{Y}}_t\right)}_{t\in \left[0,T-{T}_0\right]} $ these trajectories.
-
2. By flipping a coin, test whether the jump time $ {T}_1 $ occurs after $ T $ (this is with probability $ p $ ) or before $ T $ (this is with probability $ 1-p $ ), where
-
• If $ {T}_1>T $ , then $ {\mathbf{X}}_t={\mathbf{Y}}_{t-{T}_0} $ for all $ t\in \left[{T}_0,T\right] $ , which completes the simulation.
-
• Otherwise, we continue by generating $ {T}_1 $ in $ \left[{T}_0,T\right] $ and the associated jump as in the following.
3. Generate $ {T}_1-{T}_0 $ , given that $ {T}_1<T $ , according to the probability distribution
The process until the time $ {T}_1 $ is then given by the generated trajectories, that is,
4. Draw which kind of jump occurs at $ {T}_1 $ (we denote by $ {\mathbf{X}}_{T_1^{-}} $ the configuration of the process just before the jump, which is $ {\mathbf{Y}}_{T_1-{T}_0} $ by continuity of $ \left({\mathbf{Y}}_t\right) $ ):
-
• this is a birth with probability $ \beta \left({\mathbf{X}}_{T_1^{-}}\right)/\alpha \left({\mathbf{X}}_{T_1^{-}}\right) $ ;
-
• this is a death with probability $ \delta \left({\mathbf{X}}_{T_1^{-}}\right)/\alpha \left({\mathbf{X}}_{T_1^{-}}\right) $ ;
-
• this is a transformation with probability $ \tau \left({\mathbf{X}}_{T_1^{-}}\right)/\alpha \left({\mathbf{X}}_{T_1^{-}}\right) $ .
5. Generate the jump at $ t={T}_1 $ to get $ {\mathbf{X}}_{T_1} $ as follows:
-
• if this is a birth, generate the new particle $ x=\left(z,m\right) $ according to the probability density function $ {p}^{\beta}\left(\left(z,m\right)|{\mathbf{X}}_{T_1^{-}}\right) $ . Then set $ {\mathbf{X}}_{T_1}={\mathbf{X}}_{T_1^{-}}\cup \hskip0.2em \left(z,m\right) $ ;
-
• if this is a death, draw which particle $ {x}_i\in {\mathbf{X}}_{T_1^{-}} $ to delete according to the probability $ {p}^{\delta}\left({x}_i|{\mathbf{X}}_{T_1^{-}}\right) $ , for $ i=1,\dots, n\left({\mathbf{X}}_{T_1^{-}}\right) $ . Then set $ {\mathbf{X}}_{T_1}={\mathbf{X}}_{T_1^{-}}\backslash \hskip0.35em {X}_i $ ;
-
• if this is a transformation, draw which particle $ {x}_i=\left({z}_i,{m}_i\right)\in {\mathbf{X}}_{T_1^{-}} $ is transformed and generate its transformation according to $ {p}^{\tau}\left(\left({z}_i,{m}_i\right),m|{\mathbf{X}}_{T_1^{-}}\right) $ , for $ i=1,\dots, n\left({\mathbf{X}}_{T_1^{-}}\right) $ . Then set $ {\mathbf{X}}_{T_1}=\left({\mathbf{X}}_{T_1^{-}}\backslash \left({z}_i,{m}_i\right)\right)\cup \hskip0.2em \left({z}_i,m\right) $ .
6. Back to step 1 with $ {T}_0\leftarrow {T}_1 $ and $ {\mathbf{X}}_0\leftarrow {\mathbf{X}}_{T_1} $ in order to generate the new trajectories starting from $ {\mathbf{X}}_{T_1} $ and the next jump time $ {T}_2 $ , and so on.
In the first step of the above construction, the trajectories are generated up to the final time $ T $ . It is however very likely that the next jump occurs much before $ T $ so that it would be sufficient and computationally more efficient to generate these trajectories on a shorter time interval. We provide in the Appendix a formal algorithm of simulation of $ {\mathbf{X}}_t $ for $ t\in \left[0,T\right] $ , following the above construction and including the latter idea. This algorithm has been implemented in Python and is available in our GitHub repository.
From a theoretical side, note that the specific exponential form of the probability distribution of the inter-jump waiting time in step 3 is necessary to imply the interpretation of $ \beta $ , $ \delta $ , and $ \tau $ explained in the previous section. This exponential form also implies that $ {\left({\mathbf{X}}_t\right)}_{t\ge 0} $ is a Markov process, meaning that its future dynamics only depends on its present configuration. We refer to Ref. (Reference Lavancier and Le Guével29) for more details about these theoretical aspects.
2.3. Exemplified specifications of the model
2.3.1. The inter-jumps motion
Recall that during an inter-jump period, the process $ \left({\mathbf{X}}_t\right) $ has a constant cardinality $ n=n\left({\mathbf{X}}_t\right) $ and the marks of all its particles remain constant. We denote by $ \left({\mathbf{Y}}_t\right) $ a system of $ n $ such particles $ \left({z}_{i,t},{m}_i\right) $ , for $ i=1,\dots, n $ , where $ {z}_{i,t}\in \Lambda $ represents the position of the $ i $ th particle at time $ t $ and $ {m}_i\in \unicode{x1D544} $ is its constant mark, that is
In agreement with the construction of the previous section, the inter-jump trajectory of each particle of $ \left({\mathbf{X}}_t\right) $ will coincide with the $ n $ trajectories of $ \left({\mathbf{Y}}_t\right) $ during this period.
As a general example, we assume that $ \left({\mathbf{Y}}_t\right) $ follows the following system of stochastic differential equation, starting at $ t=0 $ at the configuration $ {\mathbf{y}}_0\in {\left(\Lambda \times \unicode{x1D544}\right)}^n $ ,
where the drift functions $ {v}_i $ take their values in $ {\mathrm{\mathbb{R}}}^d $ , the diffusion $ {\sigma}_i $ are nonnegative functions, and $ \left({B}_{i,t}\right) $ , $ i=1,\dots, n $ , are $ n $ independent standard Brownian motions in $ {\mathrm{\mathbb{R}}}^d $ . Here, $ {\mathbf{y}}_0 $ , $ {v}_i $ , and $ {\sigma}_i $ are free parameters to be chosen.
Some conditions on the drift and diffusion functions are necessary to ensure the existence and unicity of the solution of $ \left(\mathrm{Move}\right) $ . This holds for instance if these functions are Lipschitz,(Reference Øksendal31) a condition met for the following examples. In addition, since each particle is supposed to evolve in the bounded set $ \Lambda $ of $ {\mathrm{\mathbb{R}}}^d $ , we need in practice to force the trajectories of $ \left(\mathrm{Move}\right) $ to stay in $ \Lambda $ . This may be achieved by reflecting the trajectories at the boundary of $ \Lambda $ .
In its general form, $ \left(\mathrm{Move}\right) $ allows the motion of each particle to depend on its mark, but also on the position and mark of the other particles (that are part of $ {\mathbf{Y}}_t $ ). We detail several examples below, that may be realistic for biological applications.
Example 1 (Brownian motions): If $ {v}_i\left(t,{\mathbf{Y}}_t\right)=0 $ and $ {\sigma}_i\left(t,{\mathbf{Y}}_t\right)=\sigma $ (for $ \sigma >0 $ ) is constant, then each particle follows a Brownian motion with the same diffusion coefficient $ \sigma $ , independently of the other particles.
Example 2 (spatially varying diffusion coefficients): If $ {v}_i\left(t,{\mathbf{Y}}_t\right)=0 $ and $ {\sigma}_i\left(t,{\mathbf{Y}}_t\right)={\sigma}_{m_i}\left({z}_{i,t}\right) $ , where $ {\sigma}_{m_i} $ is a positive function defined on $ \Lambda $ , then each particle follows an independent diffusive motion, where the diffusive coefficient depends on the associated mark and may vary in space. For instance, assume that $ \Lambda ={\Lambda}_1\cup {\Lambda}_2 $ with $ {\Lambda}_1\cap \hskip0.3em {\Lambda}_2=\varnothing $ and that for $ m\in \unicode{x1D544} $ ,
where $ {\sigma}_{1,m}>0 $ , $ {\sigma}_{2,m}>0 $ . Then each particle with mark $ m $ follows locally in $ {\Lambda}_1 $ a Brownian motion with diffusion coefficient $ {\sigma}_{1,m} $ and locally in $ {\Lambda}_2 $ a Brownian motion with diffusion coefficient $ {\sigma}_{2,m} $ . Note that as such, $ {\sigma}_m $ is not Lipschitz and it needs to be smooth so as to fit the theoretical setting. This may be achieved by taking the convolution of $ {\sigma}_m $ by a bump function.
Example 3 (directed and confined motions): If $ {v}_i\left(t,{\mathbf{Y}}_t\right)={v}_{m_i}\left({z}_{i,t}\right) $ and $ {\sigma}_i\left(t,{\mathbf{Y}}_t\right)={\sigma}_{m_i} $ , where $ {v}_{m_i} $ is defined on $ \Lambda $ and $ {\sigma}_{m_i}>0 $ , then each particle evolves independently of each other with a drift and a diffusion coefficient that depend on its mark. This example includes the directed motion considered in Ref. (Reference Briane, Kervrann and Vimond22) when $ {v}_{m_i}(z)={v}_{m_i} $ is a constant drift. It also includes the Ornstein–Uhlenbeck dynamics, also considered in Ref. (Reference Briane, Kervrann and Vimond22), when $ {v}_{m_i}\left({z}_{i,t}\right)=-{\lambda}_{m_i}\left({z}_{i,t}-{z}_{i,0}\right) $ , where $ {\lambda}_{m_i}>0 $ can be interpreted as a force of attraction toward the initial position $ {z}_{i,0} $ , leading to a confined trajectory.
Example 4 (interacting particles): In this example, we show how we can include interactions between the particles through a Langevin dynamics. To do so, we introduce, for $ m,{m}^{\prime}\in \unicode{x1D544} $ , pairwise interaction functions $ {\Phi}_{m,{m}^{\prime }} $ , as considered in statistical physics: For $ r>0 $ , $ {\Phi}_{m,{m}^{\prime }}(r) $ represents the pairwise interaction between a particle with mark $ m $ and a particle with mark $ {m}^{\prime } $ at a distance $ r $ apart. If $ {\Phi}_{m,{m}^{\prime }}(r)=0 $ , there is no interaction, if $ {\Phi}_{m,{m}^{\prime }}(r)>0 $ there is inhibition between the two particles at distance $ r $ , and if $ {\Phi}_{m,{m}^{\prime }}(r)<0 $ there is attraction. Examples of inhibitive interaction functions can be found in Ref. (Reference Lavancier, Le Guével and Manent32). The (overdamped) Langevin dynamics associated to these interactions reads as $ \left(\mathrm{Move}\right) $ with $ {\sigma}_i\left(t,{\mathbf{Y}}_t\right)={\sigma}_{m_i} $ , $ {\sigma}_{m_i}>0 $ , and
where $ \nabla $ denotes the Gradient operator. Accordingly, each particle moves in a direction that tend to decrease the value of the pairwise interaction function with the other particles.
Example 5 (colocalized particles): Assume that some particles, say with mark $ m $ , are thought to be colocalized with particles having the mark $ \tilde{m} $ . This means that we expect the former to be localized nearby the latter and to follow approximately the same motion. Specifically, to let the particle $ i $ with mark $ m $ be colocalized with the particle $ j $ with mark $ \tilde{m} $ , we may simply define $ {z}_{i,t}={z}_{j,t}+{\sigma}_i{B}_{i,t} $ , $ t>0 $ , where $ {B}_{i,t} $ is a standard Brownian motion in $ {\mathrm{\mathbb{R}}}^d $ representing the deviation of the trajectory $ i $ around the trajectory $ j $ , and $ {\sigma}_i>0 $ quantifies the strength of this deviation. Here $ {z}_{j,t} $ may be defined as in the previous examples, for instance as the typical trajectory of a particle with mark $ \tilde{m} $ .
2.3.2. The intensity functions
Recall that the intensity functions $ \beta $ , $ \delta $ , and $ \tau $ rule the waiting times until the next birth, death, and transformation, respectively. Heuristically, the probability that a birth occurs in the time interval $ \left[t,t+ dt\right] $ given that the particles are in configuration $ {\mathbf{X}}_t $ is $ \beta \left({\mathbf{X}}_t\right) dt $ , and similarly for $ \delta $ and $ \tau $ . As a consequence these probabilities may evolve over time according to the configuration of particles, making for instance a death more likely to happen when there are many particles or a high concentration of them in some region, due to competition. We provide some natural examples below. For each example, any of $ \beta $ , $ \delta $ , or $ \tau $ can be set similarly, even if we focus only on one of them.
Example 6 (constant intensities): The simplest situation is when the intensity functions are constant, for instance $ \beta \left({\mathbf{X}}_t\right)=\beta $ with $ \beta >0 $ . Then births appear at a constant rate and we can expect that in average $ \beta \times \left(s-t\right) $ new particles appear during the interval $ \left[s,t\right] $ .
Example 7 (intensities depending on the cardinality): If $ \delta \left({\mathbf{X}}_t\right)=\delta n\left({\mathbf{X}}_t\right) $ , with $ \delta >0 $ , then the more particles there are, the more deaths we observe. This is a natural situation when each particle is thought to have a constant death rate $ \delta $ , so that the total death intensity for the system of particles at time $ t $ is just the sum of them, that is $ \delta n\left({\mathbf{X}}_t\right) $ .
Example 8 (spatially varying intensities): Assume that the mark of a particle (say its motion regime) has more chance to change in some region of $ \Lambda $ than another, then the transformation intensity $ \tau $ may reflect this dependency. Let for instance $ \Lambda ={\Lambda}_1\cup {\Lambda}_2 $ with $ {\Lambda}_1\cap {\Lambda}_2=\varnothing $ and define $ \tau \left({\mathbf{X}}_t\right)={\tau}_1{n}_1\left({\mathbf{X}}_t\right)+{\tau}_2{n}_2\left({\mathbf{X}}_t\right) $ where $ {\tau}_1<{\tau}_2 $ and $ {n}_1\left({\mathbf{X}}_t\right) $ (resp. $ {n}_2\left({\mathbf{X}}_t\right) $ ) denotes the number of particles in $ {\Lambda}_1 $ (resp. in $ {\Lambda}_2 $ ). Then for a given cardinality $ n\left({\mathbf{X}}_t\right) $ , the more proportion of particles in $ {\Lambda}_2 $ , the more transformations happen. Note that in order to be rigorous, we should consider a continuous version of $ \tau $ , which can be achieved by convolution with a bump function.
Example 9 (transformation due to colocalization): Assume that some $ m $ -particles (that are the particles with mark $ m $ ) can be colocalized with some $ \tilde{m} $ -particles. Assume in addition that the particles are assigned a second mark that encodes their motion regime (e.g., diffuse, confined, or directed). Eventually, during the dynamics of particles, a noncolocolized $ m $ -particle may become colocolized with a $ \tilde{m} $ -particle, meaning that it becomes $ D $ -close to a $ \tilde{m} $ -particle, where $ D $ is some prescribed colocalization distance. If so, we may expect that the motion regime of the $ m $ -particle becomes similar as the $ \tilde{m} $ -particle, so that a transformation must occur. Let $ {n}_{D,m,\tilde{m}}\left({\mathbf{X}}_t\right) $ be the number of $ D $ -close pairs of particles with marks $ m $ and $ \tilde{m} $ , whose motion regimes are different. Then we may define $ \tau \left({\mathbf{X}}_t\right)=\tau \hskip0.1em {n}_{D,m,\tilde{m}}\left({\mathbf{X}}_t\right) $ , for some $ \tau >0 $ , so that a transformation (of motion regime here) is very likely to occur when the aforementioned situation happen. Note that if $ \tau $ is large, such transformation will quickly happen as soon as $ {n}_{D,m,\tilde{m}}\left({\mathbf{X}}_t\right)=1 $ , and so $ {n}_{D,m,\tilde{m}}\left({\mathbf{X}}_t\right)>1 $ will be unlikely to be observed. Here again, a smooth version of $ \tau $ can be introduced by convolution to ensure its continuity.
2.3.3. The transition probability functions
We detail examples for the three possible transitions, in order below: births, deaths, and transformations.
For the births, remember that $ {p}^{\beta}\left(\left(z,m\right)|\mathbf{x}\right) $ denotes the probability density function (pdf) that a particle appears at the position $ z $ with the mark $ m $ , given that the system of particles are in configuration $ \mathbf{x} $ . To set this probability, two approaches are possible:
-
1. First drawing the mark $ m $ of the new particle with respect to some probability $ {p}^{\beta}\left(m|\mathbf{x}\right) $ , then the position of the new particle given its mark according to some pdf $ {p}^{\beta}\left(z|\mathbf{x},m\right) $ . This leads to the decomposition $ {p}^{\beta}\left(\left(z,m\right)|\mathbf{x}\right)={p}^{\beta}\left(m|\mathbf{x}\right){p}^{\beta}\left(z|\mathbf{x},m\right) $ .
-
2. First generating the position of the new particle with respect to some pdf $ {p}^{\beta}\left(z|\mathbf{x}\right) $ , then its mark $ m $ given the position with probability $ {p}^{\beta}\left(m|\mathbf{x},z\right). $ This leads to the decomposition $ {p}^{\beta}\left(\left(z,m\right)|\mathbf{x}\right)={p}^{\beta}\left(z|\mathbf{x}\right){p}^{\beta}\left(m|\mathbf{x},z\right) $ .
Example 10 (uniform births): This is the simple example where the births do not depend on the environment, are uniform in space and the marks are drawn with respect to some prescribed probabilities $ {p}_m $ , where $ {\sum}_{m\in \unicode{x1D544}}{p}_m=1 $ . The two above approaches then coincide with $ {p}^{\beta}\left(m|\mathbf{x}\right)={p}^{\beta}\left(m|\mathbf{x},z\right)={p}_m $ and $ {p}^{\beta}\left(z|\mathbf{x},m\right)={p}^{\beta}\left(z|\mathbf{x}\right)=1/\mid \Lambda \mid $ for $ z\in \Lambda $ .
Example 11 (colocalized births): We adopt here the first approach above. We first draw the marks independently of the environment by setting $ {p}^{\beta}\left(m|\mathbf{x}\right)={p}_m $ with $ {\sum}_{m\in \unicode{x1D544}}{p}_m=1 $ , as in the previous example. Second, in order to generate the position of a new $ m $ -particle, thought to be colocalized with the $ \tilde{m} $ -particles, we may use a mixture of isotropic normal distribution, centered at each $ \tilde{m} $ -particle, with deviation $ \sigma >0 $ . Denoting by $ \tilde{n}\left(\mathbf{x}\right) $ the number of $ \tilde{m} $ -particles in $ \mathbf{x} $ and $ {\tilde{z}}_i $ their positions ( $ i=1,\dots, \tilde{n}\left(\mathbf{x}\right) $ ), this means that
Note that to be rigorous $ {p}^{\beta}\left(z|\mathbf{x},m\right) $ should be restricted to $ \Lambda $ with a proper normalization, otherwise some particles might be generated outside $ \Lambda $ . We omit these details.
Example 12 (spatially dependent new marks): We may adopt the second approach by first generating a uniform position for the new particle, that is, $ {p}^{\beta}\left(z|\mathbf{x}\right)=1/\mid \Lambda \mid $ for $ z\in \Lambda $ , and second by drawing the mark according to the generated position. Let for instance $ \Lambda ={\Lambda}_1\cup {\Lambda}_2 $ with $ {\Lambda}_1\cap {\Lambda}_2=\varnothing $ and set
where $ {\sum}_{m\in \unicode{x1D544}}{p}_{1,m}={\sum}_{m\in \unicode{x1D544}}{p}_{2,m}=1 $ . Then depending on the position, the distribution of the marks may be different.
We now focus on the death transition, namely the probability $ {p}^{\delta}\left({x}_i|\mathbf{x}\right) $ , for $ {x}_i\in \mathbf{x} $ , that the particle $ {x}_i $ in $ \mathbf{x} $ disappears when there is a death.
Example 13 (uniform deaths): The simplest example is when a death occurs uniformly over the existing particles, that is $ {p}^{\delta}\left({x}_i|\mathbf{x}\right)=1/n\left(\mathbf{x}\right) $ for $ i=1,\dots, n\left(\mathbf{x}\right) $ .
Example 14 (deaths due to competition): We may imagine that, due to competition, a particle is more likely to disappear if there are too many neighbors around it. Let $ {n}_D\left({x}_i\right) $ be the number of neighboring particles around $ {x}_i $ within distance $ D>0 $ . To take into account the competition at distance $ D>0 $ , we may define $ {p}^{\delta}\left({x}_i|\mathbf{x}\right)={n}_D\left({x}_i\right)/{\sum}_{j=1}^{n\left(\mathbf{x}\right)}{n}_D\left({x}_j\right) $ . Similarly, if relevant, we may count the number of neighbors of a certain mark only.
Finally, we focus on $ {p}^{\tau}\left(\left({z}_i,{m}_i\right),m|\mathbf{x}\right) $ , for $ \left({z}_i,{m}_i\right)\in \mathbf{x} $ , which is the probability that the particle $ {x}_i=\left({z}_i,{m}_i\right) $ in $ \mathbf{x} $ changes its mark from $ {m}_i $ to $ m\ne {m}_i $ , when a transformation happens. Similarly, as for the birth transition probability, it is natural to decompose this probability as
where $ {p}^{\tau}\left(\left({z}_i,{m}_i\right)|\mathbf{x}\right) $ represents the probability to choose the particle $ \left({z}_i,{m}_i\right) $ in the configuration $ \mathbf{x} $ , in order to change its mark, and $ {p}^{\tau}\left(m|\mathbf{x},\left({z}_i,{m}_i\right)\right) $ is the probability to choose the new mark $ m $ given that the transformed particle is located at $ {z}_i $ with mark $ {m}_i $ .
Example 15 (transformations independent on the environment): A typical situation is when the particle to transform is drawn uniformly over the existing particles, that is $ {p}^{\tau}\left(\left({z}_i,{m}_i\right)|\mathbf{x}\right)=1/n\left(\mathbf{x}\right) $ , and the transformation is carried out independently on the environment, according to a transition matrix with entries $ {p}_{m,{m}^{\prime }}\ge 0 $ , $ m,{m}^{\prime}\in \unicode{x1D544} $ , representing the probability to be transformed from mark $ m $ to mark $ {m}^{\prime } $ . Here, for all $ m\in \unicode{x1D544} $ , we assume $ {p}_{m,m}=0 $ in order to ensure a genuine transformation, and of course $ {\sum}_{m^{\prime}\in \unicode{x1D544}}{p}_{m,{m}^{\prime }}=1 $ . With this formalism, we thus have $ {p}^{\tau}\left(m|\mathbf{x},\left({z}_i,{m}_i\right)\right)={p}_{m_i,m} $ .
Example 16 (spatially dependent transformations): To make the previous example spatially dependent, introduce $ {p}_m(z) $ , a pdf in $ \Lambda $ representing the locations in $ \Lambda $ where a particle with mark $ m $ is more or less likely to be transformed. Then we may set
where $ {n}_i\left(\mathbf{x}\right) $ denotes the number of particles with marks $ {m}_i $ in $ \mathbf{x} $ and $ {z}_j^{\left({m}_i\right)} $ , $ j=1,\dots, {n}_i\left(\mathbf{x}\right) $ , their positions. In this expression $ {n}_i\left(\mathbf{x}\right)/n\left(\mathbf{x}\right) $ is a weight accounting for the prevalence of mark $ {m}_i $ in $ \mathbf{x} $ and the sum in the denominator is a normalization so that the probabilities sum to 1. Note that if $ {p}_m(z) $ is the uniform pdf on $ \Lambda $ , then we recover the uniform distribution $ {p}^{\tau}\left(\left({z}_i,{m}_i\right)|\mathbf{x}\right)=1/n\left(\mathbf{x}\right) $ . Furthermore, once the particle is chosen as above, we may apply a spatially dependent transformation as follows. Let $ \Lambda ={\Lambda}_1\cup {\Lambda}_2 $ with $ {\Lambda}_1\cap {\Lambda}_2=\varnothing $ and let two different transition matrices with respective entries $ {p}_{m,{m}^{\prime}}^{(1)} $ and $ {p}_{m,{m}^{\prime}}^{(2)} $ , for $ m,{m}^{\prime}\in \unicode{x1D544} $ . Then we may set
Accordingly, the transformation does not follow the same distribution, whether the chosen particle to be transformed is located in $ {\Lambda}_1 $ or $ {\Lambda}_2 $ .
Example 17 (transformation due to colocalization): Assume that we are in the same situation as in Example 9 where $ m $ -particles can be colocalized to $ \tilde{m} $ -particles. We assume like in this example that a transformation occurs if $ {n}_{D,m,\tilde{m}}\left({\mathbf{X}}_t\right)\ge 1 $ , where $ {n}_{D,m,\tilde{m}}\left({\mathbf{X}}_t\right) $ denotes the number of $ D $ -close pairs of particles with marks $ m $ and $ \tilde{m} $ , whose motion regimes are different. Then, when a transformation happens, we may choose the $ m $ -particle to be transformed uniformly over those $ m $ -particles that are $ D $ -close to a $ \tilde{m} $ -particle with a different motion regime. Then the transformation makes the motion regime of the selected $ m $ -particle similar as the motion regime of its closest $ \tilde{m} $ -particle.
3. Application to the joint dynamics of Langerin/Rab11 proteins
3.1. Description of the dataset
The dataset we consider comes from the observation by a 3D multi-angle TIRF (total internal reflection fluorescence) microscopy technique of the intracellular trafficking of YFP Langerin and m-Cherry Rab11 proteins in a RPE1 living cell,(Reference Boulanger, Gueudry, Munch, Cinquin, Paul-Gilloteaux, Bardin, Guérin, Senger, Blanchoin and Salamero30) specifically projected along the z-axis onto the 2D plane close to the plasma membrane. This provides a 2D image sequence of 1199 frames, each lasting 140 ms and showing the simultaneous locations of the two types of proteins. The two images at the top of Figure 2 depict the first frame of the raw sequence for the Langerin fluorescent channel and the Rab11 fluorescent channel, respectively, recorded simultaneously using a dual-view optical device. Note that the cell adheres on a fibronectin micropattern, which constrains intracellular constituents such as cytoskeleton elements and gives a reproducible shape, explaining the “umbrella” shape of the cell. These raw sequences are post-processed following Refs. (Reference Pécot, Bouthemy, Boulanger, Chessel, Bardin, Salamero and Kervrann33, Reference Pécot, Kervrann, Bardin, Goud and Salamero34), then each bright spot is represented by a single point, and we apply the U-track algorithm(Reference Jaqaman, Loerke, Mettlen, Kuwata, Grinstein, Schmid and Danuser35) to estimate particle trajectories. The bottom images of Figure 2 show the resulting trajectories for the Langerin channel and the Rab11 channel, respectively. These trajectories have been further analyzed by the method developed in Ref. (Reference Briane, Kervrann and Vimond22) to classify them into three diffusion regimes: Brownian, directed, and confined, which corresponds to the blue, red, and green colors, respectively, in Figure 2.
To be more specific in the analysis of all trajectories, we fit three parametric models to each of them, following Ref. (Reference Briane, Kervrann and Vimond22), depending on their regime:
-
1. for a Brownian regime (in blue): a Brownian motion,
-
2. for a directed motion regime (in red): a Brownian motion with constant drift,
-
3. for a confined motion regime (in green): an Ornstein–Uhlenbeck process.
Each trajectory has its individual parameters (see Examples 1 and 3), estimated by maximum likelihood.(Reference Liptser and Shiryaev36) Furthermore, some trajectories may change from one regime to another, which corresponds to a “transformation” in the BDM model that will be specified in the next section.
Figure 3 summarizes different features of the obtained trajectories for the Langerin sequence (the same characteristics have been analyzed for the Rab11 sequence, but are not detailed here). The histograms at the bottom display the duration of all trajectories (in frames), according to their regime. We can observe that the (blue) Brownian and (red) directed trajectories have quite a short lifetime in average, in comparison with the confined trajectories (in green). The top-right boxplots represent the distribution of the number of particles per frame, according to their regime: there is a majority of Brownian motions, followed by confined motions and a minority of directed motions. Finally, the top-left circular histogram aims at depicting the orientation of the drift vectors for the directed (red) trajectories. Specifically, for this plot, we have recorded the deviation of the drift angle (in degrees) with respect to the direction toward the center of the cell. For instance, this deviation is $ {0}^{\circ } $ if the drift goes toward the center, and $ {180}^{\circ } $ if it goes in the opposite direction. It appears from this plot that most deviation angles are around $ {0}^{\circ } $ or $ {180}^{\circ } $ , meaning that the red trajectories mainly move in a radial direction going to (or starting from) the center of the cell.
The above descriptors will be helpful to calibrate the parameters of the BDM model in the next section and they will also serve as benchmarks to evaluate the quality of our simulations. However, it is important to keep in mind that they come with some approximations and errors induced by imperfect tracking algorithms. In particular, no trajectory can last less than 10 frames in the data, which is a minimal length of detection for our tracking method. It is also clear in the bottom plots of Figure 1, that some directed trajectories appear wrongly in blue, which can be explained by the multiple testing procedure of Ref. (Reference Briane, Kervrann and Vimond22) that aims at minimizing the number of false positives (that are bad green or bad red trajectories) to the detriment of possibly too many false negatives (that are wrong blue trajectories).
Concerning the births and deaths of trajectories, we summarize in Table 1 their total numbers observed in the real dataset, according to the type of proteins and motion regime. The number of regime transformations is in turn given in Table 2 for the Langerin proteins. For the Rab11 proteins, only one switching from a Brownian motion to a confined motion was observed during the sequence.
To address in detail these jumps dynamics, we leverage the study carried out for the same dataset in Ref. (Reference Lavancier and Le Guével29), where it has been concluded that for each type of proteins and motion regimes, the birth intensity is constant, like in Example 6, while the death intensity is proportional to the number of existing particles, like in Example 7. Given the small number of observed motion regime transformations, its intensity can also be considered as constant. Concerning the transition probability functions, the deaths occur uniformly over all existing particles, like in Example 13. As to the birth transition, there is no reason to choose another density than the uniform distribution over the cell for the Rab11 proteins (Example 10). But due to colocalization (as observed for this dataset in Ref. (Reference Lavancier, Pécot, Zengzhen and Kervrann18)), the birth density for the Langerin positive structures can be approximated by a mixture between a uniform distribution, for $ 93\% $ of the Langerin births, and a colocalized density around the existing Rab11 vesicles, like in Example 11, for $ 7\% $ of the Langerin births. These proportions, along with the other parameters, have been estimated by maximum likelihood, the theoretical foundations of which can be found in Refs. (Reference Löcherbach37, Reference Löcherbach38) for stochastic models that include the BDM model. Note however that at this step, the goal is to provide a guideline to set the parameters of the BDM model in order to generate realistic realizations, as carried out in the next section. For this reason, any alternative estimation method or biological expertizes to set the parameters could be appropriate.
3.2. Simulation of synthetic sequences
3.2.1. Model parameters setting
Based on the data analysis of the previous section, we are now in position to specify all characteristics of the BDM process with transformations presented in Section 2, so as to mimic the joint dynamics of Langerin/Rab11 proteins within a cell. To make the connection with the theoretical notation, the region of interest $ \Lambda $ represents the cell in dimension $ d=2 $ . Each particle in $ \Lambda $ will be marked by a label from the set $ \unicode{x1D544}=\left\{\left(L,1\right),\Big(L,2\Big),\Big(L,3\Big),\Big(R,1\Big),\Big(R,2\Big),\Big(R,3\Big)\right\} $ , where $ L $ stands for the Langerin proteins, $ R $ for the Rab11 proteins, and the number $ 1,2 $ , or $ 3 $ indicates the motion regime of the particle: Brownian, directed, or confined, respectively.
Concerning the motion of each trajectory, it follows the regime indicated by its mark and is in agreement with the observed trajectories from the real dataset detailed in the previous section, see also Examples 1 and 3:
-
1. For a Brownian motion, we draw the diffusion coefficient according to the empirical distribution of the diffusion coefficients estimated from the Brownian motions of the real dataset, for the same type of proteins ( $ L $ or $ R $ );
-
2. For a directed motion, we generate a Brownian motion with constant drift, with the same strategy for the choice of the diffusion coefficient, and where the drift vector is chosen as follows: it is by default oriented toward the center of the cell, this orientation being subjected to a deviation drawn from the empirical distribution depicted in the top-left circular histogram of Figure 3. In addition, its norm is drawn from the empirical distribution of the drift norms observed from the real dataset. Here again, each set of parameters is distinct for the Langerin and Rab11 proteins;
-
3. For a confined motion, we generate an Ornstein–Uhlenbeck process with diffusion coefficient $ 0.4 $ for all particles (which is the average from the real dataset), and parameter $ \lambda =5.96 $ for the Langerin proteins, and $ \lambda =7.41 $ for the Rab11 proteins.
In these values, the unit is pixels, and one pixel is $ 160\times 160 $ nm $ {}^2 $ in our images.
Concerning the intensity functions, we set the birth intensity and the transformation intensity to constant values, as concluded from the real-data analysis. In agreement with Table 1, the total birth intensity can be estimated by $ \beta \left(\mathbf{x}\right)=1248/167.86=7.43 $ , whatever the configuration $ \mathbf{x} $ of particles is, because 1248 is the total number of observed births and $ 167.86 $ is the total time length of the sequence (in seconds). Similarly, we set $ \tau \left(\mathbf{x}\right)=16/167.86=0.095 $ since 16 transformations have been observed in the real sequence. For the death intensity, for each mark $ m\in \unicode{x1D544} $ , we let it proportional to the number of particles, that is $ {\delta}_m{n}_m\left(\mathbf{x}\right) $ , where $ {n}_m\left(\mathbf{x}\right) $ is the number of particles with mark $ m $ in the configuration $ \mathbf{x} $ and $ {\delta}_m $ has been estimated from the real-dataset as follows: $ {\delta}_m=0.17 $ if $ m=\left(L,1\right) $ , $ {\delta}_m=0.14 $ if $ m=\left(L,2\right) $ , $ {\delta}_m=0.07 $ if $ m=\left(L,3\right) $ , $ {\delta}_m=0.21 $ if $ m=\left(R,1\right) $ , $ {\delta}_m=0.25 $ if $ m=\left(R,2\right) $ , and $ {\delta}_m=0.08 $ if $ m=\left(R,3\right) $ . The total death intensity for the configuration $ \mathbf{x} $ of particles is then $ \delta \left(\mathbf{x}\right)={\sum}_{m\in \unicode{x1D544}}{\delta}_m{n}_m\left(\mathbf{x}\right) $ .
Finally, we set the transition probability functions as follows. For the death transition, the probability to kill the particle $ {x}_i=\left({z}_i,{m}_i\right) $ in the configuration $ \mathbf{x} $ is set to
which means that we first draw the mark $ m $ with probability $ {\delta}_m{n}_m\left(\mathbf{x}\right)/\delta \left(\mathbf{x}\right) $ and then the particle uniformly among all existing particles with mark $ m $ . For the transformation transition, we first select the type of proteins to transform with probability $ 15/16 $ for Langerin and $ 1/16 $ for Rab11, in line with the transformations rates observed in the real sequence, second we choose a particle uniformly among all existing particles of this type, and third, as in Example 15, we apply a regime transformation with respect to the following transition matrix (from the regime in rows to the regime in columns):
This matrix is in agreement with Table 2 concerning the Langerin proteins, where we have added some possible transitions from regime 1 to 2, and from regime 2 to 1, that appear to us likely to occur, even if they were not observed in the (quite rare) transformations in the real-sequence. The same transition matrix has been set for the Rab11 proteins, since there is not enough observed transformations in the real sequence (only one) to design a finer choice.
It remains to set the birth transition probability. First, we select which type of protein to create: following Table 1, it is a Langerin protein with probability $ 747/1248 $ and a Rab11 protein with probability $ 501/1248 $ . If the selected type is Rab11, then it is generated uniformly in the cell with regime $ 1 $ with probability $ 0.784 $ , $ 2 $ with probability $ 0.048 $ , and $ 3 $ with probability $ 0.168 $ , which corresponds to the relative proportion of births of each regime over all births for the real Rab11 sequence. If the selected type is Langerin, then we flip a coin for colocalization with probability $ p=0.07 $ . If there is no colocalization, then the new Langerin protein is generated uniformly in the cell with regime $ 1 $ with probability $ 0.807 $ , $ 2 $ with probability $ 0.104 $ , and $ 3 $ with probability $ 0.088 $ (the observed relative proportions of births). If there is colocalization, then the new Langerin protein is generated around an existing Rab11 protein according to the density (1) in Example 11, where by maximum likelihood estimation $ \sigma =1.1 $ . In this case, the regime of the new Langerin protein and its drift vector for a directed motion are similar as those of its colocalized Rab11 protein.
3.2.2. Analysis of resulting simulations
We have generated 100 sequences following the model of the previous section, during the same time length as the real sequence of Section 3.1, that is $ 167.86 $ s for 1199 frames. Some descriptors concerning the generated Langerin trajectories coming from two simulated sequences are depicted in Figures 4 and 5, that are to be compared with the similar outputs of the real data in Figures 2 and 3. The results for other simulated sequences can be seen in our GitHub repository. We have also summarized the mean number of births and deaths over the 100 simulated sequences in Table 3, to be compared with Table 1. Both graphical and quantitative results demonstrate that our model is able to create a joint dynamics with comparable features as those observed in the real-data sequence.
Data availability statement
The real data presented in the manuscript and replication code may be obtained from the authors and can be found in our GitHub repository at https://github.com/balsollier-lisa/BDM-generator-for-bioimaging.
Acknowledgments
We thank the PICT imaging platform of Institut Curie, member of the France-BioImaging infrastructure (ANR-10-INBS-04-07), for providing real TIRM image sequences. We are grateful to Cesar Augusto Valades-Cruz for assistance in tracking the particles from the data sequences and to Vincent Briane for fruitful discussion and for providing the code developed in Ref. (Reference Briane, Kervrann and Vimond22).
Author contribution
F.L. and C.K. conceived and designed the study. J.S. conducted data gathering. L.B. and F.L. performed statistical analyses and simulations. All authors contributed to the writing and approved the final submitted draft.
Funding statement
This research was supported by the 80|Prime grant from the CNRS.
Competing interest
The authors declare no competing interests.
Ethical standard
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.
A. Appendix
A.1. Algorithm for simulation
We provide in this appendix a formal algorithm to simulate a BDM process with mutations (or transformations), following the construction of Section 2.2. Its implementation is available in our GitHub repository at https://github.com/balsollier-lisa/BDM-generator-for-bioimaging. It is a refinement of the algorithm introduced in Ref. (Reference Lavancier and Le Guével29) for a BDM process (without transformations). The idea is to generate the inter-jump move on a small time length $ \Delta $ , then to test whether a jump has occurred during this period (this is with probability $ p $ in the following Algorithm 1). If so, we generate the jump time and the jump. If not, we continue the simulation of the inter-jump move on a further time length $ \Delta $ , test whether a jump has occurred, and so on. The algorithm is valid whatever $ \Delta >0 $ is, but an efficient choice is to set a small value for $ \Delta $ . A default recommendation is to set $ \Delta $ as the discretization step used to simulate the trajectories in the $ \mathrm{Algo}.\mathrm{Move} $ algorithm (which is an input of our algorithm, see below).
We let as in Section 2.2 $ \alpha =\beta +\delta +\tau $ and $ {T}_0=0 $ . We denote in Algorithm 2 $ {\mathbf{X}}_{T_j^{-}} $ the configuration just before the jump time $ {T}_j $ . In order to run Algorithms 1 and 2, we need the following inputs:
-
1. $ T>0 $ : final time of simulation;
-
2. $ {\mathbf{X}}_0\in E $ : initial configuration of particles;
-
3. $ \Delta >0 $ : small time length for piecewise simulation;
-
4. $ \beta $ , $ \delta $ , $ \tau $ : intensity functions of births, deaths, and transformations;
-
5. $ {p}^{\beta } $ , $ {p}^{\delta } $ , $ {p}^{\tau } $ : transition probability for a birth, a death, and a transformation;
-
6. $ \mathrm{Algo}.\mathrm{Move}\left({\mathbf{y}}_0,\Delta \right) $ : algorithm that returns, for $ {\mathbf{y}}_0\in E $ and $ \Delta >0 $ , $ n\left({\mathbf{y}}_0\right) $ (discretized) trajectories on $ \left[0,\Delta \right] $ following the system of SDEs $ \left(\mathrm{Move}\right) $ with initial configuration $ {\mathbf{y}}_0 $ .
Algorithm 1. Simulation on the time interval $ \left[0,T\right] $
Algorithm 2. Simulation of the jump at $ t={T}_{j+1} $ given $ {\mathbf{X}}_{T_{j+1}^{-}} $
A.2. Conclusion and perspectives
We have leveraged an original stochastic model, namely a multitype BDM process with transformations, in order to simulate biomolecules dynamics within a cell. This stochastic process not only models the individual trajectory of particles with any Markovian dynamics, but it is also able to generate the appearance (i.e., birth), disappearance (i.e., death), and regime switching (i.e., transformation) of each trajectory over time. Importantly, interactions between particles can be included, accounting for the possible colocalization phenomenon. The model is very flexible and is specified thanks to three sets of parameters: (1) a system describing the set of trajectories (typically a system of stochastic differential equations); (2) the intensity functions, ruling the waiting time before a new appearance, a disappearance or a switching; (3) the transition probability functions, driving where a new particle appears when there is a birth, which particle disappears when there is a death, and which particle switches its regime (and how) when there is a transformation. Numerous examples of these model specifications have been detailed. As an illustration, we demonstrated the relevance of this approach by generating the joint dynamics of Langerin/Rab11 proteins within a cell, based on a preliminary data-based analysis in order to finely calibrate the model.
Since the model is very flexible, an important step is the choice of model characteristics and parameters. The calibration carried out for our illustration is specific to the application at hand, and of course another calibration must be carried out for another application. In our case, we used one observed sequence of Langerin/Rab11 proteins. In order to improve the choice of parameters, a deeper empirical study based from several image sequences might help calibrating robustly the model. Once the parameters are fitted, the simulation of a sequence is quite fast: about one minute on an regular laptop for the generation of 2000 frames containing each about 70 trajectories in interaction. In general terms, the bottleneck is the simulation of all trajectories between two jumps: if each particle moves independently of the others, this scales linearly with the number of particles and parallelization is easy to set up. When complicated interactions are introduced between particles, then the simulation of all trajectories scales badly with the number of particles. As a restriction, due to the Markovian framework ensuring the theoretical well-posedness of the model, anomalous trajectories(Reference Arts, Smal, Paul, Wyman and Meijering19, Reference Muñoz-Gil, Volpe, Garcia-March, Aghion, Argun, Hong, Bland, Bo, Conejero, Firbas, Garibo i Orts, Gentili, Huang, Jeon, Kabbech, Kim, Kowalek, Krapf, Loch-Olszewska, Lomholt, Masson, Meyer, Park, Requena, Smal, Song, Szwabiński, Thapa, Verdier, Volpe, Widera, Lewenstein, Metzler and Manzo25, Reference Korabel, Waigh, Fedotov and Allan39) are not allowed in theory, though the algorithmic construction in Section 2.2 does not rule out their introduction. However, a rigorous understanding of the model in this setting remains challenging and constitutes an exciting perspective. In an effort to generate even more realistic image sequences, we may consider to blur the system of generated particles using the point spread function, and to add some noise and background, as carried out for instance in Ref. (Reference Lagardère, Chamma, Bouilhol, Nikolski and Thoumine8). In relation, additional features could be computed from both the real-image sequence and the synthetic ones in order to strengthen the quality assessment of the generator.