TASEPy: a Python-based package to iteratively solve the inhomogeneous exclusion process

The totally asymmetric simple exclusion process (TASEP) is a paradigmatic lattice model for one-dimensional particle transport subject to excluded-volume interactions. Solving the inhomogeneous TASEP in which particles' hopping rates vary across the lattice is a long-standing problem. In recent years, a power series approximation (PSA) has been developed to tackle this problem, however no computer algorithm currently exists that implements this approximation. This paper addresses this issue by providing a Python-based package TASEPy that finds the steady state solution of the inhomogeneous TASEP for any set of hopping rates using the PSA truncated at a user-defined order.


Introduction
The exclusion process is a paradigmatic lattice model for one-dimensional transport subject to excluded-volume interactions.It was introduced as a model for mRNA translation in 1968 [1,2], and independently in 1970 as a generalization of the lattice random walk to include multiple particles with excluded-volume interactions [3], when it was named the exclusion process.The exclusion process rose to prominence in the 1990s due to its connection to a variety of physical phenomena, including boundary-induced phase transitions [4], vehicular traffic [5], quantum spin chains [6] and surface growth [7].In nonequilibrium statistical physics, the totally asymmetric simple exclusion process (TASEP) in which particles move unidirectionally and with uniform hopping rates (the homogeneous TASEP) is one of very few models whose nonequilibrium steady-state solution is known exactly [8][9][10].The steady-state solution of the homogeneous TASEP via matrix-product Ansatz has inspired a large body of research on nonequilibrium steady states [11].Its large deviation properties [12] have played an important role in the development of the macroscopic fluctuation theory [13] describing coarse-grained fluctuations in driven diffusive systems [14].Applications of the TASEP in biology, other than mRNA translation [15][16][17][18], include DNA transcription [19][20][21][22][23] and intracellular transport by molecular motors [24,25].
In the context of mRNA translation, the TASEP captures stochastic motion of individual ribosomes on the mRNA, including excluded-volume (steric) interactions between ribosomes that may cause traffic jams.Ribosome speed along the mRNA is non-uniform, which has been linked to variations in the availability of the transfer RNA (tRNA) molecules delivering the correct amino acid to the ribosome [26].Most amino acids are encoded by two to six (synonymous) codons, whose frequency of usage is non-random [27], which is known as the codon usage bias.In biotechnology, codon (sequence) optimization by replacing rare with frequent codons has become an important tool to increase the production of proteins that are non-native to their host cell, with some studies reporting up to 1000-fold increase in protein levels [28].These successes are seemingly in contrast to numerous studies demonstrating that translation is rate-limited by initiation [29][30][31][32], raising an opportunity for the TASEP to explore theoretically the effect of variable codon speed on the protein production rate.
Variable ribosome speed can be modelled by the TASEP with inhomogeneous hopping rates that are fixed to each lattice site, which we refer to as the inhomogeneous TASEP (other names that circulate in the literature are the TASEP with site-wise disorder and the disordered TASEP).In stark contrast to the homogeneous TASEP for which many exact results are known, solving the inhomogeneous TASEP is a challenging problem, even if all but one lattice sites have the same hopping rate [33][34][35].One option is to employ the mean-field approximation, in which correlations between neighbouring particles are ignored.This approximation leads to a set of nonlinear equations for the local particle densities at each lattice position that must be solved numerically [36,37].If the hopping rates are slowly varying along the lattice and the number of lattice sites is large, then a hydrodynamic (coarse-grained) limit of the TASEP is justified, yielding simple analytical results for the local density [38].Finally, if the TASEP has only few sites with slow hopping rates, then a combination of the mean-field approximation and the exact solution of the homogeneous TASEP can be used [34,39,40].
The other option, which accounts for correlations between particles, is the power series approximation (PSA) [41][42][43].This approximation is based on the formal series expansion of the steady-state solution, with the initiation rate (at which new particles are added to the lattice) being the expansion variable.The coefficients in this expansion can be computed exactly, whereas the approximation comes from the truncation of the series at some given order.Since initiation is typically the rate-limiting step in translation, only the first few terms are needed to obtain accurate results [42].Consequently, the power series approximation is expected to be considerably faster than the stochastic simulation algorithm (the Gillespie algorithm) [44].This advantage has proved instrumental for solving the inverse problem of inferring variable ribosome speed from experimental data obtained by ribosome profiling [45].However, computer implementation of the power series approximation has so far been limited to the first few orders, preventing its wider use.
In this paper, we close this gap by providing a computer algorithm that finds the steady state of the inhomogeneous TASEP for arbitrary hopping rates using the power series truncated at an arbitrary, user-defined order.After introducing the model, we summarize the power series approximation [41][42][43] and present an iterative method to solve it for any order.Finally, we implement this method in a Python code, which we package as the TASEPy module.We provide detailed instructions how to use it, and test its correctness using exact results for small systems and stochastic simulations for large systems.The code is available in a GitHub repository [46] under the MIT licence.

Definition of the model
We model a driven gas of particles advancing on a unidimensional discrete lattice consisting of L sites, labelled from 1 to L. As we are motivated by the mRNA translation process [1], we assume that each particle occupies ℓ sites.In order to locate each particle on the lattice, we arbitrarily chose a site of the particle that we will call the "tracking site" (identical for all particles); we can then identify the position of a particle on the lattice by the position of its 'tracking' site.For instance, in Fig. 1 each particle has size ℓ = 3 and the tracking site is the middle one.For ribosomes, ℓ ≈ 10 and the tracking site is approximately 5 lattice sites from the ribosome's trailing end.However, the procedure that we describe is general, meaning that Figure 1: Sketch of the TASEP for ℓ = 3.A black dot denotes the tracking site.A particle is injected with its tracking site on the first lattice site with rate α.The position of the tracking site then moves from site i to i + 1 with rate ω i , and when it reaches the last site, the particle is ejected from the lattice with rate β.At each step, the dynamics has to respect the exclusion rules explained in the text.
it can be applied to any value of ℓ, and it does not depend on the position of the particle's tracking site [2,15].
The model is illustrated in Fig. 1.Particles enter the lattice with their tracking site (highlighted with a black dot) on site 1 with a probability per unit time α, provided that no other particles interfere with the binding of that particle.In other words, the tracking site of the following particle should be at least ℓ + 1 sites downstream.In the following, we will often identify the position of a particle with the position of its tracking site.In the bulk, a particle moves from site i to site i + 1 with a rate ω i , provided that there is no particle on site i + ℓ.On the last site L, particles exit the lattice with a rate β.
For each site i = 1, . . ., L we define the corresponding particle occupancy number τ i ∈{0, 1}, These numbers determine the configuration of the system denoted by C = {τ 1 , . . ., τ L }.Using this notation, kinetic steps of the driven lattice gas can be summarized as: Eqs. (2a)-(2d) define the totally asymmetric simple exclusion process first proposed by Mac-Donald, Gibbs and Pipkin as a model of mRNA translation [1].

Particle current and particle density
Our main goal is to compute the particle current J and particle densities ρ i on each site i, in the steady state.As we do not consider particle binding and unbinding inside the lattice, the current J in the steady state is conserved across the lattice, and we can write: where the brackets 〈. . .〉 denote an ensemble average with respect to the steady-state probability P(C).As the steady-state current is conserved along the lattice, the equation above can be understood as equating the entry current (injection rate α multiplied by the probability that the first ℓ sites are empty), and the exit current on the last site (termination rate β multiplied by the probability of occupancy of the last site).The local particle densities ρ i (the probability of finding a particle tracking site on site i), and the average lattice density ρ are defined as

Steady-state master equation
The steady-state probability P(C) satisfies the master equation, where W (C → C ′ ) denotes the rate of transition from configuration C to C ′ .To be specific, we adopt an alternative notation for C that specifies the positions x m of each particle m on the lattice (i.e. the position of its tracking site), In other words, N is the number of particles in configuration C, and x 1 , . . ., x N are the positions of particles in C, where x 1 is the position of the leftmost and x N the position of the rightmost particle.We use index i to indicate the lattice site, index m to indicate the mth particle, while n is reserved to the order of the power series expansion.The particle positions x 1 , . . ., x N must respect the excluded-volume interactions, To simplify the notation, we define ω 0 = α and ω L = β.The steady-state master equation, Eq. ( 5), can be written as e(C)P({x 1 , . . ., x N }) where e(C) is the exit rate from configuration C, 1 A is the indicator function that is equal to 1 if the condition A is true and is 0 otherwise, and m is the ladder operator that moves the mth particle one lattice point to the left, provided the move is allowed by the excluded-volume interactions.For x m > 1, the ladder operator m is defined as For x 1 = 1, the ladder operator 1 is defined as where denotes the empty lattice.
In summary, to find the master equation for a given configuration C of particles, we compute the exit rate e(C) from that configuration [the left-hand side of Eq. ( 8)], and then check all particles that can be moved one step to the left-that determines the configurations C ′ that C can be entered from [the right-hand side of Eq. (8)].For example, consider a system with ℓ = 10 and L = 100, and a configuration with three particles C = {x 1 , x 2 , x 3 } = {1, 11, 30}.The master equation for P (1,11,30) reads, (ω 11 + ω 30 )P (1,11,30) = ω 0 P (11,30) + ω 29 P (1,11,29) + ω 100 P (1,11,30,100) . ( The first particle at x 1 = 1 cannot move because the second particle is at x 2 = x 1 +ℓ = 11.This simple way of generating the master equation for each configuration is useful for the power series approximation discussed in the next section. 3 Power series approximation (PSA)

General results
The power series approximation (PSA), as previously formulated in the Refs.[41-43, 47, 48], represents P(C) as a power series expansion in the initiation rate α, The unknown coefficients c n (C) depend on both the particle configuration C and the rates ω 1 , . . ., ω L .Considering that the sum of all P(C) must equal to 1, it directly follows that While it is possible to expand P(C) in other rates, we choose to expand in the initiation rate α as we are mainly motivated by mRNA translation for which we expect the translation initiation rate to be much smaller than any other rate.To illustrate, the median value of α estimated for the S. cerevisiae genome is an order of magnitude smaller than any of the elongation rates [31].This let us approximate the series expansion P(C) in Eq. ( 13) using the first K terms We emphasize that, although Eq. ( 13) is exact, we may introduce notable errors by truncating the series expansion to a limited number of terms as done in Eq. (15).Considering an initiation rate comparable to other rates may thus result in non-physical values of P(C) < 0 or P(C) > 1.
In particular, if for a specific choice of α the approximation fails, it becomes necessary to include higher-order terms in Eq. (15).Later in this work, we will provide criteria to establish the reliability of the PSA's results.
To compute the steady-state probabilities P(C) and consequently calculate particle currents and densities, we first need to determine the coefficients c n (C).We insert Eq. ( 13) into Eq.( 5) and gather terms involving α n .The sum of these terms equals to zero, since the left-hand side of Eq. ( 5) equals zero.Next, we differentiate between cases where W (C → C ′ ) = α (resulting in terms of the order α n+1 when multiplied by P(C)) and cases where W (C → C ′ ) ̸ = α (yielding terms of the order α n ).Using an indicator function I C,C ′ , we write W (C → C ′ ) as where Inserting P(C) from ( 13) into ( 5) we obtain After gathering all terms containing α n and equating their sum to 0, we obtain the following equation for where e 0 (C) is the total exit rate from C excluding initiation For C = , we can use Eq. ( 14) instead, which gives The equation ( 19) applies to n ≥ 1.For n = 0, the equation is simpler and reads, We remark that Eq. ( 22) coincides with the initial master equation when α is set to 0. Since in that case there is no initiation, From (23) it follows that all coefficients c n (C) of order n smaller than the total number of particles N (C) in C will be equal to zero.This is summarized in the condition Mathematically, this result can be derived using the Markov chain tree theorem [49], known in physics as Schnakenberg network theory [50].Here we omit the details of the derivation, which can be found in Ref. [43] where we proved (24) for the TASEP with particles of size ℓ = 1.The same arguments hold to the general case studied in this paper.Relations (24) tell us that for each order n we only have to consider lattice configurations with at most N = n particles, and that considerably simplifies the calculation of the coefficients c n (C).This simplification plays a crucial role in making the power series approximation practical and applicable.
In that sense, the power series expansion can be seen as a form of perturbation theory, where each initiation event corresponds to one order of the perturbation theory, starting from the empty lattice.
Applying the power series to Eqs. ( 3)-(4b), we get the following expressions for the steadystate particle current J, local density ρ i and the average density ρ, Here, index n and the coefficients J n , ρ i,n and ρ n correspond to the order of the series expansion of P(C) in Eq. ( 13).Since the current is by definition multiplied by α, the 0th order in the series expansion of P(C) contributes to the 1st order in the series expansion of J.

Analytical solution for the first-order coefficients
The first order in the power series expansion can be solved analytically.According to Eq. ( 24), the coefficients c 1 (C) are non-zero for configurations having a number of particles smaller than 1, i.e.C = {x 1 } and C = .The equation for c 1 ({x 1 }) reads, This recurrence relation can be easily solved, from which we get that The expression for c 1 ( ) follows from Eq. ( 21),

Iterative solution for higher-order coefficients
We can compute all c n (C) of a given order n recursively by following a natural order, which forms the basis of the code we developed.To see that, we rewrite Eq. ( 19) for c n (C) using the notation C = {x 1 , . . ., x N }, 1 ≤ N ≤ n, as previously done for the master equation (8).
The pedagogical case ℓ = 1 is presented in Appendix A, while below we directly derive the equations for the general case ℓ ≥ 1.We obtain where e 0 ({x 1 , . . ., x N }) is given by Each term on the right-side of Eq. ( 29) has a simple interpretation, which we explain below: (a) The first term on the right-hand side of Eq. ( 29) is a contribution to c n (C) from the previous order n − 1, provided the first particle is at site x 1 = 1.
(b) The second and third terms account for all configurations that lead to C by moving one particle to the right, provided the move is allowed by excluded-volume interactions.The second term is for the leftmost particle (b1), and the third term is for all other particles (b2).
(c) The fourth term computes the contribution to c n (C) coming from a configuration that has an extra particle at the last site L, provided that x N ≤ L − ℓ and N + 1 ≤ n because of Eq. ( 24).
(d) The last term removes the contribution coming from the same configuration but from the previous order n − 1, provided x 1 > ℓ.This term is not zero, provided n − 1 ≥ N .
There is a natural hierarchy of configurations for solving Eq. ( 29) recursively, such that all coefficients on the right-hand side are known before computing c n (C) on the left-hand side.
Let us assume that we have calculated the values of all non-zero coefficients c n−1 .We start with a configuration in which n particles are stacked together at positions x i = 1 + (i − 1)ℓ for i = 1, . . ., n.We refer to this configuration as a 'stacked' configuration.For now, we assume that n is smaller or equal to the maximum number of particles that can fit onto the lattice, N max = ⌊L/ℓ⌋ + 1.For the stacked configuration in that case, where x n = 1 + (n − 1)ℓ is the position of the rightmost particle.Importantly, the right-hand side depends only on a coefficient of the order of n − 1, which is known.The next coefficient that can be computed is for a configuration in which the nth particle is moved one lattice site to the right (x n = 2+(n−1)ℓ).The coefficient for this configuration depends on the previously computed coefficient (c n ({1, . . ., 1 + (n − 1)ℓ})) and a coefficient that is of the order of n − 1.This procedure continues for all allowed positions of the nth particle, the last being x N = L.The next configuration after that one is obtained by removing the nth particle from the lattice.
After we have exhausted all positions of the nth particle, we move the (n − 1)th particle one lattice site to the right, and set the nth particle next to it, stacked together.We then leave the (n − 1)th particle intact while we cycle through all allowed positions of the nth particle, including the configuration in which the nth particle is removed from the lattice.This procedure is repeated until (n − 1)th and nth particle are both removed from the lattice.We then move (n−2)th particle one lattice site to the right and stack the two removed particles next to it, if possible.This procedure is repeated for all particles on the lattice, until we finally reach a configuration in which the first particle is at the last site.The next configuration from there would be an empty lattice, but we can get the coefficient c n ( ) also by summing all other coefficients of the same order, according to Eq. ( 14).For n ≥ N max , we start from a stacked configuration with N max particles and cycle through all configurations as before until we reach an empty lattice.In other words, for each n ≥ N max we cycle through all lattice configurations, whereas for n < N max we cycle through only a subset of configurations because of Eq. (24).
For n ≥ N max , S n contains all lattice configurations.

TASEPy usage instructions
In this section, we provide a practical implementation of the power series approximation coded into a Python-based package named TASEPy.To use this package, copy the TASEPy.pyfile into the working directory and import the following functions: from TASEPy import psa_compute from TASEPy import total_coeffs from TASEPy import local_density from TASEPy import mean_density from TASEPy import current In the upcoming sections, we will explore the functions used to evaluate densities and cur- rents obtained by the PSA.For a comprehensive step-by-step guide and additional details, please refer to the tutorial tutorial_TASEPy.ipynb.Finally, in the last section, we will present benchmarks involving exact results and stochastic simulations.

Function psa_compute
At the core of TASEPy is the psa_compute function.It takes a list containing hopping rates ω 1 , . . ., ω L , the maximum order of the series expansion K and the particle size ℓ (ℓ = 1 by default) as input, and returns a two-dimensional list containing coefficients ρ i,n and a onedimensional list containing coefficients J n , for all n = 0, . . ., K.These coefficients are needed to compute density profiles and particle current in the power-series approximation for any value of α, following Eqs.(25a)-(25c).In the following code, rhocoeff, Jcoeff = psa_compute(wlist, K, ll) wlist = [ω 1 , . . ., ω L ] is the list of hopping rates and ll is the particle size ℓ.The coefficients ρ i,n and J n are stored in rhocoeff and Jcoeff, respectively, such that rhocoeff[i] is the list [ρ i+1,0 , . . ., ρ i+1,K ] and Jcoeff[i] = J i .
By default, the function psa_compute does not save the coefficients c n (X ).If one is interested in physical quantities other than the local density and current, then it is paramount that the coefficients are saved.This option is activated by setting save_coeff=True in the function psa_compute (the default value is False if omitted), which saves the coefficients in a file.The name of this file is specified by an optional argument coeff_file='〈filename〉', e.g.'test.csv' in the example below: rhocoeff, Jcoeff = psa_compute(wlist, K, ll, True, 'test.csv') We note that the resulting file may be very large, since the number of coefficients grows exponentially with the system size.We have therefore implemented a function total_coeffs that counts the total number of coefficients that will be stored in the file.It is advised that this function is called before psa_compute when the option save_coeffs is set to True.For example, for L = 100, ℓ = 1 and K = 4, the total number of coefficients is close to 4,300,000, which takes about 173 MB, or about 43 bytes per coefficient.
The inner workings of psa_compute are the following.For a given order n, we generate the initial stacked configuration X as a list of particle positions and compute the coefficient c n (X ) using Eq. ( 31).This coefficient is then stored in a dictionary c_n with key X .Using function next_config(), we go through all other configurations X ∈ S n and use Eq.( 29) to compute the corresponding coefficient c n (X ), which is stored in the dictionary c_n using key X .Due to the hierarchy of configurations explained in Section 3, computing c n requires only coefficients that have already been stored in the dictionaries c_n-1 and c_n.As we go through all configurations, we compute the density coefficients ρ i,n for each lattice site i using Eq.(25b), and the particle current coefficients J n using (25a).This process is repeated for all orders n = 0, . . ., K. We remark that, while computing the coefficients of order n, psa_compute only keeps in memory the dictionaries of order n − 1 and n.

Function local_density
Following the execution of psa_compute, we can compute the density profile for each order n and for any fixed value of α.This is implemented in the function local_density, rho = local_density(rhocoeff, alpha) which takes as input the two-dimensional list rhocoeff and the value of the initiation rate α, and returns a two-dimensional list containing the density profiles for all orders 0, . . ., K. Let us denote by ρ (n) i the local density truncated at the order of n, The density profile [ρ For instance, rho [2] is a list containing local particle densities for each lattice site, truncated at the second order.

Function current
Finally, the particle current J for a given value of α can be computed from Eq.(25a) knowing the current coefficients J 0 , . . ., J K .This is done by the function current, which takes the list Jcoeff and the value of α as input, and returns the list [J (0) , . . ., J (K) ], where . This whole procedure is demonstrated in the tutorial file tutorial_TASEPy.ipynbfor lattice size L = 100, particle size ℓ = 1 and the PSA order K = 4, using randomly generated hopping rates in the range between 1 and 10.The resulting density profiles ρ (n) i for n = 1, . . ., 4 are presented in Fig. 3.In Fig. 4(a) and Fig. 4(b) we show the mean density ρ (n) and particle current J (n) versus α, respectively.Plotting mean density and current versus α is useful for estimating values of α for which the PSA of given order is no longer a good approximation.As α is increased, the highest-order terms in the PSA begin to dominate, leading to values diverging from the exact ones.We know that 0 ≤ ρ i ≤ 1 and 0 ≤ J ≤ α, so any value outside these bounds indicates that α is too big.We also expect the local density and current to be non-decreasing in α. i.e. that dρ i /dα ≥ 0 and d J/dα ≥ 0. Using Eq. ( 3), we also get that d J/dα ≤ 1.These conditions can be easily checked using coefficients stored in rhocoeff and Jcoeff, as explained in the tutorial file tutorial_TASEPy.ipynbwhere we find the smallest value of α for which any of the conditions above fails.In practice, however, it is best to consider values of α that are much smaller than this value.

Benchmarks
The file benchmarks_TASEPy.ipynb is a Jupyter notebook comparing the results obtained with TASEPy to symbolic exact calculation for small systems and to stochastic simulations.
Exact results were obtained by solving the stationary master equation M P = 0 in Eq. ( 5) for small lattices, where M is the stochastic transition matrix, and P is the probability vector.This system of equations was solved exactly using Mathematica® to obtain P as a function of the initiation rate α, which was kept as a variable.The probability vector was then expanded in α up to the order of K, from which the coefficients ρ i,n and J n were computed for n = 0, . . ., K. The Mathematica® code is available in the repository.
For larger systems, the matrix M is too big to solve the system of equations M P = 0 analytically.We thus performed stochastic simulations of the TASEP using the Gillespie algorithm and compared the local density and current to those obtained by TASEPy.The Fortran code used to simulate the TASEP is available in the repository.In Fig. 5, we compare the density profiles for L = 50, ℓ = 5, K = 4 and α = 0.2.The TASEPy algorithm takes less than a second on a standard laptop to compute the PSA coefficients for this medium-sized lattice.The order of the PSA is K = 4 (for more details, see benchmarks_TASEPy.ipynb).

Applications
In this section, we explore two potential applications of TASEPy that are related to mRNA translation, the biological process for which the exclusion process was originally developed [1,2].In both applications, we use TASEPy to infer parameters of the TASEP from input data: either the mRNA-dependent translation initiation rate α from the mean ribosome density data obtained by polysome profiling [31], or the set of hopping rates ω i from the local density data obtained by ribosome profiling [48].

Inferring initiation rates from mean density measurements
The mean ribosome density ρ, defined as the number of ribosomes N on the mRNA divided by the length L of the mRNA in codon units, can be experimentally measured with a technique called polysome profiling.This quantity has for instance been quantified genome-wide in yeast S. cerevisiae [51].As mentioned in the introduction, it is generally assumed that the codondependent ribosome speed mainly depends on the codon, and those rates can be roughly estimated from the abundance of the corresponding tRNA.In Ciandrini et al. ( 2013) [31], the authors estimated the value of the initiation rate α for each mRNA of the yeast genome, assuming a codon-dependent set of ribosome hopping rates ω i .The method involved comparing the mean particle density ρ(α) obtained from simulated driven lattice gas to the experimental ribosome density ρ exp .The goal was to identify the physiological initiation rate that best matches the experimentally observed densities.To achieve this, the model had to be simulated for many different values of α to find ρ(α) ≃ ρ exp at an arbitrarily close resolution.
The benefit of using TASEPy lies in the fact that it eliminates the need to run numerous stochastic simulations for different values of α (which could be computationally expensive depending on the required points).Instead, as demonstrated in the preceding sections, our approach enables the computation of mean density coefficients ρ n to determine the mean density ρ(α) for any initiation rate, as seen in Eq. (25c).
In the notebook applications_TASEPy.ipynb, available in the repository [46], we implement this inference procedure for a random gene (YAL008W) of the S. cerevisiae genome.We compute the density coefficients ρ i,n with the psa_compute function, then calculate the mean density coefficients ρ n , Eq. (25c), and finally use the scipy.optimize.newtonfunction to find the roots of f (α) = ρ(α) − ρ exp using Newton's method since the derivative f ′ is also computable.The initiation rate inferred with TASEPy is comparable to the one found in [31] for this particular gene (L = 198, ℓ = 9), with a relative error of 1.5% between the two values using the third order of the PSA.The inferred value of α is approximately 0.15/s, which is between one and two orders of magnitude smaller than the estimated hopping rates ω i .This is a typical scenario that justifies the use of the PSA, leading to reliable results.Further details regarding the implementation of this inference procedure can be found in the notebook applications_TASEPy.ipynb.

Inferring elongation-to-initiation rate ratios from local density measurements
The exclusion process has often been used to solve the forward problem of predicting the particle current J and local density ρ i from a given set of particle hopping rates ω i for i = 1, . . ., L and the initiation rate α.In contrast, our focus lies on addressing the inverse problem, which involves inferring the rates ω i and α given the local density profile ρ i .Notably, as the local density ρ i is dimensionless, we can only determine the ratios κ i ≡ ω i /α for i = 1, . . ., L, whereas the absolute rates cannot be directly inferred.Specifically, our objective is to find κ 1 , . . ., κ L in a manner that satisfies the following set of conditions: where ρ i,exp represents the known local density on site i.This inference problem occurs when interpreting ribosome profiling experiments [52,53], which measure local ribosome density relative to mean ribosome density, ρ i /ρ for all codons of the mRNA sequence.These data alone, however, are not enough to estimate the ratios κ i -one needs also the mean ribosome density ρ to obtain the absolute local density ρ i .
For illustration purposes, we use local density obtained from stochastic simulations of the TASEP rather than from ribosome profiling.In this way, the original and inferred rates can be directly compared.In the notebook applications_TASEPy.ipynb [46], we demonstrate how to tackle this problem using TASEPy.The approach involves iteratively computing the PSA coefficients and evaluating the density profiles for a given set of κ i 's, aiming to minimize the objective function (also known as the root-mean-square deviation or RMSD): The set of κ i for i = 1, . . ., L that minimizes the RMSD will represent an optimal solution to the inverse problem.
As a proof of principle, we show this inference procedure for a small lattice (L = 20, ℓ = 1), which takes few minutes on a commercial laptop (K = 3).The minimization of RMSD(κ 1 , . . . ,κ L ) was done using scipy.optimize.minimizefunction with the Powell method and bounds on κ i between 10 −2 and 10 5 , see applications_TASEPy.ipynb for more details [46].The initial values of κ i were obtained using the mean-field approximation, which ignores correlations between particles [1,2].These values were relatively close to the original values, but with obvious discrepancies (the Pearson's correlation coefficient is 0.954) [Fig.6(a)].After the optimization, the inferred rates provide a good match to the original ones (the Pearson's correlation coefficient is 0.998) [Fig.6(b)].More advanced or faster optimization methods can be implemented, but this is out of the scope of this work.
We note that the inference procedure presented above is similar to the Non-Equilibrium Analysis of Ribo-seq (NEAR) procedure introduced a few years ago [48] for analysing ribosome profiling data using the inhomogeneous TASEP as a model for mRNA translation.The NEAR procedure too uses the power-series approximation of the TASEP, but is limited by construction to the third-order of the PSA (K = 3), whereas TASEPy has no such limitation.

Discussion and conclusion
The steady-state solution of the inhomogeneous totally asymmetric simple exclusion process (TASEP) is still unknown after more than 50 years since its first appearance.A method developed in the 1990s that provides an exact solution of the homogenous TASEP is unfortunately not applicable to the inhomogeneous TASEP, prompting a need for approximative solutions.
The power series approximation (PSA) provides an approximative solution of the inhomogeneous TASEP when one of the rates is limiting, such as the initiation rate α in our case.This method provides an exact series expansion of the steady-state solution in the limiting parameter, whereby the only approximation is the truncation of the series at a desired order.However, the implementation of the PSA is rather cumbersome, with results so far limited to low orders of the series expansion.
In this article, we have developed a new iterative method for computing the PSA up to any desired order K.We have implemented this method in a Python package called TASEPy  In (a) we compare the original κ i 's (which are the ones to be estimated and used as inputs for the stochastic simulations) with their initial guess provided by the mean-field approximation of the TASEP [1,2].Each point on the plot corresponds to one lattice site.Points that are away from the y = x line indicate deviations from the original rates.After optimization (b), the inferred values of κ i 's closely match the original ones.distributed under a permissive free software licence for anyone to use.The TASEPy package computes the local density, mean density and particle current for any set of hopping rates and for any order of the PSA.Optionally, it stores the probability coefficients in a file, which is useful for analysing other physical quantities, such as density-density correlations between distance lattice sites.The correctness of the algorithm has been extensively tested using exact results for small lattice sizes and stochastic simulations.
The TASEPy iterates over all particle configurations for which the probability coefficients c n (C) ̸ = 0 for n ≤ K. Hence, the time complexity of the TASEPy is equal to the total number of coefficients that need to be computed.For ℓ = 1, this number is of the order of L K , where L is the lattice size.Hence, it is advisable to be cautious while setting K for larger lattice sizes to avoid excessive computation time.For L = 100 and K = 4, the calculation takes less than a minute on a laptop with i7 CPU and 16 GB of RAM.This indicates that TASEPy can efficiently handle computations for lattice sizes that are commonly used in literature.
We would like to emphasize that the PSA provides an approximate solution of the TASEP, which becomes increasingly accurate for larger orders n and smaller values of α.However, if the initiation rate α is not limiting, the approximation may not hold.This happens when the last term in the series expansion at the truncation order begins to dominate over low-order terms, leading to inaccurate or even unphysical results.This can be easily spotted by checking a number of bounds that the local density ρ i and particle current J must satisfy: 0 ≤ ρ i ≤ 1, dρ i /dα ≥ 0, 0 ≤ J ≤ α and 0 ≤ d J/dα ≤ 1. Users are encouraged to use these bounds in their calculations to ensure the reliability of the results.
We also underline that, in the case ℓ = 1, the system is particle-hole symmetric.This symmetry can be used to solve the model when β is limiting.In practice, this can be done by replacing in the PSA equations α with β, τ i with 1 − τ i for i = 1, . . ., L, and i with L − i + 1.This symmetry, however, does not hold for ℓ > 1, and for this reason our method cannot be immediately mapped to the regime in which the exit rate is small.However, one could develop a PSA on another limiting rate (β for instance).This and other extensions can be implemented in future versions of TASEPy.
In this paper, we have explored two practical applications of TASEPy, which are both closely related to mRNA translation, the biological process for which the TASEP was originally developed.These applications solve the inverse problem (as problems in biology often are) of inferring model inputs from model outputs: one application focuses on estimating mRNAdependent translation initiation rate α from mean ribosome density measured by polysome profiling, while the other application involves determining the set of elongation-to-initiation rate ratios ω i /α from local ribosome density measured by ribosome profiling.We note, however, that these applications have been explored for illustration purposes only, and have not been optimized for speed.Future development of TASEPy will focus on refining the provided code and optimizing the package for speed.Additionally, expanding TASEPy to other, more computationally-oriented languages will enhance its versatility and utility.
In conclusion, TASEPy provides a significant advance in solving the inhomogeneous TASEP by packaging an intricate theoretical framework into a practical, user-friendly tool that allows the exploration of the model with just a few lines of code.Due to the versatility of the TASEP, we expect the TASEPy to serve as a valuable tool for various applications.We encourage researchers to explore and enhance TASEPy further, leveraging its capabilities for their specific needs.
For the following orders, we develop a recursive equation to compute all the coefficients c n (C) for any order n and configuration C = {x 1 , . . ., x N } with 1 ≤ N ≤ n.(a) This term is a contribution to c n (C) from the previous order n − 1, provided the first particle is at site x 1 = 1 (condition 1 x 1 =1 ).
(b) These two terms give the contribution to c n ({x 1 , . . ., x N }) from all configurations that lead to C = {x 1 , . . ., x N } by moving one particle to the right (provided that excludedvolume interactions allow this move).The term (b1) considers the movement of the leftmost particle (i.e. with label 1) from position x 1 − 1 to position x 1 (and thus one needs to check that x 1 > 1).The term (b2) considers the stepping of all other particles m = 2, . . ., N from x m − 1 to x m , and for which one does not have to check the condition x 1 > 1.
(c) The case in which a particle exits the lattice is considered in (c).This reduces the number of particles from N + 1 to N , resulting in the configuration C = {x 1 , . . ., x N }.Since this coefficient is equal to zero if n is smaller than the number of particles N in the configuration C, one needs to check that N + 1 ≤ n.Furthermore, because of volume exclusion, the position x N must less than or equal to L − 1, i.e. the condition x N + 1 ≤ L has to be satisfied.
(d) The last term (d) removes the contribution of the same configuration C from the previous order n − 1, provided that the first particle is not on the first site (x 1 > 1).This term comes from exiting C by means of adding a new particle at the lattice site 1, which can occur only if x 1 > 1.As above, n − 1 ≥ N otherwise this term is zero.
The coefficients c n can be computed recursively following a precise order of configurations that allows evaluating c n such that the terms (a)-(d) are known.To explain what this order of configurations is, let us imagine that all c n−1 have been computed, and we want to calculate the coefficients c n .We first consider the case n ≤ L. The first configuration we need to consider is the one with n particles stacked at the beginning of the lattice (we remind that c n (C) = 0 for any C with more particles than n).For ℓ = 1, this 'stacked' configuration is simply {1, 2, . . ., n}.We can then compute the coefficient of order n for this configuration, as only the term (a) contributes: c n ({1, 2, . . ., n}) = 1 ω x n c n−1 ({2, . . ., n}) .

Figure 2 :
Figure 2: Graphical representation of the iteration procedure for L = 4 and ℓ = 2.For n > N max = 2, the iteration procedure is the same as for n = 2. Orders are separated by dashed horizontal lines.Following this procedure, the coefficient c n (C) of a given configuration C depends only on previously computed coefficients.

Figure 5 :
Figure 5: Comparison between density profile obtained with TASEPy (continuous lines) and stochastic simulations (gray circles) for a system with L = 50 and ℓ = 5.

Figure 6 :
Figure6: Results of inferring the ratios κ i = ω i /α for i = 1, . . ., L from the local densities.In (a) we compare the original κ i 's (which are the ones to be estimated and used as inputs for the stochastic simulations) with their initial guess provided by the mean-field approximation of the TASEP[1,2].Each point on the plot corresponds to one lattice site.Points that are away from the y = x line indicate deviations from the original rates.After optimization (b), the inferred values of κ i 's closely match the original ones.