SciPost Submission Page
An objective criterion for cluster detection in stochastic epidemic models
by Eugenio Lippiello and Polyzois Bountzis
Submission summary
As Contributors:  Eugenio Lippiello 
Preprint link:  scipost_202103_00025v1 
Date submitted:  20210325 00:01 
Submitted by:  Lippiello, Eugenio 
Submitted to:  SciPost Physics 
Academic field:  Physics 
Specialties: 

Approach:  Theoretical 
Abstract
The correct identification of clusters is crucial for an accurate monitoring of the spread of a disease and also in many other natural, social and physical phenomena which exhibit an epidemic structure. Nevertheless, even when an accurate mathematical model is available, no simple tool exists which allows one to identify how many independent clusters are present and to link elements to the appropriate clusters. Here we develop an automatic method for the detection of the internal structure of the clusters and their number, independently of the model that describes the dynamics of the phenomenon. It is substantially based on the difference of the loglikelihood $\delta {\cal LL}$\, that is evaluated when all elements are connected and when they are grouped into clusters. As a function of the number of connected elements $\delta {\cal LL}$ presents a change of slope and a singularity which can be both used in cluster identification. Our method is validated for an epidemic model with a minimal temporal structure and for the Epidemic Type Aftershock Sequence model describing the spatiotemporal clustering of earthquakes.
Current status:
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 1 on 202152 Invited Report
Report
This work describes an intriguing method for reconstructing the clusters of an epidemic model. It is based on a clever use of a loglikelihood difference and provides indicators of the best number of links for detecting optimally “immigrants” (nodes without ancestors) and clusters. I think that the papers deserves publication in SciPost because it opens a new direction in the much used field of unsupervised methods for clustering. Nevertheless, first it needs some clarifications.
– Literature: some papers by Ogata already used the loglikelihood for declustering seismicity (probably ref 8 and 9). This should be discussed in the text.
– loglikelihood: is there any rigorous reason for using the loglikelihood difference (2) or it is more about intuition?
– Algorithm: it is not clear how the algorithm is implemented in practice and how it can scale linearly with the number of links. Is there a ordering of the q_ij involved? The paper would become much clearer with a pseudocode of the main algorithm.
– Parametrization: in which sense the number of links can be used to parametrize the partitions Y? Partitions with the same number of links can be very different.
– symbol j: probably the symbol “j” used at page 4 as an increment of the number of links could be replaced with something that does not sound as the index of an event.
– R: after eq.(3) I do not understand the sentence saying that R drops fast to zero after n1*. Why is it so? Should not it be after n2*?
– Averages: it is not clear in which sense averages are taken: over different realizations? See e.g. sentences before eq.(4).
– Notation: numbers in the form 2E5 are unusual, normally one sees 2 \times 10^5 (as in figures’ axes).
– Figures: vertical lines in correspondence of the true number of clusters and of links would be a good guide to the eye.
– Model: the realization of the first model is not totally clear, especially concerning the immigrants’ spreading.