Organizers: Eduardo Gutiérrez Peña, UNAM, México y Manuel Mendoza, ITAM, México
- María Lomelí, Babylon Health, UK. Title: Amortised inference using a faithful inverse for sequential importance sampling
Automated decision-making for medical diagnosis consists of producing differentials for various diseases based on evidence about the state of the patient. A particular way to encode the various relationships between symptoms, risk-factors, and diseases is by using a Bayesian network, where the edge structure reflects the underlying causal mechanisms between the nodes. Due to the combinatorial explosion of computing posterior distributions exactly, various approximate inference schemes have been proposed to tackle this problem, such as variational inference and importance sampling. In addition, amortisation techniques allow us to reduce the cost of inference by carrying out and storing some computations offline. In the medical-diagnosis task, producing highly-accurate marginals is key to differential diagnosis. Importance sampling is particularly suited for this, as it is asymptotically exact and a good choice of proposal can provide a reduction in variance. In this paper, we construct various data-driven proposals by using an inverse factorisation of the model’s joint distribution. The proposal distributions are based on a neural network that is trained with samples from the generative model before inference takes place, whereas the inverse factorisation provides the sampling schedule for the importance sampling scheme. In the experiment section, we compared this scheme to likelihood-weighting as well as to a standard variational inference scheme. We also explored the impact of different inverse factorisations on variance reduction. Our findings reveal that the new scheme produces competitive data-driven proposals for importance sampling.
- Rosana Zenil-Ferguson, University of Hawaii, USA. Title: When your favorite trait is not enough to explain diversification
Abstract: The effect of having extra sets of chromosomes, a mutation known as polyploidy, in the rates of speciation and extinction of flowering plants remains a contentious issue in Botany. Recent studies have found that plant polyploids have slower speciation rates and higher extinction rates than diploids. This result is counterintuitive for botanists since the extra genetic information created by polyploidy should be the source of innovation and ultimately enhance the speciation process.
In this talk, we re-investigate the role of polyploidy in the rates of speciation and extinction by proposing new and custom birth and death stochastic processes. The newly proposed models include the possibility of a different trait, other than polyploidy, dictating the speciation and extinction process. Using RevBayes (https://revbayes.github.io), a new software tool that deals with the lack of independence amongst plant species via a probabilistic graphical model approach, we estimated the posterior distribution and Bayes factors of the proposed models. Stochastic models were fitted to a phylogeny from the tomato family (Solanaceae) containing 595 species with polyploidy and breeding system data. Finally, we found that polyploidy might not be the driver of diversification and that other observed or unobserved traits might be contributing to speciation and extinction process in the evolutionary history of plants.
- Pedro Regueiro, University of California, Santa Cruz, USA. Title: Dynamic Evolution of Communities in Networks
Abstract: The class of Bayesian stochastic blockmodels has become a popular approach for modeling and prediction with relational network data. This is due, in part, to the fact that inference on structural properties of networks follows naturally in this framework. Here, we introduce a new dynamic stochastic blockmodel that allows us to study the evolution of communities across time. Our approach models both shifts in community membership using a fragmentation-coagulation prior, and changes in the propensities of interaction among communities using a variant of the autoregressive process. Computation is performed using a Markov chain Monte Carlo algorithm. Illustrations are provided using both real and simulated data.