One of the main goals in cosmology is to understand how the Universe evolves, how it forms structures, why it expands, and what is the nature of dark matter and dark energy. Next decade large and expensive observational projects will bring information on the structure and the distribution of many millions of galaxies at different redshifts enabling us to make great progress in answering these questions. However, these data require a very special and complex set of analysis tools to extract the maximum valuable information. Statistical inference techniques are being developed, bridging the gaps between theory, simulations, and observations. In particular, we discuss the efforts to address the question: What is the underlying nonlinear matter distribution and dynamics at any cosmic time corresponding to a set of observed galaxies in redshift space?
An accurate reconstruction of the initial conditions encodes the full phase-space information at any later cosmic time (given a particular structure formation model and a set of cosmological parameters). We present advances to solve this problem in a self-consistent way with Big Data techniques of the Cosmic Web.