Introducing FIMM Group Leader Andrea Ganna

Andrea Ganna is a FIMM-EMBL Group Leader at the Institute for Molecular Medicine Finland at the University of Helsinki. He specializes in statistical genetics and epidemiology. In this article Andrea shares his research interests and explains how deep learning approaches and artificial intelligence can be used to leverage information from large scale epidemiological datasets to predict or even prevent the course of human disease and optimize existing interventions in human health care.

Group leader Andrea Ganna. Credit: FIMM, University of Helsinki

Andrea studied statistics and during his PhD at Karolinska Institute with Professor Erik Ingelsson, he worked on epidemiology, biostatistics, data science and molecular epidemiology which focuses on using molecular, metabolomics and genetic markers to distinguish for example the development of certain diseases in a population. During his postdocs, he focused on human genetics and especially large-scale genetic studies including the genome-wide association study (GWAS), and exome and genome sequencing studies.


What got you interested in epidemiology, genetics and statistics?


Andrea comments; “I was Initially more interested in social sciences like sociology, but I found that it didn’t really have a strong quantity of component, and statistics was a nice way to study society from a more robust perspective. I have always studied population studies that include large collection of data of individuals. Lately, I am more interested in the societal aspects and in what we can do from a public health perspective, using genetics, molecular markers, large-scale epidemiological studies and available databases.”

“Today, it is possible to combine factors such as genetics, molecular markers and environmental factors and model them to improve public health, interventions and to identify who are the high-risk individuals for common diseases. We can then target them and do trial-based experiments to show that our approaches are working.”

What questions are you setting out to address with your research and what are you working on at the moment?


“In Finland it is possible to do prevention studies as we have been collecting information since the 1960´s and currently we have sociodemographic data of the entire Finish population. Also, we are experiencing an increased use of large-scale genetics information as well. It is therefore not unconceivable that in the next 10-15 years everyone in Finland can access their genetic information. Given that everyone will have this information and we have collected heath information from everyone, the question is then, how can we use this information to identify individuals at high risk and to inform them that they are high risk individuals and try to motivate them to see a GP or do an intervention to lower their risk.”

Andrea’s group seeks to set up ways to utilize large-scale registry-based information and genetic information to early predict common diseases and use their findings to identify high-risk individual-groups and help these groups to benefit from already existing interventions. The challenge is to set up these projects, since it is very difficult to get approval to access the data and equally perplexing to put together and analyze the highly complex data. Andrea’s research strategy is to approach this challenge with modern methods such as artificial intelligence, novel statistical methods and machine learning to try to leverage the complexity of the data.

Andrea questions; “How do we model all the information we have on each individual through his or her entire lifetime in a sequential order so these data can inform us that the individual is at risk of disease? How do we analyze the data in an ethically appropriate and secure way? How do we collaborate with the different institutions so that they share their data with us?”

What methodologies do you use in your research?


In our research, we mostly combine novel statistical methods such as epidemiological methods and modern deep learning approaches because it is simply impossible to use traditional statistical software and methods on datasets including millions of records.  Deep learning approaches have been engineered to handle huge data cohorts. Even though artificial intelligence can make errors and interpretability is still a challenge, we are experienced in understanding how well these methods do and are constantly trying to find ways to make data interpretation easier.

Are you collaborating with other researchers at the moment?


We are currently working together with the Finnish institute for Health and Welfare (THL). From the genetics side we are collaborating with Mark Daly (Director of FIMM), Samuli Ripatti (Vice Director of FIMM), and Aarno Palotie (Research Director at FIMM). Also, we are collaborating with researchers in the USA and other parts of Europe. In August 2019, we concluded a very large project `genetics of same sex behavior` which was done in collaboration with researchers from the Netherlands, the UK and the US.

Could you tell me a bit about your group at FIMM and what are your goals for the first phase as a FIMM-EMBL group leader?


Mari Niemi and Aoxing Lui are postdocs in the group and are working on brain-derived single cell sequencing and leveraging national databases to study human reproductive success, respectively. Vincent Llorens is a software engineer and is working on building a web portal to study nation-wide disease paths. Mattia Cordioli as a PhD student and specializes in AI methods to model health trajectories and evaluate polygenic risk scores. Sakari Jukarainen is a postdoc and focuses on building a portal where we can leverage these data to give useful information to the doctors. We want to create an automated system that uses already existing but updated health registries to provide real-time information to doctors.

Andrea comments; “In the first phase we aim to complete a few of our large projects involving artificial intelligence and health registries. The next five years will involve working with biological translation or recontact studies on the long-term with the FinnGen project - we are thinking about how to recontact people and do more in-depth measurements on interesting trends and also do follow-up analyses in stem-cell models and brain organoids.”

Visit FIMM website to learn more about Andrea Ganna.

Also visit the Data Science – Genetics Epidemiology lab website to learn more about the group and their research projects.