Aarhus University Seal / Aarhus Universitets segl


BERTHA - Big Data Centre for Environment and Health

Big Data Solutions



Until now individual location-time-activity patterns have not been available for environmental and social exposure assessments.  In BERTHA, we collect and join numerous data sources as personalised sensors, social media, public Danish medical records and population registers, static environmental monitors and mathematical models. To assemble this myriad collection of data sources and models to significantly improve environmental health exposure assessment is a complex task that calls for holistic solutions: In short, it requires Big Data solutions.


The unique asset of BERTHA is the application of dynamic exposure profiles based on tracking people through the various microenvironments they meet in daily life, and the linkage of these data to already existing Big Data sources, such as public Danish medical records and population registers. 




In BERTHA, we develop and extend new, spatial and temporal algorithms in an exploratory visualisation environment to reveal patterns and interactions in the complex Big Data sources for chosen health outcomes. We combine and apply interactive data mining, data analytics, machine learning, exploratory visualisation and spatial data analysis to our myriad data sources and health outcomes. 


A tweet map of Europe: Via Twitter comes a data visualization that gives you a map of human activity rather than just infrastructure, based on the geotagged tweets sent since 2009 (used with permission from Miguel Ribs, Twitter Inc.).

Examples on methods and tools we may use in BERTHA:

  • An integrated assessment approach where model calculations and measurements on environmental exposures is combined, using the AirGIS system for modelling air pollution and noise exposures and including data from both static, routine monitoring and field experimental work using personalised sensors. The environmental exposure profiles may be analysed together with data on health outcomes from Danish health registries or from biomarker assessments.
  • Social media data mining by representing, analyzing and extracting patterns from citizen volunteered, social media data. For example, extracting tweets from twitter users, expressing poor or good mental health, and map that geographically to reveal locations that are better, or worse, in terms of their mental health.


We supplement the use of static environmental sensors with an individual, personalised assessment regime, where environmental data from GPS enabled, personalised micro-sensors carried by an individual, together with social media postings and longitudinal register data provide us with a more complete understanding of an individual’s mobility and hence exposures related to health outcomes.

Illustration on how concentration, spatiotemporal and contextual data are integrated to analyse an individual exposure profile (calculated PM2.5 at 1-minute resolution). The colours indicate which of the defined microenvironments the person was at which point in time.

Examples on methods and tools we may use in BERTHA:

  • Develop, test and apply personal air pollution (e.g. PM2.5, PM10 and NO2) and noise exposure monitors to healthy individuals from the Danish Blood Donor cohort and to patients with implanted heart defibrillators, investigating the environmental impact on biomarkers and on disease severity
  • A study of healthy individuals who exercise regularly and environmental impact on biomarkers, using equipment for assessment of environmental exposures together with heart rate, work intensity and a range of other variables together with GPS and time coordinates during running


Before the Big Data revolution and the development of technology and computational power to handle all the data collected for decades, several assumptions had to be part of environmental exposure assessments. Traditional environmental exposure assessment is/was based on the assumption, that environmental exposures at the home addresses or static monitors nearby the home addresses may be used as proxies for personal exposures.

In recent years, use of mathematical models in exposure assessment, instead of, or as a supplement to measurements, has become more common. In almost all health assessment studies related to environmental exposures, an individual’s address is still used as a proxy for personal exposure. Yet, people are mobile so using a static location can lead to estimation errors in individual exposure, since time spent commuting, at work, or socialising, is not accounted for in the exposure assessment.

In BERTHA, we have the ability to minimise these estimation errors by combining model calculations and measurements from both routine/static monitoring with field experimental work using personalised sensors, tracking people through the various microenvironments they meet in daily life.