Data science is the study of the computational principles, methods, and systems for extracting knowledge from data. Large data sets are now generated by almost every activity in science, society, and commerce — ranging from molecular biology to social media, from sustainable energy to health care.

Data science asks: How can we efficiently find patterns in these vast streams of data? Many research areas have tackled parts of this problem: machine learning and artificial intelligence provide methods for finding patterns and making predictions and decisions from data; databases are needed for efficiently accessing data and ensuring its quality; statistics and optimization provide fundamental mathematical ideas and methods; ideas from algorithms are required to build systems that scale to big data streams; and natural language processingcomputer vision, and speech processing are each needed for analysis of different types of unstructured data. Recently, these distinct disciplines have begun to converge into a single field called data science.

The EPSRC Centre for Doctoral Training (CDT) in Data Science, based at the University of Edinburgh, will train a new generation of data scientists, comprising 50 PhDs over five intake years, with the technical skills and interdisciplinary awareness necessary to become R&D leaders in this emerging area. The first cohort started the programme in September 2014.