5–8 Dec 2011
City Conference Center
Europe/Stockholm timezone

Session

Keynote: eScience: Past, Present and Future, Tony Hey, Microsoft Research

7 Dec 2011, 09:00
City Conference Center

City Conference Center

Drottninggatan 71B Stockholm Sweden

Description

The talk will review the origins of the eScience initiative
starting with the John Taylor’s ambitious £250M program in
the UK. One strand of the eScience agenda concerns data-
intensive science and ‘big data’. In Europe and the UK, and
also globally, the particle physicists used complex
middleware to build a grid of centers to move data from
CERN and to share data and computing resources for the
analysis of the LHC experiments. Other scientific
communities also have big data challenges: Jim Gray and
Alex Szalay’s pioneering work with the Sloan Digital Sky
Survey and their creation of the SkyServer Database were
major landmarks for ‘big data’ astronomy. The global
astronomy community also came together to create ‘Virtual
Observatories’ as a forum for collaboration and exchange of
data.



Both particle physics and astronomy have significant
amounts of data yet do not present the same challenge for
discovery and insight that is needed for the analysis of
genetic and bioinformatics data. There, the goal is to extract
new knowledge from very disparate types of data ranging
from gene sequences to 3-D protein structures. Similar
remarks can be made about biomedical data where
understanding features in medical images and integrating
this information with many other types of medical data is a
major challenge. In these last two examples, computer
science technologies such as Machine Learning and
Computer Vision clearly have a key role to play. Finally the
increasing deployments of sensor networks and the use of
satellite imagery are transforming many areas of
environmental science. In all these cases there is a need to
use advanced IT to assist scientists in managing, visualizing
and analyzing their data.



The eScience agenda is not only about very big data in the
Terabyte and Petabyte range. The need to collaborate, re-
use and mine many small data sets is a common feature of
many different fields and eScience covers the tools and
technologies required to make this possible. The tools must
cover the entire data life cycle, from acquisition to archive.
Furthermore, the tools needed by scientists can incorporate
advanced computer science algorithms but they also need
to be robust and reliable - not just research prototypes
beloved by computer science researchers!



Increasingly eScience technologies will be relevant to the
Humanities and Social Sciences and perhaps the term
eResearch, as in the Australian eResearch program, is a
more appropriate name. The explosive growth in scientific
data will also affect scholarly publishing and libraries. In
addition to the scientific data revolution we are also seeing a
transformation in how we publish scientific data and how
we assign credit for such tasks as data curation.



After a brief survey of the state of eScience today with
some examples of what Jim Gray called the ‘Fourth
Paradigm’ for scientific research, the talk concludes with a
look to the future where semantic computing technologies
and Cloud services are certain to play an increasingly
important role.

Presentation materials

Building timetable...