Data science: an emerging discipline
The role of data scientist has been described as the “sexiest job of the 21st Century”. While possibly there is a degree of hype associated with such a claim, there are factors at play such as the unprecedented growth in the amount of data being generated. This paper characterises the already...
Autor Principal: | Galpin, Ixent |
---|---|
Publicado: |
2018
|
Materias: | |
Acceso en línea: |
http://hdl.handle.net/11634/11508 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Sumario: |
The role of data scientist has been described
as the “sexiest job of the 21st Century”. While
possibly there is a degree of hype associated with
such a claim, there are factors at play such as the
unprecedented growth in the amount of data
being generated. This paper characterises the
already established disciplines which underpin
data science, viz., data engineering, statistics,
and data mining. Following a characterisation
of the previous fields, data science is found to be
most closely related to data mining. However, in
contrast to data mining, data science promises
to operate over datasets that exhibit significant
challenges in terms of the four Vs: Volume, Variety,
Velocity and Veracity. This paper notes that the
current emphasis, both in industry and academia,
is on the first three Vs, which pose mainly
scientific or technological challenges, rather than
Veracity, which is a truly scientific (and arguably
a more complex) challenge. Data Science can
be seen to have a more ambitious objective than
what traditionally data mining has: as a science,
data science aims to lead to the creation of new
theories and knowledge. This paper notes that,
ironically, the veracity dimension, which is
arguably the closest one relating to this objective,
is being neglected. Despite the current media
frenzy about data science, the paper concludes
that more time is needed to see whether it will
emerge as discipline in its own right. |
---|