Data Science


Data science, just like artificial intelligence, is not new. Instead, it has been here for decades. It draws great attention when Harvard Business Review called the data scientist “The Sexiest Job of the 21st Century.” It is closely related to the themes of this course, data engineering and data management.

Data science is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. Main differences between data science and data mining – data mining is an activity which is a part of a broader Knowledge Discovery in Databases (KDD) process, while data science is a field of study just like applied mathematics or computer science.

Often data science is looked upon in a broad sense while data mining is considered niche. The data science process consists of the following six steps:

  1. Frame the problem,
  2. Collect the raw data,
  3. Process the data for analysis,
  4. Explore the data,
  5. Perform in-depth analysis, and
  6. Communicate results.




      “People who know little are usually great talkers,    
      while men who know much say little.”    
      ― Jean Jacques Rousseau