In part five of our series, we look at the skills and background needed to embark on a career in data science.
So what are the skills and background needed to embark on a data science career?
A bachelor’s or master’s degree in any quantitative subject is a good starting point for a data science career. Most relevant subjects include computer science, mathematics, physics and statistics. However, that doesn’t rule you out if you have studied something else.
Fyte’s Ali Hussnain and Paul van Loon, Head of Analytics at Forecast discuss in more detail the educational background you need to break into the field of data science.
The data science lifecycle
The concept of data science is relatively new: the definition data scientist (coined in 2008 and popularised in 2012) continues to evolve. Now data science involves a whole spectrum of tasks which you might have accomplished on your way of earning your degree.
According to the School of information at Berkeley, the data science Life Cycle has five stages:
- The data scientist acquires the data via one or more means – by actively creating new data, obtaining existing data produced by organisations or receiving data created by devices.
- This is mainly about data integration and pre-processing without deriving commercial values from it just yet. The idea is to prepare your data for the synthesis and usage in the next stage. Maintenance tasks include data warehousing, data cleansing and the extract-transform-load (ETL) processes where data staging occurs.
- Here is where data mining – discovering patterns in the dataset – comes in. A variety of methods are adopted, whether simple summarisation or more advanced methods like clustering and classification.
- At this stage people extract insights from the dataset using both quantitative and qualitative analysis. Tasks range from Exploratory Data Analysis (EDA) and confirmatory analysis to regressions and predictive modelling. Text analytics and Natural Language Processing (NLP) are also becoming increasingly popular in the context of customer comment or legal document analysis.
- And finally, you provide the results to stakeholders. This includes data reporting and visualisation, and usually some business intelligence and commercial impact analysis to inform decision makers.
STEM and stats background
Now that you have a basic understanding of the tasks a data scientist performs on a day-to-day basis, it is clear to see that a STEM subject could help you kick-start your career. But you need in-depth knowledge of maths and statistics, some solid programming and also comprehensive data collection, data cleaning, data analysis and data reporting. Hands-on experience working with different data sets would certainly help. There is an abundance of written material and online courses out there for you to learn along the way.
What kind of data scientist is in demand? First, let’s look at the diverse types of careers related to data. It’s not just data scientists, there are also business analysts, data developers and data engineers. The skillset and knowledge required for each of the job family varies.
A conference paper by De Mauro et al. (‘Beyond Data Scientists, a Review of Big Data Skills and Job Families’, 2016) analysed vast amounts of job posts published online and shed some light on the skills required to thrive in the data industry. For example, business analysts lean more towards the commercial side of things, often equipped with effective communication skills and financial acumen to make a real impact on business transformation.
Core competencies of a data scientist
However, on your career path as a data scientist, the focus of your skillset has to be on analytical methods and the ability of turning data into actual insights. As a data scientist, you need to be proficient at using data warehouses and be adept at querying or extracting data from databases, whether in the cloud or on your local machine.
It is also your responsibility as a data scientist to leverage the data at hand – identifying patterns, extracting information, designing and implementing models descriptive or predictive purposes with the business context in mind. Therefore, a solid understanding of statistics is a must, and you also need some programming skills to implement your model in R, Python or other languages.
Data scientists also seek to improve metrics and statistical models continually and integrate research and learning as part of that process. A ‘scientist’ mindset is helpful to accumulate expertise through trial and error.
Here they go into more detail about the discipline of data science and the mindset needed to succeed.