Posted on: 22 May 2019

Share this:

Key differences between a Data Scientist and Data Engineer


In the information age, data is everything, so it’s no wonder that positions like ‘data scientist’ and ‘data engineer’ have been created. While these are new job titles, the core work has existed for a while in the form of data analysts. In the past, anyone who analysed data would be known as a data analyst and anyone who created backend support platforms to handle the data analysis would be known as a ‘business intelligence (BI) developer.

 However, with the emergence of big data, the requirement to handle large volumes of data by corporations and research facilities grew and so did the requirement for data scientists and engineers. Below is a quick guide to the world of big data, some of the skills and tools used in such occupations and the different roles available:

1.The Data Analyst

Analysts have been churning all forms of data into manageable chunks of information and processing this into reports and visual representations. They also tend to have a strong understanding of professional tools to solve problems and help guide business decisions. From pie charts to bar graphs, visual representations can be very useful in helping to make decisions about complex problems when there is an excess of information to process. However, don’t expect data analysts to analyze big data. They aren’t generally equipped to know the mathematics or have the technical know-how to tackle and develop complex algorithms for specific problems.
Core skills: Statistics, data munging and visualization and exploratory data analysis.
Tools used: Microsoft Excel, SPSS, SPSS Modeler, SAS, SAS Miner, SQL, Microsoft Access, Tableau, SSAS.

2.The business intelligence developers

BI developers are known to be data experts and work with internal stakeholders to understand reporting needs. They then proceed to collect requirements, design and build business intelligence and reporting solutions. They are also tasked with designing, developing and supporting new and old data warehouses, ETL packages, dashboards and analytical reports. They are expected to work with SQL to integrate data from different sources. However, BI devs don’t perform data analysis.
Core skills: ETL, developing reports, OLAP, cubes, web intelligence, business objects design.
Tools used: Tableau, dashboard tools, SQL, SSAS, SSIS and SPSS Modeler.

3.The data engineers

These engineers are tasked with creating the infrastructure to be analyzed by data scientists. They are software engines who design, build, integrate and manage big data. They are generally working on optimizing the performance of the big data ecosystem within a corporation and ensuring everything is easily accessible and working smoothly. Using tools like MySQL or MongoDB they might run Extract, Transform and Load on top of large datasets and create data warehouses which can be used for reports and analysis by the data scientists. Data engineers keep themselves busy with the design and architecture of the whole ecosystem and don’t generally dabble with machine learning.
Core skills: Hadoop, MapReduce, Hive, Pig, Data streaming, NoSQL, SQL, programming.
Tools used:
DashDB, MySQL, MongoDB, Cassandra.

4.Finally the data scientists

There are many sources on the internet that hail the data scientist as modern day alchemists turning raw data into purified insights. While that may be a bit of an exaggeration, it’s not too far from the truth! Statistics, machine learning, and analysis are the main tools of a data scientist. These tools can be used to solve critical business problems. Turning large rivers of big data into valuable and actionable insights is no easy task. Data science is an evolution of the data analysis done in the past years improving on it with automation and machine learning. Data scientists are expected to be veteran programmers with an ability to design new algorithms on the fly. There is a huge demand for these individuals these days.

But what’s the point of all these actionable insights if it cannot be visualized properly? In fact, one of the expectations from this job is to be able to visualize the results of their findings using apps, or other technologies. For example, think of the Jarvis UI from the Iron Man movies. Narrating interesting stories about the solutions to their data or business problems becomes part of the job.

Data scientists are also required to understand traditional and new data analysis methods to build statistical models and discover patterns in data. Currently, Microsoft is offering courses in the Data sciences.

If you’d like to discuss your big data requirements do feel free to contact us at Asahi Technologies. We build custom applications and we can work with your data science team for all your reporting needs.

Do subscribe to our newsletter for more blogs on big data and custom software.

Malathi Aranganathan

Malathi Aranganathan

Senior QA Engineer