Keynote Remarks By DR. YEMI KALE
Statistician-General of the Federation / Chief Executive Officer National Bureau of Statistics
Delivered at the
1st National Summit on Big Data Economy /
2nd Data Science Bootcamp
I am honored to join you here today at the 1st National Summit on Big Data Economy and the 2nd Data Science Bootcamp being organized by Data Science Nigeria. I consider this event, its organization and its objectives highly commendable, especially as it aims to address a pressing challenge in our society today, which is preparing today’s youth for the workforce of tomorrow, in particular, by building up the skills gap among young professionals and students in the emerging field of data science, and specifically in machine learning, programming and data analytics.
Undoubtedly, one of the most engaging and increasingly important areas of discussion since the dawn of the 21st century has been on statistics and getting data right. Today we talk of open data, big data, and the right data. We hear debates about whether African data is poor or whether there is a statistical tragedy or renaissance in Africa. Attention to the quality of data has increased globally. Even in more developed countries like the UK, Canada and the USA, there are questions being raised about the quality of data, errors in data, wrong use of data and polarisation of data.
All around us, we observe the quantum leap in the type, size and scale of data that is being driven by rapid advances in the world of computing. Vast amounts of data are being generated every second of the day, across the world in various forms and across various sectors, what we call ‘Big Data’. (The term simply represents the increasing amount and the varied types of data that is now being collected.)
We often define or describe data as a collection of facts that has been translated into a form that provides information. When we hear data we often think of numbers and figures. However, data can be in the form of numbers, images, words, figures, facts or ideas. It is thought to be the lowest unit of information from which other measurements and analysis can be done. In other words, data (whether ordinary or what we call big data) is simply another word for information. The thing that differentiates Big Data from the “regular data” we were analyzing before is that the tools we use to collect, store and analyze it have had to change to accommodate the increase in size and complexity. With the latest tools on the market, we no longer have to rely on sampling. Instead, we can process datasets in their entirety and gain a far more complete picture of the world around us.