Why everyone should become a data scientist

[Music]

data exists

in many forms and is often generated by

our non-deliberate actions

when we spend time with our friends and

loved ones on social media

we’re generating data about our likes

dislikes the language we speak

our location and so much more and this

data

is often used by social media companies

to help

improve our next online experience

imagine a life without social media

imagine a life without the ability to

buy from the comfort of our homes

imagine a life without data

take this picture for instance which

most of you would identify to be

someone’s running sessions actually my

running sessions

from this picture we can tell so many

stories because

it holds a set of facts that help us

make the next key decision

or simply put because it holds data

for instance we can tell that there’s an

almost fixed distance that i run

and you might notice that my sessions

happen on a saturday

or a sunday in the month of february

and maybe you might even go on to make a

mental note

of my average running pace and perhaps

compare it with your own pace of running

or that of someone else that you know

and most importantly you might have

already noticed that

i won’t be representing zambia at the

next olympic games

in the case of online purchases data

about

our purchase transactions is usually

used to suggest the next items

that we might be interested in buying

it’s even further used to protect our

money from fraudsters

who might attempt to make purchases

using our bank details

the fraudsters are detected through

anomaly transactions

when data about a current transaction is

compared against the history of our past

purchases

data has become an important game

changer in the world today

or as most people would put it it has

become the new oil

imagine such a great power being

harnessed by a country zambia

this is a dream that is not beyond reach

but it’s one that costs for all of us to

become data scientists

stay with me there it’s not as scary as

it sounds

i’ll explain along the way in how we can

at least be data literate

to harness the power within data the

world today offers two solutions that

are often spoken of interchangeably

one is called data science

which is a collection of scientific

approaches to obtain meaning from data

and the other is called big data or the

ability to leverage off the power of

modern day computing devices such as

storage

and processing speeds to analyze large

amounts of data with great speeds and

efficiencies

that have never been known before

big data and data science have led to

mixed experiences

that may discourage a developing

continent like africa

from taking part in their use however

the power might be critical in unlocking

africa’s

next level of growth and ushering in

zambia’s new development

some of you may recall the scandal

between facebook and cambridge analytica

where it was reported that data of up to

87 million facebook users

was illegally acquired and

inappropriately used

by the british political consulting firm

cambridge analytica

you may have even heard of how in the

united kingdom

many parents and their children cried

foul when the uk’s office of

qualifications

and examination regulations of co

allowed a data algorithm to determine

the results of students

and the results were found to be best

against students coming from poorer

backgrounds

and this may leave you wondering is big

data safe

can we trust data science i have spent

the last two years collaborating with

friends

and organizing free classes in data

science and artificial intelligence

in community outreach projects we aim to

raise awareness of how

data science when used properly can

bring about development

in in my work with with

these other data enthusiasts i have

learnt of a great concern

a concern as to whether it is even

possible to take up

data science as a fully fledged career

in zambia

and what is the reason for their concern

that we do not have enough data to work

within zambia

is it really that we lack the capacity

as a country to generate enough data to

sustain a career in data science

is this doubt keeping us as a nation

from becoming the next leader in a

data-driven tomorrow

well you might think for a moment that

we’re nowhere near generating and

consuming our own data

in the sizes of big data but while this

might seem true

we do not lack in data we have so much

that

so much data that surrounds even the

simplest of tasks

so let’s address this hesitation imagine

that

a grocery store owner and you’re able to

identify patterns

in customer purchases with just a basic

literacy level in data

you could use these patterns to deliver

the range of shelves

for greater customer convenience and

thereby maximizing your profits

there’s so much data out there sitting

and waiting for us

to harness its value but sometimes

discovering it

calls for creativity and an innovative

way of thinking

other times it calls for deliberate

attempts to harness it

but this is not only limited to business

settings

i have a story i’d like to share with

you of a person of projects

that i took and applied analytics

to gain insight

i’m an i.t auditor by trade and part of

my work

involves performing data analytics

and in this story as you can see i used

my analytics to actually gain an insight

of my use of a basic commodity

sugar we often tend to think that we

know ourselves better than

others however the outcome of my project

awakened me to how much of a stranger i

was to my own body cues

and this is clear evidence of how facts

and figures

can clearly transform how operate when

you move away from leading based on gut

feeling

to making decisions driven by data

insight

and my success in having captured enough

data to work with

is evidence enough of how readily

available data can be

so here goes my sugar analytic

on the summer day in 2020 i dashed into

the kitchen to make a bowl of cereal

but while in the process of trying to

make my cereal

i discovered that my weekly allocation

of sugar had run out

because it had run out only halfway

through the work i found this quite

strange

but since i live alone i knew there was

no one else to suspect apart from myself

and wanting to learn what had caused

this to happen picked my interest

so i began recording each review of my

sugar container

and circumstances surrounding the review

i had questions that needed to be

answered

whether my high use of sugar had

anything to do with stress

i even had my assumptions that it did

because i couldn’t think of any other

possible factors

after three months of recording my sugar

usage and recording the number of tasks

that i took up at work

i analyzed my data i noticed that in

most cases i used more sugar with each

increase in workloads

and to confirm this i checked my

correlation coefficient

which is a mathematical scale that helps

us get rid of force assumptions

the scale runs from negative one to one

with zero being neutral and anything

spreading away from zero being

a strong relationship in my analysis my

correlation coefficient was

negative 0.32 which is almost a weak

inverse relationship

and this was just about one of the last

things that could go wrong with my

analysis

with this failure i decided to consider

where my sugar was being consumed

and there’s only after considering my

connection my

my location that i was able to establish

a connection between

my sugar usage and workload

i learned that i used more sugar when

stressed if working from home

and on the other hand i used more sugar

when not stressed if working from the

office

and this is just one of the many

insights that i gained from analysis of

my sugar usage

i learned how my emotions work and body

state

affected my sugar intake i also

discovered that my sugar intake levels

actually

showed up even before my workload

increased but i knew this to be due to

anxiety

that i’d usually faced before taking up

a new job

an analysis of my sugar research data

also helped me then to discover how

this anxiety for me led to stress

lengthened working hours and reduced

productivity

i began cautiously looking out for some

of these triggers thanks to the

information that i gathered about myself

by using my appetite skills to know when

i’m anxious

and now actively consult and reach out

to people who may have

more technical knowledge about a new

task that would be assigned to undertake

and this helps to calm myself and to

plan more efficiently

before taking up the new job

accurately knowing how much i exceeded

the daily recommended sugar intake

levels

led me to taking up running as an active

sport that i do each weekend

and knowing the impact of my fast-paced

work environment on myself

has led me to appreciating the

importance of rest

analysis of my sugar research data and

the changes it prompted

enabled me to building up my physical

stamina

and keeping a keen eye on my mental

well-being and that’s increasing

productivity

well i’m sure i’m now considering taking

up an online course

in data science for others you can

always

take a step by appreciating the data

around you

as a continent taking full

responsibility of the world we’re

creating today

for our children tomorrow cause for all

of us to have

a basic literacy level in data science

as this would help us

understand the value of data and ensure

accountability

among users of our data and learning to

be brokers of our own data

is the beginning of a dawn for new

africa i’d like to take you back to that

shopkeeper

who has a chance of increasing their

profits by becoming tata literate

you might even know of a teacher who

could better understand their students

strengths

by analyzing their grades or you might

even be that doctor

with the wealth of past patient

illnesses or the government or that

government official

seeking to make meaningful development

data is always at the mercy of the

exploration

thank you

[Music]

you

[音乐]

数据

以多种形式存在,通常是由

我们的非故意行为产生的,

当我们在社交媒体上与朋友和亲人共度时光时,

我们正在生成关于我们喜欢的数据、

不喜欢

我们所在位置的语言等等

社交媒体公司经常使用这些数据

来帮助

改善我们的下一次在线体验

想象一个没有社交媒体

的生活 想象一个没有能力

从我们舒适的家中购买

的生活 想象一个没有数据的生活

以这张照片

为例 你会确定是

某人的跑步训练实际上我的

跑步训练

从这张照片中我们可以讲述很多

故事,因为

它包含一组有助于我们

做出下一个关键决策的事实,

或者简单地说,因为它包含数据

,例如我们可以告诉我们有

我跑了一个几乎固定的距离

,你可能会注意到我的训练

发生在 2 月的一个星期六

或星期天

,也许你甚至可以继续 记

下我的平均跑步速度,或许

将其与您自己

或您认识的其他人的跑步速度进行比较

,最重要的是您可能

已经注意到

我不会代表赞比亚参加

下一届

奥运会 在线购物案例 关于

我们的购买交易的数据通常

用于

建议我们可能有兴趣购买的下一件商品

它甚至进一步用于保护我们的

资金免受

可能试图

使用我们的银行详细信息进行购买

的欺诈者的侵害 通过异常检测到欺诈者

将当前交易的数据

与我们过去购买的历史进行比较时的

交易

数据已成为当今世界的重要游戏

规则改变者,

或者正如大多数人所说的那样,它已

成为新的石油

想象一个国家正在利用如此强大的力量

赞比亚

这是一个并非遥不可及的梦想,

但它是我们所有人

成为数据科学家的代价

那里并不像听起来那么可怕,

我将在此过程中解释我们如何

至少具备数据素养

以利用数据中的力量

当今世界提供了

两种通常可以互换使用的解决方案,

一种称为数据科学

,它是一种 收集

从数据中获取意义的科学方法

,另一种称为大数据或

利用

现代计算设备的能力(例如

存储

和处理速度)以前所未有的

速度和效率分析大量数据的能力

大数据和数据科学导致

混合经验之前就

知道了,这可能会阻止

像非洲这样

的发展中大陆参与使用它们,但是

这种力量可能对于释放

非洲的

下一个增长水平和引领

赞比亚的新发展至关重要

,你们中的一些人可能还记得

facebook 与 cambridge analytica 之间的丑闻

,据报道,多达

8700 万 facebook 用户的数据被

泄露 s

被英国政治咨询公司

cambridge analytica 非法收购和不当使用

您甚至可能听说过

在英国,当英国的

资格

和考试条例办公室

允许使用数据算法来

确定 学生

的结果和结果被发现最适合

来自贫困背景的学生

,这可能会让您想知道大

数据是否安全

我们可以信任数据科学我

在过去两年中与朋友合作

并组织免费的数据

科学课程和

社区外展项目中的人工智能我们的目标是

提高人们对

数据科学如何在

与其他数据爱好者一起工作

中带来发展的认识

在赞比亚将

数据科学提升为成熟的职业

原因是什么 因为他们

担心我们没有足够的数据在赞比亚工作

,我们

作为一个国家是否真的缺乏产生足够数据来

维持数据科学事业的能力,

这个疑问是否会阻止我们作为一个

国家成为下一个领导者? 一个

数据驱动的明天,

你可能会想,

我们离生成和

使用自己

的大数据大小的数据还差得很远,但尽管这

似乎是真的,

但我们并不缺乏数据,我们拥有

如此多的数据 甚至包括

最简单的任务,

所以让我们解决这种犹豫,

假设杂货店老板和您能够

识别

客户购买的模式,只需基本

的数据识字水平,

您就可以使用这些模式来

提供更大的货架范围 客户便利,

从而最大化您的利润

有大量数据

等待

我们利用其价值,但有时

发现它

需要创造力和创新

其他时候,

积极

的思维方式需要

有意识

地尝试利用它,但这不仅

限于商业

环境 我是一名 IT 审计师,

我的部分工作

涉及执行数据分析

,在这个故事中,如你所见,我使用

我的分析来实际

了解我对基本商品糖的使用情况,

我们通常倾向于认为我们

了解自己 比

其他人更好,但是我的项目的结果让我

意识到

我对自己的身体线索有

多么陌生,这清楚地证明了

你从基于直觉的领导转变

为 以数据洞察力为依据做出决策,

并且我成功捕获了足够的

数据以供使用

,这足以证明数据是多么

容易获得,

所以

我在 2020 年的夏天进行了糖分分析

冲进厨房做一碗麦片,

但在尝试做麦片的过程中,

我发现我每周分配

的糖已经用完了,

因为它只在工作进行到一半时就用完了

,我觉得这很

奇怪,

但因为我 一个人住我知道

除了我自己没有其他人可以怀疑

并且想了解导致

这种情况发生的原因引起了我的兴趣,

因此我开始记录对我的

糖容器的每次评论

以及围绕评论的情况

我有需要

回答的问题

我对糖的大量使用

是否与压力有关,

我什至假设它确实如此,

因为

在记录了我的糖

使用量和记录

我所承担的任务数量三个月后,我想不出任何其他可能的因素 工作

我分析了我的数据,我注意到在

大多数情况下,随着工作量的增加,我使用了更多的糖

,为了确认这一点,我检查了我的

相关系数

,这是一个数学 帮助

我们摆脱力假设

的比例尺从负一到一

,零是中性的,任何

从零扩散的东西

在我的分析中都是一个强关系,我的

相关系数是

负的 0.32,这几乎是一个弱的

反比关系

,这是 由于这次失败

,我的分析可能会出错的最后一件事是

我决定考虑

我的糖在哪里被消耗

,只有在考虑了我的

连接

我的位置之后,我才能在

我的糖使用和 工作量

我了解到,如果在家工作,我会在有压力的情况下使用更多的糖

,另一方面,

如果在办公室工作,我会在没有压力的情况下使用更多的糖

,这只是

我从糖使用分析中获得的众多见解之一

了解了我的情绪和身体状态如何

影响我的糖摄入量我还

发现我的糖摄入量

实际上

甚至早在 我的工作量

增加了,但我知道这是由于

我在接受新工作之前通常会面临的焦虑,

对我的糖研究数据的分析

也帮助我发现

这种焦虑如何导致我的压力

延长工作时间和 生产力降低

我开始谨慎地寻找其中的

一些触发因素,这要归功于

我通过使用我的食欲技能收集到的关于我自己的信息来了解我

什么时候

感到焦虑

,现在积极咨询并联系

可能对

某件事有更多技术知识的人

将被分配承担的新任务

,这有助于

在接受新工作之前让自己

平静下来并更

有效地进行计划 每个周末

都知道我快节奏的

工作环境对自己

的影响,这让我意识到

对我的糖研究数据进行休息分析的重要性

它引发的变化

使我能够增强体力

并密切关注我的心理

健康,这很好地提高了

生产力我敢肯定我现在正在考虑

为其他人参加数据科学在线课程,你可以

始终

迈出一步,欣赏

您周围的数据,

作为一个大陆承担

我们

今天

为我们的孩子明天创造的世界的全部责任,

因为这将帮助我们

了解数据科学的基本素养 数据和确保

我们的数据用户之间的责任和学习

成为我们自己数据的经纪人

是新非洲黎明的开始

我想带你回到那个

有机会

通过成为塔塔来增加利润的店主 识字的

你甚至可能认识一位老师,他

可以通过分析学生的成绩来更好地了解学生的

长处

,或者你

甚至可能是那个

拥有大量过去病人

疾病的医生 o r

寻求制作有意义的发展

数据的政府或政府官员总是受制于

探索

谢谢

[音乐]