Visualizing the worlds Twitter data Jer Thorp

Transcriber: Andrea McDonough
Reviewer: Bedirhan Cinar

A couple of years ago I started using Twitter,

and one of the things that really charmed me about Twitter

is that people would wake up in the morning

and they would say, “Good morning!”

which I thought,

I’m a Canadian,

so I was a little bit,

I liked that politeness.

And so, I’m also a giant nerd,

and so I wrote a computer program

that would record 24 hours of everybody on Twitter

saying, “Good morning!”

And then I asked myself my favorite question,

“What would that look like?”

Well, as it turns out, I think it would look something like this.

Right, so we’d see this wave of people

saying, “Good morning!” across the world as they wake up.

Now the green people, these are people that wake up

at around 8 o’clock in the morning,

Who wakes up at 8 o’clock or says, “Good morning!” at 8?

And the orange people,

they say, “Good morning!” around 9.

And the red people, they say, “Good morning!” around 10.

Yeah, more at 10’s than, more at 10’s than 8’s.

And actually if you look at this map,

we can learn a little bit about how people wake up

in different parts of the world.

People on the West Coast, for example,

they wake up a little bit later

than those people on the East Coast.

But that’s not all that people say on Twitter, right?

We also get these really important tweets, like,

“I just landed in Orlando!! [plane sign, plane sign]”

Or, or, “I just landed in Texas [exclamation point]!”

Or “I just landed in Honduras!”

These lists, they go on and on and on,

all these people, right?

So, on the outside, these people are just telling us

something about how they’re traveling.

But we know the truth, don’t we?

These people are show-offs!

They are showing off that they’re in Cape Town and I’m not.

So I thought, how can we take this vanity

and turn it into utility?

So using a similar approach that I did with “Good morning,”

I mapped all those people’s trips

because I know where they’re landing,

they just told me,

and I know where they live

because they share that information on their Twitter profile.

So what I’m able to do with 36 hours of Twitter

is create a model of how people are traveling

around the world during that 36 hours.

And this is kind of a prototype

because I think if we listen to everybody

on Twitter and Facebook and the rest of our social media,

we’d actually get a pretty clear picture

of how people are traveling from one place to the other,

which is actually turns out to be a very useful thing for scientists,

particularly those who are studying how disease is spread.

So, I work upstairs in the New York Times,

and for the last two years,

we’ve been working on a project called, “Cascade,”

which in some ways is kind of similar to this one.

But instead of modeling how people move,

we’re modeling how people talk.

We’re looking at what does a discussion look like.

Well, here’s an example.

This is a discussion around an article called,

“The Island Where People Forget to Die”.

It’s about an island in Greece where people live

a really, really, really, really, really, really long time.

And what we’re seeing here

is we’re seeing a conversation that’s stemming

from that first tweet down in the bottom, left-hand corner.

So we get to see the scope of this conversation

over about 9 hours right now,

we’re going to creep up to 12 hours here in a second.

But, we can also see what that conversation

looks like in three dimensions.

And that three-dimensional view is actually much more useful for us.

As humans, we are really used to things

that are structured as three dimensions.

So, we can look at those little off-shoots of conversation,

we can find out what exactly happened.

And this is an interactive, exploratory tool

so we can go through every step in the conversation.

We can look at who the people were,

what they said,

how old they are,

where they live,

who follows them,

and so on, and so on, and so on.

So, the Times creates about 6,500 pieces of content every month,

and we can model every single one

of the conversations that happen around them.

And they look somewhat different.

Depending on the story

and depending on how fast people are talking about it

and how far the conversation spreads,

these structures, which I call these conversational architectures,

end up looking different.

So, these projects that I’ve shown you,

I think they all involve the same thing:

we can take small pieces of data

and by putting them together,

we can generate more value,

we can do more exciting things with them.

But so far we’ve only talked about Twitter, right?

And Twitter isn’t all the data.

We learned a moment ago

that there is tons and tons,

tons more data out there.

And specifically, I want you to think about one type of data

because all of you guys,

everybody in this audience, we,

we, me as well,

are data-making machines.

We are producing data all the time.

Every single one of us, we’re producing data.

Somebody else, though, is storing that data.

Usually we put our trust into companies to store that data,

but what I want to suggest here

is that rather than putting our trust

in companies to store that data,

we should put the trust in ourselves

because we actually own that data.

Right, that is something we should remember.

Everything that someone else measures about you,

you actually own.

So, it’s my hope,

maybe because I’m a Canadian,

that all of us can come together

with this really valuable data that we’ve been storing,

and we can collectively launch that data

toward some of the world’s most difficulty problems

because big data can solve big problems,

but I think it can do it the best

if it’s all of us who are in control.

Thank you.

抄写员:Andrea McDonough
审稿人:Bedirhan Cinar

几年前,我开始使用 Twitter,

而 Twitter 真正让我着迷的一件事

是人们早上

醒来会说:“早上好!”

我想,

我是加拿大人,

所以我有点,

我喜欢那种礼貌。

所以,我也是一个超级书呆子

,所以我写了一个计算机程序

,可以记录 24 小时 Twitter 上每个人

说的“早上好!”

然后我问自己我最喜欢的问题,

“那会是什么样子?”

好吧,事实证明,我认为它看起来像这样。

对,所以我们会看到这波人

说,“早上好!” 当他们醒来时,世界各地。

现在绿色的人,就是

早上八点左右起床的人,

八点起床或者说:“早上好!” 在8点?

橙色的人,

他们说,“早上好!” 大约 9 点

。红色的人,他们说,“早上好!” 大约 10

岁。是的,10 岁比 10 岁多,10 岁多于 8 岁。

实际上,如果您查看这张地图,

我们可以了解一些关于人们如何

在世界不同地区醒来的信息。

例如,

西海岸的人比东海岸的人起床晚一点。

但这并不是人们在 Twitter 上所说的全部内容,对吧?

我们还收到了这些非常重要的推文,例如,

“我刚刚降落在奥兰多![飞机标志,飞机标志]”

或者,或者,“我刚刚降落在德克萨斯[感叹号]!”

或者“我刚刚降落在洪都拉斯!”

这些名单,他们一直在继续,

所有这些人,对吧?

所以,在外面,这些人只是在

告诉我们他们的旅行方式。

但我们知道真相,不是吗?

这些人是在炫耀!

他们在炫耀他们在开普敦,而我不在。

所以我想,我们怎样才能把这种

虚荣心变成实用呢?

因此,使用与“早上好”类似的方法,

我绘制了所有这些人的旅行地图,

因为我知道他们在哪里着陆,

他们只是告诉我

,我知道他们住在哪里,

因为他们在 Twitter 个人资料上分享了这些信息。

所以我能用 36 小时的 Twitter 做的

是创建一个模型,说明人们

在这 36 小时内如何环游世界。

这是一个原型,

因为我认为如果我们

在 Twitter 和 Facebook 以及我们其他社交媒体上倾听每个人的意见,

我们实际上会非常清楚

地了解人们如何从一个地方旅行到另一个地方,

即 实际上对科学家来说是一件非常有用的事情,

尤其是那些研究疾病如何传播的人。

所以,我在纽约时报的楼上工作

,在过去的两年里,

我们一直在做一个名为“Cascade”的项目

,在某些方面有点类似于这个项目。

但是,我们不是在模拟人们的移动方式,

而是在模拟人们的谈话方式。

我们正在研究讨论是什么样的。

好吧,这是一个例子。

这是围绕一篇名为

“人们忘记死亡的岛屿”的文章展开的讨论。

这是关于希腊的一个岛屿,人们在那里生活

了很长一段时间。

我们在这里看到的

是我们看到的对话

源于左下角的第一条推文。

所以我们现在可以在大约 9 个小时内看到这次对话的范围

我们将在一秒钟内爬到 12 个小时。

但是,我们也可以

从三个维度看到对话的样子。

而那个三维视图实际上对我们更有用。

作为人类,我们真的习惯

了三个维度的结构。

所以,我们可以看看那些谈话的小插曲,

我们可以找出到底发生了什么。

这是一个交互式的探索性工具,

因此我们可以完成对话中的每一步。

我们可以看看这些人是谁,

他们说了什么,他们

多大了,

他们住在哪里,

谁跟着他们,

等等,等等,等等。

因此,《纽约时报》每月创建大约 6,500 条内容

,我们可以

对围绕它们进行的每一次对话进行建模。

它们看起来有些不同。

根据故事

以及人们谈论它的速度

以及对话传播的程度,

这些结构,我称之为对话架构,

最终看起来会有所不同。

因此,我向您展示的这些项目,

我认为它们都涉及相同的事情:

我们可以获取小块数据

,通过将它们组合在一起,

我们可以产生更多价值,

我们可以用它们做更多令人兴奋的事情。

但到目前为止,我们只讨论了 Twitter,对吧?

Twitter 并不是所有的数据。

我们刚才

了解到,那里有

大量的数据。

具体来说,我希望你们考虑一种类型的数据,

因为你们

所有人,这个观众中的每个人,我们,

我们,还有我,

都是数据制造机器。

我们一直在生产数据。

我们每一个人,我们都在生产数据。

但是,其他人正在存储这些数据。

通常我们信任公司来存储这些数据,

但我想在这里建议的

是,与其

信任公司来存储这些数据,

我们应该信任自己,

因为我们实际上拥有这些数据。

对,这是我们应该记住的。

别人衡量你的一切,

你实际上拥有。

所以,我希望,

也许是因为我是加拿大人,

我们所有人都可以聚集

在一起,收集我们一直在存储的这些非常有价值的数据

,我们可以共同将这些数据

用于解决世界上一些最困难的问题,

因为大 数据可以解决大问题,

但我认为

如果我们所有人都在控制它,它可以做到最好。

谢谢你。