The race to sequence the human genome Tien Nguyen

Packed inside every cell in your body
is a set of genetic instructions,

3.2 billion base pairs long.

Deciphering these directions
would be a monumental task

but could offer unprecedented insight
about the human body.

In 1990, a consortium
of 20 international research centers

embarked on the world’s largest
biological collaboration

to accomplish this mission.

The Human Genome Project proposed
to sequence the entire human genome

over 15 years
with $3 billion of public funds.

Then, seven years
before its scheduled completion,

a private company called Celera announced
that they could accomplish the same goal

in just three years
and at a fraction of the cost.

The two camps discussed a joint venture,
but talks quickly fell apart

as disagreements arose over legal
and ethical issues of genetic property.

And so the race began.

Though both teams used the same technology
to sequence the entire human genome,

it was their strategies
that made all the difference.

Their paths diverged
in the most critical of steps:

the first one.

In the Human Genome Project’s approach,

the genome was first divided into smaller,
more manageable chunks

about 150,000 base pairs long

that overlapped each other
a little bit on both ends.

Each of these fragments of DNA

was inserted inside a bacterial
artificial chromosome

where they were cloned and fingerprinted.

The fingerprints showed scientists
where the fragments overlapped

without knowing the actual sequence.

Using the overlapping bits as a guide,

the researchers marked
each fragment’s place in the genome

to create a contiguous map,

a process that took about six years.

The cloned fragments were sequenced
in labs around the world

following one of the project’s
two major principles:

that collaboration on our shared heritage
was open to all nations.

In each case, the fragments
were arbitrarily broken up

into small, overlapping pieces
about 1,000 base pairs long.

Then, using a technology
called the Sanger method,

each piece was sequenced letter by letter.

This rigorous map-based approach
called hierarchical shotgun sequencing

minimized the risk of misassembly,

a huge hazard of sequencing genomes
with many repetitive portions,

like the human genome.

The consortium’s
“better safe than sorry” approach

contrasted starkly with Celera’s strategy
called whole genome shotgun sequencing.

It hinged on skipping
the mapping phase entirely,

a faster, though foolhardy, approach
according to some.

The entire genome was directly chopped up

into a giant heap
of small, overlapping bits.

Once these bits were sequenced
via the Sanger method,

Celera would take the formidable risk
of reconstructing the genome

using just the overlaps.

But perhaps their decision
wasn’t such a gamble

because guess whose freshly completed map
was available online for free?

The Human Genome Consortium,

in accordance with
the project’s second major principle

which held that all of the project’s data

would be shared publicly
within 24 hours of collection.

So in 1998, scientists around the world

were furiously sequencing
lines of genetic code

using the tried and true, yet laborious,
Sanger method.

Finally, after three exhausting years
of continuous sequencing and assembling,

the verdict was in.

In February 2001, both groups
simultaneously published

working drafts of more than 90%
of the human genome,

several years ahead
of the consortium’s schedule.

The race ended in a tie.

The Human Genome Project’s practice
of immediately sharing its data

was an unusual one.

It is more typical for scientists
to closely guard their data

until they are able to analyze it
and publish their conclusions.

Instead, the Human Genome Project
accelerated the pace of research

and created an international
collaboration on an unprecedented scale.

Since then, robust investment in both
the public and private sector

has led to the identification
of many disease related genes

and remarkable advances
in sequencing technology.

Today, a person’s genome can be sequenced
in just a few days.

However, reading the genome
is only the first step.

We’re a long way away from understanding
what most of our genes do

and how they are controlled.

Those are some of the challenges

for the next generation
of ambitious research initiatives.