Building the Taxonomy of Life

Questions / by Sam Kean /

The presumption was that you’d need experts to write pages, and we’d end up with 2 million or so. I was absolutely clear from the start that that wasn’t going to work.

Over the next 10 years, scientists will use the internet to store and share information — videos, descriptions, locations, behaviors — about most organisms on Earth. The Encyclopedia of Life project is drawing this information together and reassembling it into pages about each species. So far, the Encyclopedia of Life has pages for 180,000 species, which, according to David “Paddy” Patterson, is a tenth of the way through the effort.

Patterson speaks with an Irish accent and works sometimes on a comically small laptop with keys much too small for his fingers. But that disparity between organic and electronic is fitting. Patterson heads the Encyclopedia of Life’s Biodiversity Informatics Group (BIG), which has the daunting task of gathering humanity’s collective biological knowledge and making sure experts and amateurs alike can post all that information as easily as they’d update a Facebook page. We spoke to him about parallels between the Encyclopedia of Life and Oxford English Dictionary, about why unicorns don’t have a place in EoL but creationists might, and about the prospects of failure.

Seed: How will the EoL change science?
David Patterson: Most biology that’s being done is very parochial, in that each scientist works on their own or in a small group. They collect data; they lodge that data in some kind of electronic format from which they derive some kind of understanding; and they report that understanding in a scientific paper. The data itself tends to be inaccessible, and one gets just the paper, the synthesis. The information is in isolated pools. As researchers retire, much of that information is lost, so that vast amounts of information is no longer accessible to us. We want to change that.

Seed: You studied protists as a biologist, and now you’re essentially doing information science.
DP: Although interested in particular organisms, I now would point to the fact that I was a taxonomist, and one role of a taxonomist is to develop a comprehensive understanding of everything ever written about their favorite organisms. They become a walking compendium. Or if not holding the information in their head, if you go into their room, they can reach out their hand and pick the right book off the bookshelf and go to the right page. It’s that skill we’re building into Encyclopedia of Life. In essence, it’s a massive indexing problem.

Seed: What’s the biggest obstacle to getting all species on the site?
DP: [laughs] Oh, God. Content acquisition. We’ve got the infrastructure. From an informatics point of view, there are no longer any impediments. The problem is getting the content in. It’s a social challenge. I foresee that social networking tools are going to be some of the most valuable things we can use.

The Encyclopedia of Life is adopting a strategy different from the one most people had been imagining. The presumption was that you’d need experts and require them to write pages, and somehow you were going to end up with 2 million or so. I was absolutely clear from the start that that wasn’t going to work.

Seed: What about people who are enthusiastic about a species and just decide, “I want to write something about it”? How much of a role will they play?
DP: I honestly would hope in excess of 75 percent, maybe over 90 percent. I just cannot see how it’s going to be feasible any other way. The problem is, you don’t know what you’re going to get, so you have the same concerns as with the Wikipedia environment.

Seed: I imagine it’s easy to find people who can talk about flowering plants or butterflies or other popular species. How are you going to find experts for things like beetles, where there are so many species and a lot fewer people working with them nowadays?
DP: You’d be surprised what’s out there. You really would be surprised. One of the tasks that I did to prepare the growing of the Encyclopedia of Life was to collect names. I set myself the task of searching across the full spectrum of biodiversity for all of the names of all of the genera, and I only encountered two names where there were zero people working.

Seed: What were those areas?
DP: A couple of the smaller beetle families.

Seed: Have you looked at the history of the Oxford English Dictionary, which was another daunting project and an early example of using citizens to get information that experts could never acquire on their own?
DP: I actually have somewhere around here a book about it called The Professor and the Madman. I think that there are a number of things that make the Encyclopedia of Life a tractable project and why this attempt is likely to be successful whereas other attempts have failed — and there have been plenty of other attempts, both on paper and electronically. The capacity to use aggregation technology is one component, so you don’t redo the work of others. Coming up with management of names is critical, because that allows you to integrate the stupid information. But getting community buy-in as contributors, as creators, and as fact checkers is an absolutely essential part. If we fail on that front, I suspect the Encyclopedia of Life would fail.

Seed: What would constitute failure? And are you scared of that?
DP: It is incumbent on me to worry about things like that. I believe we should have a certain proportion of species represented by 10 years after the start of the project, so about 2016, 2017. I’d like to think by that date you can type in any known species and nine times out of 10 get a result.

Seed: You’ve had people ask about some strange animals, haven’t you?
DP: We’ve got logs, so we know when people come in looking for things like unicorns. We’ve had queries for dragons and the like. Fairies was another one. People are clearly just probing the system to see what they’ll get. The unicorn has been used to evaluate some of the other efforts out there. If you type “unicorn” into a biodiversity website and get results, it’s deemed a bad website.

Seed: You were in a bit of controversy in 2007 because you said you wouldn’t mind if a creationist or someone hostile to biology contributed. Have you rethought that?
DP: Only dug my heels in. I don’t care what kind of color, gender, belief structure someone has. If they’ve got nice pictures of organisms, if they’ve got knowledge about organisms, I’d be delighted for them to put it in there. If they write something nonsensical, they’re going to be the ones covered in mud because their reputations will be at fault. In this process maybe some of those nuts are going to be educated, so it’ll be clear why their view is not a credible view.

The biggest problem is that any kind of exclusivity filter will deter people from contributing. You’ve got to be like Flickr, YouTube, Wikipedia: Leave the doors wide open, let people put stuff in there. But you also need to have devices in place to remove stuff that’s offensive or seriously stupid or wrong. Open the doors wide — let it in — then worry about the quality afterward.

 

Originally published March 23, 2009

Tags data ecology innovation technology

Share this Stumbleupon Reddit Email + More

Now on SEEDMAGAZINE.COM

  • Ideas

    I Tried Almost Everything Else

    John Rinn, snowboarder, skateboarder, and “genomic origamist,” on why we should dumpster-dive in our genomes and the inspiration of a middle-distance runner.

  • Ideas

    Going, Going, Gone

    The second most common element in the universe is increasingly rare on Earth—except, for now, in America.

  • Ideas

    Earth-like Planets Aren’t Rare

    Renowned planetary scientist James Kasting on the odds of finding another Earth-like planet and the power of science fiction.

The Seed Salon

Video: conversations with leading scientists and thinkers on fundamental issues and ideas at the edge of science and culture.

Are We Beyond the Two Cultures?

Video: Seed revisits the questions C.P. Snow raised about science and the humanities 50 years by asking six great thinkers, Where are we now?

Saved by Science

Audio slideshow: Justine Cooper's large-format photographs of the collections behind the walls of the American Museum of Natural History.

The Universe in 2009

In 2009, we are celebrating curiosity and creativity with a dynamic look at the very best ideas that give us reason for optimism.

Revolutionary Minds
The Interpreters

In this installment of Revolutionary Minds, five people who use the new tools of science to educate, illuminate, and engage.

The Seed Design Series

Leading scientists, designers, and architects on ideas like the personal genome, brain visualization, generative architecture, and collective design.

The Seed State of Science

Seed examines the radical changes within science itself by assessing the evolving role of scientists and the shifting dimensions of scientific practice.

A Place for Science

On the trail of the haunts, homes, and posts of knowledge, from the laboratory to the field.

Portfolio

Witness the science. Stunning photographic portfolios from the pages of Seed magazine.

SEEDMAGAZINE.COM by Seed Media Group. ©2005-2015 Seed Media Group LLC. All Rights Reserved.

Sites by Seed Media Group: Seed Media Group | ScienceBlogs | Research Blogging | SEEDMAGAZINE.COM