Bioinformatics
Scientific Discipline or Support Field?

by Emmanouil Skoufos

(Posted November 27, 1998 · Issue 43)

Abstract

Bioinformatics marries computer science and biology. The young union won't flourish unless we cultivate practitioners well versed in both disciplines, and give them creative work to do.


Bioinformatics is one of the hottest areas in the biomedical marketplace. Every week, in the back of the major scientific weeklies Science and Nature, there are several advertisements for bioinformatics positions in industries ranging from start-up biotechnology companies to Fortune 500 pharmaceutical giants. In the academic world, universities are having a hard time holding on to and replacing bioinformatics specialists lured to industry.

With the increase in complexity and power of both computer systems and bench research techniques, human "bridges" who understand both disciplines, and can communicate with scientists in either field, are in very great demand. Bioinformatics was called a scientific "wave of the future" in Science feature stories in both 1996 and 1997 (paid subscription required to access these full-text articles). An article in The Scientist last year by Thomas W. Durso declared that "As Genomics Grows, Future For Bioinformatics Is Bright," further bolstering the confident forecasts for the field, since the various genome sequencing projects are sprinting quickly toward completion.

All these signs point to a prosperous future, and to streets that seem paved with salary and grant gold for scientists who are and want to be bioinformatics specialists. But a more careful examination may reveal some hidden clouds in these sunny skies. A closer look at the advertisements for industrial bioinformatics positions reveals that expert training in biological sciences, and the demonstrated ability to solve biological problems, is buried among a fairly long array of requirements regarding knowledge of programming languages and methods. Even though it might be difficult to find bioinformatics scientists to fill positions in departments that offer bioinformatics degrees in academia, there are very few openings in other departments for scientists who use bioinformatics to answer fundamental biological questions. There are only five programs in North America that offer Ph.D. degrees in bioinformatics or computational biology.

In addition, funding for the training of bioinformatics scientists is limited to training grants in medical informatics programs from the National Library of Medicine, and to Department of Energy fellowships to computer scientists who want to enter the field. Funding mechanisms for stand-alone bioinformatics research projects are not yet in place. For certain projects, such as databases, funding is usually provided only for the development of the project, and not for its maintenance. Furthermore, publication of research using only bioinformatics approaches is very rare in mainstream scientific journals. It seems that on the one hand there is a great demand for (and only a small supply of) those skilled in bioinformatics, compounded by the fact that very few institutions provide formal training in it. On the other hand, there are very few openings and opportunities other than support and collaborative positions for bioinformatics scientists in today's job market.

What is the reason for this discrepancy? One answer might be that the field is still very young and not well defined, even among bioinformatics practitioners themselves who, to add to the complexity, come from diverse training backgrounds such as computer science and medicine. Even the name of the discipline - computational biology, or bioinformatics? - is a matter of debate among those in the field. Historically, the use of computers to answer biological questions, which is a functional definition of bioinformatics, started with the development of algorithms and their application to understanding the interactions of biological processes and the phylogenetic relationships among organisms based on gene sequence information. The exponential increase in the amount of genomic sequence data available, as well as the increase in computer-driven machinery for data acquisition and analysis, expanded the breadth of bioinformatics.

Databases must be constructed to hold the data, and specialists must be used in computer-based data collection and analysis. Scientists involved in this type of project are now thought of as support personnel assisting in the work of the bench scientists, who are exclusively involved in the actual information collection that goes into the databases constructed or analyzed by bioinformatics scientists. This is the kind of thinking that informs the previously mentioned industry hiring practices. This practice is understandable, since most companies have a lot of data on hand, and much more on the way, to be stored and retrieved as fast as possible. The scientists who are best able to construct these databases quickly and well are computer scientists, who have little background or interest in data analysis. That is usually the work of the bench-oriented genomics scientists supported by these databases.

What differentiates a scientific discipline from a support field is that the former involves hypothesis-driven research, while the latter supports such research. Unlike many support fields, bioinformatics has involved hypothesis-driven research since its inception. Theories of molecular evolution have been examined using post-sequencing genomics. Theories on molecular interactions, and on complex processes such as nervous excitation, have been examined using molecular modeling. The genomics and modeling areas of bioinformatics are starting to be viewed as a scientific discipline, as evidenced by the increased publication of stand-alone papers on these subjects.

How could bioinformatics research in areas such as database structure be treated as such? A database is conceived to aid data collection and analysis by bench scientists, and so is a construction for support and collaboration. But there is much room for hypothesis-driven research in the database field. One can point to the analogy of the plasmid construction area of molecular biology: databases that are just storehouses of data are as useful as plasmids that store a particular gene without having any additional functionality, allowing investigators to get information about the activity or the structure of the gene. As plasmids are usually constructed for use in particular experiments to answer particular questions, so databases can be constructed so that particular biological questions can be answered by data mining.

The challenge in database construction is to establish an architecture that allows for intelligent searching, communication with other databases, and the coupling of specific analytical tools to solve specific biological problems. Scientists who can construct these databases must have the background to determine which particular scientific problems need solving, and which methods best solve them. Scientists not versed in basic biological research cannot meet both of these requirements.

In terms of their acceptance as scientific disciplines, there are many parallels between the early days of molecular biology and those of bioinformatics. Only time will tell whether the latter will follow the course of the former. Watch the important signs in the back of your favorite scientific weekly, not in a special section dedicated to the field, but in the classifieds: positions for bioinformatics researchers in basic science academic departments, grants and fellowships specifically for bioinformatics training and research, new bioinformatics departments, and industry positions requiring expertise in both biology and informatics.

Emmanouil Skoufos is a postdoctoral fellow at the Center for Medical Informatics at Yale University School of Medicine.
Andrzej Krauze is an illustrator, poster maker, cartoonist, and painter who illustrates regularly for HMS Beagle, The Guardian, The Sunday Telegraph, Bookseller, and New Statesman.

Send us your comments and ideas for future articles.

Endlinks

Science: Current Positions Advertised - weekly job listings.

Genome Monitoring Table - with up-to-date statistics on the status of the different genome projects.

Genome Sequencing Projects - linked list of all genome sequencing projects.

A Curriculum for Bioinformatics: The Time is Ripe - editorial by Russ Altman of Stanford University. Requires Adobe Reader.

Bioinformatics in a Post-Genomics Age - by Diane Gershon, from the September 25, 1997 issue of Nature. Free registration is required for access.

University Bioinformatics Programs list of training programs. Maintained by Indiana University.

Computational Biology and the Cross-Disciplinary Challenge: Finding a Home in Academia - paper from the Computer Science and Telecommunications Board National Research Council symposium of May 16, 1996. By E.H. Shortliffe. Requires Adobe Reader.


Previous Op-Ed Articles
Opening Our Minds: The Decade of the Brain
by Gavin Swanson (Issue 42 · posted November 13, 1998)
Undergraduate Science Undervalued
by Carol Berkower (Issue 41 · posted October 30, 1998)
Opportunity, not Exploitation: Valuing the Icelandic Genome
by Kari Stefansson (Issue 40 · posted October 16, 1998)
Outsourcing Trials for Fun and Profit
by Ismail Shalaby (Issue 39 · posted October 2, 1998)
A Journal Falls Silent, Muffling History
by Alan I. Packer (Issue 38 · posted September 18, 1998)
The NIMH's Multiple Personality
by E. Fuller Torrey (Issue 37 · posted September 4, 1998)

more