WORD Version
April 2001, Version 1.1
Yutaka Akiyama, Director
This document describes the operating policy of the Computational
Biology Research Center (CBRC).
Please note that the following deals mostly with operating policy
and that other materials should be referenced for details on research
topics and research projects of the Center. Note also that target
readership of this document includes internal members of the Center.
I would therefore like to devote most of the latter half to problems
affecting individual researchers. In the first section, however,
I deal mostly with matters concerning the Center itself, such as
its purpose, the state of bioinformatics research both inside and
outside Japan, and Center strategies in response to overseas activities.
1. Purpose of the Computational Biology Research Center
Bioinformatics is a comprehensive science that examines a wide
range of biological phenomena on the basis of information theory.
These phenomena range from the structure and regulatory mechanism
of the genome sequence to the 3D structure and function of protein
molecules, the product of this sequence, and the mutual relationship
between molecular structure and function within cells and individuals.
Up until now, there has been no large-scale research base in Japan
dealing solely with bioinformatics, and the field of bioinformatics
has generally been treated as simply a part of experimental projects
like the Genome Project for performing computer analysis of obtained
data. Outside of Japan, however, bioinformatics research bases independent
of so-called wet molecular biology experiments have come to be established.
These include the National Center for Biotechnology Information
(NCBI) under the National Institutes of Health (NIH) established
in 1988 in the United States, and the European Bioinformatics Institute
(EBI) established independent of the European Molecular Biology
Laboratory (EMBL) in the European Union in 1992. Each of these research
laboratories has been recruiting high-level talent and has come
to build up an interdisciplinary staff from fields such as biology,
physics, information science, and mathematics. At these laboratories,
researchers are given the freedom to devote themselves exclusively
to computer-based research-they perform no wet experiments. In Europe
and the United States, it is now fully recognized that an independent
and concentrated approach to bioinformatics is essential considering
its broad scope and the need for developing advanced algorithms.
At present, the NCBI operates with a staff of 300 with plans to
expand to a 500-researcher system in the near future, and the EBI
currently has a staff of slightly more than 100 researchers.
While the years before 1995 gave birth to many superb bioinformatics
researchers in Japan, no base was established to bring these people
together, and they instead became scattered among experimental projects,
educational institutions, and companies. Of course, there are many
benefits to conducting on-site informatics research in experimental
projects where direct feedback from experimental researchers can
be obtained. On the other hand, this approach by itself does not
promote the development of novel technologies or the cultivation
of a dynamic research community. At experimental sites where competition
is fierce, development work takes on a shortsighted approach and
the birth of general-purpose technologies is not the rule. The employment
framework is unfortunately based on satisfying that which is minimally
required for processing the data on hand and this makes it difficult
to achieve any kind of synergetic effect.
As a base for bioinformatics researchers in Japan, the Computational
Biology Research Center aims to uncover new bioinformatics methodologies
on the basis of close international cooperation with NCBI in the
United States, EBI in Europe, and other institutions. In common
with the NCBI and EBI, the Center is a research facility specializing
in bioinformatics. Other than this, however, there are a number
of differences that arise, for the most part, from the fact that
the Center was established about ten years later than its counterparts
in Europe and the United States. The following explains these differences
and other features that illustrate how the Center is not simply
a copy of other institutions like the NCBI and EBI.
First, many biology-related bases already exist in Japan under
the umbrella of various ministries and government offices, and many
of these bases provide public database services. For example, organizations
like The Center for Information Biology and DNA Data Bank of Japan
(DDBJ) of the National Institute of Genetics, the Human Genome Center
(HGC) of the Institute of Medical Science of the University of Tokyo,
and the Kyoto University Institute for Chemical Research have been
energetically providing molecular-biological database services for
some time over the Internet. Under these circumstances, it would
be a waste of national resources for the Center to provide similar
services, and for this reason, we have decided to avoid committing
research resources to providing mirroring services for public databases,
simple database integration, etc. We will instead concentrate on
the development of new computational techniques and the management
of a public database that stores only new analysis results from
the use of those techniques. This, however, is a choice that involves
extremely high risk since most molecular biological scientists are
more appreciative of databases than computational techniques. It
is widely acknowledged that the high international evaluation that
institutions like the NIH and EBI have received is basically because
of the praise given to their database services. It is true, of course,
that experimental data is essential to understanding biological
phenomena-we cannot deny for a moment that study of "content" by
its very nature will always attract more attention than the study
of "framework." Nevertheless, it is still our desire to devote our
energies to the research of bioinformatics algorithms, an area that
has been lacking depth in Japan, and to make a contribution to technical
systems and the study of framework. The reason for our motivation
here is that now, with analysis of the human genome nearing completion
and concern shifting to the complex mutual relationships and metabolic
processes of intra-cell genes and their products, there is a worldwide
need for completely new methodologies of information analysis. In
the post-genome era, it will not be possible to take the initiative
in technology by simply refining technologies like homology search
and assembly (splicing together of DNA fragments) as has been done
up to now.
The second difference is that respect and consideration must be
given to not only the database services provided by the research
laboratories described above but also to the superb research in
bioinformatics that has been performed at these locations (despite
their relatively small staffs). In this regard, the Center, despite
its founding as Japan's first genuine bioinformatics research base
of significant scale, will never be he country's only public base
in this area. It would not be easy to gather for the short term
all outstanding researchers working at small bases scattered throughout
the country, especially on a scale similar to that of the NCBI or
EBI. That is to say, the idea of making an all out effort to suddenly
concentrate research personnel that have come to be scattered over
these last ten years would be a mistake or, at the least, unrealistic.
In Japan, where the pool of researchers is relatively thin, we need
a more realistic and effective method of bringing research power
together. Specifically, we must work on achieving a good balance
in forming alliances with existing bioinformatics research organizations,
moving research leaders at a steady pace based on intermediate-and
long-term strategies, and providing large-scale education and training
for new people in the field.
Against this background, the features and identity of the Computational
Biology Research Center established within the National Institute
of Advanced Industrial Science and Technology (AIST) become important
matters of concern. One key feature of the Center will be its large
group of researchers (about 50 researchers are planned for the initial
fiscal year) made up, for example, of full-time staff, part-time
staff, fellows of the New Energy and Industrial Technology Development
Organization (NEDO), and outside collaborating researchers. We plan
to make best use of the benefits associated with a large number
of researchers gathered in one place to discuss bioinformatics,
and to give importance to research that goes beyond the traditional
scope of bioinformatics through the convergence of various fields.
Here, the introduction of new informatics theories and proposal
of new information-analysis techniques (in which design work shifts
from measurement equipment to analysis techniques) will be encouraged.
In appointing staff, our idea is to establish a 3:2:5 ratio among
full-time, part-time, and outside researchers. (Although part-time
service should be increased, the current term of service is two
years, which makes it difficult to increase the absolute number
of part-time researchers.) Obtaining enough outside researchers
from companies, universities, etc. to make up about half of the
Center's staff will be achieved through an open organization. In
addition, great importance will be attached to having each and every
researcher become a next-generation leader. At the Center, it is
our desire to play the role of an incubation site where researchers
can take the first step in becoming a leader in bioinformatics before
returning to their university or company positions. When hiring
staff, moreover, we would like to defy past common sense and target
people that have as broad a background as possible. This, however,
must be pursued with care giving full consideration to the tradeoff
involved in achieving our short-term mission as a research center
with the Center's prescribed time limit of seven years. With full
awareness of the functions and roles described above, the Center
will come to share the burden of bioinformatics research with existing
research groups in universities and elsewhere in Japan while becoming
a center for interaction in Japan's bioinformatics research committee.
In this way, the Center will gradually grow in significance.
The third difference relates to the fact that while the NCBI and
the EBI are completely independent of experimental projects in terms
of operating policy and budget, they are nevertheless quite close
to those projects in a physical sense. Specifically, the NCBI is
located on the NIH campus and the EBI is situated adjacent to the
Sanger Center. To make up for this lack of proximity to experimental
projects, the Center plans to collaborate with life-science units
within the AIST beginning with the Biological Information Research
Center located at a newly developed seaside city center, and to
engage in joint research with domestic and international public
experimental projects and with pharmaceutical and chemical companies.
Much research time will have to be spent meeting with experimental
researchers at experimental sights. For this reason, the ideal format
for the Center is a mosaic-like existence in which some researchers
are deep in study at the Center while others are up and around returning
only infrequently.
The fourth difference is that the Center was not founded against
the same background as that of NCBI and EBI. Now, with sequencing
of the human genome completed, we are at the dawn of a new age in
which efforts can be devoted exclusively to post-genome analysis
technologies. These last ten years, moreover, have seen dramatic
advances in computer-related technologies. As a result, this difference
between the start time of the Center and that of existing institutions
like the NCBI and EBI means that the Center can place more emphasis
on specific research themes and commit resources in a project-by-project
manner. While research laboratories like the NCBI and EBI have been
carrying out a public mission that creates a particular research
culture, the Center is more likely to become project oriented in
nature. These laboratories, moreover, built up a large-scale computing
environment in a gradual fashion as data-processing needs escalated.
In contrast, the Center intends to focus intensely on the issue
of computing from the very beginning and to pay particular attention
to achieving efficient operation and expansion of computing resources.
The Magi PC Cluster (1024 CPUs) already acquired by the Center is
one example of this policy. Overall, the Center plans to employ
high-speed computer power of one or two orders of magnitude greater
than that of existing research groups and to significantly accelerate
the speed of research. Of course, computing speeds continue to increase
year by year as a matter of course, but making a concerted effort
at all times to use new and powerful computing facilities effectively
can reduce development time. It can also make up for an insufficient
number of research personnel by simplifying the trial and error
process. To achieve these goals, it is essential that researchers
excelling in parallel-processing technology be aggressively recruited,
that collaboration be pursued with the informatics community, and
that close links be formed with the Tsukuba Advanced Computing Center
(TACC) and computer-science research departments within the AIST.
If we were not to pursue cutting-edge research techniques in this
way, it would be all the more difficult for the Center to catch
up in this new age of bioinformatics.
At this point in time, bioinformatics has two distinctive aspects.
One is its use as a tool for analyzing genome information, and in
particular, as a quick but useful tool to take the lead in research
as competition in development work and patent acquisition continues
to intensify. The other aspect is its use as a new technical system
for achieving a deep understanding of the mechanisms behind biological
phenomena from the molecular level. There is little doubt that using
models constructed within a computer to perform computational experiments
will be a major field in molecular biology by the middle of the
21st century. By therefore reducing the amount of wet experiments
that have to be performed, we can expect this technology to reduce
the cost and time of R&D in the biotechnology industry and to
contribute to society from the standpoint of ethics and safety.
While therefore fulfilling our mission of providing tools associated
with the first aspect above to industry, the Center must also engage
in research activities that keep an eye on the advent of future
technologies in conjunction with the second aspect above. Generally
speaking, a research center is designed to devote all of its energies
to short-term and intermediate-term R&D. In the field of bioinformatics,
however, it sometimes happens that research intended for future
technologies comes to be used on site the following year contrary
to all expectations. The truth is that it has become increasingly
difficult to judge whether certain research is short or long term.
It should be mentioned here that the words "computational biology"
in the English name of the Center (Computational Biology Research
Center) is not a direct translation of the corresponding expression
in the Japanese name (literally, "biology information science").
It is our goal to resolve this nuance between these two expressions
and to bring about an era in which both are used with the same meaning.
TOP
2. Review of Individual Researchers
From this section on, I will discuss specific issues associated
with researchers themselves.
The most basic criterion for evaluating the achievements of a researcher
is the content and quality of published papers. In bioinformatics,
however, which is an interdisciplinary field with a relatively short
history, review on the basis of published papers must be performed
with care. At present, journals that specialize in bioinformatics
are few, and journals that are optimized for accepting contributions
are either divided by topic (such as genomics, protein-structure
analysis, or metabolic networks) or by information-processing technique,
and thus vary greatly. Numerical evaluation such as in impact factor,
moreover, is not commonly used in the field of bioinformatics-it
is important that each paper be accurately judged on its effects
on the community as opposed to examining the overall reputation
of the journal that published the paper in question. For this reason,
the Center does not mechanically evaluate the number of published
papers and instead emphasizes opportunities given to the researcher
to explain his or her achievements.
At the same time, a researcher under review who has achieved superb
results in a) patent acquisition, b) creation and release of software,
or c) education of industry, academia, or government will be given
more credit for these achievements than for published papers. In
fact, it is these kinds of results that I would like to encourage.
It is not easy, though, to establish incentives in this regard in
the case of individual researchers that are passing through research
society. (At present, about 80% of researchers at the Center are
employed on a term basis and increasing one's number of papers is
considered to be the best choice for the individual that is considering
reemployment elsewhere. Likewise, patents will never be greatly
attractive to individual researchers due to the rule that patents
rights are transferred to AIST.) Nevertheless, there are still researchers
that have achieved remarkable results in patents, software, and
education while sacrificing some of their writing of journal papers.
The Center gives such researchers as much credit as possible for
these achievements.
The above, however, are not the only criteria for reviewing an
individual researcher for his scientific accomplishments. We also
intend to give as much credit as we are allowed to actions that
enhance the name of the Center (such as outside commendations, writing
activities, demonstrations, and receiving visitors) and activities
related to mutual education and volunteer work within the Center.
The object of an individual review, moreover, will generally not
be the effort involved but rather actual results. That is, individual
researchers will be encouraged to come up with the most efficient
method within the scope of Center rules without having to worry
about superficial matters like work attitude and work format (business
trips and outside duty are common).
Clarifying the method of individual review is of course important.
At the same time, though, it can be said that most researchers with
a desire to work at the Center are aiming for a step up in their
career as a researcher as opposed to ensuring long-term employment
with AIST. For this reason, it can be surmised that there is much
more concern with freedom in selecting research themes and research-budget
allocation than tiny increases or decreases in merit pay based on
individual review.
TOP
3. Freedom in Selection of Research Themes
At AIST, each research department conducts fundamental and germinating
research and each research center steers its research in the direction
defined by its mission. The Computational Biology Research Center
is no exception. Furthermore, considering that the technical development
of bioinformatics is a critical theme of national importance in
Japan, it is imperative that the allocation of research resources
be performed with utmost care. For details in regard to what themes
are receiving attention, I refer the reader to the "Priority Themes"
section found in descriptions of research projects and elsewhere.
On the other hand, research of the type that refines conventional
technology and germinating research of the trial-and-error type
exist adjacent to each other in a complicated way within the narrow
field of bioinformatics. In other words, bioinformatics research
has a "fractal" structure, and it is clear that germinating trial-and-error
research occupies a definite if small percentage of research. At
research centers, it is impossible to completely prohibit germinating
research.
A point to consider here is the degree of margin that should be
allowed in research investment taking into account the seven-year
research limit of the Center and its relatively small staff of about
50 people (about 40 people at its founding). On one hand, one needs
to be aware that departing from a priority theme established by
the Center can rapidly diminish the probability of being allowed
to research that departing theme (compared with research departments).
On the other hand, the Center also recognizes the value of continuing
research that clearly results in a world-class breakthrough.
To give a concrete example, research themes at the Center in fiscal
year 2001 have been concentrated in the area surrounding molecular
biology and cellular biology. As a consequence, simulation at the
level of an individual composed of multiple organs or research into
the properties of an ecological group made up of interacting individuals,
for example, would in most cases be disallowed despite their affinity
with bioinformatics. These research themes, though attractive, are
simply too distant from the priority themes established by the Center.
Yet, if a certain unconventional idea should lead to research that
could be published in an extremely renowned publication, an exception
may very well be made to allow it to continue.
Bioinformatics research is a fast-flowing field that undergoes
great expansion every time a new experimental technique appears.
Sufficient consideration is therefore given to the fact that previously
established priority themes can become outdated after several years.
From this point of view, the generation of many good results even
if somewhat removed from priority themes should be encouraged rather
than frowned upon, and the contents of established priority themes
should be put up for discussion and reevaluation once a year.
Each team leader is responsible for making primary decisions on
theme selection for individual researchers. The team leader must
evaluate each case separately based on the basic policy described
above.
TOP
4. Research-budget Allocation
In allocating a budget to each team, a budget proposal is prepared
in the previous fiscal year based on budget requirements, and a
revised amount is then allotted at the beginning of the fiscal year
in question. The Center, however, also attaches importance to "common
expenses" and "reserved expenses" in addition to the amount allotted
to each team.
To give some background here, one policy of the Center is to use
a large-scale computational environment as described earlier. Such
an environment, however, cannot be set up for a single team. To
achieve effective use of a large-scale cutting-edge computational
environment, it is important to set up hardware, software, and databases
on a shared computer even if this slightly reduces the allotment
to each team, and to promote the sharing of these resources to raise
the utilization level of the system. Providing a common platform
like this, while appearing at first to be a roundabout approach,
is extremely significant in terms of achieving robust technologies,
diverting technologies to other applications, and developing common
specifications that cover multiple themes.
In the field of bioinformatics, moreover, research flow is exceedingly
fast, and for this reason, the Center enables additional budget
to be committed if an idea proposed after the beginning of a fiscal
year has been judged by the director to have merit. To this end,
"reserved expenses" are established at the beginning of a fiscal
year and allotted to appropriate teams in the latter half of the
year. These expenses may also be used to cultivate "germinating
research allowed within the Center" as described in the previous
section.
Although not realizable in the initial fiscal year due to pressures
related to startup expenses, our basic plan is to assign initial
expenses, common expenses, and reserved expenses to each team in
a ratio of 6:2:2.
For the most part, each team leader is responsible for allotment
of research budget and expenditures within his or her team. In some
cases, however, like startup expenses for a new staff member or
running research expenses for individual researchers, a budget may
be allotted after the director has approved the expenditure in question.
Also, in regard to budgets that researchers bring with them from
the outside, the Center takes on no overhead to the extent possible.
This is an absolutely necessary measure for two main reasons. First,
AIST headquarters must already pay for much overhead, and second,
searchers must be provided with an incentive to apply for outside
funds.
For example, even if the amount of money is small, we are giving
importance to research for which external funds can be obtained
from private corporations or public projects from the viewpoint
of future expansion. Because there are many cases in which overhead
is large in proportion to research expenses, the Center provides
administrative support as much as possible.
TOP
5. Patent Acquisition
The Computational Biology Research Center strongly encourages the
acquisition of patents. To complete this policy statement, this
section discusses patent-related matters.
The Center divides acquirable patents into two major types. The
first type concerns inventions of new computational algorithms or
computational systems in bioinformatics. The second type concerns
individual discoveries related to genes or protein obtained through
the use of these computational algorithms and systems. In either
case, the appropriateness of acquiring patents of these types must
be thoroughly discussed, as the former type is considered by some
to be a kind of mathematical or algorithm patent, and the latter
type originates directly from genome information, an asset shared
by the human race. The Center recognizes that some researchers are
deeply opposed to patent acquisition.
I would nevertheless like to encourage the acquisition of patents
at the Center for the reasons described below. First of all, there
are indeed many problems intrinsic to the acquisition of patents
from genome information, a shared asset of the human race. And,
if I may speak freely, it is for this very reason that semi-public
organizations like AIST should try to acquire patents as much as
possible before commercial enterprises like venture firms. In contrast,
there is the method of releasing research results on the Internet
for everyone to see so as to suppress the acquisition of patents
by other parties. This approach, however, is not that effective,
as specific private sectors are eventually given permission to acquire
patents. This is not to say that the acquisition of patents by the
private sector is necessarily wrong. I am saying, rather, that making
an aggressive effort to secure patents should not be an issue (and
is hardly a matter of choice) when compared with having one's own
inventions or discoveries passed on to another party right in front
of one's eyes. There is also the opinion that because research at
the Center is funded by taxes, the favor should be returned to the
people or industry in the form of patent acquisition. It is our
aim, however, to be a group that proves its worth by making important
contributions by ways other than patents (such as by research-leader
incubation, education, guidance, and scientific presence). In short,
the Center must be capable of choosing at will either patent acquisition
for the sake of revenue or defensive patent acquisition for humane
reasons.
The patent rights of an invention originating from an AIST researcher
are transferred to AIST. To implement the patent, the Center searches
for an implementing company through the services of a technology
licensing office (TLO). Since the Center represents the inventor's
side, it is imperative that extreme pressure be applied to ensure
that such bioinformatics patents are implemented appropriately.
This is the mission that the inventor must fulfill with respect
to society, and it is this that I would like AIST headquarters to
fully understand. While a patent of the first type above like a
computational system can be implemented on a broad scale, there
are many examples of gene-related patents of the second type that
are implemented through an exclusive license (since there are many
companies that demand safety guaranties for their investment, resulting
in a long path time-wise before drug discovery). The root of this
problem is very deep.
The same line of thinking for patents as described above can also
be applied to software created in the course of research and to
databases obtained as a result of calculations performed with that
software. In other words, we consider the mission of the Center
to be the dissemination of its results to the public at large both
domestically and internationally, and we assume that the Center
should be able to actively manage the implementation of those results.
It is common for the deputy director of a research center to be
placed in charge of patents and intellectual property. At the Computational
Biology Research Center, however, these are placed in charge of
the director himself as a reflection of the importance attached
to them. As described above in the section on individual reviews,
there are no strong incentives under current rules for researchers
to apply for patents as long as they must abandon writing papers.
For this reason, the Center endeavors to construct a system in conjunction
with patent lawyers and TLO to support researcher acquisition of
patents by, for example, attaching importance to patents in individual
reviews.
TOP
|