
Annotations of the human genome sequence as well as those in other species have revieled rich sets of information coded in the nucleic acid sequences. The internet software technology allows researchers to retrieve data of interest, which is then integrated in their desktop computers for their biological reasoning. These computational tasks includes tedious file format conversion with text processing. Even with conformed tools and data format such as Perl and GFF format, those computational tasks sometimes meet limitations of methodology for new kind of data. We consider that a data processing language should facilitates those computing tasks and support constructing systems for analysis in various research activities. Among currently available language systems, we adopted the Lua programming language, that fulfills our requirements in computational tasks for sequence map layout: handling of data containers, symbolic reference to data, and a simple programming syntax.
The programming language Lua provides an interpreter which allows fine-grind programming for data processings within a procedural programming syntax like Pascal. In addition, its data description facilities allows a hierarchical data management that was also illustrated by ASN.1 for the data processing of annotation data at NCBI. Both data description and programming are supported by Lua language in a single language context with the garbage collection. Our sequence map visualization program GUPPY (Genetic Understanding Perspective Preview sYstem) was successfully implemented embedding the Lua language for processing of annotation data and layout script. Applications of GUPPY, for data processing for GenBank, GoldenPath, FlyBase, and result of BLAST or Clustal-W were developed. Upon importing a foreign file, the original data is first decomposed in Lua language maintaining the imported data schema. Then, portions of annotations are selected and arranged into the target format with flexible data schema conversion. Our method is similar to the Document Object Model technique in the way that we utilize large main memory of modern computers. Further practical applications for the expression data by microarray experiments are in progress.

Saeki,S.,Asai,K.,Takahashi,K.,Ueno,Y.,Isono,K. and Iba,H.,"Inference of Euler Angles for Single Particle Analysis by Using Genetic Algorithms",to appear in Genome Informatics 2002.
|