DNA SEQUENCE ANALYSIS

 Bioinformatics ✶ Python, FinchTV

July 2021

I. Introduction

In 11th grade, I had the unique opportunity to conduct bioinformatics research at my high school, the Academy for Information Technology. I conducted the research primarily by myself and was guided by my Bioinformatics teacher, Dr. Andrew Colasanti, and a Rutgers professor, Dr. Janet Mead. The research involves DNA sequence analysis of a species of duckweed, Landoltia punctata, and was published by the National Center for Biotechnolgy (NCBI).

II. Abstract

Duckweed, a fresh-water aquatic plant, is of particular interest to the scientific community because of its use in bioremediation and its potential use as a biofuel. An mRNA population of the duckweed plant Landoltia punctata was used to explore the research question: Which genes are expressed in this organism, and how do they compare with expressed genes (i.e., proteins) from other species? In order to answer this, DNA sequence analysis was performed, and comparisons were made between this gene and those found in other species. 

Plasmid DNA was purified from bacteria that had been transformed with a Landoltia punctata cDNA library (a collection of mRNAs copied into DNAs). The specific gene clone researched in this project, named 20JM571.20, was analyzed using modern molecular biology laboratory techniques, including polymerase chain reaction (PCR) and agarose gel electrophoresis. The sequence of the cDNA was trimmed and edited using the bioinformatics software FinchTV, and was found to be 556 base pairs long.

The edited sequence and its corresponding protein were then compared with those from other organisms using the BLAST algorithm and program. A few organisms with similar sequences and proteins were found, including Spirodela intermedia, Colocasia esculenta, and Cinnamomum kanehirae. The protein is found exclusively in other plants. These discoveries helped determine 20JM571.20’s purpose within Landoltia punctata.

Through further research and analysis, 20JM571.20 was found to express a transcription activator similar to protein Far-1 related sequence 5-like protein, which has crucial functions in plant adaptation and evolution. When expressed as a protein, 20JM571.20 responds to light waves and in turn regulates cellular processes by activating or repressing certain genes. The protein is most highly localized in the nucleus and cytosol, and 20JM571.20 itself is most highly expressed in the dry seed and shoot apex.

III. My Role

My individual contributions are as follows:

IV. Publication

The sequence and its function have been published in GenBank, the international repository of all known DNA sequences, which is part of the National Center for Biotechnology Information (NCBI). Please note that my sequence is marked as “Journal: Unpublished” on the GenBank website because it is not a journal publication; however, the sequence is officially published on GenBank for scientists around the world to use. Below is the official citation.


Landoltia punctata clone 20JM571.20 protein FAR1-related 5-like protein-like, mRNA sequence. Monga, N., Colasanti, A. and Mead, J. National Center for Biotechnology Information. July 9, 2021. Accession #JZ984995.1 https://www.ncbi.nlm.nih.gov/nuccore/2063194577