Implementation of basic RNA structure prediction algorithms


Projekt-Type Master Team-Project
Project Name RNA-Google: Implementing a prototype for fast search in large RNA repositories
Duration 07.05.2010 - 08.10.2010
Short Description

In this project i implement two bioinformation algorithms for the RNA secondary structure prediction. One is Nussinov Algorithm and another one is Zuker Algorithm. The basic idea of Nussinov Algorithm is try to calculate the maximum based RNA pairs(A-U, C-G, U-G) within a given RNA string, and then use the maximum based pairs to traceback, and get the corresponding best RNA secondary structure. The basic idea of Zuker Algorithm is to calculate the minimum Gibbs Free Energy within a given RNA string, and use the minimum energy to traceback, and get the corresponding best RNA secondary structure.

Supervisors: Prof. Dr. Hannah Bast, Jens Hoffmann
Participant: Li Zhang

Running and Testing

Source Code Download NussinovAlgorithm, ZukerAlgorithm
SVN Checkout

You can get the current source code by doing an svn check out action at '' with the username 'zhangl', password 'Zl1985.' There you will find two folders, one is NussinovAlgorithm, another one is ZukerAlgorithm. Inside them there are .h file, .cpp file, makefile, cpplint.

Running And Usage You could run the code by doing the command 'make', then the code will make test, make lint, make perf respectively. If you want to see the time complexity of the algorithm, first you do the 'make' command, then the souce code will generate strings with length from 4 to 100. And then you can use the three commands below to see the plot of the algorithm: 'gnuplot', 'set log xy', 'plot "./ZukerAlgorithmPerf.output.txt" using 1:2 with linespoint'. If you want to clean all the generated files in the last make, then you should type 'make clean'.If you want to use the code to calculate the optimal best structures, you should do like this: ./NussinovAlgorithmMain [Input string], where the string consists of 'a', 'u', 'c', 'g' or ./ZukerAlgorithmMain [Input string], where the string consists of 'a', 'u', 'c', 'g'.
Documentation You could get the documented source code here: Documentation Files