Secondary structure prediction

The Workbenches uses a minimum free energy (MFE) approach to predicting RNA secondary structure. Here, the stability of a given secondary structure is defined by the amount of free energy used (or released) by its formation. The more negative free energy a structure has, the more likely is its formation since more stored energy is released by the event.

Free energy contributions are considered additive, so the total free energy of a secondary structure can be calculated by adding the free energies of the individual structural elements. Hence, the task of the prediction algorithm is to find the secondary structure with the minimum free energy. As input to the algorithm empirical energy parameters are used. These parameters summarize the free energy contribution associated with a large number of structural elements.

In the Workbenches, structures are predicted by a modified version of Professor Michael Zuker's well known algorithm which is the algorithm behind a number of RNA-folding packages including MFOLD. Our algorithm is a dynamic programming algorithm for free energy minimization which includes free energy increments for coaxial stacking of stems when they are either adjacent or separated by a single mismatch. The thermodynamic energy parameters used are from the latest Mfold version 3, see http://www.bioinfo.rpi.edu/~zukerm/rna/energy/.

Structure output

The predict secondary structure algorithm has two options concerning structure output: Compute minimum free energy and structure and in addition to this the possibility to compute a sample of suboptimal structures.

Compute minimum free energy and structure. Compute minimum free energy and structure. This will only display one structure - the optimal structure.

Also compute sample of suboptimal structures. In addition to the optimal structure, it computes a specified number of suboptimal structures.

If you choose to compute suboptimal structures also, you can specify how many structures to include in the output. The algorithm then iterates over all permissible canonical base pairs and computes the minimum free energy and associated secondary structure constrained to contain a specified base pair. These structures are then sorted according to minimum free energy and the most optimal are reported given the specified number of structures. Note, that two different sub-optimal structures can have the same minimum free energy.

The suboptimal MFold output also allows you to calculate P-optimal base pairs: A P-optimal base pair is a base pair contained in at least one folding within P percent of the minimum free energy. Thus, a folding within P percent of the minimum free energy contains only P-optimal base pairs or P'-optimal base pairs where P'<P. The minimum free energy structure contains only-optimal base pairs. If you compute 20 suboptimal structures all containing a stem (a group of 2 or more consecutive base pairs) so that all base pairs are -optimal (contained in a folding with the minimum free energy), it is natural to give that stem a high confidence.

P-optimal base pairs are distinguished by colors which can be displayed in the normal sequence view using the Side Panel

Click here to read about advanced options