Abstract
Computer-aided prediction of aptamer sequences has been focused on primary sequence alignment and motif comparison. We observed that many aptamers have a conserved hairpin, yet the sequence of the hairpin can be highly variable. Taking such a secondary structure information into consideration, a new algorithm combining conserved primary sequences and secondary structures is developed, that combines three scores based on sequence abundance, stability, and structure, respectively. This algorithm was used in the prediction of aptamers from caffeine and theophylline selections. In the late rounds of the selection, when the library was converged, the predicted sequences matched well with the most abundant sequences. When the library was far from convergence and the sequences were deemed impossible for traditional analysis methods, the algorithm still predicted aptamer sequences that were experimentally verified by isothermal titration calorimetry. This algorithm paves a new way to look for patterns in aptamer selection libraries and mimics the sequence evolution process. It will help shorten the aptamer selection time and promote the biosensor application of aptamers.
Supplementary materials
Title
Supporting Information
Description
DNA sequences tested in this work and additional aptamer alignment data.
Actions