Methodology — RNAfold

01 · Overview

The v0.2 pipeline takes an RNA sequence (or a PDB upload) and returns a top-3 ranked shortlist of candidate druggable cleft pockets, alongside the predicted 3D structure, a conformational ensemble, full per-cluster metadata and a branded PDF report.

sequence (or uploaded PDB)
   -> 3D structure prediction              (single-seq default; opt-in MSA path)
   -> conformational ensemble (5 frames)    (ANM normal-mode sampling)
   -> cavity detection per frame            (fpocket, RNA-tuned parameters)
   -> cross-frame clustering at 4 A         (Kabsch alignment, persistence-aware)
   -> RNA-applicable ranking                (persistence x binding-residue stability)
   -> top-3 + ensemble PDB + JSON + PDF

02 · 3D structure prediction

We use RhoFold+ (Shen et al., Nature Methods 2024) under its Apache 2.0 licence to predict the 3D tertiary structure from sequence. Single-sequence prediction is the default; for targets that pass the pre-pilot tractability screen we run an MSA-driven path using the same model on a covariance-aware multiple-sequence alignment built from RNAcentral homology search.

We do not modify, fine-tune, or re-train the pretrained weights. The training data licence is non-commercial; the model is used strictly for inference under its software licence.

LicenceApache 2.0 (model code & weights). Training data licence non-commercial; weights used for inference only.

03 · Conformational ensemble

We sample a five-frame conformational ensemble around the predicted structure using anisotropic network model (ANM) normal-mode analysis (Atilgan et al., 2001), implemented via ProDy (Bakan et al., 2011) under BSD-3. Perturbation is along the 10 lowest-frequency collective modes on the C3′ backbone.

ANM is deterministic, fast, and produces motion that captures the kind of low-frequency conformational change relevant to pocket formation. We piloted short molecular dynamics (AMBER OL3, 500 ps production) but observed equilibration drift of 2.9–3.3 Å away from the prediction before production starts; this drift demoted real binding-site clusters in the ranking on both targets where MSA mode helps. MD is on the v0.3 research roadmap pending resolution of that drift.

LicenceProDy: BSD-3.

04 · Cavity detection

Cavities are detected on each frame using fpocket (Le Guilloux et al., 2009) under its MIT licence, with RNA-tuned alpha-sphere and clustering parameters: minimum radius 3.0 Å, maximum 5.7 Å, minimum alpha-spheres 35, clustering distance 1.65 Å.

fpocket’s default parameters over-predict pockets on RNA because the polar grooves of duplex RNA get mistaken for cavities. The RNA community has converged on a tighter parameter set; the most recent formalisation is fpocketR (Veenbaas et al., PNAS 2025), an RNA-specific wrapper for fpocket 4.0 with parameters calibrated on a comprehensive analysis of RNA–ligand co-crystal structures. Our parameter set is consistent with the fpocketR tuning; on the seven-target benchmark below, our detection produces the same number of pockets and the same rank-1 binding-site overlap as fpocketR’s parameters, target-by-target. We treat the RNA-tuned detection step as field-standard rather than as a proprietary contribution.

Pockets are then clustered across the ensemble at 4 Å (centroid distance) after Kabsch-aligning frames to the reference. Persistence is the fraction of frames in which a cluster is detected; clusters with persistence below the configured floor are excluded from the customer-visible top-3.

Licencefpocket: MIT. fpocketR: MIT (not used as a dependency; cited as the published reference for the RNA-tuning).

05 · Cross-frame geometric ranking // load-bearing contribution

This is the load-bearing architectural contribution of v0.2 and the part of the pipeline that is not subsumed by any published RNA pocket-detection tool. Cavity detection (the previous section) is field-standard and equivalent to the published fpocketR approach. What we add on top is the cross-frame geometric ranker operating over the five-frame ANM conformational ensemble.

We rank candidate pockets by:

score = persistence x n_residues_intersected

where persistence is the fraction of frames in which the cluster is detected, and n_residues_intersected is the count of residues contacted by the cluster in every member frame. Both features are RNA-applicable by construction — no hydrophobicity term, no protein-trained classifier — and both require the conformational ensemble to be computed: a single frame does not give a persistence measurement and does not yield the intersected-residue set.

The empirical contribution of this step is direct. On the seven-target benchmark with the same RNA-tuned detection in every configuration, single-frame fpocket-native ranking (whether using our parameters or the published fpocketR parameters) produces 0 of 7 strict@1 and 2 of 7 near@1. Adding the cross-frame geometric ranker on top of an identical detection step lifts recovery to 3 of 7 strict@1 and 6 of 7 near@1. On 4 of the 7 targets the lift crosses a recovery class boundary (NEITHER → NEAR or NEITHER → STRICT or NEAR → STRICT); see the comparison section below for the per-target breakdown.

fpocket’s native druggability score — the output of a logistic regression trained on protein druggable-vs-non-druggable cavities — does not transfer to RNA; on the seven-target benchmark it consistently scores the actual binding-site cluster near zero while assigning non-binding cavities scores in the 0.1–0.7 range. The fpocket-on-RNA limitation is independently characterised by Veenbaas et al. (2025) and addressed in fpocketR through parameter tuning rather than scoring; our orthogonal answer is to leave the protein-trained scorer aside and rank by RNA-applicable geometric features over a conformational ensemble. fpocket’s druggability score is still computed and reported per cluster as metadata for transparency; druggability assessment itself is left to the customer’s medicinal chemistry workflow.

06 · Pre-pilot MSA tractability screen

For targets with diverse-tail evolutionary representation an MSA-driven structure prediction path is offered as an opt-in mode. The screen criterion is empirical:

min_id < 0.77   OR   >= 5% of homologs at 70-80% identity

Below this threshold the MSA carries enough divergence to add value to AI structure prediction; above it, the MSA is dominated by near-duplicate orthologs that perturb the prediction without improving it. The criterion is calibrated on a benchmark of seven RNA targets; we will refine it as more targets accumulate.

For groove-binding targets, surface-recognition interfaces, and large complex folds (e.g. group I intron core), v0.2 is out of scope by construction — not because of the screen, but because the cavity-detection step itself requires cleft geometry. The screen flags these classes upfront.

07 · Validation

The v0.2 benchmark covers seven cleft-binder RNAs with deposited co-crystal structures. The set is constructed in two layers: four v0.1 retro targets — 2GDI (TPP RF00059), 4GXY (B12 RF00174), 2GIS (SAM-I RF00162), 5C45 (FMN RF00050) — that were carried over from the v0.1 fpocket-druggability era to give an apples-to-apples comparison; and three new pilot targets— 4LVV (THF RF01831), 2HOJ (thi-box TPP), 3DIL (group I intron) — introduced to stress the geometric ranker and the MSA tractability screen on unseen sequences and unseen fold families.

As-shipped configuration: single-sequence prediction by default; MSA mode where the pre-pilot screen indicates. The as-shipped arm is the one the screen selects for each target — single-sequence for the majority, MSA for 4GXY, 4LVV, and 3DIL.

Benchmark definitions

For each target we compare each pipeline-produced cluster against the experimental binding-site residue set from the co-crystal structure. Overlap is the fraction of binding-site residues that fall within the cluster:

overlap(cluster, target) = | residues(cluster) ∩ binding_site(target) |
                           --------------------------------------------
                                    | binding_site(target) |

For each target we then evaluate the top-K ranked clusters and report:

strict@Kat least one cluster in the top K has overlap ≥ 50%. We report strict@1 — the customer-facing rank-1 cluster recovers the experimental site.
near@Kat least one cluster in the top K has overlap ≥ 30%. We report near@1 — the rank-1 cluster picks up a substantial fraction of the site even if it falls short of the strict threshold.

These are stricter than the “centroid-within-X-Å” criterion common in the protein cavity literature: a cluster whose centroid is close to the binding site but whose residue contacts miss the actual binding pocket scores 0% here. Both thresholds are reported because the 30–50% band corresponds to predictions that point a medicinal chemist at the right region of the molecule even though geometry refinement is still needed.

Recovery results

→3 of 7 strict@1 — 2GDI (TPP, single-seq, 71%), 2HOJ (thi-box TPP, single-seq, 53%), 4LVV (THF, MSA, 50%)
→6 of 7 near@1 — the strict set plus 4GXY (B12, MSA, 35%), 2GIS (SAM-I, single-seq, 38%), 5C45 (FMN, single-seq, 40%)
→1 of 7 neither — 3DIL (group I intron) at 0% overlap on both arms; underlying prediction quality is the bound, and group I intron sits at the boundary of v0.2’s addressable scope (see below)

For comparison, the v0.1 fpocket-druggability ranker on the same four v0.1 retro targets produces 0 of 4 strict@1 — the protein-trained classifier placed non-binding cavities at rank 1 in every case. The v0.2 geometric ranker on those same four targets in their as-shipped arms produces 1 strict (2GDI) and 3 near (4GXY MSA, 2GIS, 5C45). The full delta is the seven-target view: 3/7 strict@1, 6/7 near@1.

Full per-target table on the homepage benchmark section. Per-cell records: logs/v02_launch_benchmark.json. Reproducer: tmp/v02_launch_benchmark.py.

08 · Comparison against fpocketR

The Weeks lab at UNC Chapel Hill published fpocketR— an RNA-tuned wrapper for fpocket 4.0 — in PNAS in April 2025 (Veenbaas et al.). fpocketR formalises the same observation our pipeline had been built around: fpocket’s default parameters over-predict on RNA, mistaking grooves for cavities. Their published answer is a tuned parameter set (m=3.0Å, M=5.7Å, i=42 spheres, D=1.65Å, with explicit polar-sphere and apolar-ratio cutoffs). Ours is m=3.0Å, M=5.7Å, i=35 spheres, D=1.65Å.

To answer the load-bearing positioning question — does fpocketR’s tuned detection already produce the recovery we report, making the cross-frame geometric ranker redundant? — we ran the seven-target benchmark under four configurations against the same as-shipped-arm predicted structure for each target:

Aggregate rank-1 recovery (n = 7)

Configuration	strict@1	near@1
Vanilla fpocket defaults	3 / 7	3 / 7
Our v0.2 params (single-frame, no ensemble, no geometric ranker)	0 / 7	2 / 7
fpocketR params (single-frame, no ensemble, no geometric ranker)	0 / 7	2 / 7
Our locked v0.2 (ensemble + geometric ranker)	3 / 7	6 / 7

Vanilla fpocket appears to win at strict@1 in aggregate. It does so by over-detecting (12–28 pockets per target) and accidentally placing a binding-site-covering cavity at rank-1 from noise. The most extreme case is 2GIS, where vanilla fpocket’s rank-1 has 100% overlap because it is a single huge pocket that engulfs the entire binding site plus a large number of unrelated residues — useless as a top-3 customer surface.

Per-target lift (rank-1 overlap)

Target	fpocketR single-frame	v0.2 ensemble + ranker	Δ
2GDI (TPP)	43% NEAR	71% STRICT	+28 pp · NEAR→STRICT
4GXY (B12)	0% NEITHER	35% NEAR	+35 pp · NEITHER→NEAR
2GIS (SAM-I)	6% NEITHER	38% NEAR	+32 pp · NEITHER→NEAR
5C45 (FMN)	40% NEAR	40% NEAR	flat (already NEAR)
3DIL (group I intron)	0% NEITHER	0% NEITHER	flat (out-of-scope)
2HOJ (thi-box TPP)	0% NEITHER	53% STRICT	+53 pp · NEITHER→STRICT
4LVV (THF)	6% NEITHER	50% STRICT	+44 pp · NEITHER→STRICT

On four of the seven targets the ensemble + cross-frame ranker moves recovery across a class boundary. On 5C45 it is flat (NEAR under both, the smallest target in the benchmark) and on 3DIL it is flat NEITHER (the out-of-scope group I intron whose AI structure prediction quality is the bound — no ranking change can fix backbone RMSD of 11.6–20.3 Å).

What this comparison establishes

RNA-tuned cavity detection is field-standard, not proprietary. Our parameters and fpocketR’s parameters produce equivalent detection on this benchmark.
The cross-frame geometric ranker over a conformational ensemble is the v0.2 contribution that is not subsumed by any published RNA pocket-detection tool. It produces a measurable, target-specific lift on top of either parameter set.
fpocket’s native druggability score is not a ranking signal on RNA. Removing it from the ranking step (as we do) and removing it via parameter-tuning (as fpocketR does) are both valid responses to the same underlying problem.
Backbone RMSD is not the customer-facing quality metric. The 2HOJ case (NEITHER at 14 Å RMSD with single-frame ranking; STRICT at 14 Å RMSD with ensemble + ranker) and the 2GIS case (NEITHER at 1.3 Å RMSD with single-frame ranking; NEAR at 1.3 Å RMSD with ensemble + ranker) both confirm this. Binding-site overlap is the metric that maps to customer usefulness.

Comparison source: tests/regression/fpocketr_comparison_baseline.json. Reproducer: tmp/fpocketR_compare_all.py.

09 · Scope and limitations

v0.2 is calibrated for a specific class of RNA target. The boundaries below are constraints we know about and surface upfront; the pre-pilot screen described in section 06 enforces them at the customer-intake step.

In scope

Pre-organized helical / internal-loop RNA elements. Our competence is local pocket pre-organization, not global fold accuracy: when a binding site sits in a pre-organized helical or internal-loop element, cavity detection recovers it even when the rest of the fold is less certain, because pocket detection is a local problem rather than a whole-molecule one. Riboswitches and aptamers are the prototypical case; the same fold architecture extends to structured viral elements such as HCV IRES IIa and HIV-1 RRE.
Sequence length up to ~500 nt. The benchmark spans 54–174 nt; the pipeline is designed to scale to ~500 nt within the same architecture, though we report results only for the benchmark range.
Sequence-only input. A deposited PDB upload is also accepted and skips the structure-prediction step; the downstream ensemble, cavity detection, and ranking are unchanged.

Out of scope (by construction)

H-type pseudoknots and G-quadruplexes. For these fold classes, structure-prediction confidence can be misleading — the model reports high confidence on a fold it has not actually built correctly — so we exclude them from the product rather than hand back an unreliable pocket. Before we report pockets on any target, we screen it against the fold-confidence signal from structure prediction, and any target we cannot confidently model — including known pseudoknot and G-quadruplex cases — is reviewed by hand before results go out.
Groove-binding interfaces. Binding modes that occupy the major or minor groove without forming a cleft cavity are not detectable by the cavity-detection step in v0.2. We flag these classes in the pre-pilot screen.
Surface-recognition interfaces. Large flat protein-RNA-style binding surfaces are outside the cavity-detection regime. v0.2 does not attempt to surface them.
Large complex folds with reduced AI prediction quality. Group I introns are the canonical example: 3DIL fails on both arms not because the ranker is wrong but because the predicted structure is too distorted (RMSD 11.6–20.3 Å on the two arms) for cavity geometry to align with the experimental site. We surface AI prediction quality (mean pLDDT, RMSD) so this is visible.
MD-driven ensembles. Short molecular dynamics (AMBER OL3, 500 ps production) on top of the RhoFold prediction was tried but is not shipped: we observed 2.9–3.3 Å equilibration drift away from the prediction that demoted real binding-site clusters in the ranking on both targets where MSA mode helps. The shipped pipeline uses ANM only.
Druggability assessment. The pipeline ranks pockets; it does not predict druggability. Druggability assessment belongs in the customer’s medicinal chemistry workflow.

What results are and are not

Results are computational predictions. We make geometric claims (a cluster of a given size persists in this position across the ensemble, and it overlaps the experimental binding site at X%) and we make recovery claims against the seven benchmark targets. We do not make claims about ligand binding affinity, hit rate against the cluster, or any in-vitro property; experimental validation belongs in the customer’s workflow before the predictions are used in drug-development decisions.

10 · Known limitations

The open items called out above — MD equilibration drift, groove-binder geometry, larger folds — remain unresolved in the shipped pipeline. They are research questions, not committed features, and nothing on a research branch is represented as available here. See findings for the induced-fit survey, which bears on how much any apo-structure pocket prediction can be trusted regardless of these items.

11 · Reproducibility

The pipeline is deterministic given the inputs:

Structure prediction is deterministic at single_seq_pred=True.
ANM with seed 0 produces identical perturbations on a fixed CPU/numpy stack.
fpocket with fixed parameters is deterministic.
Cross-frame clustering at ε = 4 Å with the locked Kabsch reference is deterministic.

A re-run on the same input sequence produces bit-identical output bundles, including the binding-site overlap percentages reported on this page.

12 · Citations

Shen, T. et al. (2024).RhoFold+: Accurate RNA 3D structure prediction using a language model-based deep learning approach.Nature Methods.
Atilgan, A.R. et al. (2001).Anisotropy of fluctuation dynamics of proteins with an elastic network model.Biophysical Journal 80, 505–515.
Bakan, A., Meireles, L.M., Bahar, I. (2011).ProDy: Protein Dynamics Inferred from Theory and Experiment.Bioinformatics 27, 1575–1577.
Le Guilloux, V., Schmidtke, P., Tuffery, P. (2009).Fpocket: An open source platform for ligand pocket detection.BMC Bioinformatics 10, 168.
Veenbaas, S.D., Koehn, J.T., Irving, P.S., Lama, N.N., Weeks, K.M. (2025).Ligand-binding pockets in RNA and where to find them.PNAS 122(17), e2422346122. doi:10.1073/pnas.2422346122

// White paper

A full white paper covering the methodology, validation, and head-to-head against alternative approaches is in preparation. Until it is published, this page is the canonical methods document. For early discussion or to request the draft, email info@rnafold.com.

How v0.2 actually works