01 · Overview
The v0.2 pipeline takes an RNA sequence (or a PDB upload) and returns a top-3 ranked shortlist of candidate druggable cleft pockets, alongside the predicted 3D structure, a conformational ensemble, full per-cluster metadata and a branded PDF report.
sequence (or uploaded PDB) -> 3D structure prediction (single-seq default; opt-in MSA path) -> conformational ensemble (5 frames) (ANM normal-mode sampling) -> cavity detection per frame (fpocket, RNA-tuned parameters) -> cross-frame clustering at 4 A (Kabsch alignment, persistence-aware) -> RNA-applicable ranking (persistence x binding-residue stability) -> top-3 + ensemble PDB + JSON + PDF
02 · 3D structure prediction
We use RhoFold+ (Shen et al., Nature Methods 2024) under its Apache 2.0 licence to predict the 3D tertiary structure from sequence. Single-sequence prediction is the default; for targets that pass the pre-pilot tractability screen we run an MSA-driven path using the same model on a covariance-aware multiple-sequence alignment built from RNAcentral homology search.
We do not modify, fine-tune, or re-train the pretrained weights. The training data licence is non-commercial; the model is used strictly for inference under its software licence.
03 · Conformational ensemble
We sample a five-frame conformational ensemble around the predicted structure using anisotropic network model (ANM) normal-mode analysis (Atilgan et al., 2001), implemented via ProDy (Bakan et al., 2011) under BSD-3. Perturbation is along the 10 lowest-frequency collective modes on the C3′ backbone.
ANM is deterministic, fast, and produces motion that captures the kind of low-frequency conformational change relevant to pocket formation. We piloted short molecular dynamics (AMBER OL3, 500 ps production) but observed equilibration drift of 2.9–3.3 Å away from the prediction before production starts; this drift demoted real binding-site clusters in the ranking on both targets where MSA mode helps. MD is on the v0.3 research roadmap pending resolution of that drift.
04 · Cavity detection
Cavities are detected on each frame using fpocket (Le Guilloux et al., 2009) under its MIT licence, with RNA-tuned alpha-sphere and clustering parameters: minimum radius 3.0 Å, maximum 5.7 Å, minimum alpha-spheres 35, clustering distance 1.65 Å.
fpocket’s default parameters over-predict pockets on RNA because the polar grooves of duplex RNA get mistaken for cavities. The RNA community has converged on a tighter parameter set; the most recent formalisation is fpocketR (Veenbaas et al., PNAS 2025), an RNA-specific wrapper for fpocket 4.0 with parameters calibrated on a comprehensive analysis of RNA–ligand co-crystal structures. Our parameter set is consistent with the fpocketR tuning; on the seven-target benchmark below, our detection produces the same number of pockets and the same rank-1 binding-site overlap as fpocketR’s parameters, target-by-target. We treat the RNA-tuned detection step as field-standard rather than as a proprietary contribution.
Pockets are then clustered across the ensemble at 4 Å (centroid distance) after Kabsch-aligning frames to the reference. Persistence is the fraction of frames in which a cluster is detected; clusters with persistence below the configured floor are excluded from the customer-visible top-3.
05 · Cross-frame geometric ranking // load-bearing contribution
This is the load-bearing architectural contribution of v0.2 and the part of the pipeline that is not subsumed by any published RNA pocket-detection tool. Cavity detection (the previous section) is field-standard and equivalent to the published fpocketR approach. What we add on top is the cross-frame geometric ranker operating over the five-frame ANM conformational ensemble.
We rank candidate pockets by:
score = persistence x n_residues_intersected
where persistence is the fraction of frames in which the cluster is detected, and n_residues_intersected is the count of residues contacted by the cluster in every member frame. Both features are RNA-applicable by construction — no hydrophobicity term, no protein-trained classifier — and both require the conformational ensemble to be computed: a single frame does not give a persistence measurement and does not yield the intersected-residue set.
The empirical contribution of this step is direct. On the seven-target benchmark with the same RNA-tuned detection in every configuration, single-frame fpocket-native ranking (whether using our parameters or the published fpocketR parameters) produces 0 of 7 strict@1 and 2 of 7 near@1. Adding the cross-frame geometric ranker on top of an identical detection step lifts recovery to 3 of 7 strict@1 and 6 of 7 near@1. On 4 of the 7 targets the lift crosses a recovery class boundary (NEITHER → NEAR or NEITHER → STRICT or NEAR → STRICT); see the comparison section below for the per-target breakdown.
fpocket’s native druggability score — the output of a logistic regression trained on protein druggable-vs-non-druggable cavities — does not transfer to RNA; on the seven-target benchmark it consistently scores the actual binding-site cluster near zero while assigning non-binding cavities scores in the 0.1–0.7 range. The fpocket-on-RNA limitation is independently characterised by Veenbaas et al. (2025) and addressed in fpocketR through parameter tuning rather than scoring; our orthogonal answer is to leave the protein-trained scorer aside and rank by RNA-applicable geometric features over a conformational ensemble. fpocket’s druggability score is still computed and reported per cluster as metadata for transparency; druggability assessment itself is left to the customer’s medicinal chemistry workflow.
06 · Pre-pilot MSA tractability screen
For targets with diverse-tail evolutionary representation an MSA-driven structure prediction path is offered as an opt-in mode. The screen criterion is empirical:
min_id < 0.77 OR >= 5% of homologs at 70-80% identity
Below this threshold the MSA carries enough divergence to add value to AI structure prediction; above it, the MSA is dominated by near-duplicate orthologs that perturb the prediction without improving it. The criterion is calibrated on a benchmark of seven RNA targets; we will refine it as more targets accumulate.
For groove-binding targets, surface-recognition interfaces, and large complex folds (e.g. group I intron core), v0.2 is out of scope by construction — not because of the screen, but because the cavity-detection step itself requires cleft geometry. The screen flags these classes upfront.
07 · Validation
The v0.2 benchmark covers seven cleft-binder RNAs with deposited co-crystal structures. The set is constructed in two layers: four v0.1 retro targets — 2GDI (TPP RF00059), 4GXY (B12 RF00174), 2GIS (SAM-I RF00162), 5C45 (FMN RF00050) — that were carried over from the v0.1 fpocket-druggability era to give an apples-to-apples comparison; and three new pilot targets— 4LVV (THF RF01831), 2HOJ (thi-box TPP), 3DIL (group I intron) — introduced to stress the geometric ranker and the MSA tractability screen on unseen sequences and unseen fold families.
As-shipped configuration: single-sequence prediction by default; MSA mode where the pre-pilot screen indicates. The as-shipped arm is the one the screen selects for each target — single-sequence for the majority, MSA for 4GXY, 4LVV, and 3DIL.
Benchmark definitions
For each target we compare each pipeline-produced cluster against the experimental binding-site residue set from the co-crystal structure. Overlap is the fraction of binding-site residues that fall within the cluster:
overlap(cluster, target) = | residues(cluster) ∩ binding_site(target) |
--------------------------------------------
| binding_site(target) |For each target we then evaluate the top-K ranked clusters and report:
- strict@Kat least one cluster in the top K has overlap ≥ 50%. We report strict@1 — the customer-facing rank-1 cluster recovers the experimental site.
- near@Kat least one cluster in the top K has overlap ≥ 30%. We report near@1 — the rank-1 cluster picks up a substantial fraction of the site even if it falls short of the strict threshold.
These are stricter than the “centroid-within-X-Å” criterion common in the protein cavity literature: a cluster whose centroid is close to the binding site but whose residue contacts miss the actual binding pocket scores 0% here. Both thresholds are reported because the 30–50% band corresponds to predictions that point a medicinal chemist at the right region of the molecule even though geometry refinement is still needed.
Recovery results
- →3 of 7 strict@1 — 2GDI (TPP, single-seq, 71%), 2HOJ (thi-box TPP, single-seq, 53%), 4LVV (THF, MSA, 50%)
- →6 of 7 near@1 — the strict set plus 4GXY (B12, MSA, 35%), 2GIS (SAM-I, single-seq, 38%), 5C45 (FMN, single-seq, 40%)
- →1 of 7 neither — 3DIL (group I intron) at 0% overlap on both arms; underlying prediction quality is the bound, and group I intron sits at the boundary of v0.2’s addressable scope (see below)
For comparison, the v0.1 fpocket-druggability ranker on the same four v0.1 retro targets produces 0 of 4 strict@1 — the protein-trained classifier placed non-binding cavities at rank 1 in every case. The v0.2 geometric ranker on those same four targets in their as-shipped arms produces 1 strict (2GDI) and 3 near (4GXY MSA, 2GIS, 5C45). The full delta is the seven-target view: 3/7 strict@1, 6/7 near@1.
Full per-target table on the homepage benchmark section. Per-cell records: logs/v02_launch_benchmark.json. Reproducer: tmp/v02_launch_benchmark.py.
08 · Comparison against fpocketR
The Weeks lab at UNC Chapel Hill published fpocketR— an RNA-tuned wrapper for fpocket 4.0 — in PNAS in April 2025 (Veenbaas et al.). fpocketR formalises the same observation our pipeline had been built around: fpocket’s default parameters over-predict on RNA, mistaking grooves for cavities. Their published answer is a tuned parameter set (m=3.0Å, M=5.7Å, i=42 spheres, D=1.65Å, with explicit polar-sphere and apolar-ratio cutoffs). Ours is m=3.0Å, M=5.7Å, i=35 spheres, D=1.65Å.
To answer the load-bearing positioning question — does fpocketR’s tuned detection already produce the recovery we report, making the cross-frame geometric ranker redundant? — we ran the seven-target benchmark under four configurations against the same as-shipped-arm predicted structure for each target:
Aggregate rank-1 recovery (n = 7)
| Configuration | strict@1 | near@1 |
|---|---|---|
| Vanilla fpocket defaults | 3 / 7 | 3 / 7 |
| Our v0.2 params (single-frame, no ensemble, no geometric ranker) | 0 / 7 | 2 / 7 |
| fpocketR params (single-frame, no ensemble, no geometric ranker) | 0 / 7 | 2 / 7 |
| Our locked v0.2 (ensemble + geometric ranker) | 3 / 7 | 6 / 7 |
Vanilla fpocket appears to win at strict@1 in aggregate. It does so by over-detecting (12–28 pockets per target) and accidentally placing a binding-site-covering cavity at rank-1 from noise. The most extreme case is 2GIS, where vanilla fpocket’s rank-1 has 100% overlap because it is a single huge pocket that engulfs the entire binding site plus a large number of unrelated residues — useless as a top-3 customer surface.
Per-target lift (rank-1 overlap)
| Target | fpocketR single-frame | v0.2 ensemble + ranker | Δ |
|---|---|---|---|
| 2GDI (TPP) | 43% NEAR | 71% STRICT | +28 pp · NEAR→STRICT |
| 4GXY (B12) | 0% NEITHER | 35% NEAR | +35 pp · NEITHER→NEAR |
| 2GIS (SAM-I) | 6% NEITHER | 38% NEAR | +32 pp · NEITHER→NEAR |
| 5C45 (FMN) | 40% NEAR | 40% NEAR | flat (already NEAR) |
| 3DIL (group I intron) | 0% NEITHER | 0% NEITHER | flat (out-of-scope) |
| 2HOJ (thi-box TPP) | 0% NEITHER | 53% STRICT | +53 pp · NEITHER→STRICT |
| 4LVV (THF) | 6% NEITHER | 50% STRICT | +44 pp · NEITHER→STRICT |
On four of the seven targets the ensemble + cross-frame ranker moves recovery across a class boundary. On 5C45 it is flat (NEAR under both, the smallest target in the benchmark) and on 3DIL it is flat NEITHER (the out-of-scope group I intron whose AI structure prediction quality is the bound — no ranking change can fix backbone RMSD of 11.6–20.3 Å).
What this comparison establishes
- RNA-tuned cavity detection is field-standard, not proprietary. Our parameters and fpocketR’s parameters produce equivalent detection on this benchmark.
- The cross-frame geometric ranker over a conformational ensemble is the v0.2 contribution that is not subsumed by any published RNA pocket-detection tool. It produces a measurable, target-specific lift on top of either parameter set.
- fpocket’s native druggability score is not a ranking signal on RNA. Removing it from the ranking step (as we do) and removing it via parameter-tuning (as fpocketR does) are both valid responses to the same underlying problem.
- Backbone RMSD is not the customer-facing quality metric. The 2HOJ case (NEITHER at 14 Å RMSD with single-frame ranking; STRICT at 14 Å RMSD with ensemble + ranker) and the 2GIS case (NEITHER at 1.3 Å RMSD with single-frame ranking; NEAR at 1.3 Å RMSD with ensemble + ranker) both confirm this. Binding-site overlap is the metric that maps to customer usefulness.
Comparison source: tests/regression/fpocketr_comparison_baseline.json. Reproducer: tmp/fpocketR_compare_all.py.
09 · Scope and limitations
v0.2 is calibrated for a specific class of RNA target. The boundaries below are constraints we know about and surface upfront; the pre-pilot screen described in section 06 enforces them at the customer-intake step.
In scope
- Cleft-binding RNAs — small-molecule binding sites that form geometric cavities in the folded RNA. Riboswitches are the prototypical case; the v0.2 benchmark targets all sit in this class.
- Sequence length up to ~500 nt. The benchmark spans 54–174 nt; the pipeline is designed to scale to ~500 nt within the same architecture, though we report results only for the benchmark range.
- Sequence-only input. A deposited PDB upload is also accepted and skips the structure-prediction step; the downstream ensemble, cavity detection, and ranking are unchanged.
Out of scope (by construction)
- Groove-binding interfaces. Binding modes that occupy the major or minor groove without forming a cleft cavity are not detectable by the cavity-detection step in v0.2. We flag these classes in the pre-pilot screen.
- Surface-recognition interfaces. Large flat protein-RNA-style binding surfaces are outside the cavity-detection regime. v0.2 does not attempt to surface them.
- Large complex folds with reduced AI prediction quality. Group I introns are the canonical example: 3DIL fails on both arms not because the ranker is wrong but because the predicted structure is too distorted (RMSD 11.6–20.3 Å on the two arms) for cavity geometry to align with the experimental site. We surface AI prediction quality (mean pLDDT, RMSD) so this is visible.
- MD-driven ensembles. Short molecular dynamics (AMBER OL3, 500 ps production) on top of the RhoFold prediction was piloted but is not shipped in v0.2: we observed 2.9–3.3 Å equilibration drift away from the prediction that demoted real binding-site clusters in the ranking on both targets where MSA mode helps. MD is on the v0.3 research roadmap; v0.2 ships with ANM only.
- Druggability assessment. The pipeline ranks pockets; it does not predict druggability. Druggability assessment belongs in the customer’s medicinal chemistry workflow.
What results are and are not
Results are computational predictions. We make geometric claims (a cluster of a given size persists in this position across the ensemble, and it overlaps the experimental binding site at X%) and we make recovery claims against the seven benchmark targets. We do not make claims about ligand binding affinity, hit rate against the cluster, or any in-vitro property; experimental validation belongs in the customer’s workflow before the predictions are used in drug-development decisions.
10 · What’s next
The public roadmap covers v0.2.1 (in development) and v0.3 (planned). Three things shape the queue: customer pilot conversations during the v0.2 ship cycle, evidence from the expanded benchmark as new targets accumulate, and resolution of the open research items called out above (MD drift, groove-binder geometry, larger folds).
Full list with status labels: homepage roadmap section. We do not commit to dates; items move when the work is done.
11 · Reproducibility
The pipeline is deterministic given the inputs:
- Structure prediction is deterministic at single_seq_pred=True.
- ANM with seed 0 produces identical perturbations on a fixed CPU/numpy stack.
- fpocket with fixed parameters is deterministic.
- Cross-frame clustering at ε = 4 Å with the locked Kabsch reference is deterministic.
A re-run on the same input sequence produces bit-identical output bundles, including the binding-site overlap percentages reported on this page.
12 · Citations
- Shen, T. et al. (2024).RhoFold+: Accurate RNA 3D structure prediction using a language model-based deep learning approach.Nature Methods.
- Atilgan, A.R. et al. (2001).Anisotropy of fluctuation dynamics of proteins with an elastic network model.Biophysical Journal 80, 505–515.
- Bakan, A., Meireles, L.M., Bahar, I. (2011).ProDy: Protein Dynamics Inferred from Theory and Experiment.Bioinformatics 27, 1575–1577.
- Le Guilloux, V., Schmidtke, P., Tuffery, P. (2009).Fpocket: An open source platform for ligand pocket detection.BMC Bioinformatics 10, 168.
- Veenbaas, S.D., Koehn, J.T., Irving, P.S., Lama, N.N., Weeks, K.M. (2025).Ligand-binding pockets in RNA and where to find them.PNAS 122(17), e2422346122. doi:10.1073/pnas.2422346122
// White paper
A full white paper covering the methodology, validation, and head-to-head against alternative approaches is in preparation. Until it is published, this page is the canonical methods document. For early discussion or to request the draft, email info@rnafold.com.