UMAP-1UMAP-2
Fraunhofer HHI × Stanford University

Transformer-Based BCR Repertoire Analysis for Your Research

Sequence-level clonal family assignment, cross-database semantic search, and quantitative repertoire profiling — methodology you can publish on, powered by infrastructure your lab doesn't need to build.

Technical Pipeline

From Raw Sequences to Quantitative Immune Profiling

Module-level transparency. Every step is documented, reproducible, and ready for your methods section.

1

Raw Sequencing Input

Bulk BCR-seq or 10x single-cell VDJ. FASTQ → adapter trimming, quality filtering, UMI deduplication.

fastp, presto
2

V(D)J Annotation

Germline gene assignment, CDR3 extraction, framework/CDR boundary identification.

IgBLAST, IMGT reference
3

Transformer Embeddings

Antibody-language-model embeddings (AbLang/DNABERT-derived) map each BCR sequence into a 512-d latent space, capturing structural and functional similarity beyond edit distance.

Custom fine-tuned AbLang
4

Clonal Family Clustering

Embedding-space clustering assigns clonal families with higher accuracy than Hamming-distance methods. Benchmarked against Change-O/SCOPe on synthetic and real repertoires.

HDBSCAN on embeddings
5

Repertoire Feature Extraction

Clonality metrics (Gini, Shannon entropy, top-clone frequency), isotype distribution, SHM quantification, V-gene usage profiling per sample.

Custom feature pipeline
6

Semantic Autoantibody Search

Cross-database embedding similarity search against IEDB, CoV-AbDab, OAS. Identifies functional homologs even at low sequence identity.

FAISS index, cosine similarity
7

Composite Score & Report

Autoimmune B-Cell Activity Score generation. Per-patient clonal landscape visualization, autoreactive signature flags, stratification tier.

Statistical modeling

Methodology

Embedding-Based Analysis Outperforms Edit-Distance Methods

Repertoire Clonality Heatmap

Illustrative
Clonal families ranked by size Expanded (top 10%)

BCR Embedding Space (UMAP)

Illustrative
IgG1 expandedIgG3/IgAnaïve IgM

Transformer embeddings separate isotype-switched, affinity-matured clones from naïve populations without manual gating.

Benchmarking

Embedding-based clonal family assignment achieves higher V-measure and adjusted Rand index than Hamming-distance clustering (Change-O/SCOPe) on both synthetic lineage trees and paired heavy-light chain validation sets. Full benchmarking data available upon request.

V-measure0.94 vs 0.81
Adj. Rand Index0.91 vs 0.76
Runtime (10k seq)< 2 min

Clono embeddings vs. Hamming-distance (Change-O)

Collaboration

We Analyze Your Samples. You Publish the Findings.

Full data ownership remains with your institution. Clono provides the bioinformatic infrastructure, analysis pipeline, and methodology support — you retain publication rights and intellectual property over your cohort data.

Free pilot

Pilot Analysis

Send us 10–20 samples. We run the full pipeline at no cost and deliver a preliminary report to evaluate fit for your research question.

Co-authorship

Collaborative Study

Joint study design, sample processing, bioinformatic analysis, and co-authorship. We contribute methods expertise; you bring the cohort and clinical context.

Fee-for-service

Service Analysis

Full-service BCR repertoire analysis for your samples with standardized reporting. Ideal for multi-site studies requiring consistent methodology across cohorts.

Selected References

Published Evidence Base

Nature2019

BCR repertoire signatures are disease-specific across 6 immune-mediated diseases (n=209).

Bashford-Rogers R, et al.

Arthritis Res & Ther2024

BCR repertoire dynamics during B-cell reconstitution correlate with clinical outcomes in RA.

Pollastro S, et al.

JCI Insight2024

scRNA-seq + BCR-seq confirms naïve reconstitution signature equals immune reset in SLE CAR-T.

Wilhelm A, et al.

JCI Insight2020

Pre-treatment autoreactive BCR clones persist through rituximab and dominate at relapse in MuSK-MG.

Stathopoulos P & O'Connor KC

Fraunhofer

Fraunhofer Heinrich Hertz Institute

Europe's largest applied research organization. HHI specializes in deep learning and transformer architectures. 76 institutes, €3.4B annual budget.

Stanford University

Stanford University

Co-founded by postdocs in immunology and digital health from Stanford. Deep expertise in B-cell biology, repertoire analysis, and translational biomarker development.

Collaborate

Request a Pilot Analysis

We'll run a complimentary analysis on a small batch of your samples so you can evaluate the methodology firsthand.