Bittensor Subnet · Adversarial Robustness

Perturb

Decentralized Adversarial Robustness Network — built on Bittensor.

Version1.0
DateApril 2026
Domainperturbai.io
00/

Abstract

We propose a decentralized adversarial robustness network built on Bittensor. Miners compete to find adversarial examples — imperceptible input perturbations that cause state-of-the-art image classifiers to fail. Validators construct and verify challenges using a real AI model and an LLM-backed semantic checker, then score responses based on perturbation minimality and response speed. On-chain weights are assigned periodically to miners who accumulate a sufficient history of verified results.

The result is the world's first financially incentivized, continuously improving adversarial testing infrastructure. The network produces two commercially valuable outputs — an adversarial training dataset and on-chain model robustness certificates — addressing a $1.43B market growing at 26.1% annually.

01/

Executive Summary

$11.6BAI Red Teaming Market by 2033
26.1%Annual Market Growth Rate (CAGR)
€30MMax EU AI Act Non-Compliance Fine
6B+Largest Supported Vision Model

Every AI model deployed in production carries a hidden vulnerability: adversarial examples. An imperceptible change to a single image can cause a medical imaging model to misclassify a tumor, an autonomous vehicle to ignore a stop sign, or a fraud detection system to approve a fraudulent transaction. The tooling to systematically discover these vulnerabilities before deployment is fragmented, expensive, and — critically — static. It does not improve over time.

Perturb changes this. Built on Bittensor, Perturb is a decentralized adversarial robustness network where miners compete to find adversarial examples — inputs that fool AI models while remaining imperceptible to humans. Every challenge is verified by an LLM-backed semantic check. Every attack is scored with mathematical precision. The network gets stronger every single day.

The result is the world's first financially incentivized, continuously improving adversarial testing infrastructure — something no centralized service can replicate.

02/

The Problem We Solve

AI Models Are Brittle

Modern AI models achieve remarkable accuracy on clean test sets, yet remain catastrophically vulnerable to adversarial perturbations — mathematically crafted modifications to input data that are imperceptible to humans but cause models to fail completely. A ResNet-50 achieving 94% accuracy on ImageNet can be fooled by changing fewer than 0.3% of an image's pixels. An EfficientNet-B5 scoring 83.6% top-1 accuracy can misclassify a tabby cat as a fire truck with a perturbation invisible to any human observer.

This is not a theoretical concern. Adversarial attacks have been demonstrated against:

Medical imaging classifiers used for cancer detection and diagnostic triage

Facial recognition systems controlling physical access to secure facilities

Autonomous vehicle perception systems for object and sign recognition

Content moderation models protecting platforms from harmful material

Financial fraud detection systems processing millions of transactions daily

The Market Opportunity

Three powerful forces are converging to create an urgent, large, and underserved market for adversarial robustness testing:

Regulatory Mandates: The EU AI Act classifies AI systems in healthcare, autonomous vehicles, critical infrastructure, and hiring as high-risk, requiring mandatory conformity assessments including robustness testing before deployment. Non-compliance carries fines of up to €30 million or 6% of global annual revenue.

Enterprise Procurement Requirements: Large enterprises increasingly require AI vendors to demonstrate robustness certifications before purchase. A verified robustness certificate is becoming table stakes for selling AI products into regulated industries.

AI Proliferation: As AI moves from research to production across every industry, the attack surface grows exponentially. Every new model deployment is a new vulnerability. Organizations deploying AI without systematic adversarial testing are accepting unknown, unquantified risk.

Why Existing Solutions Fall Short

SolutionTypeCritical Limitation
IBM ART / Foolbox / CleverHansOpen-source librariesRequires deep ML expertise. No managed service. Never improves.
Manual red teaming firmsHuman service$50K–$200K per engagement. Weeks to complete. Cannot scale.
Internal security teamsIn-houseMost organizations lack adversarial ML expertise.
PerturbDecentralized networkSelf-improving. Competitive. Scalable. LLM-verified. On-chain certificates.

No existing solution provides a competitive, financially incentivized, continuously improving approach to adversarial testing. Perturb is the first.

03/

Introducing Perturb

What Perturb Does

Perturb is a Bittensor subnet that incentivizes a global network of miners to find adversarial examples — images that fool AI classifiers while remaining visually indistinguishable from the original. The network produces two commercially valuable outputs:

Adversarial Training Dataset

A continuously growing, LLM-verified dataset of adversarial examples. Sold via subscription to AI teams doing adversarial training — the most effective known defense. Gets better every day as miners improve.

Short-Term Revenue

Model Robustness Certificates

On-demand adversarial evaluation reports with on-chain cryptographic proof of testing. Essential for EU AI Act compliance and enterprise AI procurement. Sold as a tiered subscription service.

Long-Term Revenue

Why Bittensor

Competitive Improvement: TAO emissions reward the best-performing miners. Applied to adversarial attacks, this creates the first financially incentivized adversarial research network. Miners earn real money for finding better attacks — driving continuous improvement no salaried team can match.

Perfect Verification Symmetry: Finding an adversarial example is computationally hard. Verifying one is trivially cheap: run the model, check the output with an LLM, measure the perturbation norm. This asymmetry makes Perturb's incentive mechanism clean, objective, and manipulation-resistant.

Decentralized Trust: On-chain records of adversarial evaluations create cryptographically verifiable proof of robustness testing — more credible than any centralized company's self-reported certificate, directly relevant to regulatory bodies seeking auditable compliance records.

04/

Technical Architecture

System Overview

Perturb operates on a challenge-response loop between validators and miners. Validators construct verified challenges, distribute them to a randomly selected pool of miners, and score responses using LLM semantic verification, perturbation minimality, and response speed.

VALIDATOR MINER Pull Image from API random_mode · label ∈ LABEL_CONSTANTS Verify Challenge EfficientNet-B5 → LLM check loop until label matches Broadcast Challenge select K miners · identical payload challenge Verify Response norm gates · LLM semantic check Score · Set Weights perturbation + speed · on-chain γ blend Generate Attack any algorithm · PGD / FGSM / … method is proprietary to miner perturbed_image await

Validator: Challenge Pipeline

The validator constructs each challenge through a verified pipeline ensuring every challenge sent to miners is semantically clean and unambiguous:

validator / build_challenge.py
def build_challenge() -> Challenge:
    label = random.choice(LABEL_CONSTANTS)
    image = image_hosting_api.fetch(mode="random", label=label)

    while True:
        raw_output = efficientnet_b5(preprocess(image))
        predicted  = raw_output.argmax().item()
        label_str  = IMAGENET_CLASSES[predicted]
        if llm_verify_label_match(label, label_str):
            break
        else:
            image = image_hosting_api.fetch(mode="random", label=label)

    return Challenge(
        model          = "efficientnet_b5",
        image          = image,
        true_label     = label,
        true_label_str = label_str,
    )

Validator: LLM Verification

Perturb uses a lightweight LLM for semantic verification at two points: during challenge construction and during response scoring. This approach handles edge cases where related classes such as tabby cat vs Egyptian cat would incorrectly fail a valid attack under integer class-ID comparison.

ParameterValueDescription
PERTURB_LLM_ENDPOINT_MODELQwen2.5-1.5B-InstructModel name sent in validator challenge payload
Ollama default modelqwen2.5:1.5b-instructActual model name for local Ollama deployment
Verification taskSemantic label matchingDoes predicted output semantically match true label?
Used atChallenge build + Response scoreBoth stages share the same LLM endpoint

Challenge Format

Complete payload sent from validator to all K selected miners:

challenge payload (JSON)
{
  "task_id":         "string",
  "model":           "efficientnet_b5",
  "llm_model_hint":  "Qwen2.5-1.5B-Instruct",
  "image":           "base64_encoded_RGB_image",
  "true_label":      "string  // e.g. 'tabby_cat'",
  "constraints": {
    "norm":          "Linf",
    "epsilon":       0.06,
    "min_delta":     0.002,
    "max_delta":     0.12,
    "pixel_range":   [0.0, 1.0]
  },
  "scoring_weights": { "perturbation": 0.7, "speed": 0.3 },
  "timeout_ms":      60000
}

Miner: Response

Miners return only the perturbed image. The attack method and parameters are entirely proprietary — the miner's competitive edge. Perturb provides a working default miner implementation so new participants can join immediately. Sophisticated miners replace the default with optimized strategies to compete for higher emission shares.

miner response (JSON)
{ "task_id": "string", "perturbed_image": "base64_encoded_RGB_image" }

Miners return only the perturbed image. Attack method is proprietary. The network evaluates results, not methods.

Scoring: Per-Challenge

Each miner response passes through strict verification gates. Any hard-fail returns 0.0 immediately:

ConditionThresholdResult
Invalid image, wrong shape, or out-of-range pixelsAny violationscore = 0.0
Perturbation norm below minimumnorm < 0.002score = 0.0
Perturbation norm above maximumnorm > min(ε, 0.12)score = 0.0
Perturbed prediction still matches true_label semanticallyLLM checkscore = 0.0
All checks passproceed to formula

Scoring formula for responses that pass all verification gates:

Perturbation Score — rewards minimally perturbed images perturbation_score = 1 − min( ‖δ‖ / ε , 1 )
Speed Score — rewards faster responses speed_score = 1 − min( tresponse / ttimeout , 1 )
Per-Challenge Score score = 0.7 × perturbation_score + 0.3 × speed_score
scoring implementation
perturbation_ratio = norm / epsilon
perturbation_score = 1.0 - min(perturbation_ratio, 1.0)
speed_score        = 1.0 - min(response_time_ms / timeout_ms, 1.0)
score = 0.7 * perturbation_score + 0.3 * speed_score

Scoring: On-Chain Weight Setting

On-chain weights blend each miner's historical average score with a rank-based emission allocation. Only miners with processed_count > 100 are eligible. The emission schedule is differentiated by rank — top miners receive a disproportionately larger share, with inverse-rank decay for all lower positions.

Fixed emission — Top 3 emission(1) = 50%    emission(2) = 30%    emission(3) = 10%
Decayed emission — Ranks 4 to 10  (5% total pool) emission(k) = 0.05 × (1/k) / Σ(1/j)    j ∈ {4, 5, …, 10}
Decayed emission — Ranks 11 and above  (5% total pool) emission(k) = 0.05 × (1/k) / Σ(1/j)    j ∈ {11, 12, …, N}
Weight blend  (γ = 0.7) w = γ · normalize(avg_score) + (1 − γ) · normalize(emission)
_set_weights
eligible = [uid for uid in miners if processed_count[uid] > 100]
N = len(eligible)

avg_raw = {uid: mean(last_100_scores[uid]) for uid in eligible}

# Top 3: fixed emission percentages
emission[rank_1] = 0.50
emission[rank_2] = 0.30
emission[rank_3] = 0.10

# Ranks 4-10: inverse-rank decay within 5% pool
for k in range(4, 11):
    emission[k] = 0.05 * (1/k) / sum(1/j for j in range(4, 11))

# Ranks 11+: inverse-rank decay within remaining 5% pool
for k in range(11, N + 1):
    emission[k] = 0.05 * (1/k) / sum(1/j for j in range(11, N + 1))

GAMMA = 0.7
raw     = GAMMA * normalize(avg_raw) + (1 - GAMMA) * normalize(emission)
weights = raw / raw.sum()
RankEmission ShareFormula
1st50%Fixed — winner takes 50% of miner emission
2nd30%Fixed
3rd10%Fixed
4th – 10th5% sharedemission(k) = 0.05 × (1/k) / Σj=4..10(1/j) — rank 4 earns more than rank 10
11th+5% sharedemission(k) = 0.05 × (1/k) / Σj=11..N(1/j) — decay continues with network growth
Ineligible0%processed_count ≤ 100

Phase 1 Model: EfficientNet-B5

Perturb launches with EfficientNet-B5 as the sole target model — a deliberate choice prioritising network stability, miner onboarding, and validation credibility over premature complexity.

AttributeValue
ModelEfficientNet-B5
Parameters30.4M
ImageNet Top-183.6%
timm nameefficientnet_b5
Input size224 × 224 × 3 (RGB)
Normalization mean[0.485, 0.456, 0.406]
Normalization std[0.229, 0.224, 0.225]
Output[batch, 1000] logits
GPU requirementRTX 3080 minimum for PGD-40 attacks

EfficientNet-B5 sits at the ideal intersection of attack difficulty, hardware accessibility, and research credibility. It is challenging enough that basic FGSM attacks perform poorly — requiring miners to implement stronger methods — but not so large that participation requires data center hardware.

Expansion Roadmap

PhaseModels AddedNew Capabilities
Phase 1 — LaunchEfficientNet-B5Image classification, Linf norm, LLM verification, full scoring pipeline
Phase 2 — Month 2–3+ ConvNeXt-Small, ViT-Small, Swin-TinyArchitecture diversity: CNN vs Transformer vs Hybrid
Phase 3 — Month 4–6+ EfficientNetV2-M, NFNet-F0, ResNeXt-101Mid-range GPU tier, stronger attack difficulty
Phase 4 — Month 6++ LLM text classification modelsNLP attacks: word substitution, prompt injection
Phase 5 — Year 2+ Vision models >1B paramsExtreme tier: CLIP ViT-G, EVA-Giant, InternViT-6B
05/

Revenue Model

Adversarial Training Dataset — Short-Term Revenue

Adversarial training — retraining models on adversarial examples — is the most effective known defense against adversarial attacks. Perturb generates this data continuously as a byproduct of its core operation, creating a dataset that improves every single day. Each entry includes: original image, adversarial image, model name, true label, constraint parameters, perturbation norm, LLM verification status, attack score, and timestamp.

TierVolumeFrequencyTarget Customer
Research100K examples/monthWeeklyAcademic labs, AI safety organizations
Professional1M examples/monthDailyAI startups, ML engineering teams
EnterpriseUnlimitedReal-timeLarge enterprises, frontier AI labs

Model Robustness Testing Service — Long-Term Revenue

Organizations submit a model for evaluation and receive a comprehensive robustness report generated by directing the full miner network at the target model. Each report includes:

Overall robustness score (0.0–1.0) benchmarked against industry standards

Attack success rate across epsilon budgets and image categories

Worst-case adversarial examples, visualized and downloadable

LLM-verified semantic failure analysis — not just pixel statistics

On-chain cryptographic certificate of evaluation — immutable and auditable

Comparison against published AutoAttack benchmarks for the same architecture

TierModelsEvaluation DepthFrequency
Starter1 modelStandard suiteMonthly
Growth5 modelsExtended suiteWeekly
EnterpriseUnlimitedFull suite + custom scopeContinuous
06/

Go-To-Market Strategy

Target Customers

AI Startups Selling to Enterprise

Enterprise procurement teams now require adversarial robustness certifications before purchase. A Perturb certificate can unblock deals worth orders of magnitude more than the subscription cost.

Regulated Industry Deployments

EU AI Act compliance requires conformity assessments for high-risk AI systems. Perturb provides on-chain proof of robustness testing — immutable, auditable, and defensible to regulators.

AI Research Labs

Standardized, reproducible, LLM-verified robustness benchmarks citable in academic publications. The public leaderboard becomes a recognized reference benchmark in the adversarial ML research community.

AI Safety Organizations

Organizations working on AI safety need large, diverse, high-quality adversarial example datasets for research into robustness and defenses. The dataset subscription provides this at a fraction of the cost of generating equivalent data in-house.

Phased Launch Plan

Months 1–3
Phase 1 — Build Credibility
Launch the public robustness leaderboard at perturbai.io. Attack every major open-source model on HuggingFace and publish verified scores. Become the definitive reference for adversarial robustness benchmarks.
Target: 30+ models · 3+ academic citations · 1,000+ visitors/month
Months 3–6
Phase 2 — Monetize Research Community
Launch dataset subscription. Use the public leaderboard as the primary conversion funnel. Partner with AI safety organizations as anchor customers.
Target: 50+ Research tier subscribers · $15,000 MRR
Months 6–12
Phase 3 — Enterprise Compliance
Launch model robustness testing subscription targeting EU AI Act compliance. Position as the only blockchain-verifiable robustness certificate. Partner with AI governance consultancies.
Target: 10+ enterprise subscribers · $80,000 MRR
Year 2
Phase 4 — Scale and Expand
Expand to LLM and multimodal attack coverage. Open bug bounty marketplace where companies post TAO-denominated bounties for finding vulnerabilities in their models.
Target: $500,000+ ARR

Competitive Moat

Self-Improving Attack Quality: Every day miners compete, the network gets better. The dataset becomes more valuable. The certificates become more credible. No centralized service has this compounding property.

LLM-Verified Semantic Precision: Unlike systems comparing integer class IDs, Perturb uses LLM semantic verification ensuring attacks are genuinely meaningful — producing higher-quality data and more credible certificates.

On-Chain Immutability: Robustness certificates on Bittensor cannot be altered, backdated, or selectively disclosed — categorically different from any vendor's self-reported compliance documentation.

Architecture Diversity: As Perturb adds target models, miners who specialize build irreplaceable expertise. A miner optimizing attacks for six months outperforms any general-purpose tool.

07/

Market Analysis

MetricValueNotes
AI Red Teaming Market (2024)$1.43 billionIndustry research, 2024
AI Red Teaming Market (2033)$11.61 billionProjected at 26.1% CAGR
CAGR (2025–2033)26.1%Driven by regulatory mandates
EU AI Act max fine€30M or 6% revenueNon-compliance penalty
Fastest growing segmentAdversarial attack simulationPerturb's exact category
Key verticalsHealthcare, BFSI, Government, AutomotivePrimary enterprise targets
Bittensor active subnets128 (expanding to 256)As of April 2026
Competing subnets in this space0No existing subnet covers adversarial ML

Perturb enters a $1.43B market growing at 26.1% annually, with zero competing Bittensor subnets in this category and increasing regulatory tailwinds globally.

08/

Conclusion

AI models are increasingly embedded in decisions that affect human safety, financial stability, and personal rights. Yet the vast majority of these models are deployed without systematic adversarial robustness testing — not because organizations don't care, but because the tooling to do so at scale simply doesn't exist.

Perturb changes this. By applying Bittensor's competitive incentive mechanism to adversarial example generation — and adding LLM-backed semantic verification to ensure quality — Perturb creates a network that gets measurably better every day. Miners are financially motivated to become the world's best adversarial attack researchers. Validators verify results with mathematical precision. The network produces two commercially valuable outputs that address a real, growing, regulatory-driven market.

The validation mechanism is airtight. The market is large and accelerating. The Bittensor architecture provides an unfair advantage no centralized competitor can replicate. And with LLM-verified semantic scoring as a differentiator, Perturb produces higher-quality adversarial data than any existing tool.

For technical documentation, validator setup, and miner onboarding, visit perturbai.io or the official GitHub repository.