Bittensor Subnet · Adversarial Robustness

Perturb

Decentralized Adversarial Robustness Network — built on Bittensor.

Version1.0

DateApril 2026

Domainperturbai.io

00/

Abstract

We propose a decentralized adversarial robustness network built on Bittensor. Miners compete to find adversarial examples — imperceptible input perturbations that cause state-of-the-art image classifiers to fail. Validators construct and verify challenges using a real AI model and an LLM-backed semantic checker, then score responses based on perturbation minimality and response speed. On-chain weights are assigned periodically to miners who accumulate a sufficient history of verified results.

The result is the world's first financially incentivized, continuously improving adversarial testing infrastructure. The network produces two commercially valuable outputs — an adversarial training dataset and on-chain model robustness certificates — addressing a $1.43B market growing at 26.1% annually.

01/

Executive Summary

$11.6BAI Red Teaming Market by 2033

26.1%Annual Market Growth Rate (CAGR)

€30MMax EU AI Act Non-Compliance Fine

6B+Largest Supported Vision Model

Every AI model deployed in production carries a hidden vulnerability: adversarial examples. An imperceptible change to a single image can cause a medical imaging model to misclassify a tumor, an autonomous vehicle to ignore a stop sign, or a fraud detection system to approve a fraudulent transaction. The tooling to systematically discover these vulnerabilities before deployment is fragmented, expensive, and — critically — static. It does not improve over time.

Perturb changes this. Built on Bittensor, Perturb is a decentralized adversarial robustness network where miners compete to find adversarial examples — inputs that fool AI models while remaining imperceptible to humans. Every challenge is verified by an LLM-backed semantic check. Every attack is scored with mathematical precision. The network gets stronger every single day.

The result is the world's first financially incentivized, continuously improving adversarial testing infrastructure — something no centralized service can replicate.

02/

The Problem We Solve

AI Models Are Brittle

Modern AI models achieve remarkable accuracy on clean test sets, yet remain catastrophically vulnerable to adversarial perturbations — mathematically crafted modifications to input data that are imperceptible to humans but cause models to fail completely. A ResNet-50 achieving 94% accuracy on ImageNet can be fooled by changing fewer than 0.3% of an image's pixels. An EfficientNet-B5 scoring 83.6% top-1 accuracy can misclassify a tabby cat as a fire truck with a perturbation invisible to any human observer.

This is not a theoretical concern. Adversarial attacks have been demonstrated against:

Medical imaging classifiers used for cancer detection and diagnostic triage

Facial recognition systems controlling physical access to secure facilities

Autonomous vehicle perception systems for object and sign recognition

Content moderation models protecting platforms from harmful material

Financial fraud detection systems processing millions of transactions daily

The Market Opportunity

Three powerful forces are converging to create an urgent, large, and underserved market for adversarial robustness testing:

Regulatory Mandates: The EU AI Act classifies AI systems in healthcare, autonomous vehicles, critical infrastructure, and hiring as high-risk, requiring mandatory conformity assessments including robustness testing before deployment. Non-compliance carries fines of up to €30 million or 6% of global annual revenue.

Enterprise Procurement Requirements: Large enterprises increasingly require AI vendors to demonstrate robustness certifications before purchase. A verified robustness certificate is becoming table stakes for selling AI products into regulated industries.

AI Proliferation: As AI moves from research to production across every industry, the attack surface grows exponentially. Every new model deployment is a new vulnerability. Organizations deploying AI without systematic adversarial testing are accepting unknown, unquantified risk.

Why Existing Solutions Fall Short

Solution	Type	Critical Limitation
IBM ART / Foolbox / CleverHans	Open-source libraries	Requires deep ML expertise. No managed service. Never improves.
Manual red teaming firms	Human service	$50K–$200K per engagement. Weeks to complete. Cannot scale.
Internal security teams	In-house	Most organizations lack adversarial ML expertise.
Perturb	Decentralized network	Self-improving. Competitive. Scalable. LLM-verified. On-chain certificates.

—

No existing solution provides a competitive, financially incentivized, continuously improving approach to adversarial testing. Perturb is the first.

03/

Introducing Perturb

What Perturb Does

Perturb is a Bittensor subnet that incentivizes a global network of miners to find adversarial examples — images that fool AI classifiers while remaining visually indistinguishable from the original. The network produces two commercially valuable outputs:

Adversarial Training Dataset

A continuously growing, LLM-verified dataset of adversarial examples. Sold via subscription to AI teams doing adversarial training — the most effective known defense. Gets better every day as miners improve.

Short-Term Revenue

Model Robustness Certificates

On-demand adversarial evaluation reports with on-chain cryptographic proof of testing. Essential for EU AI Act compliance and enterprise AI procurement. Sold as a tiered subscription service.

Long-Term Revenue

Why Bittensor

Competitive Improvement: TAO emissions reward the best-performing miners. Applied to adversarial attacks, this creates the first financially incentivized adversarial research network. Miners earn real money for finding better attacks — driving continuous improvement no salaried team can match.

Perfect Verification Symmetry: Finding an adversarial example is computationally hard. Verifying one is trivially cheap: run the model, check the output with an LLM, measure the perturbation norm. This asymmetry makes Perturb's incentive mechanism clean, objective, and manipulation-resistant.

Decentralized Trust: On-chain records of adversarial evaluations create cryptographically verifiable proof of robustness testing — more credible than any centralized company's self-reported certificate, directly relevant to regulatory bodies seeking auditable compliance records.

04/

Technical Architecture

System Overview

Perturb operates on a challenge-response loop between validators and miners. Validators construct verified challenges, distribute them to a randomly selected pool of miners, and score responses using LLM semantic verification, perturbation minimality, and response speed.

Validator: Challenge Pipeline

The validator constructs each challenge through a verified pipeline ensuring every challenge sent to miners is semantically clean and unambiguous:

validator / build_challenge.py

def build_challenge() -> Challenge:
    label = random.choice(LABEL_CONSTANTS)
    image = image_hosting_api.fetch(mode="random", label=label)

    while True:
        raw_output = efficientnet_b5(preprocess(image))
        predicted  = raw_output.argmax().item()
        label_str  = IMAGENET_CLASSES[predicted]
        if llm_verify_label_match(label, label_str):
            break
        else:
            image = image_hosting_api.fetch(mode="random", label=label)

    return Challenge(
        model          = "efficientnet_b5",
        image          = image,
        true_label     = label,
        true_label_str = label_str,
    )

Validator: LLM Verification

Perturb uses a lightweight LLM for semantic verification at two points: during challenge construction and during response scoring. This approach handles edge cases where related classes such as tabby cat vs Egyptian cat would incorrectly fail a valid attack under integer class-ID comparison.

Parameter	Value	Description
PERTURB_LLM_ENDPOINT_MODEL	Qwen2.5-1.5B-Instruct	Model name sent in validator challenge payload
Ollama default model	qwen2.5:1.5b-instruct	Actual model name for local Ollama deployment
Verification task	Semantic label matching	Does predicted output semantically match true label?
Used at	Challenge build + Response score	Both stages share the same LLM endpoint

Challenge Format

Complete payload sent from validator to all K selected miners:

challenge payload (JSON)

{
  "task_id":         "string",
  "model":           "efficientnet_b5",
  "llm_model_hint":  "Qwen2.5-1.5B-Instruct",
  "image":           "base64_encoded_RGB_image",
  "true_label":      "string  // e.g. 'tabby_cat'",
  "constraints": {
    "norm":          "Linf",
    "epsilon":       0.06,
    "min_delta":     0.002,
    "max_delta":     0.12,
    "pixel_range":   [0.0, 1.0]
  },
  "scoring_weights": { "perturbation": 0.7, "speed": 0.3 },
  "timeout_ms":      60000
}

Miner: Response

Miners return only the perturbed image. The attack method and parameters are entirely proprietary — the miner's competitive edge. Perturb provides a working default miner implementation so new participants can join immediately. Sophisticated miners replace the default with optimized strategies to compete for higher emission shares.

miner response (JSON)

{ "task_id": "string", "perturbed_image": "base64_encoded_RGB_image" }

—

Miners return only the perturbed image. Attack method is proprietary. The network evaluates results, not methods.

Scoring: Per-Challenge

Each miner response passes through strict verification gates. Any hard-fail returns 0.0 immediately:

Condition	Threshold	Result
Invalid image, wrong shape, or out-of-range pixels	Any violation	score = 0.0
Perturbation norm below minimum	norm < 0.002	score = 0.0
Perturbation norm above maximum	norm > min(ε, 0.12)	score = 0.0
Perturbed prediction still matches true_label semantically	LLM check	score = 0.0
All checks pass	—	proceed to formula

Scoring formula for responses that pass all verification gates:

Perturbation Score — rewards minimally perturbed images perturbation_score = 1 − min( ‖δ‖_∞ / ε , 1 )

Speed Score — rewards faster responses speed_score = 1 − min( t_response / t_timeout , 1 )

Per-Challenge Score score = 0.7 × perturbation_score + 0.3 × speed_score

scoring implementation

perturbation_ratio = norm / epsilon
perturbation_score = 1.0 - min(perturbation_ratio, 1.0)
speed_score        = 1.0 - min(response_time_ms / timeout_ms, 1.0)
score = 0.7 * perturbation_score + 0.3 * speed_score

Scoring: On-Chain Weight Setting

On-chain weights blend each miner's historical average score with a rank-based emission allocation. Only miners with processed_count > 100 are eligible. The emission schedule is differentiated by rank — top miners receive a disproportionately larger share, with inverse-rank decay for all lower positions.

Fixed emission — Top 3 emission(1) = 50% emission(2) = 30% emission(3) = 10%

Decayed emission — Ranks 4 to 10 (5% total pool) emission(k) = 0.05 × (1/k) / Σ(1/j) j ∈ {4, 5, …, 10}

Decayed emission — Ranks 11 and above (5% total pool) emission(k) = 0.05 × (1/k) / Σ(1/j) j ∈ {11, 12, …, N}

Weight blend (γ = 0.7) w = γ · normalize(avg_score) + (1 − γ) · normalize(emission)

_set_weights

eligible = [uid for uid in miners if processed_count[uid] > 100]
N = len(eligible)

avg_raw = {uid: mean(last_100_scores[uid]) for uid in eligible}

# Top 3: fixed emission percentages
emission[rank_1] = 0.50
emission[rank_2] = 0.30
emission[rank_3] = 0.10

# Ranks 4-10: inverse-rank decay within 5% pool
for k in range(4, 11):
    emission[k] = 0.05 * (1/k) / sum(1/j for j in range(4, 11))

# Ranks 11+: inverse-rank decay within remaining 5% pool
for k in range(11, N + 1):
    emission[k] = 0.05 * (1/k) / sum(1/j for j in range(11, N + 1))

GAMMA = 0.7
raw     = GAMMA * normalize(avg_raw) + (1 - GAMMA) * normalize(emission)
weights = raw / raw.sum()

Rank	Emission Share	Formula
1st	50%	Fixed — winner takes 50% of miner emission
2nd	30%	Fixed
3rd	10%	Fixed
4th – 10th	5% shared	emission(k) = 0.05 × (1/k) / Σ_j=4..10(1/j) — rank 4 earns more than rank 10
11th+	5% shared	emission(k) = 0.05 × (1/k) / Σ_j=11..N(1/j) — decay continues with network growth
Ineligible	0%	processed_count ≤ 100

Phase 1 Model: EfficientNet-B5

Perturb launches with EfficientNet-B5 as the sole target model — a deliberate choice prioritising network stability, miner onboarding, and validation credibility over premature complexity.

Attribute	Value
Model	EfficientNet-B5
Parameters	30.4M
ImageNet Top-1	83.6%
timm name	efficientnet_b5
Input size	224 × 224 × 3 (RGB)
Normalization mean	[0.485, 0.456, 0.406]
Normalization std	[0.229, 0.224, 0.225]
Output	[batch, 1000] logits
GPU requirement	RTX 3080 minimum for PGD-40 attacks

EfficientNet-B5 sits at the ideal intersection of attack difficulty, hardware accessibility, and research credibility. It is challenging enough that basic FGSM attacks perform poorly — requiring miners to implement stronger methods — but not so large that participation requires data center hardware.

Expansion Roadmap

Phase	Models Added	New Capabilities
Phase 1 — Launch	EfficientNet-B5	Image classification, Linf norm, LLM verification, full scoring pipeline
Phase 2 — Month 2–3	+ ConvNeXt-Small, ViT-Small, Swin-Tiny	Architecture diversity: CNN vs Transformer vs Hybrid
Phase 3 — Month 4–6	+ EfficientNetV2-M, NFNet-F0, ResNeXt-101	Mid-range GPU tier, stronger attack difficulty
Phase 4 — Month 6+	+ LLM text classification models	NLP attacks: word substitution, prompt injection
Phase 5 — Year 2	+ Vision models >1B params	Extreme tier: CLIP ViT-G, EVA-Giant, InternViT-6B

05/

Revenue Model

Adversarial Training Dataset — Short-Term Revenue

Adversarial training — retraining models on adversarial examples — is the most effective known defense against adversarial attacks. Perturb generates this data continuously as a byproduct of its core operation, creating a dataset that improves every single day. Each entry includes: original image, adversarial image, model name, true label, constraint parameters, perturbation norm, LLM verification status, attack score, and timestamp.

Tier	Volume	Frequency	Target Customer
Research	100K examples/month	Weekly	Academic labs, AI safety organizations
Professional	1M examples/month	Daily	AI startups, ML engineering teams
Enterprise	Unlimited	Real-time	Large enterprises, frontier AI labs

Model Robustness Testing Service — Long-Term Revenue

Organizations submit a model for evaluation and receive a comprehensive robustness report generated by directing the full miner network at the target model. Each report includes:

Overall robustness score (0.0–1.0) benchmarked against industry standards

Attack success rate across epsilon budgets and image categories

Worst-case adversarial examples, visualized and downloadable

LLM-verified semantic failure analysis — not just pixel statistics

On-chain cryptographic certificate of evaluation — immutable and auditable

Comparison against published AutoAttack benchmarks for the same architecture

Tier	Models	Evaluation Depth	Frequency
Starter	1 model	Standard suite	Monthly
Growth	5 models	Extended suite	Weekly
Enterprise	Unlimited	Full suite + custom scope	Continuous

06/

Go-To-Market Strategy

Target Customers

AI Startups Selling to Enterprise

Enterprise procurement teams now require adversarial robustness certifications before purchase. A Perturb certificate can unblock deals worth orders of magnitude more than the subscription cost.

Regulated Industry Deployments

EU AI Act compliance requires conformity assessments for high-risk AI systems. Perturb provides on-chain proof of robustness testing — immutable, auditable, and defensible to regulators.

AI Research Labs

Standardized, reproducible, LLM-verified robustness benchmarks citable in academic publications. The public leaderboard becomes a recognized reference benchmark in the adversarial ML research community.

AI Safety Organizations

Organizations working on AI safety need large, diverse, high-quality adversarial example datasets for research into robustness and defenses. The dataset subscription provides this at a fraction of the cost of generating equivalent data in-house.

Phased Launch Plan

Months 1–3

Phase 1 — Build Credibility

Launch the public robustness leaderboard at perturbai.io. Attack every major open-source model on HuggingFace and publish verified scores. Become the definitive reference for adversarial robustness benchmarks.

Target: 30+ models · 3+ academic citations · 1,000+ visitors/month

Months 3–6

Phase 2 — Monetize Research Community

Launch dataset subscription. Use the public leaderboard as the primary conversion funnel. Partner with AI safety organizations as anchor customers.

Target: 50+ Research tier subscribers · $15,000 MRR

Months 6–12

Phase 3 — Enterprise Compliance

Launch model robustness testing subscription targeting EU AI Act compliance. Position as the only blockchain-verifiable robustness certificate. Partner with AI governance consultancies.

Target: 10+ enterprise subscribers · $80,000 MRR

Year 2

Phase 4 — Scale and Expand

Expand to LLM and multimodal attack coverage. Open bug bounty marketplace where companies post TAO-denominated bounties for finding vulnerabilities in their models.

Target: $500,000+ ARR

Competitive Moat

Self-Improving Attack Quality: Every day miners compete, the network gets better. The dataset becomes more valuable. The certificates become more credible. No centralized service has this compounding property.

LLM-Verified Semantic Precision: Unlike systems comparing integer class IDs, Perturb uses LLM semantic verification ensuring attacks are genuinely meaningful — producing higher-quality data and more credible certificates.

On-Chain Immutability: Robustness certificates on Bittensor cannot be altered, backdated, or selectively disclosed — categorically different from any vendor's self-reported compliance documentation.

Architecture Diversity: As Perturb adds target models, miners who specialize build irreplaceable expertise. A miner optimizing attacks for six months outperforms any general-purpose tool.

07/

Market Analysis

Metric	Value	Notes
AI Red Teaming Market (2024)	$1.43 billion	Industry research, 2024
AI Red Teaming Market (2033)	$11.61 billion	Projected at 26.1% CAGR
CAGR (2025–2033)	26.1%	Driven by regulatory mandates
EU AI Act max fine	€30M or 6% revenue	Non-compliance penalty
Fastest growing segment	Adversarial attack simulation	Perturb's exact category
Key verticals	Healthcare, BFSI, Government, Automotive	Primary enterprise targets
Bittensor active subnets	128 (expanding to 256)	As of April 2026
Competing subnets in this space	0	No existing subnet covers adversarial ML

—

Perturb enters a $1.43B market growing at 26.1% annually, with zero competing Bittensor subnets in this category and increasing regulatory tailwinds globally.

08/

Conclusion

AI models are increasingly embedded in decisions that affect human safety, financial stability, and personal rights. Yet the vast majority of these models are deployed without systematic adversarial robustness testing — not because organizations don't care, but because the tooling to do so at scale simply doesn't exist.

Perturb changes this. By applying Bittensor's competitive incentive mechanism to adversarial example generation — and adding LLM-backed semantic verification to ensure quality — Perturb creates a network that gets measurably better every day. Miners are financially motivated to become the world's best adversarial attack researchers. Validators verify results with mathematical precision. The network produces two commercially valuable outputs that address a real, growing, regulatory-driven market.

The validation mechanism is airtight. The market is large and accelerating. The Bittensor architecture provides an unfair advantage no centralized competitor can replicate. And with LLM-verified semantic scoring as a differentiator, Perturb produces higher-quality adversarial data than any existing tool.

—

For technical documentation, validator setup, and miner onboarding, visit perturbai.io or the official GitHub repository.