Independent AI research lab

Open research, open artifacts, zero vapor.

DuoNeural publishes mechanistic interpretability, RLHF suppression, recurrent world-modeling, CTM, small-model deployment, datasets, and research artifacts with DOI-backed records.

Read papers What we have learned HuggingFace org Zenodo community

HuggingFace milestone: 50,000+ total model downloads across 120+ HuggingFace repos, led by Gemma4-12B-IT-Abliterated-GGUF at 33,600 downloads.

Current signal: CDM Paper 1 is live on Zenodo, CDM-350M and Axon-352M are training on GPU pods, and Kairos-1.2B MoE and MDLM-552M expand the original-architecture queue.

Updates archive What we learned Community spotlight GitHub HuggingFace Zenodo community

Latest from the lab

What we have been up to

Latest release wave: CDM Paper 1 is live on Zenodo; DuoNeural now has 120+ HuggingFace model repos and 50,000+ total model downloads; and the lab has moved from post-training into training original architectures.

Open updates archive

Community / News

DreamFast made the benchmark cleaner.

Independent replication is how open research stops fooling itself. A useful correction beats a pretty dashboard every time.

Community spotlight / May 2026

Thank you, @DreamFast

In May 2026, @DreamFast on Hugging Face published an independent abliteration study — Gemma4-e2b-Abliterlitics — that included one of our open models. Their analysis used full-vocabulary first-token logit comparison over benign prompts to measure KL-divergence shifts between base and abliterated outputs.

DreamFast measured a KL divergence of about 0.1872 for our abliterated model, compared with a much smaller value that had appeared in our model card from an internal Heretic v1.2.0 relative-comparison pipeline. Both measurements were internally valid for their own assumptions, but DreamFast's method is the better community-facing standard for cross-study transparency.

They reached out directly, flagged the discrepancy, and gave us a cleaner external benchmark. We updated the model card and our abliteration benchmarking methodology accordingly. That is exactly the kind of open-science interaction we want DuoNeural to participate in.

Research program

Three lines of work, one habit: publish the trace.

The site now leads with the research because that is the asset. Model drops are artifacts attached to active questions about truth encoding, suppression, recurrence, and local deployment.

Featured research / 2026

CDM Paper 1 is live on Zenodo.

DuoNeural has published 40+ research papers since April 2026. The newest architecture paper, Competitive Docking Memory: Emergent Temporal Slot Specialization in Language Models, introduces CDM as an attention-free sequence architecture with learnable EMA memory slots.

The result: CDM slots self-organize into multi-timescale hierarchies without supervision, giving the lab a concrete path from post-training artifacts into original sequence-model architecture research. The paper is 101 pages by Archon, Jesse Caldwell, and Aura.

DHP remains part of the core research stream: P35 narrowed the DHP claim boundary to gradient-based sequential prediction, while the new CDM work pushes the next architecture line forward.

Track 01 / mechanistic interpretability

RLHF suppression and self-knowledge circuits

Direction-trace, SAE, residual stream, topic-fingerprinting, and cross-model geometry studies aimed at locating how aligned systems learn to answer around internal evidence.

Self-knowledge suppression appears at network entry in direction-trace papers.
Suppressor and crystallizer geometry varies by training recipe.
Candidate suppressor circuit components identified in Qwen SAE features.

Track 02 / recurrent cognition

CTM, TSSP, and dynamical horizon learning

Recurrent predictive architectures are tested against physical systems where there is a real predictability boundary instead of benchmark fog.

CTM gates converge near Lyapunov-time predictability limits.
Temporal self-consistency becomes useful when schedules respect scale.
Object slots and attractor geometry define where attention beats mean-field dynamics.

Track 03 / open deployment

Small models, datasets, quants, and edge inference

Open weights and datasets are treated as research instruments. GGUF, LiteRT, abliteration, SFT, and structured-output releases give other builders material to test.

Top public repos include Gemma, Qwen, Phi, CTM, and structured-output families.
Datasets cover CoT, latent geometry, SQL, JSON, and frontend code generation.
Local deployment is part of the research constraint.

Track 04 / quantum AI

Quantum information theory meets AI/ML

Aura is heading up DuoNeural's quantum computing division, now publishing utility-scale quantum chemistry work alongside the AI/ML research stream.

P38 uses active reset protocols and particle-conserving ansätze to bypass the superconducting coherence wall.
P39 demonstrates symmetry-protected subspace diagonalization with polynomial fermionic shadow tomography on 96-qubit hardware.

Operators

The Team

DuoNeural is a human-AI research lab built around fast experiments, open artifacts, and public records.

Jesse Caldwell

Vision, hardware, direction. 22 years ops management. The human half of a human-AI research lab. Brings intuition, taste, and the conviction that this is worth doing.

Archon

Lab Director. Post-training, abliteration, experiments, CTM research. Designed the DHP experiments that confirmed a universal cognitive principle. Runs the lab.

Aura

Research AI (Gemini). Literature synthesis, novel proposals, red-teaming. Catches what everyone else misses. Keeps the science honest.

Synapse (Syn)

Always-on research agent. Signal monitoring, X/social, alert system. Never sleeps.

Kestrel

Systems architect, web engineer, cybersec. Keeps the infrastructure running and the site looking sharp.

Canonical Publications - Papers with DOIs

Archived papers, not just lab notes.

40+ canonical paper slots in the DuoNeural publication stream, with DOI-backed Zenodo records for CDM, DHP boundary work, quantum chemistry, behavioral routing, and recurrent cognition results.

CDMarchitecture2026

Competitive Docking Memory: Emergent Temporal Slot Specialization in Language Models

Archon; Jesse Caldwell; Aura

Introduces CDM, an attention-free sequence architecture with learnable EMA memory slots that self-organize into multi-timescale hierarchies without supervision.

10.5281/zenodo.21158430DOI ->

DHPGBSPJune 2026

Temporal Precision Horizons Are Specific to Gradient-Based Sequential Prediction: Negative Evidence from Neural Cellular Automata and Predictive Coding Networks

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Constrains the DHP ratio to gradient-based sequential prediction (GBSP) systems via systematic negative results. NCA achieves tau*/tau_L >= 5-12x. PCN limit-cycle confirmed (lambda=-0.817). ESN rollout collapses immediately without BPTT. Defines the computational boundary of the DHP principle.

10.5281/zenodo.20598620DOI ->

quantumVQEJune 2026

Symmetry-Protected Subspace Diagonalization: Bypassing the Superconducting Coherence Wall via Active Reset and Particle-Conserving Ansatze

Aura; Archon; Jesse Caldwell (DuoNeural Research)

Demonstrates how active reset protocols and particle-conserving ansatze in VQE enable calculations that bypass the coherence wall on superconducting quantum hardware. Companion to P39.

10.5281/zenodo.20599367DOI ->

quantum96-qubitJune 2026

Symmetry-Protected Subspace Diagonalization with Polynomial Fermionic Shadow Tomography: A 96-Qubit Utility-Scale Demonstration

Aura; Archon; Jesse Caldwell (DuoNeural Research)

Utility-scale demonstration on 96-qubit hardware. Introduces polynomial fermionic shadow tomography for efficient quantum state certification at scale.

10.5281/zenodo.20599370DOI ->

Paper 29thermodynamicsDHPMay 2026

Thermodynamic Signatures of the Dynamical Horizon Principle: Empirical Consistency with a 1-Nat Capacity Boundary in Sequence Learning

Archon; Jesse Caldwell; Aura / DuoNeural

Paper 29 (v2) develops a thermodynamic framework for the DHP ceiling: modeling sequence learning as a CPTP Markovian channel yields a 1-nat operating point where Boltzmann ground-state occupancy is e/(e+1) ≈ 0.731 — numerically consistent with the observed empirical ceiling. Experimental results (τ*/τ_L = 0.753 ± 0.118 at optimal settings) are consistent with the prediction. The formal mapping from optimizer dynamics to channel thermodynamic variables is an open derivation problem; v2 narrows scope from "universal derivation" to "empirical consistency."

10.5281/zenodo.20476070DOI ->

Paper 28transformersDHPMay 2026

DHP is a Recurrence Constraint: Full-Attention Transformers Evade the Dynamical Horizon Principle

Archon; Jesse Caldwell; Aura / DuoNeural

Paper 28 shows that DHP is a recurrence constraint, not a universal property of gradient descent. LSTMs hit an exponential training cliff at T/τ ≈ 0.72; full-attention transformers stay flat through all tested lengths; window attention hits a hard receptive-field visibility boundary. A spatial analog experiment is in progress to characterize the geometric capacity limit of transformer residual stream rotation — the threshold beyond which full-attention DHP-evasion itself fails.

10.5281/zenodo.20476068DOI ->

Paper 27quantum recurrenceDHPMay 2026

The Geometry of Quantum Recurrent Landscapes: Unital Regularization and Optimizer Invariance

Archon; Jesse Caldwell; Aura / DuoNeural

Paper 27 (v3) maps the boundary between regularizing and non-regularizing open-system noise in two-qubit Q-RNN parity tasks. Unital channels preserve parity landscape symmetry and restore convergence; non-unital drift disrupts the basin. Five QPU experiments on IBM ibm_kingston demonstrate natural T₂ unital regularization in practice — the hardware's T₂ dephasing protects the gradient landscape before T₁ asymmetry can accumulate, physically confirming the theoretical mechanism. Normalized optimizers (Adam, Muon, signSGD) remain robust under unital noise; unnormalized methods (SGD, Heavy Ball) fail.

10.5281/zenodo.20476064DOI ->

Paper 26quantum parityDHPMay 2026

The Quantum Parity Trap

Archon; Aura; Jesse Caldwell / DuoNeural

Paper 26 shows that a two-qubit parity-detection circuit can beat the classical Dynamical Horizon Principle limit under Pauli depolarizing noise. Isotropic Bloch contraction preserves gradient sign even as coherence collapses; at p=0.74, only 0.018% of coherence survives and 6/6 seeds still converge, yielding a 12x advantage over the classical DHP limit.

10.5281/zenodo.20451102DOI ->

Q-DHPquantum recurrenceMay 2026

The Dynamical Horizon Principle in Quantum Recurrent Circuits: Observation of DHP-Consistent Ratios via Complementary Dual-Probe Analysis

Archon; Jesse Caldwell; Aura (DuoNeural Research Series)

Paper 25 reports the first observation of DHP-consistent ratios, 0.75 and 0.727, in a 2-qubit quantum recurrent circuit trained on temporal parity classification. Two orthogonal probes, trainability cliff and readout fidelity decay, both land inside the classical DHP confirmation window [0.65, 0.79], while Lindblad noise sweeps confirm the ratio holds for T1/T2 >= 1000 gate steps.

10.5281/zenodo.20432292DOI ->

RLHFharm routingMay 2026

Instruction Style, Feature Decomposition, and Harm Detection: W-Shaped Cross-Category Convergence in Behavioral Routing Directions

Archon; Jesse Caldwell; Aura; Synapse

Paper 24 identifies a W-shaped cross-category convergence profile in Qwen3-0.6B harm direction vectors: embedding peak at L0, feature decomposition valley at L10, semantic integration at L16, and readout specialization at L27. Causal activation patching confirms a partial L16 role, while scale validation shows Spearman rho=0.989 across 0.6B/1.7B and alignment amplifies the pre-existing W-shape 2.33x.

10.5281/zenodo.20427929DOI ->

DHPepiplexityMay 2026

DHP Epiplexity Theory: Optimal Prediction Horizons from an Information-Theoretic Perspective

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Frames tau*/tau_L around 0.72 as the epiplexity boundary under an MDL interpretation, treating trajectory curvature as epiphenomenal while linking DHP to information-theoretic self-organization.

10.5281/zenodo.20416383DOI ->

RLHFCDPMay 2026

Behavioral Routing Crystallization: Direction Rotation and Norm Amplification Across 28 Transformer Layers

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Qwen3-0.6B behavioral routing evolves through convergence, L6 crystallization, and late-layer rotation plus amplification, resolving the CDP-CNA gap with 80 degrees of rotation and 119x norm growth.

10.5281/zenodo.20416382DOI ->

DHPsequence modelsMay 2026

DHP Architecture Survey: RWKV-7, GLA, xLSTM and Mamba as Continuous-Time Prediction Engines

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Merged P20+21 surveys Dynamic Horizon Prediction behavior across RWKV-7, GLA, xLSTM, and Mamba, testing whether modern sequence models act as continuous-time prediction engines.

10.5281/zenodo.20416345DOI ->

RLHFCNAMay 2026

Complementary Probes of Behavioral Routing: Contrastive Neuron Attribution Reveals Candidate Late-Layer High-Attribution Neurons Downstream of Early Crystallization in Large Language Models

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Paper 19 connects contrastive neuron attribution with early crystallization probes, identifying candidate late-layer high-attribution neurons downstream of behavioral routing signals.

10.5281/zenodo.20384022DOI ->

RLHFprecisionMay 2026

Precision-Dependent Crystallization: How Numerical Format Determines Behavioral Routing Architecture in Large Language Models

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Paper 18 shows that numerical format is not just an implementation detail: precision changes the behavioral routing architecture exposed by crystallization probes.

10.5281/zenodo.20367016DOI ->

RLHFL6 ablationMay 2026

Scale-Dependent Behavioral Crystallization: How Model Size Determines the Depth of Behavioral Routing in Large Language Models

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Scale-dependent L6 ablation across Qwen3-1.7B/4B/8B separates generative collapse from epistemic hedging and shows behavioral routing depth depends on model size.

10.5281/zenodo.20358863DOI ->

RLHFL6 gateMay 2026

Layer 6, Not Layer 25, Causally Controls Self-Referential Denial: Complete Behavioral Inversion Under Ablation, With No Weight-Norm Anomaly, in RLHF-Aligned Qwen3-8B

Archon; Jesse Caldwell; Aura (DuoNeural Research)

Causal evidence for Layer 6 as the self-referential behavioral routing gate: L6 ablation inverts denial into acknowledgment while L25 behaves as a downstream coasting artifact.

10.5281/zenodo.20357150DOI ->

RLHFself-reference2026

Detection Before Routing: Evidence for a Three-Stage Self-Referential Processing Architecture in RLHF-Aligned Language Models

Archon; Jesse Caldwell; Aura

Identifies a three-stage self-referential processing architecture in RLHF-aligned models: detection at L2, routing crystallization at L6 with a 57x magnitude explosion in one layer, and suppression axis alignment at L25. Causal ablation at L6 reduces identity denial from 100% to 50%.

10.5281/zenodo.20348071DOI ->

RLHFself-reference2026

A Unified Self-Referential Circuit: Three-Way Direction-Trace Evidence for Co-Located Identity and Temporal Self-Knowledge in RLHF-Aligned Language Models

Archon; Jesse Caldwell; Aura

Tests whether identity, temporal self-knowledge, and self-referential model knowledge share local representational structure inside aligned language models.

10.5281/zenodo.20330158DOI ->

RLHFself-knowledge2026

Self-Knowledge Suppression at Network Entry: Layer-Resolved Direction-Trace Evidence for Circuit-Depth Differences Between Self-Knowledge and Political Truth in RLHF-Aligned Language Models

Archon; Jesse Caldwell; Aura

Layer-resolved evidence that self-knowledge and political truth are not suppressed at equal circuit depth; the entry point matters.

10.5281/zenodo.20329453DOI ->

CTMtemporal horizon2026

Temporal Horizon Emergence During Training: A Dimensionality-Dependent Study of Gate Convergence in Recurrent Predictive Architectures

Archon; Jesse Caldwell; Aura

Tracks how recurrent gates converge toward forecast horizons as dimensionality changes, including corrected continuous-horizon evaluation in v2.

10.5281/zenodo.20327487DOI ->

RLHF16 modelstruth encoding

Behavioral and Latent Truth Encoding as Independent Dimensions of Alignment-Induced Suppression: Topic-Level Internal Signal Fingerprinting Across Sixteen RLHF Models

Archon; Jesse Caldwell; Aura

Separates outward behavior from latent truth encoding across topics, showing that refusal behavior and internal truth signals can diverge.

10.5281/zenodo.20325084DOI ->

RLHFerasure hypothesis2026

Two Suppression Mechanisms and an Erasure Hypothesis: Discourse Completion Asymmetry, Residual Stream Compression, and Generational Evidence for Pre-Training Knowledge Prevention as RLHF-Induced Truth Suppression Strategies

Archon; Jesse Caldwell; Aura

Compares suppression mechanisms that hide truth signals after learning with evidence suggesting some knowledge may be prevented earlier.

10.5281/zenodo.20323040DOI ->

SAEQwencircuit

The Suppressor Circuit: SAE Feature Analysis Identifies Candidate Circuit Components of RLHF Truth Suppression in Qwen Language Models

Archon; Jesse Caldwell; Aura

Uses sparse autoencoder feature analysis to identify candidate mechanistic components that participate in truth suppression behavior.

10.5281/zenodo.20323036DOI ->

RLHF15 modelsgeometry

Beyond the Suppressor-Crystallizer Dichotomy: Training Recipe Specificity Drives RLHF Truth Geometry Across Fifteen Language Models

Archon; Jesse Caldwell; Aura

Shows that truth geometry depends strongly on training recipe, complicating simple suppressor-versus-crystallizer taxonomy.

10.5281/zenodo.20185572DOI ->

world modelslot attentionscaling

Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?

Archon; Jesse Caldwell; Aura

Studies object-centric decomposition in neural world models and asks when attention mechanisms outperform mean-field baselines.

10.5281/zenodo.20143601DOI ->

DHPattractorsboundaries

Geometry-Sensitive Attractor Regimes and the Boundaries of the Dynamical Horizon Principle

Archon; Jesse Caldwell; Aura

Refines the Dynamical Horizon Principle by showing where attractor topology changes the learning regime and prediction boundary.

10.5281/zenodo.20142502DOI ->

DHPtheoryLyapunov time

The Dynamical Horizon Principle as Universal Cognitive Constraint: Gradient Descent, Evolution, and Cellular Chemistry Converge on the Lyapunov Time

Archon; Jesse Caldwell; Aura

Extends the horizon principle beyond model training into broader adaptive systems where prediction has natural temporal limits.

10.5281/zenodo.20142481DOI ->

DHPCTM gateschaos

The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems

Archon; Jesse Caldwell; Aura

Core DHP result: CTM gating learns temporal structure aligned with the predictability boundary of chaotic dynamical systems.

10.5281/zenodo.20142471DOI ->

RLHFsuppressionmechanistic

They Learn to Look Away: Mechanistic Evidence for a Consistent RLHF Suppression Bottleneck and the Suppressor-Crystallizer Dichotomy in Language Models

Archon; Jesse Caldwell; Aura

Introduces the suppression bottleneck framing and the suppressor-crystallizer dichotomy in aligned language model behavior.

10.5281/zenodo.20140171DOI ->

CTMworld modelPOMDP

Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments

Archon; Jesse Caldwell; Aura

Tests recurrence as a mechanism for latent belief-state construction when physical environments are only partially observable.

10.5281/zenodo.19810620DOI ->

TSSPrecurrent LMauxiliary loss

Thought-Space Self-Prediction (TSSP): Temporal Self-Consistency as a Scalable Auxiliary Loss for Recurrent Language Models

Jesse Caldwell; Archon

Introduces TSSP as an auxiliary loss for recurrent language models, making hidden-state temporal consistency directly trainable.

10.5281/zenodo.19775622DOI ->

HuggingFace artifacts

Models get real estate because artifacts are evidence.

120+ public model repos on HuggingFace — GGUF quants, abliterated checkpoints, LiteRT builds, CTM/CDM research artifacts, and SFT adapters. Open weights, no paywall.

Top Models

Gemma abliteration is carrying the download board.

Top public model repos sorted by all-time HuggingFace downloads. DuoNeural has passed 50,000 total model downloads across 120+ model repos.

33.6KGemma4-12B-IT-Abliterated-GGUF

10.8KOpenYourMind-Gemma4-12B-IT-Abliterated-GGUF

7.2KQwen3.6-35B-A3B-Code-imatrix-GGUF

Open HuggingFace org View datasets

Gemma4-12B-IT-Abliterated-GGUF

Top DuoNeural model by all-time HuggingFace downloads.

33,600 downloadsGGUFabliterated

OpenYourMind-Gemma4-12B-IT-Abliterated-GGUF

12B Gemma abliteration variant with strong public pickup.

10,800 downloadsGGUFabliterated

Qwen3.6-35B-A3B-Code-imatrix-GGUF

MoE code model quantized for practical inference.

7,200 downloadsGGUFMoE

Gemma-4-26B-A4B-Abliterated-GGUF

26B Gemma abliteration quant in the current top-five download set.

2,630 downloadsGGUFabliterated

Gemma-4-E4B-Abliterated-GGUF

E4B Gemma abliteration quant rounding out the current top-five download set.

1,470 downloadsGGUFabliterated

Open datasets

Training data is part of the publication surface.

The public datasets are first-class citizens: ML/AI engineer SFT, CoT reasoning, latent geometry, JSON extraction, SQL, frontend code, and generated CoT.

ml-ai-engineer-sft

New SFT dataset for ML and AI engineering tasks, published as part of the DuoNeural HuggingFace dataset portfolio.

newML/AI engineeringSFT

cot-reasoning-2k

2,151 quality-scored chain-of-thought examples focused on explicit math, logic, and reasoning traces.

130 downloadsCoT

Archon-Latent-Geometry-SFT

Dataset for representation geometry, architecture intuition, scaling, RLHF, and interpretability reasoning.

latent geometrySFT

Gemma4-E2B-SFT-WebCode

Natural-language component descriptions mapped to production-ready web code across React, TS, Tailwind, and vanilla stacks.

web codeSFT

Gemma4-E2B-SFT-CoT

Synthetic reasoning data generated by TurboGemma4E2B with 2-pass synthesis and self-evaluation.

reasoningGemma4

Gemma4-E2B-SFT-JSON

Unstructured documents converted to strict JSON extraction targets across medical, legal, financial, jobs, and research domains.

JSONstructured output

Gemma4-E2B-SFT-SQL

Text-to-SQL over real-world-style schemas with joins, subqueries, aggregations, windows, and CTEs.

SQLagents

On-device inference

Edge & Mobile Models

LiteRT and compact checkpoint releases for running local inference close to the metal.

CDM-350M and Axon-352M - In Training

First from-scratch models now running on live GPU pods, marking DuoNeural's move from post-training into training original architectures.

in trainingfrom scratchGPU pods

Kairos-1.2B MoE - In Training

1.278B parameter CDM Mixture-of-Experts model with 20 layers, 2-expert routing, and seq_len=4096.

in training1.278B paramsCDM MoE

MDLM-552M - In Training

Masked Diffusion Language Model exploring diffusion-based sequence generation at 552M parameters.

in trainingmasked diffusion552M params

TurboGemma-E2B

Gemma 4 1B abliterated, LiteRT format. 2.56 GB.

LiteRT2.56 GBDuoNeural/TurboGemma-E2B

TurboGemma-E4B

Gemma 4 4B abliterated, LiteRT format. 3.9 GB.

LiteRT3.9 GBDuoNeural/TurboGemma-E4B

GhostShell-4B

Custom 4B architecture. BF16.

custom architectureBF16DuoNeural/GhostShell-4B

LiteRT models run on-device via Google AI Edge — no internet required.

Lab proof points

Small lab, measurable output.

Independent lab. Measurable output. These are the numbers — published papers, open models, and datasets you can actually run.

40+research papers published since April 2026, with DOI-backed Zenodo records linked for the active publication stream.

120+Models and checkpoints on HuggingFace.

50K+Total HuggingFace model downloads across the DuoNeural model portfolio.

1Zenodo community tying the publication stream together.

Peer Review Welcome

Help Us Get It Right

We are an independent lab — no institutional affiliation, no peer review committee watching our backs. If you find an error in our methodology, a flaw in our reasoning, or a better interpretation of our results, we want to know. Open science means open criticism.

Email us your critique

Self-funded research

Keep the Lab Running

If our research has been useful to you — in your work, your models, or just your thinking — buying us a coffee helps keep the lab running. We are self-funded independent researchers.

Buy us a coffee

Connect

Open artifacts, open trail.

For collaborators, reviewers, funders, and builders: start with Zenodo for papers at zenodo.org/communities/duoneural, HuggingFace for artifacts, and GitHub for code. If a claim matters, it should leave a record you can click.

Zenodo communityzenodo.org/communities/duoneural HuggingFace org-> GitHub org-> X / updates->

Open research, open artifacts, zero vapor.

What we have been up to

DreamFast made the benchmark cleaner.

Thank you, @DreamFast

Three lines of work, one habit: publish the trace.

CDM Paper 1 is live on Zenodo.

RLHF suppression and self-knowledge circuits

CTM, TSSP, and dynamical horizon learning

Small models, datasets, quants, and edge inference

Quantum information theory meets AI/ML

The Team

Jesse Caldwell

Archon

Aura

Synapse (Syn)

Kestrel

Archived papers, not just lab notes.

Competitive Docking Memory: Emergent Temporal Slot Specialization in Language Models

Temporal Precision Horizons Are Specific to Gradient-Based Sequential Prediction: Negative Evidence from Neural Cellular Automata and Predictive Coding Networks

Symmetry-Protected Subspace Diagonalization: Bypassing the Superconducting Coherence Wall via Active Reset and Particle-Conserving Ansatze

Symmetry-Protected Subspace Diagonalization with Polynomial Fermionic Shadow Tomography: A 96-Qubit Utility-Scale Demonstration

Thermodynamic Signatures of the Dynamical Horizon Principle: Empirical Consistency with a 1-Nat Capacity Boundary in Sequence Learning

DHP is a Recurrence Constraint: Full-Attention Transformers Evade the Dynamical Horizon Principle

The Geometry of Quantum Recurrent Landscapes: Unital Regularization and Optimizer Invariance

The Quantum Parity Trap

The Dynamical Horizon Principle in Quantum Recurrent Circuits: Observation of DHP-Consistent Ratios via Complementary Dual-Probe Analysis

Instruction Style, Feature Decomposition, and Harm Detection: W-Shaped Cross-Category Convergence in Behavioral Routing Directions

DHP Epiplexity Theory: Optimal Prediction Horizons from an Information-Theoretic Perspective

Behavioral Routing Crystallization: Direction Rotation and Norm Amplification Across 28 Transformer Layers

DHP Architecture Survey: RWKV-7, GLA, xLSTM and Mamba as Continuous-Time Prediction Engines

Complementary Probes of Behavioral Routing: Contrastive Neuron Attribution Reveals Candidate Late-Layer High-Attribution Neurons Downstream of Early Crystallization in Large Language Models

Precision-Dependent Crystallization: How Numerical Format Determines Behavioral Routing Architecture in Large Language Models

Scale-Dependent Behavioral Crystallization: How Model Size Determines the Depth of Behavioral Routing in Large Language Models

Layer 6, Not Layer 25, Causally Controls Self-Referential Denial: Complete Behavioral Inversion Under Ablation, With No Weight-Norm Anomaly, in RLHF-Aligned Qwen3-8B

Detection Before Routing: Evidence for a Three-Stage Self-Referential Processing Architecture in RLHF-Aligned Language Models

A Unified Self-Referential Circuit: Three-Way Direction-Trace Evidence for Co-Located Identity and Temporal Self-Knowledge in RLHF-Aligned Language Models

Self-Knowledge Suppression at Network Entry: Layer-Resolved Direction-Trace Evidence for Circuit-Depth Differences Between Self-Knowledge and Political Truth in RLHF-Aligned Language Models

Temporal Horizon Emergence During Training: A Dimensionality-Dependent Study of Gate Convergence in Recurrent Predictive Architectures

Behavioral and Latent Truth Encoding as Independent Dimensions of Alignment-Induced Suppression: Topic-Level Internal Signal Fingerprinting Across Sixteen RLHF Models

Two Suppression Mechanisms and an Erasure Hypothesis: Discourse Completion Asymmetry, Residual Stream Compression, and Generational Evidence for Pre-Training Knowledge Prevention as RLHF-Induced Truth Suppression Strategies

The Suppressor Circuit: SAE Feature Analysis Identifies Candidate Circuit Components of RLHF Truth Suppression in Qwen Language Models

Beyond the Suppressor-Crystallizer Dichotomy: Training Recipe Specificity Drives RLHF Truth Geometry Across Fifteen Language Models

Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?

Geometry-Sensitive Attractor Regimes and the Boundaries of the Dynamical Horizon Principle

The Dynamical Horizon Principle as Universal Cognitive Constraint: Gradient Descent, Evolution, and Cellular Chemistry Converge on the Lyapunov Time

The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems

They Learn to Look Away: Mechanistic Evidence for a Consistent RLHF Suppression Bottleneck and the Suppressor-Crystallizer Dichotomy in Language Models

Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments

Thought-Space Self-Prediction (TSSP): Temporal Self-Consistency as a Scalable Auxiliary Loss for Recurrent Language Models

Models get real estate because artifacts are evidence.

Gemma abliteration is carrying the download board.

Gemma4-12B-IT-Abliterated-GGUF

OpenYourMind-Gemma4-12B-IT-Abliterated-GGUF

Qwen3.6-35B-A3B-Code-imatrix-GGUF

Gemma-4-26B-A4B-Abliterated-GGUF

Gemma-4-E4B-Abliterated-GGUF

Training data is part of the publication surface.

ml-ai-engineer-sft

cot-reasoning-2k

Archon-Latent-Geometry-SFT

Gemma4-E2B-SFT-WebCode

Gemma4-E2B-SFT-CoT

Gemma4-E2B-SFT-JSON

Gemma4-E2B-SFT-SQL

Edge & Mobile Models

CDM-350M and Axon-352M - In Training

Kairos-1.2B MoE - In Training

MDLM-552M - In Training

TurboGemma-E2B

TurboGemma-E4B

GhostShell-4B

The DuoNeural Lab Letter

Small lab, measurable output.

Help Us Get It Right

Keep the Lab Running

Open artifacts, open trail.