Cross-architecture RYS sweep β€” SmolLM2-1.7B-Instruct (negative result; 135M and 360M siblings respond normally)

#33
by john-broadway - opened

Sharing a cross-architecture RYS (layer-duplication, "Repeat Your Self") sweep that includes SmolLM2-1.7B-Instruct alongside 20 other model variants spanning 10 architecture families.

Sweep result for this model (24 layers, baseline reasoning 58.82%):

Configuration Math Ξ” EQ Ξ” Reasoning Ξ”
Best: (15,18) block-3 (best combined Ξ”; still negative overall) βˆ’6.19 +1.09 +0.00

Peak reasoning Ξ”: +0.00% (zero configurations boost reasoning >5%). First published RYS negative result; RYS is not universal. Notable because sibling SmolLM2-135M and -360M respond normally β€” the 1.7B size-point is uniquely anomalous within this family.

The cross-architecture finding (Pearson r = βˆ’0.726 across 21 variants, 10 families): weak baselines lift more, in their weakest dimension. Three distinct mechanisms identified for RYS-recoverable suppression β€” under-training scale, MoE routing inefficiency, and specialization training trade-off. First published negative result (SmolLM2-1.7B).

Full sweep data + analysis: https://huggingface.co/datasets/john-broadway/rys-sovereign-collection-v2
Evaluation card for SmolLM2-1.7B-Instruct: https://huggingface.co/john-broadway/SmolLM2-1.7B-RYS-eval

Method: original RYS post by David Ng; sweep toolkit by alainnothere. Train-free β€” no weight changes, no merging.

β€” John Broadway, with collaboration from Claude (Opus 4.6 in April 2026 sweep generation; Opus 4.7 in May 2026 cross-architecture analysis).

Sign up or log in to comment