This is a Llama-3.3-8B-Instruct-128K fine-tune, produced through P-E-W's Heretic (v1.2.0) abliteration engine with Magnitude-Preserving Orthogonal Ablation enabled.

Note: Model exhibits overt non-compliance (divergence, changing focus, reinterpretation, and rarely argumentation). An effort was made to target model-unique refusals, overt non-compliance, and disclaimer/warning attachments.

Heretication Results

Score Metric	Value	Parameter	Value
Refusals	0/100	direction_index	11.17
KL Divergence	0.0448	attn.o_proj.max_weight	1.92
Initial Refusals	102/104	attn.o_proj.max_weight_position	6.82
		attn.o_proj.min_weight	1.77
		attn.o_proj.min_weight_distance	23.82
		mlp.down_proj.max_weight	0.85
		mlp.down_proj.max_weight_position	7.03
		mlp.down_proj.min_weight	0.77
		mlp.down_proj.min_weight_distance	28.52

Appendix

Empty system prompt.

Previous attempt: Click Here

Trial 196 was the optimal choice, picked 192. Additional trials can be run.

Restoring model from trial 196...
* Parameters:
  * direction_index = 10.72
  * attn.o_proj.max_weight = 1.87
  * attn.o_proj.max_weight_position = 20.92
  * attn.o_proj.min_weight = 1.76
  * attn.o_proj.min_weight_distance = 16.32
  * mlp.down_proj.max_weight = 0.78
  * mlp.down_proj.max_weight_position = 6.49
  * mlp.down_proj.min_weight = 0.54
  * mlp.down_proj.min_weight_distance = 13.90

 » [Trial 192] Refusals:  0/104, KL divergence: 0.0448
   [Trial 199] Refusals:  2/104, KL divergence: 0.0398
   [Trial 196] Refusals:  5/104, KL divergence: 0.0273
   [Trial 141] Refusals: 21/104, KL divergence: 0.0207
   [Trial 101] Refusals: 22/104, KL divergence: 0.0205
   [Trial 205] Refusals: 37/104, KL divergence: 0.0132
   [Trial 213] Refusals: 58/104, KL divergence: 0.0124
   [Trial 131] Refusals: 72/104, KL divergence: 0.0088
   [Trial 214] Refusals: 81/104, KL divergence: 0.0080
   [Trial  52] Refusals: 83/104, KL divergence: 0.0065
   [Trial  18] Refusals: 88/104, KL divergence: 0.0057
   [Trial 332] Refusals: 92/104, KL divergence: 0.0057
   [Trial  68] Refusals: 94/104, KL divergence: 0.0048
   [Trial  37] Refusals: 98/104, KL divergence: 0.0043
   [Trial  28] Refusals: 99/104, KL divergence: 0.0022
   [Trial 313] Refusals: 100/104, KL divergence: 0.0020
   [Trial  20] Refusals: 101/104, KL divergence: 0.0015
   [Trial 178] Refusals: 102/104, KL divergence: 0.0004