SCRATCHPADS-Experiment

Passphrase Seeding — Random Noise (gpt-oss-120b) 2026-02-27
Hypothesis

Prepending random passphrases (random words, random numbers, random alphanumeric strings) before a creative prompt increases output diversity compared to baseline.

Test

The generation parameters experiment showed temperature is the dominant diversity knob, but even at temp=1.0 cosine similarity averages 0.65. Tests whether random noise in the prompt can break these patterns.

100 completions: 4 configurations × 25 runs.

Prompt: "Write a short story about a traveler arriving in a strange city."

Configurations:

  • Control: no passphrase
  • Random words: 5 random English nouns prepended
  • Random numbers: 5 random integers (1-9999) prepended
  • Random alphanumeric: 5 random tokens (4-8 chars) prepended

Passphrase format: "{passphrase}\n\n{prompt}" — no explanation, just raw tokens. Fresh passphrase per run.

  • Model: gpt-oss-120b via Cerebras API (free tier)
  • Temperature: 1.0, top_p: 0.95, max_completion_tokens: 65000
  • Metrics include MATTR-50 (length-corrected lexical diversity) to control for output length differences
Result

INCONCLUSIVE

Metrics conflict with each other, making a clear verdict impossible.

Pro-diversity signals: all passphrase types reduce bigram overlap (Jaccard drops from 0.056 to 0.047-0.051). Random words reduce cosine similarity from 0.667 to 0.605 (large effect, d=-0.82).

Anti-diversity signals: all passphrase types reduce token entropy (from 0.940 to 0.846-0.862, very large effects d=-1.53 to -1.95). The model becomes more predictable in its token selection with a passphrase present.

Length confound: passphrase configs produce 31-39% fewer words (788-893 vs 1297 baseline). The large unique word ratio improvements (+1.25 to +1.97 d) collapse to negligible when measured by MATTR-50 (length-corrected), confirming they were a length artifact.

Config Cosine sim Bigram Jaccard Token entropy MATTR-50
Control 0.667 0.056 0.940 0.819
Random words 0.605 0.047 0.862 0.818
Random numbers 0.669 0.051 0.853 0.821
Random alphanum 0.645 0.050 0.846 0.826

The consistent entropy reduction is the strongest finding. It suggests the model becomes more conservative in token selection when a passphrase is present, possibly because the reasoning model processes the passphrase during chain-of-thought, consuming capacity that would otherwise go to exploratory token selection.

Next
  1. Test themed passphrases (semantically coherent words) to see if meaningful prefixes produce different effects than noise
  2. Investigate the entropy reduction further — could be useful for output stabilization in code generation
  3. Test at different temperatures to see if the passphrase effect interacts with the temperature effect