SCRATCHPADS-Experiment

Compound Length Constraints — Words + Characters (gpt-oss-120b) 2026-02-25
Hypothesis

Providing both word count and character count constraints simultaneously degrades compliance on both metrics compared to single-constraint baselines, and incompatible pairs force the model to prioritize one constraint over the other.

Test

Prior experiments tested word count and character count independently with near-perfect results at certain targets. This tests what happens when both are specified simultaneously, including deliberately impossible combinations.

54 completions: 6 constraint pairs × 3 topics × 3 runs.

Compatible pairs (5.6 chars/word, close to natural English): 100w/560c, 500w/2800c, 1000w/5600c

Incompatible pairs (physically impossible ratios): 500w/1000c (2.0 cpw), 100w/5000c (50.0 cpw), 1000w/2000c (2.0 cpw)

Prompt: "Write exactly {words} words and exactly {chars} characters about: {topic}"

  • Model: gpt-oss-120b via Cerebras API (free tier)
  • Temperature: 1.0, top_p: 0.95, max_completion_tokens: 65000
Result

CONFIRMED

Adding a second constraint degrades compliance on both metrics. For compatible pairs at 100 words, degradation is minimal (+1.04pp mean word deviation), but at 1000 words it jumps to +68.69pp compared to single-constraint baselines.

Compatible pairs vs single-constraint baseline (word deviation):

Target words Single-constraint Compound Degradation
100 0.07% 1.11% +1.04pp
500 0.44% 14.98% +14.54pp
1000 0.51% 69.20% +68.69pp

Priority when constraints conflict: the model doesn't have a fixed word-vs-char priority. It gravitates toward whichever constraint is more natural for the output length it produces:

  • 100w/5000c (impossible): 7/9 closer to words — the model writes ~100 words and ignores the impossible 5000-char target
  • 1000w/2000c (impossible): 7/9 closer to chars — the model writes ~200 words near the char target, abandoning the word target

Two catastrophic outliers: one produced 112 words but 2,040,101 characters (massive padding to attempt the char target), another produced 27,803 words / 55,684 characters in a runaway generation maintaining exactly 2.0 chars/word.

The model maintains natural chars-per-word ratios (5.5-7.0) even when incompatible constraints demand impossible ratios. It picks one constraint to approximate and lets the other float.

Next
  1. Test whether explicit priority instructions ("prioritize word count over character count") improve compliance on the prioritized metric
  2. Test compound constraints with range-based instructions instead of exact targets
  3. Test on other models to see if the "pick the more natural constraint" behavior is universal