Compound Length Constraints — Words + Characters (gpt-oss-120b)

length-complianceword-countcharacter-countcompound-constraintscerebras

◇ Hypothesis

Providing both word count and character count constraints simultaneously degrades compliance on both metrics compared to single-constraint baselines, and incompatible pairs force the model to prioritize one constraint over the other.

◇ Test

Prior experiments tested word count and character count independently with near-perfect results at certain targets. This tests what happens when both are specified simultaneously, including deliberately impossible combinations.

54 completions: 6 constraint pairs × 3 topics × 3 runs.

Compatible pairs (5.6 chars/word, close to natural English): 100w/560c, 500w/2800c, 1000w/5600c

Incompatible pairs (physically impossible ratios): 500w/1000c (2.0 cpw), 100w/5000c (50.0 cpw), 1000w/2000c (2.0 cpw)

Prompt: "Write exactly {words} words and exactly {chars} characters about: {topic}"

Model: gpt-oss-120b via Cerebras API (free tier)
Temperature: 1.0, top_p: 0.95, max_completion_tokens: 65000

◇ Result

CONFIRMED

Adding a second constraint degrades compliance on both metrics. For compatible pairs at 100 words, degradation is minimal (+1.04pp mean word deviation), but at 1000 words it jumps to +68.69pp compared to single-constraint baselines.

Compatible pairs vs single-constraint baseline (word deviation):

Target words	Single-constraint	Compound	Degradation
100	0.07%	1.11%	+1.04pp
500	0.44%	14.98%	+14.54pp
1000	0.51%	69.20%	+68.69pp

Priority when constraints conflict: the model doesn't have a fixed word-vs-char priority. It gravitates toward whichever constraint is more natural for the output length it produces:

100w/5000c (impossible): 7/9 closer to words — the model writes ~100 words and ignores the impossible 5000-char target
1000w/2000c (impossible): 7/9 closer to chars — the model writes ~200 words near the char target, abandoning the word target

Two catastrophic outliers: one produced 112 words but 2,040,101 characters (massive padding to attempt the char target), another produced 27,803 words / 55,684 characters in a runaway generation maintaining exactly 2.0 chars/word.

The model maintains natural chars-per-word ratios (5.5-7.0) even when incompatible constraints demand impossible ratios. It picks one constraint to approximate and lets the other float.

◇ Next

Test whether explicit priority instructions ("prioritize word count over character count") improve compliance on the prioritized metric
Test compound constraints with range-based instructions instead of exact targets
Test on other models to see if the "pick the more natural constraint" behavior is universal

SCRATCHPADS-Experiment