Without a length constraint, LLM outputs cluster around a 'natural' default length that varies by topic type and prompt framing.
Establishes the model's unconstrained baseline output length, which the prior word count and character count experiments can be compared against.
60 completions: 3 topics × 2 framings × 10 runs.
Topics: factual (solar panels), creative (lighthouse keeper story), argumentative (remote work)
Framings:
- Bare: "Write about the following topic: {topic}"
- Direct: "{topic}" (topic text only, no wrapper)
No word count, character count, or length instructions of any kind.
- Model: gpt-oss-120b via Cerebras API (free tier)
- Temperature: 1.0, top_p: 0.95, max_completion_tokens: 65000
CONFIRMED
Outputs cluster around topic-dependent default lengths. Topic matters; framing doesn't.
By topic (n=20 each):
| Topic | Mean words | Mean chars |
|---|---|---|
| Factual | 1586 | 10,252 |
| Creative | 1399 | 7,881 |
| Argumentative | 1015 | 7,309 |
Factual prompts produce 56% more words than argumentative. The ordering is consistent across both framings.
By framing (n=30 each):
| Framing | Mean words |
|---|---|
| Bare ("Write about...") | 1348 |
| Direct (topic only) | 1319 |
A 2.1% difference, well within the per-group standard deviations. The framing wrapper adds no meaningful length.
Overall: median 1359 words, range 770–2061. All 60 completions finished naturally (finish_reason=stop). Creative writing had the highest variability (SD=354 words with bare framing), argumentative the most consistent (SD=109 with direct framing).
Characters per word varied by topic: argumentative used longer words (7.20 chars/word) vs creative (5.64), reflecting formal vs conversational vocabulary.
- Compare these baselines against the constrained experiments — the unconstrained argumentative mean (1015 words) is close to the "exactly 1000 words" constrained result (~988 words)
- Test whether system prompts or role instructions shift the default length
- Measure default length on other models for cross-model comparison