Pikumo · probe image-gen pipeline · aspect ratio A/B/C
What's the right aspect ratio for a story panel?
Same 4-beat grief story, same character photo, same style, same gpt-5.4-mini
orchestrator with <story_source> attached. The only variable is
the image_generation size parameter: 1024×1024 (square, current
production default), 1024×1536 (portrait 2:3), 1536×1024 (landscape 3:2).
All panels rendered natively at that size — not resized post-hoc. Below each panel: which composition genuinely "won" the beat (green pill) vs runner-up (faded green).
Beat 1 — Entryway
runner-up
winner
Square crops Mei into a portrait shot — the hall is barely visible. Portrait stretches the hall deep into the apartment, terracotta tiles + key clearly visible. Landscape goes further: reveals the coat hook on the left wall, scarf detail, and the kitchen at the end of the hall. Landscape reads as "she is in a place" rather than "she is centered in a frame."
Beat 2 — Kitchen (one bowl, one spoon, one teacup)
runner-up
winner
Square turns the kitchen into a stage behind Mei. Portrait gives the full vertical kitchen — range hood, stove with kettle, sage-green curtain, dish rack with bowl & cup. Landscape reveals a red sitting nook on the right we never see otherwise — the apartment reads as a lived-in space, not just a kitchen. Counter-intuitive but consistent finding: landscape on a small interior feels expansive, not cramped, because interior architecture is horizontal.
Beat 3 — Bedroom (sleeve of the green sweater)
winner
runner-up
The one beat portrait clearly wins. Vertical light from the window → seated body → green sweater sleeve to face → pink bedspread anchoring the bottom — feels like a page from a picture book. Landscape is cinematic but loses the storybook intimacy. Square crops the bed entirely and squeezes the closet detail.
Beat 4 — Leaving
runner-up
winner
Square reverts to the "smiling at camera in a fluorescent box" failure mode from the story-source probe. Portrait gives a closed apartment door on the left and twilight city through the open-air walkway on the right. Landscape reveals both walls of the walkway plus the full skyline — the most evocative of the three on this corridor geometry. Reads as departure, not portrait.
The numbers
| metric | square 1:1 | portrait 2:3 | landscape 3:2 |
|---|---|---|---|
| size (px) | 1024×1024 | 1024×1536 | 1536×1024 |
| wall seconds (direct OpenAI) | 137.4 | 143.2 | 135.2 |
| input tokens | 9,273 | 11,321 | 11,003 |
| cached tokens | 0 | 1,792 | 1,792 |
| reasoning tokens | 1,239 | 661 | 486 |
| output tokens | 2,039 | 1,674 | 1,327 |
| image cost (low qual, est.) | $0.044 | $0.064 | $0.064 |
| panels delivered | 4/4 | 4/4 | 4/4 |
All three runs: gpt-5.4-mini · OpenAI direct · image_generation quality: low · same prompt (variant B from story-source probe). Latency parity is the headline — non-square ratios are not meaningfully slower.
Verdict
Retire square as the default. The 1:1 crop forces character-dominates-frame on every beat — environment is always sacrificed. It's a worse composition for every beat tested.
Default to landscape 3:2 (1536×1024). Wins beats 1, 2, 4 outright. Strong second on beat 3. Reveals architectural context — corridors, kitchens, walls, beds, and outdoor scenes all have horizontal geometry, and landscape lets that breathe. Reads as "she is in a place" rather than "she is centered in a frame."
Hold portrait 2:3 (1024×1536) as a per-story or per-beat option. Wins on intimate interior beats (the bedroom) where the vertical composition reads as a picture-book page. Worth a follow-up probe that mixes the two within a single story — maybe per-beat: outdoor / wide → landscape; close intimate → portrait.
Cost +45% per image vs square. $0.044 → $0.064 per 4-panel story at low quality. Negligible against orchestrator + extract costs. Latency is unchanged (~140s on direct OpenAI).
Caveat. This is one story (interior-heavy grief). Trip stories with outdoor vistas would lean even harder landscape. Pure single-character intimate-moment stories might favor portrait. Worth a second probe on a trip-shaped story before fully committing — but landscape is the safe default move now.