Guide · AI image API readable text

An image API where the headline is real text, not pixels

Most AI image tools render text as diffusion blur. This API renders headlines and CTAs as HTML/CSS on top of the visual — letters stay sharp at any zoom level.

Diffusion models do not draw letters, they draw pixels that resemble letters. That is why DALL·E, Midjourney, and Ideogram all produce 'M4RK3T1NG' instead of 'MARKETING' once you push past one or two words. For ad creative, this means you generate the visual and then retype the headline in Photoshop.

This guide explains how the 42rows image creative API works around the problem. The visual is generated by Imagen 4 (a diffusion model). The headline, stat, and CTA are then composited as an HTML/CSS layer on top — using a real font renderer. Letters are vector-clean at any zoom.

Step by step

  1. 01

    Two-stage pipeline

    Internally the request becomes: (1) Imagen 4 generates the background and visual elements at the target aspect ratio; (2) a Playwright-based compositor renders the headline, subheadline, and CTA as HTML/CSS and merges them onto the image. The output you receive is the merged PNG — you do not see the two stages.

  2. 02

    Be explicit about the copy

    The compositor renders exactly what you put in the prompt for the headline and CTA fields. Treat them like form inputs: type the literal sentence you want to read on the ad. The art director extracts them from the prompt — putting them in quotes helps disambiguation ("Headline: \"Build streaks that stick\"").

  3. 03

    Test at thumb-stop scale

    Letters that read at 100% may still feel small at 30% (mobile feed). For the LinkedIn 1200×628 case, headlines longer than 9 words start to look cramped. Test the output at 30% zoom in your browser to approximate how it will display in feed.

Example prompts

Copy, click, tweak — the CTA opens the terminal with the prompt pre-loaded.

Headline-driven ad 1080×1080 ad. Headline: "Stop the meeting marathon". CTA: "Try async reviews". Brand colour: navy. Style: editorial type-driven, minimal visual. Try →
Stat-driven ad 1200×628 LinkedIn ad. Headline: "47%". Sub: "of B2B teams ship faster with async standups". CTA: "Learn how". Style: clean dashboard mock. Try →
Quote-driven ad Square ad with a customer quote. Quote: "We cut reporting time by two thirds in week one". Attribution: "VP Finance, Acme Corp". CTA: "Read the case study". Try →

API call

Standard REST. Bearer token, JSON body, URL response. Works in any HTTP client, n8n, Make, Zapier, or MCP agent.

curl -X POST https://api.42rows.com/v1/image-creative \
  -H "Authorization: Bearer sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Headline: \"Stop the meeting marathon\". CTA: \"Try async reviews\". Square 1080x1080.",
    "format": "1080x1080"
  }'

Pricing

Pay-per-call, no subscription. Subscription plans are on the roadmap — they will not change pay-per-call rates.

FAQ

Which font is used for the overlay?

A small set of brand-neutral sans-serif typefaces, picked by the art director based on tone (technical, editorial, friendly, etc.). Custom font upload is on the roadmap.

Can I get the visual without the text overlay?

Yes. The actor input has a `skip_compositor` flag that returns just the Imagen output without the HTML/CSS pass. Useful if you want to do the text layer yourself in Figma.

What about non-Latin scripts?

The compositor renders any script supported by the bundled font fallback chain (Latin, Greek, Cyrillic, basic CJK). Devanagari and Arabic are partial — long-form Arabic right-to-left layouts are roadmap.

Why HTML/CSS specifically and not SVG?

HTML/CSS gives us multi-line wrap, line-height tuning, and gradient text out of the box, all rendered through Chromium under Playwright. SVG would work for static layouts but is more painful for responsive headline lengths.

How does this compare to Bannerbear or Placid?

Bannerbear and Placid expect you to design templates first and fill them in via API. Our endpoint takes a brief and chooses the template internally. Trade-off: less control over final layout, much faster to ship.

Ship it

Use the first example prompt as a starter — the button opens the public terminal with it pre-filled.