Röstmatchade AI-utkast: hur modellen lär sig låta som dig
AI-utkast har ett igenkänningsproblem. Röstmatchning är det tekniska svaret — färre ”Jag hoppas det här finner dig väl”-öppningar, fler utkast som inte går att skilja från det mail du själv skulle ha skrivit.
AI email drafting has a recognisability problem. By the time you've received the third "I hope this email finds you well" opener from someone you know doesn't actually talk like that, you've learned to spot the AI. The drafts are syntactically fine, the tone is professional, the structure is correct — and they sound nothing like the sender.
Recipients notice. Trust takes a small hit. The convenience of the AI draft is bought at the cost of a slightly less authentic-feeling correspondence. Over time, that compounds into a real reputational cost.
Voice matching is the technical answer. The idea: instead of the model generating in its default register, give it your actual writing style as input, and have it produce drafts that mirror your sentence structure, your level of formality, your typical openings and sign-offs, your idioms. Done well, the drafts are indistinguishable from something you wrote in two minutes.
What the model needs to learn
"Your voice" isn't one thing — it's a set of styles that vary by recipient and context. You write differently to a customer than to a co-founder. Differently in a first reply than in a sixth-message thread. Differently to someone in France than to someone in San Francisco. A useful voice model captures at least:
- Greeting + closing patterns — "hey [name]," vs "Hi all," vs no greeting; "Best," vs "Cheers," vs no sign-off.
- Sentence length distribution — short and punchy vs long with subordinate clauses.
- Formality calibration per recipient — you almost certainly write more casually to repeat contacts.
- Hedging vs declarative bias — "I think we should" vs "We should".
- Idiosyncratic phrases — the words you use that nobody else does. "Worth a beat," "in the meantime," "quick thought".
The two ways to do voice modelling
There are two mainstream approaches:
- Few-shot in the prompt. Pull the last 20-50 of your sent emails matching the recipient context, paste them into the prompt as examples, ask the model to write the next one in the same style. Cheap, no training, works immediately on day one. Quality is good but not great — the model averages across the examples.
- Fine-tuned per-user model. Take a base model, train it specifically on your sent history, deploy the resulting per-user model. Quality is meaningfully better — the model internalises your style rather than inferring it each call — but the per-user training cost and serving infrastructure are real, and the lift over good few-shot is small enough that most products skip it.
The third option, increasingly common in 2026, is a hybrid: few-shot in the prompt + a short structured "voice profile" the user can edit. The profile captures the things that are easy to articulate (typical sign-off, preferred level of formality, words to avoid) and the few-shot examples handle the rest. The user can correct the profile when they want to nudge the output.
The cold-start problem
Voice matching requires a sent-history baseline. New users with empty sent folders get default-register drafts until enough mail accumulates — typically 50-100 sent messages. For most knowledge workers that's a week of normal use; for low-volume users it can take a month.
The interim should be honest about the limitation. A draft labelled "default style (still learning your voice)" is more trustworthy than one that pretends to be in your voice while sounding like everyone else's drafts.
What good voice matching prevents
The clearest test of voice matching is whether the recipient can tell. The threshold isn't "sounds professional" — it's "the recipient assumes you wrote it." That's a high bar. Reaching it requires:
- Matching your typical email length. If you write two-sentence replies and the draft is six paragraphs, it's wrong.
- Preserving your typos and informalisms when context allows. Over-polish is its own tell.
- Skipping the corporate phrasings you never use. If you've never written "circling back" in your life, the draft shouldn't introduce it.
- Inheriting your relationship tone. Your reply to a long- time customer should sound warmer than your reply to a cold inbound.
How Inboxer handles it
Inboxer builds a voice profile from your sent history on connect — a one-off backfill that takes a few minutes for most accounts. The profile captures sentence length, formality calibration per recipient, typical greetings/closings, and idiosyncratic phrases. Drafts are generated few-shot with the most recent matching context, conditioned on the profile.
You can edit the profile from /settings/voice — useful when you want to nudge the output (always use "Hi" not "Hey", never sign off with "Best", etc.). Every edit takes effect on the next draft. Nothing sends without your click, so a draft that doesn't feel right is one keystroke away from being rewritten by you instead.