Familiar | Best AI model to run D&D in Foundry VTT

Which model should you pick?

The short version: connect a current flagship from Anthropic, OpenAI, or Google and you will not have to think about this again. A capable mid-tier model covers a tighter budget and carries most of a normal session. A strong local model runs offline if you want your world to stay on your machine. You bring your own AI, so all three are yours to take, and you can change your mind between sessions.

Familiar names the models it can reach in the picker itself, which is the honest place for it. Line-ups change every few months, so a list printed on a marketing page is stale before you read it. What does not change is the shape of the choice.

The model sets the ceiling. On D&D 5e (2024), Familiar's code holds the damage, the conditions and the dice steady, so a weak model cannot break the maths. What it can do is bore you. The words are the model's own: whether the read-aloud lands, whether your innkeeper still sounds like himself in hour three, whether the recap is worth reading. That, and the judgement when two conditions collide, is what a stronger model buys.

The model tiers compared on what each one is for and what it costs you in practice.
Tier	Best for	The catch
Flagship	Hard rulings, messy multi-creature fights, anything that turns on judgement	Priciest per token, and the slowest to answer
Mid-tier	Most of a normal night: narration, lookups, NPC talk, routine combat	Less headroom when a rule gets genuinely tangled
Fast	Quick lookups and recaps where you want the answer back now	Thinnest margin on a long chain of tool calls
Local	Offline play on your own hardware	Needs a tool-trained model of 14B or larger, and a discrete GPU to feel like a conversation

Nothing in your world is tied to any of this. Switch tiers whenever the night calls for it.

Where do you actually get one?

Three routes, and they hand you different amounts of choice. Worth knowing before you sign up for anything, because one of them quietly costs you a setting.

Whichever you take, none of it is a commitment. Your campaign lives in Foundry journals and Familiar's memory, not in the model, so swapping provider halfway through a campaign costs you nothing but the time to paste a key.

One key for everything: OpenRouter reaches hundreds of models, and you swap between them from a dropdown without signing up anywhere else. The shortest path from nothing to playing.
A direct provider key: reaches that provider's own catalogue, and on Anthropic it also unlocks the full Thinking ladder. OpenRouter only reports whether a model can reason, not how finely, so routing Claude through it gives you Low, Medium and High and nothing above.
Over MCP: you do not pick a model in Familiar at all. Your client picks, and Familiar runs the tools it calls.

Where the model actually shows up

Think of the model as the brain that executes your prep. The journals, statblocks, and encounters are already on the page; the model is what reads them and works the tools to make them happen at the table. Prep fixes what runs. The model sets how well it runs.

One turn of combat is a multi-step tool sequence: roll initiative, set the order, apply a condition, force a save, update the canvas, then play the monster turn. Every step is a chance to slip, and the slips compound. A model that gets each step right nineteen times in twenty still comes through that six-step chain clean only about three nights in four.

On D&D 5e (2024) Familiar takes some of that risk off the model: the damage, the conditions and the dice run in code, so the arithmetic holds up whatever you connect. What the model still owns in that chain is the order and the reading: which tool to reach for next, whether a rule applies at all, whether a ruling survives two stacked conditions. That is why the model shows up most in combat.

On other Foundry systems your own system resolves all of that, so more of the night rides on the model itself: the language, the lookups, and the calls you hand back to the table.

Speed is the other half of the trade, and pricing pages never mention it. A flagship thinking hard is dead air at a table with people sitting in it. Playing solo you will barely notice. With four players watching you wait, you will.

Combat & AI

What Familiar does on your systemWhich guardrails run where, drawn per system.

Give it room to think

Reasoning effort, what Familiar calls Thinking, is how hard the model deliberates before it acts. Turn it down and the AI answers off the top of its head. Turn it up and it works the problem through first: reads the rule, checks the board, plans the sequence, then moves. You set the dial, and the model spends the extra effort only where you point it.

Familiar ships this on High. That is already the right place for combat and hard rules, so if you never touch the setting, you are where the fiddly nights want you to be.

Which rungs you see depends on the model you connected. Claude Opus offers the whole ladder, Off through Max. Most others give you Low, Medium and High and stop there, so on Gemini Pro, Grok, Mistral or DeepSeek there is no way to switch reasoning off at all. Familiar only shows the levels your model genuinely accepts, so a shorter dropdown is the model's limit rather than a missing feature.

Combat & tricky rules: High
Where Familiar starts out, and where it should stay for the multi-step bookkeeping. This is the work that earns the extra thinking.
Roleplay & exploration: Medium
One rung down from the shipped default. NPC banter, reading a journal, narrating a room. Enough deliberation to stay in character and on the page, without spending it on a conversation.
Quick lookups & recaps: Low, or Off where you have it
Fetch a stat block, restate a rule, summarise last session. Off is the fastest and cheapest, but only Claude and GPT offer it; everywhere else Low is the floor.

Thinking is a dial you control, not a promise the model keeps. Turning it up improves the odds on a hard sequence; it does not guarantee the answer. Past a point the model over-thinks a simple ask and you pay for tokens you did not need. Give it enough room for the task in front of it, and no more.

What a session costsMore thinking means more tokens. Here is what that adds up to.

Who controls thinking depends on how you connected

The reasoning dial does not sit in the same place on both routes, and it catches people out. On the built-in chat route you bring an API key and pick everything inside Familiar: the model in settings, and a Thinking dropdown beside it. Over MCP the controls move out to your client, which decides both which model answers and how hard it reasons before it does.

Built-in chat (API key): you pick the model and the Thinking level inside Familiar.
MCP: your AI client picks the model and controls how hard it thinks; Familiar runs the tools it calls.

Is it the model, the dial, or your prep?

Before you go and buy a bigger model, work out what actually went wrong. Most of what looks like a weak model is not one, and a flagship will not fix a page you never wrote.

Symptoms you notice at the table, what usually causes them, and what to change.
Symptom	Likely cause	What to change
A condition gets dropped, or initiative lands out of order	The dial, then the tier	Raise Thinking first. If it still happens on High, move up a tier.
It loses the thread halfway through a big fight	The tier	A longer tool chain needs more headroom. This is the one a flagship genuinely fixes.
It forgot an NPC from two sessions ago	Not the model	Nothing was written down for it to read. Fix the record, not the brain.
It invented a beat you never wrote	Not the model	The page it needed did not exist. Fix the prep.
Replies are slow and the table is waiting	The dial is too high for the moment	Drop to Medium for the quiet stretch, and raise it when the fight starts.

A bigger model does not fix a prep problem. Two of these five are not the model at all.

A starting point

If you would rather not tune any of this: connect a current flagship, leave Thinking where Familiar puts it, and play a session. That is close to the out-of-the-box setup already, and it suits most tables.

Adjust from there. Drop the dial when a night is all talk, and only reach for a cheaper model once the bill starts to bother you.

More in Connect your assistant

Play D&D with Claude in Foundry

New

The most direct connection of the three: Claude Desktop and Claude Code speak MCP natively, no agent app in between. Your Anthropic plan drives the DM's side of a published adventure with no per-token bill.

Play D&D with ChatGPT in Foundry

New

ChatGPT reaches your Foundry game through Codex, the agent a ChatGPT plan already includes, and runs the DM's side of a published adventure from there. Prefer a key? Bring an OpenAI API key instead.

Play D&D with Gemini in Foundry

New

The cheapest way in: Antigravity connects Gemini over MCP on any Google plan, including a free account, or you paste a Google API key. Either route runs the DM's side of a published adventure.

MCP vs API: which to choose

New

Subscription or API key? MCP subscription versus a pay-per-token key, and which is cheaper for how your table actually plays.

Running a local LLM

New

Run the model on your own PC instead of a provider, so your world never leaves the machine and the bill moves from a token meter to your graphics card. Setup is one settings tab and one switch. Without a discrete GPU, expect minutes per turn.

New to Familiar? I'm Ryan, the person who built it. The Discord is small and brand new, so if you join now I'll help you get set up myself.

Install in Foundry Join the Discord

Choosing an AI model for Familiar.