AI Tool Selection
The AI tooling market is loud, fast-moving, and full of demos that look better than the production reality. This page is the framework we use to cut through it.
The build-vs-buy decision tree
Before evaluating any vendor, decide whether you should be evaluating one at all.
Three honest options:
- Buy. Mature SaaS, bounded scope, you customize the surface but not the engine. Fastest path. Best when the workflow is not your moat.
- Assemble. Low-code (Zapier / Make) or a thin custom layer over commodity primitives (LLM APIs, MCP servers, vector DBs). The right answer most of the time.
- Build. Full custom. Reserve for capabilities that are core to your differentiation or where no tool fits.
When in doubt: buy or assemble first; rebuild only if and when the seams hurt enough.
The vendor evaluation rubric
Score every candidate vendor across these eight dimensions. Use a simple 1–5 scale and weight the dimensions for your context.
| Dimension | What you are checking | Disqualifiers |
|---|---|---|
| Fit | Does the product solve your problem, not a parallel one? | Demo only ever shows their canonical use case, not yours. |
| Data residency & use | Where does your data go? Is it used to train? | "We may use your data to improve our models" with no opt-out. |
| Security posture | SOC 2, ISO, encryption, SSO, audit logs, vulnerability disclosure. | No SOC 2 and no roadmap for one. |
| Model agnosticism | Can the product use a model you choose, or are you locked to theirs? | A single proprietary model with no swap path. |
| Exit cost | Can you get your data out? Your prompts? Your eval set? | Closed formats. No export. |
| Support & maturity | Funding stage, customer count, response times, named contacts. | Pre-revenue with no support SLA. |
| Integration surface | API, webhooks, MCP, native connectors. Documented? | "Coming soon" on every integration you need. |
| Pricing model | Per-seat? Per-token? Per-action? Predictable? | Pricing only revealed at the end of a sales cycle. |
For an evaluation to be honest, at least one team member must run a real workflow on real data in a sandbox before signing. Vendor-led demos do not count.
The build-or-buy trap to avoid
"We'll just build it ourselves; the API is right there."
Building looks cheap because the visible cost is the API call. The hidden costs:
- Auth, rate limiting, retries, idempotency.
- Logging, monitoring, alerting.
- HITL approval flow.
- Eval suite and regression CI.
- Security review, threat model, incident response.
- The maintenance tail — every model upgrade, every API change, every new browser quirk.
If a tool covers 80% and only the 20% you do not need is in your way, buy and live with the 20%. If a tool covers 80% and the missing 20% is exactly your differentiator, build the 20% and integrate.
RFP question bank
The questions we ask before signing. Adapt to your situation.
Product fit
- Walk us through a customer who looked like us. What did they buy and what did they extend?
- What does your product not do that customers frequently ask for?
- What is the smallest meaningful deployment? Who succeeds at that scale?
- Who is the worst-fit customer for you and why?
Data and security
- Where is data stored geographically? Can we pin a region?
- Is our data used to train your models or any third-party model? Is there an opt-out?
- Provide your most recent SOC 2 / ISO report.
- How are credentials and secrets handled across your platform?
- What is your incident response and breach-notification process?
- Do you support SSO (which protocols), SCIM, and audit log export?
Models and lock-in
- Which models can your product use? Can we bring our own?
- If you change the underlying model, how are we notified and what is our right to test?
- Do you support self-hosted or VPC deployment for regulated workloads?
Operations
- What is your status page URL, last 90 days of incidents, and your published SLA?
- Who is our named technical contact during onboarding? After?
- What is the typical response time for a P1 issue?
- Show us your roadmap for the next two quarters.
Pricing and exit
- What is the all-in monthly cost for our projected usage? Include overages.
- What does pricing look like at 5× our current scale?
- How do we export our data, prompts, and configuration if we leave?
- What is the off-boarding process and timeline?
- What is the minimum contract length and the renewal mechanic?
Evaluation
- Provide a sandbox or trial against our real data for at least two weeks.
- Provide reference customers we can call directly.
- What evaluation evidence do you have that your product works? (Their own eval suite, win-rates against benchmarks, anything beyond marketing.)
Red flags
- Pricing only revealed at the end of a long sales cycle.
- Demos that won't run on your data.
- Vague answers on data use ("we take privacy seriously").
- A roadmap that perfectly matches your asks. (They are telling you what you want to hear.)
- No SOC 2, no compensating controls, and no plan.
- "Our model is proprietary." (Translation: opaque, untestable, lock-in.)
- A reference list you cannot actually call.
Green flags
- Specific, measured answers — including limits.
- Willingness to lose the deal if the fit is bad.
- Public changelog and incident history.
- A free or low-cost tier you can prove value on first.
- Documented data export and exit process.
- Engineers, not only sales, in the room when you ask hard questions.
Need help implementing or feeling stuck? Contact us today to establish a consulting relationship.