The AI-First PM Toolchain

The Short Version

Large language models make PMs measurably better: more creative, more thorough, and faster. They work best with text, which means every PM artifact locked in a binary Office format is harder for AI to use well. That constraint affects the PM operating model: the tools change, the audience changes, the definition of "source of truth" changes, and the way we organize knowledge changes with it. This document makes the case for markdown, Git, and repository-aware AI assistance as the PM foundation, and explains why the alternatives we tested still fall short as of May 2026.

LLMs Make PMs Better

Large language models improve PM work. Not just speed, but quality. They add creativity and thoroughness that are hard for any individual PM to sustain under time pressure. A PM working with a capable agent can produce better briefs, more complete specifications, and more rigorous acceptance criteria than the same PM working alone. The improvement shows up both in what we produce and how quickly we produce it.

That describes the current state as of mid 2026. It raises a practical question: if LLMs are this valuable, what needs to change so every PM can use them effectively?

The answer starts with one technical fact about how LLMs work.

LLMs Work Best With Text

Large language models operate primarily on text. Multimodal models can accept images and video, but the work PMs most often need — drafting a spec, reviewing a brief, generating acceptance criteria — still depends on text. The format the AI ecosystem has largely standardized on for that text is markdown.

That has practical consequences for PM tooling. If AI works best with text, then PM source artifacts should be text: markdown files stored in a system designed for text, not Word documents, PowerPoint decks, or spreadsheets treated as the canonical version.

That system is Git, paired with a markdown-capable editor and an AI assistant that can work directly against the repository.

This is not about preferring developer tools to Office. It follows from how the technology works. Markdown is easy for agents to read and write. Git provides explicit version history, branching, pull-request review, and collaboration. That explicitness matters more when agents help write the content. Productivity apps give you background versioning; Git gives you a deliberate review process, a named diff, a revision, and an approval trail. SharePoint can store rendered artifacts, while Git gives text the source-level control that agent-authored work requires.

The Office Alternatives Are Not Ready Yet

We tested the current attempts to bring AI agents into the Office ecosystem in mid-May 2026. The category is not ready yet for this workflow. In the tools we evaluated, permission models are too restrictive, and the agent capabilities are too limited to support real PM work.

They could become viable. SharePoint and Word may evolve into text-based editors that integrate naturally with LLMs, or a secure Office-connected agent may become powerful enough to replace this model. This is a May 2026 assessment, not a permanent commitment.

Today, though, they are not capable enough. Waiting for them means continuing to create PM artifacts in formats AI cannot reliably use.

Our Audience Has Changed

There is a second shift that reinforces the move to text: PM content is no longer written only for humans.

Consider who actually reads our output now. A specification feeds a coding agent. A brief feeds a specification agent. Documentation produced by documentation agents feeds customer agents that act on behalf of the customer. A stakeholder deck feeds a storytelling agent that renders it for human consumption. The primary consumers of PM content are other agents, not people.

We create content directed by humans, produced by agents, for consumption by other agents. Humans still need that content — but they get rendered versions, not the source.

This changes how we should think about deliverables. PowerPoint, Word, and blog platforms become renderers. The PM creates markdown, and a build process turns it into whatever format the audience needs: slides, documents, video, HTML. The software analogy is useful: engineers do not edit executables directly; they edit source code that compiles into executables. PMs should not maintain slide decks as source; they should maintain markdown that can render into slides.

One Source of Truth

This audience shift creates a coordination problem. If agents read markdown and humans read PowerPoint, the temptation is to maintain both: a markdown spec for the coding agent, a Word version for stakeholders, and a summarized deck for leadership.

That path creates drift. Six months later, no one remembers which version is authoritative. The spec says one thing, the Word doc says another, and every decision that depends on them slows down while someone figures out which is correct.

The markdown in Git should be the single source of truth. PowerPoint, Word, video, and blog posts are build artifacts generated from that source. They can live in SharePoint for viewing, but they should not be the canonical version. If the rendered version drifts from the source, regenerate it from the repository.

This is the same principle software engineering already uses. Source code lives in Git. Compiled binaries are build artifacts. Nobody edits the compiled binary and expects the source to update. The same discipline now applies to PM content.

We should not split the difference by treating markdown as authoritative for agents and Office as authoritative for humans. The problem is operational, not philosophical: dual sources of truth do not stay in sync. Software engineering has decades of evidence for this, and the same pattern applies to PM artifacts.

Agents Are Code — Treat Them Like Code

If agents are central to this model, we also need to decide where the agents themselves live and how they are developed.

Agents and skills are code. They may be written in English rather than Python, but they still encode workflows, templates, and reasoning strategies. They need the same development rigor as other code: builds, evals, quality gates, pull request reviews, and dedicated expert teams.

Agents should live in their own dedicated development repositories, maintained by experts who specialize in agent development. They should not be checked into the PM working repository or copied between repos. They should be developed once, published through a marketplace, and installed wherever they are needed.

This follows the same single-source-of-truth principle. If I develop a documentation agent and copy it into three documentation repositories, I now have three copies. Which one has the latest version? Which one works best? They will drift like any other duplicated artifact. By developing agents in a dedicated repo and distributing them through an installation process, we maintain one canonical version.

Agent marketplace systems can support this model: a centrally maintained repository publishes agents and skills, and working repositories install the versions they need.

Four Types of Knowledge, Four Natural Homes

With the tooling and agent model established, the next question is what knowledge PMs need access to, and where each type should live.

There are four distinct types. The differences matter because each one has different ownership, update cadence, and access requirements.

What We Create

The first type is the artifacts PMs produce: specifications, briefs, stakeholder decks (as markdown), blog posts, documentation. These are prescriptive — they define how things should work. They live in the team's Git repository, organized by feature area.

What We Know About Our Features

The second type is descriptive knowledge about the team's product. Not what we want to build next, but what already exists — feature descriptions, business value summaries, how components interact. This evolves alongside the prescriptive artifacts and lives in the same repository. When a spec changes a feature, the description of that feature should update in the same pull request. Things that evolve together live together.

What We Know About the Platform

The third type is system-wide knowledge that cuts across teams: the platform's permission model, billing model, deployment architecture, BCDR plans, core capabilities. No single PM team owns this. It belongs in a centralized repository maintained by a dedicated technical team, with organization-wide read access. Every PM team consumes it; nobody outside the maintaining team writes to it.

What We Know About Our Customers

The fourth type is customer knowledge: call transcripts, feedback, CRM data, market signals, competitive intelligence, and support incidents. This material is massive, multi-structured, and sensitive.

This is where the Git model stops fitting. Customer data includes information shared under NDA. It contains private business details from direct customer conversations. There are regulatory constraints on who can access it, how it can be stored, and where it can travel. You cannot simply check this into a Git repository and clone it onto every PM's laptop. That is not a technical limitation — it is a compliance, privacy, and security boundary.

The right home for customer knowledge is a governed customer data platform. The architecture follows a bronze/silver/gold data lake pattern. Raw transcripts, unstructured feedback, web-scraped data from forums and social media — all of that is the bronze layer, massive and unstructured. Semi-structured summaries — customer pain-point briefs, aggregated feedback by theme — form the silver layer. Fully structured, tabular data — queryable databases showing that 26 customers requested feature A with high urgency while 35 requested feature B with lower urgency — is the gold layer.

The platform should expose all of these layers to agents through governed query and retrieval interfaces. An agent can start with a structured query against the gold layer, then drill into silver-layer summaries, and if needed, read raw bronze-layer transcripts — all within the organization's compliance and permission framework. This is not achievable with Git repositories.

Data platforms are designed for this boundary in a way Git repositories are not. They support governance models, security controls, permissioning, auditability, retention policies, and controlled access patterns built explicitly for sensitive and regulated data. The goal is not just to store customer knowledge somewhere larger than Git; it is to store it somewhere that can enforce the compliance rules that customer knowledge requires.

The customer knowledge base is likely the biggest infrastructure investment in this model. It requires data engineers, possibly data scientists, and a dedicated team to own it. Validating that dependency is critical: without this foundation, the ideation phase of the PM workflow has no data to work from.

One Team, One Repository

Given the move to text-based artifacts, agents as consumers, a single source of truth, and four knowledge types, how should we organize the repositories themselves?

The principle is the same one that governs code organization: things that evolve together live together. A group of PMs working toward a common goal shares a repository. That boundary might not follow the org chart — a single org might have three repositories if it serves three independent goals, or three orgs might share one repository if they ship together. The unit of separation is the unit of evolution.

A practical warning sign is access control. If you are trying to map folders to teams and restrict who can write to each part of the repository, the repository is probably too broad. The right fix is not a more elaborate permissions scheme inside one repo; it is a smaller repository boundary that matches the group of people who should share write access.

Feature-First, Not Artifact-First

Inside each repository, structure follows features, not artifact types. The root directories are product areas. Under each product area: features. Under each feature: specs, briefs, decks, blog posts, knowledge base entries.

The alternative — a root-level specs/ folder, a root-level briefs/ folder, a root-level decks/ folder — scatters related work across the repository. When a PM ships a feature, they change the spec, update the brief, revise the knowledge base, and draft a blog post. If those files live in four different root directories, every pull request touches paths all over the repository. When they live under the same feature folder, the PR is self-contained. Things that evolve together live together.

Skills as Knowledge Indexes

The feature-first structure creates a challenge: the descriptive knowledge base is spread across every feature folder in the repository. An agent looking for knowledge about the product cannot simply scan one directory.

The solution is to use skills as indexes. A small number of skills live in a dedicated skills folder. Each skill's YAML front matter acts as a first-level index: a short description of the product area it covers. The skill body acts as a second-level index, pointing to specific files distributed across feature folders.

These knowledge-index skills are the one exception to the rule that skills live in development repositories. Because they are tightly coupled to the repository's content structure, they must reside in the working repository itself. Functional agent skills — the ones that do work — are developed and distributed centrally. Knowledge-index skills — the ones that navigate information — live where the information lives.

At the repository root, an agents.md file provides structural context for any generic agent that enters the repository. It names the organization, lists product areas and features, and tells agents where to find and place artifacts. That means a PM does not need to specify file paths every time they ask an agent to create a spec; the agent can read agents.md and follow the structure.

Why Not a Monorepo

In code, monorepos sometimes make sense because multiple packages compile into a shared build artifact. There is a technical reason for co-location. PM artifacts do not have the same constraint. Briefs do not compile together. Specs do not link at build time. There is no shared output that requires co-location.

The downsides are concrete. Permission models become complicated: OWNERS files across the repository, PR approval bottlenecks, and protection rules dictating who can edit which folder. Agents need extra instructions not to touch other teams' folders, which is fragile and easy to get wrong. Pull requests become large and cross-cutting. File systems become harder to navigate. The benefits do not justify that complexity when org-wide read access to each team's repository gives people the access they need.

Per-team repositories are simpler: five, ten, twenty people, everyone with equal write access, everyone knowing each other by name. No OWNERS files. No approval bottlenecks. Less chance of an agent editing the wrong area. If you need another team's knowledge, clone their repo. Read access is cheap. Write access stays scoped to the people who actually own the content.

Repositories are cheap. There is little efficiency gained by joining them, and meaningful risk in making them too large.

Adopt the Markdown-First Model

Adopt markdown, Git, a markdown-capable editor, and repository-aware AI assistance as the primary PM toolchain. Every new PM artifact is authored in markdown, stored in Git, and reviewed via pull requests. Rendered outputs — PowerPoint, Word, video — are build artifacts generated from the markdown source and stored in SharePoint for viewing, never as the canonical version.

Three conditions must be met for this to succeed:

Invest in training. PMs need structured onboarding for markdown editing and Git. The tools are learnable, but the shift is real and must be resourced — not just announced.

Build agent development capacity. Dedicated teams must build, test, and maintain the agents that power this workflow. Agent quality directly determines PM productivity. Underfunding agent development weakens the entire model.

Validate the customer data foundation. This needs to be built, and it is a real infrastructure investment. Without it, the ideation phase has no reliable data source, and the PM lifecycle has no strong starting point.

This is a May 2026 assessment. If Office tools become viable for agent workflows, this model loses its primary differentiator. Until then, it is the most workable path we have found. The choice is not between familiar tools and unfamiliar ones. It is between a toolchain AI can reliably use and one it cannot.