How We Built an AI-Powered Content System That Turns a 4-Hour Job Into a 10-Minute Conversation
There's a particular kind of work that eats hours without anyone noticing. Not the big strategic decisions. The slide decks. The branded presentations that need to look professional, use the right terminology, and follow brand guidelines down to the trademark symbol placement. Every company has this work, and every company underestimates how long it takes.
We decided to stop accepting that tax on our time. What we built is a system of interconnected AI skills that handles the whole workflow, from a casual description of what you need to a finished, brand-compliant deck, without anyone opening PowerPoint.
This post covers how it works and why it beat the alternatives we tried first.
The Problem Isn't "Making Slides." It's Everything Around It
When someone on the team needs a presentation, the actual content is usually the easy part. They know what they want to say. The time sink is everything else: finding the right template, applying brand colors correctly, sizing images, writing speaker notes that don't sound robotic, making sure product names use the correct trademark symbols, and then running the whole thing past someone to confirm it "looks right."
We realized the bottleneck wasn't creativity or knowledge. It was the mechanical compliance work between having an idea and having a finished artifact.
What We Built: A Skill System, Not a Single Tool
We learned early that a single all-purpose tool wouldn't cut it. Different tasks need different levels of sophistication, and trying to build one monolithic solution meant every interaction carried the overhead of every feature.
Instead, we built a system of specialized skills that work together. Each skill handles one job extremely well, and they share a common understanding of brand voice, visual identity, and quality standards.
The presentation workflow starts with a conversation. You describe what you need in plain language ("I need a deck for the Q1 earnings review, internal audience, focus on storage growth numbers and the new product line"), and the system builds a slide-by-slide outline. You review it, tweak it, approve it, and it generates a finished .pptx with branded layouts, consistent formatting, and properly styled content. The whole loop takes minutes.
The image generation skill creates presentation visuals that match the brand. Instead of hunting through stock photo libraries or waiting on a designer, you describe what you need and it generates images with the right color palette and tone for the slide. If you have an existing image that's almost right, you can upload it and describe the edits (darken the background, add a gradient, shift the color temperature), and it handles the change.
The editing skill lets you modify existing decks without starting from scratch. Swap slides, rewrite speaker notes, update product mentions across the whole deck, replace images. All through conversation. It understands the structure of the existing file and makes targeted changes while preserving everything else.
Brand Voice: The Part That Actually Matters
The system isn't just "AI that makes slides." Every skill enforces a shared set of brand rules automatically, every time, with no extra steps.
This means every piece of text the system produces has been checked against the brand guidelines before it ever reaches the user. Product names are capitalized correctly. Trademark symbols appear on first mention. The tone stays advisory rather than salesy. Slide titles are specific and descriptive rather than generic. The writing avoids the hollow transitions and empty superlatives that make AI-generated content feel obviously artificial.
We baked this into the architecture instead of making it a separate review step. Each skill reads the brand voice rules before generating any text, so compliance is built into the output, not checked after.
Why This Approach Saves Real Time
The time savings come from a few places, and they add up:
Elimination of format wrestling. Nobody spends 45 minutes adjusting text boxes, aligning images, or figuring out which slide master to use. The system handles layout decisions based on content type, and it knows which layouts work best for comparisons, metrics, narratives, and product features.
Consistency without effort. When you have multiple people creating presentations, consistency requires either rigid templates (which people work around) or manual review (which creates bottlenecks). The skill system produces consistent output by default because every skill draws from the same brand authority.
Iteration speed. The most underrated win is how fast you can iterate. "Make slide 4 a two-column layout" or "Add a section about the competitive landscape after slide 6." Changes that would take 15 minutes of manual rearranging happen in seconds. People actually revise their content instead of shipping the first draft because editing no longer feels expensive.
The Technical Architecture (For the Curious)
Under the hood, the system is a set of Claude skills: structured instruction sets that give the AI specialized knowledge and workflows for specific tasks. Each skill has its own instruction file and reference materials, but they all share access to the same brand voice definition and product documentation.
The presentation skills use PptxGenJS for programmatic slide generation, which means every element is placed with precision rather than approximated. Images are generated through an MCP integration with Google's Gemini image model, with brand-aware prompting that automatically appends the correct color palette and aesthetic guidelines to every generation request.
A visual QA step at the end of each workflow catches text overlap, alignment problems, and contrast issues before the user sees the output. Small detail, but it kills the usual "almost done but not quite" rework loop.
All configuration — API keys, default preferences, output folders — lives in a single environment file, which means handing the system to a new team is a five-minute setup rather than a documentation deep-dive.
What We Learned
A few things stuck.
Brand voice enforcement is the highest-leverage feature. We thought layout augmentation would be the big win. It helps, but the real value is never having to manually check whether someone used the wrong product name or wrote "In today's rapidly evolving landscape" as a slide opener. Killing that whole class of error is worth more than any formatting augmentation.
Conversational intake beats form-filling. Early versions asked structured questions: audience, purpose, slide count, layout preferences. People found it tedious and often didn't know the answers upfront. We switched to a conversational model where the system asks follow-up questions only when there are real gaps. The experience feels natural instead of bureaucratic.
Modular skills beat monolithic tools. Separate skills for creation, editing, images, and templates keep each one focused and maintainable. You can use them independently too: generate an image for a one-off doc, or edit a deck that was originally created by hand.
Where This Goes Next
The same pattern could extend beyond slide decks. Any workflow that mixes content creation with brand compliance and professional formatting is a candidate: marketing one-pagers, internal comms, customer-facing proposals, training materials. The architecture would handle them once you define the brand rules and output formats.
The shift isn't AI replacing creative work. It's AI augmentation: taking the compliance and formatting overhead so humans can focus on what they're good at—deciding what to say and why it matters.