AI Agents Full Course 2026: Master Agentic AI (2 Hours)

Summary

This comprehensive course on AI agents covers everything from foundational loops to multi-agent architectures without requiring a computer science background. Guided by an expert running a $4M yearly enterprise, it outlines the core agentic workflow and details setup across three primary platforms: OpenAI's Codeex, Anthropic's Claude Code, and Google's Anti-gravity. Key advanced techniques demonstrated include self-modifying system prompts, multi-agent MCP orchestration, video-to-action pipelines, stochastic consensus, agent debate chat rooms, sub-agent verification, prompt contracts, and cost-effective context management using the 60/30/10 rule and the iceberg technique.

Key Insights

The Core Agent Loop continuously iterates through Observation, Reasoning, and Action until reaching the Definition of Done.

At the molecular level, AI agents operate using a core loop of Observe, Reason, and Act. During observation, the agent loads files, system prompts, conversation history, and previous tool results. It then transitions to a reasoning or 'thinking' step to devise an execution plan. Finally, it acts by executing code, calling APIs, or editing files. The cycle repeats, accumulating tokens and information, until the agent triggers its 'definition of done,' which is a explicit set of criteria and constraints defining when a task is successfully resolved.

Self-modifying system prompts reduce repeating errors by appending learning rules to configuration files at session ends.

By establishing a global or local markdown instruction file (such as agents.mmd, claude.mmd, or gemini.md), users can create a self-correcting prompt architecture. When an agent experiences a failure, hits a bug, or receives explicit user correction, it is instructed to automatically write a clear imperative rule to the bottom of the instruction file (e.g., 'Never generate dark-mode UI' because 'the user prefers clean white minimalist aesthetics'). This allows historical learnings to accumulate across independent sessions, steadily driving down the agent's error rate.

Stochastic consensus and chat room debates leverage model variance to generate highly refined strategic solutions.

Rather than relying on a single, linear query, developers can exploit the statistical stochasticity of LLMs. In Stochastic Multi-Agent Consensus, a parent agent spawns multiple parallel sub-agents with slight prompt variations to map the response 'search space,' analyzing outcomes based on median, mode, and outliers. Similarly, Agent Chat Rooms group role-playing sub-agents (systems thinker, pragmatist, contrarian, user advocate, and edge-case finder) to debate a problem round-robin, sharpening ideas through direct challenge before delivering a synthesized report.

Sustainable agent deployment requires managing context degradation and token budgets using strategic abstraction patterns.

As conversation context approaches limits, model accuracy degrades, leading to dropped tool calls and lost details. Advanced architectures handle this via auto-compaction and the Iceberg Technique—statically loading only essential system rules, memory, and task states while abstracting deep files, history, and internet searches behind tools (GP, Glob, Web Fetch, and Read). Combined with the 60/30/10 rule (allocating 60% of simple tasks to cheap models, 30% to mid-tier, and 10% to premium routers), operators can drop operational token costs by up to 60% without quality losses.

Sections

Course Overview and Multi-Chrome Demo

The course is designed for both personal and business use without requiring prior programming experience.

The instructor shares that they teach over 2,000 people and run a business generating over $4 million annually using AI agents. No formal computer science degree is necessary, as everything they learned was derived from free resources. This course is platform-agnostic, focusing on the architectural methodologies underlying Codeex, Claude Code, and Google's Anti-gravity.

A live demo showcases five automated Chrome browser instances filling out contact forms in parallel.

The instructor demonstrates how to parallelize lead-outreach efforts using multiple Claude Code sub-agents. These agents independently open their own Chrome browsers, navigate to target website contact pages, extract contact forms, dynamically map out the input fields (first name, last name, email), generate a personalized outreach message based on context, and submit the form—massively scaling outreach throughput.

Parallel processing minimizes the limitations of lower model accuracy compared to manual human iteration.

Though individual AI agents may lack the pinpoint, one-shot accuracy of a expert human developer, they dominate in execution speed. By parallelizing workflows—running dozens of instances of themselves simultaneously to test multiple unique paths—agents can iterate repeatedly until a high-quality, verified solution is found, outpacing manual labor.

The Core Agent Loop and System Components

The core agent workflow relies on a continuous loop of observation, reasoning, action, and feedback.

The loop begins with the Observe step, where the agent reads files, previous tool calls, system prompts, and multimodal inputs. It processes this in the Think/Reason step, outputting a clear, step-by-step plan. The Act step follows, using tools, running commands via CLIs, or editing files, generating outputs that loop back to a new Observation phase.

A robust definition of done is essential to prevent loop infinity and ensure prompt completeness.

The definition of done represents the set of specific, concrete parameters that must be met for the loop to terminate. Without a clear definition of done embedded in the agent's prompts, the agent may perpetually query tools, get stuck in feedback loops, or deliver underwhelming final summaries.

An AI agent consists of a reasoning LLM wrapped in tools, goals, and persistent memory.

While a chatbot is merely a raw Large Language Model answering text inputs, an AI agent coordinates several layers of architecture. The LLM acts as the reasoning engine, but the agent system includes tools (file systems, CLI access, API controllers, web browsers), user-directed goals, and persistent memory spaces ( conversation history, custom .md configuration files, automated memory systems, and skill libraries).

Setting Up the Three Major Platforms

OpenAI's Codeex agent platform can be set up easily on MacOS, Windows, and Linux.

To download Codeex, users navigate to OpenAI's website and select the app installer. On MacOS, installation is completed by dragging the download file to the Applications folder. Inside, users open a specific directory and can directly type instructions to build complex applications, as demonstrated with a minimalist single-page personal portfolio.

Anthropic's Claude Code is a paid agent interface that provides a high return on investment.

To use Claude Code, users download the application and subscribe to a Pro tier ($17 per year or $20 monthly). Inside Code mode, developers choose a local project folder and issue prompt sequences. A demonstration shows Claude Code autonomously creating a portfolio site by analyzing directory files, executing build scripts, and opening result pages.

Google's Anti-gravity app offers a slick interface wrapping the Gemini 3.1 Pro model.

Google Anti-gravity is downloaded directly, logging users in automatically via their Google profiles. On the right-hand panel, users interact with the Gemini agent, while the center displays file contents. Gemini showcases strong design capabilities by building a portfolio page styled with modern glassmorphism features.

Claude, Gemini, and GPT-based platforms each feature distinct operational strengths and weaknesses.

Claude Code provides the most interpretable reasoning logs, letting users inspect, pause, and steer agents, but runs slower without expensive fast modes. Gemini excels in front-end design, UI aesthetics, fast output rendering, and native video processing, though it lacks reasoning transparency. GPT (featured in Codeex) is the strongest at backend development, mathematics, and test-driven parallel workflows.

Self-Modifying System Prompts and Skills

System configurations like agents.mmd serve as static anchors for local and regional prompts.

Whether using agents.mmd (Codeex), claude.mmd (Claude Code), or gemini.md (Anti-gravity), these files act as a system header that is automatically prepended to every conversation thread. This allows users to store permanent rules, persona details, API patterns, and design preferences directly in the project directory.

Self-modifying system prompts automatically update rule structures based on user corrections and issues.

By establishing instructions inside the system configuration file to listen for user feedback, the agent learns to append a numbered sequence of imperative rules to the bottom of the document upon error detection. When the user asserts a preference—such as requesting light mode—the prompt writes a persistent instruction for future sessions.

Global configurations, local project configurations, and skills establish a structured tier of prompt hierarchy.

The prompt hierarchy is structured from general to specific. A global system metadata file enforces company-wide standards; a local project configuration file handles repository-specific behaviors; local skill files encapsulate deterministic workflows; and the final inline user prompt initiates immediate actions, conserving the overall context budget.

Agent skills provide highly repeatable, deterministic blueprints for complex programmatic actions.

Skills are markdown documents featuring names, descriptions, and explicit workflow steps marked by triple-hyphen delimiters. A demonstrative run using an 'algorithmic art' skill guides a model to construct a particle-rendering browser application in an identical, structured format every time, producing highly consistent outputs.

High-Level Multi-Agent Orchestration Patterns

Multi-agent MCP orchestration allows a manager model to delegate specific subtasks to specialized models.

Using the Model Context Protocol (MCP), a top-level router/manager models (like Claude) can divide a complex task into multiple subtasks. Claude plans the build, delegates UI design to Gemini, hands API development and test-driven verification to Codeex, and then merges and validates the components to form a finished application.

Implementing multi-agent MCP requires linking direct API keys from different platform developers.

For multi-platform routing, developers generate API keys from the respective console dashboards: Anthropic Console, OpenAI Developer Platform, and Google AI Studio. These keys are introduced to the environment variables of the manager model, which spawns backend subprocesses to interface with the external engines.

The video-to-action pipeline utilizes Gemini's video understanding to extract step-by-step tutorial guides.

Since video files capture details hard to write out in plain text, Google's Gemini API is used to watch videos (like YouTube tutorials). Gemini analyzes the video frame-by-frame and exports a structured markdown action list. Claude Code then reads this list to drive browser-controlling agents, replicating the tutorial autonomously.

Harnessing Stochasticity: Consensus and Chat Rooms

Stochastic Multi-Agent Consensus spawns several parallel agents to filter out hallucinations and surface rare ideas.

To exploit LLM stochasticity, a parent agent issues parallel requests containing slight framing variations (e.g., 'assume limited budget,' 'focus on user perspective'). Multiple sub-agents analyze the problem, and their results are aggregated. The manager evaluates commonalities (the mode or consensus) and highlights unique edge-case suggestions.

Running parallel agent queries scales the search space surface area within short turnaround times.

A single linear session might take multiple back-and-forth prompt turns, consuming up to 15 minutes of developer wait time. By parallelizing queries across ten separate instances, the system traverses a massive portion of the search space in a single five-minute window, identifying critical solutions like TikTok hook restructuring.

Agent Chat Rooms utilize custom perspectives to debate and solve difficult strategic challenges.

A centralized chat history file (chat.json) is opened for five distinct sub-agents representing a systems thinker, pragmatist, contrarian, edge-case finder, and user advocate. The agents take round-robin turns debating a topic. This structured friction challenges assumptions, catching flaws a single model sequence would ignore.

Verification Loops, Prompt Contracts, and Reverse Prompting

Sub-agent verification loops cure development biases by handing raw outputs to unbiased reviewer models.

An implementer agent can develop blind spots after debugging code, building a 'sunk cost bias' regarding its architectural choices. In a verification loop, the raw output is sent to a fresh, context-free reviewer agent. If issues are found, they are routed to a resolver agent, producing clean, verified results.

Prompt contracts establish clear goals, constraints, formats, and failure states prior to project execution.

Vague instructions (e.g., 'make a beautiful website') often lead to project failures. Under a prompt contract system, the agent analyzes the task and writes a four-section document outlining the goals, constraints, format details, and precise failure conditions. The task begins only after the developer signs off on this contract.

Reverse prompting forces models to ask clarifying questions before attempting a one-shot generation.

Instead of generating code based on assumptions, a reverse-prompting skill instructs the agent to ask the user five precise clarifying questions first. Answering these questions narrows down design tastes, performance standards, and environment constraints, greatly increasing the chances of a one-shot success.

Context, Compaction, and Token Cost Optimization

Large context windows degrade model output quality over long chat conversations.

Though context windows range from 150k words up to 700k words, model accuracy degrades as token counts rise. When context grows, models struggle to locate relevant information, leading to dropped tool calls, errors, and high API costs. This makes proactive context-window management critical.

Auto-compaction limits token bloat by summarizing and condensing previous message history on the fly.

Many agent systems combat token bloat through automatic compaction. When active tokens hit a threshold (e.g., 80% capacity), the system condenses the chat history to about 30%, retaining key settings and preferences while removing tool outputs and junk tokens.

The Iceberg Technique stores essential instructions statically while loading bulky codebase files on demand.

To save tokens, the Iceberg Technique keeps only global prompts, task states, and active file contexts in the main window. Deep codebases, web results, and historical details stay 'underwater.' Tools like GP, Glob, and Read fetch these files only when required, keeping the active context small.

The 60/30/10 allocation rule cuts API costs by routing tasks to different price-point models.

Organizations can save up to 60% in token expenses by reserving premium models (like Opus or GPT-5) for high-level orchestration (10% of tasks). Mid-tier models (like Sonnet) handle standard writing and extraction (30% of tasks), while low-cost models (like Haiku or Gemini Flash) manage basic scraping tasks (60% of tasks).

Batch processing APIs leverage low-inference server windows to offer 50% discounts on bulk queries.

Developers can submit non-urgent, high-volume tasks (such as lead enrichment list processing) via batch API endpoints on OpenAI, Anthropic, or Google. Platforms queue these requests and process them overnight when server load decreases, returning the results within 24 hours at a 50% discount.

Ask a Question

*Uses 1 Wisdom coin from your coin balance

Watch Video

Open in YouTube