Prompt Engineering Is Not Enough: How Java Developers Should Structure AI Agent Workflows Using Embabel or Koog

Eleftheria DrosopoulouApril 30th, 2026Last Updated: April 23rd, 2026

0 215 12 minutes read

Goal-oriented planning, tool registration, and multi-step task execution — finally, on the JVM.

1. The Problem With Prompt-Only Thinking

If you have spent any real time building LLM-powered features in Java, you have almost certainly run into the same ceiling. At first, a clever prompt seems to solve everything. You craft a system message, shape the user turn, sprinkle in some examples, and — for a while — it works surprisingly well. Then, as soon as the task gets more complex, the cracks appear. The model forgets earlier steps. It invents data you did not give it. It confidently completes the wrong goal. Moreover, when it does fail, you have very little idea why.

This is not a model quality problem. It is an architectural problem. As Rod Johnson — the creator of the Spring Framework — put it when describing why he built Embabel: “Without agentic systems, we are more like alchemists than engineers, our prompts more like incantations.” That framing resonates with any developer who has spent an afternoon chasing a flaky prompt through staging logs.

Prompt engineering optimises a single LLM call. Agent frameworks orchestrate sequences of calls, tools, and decisions — and they do it in a way that is testable, explainable, and survives refactoring. These are entirely different problems.

For years, Java developers who wanted agent capabilities faced an uncomfortable choice: adopt Python frameworks like LangChain or CrewAI and maintain a polyglot stack, or build their own orchestration on top of Spring AI or LangChain4j — useful primitives, but not full agent planners. That gap has now closed, and it closed quickly.

2. The New JVM Agent Landscape

Two frameworks arrived within months of each other in 2025, and both came from credible, well-resourced teams. Embabel, launched by Rod Johnson in May 2025, built on Spring Boot and written in Kotlin with first-class Java interoperability. And Koog, open-sourced by JetBrains at KotlinConf 2025, built entirely on Kotlin coroutines with a graph-based strategy model that targets not just the JVM, but also Android, iOS, and WebAssembly via Kotlin Multiplatform.

By contrast, the earlier generation of JVM AI tooling — Spring AI and LangChain4j — deliberately sits lower in the stack. As the Java Code Geeks 2026 trends report summarises it: “Spring AI is the pragmatic entry point. Embabel and Koog are for teams building serious multi-step agent workflows on the JVM.” Think of the relationship the same way you think of servlets versus Spring MVC — the lower layer is still there, but most developers should not be coding directly against it.

The stack, visualisedLLM API → Spring AI / LangChain4j (primitives) →Embabel / Koog(agent orchestration) → your business application. The middle layer is where planning, tool registration, and multi-step execution actually live.

JVM AI Framework GitHub Stars — Comparative Growth (2025)

Approximate star counts at key milestones, based on public repository data and reported figures from InfoQ and The New Stack.

3. Embabel in Depth: GOAP Meets Spring

Embabel is, in essence, Rod Johnson’s answer to a question he believes the Python ecosystem has not answered well: how do you bring engineering discipline to agentic AI? The framework’s most distinctive idea is that it separates planning from execution. Rather than asking an LLM to decide what to do next, Embabel delegates that decision to a deterministic AI algorithm borrowed from video game development called Goal-Oriented Action Planning (GOAP).

In a game engine, GOAP lets an NPC character work out a sequence of actions — pick up weapon, find cover, flank the enemy — given a set of preconditions and a desired goal state. Embabel applies exactly this algorithm to enterprise workflows. The developer defines actions (what the agent can do), conditions (what must be true before or after an action), and goals (the desired end state). The GOAP planner then computes an optimal path through those actions at runtime — and replans after every step, forming what Johnson calls an OODA loop (Observe, Orient, Decide, Act).

Furthermore, because the planner is a non-LLM algorithm, the decisions it makes are fully explainable. You can log exactly why the planner chose action B over action A, which is critical for any regulated or audited business process.

Key Embabel Concepts at a Glance

Concept	What it does	Familiar Spring analogy
@AgentComponent	Marks a class as an agent that the platform can discover	@Service / @Component
@Action	Declares a method as a step the planner can use	@RequestMapping (a route the framework can invoke)
Goal	Desired end state; the planner works backward from here	Return type of a controller endpoint
Condition	A typed boolean checked before/after each action	@PreAuthorize / guard clause
Blackboard	Shared typed state that all actions read and write	The model object passed through a request pipeline
AgentPlatform	Bootstraps and executes agent flows (Focused / Closed / Open)	DispatcherServlet

Beyond GOAP, Embabel’s 2025 year-end update added a Utility planner (for open-ended exploration without a fixed goal) and a Supervisor pattern that maps directly to the LangGraph supervisor-with-workers model — giving teams migrating from Python a familiar mental model while gaining type safety and testability that Python frameworks simply cannot match at the JVM level.

For a practical starting point, the Baeldung Embabel tutorial walks through building a quiz-generation agent, and the Dan Vega first-look guide covers tool registration and MCP server integration in detail. Both are solid starting points before you touch the official GitHub repository.

4. Koog in Depth: Coroutines Meet Agent Graphs

JetBrains took a different architectural path. Where Embabel delegates planning to a deterministic algorithm, Koog asks developers to define the agent’s strategy as an explicit directed graph of nodes. Each node performs one operation — calling an LLM, invoking a tool, summarising message history, routing to a subgraph — and edges between nodes define flow control including loops, branches, fallbacks, and parallel paths.

This node-edge model will feel immediately familiar to anyone who has worked with LangGraph. However, Koog’s implementation is considerably more idiomatic for JVM developers because it is built entirely on Kotlin coroutines, which means concurrency, streaming responses, and parallel tool calls all compose naturally without callback hell or thread management. Additionally, Koog ships with built-in history compression — an important practical detail when running long agentic sessions where raw message history would otherwise exhaust the model’s context window.

Arguably, Koog’s biggest differentiator is its multiplatform reach. Because it is built on Kotlin Multiplatform, the same agent logic can run on the JVM backend, on Android, on iOS, and even in the browser via WebAssembly — a capability no other JVM agent framework currently offers.

What Koog Ships Out of the Box

Feature	Details	Version introduced
Graph-based strategies	Nodes + edges with loops, branches, parallel paths	0.1.0 (initial)
Non-graph strategy API	Define strategies as Kotlin extension functions without explicit wiring	0.5.0
MCP integration	Native Model Context Protocol support via Kotlin MCP SDK	0.1.0
A2A protocol	Agent-to-Agent communication; agents discover and call each other	0.5.0 (Oct 2025)
Persistence & checkpointing	Snapshot agent state; resume exactly where execution paused	0.4.0
History compression	Intelligent summarisation to manage context window usage	0.1.0
OpenTelemetry observability	Built-in exporters for Langfuse and W&B Weave	0.3.0
AIAgentService	Manage multiple uniform running agents as state-managed services	0.5.0

The JetBrains team has also made it a point to publish a multi-part blog series building a real coding agent step by step — arguably the most readable practical onboarding content for any JVM agent framework available today. Koog’s documentation lives at docs.koog.ai.

5. Side-by-Side Comparison

Both frameworks solve the same high-level problem, but they optimise for different teams and different kinds of agents. Rather than declaring a winner, the table below lays out the architectural tradeoffs as clearly as possible so that you can make the choice that fits your stack.

Dimension	Embabel	Koog
Planning model	GOAP (deterministic, non-LLM)	Explicit graph / coroutines
Primary language	Kotlin + excellent Java interop	Kotlin (Java API available)
Spring integration	Deep — built on Spring Boot	Available (Spring Boot + Ktor adapters)
Multiplatform	JVM only	JVM, Android, iOS, JS, WASM
History / memory	Domain blackboard; conversation memory is app-level concern	Built-in history compression + RAG support
Observability	Spring Actuator; logging; prompt testing library	OpenTelemetry, Langfuse, W&B Weave out of the box
Fault tolerance	Replanning after each action (OODA loop)	Checkpointing, rollback tool side-effects, retries
License	Apache 2.0	Apache 2.0
Planner explainability	High — non-LLM algorithm, fully loggable	Medium — graph is explicit but LLM drives node decisions
Best fit for	Spring teams with complex domain models	Kotlin-first teams; multiplatform or mobile deployment

As Rod Johnson himself noted in the GitHub discussion comparing the two: “For Java developers, especially those using Spring, the choice is obviously Embabel. For Kotlin developers it probably comes down to whether Koog’s explicit node-edge wiring is what they’re looking for.”

6. Goal-Oriented Planning in Practice

Let us make goal-oriented planning concrete with a realistic scenario: an order-processing agent that needs to validate a customer order, check inventory, apply pricing rules, and then dispatch a fulfilment event. Without a framework, you would write this as a waterfall of chained LLM calls or hand-coded state machines. With Embabel, you instead declare the pieces and let the planner sequence them.

Here is a simplified view of how Embabel’s annotations map to the planner’s vocabulary. Notice that the code is idiomatic Java — no XML, no special runner, just annotated methods that Spring and Embabel pick up automatically:

Java · Embabel — Order Processing Agent (simplified)

import com.embabel.agent.api.annotation.*;
import com.embabel.agent.api.common.OperationContext;

@AgentComponent
public class OrderProcessingAgent {

    // Action 1 — validate the incoming order
    // Precondition: an unvalidated order exists on the blackboard
    // Effect: a ValidatedOrder object is placed on the blackboard
    @Action(
        pre  = "order.status == 'PENDING'",
        post = "validatedOrder != null"
    )
    public ValidatedOrder validateOrder(Order order, OperationContext ctx) {
        // The LLM prompt is focused: just validation rules, nothing else
        return ctx.promptForObject(
            "Validate this order for completeness and fraud signals: " + order,
            ValidatedOrder.class
        );
    }

    // Action 2 — check inventory (pure Java, no LLM involved)
    @Action(
        pre  = "validatedOrder != null",
        post = "inventoryResult != null"
    )
    public InventoryResult checkInventory(ValidatedOrder validatedOrder,
                                          InventoryService inventory) {
        return inventory.check(validatedOrder.items());
    }

    // Goal — what the planner is trying to reach
    @AchievesGoal
    @Action(
        pre = "inventoryResult.allInStock == true"
    )
    public FulfilmentEvent dispatch(ValidatedOrder order,
                                    InventoryResult inv) {
        return new FulfilmentEvent(order, inv.reservationId());
    }
}

There are several things worth noticing here. First, each action has a narrow, focused responsibility. The LLM in validateOrder is only asked about fraud and completeness — not inventory, not pricing. That focus makes the prompt far more reliable than one giant prompt trying to do everything at once. Second, checkInventory involves no LLM at all — it is plain Java calling a service. Embabel happily mixes LLM-powered and code-driven actions in the same plan. Third, you never write the sequence yourself; the GOAP planner infers it from the preconditions and postconditions at runtime.

Embabel’s planner evaluates preconditions as typed expressions against the blackboard. Keep them to data-presence checks (validatedOrder != null) rather than complex business rules. Move logic into the action methods themselves, where it is testable and debuggable.

The equivalent in Koog takes a graph-first approach. Instead of declaring preconditions, you wire nodes together explicitly. Here is a minimal Koog strategy for the same flow:

Kotlin · Koog — Order Processing Strategy (simplified)

import ai.koog.agents.core.agent.AIAgent
import ai.koog.agents.core.agent.config.AIAgentConfig
import ai.koog.agents.core.tools.ToolRegistry
import ai.koog.prompt.executor.clients.openai.OpenAIModels
import ai.koog.prompt.executor.llms.all.simpleOpenAIExecutor

// Step 1 — define tools as annotated functions
val toolRegistry = ToolRegistry {
    tool(::validateOrderTool)   // wraps your validation logic
    tool(::checkInventoryTool)  // wraps InventoryService.check()
    tool(::dispatchOrderTool)   // wraps FulfilmentEvent creation
}

// Step 2 — configure the agent
val config = AIAgentConfig(
    prompt = ai.koog.agents.core.agent.config.PromptConfig(
        systemPrompt = """
            You are an order processing agent. 
            Use the available tools in sequence to validate, check 
            inventory, and dispatch orders. Stop when dispatched.
        """.trimIndent()
    )
)

// Step 3 — create and run the agent
val agent = AIAgent(
    promptExecutor = simpleOpenAIExecutor(System.getenv("OPENAI_API_KEY")),
    llmModel       = OpenAIModels.Chat.GPT4o,
    toolRegistry   = toolRegistry,
    config         = config
)

// Step 4 — run with a specific order input
val result = agent.run("Process order: ${order.toJson()}")
println(result)

In Koog, the LLM itself decides which tool to call and in what order, guided by the system prompt and the tool descriptions. The graph-based strategy API adds more explicit control when you need branches or loops. This is a meaningful philosophical difference: Embabel’s planner is deterministic by design, while Koog’s default single-agent pattern trusts the model to sequence tools correctly — which is faster to prototype but harder to audit in regulated contexts.

7. Tool Registration and Multi-Step Execution

Tool registration is where both frameworks shine brightest, and where the contrast with raw prompt engineering is most visible. Instead of describing a capability in natural language and hoping the model extracts the right parameters, you register a typed function that the framework exposes to the model in a structured way. The model receives a JSON schema describing the tool’s inputs and outputs; it calls the tool by name with typed arguments; the framework validates the call, executes the function, and returns the result — all without you writing any serialisation code.

1. Define your tool as a typed function

In Embabel, any Spring bean method annotated with @Tool is automatically registered. In Koog, you pass a function reference to ToolRegistry. Both frameworks generate the JSON schema from the method signature using reflection — no manual schema writing required.

2. The planner or agent selects the tool

Embabel’s GOAP planner picks tools based on which action’s preconditions are currently satisfied. Koog’s LLM-driven agent selects tools by reasoning over the tool descriptions in the system prompt — the same pattern used by OpenAI’s function-calling API.

3. The framework validates and executes

Both frameworks deserialise the model’s tool-call arguments into your typed objects before execution. If the model provides malformed arguments, the framework catches that before your code ever sees it — a safety net that raw prompt-parsing utterly lacks.

4. Results flow back into the next step

In Embabel, the return value of an @Action is placed on the typed blackboard, making it available to subsequent actions. In Koog, it is appended to the message history as a tool result, and the agent reasons over it in the next iteration of its execution loop.

5. MCP servers extend your tool catalogue

Both frameworks support the Model Context Protocol. This means any MCP-compatible server — databases, APIs, file systems — can be connected as a tool source without writing custom integration code.

A common misconception is that MCP alone is enough to build agents. As Johnson has argued, MCP solves tool discoverability and invocation — but it does not solve planning, sequencing, error recovery, or explainability. That is precisely what Embabel and Koog add on top.

8. Adoption in Numbers

Both frameworks are young, but the signals from the developer community are already clear. Embabel crossed 3,000 GitHub stars within its first year, a trajectory that mirrors early Spring adoption in enterprise Java circles. Meanwhile, the JetBrains team reports that Koog is already used internally to power the AI stack behind Junie (their AI coding agent) and AI Assistant in IntelliJ IDEA — giving it an unusually large and demanding real-world workload for an open-source framework less than a year old.

Perceived Production-Readiness: JVM AI Frameworks (Developer Survey Proxy)

Composite score (0–10) based on InfoQ Java Trends 2025, The New Stack developer interviews, and GitHub issue/PR activity as of early 2026. Higher = more production-ready.

Interestingly, Deutsche Telekom has already built one of Europe’s largest LLM-powered customer-service chatbots on the Kotlin/JVM stack — an early signal that enterprise teams are not waiting for full framework maturity before shipping. As JetBrains’ tech lead Vadim Briliantov noted in an interview with The New Stack: “For many enterprises, Python is not considered a production-ready language, even though most modern AI tools are built on it.” That sentiment is increasingly shaping procurement decisions at the platform level.

9. Which Framework Should You Choose?

The honest answer is that neither framework is universally superior. They optimise for different team profiles and different agent architectures, so the right choice comes down to a few concrete questions.

Choose Embabel if…

Your team is deep in the Spring ecosystem and wants to add agent capabilities to existing Spring applications with minimum friction. Your domain is complex with rich typed objects that the planner can reason about. You need high explainability — regulatory, compliance, or audit requirements mean you must be able to trace every planning decision. You are not targeting mobile or browser runtimes.

Choose Koog if…

Your team is Kotlin-first and values coroutines as a native concurrency model. You need multiplatform deployment — backend and Android, or backend and browser via WASM. You want built-in history compression and out-of-the-box OpenTelemetry integration without writing adapters. You prefer the explicit control of graph-based wiring over automatic GOAP planning.

As the JCG deep-dive analysis from March 2026 concludes: “Spring-deep teams adding intelligence to existing services will likely reach for Embabel first; Kotlin-first teams building new agents or targeting mobile will likely reach for Koog.” That framing holds up well in practice.

In either case, before you start a new project, it is worth checking the Embabel releases page and the Koog releases page for the latest versions — both frameworks have been shipping patch releases weekly throughout early 2026, and APIs are still evolving.

10. What We Have Learned

This article has covered a lot of ground, so let us bring it together in one place before you move on.

Prompt engineering alone cannot deliver the reliability, testability, or explainability that enterprise Java applications require from AI workflows. Agent frameworks are not optional once complexity grows.
Embabel, from Spring creator Rod Johnson, introduces Goal-Oriented Action Planning — a deterministic, non-LLM algorithm that plans sequences of typed actions toward a declared goal. It is built on Spring Boot, offers excellent Java interoperability, and is the natural choice for Spring-heavy teams.
Koog, from JetBrains, takes a graph-based, coroutine-native approach that gives Kotlin developers explicit wiring control, built-in history compression, and Kotlin Multiplatform reach across backend, Android, iOS, and browser targets.
Both frameworks sit above Spring AI and LangChain4j in the stack, adding the planning and orchestration layer those lower-level libraries intentionally omit.
Tool registration in both frameworks is typed, schema-driven, and MCP-compatible — a significant step beyond raw function-calling or prompt-described capabilities.
The choice between them is primarily a team profile decision: Spring-first teams naturally gravitate to Embabel; Kotlin-first or multiplatform teams naturally gravitate to Koog. Neither is wrong.

Prompt Engineering Is Not Enough: How Java Developers Should Structure AI Agent Workflows Using Embabel or Koog

1. The Problem With Prompt-Only Thinking

2. The New JVM Agent Landscape

3. Embabel in Depth: GOAP Meets Spring

Key Embabel Concepts at a Glance

4. Koog in Depth: Coroutines Meet Agent Graphs

What Koog Ships Out of the Box

5. Side-by-Side Comparison

6. Goal-Oriented Planning in Practice

7. Tool Registration and Multi-Step Execution

8. Adoption in Numbers

9. Which Framework Should You Choose?

10. What We Have Learned

Thank you!

Eleftheria Drosopoulou

Thank you!

1. The Problem With Prompt-Only Thinking

2. The New JVM Agent Landscape

3. Embabel in Depth: GOAP Meets Spring

Key Embabel Concepts at a Glance

4. Koog in Depth: Coroutines Meet Agent Graphs

What Koog Ships Out of the Box

5. Side-by-Side Comparison

6. Goal-Oriented Planning in Practice

7. Tool Registration and Multi-Step Execution

8. Adoption in Numbers

9. Which Framework Should You Choose?

10. What We Have Learned

Thank you!

Related Articles

Thank you!