Tutorial on Agentic Engine

Let me declare upfront that the goal of this tutorial is not to build a powerful agent, but to understand what must exist before an agent can exist at all. Most discussions of agentic systems begin with planners, tools, memory, reflection, or orchestration frameworks. While useful, these layers often obscure the fundamental mechanics. My intent however is to reverse that process: start with almost nothing, and add complexity only when its necessity becomes unavoidable.

Read disclaimer if you belive in the AI God

1. Introduction

At its core, an agentic system is a program that lives in an environment in our case it is the LLM. It must act, observe, and decide what to do next. In this tutorial we deliberately postpone the last part to the very end and in doing so, we forces ourselves to confront a simple question: what does “acting” and “observing” mean? The answer is surprisingly minimal. Action becomes sending text. Observation becomes receiving text. Memory, goals, plans, validation and everything else is an interpretation layered on top of this.

This framing also exposes an important methodological stance: agentic behavior should be constructed, not declared. Rather than assuming the existence of an “agent” we build up the conditions under which agency becomes possible. Each step is meant to be legible and inspectable. If a feature cannot be explained in terms of what it adds to the act–observe loop, it does not yet belong in the system.

Finally, we are treating the implementation as an ongoing experiment. The emphasis is not on arriving at a final architecture, but on focussing on the ability to reason about agentic systems as evolving programs. By the end, what matters is not the specific code written, but the mental model it enables: that agents are not magical entities, but disciplined arrangements of data flow, control flow, and feedback.

This experiment has two sides. First is important for you the reader. I wanted to implement a bare minimum of what is required to implement an engine that supports AI agents or Agentic systems. This is an attempt to strip away any fluff and only focus on the skeleton. I intend for this to be an incremental process, such that along the way we will be able to discern what is going on throughout the whole system. Second is important for me. I am writing is in literate programming style through and through. Though I have been using org-babel to exploit the fruits of literate programming for a long time, I have never written a entire tutorial that completely relied on literate programming so far, but I have always wanted to. So this is a very personal challenge, so lets see if I can achieve this. Lets take next section as a substrate to understand how this tutorial is supposed to work and understood.

2. Call upon the Godly being

The very simple system is one that just requests help from an all powerful “being” without any feedback loop. We just need one function that calls upon an LLM with a decent level of precision in its prayer.

Here we focus on the most crucial aspect of how a agent sends its prayer to an “intelligent” god almighty. Everything that follows chains, graphs, validation depends on this interface remaining simple, stable, and explicit. The function that “calls the god” is intentionally narrow: it takes messages and returns a message. No interpretation, no retries, no clever abstractions.

Note that this function is not an agent. It does not decide when to act, why to act, or how to react. It is merely a gateway.

We need to let the agents know which God you want their prayers’ be posted to. We are going to use OpenRouter since they offer many LLMs for us to work with. If you run something locally, you can point the URL to localhost, make sure you change URL to correct PORT.

These variables are pulled from the environment which is a reasonable practice. This matters because agents often need to operate across environments, models, or deployment contexts. Environment here means the OS, or shell which supplies some configuration variables on top which the program runs not the environment that agent talks to.

OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

This

def call_llm(messages):
    url = f"{DEFAULT_BASE_URL}/chat/completions"
    headers = {
        "Content-Type": "application/json",
        "Accept": "application/json",
        "Authorization" : f"Bearer {OPENROUTER_API_KEY}"
    }

    payload = {
        "model": DEFAULT_MODEL,
        "messages": messages
    }

    resp = requests.post(
        url,
        headers=headers,
        data=json.dumps(payload),
    )

    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]

We can invoke this function as follows. Messages is the main channel of our communication with our god. Each message has two parts, role and content. We need to pass at least two messages for the conversation to be useful. One with system role and another with user role.

output = call_llm([
    {"role": "system", "content": "You're a benevolent god"},
    {"role": "user", "content": "Bless me you all powerful creature?"}
    ])
pprint.pprint(output)

There are lots of information available for us to use from our benevolent God, of which we are interested in choices specifically message piece of it.

2.1. Some improvements

First things first, lets function-ify these, because it just a better way to write code.

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY

Our prayer to god consists of four important parts. Lets expand our function’s arguments list so that we can change the

model - in case our agents want to talk to more than one model
base_url - in case our agents would like avail other services
response_format - this will come in handy later
timeout - always have a timeout when you’re working with unreliable communication channel like when talking to god.

Here is the structure of the function, and we will inspect the parts in consecutive code listings one by one. The labels inside the angle brackets are references to code blocks that will their place here. i.e these are not some comments with <<some label>> instead of # that look weird, but is executable code. Here is the function in its full glory, comprised of the code blocks that constructs the url, prepare API headers, compose the payload, and send and receive the prayers and reply from god

def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    <<url-construction>>
    <<api-headers>>
    <<payload>>
    <<request-response>>

URL

url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"

API headers with authentication key.

headers = {"Content-Type": "application/json", "Accept": "application/json"}
if _api_key():
    headers["Authorization"] = f"Bearer {_api_key()}"

Package the prayer into a single payload

payload = {"model": model or _model_name(), "messages": messages}
if response_format is not None:
    payload["response_format"] = response_format

We pray to the god and hope he/she gets back to us. Note that we only pick one message and its content as the word from god

resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
resp.raise_for_status()
return resp.json()["choices"][0]["message"]

The function in its full form.

def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]

Lets put together the whole program into a python file and run it.

import os, json, requests, argparse, sys, pprint

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
import os, json, requests, argparse, sys, pprint

OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY
def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]

output = call_llm([
    {"role": "system", "content": "You're a benevolent god"},
    {"role": "user", "content": "Bless me you all powerful creature?"}
    ])
pprint.pprint(output)

You can now run the whole file as follows

python 01-llm-interface.py

3. Structured Output

Once a system can reliably exchange text with an LLM, our next challenge is structure. Free-form language is expressive but brittle because it is difficult to compose, validate, or reuse. Structured output addresses this by turning language from a conversational medium into a data-producing interface.

The key idea in this section is that structure is a contract. When we ask the model to produce JSON matching a schema, we are no longer merely requesting information, we are also have an agreement about form, be is CSV, YAML, or JSON and we can leverage Python’s libraries to post process the information in form that are familiar to both us and the computer. So the output can checked, parsed, and potentially rejected.

Introducing safe parsing reinforces an important agentic principle: the environment is unreliable. Models may hallucinate, truncate output, or subtly violate constraints. An agent that assumes correctness will eventually fail. By treating parsing errors immediately as they arrive and by reacting to it the system becomes robust by design rather than by accident. We will use of Pydantic schemas to make this easier a little bit later but for now we keep it simple.

def _safe_json_parse(text):
     try:
         return json.loads(text)
     except Exception:
         return None

This is another bedrock for the system which translated replies from LLM for consumption by typical python programs. We recognize system and user prompts as two distinct categories which have different purposes. This function at this point is brittle but concise enough to only focus on what is absolute minimum to structure the reply from the LLMs. It prepares the message and parses the reply into json.

def generate_structured(system_prompt,
                         user_prompt,
                         schema_str,
                         model=DEFAULT_MODEL,
                         base_url=None):
    <<prepare-message>>
    <<request-and-parse-response>>
    return parsed

System prompt and History

sys_msg = {
    "role": "system",
    "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
}
history = [sys_msg, {"role": "user", "content": user_prompt}]

Parse json

resp = call_llm(history, model=model, base_url=base_url,
                response_format={"type": "json_object"})
parsed = _safe_json_parse(resp["content"])

def generate_structured(system_prompt,
                         user_prompt,
                         schema_str,
                         model=DEFAULT_MODEL,
                         base_url=None):
    sys_msg = {
        "role": "system",
        "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
    }
    history = [sys_msg, {"role": "user", "content": user_prompt}]
    resp = call_llm(history, model=model, base_url=base_url,
                    response_format={"type": "json_object"})
    parsed = _safe_json_parse(resp["content"])
    return parsed

Schemas serve as specifications: they describe what valid structure looks like, they can be used to generate prompts and validate results. This creates a shared ontology between the us the humans, the agent, and the LLM. Crucially, the schema exists outside the model, it is the agent’s understanding of the world, not the model’s.

{'ingredients': [{'item': '<item>', 'quantity': '<quantity>'}]}

output = generate_structured(
    system_prompt = "You are a master chef. Find the ingredients for the following dish.",
    user_prompt = "Kozhi Kulambu",
    schema_str = {'ingredients': [{'item': '<item>', 'quantity': '<quantity>'}]}
)
pprint.pprint(output)

{'ingredients': [{'item': 'chicken', 'quantity': '500 grams'},
                 {'item': 'onion', 'quantity': '2 large, finely chopped'},
                 {'item': 'tomato', 'quantity': '2 medium, pureed'},
                 {'item': 'ginger-garlic paste', 'quantity': '2 tablespoons'},
                 {'item': 'turmeric powder', 'quantity': '1/2 teaspoon'},
                 {'item': 'red chili powder', 'quantity': '1 tablespoon'},
                 {'item': 'coriander powder', 'quantity': '2 tablespoons'},
                 {'item': 'cumin seeds', 'quantity': '1 teaspoon'},
                 {'item': 'curry leaves', 'quantity': 'a few sprigs'},
                 {'item': 'green chilies', 'quantity': '2, slit'},
                 {'item': 'coconut milk', 'quantity': '200 ml'},
                 {'item': 'salt', 'quantity': 'to taste'},
                 {'item': 'oil', 'quantity': '3 tablespoons'},
                 {'item': 'water', 'quantity': 'as needed'}]}

3.1. Schema makes it more robust

So far we used hand-crafted schema string which can be replaced with pydantic to help with schema generation and validating response. But we can use pydantic to describe the schema, because it provides functions to validate json with respect to the schema described in pydantic DSL. But first we need to install and import two libraries pydantic and typing.

from typing import List
from pydantic import BaseModel, ValidationError

This schema dictates that the output from the LLM, must be list of ingredients and each ingredient must include the item and quantity for the requested recipie.

class Ingredient(BaseModel):
    item: str
    quantity: str

class Ingredients(BaseModel):
    ingredients: List[Ingredient]

We reuse the previous definition with one difference. We use schema defined with pydantic.

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   resp = call_llm(history, model=model, base_url=base_url,
                   response_format={"type": "json_object"})
   parsed = _safe_json_parse(resp["content"])
   return parsed

schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)

output = generate_structured(
    system_prompt = "You are a master chef. Find the ingredients for the following dish.",
    user_prompt = "Kozhi Kulambu",
    schema_model = Ingredients,
)
pprint.pprint(output)

3.2. Validation and retries

Pydantic offers mechanisms to validate the parsed json against our schema. The system observes an error, communicates that error back to the model and requests correction. This is not yet planning or reflection, but it is already adaptive behavior. The agent is no longer passively receiving output; it can now enfore constraints and nudge interaction to a more desired direction. Validation and retries introduce the first real feedback loop. This pattern of request-validate-correct will recur throughout more advanced agentic designs.

if parsed is not None:
    try:
        obj = schema_model.model_validate(parsed)
        return obj.model_dump()
    except ValidationError as ve:
        error = json.dumps(ve.errors(), ensure_ascii=False)
else:
    error = "Invalid JSON syntax"

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        max_retries=3,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   for attempt in range(max_retries + 1):
      resp = call_llm(history, model=model, base_url=base_url,
                      response_format={"type": "json_object"})
      parsed = _safe_json_parse(resp["content"])
      if parsed is not None:
          try:
              obj = schema_model.model_validate(parsed)
              return obj.model_dump()
          except ValidationError as ve:
              error = json.dumps(ve.errors(), ensure_ascii=False)
      else:
          error = "Invalid JSON syntax"

      if attempt >= max_retries:
         raise RuntimeError(f"Validation failed: {error}")
      history.append(resp)
      history.append({
          "role": "user",
          "content": f"Validation Error: {error}. Fix it."
      })

output = generate_structured(
    system_prompt = "You are a master chef. Find the ingredients for the following dish.",
    user_prompt = "Chicken Kulambu",
    schema_model = Ingredients,
)
pprint.pprint(output)

3.3. Final form

We are done with tedium and will move onto to more interesting things now. But here is the full python script which can be run standalone.

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
<<imports>>
<<imports-pydantic>>
<<llm-config>>

<<env>>
<<call-llm>>
<<safe-json-parse>>
<<generate-structured-pydantic-validated>>
<<schema-chef-example>>
<<generate-structured-pydantic-validated-test>>

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
import os, json, requests, argparse, sys, pprint
from typing import List
from pydantic import BaseModel, ValidationError
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY
def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]
def _safe_json_parse(text):
     try:
         return json.loads(text)
     except Exception:
         return None

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        max_retries=3,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   for attempt in range(max_retries + 1):
      resp = call_llm(history, model=model, base_url=base_url,
                      response_format={"type": "json_object"})
      parsed = _safe_json_parse(resp["content"])
      if parsed is not None:
          try:
              obj = schema_model.model_validate(parsed)
              return obj.model_dump()
          except ValidationError as ve:
              error = json.dumps(ve.errors(), ensure_ascii=False)
      else:
          error = "Invalid JSON syntax"

      if attempt >= max_retries:
         raise RuntimeError(f"Validation failed: {error}")
      history.append(resp)
      history.append({
          "role": "user",
          "content": f"Validation Error: {error}. Fix it."
      })

class Ingredient(BaseModel):
    item: str
    quantity: str

class Ingredients(BaseModel):
    ingredients: List[Ingredient]
output = generate_structured(
    system_prompt = "You are a master chef. Find the ingredients for the following dish.",
    user_prompt = "Chicken Kulambu",
    schema_model = Ingredients,
)
pprint.pprint(output)

4. Linear chain

Linear chains introduce the idea that complex behavior can be decomposed into a sequence of simple, well-scoped steps. Each step performs a single transformation: outline generation, drafting, metadata creation. Individually, these steps are unremarkable. Collectively, they produce behavior that appears intentional.

The deeper principle here is separation of concerns over time. Instead of asking a model to do everything at once, we ask it to do one thing at a time, with each step producing structured output for the next. This mirrors human workflows and reduces the amount of context to be kept in memory for both the model and the system designer. This also provides us with more surface point with which we can interact with to push the agent in desired direction.

4.1. Blog Generator

We are going to make use of our engine to build an agent that write blog post in three steps. (i) Come up with an outline. (ii) Write a draft post. (iii) create promotional material.

def _outline(topic):
    return generate_structured(
        "You are a blog planning assistant.",
        f"Create an outline with 4-6 section titles for a blog post about: {topic}",
        Outline,
    )

def _draft(outline):
    return generate_structured(
        "You are a blog drafting assistant.",
        "Write a short paragraph for each outlined section:\n" + json.dumps(outline, ensure_ascii=False),
        Draft,
    )

def _metadata(draft):
    return generate_structured(
        "You are an editorial assistant.",
        "Produce headline, slug, and 5 SEO tags for this draft:\n" + json.dumps(draft, ensure_ascii=False),
        Metadata,
    )

class Outline(BaseModel):
    sections: List[str]

class DraftSection(BaseModel):
    title: str
    text: str

class Draft(BaseModel):
    sections: List[DraftSection]

class Metadata(BaseModel):
    headline: str
    slug: str
    tags: List[str]

def blog_generator(topic):
    topic = {"topic": topic}
    outline = _outline(topic)
    draft = _draft(outline)
    metadata =_metadata(draft)
    return topic, draft, metadata

topic = 'Write a blog post about marathon'
outline, draft, metadata = blog_generator(topic)
pprint.pprint(outline)
pprint.pprint(draft)
pprint.pprint(metadata)

from pydantic import Field

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
import os, json, requests, argparse, sys, pprint
from typing import List
from pydantic import BaseModel, ValidationError

OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY
def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]
def _safe_json_parse(text):
     try:
         return json.loads(text)
     except Exception:
         return None

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        max_retries=3,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   for attempt in range(max_retries + 1):
      resp = call_llm(history, model=model, base_url=base_url,
                      response_format={"type": "json_object"})
      parsed = _safe_json_parse(resp["content"])
      if parsed is not None:
          try:
              obj = schema_model.model_validate(parsed)
              return obj.model_dump()
          except ValidationError as ve:
              error = json.dumps(ve.errors(), ensure_ascii=False)
      else:
          error = "Invalid JSON syntax"

      if attempt >= max_retries:
         raise RuntimeError(f"Validation failed: {error}")
      history.append(resp)
      history.append({
          "role": "user",
          "content": f"Validation Error: {error}. Fix it."
      })

class Outline(BaseModel):
    sections: List[str]

class DraftSection(BaseModel):
    title: str
    text: str

class Draft(BaseModel):
    sections: List[DraftSection]

class Metadata(BaseModel):
    headline: str
    slug: str
    tags: List[str]
def _outline(topic):
    return generate_structured(
        "You are a blog planning assistant.",
        f"Create an outline with 4-6 section titles for a blog post about: {topic}",
        Outline,
    )

def _draft(outline):
    return generate_structured(
        "You are a blog drafting assistant.",
        "Write a short paragraph for each outlined section:\n" + json.dumps(outline, ensure_ascii=False),
        Draft,
    )

def _metadata(draft):
    return generate_structured(
        "You are an editorial assistant.",
        "Produce headline, slug, and 5 SEO tags for this draft:\n" + json.dumps(draft, ensure_ascii=False),
        Metadata,
    )
def blog_generator(topic):
    topic = {"topic": topic}
    outline = _outline(topic)
    draft = _draft(outline)
    metadata =_metadata(draft)
    return topic, draft, metadata
topic = 'Write a blog post about marathon'
outline, draft, metadata = blog_generator(topic)
pprint.pprint(outline)
pprint.pprint(draft)
pprint.pprint(metadata)

4.2. Lesson Plan Generator

We are going to make use of our engine to build an agent that write blog post in three steps. (i) Come up with an outline. (ii) Write a lesson plan by exapanding upon the outline. (iii) create assessments aligned with lesson plan.

def lesson_outline(topic, level):
    return generate_structured(
        "You are a helpful teacher planning a lesson sequence.",
        f"Create 3–5 lessons (title and objective) for topic: {topic}. Audience level: {level}.",
        LessonOutline,
    )

def lesson_plan(outline):
    return generate_structured(
        "You are a teacher designing classroom activities.",
        "For each lesson, add 2–3 activities with duration (minutes) and materials:\n" + json.dumps(outline, ensure_ascii=False),
        LessonPlan,
    )

def lesson_meta(plan):
    return generate_structured(
        "You are an instructional designer.",
        "Given this plan, propose 2–3 assessments (type + prompt) and a short resource list:\n" + json.dumps(plan, ensure_ascii=False),
        LessonMeta,
    )

class LessonOutline(BaseModel):
    lessons: List[dict]

class Activity(BaseModel):
    description: str
    duration_minutes: int = Field(ge=5, le=180)
    materials: List[str]

class LessonWithActivities(BaseModel):
    title: str
    activities: List[Activity]

class LessonPlan(BaseModel):
    lessons: List[LessonWithActivities]

class Assessment(BaseModel):
    type: str
    prompt: str

class LessonMeta(BaseModel):
    assessments: List[Assessment]
    resources: List[str]

def lesson_generator(topic, level = "beginner"):
    outline = lesson_outline(topic, level)
    plan = lesson_plan(outline)
    meta = lesson_meta(plan)
    return outline, plan, meta

topic = 'Write a lesson plan on python functions'
outline, plan, meta = lesson_generator(topic, 'advanced')
pprint.pprint(outline)
pprint.pprint(plan)
pprint.pprint(meta)

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
import os, json, requests, argparse, sys, pprint
from typing import List
from pydantic import BaseModel, ValidationError
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY
def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]
def _safe_json_parse(text):
     try:
         return json.loads(text)
     except Exception:
         return None

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        max_retries=3,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   for attempt in range(max_retries + 1):
      resp = call_llm(history, model=model, base_url=base_url,
                      response_format={"type": "json_object"})
      parsed = _safe_json_parse(resp["content"])
      if parsed is not None:
          try:
              obj = schema_model.model_validate(parsed)
              return obj.model_dump()
          except ValidationError as ve:
              error = json.dumps(ve.errors(), ensure_ascii=False)
      else:
          error = "Invalid JSON syntax"

      if attempt >= max_retries:
         raise RuntimeError(f"Validation failed: {error}")
      history.append(resp)
      history.append({
          "role": "user",
          "content": f"Validation Error: {error}. Fix it."
      })



class LessonOutline(BaseModel):
    lessons: List[dict]

class Activity(BaseModel):
    description: str
    duration_minutes: int = Field(ge=5, le=180)
    materials: List[str]

class LessonWithActivities(BaseModel):
    title: str
    activities: List[Activity]

class LessonPlan(BaseModel):
    lessons: List[LessonWithActivities]

class Assessment(BaseModel):
    type: str
    prompt: str

class LessonMeta(BaseModel):
    assessments: List[Assessment]
    resources: List[str]


def lesson_outline(topic, level):
    return generate_structured(
        "You are a helpful teacher planning a lesson sequence.",
        f"Create 3–5 lessons (title and objective) for topic: {topic}. Audience level: {level}.",
        LessonOutline,
    )

def lesson_plan(outline):
    return generate_structured(
        "You are a teacher designing classroom activities.",
        "For each lesson, add 2–3 activities with duration (minutes) and materials:\n" + json.dumps(outline, ensure_ascii=False),
        LessonPlan,
    )

def lesson_meta(plan):
    return generate_structured(
        "You are an instructional designer.",
        "Given this plan, propose 2–3 assessments (type + prompt) and a short resource list:\n" + json.dumps(plan, ensure_ascii=False),
        LessonMeta,
    )

def lesson_generator(topic, level = "beginner"):
    outline = lesson_outline(topic, level)
    plan = lesson_plan(outline)
    meta = lesson_meta(plan)
    return outline, plan, meta
topic = 'Write a lesson plan on python functions'
outline, plan, meta = lesson_generator(topic, 'advanced')
pprint.pprint(outline)
pprint.pprint(plan)
pprint.pprint(meta)

4.3. Compare and Clarify: `run-chain`

If we study these two functions a bit closely we can build better abstraction that will come in handy in thinking about more complex flows.

def blog_generator(topic):
    outline = _outline(topic)
    draft = _draft(outline)
    metadata =_metadata(draft)
    return outline, draft, metadata

def lesson_generator(topic, level = "beginner"):
    outline = lesson_outline(topic, level)
    plan = lesson_plan(outline)
    meta = lesson_meta(plan)
    return outline, plan, meta

They both have the same structure, but different data. If we can separate the function from the data, we can reuse the same function and even extend it to support more complex flows.

blog_chain = [_outline, _draft, _metadata]
lesson_chain = [lesson_outline, lesson_draft, lesson_metadata]

def run_chain(chain, initial_input):
    outputs = []
    output = initial_input
    for link in chain:
        output = link(output)
        outputs.append(output)
    return outputs

Abstracting the chain itself reveals another insight: the logic of execution can be separated from the logic of work. A generic run_chain function knows nothing about blogging or lesson planning. It only knows how to pass outputs forward. This separation enables reuse and experimentation. Chains can be reordered, extended, or swapped without rewriting execution logic.

4.4. Can this be improved further?

The run_chain function loops through each link in the chain be it for blogging or lesson plan generation and pipes the output from previous link to next one. But each link only can see the output of its previous link but do not have the whole history. Note that we do not need to pass on the whole history to the LLM for every case, since this is just a plain python function we can make decisions of how to make use of the history. So lets make the history available to every link in the chain.

lesson_chain = {
    'outline' : lesson_outline,
    'plan': lesson_plan,
    'metadata' : lesson_meta
}

def run_chain(chain, initial):
   state = {'initial': initial}
   for name, link in chain.items():
      output = link(state)
      state.update({name: output})
   return state

def lesson_outline(state):
    return generate_structured(
        "You are a helpful teacher planning a lesson sequence.",
        f"Create 3–5 lessons (title and objective)" +
        " for topic: {state['initial']['topic']}." +
        "Audience level: state['initial']['level].",
        LessonOutline,
    )

def lesson_plan(state):
    return generate_structured(
        "You are a teacher designing classroom activities.",
        "For each lesson, add 2–3 activities with duration (minutes) and materials:\n" + json.dumps(state['outline'], ensure_ascii=False),
        LessonPlan,
    )

def lesson_meta(state):
    return generate_structured(
        "You are an instructional designer.",
        "Given this plan, propose 2–3 assessments (type + prompt) and a short resource list:\n" + json.dumps(state['plan'], ensure_ascii=False),
        LessonMeta,
    )

The shared state is conceptual important. If each step can access the accumulated history, the system gains context-awareness without forcing that context into every model call. State exists in the program, not in the prompt. As mentioned earlier, the agent decides what context matters, rather than blindly passing everything forward.

At this point, we can system begins to take shape and resemble an agent more clearly. It (i) has a persistent internal representation of progress, (ii) executes a sequence of decisions, and (iii) produces intermediate artifacts. Yet it remains deterministic and inspectable. There is no hidden planning or autonomous branching but only explicit structure. This restraint is let us make the transition to graphs and conditional flow intelligible rather than chaotic.

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
import os, json, requests, argparse, sys, pprint
from typing import List
from pydantic import BaseModel, ValidationError
from pydantic import Field
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY
def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]
def _safe_json_parse(text):
     try:
         return json.loads(text)
     except Exception:
         return None

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        max_retries=3,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   for attempt in range(max_retries + 1):
      resp = call_llm(history, model=model, base_url=base_url,
                      response_format={"type": "json_object"})
      parsed = _safe_json_parse(resp["content"])
      if parsed is not None:
          try:
              obj = schema_model.model_validate(parsed)
              return obj.model_dump()
          except ValidationError as ve:
              error = json.dumps(ve.errors(), ensure_ascii=False)
      else:
          error = "Invalid JSON syntax"

      if attempt >= max_retries:
         raise RuntimeError(f"Validation failed: {error}")
      history.append(resp)
      history.append({
          "role": "user",
          "content": f"Validation Error: {error}. Fix it."
      })



class LessonOutline(BaseModel):
    lessons: List[dict]

class Activity(BaseModel):
    description: str
    duration_minutes: int = Field(ge=5, le=180)
    materials: List[str]

class LessonWithActivities(BaseModel):
    title: str
    activities: List[Activity]

class LessonPlan(BaseModel):
    lessons: List[LessonWithActivities]

class Assessment(BaseModel):
    type: str
    prompt: str

class LessonMeta(BaseModel):
    assessments: List[Assessment]
    resources: List[str]


def lesson_outline(state):
    return generate_structured(
        "You are a helpful teacher planning a lesson sequence.",
        f"Create 3–5 lessons (title and objective)" +
        " for topic: {state['initial']['topic']}." +
        "Audience level: state['initial']['level].",
        LessonOutline,
    )

def lesson_plan(state):
    return generate_structured(
        "You are a teacher designing classroom activities.",
        "For each lesson, add 2–3 activities with duration (minutes) and materials:\n" + json.dumps(state['outline'], ensure_ascii=False),
        LessonPlan,
    )

def lesson_meta(state):
    return generate_structured(
        "You are an instructional designer.",
        "Given this plan, propose 2–3 assessments (type + prompt) and a short resource list:\n" + json.dumps(state['plan'], ensure_ascii=False),
        LessonMeta,
    )

lesson_chain = {
    'outline' : lesson_outline,
    'plan': lesson_plan,
    'metadata' : lesson_meta
}
def run_chain(chain, initial):
   state = {'initial': initial}
   for name, link in chain.items():
      output = link(state)
      state.update({name: output})
   return state

output = run_chain(lesson_chain, 'Write a lesson plan on python functions')
pprint.pprint(output)

5. Chains to Graphs

We can generalize the chains to graphs by making control flow explicit and conditional. Structuring the flow in a graph supports that decisions depend on state which is more flexible that chain where decisions are linear. This is where agentic systems begin to resemble “reasoning” processes rather than pipelines.

In our simple graph model nodes represent roles or specific functions: reader, evaluator, decider and edges represent transitions based on state.

class Graph:
    def __init__(self, nodes, edges):
        self.nodes = nodes
        self.edges = edges

review_graph = Graph(
    nodes={
        "reader": read_abstract,
        "evaluator": evaluate_abstract,
        "decider": make_decision,
    },
    edges={
        "reader": lambda _s: "evaluator",
        "evaluator": lambda _s: "decider",
        "decider": lambda _s: "FINISH",
    },
)

Importantly, edges are functions, not constants. They can encode logic, given what we now know, where should we go next? Defining edges in terms of functions instead of contansts, the agent can exercise very dynamic workflows.

def read_abstract(state):
    prompt = f"Here is an abstract:\n{state['abstract']}\nExtract its field and claims."
    return {
        **state,
        **generate_structured(
            "You are a peer reviewer reading an abstract.",
            prompt,
            AbstractInfo,
        )
    }

def evaluate_abstract(state):
    prompt = f"Evaluate this abstract for clarity, originality, and reproducibility:\n{state['abstract']}"
    return {
        **state,
        **generate_structured(
            "You are an expert reviewer.",
            prompt,
            Evaluation,
        )
    }

def make_decision(state):
    prompt = (
        f"Based on these scores:\n"
        f"Clarity: {state['clarity']}, Originality: {state['originality']}, Reproducibility: {state['reproducibility']}\n"
        "Give a final Accept/Minor/Major/Reject decision and justify it."
    )
    return {
        **state,
        **generate_structured(
            "You are the program chair.",
            prompt,
            Recommendation,
        )
    }

The execution loop for the graph is intentionally simple as I promised in the very beginnning. Since we are not aiming for performance, there is no scheduler, no concurrency, no optimization. The agent continues to run until FINSIH state is reached. This makes the mechanics of decision-making visible. At each step, the system:

Executes a role-specific function
Updates shared state
Chooses the next node based on that state

def run_graph(graph, start_node, initial_state):
    print("Running graph...")
    state = dict(initial_state)
    current = start_node

    while True:
        if current == "FINISH":
            print("Finished graph.")
            return state

        if current not in graph.nodes:
            raise KeyError(f"Unknown node {current}")

        print(f'running {current}')
        state = graph.nodes[current](state)

        if current not in graph.edges:
            raise KeyError(f"No edge for node {current}")

        current = graph.edges[current](state)

Schemas continue to play a central role. Each node produces structured additions to the state, ensuring that downstream decisions operate on validated data rather than free-form text. This maintains coherence even as control flow becomes more dynamic.

class AbstractInfo(BaseModel):
    title: str
    field: str
    main_claims: List[str]

class Evaluation(BaseModel):
    clarity: int = Field(ge=1, le=5)
    originality: int = Field(ge=1, le=5)
    reproducibility: int = Field(ge=1, le=5)
    notes: Optional[str] = None

class Recommendation(BaseModel):
    decision: str = Field(pattern="^(Accept|Minor|Major|Reject)$")
    justification: str

What emerges here is a minimal but expressive model of agency. The system can branch, terminate, and justify decisions based on accumulated evidence. Yet it remains fully inspectable. There is no “hidden intelligence” beyond the language model calls and the explicit structure imposed by the program.

from typing import Any, Callable, Dict, List, Optional

"""
This file is auto-generated from KURI.org via org-babel tangle.
"""
import os, json, requests, argparse, sys, pprint
from typing import List
from pydantic import BaseModel, ValidationError
from pydantic import Field
from typing import Any, Callable, Dict, List, Optional
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
DEFAULT_BASE_URL   = os.getenv("DEFAULT_BASE_URL",   "http://localhost:12345/v1")
DEFAULT_MODEL      = os.getenv("OPENROUTER_MODEL",   "gpt-4o")

def _base_url():   return DEFAULT_BASE_URL
def _model_name(): return DEFAULT_MODEL
def _api_key():    return OPENROUTER_API_KEY
def call_llm(messages, model=None, base_url=None, response_format=None, timeout=60):
    url = f"{(base_url or _base_url()).rstrip('/')}/chat/completions"
    headers = {"Content-Type": "application/json", "Accept": "application/json"}
    if _api_key():
        headers["Authorization"] = f"Bearer {_api_key()}"
    payload = {"model": model or _model_name(), "messages": messages}
    if response_format is not None:
        payload["response_format"] = response_format
    resp = requests.post(url, headers=headers, data=json.dumps(payload), timeout=timeout)
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]
def _safe_json_parse(text):
     try:
         return json.loads(text)
     except Exception:
         return None

def generate_structured(system_prompt,
                        user_prompt,
                        schema_model,
                        model=DEFAULT_MODEL,
                        max_retries=3,
                        base_url=None):
   schema_str = json.dumps(schema_model.model_json_schema(), ensure_ascii=False)
   sys_msg = {
       "role": "system",
       "content": f"{system_prompt}\nOutput ONLY JSON matching this schema:\n{schema_str}"
   }
   history = [sys_msg, {"role": "user", "content": user_prompt}]
   for attempt in range(max_retries + 1):
      resp = call_llm(history, model=model, base_url=base_url,
                      response_format={"type": "json_object"})
      parsed = _safe_json_parse(resp["content"])
      if parsed is not None:
          try:
              obj = schema_model.model_validate(parsed)
              return obj.model_dump()
          except ValidationError as ve:
              error = json.dumps(ve.errors(), ensure_ascii=False)
      else:
          error = "Invalid JSON syntax"

      if attempt >= max_retries:
         raise RuntimeError(f"Validation failed: {error}")
      history.append(resp)
      history.append({
          "role": "user",
          "content": f"Validation Error: {error}. Fix it."
      })


def read_abstract(state):
    prompt = f"Here is an abstract:\n{state['abstract']}\nExtract its field and claims."
    return {
        **state,
        **generate_structured(
            "You are a peer reviewer reading an abstract.",
            prompt,
            AbstractInfo,
        )
    }

def evaluate_abstract(state):
    prompt = f"Evaluate this abstract for clarity, originality, and reproducibility:\n{state['abstract']}"
    return {
        **state,
        **generate_structured(
            "You are an expert reviewer.",
            prompt,
            Evaluation,
        )
    }

def make_decision(state):
    prompt = (
        f"Based on these scores:\n"
        f"Clarity: {state['clarity']}, Originality: {state['originality']}, Reproducibility: {state['reproducibility']}\n"
        "Give a final Accept/Minor/Major/Reject decision and justify it."
    )
    return {
        **state,
        **generate_structured(
            "You are the program chair.",
            prompt,
            Recommendation,
        )
    }


class AbstractInfo(BaseModel):
    title: str
    field: str
    main_claims: List[str]

class Evaluation(BaseModel):
    clarity: int = Field(ge=1, le=5)
    originality: int = Field(ge=1, le=5)
    reproducibility: int = Field(ge=1, le=5)
    notes: Optional[str] = None

class Recommendation(BaseModel):
    decision: str = Field(pattern="^(Accept|Minor|Major|Reject)$")
    justification: str

def run_graph(graph, start_node, initial_state):
    print("Running graph...")
    state = dict(initial_state)
    current = start_node

    while True:
        if current == "FINISH":
            print("Finished graph.")
            return state

        if current not in graph.nodes:
            raise KeyError(f"Unknown node {current}")

        print(f'running {current}')
        state = graph.nodes[current](state)

        if current not in graph.edges:
            raise KeyError(f"No edge for node {current}")

        current = graph.edges[current](state)
class Graph:
    def __init__(self, nodes, edges):
        self.nodes = nodes
        self.edges = edges

review_graph = Graph(
    nodes={
        "reader": read_abstract,
        "evaluator": evaluate_abstract,
        "decider": make_decision,
    },
    edges={
        "reader": lambda _s: "evaluator",
        "evaluator": lambda _s: "decider",
        "decider": lambda _s: "FINISH",
    },
)

abstract = (
    "MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering"
    " This paper introduces MedMCQA, a new large-scale, Multiple-Choice Question Answering (MCQA) dataset designed to address real-world medical entrance exam questions. More than 194k high-quality AIIMS & NEET PG entrance exam MCQs covering 2.4k healthcare topics and 21 medical subjects are collected with an average token length of 12.77 and high topical diversity. Each sample contains a question, correct answer(s), and other options which requires a deeper language understanding as it tests the 10+ reasoning abilities of a model across a wide range of medical subjects & topics. A detailed explanation of the solution, along with the above information, is provided in this study."
  )
final = run_graph(review_graph, "reader", {"abstract": abstract})
print(json.dumps(final, indent=2, ensure_ascii=False))

What I want you to takeaway is that agentic systems do not require mystery. They require boundaries, explicit representation for state, and composition. By building up from prayers to protocols, from chains to graphs, we arrive at agency as a natural consequence of structure and not by leap of magic.

6. Disclaimer: On Agency, Intelligence, and Metaphor

It is important to clarify the conceptual boundaries of this tutorial.

This work does not assume that large language models are minds, brains, or intelligent entities in any literal, philosophical, or scientific sense. Terms such as intelligence, reasoning, decision, or understanding are used here in a strictly instrumental and operational manner. They describe observable program behavior control flow, data transformation and conditional execution not cognition, intention, or consciousness.

In this tutorial, the language model is treated as part of the environment in which the program operates. It is a statistical text-generation system with useful properties, nothing more. Any appearance of reasoning, planning, or judgment arises entirely from human-authored structure: the prompts, schemas, validation rules, control flow, and execution logic written in code. The model supplies linguistic variability; the meaning, constraints, and direction of computation are imposed by us the programmers.

References to “gods,” “prayers,” or “almighty beings” are deliberately sarcastic. They are a light-hearted nod to contemporary tendencies to anthropomorphize language models or treat them as oracles. These metaphors are used to highlight that trend, not endorse it. They should not be read as implying wisdom, authority, autonomy, or agency on the part of the model.

Throughout this tutorial, any so-called “agentic” behavior emerges from explicit mechanisms: message passing, structured outputs, validation loops, state accumulation, and graph-based control flow. Nothing is implicit, mystical, or self-originating. Where the system appears to act or decide, it is merely executing deterministic or explicitly defined rules over structured data.

The central claim of this work is therefore a modest one: agentic systems are not intelligent beings, but carefully constructed programs. Their apparent sophistication is a consequence of composition and discipline, not magic. The only genuine intelligence involved designing abstractions, choosing constraints, and reasoning about behavior remains firmly human.