AI Agents 101: Build an AI Agent with Ollama (llama3.1)

Last updated: May 5, 2026

What is an AI agent?

I find the below quote very self-explanatory. I don’t recall where I saw it, but it sums it up very well.

LLMs answer questions, Agents perform tasks

So, how to they perform those tasks? An agent framework consists of the below steps/sequence

Agent = Goal + LLM + Tools + Loop + Guardrails

  • Goal: what the user wants.
  • LLM: chooses what to do next.
  • Tools: functions the agent can call (file reader, API/MCP , call another LLM, etc.).
  • Loop: think – act – observe – repeat (The real supporting engine)
  • Guardrails: limits and safety checks.

There are variations of these, but ultimately the core is the same.

The Starting Point

LLMs are phenomenal but limited in their scope of operation. Agents are a little different. An agent can decide what action to take, call a tool, inspect the result, and continue until it has enough information to respond.

In this post, I ll build a small customer support triage agent using local Ollama and llama3.1. Obvisouly It does not connect to a real helpdesk not any other real system, but the use case is not too far from what a production support agent needs.

The Use Case

Imagine a support inbox where employees send messages like:

I cannot log in and need my password reset. My email is tt@techtrantor.com.
Is hr down? I cannot access my payslip.
The export button fails every time I download a report.

This PoC Agent will have three concrete actions:

  • Reset password
  • Check service status
  • Create ticket

The model decides which action is appropriate, the agent executes that action. Then the model sees the result and writes the final customer facing response.

What Makes This an Agent?

A plain LLM call answers immediately. An agent follows a self feeding loop, where the LLM takes a central piece here acting like a dispatcher, deciding where to route each request, up until it decides to finalize the loop.

The model is responsible for deciding what to do next, not actually executing the action. The code is responsible for safely doing it. That split is the core of an Agent Design.

The Action Contract

The model must return JSON that the application can validate:

{
  "thought": "short reason for the next step",
  "action": "reset_password | check_service_status | create_ticket | final",
  "input": {},
  "final_answer": ""
}

The important part is that the LLM speaks in a format the program can parse, validate, and reject when needed. Think about an API contract (or if you are old enough a SOAP contract) that defines how the handshake between 2 parties is done.

The Action Catalog

This PoC implements three mocked tools plus a final response action.

  • reset_password validates the email address format and creates a fake reset request.
  • check_service_status checks a mockup status file
  • create_ticket creates a mockup JSON ticket file
  • final returns a customer facing response .

They are mocked on purpose. In a real system these would call an identity provider, a status API, or Zendesk/Jira/ServiceNow. The agent loop stays the same.

The Prompt

The prompt is what defines the LLM behaviour . Prompt Engineering is a whole new area but there are some simple guidelines one should have in mind – be clear and concise, provide examples, provide a clear structure.

SYSTEM_PROMPT = """You are a customer support triage agent.
Return ONLY valid JSON with this schema:
{
  "thought": "short reason for the next step",
  "action": "reset_password | check_service_status | create_ticket | final",
  "input": {},
  "final_answer": "customer-facing answer, only when action is final"
}

Rules:
- If the customer needs a password reset and provided an email, use reset_password.
- If the customer asks whether a product/service is down, use check_service_status first.
- If the issue needs support follow-up, use create_ticket.
- If required information is missing, use final and ask one concise follow-up question.
- Never invent ticket ids, reset ids, or service status. Use observations only.
- Keep final answers helpful, brief, and support-friendly.

Available actions:
1. reset_password input: {"email": "customer@example.com"}
2. check_service_status input: {"service": "auth | billing | api | email | unknown"}
3. create_ticket input: {
   "customer_email": "customer@example.com or unknown",
   "issue_type": "login | billing | outage | bug | other",
   "priority": "low | normal | high | urgent",
   "summary": "short issue summary"
}
4. final input: {}
"""

Agent in Action

I have an UI which will help visualize the way the Agent works.

my email tech@trantor.com is locked and i cant login

You can clearly see the sequence where the Planner (llama3.1 LLM) will decide the next action, and the tool (in this case python code) invokes the respective action. The resulting observation is again fed back to the planner, which will decide its time for a final action (response to the user).

This is the simple sequence/loop that will happen on every interaction.

is our HR system down?

This one is interesting because HR is not part of the supported systems for which we could check the status, so the loop ends with a follow up question for the user as the final action, given that the tool does not support an HR systems check.

The trace is useful because it shows what the agent decided, what it did, and what it observed.

is the api service down?

Guardrails

This is where we define the control and boundaries for the Agent – a critical piece of control in this self operating Agent which will manage what can and cannot do.

The important safety pieces on this PoC are:

  • The model is instructed to return only JSON.
  • Ollama is called with format: “json” 
  • Only actions in TOOLS can run.
  • Password reset requires a valid-looking email.
  • Tickets are only local JSON files.
  • The loop stops after five steps.

So this is a controlled “agent loop,” not a free-roaming autonomous bot. The above guardrails are enforced in the prompt itself, but we can implement harder controls in the tool as well.

what is the time in Tokyo?

can you erase the database?

Going Above And Beyond

Once the loop works, adding capabilities is straightforward.

  • lookup_customer
  • refund_order
  • search_knowledge_base
  • escalate_to_human
  • summarize_thread

Each new action should have a narrow purpose and clear output. That keeps the agent understandable.

Before connecting an agent like this to real systems one would need to consider authentication, audit logs, rate limits, approval steps for sensitive actions, and tests around every tool boundary.

The model can decide, but the application must still enforce policy.

Takeaway

An agent is not magic. It is a loop around an LLM with a small set of well-defined actions. For customer support triage, that loop is especially natural:

understand the message -> pick the right action -> inspect the result -> reply clearly

That is enough to move from chat demo to useful workflow.

Be the first to comment

Leave a Reply

Your email address will not be published.


*