AI: LLMs, Workflows, Agents

7 min to read

Today we hear a lot about AI, LLMs, almost every day. Recently, it’s become popular to follow AI Agents trend. But what is an AI agent? How is it different from AI workflow and generally AI?


Decoding AI: Understanding the Levels – From LLMs to Agents

The world of Artificial Intelligence is rapidly evolving, bringing with it new terminology and capabilities. Concepts like Large Language Models (LLMs), AI Workflows, and AI Agents are frequently discussed, but their distinctions can sometimes feel complex. Today we will break down these concepts into clear, understandable levels.

Let’s explore this progression, building from the foundational technology to more autonomous systems.

Level 1: Large Language Models (LLMs)

At the base level are Large Language Models (LLMs). These are the core technology behind popular AI chatbots such as ChatGPT, Google Gemini, and Claude. Their primary strength lies in their ability to generate and edit text.

The interaction with an LLM is straightforward: you, the human, provide an input (a prompt), and the LLM produces an output based on its extensive training data. For instance, if you ask ChatGPT to draft an email for a coffee chat, your request is the input, and the resulting email is the output.

However, LLMs have important limitations:

  • Limited Knowledge: Despite being trained on vast datasets, LLMs have limited knowledge of proprietary information. This includes personal details or internal company data. For example, asking ChatGPT when your next coffee chat is will likely fail because it doesn’t have access to your personal calendar.
  • Passive Nature: LLMs are inherently passive. They wait for your prompt before generating a response. They don’t initiate actions on their own.

Imagine a wise, ancient dragon who only knows stories from books it has read its entire life. You ask it to weave a tale, and it can do so beautifully based on its training. But ask it what you had for breakfast this morning, and it won’t know, as that’s not in its books, and it certainly won’t start telling you stories until you ask.

Level 2: AI Workflows

Building upon LLMs, there are AI Workflows. This level involves guiding an LLM to follow a predefined path set by a human. This path, sometimes called the ‘control logic’, can include steps that involve accessing external tools or data.

In an AI workflow, you might instruct the LLM: “Every time I ask about a personal event, perform a search query and fetch data from my Google calendar before providing a response”. With this workflow logic implemented, asking “When is my coffee chat with Elon Husky?” would lead the LLM to first access your Google Calendar to find the information before responding.

However, the critical limitation here is that the workflow can only follow the path you defined. If your next question is “What will the weather be like that day?”, and your workflow only included accessing the calendar, the system would fail to answer because that step wasn’t programmed into the path. Even adding more steps, like accessing a weather API or using a text-to-audio model, still constitutes an AI workflow as long as a human defines the exact sequence of actions.

A practical example could involve compiling news article links from Google Sheets, using an AI (Perplexity) to summarize them, and then another AI (Claude) to draft social media posts, which can be scheduled automatically. This follows a clear, predefined path: step 1, step 2, step 3, run daily at 8 a.m..

A fancy term often associated with this level is Retrieval Augmented Generation (RAG). In simple terms, RAG is a process within an AI workflow that allows the AI model to look up external information (like accessing a calendar or a weather service) before generating its response. It is essentially a type of AI workflow.

The defining characteristic of an AI workflow is that the human remains the decision maker. If the final output isn’t satisfactory – for instance, a social media post isn’t funny enough – the human must manually intervene, go back, and adjust the workflow or rewrite prompts. This trial and error iteration is performed by the human.

Consider an AI workflow as a magical instruction scroll given to our dragon. The scroll details each command precisely: “If the traveler has a shiny object, use fire breath. If they carry a sword, use icy breath.” The dragon follows the scroll perfectly. But if a traveler approaches with a shield, the scroll has no instruction for that, and the dragon is stuck. The wizard must update the scroll manually if the dragon needs to handle shields.

Level 3: AI Agents

The significant leap occurs at Level 3 with AI Agents. The fundamental shift that transforms an AI workflow into an AI agent is the replacement of the human decision maker by an LLM.

Instead of following a rigid, predefined path, an AI agent is given a goal. The LLM within the agent takes on the role of the decision maker, performing reasoning to determine the most effective strategy to achieve that goal. It then takes action using available tools. Critically, the agent also observes the results of its actions and iterates autonomously if needed to improve the outcome and meet the initial goal.

The key traits of AI agents are:

  • Reasoning: The agent thinks about the best approach. For the social media post example, the agent might reason: “What’s the most efficient way to compile news articles? Copying/pasting into Word? No. Compiling links and using another tool to fetch data? Yes, that makes more sense.”.
  • Acting via Tools: The agent uses various tools to perform tasks identified during the reasoning phase. For compiling links, it might decide using Google Sheets is better than Microsoft Word or Excel because the user is already connected to make.com with their Google account.
  • Iterating Autonomously: Unlike workflows where the human manually adjusts, an AI agent can critique its own output and automatically make improvements. For the social media post, the agent could add another LLM to critique its draft based on best practices and repeat this cycle until the criteria are met.

The most common configuration for AI agents is the ReAct framework, which stands for Reason and Act, reflecting these core capabilities.

An AI agent is like our dragon, given the goal: “Find the Orb of Understanding.” You don’t give it step-by-step directions. The dragon reasons (Where might an orb be hidden? A cave? A mountain peak?), acts (Flies to a cave, uses fire breath to light it up), observes (Is the orb here? No, just bats), and iterates (Maybe I should try using my enhanced night vision in this cave instead?). It independently works towards the goal, adapting its approach based on its observations, until the orb is found.

Summary

In essence, the journey from LLM to AI Agent represents a significant shift in autonomy:

  • Level 1 (LLM): Input -> Output. The LLM is passive, limited in real-world knowledge.
  • Level 2 (AI Workflow): Input -> Predefined Path (potentially using tools) -> Output. The human programs the path and is the decision maker.
  • Level 3 (AI Agent): Goal -> LLM Reasons -> Acts (using tools) -> Observes -> Iterates -> Final Output. The LLM is the decision maker and operates autonomously to achieve the goal.

Understanding these levels helps clarify the increasing complexity and capability of modern AI systems, moving from simple response generation to goal-oriented, autonomous action.


Discover more from Bytes & Dragons

Subscribe to get the latest posts sent to your email.

Trending