Agentic data scientists: Redefining the future of analytics

For the past two decades, the role of a data scientist has been about mastering data pipelines, algorithms, and tools to transform raw data into insights. But with the rise of agentic AI systems, frameworks where autonomous agents collaborate, reason, and take actions, data science is undergoing a radical shift. The Agentic Data Scientist isn’t just a human role anymore; it’s becoming a hybrid of human expertise and LLM-powered agents that can plan, code, execute, and self-improve data pipelines.

This article explains the concept of Agentic Data Scientists, their working principles, and their role in advancing data-driven intelligence. It also includes complete code to help you build an Agentic Data Scientist for yourself.

What is an Agentic Data Scientist?

An Agentic Data Scientist is a human-machine partnership where large language models (LLMs) orchestrate multiple specialized agents to automate and augment the traditional data science lifecycle.

Instead of a human manually writing preprocessing scripts, training models, and tuning hyperparameters, the agentic system consists of several agents that collects the user intent, breaks down the task into step-by-step instructions, generates executable Python code, runs code in a sandbox and captures outputs, and validates the results, suggests fixes, and loops back through the other agents if needed.

The human data scientist still drives problem formulation, governance, and interpretation, but much of the repetitive work becomes autonomous.

Why now? The convergence of LLMs and LangGraph

The agentic paradigm has become feasible because of two breakthroughs:

LLMs with reasoning abilities (such as GPT-4o, LLaMA-3, or IBM Granite models) that can interpret tasks, generate code, and critique results.
Orchestration frameworks like LangGraph or watsonx Orchestrate, which allow AI developers to design stateful, looping workflows where agents collaborate, replan, and self-correct data pipelines.

Sample Architecture for an Agentic Data Scientist

Let’s consider a sample Agentic Data Scientist.

UI Input Agent. The user uploads the dataset in CSV or Excel format on which he intends to perform the data science task (such as regression or classification). Also, the user describes its task in natural language (English). The UI Input Agent passes the task and the data (in a data frame format) to the Planner agent.
Planner Agent. Generates a clear step-by-step plan (no code yet).
Coder Agent. Translates the plan into Python code operating directly on the DataFrame.
Executor Agent. Runs the code, capturing stdout or error tracebacks.
Reviewer Agent. Automatically fixes buggy code or suggests human intervention.

The agents form a loop (see the following figure). If execution fails, the Reviewer Agent suggests corrections and the Executor Agent tries again. This continues until either the code runs or the system gives up with a human-readable suggestion.

workflow of agents in agentic data scientist

Code walkthrough of the sample Agentic Data Scientist

Let’s walkthrough the Python code for building the sample Agentic Data Scientist. We build this agentic system using:

Gradio, for the UI
LangGraph, where the system is built around a LangGraph state machine, where each node represents an agent.
IBM watsonx, specifically a ChatWatsonx which is a wrapper for the watsonx.ai LLMs

Imports

First, we import all the required Python libraries. These include standard libraries (os, io, sys, traceback, contextlib), third-party libraries (pandas, gradio, dotenv), and specialized packages from LangGraph and watsonx.

import os
import io
import sys
import traceback
import contextlib
from typing import TypedDict, Literal, Any

import pandas as pd
import gradio as gr
from dotenv import load_dotenv

# LangGraph
from langgraph.graph import END, StateGraph, START
from langgraph.checkpoint.memory import MemorySaver

# LangChain / Watsonx
from langchain_ibm import ChatWatsonx
from langchain.prompts import ChatPromptTemplate

Environment and LLM setup

Next, we load credentials for watsonx.ai using environment variables. The model we used is 'openai/gpt-oss-120b'.

Also, we initialize the ChatWatsonx instance.

load_dotenv()

credentials = {
    "watsonx_api_key": os.getenv("WATSONX_APIKEY"),
    "watsonx_url": os.getenv("WATSONX_URL", "https://us-south.ml.cloud.ibm.com/"),
    "watsonx_project_id": os.getenv("WATSONX_PROJECT_ID"),
}

model_id = "openai/gpt-oss-120b"
llm = ChatWatsonx(
    model_id=model_id,
    api_key=credentials["watsonx_api_key"],
    url=credentials["watsonx_url"],
    project_id=credentials["watsonx_project_id"],
    max_tokens=100_000,
)

GraphState definition

LangGraph workflows use a typed dictionary to represent state. This ensures consistency as the state flows through agents. Next, we define the GraphState definition.

class GraphState(TypedDict):
    task: str
    uploaded_file: Any
    dataset_info: str
    instructions: str
    code: str
    exec_output: str
    exec_error: str
    attempts: int
    suggestions: str

Agents

Agents represent nodes in the LangGraph workflow. Each agent performs a specific role: input preprocessing, planning, coding, executing, or reviewing. They communicate with each other via the shared GraphState.

Input Agent

This agent loads the uploaded CSV file into a pandas DataFrame and prepares a short description of the dataset columns.

def ui_input_agent(state: GraphState) -> GraphState:
    global df
    df = load_file(state.get("uploaded_file"))
    if isinstance(df, pd.DataFrame) and not df.empty:
        cols = ", ".join(df.columns.tolist())
        dataset_info = f"The uploaded file contains the following columns: {cols}."
    else:
        dataset_info = "No data was uploaded (empty DataFrame)."
    return {
        "task": state["task"],
        "uploaded_file": state.get("uploaded_file"),
        "dataset_info": dataset_info,
        "attempts": 0,
        "suggestions": "",
    }

Planner Agent

This agent uses the LLM to create a clear, numbered plan of steps to solve the user's data-science task.

planner_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "... system instructions ..."),
        ("human", "Task description: {task}\nDataset info: {dataset_info}\n\nProvide the numbered instructions."),
    ]
)

def planner_agent(state: GraphState) -> GraphState:
    prompt = planner_prompt.format_messages(
        task=state["task"],
        dataset_info=state["dataset_info"],
    )
    response = llm.invoke(prompt)
    instructions = response.content.strip()
    return {"instructions": instructions}

Coder Agent

This agent converts the planner's instructions into executable Python code.

coder_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "... system instructions ..."),
        ("human", "Instructions:\n{instructions}\n\nWrite the code now:"),
    ]
)

def coder_agent(state: GraphState) -> GraphState:
    prompt = coder_prompt.format_messages(instructions=state["instructions"])
    response = llm.invoke(prompt)
    code = response.content.strip()
    if code.startswith("```python"):
        code = code[len("```python"):].strip()
    if code.startswith("```"):
        code = code[3:].strip()
    if code.endswith("```"):
        code = code[:-3].strip()
    return {"code": code}

Executor Agent

This agent runs the generated Python code and captures its output or error trace.

def executor_agent(state: GraphState) -> GraphState:
    code = state.get("code", "")
    if not code:
        return {"exec_output": "", "exec_error": "No code was provided by the coder."}
    exec_namespace = dict(globals())
    exec_namespace.update({"__name__": "__main__"})
    stdout_buf = io.StringIO()
    try:
        with contextlib.redirect_stdout(stdout_buf):
            exec(code, exec_namespace)
        return {"exec_output": stdout_buf.getvalue(), "exec_error": ""}
    except Exception:
        tb = traceback.format_exc()
        return {"exec_output": "", "exec_error": tb}

Reviewer Agent

This agent checks for errors in execution. If the code failed, it attempts to fix it or provides human-readable suggestions.

reviewer_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "... system instructions ..."),
        ("human", "Error:\n```\n{error}\n```\n\nOriginal code:\n```\n{code}\n```"),
    ]
)

def reviewer_agent(state: GraphState) -> GraphState:
    prompt = reviewer_prompt.format_messages(error=state["exec_error"], code=state["code"])
    response = llm.invoke(prompt)
    reply = response.content.strip()
    if reply.upper().startswith("SUGGEST:"):
        suggestion = reply[len("SUGGEST:"):].strip()
        return {"suggestions": suggestion, "code": state["code"]}
    corrected = reply
    if corrected.startswith("```python"):
        corrected = corrected[len("```python"):].strip()
    if corrected.startswith("```"):
        corrected = corrected[3:].strip()
    if corrected.endswith("```"):
        corrected = corrected[:-3].strip()
    return {"code": corrected, "suggestions": ""}

Workflow definition

This section defines the LangGraph workflow by adding nodes and edges. It also sets up conditional paths for retrying execution or ending the workflow.

MAX_ATTEMPTS = 5
workflow = StateGraph(GraphState)

workflow.add_node("input", ui_input_agent)
workflow.add_node("planner", planner_agent)
workflow.add_node("coder", coder_agent)
workflow.add_node("executor", executor_agent)
workflow.add_node("reviewer", reviewer_agent)

workflow.add_edge(START, "input")
workflow.add_edge("input", "planner")
workflow.add_edge("planner", "coder")
workflow.add_edge("coder", "executor")

def after_executor(state: GraphState) -> Literal["reviewer", END]:
    if state["exec_error"] and state["attempts"] < MAX_ATTEMPTS:
        return "reviewer"
    return END

workflow.add_conditional_edges("executor", after_executor, {"reviewer": "reviewer", END: END})
app = workflow.compile()

Helper utilities

These functions support the Agentic Data Scientist’s core workflow by handling key backend operations:

_normalise_path() – Ensures uploaded files are correctly interpreted, whether as objects or file paths.
load_file() – Reads CSV or Excel files into pandas DataFrames and handles errors gracefully.
run_workflow() – Orchestrates the full agentic process: executes tasks, collects outputs from planner, coder, executor, and reviewer, and compiles final results.

Together, they make the system stable, modular, and easy to maintain.

def _normalise_path(file_obj) -> str:
    if file_obj is None:
        return ""
    if isinstance(file_obj, str):
        return file_obj
    return getattr(file_obj, "name", "")


def load_file(file_obj):
    path = _normalise_path(file_obj)
    if not path:
        return pd.DataFrame()
    ext = path.lower().split(".")[-1]
    try:
        if ext == "csv":
            df = pd.read_csv(path)
        elif ext in ("xlsx", "xls"):
            df = pd.read_excel(path)
        else:
            raise ValueError("Unsupported file type. Please upload CSV or Excel.")
    except Exception as exc:
        raise ValueError(f"Failed to read the uploaded file: {exc}") from exc
    return df.drop(columns="Date", errors="ignore")

def run_workflow(task: str, uploaded_file) -> dict:
    final_state = app.invoke({"task": task, "uploaded_file": uploaded_file})
    result = {
        "planner": final_state.get("instructions", ""),
        "coder": final_state.get("code", ""),
        "executor_output": final_state.get("exec_output", ""),
        "executor_error": final_state.get("exec_error", ""),
        "reviewer": final_state.get("suggestions", ""),
        "final_output": "",
    }
    if final_state.get("exec_error"):
        if final_state.get("suggestions"):
            result["final_output"] = (
                "❗️  The system could not automatically fix the code.\n"
                f"💡  Suggested next step for the human:\n{final_state['suggestions']}"
            )
        else:
            result["final_output"] = (
                "❗️  Execution failed after the maximum number of attempts.\n"
                f"Error was:\n{final_state['exec_error']}"
            )
    else:
        result["final_output"] = "✅  Code ran successfully!  Output:\n" + final_state["exec_output"]
    return result

Gradio user interface

This Gradio user interface acts as an interactive front-end for the Agentic Data-Science Assistant. It provides a simple workflow where users can:

Upload a CSV or Excel dataset.
Describe their desired analysis or model in plain English.
Run the assistant, which then automatically plans the steps, generates Python code, executes it, reviews the outcome, and presents the final results.

Each output area (Planner, Coder, Executor, Reviewer, Final Output) displays the respective stage’s result, giving users full transparency into how the analysis is performed.

with gr.Blocks(theme=gr.themes.Default()) as demo:
    gr.Markdown("""
        # 🤖 Data‑Science Assistant (LangGraph + Watsonx)
        1️⃣ Upload a CSV or Excel file.  
        2️⃣ Describe the analysis / model you want in plain English.  
        3️⃣ Press Run – the assistant will plan, code, execute, review and finally give you the result.
    """)

    with gr.Row():
        file_input = gr.File(label="📂 Upload CSV / Excel (optional)", file_types=[".csv", ".xlsx", ".xls"])
        task_input = gr.Textbox(label="📝 Task description", placeholder="e.g. train a linear regression model", lines=3)

    run_btn = gr.Button("🚀 Run", variant="primary")

    planner_box = gr.Textbox(label="🗒️ Planner – numbered steps", lines=6)
    coder_box   = gr.Code(label="👩‍💻 Coder – generated Python code", language="python", lines=12)
    executor_out = gr.Textbox(label="⚙️ Executor – stdout", lines=6)
    executor_err = gr.Textbox(label="❌ Executor – error (if any)", lines=6)
    reviewer_box = gr.Textbox(label="🧐 Reviewer – suggestion (if any)", lines=6)
    final_box    = gr.Textbox(label="🎉 Final output", lines=8)

    def on_click(task, uploaded_file):
        out = run_workflow(task, uploaded_file)
        return (
            out["planner"], out["coder"], out["executor_output"],
            out["executor_error"], out["reviewer"], out["final_output"],
        )

    run_btn.click(fn=on_click, inputs=[task_input, file_input], outputs=[planner_box, coder_box, executor_out, executor_err, reviewer_box, final_box])

Launch

To launch the Agentic Data Scientist interface do the following:

Put all of the above code snippets into a python script named something like agentic_data_scientist.py.
Put your watsonx credentials (WATSONX_URL, WATSONX_APIKEY, and WATSONX_PROJECT_ID) in an .env file.
From your terminal, run your script: python agentic_data_scientist.py.

This will start a local web server and display a link (for example, http://127.0.0.1:7860) that opens the Agentic Data Scientist interface in your browser. You can then interact with it to perform end-to-end automated data-science workflows.

After clicking the link the following interface opens:

In this example, the user uploaded the sales data file, Supplement_Sales_Weekly.csv and has described the task as “Train and evaluate a random forest regressor model on the data. Output variable is Units Sold.”
After reviewing this task, click the Run button.

You will see the following outputs of the Planner Agent and the Coder Agent displayed. Once the program has run, you get the final output of the task displayed on the screen, like the following screen capture:

In this example, you can see that the agents have performed the task specified by the user and also given the evaluation of the training such as RMSE, R2, and so on, along with the feature importances.

Conclusion

In this article, we showed how you can build agents that can perform any data science task seamlessly using watsonx LLMs, LangGraph, and a Gradio UI.

The era of the Agentic Data Scientist is here. By combining human intuition with agent-driven automation, we can create systems that are faster, smarter, and more ethical.