Features Syntax AI Agents MCP Connectors Python FFI Toolchain View on GitHub
Open Source · Apache 2.0

The Language Built for Data & AI

Python made data accessible. TL makes it fast, safe, and intelligent — in one compiled language. 1,322 tests passing across 34 implementation phases — AI agents with tool-use, full MCP ecosystem (client + server), generics, pattern matching, Python FFI, LLVM & WASM backends, package manager, full LSP, and a comprehensive security audit already shipping.

1,322 Tests Passing Rust-Powered Zero GIL
pipeline.tl
source users = postgres("db").table("users") -> User

transform active_users(src: table<User>) -> table<User> {
    src
    |> filter(is_active == true)
    |> clean(nulls: { name: "unknown" })
    |> with { tenure = today() - signup_date }
}

model churn = train xgboost {
    data: active_users(users)
    target: "is_active"
    features: [tenure, monthly_spend]
}
Philosophy

Seven Principles That Define TL

DATA IS A TYPE

Tables, Streams, Tensors are native types in the language.

PIPELINES ARE PROGRAMS

ETL/ELT flows are composable first-class constructs.

AI IS A VERB

train, predict, embed, agent are keywords — not libraries.

PARALLEL BY DEFAULT

No GIL, automatic partitioning across cores.

FAIL LOUD, RECOVER SMART

Built-in error handling for unreliable data sources.

READABLE BEATS CLEVER

Python-like readability, Rust-like safety guarantees.

FAST WITHOUT TRYING

Compiled to native code with lazy evaluation. Performance is the default, not an afterthought.

The Problem

Replace Your Entire Stack

Stop duct-taping together a dozen tools. TL unifies the modern data stack into one language.

Today's Stack
TL Equivalent
Python + Pandas
Native table type
SQL in strings
Native query syntax
Spark / PySpark
Built-in distributed execution
Airflow / Dagster
Native pipeline construct
PyTorch / TF / sklearn
Native model / train / predict
Kafka consumers
Native stream type
dbt
Native transformations with typing
Docker + K8s
tl deploy CLI command
LangChain / CrewAI / AutoGen
Native agent construct with tool-use
Custom MCP integrations
Built-in mcp_connect() + mcp_serve()
Syntax

Clean, Expressive, Powerful

Schema & Source
schema User {
    id:          int64
    name:        string
    email:       string
    signup_date: date
    is_active:   bool
}

source users =
    postgres("db")
    .table("users")
    -> User
Transform & Pipeline
transform clean_users(src: table<User>) {
    src
    |> filter(is_active == true)
    |> clean(nulls: {
        name: "unknown"
    })
    |> with {
        tenure = today() - signup_date
    }
}

pipeline daily_etl {
    users |> clean_users
}
AI Training
model churn_predictor =
    train xgboost {
        data: clean_users(users)
        target: "is_active"
        features: [
            tenure,
            monthly_spend
        ]
        split: 0.8
    }

// Use the model
let result =
    predict(churn_predictor, new_user)
Pattern Matching
match load_users("data.csv") {
    Ok(users) => process(users)
    Err(DataError::FileNotFound(p)) =>
        log("Missing: {p}")
    Err(e) => alert("{e}")
}

// Destructuring + guards
let [head, ...tail] = items
let Point { x, y } = origin
Generics & Traits
fn top_n<T: Comparable>(
    data: table<T>,
    col: fn(T) -> float64,
    n: int
) -> table<T> {
    data |> sort(col, desc) |> limit(n)
}

trait Connectable {
    fn connect() -> result<Conn, Error>
}
AI Agents

AI Agents as Language Primitives

No frameworks. No glue code. Define autonomous AI agents with tool-use, multi-provider LLM support, and lifecycle hooks — all with a single keyword.

research_agent.tl
// Define tool functions in pure TL
fn search(query) {
    let resp = http_request("GET",
        "https://api.search.com/v1?q=" + query,
        none, none)
    json_parse(resp.body)
}

// Declare the agent
agent research_bot {
    model: "gpt-4o",
    system: "You are a research assistant.",
    tools {
        search: {
            description: "Search the web",
            parameters: {
                type: "object",
                properties: {
                    query: { type: "string" }
                }
            }
        }
    },
    max_turns: 5,
    on_tool_call {
        println("[LOG] " + tool_name)
    }
}

// Run it
let result = run_agent(research_bot,
    "What is quantum computing?")
println(result.response)

First-Class Keyword

agent is a language keyword, not a library import. Tools are TL functions wired directly to the LLM.

Any LLM Provider

OpenAI, Anthropic, Ollama, or any OpenAI-compatible endpoint. Auto-detects protocol from model name. One base_url field to switch.

Automatic Tool Loop

The runtime handles multi-turn tool calling, JSON arg conversion, and result formatting. You just write the function.

Lifecycle Hooks

on_tool_call and on_complete blocks for logging, metrics, or custom logic at each step.

Pipeline Integration

Agents use the same table, stream, and connectors as your data pipelines — no serialization layer needed.

Conversation Persistence

run_agent(agent, message, history) maintains context across multi-turn sessions. No external memory store needed.

SSE Streaming

stream_agent() delivers real-time token-by-token output via Server-Sent Events.

Retry & JSON Mode

Automatic exponential backoff on 429/5xx errors. output_format: "json" for guaranteed structured output.

run_agent(agent, msg, history)
Multi-turn agent with conversation persistence
stream_agent(agent, msg)
Real-time SSE streaming responses
embed(text) → tensor
Vector embeddings + similarity search
MCP Ecosystem

Full MCP Ecosystem

Model Context Protocol — the open standard that lets AI tools and data systems talk to each other. TL implements both sides: connect to any MCP server as a client, or expose TL functions to any AI tool as a server.

mcp_client.tl
// Connect to any MCP server
let github = mcp_connect("github-server")

// Discover available tools
let tools = mcp_list_tools(github)

// Call any tool directly
let issues = mcp_call_tool(github,
    "list_issues",
    { repo: "myorg/myrepo" })

// Read resources from server
let schema = mcp_read_resource(github,
    "repo://myorg/myrepo/schema")

// Get prompt templates
let prompt = mcp_get_prompt(github,
    "summarize_pr", { pr: "42" })

mcp_disconnect(github)
mcp_server.tl
// Expose TL functions as MCP tools
fn query_sales(region, quarter) {
    postgres("warehouse", "sales")
        |> filter(region == region)
        |> filter(q == quarter)
}

fn run_report(name) {
    read_csv("reports/" + name + ".csv")
        |> aggregate(sum(revenue))
}

// Start the MCP server
mcp_serve()

// Claude Desktop, Cursor, Windsurf etc.
// can now discover & call these functions
// Config: { "command": "tl run server.tl" }

MCP Client

mcp_connect() auto-detects stdio or HTTP transport. 10 builtins for tools, resources, prompts, and ping.

MCP Server

mcp_serve() turns TL functions into MCP tools. Claude Desktop, Cursor, Windsurf can discover and call them.

Agent Integration

mcp_servers: [...] in agent definitions. LLM sees one unified tool list — MCP and native tools dispatched transparently.

Sampling

MCP servers can request LLM completions back through TL. Bidirectional AI communication over the protocol.

agent_with_mcp.tl
// Connect to multiple MCP servers
let github = mcp_connect("github-server")
let db = mcp_connect("database-server")

// Agent auto-discovers tools from all MCP servers
agent ops_bot {
    model: "claude-sonnet-4-20250514",
    system: "You are a DevOps assistant.",
    mcp_servers: [github, db],
    tools {
        // Native TL tools alongside MCP tools
        deploy: {
            description: "Deploy to production",
            parameters: { type: "object",
                properties: { service: { type: "string" } } }
        }
    }
}

// LLM sees unified tool list: GitHub + DB + native
let result = run_agent(ops_bot,
    "Check open PRs, verify staging DB, then deploy")
11 builtins
BuiltinId 216–226 — connect, discover, call, read, ping, disconnect
2 transports
stdio (subprocess) + Streamable HTTP (reqwest/axum)
rmcp 1.1 SDK
Feature-gated: --features mcp
Sandbox-aware
Sandbox blocks subprocess spawning, HTTP always allowed

One Protocol, Both Directions

TL agents gain access to the entire MCP ecosystem — filesystem, GitHub, databases, Slack, and hundreds more — without building each integration natively. Any AI tool gains access to TL's data engine via MCP server. Connect or serve. Your choice.

Type System

Data Types as First-Class Citizens

table<T>

Columnar, lazy-evaluated, and partitionable. The core data type for batch processing.

let users: table<User> = postgres("db").table("users")

stream<T>

Infinite, windowed, real-time. For continuous data processing and event streams.

stream process_events {
  from: kafka("events")
  window: tumbling(5m)
}

tensor<dtype, shape>

N-dimensional arrays for AI and machine learning. Shape-checked at compile time.

let embeddings: tensor<float32, [256, 768]>

model

A trained AI model as a first-class value. Serialize, version, deploy natively.

model churn = train xgboost { ... }

agent

Autonomous AI agent with tool-use, MCP server integration, and lifecycle hooks.

agent bot { model: "gpt-4o", mcp_servers: [...], tools { ... } }
Memory Safety

Four Rules. Zero Data Races.

Rust-inspired ownership without lifetime annotations. The compiler guarantees memory safety and data-race freedom at compile time.

1

Every value has one owner

let users = load("users.parquet")
// `users` is the sole owner
2

Pipe |> moves ownership

let active = users |> filter(age > 25)
// `users` is now consumed
3

Clone or borrow for reuse

let copy = users.clone()
let ref = &users // read-only
4

Parallel partitions own data

parallel for shard in users.partition(by: "region")
// No locks needed — compiler guarantees it
Under the Hood

Six-Stage Compilation Pipeline

Lexer
Parser
Semantic Analysis
TL-IR
Optimization
Code Generation
LLVM
Native
Cranelift
JIT
WASM
Web
CUDA
GPU

Built entirely in Rust. TL-IR doubles as a query plan — enabling data-aware optimizations like predicate pushdown, column pruning, and join reordering.

Benchmarks

Performance That Speaks

CSV Parse 1B rows
Python
45s
TL
<4s
Filter + Aggregate
Python
30s
TL
<2s
ETL Pipeline
Python
5min
TL
<30s
Stream Processing
Python
10K/s
TL
500K/s
Cold Start
Python
3-5s
TL
<100ms

End-to-End ML Pipeline

TL's compiler sees the entire pipeline as one program — eliminating serialization boundaries between tools.

Python Stack (~275s)
pandas.read_csv()45s
DataFrame transforms30s
df.to_numpy()5s
xgboost.train()120s
model.predict()15s
pandas.to_sql()60s
TL Pipeline (~120s)
load + filter + with4s
train xgboost { ... }110s
predict + with + save6s
Speedup~2.3x
Zero-copy Arrow handoff. No serialization boundaries.

Targets based on architecture analysis. Benchmarks will be published with reproducible scripts.

Error Handling

Data Is Messy. TL Handles It.

Rust-inspired result<T, E> with data-specific error types and declarative cleaning — not try/catch bolted on as an afterthought.

Result Types & Pattern Match
fn load_users(path: string)
    -> result<table<User>, DataError> {
    let raw = read_csv(path)?
    let valid = raw |> validate_schema(User)?
    Ok(valid)
}

match load_users("data.csv") {
    Ok(users) => process(users)
    Err(DataError::SchemaViolation(d))
        => alert("Drift: {d}")
    Err(e) => log("{e}")
}
Declarative Data Cleaning
let users = load("raw_users.csv")
    |> clean {
        nulls: {
            name: fill("UNKNOWN")
            email: drop_row
            age: fill(median)
        }
        duplicates: dedupe(by: email)
        outliers: {
            age: clamp(0, 150)
        }
    }
    |> validate {
        assert null_rate(email) == 0.0
        assert unique(id)
    }
Connectors

Connect to Everything

First-class connectors for databases, object storage, message queues, and APIs. All type-safe and schema-aware.

PostgreSQL
Shipped
MySQL
Shipped
SQLite
Shipped
DuckDB
Shipped
Redshift
Shipped
MSSQL
Shipped
Snowflake
Shipped
BigQuery
Shipped
Databricks
Shipped
ClickHouse
Shipped
MongoDB
Shipped
Kafka
Shipped
AWS S3
Shipped
Redis
Shipped
GraphQL
Shipped
HTTP/REST
Shipped
Parquet/CSV
Shipped
SFTP/SCP
Shipped

19 connectors shipped — more coming with every release.

Python Interop

Use Any Python Library

Bidirectional Python FFI via pyo3. Import Python modules, call functions, convert tensors to NumPy — all from TL code.

interop.tl
// Import any Python library
let np = py_import("numpy")
let pd = py_import("pandas")
let sklearn = py_import("sklearn.metrics")

// Call Python functions with TL values
let score = py_call(
    sklearn.accuracy_score,
    y_true, y_pred
)

// TL Tensor <-> NumPy ndarray
let pi = np.pi
let arr = np.sqrt(16)

Bidirectional Conversion

int, float, string, bool, list, map, set — all auto-converted between TL and Python.

Tensor ↔ NumPy

TL tensors convert seamlessly to/from NumPy ndarrays for ML workflows.

Dot Notation Access

Use natural math.sqrt(16) syntax on Python objects via method dispatch.

Feature-Gated

Python FFI is opt-in via feature flag. Zero overhead when not used.

Developer Experience

Everything You Need, Built In

terminal
$ tl init my-project
Created my-project/ with tl.toml
$ tl build
Compiled 12 modules in 0.3s
$ tl test
1,322 tests passed, 0 failed
$ tl add kafka-connector
Added kafka-connector v0.8
$ tl fmt && tl lint && tl check
Formatted 12 files. No warnings. Types OK.
$ tl debug pipeline.tl
Breakpoint hit at pipeline.tl:42
dbg> inspect rows → 1,248 records
$ tl doc src/ --public-only
Generated docs for 8 public modules
$ tl deploy pipeline.tl --target k8s
Deployed to cluster: prod-east

VS Code Extension & LSP

Syntax highlighting, diagnostics, go-to-definition, hover docs, document symbols, rename refactoring, and find-references across files.

Package Manager

tl add, tl update, tl outdated — full dependency management with lockfile and transitive resolution.

Formatter, Linter & Type Checker

tl fmt, tl lint, tl check — AST-guided formatting, naming conventions, and compile-time type safety.

Doc Generation

tl doc generates HTML, Markdown, or JSON docs from /// doc comments with cross-references.

Interactive Step Debugger

tl debug — breakpoints, variable inspection, source listing, and stack traces. Debug pipelines interactively.

Data Inspection & Lineage

tl inspect, tl profile, tl lineage — preview data, statistical profiles, and lineage graphs.

Standard Library

Batteries Included

Rich standard library methods, native DateTime, window functions, and data engineering primitives — all built in.

15+ List Methods

find sort_by group_by unique flatten chunk zip each

Map & String Methods

merge entries map_values trim_start is_numeric strip_prefix count

Math & Randomization

exp sign clamp is_nan random() random_int() sample()

Native DateTime

First-class VmValue::DateTime type with full arithmetic.

today() date_add() date_diff() date_trunc() date_extract()

Window Functions

DataFusion UDWF-backed analytics on tables.

rank row_number dense_rank lag lead ntile

Table Operations

Pipeline-native data manipulation.

table1 |> union(table2)
table |> sample(100)
table |> sample(fraction: 0.1)
assert_table_eq(t1, t2)

11 MCP Builtins

Full Model Context Protocol client + server.

mcp_connect mcp_list_tools mcp_call_tool mcp_read_resource mcp_get_prompt mcp_serve mcp_ping mcp_disconnect
"""...""" Triple-quoted strings with automatic dedentation
Positioning

How TL Compares

The only language where data pipelines, SQL-like queries, ML training, AI agents, MCP ecosystem, and real-time streaming are all first-class features — not libraries.

Tool
Their Strength
TL's Advantage
Python
Largest ML/data ecosystem
10-50x faster, type-safe, compiled
Mojo
Compiled ML, Python superset
Better data engineering: pipelines, streaming, connectors
Rust
Max performance, memory safety
Domain-specific abstractions as primitives
DuckDB
Embedded analytics, great SQL
Full language, not just SQL — plus ML and streaming
Polars
Fastest DataFrame library
First-class syntax, integrated ML/streaming
Scala + Spark
Battle-tested distributed computing
Simpler syntax, faster single-node, no JVM
SQL / dbt
Declarative, universally understood
Full programming language + AI + streaming
LangChain / CrewAI
Rich agent ecosystem, Python flexibility
Native syntax, no Python dep, compiled speed, type-safe tools
Custom MCP SDKs
Protocol-level flexibility
Client + server built-in, agent-integrated, zero config

Open Source.

ThinkingLanguage is licensed under Apache 2.0.

Apache 2.0 Built with Rust