Skip to content

Getting Started

This guide walks you through installing the core Noon tools, running your first DuckDB query, and profiling data with FineType.

Noon tools run on macOS, Linux, and Windows (via WSL).

ToolPurposeRequired
NushellShell for data workflowsRecommended
DuckDBIn-memory analytical SQL engineYes
FineTypeSemantic type classificationYes

You’ll need a terminal and a package manager. The examples below use Homebrew (macOS/Linux), but alternatives are listed for each platform.

Nushell is a modern shell that treats data as structured tables — ideal for piping between tools.

Terminal window
# macOS / Linux
brew install nushell
# Windows
winget install nushell
# Or via Cargo (any platform with Rust)
cargo install nu

Verify:

Terminal window
nu --version

DuckDB is a fast, embeddable SQL engine for analytics. It reads CSV, JSON, and Parquet natively.

Terminal window
# macOS
brew install duckdb
# Linux (apt)
sudo apt install duckdb
# Windows
winget install DuckDB.cli

Verify:

Terminal window
duckdb --version

FineType classifies text into 152 semantic types with a character-level CNN model.

Terminal window
# macOS (Homebrew)
brew install noon-org/tap/finetype
# Any platform with Rust
cargo install finetype-cli

Verify:

Terminal window
finetype --version

Let’s walk through a complete analytics workflow: create a dataset, query it with DuckDB, then profile the column types with FineType.

Save this as contacts.csv:

id,name,email,created_at,ip_address,amount
1,Alice Chen,[email protected],2024-01-15T09:30:00Z,192.168.1.10,149.99
2,Bob Smith,[email protected],2024-02-20T14:15:00Z,10.0.0.42,2500.00
3,Carol Wu,[email protected],2024-03-08T11:00:00Z,172.16.0.1,89.50
4,Dan Reeves,[email protected],2024-04-12T16:45:00Z,192.168.0.5,1200.00
5,Eve Nakamura,[email protected],2024-05-01T08:00:00Z,10.10.10.1,340.75

This dataset has a mix of types: names, emails, timestamps, IP addresses, and numeric amounts.

Open a DuckDB shell and explore the data:

-- Start DuckDB
duckdb
-- Load and inspect
SELECT * FROM 'contacts.csv';
-- Aggregate query
SELECT
count(*) AS total_contacts,
avg(amount) AS avg_amount,
min(created_at) AS earliest,
max(created_at) AS latest
FROM 'contacts.csv';

Expected output:

┌─────────────────┬────────────┬──────────────────────┬──────────────────────┐
│ total_contacts │ avg_amount │ earliest │ latest │
│ int64 │ double │ varchar │ varchar │
├─────────────────┼────────────┼──────────────────────┼──────────────────────┤
│ 5 │ 856.05 │ 2024-01-15T09:30:00Z │ 2024-05-01T08:00:00Z │
└─────────────────┴────────────┴──────────────────────┴──────────────────────┘

DuckDB automatically reads the CSV and lets you query it immediately — no schema definition needed.

Now let’s see what FineType detects in each column:

Terminal window
finetype profile -f contacts.csv

Expected output:

Column Type Confidence
────────────── ──────────────────────────────── ──────────
id representation.numeric.increment 0.95
name identity.person.full_name 0.92
email identity.person.email 0.99
created_at datetime.timestamp.iso_8601 0.98
ip_address technology.internet.ip_v4 0.97
amount representation.numeric.decimal 0.94

FineType identifies semantic types beyond what SQL type inference gives you — it distinguishes emails from strings, IP addresses from text, and ISO timestamps from generic dates.

You can also classify single values:

Terminal window
finetype infer -i "[email protected]"
# → identity.person.email
finetype infer -i "192.168.1.10"
# → technology.internet.ip_v4
finetype infer -i "2024-01-15T09:30:00Z"
# → datetime.timestamp.iso_8601

Each prediction is a transformation contract — it maps to a DuckDB SQL expression guaranteed to parse the value correctly.

If you have the FineType DuckDB extension installed, you can classify directly in SQL:

INSTALL finetype FROM community;
LOAD finetype;
SELECT
column_name,
finetype(value) AS semantic_type
FROM 'contacts.csv';

Keep exploring

Dive deeper into the Noon analytics ecosystem.

FineType Docs

Explore the full taxonomy, CLI commands, DuckDB extension, and performance benchmarks. Read more →

DuckDB

Learn more about DuckDB’s SQL dialect, file format support, and extensions. Documentation →

Nushell

Discover Nushell’s structured data pipelines and how they complement SQL workflows. The Nushell Book →