dt-cli-tools

CLI tools for viewing, filtering, and comparing tabular data files
Log | Files | Refs | README | LICENSE

commit f4e310b5dd54a86f9939f7cfb4fe8db12f1e423c
parent 945e9cb45239d63a0a7e3c8241aedc6c2e9ce344
Author: Erik Loualiche <eloualic@umn.edu>
Date:   Fri,  3 Apr 2026 15:29:43 -0500

docs: update README to match xl-cli-tools style, add .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Diffstat:
A.gitignore | 2++
MREADME.md | 280++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------
2 files changed, 215 insertions(+), 67 deletions(-)

diff --git a/.gitignore b/.gitignore @@ -0,0 +1,2 @@ +/target +Cargo.lock diff --git a/README.md b/README.md @@ -1,16 +1,48 @@ -# dt-cli-tools +<div align="center"> -CLI tools for viewing, filtering, and comparing tabular data files. Supports CSV, TSV, Parquet, Arrow/Feather, JSON, NDJSON, and Excel. +<h1>dt-cli-tools</h1> +<h3>View, filter, and diff tabular data files from the command line</h3> -Three read-only tools: **dtcat**, **dtfilter**, **dtdiff**. +[![Vibecoded](https://img.shields.io/badge/vibecoded-%E2%9C%A8-blueviolet)](https://claude.ai) +[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) -## Install +<table> +<tr> +<td align="center" width="33%"><strong>dtcat</strong> — view</td> +<td align="center" width="33%"><strong>dtfilter</strong> — query</td> +<td align="center" width="33%"><strong>dtdiff</strong> — compare</td> +</tr> +<tr> +<td><img src="demo/dtcat.gif" alt="dtcat demo" /></td> +<td><img src="demo/dtfilter.gif" alt="dtfilter demo" /></td> +<td><img src="demo/dtdiff.gif" alt="dtdiff demo" /></td> +</tr> +</table> + +</div> + +*** + +[**dtcat**](#dtcat--view-data-files) · [**dtfilter**](#dtfilter--query-and-filter) · [**dtdiff**](#dtdiff--compare-two-files) · [**Install**](#installation) · [**Claude Code**](#claude-code-integration) + +*** + +Three read-only binaries, no runtime dependencies. Supports CSV, TSV, Parquet, Arrow/Feather, JSON, NDJSON, and Excel. ```bash -cargo install --path . +# View a file +dtcat data.parquet + +# Filter rows +dtfilter data.csv --filter "Amount>1000" --sort "Amount:desc" + +# Diff two files +dtdiff old.csv new.csv --key ID ``` -## Formats +> **Supersedes [xl-cli-tools](https://github.com/LouLouLibs/xl-cli-tools)** — same filtering and diff capabilities, now for all tabular formats. + +## Supported Formats | Format | Extensions | Detection | |--------|-----------|-----------| @@ -22,77 +54,204 @@ cargo install --path . | NDJSON | `.ndjson`, `.jsonl` | `{` prefix | | Excel | `.xlsx`, `.xls`, `.xlsb`, `.ods` | ZIP/OLE magic | -Format detection: `--format` flag > magic bytes > file extension. +Format detection: `--format` flag > magic bytes > file extension. CSV delimiter auto-detected (comma, tab, semicolon). + +## Installation -CSV delimiter auto-detected (comma, tab, semicolon). +### Pre-built binaries (macOS) ---- +Download from [Releases](https://github.com/LouLouLibs/dt-cli-tools/releases): -## dtcat +```bash +# Apple Silicon (macOS) +for tool in dtcat dtfilter dtdiff; do + curl -L "https://github.com/LouLouLibs/dt-cli-tools/releases/latest/download/${tool}-aarch64-apple-darwin" \ + -o ~/.local/bin/$tool +done +chmod +x ~/.local/bin/dt{cat,filter,diff} +``` -View and inspect files. Outputs markdown tables by default. +### From source ```bash -dtcat data.parquet # schema + data (≤50 rows all, >50 head/tail 25) -dtcat data.csv --schema # column names and types -dtcat data.csv --describe # summary statistics -dtcat report.xlsx --info # file metadata (size, format, sheets) -dtcat report.xlsx --sheet Revenue # specific Excel sheet -dtcat data.csv --head 10 # first 10 rows -dtcat data.csv --tail 5 # last 5 rows -dtcat data.parquet --csv # output as CSV for piping -dtcat data.txt --format csv # override format detection -dtcat data.csv --skip 2 # skip metadata rows above header +cargo install --path . ``` -Modes `--schema`, `--describe`, and data (default) are mutually exclusive. +Requires Rust 1.85+. + +## dtcat — View Data Files + +```bash +# Overview: schema + data (<=50 rows all, >50 head/tail 25) +dtcat data.parquet -## dtfilter +# Column names and types only +dtcat data.csv --schema + +# Summary statistics (count, mean, std, min, max, median) +dtcat data.csv --describe + +# File metadata (size, format, sheets) +dtcat report.xlsx --info + +# Pick a sheet in a multi-sheet workbook +dtcat report.xlsx --sheet Revenue + +# First 10 rows / last 5 rows +dtcat data.csv --head 10 +dtcat data.csv --tail 5 + +# CSV output for piping +dtcat data.parquet --csv + +# Override format detection +dtcat data.txt --format csv + +# Skip metadata rows above header +dtcat data.csv --skip 2 +``` + +### Example output + +``` +# File: sales.parquet (245 KB) +# Format: Parquet + +## Data (1240 rows x 4 cols) + +| Column | Type | +|---------|--------| +| date | Date | +| region | String | +| amount | Float | +| units | Int | + +| date | region | amount | units | +|------------|--------|---------|-------| +| 2024-01-01 | East | 1234.56 | 100 | +| 2024-01-02 | West | 987.00 | 75 | +... (1190 rows omitted) ... +| 2024-12-30 | East | 1100.00 | 92 | +| 2024-12-31 | West | 1250.75 | 110 | +``` -Filter, sort, and select. +### Adaptive defaults + +- **Single sheet/file, <=50 rows:** shows all data +- **Single sheet/file, >50 rows:** first 25 + last 25 rows +- **Multiple sheets:** lists schemas, pick one with `--sheet` + +Modes `--schema`, `--describe`, `--info`, and data (default) are mutually exclusive. + +## dtfilter — Query and Filter ```bash -dtfilter data.csv --filter State=CA # equality -dtfilter data.csv --filter Amount>1000 # numeric comparison -dtfilter data.csv --filter State=CA --filter Amount>1000 # AND logic -dtfilter data.csv --filter Name~john # contains (case-insensitive) -dtfilter data.csv --filter Status!=Draft # not equals -dtfilter data.csv --columns State,City,Amount # select columns -dtfilter data.csv --sort Amount:desc # sort descending -dtfilter data.csv --sort Name # sort ascending (default) -dtfilter data.csv --filter Active=true --limit 10 # cap output rows -dtfilter data.csv --head 100 --filter State=CA # window before filter -dtfilter data.parquet --filter value>0 --csv # CSV output +# Filter rows by value +dtfilter data.csv --filter State=CA + +# Numeric comparisons +dtfilter data.csv --filter Amount>1000 + +# Multiple filters (AND) +dtfilter data.csv --filter State=CA --filter Amount>1000 + +# Contains filter (case-insensitive) +dtfilter data.csv --filter Name~john + +# Select columns +dtfilter data.csv --columns State,City,Amount + +# Sort results +dtfilter data.csv --sort Amount:desc + +# Limit output +dtfilter data.csv --sort Amount:desc --limit 10 + +# Window before filter +dtfilter data.csv --head 100 --filter State=CA + +# CSV output for piping +dtfilter data.parquet --filter value>0 --csv ``` -Filter operators: `=` `!=` `>` `<` `>=` `<=` `~` (contains) `!~` (not contains). +### Filter operators -`--head`/`--tail` apply before filtering. `--limit` applies after. `--head` and `--tail` are mutually exclusive. +| Operator | Meaning | Example | +|----------|---------|---------| +| `=` | Equals | `State=CA` | +| `!=` | Not equals | `Status!=Draft` | +| `>` | Greater than | `Amount>1000` | +| `<` | Less than | `Year<2024` | +| `>=` | Greater or equal | `Score>=90` | +| `<=` | Less or equal | `Price<=50` | +| `~` | Contains (case-insensitive) | `Name~john` | +| `!~` | Not contains | `Name!~test` | -## dtdiff +`--head`/`--tail` apply before filtering. `--limit` applies after. Row count is printed to stderr. -Compare two files of the same format. Exit code 0 = identical, 1 = differences, 2 = error. +## dtdiff — Compare Two Files ```bash -dtdiff old.csv new.csv # positional comparison -dtdiff old.csv new.csv --key ID # key-based (added/removed/modified) -dtdiff old.csv new.csv --key Date,Ticker # composite key -dtdiff old.csv new.csv --key ID --tolerance 0.01 # float tolerance -dtdiff old.csv new.csv --key ID --json # JSON output -dtdiff old.csv new.csv --key ID --csv # CSV output -dtdiff old.csv new.csv --no-color # plain text -dtdiff report.xlsx other.xlsx --sheet Revenue # Excel sheets +# Positional diff (whole-row comparison) +dtdiff old.csv new.csv + +# Key-based diff (match rows by ID, compare cell by cell) +dtdiff old.csv new.csv --key ID + +# Composite key +dtdiff old.csv new.csv --key Date,Ticker + +# Float tolerance (differences <= 0.01 treated as equal) +dtdiff old.csv new.csv --key ID --tolerance 0.01 + +# Only compare specific columns +dtdiff old.csv new.csv --key ID --columns Name,Salary + +# Excel sheets +dtdiff report.xlsx other.xlsx --sheet Revenue + +# Output formats +dtdiff old.csv new.csv --key ID --json +dtdiff old.csv new.csv --key ID --csv +dtdiff old.csv new.csv --no-color +``` + +### Example output + +``` +--- old.csv ++++ new.csv + +Added: 1 | Removed: 1 | Modified: 2 + +- ID: "3" Name: "Charlie" Department: "Engineering" Salary: "88000" ++ ID: "5" Name: "Eve" Department: "Marketing" Salary: "70000" +~ ID: "1" + Salary: "95000" → "98000" +~ ID: "2" + Department: "Marketing" → "Design" + Salary: "72000" → "75000" ``` -Both files must be the same format (CSV/TSV are treated as compatible). +### Diff modes -**Positional mode** (no `--key`): reports added/removed rows based on full-row equality. +**Positional (no `--key`):** Every column defines row identity. Reports added/removed rows only. -**Key-based mode** (`--key`): matches by key columns, reports added/removed/modified with cell-level changes. +**Key-based (`--key`):** Match rows by key columns, compare remaining columns cell by cell. Reports added, removed, and modified rows with per-cell changes. Supports composite keys, duplicate key detection, and float tolerance. ---- +### Exit codes (diff convention) -## Exit Codes +| Code | Meaning | +|------|---------| +| 0 | No differences | +| 1 | Differences found | +| 2 | Error | + +## Claude Code integration + +Claude Code skills are available in [claude-skills](https://github.com/LouLouLibs/claude-skills). Claude can view data files, analyze schemas, filter rows, and compare files in conversations. + +## Exit codes | Tool | 0 | 1 | 2 | |------|---|---|---| @@ -100,19 +259,6 @@ Both files must be the same format (CSV/TSV are treated as compatible). | dtfilter | success | runtime error | invalid arguments | | dtdiff | no differences | differences found | error | -## Architecture - -Library crate `dtcore` with three thin binaries. ~60% ported from [xl-cli-tools](https://github.com/LouLouLibs/xl-cli-tools). - -``` -src/ - format.rs # format detection (magic bytes + extension) - reader.rs # dispatch to format-specific readers - readers/ # CSV, Parquet, Arrow, JSON, Excel - formatter.rs # DataFrame → markdown/CSV output - filter.rs # filter expressions, sort, pipeline - diff.rs # positional and key-based comparison - metadata.rs # FileInfo, SheetInfo, display helpers -``` +## License -Built on [Polars](https://pola.rs/) for DataFrames, [calamine](https://github.com/tafia/calamine) for Excel, [clap](https://clap.rs/) for CLI. +MIT