README.md - dt-cli-tools - CLI tools for viewing, filtering, and comparing tabular data files

README.md (7640B)
      1 <div align="center">
      2 
      3 <h1>dt-cli-tools</h1>
      4 <h3>View, filter, and diff tabular data files from the command line</h3>
      5 
      6 [![Vibecoded](https://img.shields.io/badge/vibecoded-%E2%9C%A8-blueviolet)](https://claude.ai)
      7 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
      8 
      9 <img src="demo/hero.gif" alt="dt-cli-tools demo" width="80%" />
     10 
     11 </div>
     12 
     13 ***
     14 
     15 [**dtcat**](#dtcat--view-data-files) · [**dtfilter**](#dtfilter--query-and-filter) · [**dtdiff**](#dtdiff--compare-two-files) · [**Install**](#installation) · [**Claude Code**](#claude-code-integration)
     16 
     17 ***
     18 
     19 Three read-only binaries, no runtime dependencies. Supports CSV, TSV, Parquet, Arrow/Feather, JSON, NDJSON, and Excel.
     20 
     21 ```bash
     22 # View a file
     23 dtcat data.parquet
     24 
     25 # Filter rows
     26 dtfilter data.csv --filter "Amount>1000" --sort "Amount:desc"
     27 
     28 # Diff two files
     29 dtdiff old.csv new.csv --key ID
     30 ```
     31 
     32 > **Supersedes [xl-cli-tools](https://github.com/LouLouLibs/xl-cli-tools)** — same filtering and diff capabilities, now for all tabular formats.
     33 
     34 ## Supported Formats
     35 
     36 | Format | Extensions | Detection |
     37 |--------|-----------|-----------|
     38 | CSV | `.csv` | delimiter heuristic |
     39 | TSV | `.tsv`, `.tab` | delimiter heuristic |
     40 | Parquet | `.parquet`, `.pq` | `PAR1` magic |
     41 | Arrow/Feather | `.arrow`, `.feather`, `.ipc` | `ARROW1` magic |
     42 | JSON | `.json` | `[` prefix |
     43 | NDJSON | `.ndjson`, `.jsonl` | `{` prefix |
     44 | Excel | `.xlsx`, `.xls`, `.xlsb`, `.ods` | ZIP/OLE magic |
     45 
     46 Format detection: `--format` flag > magic bytes > file extension. CSV delimiter auto-detected (comma, tab, semicolon).
     47 
     48 ## Installation
     49 
     50 ### Pre-built binaries (macOS)
     51 
     52 Download from [Releases](https://github.com/LouLouLibs/dt-cli-tools/releases):
     53 
     54 ```bash
     55 # Apple Silicon (macOS)
     56 for tool in dtcat dtfilter dtdiff; do
     57   curl -L "https://github.com/LouLouLibs/dt-cli-tools/releases/latest/download/${tool}-aarch64-apple-darwin" \
     58     -o ~/.local/bin/$tool
     59 done
     60 chmod +x ~/.local/bin/dt{cat,filter,diff}
     61 
     62 # Intel Mac (macOS)
     63 for tool in dtcat dtfilter dtdiff; do
     64   curl -L "https://github.com/LouLouLibs/dt-cli-tools/releases/latest/download/${tool}-x86_64-apple-darwin" \
     65     -o ~/.local/bin/$tool
     66 done
     67 chmod +x ~/.local/bin/dt{cat,filter,diff}
     68 ```
     69 
     70 ### From source
     71 
     72 ```bash
     73 cargo install --path .
     74 ```
     75 
     76 Requires Rust 1.85+.
     77 
     78 ## dtcat — View Data Files
     79 
     80 <img src="demo/dtcat.gif" alt="dtcat demo" width="80%" />
     81 
     82 ```bash
     83 # Overview: schema + data (<=50 rows all, >50 head/tail 25)
     84 dtcat data.parquet
     85 
     86 # Column names and types only
     87 dtcat data.csv --schema
     88 
     89 # Summary statistics (count, mean, std, min, max, median)
     90 dtcat data.csv --describe
     91 
     92 # File metadata (size, format, sheets)
     93 dtcat report.xlsx --info
     94 
     95 # Pick a sheet in a multi-sheet workbook
     96 dtcat report.xlsx --sheet Revenue
     97 
     98 # First 10 rows / last 5 rows
     99 dtcat data.csv --head 10
    100 dtcat data.csv --tail 5
    101 
    102 # CSV output for piping
    103 dtcat data.parquet --csv
    104 
    105 # Random sample of rows
    106 dtcat huge.parquet --sample 20
    107 dtcat huge.parquet --sample 50 --csv
    108 
    109 # Convert between formats
    110 dtcat data.csv --convert parquet -o data.parquet
    111 dtcat report.xlsx --sheet Revenue --convert csv -o revenue.csv
    112 dtcat data.parquet --convert ndjson              # text formats go to stdout
    113 
    114 # Override format detection
    115 dtcat data.txt --format csv
    116 
    117 # Skip metadata rows above header
    118 dtcat data.csv --skip 2
    119 ```
    120 
    121 ### Example output
    122 
    123 ```
    124 # File: sales.parquet (245 KB)
    125 # Format: Parquet
    126 
    127 ## Data (1240 rows x 4 cols)
    128 
    129 | Column  | Type   |
    130 |---------|--------|
    131 | date    | Date   |
    132 | region  | String |
    133 | amount  | Float  |
    134 | units   | Int    |
    135 
    136 | date       | region | amount  | units |
    137 |------------|--------|---------|-------|
    138 | 2024-01-01 | East   | 1234.56 | 100   |
    139 | 2024-01-02 | West   | 987.00  | 75    |
    140 ... (1190 rows omitted) ...
    141 | 2024-12-30 | East   | 1100.00 | 92    |
    142 | 2024-12-31 | West   | 1250.75 | 110   |
    143 ```
    144 
    145 ### Adaptive defaults
    146 
    147 - **Single sheet/file, <=50 rows:** shows all data
    148 - **Single sheet/file, >50 rows:** first 25 + last 25 rows
    149 - **Multiple sheets:** lists schemas, pick one with `--sheet`
    150 
    151 Modes `--schema`, `--describe`, `--info`, and data (default) are mutually exclusive.
    152 
    153 `--sample N` randomly selects N rows; mutually exclusive with `--head`/`--tail`/`--all`.
    154 
    155 `--convert FORMAT` writes to a different format. Use `-o PATH` for output file (required for binary formats Parquet/Arrow; optional for text formats which default to stdout). Supported targets: csv, tsv, parquet, arrow, json, ndjson.
    156 
    157 ## dtfilter — Query and Filter
    158 
    159 <img src="demo/dtfilter.gif" alt="dtfilter demo" width="80%" />
    160 
    161 ```bash
    162 # Filter rows by value
    163 dtfilter data.csv --filter State=CA
    164 
    165 # Numeric comparisons
    166 dtfilter data.csv --filter Amount>1000
    167 
    168 # Multiple filters (AND)
    169 dtfilter data.csv --filter State=CA --filter Amount>1000
    170 
    171 # Contains filter (case-insensitive)
    172 dtfilter data.csv --filter Name~john
    173 
    174 # Select columns
    175 dtfilter data.csv --columns State,City,Amount
    176 
    177 # Sort results
    178 dtfilter data.csv --sort Amount:desc
    179 
    180 # Limit output
    181 dtfilter data.csv --sort Amount:desc --limit 10
    182 
    183 # Window before filter
    184 dtfilter data.csv --head 100 --filter State=CA
    185 
    186 # CSV output for piping
    187 dtfilter data.parquet --filter value>0 --csv
    188 ```
    189 
    190 ### Filter operators
    191 
    192 | Operator | Meaning | Example |
    193 |----------|---------|---------|
    194 | `=` | Equals | `State=CA` |
    195 | `!=` | Not equals | `Status!=Draft` |
    196 | `>` | Greater than | `Amount>1000` |
    197 | `<` | Less than | `Year<2024` |
    198 | `>=` | Greater or equal | `Score>=90` |
    199 | `<=` | Less or equal | `Price<=50` |
    200 | `~` | Contains (case-insensitive) | `Name~john` |
    201 | `!~` | Not contains | `Name!~test` |
    202 
    203 `--head`/`--tail` apply before filtering. `--limit` applies after. Row count is printed to stderr.
    204 
    205 ## dtdiff — Compare Two Files
    206 
    207 <img src="demo/dtdiff.gif" alt="dtdiff demo" width="80%" />
    208 
    209 ```bash
    210 # Positional diff (whole-row comparison)
    211 dtdiff old.csv new.csv
    212 
    213 # Key-based diff (match rows by ID, compare cell by cell)
    214 dtdiff old.csv new.csv --key ID
    215 
    216 # Composite key
    217 dtdiff old.csv new.csv --key Date,Ticker
    218 
    219 # Float tolerance (differences <= 0.01 treated as equal)
    220 dtdiff old.csv new.csv --key ID --tolerance 0.01
    221 
    222 # Only compare specific columns
    223 dtdiff old.csv new.csv --key ID --columns Name,Salary
    224 
    225 # Excel sheets
    226 dtdiff report.xlsx other.xlsx --sheet Revenue
    227 
    228 # Output formats
    229 dtdiff old.csv new.csv --key ID --json
    230 dtdiff old.csv new.csv --key ID --csv
    231 dtdiff old.csv new.csv --no-color
    232 ```
    233 
    234 ### Example output
    235 
    236 ```
    237 --- old.csv
    238 +++ new.csv
    239 
    240 Added: 1 | Removed: 1 | Modified: 2
    241 
    242 - ID: "3"  Name: "Charlie"  Department: "Engineering"  Salary: "88000"
    243 + ID: "5"  Name: "Eve"  Department: "Marketing"  Salary: "70000"
    244 ~ ID: "1"
    245     Salary: "95000" → "98000"
    246 ~ ID: "2"
    247     Department: "Marketing" → "Design"
    248     Salary: "72000" → "75000"
    249 ```
    250 
    251 ### Diff modes
    252 
    253 **Positional (no `--key`):** Every column defines row identity. Reports added/removed rows only.
    254 
    255 **Key-based (`--key`):** Match rows by key columns, compare remaining columns cell by cell. Reports added, removed, and modified rows with per-cell changes. Supports composite keys, duplicate key detection, and float tolerance.
    256 
    257 ### Exit codes (diff convention)
    258 
    259 | Code | Meaning |
    260 |------|---------|
    261 | 0 | No differences |
    262 | 1 | Differences found |
    263 | 2 | Error |
    264 
    265 ## Claude Code integration
    266 
    267 Claude Code skills are available in [claude-skills](https://github.com/LouLouLibs/claude-skills). Claude can view data files, analyze schemas, filter rows, and compare files in conversations.
    268 
    269 ## Exit codes
    270 
    271 | Tool | 0 | 1 | 2 |
    272 |------|---|---|---|
    273 | dtcat | success | runtime error | invalid arguments |
    274 | dtfilter | success | runtime error | invalid arguments |
    275 | dtdiff | no differences | differences found | error |
    276 
    277 ## License
    278 
    279 MIT
	dt-cli-tools CLI tools for viewing, filtering, and comparing tabular data files
	Log \| Files \| Refs \| README \| LICENSE