docs: add dt-cli-tools implementation plan - dt-cli-tools - CLI tools for viewing, filtering, and comparing tabular data files

commit c943e393050777ab5305ac65cef3ff1ab7db8a83
parent a0e0a2972a6c6acac7789b432d28cc058f411d22
Author: Erik Loualiche <eloualic@umn.edu>
Date:   Mon, 30 Mar 2026 22:57:56 -0500

docs: add dt-cli-tools implementation plan

17-task plan covering project scaffolding, format detection, readers
(CSV, Parquet, Arrow, JSON, Excel), ported modules (formatter, filter,
diff), three CLI binaries, and integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Diffstat:
A docs/superpowers/plans/2026-03-30-dt-cli-tools.md  | 2305 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 file changed, 2305 insertions(+), 0 deletions(-)
diff --git a/docs/superpowers/plans/2026-03-30-dt-cli-tools.md b/docs/superpowers/plans/2026-03-30-dt-cli-tools.md
@@ -0,0 +1,2305 @@
+# dt-cli-tools Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Build a Rust CLI tool suite (`dtcat`, `dtfilter`, `dtdiff`) for inspecting, querying, and comparing tabular data files across formats (CSV, Parquet, Arrow, JSON, Excel).
+
+**Architecture:** Multi-format reader layer with automatic format detection feeds DataFrames into format-agnostic modules (formatter, filter, diff) ported from xl-cli-tools. Three binaries share the `dtcore` library crate.
+
+**Tech Stack:** Rust 2024 edition, Polars 0.46 (DataFrame engine + CSV/Parquet/Arrow/JSON readers), calamine (Excel), clap (CLI), anyhow (errors), serde_json (JSON output).
+
+**Source reference:** xl-cli-tools at `/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/`
+
+---
+
+## File Structure
+
+```
+dt-cli-tools/
+  Cargo.toml
+  src/
+    lib.rs                    # pub mod declarations
+    format.rs                 # Format enum, magic-byte + extension detection
+    reader.rs                 # ReadOptions, read_file dispatch
+    metadata.rs               # FileInfo, format_file_size (generalized)
+    formatter.rs              # ported from xl-cli-tools (pure DataFrame formatting)
+    filter.rs                 # ported from xl-cli-tools (letter-based column resolution removed)
+    diff.rs                   # ported from xl-cli-tools (pure DataFrame comparison)
+    readers/
+      mod.rs                  # sub-module declarations
+      csv.rs                  # CSV/TSV reader via Polars CsvReader
+      parquet.rs              # Parquet reader via Polars ParquetReader
+      arrow.rs                # Arrow IPC reader via Polars IpcReader
+      json.rs                 # JSON/NDJSON reader via Polars JsonReader/JsonLineReader
+      excel.rs                # Excel reader via calamine (ported from xl-cli-tools reader.rs)
+  src/bin/
+    dtcat.rs                  # view/inspect any tabular file
+    dtfilter.rs               # filter/query any tabular file
+    dtdiff.rs                 # compare two tabular files
+  tests/
+    integration/
+      dtcat.rs
+      dtfilter.rs
+      dtdiff.rs
+  demo/                       # fixture files for tests
+```
+
+---
+
+### Task 1: Project Scaffolding
+
+**Files:**
+- Create: `Cargo.toml`
+- Create: `src/lib.rs`
+- Create: `src/readers/mod.rs`
+
+- [ ] **Step 1: Create Cargo.toml**
+
+```toml
+[package]
+name = "dt-cli-tools"
+version = "0.1.0"
+edition = "2024"
+description = "CLI tools for viewing, filtering, and comparing tabular data files"
+license = "MIT"
+
+[lib]
+name = "dtcore"
+path = "src/lib.rs"
+
+[[bin]]
+name = "dtcat"
+path = "src/bin/dtcat.rs"
+
+[[bin]]
+name = "dtfilter"
+path = "src/bin/dtfilter.rs"
+
+[[bin]]
+name = "dtdiff"
+path = "src/bin/dtdiff.rs"
+
+[dependencies]
+polars = { version = "0.46", default-features = false, features = [
+    "dtype-datetime",
+    "csv",
+    "parquet",
+    "ipc",
+    "json",
+] }
+calamine = "0.26"
+clap = { version = "4", features = ["derive"] }
+anyhow = "1"
+serde_json = { version = "1", features = ["preserve_order"] }
+
+[profile.release]
+strip = true
+lto = true
+codegen-units = 1
+panic = "abort"
+opt-level = "z"
+
+[dev-dependencies]
+assert_cmd = "2"
+predicates = "3"
+tempfile = "3"
+```
+
+- [ ] **Step 2: Create src/lib.rs with module declarations**
+
+```rust
+pub mod diff;
+pub mod filter;
+pub mod format;
+pub mod formatter;
+pub mod metadata;
+pub mod reader;
+pub mod readers;
+```
+
+- [ ] **Step 3: Create src/readers/mod.rs**
+
+```rust
+pub mod arrow;
+pub mod csv;
+pub mod excel;
+pub mod json;
+pub mod parquet;
+```
+
+- [ ] **Step 4: Create placeholder files so the project compiles**
+
+Create minimal empty-module stubs for every file declared in lib.rs and readers/mod.rs. Each stub is just an empty file or contains only `use anyhow::Result;` as needed. Also create empty `src/bin/dtcat.rs`, `src/bin/dtfilter.rs`, `src/bin/dtdiff.rs` with `fn main() {}`.
+
+- [ ] **Step 5: Verify the project compiles**
+
+Run: `cargo check 2>&1`
+Expected: compiles with no errors (warnings OK at this stage)
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add Cargo.toml src/
+git commit -m "feat: scaffold dt-cli-tools project structure"
+```
+
+---
+
+### Task 2: Format Detection (`format.rs`)
+
+**Files:**
+- Create: `src/format.rs`
+
+- [ ] **Step 1: Write tests for format detection**
+
+```rust
+// src/format.rs
+
+use anyhow::{Result, bail};
+use std::path::Path;
+use std::io::Read;
+
+#[derive(Debug, Clone, Copy, PartialEq)]
+pub enum Format {
+    Csv,
+    Tsv,
+    Parquet,
+    Arrow,
+    Json,
+    Ndjson,
+    Excel,
+}
+
+impl Format {
+    /// Returns true if this format and `other` belong to the same family
+    /// (e.g. Csv and Tsv are both delimited text).
+    pub fn same_family(&self, other: &Format) -> bool {
+        matches!(
+            (self, other),
+            (Format::Csv, Format::Tsv)
+                | (Format::Tsv, Format::Csv)
+                | (Format::Json, Format::Ndjson)
+                | (Format::Ndjson, Format::Json)
+        ) || self == other
+    }
+}
+
+// Placeholder public functions — will implement in Step 3
+pub fn detect_format(path: &Path, override_fmt: Option<&str>) -> Result<Format> {
+    todo!()
+}
+
+pub fn parse_format_str(s: &str) -> Result<Format> {
+    todo!()
+}
+
+fn detect_by_magic(path: &Path) -> Result<Option<Format>> {
+    todo!()
+}
+
+fn detect_by_extension(path: &Path) -> Result<Format> {
+    todo!()
+}
+
+/// Auto-detect CSV delimiter by sampling the first few lines.
+/// Returns b',' (comma), b'\t' (tab), or b';' (semicolon).
+pub fn detect_csv_delimiter(path: &Path) -> Result<u8> {
+    todo!()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::io::Write;
+    use tempfile::NamedTempFile;
+
+    // -- parse_format_str --
+
+    #[test]
+    fn parse_csv() {
+        assert_eq!(parse_format_str("csv").unwrap(), Format::Csv);
+    }
+
+    #[test]
+    fn parse_tsv() {
+        assert_eq!(parse_format_str("tsv").unwrap(), Format::Tsv);
+    }
+
+    #[test]
+    fn parse_parquet() {
+        assert_eq!(parse_format_str("parquet").unwrap(), Format::Parquet);
+    }
+
+    #[test]
+    fn parse_arrow() {
+        assert_eq!(parse_format_str("arrow").unwrap(), Format::Arrow);
+    }
+
+    #[test]
+    fn parse_json() {
+        assert_eq!(parse_format_str("json").unwrap(), Format::Json);
+    }
+
+    #[test]
+    fn parse_ndjson() {
+        assert_eq!(parse_format_str("ndjson").unwrap(), Format::Ndjson);
+    }
+
+    #[test]
+    fn parse_excel() {
+        assert_eq!(parse_format_str("excel").unwrap(), Format::Excel);
+        assert_eq!(parse_format_str("xlsx").unwrap(), Format::Excel);
+    }
+
+    #[test]
+    fn parse_unknown_is_err() {
+        assert!(parse_format_str("banana").is_err());
+    }
+
+    #[test]
+    fn parse_case_insensitive() {
+        assert_eq!(parse_format_str("CSV").unwrap(), Format::Csv);
+        assert_eq!(parse_format_str("Parquet").unwrap(), Format::Parquet);
+    }
+
+    // -- detect_by_extension --
+
+    #[test]
+    fn ext_csv() {
+        assert_eq!(detect_by_extension(Path::new("data.csv")).unwrap(), Format::Csv);
+    }
+
+    #[test]
+    fn ext_tsv() {
+        assert_eq!(detect_by_extension(Path::new("data.tsv")).unwrap(), Format::Tsv);
+        assert_eq!(detect_by_extension(Path::new("data.tab")).unwrap(), Format::Tsv);
+    }
+
+    #[test]
+    fn ext_parquet() {
+        assert_eq!(detect_by_extension(Path::new("data.parquet")).unwrap(), Format::Parquet);
+        assert_eq!(detect_by_extension(Path::new("data.pq")).unwrap(), Format::Parquet);
+    }
+
+    #[test]
+    fn ext_arrow() {
+        assert_eq!(detect_by_extension(Path::new("data.arrow")).unwrap(), Format::Arrow);
+        assert_eq!(detect_by_extension(Path::new("data.feather")).unwrap(), Format::Arrow);
+        assert_eq!(detect_by_extension(Path::new("data.ipc")).unwrap(), Format::Arrow);
+    }
+
+    #[test]
+    fn ext_json() {
+        assert_eq!(detect_by_extension(Path::new("data.json")).unwrap(), Format::Json);
+    }
+
+    #[test]
+    fn ext_ndjson() {
+        assert_eq!(detect_by_extension(Path::new("data.ndjson")).unwrap(), Format::Ndjson);
+        assert_eq!(detect_by_extension(Path::new("data.jsonl")).unwrap(), Format::Ndjson);
+    }
+
+    #[test]
+    fn ext_excel() {
+        assert_eq!(detect_by_extension(Path::new("data.xlsx")).unwrap(), Format::Excel);
+        assert_eq!(detect_by_extension(Path::new("data.xls")).unwrap(), Format::Excel);
+        assert_eq!(detect_by_extension(Path::new("data.xlsb")).unwrap(), Format::Excel);
+        assert_eq!(detect_by_extension(Path::new("data.ods")).unwrap(), Format::Excel);
+    }
+
+    #[test]
+    fn ext_unknown_is_err() {
+        assert!(detect_by_extension(Path::new("data.txt")).is_err());
+        assert!(detect_by_extension(Path::new("data")).is_err());
+    }
+
+    // -- detect_by_magic --
+
+    #[test]
+    fn magic_parquet() {
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(b"PAR1some_data").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_by_magic(f.path()).unwrap(), Some(Format::Parquet));
+    }
+
+    #[test]
+    fn magic_arrow() {
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(b"ARROW1some_data").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_by_magic(f.path()).unwrap(), Some(Format::Arrow));
+    }
+
+    #[test]
+    fn magic_xlsx_zip() {
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(&[0x50, 0x4B, 0x03, 0x04, 0x00]).unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_by_magic(f.path()).unwrap(), Some(Format::Excel));
+    }
+
+    #[test]
+    fn magic_xls_ole() {
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(&[0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1]).unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_by_magic(f.path()).unwrap(), Some(Format::Excel));
+    }
+
+    #[test]
+    fn magic_json_array() {
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(b"[{\"a\":1}]").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_by_magic(f.path()).unwrap(), Some(Format::Json));
+    }
+
+    #[test]
+    fn magic_json_object() {
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(b"{\"a\":1}\n{\"a\":2}").unwrap();
+        f.flush().unwrap();
+        // Leading { suggests NDJSON
+        assert_eq!(detect_by_magic(f.path()).unwrap(), Some(Format::Ndjson));
+    }
+
+    #[test]
+    fn magic_csv_fallback_none() {
+        // Plain text with commas — magic returns None, falls back to extension
+        let mut f = NamedTempFile::with_suffix(".bin").unwrap();
+        f.write_all(b"a,b,c\n1,2,3\n").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_by_magic(f.path()).unwrap(), None);
+    }
+
+    // -- detect_format (integration) --
+
+    #[test]
+    fn override_wins() {
+        // Even with .csv extension, override to parquet
+        assert_eq!(
+            detect_format(Path::new("data.csv"), Some("parquet")).unwrap(),
+            Format::Parquet
+        );
+    }
+
+    // -- same_family --
+
+    #[test]
+    fn csv_tsv_same_family() {
+        assert!(Format::Csv.same_family(&Format::Tsv));
+        assert!(Format::Tsv.same_family(&Format::Csv));
+    }
+
+    #[test]
+    fn json_ndjson_same_family() {
+        assert!(Format::Json.same_family(&Format::Ndjson));
+    }
+
+    #[test]
+    fn csv_parquet_different_family() {
+        assert!(!Format::Csv.same_family(&Format::Parquet));
+    }
+
+    // -- detect_csv_delimiter --
+
+    #[test]
+    fn delimiter_comma() {
+        let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+        f.write_all(b"a,b,c\n1,2,3\n4,5,6\n").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_csv_delimiter(f.path()).unwrap(), b',');
+    }
+
+    #[test]
+    fn delimiter_tab() {
+        let mut f = NamedTempFile::with_suffix(".tsv").unwrap();
+        f.write_all(b"a\tb\tc\n1\t2\t3\n").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_csv_delimiter(f.path()).unwrap(), b'\t');
+    }
+
+    #[test]
+    fn delimiter_semicolon() {
+        let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+        f.write_all(b"a;b;c\n1;2;3\n").unwrap();
+        f.flush().unwrap();
+        assert_eq!(detect_csv_delimiter(f.path()).unwrap(), b';');
+    }
+}
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+Run: `cargo test --lib format:: 2>&1 | tail -5`
+Expected: all tests FAIL (todo! panics)
+
+- [ ] **Step 3: Implement format detection**
+
+Replace the `todo!()` bodies with real implementations:
+
+```rust
+pub fn parse_format_str(s: &str) -> Result<Format> {
+    match s.to_lowercase().as_str() {
+        "csv" => Ok(Format::Csv),
+        "tsv" | "tab" => Ok(Format::Tsv),
+        "parquet" | "pq" => Ok(Format::Parquet),
+        "arrow" | "feather" | "ipc" => Ok(Format::Arrow),
+        "json" => Ok(Format::Json),
+        "ndjson" | "jsonl" => Ok(Format::Ndjson),
+        "excel" | "xlsx" | "xls" | "xlsb" | "ods" => Ok(Format::Excel),
+        _ => bail!("unknown format '{}'. Supported: csv, tsv, parquet, arrow, json, ndjson, excel", s),
+    }
+}
+
+fn detect_by_extension(path: &Path) -> Result<Format> {
+    let ext = path
+        .extension()
+        .and_then(|e| e.to_str())
+        .map(|e| e.to_lowercase());
+
+    match ext.as_deref() {
+        Some("csv") => Ok(Format::Csv),
+        Some("tsv") | Some("tab") => Ok(Format::Tsv),
+        Some("parquet") | Some("pq") => Ok(Format::Parquet),
+        Some("arrow") | Some("feather") | Some("ipc") => Ok(Format::Arrow),
+        Some("json") => Ok(Format::Json),
+        Some("ndjson") | Some("jsonl") => Ok(Format::Ndjson),
+        Some("xlsx") | Some("xls") | Some("xlsb") | Some("ods") => Ok(Format::Excel),
+        Some(other) => bail!("unrecognized extension '.{}'. Use --format to specify.", other),
+        None => bail!("no file extension. Use --format to specify the format."),
+    }
+}
+
+fn detect_by_magic(path: &Path) -> Result<Option<Format>> {
+    let mut file = std::fs::File::open(path)?;
+    let mut buf = [0u8; 8];
+    let n = file.read(&mut buf)?;
+    if n < 2 {
+        return Ok(None);
+    }
+
+    // Parquet: "PAR1"
+    if n >= 4 && &buf[..4] == b"PAR1" {
+        return Ok(Some(Format::Parquet));
+    }
+    // Arrow IPC: "ARROW1"
+    if n >= 6 && &buf[..6] == b"ARROW1" {
+        return Ok(Some(Format::Arrow));
+    }
+    // ZIP (xlsx, ods): PK\x03\x04
+    if buf[0] == 0x50 && buf[1] == 0x4B {
+        return Ok(Some(Format::Excel));
+    }
+    // OLE2 (xls): D0 CF 11 E0
+    if n >= 4 && buf[0] == 0xD0 && buf[1] == 0xCF && buf[2] == 0x11 && buf[3] == 0xE0 {
+        return Ok(Some(Format::Excel));
+    }
+    // JSON array: starts with [
+    // Need to skip leading whitespace
+    let first_non_ws = buf[..n].iter().find(|b| !b.is_ascii_whitespace());
+    if let Some(b'[') = first_non_ws {
+        return Ok(Some(Format::Json));
+    }
+    if let Some(b'{') = first_non_ws {
+        return Ok(Some(Format::Ndjson));
+    }
+
+    // CSV/TSV: no distinctive magic bytes — return None to fall through to extension
+    Ok(None)
+}
+
+pub fn detect_format(path: &Path, override_fmt: Option<&str>) -> Result<Format> {
+    if let Some(fmt) = override_fmt {
+        return parse_format_str(fmt);
+    }
+    if let Some(fmt) = detect_by_magic(path)? {
+        return Ok(fmt);
+    }
+    detect_by_extension(path)
+}
+
+pub fn detect_csv_delimiter(path: &Path) -> Result<u8> {
+    let mut file = std::fs::File::open(path)?;
+    let mut buf = String::new();
+    // Read up to 8KB for sampling
+    file.take(8192).read_to_string(&mut buf)?;
+
+    let lines: Vec<&str> = buf.lines().take(10).collect();
+    if lines.is_empty() {
+        return Ok(b',');
+    }
+
+    let delimiters = [b',', b'\t', b';'];
+    let mut best = b',';
+    let mut best_score = 0usize;
+
+    for &d in &delimiters {
+        let counts: Vec<usize> = lines
+            .iter()
+            .map(|line| line.as_bytes().iter().filter(|&&b| b == d).count())
+            .collect();
+        // Score: minimum count across lines (consistency matters)
+        let min_count = *counts.iter().min().unwrap_or(&0);
+        if min_count > best_score {
+            best_score = min_count;
+            best = d;
+        }
+    }
+
+    Ok(best)
+}
+```
+
+- [ ] **Step 4: Run tests to verify they pass**
+
+Run: `cargo test --lib format:: 2>&1`
+Expected: all tests PASS
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add src/format.rs
+git commit -m "feat: add format detection with magic bytes and extension matching"
+```
+
+---
+
+### Task 3: Metadata Module (`metadata.rs`)
+
+**Files:**
+- Create: `src/metadata.rs`
+
+- [ ] **Step 1: Write metadata module with tests**
+
+Port `format_file_size` from xl-cli-tools (`/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/metadata.rs`). Generalize `FileInfo` to include the detected format and work for non-Excel files.
+
+```rust
+// src/metadata.rs
+
+use crate::format::Format;
+
+/// Info about a single sheet (Excel) or the entire file (other formats).
+#[derive(Debug, Clone)]
+pub struct SheetInfo {
+    pub name: String,
+    pub rows: usize, // total rows including header
+    pub cols: usize,
+}
+
+/// Info about the file.
+#[derive(Debug)]
+pub struct FileInfo {
+    pub file_size: u64,
+    pub format: Format,
+    pub sheets: Vec<SheetInfo>,
+}
+
+/// Format file size for display: "245 KB", "1.2 MB", etc.
+pub fn format_file_size(bytes: u64) -> String {
+    if bytes < 1_024 {
+        format!("{bytes} B")
+    } else if bytes < 1_048_576 {
+        format!("{:.0} KB", bytes as f64 / 1_024.0)
+    } else if bytes < 1_073_741_824 {
+        format!("{:.1} MB", bytes as f64 / 1_048_576.0)
+    } else {
+        format!("{:.1} GB", bytes as f64 / 1_073_741_824.0)
+    }
+}
+
+/// Format name for a Format variant.
+pub fn format_name(fmt: Format) -> &'static str {
+    match fmt {
+        Format::Csv => "CSV",
+        Format::Tsv => "TSV",
+        Format::Parquet => "Parquet",
+        Format::Arrow => "Arrow IPC",
+        Format::Json => "JSON",
+        Format::Ndjson => "NDJSON",
+        Format::Excel => "Excel",
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_format_file_size() {
+        assert_eq!(format_file_size(500), "500 B");
+        assert_eq!(format_file_size(2_048), "2 KB");
+        assert_eq!(format_file_size(1_500_000), "1.4 MB");
+    }
+
+    #[test]
+    fn test_format_name() {
+        assert_eq!(format_name(Format::Csv), "CSV");
+        assert_eq!(format_name(Format::Parquet), "Parquet");
+        assert_eq!(format_name(Format::Excel), "Excel");
+    }
+}
+```
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib metadata:: 2>&1`
+Expected: PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/metadata.rs
+git commit -m "feat: add metadata module with FileInfo and format_file_size"
+```
+
+---
+
+### Task 4: Formatter Module (`formatter.rs`)
+
+**Files:**
+- Create: `src/formatter.rs`
+
+- [ ] **Step 1: Port formatter.rs from xl-cli-tools**
+
+Copy `/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/formatter.rs` and update imports:
+- Change `use crate::metadata::{format_file_size, FileInfo, SheetInfo};` to `use crate::metadata::{format_file_size, FileInfo, SheetInfo, format_name};`
+- Update `format_header` to include the format name: `# File: report.csv (245 KB) [CSV]`
+- The rest of the module (format_schema, format_data_table, format_head_tail, format_csv, format_describe, all helper functions, and all tests) transfers verbatim.
+
+Key change to `format_header`:
+```rust
+pub fn format_header(file_name: &str, info: &FileInfo) -> String {
+    let size_str = format_file_size(info.file_size);
+    let fmt_name = format_name(info.format);
+    let sheet_count = info.sheets.len();
+    if sheet_count > 1 {
+        format!("# File: {file_name} ({size_str}) [{fmt_name}]\n# Sheets: {sheet_count}\n")
+    } else {
+        format!("# File: {file_name} ({size_str}) [{fmt_name}]\n")
+    }
+}
+```
+
+Update the `format_header` test to match the new output:
+```rust
+#[test]
+fn test_format_header() {
+    let info = FileInfo {
+        file_size: 250_000,
+        format: Format::Excel,
+        sheets: vec![
+            SheetInfo { name: "Sheet1".into(), rows: 100, cols: 5 },
+            SheetInfo { name: "Sheet2".into(), rows: 50, cols: 3 },
+        ],
+    };
+    let out = format_header("test.xlsx", &info);
+    assert!(out.contains("# File: test.xlsx (244 KB) [Excel]"));
+    assert!(out.contains("# Sheets: 2"));
+}
+
+#[test]
+fn test_format_header_single_sheet() {
+    let info = FileInfo {
+        file_size: 1_000,
+        format: Format::Csv,
+        sheets: vec![SheetInfo { name: "data".into(), rows: 10, cols: 3 }],
+    };
+    let out = format_header("data.csv", &info);
+    assert!(out.contains("[CSV]"));
+    assert!(!out.contains("Sheets"));
+}
+```
+
+All other tests (format_data_table, format_head_tail, format_schema, format_csv, format_describe, etc.) transfer verbatim from xl-cli-tools. They test pure DataFrame formatting and don't reference Excel-specific types.
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib formatter:: 2>&1`
+Expected: all tests PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/formatter.rs
+git commit -m "feat: port formatter module from xl-cli-tools with format-name support"
+```
+
+---
+
+### Task 5: Filter Module (`filter.rs`)
+
+**Files:**
+- Create: `src/filter.rs`
+
+- [ ] **Step 1: Port filter.rs from xl-cli-tools, removing letter-based column resolution**
+
+Copy `/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/filter.rs` and make these changes:
+
+1. **Remove** `col_letter_to_index` function entirely.
+2. **Simplify** `resolve_column` to only do name matching (exact, then case-insensitive). Remove the letter-based fallback step:
+
+```rust
+/// Resolve a column specifier to a DataFrame column name.
+/// Accepts a header name (exact match first, then case-insensitive).
+pub fn resolve_column(spec: &str, df_columns: &[String]) -> Result<String, String> {
+    // 1. Exact header name match
+    if df_columns.contains(&spec.to_string()) {
+        return Ok(spec.to_string());
+    }
+    // 2. Case-insensitive header name match
+    let spec_lower = spec.to_lowercase();
+    for col in df_columns {
+        if col.to_lowercase() == spec_lower {
+            return Ok(col.clone());
+        }
+    }
+    let available = df_columns.join(", ");
+    Err(format!("column '{}' not found. Available columns: {}", spec, available))
+}
+```
+
+3. **Remove** the letter-based tests: `resolve_by_letter`, `resolve_by_letter_lowercase`, `resolve_header_takes_priority_over_letter`, `resolve_letter_out_of_range_is_err`, `pipeline_cols_by_letter`.
+4. Keep everything else: `parse_filter_expr`, `parse_sort_spec`, `build_filter_mask`, `apply_filters`, `filter_pipeline`, `FilterOptions`, `SortSpec`, `FilterExpr`, `FilterOp`, `apply_sort`, and all their tests.
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib filter:: 2>&1`
+Expected: all tests PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/filter.rs
+git commit -m "feat: port filter module from xl-cli-tools without letter-based column resolution"
+```
+
+---
+
+### Task 6: Diff Module (`diff.rs`)
+
+**Files:**
+- Create: `src/diff.rs`
+
+- [ ] **Step 1: Port diff.rs verbatim from xl-cli-tools**
+
+Copy `/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/diff.rs` and update the import path:
+- Change `use crate::formatter;` to `use crate::formatter;` (same - no change needed)
+
+The entire module (SheetSource, DiffRow, CellChange, ModifiedRow, DiffResult, DiffOptions, diff_positional, diff_keyed, diff_sheets, and all tests) transfers verbatim. No changes to logic.
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib diff:: 2>&1`
+Expected: all tests PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/diff.rs
+git commit -m "feat: port diff module from xl-cli-tools"
+```
+
+---
+
+### Task 7: CSV Reader (`readers/csv.rs`)
+
+**Files:**
+- Create: `src/readers/csv.rs`
+
+- [ ] **Step 1: Write CSV reader with tests**
+
+```rust
+// src/readers/csv.rs
+
+use anyhow::Result;
+use polars::prelude::*;
+use std::path::Path;
+
+use crate::reader::ReadOptions;
+
+pub fn read(path: &Path, opts: &ReadOptions) -> Result<DataFrame> {
+    let separator = opts.separator.unwrap_or_else(|| {
+        crate::format::detect_csv_delimiter(path).unwrap_or(b',')
+    });
+
+    let mut reader = CsvReadOptions::default()
+        .with_has_header(true)
+        .with_skip_rows(opts.skip_rows.unwrap_or(0))
+        .with_parse_options(
+            CsvParseOptions::default().with_separator(separator),
+        )
+        .try_into_reader_with_file_path(Some(path.into()))?;
+
+    let df = reader.finish()?;
+    Ok(df)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::io::Write;
+    use tempfile::NamedTempFile;
+
+    fn default_opts() -> ReadOptions {
+        ReadOptions::default()
+    }
+
+    #[test]
+    fn read_basic_csv() {
+        let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+        write!(f, "name,value\nAlice,100\nBob,200\n").unwrap();
+        f.flush().unwrap();
+
+        let df = read(f.path(), &default_opts()).unwrap();
+        assert_eq!(df.height(), 2);
+        assert_eq!(df.width(), 2);
+        let names: Vec<String> = df.get_column_names().iter().map(|s| s.to_string()).collect();
+        assert_eq!(names, vec!["name", "value"]);
+    }
+
+    #[test]
+    fn read_tsv() {
+        let mut f = NamedTempFile::with_suffix(".tsv").unwrap();
+        write!(f, "a\tb\n1\t2\n3\t4\n").unwrap();
+        f.flush().unwrap();
+
+        let opts = ReadOptions { separator: Some(b'\t'), ..Default::default() };
+        let df = read(f.path(), &opts).unwrap();
+        assert_eq!(df.height(), 2);
+        assert_eq!(df.width(), 2);
+    }
+
+    #[test]
+    fn read_with_skip() {
+        let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+        write!(f, "metadata line\nname,value\nAlice,100\n").unwrap();
+        f.flush().unwrap();
+
+        let opts = ReadOptions { skip_rows: Some(1), ..Default::default() };
+        let df = read(f.path(), &opts).unwrap();
+        assert_eq!(df.height(), 1);
+        let names: Vec<String> = df.get_column_names().iter().map(|s| s.to_string()).collect();
+        assert_eq!(names, vec!["name", "value"]);
+    }
+}
+```
+
+Note: This requires `ReadOptions` from `reader.rs`. Define it first (in the next step, or define a minimal version now).
+
+- [ ] **Step 2: Define ReadOptions in reader.rs**
+
+```rust
+// src/reader.rs
+
+/// Options that control how a file is read.
+#[derive(Debug, Clone, Default)]
+pub struct ReadOptions {
+    pub sheet: Option<String>,     // Excel only
+    pub skip_rows: Option<usize>,
+    pub separator: Option<u8>,     // CSV override
+}
+```
+
+- [ ] **Step 3: Run tests**
+
+Run: `cargo test --lib readers::csv:: 2>&1`
+Expected: all tests PASS
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add src/readers/csv.rs src/reader.rs
+git commit -m "feat: add CSV/TSV reader with delimiter auto-detection"
+```
+
+---
+
+### Task 8: Parquet Reader (`readers/parquet.rs`)
+
+**Files:**
+- Create: `src/readers/parquet.rs`
+
+- [ ] **Step 1: Write Parquet reader with tests**
+
+```rust
+// src/readers/parquet.rs
+
+use anyhow::Result;
+use polars::prelude::*;
+use std::path::Path;
+
+use crate::reader::ReadOptions;
+
+pub fn read(path: &Path, opts: &ReadOptions) -> Result<DataFrame> {
+    let file = std::fs::File::open(path)?;
+    let mut df = ParquetReader::new(file).finish()?;
+
+    if let Some(skip) = opts.skip_rows {
+        if skip > 0 && skip < df.height() {
+            df = df.slice(skip as i64, df.height() - skip);
+        }
+    }
+
+    Ok(df)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use tempfile::NamedTempFile;
+
+    fn default_opts() -> ReadOptions {
+        ReadOptions::default()
+    }
+
+    #[test]
+    fn read_parquet_roundtrip() {
+        // Create a parquet file using Polars writer
+        let s1 = Series::new("name".into(), &["Alice", "Bob"]);
+        let s2 = Series::new("value".into(), &[100i64, 200]);
+        let mut df = DataFrame::new(vec![s1.into_column(), s2.into_column()]).unwrap();
+
+        let f = NamedTempFile::with_suffix(".parquet").unwrap();
+        let file = std::fs::File::create(f.path()).unwrap();
+        ParquetWriter::new(file).finish(&mut df).unwrap();
+
+        let result = read(f.path(), &default_opts()).unwrap();
+        assert_eq!(result.height(), 2);
+        assert_eq!(result.width(), 2);
+    }
+}
+```
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib readers::parquet:: 2>&1`
+Expected: PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/readers/parquet.rs
+git commit -m "feat: add Parquet reader"
+```
+
+---
+
+### Task 9: Arrow IPC Reader (`readers/arrow.rs`)
+
+**Files:**
+- Create: `src/readers/arrow.rs`
+
+- [ ] **Step 1: Write Arrow IPC reader with tests**
+
+```rust
+// src/readers/arrow.rs
+
+use anyhow::Result;
+use polars::prelude::*;
+use std::path::Path;
+
+use crate::reader::ReadOptions;
+
+pub fn read(path: &Path, opts: &ReadOptions) -> Result<DataFrame> {
+    let file = std::fs::File::open(path)?;
+    let mut df = IpcReader::new(file).finish()?;
+
+    if let Some(skip) = opts.skip_rows {
+        if skip > 0 && skip < df.height() {
+            df = df.slice(skip as i64, df.height() - skip);
+        }
+    }
+
+    Ok(df)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use tempfile::NamedTempFile;
+
+    fn default_opts() -> ReadOptions {
+        ReadOptions::default()
+    }
+
+    #[test]
+    fn read_arrow_roundtrip() {
+        let s1 = Series::new("x".into(), &[1i64, 2, 3]);
+        let mut df = DataFrame::new(vec![s1.into_column()]).unwrap();
+
+        let f = NamedTempFile::with_suffix(".arrow").unwrap();
+        let file = std::fs::File::create(f.path()).unwrap();
+        IpcWriter::new(file).finish(&mut df).unwrap();
+
+        let result = read(f.path(), &default_opts()).unwrap();
+        assert_eq!(result.height(), 3);
+        assert_eq!(result.width(), 1);
+    }
+}
+```
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib readers::arrow:: 2>&1`
+Expected: PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/readers/arrow.rs
+git commit -m "feat: add Arrow IPC reader"
+```
+
+---
+
+### Task 10: JSON/NDJSON Reader (`readers/json.rs`)
+
+**Files:**
+- Create: `src/readers/json.rs`
+
+- [ ] **Step 1: Write JSON reader with tests**
+
+```rust
+// src/readers/json.rs
+
+use anyhow::Result;
+use polars::prelude::*;
+use std::path::Path;
+
+use crate::format::Format;
+use crate::reader::ReadOptions;
+
+pub fn read(path: &Path, format: Format, opts: &ReadOptions) -> Result<DataFrame> {
+    let file = std::fs::File::open(path)?;
+
+    let mut df = match format {
+        Format::Ndjson => {
+            JsonLineReader::new(file).finish()?
+        }
+        _ => {
+            // JSON array format
+            JsonReader::new(file).finish()?
+        }
+    };
+
+    if let Some(skip) = opts.skip_rows {
+        if skip > 0 && skip < df.height() {
+            df = df.slice(skip as i64, df.height() - skip);
+        }
+    }
+
+    Ok(df)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::io::Write;
+    use tempfile::NamedTempFile;
+
+    fn default_opts() -> ReadOptions {
+        ReadOptions::default()
+    }
+
+    #[test]
+    fn read_json_array() {
+        let mut f = NamedTempFile::with_suffix(".json").unwrap();
+        write!(f, r#"[{{"name":"Alice","value":1}},{{"name":"Bob","value":2}}]"#).unwrap();
+        f.flush().unwrap();
+
+        let df = read(f.path(), Format::Json, &default_opts()).unwrap();
+        assert_eq!(df.height(), 2);
+    }
+
+    #[test]
+    fn read_ndjson() {
+        let mut f = NamedTempFile::with_suffix(".ndjson").unwrap();
+        write!(f, "{}\n{}\n",
+            r#"{{"name":"Alice","value":1}}"#,
+            r#"{{"name":"Bob","value":2}}"#,
+        ).unwrap();
+        f.flush().unwrap();
+
+        let df = read(f.path(), Format::Ndjson, &default_opts()).unwrap();
+        assert_eq!(df.height(), 2);
+    }
+}
+```
+
+Note: Polars JSON reader API may vary. If `JsonReader` is not directly available, use `JsonFormat::Json` with the appropriate reader. The implementer should check the exact Polars 0.46 API and adapt. Alternative approach if `JsonReader` doesn't exist:
+
+```rust
+// Alternative using LazyFrame
+let lf = LazyJsonLineReader::new(path).finish()?;
+let df = lf.collect()?;
+```
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib readers::json:: 2>&1`
+Expected: PASS (adapt if Polars API differs)
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/readers/json.rs
+git commit -m "feat: add JSON/NDJSON reader"
+```
+
+---
+
+### Task 11: Excel Reader (`readers/excel.rs`)
+
+**Files:**
+- Create: `src/readers/excel.rs`
+
+- [ ] **Step 1: Port Excel reader from xl-cli-tools**
+
+Copy `/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/reader.rs` to `src/readers/excel.rs` and adapt:
+
+1. Change the public API from `read_sheet(path, sheet_name)` / `read_sheet_with_skip(path, sheet_name, skip)` to a single function matching the reader pattern:
+
+```rust
+pub fn read(path: &Path, opts: &ReadOptions) -> Result<DataFrame>
+```
+
+This function:
+- Resolves the sheet name from `opts.sheet` (defaults to the first sheet).
+- Applies `opts.skip_rows`.
+- Reuses `range_to_dataframe_skip`, `infer_column_type`, `build_series` verbatim from xl-cli-tools.
+
+2. Also provide a helper for reading Excel metadata (sheet names, dimensions):
+
+```rust
+pub fn read_excel_info(path: &Path) -> Result<Vec<SheetInfo>>
+```
+
+This reuses the calamine-based metadata reading from xl-cli-tools `metadata.rs:read_file_info`, but returns just the sheet list.
+
+3. Port all internal functions (`infer_column_type`, `build_series`, `range_to_dataframe_skip`) and unit tests verbatim.
+
+- [ ] **Step 2: Run tests**
+
+Run: `cargo test --lib readers::excel:: 2>&1`
+Expected: PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/readers/excel.rs
+git commit -m "feat: port Excel reader from xl-cli-tools"
+```
+
+---
+
+### Task 12: Reader Dispatch (`reader.rs`)
+
+**Files:**
+- Modify: `src/reader.rs` (already has ReadOptions from Task 7)
+
+- [ ] **Step 1: Add read_file dispatch function**
+
+```rust
+// Add to src/reader.rs
+
+use anyhow::{Result, bail};
+use polars::prelude::*;
+use std::path::Path;
+
+use crate::format::Format;
+use crate::metadata::{FileInfo, SheetInfo};
+use crate::readers;
+
+/// Options that control how a file is read.
+#[derive(Debug, Clone, Default)]
+pub struct ReadOptions {
+    pub sheet: Option<String>,     // Excel only
+    pub skip_rows: Option<usize>,
+    pub separator: Option<u8>,     // CSV override
+}
+
+/// Read a file into a DataFrame, dispatching to the appropriate reader.
+pub fn read_file(path: &Path, format: Format, opts: &ReadOptions) -> Result<DataFrame> {
+    match format {
+        Format::Csv | Format::Tsv => readers::csv::read(path, opts),
+        Format::Parquet => readers::parquet::read(path, opts),
+        Format::Arrow => readers::arrow::read(path, opts),
+        Format::Json | Format::Ndjson => readers::json::read(path, format, opts),
+        Format::Excel => readers::excel::read(path, opts),
+    }
+}
+
+/// Read file metadata: size, format, and sheet info (for Excel).
+pub fn read_file_info(path: &Path, format: Format) -> Result<FileInfo> {
+    let file_size = std::fs::metadata(path)?.len();
+
+    let sheets = match format {
+        Format::Excel => readers::excel::read_excel_info(path)?,
+        _ => vec![], // Non-Excel formats have no sheet concept
+    };
+
+    Ok(FileInfo {
+        file_size,
+        format,
+        sheets,
+    })
+}
+```
+
+- [ ] **Step 2: Write integration test for dispatch**
+
+```rust
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::io::Write;
+    use tempfile::NamedTempFile;
+
+    #[test]
+    fn dispatch_csv() {
+        let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+        write!(f, "a,b\n1,2\n").unwrap();
+        f.flush().unwrap();
+
+        let df = read_file(f.path(), Format::Csv, &ReadOptions::default()).unwrap();
+        assert_eq!(df.height(), 1);
+    }
+
+    #[test]
+    fn dispatch_parquet() {
+        use polars::prelude::*;
+        let s = Series::new("x".into(), &[1i64, 2]);
+        let mut df = DataFrame::new(vec![s.into_column()]).unwrap();
+
+        let f = NamedTempFile::with_suffix(".parquet").unwrap();
+        let file = std::fs::File::create(f.path()).unwrap();
+        ParquetWriter::new(file).finish(&mut df).unwrap();
+
+        let result = read_file(f.path(), Format::Parquet, &ReadOptions::default()).unwrap();
+        assert_eq!(result.height(), 2);
+    }
+}
+```
+
+- [ ] **Step 3: Run tests**
+
+Run: `cargo test --lib reader:: 2>&1`
+Expected: PASS
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add src/reader.rs
+git commit -m "feat: add reader dispatch with read_file and read_file_info"
+```
+
+---
+
+### Task 13: dtcat Binary (`src/bin/dtcat.rs`)
+
+**Files:**
+- Create: `src/bin/dtcat.rs`
+
+- [ ] **Step 1: Implement dtcat**
+
+Adapt from xl-cli-tools `xlcat.rs` (`/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/bin/xlcat.rs`). Key changes:
+
+1. Replace `xlcat::` imports with `dtcore::`.
+2. Add `--format` flag for format override.
+3. Replace Excel-specific file validation with format detection.
+4. Add `--info` flag (show file metadata).
+5. For non-Excel files, skip sheet resolution (no sheets concept). For Excel files with multiple sheets, keep the same listing behavior.
+6. Use `reader::read_file` and `reader::read_file_info` instead of `metadata::read_file_info` + `reader::read_sheet`.
+
+```rust
+// src/bin/dtcat.rs
+
+use dtcore::format;
+use dtcore::formatter;
+use dtcore::metadata::{self, SheetInfo};
+use dtcore::reader::{self, ReadOptions};
+
+use anyhow::Result;
+use clap::Parser;
+use polars::prelude::*;
+use std::path::PathBuf;
+use std::process;
+
+#[derive(Parser, Debug)]
+#[command(name = "dtcat", about = "View tabular data files in the terminal")]
+struct Cli {
+    /// Path to data file
+    file: PathBuf,
+
+    /// Override format detection (csv, tsv, parquet, arrow, json, ndjson, excel)
+    #[arg(long)]
+    format: Option<String>,
+
+    /// Select sheet by name or 0-based index (Excel only)
+    #[arg(long)]
+    sheet: Option<String>,
+
+    /// Skip first N rows
+    #[arg(long)]
+    skip: Option<usize>,
+
+    /// Show column names and types only
+    #[arg(long)]
+    schema: bool,
+
+    /// Show summary statistics
+    #[arg(long)]
+    describe: bool,
+
+    /// Show first N rows (default: 50)
+    #[arg(long)]
+    head: Option<usize>,
+
+    /// Show last N rows
+    #[arg(long)]
+    tail: Option<usize>,
+
+    /// Output as CSV instead of markdown table
+    #[arg(long)]
+    csv: bool,
+
+    /// Show file metadata (size, format, shape, sheets)
+    #[arg(long)]
+    info: bool,
+}
+
+#[derive(Debug)]
+struct ArgError(String);
+
+impl std::fmt::Display for ArgError {
+    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+        write!(f, "{}", self.0)
+    }
+}
+
+impl std::error::Error for ArgError {}
+
+fn run(cli: &Cli) -> Result<()> {
+    // Validate flag combinations
+    if cli.schema && cli.describe {
+        return Err(ArgError("--schema and --describe are mutually exclusive".into()).into());
+    }
+
+    // Detect format
+    let fmt = format::detect_format(&cli.file, cli.format.as_deref())?;
+
+    // Read file info
+    let file_info = reader::read_file_info(&cli.file, fmt)?;
+    let file_name = cli.file
+        .file_name()
+        .map(|s| s.to_string_lossy().to_string())
+        .unwrap_or_else(|| cli.file.display().to_string());
+
+    // --info mode
+    if cli.info {
+        let mut out = formatter::format_header(&file_name, &file_info);
+        out.push_str(&format!("Format: {}\n", metadata::format_name(fmt)));
+        if !file_info.sheets.is_empty() {
+            for sheet in &file_info.sheets {
+                out.push_str(&format!("  {}: {} rows x {} cols\n", sheet.name, sheet.rows, sheet.cols));
+            }
+        }
+        print!("{out}");
+        return Ok(());
+    }
+
+    // Build read options
+    let read_opts = ReadOptions {
+        sheet: cli.sheet.clone(),
+        skip_rows: cli.skip,
+        separator: None,
+    };
+
+    // For Excel with multiple sheets and no --sheet flag: list sheets
+    if fmt == format::Format::Excel && file_info.sheets.len() > 1 && cli.sheet.is_none() {
+        let has_row_flags = cli.head.is_some() || cli.tail.is_some() || cli.csv;
+        if has_row_flags {
+            return Err(ArgError(
+                "Multiple sheets found. Use --sheet <name> to select one.".into(),
+            ).into());
+        }
+
+        // List all sheets with schemas
+        let mut out = formatter::format_header(&file_name, &file_info);
+        out.push('\n');
+        for sheet in &file_info.sheets {
+            let opts = ReadOptions { sheet: Some(sheet.name.clone()), ..read_opts.clone() };
+            let df = reader::read_file(&cli.file, fmt, &opts)?;
+            if sheet.rows == 0 && sheet.cols == 0 {
+                out.push_str(&formatter::format_empty_sheet(sheet));
+            } else {
+                out.push_str(&formatter::format_schema(sheet, &df));
+            }
+            out.push('\n');
+        }
+        out.push_str("Use --sheet <name> to view a specific sheet.\n");
+        print!("{out}");
+        return Ok(());
+    }
+
+    // Read the data
+    let df = reader::read_file(&cli.file, fmt, &read_opts)?;
+
+    // Build a SheetInfo for display
+    let sheet_info = if let Some(si) = file_info.sheets.first() {
+        si.clone()
+    } else {
+        SheetInfo {
+            name: file_name.clone(),
+            rows: df.height() + 1, // +1 for header
+            cols: df.width(),
+        }
+    };
+
+    // Render output
+    render_output(cli, &file_name, &file_info, &sheet_info, &df)
+}
+
+fn render_output(
+    cli: &Cli,
+    file_name: &str,
+    file_info: &metadata::FileInfo,
+    sheet_info: &SheetInfo,
+    df: &DataFrame,
+) -> Result<()> {
+    if cli.csv {
+        let selected = select_rows(cli, df);
+        print!("{}", formatter::format_csv(&selected));
+        return Ok(());
+    }
+
+    let mut out = formatter::format_header(file_name, file_info);
+    out.push('\n');
+
+    if df.height() == 0 {
+        out.push_str(&formatter::format_schema(sheet_info, df));
+        out.push_str("\n(no data rows)\n");
+        print!("{out}");
+        return Ok(());
+    }
+
+    if cli.schema {
+        out.push_str(&formatter::format_schema(sheet_info, df));
+    } else if cli.describe {
+        out.push_str(&formatter::format_schema(sheet_info, df));
+        out.push_str(&formatter::format_describe(df));
+    } else {
+        out.push_str(&formatter::format_schema(sheet_info, df));
+        out.push('\n');
+        out.push_str(&format_data_selection(cli, df));
+    }
+
+    print!("{out}");
+    Ok(())
+}
+
+fn format_data_selection(cli: &Cli, df: &DataFrame) -> String {
+    let total = df.height();
+
+    if cli.head.is_some() || cli.tail.is_some() {
+        let head_n = cli.head.unwrap_or(0);
+        let tail_n = cli.tail.unwrap_or(0);
+        if head_n + tail_n >= total || (head_n == 0 && tail_n == 0) {
+            return formatter::format_data_table(df);
+        }
+        if cli.tail.is_none() {
+            return formatter::format_data_table(&df.head(Some(head_n)));
+        }
+        if cli.head.is_none() {
+            return formatter::format_data_table(&df.tail(Some(tail_n)));
+        }
+        return formatter::format_head_tail(df, head_n, tail_n);
+    }
+
+    // Default: <=50 rows show all, >50 show head 25 + tail 25
+    if total <= 50 {
+        formatter::format_data_table(df)
+    } else {
+        formatter::format_head_tail(df, 25, 25)
+    }
+}
+
+fn select_rows(cli: &Cli, df: &DataFrame) -> DataFrame {
+    let total = df.height();
+
+    if cli.head.is_some() || cli.tail.is_some() {
+        let head_n = cli.head.unwrap_or(0);
+        let tail_n = cli.tail.unwrap_or(0);
+        if head_n + tail_n >= total || (head_n == 0 && tail_n == 0) {
+            return df.clone();
+        }
+        if cli.tail.is_none() {
+            return df.head(Some(head_n));
+        }
+        if cli.head.is_none() {
+            return df.tail(Some(tail_n));
+        }
+        let head_df = df.head(Some(head_n));
+        let tail_df = df.tail(Some(tail_n));
+        return head_df.vstack(&tail_df).unwrap_or_else(|_| df.clone());
+    }
+
+    if total <= 50 { df.clone() } else {
+        let h = df.head(Some(25));
+        let t = df.tail(Some(25));
+        h.vstack(&t).unwrap_or_else(|_| df.clone())
+    }
+}
+
+fn main() {
+    let cli = Cli::parse();
+    if let Err(err) = run(&cli) {
+        if err.downcast_ref::<ArgError>().is_some() {
+            eprintln!("dtcat: {err}");
+            process::exit(2);
+        }
+        eprintln!("dtcat: {err}");
+        process::exit(1);
+    }
+}
+```
+
+- [ ] **Step 2: Verify it compiles**
+
+Run: `cargo build --bin dtcat 2>&1`
+Expected: compiles successfully
+
+- [ ] **Step 3: Manual smoke test**
+
+Create a quick test CSV and run dtcat on it:
+```bash
+echo "name,value\nAlice,100\nBob,200" > /tmp/test.csv
+cargo run --bin dtcat -- /tmp/test.csv
+cargo run --bin dtcat -- /tmp/test.csv --schema
+cargo run --bin dtcat -- /tmp/test.csv --csv
+```
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add src/bin/dtcat.rs
+git commit -m "feat: add dtcat binary for viewing tabular data files"
+```
+
+---
+
+### Task 14: dtfilter Binary (`src/bin/dtfilter.rs`)
+
+**Files:**
+- Create: `src/bin/dtfilter.rs`
+
+- [ ] **Step 1: Implement dtfilter**
+
+Adapt from xl-cli-tools `xlfilter.rs` (`/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/bin/xlfilter.rs`). Key changes:
+
+1. Replace `xlcat::` imports with `dtcore::`.
+2. Add `--format` flag.
+3. Replace Excel-specific file reading with format detection + `reader::read_file`.
+4. Remove Excel-specific sheet resolution for non-Excel formats.
+5. Change `--cols` description to "Select columns by name" (no letter-based).
+
+```rust
+// src/bin/dtfilter.rs
+
+use std::path::PathBuf;
+use std::process;
+
+use anyhow::Result;
+use clap::Parser;
+
+use dtcore::filter::{parse_filter_expr, parse_sort_spec, filter_pipeline, FilterOptions};
+use dtcore::format;
+use dtcore::formatter;
+use dtcore::reader::{self, ReadOptions};
+
+#[derive(Parser)]
+#[command(
+    name = "dtfilter",
+    about = "Filter and query tabular data files",
+    version
+)]
+struct Args {
+    /// Path to data file
+    file: PathBuf,
+
+    /// Override format detection
+    #[arg(long)]
+    format: Option<String>,
+
+    /// Select sheet (Excel only)
+    #[arg(long)]
+    sheet: Option<String>,
+
+    /// Skip first N rows
+    #[arg(long)]
+    skip: Option<usize>,
+
+    /// Select columns by name (comma-separated)
+    #[arg(long)]
+    columns: Option<String>,
+
+    /// Filter expressions (e.g., Amount>1000, Name~john)
+    #[arg(long = "filter")]
+    filters: Vec<String>,
+
+    /// Sort specification (e.g., Amount:desc)
+    #[arg(long)]
+    sort: Option<String>,
+
+    /// Max rows in output (applied after filter)
+    #[arg(long)]
+    limit: Option<usize>,
+
+    /// First N rows (applied before filter)
+    #[arg(long)]
+    head: Option<usize>,
+
+    /// Last N rows (applied before filter)
+    #[arg(long)]
+    tail: Option<usize>,
+
+    /// Output as CSV
+    #[arg(long)]
+    csv: bool,
+}
+
+#[derive(Debug)]
+struct ArgError(String);
+impl std::fmt::Display for ArgError {
+    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+        write!(f, "{}", self.0)
+    }
+}
+impl std::error::Error for ArgError {}
+
+fn run(args: Args) -> Result<()> {
+    if !args.file.exists() {
+        return Err(ArgError(format!("file not found: {}", args.file.display())).into());
+    }
+    if args.head.is_some() && args.tail.is_some() {
+        return Err(ArgError("--head and --tail are mutually exclusive".into()).into());
+    }
+
+    let fmt = format::detect_format(&args.file, args.format.as_deref())?;
+
+    let read_opts = ReadOptions {
+        sheet: args.sheet,
+        skip_rows: args.skip,
+        separator: None,
+    };
+
+    let df = reader::read_file(&args.file, fmt, &read_opts)?;
+
+    if df.height() == 0 {
+        eprintln!("0 rows");
+        println!("(no data rows)");
+        return Ok(());
+    }
+
+    // Parse filter expressions
+    let filters: Vec<_> = args.filters
+        .iter()
+        .map(|s| parse_filter_expr(s))
+        .collect::<Result<Vec<_>, _>>()
+        .map_err(|e| anyhow::anyhow!(ArgError(e)))?;
+
+    let sort = args.sort
+        .as_deref()
+        .map(parse_sort_spec)
+        .transpose()
+        .map_err(|e| anyhow::anyhow!(ArgError(e)))?;
+
+    let cols = args.columns.map(|s| {
+        s.split(',').map(|c| c.trim().to_string()).collect::<Vec<_>>()
+    });
+
+    let opts = FilterOptions {
+        filters,
+        cols,
+        sort,
+        limit: args.limit,
+        head: args.head,
+        tail: args.tail,
+    };
+
+    let result = filter_pipeline(df, &opts)?;
+
+    eprintln!("{} rows", result.height());
+
+    if result.height() == 0 {
+        println!("{}", formatter::format_data_table(&result));
+    } else if args.csv {
+        print!("{}", formatter::format_csv(&result));
+    } else {
+        println!("{}", formatter::format_data_table(&result));
+    }
+
+    Ok(())
+}
+
+fn main() {
+    let args = Args::parse();
+    if let Err(err) = run(args) {
+        if err.downcast_ref::<ArgError>().is_some() {
+            eprintln!("dtfilter: {err}");
+            process::exit(2);
+        }
+        eprintln!("dtfilter: {err}");
+        process::exit(1);
+    }
+}
+```
+
+- [ ] **Step 2: Verify it compiles**
+
+Run: `cargo build --bin dtfilter 2>&1`
+Expected: compiles
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/bin/dtfilter.rs
+git commit -m "feat: add dtfilter binary for filtering tabular data files"
+```
+
+---
+
+### Task 15: dtdiff Binary (`src/bin/dtdiff.rs`)
+
+**Files:**
+- Create: `src/bin/dtdiff.rs`
+
+- [ ] **Step 1: Implement dtdiff**
+
+Adapt from xl-cli-tools `xldiff.rs` (`/Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/bin/xldiff.rs`). Key changes:
+
+1. Replace `xlcat::` imports with `dtcore::`.
+2. Add `--format` flag.
+3. **Same-format enforcement**: detect format of both files and error if they differ (Csv/Tsv are same family and allowed).
+4. Replace Excel-specific reading with format detection + `reader::read_file`.
+5. Remove letter-based column resolution in key/cols parsing (use name-only `resolve_column`).
+6. Port all output formatters (format_text, format_markdown, format_json, format_csv) and tests verbatim.
+
+Exit codes: 0 = no differences, 1 = differences found, 2 = error.
+
+```rust
+// src/bin/dtdiff.rs
+// Adapted from xl-cli-tools xldiff.rs
+
+use std::io::IsTerminal;
+use std::path::PathBuf;
+use std::process;
+
+use anyhow::{Result, bail};
+use clap::Parser;
+use serde_json::{Map, Value, json};
+
+use dtcore::diff::{DiffOptions, DiffResult, SheetSource};
+use dtcore::format;
+use dtcore::formatter;
+use dtcore::reader::{self, ReadOptions};
+
+#[derive(Parser)]
+#[command(
+    name = "dtdiff",
+    about = "Compare two tabular data files and show differences",
+    version
+)]
+struct Args {
+    /// First file
+    file_a: PathBuf,
+
+    /// Second file
+    file_b: PathBuf,
+
+    /// Override format detection (both files must be this format)
+    #[arg(long)]
+    format: Option<String>,
+
+    /// Select sheet (Excel only)
+    #[arg(long)]
+    sheet: Option<String>,
+
+    /// Key column(s) for matched comparison (comma-separated names)
+    #[arg(long)]
+    key: Option<String>,
+
+    /// Numeric tolerance for float comparisons (default: 1e-10)
+    #[arg(long, default_value = "1e-10")]
+    tolerance: f64,
+
+    /// Output as JSON
+    #[arg(long)]
+    json: bool,
+
+    /// Output as CSV
+    #[arg(long)]
+    csv: bool,
+
+    /// Disable colored output
+    #[arg(long)]
+    no_color: bool,
+}
+
+fn run(args: Args) -> Result<()> {
+    if !args.file_a.exists() {
+        bail!("file not found: {}", args.file_a.display());
+    }
+    if !args.file_b.exists() {
+        bail!("file not found: {}", args.file_b.display());
+    }
+
+    // Detect formats
+    let fmt_a = format::detect_format(&args.file_a, args.format.as_deref())?;
+    let fmt_b = format::detect_format(&args.file_b, args.format.as_deref())?;
+
+    // Same-format enforcement (Csv/Tsv are same family)
+    if !fmt_a.same_family(&fmt_b) {
+        bail!(
+            "format mismatch: {} is {:?} but {} is {:?}. Both files must be the same format.",
+            args.file_a.display(), fmt_a,
+            args.file_b.display(), fmt_b,
+        );
+    }
+
+    let read_opts = ReadOptions {
+        sheet: args.sheet.clone(),
+        skip_rows: None,
+        separator: None,
+    };
+
+    let df_a = reader::read_file(&args.file_a, fmt_a, &read_opts)?;
+    let df_b = reader::read_file(&args.file_b, fmt_b, &read_opts)?;
+
+    // Resolve key columns
+    let key_columns: Vec<String> = if let Some(ref key_str) = args.key {
+        key_str.split(',').map(|s| s.trim().to_string()).collect()
+    } else {
+        vec![]
+    };
+
+    let file_name_a = args.file_a.file_name()
+        .map(|s| s.to_string_lossy().to_string())
+        .unwrap_or_else(|| args.file_a.display().to_string());
+    let file_name_b = args.file_b.file_name()
+        .map(|s| s.to_string_lossy().to_string())
+        .unwrap_or_else(|| args.file_b.display().to_string());
+
+    let source_a = SheetSource {
+        file_name: file_name_a,
+        sheet_name: args.sheet.clone().unwrap_or_else(|| "data".into()),
+    };
+    let source_b = SheetSource {
+        file_name: file_name_b,
+        sheet_name: args.sheet.unwrap_or_else(|| "data".into()),
+    };
+
+    let opts = DiffOptions {
+        key_columns,
+        tolerance: Some(args.tolerance),
+    };
+
+    let result = dtcore::diff::diff_sheets(&df_a, &df_b, &opts, source_a, source_b)?;
+
+    let use_color = !args.no_color && std::io::stdout().is_terminal();
+
+    // Format output
+    let output = if args.json {
+        format_json(&result)
+    } else if args.csv {
+        format_csv_output(&result)
+    } else {
+        format_text(&result, use_color)
+    };
+
+    print!("{}", output);
+
+    if result.has_differences() {
+        process::exit(1);
+    }
+
+    Ok(())
+}
+
+// Port format_text, format_json, format_csv_output (renamed from format_csv to avoid
+// collision with the flag) verbatim from xl-cli-tools xldiff.rs.
+// Include format_row_inline, csv_quote, csv_row helpers.
+
+// [Full implementations copied from xl-cli-tools xldiff.rs - see source at
+//  /Users/loulou/Dropbox/projects_claude/xl-cli-tool/src/bin/xldiff.rs lines 141-455]
+// The only rename: format_csv -> format_csv_output to avoid name collision.
+
+fn main() {
+    let args = Args::parse();
+    if let Err(err) = run(args) {
+        eprintln!("dtdiff: {err}");
+        process::exit(2);
+    }
+}
+```
+
+The output formatter functions (`format_text`, `format_json`, `format_csv_output`, `format_row_inline`, `csv_quote`, `csv_row`) and their tests transfer verbatim from xldiff.rs lines 141-827. Copy them into dtdiff.rs.
+
+- [ ] **Step 2: Verify it compiles**
+
+Run: `cargo build --bin dtdiff 2>&1`
+Expected: compiles
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add src/bin/dtdiff.rs
+git commit -m "feat: add dtdiff binary for comparing tabular data files"
+```
+
+---
+
+### Task 16: Demo Fixtures and Integration Tests
+
+**Files:**
+- Create: `demo/` fixture files
+- Create: `tests/integration/dtcat.rs`
+- Create: `tests/integration/dtfilter.rs`
+- Create: `tests/integration/dtdiff.rs`
+
+- [ ] **Step 1: Create demo fixture files**
+
+Create small test files in `demo/`:
+
+```bash
+# demo/sample.csv
+echo 'name,value,category
+Alice,100,A
+Bob,200,B
+Charlie,300,A
+Diana,400,B
+Eve,500,A' > demo/sample.csv
+
+# demo/sample.tsv
+printf 'name\tvalue\ncategory\nAlice\t100\tA\nBob\t200\tB\n' > demo/sample.tsv
+```
+
+Also create Parquet and Arrow fixtures programmatically in a test helper, or via a small Rust script.
+
+- [ ] **Step 2: Write dtcat integration tests**
+
+```rust
+// tests/integration/dtcat.rs
+
+use assert_cmd::Command;
+use predicates::prelude::*;
+use std::io::Write;
+use tempfile::NamedTempFile;
+
+fn dtcat() -> Command {
+    Command::cargo_bin("dtcat").unwrap()
+}
+
+fn csv_file(content: &str) -> NamedTempFile {
+    let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+    write!(f, "{}", content).unwrap();
+    f.flush().unwrap();
+    f
+}
+
+#[test]
+fn shows_csv_data() {
+    let f = csv_file("name,value\nAlice,100\nBob,200\n");
+    dtcat()
+        .arg(f.path())
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("Alice"))
+        .stdout(predicate::str::contains("Bob"));
+}
+
+#[test]
+fn schema_flag() {
+    let f = csv_file("name,value\nAlice,100\n");
+    dtcat()
+        .arg(f.path())
+        .arg("--schema")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("Column"))
+        .stdout(predicate::str::contains("Type"));
+}
+
+#[test]
+fn csv_output_flag() {
+    let f = csv_file("name,value\nAlice,100\n");
+    dtcat()
+        .arg(f.path())
+        .arg("--csv")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("name,value"));
+}
+
+#[test]
+fn head_flag() {
+    let f = csv_file("x\n1\n2\n3\n4\n5\n");
+    dtcat()
+        .arg(f.path())
+        .arg("--head")
+        .arg("2")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("1"))
+        .stdout(predicate::str::contains("2"));
+}
+
+#[test]
+fn nonexistent_file_exits_1() {
+    dtcat()
+        .arg("/tmp/does_not_exist.csv")
+        .assert()
+        .failure();
+}
+
+#[test]
+fn format_override() {
+    // A .txt file read as CSV
+    let mut f = NamedTempFile::with_suffix(".txt").unwrap();
+    write!(f, "a,b\n1,2\n").unwrap();
+    f.flush().unwrap();
+
+    dtcat()
+        .arg(f.path())
+        .arg("--format")
+        .arg("csv")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("1"));
+}
+```
+
+- [ ] **Step 3: Write dtfilter integration tests**
+
+```rust
+// tests/integration/dtfilter.rs
+
+use assert_cmd::Command;
+use predicates::prelude::*;
+use std::io::Write;
+use tempfile::NamedTempFile;
+
+fn dtfilter() -> Command {
+    Command::cargo_bin("dtfilter").unwrap()
+}
+
+fn csv_file(content: &str) -> NamedTempFile {
+    let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+    write!(f, "{}", content).unwrap();
+    f.flush().unwrap();
+    f
+}
+
+#[test]
+fn filter_eq() {
+    let f = csv_file("name,value\nAlice,100\nBob,200\n");
+    dtfilter()
+        .arg(f.path())
+        .arg("--filter")
+        .arg("name=Alice")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("Alice"))
+        .stdout(predicate::str::contains("Bob").not());
+}
+
+#[test]
+fn filter_gt() {
+    let f = csv_file("name,value\nAlice,100\nBob,200\nCharlie,300\n");
+    dtfilter()
+        .arg(f.path())
+        .arg("--filter")
+        .arg("value>150")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("Bob"))
+        .stdout(predicate::str::contains("Charlie"));
+}
+
+#[test]
+fn sort_desc() {
+    let f = csv_file("name,value\nAlice,100\nBob,200\n");
+    dtfilter()
+        .arg(f.path())
+        .arg("--sort")
+        .arg("value:desc")
+        .assert()
+        .success();
+}
+
+#[test]
+fn columns_select() {
+    let f = csv_file("name,value,extra\nAlice,100,x\n");
+    dtfilter()
+        .arg(f.path())
+        .arg("--columns")
+        .arg("name,value")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("name"))
+        .stdout(predicate::str::contains("extra").not());
+}
+
+#[test]
+fn csv_output() {
+    let f = csv_file("name,value\nAlice,100\n");
+    dtfilter()
+        .arg(f.path())
+        .arg("--csv")
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("name,value"));
+}
+```
+
+- [ ] **Step 4: Write dtdiff integration tests**
+
+```rust
+// tests/integration/dtdiff.rs
+
+use assert_cmd::Command;
+use predicates::prelude::*;
+use std::io::Write;
+use tempfile::NamedTempFile;
+
+fn dtdiff() -> Command {
+    Command::cargo_bin("dtdiff").unwrap()
+}
+
+fn csv_file(content: &str) -> NamedTempFile {
+    let mut f = NamedTempFile::with_suffix(".csv").unwrap();
+    write!(f, "{}", content).unwrap();
+    f.flush().unwrap();
+    f
+}
+
+#[test]
+fn no_diff_exits_0() {
+    let a = csv_file("name,value\nAlice,100\n");
+    let b = csv_file("name,value\nAlice,100\n");
+    dtdiff()
+        .arg(a.path())
+        .arg(b.path())
+        .assert()
+        .success()
+        .stdout(predicate::str::contains("No differences"));
+}
+
+#[test]
+fn diff_exits_1() {
+    let a = csv_file("name,value\nAlice,100\n");
+    let b = csv_file("name,value\nBob,200\n");
+    dtdiff()
+        .arg(a.path())
+        .arg(b.path())
+        .assert()
+        .code(1);
+}
+
+#[test]
+fn keyed_diff() {
+    let a = csv_file("id,name\n1,Alice\n2,Bob\n");
+    let b = csv_file("id,name\n1,Alice\n2,Robert\n");
+    dtdiff()
+        .arg(a.path())
+        .arg(b.path())
+        .arg("--key")
+        .arg("id")
+        .assert()
+        .code(1)
+        .stdout(predicate::str::contains("Bob").or(predicate::str::contains("Robert")));
+}
+
+#[test]
+fn json_output() {
+    let a = csv_file("id,val\n1,a\n");
+    let b = csv_file("id,val\n1,b\n");
+    dtdiff()
+        .arg(a.path())
+        .arg(b.path())
+        .arg("--key")
+        .arg("id")
+        .arg("--json")
+        .assert()
+        .code(1)
+        .stdout(predicate::str::contains("\"modified\""));
+}
+
+#[test]
+fn format_mismatch_exits_2() {
+    let csv = csv_file("a,b\n1,2\n");
+    // Create a file with .parquet extension but CSV content - format detection
+    // will see it as parquet by extension, creating a mismatch
+    let mut pq = NamedTempFile::with_suffix(".parquet").unwrap();
+    write!(pq, "a,b\n1,2\n").unwrap();
+    pq.flush().unwrap();
+    // This should fail because formats differ (or parquet reader fails on CSV content)
+    dtdiff()
+        .arg(csv.path())
+        .arg(pq.path())
+        .assert()
+        .failure();
+}
+```
+
+- [ ] **Step 5: Run all integration tests**
+
+Run: `cargo test --test '*' 2>&1`
+Expected: all integration tests PASS
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add demo/ tests/
+git commit -m "feat: add demo fixtures and integration tests for all binaries"
+```
+
+---
+
+### Task 17: Final Verification
+
+- [ ] **Step 1: Run full test suite**
+
+Run: `cargo test 2>&1`
+Expected: all unit tests and integration tests PASS
+
+- [ ] **Step 2: Run clippy**
+
+Run: `cargo clippy 2>&1`
+Expected: no errors (warnings acceptable)
+
+- [ ] **Step 3: Build release binaries**
+
+Run: `cargo build --release 2>&1`
+Expected: builds successfully, produces `dtcat`, `dtfilter`, `dtdiff` in `target/release/`
+
+- [ ] **Step 4: Smoke test all binaries**
+
+```bash
+echo "name,value\nAlice,100\nBob,200" > /tmp/dt_test.csv
+./target/release/dtcat /tmp/dt_test.csv
+./target/release/dtcat /tmp/dt_test.csv --schema
+./target/release/dtcat /tmp/dt_test.csv --describe
+./target/release/dtfilter /tmp/dt_test.csv --filter "value>100"
+echo "name,value\nAlice,100\nCharlie,300" > /tmp/dt_test2.csv
+./target/release/dtdiff /tmp/dt_test.csv /tmp/dt_test2.csv
+```
+
+- [ ] **Step 5: Final commit**
+
+```bash
+git add -A
+git commit -m "chore: final cleanup and verification for v0.1"
+```

	dt-cli-tools CLI tools for viewing, filtering, and comparing tabular data files
	Log \| Files \| Refs \| README \| LICENSE