xl-cli-tools

CLI tools for viewing and editing Excel files
Log | Files | Refs | README | LICENSE

2026-03-17-xldiff.md (58492B)


      1 # xldiff Implementation Plan
      2 
      3 > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking.
      4 
      5 **Goal:** Port go-xldiff to Rust as the `xldiff` binary in this repo, with type-aware comparison and tolerance support.
      6 
      7 **Architecture:** New `src/diff.rs` module for core diff logic (data structures, positional/key-based algorithms, tolerance). New `src/bin/xldiff.rs` binary for CLI parsing, file:sheet argument handling, and four output formatters (text, markdown, json, csv). Reuses existing `reader`, `metadata`, `filter` (for `resolve_column`), and `formatter` modules.
      8 
      9 **Tech Stack:** Rust, polars (DataFrames), calamine (reading), clap (CLI), serde_json (JSON output), std::io::IsTerminal (TTY detection)
     10 
     11 **Spec:** `docs/superpowers/specs/2026-03-17-xldiff-design.md`
     12 
     13 ---
     14 
     15 ### Task 1: Cargo.toml & lib.rs setup
     16 
     17 **Files:**
     18 - Modify: `Cargo.toml`
     19 - Modify: `src/lib.rs`
     20 
     21 - [ ] **Step 1: Add xldiff binary entry and dependencies to Cargo.toml**
     22 
     23 Add after the xlfilter `[[bin]]` block:
     24 
     25 ```toml
     26 [[bin]]
     27 name = "xldiff"
     28 path = "src/bin/xldiff.rs"
     29 ```
     30 
     31 Add to `[dependencies]`:
     32 
     33 ```toml
     34 serde_json = { version = "1", features = ["preserve_order"] }
     35 ```
     36 
     37 - [ ] **Step 2: Add diff module to lib.rs**
     38 
     39 ```rust
     40 pub mod cell;
     41 pub mod diff;
     42 pub mod filter;
     43 pub mod formatter;
     44 pub mod metadata;
     45 pub mod reader;
     46 pub mod writer;
     47 ```
     48 
     49 - [ ] **Step 3: Create empty diff.rs so the project compiles**
     50 
     51 Create `src/diff.rs`:
     52 
     53 ```rust
     54 // Diff engine for comparing two Excel sheets.
     55 ```
     56 
     57 - [ ] **Step 4: Verify project compiles**
     58 
     59 Run: `cargo check`
     60 Expected: success (no errors)
     61 
     62 - [ ] **Step 5: Commit**
     63 
     64 ```bash
     65 git add Cargo.toml src/lib.rs src/diff.rs
     66 git commit -m "chore: add xldiff binary entry, serde deps, empty diff module"
     67 ```
     68 
     69 ---
     70 
     71 ### Task 2: Make formatter helpers public
     72 
     73 **Files:**
     74 - Modify: `src/formatter.rs`
     75 
     76 The markdown output formatter in xldiff needs `render_table`, `compute_col_widths`, `render_table_header`, `render_table_rows`, and `format_any_value` from `formatter.rs`. These are currently private.
     77 
     78 - [ ] **Step 1: Make the five helper functions public**
     79 
     80 In `src/formatter.rs`, change:
     81 
     82 - `fn compute_col_widths(` → `pub fn compute_col_widths(`
     83 - `fn render_table_header(` → `pub fn render_table_header(`
     84 - `fn render_table_rows(` → `pub fn render_table_rows(`
     85 - `fn render_table(` → `pub fn render_table(`
     86 - `fn format_any_value(` → `pub fn format_any_value(`
     87 
     88 - [ ] **Step 2: Verify project compiles**
     89 
     90 Run: `cargo check`
     91 Expected: success
     92 
     93 - [ ] **Step 3: Commit**
     94 
     95 ```bash
     96 git add src/formatter.rs
     97 git commit -m "refactor: make formatter table helpers public for xldiff reuse"
     98 ```
     99 
    100 ---
    101 
    102 ### Task 3: Make `col_letter_to_index` and `resolve_column` available from filter module
    103 
    104 **Files:**
    105 - Modify: `src/filter.rs`
    106 
    107 `col_letter_to_index` is currently private in `filter.rs`. The xldiff binary needs `resolve_column` (already public) and `resolve_columns` (already public), which internally use `col_letter_to_index`. Since `resolve_column` is already public, no changes are needed here — xldiff will call `xlcat::filter::resolve_column()` directly.
    108 
    109 **This task is a no-op** — confirming that the existing public API is sufficient. Skip to Task 4.
    110 
    111 ---
    112 
    113 ### Task 4: diff.rs — data structures
    114 
    115 **Files:**
    116 - Modify: `src/diff.rs`
    117 
    118 - [ ] **Step 1: Write the data structures**
    119 
    120 ```rust
    121 use polars::prelude::*;
    122 
    123 /// Source file and sheet metadata for display.
    124 #[derive(Debug, Clone)]
    125 pub struct SheetSource {
    126     pub file_name: String,
    127     pub sheet_name: String,
    128 }
    129 
    130 /// A single row from an added or removed set.
    131 #[derive(Debug, Clone)]
    132 pub struct DiffRow {
    133     pub values: Vec<String>,
    134 }
    135 
    136 /// A change in a single cell.
    137 #[derive(Debug, Clone)]
    138 pub struct CellChange {
    139     pub column: String,
    140     pub old_value: String,
    141     pub new_value: String,
    142 }
    143 
    144 /// A row present in both files with cell-level differences.
    145 #[derive(Debug, Clone)]
    146 pub struct ModifiedRow {
    147     pub key: Vec<String>,
    148     pub changes: Vec<CellChange>,
    149 }
    150 
    151 /// Result of comparing two sheets.
    152 #[derive(Debug, Clone)]
    153 pub struct DiffResult {
    154     pub headers: Vec<String>,
    155     pub key_columns: Vec<String>,
    156     pub added: Vec<DiffRow>,
    157     pub removed: Vec<DiffRow>,
    158     pub modified: Vec<ModifiedRow>,
    159     pub source_a: SheetSource,
    160     pub source_b: SheetSource,
    161 }
    162 
    163 impl DiffResult {
    164     pub fn has_differences(&self) -> bool {
    165         !self.added.is_empty() || !self.removed.is_empty() || !self.modified.is_empty()
    166     }
    167 }
    168 
    169 /// Options controlling how the diff is performed.
    170 #[derive(Debug, Clone, Default)]
    171 pub struct DiffOptions {
    172     pub key_columns: Vec<String>,
    173     pub tolerance: Option<f64>,
    174 }
    175 ```
    176 
    177 - [ ] **Step 2: Verify project compiles**
    178 
    179 Run: `cargo check`
    180 Expected: success
    181 
    182 - [ ] **Step 3: Commit**
    183 
    184 ```bash
    185 git add src/diff.rs
    186 git commit -m "feat(xldiff): add diff data structures — DiffResult, DiffRow, ModifiedRow, CellChange"
    187 ```
    188 
    189 ---
    190 
    191 ### Task 5: diff.rs — positional mode
    192 
    193 **Files:**
    194 - Modify: `src/diff.rs`
    195 - Create: `tests/test_xldiff.rs` (unit tests only at first)
    196 
    197 - [ ] **Step 1: Write the failing test for positional diff**
    198 
    199 Add at the bottom of `src/diff.rs`:
    200 
    201 ```rust
    202 #[cfg(test)]
    203 mod tests {
    204     use super::*;
    205 
    206     fn make_source(name: &str) -> SheetSource {
    207         SheetSource {
    208             file_name: format!("{name}.xlsx"),
    209             sheet_name: "Sheet1".to_string(),
    210         }
    211     }
    212 
    213     #[test]
    214     fn test_positional_no_diff() {
    215         let s1 = Series::new("name".into(), &["Alice", "Bob"]);
    216         let s2 = Series::new("score".into(), &[90i64, 80]);
    217         let df_a = DataFrame::new(vec![s1.clone().into_column(), s2.clone().into_column()]).unwrap();
    218         let df_b = DataFrame::new(vec![s1.into_column(), s2.into_column()]).unwrap();
    219 
    220         let opts = DiffOptions::default();
    221         let result = diff_positional(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    222         assert!(!result.has_differences());
    223         assert!(result.added.is_empty());
    224         assert!(result.removed.is_empty());
    225     }
    226 
    227     #[test]
    228     fn test_positional_added_removed() {
    229         let df_a = DataFrame::new(vec![
    230             Series::new("name".into(), &["Alice", "Bob"]).into_column(),
    231         ]).unwrap();
    232         let df_b = DataFrame::new(vec![
    233             Series::new("name".into(), &["Alice", "Charlie"]).into_column(),
    234         ]).unwrap();
    235 
    236         let opts = DiffOptions::default();
    237         let result = diff_positional(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    238         assert_eq!(result.removed.len(), 1);
    239         assert_eq!(result.removed[0].values, vec!["Bob"]);
    240         assert_eq!(result.added.len(), 1);
    241         assert_eq!(result.added[0].values, vec!["Charlie"]);
    242         assert!(result.modified.is_empty());
    243     }
    244 
    245     #[test]
    246     fn test_positional_duplicate_rows() {
    247         let df_a = DataFrame::new(vec![
    248             Series::new("x".into(), &["A", "A", "A"]).into_column(),
    249         ]).unwrap();
    250         let df_b = DataFrame::new(vec![
    251             Series::new("x".into(), &["A", "A"]).into_column(),
    252         ]).unwrap();
    253 
    254         let opts = DiffOptions::default();
    255         let result = diff_positional(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    256         assert_eq!(result.removed.len(), 1);
    257         assert!(result.added.is_empty());
    258     }
    259 }
    260 ```
    261 
    262 - [ ] **Step 2: Run tests to verify they fail**
    263 
    264 Run: `cargo test --lib diff::tests`
    265 Expected: FAIL — `diff_positional` not found
    266 
    267 - [ ] **Step 3: Implement positional diff**
    268 
    269 Add to `src/diff.rs` before the tests module:
    270 
    271 ```rust
    272 use crate::formatter;
    273 use std::collections::HashMap;
    274 
    275 /// Format a cell value for hashing/display, using a null sentinel to distinguish null from empty.
    276 fn cell_to_string(col: &Column, idx: usize) -> String {
    277     match col.get(idx) {
    278         Ok(AnyValue::Null) | Err(_) => String::new(),
    279         Ok(v) => formatter::format_any_value(&v),
    280     }
    281 }
    282 
    283 fn cell_to_key_part(col: &Column, idx: usize) -> String {
    284     match col.get(idx) {
    285         Ok(AnyValue::Null) | Err(_) => "\x01NULL\x01".to_string(),
    286         Ok(v) => formatter::format_any_value(&v),
    287     }
    288 }
    289 
    290 /// Convert a DataFrame row to a string key by joining cell values with null bytes.
    291 fn row_to_key(df: &DataFrame, row_idx: usize) -> String {
    292     let cols = df.get_columns();
    293     let parts: Vec<String> = cols
    294         .iter()
    295         .map(|c| cell_to_key_part(c, row_idx))
    296         .collect();
    297     parts.join("\0")
    298 }
    299 
    300 /// Convert a DataFrame row to a Vec<String> of display values.
    301 fn row_to_strings(df: &DataFrame, row_idx: usize) -> Vec<String> {
    302     df.get_columns()
    303         .iter()
    304         .map(|c| cell_to_string(c, row_idx))
    305         .collect()
    306 }
    307 
    308 /// Positional diff: treats entire row as identity using multiset comparison.
    309 pub fn diff_positional(
    310     df_a: &DataFrame,
    311     df_b: &DataFrame,
    312     _opts: &DiffOptions,
    313     source_a: SheetSource,
    314     source_b: SheetSource,
    315 ) -> anyhow::Result<DiffResult> {
    316     let headers_a: Vec<String> = df_a.get_column_names().iter().map(|s| s.to_string()).collect();
    317     let headers_b: Vec<String> = df_b.get_column_names().iter().map(|s| s.to_string()).collect();
    318 
    319     // Warn about mismatched column counts in positional mode
    320     if headers_a.len() != headers_b.len() {
    321         eprintln!(
    322             "warning: column count differs ({} vs {}), padding shorter side with empty values",
    323             headers_a.len(), headers_b.len()
    324         );
    325     }
    326 
    327     // Use headers from the file with more columns
    328     let headers = if headers_b.len() > headers_a.len() { headers_b } else { headers_a };
    329 
    330     // Build frequency maps
    331     let mut counts_a: HashMap<String, Vec<usize>> = HashMap::new();
    332     for i in 0..df_a.height() {
    333         counts_a.entry(row_to_key(df_a, i)).or_default().push(i);
    334     }
    335     let mut counts_b: HashMap<String, Vec<usize>> = HashMap::new();
    336     for i in 0..df_b.height() {
    337         counts_b.entry(row_to_key(df_b, i)).or_default().push(i);
    338     }
    339 
    340     let mut removed = Vec::new();
    341     let mut consumed_b: HashMap<String, usize> = HashMap::new();
    342 
    343     // Walk A: try to match with B
    344     for i in 0..df_a.height() {
    345         let key = row_to_key(df_a, i);
    346         let b_count = counts_b.get(&key).map(|v| v.len()).unwrap_or(0);
    347         let already_consumed = consumed_b.get(&key).copied().unwrap_or(0);
    348         if already_consumed < b_count {
    349             *consumed_b.entry(key).or_insert(0) += 1;
    350         } else {
    351             let mut vals = row_to_strings(df_a, i);
    352             vals.resize(headers.len(), String::new()); // pad if needed
    353             removed.push(DiffRow { values: vals });
    354         }
    355     }
    356 
    357     // Walk B: anything not consumed is added
    358     let mut consumed_a: HashMap<String, usize> = HashMap::new();
    359     let mut added = Vec::new();
    360     for i in 0..df_b.height() {
    361         let key = row_to_key(df_b, i);
    362         let a_count = counts_a.get(&key).map(|v| v.len()).unwrap_or(0);
    363         let already_consumed = consumed_a.get(&key).copied().unwrap_or(0);
    364         if already_consumed < a_count {
    365             *consumed_a.entry(key).or_insert(0) += 1;
    366         } else {
    367             let mut vals = row_to_strings(df_b, i);
    368             vals.resize(headers.len(), String::new()); // pad if needed
    369             added.push(DiffRow { values: vals });
    370         }
    371     }
    372 
    373     Ok(DiffResult {
    374         headers,
    375         key_columns: vec![],
    376         added,
    377         removed,
    378         modified: vec![],
    379         source_a,
    380         source_b,
    381     })
    382 }
    383 ```
    384 
    385 - [ ] **Step 4: Run tests to verify they pass**
    386 
    387 Run: `cargo test --lib diff::tests`
    388 Expected: all 3 tests PASS
    389 
    390 - [ ] **Step 5: Commit**
    391 
    392 ```bash
    393 git add src/diff.rs
    394 git commit -m "feat(xldiff): implement positional diff with multiset comparison"
    395 ```
    396 
    397 ---
    398 
    399 ### Task 6: diff.rs — key-based mode
    400 
    401 **Files:**
    402 - Modify: `src/diff.rs`
    403 
    404 - [ ] **Step 1: Write failing tests for key-based diff**
    405 
    406 Add to the `tests` module in `src/diff.rs`:
    407 
    408 ```rust
    409     #[test]
    410     fn test_keyed_no_diff() {
    411         let df_a = DataFrame::new(vec![
    412             Series::new("id".into(), &["1", "2"]).into_column(),
    413             Series::new("val".into(), &[10i64, 20]).into_column(),
    414         ]).unwrap();
    415         let df_b = df_a.clone();
    416 
    417         let opts = DiffOptions { key_columns: vec!["id".to_string()], ..Default::default() };
    418         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    419         assert!(!result.has_differences());
    420     }
    421 
    422     #[test]
    423     fn test_keyed_added_removed() {
    424         let df_a = DataFrame::new(vec![
    425             Series::new("id".into(), &["1", "2"]).into_column(),
    426             Series::new("val".into(), &[10i64, 20]).into_column(),
    427         ]).unwrap();
    428         let df_b = DataFrame::new(vec![
    429             Series::new("id".into(), &["2", "3"]).into_column(),
    430             Series::new("val".into(), &[20i64, 30]).into_column(),
    431         ]).unwrap();
    432 
    433         let opts = DiffOptions { key_columns: vec!["id".to_string()], ..Default::default() };
    434         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    435         assert_eq!(result.removed.len(), 1);
    436         assert_eq!(result.added.len(), 1);
    437         assert_eq!(result.removed[0].values, vec!["1", "10"]);
    438         assert_eq!(result.added[0].values, vec!["3", "30"]);
    439     }
    440 
    441     #[test]
    442     fn test_keyed_modified() {
    443         let df_a = DataFrame::new(vec![
    444             Series::new("id".into(), &["1", "2"]).into_column(),
    445             Series::new("score".into(), &[90i64, 80]).into_column(),
    446         ]).unwrap();
    447         let df_b = DataFrame::new(vec![
    448             Series::new("id".into(), &["1", "2"]).into_column(),
    449             Series::new("score".into(), &[95i64, 80]).into_column(),
    450         ]).unwrap();
    451 
    452         let opts = DiffOptions { key_columns: vec!["id".to_string()], ..Default::default() };
    453         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    454         assert!(result.added.is_empty());
    455         assert!(result.removed.is_empty());
    456         assert_eq!(result.modified.len(), 1);
    457         assert_eq!(result.modified[0].key, vec!["1"]);
    458         assert_eq!(result.modified[0].changes.len(), 1);
    459         assert_eq!(result.modified[0].changes[0].column, "score");
    460         assert_eq!(result.modified[0].changes[0].old_value, "90");
    461         assert_eq!(result.modified[0].changes[0].new_value, "95");
    462     }
    463 
    464     #[test]
    465     fn test_keyed_composite_key() {
    466         let df_a = DataFrame::new(vec![
    467             Series::new("date".into(), &["2024-01", "2024-01"]).into_column(),
    468             Series::new("ticker".into(), &["AAPL", "GOOG"]).into_column(),
    469             Series::new("price".into(), &[150i64, 100]).into_column(),
    470         ]).unwrap();
    471         let df_b = DataFrame::new(vec![
    472             Series::new("date".into(), &["2024-01", "2024-01"]).into_column(),
    473             Series::new("ticker".into(), &["AAPL", "GOOG"]).into_column(),
    474             Series::new("price".into(), &[155i64, 100]).into_column(),
    475         ]).unwrap();
    476 
    477         let opts = DiffOptions {
    478             key_columns: vec!["date".to_string(), "ticker".to_string()],
    479             ..Default::default()
    480         };
    481         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    482         assert_eq!(result.modified.len(), 1);
    483         assert_eq!(result.modified[0].key, vec!["2024-01", "AAPL"]);
    484     }
    485 ```
    486 
    487 - [ ] **Step 2: Run tests to verify they fail**
    488 
    489 Run: `cargo test --lib diff::tests`
    490 Expected: FAIL — `diff_keyed` not found
    491 
    492 - [ ] **Step 3: Implement key-based diff**
    493 
    494 Add to `src/diff.rs`:
    495 
    496 ```rust
    497 /// Row data stored during key-based comparison.
    498 struct KeyedRow {
    499     values: Vec<String>,
    500     key_values: Vec<String>,
    501 }
    502 
    503 /// Key-based diff: match rows by key columns, compare remaining cells.
    504 pub fn diff_keyed(
    505     df_a: &DataFrame,
    506     df_b: &DataFrame,
    507     opts: &DiffOptions,
    508     source_a: SheetSource,
    509     source_b: SheetSource,
    510 ) -> anyhow::Result<DiffResult> {
    511     let headers_a: Vec<String> = df_a.get_column_names().iter().map(|s| s.to_string()).collect();
    512     let headers_b: Vec<String> = df_b.get_column_names().iter().map(|s| s.to_string()).collect();
    513 
    514     // Resolve key column indices for both frames
    515     let key_indices_a: Vec<usize> = opts.key_columns.iter()
    516         .map(|k| headers_a.iter().position(|h| h == k)
    517             .ok_or_else(|| anyhow::anyhow!("key column '{}' not found in first file", k)))
    518         .collect::<anyhow::Result<Vec<_>>>()?;
    519 
    520     let key_indices_b: Vec<usize> = opts.key_columns.iter()
    521         .map(|k| headers_b.iter().position(|h| h == k)
    522             .ok_or_else(|| anyhow::anyhow!("key column '{}' not found in second file", k)))
    523         .collect::<anyhow::Result<Vec<_>>>()?;
    524 
    525     // Find common non-key columns for comparison
    526     let non_key_a: Vec<String> = headers_a.iter()
    527         .filter(|h| !opts.key_columns.contains(h))
    528         .cloned().collect();
    529     let non_key_b: Vec<String> = headers_b.iter()
    530         .filter(|h| !opts.key_columns.contains(h))
    531         .cloned().collect();
    532     let common_non_key: Vec<String> = non_key_a.iter()
    533         .filter(|h| non_key_b.contains(h))
    534         .cloned().collect();
    535 
    536     // Warn about columns only in one file
    537     for col in &non_key_a {
    538         if !non_key_b.contains(col) {
    539             eprintln!("warning: column '{}' only in first file, skipping", col);
    540         }
    541     }
    542     for col in &non_key_b {
    543         if !non_key_a.contains(col) {
    544             eprintln!("warning: column '{}' only in second file, skipping", col);
    545         }
    546     }
    547 
    548     // Headers = key columns + union of non-key columns (common first, then only-in-A, only-in-B)
    549     let mut headers = opts.key_columns.clone();
    550     headers.extend(non_key_a.iter().chain(
    551         non_key_b.iter().filter(|h| !non_key_a.contains(h))
    552     ).cloned());
    553 
    554     // Build row maps
    555     let map_a = build_key_map(df_a, &key_indices_a);
    556     let map_b = build_key_map(df_b, &key_indices_b);
    557 
    558     // Detect duplicate keys
    559     check_duplicate_keys(df_a, &key_indices_a, &source_a);
    560     check_duplicate_keys(df_b, &key_indices_b, &source_b);
    561 
    562     let mut added = Vec::new();
    563     let mut removed = Vec::new();
    564     let mut modified = Vec::new();
    565 
    566     // Removed: keys in A but not B
    567     for (key_str, row_a) in &map_a {
    568         if !map_b.contains_key(key_str) {
    569             removed.push(DiffRow { values: row_a.values.clone() });
    570         }
    571     }
    572 
    573     // Added: keys in B but not A; Modified: keys in both
    574     for (key_str, row_b) in &map_b {
    575         match map_a.get(key_str) {
    576             None => {
    577                 added.push(DiffRow { values: row_b.values.clone() });
    578             }
    579             Some(row_a) => {
    580                 let changes = compare_rows(
    581                     df_a, df_b,
    582                     &headers_a, &headers_b,
    583                     row_a, row_b,
    584                     &common_non_key,
    585                     opts,
    586                 );
    587                 if !changes.is_empty() {
    588                     modified.push(ModifiedRow {
    589                         key: row_a.key_values.clone(),
    590                         changes,
    591                     });
    592                 }
    593             }
    594         }
    595     }
    596 
    597     Ok(DiffResult {
    598         headers,
    599         key_columns: opts.key_columns.clone(),
    600         added,
    601         removed,
    602         modified,
    603         source_a,
    604         source_b,
    605     })
    606 }
    607 
    608 fn build_key_map(df: &DataFrame, key_indices: &[usize]) -> HashMap<String, KeyedRow> {
    609     let cols = df.get_columns();
    610     let mut map = HashMap::new();
    611     for i in 0..df.height() {
    612         let key_values: Vec<String> = key_indices.iter()
    613             .map(|&idx| formatter::format_any_value(&cols[idx].get(i).unwrap_or(AnyValue::Null)))
    614             .collect();
    615         let key_str = key_values.join("\0");
    616         let values: Vec<String> = cols.iter()
    617             .map(|c| formatter::format_any_value(&c.get(i).unwrap_or(AnyValue::Null)))
    618             .collect();
    619         map.insert(key_str, KeyedRow { values, key_values });
    620     }
    621     map
    622 }
    623 
    624 fn check_duplicate_keys(df: &DataFrame, key_indices: &[usize], source: &SheetSource) {
    625     let cols = df.get_columns();
    626     let mut seen: HashMap<String, usize> = HashMap::new();
    627     for i in 0..df.height() {
    628         let key: Vec<String> = key_indices.iter()
    629             .map(|&idx| formatter::format_any_value(&cols[idx].get(i).unwrap_or(AnyValue::Null)))
    630             .collect();
    631         let key_str = key.join(", ");
    632         let count = seen.entry(key_str.clone()).or_insert(0);
    633         *count += 1;
    634         if *count == 2 {
    635             eprintln!("warning: duplicate key \"{}\" in {}", key_str, source.file_name);
    636         }
    637     }
    638 }
    639 
    640 /// Compare non-key columns of two matched rows.
    641 fn compare_rows(
    642     df_a: &DataFrame,
    643     df_b: &DataFrame,
    644     headers_a: &[String],
    645     headers_b: &[String],
    646     row_a: &KeyedRow,
    647     row_b: &KeyedRow,
    648     common_columns: &[String],
    649     opts: &DiffOptions,
    650 ) -> Vec<CellChange> {
    651     let mut changes = Vec::new();
    652     for col_name in common_columns {
    653         let idx_a = headers_a.iter().position(|h| h == col_name).unwrap();
    654         let idx_b = headers_b.iter().position(|h| h == col_name).unwrap();
    655         let val_a = &row_a.values[idx_a];
    656         let val_b = &row_b.values[idx_b];
    657 
    658         let is_equal = if let Some(tol) = opts.tolerance {
    659             values_equal_with_tolerance(val_a, val_b, tol, df_a, df_b, col_name)
    660         } else {
    661             val_a == val_b
    662         };
    663 
    664         if !is_equal {
    665             changes.push(CellChange {
    666                 column: col_name.clone(),
    667                 old_value: val_a.clone(),
    668                 new_value: val_b.clone(),
    669             });
    670         }
    671     }
    672     changes
    673 }
    674 
    675 fn is_float_dtype(dt: &DataType) -> bool {
    676     matches!(dt, DataType::Float32 | DataType::Float64)
    677 }
    678 
    679 fn is_int_dtype(dt: &DataType) -> bool {
    680     matches!(dt, DataType::Int8 | DataType::Int16 | DataType::Int32 | DataType::Int64
    681         | DataType::UInt8 | DataType::UInt16 | DataType::UInt32 | DataType::UInt64)
    682 }
    683 
    684 /// Check if two values are equal, applying tolerance for float columns.
    685 /// Tolerance applies to Float64 columns and Int64-vs-Float64 cross-type,
    686 /// but NOT to pure integer columns (spec requirement).
    687 fn values_equal_with_tolerance(
    688     val_a: &str,
    689     val_b: &str,
    690     tolerance: f64,
    691     df_a: &DataFrame,
    692     df_b: &DataFrame,
    693     col_name: &str,
    694 ) -> bool {
    695     if let (Ok(a), Ok(b)) = (val_a.parse::<f64>(), val_b.parse::<f64>()) {
    696         // Handle NaN
    697         if a.is_nan() && b.is_nan() {
    698             return true;
    699         }
    700         if a.is_nan() || b.is_nan() {
    701             return false;
    702         }
    703         let type_a = df_a.column(col_name).ok().map(|c| c.dtype().clone());
    704         let type_b = df_b.column(col_name).ok().map(|c| c.dtype().clone());
    705 
    706         let a_is_float = type_a.as_ref().map_or(false, is_float_dtype);
    707         let b_is_float = type_b.as_ref().map_or(false, is_float_dtype);
    708         let a_is_int = type_a.as_ref().map_or(false, is_int_dtype);
    709         let b_is_int = type_b.as_ref().map_or(false, is_int_dtype);
    710 
    711         // Pure integer columns: exact comparison, no tolerance
    712         if a_is_int && b_is_int {
    713             return val_a == val_b;
    714         }
    715         // At least one float, or cross-type int/float: apply tolerance
    716         if a_is_float || b_is_float {
    717             return (a - b).abs() <= tolerance;
    718         }
    719     }
    720     // Fallback: exact string comparison
    721     val_a == val_b
    722 }
    723 ```
    724 
    725 - [ ] **Step 4: Run tests to verify they pass**
    726 
    727 Run: `cargo test --lib diff::tests`
    728 Expected: all 7 tests PASS
    729 
    730 - [ ] **Step 5: Commit**
    731 
    732 ```bash
    733 git add src/diff.rs
    734 git commit -m "feat(xldiff): implement key-based diff with composite keys and duplicate detection"
    735 ```
    736 
    737 ---
    738 
    739 ### Task 7: diff.rs — tolerance tests
    740 
    741 **Files:**
    742 - Modify: `src/diff.rs`
    743 
    744 - [ ] **Step 1: Write tolerance tests**
    745 
    746 Add to the `tests` module:
    747 
    748 ```rust
    749     #[test]
    750     fn test_keyed_tolerance_within() {
    751         let df_a = DataFrame::new(vec![
    752             Series::new("id".into(), &["1"]).into_column(),
    753             Series::new("price".into(), &[100.001f64]).into_column(),
    754         ]).unwrap();
    755         let df_b = DataFrame::new(vec![
    756             Series::new("id".into(), &["1"]).into_column(),
    757             Series::new("price".into(), &[100.002f64]).into_column(),
    758         ]).unwrap();
    759 
    760         let opts = DiffOptions {
    761             key_columns: vec!["id".to_string()],
    762             tolerance: Some(0.01),
    763         };
    764         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    765         assert!(!result.has_differences(), "values within tolerance should be equal");
    766     }
    767 
    768     #[test]
    769     fn test_keyed_tolerance_exceeded() {
    770         let df_a = DataFrame::new(vec![
    771             Series::new("id".into(), &["1"]).into_column(),
    772             Series::new("price".into(), &[100.0f64]).into_column(),
    773         ]).unwrap();
    774         let df_b = DataFrame::new(vec![
    775             Series::new("id".into(), &["1"]).into_column(),
    776             Series::new("price".into(), &[100.05f64]).into_column(),
    777         ]).unwrap();
    778 
    779         let opts = DiffOptions {
    780             key_columns: vec!["id".to_string()],
    781             tolerance: Some(0.01),
    782         };
    783         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    784         assert_eq!(result.modified.len(), 1);
    785     }
    786 
    787     #[test]
    788     fn test_keyed_nan_handling() {
    789         let df_a = DataFrame::new(vec![
    790             Series::new("id".into(), &["1"]).into_column(),
    791             Series::new("val".into(), &[f64::NAN]).into_column(),
    792         ]).unwrap();
    793         let df_b = DataFrame::new(vec![
    794             Series::new("id".into(), &["1"]).into_column(),
    795             Series::new("val".into(), &[f64::NAN]).into_column(),
    796         ]).unwrap();
    797 
    798         let opts = DiffOptions {
    799             key_columns: vec!["id".to_string()],
    800             tolerance: Some(0.01),
    801         };
    802         let result = diff_keyed(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    803         assert!(!result.has_differences(), "NaN == NaN should be equal");
    804     }
    805 ```
    806 
    807 - [ ] **Step 2: Run tests**
    808 
    809 Run: `cargo test --lib diff::tests`
    810 Expected: all 10 tests PASS
    811 
    812 - [ ] **Step 3: Commit**
    813 
    814 ```bash
    815 git add src/diff.rs
    816 git commit -m "test(xldiff): add tolerance and NaN handling tests"
    817 ```
    818 
    819 ---
    820 
    821 ### Task 8: diff.rs — public entry point `diff_sheets`
    822 
    823 **Files:**
    824 - Modify: `src/diff.rs`
    825 
    826 - [ ] **Step 1: Write test for the unified entry point**
    827 
    828 Add to tests:
    829 
    830 ```rust
    831     #[test]
    832     fn test_diff_sheets_positional() {
    833         let df_a = DataFrame::new(vec![
    834             Series::new("x".into(), &["a", "b"]).into_column(),
    835         ]).unwrap();
    836         let df_b = DataFrame::new(vec![
    837             Series::new("x".into(), &["a", "c"]).into_column(),
    838         ]).unwrap();
    839 
    840         let opts = DiffOptions::default(); // no key → positional
    841         let result = diff_sheets(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    842         assert_eq!(result.removed.len(), 1);
    843         assert_eq!(result.added.len(), 1);
    844     }
    845 
    846     #[test]
    847     fn test_diff_sheets_keyed() {
    848         let df_a = DataFrame::new(vec![
    849             Series::new("id".into(), &["1"]).into_column(),
    850             Series::new("v".into(), &[10i64]).into_column(),
    851         ]).unwrap();
    852         let df_b = DataFrame::new(vec![
    853             Series::new("id".into(), &["1"]).into_column(),
    854             Series::new("v".into(), &[20i64]).into_column(),
    855         ]).unwrap();
    856 
    857         let opts = DiffOptions { key_columns: vec!["id".to_string()], ..Default::default() };
    858         let result = diff_sheets(&df_a, &df_b, &opts, make_source("a"), make_source("b")).unwrap();
    859         assert_eq!(result.modified.len(), 1);
    860     }
    861 ```
    862 
    863 - [ ] **Step 2: Run tests to verify they fail**
    864 
    865 Run: `cargo test --lib diff::tests::test_diff_sheets`
    866 Expected: FAIL
    867 
    868 - [ ] **Step 3: Implement `diff_sheets`**
    869 
    870 Add to `src/diff.rs`:
    871 
    872 ```rust
    873 /// Compare two DataFrames. Dispatches to positional or key-based mode.
    874 pub fn diff_sheets(
    875     df_a: &DataFrame,
    876     df_b: &DataFrame,
    877     opts: &DiffOptions,
    878     source_a: SheetSource,
    879     source_b: SheetSource,
    880 ) -> anyhow::Result<DiffResult> {
    881     if opts.key_columns.is_empty() {
    882         diff_positional(df_a, df_b, opts, source_a, source_b)
    883     } else {
    884         diff_keyed(df_a, df_b, opts, source_a, source_b)
    885     }
    886 }
    887 ```
    888 
    889 - [ ] **Step 4: Run tests**
    890 
    891 Run: `cargo test --lib diff::tests`
    892 Expected: all 12 tests PASS
    893 
    894 - [ ] **Step 5: Commit**
    895 
    896 ```bash
    897 git add src/diff.rs
    898 git commit -m "feat(xldiff): add diff_sheets entry point dispatching positional/keyed modes"
    899 ```
    900 
    901 ---
    902 
    903 ### Task 9: xldiff.rs — CLI skeleton + file:sheet parsing
    904 
    905 **Files:**
    906 - Create: `src/bin/xldiff.rs`
    907 
    908 - [ ] **Step 1: Write the CLI binary with clap args and file:sheet parsing**
    909 
    910 Create `src/bin/xldiff.rs`:
    911 
    912 ```rust
    913 use std::io::IsTerminal;
    914 use std::path::PathBuf;
    915 use std::process;
    916 
    917 use anyhow::Result;
    918 use clap::Parser;
    919 
    920 use xlcat::diff::{self, DiffOptions, DiffResult, SheetSource};
    921 use xlcat::filter;
    922 use xlcat::metadata;
    923 use xlcat::reader;
    924 
    925 #[derive(Parser)]
    926 #[command(
    927     name = "xldiff",
    928     about = "Compare two Excel sheets and report differences",
    929     version
    930 )]
    931 struct Args {
    932     /// First file, optionally with :sheet suffix (e.g., old.xlsx:Sales)
    933     file_a: String,
    934 
    935     /// Second file, optionally with :sheet suffix
    936     file_b: String,
    937 
    938     /// Comma-separated key columns for row matching (name or column letter).
    939     /// Omit for positional (whole-row) comparison.
    940     #[arg(long)]
    941     key: Option<String>,
    942 
    943     /// Comma-separated columns to compare (name or column letter).
    944     /// Default: all columns. Key columns are always included.
    945     #[arg(long)]
    946     cols: Option<String>,
    947 
    948     /// Rows to skip before header. Single value for both files, or N,M for independent skips.
    949     #[arg(long, default_value = "0")]
    950     skip: String,
    951 
    952     /// Treat first row as data; generate synthetic column_0, column_1, … headers.
    953     /// NOTE: declared for CLI compatibility but not yet implemented — follow-up task.
    954     #[arg(long)]
    955     no_header: bool,
    956 
    957     /// Numeric tolerance for float comparisons (e.g., 0.01)
    958     #[arg(long)]
    959     tolerance: Option<f64>,
    960 
    961     /// Output format: text (default), markdown, json, csv
    962     #[arg(long, default_value = "text")]
    963     format: String,
    964 
    965     /// Force-disable ANSI colors
    966     #[arg(long)]
    967     no_color: bool,
    968 }
    969 
    970 /// Parse a file argument like "path.xlsx:Sheet1" into (path, Option<sheet>).
    971 /// Uses rfind(':') to handle Windows drive letters (C:\...).
    972 fn parse_file_arg(arg: &str) -> (PathBuf, Option<String>) {
    973     if let Some(colon_pos) = arg.rfind(':') {
    974         // Don't split if the colon is part of a Windows drive letter (e.g., C:\)
    975         // A drive letter colon is at position 1 and followed by \ or /
    976         if colon_pos == 1 && arg.len() > 2 && (arg.as_bytes()[2] == b'\\' || arg.as_bytes()[2] == b'/') {
    977             return (PathBuf::from(arg), None);
    978         }
    979         // Don't split if nothing after the colon
    980         if colon_pos == arg.len() - 1 {
    981             return (PathBuf::from(&arg[..colon_pos]), None);
    982         }
    983         let path = PathBuf::from(&arg[..colon_pos]);
    984         let sheet = arg[colon_pos + 1..].to_string();
    985         (path, Some(sheet))
    986     } else {
    987         (PathBuf::from(arg), None)
    988     }
    989 }
    990 
    991 /// Parse the --skip flag: "3" or "3,5"
    992 fn parse_skip(s: &str) -> Result<(usize, usize)> {
    993     if let Some((a, b)) = s.split_once(',') {
    994         let skip_a = a.trim().parse::<usize>().map_err(|_| anyhow::anyhow!("invalid skip value: {}", a))?;
    995         let skip_b = b.trim().parse::<usize>().map_err(|_| anyhow::anyhow!("invalid skip value: {}", b))?;
    996         Ok((skip_a, skip_b))
    997     } else {
    998         let skip = s.trim().parse::<usize>().map_err(|_| anyhow::anyhow!("invalid skip value: {}", s))?;
    999         Ok((skip, skip))
   1000     }
   1001 }
   1002 
   1003 /// Resolve sheet name from user input or default to first sheet.
   1004 fn resolve_sheet(info: &metadata::FileInfo, sheet_arg: Option<&str>) -> Result<String> {
   1005     match sheet_arg {
   1006         None => info.sheets.first()
   1007             .map(|s| s.name.clone())
   1008             .ok_or_else(|| anyhow::anyhow!("workbook has no sheets")),
   1009         Some(s) => {
   1010             // Try exact name match
   1011             if let Some(sheet) = info.sheets.iter().find(|si| si.name == s) {
   1012                 return Ok(sheet.name.clone());
   1013             }
   1014             // Try as 0-based index
   1015             if let Ok(idx) = s.parse::<usize>() {
   1016                 if let Some(sheet) = info.sheets.get(idx) {
   1017                     return Ok(sheet.name.clone());
   1018                 }
   1019             }
   1020             let names: Vec<_> = info.sheets.iter().map(|s| s.name.as_str()).collect();
   1021             anyhow::bail!("sheet '{}' not found. Available: {}", s, names.join(", "))
   1022         }
   1023     }
   1024 }
   1025 
   1026 fn run(args: Args) -> Result<()> {
   1027     let (path_a, sheet_sel_a) = parse_file_arg(&args.file_a);
   1028     let (path_b, sheet_sel_b) = parse_file_arg(&args.file_b);
   1029 
   1030     // Validate files exist
   1031     if !path_a.exists() {
   1032         anyhow::bail!("file not found: {}", path_a.display());
   1033     }
   1034     if !path_b.exists() {
   1035         anyhow::bail!("file not found: {}", path_b.display());
   1036     }
   1037 
   1038     // Validate format
   1039     let format = match args.format.as_str() {
   1040         "text" | "markdown" | "json" | "csv" => args.format.as_str(),
   1041         other => anyhow::bail!("unknown format '{}'. Use: text, markdown, json, csv", other),
   1042     };
   1043 
   1044     // Parse skip
   1045     let (skip_a, skip_b) = parse_skip(&args.skip)?;
   1046 
   1047     // Resolve sheets
   1048     let info_a = metadata::read_file_info(&path_a)?;
   1049     let info_b = metadata::read_file_info(&path_b)?;
   1050     let sheet_a = resolve_sheet(&info_a, sheet_sel_a.as_deref())?;
   1051     let sheet_b = resolve_sheet(&info_b, sheet_sel_b.as_deref())?;
   1052 
   1053     // Read DataFrames
   1054     let df_a = if skip_a > 0 {
   1055         reader::read_sheet_with_skip(&path_a, &sheet_a, skip_a)?
   1056     } else {
   1057         reader::read_sheet(&path_a, &sheet_a)?
   1058     };
   1059     let df_b = if skip_b > 0 {
   1060         reader::read_sheet_with_skip(&path_b, &sheet_b, skip_b)?
   1061     } else {
   1062         reader::read_sheet(&path_b, &sheet_b)?
   1063     };
   1064 
   1065     // Resolve key columns (validate they exist in both files)
   1066     let key_columns: Vec<String> = if let Some(ref key_str) = args.key {
   1067         let specs: Vec<String> = key_str.split(',').map(|s| s.trim().to_string()).collect();
   1068         let cols_a: Vec<String> = df_a.get_column_names().iter().map(|s| s.to_string()).collect();
   1069         let cols_b: Vec<String> = df_b.get_column_names().iter().map(|s| s.to_string()).collect();
   1070         let resolved = filter::resolve_columns(&specs, &cols_a)
   1071             .map_err(|e| anyhow::anyhow!("first file: {}", e))?;
   1072         // Validate key columns also exist in second file
   1073         for k in &resolved {
   1074             if !cols_b.iter().any(|c| c == k) {
   1075                 anyhow::bail!("key column '{}' not found in second file", k);
   1076             }
   1077         }
   1078         resolved
   1079     } else {
   1080         vec![]
   1081     };
   1082 
   1083     // Column filtering
   1084     let (df_a, df_b) = if let Some(ref cols_str) = args.cols {
   1085         let specs: Vec<String> = cols_str.split(',').map(|s| s.trim().to_string()).collect();
   1086         let cols_a: Vec<String> = df_a.get_column_names().iter().map(|s| s.to_string()).collect();
   1087         let mut selected = filter::resolve_columns(&specs, &cols_a)
   1088             .map_err(|e| anyhow::anyhow!(e))?;
   1089         // Ensure key columns are included
   1090         for k in &key_columns {
   1091             if !selected.contains(k) {
   1092                 selected.insert(0, k.clone());
   1093             }
   1094         }
   1095         let sel_refs: Vec<&str> = selected.iter().map(|s| s.as_str()).collect();
   1096         let df_a = df_a.select(&sel_refs)?;
   1097         let df_b = df_b.select(&sel_refs)?;
   1098         (df_a, df_b)
   1099     } else {
   1100         (df_a, df_b)
   1101     };
   1102 
   1103     let source_a = SheetSource {
   1104         file_name: path_a.file_name().unwrap_or_default().to_string_lossy().to_string(),
   1105         sheet_name: sheet_a,
   1106     };
   1107     let source_b = SheetSource {
   1108         file_name: path_b.file_name().unwrap_or_default().to_string_lossy().to_string(),
   1109         sheet_name: sheet_b,
   1110     };
   1111 
   1112     let opts = DiffOptions {
   1113         key_columns,
   1114         tolerance: args.tolerance,
   1115     };
   1116 
   1117     let result = diff::diff_sheets(&df_a, &df_b, &opts, source_a, source_b)?;
   1118 
   1119     // Determine color
   1120     let use_color = !args.no_color && std::io::stdout().is_terminal();
   1121 
   1122     // Format and print
   1123     let output = match format {
   1124         "text" => format_text(&result, use_color),
   1125         "markdown" => format_markdown(&result),
   1126         "json" => format_json(&result),
   1127         "csv" => format_csv(&result),
   1128         _ => unreachable!(),
   1129     };
   1130     print!("{output}");
   1131 
   1132     // Exit code
   1133     if result.has_differences() {
   1134         process::exit(1);
   1135     }
   1136     Ok(())
   1137 }
   1138 
   1139 fn main() {
   1140     let args = Args::parse();
   1141     if let Err(err) = run(args) {
   1142         eprintln!("xldiff: {err}");
   1143         process::exit(2);
   1144     }
   1145 }
   1146 
   1147 // Output formatters — implemented in subsequent tasks
   1148 fn format_text(_result: &DiffResult, _color: bool) -> String { todo!() }
   1149 fn format_markdown(_result: &DiffResult) -> String { todo!() }
   1150 fn format_json(_result: &DiffResult) -> String { todo!() }
   1151 fn format_csv(_result: &DiffResult) -> String { todo!() }
   1152 ```
   1153 
   1154 - [ ] **Step 2: Verify it compiles**
   1155 
   1156 Run: `cargo check --bin xldiff`
   1157 Expected: success (todo!() stubs compile but will panic at runtime)
   1158 
   1159 - [ ] **Step 3: Commit**
   1160 
   1161 ```bash
   1162 git add src/bin/xldiff.rs
   1163 git commit -m "feat(xldiff): add CLI skeleton with file:sheet parsing, skip, key/cols resolution"
   1164 ```
   1165 
   1166 ---
   1167 
   1168 ### Task 10: xldiff.rs — text output formatter
   1169 
   1170 **Files:**
   1171 - Modify: `src/bin/xldiff.rs`
   1172 
   1173 - [ ] **Step 1: Implement `format_text`**
   1174 
   1175 Replace the `format_text` stub:
   1176 
   1177 ```rust
   1178 fn format_text(result: &DiffResult, color: bool) -> String {
   1179     if !result.has_differences() {
   1180         return "No differences found.\n".to_string();
   1181     }
   1182 
   1183     let mut out = String::new();
   1184 
   1185     // Header
   1186     let (red, green, yellow, reset) = if color {
   1187         ("\x1b[31m", "\x1b[32m", "\x1b[33m", "\x1b[0m")
   1188     } else {
   1189         ("", "", "", "")
   1190     };
   1191 
   1192     out.push_str(&format!(
   1193         "--- {} ({})\n+++ {} ({})\n\n",
   1194         result.source_a.sheet_name, result.source_a.file_name,
   1195         result.source_b.sheet_name, result.source_b.file_name,
   1196     ));
   1197 
   1198     // Summary
   1199     out.push_str(&format!(
   1200         "Added: {} | Removed: {} | Modified: {}\n\n",
   1201         result.added.len(),
   1202         result.removed.len(),
   1203         result.modified.len(),
   1204     ));
   1205 
   1206     // Removed rows
   1207     for row in &result.removed {
   1208         out.push_str(red);
   1209         out.push_str("- ");
   1210         out.push_str(&format_row_inline(&result.headers, &row.values));
   1211         out.push_str(reset);
   1212         out.push('\n');
   1213     }
   1214 
   1215     // Added rows
   1216     for row in &result.added {
   1217         out.push_str(green);
   1218         out.push_str("+ ");
   1219         out.push_str(&format_row_inline(&result.headers, &row.values));
   1220         out.push_str(reset);
   1221         out.push('\n');
   1222     }
   1223 
   1224     if (!result.removed.is_empty() || !result.added.is_empty()) && !result.modified.is_empty() {
   1225         out.push('\n');
   1226     }
   1227 
   1228     // Modified rows
   1229     for m in &result.modified {
   1230         out.push_str(yellow);
   1231         out.push_str("~ ");
   1232         // Show key
   1233         let key_display: Vec<String> = result.key_columns.iter()
   1234             .zip(m.key.iter())
   1235             .map(|(col, val)| format!("{}: \"{}\"", col, val))
   1236             .collect();
   1237         out.push_str(&key_display.join("  "));
   1238         out.push_str(reset);
   1239         out.push('\n');
   1240         for change in &m.changes {
   1241             out.push_str(yellow);
   1242             out.push_str(&format!("    {}: \"{}\" → \"{}\"", change.column, change.old_value, change.new_value));
   1243             out.push_str(reset);
   1244             out.push('\n');
   1245         }
   1246     }
   1247 
   1248     out
   1249 }
   1250 
   1251 /// Format a row as inline key-value pairs: Name: "Alice"  Score: "90"
   1252 fn format_row_inline(headers: &[String], values: &[String]) -> String {
   1253     headers.iter()
   1254         .zip(values.iter())
   1255         .map(|(h, v)| format!("{}: \"{}\"", h, v))
   1256         .collect::<Vec<_>>()
   1257         .join("  ")
   1258 }
   1259 ```
   1260 
   1261 - [ ] **Step 2: Verify it compiles**
   1262 
   1263 Run: `cargo check --bin xldiff`
   1264 Expected: success
   1265 
   1266 - [ ] **Step 3: Commit**
   1267 
   1268 ```bash
   1269 git add src/bin/xldiff.rs
   1270 git commit -m "feat(xldiff): implement colored text output formatter"
   1271 ```
   1272 
   1273 ---
   1274 
   1275 ### Task 11: xldiff.rs — markdown output formatter
   1276 
   1277 **Files:**
   1278 - Modify: `src/bin/xldiff.rs`
   1279 
   1280 - [ ] **Step 1: Implement `format_markdown`**
   1281 
   1282 Replace the `format_markdown` stub:
   1283 
   1284 ```rust
   1285 fn format_markdown(result: &DiffResult) -> String {
   1286     use xlcat::formatter;
   1287 
   1288     if !result.has_differences() {
   1289         return "No differences found.\n".to_string();
   1290     }
   1291 
   1292     let mut out = String::new();
   1293 
   1294     // Added
   1295     if !result.added.is_empty() {
   1296         out.push_str(&format!("## Added ({})\n", result.added.len()));
   1297         let rows: Vec<Vec<String>> = result.added.iter().map(|r| r.values.clone()).collect();
   1298         out.push_str(&formatter::render_table(&result.headers, &rows));
   1299         out.push('\n');
   1300     }
   1301 
   1302     // Removed
   1303     if !result.removed.is_empty() {
   1304         out.push_str(&format!("## Removed ({})\n", result.removed.len()));
   1305         let rows: Vec<Vec<String>> = result.removed.iter().map(|r| r.values.clone()).collect();
   1306         out.push_str(&formatter::render_table(&result.headers, &rows));
   1307         out.push('\n');
   1308     }
   1309 
   1310     // Modified
   1311     if !result.modified.is_empty() {
   1312         out.push_str(&format!("## Modified ({})\n", result.modified.len()));
   1313         let key_label = if result.key_columns.len() == 1 {
   1314             format!("Key ({})", result.key_columns[0])
   1315         } else {
   1316             format!("Key ({})", result.key_columns.join(", "))
   1317         };
   1318         let mod_headers = vec![key_label, "Column".to_string(), "Old".to_string(), "New".to_string()];
   1319         let mut mod_rows = Vec::new();
   1320         for m in &result.modified {
   1321             let key_str = m.key.join(", ");
   1322             for (i, change) in m.changes.iter().enumerate() {
   1323                 let key_cell = if i == 0 { key_str.clone() } else { String::new() };
   1324                 mod_rows.push(vec![
   1325                     key_cell,
   1326                     change.column.clone(),
   1327                     change.old_value.clone(),
   1328                     change.new_value.clone(),
   1329                 ]);
   1330             }
   1331         }
   1332         out.push_str(&formatter::render_table(&mod_headers, &mod_rows));
   1333         out.push('\n');
   1334     }
   1335 
   1336     out
   1337 }
   1338 ```
   1339 
   1340 - [ ] **Step 2: Verify it compiles**
   1341 
   1342 Run: `cargo check --bin xldiff`
   1343 Expected: success
   1344 
   1345 - [ ] **Step 3: Commit**
   1346 
   1347 ```bash
   1348 git add src/bin/xldiff.rs
   1349 git commit -m "feat(xldiff): implement markdown output formatter"
   1350 ```
   1351 
   1352 ---
   1353 
   1354 ### Task 12: xldiff.rs — JSON output formatter
   1355 
   1356 **Files:**
   1357 - Modify: `src/bin/xldiff.rs`
   1358 
   1359 - [ ] **Step 1: Implement `format_json`**
   1360 
   1361 Replace the `format_json` stub:
   1362 
   1363 ```rust
   1364 fn format_json(result: &DiffResult) -> String {
   1365     use serde_json::{json, Map, Value};
   1366 
   1367     let row_to_obj = |row: &diff::DiffRow| -> Value {
   1368         let mut map = Map::new();
   1369         for (h, v) in result.headers.iter().zip(row.values.iter()) {
   1370             map.insert(h.clone(), Value::String(v.clone()));
   1371         }
   1372         Value::Object(map)
   1373     };
   1374 
   1375     let added: Vec<Value> = result.added.iter().map(row_to_obj).collect();
   1376     let removed: Vec<Value> = result.removed.iter().map(row_to_obj).collect();
   1377 
   1378     let modified: Vec<Value> = result.modified.iter().map(|m| {
   1379         let mut key_map = Map::new();
   1380         for (col, val) in result.key_columns.iter().zip(m.key.iter()) {
   1381             key_map.insert(col.clone(), Value::String(val.clone()));
   1382         }
   1383         let changes: Vec<Value> = m.changes.iter().map(|c| {
   1384             json!({
   1385                 "column": c.column,
   1386                 "old": c.old_value,
   1387                 "new": c.new_value,
   1388             })
   1389         }).collect();
   1390         json!({
   1391             "key": Value::Object(key_map),
   1392             "changes": changes,
   1393         })
   1394     }).collect();
   1395 
   1396     let obj = json!({
   1397         "added": added,
   1398         "removed": removed,
   1399         "modified": modified,
   1400     });
   1401 
   1402     serde_json::to_string_pretty(&obj).unwrap() + "\n"
   1403 }
   1404 ```
   1405 
   1406 - [ ] **Step 2: Verify it compiles**
   1407 
   1408 Run: `cargo check --bin xldiff`
   1409 Expected: success
   1410 
   1411 - [ ] **Step 3: Commit**
   1412 
   1413 ```bash
   1414 git add src/bin/xldiff.rs
   1415 git commit -m "feat(xldiff): implement JSON output formatter"
   1416 ```
   1417 
   1418 ---
   1419 
   1420 ### Task 13: xldiff.rs — CSV output formatter
   1421 
   1422 **Files:**
   1423 - Modify: `src/bin/xldiff.rs`
   1424 
   1425 - [ ] **Step 1: Implement `format_csv`**
   1426 
   1427 Replace the `format_csv` stub:
   1428 
   1429 ```rust
   1430 fn format_csv(result: &DiffResult) -> String {
   1431     let mut out = String::new();
   1432 
   1433     // Header: _status, col1, col2, ..., _old_col1, _old_col2, ...
   1434     let mut header_parts = vec!["_status".to_string()];
   1435     header_parts.extend(result.headers.clone());
   1436     for h in &result.headers {
   1437         header_parts.push(format!("_old_{}", h));
   1438     }
   1439     out.push_str(&csv_row(&header_parts));
   1440 
   1441     // Added rows
   1442     for row in &result.added {
   1443         let mut parts = vec!["added".to_string()];
   1444         parts.extend(row.values.clone());
   1445         // Empty _old_ columns
   1446         for _ in &result.headers {
   1447             parts.push(String::new());
   1448         }
   1449         out.push_str(&csv_row(&parts));
   1450     }
   1451 
   1452     // Removed rows
   1453     for row in &result.removed {
   1454         let mut parts = vec!["removed".to_string()];
   1455         parts.extend(row.values.clone());
   1456         // Empty _old_ columns
   1457         for _ in &result.headers {
   1458             parts.push(String::new());
   1459         }
   1460         out.push_str(&csv_row(&parts));
   1461     }
   1462 
   1463     // Modified rows
   1464     for m in &result.modified {
   1465         let mut main_vals = vec![String::new(); result.headers.len()];
   1466         let mut old_vals = vec![String::new(); result.headers.len()];
   1467 
   1468         // Fill key columns in main values
   1469         for (key_col, key_val) in result.key_columns.iter().zip(m.key.iter()) {
   1470             if let Some(idx) = result.headers.iter().position(|h| h == key_col) {
   1471                 main_vals[idx] = key_val.clone();
   1472             }
   1473         }
   1474 
   1475         // Fill changed columns
   1476         for change in &m.changes {
   1477             if let Some(idx) = result.headers.iter().position(|h| h == &change.column) {
   1478                 main_vals[idx] = change.new_value.clone();
   1479                 old_vals[idx] = change.old_value.clone();
   1480             }
   1481         }
   1482 
   1483         let mut parts = vec!["modified".to_string()];
   1484         parts.extend(main_vals);
   1485         parts.extend(old_vals);
   1486         out.push_str(&csv_row(&parts));
   1487     }
   1488 
   1489     out
   1490 }
   1491 
   1492 /// Format a single CSV row with RFC 4180 quoting.
   1493 fn csv_row(values: &[String]) -> String {
   1494     let escaped: Vec<String> = values.iter().map(|v| {
   1495         if v.contains(',') || v.contains('"') || v.contains('\n') {
   1496             format!("\"{}\"", v.replace('"', "\"\""))
   1497         } else {
   1498             v.clone()
   1499         }
   1500     }).collect();
   1501     escaped.join(",") + "\n"
   1502 }
   1503 ```
   1504 
   1505 - [ ] **Step 2: Verify it compiles**
   1506 
   1507 Run: `cargo check --bin xldiff`
   1508 Expected: success
   1509 
   1510 - [ ] **Step 3: Commit**
   1511 
   1512 ```bash
   1513 git add src/bin/xldiff.rs
   1514 git commit -m "feat(xldiff): implement CSV output formatter"
   1515 ```
   1516 
   1517 ---
   1518 
   1519 ### Task 14: Test fixtures
   1520 
   1521 **Files:**
   1522 - Modify: `tests/common/mod.rs`
   1523 
   1524 - [ ] **Step 1: Add diff test fixtures**
   1525 
   1526 Append to `tests/common/mod.rs`:
   1527 
   1528 ```rust
   1529 /// Create a pair of files for positional diff testing.
   1530 /// File A: Alice/90, Bob/80, Charlie/70
   1531 /// File B: Alice/90, Charlie/70, Dana/85 (Bob removed, Dana added)
   1532 pub fn create_diff_pair(path_a: &Path, path_b: &Path) {
   1533     let mut wb = Workbook::new();
   1534     let ws = wb.add_worksheet().set_name("Data").unwrap();
   1535     ws.write_string(0, 0, "Name").unwrap();
   1536     ws.write_string(0, 1, "Score").unwrap();
   1537     ws.write_string(1, 0, "Alice").unwrap();
   1538     ws.write_number(1, 1, 90.0).unwrap();
   1539     ws.write_string(2, 0, "Bob").unwrap();
   1540     ws.write_number(2, 1, 80.0).unwrap();
   1541     ws.write_string(3, 0, "Charlie").unwrap();
   1542     ws.write_number(3, 1, 70.0).unwrap();
   1543     wb.save(path_a).unwrap();
   1544 
   1545     let mut wb = Workbook::new();
   1546     let ws = wb.add_worksheet().set_name("Data").unwrap();
   1547     ws.write_string(0, 0, "Name").unwrap();
   1548     ws.write_string(0, 1, "Score").unwrap();
   1549     ws.write_string(1, 0, "Alice").unwrap();
   1550     ws.write_number(1, 1, 90.0).unwrap();
   1551     ws.write_string(2, 0, "Charlie").unwrap();
   1552     ws.write_number(2, 1, 70.0).unwrap();
   1553     ws.write_string(3, 0, "Dana").unwrap();
   1554     ws.write_number(3, 1, 85.0).unwrap();
   1555     wb.save(path_b).unwrap();
   1556 }
   1557 
   1558 /// Create a pair for key-based diff testing.
   1559 /// File A: ID=1/Score=90, ID=2/Score=80, ID=3/Score=70
   1560 /// File B: ID=1/Score=95, ID=2/Score=80, ID=4/Score=85
   1561 /// Expected: ID=1 modified (90→95), ID=3 removed, ID=4 added
   1562 pub fn create_diff_pair_with_keys(path_a: &Path, path_b: &Path) {
   1563     let mut wb = Workbook::new();
   1564     let ws = wb.add_worksheet().set_name("Data").unwrap();
   1565     ws.write_string(0, 0, "ID").unwrap();
   1566     ws.write_string(0, 1, "Name").unwrap();
   1567     ws.write_string(0, 2, "Score").unwrap();
   1568     ws.write_string(1, 0, "1").unwrap();
   1569     ws.write_string(1, 1, "Alice").unwrap();
   1570     ws.write_number(1, 2, 90.0).unwrap();
   1571     ws.write_string(2, 0, "2").unwrap();
   1572     ws.write_string(2, 1, "Bob").unwrap();
   1573     ws.write_number(2, 2, 80.0).unwrap();
   1574     ws.write_string(3, 0, "3").unwrap();
   1575     ws.write_string(3, 1, "Charlie").unwrap();
   1576     ws.write_number(3, 2, 70.0).unwrap();
   1577     wb.save(path_a).unwrap();
   1578 
   1579     let mut wb = Workbook::new();
   1580     let ws = wb.add_worksheet().set_name("Data").unwrap();
   1581     ws.write_string(0, 0, "ID").unwrap();
   1582     ws.write_string(0, 1, "Name").unwrap();
   1583     ws.write_string(0, 2, "Score").unwrap();
   1584     ws.write_string(1, 0, "1").unwrap();
   1585     ws.write_string(1, 1, "Alice").unwrap();
   1586     ws.write_number(1, 2, 95.0).unwrap();
   1587     ws.write_string(2, 0, "2").unwrap();
   1588     ws.write_string(2, 1, "Bob").unwrap();
   1589     ws.write_number(2, 2, 80.0).unwrap();
   1590     ws.write_string(3, 0, "4").unwrap();
   1591     ws.write_string(3, 1, "Dana").unwrap();
   1592     ws.write_number(3, 2, 85.0).unwrap();
   1593     wb.save(path_b).unwrap();
   1594 }
   1595 
   1596 /// Create a pair for tolerance testing.
   1597 /// File A: ID=1/Price=100.001, ID=2/Price=200.5
   1598 /// File B: ID=1/Price=100.002, ID=2/Price=200.6
   1599 /// With tolerance 0.01: only ID=2 should be modified (diff=0.1 > 0.01)
   1600 pub fn create_diff_pair_with_floats(path_a: &Path, path_b: &Path) {
   1601     let mut wb = Workbook::new();
   1602     let ws = wb.add_worksheet().set_name("Data").unwrap();
   1603     ws.write_string(0, 0, "ID").unwrap();
   1604     ws.write_string(0, 1, "Price").unwrap();
   1605     ws.write_string(1, 0, "1").unwrap();
   1606     ws.write_number(1, 1, 100.001).unwrap();
   1607     ws.write_string(2, 0, "2").unwrap();
   1608     ws.write_number(2, 1, 200.5).unwrap();
   1609     wb.save(path_a).unwrap();
   1610 
   1611     let mut wb = Workbook::new();
   1612     let ws = wb.add_worksheet().set_name("Data").unwrap();
   1613     ws.write_string(0, 0, "ID").unwrap();
   1614     ws.write_string(0, 1, "Price").unwrap();
   1615     ws.write_string(1, 0, "1").unwrap();
   1616     ws.write_number(1, 1, 100.002).unwrap();
   1617     ws.write_string(2, 0, "2").unwrap();
   1618     ws.write_number(2, 1, 200.6).unwrap();
   1619     wb.save(path_b).unwrap();
   1620 }
   1621 ```
   1622 
   1623 - [ ] **Step 2: Verify compilation**
   1624 
   1625 Run: `cargo test --no-run`
   1626 Expected: compiles successfully
   1627 
   1628 - [ ] **Step 3: Commit**
   1629 
   1630 ```bash
   1631 git add tests/common/mod.rs
   1632 git commit -m "test(xldiff): add diff test fixtures — positional, keyed, and float tolerance pairs"
   1633 ```
   1634 
   1635 ---
   1636 
   1637 ### Task 15: Integration tests
   1638 
   1639 **Files:**
   1640 - Create: `tests/test_xldiff.rs`
   1641 
   1642 - [ ] **Step 1: Write integration tests**
   1643 
   1644 Create `tests/test_xldiff.rs`:
   1645 
   1646 ```rust
   1647 mod common;
   1648 
   1649 use assert_cmd::Command;
   1650 use predicates::prelude::*;
   1651 use tempfile::TempDir;
   1652 
   1653 #[test]
   1654 fn test_no_diff() {
   1655     let dir = TempDir::new().unwrap();
   1656     let path_a = dir.path().join("a.xlsx");
   1657     let path_b = dir.path().join("b.xlsx");
   1658     // Use same data for both
   1659     common::create_simple(&path_a);
   1660     common::create_simple(&path_b);
   1661 
   1662     Command::cargo_bin("xldiff").unwrap()
   1663         .arg(path_a.to_str().unwrap())
   1664         .arg(path_b.to_str().unwrap())
   1665         .arg("--no-color")
   1666         .assert()
   1667         .success() // exit 0
   1668         .stdout(predicate::str::contains("No differences found."));
   1669 }
   1670 
   1671 #[test]
   1672 fn test_positional_diff() {
   1673     let dir = TempDir::new().unwrap();
   1674     let path_a = dir.path().join("a.xlsx");
   1675     let path_b = dir.path().join("b.xlsx");
   1676     common::create_diff_pair(&path_a, &path_b);
   1677 
   1678     Command::cargo_bin("xldiff").unwrap()
   1679         .arg(path_a.to_str().unwrap())
   1680         .arg(path_b.to_str().unwrap())
   1681         .arg("--no-color")
   1682         .assert()
   1683         .code(1) // differences found
   1684         .stdout(predicate::str::contains("Removed: 1"))
   1685         .stdout(predicate::str::contains("Added: 1"))
   1686         .stdout(predicate::str::contains("Bob"));
   1687 }
   1688 
   1689 #[test]
   1690 fn test_keyed_diff() {
   1691     let dir = TempDir::new().unwrap();
   1692     let path_a = dir.path().join("a.xlsx");
   1693     let path_b = dir.path().join("b.xlsx");
   1694     common::create_diff_pair_with_keys(&path_a, &path_b);
   1695 
   1696     Command::cargo_bin("xldiff").unwrap()
   1697         .arg(path_a.to_str().unwrap())
   1698         .arg(path_b.to_str().unwrap())
   1699         .arg("--key").arg("ID")
   1700         .arg("--no-color")
   1701         .assert()
   1702         .code(1)
   1703         .stdout(predicate::str::contains("Modified: 1"))
   1704         .stdout(predicate::str::contains("Removed: 1"))
   1705         .stdout(predicate::str::contains("Added: 1"));
   1706 }
   1707 
   1708 #[test]
   1709 fn test_tolerance() {
   1710     let dir = TempDir::new().unwrap();
   1711     let path_a = dir.path().join("a.xlsx");
   1712     let path_b = dir.path().join("b.xlsx");
   1713     common::create_diff_pair_with_floats(&path_a, &path_b);
   1714 
   1715     // With tolerance=0.01, ID=1 diff (0.001) within tolerance, ID=2 diff (0.1) exceeds
   1716     Command::cargo_bin("xldiff").unwrap()
   1717         .arg(path_a.to_str().unwrap())
   1718         .arg(path_b.to_str().unwrap())
   1719         .arg("--key").arg("ID")
   1720         .arg("--tolerance").arg("0.01")
   1721         .arg("--no-color")
   1722         .assert()
   1723         .code(1)
   1724         .stdout(predicate::str::contains("Modified: 1"));
   1725 }
   1726 
   1727 #[test]
   1728 fn test_json_format() {
   1729     let dir = TempDir::new().unwrap();
   1730     let path_a = dir.path().join("a.xlsx");
   1731     let path_b = dir.path().join("b.xlsx");
   1732     common::create_diff_pair_with_keys(&path_a, &path_b);
   1733 
   1734     Command::cargo_bin("xldiff").unwrap()
   1735         .arg(path_a.to_str().unwrap())
   1736         .arg(path_b.to_str().unwrap())
   1737         .arg("--key").arg("ID")
   1738         .arg("--format").arg("json")
   1739         .assert()
   1740         .code(1)
   1741         .stdout(predicate::str::contains("\"added\""))
   1742         .stdout(predicate::str::contains("\"removed\""))
   1743         .stdout(predicate::str::contains("\"modified\""));
   1744 }
   1745 
   1746 #[test]
   1747 fn test_markdown_format() {
   1748     let dir = TempDir::new().unwrap();
   1749     let path_a = dir.path().join("a.xlsx");
   1750     let path_b = dir.path().join("b.xlsx");
   1751     common::create_diff_pair_with_keys(&path_a, &path_b);
   1752 
   1753     Command::cargo_bin("xldiff").unwrap()
   1754         .arg(path_a.to_str().unwrap())
   1755         .arg(path_b.to_str().unwrap())
   1756         .arg("--key").arg("ID")
   1757         .arg("--format").arg("markdown")
   1758         .assert()
   1759         .code(1)
   1760         .stdout(predicate::str::contains("## Added"))
   1761         .stdout(predicate::str::contains("## Removed"))
   1762         .stdout(predicate::str::contains("## Modified"));
   1763 }
   1764 
   1765 #[test]
   1766 fn test_csv_format() {
   1767     let dir = TempDir::new().unwrap();
   1768     let path_a = dir.path().join("a.xlsx");
   1769     let path_b = dir.path().join("b.xlsx");
   1770     common::create_diff_pair_with_keys(&path_a, &path_b);
   1771 
   1772     Command::cargo_bin("xldiff").unwrap()
   1773         .arg(path_a.to_str().unwrap())
   1774         .arg(path_b.to_str().unwrap())
   1775         .arg("--key").arg("ID")
   1776         .arg("--format").arg("csv")
   1777         .assert()
   1778         .code(1)
   1779         .stdout(predicate::str::contains("_status"))
   1780         .stdout(predicate::str::contains("added"))
   1781         .stdout(predicate::str::contains("removed"))
   1782         .stdout(predicate::str::contains("modified"));
   1783 }
   1784 
   1785 #[test]
   1786 fn test_file_not_found() {
   1787     Command::cargo_bin("xldiff").unwrap()
   1788         .arg("nonexistent.xlsx")
   1789         .arg("also_nonexistent.xlsx")
   1790         .assert()
   1791         .code(2)
   1792         .stderr(predicate::str::contains("file not found"));
   1793 }
   1794 
   1795 #[test]
   1796 fn test_sheet_selector() {
   1797     let dir = TempDir::new().unwrap();
   1798     let path = dir.path().join("multi.xlsx");
   1799     common::create_multi_sheet(&path);
   1800     let path_str = path.to_str().unwrap();
   1801 
   1802     // Compare Revenue vs Expenses sheets within same file
   1803     Command::cargo_bin("xldiff").unwrap()
   1804         .arg(format!("{}:Revenue", path_str))
   1805         .arg(format!("{}:Expenses", path_str))
   1806         .arg("--no-color")
   1807         .assert()
   1808         .code(1); // different data → exit 1
   1809 }
   1810 
   1811 #[test]
   1812 fn test_cols_filter() {
   1813     let dir = TempDir::new().unwrap();
   1814     let path_a = dir.path().join("a.xlsx");
   1815     let path_b = dir.path().join("b.xlsx");
   1816     common::create_diff_pair_with_keys(&path_a, &path_b);
   1817 
   1818     // Only compare Name column (ignore Score), so ID=1 shows no modification
   1819     Command::cargo_bin("xldiff").unwrap()
   1820         .arg(path_a.to_str().unwrap())
   1821         .arg(path_b.to_str().unwrap())
   1822         .arg("--key").arg("ID")
   1823         .arg("--cols").arg("ID,Name")
   1824         .arg("--no-color")
   1825         .assert()
   1826         .code(1)
   1827         // ID=1 has same Name in both, so no modification for that row
   1828         .stdout(predicate::str::contains("Modified: 0").or(predicate::str::contains("Modified").not()));
   1829 }
   1830 ```
   1831 
   1832 - [ ] **Step 2: Run integration tests**
   1833 
   1834 Run: `cargo test --test test_xldiff`
   1835 Expected: all tests PASS
   1836 
   1837 - [ ] **Step 3: Commit**
   1838 
   1839 ```bash
   1840 git add tests/test_xldiff.rs
   1841 git commit -m "test(xldiff): add integration tests — all modes, formats, tolerance, errors"
   1842 ```
   1843 
   1844 ---
   1845 
   1846 ### Task 16: Update CLAUDE.md and README
   1847 
   1848 **Files:**
   1849 - Modify: `CLAUDE.md`
   1850 
   1851 - [ ] **Step 1: Update CLAUDE.md**
   1852 
   1853 Update the "Current tools" section to include xldiff, remove it from "Related tools" and "Planned":
   1854 
   1855 In CLAUDE.md, update:
   1856 - Add `- **xldiff** — compare two sheets, report added/removed/modified rows` to "Current tools"
   1857 - Remove or update the "Related tools" xldiff entry
   1858 - Remove xlfilter from "Planned" if still there (it's already built)
   1859 - Update the Architecture line to mention the diff module
   1860 
   1861 - [ ] **Step 2: Commit**
   1862 
   1863 ```bash
   1864 git add CLAUDE.md
   1865 git commit -m "docs: update CLAUDE.md — add xldiff to current tools, remove from planned"
   1866 ```
   1867 
   1868 ---
   1869 
   1870 ### Task 17: Final verification
   1871 
   1872 - [ ] **Step 1: Run all tests**
   1873 
   1874 Run: `cargo test`
   1875 Expected: all tests pass (existing + new)
   1876 
   1877 - [ ] **Step 2: Run clippy**
   1878 
   1879 Run: `cargo clippy --all-targets`
   1880 Expected: no warnings
   1881 
   1882 - [ ] **Step 3: Build release binary**
   1883 
   1884 Run: `cargo build --release --bin xldiff`
   1885 Expected: binary created at `target/release/xldiff`
   1886 
   1887 - [ ] **Step 4: Manual smoke test**
   1888 
   1889 Run: `cargo run --bin xldiff -- demo/budget.xlsx demo/sales.xlsx --no-color`
   1890 Expected: shows differences between the two demo files
   1891 
   1892 - [ ] **Step 5: Commit any fixes**
   1893 
   1894 If clippy or tests revealed issues, fix and commit.