BazerUtils.jl

Assorted Julia utilities including custom logging
Log | Files | Refs | README | LICENSE

2026-03-24-logger-cleanup-design.md (9661B)


      1 # BazerUtils.jl Logger Cleanup & Format Expansion
      2 
      3 **Date:** 2026-03-24
      4 **Scope:** CustomLogger.jl rewrite — robustness, new formats, breaking changes
      5 **Version target:** v0.11.0 (breaking)
      6 
      7 ## Summary
      8 
      9 Rewrite the logging subsystem to fix robustness issues, add three new output formats (JSON, logfmt, log4j_standard), rename the misleading `:log4j` symbol to `:oneline`, introduce exact-level file filtering, and refactor the format dispatch from if/elseif chains to multiple dispatch on format types.
     10 
     11 ## Format Lineup
     12 
     13 Six active formats plus one deprecated alias:
     14 
     15 | Symbol | Description | Use case |
     16 |--------|-------------|----------|
     17 | `:pretty` | Box-drawing + ANSI colors | Stdout, human reading |
     18 | `:oneline` | Single-line with timestamp, level, module, file:line, message | File logs, simple grep |
     19 | `:syslog` | RFC 5424 syslog | Syslog collectors |
     20 | `:json` | One JSON object per line (hand-rolled, no JSON.jl dependency) | Structured log aggregation (ELK, Datadog, Loki) |
     21 | `:logfmt` | `key=value` pairs | Splunk, Heroku, grep-friendly structured |
     22 | `:log4j_standard` | Actual Apache Log4j PatternLayout (`%d %-5p [%t] %c - %m%n`) | Java tooling interop |
     23 | `:log4j` | **Deprecated alias** for `:oneline` — emits one-time `Base.depwarn` | Backwards compat only |
     24 
     25 ### JSON format (zero-dependency)
     26 
     27 Hand-rolled serializer. Log records are flat with known types (strings, ints, symbols), so we only need string escaping + concatenation. No JSON.jl required.
     28 
     29 Output:
     30 ```json
     31 {"timestamp":"2024-01-15T14:30:00","level":"ERROR","module":"MyApp","file":"src/app.jl","line":42,"message":"Something failed"}
     32 ```
     33 
     34 ### logfmt format
     35 
     36 ```
     37 ts=2024-01-15T14:30:00 level=error module=MyApp file=src/app.jl line=42 msg="Something failed"
     38 ```
     39 
     40 ### log4j_standard format
     41 
     42 Follows Apache Log4j PatternLayout convention:
     43 ```
     44 2024-01-15 14:30:00,123 ERROR [1] MyApp - Something failed
     45 ```
     46 
     47 Where `[1]` is the Julia thread ID (analogous to Java's `[%t]` thread name).
     48 
     49 ### Deprecation timeline
     50 
     51 - **v0.11.0 (now):** `:log4j` = deprecated alias for `:oneline`, `:log4j_standard` = real Apache format
     52 - **~March 2027:** Remove `:log4j` alias, rename `:log4j_standard` to `:log4j`, bump major version
     53 
     54 ## Bug Fixes & Robustness
     55 
     56 ### 1. FileSink finalizer
     57 
     58 Register `finalizer(close, obj)` in the `FileSink` constructor so GC cleans up leaked file handles. The `close` method must guard against closing `stdout`/`stderr` (defensive check: `if io !== stdout && io !== stderr`).
     59 
     60 ### 2. Thread safety
     61 
     62 Add `locks::Vector{ReentrantLock}` to `FileSink`. When `create_files=false` (all streams write to the same file), use a **single shared lock** for all streams. When `create_files=true` (separate files), use one lock per stream. Wrap all `println(io, ...)` calls in `lock(lk) do ... end`. `ReentrantLock` (not `SpinLock`) to handle re-entrant logging from `show` methods.
     63 
     64 Implementation: deduplicate IO handles when all files point to the same path — open the file once and share the IO + lock across all slots. This eliminates the aliased-file race condition.
     65 
     66 ### 3. `_module=nothing` in all formats
     67 
     68 All six format methods must defensively handle `log_record._module === nothing`, falling back to `"unknown"`. Currently only `format_log4j` has this fix (v0.10.1). The three new formats (json, logfmt, log4j_standard) must include it from the start, and `format_pretty` needs the same fix applied.
     69 
     70 ### 4. Unknown log_format throws
     71 
     72 `custom_format` will `throw(ArgumentError("Unknown log_format: :$log_format. Valid options: :pretty, :oneline, :syslog, :json, :logfmt, :log4j_standard"))` instead of silently producing no output.
     73 
     74 ### 5. reformat_msg double-call
     75 
     76 Remove the dead `reformat_msg` call at top of `custom_format` (line 292). Each format function already calls it independently.
     77 
     78 ### 6. get_log_filenames(::Vector) inconsistency
     79 
     80 Require exactly `length(file_loggers)` files. Throw `ArgumentError` for any other count. Remove the warn-for->4-then-truncate path.
     81 
     82 ### 7. Single timestamp per log entry
     83 
     84 Call `now()` once in `custom_format` and pass the resulting `DateTime` to each format function. All three existing format functions (`format_pretty` line 370, `format_log4j` line 397, `format_syslog` line 425) currently call `now()` independently — these internal calls must all be removed and replaced with the passed-in timestamp parameter. Eliminates timestamp drift between capture and format.
     85 
     86 ## Breaking Changes
     87 
     88 ### cascading_loglevels
     89 
     90 New kwarg `cascading_loglevels::Bool=false`:
     91 
     92 - **`false` (new default):** Each file logger captures only its exact level. The error file gets errors only, the warn file gets warns only, etc. Implemented by replacing `MinLevelLogger` with an `EarlyFilteredLogger` that checks `log.level == target_level`. Non-standard log levels (arbitrary integers) are silently dropped under exact filtering — this is acceptable since custom levels are rare and users who need them can use `cascading_loglevels=true`.
     93 - **`true` (old behavior):** Uses `MinLevelLogger` so each file captures its level and above (debug file gets everything, info gets info+warn+error, etc.).
     94 
     95 ### log_format default symbol
     96 
     97 Default value changes from `:log4j` to `:oneline`. Functionally identical — same format, different name.
     98 
     99 ## Architecture: Multiple Dispatch Refactor
    100 
    101 Replace the if/elseif chain in `custom_format` with dispatch on format types:
    102 
    103 ```julia
    104 abstract type LogFormat end
    105 struct PrettyFormat <: LogFormat end
    106 struct OnelineFormat <: LogFormat end
    107 struct SyslogFormat <: LogFormat end
    108 struct JsonFormat <: LogFormat end
    109 struct LogfmtFormat <: LogFormat end
    110 struct Log4jStandardFormat <: LogFormat end
    111 ```
    112 
    113 A `resolve_format` function maps symbols to types (handling the `:log4j` deprecation here):
    114 
    115 ```julia
    116 function resolve_format(s::Symbol)::LogFormat
    117     s == :pretty && return PrettyFormat()
    118     s == :oneline && return OnelineFormat()
    119     s == :log4j && (Base.depwarn("...", :log4j); return OnelineFormat())
    120     s == :syslog && return SyslogFormat()
    121     s == :json && return JsonFormat()
    122     s == :logfmt && return LogfmtFormat()
    123     s == :log4j_standard && return Log4jStandardFormat()
    124     throw(ArgumentError("Unknown log_format: :$s"))
    125 end
    126 ```
    127 
    128 Each format implements a method that **writes directly to `io`** (not returns a string):
    129 ```julia
    130 format_log(io, fmt::PrettyFormat, log_record, timestamp; kwargs...)
    131 ```
    132 
    133 All format methods write directly to `io` so that the thread-safe lock wraps the entire format+write. The pretty format's multi-line box-drawing output (header + continuation lines + last line) is written as multiple `println` calls within the same locked block.
    134 
    135 **`reformat_msg` refactor:** The current `reformat_msg` has a format-aware branch (`:color=>true` for pretty). Under the new dispatch, `reformat_msg` becomes format-unaware — it always returns a plain string. Pretty format adds color via ANSI codes in its own `format_log` method, not via `IOContext(:color=>true)` in `reformat_msg`.
    136 
    137 **`shorten_path` applicability:** Only `:oneline` uses `shorten_path` (as the current `:log4j` does). The new formats (`:json`, `:logfmt`, `:log4j_standard`) use the raw file path. All formats are available for both `log_format` (file) and `log_format_stdout`.
    138 
    139 **Deprecation warning text:** `":log4j is deprecated, use :oneline for single-line format or :log4j_standard for Apache Log4j format. :log4j will be removed in a future major version."`
    140 
    141 **File organization:** All format types, `resolve_format`, and `format_log` methods stay in `CustomLogger.jl`. The file is small enough that splitting is unnecessary.
    142 
    143 Shared logic (timestamp formatting, message reformatting, single-lining) lives in helper functions called by the format methods, not in a shared preamble.
    144 
    145 ## Thread Safety Architecture
    146 
    147 The locking must happen at the `FormatLogger` callback level, since that's where `println(io, ...)` occurs. The `FileSink` owns the locks, and the format callback closures capture both the IO stream and its corresponding lock:
    148 
    149 ```julia
    150 format_log_file = (io, log_record) -> lock(sink.locks[i]) do
    151     custom_format(io, fmt, log_record, now(); kwargs...)
    152 end
    153 ```
    154 
    155 This ensures the entire format+write is atomic per stream.
    156 
    157 ## Testing Plan
    158 
    159 ### New format tests
    160 - `:json` — verify output parses as valid JSON, all expected keys present, string escaping works (newlines, quotes, backslashes in messages)
    161 - `:logfmt` — verify key=value structure, quoted values with spaces, no unescaped quotes
    162 - `:log4j_standard` — verify matches `%d %-5p [%t] %c - %m%n` pattern
    163 - `:oneline` — existing tests adapted to use new symbol name
    164 
    165 ### Behavior tests
    166 - `cascading_loglevels=false` — each file has only its level's messages; **negative assertions** that error file does NOT contain warn/info/debug, warn file does NOT contain error/info/debug, etc.
    167 - `cascading_loglevels=true` — old cascading behavior preserved; error file has errors only, warn file has warn+error, info has info+warn+error, debug has everything
    168 - Unknown format symbol — throws `ArgumentError`
    169 - `:log4j` symbol — works but emits deprecation warning
    170 - `_module=nothing` — works in all six formats
    171 
    172 ### Thread safety test
    173 - Spawn N tasks each logging M messages concurrently
    174 - Read the log file, verify no interleaved partial lines (every line is a complete log entry)
    175 
    176 ### Robustness tests
    177 - FileSink finalizer — verify files are closeable after GC (hard to test deterministically, but verify `close(sink)` works)
    178 
    179 ## Out of Scope (Future PRs)
    180 
    181 - **Log rotation** — file size limits, file renaming, count limits
    182 - **`:log4j` alias removal** — ~March 2027, rename `:log4j_standard` to `:log4j`