BazerUtils.jl

Assorted Julia utilities including custom logging
Log | Files | Refs | README | LICENSE

2026-03-24-log-rotation-design.md (7601B)


      1 # Log Rotation Design
      2 
      3 **Date:** 2026-03-24
      4 **Scope:** Size-based log rotation for FileSink
      5 **Version target:** v0.11.1 (non-breaking, additive)
      6 
      7 ## Summary
      8 
      9 Add opt-in size-based log rotation to `FileSink`. When a log file exceeds `max_size` bytes, it is closed, renamed with a timestamp suffix, and a fresh file is opened. Old rotated files can be automatically cleaned up via `max_files`.
     10 
     11 ## Public API
     12 
     13 Two new kwargs on the convenience constructors of `custom_logger` (the `filename` and batch-mode methods). These kwargs are threaded into `FileSink` — the core `custom_logger(sink::LogSink; ...)` method does not need them since they live on the sink.
     14 
     15 | Kwarg | Type | Default | Description |
     16 |-------|------|---------|-------------|
     17 | `max_size` | `Int` | `0` (disabled) | Max bytes per log file before rotation |
     18 | `max_files` | `Int` | `0` (no limit) | Max rotated files to keep per log path. `0` = keep all |
     19 
     20 Rotation is opt-in — `max_size=0` means no rotation (current behavior unchanged).
     21 
     22 **Validation:** `max_size < 0` or `max_files < 0` throws `ArgumentError`. No minimum `max_size` enforced, but values under 1024 emit a warning.
     23 
     24 ```julia
     25 custom_logger("/tmp/myapp";
     26     create_log_files=true,
     27     max_size=10_000_000,   # 10 MB
     28     max_files=5)           # keep 5 rotated + 1 current
     29 ```
     30 
     31 ## RotatableIO Wrapper
     32 
     33 **Problem:** `FormatLogger` from LoggingExtras.jl captures its IO handle at construction time. After rotation replaces `sink.ios[i]`, the `FormatLogger` would still write to the old closed IO. Additionally, `format_cb` never sees the formatted output string, so there is no natural place to count bytes.
     34 
     35 **Solution:** A thin IO wrapper that solves both problems:
     36 
     37 ```julia
     38 mutable struct RotatableIO <: IO
     39     sink::FileSink
     40     index::Int
     41     bytes::Ref{Int}  # shared counter (deduplicated like ios/locks)
     42 end
     43 
     44 function Base.write(rio::RotatableIO, data::Union{UInt8, AbstractVector{UInt8}})
     45     n = write(rio.sink.ios[rio.index], data)
     46     rio.bytes[] += n
     47     # Check rotation after write
     48     if rio.sink.max_size > 0 && rio.bytes[] > rio.sink.max_size
     49         rotate!(rio.sink, rio.index)
     50     end
     51     return n
     52 end
     53 
     54 # Forward other IO methods needed by FormatLogger
     55 Base.flush(rio::RotatableIO) = flush(rio.sink.ios[rio.index])
     56 Base.isopen(rio::RotatableIO) = isopen(rio.sink.ios[rio.index])
     57 ```
     58 
     59 **How it works:**
     60 - `FormatLogger` holds a `RotatableIO` instead of the raw file IO
     61 - Every `write` delegates to `sink.ios[rio.index]` — always the current handle
     62 - Byte counting is exact (counts actual bytes flowing through `write`)
     63 - Rotation check happens in `write`, inside the existing lock
     64 - No changes needed to `format_cb` or `custom_format` or any `format_log` method
     65 
     66 ## Byte Tracking
     67 
     68 Byte counting is exact — `RotatableIO.write` counts every byte as it flows through. No `stat()` calls, no approximation.
     69 
     70 In single-file mode (IO deduplication), all `RotatableIO` wrappers that share the same file path share the same `Ref{Int}` counter. This mirrors the existing deduplication pattern for `ios` and `locks`: a `Dict(path => Ref{Int}(initial_size))` is built in the constructor, and each slot maps to its shared counter.
     71 
     72 On construction, each counter is initialized to the current file size via `stat(path).size` (one-time cost at startup, handles append mode correctly).
     73 
     74 ## FileSink Changes
     75 
     76 ```julia
     77 mutable struct FileSink <: LogSink
     78     files::Vector{String}
     79     ios::Vector{IO}
     80     locks::Vector{ReentrantLock}
     81     rios::Vector{RotatableIO}    # wrapper IOs passed to FormatLogger
     82     max_size::Int
     83     max_files::Int
     84 end
     85 ```
     86 
     87 The constructor builds `rios` after `ios` and `locks`:
     88 
     89 ```julia
     90 # Deduplicated byte counters (one per unique path)
     91 path_to_bytes = Dict(p => Ref{Int}(stat(p).size) for p in unique_paths)
     92 rios = [RotatableIO(obj, i, path_to_bytes[files[i]]) for i in eachindex(files)]
     93 ```
     94 
     95 `create_demux_logger` passes `sink.rios[io_index]` to `FormatLogger` instead of `sink.ios[io_index]`.
     96 
     97 ## Rotation Mechanics
     98 
     99 ### rotate! function
    100 
    101 ```julia
    102 function rotate!(sink::FileSink, trigger_index::Int)
    103     path = sink.files[trigger_index]
    104 
    105     # 1. Close the current IO for this path
    106     old_io = sink.ios[trigger_index]
    107     isopen(old_io) && close(old_io)
    108 
    109     # 2. Rename to timestamp suffix
    110     rotated_name = make_rotated_name(path)
    111     mv(path, rotated_name)
    112 
    113     # 3. Open fresh file
    114     new_io = open(path, "a")
    115 
    116     # 4. Update ALL slots sharing this path (handles IO dedup)
    117     for j in eachindex(sink.ios)
    118         if sink.files[j] == path
    119             sink.ios[j] = new_io
    120         end
    121     end
    122 
    123     # 5. Reset the shared byte counter
    124     sink.rios[trigger_index].bytes[] = 0
    125 
    126     # 6. Cleanup old files (done inline — fast for typical file counts)
    127     if sink.max_files > 0
    128         cleanup_rotated_files!(path, sink.max_files)
    129     end
    130 end
    131 ```
    132 
    133 ### Thread safety
    134 
    135 Rotation happens inside the existing `lock(sink.locks[i]) do ... end` block (via the `RotatableIO.write` method, which is called from within the locked `format_cb` closure). Since the lock is already held, rotation is atomic with respect to concurrent writers on the same stream.
    136 
    137 For single-file mode, all streams share one lock, so rotation of the shared file is safe. The `rotate!` function updates ALL slots sharing the same path, preventing stale IO references.
    138 
    139 ### Cleanup
    140 
    141 `cleanup_rotated_files!(path, max_files)`:
    142 1. Extract base name and extension from `path`
    143 2. Glob for `base.*.ext` in the same directory
    144 3. Sort by filename (timestamp suffix sorts chronologically)
    145 4. Delete oldest files until count <= `max_files`
    146 
    147 ## Timestamp Suffix Format
    148 
    149 | State | Filename |
    150 |-------|----------|
    151 | Current | `app.log` |
    152 | Rotated | `app.2026-03-24T14-30-00-123.log` |
    153 
    154 Pattern: insert timestamp before the final extension. Format `yyyy-mm-ddTHH-MM-SS-sss` (includes milliseconds to avoid sub-second collisions). Uses hyphens instead of colons for filesystem compatibility.
    155 
    156 For files without an extension (e.g., `myapp`), the suffix is appended: `myapp.2026-03-24T14-30-00-123`.
    157 
    158 For multi-level extensions (e.g., `app_error.log`), the timestamp is inserted before `.log`: `app_error.2026-03-24T14-30-00-123.log`.
    159 
    160 ## Testing Plan
    161 
    162 ### Unit tests
    163 - `RotatableIO`: write bytes, verify counter incremented correctly
    164 - `rotate!` on a FileSink: verify old file renamed with timestamp, new file opened, counter reset
    165 - `rotate!` in single-file mode: verify all slots updated, shared counter reset
    166 - `rotate!` with `max_files=2`: verify only 2 rotated files kept after 3 rotations
    167 - `rotate!` with `max_files=0`: verify all rotated files kept
    168 - `make_rotated_name`: correct timestamp insertion for `.log`, no-extension, multi-extension
    169 - Validation: `max_size=-1` throws `ArgumentError`, `max_files=-1` throws `ArgumentError`
    170 
    171 ### Integration tests
    172 - Create logger with `max_size=500`, write enough messages to trigger rotation, verify multiple files exist with timestamp suffixes
    173 - Verify current log file is always the original filename
    174 - Verify content continuity: messages before and after rotation are all present across files
    175 - Verify log format is preserved after rotation (e.g., JSON lines still valid)
    176 
    177 ### Edge cases
    178 - `max_size=0`: no rotation (current behavior)
    179 - Single-file mode with rotation: all levels rotate together since they share one IO
    180 - Append to existing file: byte counter initialized to current file size, rotation triggers correctly
    181 - Sub-second rotation: millisecond suffix prevents filename collisions
    182 
    183 ## Out of Scope
    184 
    185 - Time-based rotation
    186 - Compression of rotated files (could be a future addition)
    187 - Remote/network log sinks