# Signal-Based Reload Implementation (Implemented)

This document outlines the implemented zero-downtime approach to address [GitHub Issue #56](https://github.com/tuannvm/slack-mcp-client/issues/56) using signal-based application reload.

## Problem Statement

When MCP servers restart in Kubernetes deployments, the Slack MCP client needs to reload to reconnect and discover tools, but should avoid downtime.

## Solution Overview

Implemented **signal-based reload** using SIGUSR1 and periodic timers that trigger a complete application reload within the same process, ensuring zero downtime.

## Advantages

* **Zero downtime**: Continuous operation during reload
* **On-demand reload**: `kubectl exec pod -- kill -USR1 1`
* **Faster**: No Kubernetes restart delay (\~10s saved)
* **Fresh state**: Complete reinitialization of all components
* **Production-ready**: Comprehensive error handling and resource cleanup

## Implementation

### Configuration Structure

```go
type ReloadConfig struct {
    Enabled  bool   `json:"enabled,omitempty"`  // Enable periodic reload (default: false)
    Interval string `json:"interval,omitempty"` // Reload interval (default: "30m")
}
```

### Core Components

1. **Application Lifecycle** (`internal/app/lifecycle.go`)
   * `RunWithReload()` - Main wrapper function
   * Signal handling for SIGUSR1 (reload) and SIGINT/SIGTERM (shutdown)
   * Periodic timer for automatic reloads
   * Graceful shutdown with 10-second timeout
2. **Configuration Integration** (`internal/config/config.go`)
   * Added ReloadConfig to main Config struct
   * Default values: disabled, 30-minute interval
   * Minimum interval validation (10 seconds)
3. **Monitoring Metrics** (`internal/monitoring/reload_metrics.go`)
   * Reload counters by trigger type (signal, periodic)
   * Reload duration histogram
   * Prometheus metrics endpoint

### Key Features

* **Centralized timeouts**: Constants for shutdown and minimum intervals
* **Helper functions**: Configuration loading, signal handling setup
* **Structured logging**: Key-value pairs for better observability
* **Error handling**: Graceful fallback to normal operation on config errors
* **Resource cleanup**: Proper signal handler cleanup

## Configuration Examples

### Enable with custom interval

```json
{
  "reload": {
    "enabled": true,
    "interval": "15m"
  }
}
```

### Disabled (Default)

```json
{
  "reload": {
    "enabled": false
  }
}
```

## Usage

### On-Demand Reload

```bash
# In Kubernetes
kubectl exec -it <pod-name> -- kill -USR1 1

# Local process
kill -USR1 <process-id>
```

### Periodic Reload

* Automatically reloads based on configured interval
* Minimum interval: 10 seconds
* Default interval: 30 minutes

## Testing

The implementation includes comprehensive unit tests:

* Signal handling validation
* Configuration parsing and validation
* Timeout constant verification
* Trigger type handling

## Benefits

1. **Zero Downtime**: Application continues running during reload
2. **Flexible**: Both manual (signal) and automatic (periodic) triggers
3. **Safe**: Minimum interval prevents excessive reloading
4. **Observable**: Prometheus metrics for monitoring
5. **Maintainable**: Clean, modular code with helper functions

## Production Deployment

Works seamlessly with Kubernetes:

* Pod stays running during reloads
* No service interruption
* Compatible with health checks
* Metrics available for monitoring dashboards

## Monitoring

Available Prometheus metrics:

* `mcp_reloads_total` - Counter by trigger type
* `mcp_reload_duration_seconds` - Reload timing histogram

## Implementation Status

✅ **Complete** - Fully implemented and tested

* Configuration structure and validation
* Signal-based and periodic reload triggers
* Graceful shutdown handling
* Prometheus metrics integration
* Comprehensive unit test coverage


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tuannvm.com/slack-mcp-client/docs/reload.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
