Runtime Variables & MySQL Failover - Operations Guide
Overview
This guide covers two critical operational resilience features:
- Runtime Variables (SET/SHOW VARIABLES): Change configuration at runtime without server restart using MySQL-compatible commands.
- MySQL Failover: Switch MySQL servers at runtime while preserving GTID position.
These features enable zero-downtime configuration updates and safe handling of MySQL master failover scenarios.
1. Runtime Variables (SET/SHOW VARIABLES)
MygramDB supports MySQL-compatible SET and SHOW VARIABLES commands for runtime configuration changes without server restart.
Overview
Runtime variables allow you to:
- Change logging levels on the fly
- Switch MySQL servers during failover (GTID position preserved)
- Adjust cache behavior
- Modify API and rate limiting parameters
Basic Commands
-- Show all variables
SHOW VARIABLES;
-- Show specific pattern
SHOW VARIABLES LIKE 'mysql%';
SHOW VARIABLES LIKE 'cache%';
-- Set single variable
SET logging.level = 'debug';
SET mysql.host = '192.168.1.100';
-- Set multiple variables
SET api.default_limit = 200, cache.enabled = true;Mutable vs Immutable Variables
Mutable (Runtime Changeable)
| Variable | Description | Example |
|---|---|---|
logging.level | Log level (debug/info/warn/error) | SET logging.level = 'debug' |
logging.format | Log format (json/text) | SET logging.format = 'json' |
mysql.host | MySQL server hostname | SET mysql.host = '192.168.1.101' |
mysql.port | MySQL server port | SET mysql.port = 3306 |
cache.enabled | Enable/disable query cache | SET cache.enabled = false |
cache.min_query_cost_ms | Minimum query cost to cache | SET cache.min_query_cost_ms = 20.0 |
cache.ttl_seconds | Cache TTL in seconds | SET cache.ttl_seconds = 7200 |
api.default_limit | Default LIMIT value | SET api.default_limit = 200 |
api.max_query_length | Max query expression length | SET api.max_query_length = 256 |
rate_limiting.capacity | Token bucket capacity | SET rate_limiting.capacity = 200 |
rate_limiting.refill_rate | Tokens per second | SET rate_limiting.refill_rate = 20 |
Immutable (Requires Restart)
- Server port/bind address (
tcp.bind,tcp.port,http.bind,http.port) - Table configuration (name, primary key, text columns)
- MySQL database name, user, password
- Replication settings (
server_id,start_from) - Memory allocator settings
- Network security (
allow_cidrs)
Usage Examples
Change Logging Level
-- Increase verbosity for debugging
SET logging.level = 'debug';
-- Verify change
SHOW VARIABLES LIKE 'logging%';
-- Return to normal
SET logging.level = 'info';Disable Cache During Maintenance
-- Disable cache before maintenance
SET cache.enabled = false;
-- Perform maintenance operations...
-- Re-enable cache
SET cache.enabled = true;Adjust API Limits
-- Temporarily increase limits for bulk operations
SET api.default_limit = 1000;
SET api.max_query_length = 512;
-- Check current values
SHOW VARIABLES LIKE 'api%';Validation and Error Handling
SET commands validate values before applying:
-- Type mismatch error
SET api.default_limit = 'invalid';
ERROR: Invalid value for api.default_limit: must be integer
-- Out of range error
SET api.default_limit = 99999;
ERROR: Invalid value for api.default_limit: must be between 5 and 1000
-- Unknown variable error
SET unknown.variable = 'value';
ERROR: Unknown variable: unknown.variable
-- Immutable variable error
SET mysql.database = 'newdb';
ERROR: Variable mysql.database is immutable (requires restart)Variables are validated before being applied, ensuring the server remains in a consistent state even if invalid values are provided.
Monitoring Variable Changes
Monitor variable changes using server logs:
# Watch for variable changes
tail -f /var/log/mygramdb/server.log | grep -i "variable"
# Check current configuration
mygram-cli -c "SHOW VARIABLES;"2. MySQL Failover with Runtime Variables
MygramDB supports zero-downtime MySQL failover using SET commands. When switching to a new MySQL server, the GTID position is preserved.
Failover Workflow
-- 1. Check current MySQL connection
SHOW VARIABLES LIKE 'mysql%';
-- 2. Switch to new MySQL server (replica promoted to primary)
SET mysql.host = '192.168.1.101', mysql.port = 3306;
-- 3. Verify reconnection succeeded
SHOW VARIABLES LIKE 'mysql%';How It Works
When you execute SET mysql.host/port:
- Save GTID Position: Current GTID position is saved
- Stop Binlog Reader: Gracefully stop reading from old server
- Close Old Connection: Close connection to old MySQL server
- Create New Connection: Connect to new MySQL server
- Validate New Server: Check GTID mode, binlog format, table existence
- Resume from GTID: Restart binlog reader from saved GTID position
Requirements
The new MySQL server must:
- Have GTID mode enabled (
gtid_mode = ON) - Use ROW binlog format (
binlog_format = ROW) - Contain the saved GTID position in its GTID set
- Have all required tables defined in configuration
Example: Planned Failover
-- Before failover: connected to mysql-primary-1
SHOW VARIABLES LIKE 'mysql.host';
-- Output: mysql-primary-1.example.com
-- Promote replica to new primary
-- (Done externally via MySQL replication management)
-- Switch MygramDB to new primary
SET mysql.host = 'mysql-primary-2.example.com';
-- Verify failover
SHOW VARIABLES LIKE 'mysql.host';
-- Output: mysql-primary-2.example.comServer logs will show:
[info] Reconnecting to MySQL: mysql-primary-2.example.com:3306
[info] Stopping binlog reader...
[info] Binlog reader stopped. Processed 12345 events
[info] Creating new MySQL connection: mysql-primary-2.example.com:3306
[info] Connection validated successfully
[info] Binlog reader started from GTID: <saved-gtid>Error Handling
If failover fails, the server stops replication and logs detailed errors:
[error] Failed to reconnect to new MySQL server: Connection refused
[error] Binlog replication stopped. Manual intervention required.In this case:
- Check new MySQL server is accessible
- Verify network connectivity
- Check MySQL credentials
- Retry SET command or restart server
3. MySQL Failover Detection
The server automatically validates MySQL connections to detect:
- Failover: MySQL master server changed (different server UUID)
- Invalid Servers: Missing tables, GTID mode disabled, inconsistent state
- Connection Loss: Automatic reconnection with validation
This prevents replicating from incorrect servers after network changes or failover events.
Validation Checks
When connecting or reconnecting to MySQL, the server performs:
- GTID Mode Check: Ensures
gtid_mode=ON - Server UUID Tracking: Detects server changes (failover)
- Table Existence: Verifies all required tables exist
- GTID Consistency: Checks replication state consistency
Failover Detection Behavior
Scenario 1: Normal Reconnection (Same Server)
[info] [binlog worker] Reconnected successfully
[info] [binlog worker] Connection validated successfully after reconnect- Action: Continue replication normally
- Impact: None
Scenario 2: Failover Detected (Different Server)
[info] [binlog worker] Reconnected successfully
[warn] [binlog validation] Server UUID changed: a1b2c3d4-... -> e5f6g7h8-... (failover detected)
{"event":"mysql_failover_detected","old_uuid":"a1b2c3d4-...","new_uuid":"e5f6g7h8-..."}
[info] [binlog worker] Connection validated successfully after reconnect- Action: Continue replication with warning logged
- Impact: Operator alerted to failover event
- Next Steps:
- Verify failover was intentional
- Check replication lag
- Monitor for data consistency
Scenario 3: Invalid Server (Missing Tables)
[info] [binlog worker] Reconnected successfully
{"event":"binlog_connection_validation_failed","gtid":"<current-gtid>","error":"Required tables are missing: users, messages"}
[error] [binlog worker] Connection validation failed after reconnect: Required tables are missing: users, messages
[info] Binlog reader stopped. Processed 12345 events- Action: Stop replication (prevents data corruption)
- Impact: Replication halted until operator intervenes
- Next Steps:
- Verify MySQL server is correct
- Check table schema
- Fix configuration or restore correct server
- Restart mygramdb
Scenario 4: GTID Mode Disabled
{"event":"connection_validation_failed","reason":"gtid_disabled","error":"GTID mode is not enabled"}
[error] [binlog worker] Connection validation failed after reconnect: GTID mode is not enabled- Action: Stop replication
- Impact: Server will not start binlog reader
- Next Steps: Enable GTID mode on MySQL server
Monitoring Failover Events
Structured Log Queries
# Detect failover events
grep 'mysql_failover_detected' /var/log/mygramdb/server.log
# Check validation failures
grep 'connection_validation_failed' /var/log/mygramdb/server.log
# Monitor server UUID changes
grep 'Server UUID changed' /var/log/mygramdb/server.logExample Structured Log Output
{
"event": "mysql_failover_detected",
"old_uuid": "a1b2c3d4-e5f6-1234-5678-90abcdef1234",
"new_uuid": "e5f6g7h8-i9j0-5678-9012-34567890abcd",
"timestamp": "2025-11-17T10:30:45Z"
}Operational Procedures
Planned MySQL Failover
Before Failover:
- Ensure new master has GTID enabled
- Verify all required tables exist on new master
- Check replication lag
During Failover:
- Use SET command to switch MySQL server:
SET mysql.host = 'new-primary.example.com'; - Monitor logs for failover detection
- Use SET command to switch MySQL server:
After Failover:
bash# Verify connection to new master grep "Connection validated successfully" /var/log/mygramdb/server.log | tail -1 # Check for failover warning grep "mysql_failover_detected" /var/log/mygramdb/server.log | tail -1 # Verify replication continues using the health endpoint curl http://localhost:8080/health/detail
Unplanned Failover (Emergency)
Detect Issue:
bash# Check if replication stopped grep "Binlog reader stopped" /var/log/mygramdb/server.log | tail -1 # Check validation failures grep "validation_failed" /var/log/mygramdb/server.log | tail -5Diagnose:
- Is server UUID different? → Failover occurred
- Missing tables? → Wrong server or schema issue
- GTID disabled? → Configuration problem
Recover:
sql-- If wrong server, use SET to switch to correct server SET mysql.host = 'correct-server.example.com'; -- Verify connection SHOW VARIABLES LIKE 'mysql%';bash# If validation failed, check logs and restart if needed systemctl restart mygramdb
4. Health Monitoring Integration
Health Check Endpoints
Use the health endpoint to monitor replication status:
# Basic health check
curl http://localhost:8080/health/live
# Returns: 200 OK (server is running)
# Readiness check
curl http://localhost:8080/health/ready
# Returns: 200 if ready, 503 if loading
# Detailed health status
curl http://localhost:8080/health/detail
# Returns: 200 with status = healthy or status = degraded. Use /health/ready for traffic gating.Detailed Health Response Example:
{
"status": "healthy",
"uptime_seconds": 3600,
"binlog_reader": {
"running": true,
"current_gtid": "a1b2c3d4-e5f6-1234-5678-90abcdef1234:1-12345",
"processed_events": 12345,
"queue_size": 42
},
"mysql_connection": {
"connected": true,
"server_uuid": "a1b2c3d4-e5f6-1234-5678-90abcdef1234"
}
}Alerting Rules
Recommended monitoring alerts:
Failover Detected (INFO)
- Trigger:
event: mysql_failover_detected - Action: Notify operator, verify failover was intentional
- Trigger:
Validation Failed (CRITICAL)
- Trigger:
event: connection_validation_failed - Action: Page on-call, replication stopped
- Trigger:
Replication Stopped (CRITICAL)
- Trigger:
binlog_reader.running = falsein health endpoint - Action: Page on-call, investigate immediately
- Trigger:
5. Testing & Validation
Testing Runtime Variables
# 1. Start server
./bin/mygramdb --config config.yaml
# 2. Connect with client
mygram-cli
# 3. Test variable changes
SET logging.level = 'debug';
# 4. Verify change applied
SHOW VARIABLES LIKE 'logging%';
# 5. Check logs
tail -f /var/log/mygramdb/server.logTesting Failover Detection
Unit Tests (No MySQL Required):
# Run validator unit tests
cd build
./bin/connection_validator_test --gtest_filter="ConnectionValidatorUnitTest.*"Integration Tests (Requires MySQL with GTID):
# Set environment
export ENABLE_MYSQL_INTEGRATION_TESTS=1
export MYSQL_HOST=127.0.0.1
export MYSQL_USER=root
export MYSQL_PASSWORD=test
export MYSQL_DATABASE=test
# Run integration tests
ctest -R ConnectionValidatorIntegrationTest -VManual Failover Simulation
# 1. Start mygramdb connected to mysql-server-1
./bin/mygramdb --config config.yaml
# 2. Connect with client
mygram-cli
# 3. Switch to mysql-server-2
SET mysql.host = 'mysql-server-2.example.com';
# 4. Check logs for failover detection
tail -100 /var/log/mygramdb/server.log | grep -E "(failover|UUID changed|Reconnecting)"6. Troubleshooting
Issue: SET Variable Fails
Symptoms:
SET mysql.host = 'new-host';
ERROR: Failed to reconnect to MySQL serverSolutions:
- Check MySQL server is accessible:
mysql -h new-host -u user -p - Verify network connectivity:
ping new-host - Check server logs for detailed error message
- Verify GTID mode is enabled on new server
- Check credentials are correct
Issue: Replication Stopped After Reconnect
Symptoms:
[error] Connection validation failed: Required tables are missing
[info] Binlog reader stoppedSolutions:
- Verify MySQL server is correct: Check
mysql.hostin config - Check tables exist:sql
SHOW TABLES FROM myapp; - Verify GTID mode:sql
SHOW VARIABLES LIKE 'gtid_mode'; - Check server UUID:sql
SELECT @@server_uuid;
Issue: Failover Warning But Replication Works
Symptoms:
[warn] Server UUID changed: <old> -> <new> (failover detected)
[info] Connection validated successfullyThis is normal if:
- Planned MySQL failover occurred
- DNS/IP switched to new master
- All validation checks pass
Action Required:
- Document the failover event
- Verify replication lag
- Monitor for consistency issues
Issue: Repeated Reconnection Attempts
Symptoms:
[info] Reconnect attempt #5, waiting 5000 ms
[error] [binlog worker] Connection validation failed: <error>
[info] Reconnect attempt #6, waiting 6000 msSolutions:
- Check MySQL server availability
- Verify network connectivity
- Check credentials in config
- Review firewall rules
- Consider increasing
mysql.connect_timeout_msin config
7. Performance Considerations
Runtime Variable Change Impact
- Logging Level Change: No impact, immediate
- MySQL Failover (SET mysql.host): Brief interruption (1-5 seconds)
- Binlog reader stops gracefully
- Existing queries complete normally
- Search operations continue normally
- Replication resumes from saved GTID
- Cache Toggle: Immediate, no query interruption
- API Parameter Changes: Applied to new requests only
Recommendation: MySQL failover should be performed during low-traffic periods if possible.
Validation Overhead
- Initial Connection: ~100-200ms validation time
- Reconnection: ~50-100ms additional overhead
- Impact: Negligible (occurs only during connect/reconnect)
Cost Breakdown:
- GTID mode check: 1 query (~10ms)
- Server UUID check: 1 query (~10ms)
- Table existence: N queries (~10ms each, N = table count)
- GTID consistency: 1 query (~20ms)
8. Best Practices
Configuration Management
- Version Control: Keep
config.yamlin git - Validation: Test config changes in staging first
- Rollback Plan: Keep previous config backup
- Documentation: Document all config changes
Monitoring
- Log Aggregation: Send logs to centralized logging (ELK, Splunk)
- Structured Logs: Use JSON parsing for alerts
- Health Checks: Monitor
/health/detailendpoint every 30s - Metrics: Track GTID lag, event processing rate
- Process Supervision: Run MygramDB under systemd, Kubernetes, or supervisord with a restart policy.
/health/detailreports known server, replication, and cache state, but it is not a full background-thread supervisor; alert on process exit and error logs as well.
Failover Procedures
- Document MySQL Topology: Keep diagram of master/replica setup
- Automate DNS/IP Updates: Use orchestration tools
- Test Failover Regularly: Quarterly failover drills
- Monitor Lag: Alert if replication lag > 60s
Security
- Client Access Control: Use
network.allow_cidrsto restrict client access - Config File Permissions:
chmod 600 config.yaml(sensitive credentials) - Log Rotation: Use SIGUSR1 signal for zero-downtime log rotation (see Section 10)
- Audit Trail: Log all variable changes and failovers
9. Runtime Variable Categories
Variables that can be changed at runtime using SET commands:
logging:
level: "info" # Mutable via SET
format: "json" # Mutable via SET
mysql:
host: "127.0.0.1" # Mutable via SET (triggers reconnection)
port: 3306 # Mutable via SET (triggers reconnection)
cache:
enabled: true # Mutable via SET
min_query_cost_ms: 10.0 # Mutable via SET
ttl_seconds: 3600 # Mutable via SET
api:
default_limit: 100 # Mutable via SET
max_query_length: 128 # Mutable via SET
rate_limiting:
capacity: 100 # Mutable via SET
refill_rate: 10 # Mutable via SETQuick Reference: Commands
-- Change logging level
SET logging.level = 'debug';
-- Switch MySQL server
SET mysql.host = 'new-primary.example.com';
-- Disable cache
SET cache.enabled = false;
-- Show all variables
SHOW VARIABLES;
-- Show specific variables
SHOW VARIABLES LIKE 'mysql%';# Check process status
ps aux | grep mygramdb
# Monitor logs
tail -f /var/log/mygramdb/server.log
# Health check
curl http://localhost:8080/health/detailLog Events
| Event | Severity | Meaning |
|---|---|---|
mysql_failover_detected | WARNING | Server UUID changed during reconnection |
connection_validation_failed | ERROR | Validation failed, replication stopped |
| Variable change logs | INFO | Runtime variable modified via SET |
Connection validated successfully | INFO | Validation passed |
Error Codes
| Error | Code | Action |
|---|---|---|
| GTID mode disabled | gtid_disabled | Enable GTID on MySQL |
| Missing tables | table_validation_failed | Check schema |
| Server UUID changed | failover_detected | Verify failover |
| Connection lost | connection_failed | Check network |
10. Log Rotation with SIGUSR1
MygramDB supports zero-downtime log rotation using the SIGUSR1 signal, similar to nginx. This allows seamless log file rotation without restarting the server.
How It Works
- Rename the current log file (by logrotate or manually)
- Send SIGUSR1 signal to the MygramDB process
- MygramDB reopens the log file with the original path, creating a new file
The process continues logging to the new file immediately, while the old file (now renamed) can be compressed or archived.
Manual Rotation
# 1. Rename the current log file
mv /var/log/mygramdb/app.log /var/log/mygramdb/app.log.1
# 2. Send SIGUSR1 to reopen log file
kill -USR1 $(pidof mygramdb)
# or
kill -USR1 $(cat /var/run/mygramdb.pid)
# 3. MygramDB creates new app.log and continues logginglogrotate Configuration
Create /etc/logrotate.d/mygramdb:
/var/log/mygramdb/*.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
create 0640 mygramdb mygramdb
sharedscripts
postrotate
kill -USR1 $(cat /var/run/mygramdb.pid 2>/dev/null) 2>/dev/null || true
endscript
}Configuration Options:
daily: Rotate logs dailyrotate 7: Keep 7 days of logscompress: Compress rotated logs with gzipdelaycompress: Don't compress the most recently rotated file (useful for debugging)missingok: Don't error if log file doesn't existnotifempty: Don't rotate empty filescreate 0640 mygramdb mygramdb: Create new file with specified permissions and ownershipsharedscripts: Run postrotate only once for all matched filespostrotate: Command to run after rotation (send SIGUSR1)
Verification
After sending SIGUSR1, check the logs:
# Verify new log file was created
ls -la /var/log/mygramdb/
# Check new log file contains reopen confirmation
grep "Log file reopened" /var/log/mygramdb/app.logExpected log entry after rotation:
[2025-11-25 12:34:56.789] [mygramdb] [info] Log file reopened for rotationsystemd Integration
If using systemd, add a reload target for log rotation:
# /etc/systemd/system/mygramdb.service
[Service]
ExecReload=/bin/kill -USR1 $MAINPIDThen use systemctl reload mygramdb for log rotation:
# Rotate logs using systemd
systemctl reload mygramdbImportant Notes
- File logging required: SIGUSR1 only affects file logging. If logging to stdout, the signal is acknowledged but no action is taken.
- Log level preserved: After reopening, the log level remains unchanged.
- Signal safety: SIGUSR1 only sets a flag; the actual file reopen happens in the main loop within 100ms.
- Error handling: If reopening fails, an error is written to stderr (not the log file, since it may be broken).
Troubleshooting
Log file not created after SIGUSR1:
- Check file permissions on the log directory
- Verify the process received the signal:
grep "Log file reopened" /var/log/mygramdb/app.log - Check stderr for errors
Old log file still being written to:
- Ensure the mv command completed before sending SIGUSR1
- Check process PID is correct
- Verify signal was sent to the correct process
See Also
- Replication Management - Automatic replication coordination
- Replication Guide - MySQL replication setup
- Configuration Guide - Full configuration reference
- HTTP API Guide - Health and metrics endpoints