Monitoring at Scale

Scaled Zabbix/Nagios for distributed services with alert hygiene and runbooks.
Result: fewer noisy alerts, faster incident response.