Socops.Rocks is hosted on a WordPress site:
- Pro – WordPress allows for quick easy deployment
- Con – WordPress gets attacked a lot, and crashes, needing restarts
The problem can be described:
- 24/7 monitoring
- When an outage is found, start a process
- Require approval from team
- Fix the situation automatically
- Full Audit log
SOAR to the rescue! We need:
- Testing criteria –> GET & HTTP Response Code
- Automated frequency –> “Jobs”
- Process approval –> Me
- Remediation –> Reboot box (/restart service/other)
Job 1 – Configure a SSH integration using a secure SSH Key (i.e. not password auth)
Job 2 – Configure a Task to connect to Linux and issue a reboot
Job 3 – Build a very simple workflow around the ‘Reboot’ task. If we get a HTTP 200, simply close the ticket, ‘else’ ask the sysadmin whether to issue a reboot.
Job 4 – Create a schedule to run this process every 5 minutes
Job 5 – Enjoy life and at a social distancing BBQ
Job 5.1 – If/when needed, approve the process (here I’m using the mobile app… because I’m at the BBQ)
Job 6 – Make it a Dashboard / Report
There are of course many improvements I can make (and I probably will to squeeze a second blog article out of this….)
- Reboot, wait 180 seconds, and retest HTTP 200
- Check the HTML content for unpredicted changes
- Check SSL cert validity
- Restart web service instead of a reboot
- Download last 20 log entries (pass through Threat Intel Platform)
- Etc
Without a video it’s hard to show this in action, but I’m happy to say that it works perfectly.
Result
- With no manual labour, every 5 minutes, if there’s an issue, I get a mobile notification to ask for my authorisation to reboot
- I can reboot the server from anywhere in the world without needing my SSH keys with me
- Full audit log, easy to expand
- Dashboards
Andy