Protocols for Server Downtime, Maintenance,
or 3rd Party Service Notification.
I) Monitoring and Notification
- Heartbeat monitors on all systems plus remote monitoring backups to auto-notify key partners when single units are down, total system down or failover engagement
- All parties who receive notification of downtime must check in with Network Admin or Developer Team Lead within 5 minutes of notification.
- All key parties are on call 24-7 365. Phones, pagers, and other alert devices must be on unless prohibited by law or regulation. Prior to exiting contact availability key parties must communicate with CEO or COO to identify the time period for offline status and ETA to return to communication availability.
- Key parties:
- Lead Developer
- Network Admin
- Other Members of Developer Team
- SMO Team (for notification)
- VP Customer Service
- Errors or omissions
- In case of error that results in downtime all key parties must be notified regardless of time of day or location within 10 minutes of downtime error.
- All parties who receive notification must check in with Network Admin or Developer Team Lead within 5 minutes of notification.
- CS and SMO teams must communicate issues in forums, and via social media
- After Action report must be submitted to Community
II) Reboot Protocols
- Initiating Reboot (Unplanned)
- Initialing Developer must notify Network Admin, and lead developer.
- Trained Troubleshooting Developer or Admin must be physically at sever location to facilitate repair of any reboot issues
- All issues that occur must be logged in after action report for Developer repair
- CEO,COO, SMO Team and Customer service must be notified in email
- CEO, COO must be notified via Phone
- Initiating Reboot or Downtime (planned)
- CEO, COO, Network Admin and Lead Developer must be notified in writing for approval
- Once approved, notification to ArtFire community via staff announcements must be made in Forums.
- SMO, CS must monitor forums and Social media during planned downtime event
- After Action report must be submitted and published to ArtFire.com Community
III) Service notifications from 3rd party providers
- All service notifications or potential events that are communicated by 3rd party providers are assumed to indicate potential downtime.
- All such communication must be forwarded in writing to all Key parties within 2 hours of receipt
- All such events will be assumed to create failure or error potential and will initiate failure preparation protocols
- In the event that any potential issue, maintenance , 3rd party provider service notification or utility, network wide event is anticipated the Development, CS, and Network Admin teams will initiate failure preparation protocols.
- Notification about potential event will be made to ArtFire.com community.
- CS representatives and SMO will be placed on standby for addressing concerns or questions both internally and Internet-wide via scrapers
- A trained troubleshooting member of the development or network admin team will be placed on site with servers to address non-remote accessible issues
- CEO and COO will be notified.
- Upon termination of the issue window all parties will receive stand down notification from the team lead and ArtFire.com community will be notified of the end of the potential error window.