Preparing for digital recovery in advance of disaster

People and Places | Strategy 5: Systems Renewal
Theme: Innovation
Photo by Christina @ on Unsplash

For years, British Columbians have been preparing for “the Big One” as the province sits on a seismic fault line.  Regardless of the size or nature, disasters (natural or man-made) can have major implications to global research institutions like UBC.  Today, when technology is fundamental in the learning and research ecosystems, a compromised IT infrastructure have a significant impact to the university. In 2019, the university established an UBC IT Disaster Recovery plan (DRP) to prepare in advance of any disaster.

UBC’s IT Disaster Recovery Plan

The DRP for UBC consists of a series of recovery processes for a select number of critical on-campus enterprise systems from Vancouver that have been identified as critical.  Data of these services is stored in Vancouver and replicated at an off-site location in Kamloops, which is also home to BC’s Provincial Data Centre.  If a localized disaster takes place, data can be restored to any of the critical services that are offline.

One of the key considerations in identifying services as “critical” is based on their impact on other services and the university’s capability to function if they were ever offline.

At the moment, the systems within the IT Disaster Recovery Plan are:

  • Legacy on-premise Student Information System (SIS)
  • Key on-premise Learning and Teaching Enterprise systems
  • Research Information System
  • Treasury
  • Pensions
  • SharePoint
  • TeamShare

The decision to declare an IT Disaster rests with UBC Executive. The process will be coordinated via the UBC Crisis Management Team with input from the Office of the CIO and UBC IT.

Preparing for an IT disaster

To ensure that service owners of critical IT services are prepared for a disaster, an IT disaster simulation is hosted annually to help teams to test the recovery processes in a replicated production environment of the services. After each simulation, a post-mortem is completed to identify improvements for the processes. 

Over the past two years, recovery time has drastically improved from the goal of 24 hours to 6 hours. 

“Knowing that there is an IT Disaster Recovery Plan and we are practicing the steps to recover data is a peace of mind,” says Mario Angers, Senior Manager, System. “My team manages TeamShare and we have over a petabyte (one million gigabytes or one thousand TB) of data stored on it.”

What’s next?

Establishing an IT DRP is essential for a large-scale institution like UBC. The next phase is to use the plan as a foundation to expand to other on-premise systems that were not in-scope with the initial plan.  UBC Okanagan is also in the midst of aligning their IT recovery process to the central IT disaster plan. 

As it is just as vital to ensure other services at the university is prepared for a disaster, Disaster Recovery will soon be launched as a service for the university.  Faculties and administrative units will be able to replicate the solution and self-manage their own disaster recovery plans. Such planning is a key tenant in support of UBC’s Strategic Plan, Strategy 5: Systems Renewal, as technology plays an increasingly crucial role in enabling efforts to work more synergistically across our campuses and learning sites.