Digifort UK Logo

Build resilience into mission-critical CCTV systems using server failover design.

Resilience is designed into CCTV systems to ensure continuity of the sites’ security, process, and operational control, should any part of that CCTV system fail for any reason.

Build resilience into mission-critical CCTV systems using server failover design.

Many larger CCTV systems, especially those deployed in critical and national infrastructure projects, are mission-critical, with an expectation of 24/7 operation. An immediate and appropriate response to component or system failure must be built in, including when that failure is caused deliberately, by malicious attack or even terrorism. It is also the case that larger CCTV systems, with higher quantities of video cameras, power supplies, storage devices, servers and network devices have, by definition, more parts and with that, a greater risk of failure.

In this article, Nick Bowden, Managing Director of Digifort UK, supplying and supporting the Digifort VMS (video management software) and neural analytics platform in the UK, explains how different levels of server failover design can help provide different levels of cost-effective, resilience and continuity. The impact of failure can be mitigated, data protected, and appropriate site operation maintained.

Failover and downtime

VMS-based CCTV systems, such as Digifort, use servers for recording and control. Even in a basic VMS system, these provide significantly higher levels of failure protection than those using NVRs and DVRs, but at an increased cost. The hard drives used for storage are usually arranged in RAID5 format, where a damaged drive can be replaced in the live server (‘hot swapped’) and missing video recordings rebuilt without data loss. Two solid state drives (SSDs) run the operating system. These are RAID1 (mirrored), where if one SSD fails, the spare takes over to keep the system running. This level of resilience is appropriate for many high-security, CCTV systems, however, for mission-critical applications, server failover provides even higher levels of protection and continuity.

The principles of failover are straightforward. If any live server in a CCTV system fails, the CCTV system must recognize the failure has occurred and automatically trigger a standby, ‘redundant’ server, device or system to take over. Speed is of the essence. Quick failover reduces server down time and data loss and in the case of Digifort failover technology, takes just a few seconds. The server failover aspect is only part of an effective, resilient, CCTV system design. The network infrastructure must have equal levels of resilience, offering multiple routes between network switches and CCTV system devices, should the primary network route fail. Measures to ensure power is maintained must also be taken.

Digifort, unlike many other VMS solutions, not only maintains the CCTV camera recording capability in failover, but also all system resources and settings. This ensures comprehensive system operation is maintained, post failure, including live and remote client viewing options, playback and all system configuration, including event handling.

One-to-one failover

The highest level of server failover is one-to-one, where video recording is duplicated on the live server and a “partner” server of similar specification. The partner server is in constant operation to allow the fastest possible response and avoid potential data loss. It is an expensive option, with 100% server redundancy - a complete duplication of servers. However, it provides very high levels of resilience, minimal data loss and maintains system operation.

Spread servers failover

A cost-saving alternative is ‘spread servers’ where, instead of a ‘one-to-one’ server ratio we have ‘one to many’. Here, any one of several live servers can fail to just one, redundant failover server – remembering all the servers still have RAID5 storage and RAID1 OS. For example, in a 5:1 ratio, any one of the five live servers could failover – just not all at once. Spread server architecture is cheaper than one to one, as the redundancy is less (in this case 20%), but it still brings many benefits. The IT architecture logic applied is that it would be highly unlikely that more than one live server would fail simultaneously. However, it is a risk vs cost, budget decision offering lower site continuity than one-to-one, but better than no failover at all.

Critical camera failover

Digifort offers critical camera failover, where only critical cameras failover to the spare, redundant server. These cameras can be from more than one live server, allowing security managers to proritise the importance of each camera in the CCTV system. With the same level of redundancy and cost as the spread server architecture described above, security managers could instead designate which cameras are critical, redirecting their recording path to the redundant server on failure. Unlike spread servers, in this design, all live servers could fail and still the critical cameras would operate.

Parallel recording

In applications where there is a real threat from attack and the data is critical, ‘parallel recording’ is an option. It is not really a ‘failover’ method, because the entire system is duplicated in different locations and operates entirely separately, but in parallel. Should an attack occur, such as an explosion destroying one of the locations, recording continues at the second location – assuming appropriate power back up and network design. This is an expensive option, as the entire recording facility is duplicated. However, its prime advantage is that recorded footage is available from before, during and after a catastrophic event, except for those cameras which may also be lost in the attack. In mission critical and high-threat applications, parallel recording maximises site continuity and always maintains some level of site visibility and control.

Power failover

A sudden power outage can damage servers in a CCTV system, in particular the storage hard drives. Loss of power to servers is easily addressed, using uninterrupted power supplies (UPS), which is good practice and recommended for any servers in any configuration. UPS not only allows the servers and other network devices to shut down cleanly if power is lost altogether and recover completely when power is restored but also protects the servers against damage from mains power spikes. Dual redundant PSUs in servers can also be specified, where if the primary PSU fails a spare takes over. Switching is quick and alerts can be configured within Digifort to notify the system administrator that a problem has occurred, and a repair is required.

Digifort allows CCTV systems to be deployed using different levels of resilience and budgets, proportional to their critical nature, using robust, failover server architecture. This ensures that security, process and operational control of a site are maintained and camera, power supply, storage, server and network failure are mitigated, even when the threat is malicious or catastrophic.

Ready to dive in? Talk to our team today