cancel
Showing results for 
Search instead for 
Did you mean: 

Disk group copy 2 vs. copy 1

Andrew_Szymusia
Champ in-the-making
Champ in-the-making

Trying to get a better grasp of disk groups beyond having just copy 1 and the advantage of Copy 2,  etc.  Being that the file server that the date will reside on already has a backup strategy in place, does implementing a disk group copy 2 aid in redundancy?  In other words, what's the benefit of having more than one copy.  Thanks!  Andrew.

7 REPLIES 7

MichaelBertrand
Star Collaborator
Star Collaborator

The typical argument for multiple copies is that it provides an on-line second copy of the files in case the primary Diskgroup is no longer available. Think of it as a Distributed File Server but only for OnBase. There is a secondary use-case for multiple copies, that is hosting files on faster or slower media or on media that is fundamentally different from a regulatory standpoint. There is also a third use that of backups and exports.

In the first case, I always recommend to customers that  if they are going to duplicate a server just for file serving, they should make sure that their primary server is as robust as it should be. Typically this is not much of a discussion since we are dealing with the IT department and multiple power-supplies, RAID, enterprise class drives, SANs,  etc. This allows them to make the OnBase diskgroups as highly available as other file stores on their network, using the same tools and the same knowledge. They don't have to learn how to do the OnBase Diskgroup Analysis (DGA) check and copy. While this procedure is fairly straightforward, if not fast, doing High-Availability in the same way they do it for all other files (i.e. reusing existing procedures) is even simpler.

The second case is a case where multiple copies can, in my opinion have a place, even if the DGA needs to be done. Some companies use devices that are certified to be archive stores, users can't just go to the share an modify / change / delete a file. The two common ones we see are EMC Centera and IBM Tivoli. There is also KOM Compliance and possibly other similar solutions that are directly integrated in to OnBase.

With multiple copies of diskgroups, if your first diskgroup goes down users (and import services) can no longer add documents or possibly modify existing documents (e.g. with EDM services, can someone confirm this?). So you new work effectively stops. New e-forms (not virtual e-forms or unity forms) can not be created? A DFS or replicated SAN will provide as much security, with less effort (no DGA), for the same hardware-software price. 

When multiple copies of Diskgroups were made available, RAID was not common and RAID controllers were relatively expensive, multiple copies had their place. Today, given the cost of disk space and the robust infrastructure supporting it, other than for compliance (i.e. EMC, IBM, KOM) we only use one diskgroup copy.

William_Howell
Star Contributor
Star Contributor

Another use for Copy 2 might be disaster recovery where the primary site is compromised, although help from Hyland may be required to change all the Db pointers so that Copy 1 is available. You cannot create new documents without Copy 1. If you only have a backup, you should determine the length of time a full restore of that data would require and factor the cost of that downtime. (Don't forget about the database restore as well.)

There are multiple levels of RAID and you need to understand the failure points of any solution you choose. I have personally been a victim of multiple drive failures in a RAID 5 environment where all data was lost. Michael is correct that most larger IT shops will be using EMC or IBM SAN devices that are very robost and fault tollerant. None of which will matter if the power is out for a week or a fire destroys the building.

We have used Centera for Copy 2 in the past. The advantage of the Centera and possibly some others are that the data is write-once. This helps prevent accidental or malicious deletion. And users cannot access the Centera device directly due to it's proprietary storage format so this enhances security. However, Centera is not well optimized for OnBase in that it assigns it's own "file handle" of sorts to each file. The pool of file handles is limited and you can easilly run out of handles before you exhaust storage space. Other than the high cost of Centera, the other issue is that it's very difficult to move Centera data to a different device should you wish to do so because of it's proprietary format. Hyland has recently developed a utility to support migration off Centera so that will help.

We are considering elimination of Copy 2, and use of Double-Take to mirror our data to a second location. This software is relatively inexpensive but should we lose our mass storage copy, the backup copy can quickly take over behind the scenes without requiring database pointer changes.

Bill

Jim_Dimmick
Confirmed Champ
Confirmed Champ

Andrew,

Both Michael and Bill are correct in their points. The question to ask is what problem you want to solve. In most cases, using enterprise class hardware for storage will cover high availability (you should have no single points of failure in your storage structure). OnBase copies allow you to leverage the various capability with your storage infrastructure. For high availability, does your current single copy file server meet the SLA time for the application? If so, then HA is covered. Secondly, does the file server backup schedule meet both recovery point and recovery time (RPO/RTO) for the OnBase solution? If so, DR is covered. If HA and DR goals are not met with the file server backup, then you may choose to create a second copy of OnBase disk groups in order to meet the requirements for the solution. 

A second use case for multiple copies is placing copies in different locations. If you had an office in California and one in Virginia, you could place a copy of disk groups in each location. You could then point users to access files from their location as opposed to pulling files from the other coast. OnBase has other functionality to cover this use case as well (gateway caching server) but that's another topic.

Essentially, you only need 1 copy of OnBase if your infrastructure has HA and DR requirements met, or as both Michael and Bill mention, you want to archive to Centera or Tivoli. The last important point is, OnBase copy 2 can be a Backup copy. This backup is an OnBase backup scheduled or run through the OnBase application. This backup only makes a copy of "promoted" or closed volumes. Open volumes are not copied to the second copy location. The backup can be to removable media or near line storage. In your case, a second copy as a backup is not needed as backups are already being made.

Hope this helps-

 

 

 

MichaelBertrand
Star Collaborator
Star Collaborator

Bill, we have multiple customers running their Diskgroup(s) on a DFS volume, replicated over a WAN replicated to a remote site. All files (not just committed ones) are copied more-or-less immediately and are available without having to call their first (and then second) line of support, as you identify in your last paragraph. The restore time and efforts are excellent points. 

I think we have all seen RAID systems go down and down permanently. RAID is not a backup. RAID on a copy 2 diskgroup should not be thought of as a backup. Only a backup and a tested backup is a backup! My point is, if I am going to spend money on 2 RAID 5 setups, it may make more sense to use the same number of drives and controllers to build a RAID 6 setup with multiple hot or warm spares. Though as with any parity based RAID systems, the time, the time to rebuild parity is only getting longer and longer (due to the size of data stored).

We are not an EMC or IBM shop, so when we have used Integration for Centera or Tivoli it was a requirement from the customer. They know how it works, the benefits and limitations. They also already have the procedures in place to take care of those devices.