cancel
Showing results for 
Search instead for 
Did you mean: 

Incremental Backup vs Deleted File Restoration

eswbitto
Confirmed Champ
Confirmed Champ
I'm wanting to get a collaborative consensus from the community on this. Right now I'm running nightly backups on a test install of Alfresco 4.2.c which we are planning on going live within weeks. My question is that…What is the preference when implementing a backup strategy? What has worked for you?

The reason I ask is we are doing full backups (they are compressed) instructions taken from HERE. I've skimmed through the forums and I know that the content.store categorizes files uploaded by date/time which is really great! The other aspect is from what I understand when a file is deleted (or version of a file) it can be restored I believe within a 14 day period from the time it would have been deleted.

That being said…When implementing an incremental to run on the date and time strategy that alfresco has, is there still a requirement to also have the database set along side that incremental? Looking at the database file structure there doesn't appear to be the same strategy used as in the content.store.
So would just backing up the whole database directory to coincide with that particular incremental work?

OR
Am I just over thinking this completely? I'm using just a (full) hot backup nightly that will progressively get larger over time. Having that coupled with just the reliance on being able to restore in a 14 day window a better solution?
2 REPLIES 2

bopolissimus
Confirmed Champ
Confirmed Champ
Yes, your dbdump that corresponds to the incremental must be stored alongside the incremental.

Your incremental should also know about files that have been moved/removed (e.g., moved from contentstore to contentstore.deleted) and restore of the full+intervening incrementals should exactly replicate the state of alf_data at the time of the last incremental in the restore.  So if files have been removed completely from contentstore.deleted, or have been moved from contentstore to contentstore.deleted, your restore procedure should reflect that.

If it doesn't (e.g., if FULL+incrementals restore is only additive, not reflecting moves or removes) then alfresco will not start (content integrity).  Whatever your backup procedure (and this gets more important as your procedure gets more complex, as with FULL+incrementals, you should test restore).  You'll also want to run the incrementals when alfresco is either down (if you can tolerate the downtime) or when there's likely to be no traffic.  This is because content updates/add/remove between when you take the incremental and the db backup is likely to cause content integrity issues.

Where I work, the sysadmins have settled on dirvish.  You might look into that if you're using Linux.  It's a lot simpler than incrementals.  Identical files from multiple backups are stored as linux hard links.  So only one copy is kept around.  If all links to the file are deleted (it was deleted N days ago and we only keep N backup generations) then the disk space is freed up.

There will be occasions when we keep two copies of the same file (moved from contentstore to contentstore.deleted and not yet physically removed from one), but the disk space used isn't anywhere near Nx[content size] where N is the number of generations.

I don't think I would (for myself) use FULL+incrementals for alfresco backup.  disk is pretty cheap and the added complexity of FULL+incrementals is not something I'd like to deal with when restoring.

Good luck.

eswbitto
Confirmed Champ
Confirmed Champ
@bopolissimus

I'm leaning toward agreeing with you. The more I research this the more its more of a hassle than just restoring a full backup. There would be ways to make sure deleted content would stay deleted. Thanks again for your input. Have a good one!
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.