cancel
Showing results for 
Search instead for 
Did you mean: 

Need advice about backup/restore

zomurn
Champ in-the-making
Champ in-the-making
Hello,

I'am was wondering how to proceed for backup/restore of the server in this scenario:

1) The production version is working well.
2) I backup the production version
3) 1 week later, I update the custom model into which I add more metadata. Hence, I apply the default new metadata values to all documents in production.
4) The application in production crash…I need to do restore from step 2).
….
Documents in production have the new metadata which are not present anymore in the model due to restore….how to deal with this scenario ??

Help me please
23 REPLIES 23

derek
Star Contributor
Star Contributor
Hi,
If you restored to before you installed the new model, then you should not have the new metadata anywhere in the data.  I don't understand.
Regards

zomurn
Champ in-the-making
Champ in-the-making
Yes you're right. But I need to say to my customer (150 persons) that the job they've done during 1 week (injecting the thousands documents through CIFS) is lost, they need to redo their scan….impossible !
Hence I need to keep the "HEAD version" of alf_data + database

zaizi
Champ in-the-making
Champ in-the-making
There is no way you can keep the latest alf_data + database without the model changes. Because they make the changes.

Your best course of action is to work out what caused the crashed and try and fix it.

zomurn
Champ in-the-making
Champ in-the-making
OK thanks.
Fortunately, to deal with that, I have a pre production version which is the mirror of the production version (at least the lastest model, so not necessarily the latest alf_data+database ).
Hence I can test the deployment to test if the disaster might occurs.
One last question, is there only one "critical" file in alfresco : editing the "custom-model.xml" file ?
Normally, all other files can be edited without risk, isn't it ?

Thanks for your advice.

zaizi
Champ in-the-making
Champ in-the-making
If you mean by critical, files that modify the repository, there are a bunch of them.

If you have a pre-production servers, then all changes should be tested on them first. You should not be modifying directly on live. That doesn't just apply for Alfresco.

zomurn
Champ in-the-making
Champ in-the-making
Of course for pre production, I do the test of new functionalities in addition to check that alfresco server start up well.
Exactly, it is was I mean for critical files…I know only custom-model.xml for it, you afraid me….can give me some more files ?

Thanks.

derek
Star Contributor
Star Contributor
Hi,

We attempt to write the code so that any unrecognized types, aspects and properties (they're in the data but not in the model) are just handled normally.  It is very likely, if you are running on the latest release, that the system will work fine.  If not, identifying and cleaning up the data can be done directly on the database.

Your maximum backup interval is dictated by how much you can afford to lose.  If you can't afford to lose a week when you restore, then you have to take backups more regularly.  You can do nightly or hourly backups of the DB and then weekly backups of the filesystem if (a) you have a reliable filesystem and (b) you protect the content from the orphan cleanup for at least a week.

zomurn
Champ in-the-making
Champ in-the-making
Your maximum backup interval is dictated by how much you can afford to lose. If you can't afford to lose a week when you restore, then you have to take backups more regularly. You can do nightly or hourly backups of the DB and then weekly backups of the filesystem if (a) you have a reliable filesystem and (b) you protect the content from the orphan cleanup for at least a week.

Thanks very much for your participation…but answers call new questions which calls new answers and so on you know Smiley Happy.

1) Can you explain point (b)

2) "How much you can afford to lose" : nothing, it's likely to be the answer of the customer Smiley Wink.
Why backup the database more often ? Suppose I backup alf_data each monday and db each night. I have a crash before next monday….the db backup from tuesday to sunday is useless because newer than alf_data , isn't it ? The backup of the database must be older (= hot backup) or at same version (= cold backup)) of alf_data.

From this and from the fact I have to often backup (suppose each day) it would suppose I need to backup alf_data each day…it cost a lot in space requirement unless I do incremental backup (and I need to I suppose Smiley Wink). Moreover the customer as RAID 5 machine with hardware mirroring backup.
Personnaly if I have a problem to deploy in production, I would like not to fix nothing (like a crash of window). I prefer to restore the lastest backup which is sure, safe, fast (?) and postpone the next version later after identifying the problem (which can be time varying) on my post.

derek
Star Contributor
Star Contributor
but answers call new questions which calls new answers and so on you know
Yes.  But at least I don't end up saying stuff you already know.

1) http://wiki.alfresco.com/wiki/Content_Store_Configuration#Content_Binaries.27_Lifecycle.  Content (at the storage level) is only every created by the user - never modified.  Content that has been orphaned is deleted by a background job.  This doesn't clean out anything younger than 7 days, but this can be changed by overriding the bean from scheduled-jobs-context.xml.  It follows that, during that protected period, you can backup and restore your db seamlessly.  The general ordering rules governing hot backup still apply: You can backup the DB and then the filesystem; the time between the two (DB and FS) must not exceed the time that orphaned content is protected.  So you can run regular DB backups and only run irregular FS backups.  You can also only backup newer content by targeting subdirectories of the content store (since they're time-located).

2) Since the FS is running on a RAID system, it has redundancy and doesn't need to be backed up that much.  Once a week for RAID is probably OK.  To protect yourself against software and user-related issues (i.e. unpredictable reliability), you'll need to backup the DB a lot more often; this is especially true after an upgrade or modification of the software.  If you followed point 1, you'll take very frequent hot backups of the DB followed by hot backups of the filesystem.  You will therefore always have a DB that is less than 7 days old when you copy the content.

Regards