cancel
Showing results for 
Search instead for 
Did you mean: 

Clarification on Backup file paths

eswbitto
Confirmed Champ
Confirmed Champ
I'm in the testing phase of trying to backup both file content and database and I need some clarity on the database. I've followed the wiki on doing hot/cold backups, but when I look at my file path I have two postgresql directories.

/opt/alfresco/alf_data/postgresql
/opt/alfresco/postgresql

which one do I need to backup? Both?

11 REPLIES 11

fcorti
Elite Collaborator
Elite Collaborator
Hi ESWBitto,

As described in the wiki, the correct sequence to backup alfresco is: backup of the db, backup of the documents and indexes (I suppose that both are stored in the alf_data folder).
The DB backup is usually done using the backup tools of your database: in your case postgresql.

I imagine there is something unusual with your installation because you have two different postgres folders, probably one is not used or is used for different database respect to the Alfresco's one.
But in your case, you can ignore it using the 'pg_dump -Ft -b alfresco > db_backup.sql' command.

The wiki is definitely clear but if you want to see a practical example I suggest you this <a href="https://francescocorti.wordpress.com/2013/02/06/alfresco-backup-script/">link</a>.

Hope this help you.

I'm not sure why its creating two different locations. I'm using the 4.2.c bin file to install alfresco.

Also,

Thanks for making this script it really does help out a lot. When I run it though I get an error on line 27.

./alfrescoBackup.sh: line 27: /opt/postgresql/bin/pg_dump: No such file or directory

Any ideas? I looked at /opt/alfresco/postgresql/bin and the file is there according to the warning. Weird.

*edit* ah I just looked at it….its looking for /opt/postgresql instead of /opt/alfresco/postgresql

eswbitto
Confirmed Champ
Confirmed Champ
I am getting this message though.

pg_dump.bin: [archiver (db)] connection to database "alfresco" failed: could not connect to server: Connection refused
        Is the server running on host "localhost" and accepting
        TCP/IP connections on port 5432?

My guess is when the script stops the service to run the dump it can't find it because its not running.

fcorti
Elite Collaborator
Elite Collaborator
Hi ESWBitto,

Sincerely I'm not sure because we don't usually use the bundle installation but it's easy to check: you can stop the service manually and try to execute the db connection or exportation.
In this case you can remove the stopping command but remember that in this case it's a hot backup.

Let us know.

bopolissimus
Confirmed Champ
Confirmed Champ
/opt/alfresco/alf_data/postgresql is where the data is.
/opt/alfresco/postgresql is where the postgresql software is.

if you want to do a cold backup, just stop alfresco (/opt/alfresco/alfresco.sh stop) and then backup the whole /opt/alfresco directory (or, for a minor space saving), /opt/alfresco/alf_data.

if you want to do a hot backup, alfresco would still be running.  let's say you're going to put the backup in /var/backups/alfresco (this is what we do).

sudo mkdir /var/backups/alfresco
sudo chown [user].[group] /var/backups/alfresco #set user and group to whomever runs the alfresco process.

# do an initial rsync, minimize time for later rsync.
sudo rsync -avv -P –delete /opt/alfresco/alf_data /var/backups/alfresco/alf_data

cd /opt/alfresco/postgresql/bin
./pg_dump -ORc -h localhost -U [alfresco_user_name] [alfresco_db_name] > /opt/alfresco/alf_data/alfresco-db-backup.sql

# do a refresh rsync, faster than the first one.
sudo rsync -avv -P –delete /opt/alfresco/alf_data /var/backups/alfresco/alf_data

that should do the trick.  note: you'll need to inject the correct values wherever I have [SOMETHING_IN_SQUARE_BRACKETS]

there's another way to do a cold backup (stop alfresco and then start postgres and pg_dump that).  but I can't describe the procedure since I've forgotten what it is, and I don't need perform that backup procedure since we don't stop alfresco, just always do hot backups.

eswbitto
Confirmed Champ
Confirmed Champ
Thanks Bopolissimus,

I will try and test that and see how it goes. I initially did something similar. I stopped alfresco then copied the alf_data directory to a different location. Started Alfresco. Went in and deleted a test document I had been working with. Then tried to restore by stopping alfresco again. Replacing the alf_data directory and then starting alfresco. It didn't like that. I ended up having to re-install the whole thing again.

I'm going to try the steps you outlined and I'll let you know how it goes.

Yes, as you've found out, you can't do that.  The reason is, the state of alfresco is the union of alf_data contents AND database contents at the same point in time (or close enough in time that no content changes have happened between taking a backup of alf_data and the database).

When you delete the test document and then restore the old alf_data, alfresco won't start because it'll notice that your database and alf_data are out of sync.  There is content in the old alf_data which the database doesn't know about (it doesn't know it's name, for instance, or where its versions are, or who edited it last, all that metadata is in the database).

This is why, when taking a backup, you keep the database dump and the alf_data together.  so that when you restore, you restore both.  restoring just one breaks alfresco since alfresco will see that data and database are out of sync.

eswbitto
Confirmed Champ
Confirmed Champ
@ fcorti

Ok think I figured out where I went wrong. I commented out the stopping command and tried to run the script. When I did that it asked for a password. So I gave it my database password (my assumption of what it wanted) and it worked. So that is where I went wrong. I looked through the shell script again and there is a perimeter  that I missed on putting in the password.