cancel
Showing results for 
Search instead for 
Did you mean: 

Deployment wedged in 'IN PROGRESS'

ftoth
Champ in-the-making
Champ in-the-making
Hi,

I'm hoping someone out there has a fix for this. We're using 2.1 community.
Somehow we've got a WCM deployment stuck in "IN PROGRESS" and I can't find
any way of unscrewing things. No amount of restarting affects it. If I try new
deployments, I get NullPointerError each time I try to hit either the deploy
button or the revert button. If I try to revert the "IN PROGRESS" deployment,
I get an error about a file missing.

At this point we are completely stuck as we can't deploy anything at all.

Is there any way out of this? (Short of scraping the disk clean and building from backups!)

Is there any way to delete a snapshot? I'm not adverse to hitting the database
directly if it avoids having to rebuild.

Any advice appreciated.

Thanks,

Fred
6 REPLIES 6

mrogers
Star Contributor
Star Contributor
First some questions.  
Are you deploying to an ASR or FSR?
Have you restarted both client and server ends?

ftoth
Champ in-the-making
Champ in-the-making
Hi,

I'm deploying to the alfresco deployment receiver which shipped with 2.1. So
that must be FSR. It's a small java process that receives RMI deployment
requests.

Yes, I've restarted both sides of the connection. As it happens, everything
is on the same machine. We've been using this for months with no problems.

The "IN PROGRESS" message appears to be some reflection of some static database
state, since on restart, nothing happens. There is no attempt by alfresco to actually
deploy anything at restart. I know this because I have the deployment receiver
configured to run custom code. The previous deployment went perfectly. Nothing
has happened on the receiver side since.

I'd like to completely delete the snapshot if that's possible.

If I click on the deploy button, for ANY snapshot, I get a long string of NullPointerErrors
that seem to originate here:

java.lang.NullPointerException
    at org.alfresco.web.ui.wcm.component.UIDeployWebsite.encodeBegin(UIDeployWebsite.java:191)

If I try to revert the stuck deployment, or any previous deployment, I get another error:

org.alfresco.service.cmr.avm.AVMNotFoundException: Does not exist: ANDI-rd-landscape-abstracts.pdf
    at org.alfresco.repo.avm.AVMStoreImpl.removeNode(AVMStoreImpl.java:639)

I'm diving into the code next. Any help appreciated. This is a production system.

Thanks,

Fred

ftoth
Champ in-the-making
Champ in-the-making
Note we were never able to get out of this mess and we decided to restore
from backup, which lead to other problems:

http://forums.alfresco.com/en/viewtopic.php?f=14&t=14799

This particular problem turned up on two different running instances of alfresco.
One had been running for 12 months or so with more than 500 deployments.
The other had been running for 4-6 months with 250+ deployments. Then, out
of the blue, I had the same problem on both systems within 24 hours of each other.

If anyone stumbles on to this thread with the same problem, please comment.

Thanks,

Fred

rdanner
Champ in-the-making
Champ in-the-making
Note we were never able to get out of this mess and we decided to restore
from backup, which lead to other problems:

http://forums.alfresco.com/en/viewtopic.php?f=14&t=14799

This particular problem turned up on two different running instances of alfresco.
One had been running for 12 months or so with more than 500 deployments.
The other had been running for 4-6 months with 250+ deployments. Then, out
of the blue, I had the same problem on both systems within 24 hours of each other.

If anyone stumbles on to this thread with the same problem, please comment.

Thanks,

Fred

Are these virtual machines?  Are Synced with a time server?

ftoth
Champ in-the-making
Champ in-the-making
Hi,

Good questions, but no to both. Different machines, different time zones, different continents!

I went down this road too. I started thinking "Maybe this is the weekend when we
used to switch to daylight savings time…". But no.

I still have a few remaining ideas:

1. Full moon.
2. My karma is telling me I should be a sheep herder.
3. Coincidence.
4. Some strange javascript-related problem with Firefox 3
and the clever deployment monitor AJAX stuff.

The last one is a stretch, but while rebuilding one of the alfresco instances I managed
to cause the same problem myself, somehow. At the same time, I got some
little javascript alert that said, if I recall, "Invalid", followed by an OK button.
This happened when I clicked the "Close" button after my deployments were
supposedly complete.

I dismissed it as a fluke, but…

Thanks,

Fred

ftoth
Champ in-the-making
Champ in-the-making
Hi,

For those of you who might be following this thread:

My last speculation on this (Firefox interaction) has been disproved. We just hit
the bug again, but this time using IE.

Thanks,

Fred