Hyland Connect

lfuller · ‎05-31-2007

I notice that the wiki for high availability alfresco deployment states that the lucene indexes should not be shared but that the content should. At present I've only been able to find one config that controls where both the content and the indexes are placed. Would be a lot easier if there were a way to configure those locations seperately. Any one out there mind sharing how you've handled this situation?

lfuller · ‎06-01-2007

I am following the instructions for HA Config found here:

http://wiki.alfresco.com/wiki/High_Availability_Configuration_V1.4

I selected this because it seemed to make the most sense and makes use of ehcache's distributed features with which we've enjoyed a great deal of success on other products.

The difficulty that I am in encountering is in getting the non shared lucene indexes to function properly.

I am using Alfresco 2 and have copied the index-tracking-context.xml.sample file into the extensions directory and renamed it index-tracking-context.xml. If more than this is required, I'd be interested in hearing about it.

I have placed the dir.root directory on a shared filesystem

I set this config up by first starting my node 1 against a clean db with no alfresco data in the shared directory.

I then copy the lucene-index to a local directory for both node1 and node2. Then in the shared directory I remove the original lucene-index and create an ln -s that points to the local lucene-index directory. Upon start up I typically see this exception occur once on one of the nodes.

13:13:20,769 ERROR [quartz.core.JobRunShell] Job DEFAULT.ftsIndexerJobDetail threw an unhandled Exception:
java.lang.RuntimeException: Error during run with lock.
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.doWithFileLock(IndexInfo.java:2083)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.setStatus(IndexInfo.java:1104)
        at org.alfresco.repo.search.impl.lucene.LuceneBase2.setStatus(LuceneBase2.java:256)
        at org.alfresco.repo.search.impl.lucene.LuceneIndexerImpl2.prepare(LuceneIndexerImpl2.java:736)
        at org.alfresco.repo.search.impl.lucene.LuceneIndexerAndSearcherFactory2.prepare(LuceneIndexerAndSearcherFactory2.java:706)
….

From this point forward the application appears to function normally.

However, after shutting down and restarting both nodes exceptions begin to occur at regular quartz defined intervals on both nodes.

java.nio.BufferUnderflowException
        at java.nio.ByteBuffer.get(ByteBuffer.java:650)
        at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:233)
        at java.nio.ByteBuffer.get(ByteBuffer.java:674)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.readString(IndexInfo.java:1905)        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.setStatusFromFile(IndexInfo.java:1833)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.setStatusFromFile(IndexInfo.java:1603)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.access$1300(IndexInfo.java:113)        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$2.doWork(IndexInfo.java:477)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.doWithFileLock(IndexInfo.java:2070)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.<init>(IndexInfo.java:473)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.getIndexInfo(IndexInfo.java:331)
        at org.alfresco.repo.search.impl.lucene.LuceneBase2.initialise(LuceneBase2.java:104)
        at org.alfresco.repo.search.impl.lucene.LuceneSearcherImpl2.getSearcher(LuceneSearcherImpl2.java:113)

and most often:
log4j:ERROR Error occured while converting date.
java.lang.NullPointerException
        at java.lang.System.arraycopy(Native Method)
        at java.lang.AbstractStringBuilder.getChars(AbstractStringBuilder.java:331)
        at java.lang.StringBuffer.getChars(StringBuffer.java:202)
        at org.apache.log4j.helpers.AbsoluteTimeDateFormat.format(AbsoluteTimeDateFormat.java:108)
        at java.text.DateFormat.format(DateFormat.java:314)
        at org.apache.log4j.helpers.PatternParser$DatePatternConverter.convert(PatternParser.java:436)
        at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:56)
        at org.apache.log4j.PatternLayout.format(PatternLayout.java:495)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:292)
        at org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:349)

andy · ‎06-07-2007

Hi

The location of indexes can be set in the properties - see repository.properties.

Andy

lfuller · ‎06-07-2007

Thus far I've been able to get an HA deployment partially working. Both nodes reflect the additon, removal, or update of content to the other node. The only issue I'm currently having is when I try to actually view cm:content. For instance, if I upload a jpeg to one node, both can see that a new cm:content has been added. Unfortunatley, when I try to download the file and view it, only to node to which I had orginally uploaded the cm:content can deliver the file.

lfuller · ‎06-07-2007

org.alfresco.service.cmr.repository.ContentIOException: Failed to open stream onto channel:
   accessor: ContentAccessor[ contentUrl=store://2007/6/7/18/52/ba7926a2-1549-11dc-959f-1dc25b8d07d7.bin, mimetype=application/octet-stream, size=0, encoding=UTF-8]
        at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:335)
        at org.alfresco.service.cmr.repository.datatype.DefaultTypeConverter$58.convert(DefaultTypeConverter.java:735)
        at org.alfresco.service.cmr.repository.datatype.DefaultTypeConverter$58.convert(DefaultTypeConverter.java:733)
        at org.alfresco.service.cmr.repository.datatype.TypeConverter.convert(TypeConverter.java:120)
        at org.alfresco.jcr.item.JCRTypeConverter.convert(JCRTypeConverter.java:265)
        … 10 more
This is the exception I get while trying to access a fille from an alfresco clustered node.   If I try to access the same file from the node to which it was first uploaded, I can access the file with no difficulties.

Caused by: org.alfresco.service.cmr.repository.ContentIOException: Failed to open file channel: ContentAccessor[ contentUrl=store://2007/6/7/18/52/ba7926a2-1549-11dc-959f-1dc25b8d07d7.bin, mimetype=application/octet-stream, size=0, encoding=UTF-8]
        at org.alfresco.repo.content.filestore.FileContentReader.getDirectReadableChannel(FileContentReader.java:233)
        at org.alfresco.repo.content.AbstractContentReader.getReadableChannel(AbstractContentReader.java:224)
        at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:328)
        … 14 more
Caused by: java.io.IOException: File does not exist
        at org.alfresco.repo.content.filestore.FileContentReader.getDirectReadableChannel(FileContentReader.java:208)

lfuller · ‎06-07-2007

Here is my replicating-content-services-context.xml config:

<beans>
                                                                                            
   <bean id="alternateContentStore"
         class="org.alfresco.repo.content.filestore.FileContentStore">
      <constructor-arg>
         <value>/export/home/portal/shared/store_b/alfresco_data</value>
      </constructor-arg>
   </bean>
                                                                                            
   <bean id="replicatingContentStore"
         class="org.alfresco.repo.content.replication.ReplicatingContentStore" >
      <!– the preferred store for reads and writes –>
      <property name="primaryStore">
         <ref bean="fileContentStore" />
      </property>
      <!– example of possible secondary store configuration –>
      <property name="secondaryStores">
         <list>
            <ref bean="alternateContentStore" />
         </list>
      </property>
      <!– enable content missing from the primary store to be pulled in from the secondary stores –>
      <property name="inbound">
         <value>false</value>
      </property>
      <!– enable replication from the primary to the secondary stores –>
      <property name="outbound">
         <value>false</value>
      </property>
      <!– this is required if outbound replication is active, otherwise not –>
      <property name="transactionService">
         <ref bean="transactionComponent" />
      </property>
      <property name="retryingTransactionHelper">
         <ref bean="retryingTransactionHelper"/>
      </property>
      <!– set this to force outbound replication to be asynchronous –>
      <property name="outboundThreadPoolExecutor">
         <ref bean="threadPoolExecutor" />
      </property>
   </bean>
                                                                                            
</beans>‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

The file not found error also occurs if I attempt to set both outbound and inbound replication to true. Node B has a similar configuration which points at the store_a store.

To reiterate, I can search from both nodes and see accurate results. I can create, update and delete content and see the results reflected from both nodes. I'm unable to actually pull down files that have been uploaded from both nodes… which makes this whole cluster config significanly less useful

lfuller · ‎06-08-2007

I have the HA configuration functioning properly at the moment. In doing so I've discovered two things:

1) I had misconfigured the alternateContentStore to point at the base directory rather than the actual content store.

2) the ReplicatingContentStore, with outbound and inbound both set to true, did not actually replicate the changes. And without actually replicating the changes, the secondaryStore is not used when attempting to find a file if the file is not found in the primaryStore.

With that in mind I added the contentStoreReplicator bean as well as a cron job to periodically repllicate the content to the secondary store at a regular interval (similar to sample except ran several times a minute). This seems to do the trick, but makes me wonder:
What, if anything, is the ReplicatingContentStore providing in my current scenario?

Is there anyway to get HA to work without, at minumum, doubling the space used by the contentStores? I was hopiing that using the replicatingContentStore would allow me to access content not in my primary store, and thereby avoid duplicating my content.

Is running the ContentStoreReplicator every 10 seconds like I have below (and similar ot how the lucene indexes are being handled) sustainable?

here is the config I am using now (only have two nodes in the cluster currently):

   <bean id="alternateContentStore"
         class="org.alfresco.repo.content.filestore.FileContentStore">
      <constructor-arg>
         <value>/export/home/portal/shared/10.0.4.25/alfresco_data/contentstore</value>
      </constructor-arg>
   </bean>
 
   <bean id="contentStoreReplicator"
         class="org.alfresco.repo.content.replication.ContentStoreReplicator"
         depends-on="fileContentStore, alternateContentStore" >
      <property name="sourceStore">
          <ref bean="fileContentStore" />
      </property>
      <property name="targetStore">
          <ref bean="alternateContentStore" />
      </property>
   </bean>

   <bean id="contentStoreBackupTrigger" class="org.alfresco.util.CronTriggerBean">
      <property name="jobDetail">
         <bean class="org.springframework.scheduling.quartz.JobDetailBean">
            <property name="jobClass">
               <value>org.alfresco.repo.content.replication.ContentStoreReplicator$ContentStoreReplicatorJob</value>
            </property>
            <property name="jobDataAsMap">
               <map>
                  <entry key="contentStoreReplicator">
                     <ref bean="contentStoreReplicator" />
                  </entry>
               </map>
            </property>
         </bean>
      </property>
      <property name="scheduler">
         <ref bean="schedulerFactory" />
      </property>
      <property name="cronExpression">
         <!–<value>0 0 03 * * ?</value>–>
        <value>0,10,20,30,40,50 * * * * ?</value>
      </property>
   </bean>
    
   <bean id="replicatingContentStore"
         class="org.alfresco.repo.content.replication.ReplicatingContentStore" >
      <!– the preferred store for reads and writes –>
      <property name="primaryStore">
         <ref bean="fileContentStore" />
      </property>
      <!– example of possible secondary store configuration –>
      <property name="secondaryStores">
         <list>
            <ref bean="alternateContentStore" />
         </list>
      </property>
      <!– enable content missing from the primary store to be pulled in from the secondary stores –>
      <property name="inbound">
         <value>true</value>
      </property>
      <!– enable replication from the primary to the secondary stores –>
      <property name="outbound">
         <value>true</value>
      </property>
      <!– this is required if outbound replication is active, otherwise not –>
      <property name="transactionService">
         <ref bean="transactionComponent" />
      </property>
      <property name="retryingTransactionHelper">
         <ref bean="retryingTransactionHelper"/>
      </property>
      <!– set this to force outbound replication to be asynchronous –>
      <property name="outboundThreadPoolExecutor">
         <ref bean="threadPoolExecutor" />
      </property>
   </bean>




‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

elwood · ‎07-10-2007

Hi all,

I'm using Alfresco 2.1.0 R1 (community tomcat) on two nodes (two different Windows machines), MySQL as DB and a shared folder for the contents …
the indexes are in the local nodes.

I'm following the http://wiki.alfresco.com/wiki/High_Availability_Configuration_V1.4

I've renamed index-tracking-context.xml and ehcache-custom.xml
in the extension folder.

after the nodes startup (with no errors) I can see the content creation but not the property modification (logged at the same time as admin in the two nodes)

where I'm wrong?
ehcache configuration?

thank you.

alfiasco · ‎07-13-2007

I have exactly the same problem.

Hyland Connect

High Availability Deployment config question.