Hyland Connect

paul_lahitte · ‎06-09-2008

I am using alfresco community on redhat linux and lucenes-indexes/workspace/SpacesStore is containing 32001 directories. The systems refuses creating new dir "too many links" and then Alfresco is not starting while it is trying to create anew directory.

the log:java.io.IOException: Cannot create directory: /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/e51ff042-3628-11dd-b952-2b903bc78fd4        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:175)        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:227)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$Merger.mergeIndexes(IndexInfo.java:2943)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$Merger.run(IndexInfo.java:2448)        at java.lang.Thread.run(Thread.java:619)
ERROR [org.alfresco.repo.search.impl.lucene.index.IndexInfo] Failed to merge indexes
java.io.IOException: Cannot create directory: /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/e55b2483-3628-11dd-b952-2b903bc78fd4
        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:175

Does any one know how to increase this number ??

Thank's

paul_lahitte · ‎06-11-2008

I left the process running all night (I was quite depressed and tired and went to bed).
In the morning the procees java was stopped and the log was full of warnings .

I restarted Alfresco with index.recovery.mode=VALIDATE and it started !!!
Now my lucene-indexes directory is containing only 37 directories and is 1.2 Gb sized (sounds better!).

I still have lots of warnings and errors related to lucene in the log file but at least peolpe can connect and work …

Part of catalina.out

at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:210)   at $Proxy9.index(Unknown Source)   at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:52)   at org.quartz.core.JobRunShell.run(JobRunShell.java:202)   … 1 moreCaused by: java.lang.UnsupportedOperationException: The content never exists   at org.alfresco.repo.content.EmptyContentReader.getDirectReadableChannel(EmptyContentReader.java:59)   at org.alfresco.repo.content.AbstractContentReader.getReadableChannel(AbstractContentReader.java:226)   at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:330)   … 17 more[WARNING] Unknown Ptg 1c (28)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 3c (60)[WARNING] Unknown Ptg 14 (20) at cell (15,26)09:31:18,983 ERROR [quartz.core.JobRunShell] Job DEFAULT.ftsIndexerJobDetail threw an unhandled Exception: org.alfresco.service.cmr.repository.ContentIOException: Failed to open stream onto channel:    accessor: ContentAccessor[ contentUrl=store://2008/6/11/9/31/61bba49c-3788-11dd-b926-13890b8e687a.bin, mimetype=null, size=0, encoding=UTF-8, locale=fr_FR]   at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:337)   at org.alfresco.repo.search.impl.lucene.ADMLuceneIndexerImpl.indexProperty(ADMLuceneIndexerImpl.java:858)   at org.alfresco.repo.search.impl.lucene.ADMLuceneIndexerImpl.createDocuments(ADMLuceneIndexerImpl.java:542)   at org.alfresco.repo.search.impl.lucene.ADMLuceneIndexerImpl.updateFullTextSearch(ADMLuceneIndexerImpl.java:1248)   at org.alfresco.repo.search.impl.lucene.fts.FullTextSearchIndexerImpl.index(FullTextSearchIndexerImpl.java:188)   at sun.reflect.GeneratedMethodAccessor382.invoke(Unknown Source)   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)   at java.lang.reflect.Method.invoke(Method.java:597)   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:281)   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:187)   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:154)   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:107)   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:176)   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:210)   at $Proxy9.index(Unknown Source)   at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:52)   at org.quartz.core.JobRunShell.run(JobRunShell.java:202)   at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)Caused by: java.lang.UnsupportedOperationException: The content never exists   at org.alfresco.repo.content.EmptyContentReader.getDirectReadableChannel(EmptyContentReader.java:59)   at org.alfresco.repo.content.AbstractContentReader.getReadableChannel(AbstractContentReader.java:226)   at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:330)   … 17 more‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Any clues ?
Thank's again for your help .

Paul

jjf · ‎06-23-2008

We have encountered the same issue (the index folder has hit its limitation). I do believe the merging process is working, however, it is merging at a rate slower than our content updates. So, over the course of time, it will always fill with indexes and lock up the system. Is there any workaround besides clearing the indexes and doing a full index? Is there a way to stop the full text index from running?

tman · ‎07-01-2008

Same problem here: after reindexing, lucene creates another 32000 directories, and it hangs again. Another reindexing is needed.
Is there any method of overpassing this limitation and this ongoing manual reindexing?

Thanks

karlovitz · ‎07-02-2008

This question is 2 fold, we are currently building an alfresco implementation out for a client. I am relatively new to alfresco so I am trying to learn all I can before making bad decisions. This 32000 file limitation scares me as we will be storing a large number of assets. Is there a way around this? Should we re index on a regular basis?

This brings me to my next question, I am currently experiencing a performance problem when searching on our assets. Takes anywhere from 20 - 30 seconds to return a basic lucene text query. My first step was to rebuild the indexes passing in recovery_mode=FULL, although the log file says that the indexing is complete, tomcat didn't move on to start up.

08:55:31,310 INFO [org.alfresco.repo.node.index.FullIndexRecoveryComponent] 90 % complete.
09:04:30,916 INFO [org.alfresco.repo.node.index.FullIndexRecoveryComponent] 100 % complete.
09:04:30,932 INFO [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery completed.

I have waited about an hour now for the cpu to settle down and tomcat to finish the start up process, however I am getting impatient. I saw that a previous poster stated that once the indexing is complete an asynchronys process will kick off to do the rest of the job and tomcat will continue. This, however is not what is happening. Does anyone have any ideas.

karlovitz · ‎07-02-2008

One other note is that while running this rebuild my sqlserver has consumed over 850 meg of memory and is climbing while tomcat is hovering around 600 meg. I am only rebuilding 14,000 transactions. Any ideas would be very helpful.

Steve

andy · ‎08-08-2008

Hi

Is this a simple Alfrecso install? If not what have you changed?
Was anything reported in the Alfresco logs?
Have you got any custom code?

Andy

andy · ‎08-22-2008

Hi

Please include the exact version of Alfresco.
Were there any exceptions related to index merging in the logs?
It is possible for the background merge to die and not recover on some versions of the product.
However, there will always be something in the log when this happens.

The back ground full text serach index can output errors from the convertors.
Some docuements can not convert and in some cases there is missing content etc etc.
Nothing has gone wrong.

You can search for the offending files using nitf and ninc (as described on the search wiki page)

Andy

jjf · ‎09-30-2008

Andy,

Thanks for the response. It is a basic "out-of-the-box" setup using Community Release 2.0. We have also tried this using the 2.1 release in our DEV env. From what I can tell, over time, transactions in the system will always cause indexes to increase. At some point, the indexes will hit 32,000 files and cause the system to become unstable.

I've checked the logs and, as far as I can tell, the merging isn't reporting any errors. Is there a different logging level I need to use to see errors with the merge process? If I monitor the index directory, I see the folder count going up and down which seems to me like the merging is occuring. However, over time (weeks/months), the index folders always increases.

Please let me know if I can give you any more information.

jjf · ‎09-30-2008

One thing we noticed is we wrote some customer code which adds/removes users to Alfresco via a batch. It's just a Java Quartz job which runs and calls personSvc.createPerson(), authSvc.createAuthentication(), and authoritySvc.addAuthority() (also remove/delete methods too). This code seemed to be creating one index per user. So if we had 32,000 users, there would be problems. We disabled the batch so that this wouldn't happen. Is this the expected behavior (one index per user transaction)?

louise · ‎10-03-2008

Some background:

Since ext3 aims to be backwards compatible with the earlier ext2, many of the on-disk structures are similar to those of ext2. Because of that, ext3 lacks a number of features of more recent designs, such as extents, dynamic allocation of inodes, and block suballocation. There is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode.

The maximum number of inodes (and hence the maximum number of files and directories) is set when the file system is created. If V is the volume size in bytes, then the default number of inodes is given by V/213 (or the number of blocks, whichever is less), and the minimum by V/223. The default was deemed sufficient for most applications. The max number of subdirectories in one directory is fixed to 32000.

ReiserFS is still limited by the 16 bit link count for directories, so you can't have more than around 65,000 subdirectories with the same parent. BTW: This limitation even exists in the "Mother of all network attached file systems", ONTAP on NetApp Filers.

The solution would be (or MUST be) a better Lucene indexing algorithm, not the limit change in filesystem. How we can store more than 32000 users in Alfresco authorization?

Hyland Connect

linux 32000 files limitation