cancel
Showing results for 
Search instead for 
Did you mean: 

linux 32000 files limitation

paul_lahitte
Champ in-the-making
Champ in-the-making
I am using alfresco community on redhat linux and lucenes-indexes/workspace/SpacesStore is containing 32001 directories. The systems refuses creating new dir "too many links" and then Alfresco is not starting while it is trying to create anew directory.


the log:java.io.IOException: Cannot create directory: /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/e51ff042-3628-11dd-b952-2b903bc78fd4        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:175)        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:227)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$Merger.mergeIndexes(IndexInfo.java:2943)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$Merger.run(IndexInfo.java:2448)        at java.lang.Thread.run(Thread.java:619)
ERROR [org.alfresco.repo.search.impl.lucene.index.IndexInfo] Failed to merge indexes
java.io.IOException: Cannot create directory: /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/e55b2483-3628-11dd-b952-2b903bc78fd4
        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:175

Does any one know how to increase this number ??

Thank's
24 REPLIES 24

paul_lahitte
Champ in-the-making
Champ in-the-making
I left the process running all night (I was quite depressed and tired and went to bed).
In the morning the procees java was stopped and the log was full of warnings  .

I restarted Alfresco with  index.recovery.mode=VALIDATE and it started !!!
Now my lucene-indexes directory is containing  only 37 directories and is 1.2 Gb sized (sounds better!).

I still have lots of warnings and errors related to lucene in the log file but at least peolpe can connect and work …

Part of catalina.out


at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:210)
   at $Proxy9.index(Unknown Source)
   at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:52)
   at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
   … 1 more
Caused by: java.lang.UnsupportedOperationException: The content never exists
   at org.alfresco.repo.content.EmptyContentReader.getDirectReadableChannel(EmptyContentReader.java:59)
   at org.alfresco.repo.content.AbstractContentReader.getReadableChannel(AbstractContentReader.java:226)
   at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:330)
   … 17 more
[WARNING] Unknown Ptg 1c (28)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 3c (60)
[WARNING] Unknown Ptg 14 (20) at cell (15,26)
09:31:18,983 ERROR [quartz.core.JobRunShell] Job DEFAULT.ftsIndexerJobDetail threw an unhandled Exception:
org.alfresco.service.cmr.repository.ContentIOException: Failed to open stream onto channel:
   accessor: ContentAccessor[ contentUrl=store://2008/6/11/9/31/61bba49c-3788-11dd-b926-13890b8e687a.bin, mimetype=null, size=0, encoding=UTF-8, locale=fr_FR]
   at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:337)
   at org.alfresco.repo.search.impl.lucene.ADMLuceneIndexerImpl.indexProperty(ADMLuceneIndexerImpl.java:858)
   at org.alfresco.repo.search.impl.lucene.ADMLuceneIndexerImpl.createDocuments(ADMLuceneIndexerImpl.java:542)
   at org.alfresco.repo.search.impl.lucene.ADMLuceneIndexerImpl.updateFullTextSearch(ADMLuceneIndexerImpl.java:1248)
   at org.alfresco.repo.search.impl.lucene.fts.FullTextSearchIndexerImpl.index(FullTextSearchIndexerImpl.java:188)
   at sun.reflect.GeneratedMethodAccessor382.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:281)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:187)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:154)
   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:107)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:176)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:210)
   at $Proxy9.index(Unknown Source)
   at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:52)
   at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
   at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)
Caused by: java.lang.UnsupportedOperationException: The content never exists
   at org.alfresco.repo.content.EmptyContentReader.getDirectReadableChannel(EmptyContentReader.java:59)
   at org.alfresco.repo.content.AbstractContentReader.getReadableChannel(AbstractContentReader.java:226)
   at org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:330)
   … 17 more


Any clues ?
Thank's again for your help .

Paul

jjf
Champ in-the-making
Champ in-the-making
We have encountered the same issue (the index folder has hit its limitation).  I do believe the merging process is working, however, it is merging at a rate slower than our content updates.  So, over the course of time, it will always fill with indexes and lock up the system.  Is there any workaround besides clearing the indexes and doing a full index?  Is there a way to stop the full text index from running?

tman
Champ in-the-making
Champ in-the-making
Same problem here: after reindexing, lucene creates another 32000 directories, and it hangs again. Another reindexing is needed.
Is there any method of overpassing this limitation and this ongoing manual reindexing?

Thanks

karlovitz
Champ in-the-making
Champ in-the-making
This question is 2 fold, we are currently building an alfresco implementation out for a client.  I am relatively new to alfresco so I am trying to learn all I can before making bad decisions.  This 32000 file limitation scares me as we will be storing a large number of assets.  Is there a way around this?  Should we re index on a regular basis?

This brings me to my next question, I am currently experiencing a performance problem when searching on our assets.  Takes anywhere from 20 - 30 seconds to return a basic lucene text query.  My first step was to rebuild the indexes passing in recovery_mode=FULL, although the log file says that the indexing is complete, tomcat didn't move on to start up. 

08:55:31,310 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    90 % complete.
09:04:30,916 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    100 % complete.
09:04:30,932 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery completed.


I have waited about an hour now for the cpu to settle down and tomcat to finish the start up process, however I am getting impatient.  I saw that a previous poster stated that once the indexing is complete an asynchronys process will kick off to do the rest of the job and tomcat will continue.  This, however is not what is happening.  Does anyone have any ideas.

karlovitz
Champ in-the-making
Champ in-the-making
One other note is that while running this rebuild my sqlserver has consumed over 850 meg of memory and is climbing while tomcat is hovering around 600 meg.  I am only rebuilding 14,000 transactions.  Any ideas would be very helpful.

Steve

andy
Champ on-the-rise
Champ on-the-rise
Hi

Is this a simple Alfrecso install? If not what have you changed?
Was anything reported in the Alfresco logs?
Have you got any custom code?

Andy

andy
Champ on-the-rise
Champ on-the-rise
Hi

Please include the exact version of Alfresco.
Were there any exceptions related to index merging in the logs?
It is possible for the background merge to die and not recover on some versions of the product.
However, there will always be something in the log when this happens.

The back ground full text serach index can output errors from the convertors.
Some docuements can not convert and in some cases there is missing content etc etc.
Nothing has gone wrong.

You can search for the offending files using nitf and ninc  (as described on the search wiki page)

Andy

jjf
Champ in-the-making
Champ in-the-making
Andy,

Thanks for the response.  It is a basic "out-of-the-box" setup using Community Release 2.0.  We have also tried this using the 2.1 release in our DEV env.  From what I can tell, over time, transactions in the system will always cause indexes to increase.  At some point, the indexes will hit 32,000 files and cause the system to become unstable.

I've checked the logs and, as far as I can tell, the merging isn't reporting any errors.  Is there a different logging level I need to use to see errors with the merge process?  If I monitor the index directory, I see the folder count going up and down which seems to me like the merging is occuring.  However, over time (weeks/months), the index folders always increases.

Please let me know if I can give you any more information.

jjf
Champ in-the-making
Champ in-the-making
One thing we noticed is we wrote some customer code which adds/removes users to Alfresco via a batch.  It's just a Java Quartz job which runs and calls personSvc.createPerson(), authSvc.createAuthentication(), and authoritySvc.addAuthority() (also remove/delete methods too).  This code seemed to be creating one index per user.  So if we had 32,000 users, there would be problems.  We disabled the batch so that this wouldn't happen.  Is this the expected behavior (one index per user transaction)?

louise
Champ in-the-making
Champ in-the-making
Some background:

Since ext3 aims to be backwards compatible with the earlier ext2, many of the on-disk structures are similar to those of ext2. Because of that, ext3 lacks a number of features of more recent designs, such as extents, dynamic allocation of inodes, and block suballocation. There is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode.

The maximum number of inodes (and hence the maximum number of files and directories) is set when the file system is created. If V is the volume size in bytes, then the default number of inodes is given by V/213 (or the number of blocks, whichever is less), and the minimum by V/223. The default was deemed sufficient for most applications. The max number of subdirectories in one directory is fixed to 32000.

ReiserFS is still limited by the 16 bit link count for directories, so you can't have more than around 65,000 subdirectories with the same parent. BTW: This limitation even exists in the "Mother of all network attached file systems", ONTAP on NetApp Filers.

The solution would be (or MUST be) a better Lucene indexing algorithm, not the limit change in filesystem. How we can store more than 32000 users in Alfresco authorization?