cancel
Showing results for 
Search instead for 
Did you mean: 

linux 32000 files limitation

paul_lahitte
Champ in-the-making
Champ in-the-making
I am using alfresco community on redhat linux and lucenes-indexes/workspace/SpacesStore is containing 32001 directories. The systems refuses creating new dir "too many links" and then Alfresco is not starting while it is trying to create anew directory.


the log:java.io.IOException: Cannot create directory: /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/e51ff042-3628-11dd-b952-2b903bc78fd4        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:175)        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:227)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$Merger.mergeIndexes(IndexInfo.java:2943)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo$Merger.run(IndexInfo.java:2448)        at java.lang.Thread.run(Thread.java:619)
ERROR [org.alfresco.repo.search.impl.lucene.index.IndexInfo] Failed to merge indexes
java.io.IOException: Cannot create directory: /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/e55b2483-3628-11dd-b952-2b903bc78fd4
        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:175

Does any one know how to increase this number ??

Thank's
24 REPLIES 24

pmonks
Star Contributor
Star Contributor
I believe this is a filesystem limit that can't easily be changed.  That said, Lucene should not be creating so many directories since Alfresco includes a background job that periodically merges the Lucene index fragments together.  Would it be possible to do a full reindex and watch to see if the problem persists?

A full reindex can be performed by:
  1. Setting index.recovery.mode=FULL in custom-repository.properties

  2. Restart Alfresco
Note that reindexing will take a while if you have a lot of content, so if this is a production system it's best to schedule this for a maintenance window.

Cheers,
Peter

paul_lahitte
Champ in-the-making
Champ in-the-making
I did'nt mention I am running community 2.1  running I did a full index recovery .
It look's like it is done smoothly (sorry it's french):
23:23:24,747 INFO  [node.index.FullIndexRecoveryComponent] Récupération de lindex débutée : {0} transactions.
23:27:42,557 INFO  [node.index.FullIndexRecoveryComponent]    10 % achevé.
23:34:29,295 INFO  [node.index.FullIndexRecoveryComponent]    20 % achevé.
23:44:26,847 INFO  [node.index.FullIndexRecoveryComponent]    30 % achevé.
23:54:44,340 INFO  [node.index.FullIndexRecoveryComponent]    40 % achevé.
00:05:02,756 INFO  [node.index.FullIndexRecoveryComponent]    50 % achevé.
00:16:06,846 INFO  [node.index.FullIndexRecoveryComponent]    60 % achevé.
00:27:30,280 INFO  [node.index.FullIndexRecoveryComponent]    70 % achevé.
00:38:05,534 INFO  [node.index.FullIndexRecoveryComponent]    80 % achevé.
00:47:27,022 INFO  [node.index.FullIndexRecoveryComponent]    90 % achevé.
00:59:55,407 INFO  [node.index.FullIndexRecoveryComponent]    100 % achevé.
00:59:55,411 INFO  [node.index.FullIndexRecoveryComponent] Récupération de l'index achevée. (index restore done)

Then I try to restart Alfresco but her I got an error:
alfresco would'nt stop !

# ./alfresco.sh stop
Using CATALINA_BASE:   /opt/alfresco/tomcat
Using CATALINA_HOME:   /opt/alfresco/tomcat
Using CATALINA_TMPDIR: /opt/alfresco/tomcat/temp
Using JRE_HOME:       /data/jdk1.6.0_03
CompilerOracle: exclude org/apache/lucene/index/IndexReader$1 doBody
CompilerOracle: exclude org/alfresco/repo/search/impl/lucene/index/IndexInfo$Merger mergeIndexes
CompilerOracle: exclude org/alfresco/repo/search/impl/lucene/index/IndexInfo$Merger mergeDeletions
10 juin 2008 04:30:41 org.apache.catalina.startup.Catalina stopServer
GRAVE: Catalina.stop:
java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
   at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
   at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
   at java.net.Socket.connect(Socket.java:519)
   at java.net.Socket.connect(Socket.java:469)
   at java.net.Socket.<init>(Socket.java:366)
   at java.net.Socket.<init>(Socket.java:180)
   at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:395)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:344)
   at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:435)

and the java process is still running .

root      5270     1 97 Jun09 pts/2    05:04:27 /data/jdk1.6.0_03/bin/java -Xms1024m -Xmx1024m -server -Dfile.encoding=UTF8 -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=/opt/alfresco/tomcat/conf/logging.properties -Djava.endorsed.dirs=/opt/alfresco/tomcat/common/endorsed -classpath :/opt/alfresco/tomcat/bin/bootstrap.jar:/opt/alfresco/tomcat/bin/commons-logging-api.jar -Dcatalina.base=/opt/alfresco/tomcat -Dcatalina.home=/opt/alfresco/tomcat -Djava.io.tmpdir=/opt/alfresco/tomcat/temp org.apache.catalina.startup.Bootstrap start

Do you think reindexing is done or not ?. When I look in the lucene-indexes it look's like it is very very small compared to what it was but too small to be tue.

lightWeightVersionStore:
total 12
drwxr-xr-x 2 root root 4096 jun 10 02:03 4b661ffa-366a-11dd-b803-4bb39d114b58
-rw-r–r– 1 root root  106 jun 10 02:03 IndexInfo
-rw-r–r– 1 root root  106 jun 10 02:03 IndexInfoBackup

SpacesStore:
total 8
-rw-r–r– 1 root root 20 jun  9 23:12 IndexInfo
-rw-r–r– 1 root root 20 jun  9 23:12 IndexInfoBackup

paul_lahitte
Champ in-the-making
Champ in-the-making
I does not work , I tried on the huge directory workspace/SpacesStores but it fails since alfresco tries to create directories.
I am reloading a full backup database+alfresco content but there are already 31830 directories in lucene workspace/SpacesStores.
I will try first to rebuild indexes (if less than 170 directories are to be created by alfresco ) and if this fails I will try to rebuild indexes in an empty dir .

Do I really need lucene-Indexes to have a running sytem ? What if I try starting with index.recovery.mode=NONE or AUTO ?
Help would be appreciated since my production system is beeing down for 20H00 now !!

pmonks
Star Contributor
Star Contributor
I don't think a restore is necessary, since it's only the indexes that are problematic (and the indexes are a derived asset - they can be rebuilt from scratch at any time).  But if you do wish to restore, I'd encourage you to take another backup (see http://wiki.alfresco.com/wiki/Backup_and_Restore) before hand, just in case your backup set is corrupted or whatever.

I would also recommend doing a slightly modified version of the full reindex (regardless of whether you restore or not) to see if that helps:
  1. Stop Alfresco

  2. Set index.recovery.mode=FULL in custom-repository.properties

  3. Move the lucene-indexes directory out of the way (rename, move or delete the lucene-indexes directory)

  4. Restart Alfresco
Note that the progress report you see during startup only reports on the synchronous (metadata indexing) part of the reindex process.  The asynchronous part (full text indexing) happens concurrently after the system has started, and depending on how much content you have could take quite a while (full text indexing involves extracting text from all of your documents, which is both an I/O and CPU intensive operation).

In other words, after Alfresco has started you should wait for CPU and I/O usage to settle down (ie. until the asynchronous portion of the reindex process has completed) before attempting to stop Alfresco again - otherwise you'll be back to square one again (invalid indexes).

Cheers,
Peter

paul_lahitte
Champ in-the-making
Champ in-the-making
Thank's alot for your precious answers. I am right now doing it. But I have 2 questions
After the synchronous part of the process is the directory lucene-index supposed to get bigger ? (on my system the size hasen't change since the end of the synchronous part ).
Is alfresco unavailable (via the web client) until the end of the whole index rebuilt ? (3 hours after the synchronous part of the process I can see open files in the index directory and a java process consuming CPU but still cant get access to the login page

Paul

pmonks
Star Contributor
Star Contributor
After the synchronous part of the process is the directory lucene-index supposed to get bigger ?
While the asynchronous indexing is occurring, yes the indexes should grow.  Once the asynchronous portion of the reindex has completed it should stabilise, with slow growth as new content is added to the repository.

Is alfresco unavailable (via the web client) until the end of the whole index rebuilt ?
No - it should be available the entire time (albeit with full text searches not returning any of the content that hasn't been full text indexed yet).

What exactly happens when the login page is requested?  Do you see anything unusual in the alfresco.log?

Cheers,
Peter

paul_lahitte
Champ in-the-making
Champ in-the-making
Nothing in the catalina.out after the end of the  synchronous part:

tail -f catalina.out
10 juin 2008 19:33:37 org.apache.catalina.core.StandardService start
INFO: Démarrage du service Catalina10 juin 2008 19:33:37 org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.2310 juin 2008 19:33:37 org.apache.catalina.core.StandardHost start
INFO: XML validation disabled10 juin 2008 19:33:37 org.apache.catalina.startup.HostConfig deployWAR
INFO: Déploiement de l'archive alfresco.war de l'application web19:33:52,252 WARN  [remoting.rmi.RmiRegistryFactoryBean] Could not detect RMI re
gistry - creating new one19:33:53,686 WARN  [alfresco.util.OpenOfficeConnectionTester] A connection to Op
enOffice could not be established.
19:33:55,298 INFO  [domain.schema.SchemaBootstrap] Schema managed by database di
alect org.hibernate.dialect.MySQLInnoDBDialect.
19:33:59,898 INFO  [domain.schema.SchemaBootstrap] Aucune modification na été
apportée au schéma.
19:34:02,799 INFO  [node.index.FullIndexRecoveryComponent] Récupération de lin
dex débutée : {0} transactions.
19:47:01,948 INFO  [node.index.FullIndexRecoveryComponent]    20 % achevé.
19:57:50,672 INFO  [node.index.FullIndexRecoveryComponent]    30 % achevé.
20:08:54,842 INFO  [node.index.FullIndexRecoveryComponent]    40 % achevé.
20:18:59,557 INFO  [node.index.FullIndexRecoveryComponent]    50 % achevé.
20:30:34,372 INFO  [node.index.FullIndexRecoveryComponent]    60 % achevé.
20:41:58,095 INFO  [node.index.FullIndexRecoveryComponent]    70 % achevé.
20:53:31,487 INFO  [node.index.FullIndexRecoveryComponent]    80 % achevé.
21:03:36,333 INFO  [node.index.FullIndexRecoveryComponent]    90 % achevé.
21:16:59,832 INFO  [node.index.FullIndexRecoveryComponent]    100 % achevé.
21:16:59,835 INFO  [node.index.FullIndexRecoveryComponent] Récupération de l'index achevée.

The java process is still running and files are still open but the size of the lucene-indexes are unchanged since the end of the synchronous part.

ava      5790      root  183u      REG      253,2        193   35882278 /data/opt/alf_data/lucene-indexes/system/system/IndexInfo
java      5790      root  188u      REG      253,2        193   35882279 /data/opt/alf_data/lucene-indexes/system/system/IndexInfoBackup
java      5790      root  189r      REG      253,2       2772   35882281 /data/opt/alf_data/lucene-indexes/system/system/69b57a6e-3713-11dd-a558-0f01f70fbb44/_0.cfs
java      5790      root  190u      REG      253,2        106   35882286 /data/opt/alf_data/lucene-indexes/user/alfrescoUserStore/IndexInfo
java      5790      root  191u      REG      253,2        106   35882287 /data/opt/alf_data/lucene-indexes/user/alfrescoUserStore/IndexInfoBackup
java      5790      root  192u      REG      253,2        106   35882290 /data/opt/alf_data/lucene-indexes/workspace/lightWeightVersionStore/IndexInfo
java      5790      root  193u      REG      253,2        106   35882291 /data/opt/alf_data/lucene-indexes/workspace/lightWeightVersionStore/IndexInfoBackup
java      5790      root  194u      REG      253,2          0   35882295 /data/opt/alf_data/lucene-indexes/archive/SpacesStore/IndexInfo
java      5790      root  195u      REG      253,2          0   35882296 /data/opt/alf_data/lucene-indexes/archive/SpacesStore/IndexInfoBackup
java      5790      root  196u      REG      253,2          0   35882298 /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/IndexInfo
java      5790      root  197u      REG      253,2          0   35882299 /data/opt/alf_data/lucene-indexes/workspace/SpacesStore/IndexInfoBackup

Client web is timing out when trying to connect …

paul_lahitte
Champ in-the-making
Champ in-the-making
Do you know if there is a way to disable this feature to allow my users access their documents ?
The server is not available since 2 days and it is getting critical for us ?

cheers
paul

pmonks
Star Contributor
Star Contributor
Have you taken a thread dump after the login page times out, and looked for liveness issues?  It sounds like something is deadlocked and a common source of deadlocks is a database connection pool that's too small.

Cheers,
Peter