cancel
Showing results for 
Search instead for 
Did you mean: 

Lucene error : Too Many Open File ??

Not applicable
Hello,

We are working with Alfresco 1.4. And I think that our index is corrupted because I can see a lot of error in alfresco log file and we can't add more file since see a dump error.

Error log :

00:03:43,186 ERROR [org.alfresco.repo.search.impl.lucene.LuceneBase2] Errorjava.io.FileNotFoundException: ./alf_data/lucene-indexes/workspace/SpacesStore/0e3909ea-79a7-11db-946d-3bededff6578/IndexInfoDeletions (Too many open files)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:106)        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.getDeletions(IndexInfo.java:648)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.createMainIndexReader(IndexInfo.java:1377)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.getMainIndexReferenceCountingReadOnlyIndexReader(IndexInfo.java:748)
        at org.alfresco.repo.search.impl.lucene.LuceneBase2.getSearcher(LuceneBase2.java:156)
        at org.alfresco.repo.search.impl.lucene.LuceneIndexerImpl2.updateFullTextSearch(LuceneIndexerImpl2.java:1772)
        at org.alfresco.repo.search.impl.lucene.fts.FullTextSearchIndexerImpl.index(FullTextSearchIndexerImpl.java:172)
        at sun.reflect.GeneratedMethodAccessor503.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:335)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:181)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:148)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
        at $Proxy4.index(Unknown Source)
        at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:36)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:191)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:516)


00:03:43,193 ERROR [org.quartz.core.JobRunShell] Job DEFAULT.ftsIndexerJobDetail threw an unhandled Exception:
org.alfresco.repo.search.impl.lucene.LuceneIndexException: Failed FTS update
        at org.alfresco.repo.search.impl.lucene.LuceneIndexerImpl2.updateFullTextSearch(LuceneIndexerImpl2.java:1897)
        at org.alfresco.repo.search.impl.lucene.fts.FullTextSearchIndexerImpl.index(FullTextSearchIndexerImpl.java:172)
        at sun.reflect.GeneratedMethodAccessor503.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:335)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:181)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:148)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
        at $Proxy4.index(Unknown Source)
        at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:36)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:191)        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:516)Caused by: org.alfresco.repo.search.impl.lucene.LuceneIndexException: Failed to open IndexSarcher for ./alf_data/lucene-indexes/workspace/SpacesStore/
        at org.alfresco.repo.search.impl.lucene.LuceneBase2.getSearcher(LuceneBase2.java:172)
        at org.alfresco.repo.search.impl.lucene.LuceneIndexerImpl2.updateFullTextSearch(LuceneIndexerImpl2.java:1772)
        … 14 moreCaused by: java.io.FileNotFoundException: ./alf_data/lucene-indexes/workspace/SpacesStore/0e3909ea-79a7-11db-946d-3bededff6578/IndexInfoDeletions (Too many open files)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:282)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:744)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:674)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:866)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
        at java.lang.Thread.run(Thread.java:595)


00:03:43,186 ERROR [org.alfresco.repo.search.impl.lucene.LuceneBase2] Error
java.io.FileNotFoundException: ./alf_data/lucene-indexes/workspace/SpacesStore/0e3909ea-79a7-11db-946d-3bededff6578/IndexInfoDeletions (Too many open files)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:106)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.getDeletions(IndexInfo.java:648)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.createMainIndexReader(IndexInfo.java:1377)
        at org.alfresco.repo.search.impl.lucene.index.IndexInfo.getMainIndexReferenceCountingReadOnlyIndexReader(IndexInfo.java:748)
        at org.alfresco.repo.search.impl.lucene.LuceneBase2.getSearcher(LuceneBase2.java:156)
        at org.alfresco.repo.search.impl.lucene.LuceneIndexerImpl2.updateFullTextSearch(LuceneIndexerImpl2.java:1772)
        at org.alfresco.repo.search.impl.lucene.fts.FullTextSearchIndexerImpl.index(FullTextSearchIndexerImpl.java:172)
        at sun.reflect.GeneratedMethodAccessor503.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:335)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:181)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:148)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
        at $Proxy4.index(Unknown Source)
        at org.alfresco.repo.search.impl.lucene.fts.FTSIndexerJob.execute(FTSIndexerJob.java:36)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:191)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:516)

We tried to increase the number of files that can be opened simultaneously : ulimit -n 65535

But It's still the same.

If I check the number of file open by lucene, this is what I see :

[root@fidji alfresco]# lsof | grep luc | wc -l
845


With this command, I can see that lucene seams working on deleted file.

[root@fidji alfresco]# lsof | grep luc
java      3930    root  807r      REG        9,0     21298   18677931 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/d9838a8f-79a6-11db-946d-3bededff6578/_m.cfs (deleted)
java      3930    root  808r      REG        9,0     18657   18678420 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/f7a4dc0a-79a6-11db-946d-3bededff6578/_j.cfs (deleted)
java      3930    root  809r      REG        9,0     22366   18678493 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/0f1efd2e-79a7-11db-946d-3bededff6578/_n.cfs (deleted)
java      3930    root  810r      REG        9,0     21296   18564073 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/286aab1d-79a7-11db-946d-3bededff6578/_m.cfs (deleted)
java      3930    root  811r      REG        9,0      5789   18678520 /data/alfresco/alf_data/lucene-indexes/workspace/SpacesStore/a3bbbacb-79a7-11db-946d-3bededff6578/_4.cfs
java      3930    root  812r      REG        9,0     22409   18678521 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/34e16e87-79a7-11db-946d-3bededff6578/_n.cfs
java      3930    root  813r      REG        9,0      4019   18678523 /data/alfresco/alf_data/lucene-indexes/workspace/SpacesStore/a4082d21-79a7-11db-946d-3bededff6578/_2.cfs
java      3930    root  814r      REG        9,0     18656   18678536 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/a4d8eaac-79a7-11db-946d-3bededff6578/_j.cfs
java      3930    root  815r      REG        9,0     23181   18678553 /data/alfresco/alf_data/lucene-indexes/workspace/lightWeightVersionStore/11c3e8ff-79a8-11db-946d-3bededff6578/_o.cfs
What can I do ?
Is it possible to recreate the index ?

Thanks
Regards,
Chris
31 REPLIES 31

Not applicable
Hello,

I find that we can increase the number of file in lucene queue.

See this file :

/opt/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/repository.properties

and increase the value of :

lucene.indexer.batchSize=2048

Perhaps this can help too !?

Is the rebuild index work with alfresco communit ?
And we are working with a centos 4.3.

Chris

lgr
Champ in-the-making
Champ in-the-making
ok. I'll try first with ulimit modified.
Then if the error occurs once more, i'll test your modification.

And yes, index recovering works like a charm !

Laurent.

Not applicable
I already try to increase ulimit.

But it's seams to be not sufficient.

Good luck.

Thanks
Chris

ribz33
Champ on-the-rise
Champ on-the-rise
Have you found a solution ?

We have same problem with a Redhat enterprise 3 update 3

We have ulimit -n = 1024
We have increase it. I dont know again if this will solve problem

lgr
Champ in-the-making
Champ in-the-making
I haven't met the errors since the modification, but my alfresco server hasn't been stressed, so i can't tell you if this solution is the right one.

Laurent.

adepue
Champ in-the-making
Champ in-the-making
I found this thread because I just ran into this problem.  I happen to be using a very old version of Alfresco (PR3, custom embedded into my application), so I'm not sure if my problem is exactly related, but after running a test server for a few days, I get a FileNotFoundException: …/alfresco/contentstore/2006/11/6/bba05348-6deb-11db-8a92-41c1536b4be4.bin (Too many open files)

My application calls directly into Alfresco and uses the following access pattern:
1. Find NodeRef.
2. ContentReader reader = contentService.getReader(nodeRef)
3. InputStream in = reader.getContentInputStream();
4. … in.close() …

This is a test server with only about 5 documents in the repository.  In this case, the same two or three documents are read over and over (no writing or creation of nodes had taken place).  It appears that after reading a certain number of times, things break.  The operating system is Linux (Mandrake community 10.1 - 2.6.10 #1 SMP Fri Feb 4 09:14:28 PST 2005 i686 AMD Athlon™ MP 2800+).  Alfresco is running embedded within our application, which is running in Tomcat 5.5.9 using Sun's JVM (build 1.6.0-b105).

The exception looks like this (the exception happened to be dumped by our client software to a log file in XML format, so excuse the format here - I'm also only including the relevant portions):

      <nestedThrowable class="org.alfresco.service.cmr.repository.ContentIOException" id="17">
        <detailMessage>Failed to open stream onto channel:
   accessor: Content[ url=file://2006/11/6/bba05348-6deb-11db-8a92-41c1536b4be4.bin, mimetype=application/octet-stream, encoding=null]</detailMessage>
        <cause class="org.alfresco.service.cmr.repository.ContentIOException" id="18">
          <detailMessage>Failed to open file channel: Content[ url=file://2006/11/6/bba05348-6deb-11db-8a92-41c1536b4be4.bin, mimetype=application/octet-stream, encoding=null]</detailMessage>
          <cause class="java.io.FileNotFoundException" id="19">
            <detailMessage>/home/tomcat-test/jakarta-tomcat-5.5.9/./alfresco/contentstore/2006/11/6/bba05348-6deb-11db-8a92-41c1536b4be4.bin (Too many open files)</detailMessage>
            <cause class="java.io.FileNotFoundException" reference="19"/>
            <stackTrace id="20">
              <trace id="21">java.io.RandomAccessFile.open(Native Method)</trace>
              <trace id="22">java.io.RandomAccessFile.&lt;init&gt;(RandomAccessFile.java:212)</trace>
              <trace id="23">org.alfresco.repo.content.filestore.FileContentReader.getDirectReadableChannel(FileContentReader.java:139)</trace>
              <trace id="24">org.alfresco.repo.content.AbstractContentReader.getReadableChannel(AbstractContentReader.java:184)</trace>
              <trace id="25">org.alfresco.repo.content.AbstractContentReader.getContentInputStream(AbstractContentReader.java:201)</trace>

prototribe
Champ in-the-making
Champ in-the-making
Hi

Yes, that's the correct way to rebuild.

Could anyone with this problem let us know which Linux (distro & ver) they are using?  We did the 10 million document benchmarks on Red Hat, so it doesn't seem like it should be a general "Linux" problem.

Cheers
Paul.

Hello,

We just ran into this problem as well. The initial symptom was mounted windows shares showing empty folders and nothing could be written to them.

We're using:
CentOS 4.4
Alfresco 1.4.0 (build-105)
1.1TB Disk via 3ware raid 5
2GB RAM

Its only been up for about a week (we just migrated from a very old version of alfresco), but in that time we've successfully pushed nearly 10,000 documents (consuming 250GB) into it via the samba connection with no issues. This morning we switched to an external java (Java(TM) SE Runtime Environment (build 1.6.0-b105)) and encountered the "too many open files" error today with adding only a few documents. Previously we were using the one bundled with the alfresco distribution.


The inital index rebuild failed at about 80% using the external version of java. Here is the error:

15:30:46,155 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery started: 4,098 transactions.
15:35:34,210 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    10 % complete.
15:36:15,943 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    20 % complete.
15:36:46,570 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    30 % complete.
15:37:56,449 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    40 % complete.
15:40:38,260 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    50 % complete.
15:43:50,647 ERROR [net.sf.jooreports.openoffice.connection.SocketOpenOfficeConnection] disconnected unexpectedly
15:45:03,938 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    60 % complete.
15:48:28,601 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    70 % complete.
15:49:38,818 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    80 % complete.
15:50:41,532 ERROR [net.sf.jooreports.openoffice.connection.SocketOpenOfficeConnection] disconnected unexpectedly
15:50:42,019 ERROR [org.alfresco.repo.search.impl.lucene.index.IndexInfo] java.io.FileNotFoundException: /opt/alfresco14/alf_data/lucene-indexes/workspace/SpacesStore/8f6026a9-a735-11db-8c48-7b6dddb51ec3/segments (Too many open files)
15:50:42,082 ERROR [org.alfresco.repo.search.impl.lucene.index.IndexInfo] java.lang.NullPointerException

On the second try with the alfresco bundled java (v1.5.0_08-b03) the reindexing completed successfully.

16:01:14,552 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery started: 4,098 transactions.
16:01:59,752 ERROR [net.sf.jooreports.openoffice.connection.SocketOpenOfficeConnection] disconnected unexpectedly
16:05:06,762 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    10 % complete.
16:06:41,865 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    20 % complete.
16:07:41,861 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    30 % complete.
16:08:56,494 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    40 % complete.
16:11:44,463 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    50 % complete.
16:13:16,176 ERROR [net.sf.jooreports.openoffice.connection.SocketOpenOfficeConnection] disconnected unexpectedly
16:16:36,848 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    60 % complete.
16:20:30,813 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    70 % complete.
16:21:33,967 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    80 % complete.
16:23:38,006 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    90 % complete.
16:25:04,739 ERROR [net.sf.jooreports.openoffice.connection.SocketOpenOfficeConnection] disconnected unexpectedly
16:26:09,728 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    100 % complete.
16:26:09,729 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery completed.

ulimit -n is 1024

Could this be a leak in java 6?

We've switched back to the bundled java to see if that makes a difference for regular use.

adepue
Champ in-the-making
Champ in-the-making

Could this be a leak in java 6?

Come to think of it, we never ran into this issue until we upgraded our server to use Java 1.6.

prototribe
Champ in-the-making
Champ in-the-making
Ran into the "too many open files" issue again, but it took a bit longer. I've increased ulimit to 4096. we'll see how long it takes now.

prototribe
Champ in-the-making
Champ in-the-making
With the increased ulimit and the bundled java I have not encountered the problem again. However I have encountered an issue with the jvm crashing in whats probably an unrelated issue.

I'll be moving back to 1.6.0 with the increase ulimt (4096) to see if that helps both scenarios.

Heres a snippet of the stack trace if anyone cares. The crash occurred during a CIFS access. I've read that the 1.5.0 jvms have a few bugs.

#
# An unexpected error has been detected by HotSpot Virtual Machine:
#
#  SIGSEGV (0xb) at pc=0xb20b8ecd, pid=3688, tid=2061646768
#
# Java VM: Java HotSpot(TM) Server VM (1.5.0_08-b03 mixed mode)
# Problematic frame:
# J  java.util.HashMap$ValueIterator.next()Ljava/lang/Object;
#

—————  T H R E A D  —————

Current thread (0x0851c950):  JavaThread "Sess_T19_192.168.0.103" daemon [_thread_in_Java, id=8461]

siginfo:si_signo=11, si_errno=0, si_code=1, si_addr=0x00005510

Registers:
EAX=0xad784f38, EBX=0x00000007, ECX=0x7f940c20, EDX=0x00005500
ESP=0x7ae22458, EBP=0xae0c5130, ESI=0xaecda5a0, EDI=0x00000004
EIP=0xb20b8ecd, CR2=0x00005510, EFLAGS=0x00010246

Top of Stack: (sp=0x7ae22458)
0x7ae22458:   8338fc00 b199c7f0 8338f7c8 aecda5a0
0x7ae22468:   ae0c5130 00000004 ad5cb5f0 b1ec7270
0x7ae22478:   7f940bd4 7f9405dc 00000003 7f940520
0x7ae22488:   7f940be0 8339f500 0851c950 7f940520
0x7ae22498:   7f9406d8 7f9404a8 8339f500 8338a758
0x7ae224a8:   7f940c20 00000003 7f940538 7f940bd4
0x7ae224b8:   00000026 7f940370 b7aeb55e b21a7ca4
0x7ae224c8:   afdc8078 00000000 7f93e9f8 7f940300

Instructions: (pc=0xb20b8ecd)
0xb20b8ebd:   53 24 8b 59 14 3b d3 0f 85 d7 00 00 00 8b 51 08
0xb20b8ecd:   8b 5a 10 89 74 24 0c 89 6c 24 10 89 7c 24 14 8b

Stack: [0x7ada3000,0x7ae24000),  sp=0x7ae22458,  free space=509k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J  java.util.HashMap$ValueIterator.next()Ljava/lang/Object;

UPDATE: 2007-04-09
The crashes were due to a hardware issue. We're still on java 1.5 and it seems to be working fine.