I further investigated the problem I described in my earlier post.
It's not a resource leak, but the lucene indexer is using much more 'contentstore resources' since alfresco 3.4.
Let me explain this a bit:
1. content in a contentStore is read via an implementation of ContentStore.getReader()
2. the access to this content(stream) is released via an implementation of ContentStreamListener.contentStreamClosed()
In earlier versions these 2 methods were called after eachother: contentStreamClosed() was called before the next getReader() was called.
This is not the case anymore, in ADMLucenIndexerImpl.flushPending(), getReader() is called via readDocuments(), while contentStreamClosed() is called via writer.addDocument(doc) which is in a loop over all documents (in a batch). So this implementation keeps multiple streams opened (~ size of lucene.indexer.batchSize).
In the default contentStore (file system) this is probably unnoted because the os provides a huge number of file handles. In our non-default contentstore (castorContentStore) however we use http connections to access the contentstore and we only configured a pool with a limited number of connections (much smaller than the batchSize). So we are able to solve the issue by tuning the castor connection pool to the lucene batch size.
This leaves me with 2 questions:
1. does the above analysis make sense?
2. is there a real reason to have these concurrent open streams or was this introduced 'by accident'?