Lucene indexes are around 5 times larger than contentstore?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-24-2007 09:24 AM
We've run around 13,000 word and rtf documents into an Alfresco 2.1 instance on Linux. We're finding the lucene-indexes are five times larger than the content we are indexing? Has anyone else seen this?
-> 2.8 Gb contentstore
-> 15.4 Gb lucene-indexes
Regards,
Damon.
-> 2.8 Gb contentstore
-> 15.4 Gb lucene-indexes
Regards,
Damon.
Labels:
- Labels:
-
Archive
6 REPLIES 6
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-25-2007 09:24 AM
We've now done some further testing around indexes sizes to see why they are so much larger than our content store..
Below are different tests and the results on the indexes sizes for each.
These were done sequentially. The last one is the most interesting in that
blowing indexes away results in a large decrease. It seems old and
presumably unused indexes are hanging around???
Damon.
After Bootstrap Alfresco:
4.3M live/lucene-indexes
4.0K live/contentstore.deleted
4.6M live/contentstore
4.0K live/audit.contentstore
8.8M live/
After migrating 9 folders with a few hundred files:
27M live/lucene-indexes
4.0K live/contentstore.deleted
13M live/contentstore
4.0K live/audit.contentstore
40M live/
lucene-indexes directory breakdown:
80K ./lucene-indexes/user
4.0K ./lucene-indexes/locks
48K ./lucene-indexes/archive
100K ./lucene-indexes/system
27M ./lucene-indexes/workspace
After server restarted actually went down a little:
26M ./lucene-indexes
4.0K ./contentstore.deleted
13M ./contentstore
4.0K ./audit.contentstore
Set index.recovery.mode=FULL and restarted the server:
36M ./lucene-indexes
4.0K ./contentstore.deleted
13M ./contentstore
4.0K ./audit.contentstore
49M .
Set blew away index and set index.recovery.mode=FULL and restarted the
server:
9.6M ./lucene-indexes
4.0K ./contentstore.deleted
13M ./contentstore
4.0K ./audit.contentstore
23M .
Below are different tests and the results on the indexes sizes for each.
These were done sequentially. The last one is the most interesting in that
blowing indexes away results in a large decrease. It seems old and
presumably unused indexes are hanging around???
Damon.
After Bootstrap Alfresco:
4.3M live/lucene-indexes
4.0K live/contentstore.deleted
4.6M live/contentstore
4.0K live/audit.contentstore
8.8M live/
After migrating 9 folders with a few hundred files:
27M live/lucene-indexes
4.0K live/contentstore.deleted
13M live/contentstore
4.0K live/audit.contentstore
40M live/
lucene-indexes directory breakdown:
80K ./lucene-indexes/user
4.0K ./lucene-indexes/locks
48K ./lucene-indexes/archive
100K ./lucene-indexes/system
27M ./lucene-indexes/workspace
After server restarted actually went down a little:
26M ./lucene-indexes
4.0K ./contentstore.deleted
13M ./contentstore
4.0K ./audit.contentstore
Set index.recovery.mode=FULL and restarted the server:
36M ./lucene-indexes
4.0K ./contentstore.deleted
13M ./contentstore
4.0K ./audit.contentstore
49M .
Set blew away index and set index.recovery.mode=FULL and restarted the
server:
9.6M ./lucene-indexes
4.0K ./contentstore.deleted
13M ./contentstore
4.0K ./audit.contentstore
23M .
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-29-2007 11:45 AM
Hi
How are you loading this data?
Andy
How are you loading this data?
Andy
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-29-2007 12:03 PM
The data is loaded using a migration script which loads using a combination of the following calls:
nodeService.createNode followed by a contentWriter.putContent into that node
fileFolderService.create
fileFolderService.copy (from space templates)
Chris
nodeService.createNode followed by a contentWriter.putContent into that node
fileFolderService.create
fileFolderService.copy (from space templates)
Chris
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-29-2007 12:42 PM
Hi
Do you do any queries? Do you make sure you close the result sets?
Andy
Do you do any queries? Do you make sure you close the result sets?
Andy
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-29-2007 01:30 PM
yes I do and no I don't! I'll close all handles and re run and see what happens … cheers.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-29-2007 02:37 PM
sweet. thats got it.
After a load (without closes):
25456 contentstore
51460 lucene-indexes
After a load (with closes):
15164 contentstore
16296 lucene-indexes
After a load (without closes):
25456 contentstore
51460 lucene-indexes
After a load (with closes):
15164 contentstore
16296 lucene-indexes