cancel
Showing results for 
Search instead for 
Did you mean: 

How to calculate total docs in alfresco?

yogeshpj
Star Contributor
Star Contributor

I am planning to do re-indexing of couple of tera bytes data in alfresco.

For that, I am doing re-indexing with smaller set of data and make benchmark on that and tune as deeply it can be.
After measuring performance for smaller set of data, calculate re-indexing time for actual data.

To have benchmark and get some initial idea how much time it will take to re-index, I was calculating number of alfresco documents. ( Is this right approach or should we consider data size or something else ? )

What is exact way to calculate number of docs in alfresco.

1) Check number of raw in alf_node tables with workspace filter.
2) Write java or java-script to measure number total docs with cm:content type.
3) Query in SOLR for cm:content type.

We are using alfresco 5.2.2 enterprise with SOLR 4.

2 REPLIES 2

abhysunny
Champ on-the-rise
Champ on-the-rise

I also have this question: I have been relying on CMIS query to find count of documents IN TREE of the root folder. For me, the requirement was to get an approximate count for metric. What is the best approach ?

charlesdaumont
Champ on-the-rise
Champ on-the-rise

Hi,

The best way to do it is to find it through your database, here is a request I used to count documents and folders, adapt it to your context (I hope it works Smiley Happy) :

select a.local_name, b.protocol, count(*)
from alf_qname as a, alf_store as b, alf_node as c
where a.local_name in ('content', 'folder')
and b.protocol in ('workspace', 'archive') and b.identifier = 'SpacesStore'
and c.store_id = b.id and c.type_qname_id = a.id
group by a.local_name, b.protocol; 
The number of document is not enough to calculate your indexing time. You should also consider :
  • number of versions
  • number of indexed metadata
  • text content size if you index your contents

I hope these answers will help you.

Charles Daumont

Sopra Steria