cancel
Showing results for 
Search instead for 
Did you mean: 

[solved] Optimize access performance with plenty of docs

zomurn
Champ in-the-making
Champ in-the-making
Hi,

I have a folder in alfresco (named "Archives") which represents the end folder (end of workflow), (or the end of circuit) that a document follows.
Hence, this folder has no less than 1750 documents displayed on the web client by page of ten documents by default (175 pages in all).
So when I click on this folder, i need to wait 20 seconds to display the content in this folder….all that to calculate the pagination….
Is there a way to have streaming content ? For example, when I click only one page (first ten documents) is displayed and doesn't matter the number of document there is inside that folder …

Thanks.
7 REPLIES 7

zomurn
Champ in-the-making
Champ in-the-making
In fact, this is these parameters which have influence on the time to response :

# The maximum time spent pruning results
system.acl.maxPermissionCheckTimeMillis=100000000
# The maximum number of results to perform permission checks against
system.acl.maxPermissionChecks=10000

I commented them in my custom-repository.properties and now, It will request again the server in case with want to consult the last page (175)
That's ok.

dwilson
Champ in-the-making
Champ in-the-making
I commented them in my custom-repository.properties and now, It will request again the server in case with want to consult the last page (175)
That's ok.

zomurn, I am running into similar issues with my custom paging.  Can you describe in more detail what you did here?  You inserted comment symbols in front of those two lines, or you increased the numbers for those attributes?  What does the code look like to make sure the docs on page 175 are only accessed when the user clicks on page 175?

Thanks!

zomurn
Champ in-the-making
Champ in-the-making
These two lines must be mentioned into custom-repository.properties if you want to request a large number of documents.
If you enter a big number, the time to query allowed is greater and then you can afford to do bigger queries (which retrieve a lot of results).
For example in 10s (by default) the application can retrieve "only", let's say, 500 documents from repository so the user will see a pagination displayed from page 1 to 50 (considering 10 doc per page).
If there is more document inside the folder (or from whatever query results) then the pagination after 50 is not displayed because not calculated (the repository said "I give you all the document that respond to your query in the amount of time you accorded to me"…it is 500 not more but It probably have more results).
On the other hand, it is no sense to give the opportunity to someone to acces the page 175 …. 1750 documents inside a folder is unreadeable ! Documents should only be accessed by search !

dwilson
Champ in-the-making
Champ in-the-making
Thanks for the quick response. Smiley Happy

Yes, 1750 documents are too many in general, but if it is not being displayed like a search result, and more like a listing of documents inside a category then being able to sort on that list is important to be able to get the full information.  For example to get the smallest or largest documents within the 1750.  Also since it is displayed like a category, it is nice to show the user how many documents total are in the category, and you can only get that total by giving Lucene enough time to do a full search.   Otherwise you would need to say:    "Results 1-10 of 500+ documents"  (but in actuality the number "500" wouldn't be steady, more like  541, 397, 698, etc.)    Smiley Very Happy

In your post earlier you implied that user can view page 175  if they choose to…but my question is - how can your system display that page 175 even exists if it only saw ~500 readings?

zomurn
Champ in-the-making
Champ in-the-making
The pagination is calculed according to the results retrevied…the more they are, the longer the pagination.
I don't thing we can parameterize the pagination so it always displays the last number of the page.
In this case, you need to do a query like "select count(….)" and with page length (10 doc by default) you will be able to compute the last page.
But if the user click the last page number, you must be able to query the last page in repository (keeping sort if required).

Concerning the sort, I hope (really) that even if the the results displayed don't go beyond page 50 (because of timeout)…the sort will be effective over the whole result and not only on the 50 pages (500 result) displayed….it seems to be logical otherwise sort is useless.
It is fast to query data and sort the 1750 results in database…but it is slow to retrieve the 1750 nodeRef from database to business layer…so the timeout concern only this latter slow part.

dwilson
Champ in-the-making
Champ in-the-making
Interesting.   Thanks for the reply.  So instead of:

Prev     Page 1 of 175     Next

Your paginator says something like:

Prev     Page 1 of LOTS     Next

zomurn
Champ in-the-making
Champ in-the-making
Yes, but more precisely (the standard alfresco one) :

Prev     Page 1 of XXX     Next

where XXX < 175 in case of the amount of time accorded is less than required.