cancel
Showing results for 
Search instead for 
Did you mean: 

CIFS slowdown

atariq
Champ in-the-making
Champ in-the-making
We've been going through a departmental testing phase of Alfresco 2.1 (with CIFS and NTLM) here at the office and things have generally been looking good so far.

The way our repository has been setup now is to have the alf_data directory mapped over the network to a larger storage unit. I had made the necessary changes in the custom-repository.properties file to reflect the move and it was done without a problem. People have been using Alfresco now via CIFS and the web client for a week with a generally positive reaction.

There is, however, a somewhat serious issue that we have come across that we are looking to address ASAP. One of the directories in our repository contains approximately 250MB worth of documents (1700+ docs and pdfs). This particular directory takes ages (around 30 seconds to be exact) to load fully each time via CIFS and in the grand scheme of things (considering it's a pretty well-demanded directory for the department) the slowdown is cutting efficiency. What's odd is that the directory listing doesn't seem to be that slow via the web client.

Also, for your information, a versioning aspect has been turned on all directories.

If you have any insight on what could be causing the slowdown, I'll greatly appreciate it! Thank you in advance.
19 REPLIES 19

jenglert
Champ in-the-making
Champ in-the-making
Any thoughts or suggestions on this would still be very much appreciated. Smiley Happy

Are there known issues with NFS shares and the Alfresco repository?

We had problems like this quite a bit on a project we worked on. You are seeing inconsistent results because Lucene implements some internal caching (just like Oracle).  When the results are in memory, the results can be returned quickly.  When they are are on the HDD, it takes longer.  Our solution was to enforce a maximum 500 files per directory limit.  This worked pretty well for us.  I'm not sure whether this would be a possibility for you?

Jim Englert
jenglert@tsgrp.com

atariq
Champ in-the-making
Champ in-the-making
Thanks for the response, Jim.

Yup, I learned about the reindexing strategy and rebuilt it successfully however it did not give any noticeable performance increase.

With regards to setting directory limits, that would be something I'd have to discuss further with the relevant department here at the office. That said, I'm pretty sure Alfresco should be able to handle this directory size as in previous tests of having the repository on the same server everything seemed to be just fine. It's only AFTER moving the repository over to an NFS share on our fileserver that the slowdown has started to occur.

jenglert
Champ in-the-making
Champ in-the-making
The actual location of the content should (hopefully) have no impact on search performance.  Lucene might take longer to index the content initially but queries against Lucene should be unaffected by the underlying content store location.   Your finding are consistent with what we found (thus mandating the 500 file limit).  Maybe there is some sort of issue where path based Lucene queries (CIFS, the web interface, and WebDAV all use Lucene) are actually hitting the content store for some reason. 

Problem is, that section of Alfresco code is likely very hard to follow to see if there might be an issue.  I'll put it on the back burner though.

atariq
Champ in-the-making
Champ in-the-making
Thanks for following this thread.

An interesting note I've been informed of today is that a general slowdown is experienced anywhere on the repository whenever a content item is being saved. For instance, it was demonstrated to me that whenever a 1.5mb Excel spreadsheet was being saved via CIFS onto any directory whose file count was considerably less than 500, we were still seeing around 15 - 20 seconds of delay before it got saved. Of course, for directories on which sizable traffic is expected this becomes rather problematic. As such, I really wonder whether the file count limit is the culprit.

If I really run of ideas, I'm going to try to move the repository back to the Alfresco server and carry out some more tests. I'll keep you guys posted!

Thanks again.

andy
Champ on-the-rise
Champ on-the-rise
Hi

How does the performance of CIFS and FTP compare?

CIFS is  a very chatty protocol and may well have some issues backed with NFS.

If your lucene indexes are stored locally then a search never goes to NFS. You should not have lucene indexes mounted over NFS (this is actually not supported in lucene until 2.1). Indexes in Alfresco over NFS will be supported in the future. At the moment they must be local or on some FS that is equavalent.

Is the CIFS performance consistent or is it much better the next time you open the directory.

It is possible that your CIFS client is going to get some meta data information for each file (which may end up pulling the content) - try changing what is displayed in the client (e.g. remove fils size info, thumbnails that kind of thing).

Andy

callermd
Champ in-the-making
Champ in-the-making
Using a local disk and lucene we are getting very slow CIFS browsing and saving.  We have many directories with thousands of files plus large numbers of scanned images.

Running AC 2.1 under FC7.

loftux
Star Contributor
Star Contributor
This may be a windows explorer problem
http://www.ss64.com/nt/slow_browsing.html
Or try searching directly on microsoft support, you will find lots of issues there (tip: search on both smb and cifs)

Can you try mounting CIFS from a linux pc and se if you get the same response times listing files.

Anyway, thousands of files will take longer to list than just a few  Smiley Happy try segmenting into spaces. The actual save procedure should not be affected, unless you count the time it takes for the windows app save dialog to open an list the files already existing.

/Peter Löfgren

callermd
Champ in-the-making
Champ in-the-making
I found the query is causing the save slowdown..

select * from
alf_node_status nodestatus0_
inner join alf_node nodeimpl1_
on nodestatus0_.node_id=nodeimpl1_.id
where nodeimpl1_.id in
(select nodeimpl3_.id
from alf_child_assoc childassoc2_
inner join alf_node nodeimpl3_ on childassoc2_.child_node_id=nodeimpl3_.id
where childassoc2_.parent_node_id=1595175 and childassoc2_.is_primary=1);

It takes 7 seconds to execute this query on my machine.

Interestingly the subquery returns 0 rows.

callermd
Champ in-the-making
Champ in-the-making

callermd
Champ in-the-making
Champ in-the-making
And it looks like it is fixed in latest head

From Node.hbm.xml

   <query name="node.GetPrimaryChildNodeStatuses">
      select
         status
      from
         org.alfresco.repo.domain.hibernate.NodeStatusImpl as status,
         org.alfresco.repo.domain.hibernate.ChildAssocImpl as assoc
         join assoc.child as child
      where
         assoc.parent = Smiley Tonguearent and
         assoc.isPrimary = true and
         status.node = child
   </query>
  

This is the correct way of doing this.


My save time went from 7 seconds to less than one.