03-10-2017 07:01 AM
Hello,
I use alfresco community 5.0.d and i would like to know if it is possible to search in Alfresco all the files thatdon't have an extension.
I do not know how to do the search.
Thanks,
Matthieu
03-11-2017 10:32 AM
Depends on which Search you want to use. If using the "Aikau" Search in Share or the Alfresco FTS , the Searchstring !=cm:name:*.??? should do it. It should find all nodes not having a name that ends with a three character extension.
03-13-2017 01:42 AM
The question isn't necessarily a matter of which UI you use (Aikau faceted search or Node Browser for instances), but if the search services support this type of query. The problem with a wildcard based approach in FTS is that it will by design only scale to a certain amount of documents in the system. This is a result of how the query is translated to the underlying Lucene system in SOLR. Also, the pattern *.??? assumes that all extensions are three-letter extensions only which might have been the standard in the old DOS 8.3 world but all modern MS Office extensions are four-lettered ones.
Without having done a similar query myself on a large document base (i.e. more than just a couple tens of thousands of documents), I would assume the best way to work with this is by doing a CMIS query using the LIKE operator on cmis:name. The reasoning behind this is that a CMIS query using LIKE can actually be applied against the database instead of the SOLR index, and thus is not limited by the index query rewrite restrictions. The only thing you need to ensure is that the additional indexes for transactional metadata queries have been applied on the database system.
03-13-2017 04:32 AM
Hi Axel, I mentioned "Aikau" because it's the easiest way to test the FTS String. The query performs well on large document sets (tested with 1000.000 doc repo ) , but paging throu large resultset gets slower for following pages (and gets worse page by page)
It's true it finds only three character extensions, but is easy to adapt 🙂
I used ??? because I thought Solr would internally invert the query string (???.*) which would not be so expensive - do you know if this is correct?
03-13-2017 04:50 AM
I can't say how SOLR / Lucene handles this low level. I just remember issues with running into maxBooleanClause limits with Alfresco SOLR before due to the way that Alfresco was rewriting wildcard queries before sending them off to the SOLR / Lucene layer. Though this may have changed in Alfresco 5.0 or later versions...
Explore our Alfresco products with the links below. Use labels to filter content by product module.