cancel
Showing results for 
Search instead for 
Did you mean: 

[SOLVED] Lucene Search for metadata - yes again

mwildam
Champ in-the-making
Champ in-the-making
I searched the forum, the wiki and even Google and I find a lot of information of what should work when using the API and lucene to search for meta data.

Basically the only thing, what works is:
[…]
            String searchCriteria = "+@cm\\:title:\"Test\"";
            Query query = new Query(Constants.QUERY_LANG_LUCENE, searchCriteria);
            Node[] nodes = repositoryService.get(new Predicate(null, spacesStore, query));
[…]
BUT: Nowhere through the API I can read the short prefix ("cm" in this case) - For other namespaces I don't have a single idea how that prefix could be - however, I know the long full names, like:
{http://www.customer.com/model/content/1.0}projectName

How the hack can I use the full name in my search criteria? - Documentation says I can, but I can't get it to work. How can this be achieved?
6 REPLIES 6

mikeh
Star Contributor
Star Contributor
There's an example on the wiki: http://wiki.alfresco.com/wiki/Search#Finding_nodes_by_content_mimetype

e.g.
@\{http\://www.alfresco.org/model/content/1.0\}title:'Test"
That's how Lucene needs to receive it, so obviously you'll need some extra escaping in the client code.

Thanks,
Mike

mwildam
Champ in-the-making
Champ in-the-making
Please read the following sentence slowly word by word:
You, are, my, hero!
It works. Thanks a lot!

There's an example on the wiki: http://wiki.alfresco.com/wiki/Search#Finding_nodes_by_content_mimetype
@\{http\://www.alfresco.org/model/content/1.0\}title:'Test"
I have been on that page but I overlooked that only sample - But I have seen even places showing it wrong (not going to rediscover them yet) or on slideshare showing only half the information (like http://www.slideshare.net/JM.Pascal/alfresco-search-tutorial-presentation on slide 46 which is just showing a portion and that is even wrong. Most places just show the short form. And yesterday I already looked at topic http://forums.alfresco.com/en/viewtopic.php?f=3&t=19701 as one of the place where I have been and I tried that already. What I noticed only now is that in addition to to the curly braces also the ":" after http has to be escaped - but ONLY that first occurrence! I am pretty sure I overlooked and forgot that. Actually in the previously mentioned slide 46 the escaping of the second ":" is definitely wrong - I get an error when I am trying to escape that also.

That's how Lucene needs to receive it, so obviously you'll need some extra escaping in the client code.
The longer I think of it the more I do recognize that lucene search queries must be seen more like regular expressions than like a google like query search. And from this point of view it seems to be more clear that every special character needs to be escaped.

Investigating the options I found the REST method of Open Search style:http://wiki.alfresco.com/wiki/OpenSearch#Alfresco_Keyword_Search. This seems a lot easier but however does not seem to allow a special search in particular fields. - That said, for many people probably the Open Search method fits perfectly (I guess it uses the fields configured for simple search (not tested).

BTW: I made a few quick tests and using the Alfresco Explorer I get slightly different results as if using the API. One quick guess is that "Search All stores" in Alfresco Explorer means "everything but dictionary" and for API those are also returned - have to investigate this further.

Many, many thanks again, and linking the most interesting (and most cited) two links for more information here
we probably can get this topic as a sticky one - as I consider this as a very common request.

andy
Champ on-the-rise
Champ on-the-rise
Hi

If you are writing the query in code you can get the escaping done for you.
Here is the example for building a phrase. (Do not do the last bit of escaping for non-phrases)

           LuceneQueryParser.escape("{" + ContentModel.PROP_CONTENT.getNamespaceURI() + "}"
           + ISO9075.encode(ContentModel.PROP_CONTENT.getLocalName()))).append(":\"").append(
           LuceneQueryParser.escape(phrase)).append("\"");

Alfresco FTS is more escape friendly.

Andy

mwildam
Champ in-the-making
Champ in-the-making
I tried to use the automatic escaping proposed by you but I don't have the LuceneQueryParser - where does it come from?

gyro_gearless
Champ in-the-making
Champ in-the-making
Hi,

its "org.alfresco.repo.search.impl.lucene.LuceneQueryParser" - Eclipse should easily find it 🙂

Cheers
Gyro

mwildam
Champ in-the-making
Champ in-the-making
its "org.alfresco.repo.search.impl.lucene.LuceneQueryParser" - Eclipse should easily find it 🙂
Don't have that - I only have "org.alfresco.webservice…." - I use the Remote SDK.