03-28-2022 07:51 AM
I'm doing a cmis query in SOLR in Search Services 1.2 for the conjunction of two predicates.
Property A is a text property and must match a regex with a fixed prefix followed by a wildcard.
Property B is a datetime property and must have a null value.
When I search for each of these in isolation, I can verify that there's nodes that match both.
IE.
SELECT * FROM xx:myNodeType WHERE xx.A LIKE 'ABC%'
and
SELECT * FROM xx:myNodeType WHERE xx.B IS NULL
intersects in found results.
Lets assume that the intersection contains a single node where property A is 'ABCD'.
If I replace the wildcard with an exact match instead of just the prefix and check that property B is null, I am also able to find the node.
I.e.
SELECT * FROM xx:myNodeType WHERE xx.A LIKE 'ABCD' and xx.B IS NULL
works.
However, when I use the same conjunction with the wildcard query, no results are found.
I.e.
SELECT * FROM xx:myNodeType WHERE xx.A LIKE 'ABC%' and xx.B IS NULL
There's no errors or anything in the log. The search simply returns an empty list.
Our model has evolved over time and we have around 10000 nodes where some were created before property B was added to the model. My best guess is that the problem is somehow related to this. I have deleted and recreated the SOLR indexes with no effect.
Any suggestions?
03-29-2022 03:56 AM
After more investigation it turns out that the LIKE is behaving weirdly.
When I run
SELECT * FROM xx:myNodeType
returns 3875 results (note that I have adjusted the search query limits to allow for this).
When I manually count the number of nodes in the result where property A starts with the prefix 'ABC' I find that 3672 of those do.
When I then run
SELECT * FROM xx:myNodeType WHERE xx.A LIKE 'ABC%'
I only find 998 nodes (suspiciously close to the 1000 limit, no?). Where's the rest of the 3800 nodes that should have been included?
When I search for the negation
SELECT * FROM xx:myNodeType WHERE xx.A NOT LIKE 'ABC%'
I find nodes where property A begins with something else than ABC, which is to be expected. But I also find nodes that DO start with ABC, which is obviously not expected. I find 2674 nodes with this property.
If I sum the number of nodes that match the LIKE for 'ABC%' and the negation of the like, I get the exact number of nodes that I expected to find with this property after manual counting of all the nodes.
2674 + 998 = 3672
So why does the prefix sometimes match the regex and sometimes don't? Could this be some encoding issue?
Explore our Alfresco products with the links below. Use labels to filter content by product module.