cancel
Showing results for 
Search instead for 
Did you mean: 

Categories and Advanced Search

sacco
Champ in-the-making
Champ in-the-making
Hi

I've been looking at the implementation of Categories and, as far as I understand, the general classification with the user-visible categories is achieved by adding an aspect to a content object with a single  multi-valued property which can contain references to nodes in a space of categories (rather like using URLs to identify namespaces, I suppose).

Then there is (according to the Wiki, although it doesn't seem to match the code entirely) a clever trick with indexing which allows one to search over child categories using paths (similar to XPaths) in the index
although I haven't yet found any trace in the Web Client of the special "member" QName mentioned here:

http://wiki.alfresco.com/wiki/Search#Category_Queries

OK so far? 
Although I'm not sure what is the point of exposing the top level categories uniformly with the rest:  what could it mean when a user searches for something with the category 'Regions' for instance.

I get the impression that this mechanism may also be lurking somewhere in the background for other more system-oriented categorisations, i.e. one could use the same sort of Category hierarchy stuff with another property/aspect to do something else.


However, the semantics of Advanced Search with multiple categories implemented following
http://issues.alfresco.com/browse/AWC-479
isn't quite what users might expect:  using the same notation as AWC-479, what is really required is:

+Folder +(catA1 catA2 catA3) +(catB1 catB2 catB3)

where catA?, etc. are all the subcategories that come under one top level categorisation, e.g.Regions in the bootstrap setup.

To illustrate with an example: if a user makes a search specifying the two categories 'User Manual' and 'Japanese', it is extremely unlikely that they actually want a list of results which includes both all User Manuals in the repository and all documents of whatever type in Japanese.


In any case if this really were required, it would probably be better achieved with two separate searches, whereas, as things stand, there's no easy wey to get to all the User Manuals available in Japanese.


A better approach would be to 'OR' Categories sharing a common top level "super-category", but to 'AND' the resulting disjunctions (and in the unlikely case that anybody has found a use for 'OR' across classification, this could still be achieved simply by dropping the two category hierarchys under a common root 'super-category').
10 REPLIES 10

sacco
Champ in-the-making
Champ in-the-making
I'd contribute a patch to do this, but I first opened the Alfresco source code last night, so I don't really know my way around yet and I don't know whether the code should sit in SearchContext.java, AdvancedSearchBean.java, or a bit in each (or rather, I could patch it in SearchContext.java, but a better solution might invlove adjusting both to work together).

The pseudo-code for a quick hack might be to put something like the following into the appropriate place in
public String buildQuery()


    java.util.Arrays.sort(categories);
    preRoot = new Path(categories[0]);
    Chop this path to remove the trailing wildcards to include children,
        leaving at most 4 (?) Path.Elements
                ("/member" doesn't seem to be being used, here)
    Drop the tail of this Path until the third to last node is either
        not an instance of classifiable or
        is called categoryRoot (just being careful!)
        (the path should now end at the correct top-level super-category)

    Now loop appending the path strings with 'OR' as before, but
        check each time that the prefix of the Path matches the one found above.
    When this match fails (i.e. the prefix has changed),
        output " ) AND ( "
        get a new preRoot by trimming the new path and continue as before.

sacco
Champ in-the-making
Champ in-the-making
A second point about Categories is that a strict hierarchy seems unnecessarily inflexible:

    Sweden is a country in Northern Europe which also belongs to the European Union

    Norway is a country in Northern Europe which does not belong to the European Union

    Italy is a country in Southern Europe which belongs to the European Union

    Switzerland is a European country in which does not belong to the European Union
There's no hierarchy which can really make sense of situations like these overlapping groupings.


A partial solution would be to allow the Associations between Categories to for a Directed Acyclic Graph (DAG). 

It looks to me like almost all you would need to do would be to allow links to be made to Categories from within other Categories: given the way it's implemented (although I've only looked at a small part), the user model should be perfectly backward-compatible, and nothing else seems to need changing. 
It's probably not a good idea to let Categories link into their own children, but you already have links in the system, so at the worst I'd guess there's already code available to prevent endless recursive loops.

sacco
Champ in-the-making
Champ in-the-making
Is this the right place to post this kind of stuff, or would it be better as a request on JIRA?

gavinc
Champ in-the-making
Champ in-the-making
Yes, JIRA would be better.

andy
Champ on-the-rise
Champ on-the-rise
Hi

Categories can be linked in to mutiple categories 🙂
It is not exposed in the UI.


Just like groups can be linked in to other groups too….but not from the UI.

Regards

Andy

sacco
Champ in-the-making
Champ in-the-making
Hi

Categories can be linked in to mutiple categories 🙂
It is not exposed in the UI.

Good, I did wonder.

The question, then, is whether the rest of the code handling categories is set up to do the right thing.  I suspect that something needs to be done (perhaps it already is done) to use paths which are canonical in some sense.

i.e. if I search for items with the category 'Norway', I need to find them all independently of which path to 'Norway' was used to classify the item, or to set the search criteria.

Is this already the case?

sacco
Champ in-the-making
Champ in-the-making
Yes, JIRA would be better.

OK, I'll think about posting some of it there, then.

I didn't like to go straight there, as some of the post was about wheteher I had understood what was going on correctly.

alexandra
Champ on-the-rise
Champ on-the-rise
I have experience from the Documentum platform where we used the metadata features extensively. I have not been able to figure out what I can do with categories exactly yet but I am of course looking for features similar to

Document Types: Capable of holding a unique set of metadata attribute and values in dropdown/checkboxes.

Any number for metadata attributes within each DocType.

A possibilty for hiearchical metadata values allowing for doing fields selection like Continent/Country/Region or similar..

The effect of this is of course being able to do "Smart Folders" the same way as "Smart Playlists" work in iTunes (folders based on search queries)…

Can I do any of this with categories or are they just a way to choose ONE attribute per object…

andy
Champ on-the-rise
Champ on-the-rise
Hi

Categories are in a hierarchy. You can apply more the one category from a given classification to an object (it is a multivalued property). You can search over these. You can also define your own aspects to add new classifications - but this would not be supported in the standard UI. You can link bits of category/classification trees into others but this is not exposed in the UI.

Categories can be queried by PATH. At some point we will have query based spaces.

There is not an enum style property - this is covered via catageroies.

Hope this helps

Andy