cancel
Showing results for 
Search instead for 
Did you mean: 

semantic search in alfresco

jmarti
Champ in-the-making
Champ in-the-making
Hi,

I'm wondering about the possibility of implementing "semantic search" in Alfresco? Not sure if "semantic" is the right word…

What I would like to get is:
         - Search for a word.
         - You get docs containg and/or related to that doc.
         - You also get relations between those docs. Graphical display. Customer says wow, I want this. ;-))
                     -  Relations are based on common words containing those docs.
                     -  Those common words should also be similar to the word I started the search with.

Could anyone tell me how o where I could start digging?
Is there any module or plugin already working on this?

Alfresco is based on lucene indexes, so I tried looking for a lucene module for it. No luck.

Any idea?

Should I start this from scratch?
10 REPLIES 10

jbarmash
Champ in-the-making
Champ in-the-making
I am not aware of any semantic modules.   There was a bit of a talk to enable RDF-type stuff as a community project, but no movement on that yet. 

Most likely you'll have to start from scratch. 

Since our search is based on lucene, if there are any lucene modules that enable semantic-type search, that would be your best bet.

stevereiner
Champ in-the-making
Champ in-the-making
Its coming (early beginnings hopefully soon):

Semantics for Alfresco:
http://forge.alfresco.com/projects/semantics/
    * Provides a Flex UI (FlexSpaces add-on) and Flex components for semantic tag management, semantic search, and semantic tag clouds
      (later Flex and/or Ajax components for Alfresco 3.0)
    * REST style api to semantic services implemented in Java (RDF, semantic tagging, etc.)
    * Automatic generation of semantic metadata from content
    * Search leveraging semantic metadata with Lucene
    * Auto-tagging, microformat tagging, RDF semantic tagging
    * Ontologies, taxonomies, semantic instance data
    * Semantic web (Linked data) technologies: RDF, SPARQL, OWL
    *  Calais (Reuters / ClearForest) open api (free service)
    * Calais first, later Apache UIMA and GATE via UIMA
    * Alfresco ECM: Auto generation of semantic metadata from content with Calais and UIMA, semantic data relations with content nodes, semantic tagging, semantic search
    * Alfresco WCM: Integration to create and manage web sites leveraging the semantic services ( use within web site / web app, optionally expose semantic web metadata externally)
    * Alfresco Collaboration: Support semantic tags and other features with collaboration features (blogs, wikis, 3.0 sites, etc.)

FlexSpaces
http://forge.alfresco.com/projects/flexspaces/
http://forums.alfresco.com/viewtopic.php?f=36&t=11876

Steve Reiner
http://www.integratedsemantics.org
http://www.integratedsemantics.com

jmarti
Champ in-the-making
Champ in-the-making
sounds great, I will give it a try as soon as it gets out.

stevereiner
Champ in-the-making
Champ in-the-making
I had been delayed on the Semantics for Alfresco project
By teaming up with Alexander Polev who started a similar project (Calais integration), hopefully we will have the first part sooner (auto tagging with Calais, multiple tag clouds)

http://forge.alfresco.com/projects/calais/
http://forge.alfresco.com/projects/semantics/

also registered a google code project
http://code.google.com/p/semantics4alfresco/

Steve

ttague
Champ in-the-making
Champ in-the-making
Steve:

Tom Tague from Calais here.

Good to hear that this project may be regaining some traction.

I wanted to encourage you and Alexander to seriously consider storing the Calais URI's for extracted entities, events and relationships as well as the values themselves. We'll be blogging about it in the next day or so - but our next release (1/09) will allow you to de-reference many of the entity URIs - providing you with a path from unstructured text to semantic metadata to the linked data world of Freebase, DBpedia, etc. Should enable some pretty interesting content enhancement and exploration opportunities.

Regards,

Tom

stevereiner
Champ in-the-making
Champ in-the-making
Here is a link to Tom's blog post about Calais 4

Life in the Linked Data Cloud - Calais Release 4 Coming Jan 09

"The Gist: Release 4 of Calais will be a big deal. In that release we’ll go beyond the ability to extract semantic data from your content. We will link that extracted semantic data to datasets from dozens of other information sources, from Wikipedia to Freebase to the CIA World Fact Book. In short – instead of being limited to the contents of the document you’re processing, you’ll be able to develop solutions that leverage a large and rapidly growing information asset: the Linked Data Cloud."

http://www.opencalais.com/node/9501


Steve

stevereiner
Champ in-the-making
Champ in-the-making
http://integratedsemantics.org/2008/11/24/calais-integration-semantics-for-alfresco-sneak-peak-part-...

"Here is a sneak peak video of  semantic auto-tagging and multiple semantic tag clouds coming in the next release of FlexSpaces+AIR / FlexSpaces+Browser. This uses the  Calais Integration  Alexander and  I have been working on, the Open Calais service, and Alfresco.  The Calais integration auto-tag action can also be used from the "run action" UI in the Alfresco web client (now called Explorer)."

"Things in the works: Flex UI for tag suggestions, storing URIs in the integration for future linked data use, storing geo-location info from Calais 3.1, and map UI to display geo-location points to filter search results."

Steve

erictice
Champ in-the-making
Champ in-the-making
How is this project coming along?  I am curious as to the granularity you plan to allow.  For instance, we would be looking for words in a given document.  In our business that is potentially a word from original scriptures of the bible.  Will it also support languages other than English?  We would primarily be looking for Greek or Hebrew words.

stevereiner
Champ in-the-making
Champ in-the-making
The Calais integration alfresco server extension and  Calais Flex UI  in FlexSpaces is complete:
http://forums.alfresco.com/en/viewtopic.php?f=32&t=15766

The Calais service now supports French in addition to English and will be adding more languages
(check http://opencalais.com/non-english forum topic)
Note that Calais meta-data extraction is currently focused on business and news type content.

Adding support for open source / customizable engines is currently on the back burner.