cancel
Showing results for 
Search instead for 
Did you mean: 

Aspects and MetadataExtraction

gonenc
Champ in-the-making
Champ in-the-making
Hi,

Is it possible to do some metadata extraction from the content whenever an update, create or edit in a node's content occurs? My use case is that I have an Domain Specific Language in the form of XML. Whenever this type of XML is uploaded or updated I want to extract some data from the XML and add it as aspect properties to the node.

My exact use case is that I have a report definition in XML (BIRT) and the report may have parameters defined. For efficiency I want to import the parameters needed to run the report as metadata to alfresco. For any report definition XML I have created an ReportDefinitionAspect. I have created an aspect called ReportWithParametersAspect which specifies ReportDefinitionAspect as mandatory. So whenever a ReportDefinition with parameters is added or updated in Alfresco I want to have the parameters list upto date and have the ReportWithParametersAspect applied to the node.

For this reason I have created an metadataextractor with a mimetype such as "application/reportdefinition" for my xml file just to make my mimetype unique. My metadata extractor extracts the parameters and adds the parameters as properties of ReportWithParametersAspect. When I checked out alfresco's code, aspects of properties added are applied to the node. However, ReportDefinitionAspect is not applied.

Another problem with this approach is that, whenever a report definition with parameters is updated and the parameters are removed, I want the ReportWithParametersAspect to be removed too. If I had some hooks for the update and create operations, with access to the node itself I could do this easily. Is this possible? Or should I add wizards to create and update my nodes instead of the default add Content, edit Content dialogs of web client?

Any comments or explanations is appreciated.

Gonenc Ercan
1 REPLY 1

rmills
Champ in-the-making
Champ in-the-making
Let me make sure I understand you before I try to answer:
You have an xml you're extracting data from and applying a general aspect to.  When you update the xml and add certain nodes, you want it to add new aspects? is that correct?

I'm not sure if this will work, but try using a few selectors.  For example say you want the general aspect applied when the root node is MYROOT and the "sub-aspect" applied when MYROOT has a child PROPERTY

<bean id="extracter.xml.sample.selector.XPathSelector" class="org.alfresco.repo.content.selector.XPathContentWorkerSelector" init-method="init">
    <property name="workers">
      <map>
        <entry key="/MYROOT">
          <ref bean="extracter.xml.company.MYROOTXPathSelector"/>
        </entry>
      </map>
    </property>
  </bean>


<bean id="extracter.xml.company.MYROOTXPathSelector" class="org.alfresco.repo.content.selector.XPathContentWorkerSelector" init-method="init">
    <property name="workers">
      <map>
        <entry key="/MYROOT/PROPERTY">
          <ref bean="extracter.xml.company.PROPERTYMetadataExtractor/>
        </entry>
      </map>
    </property>
  </bean>

<bean id="extracter.xml.company.PROPERTYMetadataExtracter"
    class="org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter"
    parent="baseMetadataExtracter"
    init-method="init" >
    <property name="mappingProperties">
      <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
        <property name="properties">
          <props>
            <prop key="namespace.prefix.superaspect">http://www.company.com/alfresco/model/content/1.0/superaspect</prop>
            <prop key="namespace.prefix.subaspect">http://www.company.com/alfresco/model/content/1.0/subaspect</prop>
            <prop key="superaspect_title">superaspect:Title</prop>
   <prop key="subaspect_author">subaspect:Author</prop>
         </props>
        </property>
      </bean>
    </property>
    <property name="xpathMappingProperties">
      <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
        <property name="properties">
         
          <props>
            <prop key="namespace.prefix.superaspect">http://www.company.com/alfresco/model/content/1.0/superaspect</prop>
            <prop key="namespace.prefix.subaspect">http://www.company.com/alfresco/model/content/1.0/subaspect</prop>
            <prop key="superaspect_title">/MYROOT/PROPERTY/Title</prop>
   <prop key="subaspect_author">/MYROOT/PROPERTY/Author</prop>
         </props>
        </property>
      </bean>
    </property>
  </bean>


You'll have to create a bunch of different selectors and conditions but it should get you where you want.  As for updating it when the content is updated, create Inbound and Update rules on the content space where the content is being stored to extract the data.  It should run everytime you update the content and if you set up the extractors correctly, it shouldn't matter whether or not your mimetype is different from the standard XML mimetype.

Hope this helps some.