cancel
Showing results for 
Search instead for 
Did you mean: 

Set model type of NodeRef based on Tika-determined MIME Type

pjaromin
Champ on-the-rise
Champ on-the-rise
I've created a custom "type" in my content model that models a file with a custom file format/extension.

I've configured Tika and Alfresco to recognize this file as a specific MIME type. Now, I would like to apply (setType) my custom model type to any file of this mime type added anywhere in the repository.

To that end, I've created a behavior that binds to onCreateNode for any cm:content. Unfortunately the mimetype is unavailable at this point in the lifecycle (throws NPE retrieving ContentData) and in any event, it appears that Tika runs *after* this point.

I've been looking at NodeMonitor and TransactionListener, but I'm not certain this is the right path either.

What would be an appropriate method for accomplishing what I want?

Thanks!
6 REPLIES 6

mrogers
Star Contributor
Star Contributor
You probably want to register a post commit transaction listener in the onCreate event.

Another way may be to trigger your code from addAspect.  if your custom metadata has any custom properties then an aspect will be automatically created for your custom property.   You could use that event to run your code.

pjaromin
Champ on-the-rise
Champ on-the-rise
You probably want to register a post commit transaction listener in the onCreate event.
This doesn't work, still apparently called *before* the content data is available as the contentData.getMimetype() line below throws an NPE. The Tika message (…detected by Tika as being…) also appears in the debug logs *after* the stack trace…

public void afterCommit() {

ContentData contentData = (ContentData) nodeService.getProperty(nodeRef, ContentModel.PROP_CONTENT);
String nodeMimeType = contentData.getMimetype();

}
Another way may be to trigger your code from addAspect.  if your custom metadata has any custom properties then an aspect will be automatically created for your custom property.   You could use that event to run your code.

My only concern about this is that I was unable to get my metadata extracter to run when using FTP. It runs just fine when I upload via the webapp, but none of my extractors are triggered by other means. I found some very old, resolved, bug reports on this and assumed there was something else I needed to do…but haven't had time to hunt that one down yet.

I'm running the debugger now to see if I can learn the lifecycle here to better understand the order of events…is there a guide to this somewhere or an easier way for me to discover the order of things when new content is added to the repo, and why my extractor only runs with web uploads?

Thanks!

-Patrick

mrogers
Star Contributor
Star Contributor
Umm - so metadata is post commit as well.     Which interface are you using?

Metadata extraction is an action which should be triggered by a rule.    But currently it is (probably incorrectly) hard coded into half the public interfaces.     To get ftp to extract meta-data define a rule.

pjaromin
Champ on-the-rise
Champ on-the-rise
Thanks for the reply, again.

Umm - so metadata is post commit as well.     Which interface are you using?

I was originally using the NodeServicePolicies.OnCreateNodePolicy onCreateNode. I've just discovered that the ContentServicePolicies.OnContentUpdatePolicy is fired *after* Tika detection. So it looks like I should be using onContentUpdate, when newContent = true for my work.

Metadata extraction is an action which should be triggered by a rule.  But currently it is (probably incorrectly) hard coded into half the public interfaces.     To get ftp to extract meta-data define a rule.

Thanks, will do!

-Patrick

ak21
Champ in-the-making
Champ in-the-making
I am having similar issue pjaromin. Were you able to resolve this? Please let me know.

pjaromin
Champ on-the-rise
Champ on-the-rise
I wound up binding a custom behavior to "onContentUpdate" with a NotificationFrequency.FIRST_EVENT which so far appears to works fine.

As a work-around for the extract metadata upon FTP issue, I just manually call the ContentMetadataExtracter action after setting the type.

Here's my behavior…


package com.jgsullivan.myriad.alfresco.behaviors.template;

import java.util.HashMap;
import java.util.Map;
import java.util.Map.Entry;

import org.alfresco.model.ContentModel;
import org.alfresco.repo.action.executer.ContentMetadataExtracter;
import org.alfresco.repo.content.ContentServicePolicies;
import org.alfresco.repo.policy.Behaviour;
import org.alfresco.repo.policy.Behaviour.NotificationFrequency;
import org.alfresco.repo.policy.JavaBehaviour;
import org.alfresco.repo.policy.PolicyComponent;
import org.alfresco.repo.transaction.TransactionListenerAdapter;
import org.alfresco.service.ServiceRegistry;
import org.alfresco.service.cmr.action.Action;
import org.alfresco.service.cmr.action.ActionService;
import org.alfresco.service.cmr.repository.ContentData;
import org.alfresco.service.cmr.repository.NodeRef;
import org.alfresco.service.cmr.repository.NodeService;
import org.alfresco.service.namespace.NamespaceService;
import org.alfresco.service.namespace.QName;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* Behavior that casts the node type to the appropriate Myriad model type
* based on the applied mime type.
* @author pjaromin
*
*/
public class ModelSetTypeBehavior extends TransactionListenerAdapter
      implements ContentServicePolicies.OnContentUpdatePolicy {
   
   private static final QName QNAME_ONCONTENTUPDATE = QName.createQName(NamespaceService.ALFRESCO_URI, "onContentUpdate");

   private Behaviour onContentUpdate;
   
   private PolicyComponent policyComponent;

   private ServiceRegistry serviceRegistry;
   
   private Map<String, String> mimeToModelTypeMap;
   
   private Map<String, QName> qnameMap;
   
   public void init() {
      if (log().isDebugEnabled()) {
         log().debug("Initializing Behavior");
      }
      this.onContentUpdate = new JavaBehaviour(this, "onContentUpdate", NotificationFrequency.FIRST_EVENT);
      this.policyComponent.bindClassBehaviour(QNAME_ONCONTENTUPDATE, ContentModel.TYPE_CONTENT, onContentUpdate);
   }
   
   /*
    * (non-Javadoc)
    * @see org.alfresco.repo.content.ContentServicePolicies.OnContentUpdatePolicy#onContentUpdate(org.alfresco.service.cmr.repository.NodeRef, boolean)
    */
   @Override
   public void onContentUpdate(NodeRef nodeRef, boolean newContent) {
      if (log().isDebugEnabled()) {
         log().debug("onContentUpdate, new[" + newContent + "]");
      }

      NodeService nodeService = serviceRegistry.getNodeService();
      ContentData contentData = (ContentData) nodeService.getProperty(nodeRef, ContentModel.PROP_CONTENT);
      String nodeMimeType = contentData.getMimetype();

      QName type = getQNameMap().get(nodeMimeType);
      if (type != null) {
         nodeService.setType(nodeRef, type);
         
         // Extract meta-data here because it doen't happen automatically when imported through FTP (for example)
         ActionService actionService = serviceRegistry.getActionService();
         Action extractMeta = actionService.createAction(ContentMetadataExtracter.EXECUTOR_NAME);
         actionService.executeAction(extractMeta, nodeRef);
      }
      else {
         log().warn("No type configured for mimetype [" + nodeMimeType + "]");
      }
   }
   
   /**
    *
    * @return
    */
   private Map<String, QName> getQNameMap() {
      if (qnameMap == null) {
         qnameMap = new HashMap<String, QName>();
         // Pre-resolve QNames…
         for (Entry<String,String> e : mimeToModelTypeMap.entrySet()) {
            QName qname = this.qnameFromMimetype(e.getKey());
            if (qname != null) {
               qnameMap.put(e.getKey(), qname);
            }
         }
      }
      return qnameMap;
   }

   /**
    *
    * @param mimeType
    * @return
    */
   private QName qnameFromMimetype(String mimeType) {
      QName qname = null;

      String qNameStr = mimeToModelTypeMap.get(mimeType);
      qname = QName.createQName(qNameStr, serviceRegistry.getNamespaceService());

      return qname;
   }

   public PolicyComponent getPolicyComponent() {
      return policyComponent;
   }

   public void setPolicyComponent(PolicyComponent policyComponent) {
      this.policyComponent = policyComponent;
   }

   public ServiceRegistry getServiceRegistry() {
      return serviceRegistry;
   }

   public void setServiceRegistry(ServiceRegistry serviceRegistry) {
      this.serviceRegistry = serviceRegistry;
   }

   public void setMimeToModelTypeMap(Map<String, String> mimeToModelTypeMap) {
      this.mimeToModelTypeMap = mimeToModelTypeMap;
   }

   public Map<String, String> getMimeToModelTypeMap() {
      return mimeToModelTypeMap;
   }

   protected Log log() {
      return LogFactory.getLog(this.getClass());
   }

}


Then I configured it in the context with a map of custom QNames and mime-types…


   <!–                                             –>
   <!– ModelSetTypeBehavior                                           –>
   <!–                                             –>
   <bean id="model-behavior"
      class="com.jgsullivan.myriad.alfresco.behaviors.template.ModelSetTypeBehavior"
      init-method="init" depends-on="${artifactId}.dictionaryBootstrap">
      <property name="policyComponent" ref="policyComponent" />
      <property name="serviceRegistry" ref="ServiceRegistry" />
      <property name="mimeToModelTypeMap">
         <map>
            <entry key="application/myriad-template-package"><value>myr:templatePackage</value</entry>
…multiple custom models here…
            <entry key="application/x-font-ttf"><value>myr:font</value></entry>
         </map>
      </property>
   </bean>


Hope this helps.

-Patrick