cancel
Showing results for 
Search instead for 
Did you mean: 

Adding a content transformer - NOT working

leftright
Champ on-the-rise
Champ on-the-rise
On Debian I have written a shell script(works perfectly when called on terminal),
that converts scanned PDF to PDF with computer text
and I want to include this transformation in Alfresco.
Since I have never tried adding a new content transformer, I started
with only adding a pdf to tiff transformer in alfresco(the first step
in my transformation from scanned PDF to PDF with text)

I did the following:

  - Added a tiff mimetype, to the mimetypes shown in transform content to image:
 
  <config evaluator="string-compare" condition="Action Wizards">
      <image-transformers>
    <transformer name="image/tiff"/>
      </image-transformers>
   </config>

  - Applied the rule on a space for the transformation of inbound content from pdf to tiff.

  - Added a transformer in the folder tomcat/shared/classes/alfresco/extension with the following code


<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>
<beans>  
   <bean id="transformer.Ocr.pdfVtiff" class="org.alfresco.repo.content.transform.ProxyContentTransformer" parent="baseContentTransformer">
      <property name="worker">
         <ref bean="transformer.Ocr.pdfVtiff2" />
      </property>
   </bean>  
   <bean id="transformer.Ocr.pdfVtiff2" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker">
      <property name="checkCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
                    <entry key=".*">
         <value>chmod 755 /home/grega/ocr/ocrtiff.sh</value>
                    </entry>
                </map>
            </property>
            <property name="errorCodes">
               <value>1,2</value>
            </property>
         </bean>
      </property>
      <property name="transformCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
          <entry key=".*">
            <value>/home/grega/ocr/ocrtiff.sh  ${source}</value>
          </entry>
                </map>
            </property>
            <property name="errorCodes">
               <value>1,2</value>
            </property>
         </bean>
      </property>
      <property name="explicitTransformations">
       <list>
         <bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails" >
      <property name="sourceMimetype">
             <value>application/pdf</value>
           </property>
      <property name="targetMimetype">
             <value>image/tiff</value>
           </property>
         </bean>
       </list>
       </property>
   </bean>   
</beans>

I can do all this without alfresco complaining, but when I try to add a pdf file to the folder,
I get the following error( from alfresco.log file)


01:36:38,839 ERROR [org.alfresco.web.ui.common.Utils] Failed to create content due to error: 06260001 Some error occurred during document transforming. Error message: 06260000 No transformation exists between mimetypes application/pdf and image/tiff
org.alfresco.service.cmr.rule.RuleServiceException: 06260001 Some error occurred during document transforming. Error message: 06260000 No transformation exists between mimetypes application/pdf and image/tiff
   at org.alfresco.repo.action.executer.TransformActionExecuter.executeImpl(TransformActionExecuter.java:281)
   at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:133)
   at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:749)
   at org.alfresco.repo.action.executer.CompositeActionExecuter.executeImpl(CompositeActionExecuter.java:66)
   at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:133)
   at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:749)
   at org.alfresco.repo.action.ActionServiceImpl.executeActionImpl(ActionServiceImpl.java:675)
   at org.alfresco.repo.action.ActionServiceImpl.executeAction(ActionServiceImpl.java:540)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
   at org.alfresco.repo.security.permissions.impl.AlwaysProceedMethodInterceptor.invoke(AlwaysProceedMethodInterceptor.java:34)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.alfresco.repo.security.permissions.impl.ExceptionTranslatorMethodInterceptor.invoke(ExceptionTranslatorMethodInterceptor.java:44)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.alfresco.repo.audit.AuditMethodInterceptor.proceedWithAudit(AuditMethodInterceptor.java:217)
   at org.alfresco.repo.audit.AuditMethodInterceptor.proceed(AuditMethodInterceptor.java:184)
   at org.alfresco.repo.audit.AuditMethodInterceptor.invoke(AuditMethodInterceptor.java:137)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:107)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at $Proxy37.executeAction(Unknown Source)
   at org.alfresco.repo.rule.RuleServiceImpl.executeRule(RuleServiceImpl.java:1165)
   at org.alfresco.repo.rule.RuleServiceImpl.executePendingRule(RuleServiceImpl.java:1133)
   at org.alfresco.repo.rule.RuleServiceImpl.executePendingRulesImpl(RuleServiceImpl.java:1092)
   at org.alfresco.repo.rule.RuleServiceImpl.executePendingRules(RuleServiceImpl.java:1065)
   at org.alfresco.repo.rule.RuleTransactionListener.beforeCommit(RuleTransactionListener.java:57)
   at org.alfresco.repo.transaction.AlfrescoTransactionSupport$TransactionSynchronizationImpl.doBeforeCommit(AlfrescoTransactionSupport.java:732)
   at org.alfresco.repo.transaction.AlfrescoTransactionSupport$TransactionSynchronizationImpl.doBeforeCommit(AlfrescoTransactionSupport.java:712)
   at org.alfresco.repo.transaction.AlfrescoTransactionSupport$TransactionSynchronizationImpl.beforeCommit(AlfrescoTransactionSupport.java:672)
   at org.springframework.transaction.support.TransactionSynchronizationUtils.triggerBeforeCommit(TransactionSynchronizationUtils.java:95)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.triggerBeforeCommit(AbstractPlatformTransactionManager.java:927)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:737)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:723)
   at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:394)
   at org.alfresco.util.transaction.SpringAwareUserTransaction.commit(SpringAwareUserTransaction.java:472)
   at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:403)
   at org.alfresco.web.bean.dialog.BaseDialogBean.finish(BaseDialogBean.java:124)
   at org.alfresco.web.bean.dialog.DialogManager.finish(DialogManager.java:528)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.myfaces.el.MethodBindingImpl.invoke(MethodBindingImpl.java:132)
   at org.apache.myfaces.application.ActionListenerImpl.processAction(ActionListenerImpl.java:61)
   at javax.faces.component.UICommand.broadcast(UICommand.java:109)
   at javax.faces.component.UIViewRoot._broadcastForPhase(UIViewRoot.java:97)
   at javax.faces.component.UIViewRoot.processApplication(UIViewRoot.java:171)
   at org.apache.myfaces.lifecycle.InvokeApplicationExecutor.execute(InvokeApplicationExecutor.java:32)
   at org.apache.myfaces.lifecycle.LifecycleImpl.executePhase(LifecycleImpl.java:95)
   at org.apache.myfaces.lifecycle.LifecycleImpl.execute(LifecycleImpl.java:70)
   at javax.faces.webapp.FacesServlet.service(FacesServlet.java:139)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at org.alfresco.web.app.servlet.AuthenticationFilter.doFilter(AuthenticationFilter.java:104)
   at sun.reflect.GeneratedMethodAccessor601.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.alfresco.repo.management.subsystems.ChainingSubsystemProxyFactory$1.invoke(ChainingSubsystemProxyFactory.java:116)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at $Proxy239.doFilter(Unknown Source)
   at org.alfresco.repo.web.filter.beans.BeanProxyFilter.doFilter(BeanProxyFilter.java:82)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at org.alfresco.repo.web.filter.beans.NullFilter.doFilter(NullFilter.java:68)
   at sun.reflect.GeneratedMethodAccessor601.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.alfresco.repo.management.subsystems.ChainingSubsystemProxyFactory$1.invoke(ChainingSubsystemProxyFactory.java:116)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at $Proxy239.doFilter(Unknown Source)
   at org.alfresco.repo.web.filter.beans.BeanProxyFilter.doFilter(BeanProxyFilter.java:82)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:58)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
   at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
   at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
   at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
   at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
   at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
   at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
   at java.lang.Thread.run(Thread.java:619)

What did I do wrong ?
And do I have any other options to include a transformation with shell code in alfresco?
thanks for any answers.
8 REPLIES 8

mrogers
Star Contributor
Star Contributor
It doesn't look like your transformer has been loaded.

What's the name of your file in tomcat/shared/classes/alfresco/extension ?

Is your shared class loader configured in tomcat?  A crude simple test is to just create a deliberate syntax error in your file and restart - if alfresco fails to start or throws an ugly exception then your file is being found.

leftright
Champ on-the-rise
Champ on-the-rise
the name of the file is ocr-peti-transformers-context.xml,
if I write some garbage in the file, I can't start the alfresco anymore.
so does this mean that the file is being loaded?

mrogers
Star Contributor
Star Contributor
Yes - the file is being loaded.

leftright
Champ on-the-rise
Champ on-the-rise
I have found the problem

instead of
<config evaluator="string-compare" condition="Action Wizards">
      <image-transformers>
      <transformer name="image/tiff"/>
     </image-transformers>
</config>

I used

<config evaluator="string-compare" condition="Action Wizards">
      <transformers>
      <transformer name="image/tiff"/>
     </transformers>
</config>

instead of using <image-transformers> I used  <transformers>
and the transformation can now procees.


NOTE TO ADMIN: since I will probably encounter more problems while developing transformer, I thought that I wouldn't  mark this topic as SOLVED,
and use it when I will encouter new problems. Is that OK?   Edit - Admin - yes it is, glad you got it working

leftright
Champ on-the-rise
Champ on-the-rise
Now I have tried to implement the whole transformation process, so from scanned PDF to PDF with text.
Because both the start and end type are in PDF, I have decided to split transformation in two parts.
First part is from PDF to Tiff, the second is from tiff to PDF. Now while the tiff to PDF part gets added to Alfresco,
the PDF to tiff part doesn't get added. When I look at the list of mimetype transformations in Alfresco, I see that the Tiff to Pdf
transformation goes via proxyContentTransformer, while the Pdf to Tiff goes via image/png(the default pdf to tiff  transformation in Alfresco).
So is the problem in the already existing transformation in Alfresco?
This is my transformer for pdf to tiff.


<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>

<beans>
   <bean id="transformer.worker.img2ocrpdf" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker">

      <property name="mimetypeService">
   <ref bean="mimetypeService" />
      </property>
      <property name="checkCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
                    <entry key=".*">
         <value>chmod 755 /home/grega/ocr/pdf2tiff.sh</value>
                    </entry>
                </map>
            </property>
            <property name="errorCodes">
               <value>2</value>
            </property>
         </bean>
      </property>
      <property name="transformCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
          <entry key=".*">
            <value>/home/grega/ocr/pdf2tiff.sh ${source} ${target}</value>
          </entry>
                </map>
            </property>
            <property name="errorCodes">
               <value>2</value>
            </property>
         </bean>
      </property>
      <property name="explicitTransformations">
       <list>
           <bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails">
              <property name="sourceMimetype">
            <value>application/pdf</value>
         </property>
         <property name="targetMimetype">
            <value>image/tiff</value>
         </property>
      </bean>
       </list>
       </property>
   </bean>
   <bean id="transformer.img2ocrpdf" class="org.alfresco.repo.content.transform.ProxyContentTransformer" parent="baseContentTransformer">
      <property name="worker">
         <ref bean="transformer.worker.img2ocrpdf" />
      </property>
   </bean>
</beans>

leftright
Champ on-the-rise
Champ on-the-rise
I have figured out why the pdf2tiff transformation wasn't picked, but the tiff2pdf was.
If you implement a transformer for a source and a target and the transformation between that source and traget in Alfresco already exists,
then you have to disable the already existing transormation.
You can see in this tutorial how to do this:
http://blog.metasys.pl/2011/01/speeding-up-pdf-indexing-in-alfresco-3-3/

Now I have a new problem. The pdf2tiff conversion gets picked, but  it doesn't work.
The transformer calls the external script and I have found that the problem is in the convert command(Image Magick)
in the script. If I call the script from terminal(debian), the script works and convert command as well, but the convert
command doesn't work when called form Alfresco. I am using a external ImageMagick in script, my Alfresco also uses external
ImageMagick, and as I see it works fine(I can preview pictures in Share and also zoom in on them).

Any suggestion on what is wrong?

Hi

Let me ask you this question please as I am trying to get something similar, a tiff to OCR PDF transformation with ABBYY OCR CLI for linux.

We have an Alfresco 4.2.f CE instance and ABBYY OCR4Linux CLI 9 installed in our ubuntu machine. We have checked that ABBYY OCR works fine, but we are not able to run the transformation in Alfresco.

First of all we have included in our web-client-config-custom.xml


<!– Add the tiff mime type to the list of supported transformations –>

    <config evaluator="string-compare" condition="Action Wizards">

        <transformers>

            <transformer name="image/tiff"/>

        </transformers>

    </config>



and declared our transformation in tomcat/shared/classes/alfresco/extension


<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>

<beans>
    <bean id="transformer.tiff2pdf" class="org.alfresco.repo.content.transform.ProxyContentTransformer" parent="baseContentTransformer">
        <property name="worker">
            <ref bean="transformer.worker.tiff2pdf" />
        </property>
    </bean>


    <bean id="transformer.worker.tiff2pdf" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker">
        <property name="mimetypeService">
            <ref bean="mimetypeService" />
        </property>
        <property name="checkCommand">
            <bean class="org.alfresco.util.exec.RuntimeExec">
                <property name="commandMap">
                    <map>
                        <entry key=".*">
                            <list>
                                <value>abbyyocr9</value>
                            </list>
                        </entry>
                    </map>
                </property>
            </bean>
        </property>

        <property name="transformCommand">
            <bean class="org.alfresco.util.exec.RuntimeExec">
                <property name="commandMap">
                    <map>
                        <entry key=".*">
                            <list>
                                <value>
                                    abbyyocr9 -rl Spanish -fm -rdss -afoe -if ${source} -f PDF -pfpf LZWGray -pem ImageOnText -pfpr 300 -prl -o$
                                </value>
                            </list>
                        </entry>
                    </map>
                </property>
                <property name="errorCodes">
                    <value>1,2</value>
                </property>
            </bean>
        </property>  <property name="explicitTransformations">
            <list>
                <bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails">
                    <property name="sourceMimetype"><value>image/tiff</value></property>
                    <property name="targetMimetype"><value>application/pdf</value></property>
                </bean>
            </list>
        </property>
    </bean>
</beans>




Also in the http://localhost:8080/alfresco/service/mimetypes?mimetype=image/tiff#image/tiff  we obtain the right transformation details



image/tiff - tiff
Extractors: org.alfresco.repo.content.metadata.TikaAutoMetadataExtracter
Transformable To:

    application/eps = Proxy via: com.sun.proxy.$Proxy15(Version: ImageMagick 6.8.6-6 2013-07-24 Q16 http://www.imagemagick.org Copyright: Copyright (C) 1999-2013 ImageMagick Studio LLC Features: DPC Modules Delegates: freetype jng jpeg png ps tiff wmf zlib)
    application/pdf = Proxy via: org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker(ABBYY FineReader Engine 9.0 Sample © ABBYY. 2010.


However when we launch this transformation, we are able to get a PDF, but including the TIFF image with no OCR obtained… which makes me think that Alfresco is not calling our transformation to create this PDF.

I have followed also a suggestion I found in this module to disable ImageMagic by doing this

# mv /opt/alfresco-4.2.c/common/bin/.convert.bin /opt/alfresco-4.2.c/common/bin/.convert.old
…. but if I do so I get this error message


org.alfresco.service.cmr.repository.ContentIOException: 02220014 Content conversion failed:
   reader: ContentAccessor[ contentUrl=store://2016/3/22/19/8/c645e8f5-ef22-43a1-b593-eef80fb702b3.bin, mimetype=image/tiff, size=113244, encoding=UTF-8, locale=es_CL]
   writer: ContentAccessor[ contentUrl=store://2016/3/22/19/8/4cf527dd-d17f-41ef-843f-262b9c744357.bin, mimetype=application/pdf, size=0, encoding=UTF-8, locale=es_CL]
   options: {targetContentProperty={http://www.alfresco.org/model/content/1.0}name, contentReaderNodeRef=workspace://SpacesStore/16875bd0-0039-48a0-a670-8d3b87e21fc3, contentWriterNodeRef=workspace://SpacesStore/de83616b-0433-420c-84e2-e18c731aae1b, sourceContentProperty={http://www.alfresco.org/model/content/1.0}name, use=null, includeEmbedded=false}
   limits:



Could you  please point me which steps exactly did you follow to get your transofrmation working??

Thanks a lot

vta
Champ in-the-making
Champ in-the-making
Hi LeftRight,

Im very interested indeed in your script to transform scanned PDF to PDF with computer text

Could you share it please?

Thanks in advance