cancel
Showing results for 
Search instead for 
Did you mean: 

Transformation from PCL to PDF

domma
Champ in-the-making
Champ in-the-making
Hi I am trying to deploy a transformator to convert PCL format files to PDF in the new version 1.3 of alfresco. Therefore I created all my custome files in my tomcat/shared/extension.
When I run debug the application, it seems it executing my transformation correctly.
But just after the custom execution, I get this exception:

16:58:45,922 ERROR [content.transform.AbstractContentTransformer] Content writer not closed by transformer: 
   writer: ContentAccessor[ contentUrl=store://2006/6/12/16/f0dee67e-fa23-11da-aec2-b965a7a2b11e.bin, mimetype=text/plain, size=0, encoding=UTF-8]
   transformer: PdfBoxContentTransformer[ average=10000ms]
My rule says to transform whatever input files with a mimetype PCL to a PDF, so I don't see why the generic PdfBoxContentTransformer get invoked whereas this one only transforms  text to PDF!!
Here are my custom files:
mimetype-custom-extensions.xml
<alfresco-config area="mimetype-map">
  
   <config evaluator="string-compare" condition="Mimetype Map">
   <mimetypes>    
           <mimetype mimetype="application/vnd.hp-pcl" display="PCL">
               <extension default="true">pcl</extension>
            </mimetype>
   </mimetypes>    
   </config>
</alfresco-config>

transformers-custom-context.xml

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>

<beans>

   <bean id="transformer.pcl2pdf" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformer" parent="baseContentTransformer">
      <property name="explicitTransformations">
         <list>
            <bean class="org.alfresco.repo.content.transform.ContentTransformerRegistry$TransformationKey" > 
      <constructor-arg><value>application/vnd.hp-pcl</value></constructor-arg>             
                <constructor-arg><value>application/pdf</value></constructor-arg>      
            </bean>
         </list>
      </property>
   
      <property name="checkCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandMap">
                <map>
                    <entry key=".*">
                        <value>pcl2pdf32.exe</value>
                    </entry>
                </map>
            </property>
       <property name="errorCodes">
               <value>2</value>
            </property>
         </bean>
      </property>
    
      <property name="transformCommand">
         <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="errorCodes">
               <value>2</value>
            </property>
            <property name="commandMap">
                <map>
                    <entry key="Linux">
                        <value>pcl2pdf32.exe '${target}''${source}'</value>
                    </entry>
                    <entry key="Windows.*">
                        <value>pcl2pdf32.exe "${target}" "${source}"</value>
                    </entry>
                </map>
            </property>
         </bean>
      </property>
   </bean>

</beans>

Could you help me on this one?

Cheers,

Manuela
1 REPLY 1

derek
Star Contributor
Star Contributor
Hi, Manuela

What you are seeing is bug http://www.alfresco.org/jira/browse/AR-457.
In actual fact, the problem is that the PDF document could not be transformed to text for indexing purposes.  There might be some incompatibility between the PDF generated and what PDFBox can read.  The way to check this is to perform a few conversions using the custom transformer that you can and then to search for "nitf" (Not Indexed Transformation Failed) in the UI.  You should see the converted documents show up in the search.  So, the problem is really that the converted documents cannot be indexed.  This will not prevent the documents from being stored in the repository.

If you have a tool to convert PDF to Text, then you might try a similar approach with that.

Regards