cancel
Showing results for 
Search instead for 
Did you mean: 

Using POI transformer to transform excel documents

darkstar1
Confirmed Champ
Confirmed Champ
So I am trying to use the POI transformer to transform EXCEL => HTML.
So I have the following:

   In <Project-home>/src/main/amp/config/alfresco/extension/subsystems/Transformers/default/default/transformers.properties
   content.transformer.Poi.priority=70
   content.transformer.Poi.extensions.xlsx.html.supported=true


I set the  properties:
log4j.logger.org.alfresco.repo.content.transform.TransformerDebug=TRACE
log4j.logger.org.alfresco.util.exec.RuntimeExec=TRACE

but the POI transformer isn't being called for the html transformation. In fact none at all according to this log:

2016-01-18 08:30:01,216  INFO  [management.subsystems.ChildApplicationContextFactory] [http-bio-8443-exec-2] Startup of 'Transformers' subsystem, ID: [Transformers, default] complete
2016-01-18 08:30:01,254  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0             store://2016/1/18/7/53/aa506647-1641-43ca-b9fe-9c6a91f38901.bin
2016-01-18 08:30:01,264  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/plain
2016-01-18 08:30:01,264  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0             xlsx txt  image in excel.xlsx 155.9 KB – index – SolrIndexer
2016-01-18 08:30:01,265  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0             **a) [120] TikaAuto           0 ms
2016-01-18 08:30:01,265  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0               b) [130] OOXML              0 ms
2016-01-18 08:30:01,266  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0.1           store://2016/1/18/7/53/aa506647-1641-43ca-b9fe-9c6a91f38901.bin
2016-01-18 08:30:01,266  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0.1           application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/plain
2016-01-18 08:30:01,266  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0.1           xlsx txt  <<TemporaryFile>> 155.9 KB TikaAuto
2016-01-18 08:30:01,577  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0.1           Finished in 310 ms
2016-01-18 08:30:01,580  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-2] 0             Finished in 1.378 ms

2016-01-18 08:30:02,318  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-3] 1             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,318  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-3] 1             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
2016-01-18 08:30:02,318  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-3] 1             xlsx png  image in excel.xlsx 171.7 KB – doclib – ContentService.getTransformer(…)
2016-01-18 08:30:02,318  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-3] 1             **a) [250] complex.OpenOffice.Image<<Complex>>           0 ms
2016-01-18 08:30:02,320  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-3] 1               b) [400] complex.Any.Image<<Complex>>                  0 ms
2016-01-18 08:30:02,320  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-3] 1             Finished in 22 ms Transformer NOT called

2016-01-18 08:30:02,343  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 2             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,344  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 2             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
2016-01-18 08:30:02,344  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 2             xlsx png  image in excel.xlsx 171.7 KB – doclib – ContentService.getTransformer(…)
2016-01-18 08:30:02,345  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 2             **a) [250] complex.OpenOffice.Image<<Complex>>           0 ms
2016-01-18 08:30:02,352  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 2               b) [400] complex.Any.Image<<Complex>>                  0 ms
2016-01-18 08:30:02,352  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 2             Finished in 11 ms Transformer NOT called

2016-01-18 08:30:02,368  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 3             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,368  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 3             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
2016-01-18 08:30:02,368  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 3             xlsx png  image in excel.xlsx 171.7 KB – doclib – ContentService.getTransformer(…)
2016-01-18 08:30:02,368  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 3             **a) [250] complex.OpenOffice.Image<<Complex>>           0 ms
2016-01-18 08:30:02,370  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 3               b) [400] complex.Any.Image<<Complex>>                  0 ms
2016-01-18 08:30:02,370  TRACE [content.transform.TransformerDebug] [defaultAsyncAction1] 3             Finished in 11 ms Transformer NOT called

2016-01-18 08:30:02,397  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,401  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
2016-01-18 08:30:02,401  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4             xlsx png  image in excel.xlsx 171.7 KB – doclib – ContentService.transform(…)
2016-01-18 08:30:02,401  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4             **a) [250] complex.OpenOffice.Image<<Complex>>           0 ms
2016-01-18 08:30:02,402  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4               b) [400] complex.Any.Image<<Complex>>                  0 ms
2016-01-18 08:30:02,403  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1           store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,403  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1           application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
2016-01-18 08:30:02,403  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4.1           xlsx png  image in excel.xlsx 171.7 KB complex.OpenOffice.Image<<Complex>>
2016-01-18 08:30:02,403  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1         store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,403  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
2016-01-18 08:30:02,403  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1         xlsx pdf  image in excel.xlsx 171.7 KB OpenOffice.2Pdf<<FailoverComponent>>
2016-01-18 08:30:02,406  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1.1       store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:02,406  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1.1       application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
2016-01-18 08:30:02,407  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1.1       xlsx pdf  image in excel.xlsx 171.7 KB OpenOffice<<Proxy>>
2016-01-18 08:30:02,986  INFO  [security.sync.ChainingUserRegistrySynchronizer] [DefaultScheduler_Worker-5] Retrieving users changed since 18-01-2016 08:25:02 from user registry 'ldap1'
2016-01-18 08:30:03,350  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1.1       Finished in 944 ms
2016-01-18 08:30:03,351  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.1         Finished in 948 ms
2016-01-18 08:30:03,351  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2         store:///opt/alfresco/tomcat/temp/Alfresco/ComplextTransformer_intermediate_xlsx_6113403796769600068.pdf
2016-01-18 08:30:03,351  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2         application/pdf image/png
2016-01-18 08:30:03,352  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2         pdf  png  <<TemporaryFile>> 108 KB complex.PDF.Image<<Failover>>
2016-01-18 08:30:03,352  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2.1       store:///opt/alfresco/tomcat/temp/Alfresco/ComplextTransformer_intermediate_xlsx_6113403796769600068.pdf
2016-01-18 08:30:03,352  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2.1       application/pdf image/png
2016-01-18 08:30:03,352  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2.1       pdf  png  <<TemporaryFile>> 108 KB ImageMagick<<Proxy>>
2016-01-18 08:30:03,958  DEBUG [util.exec.RuntimeExec] [pool-14-thread-1] Execution result:
   os:         Linux
   command:    /opt/alfresco/common/bin/convert /opt/alfresco/tomcat/temp/Alfresco/ImageMagickContentTransformerWorker_source_3406876904443566115.pdf[0] -auto-orient -resize "100x100>" /opt/alfresco/tomcat/temp/Alfresco/ImageMagickContentTransformerWorker_target_201353815630224920.png
   succeeded:  true
   exit code:  0
   out:
   err:
2016-01-18 08:30:03,967  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2.1       Finished in 614 ms
2016-01-18 08:30:03,967  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1.2         Finished in 616 ms
2016-01-18 08:30:03,968  TRACE [content.transform.TransformerDebug] [pool-14-thread-1] 4.1           Finished in 1.564 ms
2016-01-18 08:30:03,968  DEBUG [content.transform.TransformerDebug] [pool-14-thread-1] 4             Finished in 1.575 ms


2016-01-18 08:30:30,070  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:30,070  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/plain
2016-01-18 08:30:30,071  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5             xlsx txt  image in excel.xlsx 171.7 KB – index – SolrIndexer
2016-01-18 08:30:30,072  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5             **a) [120] TikaAuto           310 ms
2016-01-18 08:30:30,072  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5               b) [130] OOXML              0 ms
2016-01-18 08:30:30,073  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5.1           store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
2016-01-18 08:30:30,073  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5.1           application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/plain
2016-01-18 08:30:30,074  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5.1           xlsx txt  <<TemporaryFile>> 171.7 KB TikaAuto
2016-01-18 08:30:30,269  TRACE [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5.1           Finished in 195 ms
2016-01-18 08:30:30,277  DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-9] 5             Finished in 216 ms

================================= Calling document-details ===========================
  2016-01-18 08:33:02,073  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 6             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
  2016-01-18 08:33:02,074  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 6             xlsx png  – avatar32 – ContentService.getMaxSourceSizeBytes() = unlimited
  2016-01-18 08:33:02,075  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 6             **a) [250] complex.OpenOffice.Image<<Complex>>           1.564 ms
  2016-01-18 08:33:02,084  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 6               b) [400] complex.Any.Image<<Complex>>                  0 ms
  2016-01-18 08:33:02,084  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 6             Finished in 25 ms Transformer NOT called

  2016-01-18 08:33:02,097  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 7             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
  2016-01-18 08:33:02,098  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 7             xlsx png  – doclib – ContentService.getMaxSourceSizeBytes() = unlimited
  2016-01-18 08:33:02,098  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 7             **a) [250] complex.OpenOffice.Image<<Complex>>           1.564 ms
  2016-01-18 08:33:02,102  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 7               b) [400] complex.Any.Image<<Complex>>                  0 ms
  2016-01-18 08:33:02,102  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 7             Finished in 10 ms Transformer NOT called

  2016-01-18 08:33:02,125  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 8             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
  2016-01-18 08:33:02,125  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 8             xlsx pdf  – pdf – ContentService.getMaxSourceSizeBytes() = 1.5 MB
  2016-01-18 08:33:02,132  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 8             **a) [110] OpenOffice<<Proxy>> < 1.5 MB  944 ms
  2016-01-18 08:33:02,133  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 8             Finished in 29 ms Transformer NOT called

  2016-01-18 08:33:02,136  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 9             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/x-shockwave-flash
  2016-01-18 08:33:02,136  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 9             xlsx swf  – webpreview – ContentService.getMaxSourceSizeBytes() = 1 MB
  2016-01-18 08:33:02,137  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 9             **a) [150] complex.OpenOffice.Pdf2swf<<Complex>> < 1 MB    0 ms
  2016-01-18 08:33:02,144  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 9             Finished in 10 ms Transformer NOT called

  2016-01-18 08:33:02,147  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/html
  2016-01-18 08:33:02,147  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10             xlsx html ContentService.getMaxSourceSizeBytes() = unlimited
  2016-01-18 08:33:02,148  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10             **a)  [70] Poi                                            0 ms
  2016-01-18 08:33:02,148  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10               b) [110] OpenOffice<<Proxy>>                            0 ms
  2016-01-18 08:33:02,149  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10               c) [120] TikaAuto                                       0 ms
  2016-01-18 08:33:02,155  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10               d) [130] OOXML                                          0 ms
  2016-01-18 08:33:02,155  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10               e) [150] complex.OpenOffice.PdfBox<<Complex>>           0 ms
  2016-01-18 08:33:02,156  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 10             Finished in 11 ms Transformer NOT called

  2016-01-18 08:33:02,169  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 11             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
  2016-01-18 08:33:02,170  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 11             xlsx png  – avatar – ContentService.getMaxSourceSizeBytes() = unlimited
  2016-01-18 08:33:02,170  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 11             **a) [250] complex.OpenOffice.Image<<Complex>>           1.564 ms
  2016-01-18 08:33:02,173  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 11               b) [400] complex.Any.Image<<Complex>>                  0 ms
  2016-01-18 08:33:02,173  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 11             Finished in 17 ms Transformer NOT called

  2016-01-18 08:33:02,184  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 12             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/jpeg
  2016-01-18 08:33:02,184  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 12             xlsx jpg  – medium – ContentService.getMaxSourceSizeBytes() = unlimited
  2016-01-18 08:33:02,185  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 12             **a) [100] OOXMLThumbnail                                0 ms
  2016-01-18 08:33:02,186  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 12               b) [250] complex.OpenOffice.Image<<Complex>>           0 ms
  2016-01-18 08:33:02,194  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 12               c) [400] complex.Any.Image<<Complex>>                  0 ms
  2016-01-18 08:33:02,200  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 12             Finished in 21 ms Transformer NOT called

  2016-01-18 08:33:02,205  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 13             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/jpeg
  2016-01-18 08:33:02,206  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 13             xlsx jpg  – imgpreview – ContentService.getMaxSourceSizeBytes() = unlimited
  2016-01-18 08:33:02,212  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 13             **a) [100] OOXMLThumbnail                                0 ms
  2016-01-18 08:33:02,212  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 13               b) [250] complex.OpenOffice.Image<<Complex>>           0 ms
  2016-01-18 08:33:02,215  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 13               c) [400] complex.Any.Image<<Complex>>                  0 ms
  2016-01-18 08:33:02,215  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 13             Finished in 14 ms Transformer NOT called

  2016-01-18 08:33:05,074  INFO  [web.scripts.MimetypesQuery] [ajp-apr-8009-exec-5] Successfully retrieved mimetypes information from Alfresco.
  2016-01-18 08:33:05,109  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 14             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:33:05,110  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 14             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
  2016-01-18 08:33:05,110  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 14             xlsx pdf  image in excel.xlsx 171.7 KB – pdf – ContentService.getTransformer(…)
  2016-01-18 08:33:05,113  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 14             **a) [110] OpenOffice<<Proxy>> < 1.5 MB  944 ms
  2016-01-18 08:33:05,113  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 14             Finished in 8 ms Transformer NOT called

  2016-01-18 08:33:05,217  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 15             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:33:05,217  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 15             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
  2016-01-18 08:33:05,218  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 15             xlsx pdf  image in excel.xlsx 171.7 KB – pdf – ContentService.getTransformer(…)
  2016-01-18 08:33:05,218  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 15             **a) [110] OpenOffice<<Proxy>> < 1.5 MB  944 ms
  2016-01-18 08:33:05,218  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-10] 15             Finished in 2 ms Transformer NOT called

  2016-01-18 08:33:05,221  TRACE [content.transform.TransformerDebug] [pool-13-thread-1] 16             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:33:05,221  TRACE [content.transform.TransformerDebug] [pool-13-thread-1] 16             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
  2016-01-18 08:33:05,221  DEBUG [content.transform.TransformerDebug] [pool-13-thread-1] 16             xlsx pdf  image in excel.xlsx 171.7 KB – pdf – ContentService.transform(…)
  2016-01-18 08:33:05,221  DEBUG [content.transform.TransformerDebug] [pool-13-thread-1] 16             **a) [110] OpenOffice<<Proxy>> < 1.5 MB  944 ms
  2016-01-18 08:33:05,222  TRACE [content.transform.TransformerDebug] [pool-13-thread-1] 16.1           store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:33:05,222  TRACE [content.transform.TransformerDebug] [pool-13-thread-1] 16.1           application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
  2016-01-18 08:33:05,222  DEBUG [content.transform.TransformerDebug] [pool-13-thread-1] 16.1           xlsx pdf  image in excel.xlsx 171.7 KB OpenOffice<<Proxy>>
  2016-01-18 08:33:05,498  TRACE [content.transform.TransformerDebug] [pool-13-thread-1] 16.1           Finished in 276 ms
  2016-01-18 08:33:05,498  DEBUG [content.transform.TransformerDebug] [pool-13-thread-1] 16             Finished in 278 ms


================================= Calling iframe-preview ===========================
2016-01-18 08:35:53,888  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:35:53,888  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/html
  2016-01-18 08:35:53,888  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17             xlsx html image in excel.xlsx 171.7 KB ContentService.getTransformer(…)
  2016-01-18 08:35:53,889  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17             **a)  [70] Poi                                            0 ms
  2016-01-18 08:35:53,890  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17               b) [110] OpenOffice<<Proxy>>                            0 ms
  2016-01-18 08:35:53,890  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17               c) [120] TikaAuto                                       0 ms
  2016-01-18 08:35:53,890  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17               d) [130] OOXML                                          0 ms
  2016-01-18 08:35:53,891  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17               e) [150] complex.OpenOffice.PdfBox<<Complex>>           0 ms
  2016-01-18 08:35:53,892  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 17             Finished in 6 ms Transformer NOT called

  2016-01-18 08:35:54,075  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:35:54,076  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/html
  2016-01-18 08:35:54,076  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18             xlsx html image in excel.xlsx 171.7 KB ContentService.getTransformer(…)
  2016-01-18 08:35:54,077  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18             **a)  [70] Poi                                            0 ms
  2016-01-18 08:35:54,077  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18               b) [110] OpenOffice<<Proxy>>                            0 ms
  2016-01-18 08:35:54,077  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18               c) [120] TikaAuto                                       0 ms
  2016-01-18 08:35:54,078  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18               d) [130] OOXML                                          0 ms
  2016-01-18 08:35:54,078  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18               e) [150] complex.OpenOffice.PdfBox<<Complex>>           0 ms
  2016-01-18 08:35:54,079  TRACE [content.transform.TransformerDebug] [http-apr-8080-exec-4] 18             Finished in 5 ms Transformer NOT called

  2016-01-18 08:35:54,085  TRACE [content.transform.TransformerDebug] [pool-13-thread-2] 19             store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:35:54,086  TRACE [content.transform.TransformerDebug] [pool-13-thread-2] 19             application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/html
  2016-01-18 08:35:54,086  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19             xlsx html image in excel.xlsx 171.7 KB ContentService.transform(…)
  2016-01-18 08:35:54,087  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19             **a)  [70] Poi                                            0 ms
  2016-01-18 08:35:54,087  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19               b) [110] OpenOffice<<Proxy>>                            0 ms
  2016-01-18 08:35:54,088  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19               c) [120] TikaAuto                                       0 ms
  2016-01-18 08:35:54,088  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19               d) [130] OOXML                                          0 ms
  2016-01-18 08:35:54,089  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19               e) [150] complex.OpenOffice.PdfBox<<Complex>>           0 ms
  2016-01-18 08:35:54,089  TRACE [content.transform.TransformerDebug] [pool-13-thread-2] 19.1           store://2016/1/18/8/29/2fce7d8e-8470-4eb3-975c-d41132b05757.bin
  2016-01-18 08:35:54,089  TRACE [content.transform.TransformerDebug] [pool-13-thread-2] 19.1           application/vnd.openxmlformats-officedocument.spreadsheetml.sheet text/html
  2016-01-18 08:35:54,090  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19.1           xlsx html image in excel.xlsx 171.7 KB Poi
  2016-01-18 08:35:54,281  TRACE [content.transform.TransformerDebug] [pool-13-thread-2] 19.1           Finished in 191 ms
  2016-01-18 08:35:54,282  DEBUG [content.transform.TransformerDebug] [pool-13-thread-2] 19             Finished in 198 ms


I am working with both 5.0.c (and d) on both CentOS and OSX. using maven 2.1.1 sdk on OSX but the CentOS installation is an alfresco bin installation setup.
Any insights would be appreciated

3 REPLIES 3

darkstar1
Confirmed Champ
Confirmed Champ
I decided to create the following complex pipeline (Excel => PDF => HTML) instead and then replace the out of the box transformer for PDF to html (which seems to be <a href="https://github.com/Alfresco/community-edition/blob/dcbe8a0d5219614fbe8bc11c166d638152b9367c/projects...">PdfBox</a>) with pdf2htmlex for the pdf=>html transformation.
<strong>My transformer spring bean</strong>:

    <bean id="transformer.worker.pdf2htmlex" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker">
        <property name="mimetypeService">
            <ref bean="mimetypeService" />
        </property>
        <property name="checkCommand">
            <bean class="org.alfresco.util.exec.RuntimeExec">
                <property name="commandsAndArguments">
                    <map>
                        <entry key=".*">
                            <list>
                                <value>pdf2htmlex</value>
                                <value>-v</value>
                            </list>
                        </entry>
                    </map>
                </property>
                <!–<property name="errorCodes">
                    <value>1</value>
                </property>–>
            </bean>
        </property>
        <property name="transformCommand">
            <bean class="org.alfresco.util.exec.RuntimeExec">
                <property name="commandsAndArguments">
                    <map>
                        <entry key=".*" value="pdf2htmlEX –embed CFIJO ${source} ${target}"/>
                    </map>
                </property>
                <property name="processDirectory"
                          value="/"/>
            </bean>
        </property>
        <property name="explicitTransformations">
            <list>
                <bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails" >
                    <constructor-arg><value>application/pdf</value></constructor-arg>
                    <constructor-arg><value>text/html</value></constructor-arg>
                </bean>
            </list>
        </property>
    </bean>

    <bean id="transformer.pdf2htmlex" class="org.alfresco.repo.content.transform.ProxyContentTransformer" parent="baseContentTransformer">
        <property name="worker">
            <ref bean="transformer.worker.pdf2htmlex" />
        </property>
    </bean>


<strong>The transformer properties file </strong>:

#disable ootb pdf->html and xlsx->html transformation path
content.transformer.OpenOffice.extensions.xlsx.html.supported=false
#content.transformer.complex.OpenOffice.PdfBox.avilable=false
content.transformer.complex.OpenOffice.PdfBox.extensions.*.html.supported=false

#PDF to html transformer
content.transformer.pdf2htmlex.available=true
content.transformer.pdf2htmlex.priority=50
content.transformer.pdf2htmlex.extensions.pdf.html.supported=true
content.transformer.pdf2htmlex.extensions.pdf.html.priority=50


#   XLSX to HTML pipeline
content.transformer.complex.Xlsx.Html.pipeline=*|pdf|pdf2htmlex
content.transformer.complex.Xlsx.Html.available=true
content.transformer.complex.Xlsx.Html.extensions.xlsx.html.priority=30
content.transformer.complex.Xlsx.Html.extensions.xlsx.html.supported=true


One thing I notice, is that when I startup my repository, this shows up in the logs:

2016-01-21 14:18:37,962  TRACE [util.exec.RuntimeExec] [localhost-startStop-1] Execution result:
   os:         Mac OS X
   command:    pdf2htmlex -v
   succeeded:  true
   exit code:  0
   out:       
   err:        pdf2htmlEX version 0.13.6
Copyright 2012-2014 Lu Wang <coolwanglu@gmail.com> and other contributors
Libraries:
  poppler 0.40.0
  libfontforge 20160113
  cairo 1.14.6
Default data-dir: /usr/local/Cellar/pdf2htmlex/0.13.6_7/share/pdf2htmlEX
Supported
   existing environment:
        JAVA_MAIN_CLASS_7289=com.intellij.rt.execution.application.AppMain

I'm wondering why, if the check command has an exit code of 0, is the output to the command an error; but the main thing is that the transform command results in an error and a zero byte target file.
Copying that exact command and executing it in terminal always succeeds.

So it turns out that the reason for the failing argument is down to the way that you specify arguments for the transformCommand

<property name="transformCommand">
            <bean class="org.alfresco.util.exec.RuntimeExec">
                <property name="commandsAndArguments">
                    <map>
                        <entry key=".*">
                            <list>
                                <value>pdf2htmlEX</value>
                                <value>–embed</value>
                                <value>CFIJO</value>
                                <value>${source}</value>
                                <value>${target}</value>
                            </list>
                        </entry>
                    </map>
                </property>
                <property name="processDirectory" value="/"/>
            </bean>
        </property>


as opposed to:


<property name="transformCommand">
<bean class="org.alfresco.util.exec.RuntimeExec">
      <property name="commandsAndArguments">
              <map>
                  <entry key=".*" value="pdf2htmlEX –embed CFIJO ${source} ${target}"/>
              </map>
          </property>
      <property name="processDirectory" value="/"/>
</bean>
</property>

I had found the clue in the comments written on org/alfresco/util/exec/RuntimeExec.java class that's responsible for executing the commands

Also, like Jeff Potts <a href="https://forums.alfresco.com/comment/160957#comment-160957">in his post</a>, I came across the exact same issue and was resolved in exactly the same way. The reason, I think for this is that I believe that the complex pipeline needs to be registered with the TransformerRegistry on startup other wise it is unable to recognise and declare the transformer as available for the pipeline.