<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Transform or thumbnailing scanned PDF - results a white page in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/transform-or-thumbnailing-scanned-pdf-results-a-white-page/m-p/211382#M164512</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Responding in an effort to say that you are not alone.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;This appears to be caused by primarily that ImageMagick is not being used for PDF thumbnail production.&amp;nbsp; Well… moving on then…&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Secondarily this is caused by the change of Adobe's PDF specification starting with version 1.5.&amp;nbsp; They changed the cross reference table from a simple byte parsed table to a bit stream type format.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Referring to &lt;/SPAN&gt;&lt;A href="http://www.adobe.com/devnet/pdf/pdfs/PDFReference15_v6.pdf" rel="nofollow noopener noreferrer"&gt;http://www.adobe.com/devnet/pdf/pdfs/PDFReference15_v6.pdf&lt;/A&gt;&lt;SPAN&gt;, section 3.4.7:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;3.4.7 Cross-Reference Streams&lt;BR /&gt;Beginning with PDF 1.5, cross-reference information may be stored in a cross-reference&lt;BR /&gt;stream, instead of a cross-reference table. Cross-reference streams provide&lt;BR /&gt;the following advantages:&lt;BR /&gt;• A more compact representation of cross-reference information.&lt;BR /&gt;• The ability to access compressed objects that are stored in object streams (see&lt;BR /&gt;Section 3.4.6, “Object Streams”), and to allow new cross-reference entry types&lt;BR /&gt;to be added in the future.&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;… a little later on…&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;Note that the value following the startxref keyword is now the offset of the crossreference&lt;BR /&gt;stream rather than an xref keyword. For files that use cross-reference&lt;BR /&gt;streams entirely (that is, PDF 1.5 files that are not hybrid-reference files; see&lt;BR /&gt;“Compatibility with PDF 1.4” on page 85), the keywords xref and trailer are no&lt;BR /&gt;longer used. Therefore, with the exception of the “startxref address %%EOF” segment&lt;BR /&gt;and comments, a PDF 1.5 file is entirely a sequence of objects.&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;So as you can imagine, the problem occurs when pdf-render library is trying to read stream code as byte code (overgeneralization, but hopefully the point is clear)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Two possible solutions are:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;- Modify the generally abandoned pdf-render library (&lt;/SPAN&gt;&lt;A href="https://pdf-renderer.dev.java.net/" rel="nofollow noopener noreferrer"&gt;https://pdf-renderer.dev.java.net/&lt;/A&gt;&lt;SPAN&gt;) to get it updated for PDF versions greater than 1.4 (src/com/sun/pdfview/PDFFile.java for the file to specifically modify)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;- Modify transformInternal within &lt;/SPAN&gt;&lt;A href="http://svn.alfresco.com/repos/alfresco-open-mirror/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/content/transform/PdfToImageContentTransformer.java" rel="nofollow noopener noreferrer"&gt;PdfToImageContentTransformer.java&lt;/A&gt;&lt;SPAN&gt; to use ImageMagick instead of the generally abandoned pdf-render library&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Probably time to end this post now and go submit a ticket&amp;nbsp; :wink:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;For reference:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Mac OS 10.5.8&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Alfresco Community 3.2&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;ImageMagick&amp;nbsp; 6.5.5-10 (2009/09/14)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;pdf2swf - swftools 0.9.0&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;OpenOffice 3.0.0 [300m6(Build:9352)]&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 15 Oct 2009 18:19:39 GMT</pubDate>
    <dc:creator>schaffy</dc:creator>
    <dc:date>2009-10-15T18:19:39Z</dc:date>
    <item>
      <title>Transform or thumbnailing scanned PDF - results a white page</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/transform-or-thumbnailing-scanned-pdf-results-a-white-page/m-p/211381#M164511</link>
      <description>I've found a thumbnail generation problem with Alfresco 3.1 Labs and 3.2 Community editions with using scanned documents. ImageMagick and GhostScript can create thumbnails from that PDF document, but Alfresco 3.x doesn't use ImageMagick to transform - just see it in PdfToImageContentTransformer.java</description>
      <pubDate>Fri, 04 Sep 2009 14:44:23 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/transform-or-thumbnailing-scanned-pdf-results-a-white-page/m-p/211381#M164511</guid>
      <dc:creator>louise</dc:creator>
      <dc:date>2009-09-04T14:44:23Z</dc:date>
    </item>
    <item>
      <title>Re: Transform or thumbnailing scanned PDF - results a white page</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/transform-or-thumbnailing-scanned-pdf-results-a-white-page/m-p/211382#M164512</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Responding in an effort to say that you are not alone.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;This appears to be caused by primarily that ImageMagick is not being used for PDF thumbnail production.&amp;nbsp; Well… moving on then…&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Secondarily this is caused by the change of Adobe's PDF specification starting with version 1.5.&amp;nbsp; They changed the cross reference table from a simple byte parsed table to a bit stream type format.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Referring to &lt;/SPAN&gt;&lt;A href="http://www.adobe.com/devnet/pdf/pdfs/PDFReference15_v6.pdf" rel="nofollow noopener noreferrer"&gt;http://www.adobe.com/devnet/pdf/pdfs/PDFReference15_v6.pdf&lt;/A&gt;&lt;SPAN&gt;, section 3.4.7:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;3.4.7 Cross-Reference Streams&lt;BR /&gt;Beginning with PDF 1.5, cross-reference information may be stored in a cross-reference&lt;BR /&gt;stream, instead of a cross-reference table. Cross-reference streams provide&lt;BR /&gt;the following advantages:&lt;BR /&gt;• A more compact representation of cross-reference information.&lt;BR /&gt;• The ability to access compressed objects that are stored in object streams (see&lt;BR /&gt;Section 3.4.6, “Object Streams”), and to allow new cross-reference entry types&lt;BR /&gt;to be added in the future.&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;… a little later on…&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;Note that the value following the startxref keyword is now the offset of the crossreference&lt;BR /&gt;stream rather than an xref keyword. For files that use cross-reference&lt;BR /&gt;streams entirely (that is, PDF 1.5 files that are not hybrid-reference files; see&lt;BR /&gt;“Compatibility with PDF 1.4” on page 85), the keywords xref and trailer are no&lt;BR /&gt;longer used. Therefore, with the exception of the “startxref address %%EOF” segment&lt;BR /&gt;and comments, a PDF 1.5 file is entirely a sequence of objects.&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;So as you can imagine, the problem occurs when pdf-render library is trying to read stream code as byte code (overgeneralization, but hopefully the point is clear)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Two possible solutions are:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;- Modify the generally abandoned pdf-render library (&lt;/SPAN&gt;&lt;A href="https://pdf-renderer.dev.java.net/" rel="nofollow noopener noreferrer"&gt;https://pdf-renderer.dev.java.net/&lt;/A&gt;&lt;SPAN&gt;) to get it updated for PDF versions greater than 1.4 (src/com/sun/pdfview/PDFFile.java for the file to specifically modify)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;- Modify transformInternal within &lt;/SPAN&gt;&lt;A href="http://svn.alfresco.com/repos/alfresco-open-mirror/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/content/transform/PdfToImageContentTransformer.java" rel="nofollow noopener noreferrer"&gt;PdfToImageContentTransformer.java&lt;/A&gt;&lt;SPAN&gt; to use ImageMagick instead of the generally abandoned pdf-render library&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Probably time to end this post now and go submit a ticket&amp;nbsp; :wink:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;For reference:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Mac OS 10.5.8&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Alfresco Community 3.2&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;ImageMagick&amp;nbsp; 6.5.5-10 (2009/09/14)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;pdf2swf - swftools 0.9.0&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;OpenOffice 3.0.0 [300m6(Build:9352)]&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 15 Oct 2009 18:19:39 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/transform-or-thumbnailing-scanned-pdf-results-a-white-page/m-p/211382#M164512</guid>
      <dc:creator>schaffy</dc:creator>
      <dc:date>2009-10-15T18:19:39Z</dc:date>
    </item>
  </channel>
</rss>

