<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Link content to TIFF files with Lucene in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/link-content-to-tiff-files-with-lucene/m-p/113638#M80012</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Our full-text Lucene integration uses the to-text transformers in the repository to convert various mimetypes to text. So it would be a matter of writing a to-text transformer class and registering it for the "image/tiff" mimetype. It would then get called when a tiff image was added to the repo or the content modified/updated. You transformer class could perform any work required to get the OCR text (such as reading an association or custom property you have saved the data into) and return this as the result of the transformer.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;There are various examples of this in the Alfresco SDK, such as the PdfBoxContentTransformer transformer class (org.alfresco.repo.content.transform.PdfBoxContentTransformer) which converts PDF to text for indexing.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://wiki.alfresco.com/wiki/Content_Transformations" rel="nofollow noopener noreferrer"&gt;http://wiki.alfresco.com/wiki/Content_Transformations&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Hope this helps,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Kevin&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 10 Jul 2007 10:13:00 GMT</pubDate>
    <dc:creator>kevinr</dc:creator>
    <dc:date>2007-07-10T10:13:00Z</dc:date>
    <item>
      <title>Link content to TIFF files with Lucene</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/link-content-to-tiff-files-with-lucene/m-p/113637#M80011</link>
      <description>Hello,We need to import TIFF files into the repository. We would like to store these images in Alfresco and to be able to perform full-text searches on them, by telling the Lucene engine that the content of the TIFF file (found by doing OCR&amp;nbsp; in a previous step) is related to the image file.Is it pos</description>
      <pubDate>Mon, 09 Jul 2007 09:16:50 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/link-content-to-tiff-files-with-lucene/m-p/113637#M80011</guid>
      <dc:creator>fguillaume</dc:creator>
      <dc:date>2007-07-09T09:16:50Z</dc:date>
    </item>
    <item>
      <title>Re: Link content to TIFF files with Lucene</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/link-content-to-tiff-files-with-lucene/m-p/113638#M80012</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Our full-text Lucene integration uses the to-text transformers in the repository to convert various mimetypes to text. So it would be a matter of writing a to-text transformer class and registering it for the "image/tiff" mimetype. It would then get called when a tiff image was added to the repo or the content modified/updated. You transformer class could perform any work required to get the OCR text (such as reading an association or custom property you have saved the data into) and return this as the result of the transformer.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;There are various examples of this in the Alfresco SDK, such as the PdfBoxContentTransformer transformer class (org.alfresco.repo.content.transform.PdfBoxContentTransformer) which converts PDF to text for indexing.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://wiki.alfresco.com/wiki/Content_Transformations" rel="nofollow noopener noreferrer"&gt;http://wiki.alfresco.com/wiki/Content_Transformations&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Hope this helps,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Kevin&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 10 Jul 2007 10:13:00 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/link-content-to-tiff-files-with-lucene/m-p/113638#M80012</guid>
      <dc:creator>kevinr</dc:creator>
      <dc:date>2007-07-10T10:13:00Z</dc:date>
    </item>
  </channel>
</rss>

