<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Searching in large text files (Lucene limited?) in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241298#M194428</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hello&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I have PDF documents that contain only image data. Those are accompanied by text files with the image data 'transformed' into a text. I can link those two so that I get the PDF listed in search results.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The problem is when the text content is too large, then the words from the beginning are found, the words from the end are not… (The text content I've tested has 1.6MB.)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I was under impression that there's limit of 65k characters for text properties, so I linked the text content via property of type &lt;/SPAN&gt;&lt;STRONG&gt;d:content&lt;/STRONG&gt;&lt;SPAN&gt; that I add to the PDF document. Apparently that does not help. &lt;img id="smileysad" class="emoticon emoticon-smileysad" src="https://connect.hyland.com/i/smilies/16x16_smiley-sad.png" alt="Smiley Sad" title="Smiley Sad" /&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Any hints, please?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 26 Oct 2010 11:56:11 GMT</pubDate>
    <dc:creator>jzaruba</dc:creator>
    <dc:date>2010-10-26T11:56:11Z</dc:date>
    <item>
      <title>Searching in large text files (Lucene limited?)</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241298#M194428</link>
      <description>HelloI have PDF documents that contain only image data. Those are accompanied by text files with the image data 'transformed' into a text. I can link those two so that I get the PDF listed in search results.The problem is when the text content is too large, then the words from the beginning are foun</description>
      <pubDate>Tue, 26 Oct 2010 11:56:11 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241298#M194428</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-26T11:56:11Z</dc:date>
    </item>
    <item>
      <title>Re: Searching in large text files (Lucene limited?)</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241299#M194429</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Try setting&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;&lt;BR /&gt;lucene.indexer.maxFieldLength=1000000&lt;BR /&gt;&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;SPAN&gt;in alfresco-global.properties - the default is only 10000, so large documents are only partially indexed per default &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I consider this to be one of Alfrescos most annoying "features"….&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;HTH&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Gyro&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 26 Oct 2010 12:19:52 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241299#M194429</guid>
      <dc:creator>gyro_gearless</dc:creator>
      <dc:date>2010-10-26T12:19:52Z</dc:date>
    </item>
    <item>
      <title>Re: Searching in large text files (Lucene limited?)</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241300#M194430</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;You're my hero, sir! &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://connect.hyland.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks a lot!&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 26 Oct 2010 12:32:18 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241300#M194430</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-26T12:32:18Z</dc:date>
    </item>
    <item>
      <title>Re: Searching in large text files (Lucene limited?)</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241301#M194431</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;jzaruba could you explain to me how to save in "d:content" attribute? &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I can save in "d:content" attribute, but when I doing in the search/advanced search, this not works, found 0 results.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Could you help me?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks!&lt;/SPAN&gt;&lt;BR /&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 12 Nov 2013 20:47:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-in-large-text-files-lucene-limited/m-p/241301#M194431</guid>
      <dc:creator>mjuarez</dc:creator>
      <dc:date>2013-11-12T20:47:25Z</dc:date>
    </item>
  </channel>
</rss>

