<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Increase Max File Size That Solr Indexes  in Alfresco Forum</title>
    <link>https://connect.hyland.com/t5/alfresco-forum/increase-max-file-size-that-solr-indexes/m-p/85183#M25881</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello everyone,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have installed Alfresco Community Edition Vers 5.2 on windows (using exe file). As I noticed in my log file, when I upload a PDF file larger than 10 MB, the Alfresco (Solr) is not extracting its text and therefore the file content can not be searched. The log file says:&lt;/P&gt;&lt;P&gt;Metadata extraction rejected, Extracter: &lt;A href="mailto:org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter@39882d66" rel="nofollow noopener noreferrer"&gt;org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter@39882d66&lt;/A&gt;&amp;nbsp;Reason: Max doc size exceeded 10 MB.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I would appreciate it if someone could tell me how can I increase this size. I have already tried some solutions (for example increasing alfresco.contentStreamLimit located in file alfresco-community/solr4/archive-SpaceStore/conf/solrcore and&amp;nbsp;&lt;SPAN&gt;alfresco-community/solr4/workspace-SpaceStore/conf/solrcore)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks a lot in advance.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 24 Jul 2019 05:11:27 GMT</pubDate>
    <dc:creator>alinasrinazif</dc:creator>
    <dc:date>2019-07-24T05:11:27Z</dc:date>
    <item>
      <title>Increase Max File Size That Solr Indexes</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/increase-max-file-size-that-solr-indexes/m-p/85183#M25881</link>
      <description>Hello everyone,I have installed Alfresco Community Edition Vers 5.2 on windows (using exe file). As I noticed in my log file, when I upload a PDF file larger than 10 MB, the Alfresco (Solr) is not extracting its text and therefore the file content can not be searched. The log file says:Metadata extr</description>
      <pubDate>Wed, 24 Jul 2019 05:11:27 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/increase-max-file-size-that-solr-indexes/m-p/85183#M25881</guid>
      <dc:creator>alinasrinazif</dc:creator>
      <dc:date>2019-07-24T05:11:27Z</dc:date>
    </item>
    <item>
      <title>Re: Increase Max File Size That Solr Indexes</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/increase-max-file-size-that-solr-indexes/m-p/85184#M25882</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The limitation is defined in your Alfresco repository which converts the pdf to text. Please check your transformer configuration which is by default defined in &lt;A class="link-titled" href="https://github.com/ecm4u/alfresco-ce-repository/blob/5.2.g-patched/root/projects/repository/config/alfresco/subsystems/Transformers/default/transformers.properties" title="https://github.com/ecm4u/alfresco-ce-repository/blob/5.2.g-patched/root/projects/repository/config/alfresco/subsystems/Transformers/default/transformers.properties" rel="nofollow noopener noreferrer"&gt;alfresco-ce-repository/transformers.properties at 5.2.g-patched · ecm4u/alfresco-ce-repository · GitHub&lt;/A&gt; (sorry I didn't find a valid tag in the Alfresco git repo for 5.2).&lt;/P&gt;&lt;P&gt;Depending on the transformer which takes the task you should increase the maxSourceSizeKBytes.&lt;/P&gt;&lt;P&gt;e.g.&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;content.transformer.PdfBox.extensions.pdf.txt.maxSourceSizeKBytes=25600&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;and set debuggin in your log4j properties&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;log4j.logger.org.alfresco.repo.content.transform.TransformerDebug=DEBUG&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;to find out which transformer actually is running for your documents and/or install &lt;A class="link-titled" href="https://github.com/OrderOfTheBee/ootbee-support-tools" title="https://github.com/OrderOfTheBee/ootbee-support-tools" rel="nofollow noopener noreferrer"&gt;GitHub - OrderOfTheBee/ootbee-support-tools: OOTBee Support Tools addon to extend set of administrative tools on Reposit…&lt;/A&gt; to debug and modify transformation config&amp;nbsp; from your browser.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Jul 2019 18:13:18 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/increase-max-file-size-that-solr-indexes/m-p/85184#M25882</guid>
      <dc:creator>heiko_robert</dc:creator>
      <dc:date>2019-07-24T18:13:18Z</dc:date>
    </item>
  </channel>
</rss>

