<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Tips on troubleshooting individual file indexing with Alfresco 5.2 / Solr 6 in Alfresco Forum</title>
    <link>https://connect.hyland.com/t5/alfresco-forum/tips-on-troubleshooting-individual-file-indexing-with-alfresco-5/m-p/61063#M21365</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have a Alfresco Community (&lt;SPAN class=""&gt;201707&lt;/SPAN&gt;) installation which i am using to compare the default solr 4 vs solr 6 in the alfresco-search-services-1.1.0 install.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;After a full index with Solr 4, I get the following info from the solr4 admin page:&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;Num Docs: 163458&lt;/P&gt;&lt;P&gt;Max Docs: 163458&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;Deleted Docs: 0&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;Master (Searching) 1524504594659 159 &lt;STRONG&gt;6.5 GB&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;Nodes in Index: 70921&lt;BR /&gt;Transactions in Index: 80844&lt;BR /&gt;Approx transactions remaining: 0&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;Unindexed Nodes: 11441&lt;BR /&gt;Error Nodes in Index: 0&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;in the solr4 SUMMARY report, I can see that it's done:&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;Node count with FTSStatus Clean&amp;nbsp;&amp;nbsp; &amp;nbsp;69165&lt;BR /&gt;Node count with FTSStatus Dirty&amp;nbsp;&amp;nbsp; &amp;nbsp;0&lt;BR /&gt;Node count with FTSStatus New&amp;nbsp;&amp;nbsp; &amp;nbsp;0&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;When I test the solr 6 setup, I stop the alfresco app, make the changes to the alfresco install for Solr 6, start the solr server and the alfresco server, and let it re-index.&amp;nbsp; It plugs along for a few hours, and then completes with the following stats:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;Num Docs:164357&lt;/P&gt;&lt;P&gt;Max Doc:164357&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;Deleted Docs: 0&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;Master (Searching) &amp;nbsp;&amp;nbsp; &amp;nbsp;1524581958240 586 &lt;STRONG&gt;2.48 GB&lt;/STRONG&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;, and in the SUMMARY report:&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;Alfresco Nodes in Index&amp;nbsp;&amp;nbsp; &amp;nbsp;70937&lt;BR /&gt;Alfresco Transactions in Index&amp;nbsp;&amp;nbsp; &amp;nbsp;81470&lt;BR /&gt;Alfresco Unindexed Nodes&amp;nbsp;&amp;nbsp; &amp;nbsp;11698&lt;BR /&gt;Alfresco Error Nodes in Index&amp;nbsp;&amp;nbsp; &amp;nbsp;0&lt;/P&gt;&lt;P&gt;Node count with FTSStatus Clean&amp;nbsp;&amp;nbsp; &amp;nbsp;69181&lt;BR /&gt;Node count with FTSStatus Dirty&amp;nbsp;&amp;nbsp; &amp;nbsp;0&lt;BR /&gt;Node count with FTSStatus New&amp;nbsp;&amp;nbsp; &amp;nbsp;0&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;When i run the ERROR query I get nothing:&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;{&lt;BR /&gt;&amp;nbsp; "responseHeader":{&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; "status":0,&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; "QTime":0,&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; "params":{&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "q":"ERROR*",&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "wt":"json"}},&lt;BR /&gt;&amp;nbsp; "response":{"numFound":0,"start":0,"docs":[]&lt;BR /&gt;&amp;nbsp; }}&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;So the indexer looks done and comparable volume-wise to the solr4 setup.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What first concerned me was the significantly smaller size: the Solr4 6.5 Gb vs Solr6 2.5 Gb size after a complete reindex, when I was expecting a 15% size increase with the introduction of fingerprints.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There are some docs that I can't get in a full text search result set, even though the docs have the index aspect attached.&amp;nbsp; I can try to reindex one of those docs, but no luck&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;&lt;A class="jivelink2" href="http://10.155.34.115:8983/solr/admin/cores?action=reindex&amp;amp;query=sys%5C%3Anode%5C-dbid%3A135156" title="http://10.155.34.115:8983/solr/admin/cores?action=reindex&amp;amp;query=sys%5C%3Anode%5C-dbid%3A135156" rel="nofollow noopener noreferrer"&gt;http://[myip]:8983/solr/admin/cores?action=reindex&amp;amp;query=sys%5C%3Anode%5C-dbid%3A135156&lt;/A&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;At reindex time I saw a few&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;"FlateFilter: stop reading corrupt stream due to a DataFormatException"&lt;/P&gt;&lt;P&gt;and&lt;/P&gt;&lt;P&gt;"An error occured when reading table hmtx"&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;But no more then I saw on the solr4 setup.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any thoughts on how best to troubleshoot the inconsistencies?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, I know i can't upgrade to the pdfbox 2.0.X in 5.2, but anyone able to replace the pdfbox-1.8.10.jar and pdfbox-1.8.10.jar with pdfbox-1.8.13.jar and pdfbox-1.8.13.jar to get over the pdfbox probs?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 24 Apr 2018 15:35:03 GMT</pubDate>
    <dc:creator>dbiggins</dc:creator>
    <dc:date>2018-04-24T15:35:03Z</dc:date>
    <item>
      <title>Tips on troubleshooting individual file indexing with Alfresco 5.2 / Solr 6</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/tips-on-troubleshooting-individual-file-indexing-with-alfresco-5/m-p/61063#M21365</link>
      <description>I have a Alfresco Community (201707) installation which i am using to compare the default solr 4 vs solr 6 in the alfresco-search-services-1.1.0 install.After a full index with Solr 4, I get the following info from the solr4 admin page:Num Docs: 163458Max Docs: 163458...Deleted Docs: 0...Master (Sea</description>
      <pubDate>Tue, 24 Apr 2018 15:35:03 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/tips-on-troubleshooting-individual-file-indexing-with-alfresco-5/m-p/61063#M21365</guid>
      <dc:creator>dbiggins</dc:creator>
      <dc:date>2018-04-24T15:35:03Z</dc:date>
    </item>
    <item>
      <title>Re: Tips on troubleshooting individual file indexing with Alfresco 5.2 / Solr 6</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/tips-on-troubleshooting-individual-file-indexing-with-alfresco-5/m-p/61064#M21366</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I re-read a previous question that &lt;B&gt;Cesar Capillas&lt;/B&gt;‌ had answered, and I think he answered the potential for the size discrepancies (&lt;A _jive_internal="true" href="https://community.alfresco.com/message/830710-request-for-solr-6-search-services-troubleshooting-advice" rel="nofollow noopener noreferrer"&gt;https://community.alfresco.com/message/830710-request-for-solr-6-search-services-troubleshooting-advice&lt;/A&gt;).&amp;nbsp; I needed to look at my shared.properties, so thanks for the previous answer&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 25 Apr 2018 12:55:00 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/tips-on-troubleshooting-individual-file-indexing-with-alfresco-5/m-p/61064#M21366</guid>
      <dc:creator>dbiggins</dc:creator>
      <dc:date>2018-04-25T12:55:00Z</dc:date>
    </item>
  </channel>
</rss>

