<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Solr 4 indexing lag in Alfresco Forum</title>
    <link>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120466#M33086</link>
    <description>&lt;P&gt;Hi Angel, thank you for your response.&lt;/P&gt;&lt;P&gt;On DB side, I enabled the slow-query logging and observed the environment for about 10-15 minutes. Apparently all looks fine:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;15-25 jdbc connections (in average)&lt;/LI&gt;&lt;LI&gt;no slow queries logged (tracking query duration greater than 5 seconds)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I think Solr takes too long to index even a small commit. I noticed that index folder have a lot of 2 GB files:&lt;/P&gt;&lt;PRE&gt;## /solr4/index/workspace/SpacesStore/index
-rw-r--r-- 1 alfresco alfresco 2.2G Mar 10 2023 _256j_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.2G Mar 10 2023 _2gzk_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.1G Mar 10 2023 _37iw_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.1G Mar 10 2023 _2tvr_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.0G Mar 9 2023 _1s66_Lucene41_0.tim
....
-rw-r--r-- 1 alfresco alfresco 1.2M Jul 4 10:07 _ifkj.nvd
-rw-r--r-- 1 alfresco alfresco 6.9M Jul 4 10:07 _ifkj_Lucene410_0.dvd
-rw-r--r-- 1 alfresco alfresco 3.6M Jul 4 10:06 _ifkj_Lucene41_0.doc&lt;/PRE&gt;&lt;P&gt;In addition thers's another folder named "content" with small gz files and a lot of numbered sub folders that include many other gz files. That folder looks very heavy as well, and I'm unable to list all files within a reasonable time...even the command "ls -l" executed from terminal takes to long to respond&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;## solr4/content/_DEFAULT_/db
drwxrwxr-x 2 solr solr 264K Jul  6 17:09 1962
drwxrwxr-x 2 solr solr 264K Jul  6 14:34 1963
drwxrwxr-x 2 solr solr 264K Jul  6 15:10 2086
drwxrwxr-x 2 solr solr 260K Jul 15 08:33 1105
drwxrwxr-x 2 solr solr 260K Jul 10 10:54 1106
....&lt;/PRE&gt;&lt;P&gt;It looks like solr is always doing something with those huge files and that takes quite a long time. The courious thing is that searches take 3-7 secs (acceptable for 15 items paginated queries on a huge repository) but indexing is 7-10 minutes behind the DB (system clock of both servers are synced)&lt;/P&gt;</description>
    <pubDate>Tue, 05 Nov 2024 12:36:21 GMT</pubDate>
    <dc:creator>joe_l3</dc:creator>
    <dc:date>2024-11-05T12:36:21Z</dc:date>
    <item>
      <title>Solr 4 indexing lag</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120464#M33084</link>
      <description>&lt;P&gt;Hello,&amp;nbsp;&lt;/P&gt;&lt;P&gt;has anyone experienced Solr 4 indexing lagging with huge content store ?&lt;/P&gt;&lt;P&gt;I am facing a problem with Solr and the NRT (NearRealTime) indexing. Basically, in my environment Solr takes too long to sync indexes with the DB data. Documents are searchable only after 8-10 minutes.&lt;/P&gt;&lt;P&gt;Here is my stack:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Alfresco Community 5.2&lt;/STRONG&gt; - 1 Server - 12 vCPU - 40 GB Ram - JVM Heap 20 GB&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Solr 4&lt;/STRONG&gt; - 1 Server - 12 vCPU - 30 GB Ram - I/O throughput 1843.93 MB/s - JVM Heap 18 GB&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Mysql 5.7&lt;/STRONG&gt; - 1 Server - 12 vCPU - 20 GB Ram&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Sizing and settings worth mentioning:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Size on disk of content repository: 5 TB&lt;/LI&gt;&lt;LI&gt;Size on disk of Solr indexes: 300 GB&lt;/LI&gt;&lt;LI&gt;Num. Docs on Solr: 140 Mln&lt;/LI&gt;&lt;LI&gt;Content indexing disabled&lt;/LI&gt;&lt;LI&gt;Solr suggester disabled&lt;/LI&gt;&lt;LI&gt;Alfresco tracking every 8 secs&lt;/LI&gt;&lt;LI&gt;11 indexing threads for each tracking transaction&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 19 Jul 2024 16:10:31 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120464#M33084</guid>
      <dc:creator>joe_l3</dc:creator>
      <dc:date>2024-07-19T16:10:31Z</dc:date>
    </item>
    <item>
      <title>Re: Solr 4 indexing lag</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120465#M33085</link>
      <description>&lt;P&gt;Initially, it appears that the database might be the bottleneck. Do you have any metrics on the performance of the database queries?&lt;/P&gt;</description>
      <pubDate>Mon, 22 Jul 2024 07:23:41 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120465#M33085</guid>
      <dc:creator>angelborroy</dc:creator>
      <dc:date>2024-07-22T07:23:41Z</dc:date>
    </item>
    <item>
      <title>Re: Solr 4 indexing lag</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120466#M33086</link>
      <description>&lt;P&gt;Hi Angel, thank you for your response.&lt;/P&gt;&lt;P&gt;On DB side, I enabled the slow-query logging and observed the environment for about 10-15 minutes. Apparently all looks fine:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;15-25 jdbc connections (in average)&lt;/LI&gt;&lt;LI&gt;no slow queries logged (tracking query duration greater than 5 seconds)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I think Solr takes too long to index even a small commit. I noticed that index folder have a lot of 2 GB files:&lt;/P&gt;&lt;PRE&gt;## /solr4/index/workspace/SpacesStore/index
-rw-r--r-- 1 alfresco alfresco 2.2G Mar 10 2023 _256j_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.2G Mar 10 2023 _2gzk_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.1G Mar 10 2023 _37iw_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.1G Mar 10 2023 _2tvr_Lucene41_0.tim
-rw-r--r-- 1 alfresco alfresco 2.0G Mar 9 2023 _1s66_Lucene41_0.tim
....
-rw-r--r-- 1 alfresco alfresco 1.2M Jul 4 10:07 _ifkj.nvd
-rw-r--r-- 1 alfresco alfresco 6.9M Jul 4 10:07 _ifkj_Lucene410_0.dvd
-rw-r--r-- 1 alfresco alfresco 3.6M Jul 4 10:06 _ifkj_Lucene41_0.doc&lt;/PRE&gt;&lt;P&gt;In addition thers's another folder named "content" with small gz files and a lot of numbered sub folders that include many other gz files. That folder looks very heavy as well, and I'm unable to list all files within a reasonable time...even the command "ls -l" executed from terminal takes to long to respond&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;## solr4/content/_DEFAULT_/db
drwxrwxr-x 2 solr solr 264K Jul  6 17:09 1962
drwxrwxr-x 2 solr solr 264K Jul  6 14:34 1963
drwxrwxr-x 2 solr solr 264K Jul  6 15:10 2086
drwxrwxr-x 2 solr solr 260K Jul 15 08:33 1105
drwxrwxr-x 2 solr solr 260K Jul 10 10:54 1106
....&lt;/PRE&gt;&lt;P&gt;It looks like solr is always doing something with those huge files and that takes quite a long time. The courious thing is that searches take 3-7 secs (acceptable for 15 items paginated queries on a huge repository) but indexing is 7-10 minutes behind the DB (system clock of both servers are synced)&lt;/P&gt;</description>
      <pubDate>Tue, 05 Nov 2024 12:36:21 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr-4-indexing-lag/m-p/120466#M33086</guid>
      <dc:creator>joe_l3</dc:creator>
      <dc:date>2024-11-05T12:36:21Z</dc:date>
    </item>
  </channel>
</rss>

