<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Solr4 query threading beaviour  in Alfresco Forum</title>
    <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3804#M1526</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Are you sure the use of all the cores was really due to the query? SOLR will triggere a "new / updated content tracking" operation every 15 seconds for each SOLR core, which may also branch out to multiple threads when new content is found.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Setting &lt;SPAN style="font-size: 11.0pt;"&gt;alfresco.doPermissionChecks&lt;/SPAN&gt; to false in solrcore.properties is not really a smart thing to do. It just means that permissions will not be checked on the SOLR tier, but they will still be checked on the Repository tier leading to way more performance overhead.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As I often tell people, a lot of the tuning advice should be taken with a grain of salt and a basic understanding of the correlation any setting may have on the performance. It would be great to know what you have tried as to judge the potential effect (benefit / damage) it could have.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What is the IO situation in your server? SOLR and DB are very sensitive to slow IO and you can easily tank the performance by not using a proper storage medium for the data.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have used 201612 GA (5.2) for a while now locally (on my work laptop) and even with about 6 million documents loaded into it query time never reached double-digit seconds range.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sun, 22 Jan 2017 14:38:25 GMT</pubDate>
    <dc:creator>afaust</dc:creator>
    <dc:date>2017-01-22T14:38:25Z</dc:date>
    <item>
      <title>Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3803#M1525</link>
      <description>I have observed an unexpected behaviour in my Alfresco installation when doing advanced searches. The site has an complex advanced search form with about 30 criteria, many of them multi-valued picked from multi-selection lists.When I do a search with a couple of criteria selected with a user with ad</description>
      <pubDate>Fri, 20 Jan 2017 00:53:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3803#M1525</guid>
      <dc:creator>chkk</dc:creator>
      <dc:date>2017-01-20T00:53:16Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3804#M1526</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Are you sure the use of all the cores was really due to the query? SOLR will triggere a "new / updated content tracking" operation every 15 seconds for each SOLR core, which may also branch out to multiple threads when new content is found.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Setting &lt;SPAN style="font-size: 11.0pt;"&gt;alfresco.doPermissionChecks&lt;/SPAN&gt; to false in solrcore.properties is not really a smart thing to do. It just means that permissions will not be checked on the SOLR tier, but they will still be checked on the Repository tier leading to way more performance overhead.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As I often tell people, a lot of the tuning advice should be taken with a grain of salt and a basic understanding of the correlation any setting may have on the performance. It would be great to know what you have tried as to judge the potential effect (benefit / damage) it could have.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What is the IO situation in your server? SOLR and DB are very sensitive to slow IO and you can easily tank the performance by not using a proper storage medium for the data.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have used 201612 GA (5.2) for a while now locally (on my work laptop) and even with about 6 million documents loaded into it query time never reached double-digit seconds range.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 22 Jan 2017 14:38:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3804#M1526</guid>
      <dc:creator>afaust</dc:creator>
      <dc:date>2017-01-22T14:38:25Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3805#M1527</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks for the answer!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Load is definitely associated to the query process, as the server is nearly idle when I don't start a query. And nobody else on the server, no new content created. However, it looks to me now that the load is not the SOLR search itself but the retrieval of the nodes in Alfresco&amp;nbsp;after the initial search.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;When testing for analysing the root cause I retrieved the JSON search results directly opening &lt;EM&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;A class="jive-link-external-small" href="https://" rel="nofollow noopener noreferrer" target="_blank"&gt;https://&lt;/A&gt;&lt;SPAN&gt;&amp;lt;...url...&amp;gt;/share/page/dp/ws/faceted-search#searchTerm=&amp;amp;query=&amp;lt;...query...&amp;gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN style="font-size: 11.0pt;"&gt;&lt;EM&gt;&amp;amp;pageSize=25&amp;amp;maxResults=0&amp;amp;noCache=1485049983069&amp;amp;spellcheck=false"&lt;/EM&gt; and noticed that if I set "maxResults" to something greater than 0, the query executes nearly instantly as it that case it only delivers the header information with the correct number of results but not the individual item list.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Setting &lt;SPAN style="border: 0px; font-weight: inherit; font-size: 11pt;"&gt;alfresco.doPermissionChecks&lt;/SPAN&gt; to false in solrcore.properties is not really a smart thing to do. It just means that permissions will not be checked on the SOLR tier, but they will still be checked on the Repository tier leading to way more performance overhead.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Required for our solution, users are allowed to search all&amp;nbsp;content independent of access privileges and are only blocked from accessing full content, allowing them to request access if required.&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;This is implemented by a combination of the "doPermissionsChecks=false" mentioned above and running the "processResultsSinglePage" function in "search.lib.js" as admin to get all results and then clearing out some sensitive fields in the "getDocumentItem" function if the user is not in the admin role.&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;As I often tell people, a lot of the tuning advice should be taken with a grain of salt and a basic understanding of the correlation any setting may have on the performance. It would be great to know what you have tried as to judge the potential effect (benefit / damage) it could have&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;UL&gt;&lt;LI style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Java VM intial/max heap size increased from 2/4 GB to 8/16 GB&lt;/LI&gt;&lt;LI style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Different other JVM options for garbage collections tested, current settings as advised in Alfrecso 5.2 documentation&lt;/LI&gt;&lt;LI style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Caches for SOLR increased&lt;/LI&gt;&lt;LI style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Database tables optimized&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tested after each individual change, no real difference.&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;What is the IO situation in your server? SOLR and DB are very sensitive to slow IO and you can easily tank the performance by not using a proper storage medium for the data.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Tested with "hdparm -t", throughput is about 180 MB / s.&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Thankful for any further input &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;Chris&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 23 Jan 2017 06:46:32 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3805#M1527</guid>
      <dc:creator>chkk</dc:creator>
      <dc:date>2017-01-23T06:46:32Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3806#M1528</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P&gt;chkk _ wrote:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;SPAN&gt;When testing for analysing the root cause I retrieved the JSON search results directly opening &lt;EM&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;A class="jive-link-external-small" href="https://" rel="nofollow noopener noreferrer" target="_blank"&gt;https://&lt;/A&gt;&lt;SPAN&gt;&amp;lt;...url...&amp;gt;/share/page/dp/ws/faceted-search#searchTerm=&amp;amp;query=&amp;lt;...query...&amp;gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN style="font-size: 11.0pt;"&gt;&lt;EM&gt;&amp;amp;pageSize=25&amp;amp;maxResults=0&amp;amp;noCache=1485049983069&amp;amp;spellcheck=false"&lt;/EM&gt; and noticed that if I set "maxResults" to something greater than 0, the query executes nearly instantly as it that case it only delivers the header information with the correct number of results but not the individual item list.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;P style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE class="jive_macro_quote jive-quote jive_text_macro"&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px none;"&gt;Setting &lt;SPAN style="border: 0px; font-weight: inherit; font-size: 11pt;"&gt;alfresco.doPermissionChecks&lt;/SPAN&gt; to false in solrcore.properties is not really a smart thing to do. It just means that permissions will not be checked on the SOLR tier, but they will still be checked on the Repository tier leading to way more performance overhead.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px none;"&gt;Required for our solution, users are allowed to search all&amp;nbsp;content independent of access privileges and are only blocked from accessing full content, allowing them to request access if required.&lt;/P&gt;&lt;P style="min-height: 8pt; padding: 0px; color: #727174; background-color: #ffffff; border: 0px none;"&gt;&amp;nbsp;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px none;"&gt;This is implemented by a combination of the "doPermissionsChecks=false" mentioned above and running the "processResultsSinglePage" function in "search.lib.js" as admin to get all results and then clearing out some sensitive fields in the "getDocumentItem" function if the user is not in the admin role.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Executing search as "admin" will NOT prevent overhead for permission checking on the Repository-tier when doPermissionCheck is disabled. First of all, "admin" is a normal user and does not short-circuit permission checks, and second of all "processResultsSinglePage" is too late to ensure that results won't be filtered incorrectly for your kind of customisation, as results are immediately permission checked / filtered as part of the "search.queryResultSet(queryDef)" call.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 23 Jan 2017 07:26:51 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3806#M1528</guid>
      <dc:creator>afaust</dc:creator>
      <dc:date>2017-01-23T07:26:51Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3807#M1529</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Authentication as admin seemed to work fine, any suggestions to avoid that overhead if we want public search but private full document access?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 23 Jan 2017 16:34:06 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3807#M1529</guid>
      <dc:creator>chkk</dc:creator>
      <dc:date>2017-01-23T16:34:06Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3808#M1530</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The use case itself does not make much sense: Why allow people to search/see all documents that they then may not be able to access at all? Something like this cannot really be handled efficiently with out-of-the-box Alfresco without messing with the permission model / SOLR ACL indexing.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you want to continue with the current dirty hack, then you can use "System" as the runAs-user instead of "admin". At least that way the permission checks are properly short-circuited and overhead is minimized. Again, the runAs context needs to include the actual query call, not just the post processing...&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 23 Jan 2017 17:24:27 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3808#M1530</guid>
      <dc:creator>afaust</dc:creator>
      <dc:date>2017-01-23T17:24:27Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3809#M1531</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Checked the code, actually that was already running as "System" ( ...&amp;nbsp;&lt;EM&gt;AuthenticationUtil.getSystemUserName&lt;/EM&gt;() ... ), so that was not the problem. I will try your suggestion to do the "RunAs" directly in the query call to see if that makes a difference.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The use-case for this scenario is that people may request access to the documents based on the search results: the search output is a custom table showing only a couple of metadata columns from the documents. The information in these columns is considered public, but the full document data is restricted based on the document's location and the user's group memberships.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So would it be better to "fake" public access at indexing time?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 25 Jan 2017 07:28:44 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3809#M1531</guid>
      <dc:creator>chkk</dc:creator>
      <dc:date>2017-01-25T07:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Solr4 query threading beaviour</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3810#M1532</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;A href="https://migration33.stage.lithium.com/t5/user/viewprofilepage/user-id/74592"&gt;@chkk&lt;/A&gt;&amp;nbsp;wrote:&lt;BR /&gt;&lt;P&gt;Thanks for the answer!&lt;/P&gt;&lt;P&gt;Load is definitely associated to the query process, as the server is nearly idle when I don't start a query. And nobody else on the server, no new content created. However, it looks to me now that the load is not the SOLR search itself but the retrieval of the nodes in Alfresco&amp;nbsp;after the initial search.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;When testing for analysing the root cause I retrieved the JSON search results directly opening &lt;EM&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;A href="https://www.webnovedad.com/spyera-descarga-gratis-iphone-android" target="_blank" rel="nofollow noopener noreferrer"&gt;spyera&lt;/A&gt;&lt;SPAN&gt;&amp;lt;...url...&amp;gt;/share/page/dp/ws/faceted-search#searchTerm=&amp;amp;query=&amp;lt;...query...&amp;gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN&gt;&lt;EM&gt;&amp;amp;pageSize=25&amp;amp;maxResults=0&amp;amp;noCache=1485049983069&amp;amp;spellcheck=false"&lt;/EM&gt; and noticed that if I set "maxResults" to something greater than 0, the query executes nearly instantly as it that case it only delivers the header information with the correct number of results but &lt;A href="https://www.webnovedad.com/spyera-descarga-gratis-iphone-android/" target="_self" rel="nofollow noopener noreferrer"&gt;spyera&amp;nbsp;&lt;/A&gt;the individual item list.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;Setting &lt;SPAN&gt;alfresco.doPermissionChecks&lt;/SPAN&gt; to false in solrcore.properties is not really a smart thing to do. It just means that permissions will not be checked on the SOLR tier, but they will still be checked on the Repository tier leading to way more performance overhead.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Required for our solution, users are allowed to search all&amp;nbsp;content independent of access privileges and are only blocked from accessing full content, allowing them to request access if required.&lt;/P&gt;&lt;P&gt;This is implemented by a combination of the "doPermissionsChecks=false" mentioned above and running the "processResultsSinglePage" function in "search.lib.js" as admin to get all results and then clearing out some sensitive fields in the "getDocumentItem" function if the user is not in the admin role.&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;As I often tell people, a lot of the tuning advice should be taken with a grain of salt and a basic understanding of the correlation any setting may have on the performance. It would be great to know what you have tried as to judge the potential effect (benefit / damage) it could have&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;UL&gt;&lt;LI&gt;Java VM intial/max heap size increased from 2/4 GB to 8/16 GB&lt;/LI&gt;&lt;LI&gt;Different other JVM options for garbage collections tested, current settings as advised in Alfrecso 5.2 documentation&lt;/LI&gt;&lt;LI&gt;Caches for SOLR increased&lt;/LI&gt;&lt;LI&gt;Database tables optimized&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Tested after each individual change, no real difference.&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;What is the IO situation in your server? SOLR and DB are very sensitive to slow IO and you can easily tank the performance by not using a proper storage medium for the data.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Tested with "hdparm -t", throughput is about 180 MB / s.&lt;/P&gt;&lt;P&gt;Thankful for any further input &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Chris&lt;/P&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;You need more RAM and CPU&lt;/P&gt;</description>
      <pubDate>Fri, 19 Feb 2021 12:18:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/solr4-query-threading-beaviour/m-p/3810#M1532</guid>
      <dc:creator>evaaa</dc:creator>
      <dc:date>2021-02-19T12:18:16Z</dc:date>
    </item>
  </channel>
</rss>

