cancel
Showing results for 
Search instead for 
Did you mean: 

FTS SQL query causing CPU spikes on database server

Ryan_Wakefield
World-Class Innovator
World-Class Innovator

So we have been getting a lot of reports about our SQL taking up 99% of our 8 CPU cores (4 physical, 8 hyper-threaded). With the assistance of our DBA's, we were able to determine the issue is with our Full-Text Search servers. Right now we have about 900k documents in day-forward to be OCR'd and this is because we are doing a ton of conversions with some old legacy systems and storing the documents in OnBase.

 

However, the problem is that the query that I will post below is being run so often (roughly every 30-90 seconds) that it causes out CPU usage to go from the 5-15% we are using to 99%.

 

SELECT        tbl1.hscatalognum,        tbl1.itemnum,        tbl1.lastupdate,        tbl1.ftobjectid,        tbl1.docrevnum,        tbl1.filetypenum,        tbl1.status FROM hsi.hscatalogdocs tbl1 LEFT OUTER JOIN hsi.itemdatapage tbl2        ON tbl1.itemnum = tbl2.itemnum               AND tbl1.docrevnum = tbl2.docrevnum               AND tbl1.filetypenum = tbl2.filetypenum WHERE tbl2.itemnum IS NULL

 

From what I gather, the query above is a query to see what additional documents need to be processed still, but I could be wrong.

 

I did determine the cause to be when I am running 2+ FTS servers at a time. We have a total of 3 FTS servers because in order to keep up with the amount of documents coming into our system, we can't go any lower or else we will never be able to keep up. (I have even considered spinning up a 4th FTS servers because of the load that we have right now coming in.)

 

Our virtual FTS servers have 12 vCPU  cores, 16GB of RAM, and our database servers has 4 physical (8 hyper-threaded) CPU cores, and 64GB of RAM. We have not applied any "optimizations" on our FTS servers yet (as can be found on page 30 of the EP3 Full-Text Search MRG file). I'm not opposed to applying some, but due to the nature of the issue I am not sure which ones might be beneficial to do. We are running FTS 20.3.16.1000 right now. 

 

And yes, I know I need to submit a ticket to FLOS, but the fact that it typically takes them a week before I hear back anything is why I am preferring to post my question here first. And I know there are newer releases of EP3, but there are absolutely no release notes that would point to any optimizations or bugs being found in the FTS that would warrant the upgrading of our FTS version we are running.

 

 

Any suggestions on what to look at or what optimizations to apply or anything really?

 

Thanks.

1 ACCEPTED ANSWER

Mike_Walkuski
Employee
Employee

To circle back on this. Ryan contacted Technical Support and, after further investigation, determined that the Parallelism setting on the SQL Server was incorrect per the Database Reference Guide. Once that was corrected performance improved not only for the Full Text Server, but in other areas of the environment as well.

 

Parallelism
Changing the parallelism option to use only one processor will eliminate parallelism as an option for the query optimizer in its costing algorithm, and it will prohibit SQL Server from executing a single query across multiple processors. Unless the system is experiencing unbounded searches (user-executed or other), OnBase queries are structured to be selective in nature to avoid long execution times. Generally, no query should take longer than 1 second to complete, given optimal factors and appropriately unique input criteria. With the expected index usage and optimal maintenance procedures in place (e.g., up-to-date statistics, etc.), the server should not experience long-running queries.


If you prohibit all processors from being dedicated to a single query, queries that are errant (either due to poor query plans or inexperienced users) may not impact other users as significantly. If a single query is permitted to execute across all processors, access to the system can be severely limited for the remaining users.

 

  • Recommendation: Because queries generated by OnBase do not benefit from parallelism, and this setting has at times demonstrated errant behavior, parallelism should be disabled.
    • Within SQL Server 2000, right-click on the server instance and select Properties, and then select the Processor tab. In the Parallelism section, enable the option for Use 1 processor(s) to disable parallelism (any value higher than 1 enables parallelism).
    • Within SQL Server 2005 and SQL Server 2008, this setting is referred to as Max Degree of Parallelism. Right-click on the server instance and select Properties, and then select the Advanced page. (Set the value to 1 to disable parallelism.)

View answer in original post

9 REPLIES 9

MichaelBertrand
Star Collaborator
Star Collaborator

At the base of it, I believe that the issue in the query is the WHERE outside the join, it should be part of the join. It's a question of when the filter (IS NULL) is performed, after doing the join of tb1l and tbl2 or before.

 

That said, I understand that you may not have control over the query generation.  You should at the least be able to try each version of the query and analyse the plan execution statistics. 

 

LEFT OUTER JOIN hsi.itemdatapage tbl2        ON tbl1.itemnum = tbl2.itemnum               AND tbl1.docrevnum = tbl2.docrevnum               AND tbl1.filetypenum = tbl2.filetypenum                AND tbl2.itemnum IS NULL

 

I have an SQL profiler set up to spot this sort query in various applications. 

@Michael Bertrand ,

 

If you don't mind, could you elaborate a little more on what you mean by you have a SQL profiler setup to spot this sort of query? I would be really interested in hearing more about it. You could post on here on Community or reach out to me directly (email is on my Community profile).

 

Thanks!

Here I am talking about the MS SQL profiler. I am sure that Oracle has an equivalent.

 

In the Events select I choose the various Completed items from the Stored Procedure and TSQL sections. Select all the columns.  In the Column Filters I have one set on the TextData column with a filter value for the Like section of %join%where%  , So, any query that has the words "join" followed anywhere by "where with get captured. 

 

I will also add other filters to help keep the noise down, such as the duration , to get the longer running queries, something like 5 seconds (5000 ms) is where I start. You can also filter by login name, but specifically with OnBase, given the default configuration this may not work as expected. 

 

There are much better tutorials on using Profiler out there, that can help you go go deeper. 

 

That said, if you have control of the query / query building process you can take action, but if this baked in to the FTS or whatever application there will be little that you can fix and you may need a workaround. 

 

 Here is a screenshot of the basic setup - SQL Profiler Jin Where

Mike_Walkuski
Employee
Employee

To circle back on this. Ryan contacted Technical Support and, after further investigation, determined that the Parallelism setting on the SQL Server was incorrect per the Database Reference Guide. Once that was corrected performance improved not only for the Full Text Server, but in other areas of the environment as well.

 

Parallelism
Changing the parallelism option to use only one processor will eliminate parallelism as an option for the query optimizer in its costing algorithm, and it will prohibit SQL Server from executing a single query across multiple processors. Unless the system is experiencing unbounded searches (user-executed or other), OnBase queries are structured to be selective in nature to avoid long execution times. Generally, no query should take longer than 1 second to complete, given optimal factors and appropriately unique input criteria. With the expected index usage and optimal maintenance procedures in place (e.g., up-to-date statistics, etc.), the server should not experience long-running queries.


If you prohibit all processors from being dedicated to a single query, queries that are errant (either due to poor query plans or inexperienced users) may not impact other users as significantly. If a single query is permitted to execute across all processors, access to the system can be severely limited for the remaining users.

 

  • Recommendation: Because queries generated by OnBase do not benefit from parallelism, and this setting has at times demonstrated errant behavior, parallelism should be disabled.
    • Within SQL Server 2000, right-click on the server instance and select Properties, and then select the Processor tab. In the Parallelism section, enable the option for Use 1 processor(s) to disable parallelism (any value higher than 1 enables parallelism).
    • Within SQL Server 2005 and SQL Server 2008, this setting is referred to as Max Degree of Parallelism. Right-click on the server instance and select Properties, and then select the Advanced page. (Set the value to 1 to disable parallelism.)

As Mike pointed out, this was the biggest cause of our problem. However, what more so is frustrating is that we only found this out by having Hyland run their utility that you even had to have FLOS put a password in to access the utility itself.

 

In my opinion, it's the fact that Hyland has these kind of utilities to run, but that we as customers don't have access to them makes me wonder what other utilities are out there that I could run against my system to validate configurations are correct on other things. Just annoying is all....