cancel
Showing results for 
Search instead for 
Did you mean: 

FTS SQL query causing CPU spikes on database server

Ryan_Wakefield
World-Class Innovator
World-Class Innovator

So we have been getting a lot of reports about our SQL taking up 99% of our 8 CPU cores (4 physical, 8 hyper-threaded). With the assistance of our DBA's, we were able to determine the issue is with our Full-Text Search servers. Right now we have about 900k documents in day-forward to be OCR'd and this is because we are doing a ton of conversions with some old legacy systems and storing the documents in OnBase.

 

However, the problem is that the query that I will post below is being run so often (roughly every 30-90 seconds) that it causes out CPU usage to go from the 5-15% we are using to 99%.

 

SELECT        tbl1.hscatalognum,        tbl1.itemnum,        tbl1.lastupdate,        tbl1.ftobjectid,        tbl1.docrevnum,        tbl1.filetypenum,        tbl1.status FROM hsi.hscatalogdocs tbl1 LEFT OUTER JOIN hsi.itemdatapage tbl2        ON tbl1.itemnum = tbl2.itemnum               AND tbl1.docrevnum = tbl2.docrevnum               AND tbl1.filetypenum = tbl2.filetypenum WHERE tbl2.itemnum IS NULL

 

From what I gather, the query above is a query to see what additional documents need to be processed still, but I could be wrong.

 

I did determine the cause to be when I am running 2+ FTS servers at a time. We have a total of 3 FTS servers because in order to keep up with the amount of documents coming into our system, we can't go any lower or else we will never be able to keep up. (I have even considered spinning up a 4th FTS servers because of the load that we have right now coming in.)

 

Our virtual FTS servers have 12 vCPU  cores, 16GB of RAM, and our database servers has 4 physical (8 hyper-threaded) CPU cores, and 64GB of RAM. We have not applied any "optimizations" on our FTS servers yet (as can be found on page 30 of the EP3 Full-Text Search MRG file). I'm not opposed to applying some, but due to the nature of the issue I am not sure which ones might be beneficial to do. We are running FTS 20.3.16.1000 right now. 

 

And yes, I know I need to submit a ticket to FLOS, but the fact that it typically takes them a week before I hear back anything is why I am preferring to post my question here first. And I know there are newer releases of EP3, but there are absolutely no release notes that would point to any optimizations or bugs being found in the FTS that would warrant the upgrading of our FTS version we are running.

 

 

Any suggestions on what to look at or what optimizations to apply or anything really?

 

Thanks.

1 ACCEPTED ANSWER

Mike_Walkuski
Employee
Employee

To circle back on this. Ryan contacted Technical Support and, after further investigation, determined that the Parallelism setting on the SQL Server was incorrect per the Database Reference Guide. Once that was corrected performance improved not only for the Full Text Server, but in other areas of the environment as well.

 

Parallelism
Changing the parallelism option to use only one processor will eliminate parallelism as an option for the query optimizer in its costing algorithm, and it will prohibit SQL Server from executing a single query across multiple processors. Unless the system is experiencing unbounded searches (user-executed or other), OnBase queries are structured to be selective in nature to avoid long execution times. Generally, no query should take longer than 1 second to complete, given optimal factors and appropriately unique input criteria. With the expected index usage and optimal maintenance procedures in place (e.g., up-to-date statistics, etc.), the server should not experience long-running queries.


If you prohibit all processors from being dedicated to a single query, queries that are errant (either due to poor query plans or inexperienced users) may not impact other users as significantly. If a single query is permitted to execute across all processors, access to the system can be severely limited for the remaining users.

 

  • Recommendation: Because queries generated by OnBase do not benefit from parallelism, and this setting has at times demonstrated errant behavior, parallelism should be disabled.
    • Within SQL Server 2000, right-click on the server instance and select Properties, and then select the Processor tab. In the Parallelism section, enable the option for Use 1 processor(s) to disable parallelism (any value higher than 1 enables parallelism).
    • Within SQL Server 2005 and SQL Server 2008, this setting is referred to as Max Degree of Parallelism. Right-click on the server instance and select Properties, and then select the Advanced page. (Set the value to 1 to disable parallelism.)

View answer in original post

9 REPLIES 9

Mike_Walkuski
Employee
Employee

Hey Ryan,

 

That query is part of our cleanup tasks to remove any documents that have been purged (removed from itemdatapage) that may not have been removed from the hscatalogdocs table. It will set those documents to the PurgeorDelete status and they will be removed from the catalog as part of standard processing. 

 

What's interesting is how often you are seeing the query. This query should only run once at each polling interval, which by default and the lowest possible value is, is 5 minutes. Since you have 3 servers, you could see 3 queries all within the same second, but they could also be spread out over as many as 100 seconds (600 second polling interval/3 servers). 

 

So the big question is, are you seeing them more often than 3 times within any given 5 minute period, once for each server? If so, then I would work with Support to get it tested internally and potentially a bug created. If not, then the software is most likely functioning to spec.

 

Unfortunately there most likely won't be anything that can be done about the query as it is needed to find any orphaned docs in the catalogs that need to be removed. However, one way to alleviate the frequency of the queries would be to increase your polling interval in the Full Text Server config file.

 

I hope this helps!

Hey @Mike Walkuski ,

 

From what I understand, it might be running that every 5 minutes. Unfortunately, running 3 FTS servers with the specs they are running isn't able to keep up. I started my day with around 900k documents in the pending state and then just checked and I have 1.1mil in pending. What's crazy is I have around 1.7mil In-Progress. So what this is saying to me is that while yes, it is really really easy to scale our your FTS solution, it definitely means that you will need to increase your resources (especially CPU wise) on your database to compensate for any additional servers.

 

I say this because when I only had one server running, I never had a spike. When I had 2 servers running, I had a spike once everything 5-10 minutes if not longer. Once I started up my third server, I now get the spike every 30-60 seconds. And if I were to add a 4th FTS server to help getting us caught up in our Day Forward Pending queue, I would have a spike out constantly which could mean I could be seriously SERIOUSLY adversely affecting our normal system performance.

 

So it definitely sounds like to me there is a lot of room for optimization, especially when it comes to this whole running of that cleanup query because telling us to increase the amount of CPU cores (as well as having to purchase additional SQL licensing because it is licensed by the amount of cores) is not a good answer in my opinion.

 

Thanks.

Ediaz
Content Contributor
Content Contributor

Hi Ryan.

 

Unfortunately I do not have a solution, however, in order to avoid these inconveniences in the future in other solutions I have a question, Are you doing Full text to all the Document Type that enter your system? Are there possibilities to do it to less Document Type?

Thanks in advance.

@Edwin Diaz ,

 

Yes, we are doing FTS on almost all documents that are in our system. This is because roughly 75% of the document types in our system are for medical purposes (clinical and financial) and thus the ability to use FTS is needed. So while I agree that it would be nice to reduce the amount of document types to process, I don't see that being as easy as I would hope.

 

As more of a general statement, this kind of thing should be something that Hyland should be well aware of with this module. They touted about how easily this new replacement to IDOL would be so much easier to scale, use, manage, etc. However, I have ran into more and more hurdles and issues than I care to admit. The FTS is good, don't get me wrong, but there is a BUNCH more things they need to add to it to make it better and less manual maintenance cumbersome.

 

Thanks.