cancel
Showing results for 
Search instead for 
Did you mean: 

How many keywords should be in a Keyword Type Group?

Ansley_Ingram
Elite Collaborator
Elite Collaborator

Using Keyword Type Groups as opposed to “loose” keywords can improve performance by reducing the number of round trips to the database for retrieval and indexing. The performance gains and caveats of using Keyword Type Groups have been well documented in the Module Reference Guides and other documentation but there are occasionally questions that exceed that basic message of “1 trip to the database is more efficient than 10 trips”.

The question “Is there a limit to the number of keywords in a Keyword Type Group” has come up a few times recently. While there are some technical limitations that are posed by the database platform, the question itself begs for further exploration of the solution design and actual use of the keywords in question. The true short answer is, “It depends.”

The first technical limitation is the limitation on the number of columns allowed in a table - 1024 for SQL Server 2005 and earlier; 30,000 for SQL Server 2008 and later; 1000 for Oracle 10g and 11g. This first limitation would not typically pose a problem as a solution would not typically have anywhere near 1000 keyword types on a document. (If you do, we should probably have a separate discussion.)

The next technical limitation, for SQL Server implementations, is the ultimate size of the row on the database’s data page. The maximum row size in SQL Server version 7 and higher is 8060 bytes. The quickest way to reach this would be to create a Keyword Type Group with 33 single alpha keywords each set to 250 characters. Again, this scenario is highly unlikely but it is possible. (FYI, SQL Server will actually prevent you from creating a table whose columns exceed the max row size.)

For Oracle implementations, the database block size ((typically 8KB for an OnBase installation, but configurable) does not control the maximum row size. Instead, the 33 keyword scenario mentioned previously would be possible because Oracle has the capability of row chaining. Row chaining can lead to its own set of performance issues though so exceeding an 8KB row size would still not be a standard recommendation.

Aside from the technical limitations, the number of keywords in a keyword type group comes down to evaluating the business process and understanding the actual use of the various keywords.

For example’s sake, let’s say that we have a document type that has 15 keyword types that are potential candidates for a Keyword Type Group. In our example, the keywords are of a variety of data types but all single table keywords. The questions below apply to each individual keyword being reviewed.

The first question to consider is, “Will there only ever be one instance of this keyword type on this document type?” If users will often use the F6 functionality to add an additional instance of one of the keyword types then that keyword should be left out of the Keyword Type Group. (If all of the keyword types may have multiple instances, this may be a good time to look into Multi-Instance Keyword Type Groups.)

The next question to consider is, “Will this keyword on this document type be used as part of a Workflow filter?” If so, then the rest of the keywords should be analyzed and then perhaps separated into two Keyword Type Groups – one containing the keywords displayed in the Workflow filter, and another containing the remaining keywords. The reasoning behind this is that when the Workflow filter is loaded, all of the display keywords for all of the documents in the queue will be loaded into memory. If the keywords are all in one keyword type group, then extra keyword values will be loaded into memory which would become a performance consideration with a large number of documents.

Another question to consider is, “Will this keyword be used for retrieval?” If so, then similar to the Workflow filter suggestion, it may be a good also idea to separate retrieval keywords from non-retrieval keywords. Further review of the retrieval keywords and retrieval tactics could also lead to identifying a subset of keywords that are always used for retrieval that might benefit from a compound index.

Also important to remember: a keyword type can belong to more than one Keyword Type Group. One document type might benefit from a Keyword Type Group containing keyword types X, Y and Z while another might be better configured with a KTG for X and Y leaving Z loose.

As an OnBase Administrator and/or a solution designer, you have the ability and responsibility to find out as much as possible about the business process and use to make the best design decisions possible and turn those “It depends” moments into a true answer starting with “In this case”.

Please feel free to discuss these considerations and share your own thoughts about Keyword Type Group design!

1 REPLY 1

Jonathan_Perreg
Confirmed Champ
Confirmed Champ

Something else that must happen is careful planning and a good database backup created just prior to KTG configuration.

Once a KTG has been created, you cannot append or remove keywords from it. If you make a mistake during the creation of the KTG, your best bet is to roll back to your fresh database backup you made just prior to configuration. Plan for this possibility! If the database is very large, plan on wasting a few hours performing database recovery as a worst-case scenario.

If despite your planning you cannot roll back to a viable database backup after making a mistake, you do briefly have the ability to de-assign the KTG from the document type and revert back to using stand-alone keywords without data loss. You could then "retire" the misconfigured KTG (rename it, since deletion is not an option) and create a new KTG using the correct keywords. This window of opportunity may be very short, though. As soon as you commit to actually using your KTG and new documents are processed by the system, you cannot de-assign the KTG to correct an error without also blowing away all keyword data on newly-created/indexed documents. (Old documents existing prior to KTG creation would still retain keyword values from the stand-alone configuration.)

Try to plan ahead as much as possible. If a future business process alters the keywords you are about to migrate, consider spending a little time coming up with the future keyword configuration now, then lock it into the KTG. Of course this may not be possible. If the process changes in the future, you'll be forced to add stand-alone keywords and retire unused keywords by hiding them via Access Restricted Keywords.

One other thing to consider is if Report Services reports were developed that hit your old keywords, the reports will continue to "work" and may even seem "correct" for a little while when a KTG is in place. This is because the reports would be querying the stand-alone keyword tables, which still retain all your old data. After a while, someone may notice that certain information in the reports no longer seem to be updating. It's because all new data is going into the KTG table, not the stand-alone tables. All of those reports would need to be redeveloped using entirely new SQL statements.