cancel
Showing results for 
Search instead for 
Did you mean: 

Data Model Review

robin1
Champ in-the-making
Champ in-the-making
Hi guys,

I've been taking a look at the Data Model for Activiti and noticed some areas that can be improved.  If I'm off on any of these, please comment back.

The Activiti tables fall into 3 categories: 

•   Identity
•   Engine
•   History

The Identity tables are:
•   ACT_ID_GROUP
•   ACT_ID_MEMBERSHIP
•   ACT_ID_USER
•   ACT_ID_INFO

The Engine tables are:
•   ACT_GE_PROPERTY (Activiti version)
•   ACT_GE_BYTEARRAY (Process version)
•   ACT_RE_DEPLOYMENT (When and how a process is deployed)
•   ACT_RE_PROCDEF (Process definition)
•   ACT_RU_EXECUTION (Instance of a process)
•   ACT_RU_JOB (Job within a process)
•   ACT_RU_TASK (Tasks within a process)
•   ACT_RU_IDENTITYLINK (Link between Engine and Identity)
•   ACT_RU_VARIABLE (Variables for tasks)
•   ACT_RU_EVENT_SUBSCR (Process events)

Generally speaking, the relationships between the tables are all messed up.  On a high level:  a process has executions, an execution has tasks, a task has variables.
•   ACT_RU_TASK shouldn’t have the field PROC_DEF_ID_ (it’s redundant)
•   ACT_RU_VARIABLE shouldn’t have the field EXECUTION_ID_  (it’s redundant)
•   There should be a foreign key constraint between ACT_RE_PROCDEF. ID_ and ACT_RU_EXECUTION. PROC_DEF_ID_

It’s really bothering me that there are no foreign key constraints between any of the Identity tables and the Engine tables.

There should be foreign key constraints between:
•   ACT_RU_IDENTITYLINK. GROUP_ID_
•   ACT_ID_GROUP.ID_

There should be foreign key constraints between:
•   ACT_ID_USER.ID_
•   ACT_RU_IDENTITYLINK. USER_ID_
•   ACT_RU_TASK. OWNER_
•   ACT_RU_TASK. ASSIGNEE_

There should be a foreign key constraint between:
•   ACT_RU_EXECUTION.ID_
•   ACT_RU_JOB. EXECUTION _ID_

There should be a foreign key constraint between:
•   ACT_RU_TASK.ID_
•   ACT_RU_VARIABLE. TASK_ID_

The following fields are redundant:
•   ACT_RU_EXECUTION. PROC_INST_ID_ (same as ID_)
•   ACT_RU_TASK. PROC_INST _ID_ (same as EXECUTION _ID_)
•   ACT_RU_VARIABLE. PROC_INST _ID_ (same as EXECUTION _ID_)
•   ACT_RU_JOB. PROCESS_INSTANCE _ID_ (same as EXECUTION _ID_)

The History tables are:
•   ACT_HI_PROCINST (History for Process Executions/Instances)
•   ACT_HI_ACTINST (History of Process Actions)
•   ACT_HI_TASKINST (History of Process Tasks)
•   ACT_HI_DETAIL (Doesn’t seem to be used – should be history of Task Variables)
•   ACT_HI_COMMENT (Task comments)
•   ACT_HI_ATTACHMENT (Task URL???  No idea)

There are no foreign key constraints on any of the History tables.  Not between each other.  Not to the Identity tables.  This isn’t a big deal.

I ran a bunch of workflows on my local database a number of times assigning tasks to different groups.  These History records are storing surprisingly little of the user information.  ACT_HI_TASKINST. ASSIGNEE_ and ACT_HI_ACTINST.ASSIGNEE_ are  the only fields that are ever populated but most of the time the value is null. 

Upon further review, I found that ACT_RU_TASK. OWNER_ and ACT_RU_TASK. ASSIGNEE_ are always null.  I imagine these field are no longer required and this information is now stored in ACT_RU_IDENTITYLINK.  The problem is, when a task is assigned to a group, GROUP_ID_ is not stored in any of the history tables.

For some reason USER_ID_ of the person who performs a task is rarely populated in the history tables?
4 REPLIES 4

frederikherema1
Star Contributor
Star Contributor
the relationships between the tables are all messed up

I would be careful making statements like these Smiley Wink My two cents on this:

1. Looking up tasks by process-definition ID is a VERY common use case. Since a single process can have multiple child-executions (and that can have more children as well), this would require extra joins (execution -> process instance -> process definition). We decided to add extra column… The increase in performance is more relevant than the extra column.

2. Since activity DOESN'T INFORCE CHECKING on candidates and groups on a task, there are no foreign keys. This is an explicit choice rather dat just "being all messed up". This way, activiti can also be integrated better with other identity-management solutions whiteout explicit ally having to have physical "user" and "group" records available. Keep in mind, activiti.jar is an engine, not a full product and is most likely to be used inside another apps, which can have their own ACL stuff…

3. A task is never "assigned" to a group… A group can be "candidate" for the task. So if no user explicitly claims a task, the finished task will have NO assignee. Again, activiti is a library. You could easily enforce your services to first CLAIM a task before completing. This way, assignee is always preserved (this is done in alfresco, for example).

4. The task-history contains all there is to the task, we're not going to invent fields Smiley Wink Task-variables CAN be stored as del in history depending on the history-settings…


I hope these answers take a way some of the concerns raised by this post?

robin1
Champ in-the-making
Champ in-the-making
Thanks for the reply frederikheremans - very well written response and I'm glad to see that clearly a lot of though went into the data model.

I've got a quick follow up question and I think it leads to something I'm not understanding - should there be a foreign key constraint between
ACT_RE_PROCDEF. ID_ and ACT_RU_EXECUTION. PROC_DEF_ID_?

Robin

frederikherema1
Star Contributor
Star Contributor
Since there shouldn't be any executions/process-instances for an inexistent process-definition. When using the API, there is no way of deleting a process-definition without cascading all process-instances and there sub-executions. So in theory, constraints are enforced by the engine.

However, in some queries, they are joined (e.g.. ExecutionQuery.processDefinitionKey(…)) and it would definitely make sense to add a foreign-key constraint and/or and index on those fields. https://jira.codehaus.org/browse/ACT-1279

bernd_ruecker
Champ in-the-making
Champ in-the-making
[x] done