cancel
Showing results for 
Search instead for 
Did you mean: 

Cleanup Activiti history tables and act_ge_bytearray

ifischer
Champ in-the-making
Champ in-the-making
We are using Activiti for quite a long time now (since 2011), version 5.9. Now our Activiti database is getting larger and larger (currently ~22GB).
Due to some unfortunate circumstances, I cannot use the Activiti Java API to cleanup the history, I can only use SQL.

I found out that these are the tables that consume most space:
public.act_ge_bytearray 11 GB
public.act_hi_detail 5671 MB
public.act_hi_actinst 5003 MB

Now we want to remove history data to reduce the size of the database, which should be save according to this post http://forums.activiti.org/content/best-way-cleanup-history if we are not using the HistoryService, which is the case.

I already know how I can cleanup the hi_detail and hi_actinst tables - there I guess I can remove all rows that link to finished processes using the query:
select min(start_time_) from act_hi_actinst p where exists (select * from act_ru_execution e where p.proc_inst_id_=e.proc_inst_id_);
…so I will remove all hi_detail and hi_actinst entries older than the result of this query.

But I found no way to cleanup public.act_ge_bytearray as the schema does not link to any process or act_inst.

So my questions are:
is it possible at all to find a safe way to cleanup the act_ge_bytearray table via SQL? Or is it only possible via the HistoryService API?
What can happen in general if we manually cleanup the activiti history and the one bytearray rows?

Thanks in advance!

4 REPLIES 4

jbarrez
Star Contributor
Star Contributor
Ok, so if you aren't using history, you can indeed remove history. For your query: check the runtime tables for executions that don't exist anymore. These can be simply removed in the history tables.

11GB sounds like a lot for the byte array. Besides process xml and process images, what do you guys store in it together with the process?
But the rule of thumb of that the byte arrays linked to a process definition you don't use anymore can be deleted.

To be safe, I would check what the repositoryService.deleteDeployment() does (check the SQL logging) and see if you can mimic it.

ujjwalchoudhari
Champ in-the-making
Champ in-the-making
1. Does any body got the way to clear ACT_GE_BYTEARRAY table? how to find the records from bytearray table to be deleted? what is the relation of bytearray table and ACT_HI_ACTINST table in DB? How can I fetch records from ACT_GE_BYTEARRAY table starting from ACT_HI_ACTINST table?

2. How to find what all history data i can delete? can anybody post any query ? what all checks i need to keep in mind regarding date or anything else?

3. Data in my ACT_HI_DETAIL table is empty. Does this mean that all the History data is not of any use for me?

ujjwalchoudhari
Champ in-the-making
Champ in-the-making
I have few doubts . Could you please help on the below mentioned points :-

1. Does any body got the way to clear ACT_GE_BYTEARRAY table? how to find the records from bytearray table to be deleted? what is the relation of bytearray table and ACT_HI_VARINST table in DB? How can I fetch records from ACT_GE_BYTEARRAY table starting from ACT_HI_VARINST table?


2. I executed the above mentioned query

select min(start_time_) from act_hi_actinst p where exists (select * from act_ru_execution e where p.proc_inst_id_=e.proc_inst_id_);

and i am not getting any record in my ACT_HI_VARINST table older then this min date. What does this mean? My history level is default.

3. Data in my ACT_HI_DETAIL table is empty. Does this mean that all the History data is not of any use for me?

jbarrez
Star Contributor
Star Contributor
1. This means that you are storing byte array variables, most likely serialized java objects. To get those values, you'd need to start follow the FK in the ACT_HI_VARINST table.

2. Once a process is finished, it is removed from act_ru_execution, hence why you won't find anything

3. that table is only filled when running on the highest history level