cancel
Showing results for 
Search instead for 
Did you mean: 

Workflow Status Reporting

brian_robinson
Champ in-the-making
Champ in-the-making
Hi all,

I just posted about workflow status reporting on my blog (http://robinsontechnology.com/blog/2009/04/20/alfresco-workflow-status-reporting/) about workflow status reporting, and I'm looking for some feedback.  Below is a copy of what I wrote.  Feel free to reply here or on the blog if you have ideas or suggestions.  Thanks!

As part of the Alfresco Consulting team, I’ve done some work on web scripts for reporting workflow status, keyed off of custom workflow metadata.  At the AIIM conference a few weeks ago, I was asked multiple times about Alfresco’s ability to expose workflow reports. Last week I came across a post from Kas Thomas who wondered, “Does workflow always have to suck?“, which touched on workflow reporting.  My conclusion after all of this: we need to build workflow status reporting. And I’m going to start on it.

Now there are a lot of ways in which we could build this out.  First, we need to consider the reporting context.  Do we want to report based on:

    * the system as a whole (”show me all workflow status in the entire system”)
    * web content management use (”show me all workflow status for web project X”)
    * user (”show me how user ‘bobsmith’ is involved in workflow”)
    * group (”show me how group ‘marketing’ is involved in workflow”)
    * asset type (”show me all workflow status pertaining to ‘.jpg’ files”)
    * asset (”show me all workflow status related to this particular document”)
    * path (”show me all workflow status related to assets in/under this path”)
    * workflow (”show me all status related to all instances of this named workflow”)
    * workflow task (”show me all status related to all instances of this named workflow task”)
    * something else I haven’t thought of yet (please comment here if you have other suggestions)

Since I want to get something going that will be of the most use to the most people, I’m going to start with reporting by user, then by group.  My next consideration will be exactly what to report.  I have not used any other workflow software besides Alfresco, so if you’ve got input on this, I would love to hear it (please comment below).  That said, so far I’m thinking that given a user id, the report will show:

    * the number of currently active workflows the user is/may be involved it (i.e. this would be inclusive of pooled tasks assigned to a group that the user is a part of)
    * the number of currently active workflow tasks the user has assigned to them

Given a user id and a date range, the report will show:

    * the number of completed tasks
    * the number of completed workflows
    * the average duration of each completed task

Better still, given a user id, date range (Jan 1, 2008 - December 31, 2008), and recurring time period (monthly), the report could show a bar chart or line graph of the aforementioned metrics.  Perhaps you can even add the ability to compare one user to another on those metrics.

A few things are notably missing so far.  Nowhere have I mentioned the ability to key off of custom workflow metadata, or report on custom workflow metadata. Perhaps the former could be addressed via the inclusion of a name/value pair that serves as a filter.  For example, providing {http://www.mycompany.com/model/my-workflow/1.0}/customId=42 could serve to filter out any workflow that does not have that custom metadata set on the workflow’s start task.  Reporting custom workflow metadata probably doesn’t make sense as part of a “by user” oriented report, but probably does make sense as part of a named workflow report.  What do you think?

Hopefully this post will drum up some food for thought, and ideally some feedback on what Alfresco users might want from a workflow reporting perspective.  Please feel free to share your thoughts on the subject by commenting here or participating in the workflow forum at http://forums.alfresco.com/en/viewforum.php?f=34.
7 REPLIES 7

darioruizlopez
Champ in-the-making
Champ in-the-making
Only two suggestions / ideas for commenting:

1. It would be fine if the reporting tool could provide an intermediate output on xml format to let developers to prepare its own output, for example, as a presentation template.

2. I think that the fields to report are heavely related on the type and intention of the report. For example, in a business with a single workflow and manager the user might be interested only in a simple statistic about the number of workflows executed, but in a business with several workflows, groups, and several users per group, it might be interesting to provide the description of the workflow, the user who launched / approved it and the group that it belongs to. For these reason, I think that it could make sense to provide either a war to specify the output fields by the invoker or, alternatively, to provide several types of report depending on their intention

brian_robinson
Champ in-the-making
Champ in-the-making
Hi Dario,

Great feedback, thanks!  I was thinking along the same lines regarding an intermediate output format (right now I'm thinking JSON), which would enable different presentation layers to format them appropriately (i.e. a Share dashlet, Liferay Portlet, etc.).

I've been thinking about the different types of workflows as well, and I agree that the reporting requirements are likely to differ in various situations.  I'm thinking it'll probably be several reports, with the ability to handle workflows generically down the road.  For example, any workflow can have any number of tasks; some may have 3 tasks, others 15 tasks.  The reporting mechanism should be built generically enough to support each of those cases and provide value to the viewer.

The idea of having the user specify what fields they want to see is interesting.  That could be handled by a (slow) web script that returns all data available (which in itself would be challenging to implement).  Then, using the intermediate format, a slick presentation layer application could deal with the user interaction (what fields they want to see) and slice and dice the data appropriately.  Actually, such an approach could handle ALL workflow reporting, but my guess is that the amount of data and poor performance would become prohibitive pretty quickly.

Another idea would be to enable dynamic input parameters into a single web script that does the appropriate slicing and dicing of the workflow data, returning only what is necessary to fulfill the request.  This could probably work, but might be ugly to implement.

What do you think?

-Brian

darioruizlopez
Champ in-the-making
Champ in-the-making
Well, what I really meant was not as ambitious as letting the user to choose the fields. My idea was less ambitous but simpler. I only meant that a developer (not the final user) could access an API and provide the list of the fields to provide. But these fields would be available only from a fixed and limited list provided by the interface (that is, defined by you, according to the feedback that you get). This way, you would probably be able to optimize the access to these data because you would know now which stores would be involved. This would let you to provide the majority of the functionality but only with a limited effort and with a high chance to manage performance properly. And it is also likely that you will be able to add new fields in the future if you need it.

It coud be objected that this would not be friendly for the user, but nothing prevents you from also providing some web templates with predefined configurations for the most common reports, and a drop down list to choose the proper fields with their correspondent "readable" names. And, of course, the idea is that developers could also provide their own templates (possibly extending those provided by you) to generate more sophisticated output.

The idea of the dynamic input parameters could be far more complicated, because the dynamic feature of the parameters make unpredictable what stores their involve. And it would be more unpredictable for the developer providing the parameters, too. If a list of the available fields is provided in advance, it is reasonably sure that these fields will be supported, but if the fields are fully dynamic, that is, if you have complete freedom to choose whatever field name that you want, the chance that this name is not valid is much higher. Dynamic parameters are much more powerfull and capable to evolve, but this power comes at a cost of making the system more fragile, and difficult to be developed by you. And I am not quite sure about how often would this power be needed

jpotts
World-Class Innovator
World-Class Innovator
"Data capture" and "Report generation" are two separate concepts. On the Data Capture side, the trick is that you don't know the data set that anyone will ever need because workflows are so business-specific. It's almost like you need a "workflow auditing" configuration that would allow a business process designer to fire an action at any point (or many points) during the business process that dumped the state of the workflow and all process variables to a workflow log entry somewhere. What gets dumped could be configurable by process.

Then, you've got Report Generation. Some people will say, "Just stick the data in some database tables and let me use my favorite reporting tool to create custom reports against it," which is the same approach the Alfresco auditing mechanism takes. This has to be supported at a minimum. Beyond that, there is a need for canned reports, dashlets, or whatever, that can be incorporated into the web client UI. Hopefully those would leverage an API which would allow people developing custom UI's to gather workflow auditing data and build their own graphs, charts, or whatever.

For the canned reports, I think you've got a decent list started. I'd add something like "tasks by duration" or something or some other way to identify workflow bottlenecks.

brian_robinson
Champ in-the-making
Champ in-the-making
Hi Jeff,

So data capture is already occurring (out of the box) at a sufficient level as far as I can tell.  Workflow process variables end up being stored as metadata either on the start task or other tasks in the workflow, or both.  I may be missing the concept completely here, I don't know.  If so, please clarify.

Because the metadata is already being captured in the database, you could leverage a reporting tool like BIRT, JasperReports, etc today.  I'm more focused for now on an API to expose the data in order to build 'canned reports' as you suggest.  See http://robinsontechnology.com/blog/2009/05/12/alfresco-workflow-status-reporting-design/ for more details on the design of that API.

I'd add something like "tasks by duration" or something or some other way to identify workflow bottlenecks
.

Nice!  Hadn't thought of that one, thanks Jeff!

-Brian

stevereiner
Champ in-the-making
Champ in-the-making

norgan
Champ in-the-making
Champ in-the-making
any update on this ?