cancel
Showing results for 
Search instead for 
Did you mean: 

Plugin for upload with metadata

softwareloop
Champ in-the-making
Champ in-the-making
Hi, I'm started working on a new plugin which basically is a file uploader that prompts for metadata. I thought I'd share my design and have your feedback. I also have a specific question - I'll ask at the end of this message.

The idea is to replace Share's standard uploader with a similar component that:
- allows to upload one or more documents
- asks for the document type (for each file, from a list of standard and custom types)
- prompts for metadata (for each file, based on the document type)

Just to be sure I'm not reinventing the wheel, I've searched for existing add-ons and found a commercial one that does something similar:
https://addons.alfresco.com/addons/edit-meta-data-during-upload
… but it seems to consider only standard metadata (title, description, author, etc) with no option for custom types and properties.

There are couple of open source ones too:
https://addons.alfresco.com/addons/content-uploader-pro
https://addons.alfresco.com/addons/upload-tag
… but these seem to be related more to categories and tagging than to metadata.

This post in the forum is interesting for a discussion of the problems and an outline of possible solutions
https://forums.alfresco.com/forum/developer-discussions/alfresco-share-development/edit-metadata-upo...

So what do you think?

As I mentioned at the beginning I have a specific question.
An important requirement for my plugin is that the three operations (uploading, giving a type and setting metadata) be atomic from the repository's perspective. So when an uploaded document appears in the destination folder it is already typed and has all the metadata. This is in contrast to a solution that uploads first and adds metadata later as a separate step.
The rationale is that in a real ECMS there is so much automation on folders (content rules, behaviours, etc) that when a doc enters the system it must have the type and properties in place from the beginning.

So the question is: how would approach this?
The core upload webscript /alfresco/api/upload takes contentType as a parameter but doesn't handle metadata, correct? To handle metadata I could:
- enhance the core upload webscript
- first upload the document to a temporary folder, then apply the metadata as an update, finally move the document to its destination folder

I'd like to understand the pros and cons or if there are other approaches.

Thank you.

Paolo
8 REPLIES 8

afaust
Legendary Innovator
Legendary Innovator
Hello,

great to see someone willing to tackle this issue on a general purpose level with a vision that includes customer-specific content models and flexibility.
If you ever need a sounding board for your ideas or someone to help with an issue that you don't find or get an answer for here in the forums, please know that you also always have the option of reaching out to the <a href="http://orderofthebee.org/">Order of the Bee</a>. We are still in the process of organising ourselves but support of an active community and addon developers is one of our main "missions".

Concerning your question / options…

Option "enhanced upload web script":
I would refrain from enhancing the core upload web script - instead create a new web script (based on the original) with your enhancements. Not only does it isolate you from unknown customizations a customer / user applied to upload in his environment, it also decouples you from Alfresco releases and can allow you to easily support a wider range of Alfresco versions.
The biggest question in this option is how you intend to capture the metadata from the user and how to integrate this with your custom web script in a way that feels natural to people using the addon. A big requirement from my point of view would be the use of the Forms API / framework and simple form configuration via XML (or the Community form manager addon).

Option "upload to temporary folder":
We use this approach ourselves in a commercial addon of ours (not listed) for uploads in parts of Alfresco where the upload needs to be part of e.g. workflow task forms and those forms control the final "save" or "cancel" decision of the user - long after the upload has completed. The biggest issue and potential con is the need for "garbage management", e.g. a design and background process that take care of removing temporary upload files where the metadata update step was never performed without leaving anything in the database impacting the peformance of the system.
From my experience this can be quite complex to manage since there are a lot of subtle side effects to consider regarding the automation aspects you mentioned.


At the moment I would give the advantage in terms of pro/con to the "enhanced upload".
<ol>
<li>pros: transactionality is very simple, no garbage management required, few to no side effects with automation</li>
<li>cons
<ol>
<li>longer user interaction cycle - temporary upload may allow splitting unit of work into smaller pieces so user may take a break without fear of loosing data (e.g. due to session timeout)</li>
<li>may be more difficult to handle user interface state and compatibility without duplicating more code to decouple from Alfresco releases</li>
</ol></li>
</ol>

Regards
Axel

softwareloop
Champ in-the-making
Champ in-the-making
Hi Axel and thank you for your comments and for the pointer to the Order of the Bee.

I was indeed planning to use the Form services. It saves me time, it abstracts me from the specificities of the custom models and contributes to a unified look-and-feel. I'm aware that things are radically changing in 5.0.x. I'll probably develop two versions, one for classic surf one for aikau, but for the moment I'm targeting classic surf with yui on 4.2.x.

Regarding the garbage management, or how to deal with unfinished uploads, this opens two further options. In one, unfinished uploads are silently removed by the system (e.g. daily) and the user is never prompted to take any action about them. In the other, the unfinished uploads are kept for longer (e.g. a week) and the user is reminded (at login, in a message box or in other ways) that these uploads can be resumed (by filling in the missing metadata) or discarded permanently.

This second option is more complicated but your "con 1." point seems to mention this possibility. Do you actually have experience with this scenario? Do the users accept the idea that unfinished uploads can be completed or is it just too much complication for them?

I agree at the moment the "enhanced upload" option sounds more promising, as it appears to be simpler and "atomic" in a stricter sense, but evaluating different perspectives and usage scenarios is always useful.

Regards

Paolo

afaust
Legendary Innovator
Legendary Innovator
Hi Paolo,

in the use case we targeted, temporary uploads had a limited period of validity. Since they were part of filling out a form, once that form was either submitted, cancelled or the browser was closed, they lost their context and could be removed safely.

I would actually like the option to have a personal "inbox" with documents I have uploaded but not yet classified. The main question here is: "How is that really different to a regular upload and edit metadata later" (or never)?
Managers responsible for a business process may be reluctant to allow their users to postpone the metadata step for fear that important documents will end in limbo with no visibility to other people. The method of inbox/reminder may need to be very different from one customer to the next, e.g. one might like an auto-delete after X hours, one might reguire formal escalation etc. up to having different inboxes for different parts of the Alfresco platform / application.

In my experience if users have the option to be lazy (not fill metadata) they are likely to be more lazy, so you'll end up with a large amount of "unfinished uploads". And having to do multiple steps (and remember to do them) is usually perceived as overly complex. And managers know that and wouldn't want a solution that allows this unless they have a specific requirement of using inboxes, e.g. as part of a formal input management process (mailroom automation).

I think a key deciding factor would be how far you can simplify the transactional upload option. If you can get a really easy to use solution with smart features / defaults, then it would solve 80% or more of the regular use cases without disrupting the flow of work with the system that much for end users. One smart feature would be bulk data assignment, e.g. you can assign common data for all parallel uploads in one form and only have to enter specifics to those uploads that need it.

A side note: There are not that many radical changes as you might think in Alfresco 5.0. I am not aware that the document library and upload functionalities will be replaced with Aikau based widgets in this version. Replacing YUI with Aikau is a gradual process and not a big-bang change. There are various people in the Community that want to have a big bang and remove legacy stuff, but Alfresco as a company is conservative and has to prioritize their efforts to "real features" (not developer candy) like improved search, reporting & analytics.

Regards
Axel

softwareloop
Champ in-the-making
Champ in-the-making
Thanks for sharing the use case and for the insight.
As I was reading your reply I noticed that many terms you've used (inboxes, reminder, escalation, etc) in relation to the temporary upload folder are really borrowed from the realm of business processes and workflows.
So maybe the temporary upload folder solution is introducing a workflow in disguise, forcing the user to think of uploads as a two-step process: one for the upload proper and one for filling in metadata.

Certainly it wasn't my intention to give this new plugin a workflow flavour. I'd rather keep workflows orthogonal to uploads. Workflows and uploads intersect in some cases, but I wouldn't want my plugin to make it the norm. I'd rather implement the upload+typing+metadata as atomically as I can, and then leave it to the user or to a solution developer to introduce workflows using other tools or plugins, if needed.

Also as you mention with workflows the solutions can vary a lot between customers. Having a one-size-fits-all solution is very difficult.

Ok, my next goal is to understand the technical aspects of the upload, how to structure the dialog, etc.

I already have this project on GitHub: https://github.com/softwareloop/uploader-plus
I've mainly been working on the admin page. The actual upload extension for the document library is something I still need to start.

Regards

Paolo

heiko_robert
Star Collaborator
Star Collaborator
Ciao Paolo,
I just found this discussion from your blog and I think you understand what users really expect. We decided to move that task you'd like to address technically completely out of share for 2 reasons:
- "qualified" upload functionallity is required from different clients/scenarios - not only inside share but may be integrated into share the same way as an option.
- for us the alfresco forms framework is too limited / unflexible to configure / validate / integrate master data / dynamic controls / form flows / post process actions as people are already used from other enterprise platforms.

I don't want to prevent you from implementing this in share but think about the real use case twice, which experience the user expects and how to avoid implementing the same logic several times if you accept that share is only one option to add documents. Our customers avoid share if possible since it takes to long time to navigate to the right folder and to modify metadata after upload. Add comments, change workflow status, … Additionally at the moment we conflict with the rendition service locking the whole node in 4.2.e/f after upload in share.

Similar to Axels' points having uncompleted uploads we implemented something like inboxes which fires a filing service once a document has missing metadata completed - but this is done in background and user interaction is only one source for metadata. We collect as much data from master data as posssible and store as few metadata in Alfresco as possible if metadata is controlled in other systems (and I assume only very view data is really controlled in alfresco).

Looking forward to continue the discussion …

Regards
Heiko

Hi Heiko, glad to continue the discussion.
I understand your reservation about the utility of an enhanced uploader if implemented in Share, that some parts of Share are perceived as limited by customers and that workarounds to such limitations are frequently sought outside Share.

But I also listen customers who see Share as an integrated interface and as a productivity platform the potential of which is often untapped.

This is my second "generic" plugin, so I can't say I'm an expert, but I follow two principles:
- design for minimum functionality, even if this means abstracting from real case scenarios
- allow further customisation which will be required anyway in a real project

Just to make an example, I'm using the forms service to prompt for metadata but I'm sure that somebody will want to use a completely custom form. Or worse, a standard form some of the time and a custom one the rest of the time 🙂
I try to make this sort of customisation easier, or at least to not get in the way.

If a plugin does too much it can become hard to customise. If it does just enough, it may be easier to adapt it, even if the the specific project requirements are unforeseeable to the plugin developer.

I'm almost done with the first release so hopefully we'll be be able to discuss the pros and cons of the plugin in practical terms soon.

Regards

Paolo

softwareloop
Champ in-the-making
Champ in-the-making

approveme
Champ in-the-making
Champ in-the-making
That's really great !