Hyland Connect

smcardle · ‎04-28-2013

Hi All

I am looking for a way to handle conditional transformation.

As an example, I only want to perform a particular transformation on a PDF if an ONLY if it is an image only PDF i.e. if the output of say
'pdf2txt test.pdf | wc -w' returns 0.

Is this possible in the transformation configuration XML or would I need to create a new ComplexContentTransformer ?

I would realy like to achieve this through configuration only

Steve

mrogers · ‎04-29-2013

Trransformation converts between different mimetypes, so assuming that your mimetype is application/pdf (or whatever the correct value is) then it should simply work.

Perhaps the first question to ask is what you are starting with and what you want to end up with, then we can be clear whether you are talking about transformation or rendition.

smcardle · ‎04-29-2013

I want to transform from application/pdf to image/tiff

I know Alfresco already handles this, however, as always, it's not that simple.

I only want this transformer to run on image only PDF's (scanned pages) and NOT on any PDF in the same Folder that already has a searchable text layer, such as an eBook.

Hope this clears it up a bit.

Steve

mitpatoliya · ‎04-30-2013

Hi Steve,
I think what you are looking for is to read the content from the scanned PDF but actually it contains nothing but kind of Image only.So, It will require OCR(Optical character recognition) tool for achieving that.
You need to integrate some third party tool like kofax in-order to achieve this.

smcardle · ‎04-30-2013

Hi

Actually I will be using Tesseract to do the complex transformation for OCR, but you have not answered my question.

Which is, how can I get the transformer to ONLY transform from application/pdf to image/tiff if the PDF is image only ?
The rest of the complext transformation is another matter, this already works for many image types and is applied as a rule on the space. The only problem I have at the moment is that I only want the first transformation from application/pdf to image/tiff to occure if the PDF has no text already.

Steve

mitpatoliya · ‎05-03-2013

see the existing transformer is not intelligent enough to meetup your requirement so you need to customize that transformer.

Hyland Connect

Conditional transformation