cancel
Showing results for 
Search instead for 
Did you mean: 

Conditional transformation

smcardle
Champ in-the-making
Champ in-the-making
Hi All

I am looking for a way to handle conditional transformation.

As an example, I only want to perform a particular transformation on a PDF if an ONLY if it is an image only PDF i.e. if the output of say
'pdf2txt test.pdf | wc -w' returns 0.

Is this possible in the transformation configuration XML or would I need to create a new ComplexContentTransformer ?

I would realy like to achieve this through configuration only


Steve
5 REPLIES 5

mrogers
Star Contributor
Star Contributor
Trransformation converts between different mimetypes, so assuming that your mimetype is application/pdf (or whatever the correct value is) then it should simply work.    

Perhaps the first question to ask is what you are starting with and what you want to end up with, then we can be clear whether you are talking about transformation or rendition.

smcardle
Champ in-the-making
Champ in-the-making
I want to transform from application/pdf to image/tiff

I know Alfresco already handles this, however, as always, it's not that simple.

I only want this transformer to run on image only PDF's (scanned pages) and NOT on any PDF in the same Folder that already has a searchable text layer, such as an eBook.

Hope this clears it up a bit.

Steve

mitpatoliya
Star Collaborator
Star Collaborator
Hi Steve,
I think what you are looking for is to read the content from the scanned PDF but actually it contains nothing but kind of Image only.So, It will require OCR(Optical character recognition) tool for achieving that.
You need to integrate some third party tool like kofax in-order to achieve this.

smcardle
Champ in-the-making
Champ in-the-making
Hi

Actually I will be using Tesseract to do the complex transformation for OCR, but you have not answered my question.

Which is, how can I get the transformer to ONLY transform from application/pdf to image/tiff if the PDF is image only ?
The rest of the complext transformation is another matter, this already works for many image types and is applied as a rule on the space. The only problem I have at the moment is that I only want the first transformation from application/pdf to image/tiff to occure if the PDF has no text already.

Steve

mitpatoliya
Star Collaborator
Star Collaborator
see the existing transformer is not intelligent enough to meetup your requirement so you need to customize  that transformer.