cancel
Showing results for 
Search instead for 
Did you mean: 

Messed up international characters on PDF transformation

sansancasd
Champ in-the-making
Champ in-the-making
Hey.

I followed the Alfresco tutorial. On the rule that transforms PDFs to text files,  The resultant text file had messed up characters. The original PDF file has Portuguese characters, with accents, cedillas, etc.  Is there anything to fix this?

I can name a space with Portuguese characters and they appear fine. But not on PDF transformations. Any idea?


Update: If i save the file to my desktop, and open it with notepad, the characters appear fine, if i open it with gvim, the characters are messed up…

Can this be related to having the database alfresco in latin1_swedish_ci?
5 REPLIES 5

kevinr
Star Contributor
Star Contributor
The transformations are performed using OpenOffice server - it is possible that it does not support the character transformations. You database will probably work best in UTF-8 format - but saying that, if you are able to create space names ok then it may not be that.

Thanks,

Kevin

sansancasd
Champ in-the-making
Champ in-the-making
Ah, i thought it wasn't open office, because on the tutorial, it says that even without OO, the PDF to text transformation will be available. But i'll investigate OO then.

BTW, is there a easy way to migrate the alfresco database from latin1_swedish_ci to UTF-8? Mysql noob here Smiley Happy


Thanks for the reply Kevin.

kevinr
Star Contributor
Star Contributor
Ah you may well be correct - most of the transformations (certainly the interesting ones like Word->PDF) are performed by OpenOffice, but the one you mention may well be using PDFBox, an open source library we use for some PDF manipulation.

Thanks,

Kevin

sansancasd
Champ in-the-making
Champ in-the-making
Kevin, i changed OO language settings, and a new text file still showed messed up characters. So must be PDFbox indeed. Any ideas about how to fix it? Does PDFbox have any settings or something? The tomcat, java, jars world is a new world for me Smiley Happy

kevinr
Star Contributor
Star Contributor
I'm not sure any settings can be changed on PDFBox - i'll take a look.

Cheers,

Kevin