cancel
Showing results for 
Search instead for 
Did you mean: 

Imap preview eml

dranakan
Champ on-the-rise
Champ on-the-rise
Hello,

I would like to show preview in Share from mails coming from Outlook. I get problem with the accents : If a mail contain this "é" it will show this "?%". It's ok with mail UTF-8, but how can I change to preview all kind of mails ?

I have done this to be able to get a preview of eml files :

Transformer:
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.nio.charset.Charset;

import org.alfresco.error.AlfrescoRuntimeException;
import org.alfresco.repo.content.MimetypeMap;
import org.alfresco.service.cmr.repository.ContentReader;
import org.alfresco.service.cmr.repository.ContentWriter;
import org.alfresco.service.cmr.repository.TransformationOptions;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.pdfbox.TextToPDF;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.encoding.EncodingManager;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.font.PDTrueTypeFont;
import org.apache.pdfbox.pdmodel.font.PDType1Font;

import org.alfresco.repo.content.transform.AbstractContentTransformer2;
/**
* Convert Text with accents (UTF-8) to PDF.
*
*
*/
public class TextUtfToPdfContentTransformer extends AbstractContentTransformer2
{
    private static final Log logger = LogFactory.getLog(TextUtfToPdfContentTransformer.class);
   
    private TextToPDF transformer;
   
    public TextUtfToPdfContentTransformer()
    {
        transformer = new TextToPDF();
    }
   
    public void setStandardFont(String fontName)
    {
        try
        {
            transformer.setFont(PDType1Font.getStandardFont(fontName));
        }
        catch (Throwable e)
        {
            throw new AlfrescoRuntimeException("Unable to set Standard Font for PDF generation: " + fontName, e);
        }
    }
   
    public void setTrueTypeFont(String fontName)
    {
        try
        {
            transformer.setFont(PDTrueTypeFont.loadTTF(null, fontName));
        }
        catch (Throwable e)
        {
            throw new AlfrescoRuntimeException("Unable to set True Type Font for PDF generation: " + fontName, e);
        }
    }
   
    public void setFontSize(int fontSize)
    {
        try
        {
            transformer.setFontSize(fontSize);
        }
        catch (Throwable e)
        {
            throw new AlfrescoRuntimeException("Unable to set Font Size for PDF generation: " + fontSize);
        }
    }
   
    /**
     * Only supports Text to PDF
     */
    public boolean isTransformable(String sourceMimetype, String targetMimetype, TransformationOptions options)
    {
        if ( (!MimetypeMap.MIMETYPE_TEXT_PLAIN.equals(sourceMimetype) &&
              !MimetypeMap.MIMETYPE_TEXT_CSV.equals(sourceMimetype) &&
              !MimetypeMap.MIMETYPE_XML.equals(sourceMimetype) ) ||
            !MimetypeMap.MIMETYPE_PDF.equals(targetMimetype))
        {
            // only support (text/plain OR text/csv OR text/xml) to (application/pdf)
            return false;
        }
        else
        {
            return true;
        }
    }

    @Override
    protected void transformInternal(
            ContentReader reader,
            ContentWriter writer,
            TransformationOptions options) throws Exception
    {
        PDDocument pdf = null;
        InputStream is = null;
        InputStreamReader ir = null;
        OutputStream os = null;
        try
        {
           logger.debug("Working…");
            is = reader.getContentInputStream();
            EncodingManager encodingManager = new EncodingManager();
            transformer.getFont().setEncoding(encodingManager.getEncoding(COSName.WIN_ANSI_ENCODING));

            pdf = transformer.createPDFFromText(new InputStreamReader(is, "UTF-8"));
            // dump it all to the writer
            os = writer.getContentOutputStream();
            pdf.save(os);
        }
        finally
        {
            if (pdf != null)
            {
                try { pdf.close(); } catch (Throwable e) {e.printStackTrace(); }
            }
            if (ir != null)
            {
                try { ir.close(); } catch (Throwable e) {e.printStackTrace(); }
            }
            if (is != null)
            {
                try { is.close(); } catch (Throwable e) {e.printStackTrace(); }
            }
            if (os != null)
            {
                try { os.close(); } catch (Throwable e) {e.printStackTrace(); }
            }
        }
    }
   
    protected InputStreamReader buildReader(InputStream is, String encoding, String node)
    {
        // If they gave an encoding, try to use it
        if(encoding != null)
        {
            Charset charset = null;
            try
            {
                charset = Charset.forName(encoding);
            } catch(Exception e)
            {
                logger.warn("JVM doesn't understand encoding '" + encoding +
                        "' when transforming " + node);
            }
            if(charset != null)
            {
                logger.debug("Processing plain text in encoding " + charset.displayName());
                return new InputStreamReader(is, charset);
            }
        }
       
        // Fall back on the system default
        logger.debug("Processing plain text using system default encoding");
        return new InputStreamReader(is);
    }
}

Bean to use the transformer


<bean id="transformer.complex.Mail.Pdf2swf"
        class="org.alfresco.repo.content.transform.ComplexContentTransformer"
        parent="baseContentTransformer" >
      <property name="transformers">
         <list>
            <ref bean="transformer.RFC822" />
            <ref bean="transformer.PdfBox.TextUtfToPdf" />
            <ref bean="transformer.Pdf2swf" />
         </list>
      </property>
      <property name="intermediateMimetypes">
         <list>
            <value>text/plain</value>
            <value>application/pdf</value>
         </list>
      </property>
   </bean>

I also try this code (working only with mails utf-8). What do I change ?

            logger.debug("Working…");
            is = reader.getContentInputStream();
           
            EncodingManager encodingManager = new EncodingManager();
            transformer.getFont().setEncoding(encodingManager.getEncoding(COSName.WIN_ANSI_ENCODING));          
             ir = buildReader(is, reader.getEncoding(), reader.getContentUrl());
            
             pdf = transformer.createPDFFromText(ir);
             // dump it all to the writer
             os = writer.getContentOutputStream();
             pdf.save(os);


Help :
http://issues.alfresco.com/jira/browse/ALF-3246?page=com.atlassian.jira.plugin.system.issuetabpanels...
http://forums.alfresco.com/fr/viewtopic.php?f=8&t=4049#p18699

(Alfresco 3.4D, Redhat).
14 REPLIES 14

mrogers
Star Contributor
Star Contributor
There are a few issues with EML preview - which is why the issue you refer to is still open. :?
Before you start, check your database and connection string are UTF-8.

There are some fixes to the code since 3.4d to get accented characters working, you may need to patch.  In particular we replaced the old RFC822 transformer.  I suggest you look at JIRA for further details.

From memory, I'm fairly sure we have unit tests for French accented characters now.

My colleagues are investigating fixing PDFBox at the moment to get it to work with Japaneese.   And then we need to get share to preview plain text files better.

dranakan
Champ on-the-rise
Champ on-the-rise
Thank you mrogers,

check your database and connection string are UTF-8
It's ok, /opt/Alfresco/tomcat/shared/classes/alfresco-global.properties :
db.url=jdbc:mysql://localhost:3309/${db.name}?useUnicode=yes&characterEncoding=UTF-8
/opt/Alfresco/mysql/my.cnf
default-character-set=utf8

There are some fixes to the code since 3.4d to get accented characters working, you may need to patch. In particular we replaced the old RFC822 transformer. I suggest you look at JIRA for further details
I have not found something like a new update of the transformer in Jira 😞 Do I replace the org.alfresco.repo.content.transform.EMLTransformer (bean id="transformer.RFC822") with the last in the SVN ?

Thank you

dranakan
Champ on-the-rise
Champ on-the-rise
I think I have this problem : http://issues.alfresco.com/jira/browse/ALF-3757 (problems with characters encoding).

Someone purpose to process to a Tika upgrade. I have tried to use last Tika (tika-app-0.9.jar in tomcat/webapps/alfresco/WEB-INF/lib/, removing tika-core-0.8-SNAPSHOT.jar and tika-parsers-0.8-SNAPSHOT.jar, restart server) but there no change… The accents are still bad.

Someone has done a Tika upgrade ? Do I forget something ?

Thank you.

mrogers
Star Contributor
Star Contributor
I think that particular change is unlikely to help.

Its far more likely to be this… ALF-5495 or one of the related issues.

dranakan
Champ on-the-rise
Champ on-the-rise
Thank you mroger,

I see there  is a patch (greenmail-1.3-patched.diff) changing severals java files (in http://issues.alfresco.com/jira/browse/ALF-5495). I am using 3.4D and I don't think it's a good idea to change manually the java file and create a new alfresco.war

I need to use a new version of Alfresco… (trying the 34e or waiting on the 4.0a)…

mrogers
Star Contributor
Star Contributor
3.4.e is unlikely to help.  :cry:

dranakan
Champ on-the-rise
Champ on-the-rise
With Alfresco Entreprise, I could have a patch and my problem should be solved. Our firm is in discussions with Alfresco to be partners… I hope it's be done quickly…

I have done a lot of search and test with this problem. With an Alfresco Entreprise licence : For this problem the support searches for me or I need to add in JIRA and wait a response ?

Thank you

mrogers
Star Contributor
Star Contributor
I'm not sure what you are asking, but with an Enterprise subscription Alfresco Support will investigate, and raise and track any issues if necessary.

nlaselva
Champ in-the-making
Champ in-the-making
Hi dranakan,

I had the same issue with IMAP and accented (or non-ASCII) characters. I tested everything, without any chance of getting this fixed.

Then I installed Alfresco Entreprise Trial (3.4.2-30 days) and it works perfectly. Looking in JIIRA they fixed it already for build 3.3.x. This means they didn't bring the fix in the 3.4 community edition and, IMHO, this has been done on purpose. Business is business and Alfresco is not a charity  Smiley Very Happy