cancel
Showing results for 
Search instead for 
Did you mean: 

4.0 Community - document preview for cyrillic utf8(?) txt, html shows malformed text

i_
Champ in-the-making
Champ in-the-making
Alfresco 4.0 Community previewer displays utf8(?) txt and html documents with cyrillic malformed. How to enable utf8 for previewer or at least force it show normally russian and english text?

Txt previewing - it has most strange behaviour and maybe here not only one problem, creating or editing this text document:
тестовый текст - test text
日本人テストテキスト - japanese test text
español - spanish

[img]http://i.imgur.com/rwsLzdN.jpg[/img]

Previewer shows it malformed:

[img]http://i.imgur.com/Nxr9G2f.jpg[/img]

When downloading document it normally displaying in Notepad++, it says, that it has ANSI as UTF-8 encoding (UTF-8 without BOM)

When inline editing document again - all fields shows normal at inline editing form, but previewer still shows mallformed text.

Html previewing, pay attention, that preview encoding looks different than in txt preview on creating/inline editing case:
[img]http://i.imgur.com/BPtvC3Z.jpg[/img]


POST from chrome developers tools for HTML previewing:

Request URL:…:8080/share/proxy/alfresco/api/type/cm%3acontent/formprocessor
Request Method:POST
Status Code:200 OK
Request Headersview source
Accept:*/*
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:ru,en-US;q=0.8,en;q=0.6
Connection:keep-alive
Content-Length:388
Content-Type:application/json
….
User-Agent:Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31
X-Requested-With:application/x-www-form-urlencoded

prop_app_editInline: "true"
prop_cm_content: "<p>теÑтовый текÑÑ‚ - test text</p>↵<p>日本人テストテも¹ãƒˆ - japanese test text</p>↵<p>espa&ntilde;ol - spanish</p>"
prop_cm_description: "html test"
prop_cm_name: "html test"
prop_cm_title: "html test"
prop_mimetype: "text/html"
Response Headersview source
Cache-Control:no-cache
Content-Length:166
Content-Type:application/json;charset=UTF-8
Date:Mon, 13 May 2013 23:10:36 GMT
Pragma:no-cache
Server:Apache-Coyote/1.1
For part "теÑтовый текÑÑ" decoder says, that when he converting
CP1252 → UTF-8,
he get "те�товый тек��" - pretty close to "тестовый текст".

alfresco 4.0.e on ubuntu linux server 13.04 alfresco-community-4.0.e-installer-linux-x64.bin

In rare cases when creating very simple text (or maybe it just moon phases influence 😉 ), for example if it written only one russian word "текст" - previewer displays it normally at FIRST time, and only when creating first document in repository, but on reload previewer or edit, creating next document with same text "текст" - it malformed again.
1 REPLY 1

loftux
Star Contributor
Star Contributor
You can try the Embed viewer from Share xtras http://share-extras.github.io/media-viewers/
With that you can configure text files to be viewed directly in the browser (and browser code renders text file - international characters work). Problem here is transformation from text -> pdf -> flash, something breaks encoding.

You can find some info on config here as well http://loftux.se/meetup/