Adding HTML content, Alfresco 1.4
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-23-2006 08:14 AM
There seems to be an unwanted transformation of context when adding HTML content into an Alfresco space. It shows up when I view the HTML document I just added. It seems like there is some kind of parsing involved where " ", " ' ", bulleted items and such get replaced with a " ? ". This means, that a line with multiple TABs on a line in the original(.doc) would show a bunch of question marks instead when viewed within Alfresco.
The original document was ceated in MS Word (.doc), and Saved as filtered HTML before adding as content to Alfresco.
Would someone have a way around this problem.
The original document was ceated in MS Word (.doc), and Saved as filtered HTML before adding as content to Alfresco.
Would someone have a way around this problem.
Labels:
- Labels:
-
Archive
5 REPLIES 5

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-23-2006 09:41 AM
Unless you have applied a transformation via a rule, then there is no modification of the content when it is added to Alfresco. HTML or other content types are not modified by default.
If you view the content directly in the browser before uploading does it still have the problem? If you compare the content in Alfresco with the original file does it actually contain different values?
Thanks,
Kevin
If you view the content directly in the browser before uploading does it still have the problem? If you compare the content in Alfresco with the original file does it actually contain different values?
Thanks,
Kevin
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-24-2006 12:50 AM
Yes. The contents are different.
Before posting this question I did the following test:
1. Created a simple word document with a bit of text with imbedded tabs, saved as .doc and .htm
2. Viewed the html source with Nvu to confirm the tabulator spaces are coded as  's
3. Added the file as content to Alfresco
4. Click the document to view the contents - the tabs are all shown as questionmarks
5. Saving the document on disc and then viewing it does not show the unwanted characters
6.Checked out for editing, and then edit from within the space shows the unwanted characters
7. Saved the working copy of the file on disc, the view or edit, and the unwnated characters will not show
Before posting this question I did the following test:
1. Created a simple word document with a bit of text with imbedded tabs, saved as .doc and .htm
2. Viewed the html source with Nvu to confirm the tabulator spaces are coded as  's
3. Added the file as content to Alfresco
4. Click the document to view the contents - the tabs are all shown as questionmarks
5. Saving the document on disc and then viewing it does not show the unwanted characters
6.Checked out for editing, and then edit from within the space shows the unwanted characters
7. Saved the working copy of the file on disc, the view or edit, and the unwnated characters will not show
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 07:12 AM
Try saving your web page(word document) as Unicode or Unicode(UTF-8) encoded.
MS Word doc to html:
1. write your document with MS Word
2. select “File†-> “Save as Web Pageâ€
3. from the “Save asâ€-dialog select “Tools†-> “Web Optionsâ€
4. from the “Web Optionsâ€-dialog select “Encodingâ€-tab
5. select “Unicode†or “Unicode(UTF-8)†encoding instead of windows-1252
6. click ok
7. type file name and
8. click save
OR in html file:
1. change <meta http-equiv=Content-Type content="text/html; charset=windows-1252"> to <meta http-equiv=Content-Type content="text/html; charset=utf-8">
Regards,
Said
MS Word doc to html:
1. write your document with MS Word
2. select “File†-> “Save as Web Pageâ€
3. from the “Save asâ€-dialog select “Tools†-> “Web Optionsâ€
4. from the “Web Optionsâ€-dialog select “Encodingâ€-tab
5. select “Unicode†or “Unicode(UTF-8)†encoding instead of windows-1252
6. click ok
7. type file name and
8. click save
OR in html file:
1. change <meta http-equiv=Content-Type content="text/html; charset=windows-1252"> to <meta http-equiv=Content-Type content="text/html; charset=utf-8">
Regards,
Said

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 12:19 PM
7. Saved the working copy of the file on disc, the view or edit, and the unwnated characters will not show
This proves that Alfresco has not modified the content. As user 'bensai' suggests, the display in the browser is the issue with the display encoding.
Thanks,
Kevin
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-29-2006 05:02 AM
Thank You Kevin for Your prompt reply, and bensai as well.
Things are in order now.
Things are in order now.
