cancel
Showing results for 
Search instead for 
Did you mean: 

URGENT(help needed) - Alfresco migration from latin1 to UTF8

zvukovic
Champ in-the-making
Champ in-the-making
Hi all,
We are using alfresco 3.2 instance and for database MySQL 5.1 in latin1 character set.
We will need to migrate this database from latin1 to utf-8 in order to provide support for filenames in Greek characters.
These are the steps we have folowed for migrating data
Dump database and convert data
> mysqldump alfresco > alfresco_latin1.sql
> iconv -f ISO8859-1 -t UTF-8 alfresco_latin1.sql > alfresco_utf8.sql
> cat alfresco_utf8.sql | sed  -e 's/DEFAULT CHARSET=latin1;/DEFAULT CHARSET=utf8 COLLATE utf8_bin;/' > alfresco_utf8-1.sql
> cat alfresco_utf8-1.sql | sed  -e 's/SET NAMES latin1/SET NAMES utf8/' > alfresco_utf8-2.sql
> mysql create database alfresco_utf8 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;
> mysql alfresco_utf8 < alfresco_latin1.sql

Also next is added to connection parameter to support UTF8
db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode\=yes&characterEncoding\=UTF-8

Data is imported with no errors but when we tried to restart alfresco and put it to use new database we created it said we have inconsistent indexes and to do FUll reindexing.
We have specified index.recovery.mode=FULL but then we got some new errors saying it can not do re-indexing.


These are parts of log which describe issue

7:43:14,673 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    50 % complete.
17:43:35,556 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    60 % complete.
17:43:45,039 ERROR [org.alfresco.repo.node.index.AbstractReindexComponent.threads] Reindexer   182 failed with error: could not deserialize; nested exception is org.hibernate.type.SerializationException: could not deserialize.
17:43:53,811 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    70 % complete.
17:44:04,810 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    80 % complete.
17:44:23,245 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    90 % complete.
17:45:33,090 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    100 % complete.
21:05:05,786 ERROR [org.alfresco.repo.node.index.AbstractReindexComponent.threads] Reindexer   286 failed with error: could not deserialize; nested exception is org.hibernate.type.SerializationException: could not deserialize.
21:05:06,479 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery completed.
21:05:07,839 WARN  [org.alfresco.repo.cache.TransactionalCache.org.alfresco.cache.qnameEntityTransactionalCache] Transactional update cache 'org.alfresco.cache.qnameEntityTransactionalCache' is full (500).
21:05:18,206 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent] Rebuilding indexes for 26 AVM Stores
21:05:18,207 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]     Rebuilding index for sitestore
21:05:18,401 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]   Reindex   0% complete
21:05:18,401 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]     Rebuilding index for project
21:05:18,620 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]         Rebuilding index for snapshots 0 to 772




21:06:44,034 ERROR [org.springframework.web.context.ContextLoader] Context initialization failed
org.alfresco.error.AlfrescoRuntimeException: 04190000 Reindex failure for org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent
   at org.alfresco.repo.node.index.AbstractReindexComponent.reindex(AbstractReindexComponent.java:291)
   at org.alfresco.repo.node.index.IndexRecoveryBootstrapBean.onBootstrap(IndexRecoveryBootstrapBean.java:35)
   at org.alfresco.util.AbstractLifecycleBean.onApplicationEvent(AbstractLifecycleBean.java:62)
   at org.springframework.context.event.SimpleApplicationEventMulticaster$1.run(SimpleApplicationEventMulticaster.java:77)
   at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:49)
   at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:75)
   at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:246)
   at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:617)
   at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:355)
   at org.springframework.web.context.ContextLoader.createWebApplicationContext(ContextLoader.java:246)
   at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:189)
   at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:49)
   at org.alfresco.web.app.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:69)
   at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:3843)
   at org.apache.catalina.core.StandardContext.start(StandardContext.java:4342)
   at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
   at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
   at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:525)
   at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:627)
   at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
   at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488)
   at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1149)
   at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
   at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
   at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
   at org.apache.catalina.core.StandardHost.start(StandardHost.java:719)
   at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
   at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
   at org.apache.catalina.core.StandardService.start(StandardService.java:516)
   at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
   at org.apache.catalina.startup.Catalina.start(Catalina.java:578)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
   at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413)
Caused by: org.springframework.jdbc.UncategorizedSQLException: SqlMapClient operation; uncategorized SQLException for SQL []; SQL state [null]; error code [0];  
— The error occurred in alfresco/ibatis/#resource.dialect#/avm-common-SqlMap.xml. 
— The error occurred while applying a result map. 
— Check the alfresco.avm.result_AVMNodeProperty. 
— Check the result mapping for the 'serializableValue' property. 
— Cause: org.alfresco.ibatis.SerializableTypeHandlerCallback$DeserializationException: java.io.StreamCorruptedException: invalid stream header: C2ACC3AD; nested exception is com.ibatis.common.jdbc.exception.NestedSQLException:  
— The error occurred in alfresco/ibatis/#resource.dialect#/avm-common-SqlMap.xml. 
— The error occurred while applying a result map. 
— Check the alfresco.avm.result_AVMNodeProperty. 
— Check the result mapping for the 'serializableValue' property. 
— Cause: org.alfresco.ibatis.SerializableTypeHandlerCallback$DeserializationException: java.io.StreamCorruptedException: invalid stream header: C2ACC3AD
   at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.translate(SQLStateSQLExceptionTranslator.java:120)
   at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.translate(SQLErrorCodeSQLExceptionTranslator.java:276)
   at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:197)
   at org.springframework.orm.ibatis.SqlMapClientTemplate.executeWithListResult(SqlMapClientTemplate.java:220)
   at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:267)


Please did anybody had experience with migrating latin1 to utf-8 character set on mysql and how you resolved issue?
Any help will be appreciated

Thanks
Zeljko
13 REPLIES 13

mrogers
Star Contributor
Star Contributor
The error posted above is, I suspect, because you have translated a serialized value which is stored in a BLOB.   Not being text, the character set conversion will have corrupted your data.

You need to take care that you only convert text colums to utf-8.

zvukovic
Champ in-the-making
Champ in-the-making
The error posted above is, I suspect, because you have translated a serialized value which is stored in a BLOB.   Not being text, the character set conversion will have corrupted your data.

You need to take care that you only convert text colums to utf-8.

Hi mrogers,
Thanks, for this info.
I will try to apply migration procedure only on text fields and hope that will resolve our issue.
Will let back to you when do migration only on text fields.

Thanks
Zeljko

zvukovic
Champ in-the-making
Champ in-the-making
Hi all,
We done migration only on text fiedls and then re-index was successful but now we are sseing some other issue.
I assume that is coming from content we have in ALFRESCO_HOME/alf_data/ folder.
Now we are checking that part. If someone need scripts which need to be executed in order to convert data from latin1 to utf-8 here is what we used:

Alter statements to migrate tables:
SELECT CONCAT( 'ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME, ' convert to CHARACTER SET utf8 COLLATE utf8_bin;' ) FROM information_schema.TABLES 
WHERE TABLE_SCHEMA = 'alfresco_utf8' AND TABLE_COLLATION = 'latin1_swedish_ci' ORDER BY TABLE_ROWS desc;
Alter statements to migrate text columns:
                         SELECT CONCAT( 'ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME,' MODIFY COLUMN ',COLUMN_NAME,' ',COLUMN_TYPE,' CHARACTER SET utf8 COLLATE utf8_bin NOT NULL;') FROM information_schema.COLUMNS 
                         WHERE TABLE_SCHEMA = 'alfresco_utf8' AND COLLATION_NAME = 'latin1_swedish_ci'and IS_NULLABLE='NO'
                         UNION
                         SELECT CONCAT( 'ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME,' MODIFY COLUMN ',COLUMN_NAME,' ',COLUMN_TYPE,' CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL;') FROM information_schema.COLUMNS
                         WHERE TABLE_SCHEMA = 'alfresco_utf8' AND COLLATION_NAME = 'latin1_swedish_ci'and IS_NULLABLE='YES';

Thanks
Zeljko

stegbth
Champ in-the-making
Champ in-the-making
Hi,

so what is the correct way to upgrade
Alfresco 3.2r with Latin1 database to 3.4 with UTF-8 database?

Export Database from 3.2r with latin1 with mysqldump
install new 3.4d
import 3.2r into new MySQL with Latin1
convert the tables with

SELECT CONCAT( 'ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME, ' convert to CHARACTER SET utf8 COLLATE utf8_bin;' ) FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'alfresco_utf8' AND TABLE_COLLATION = 'latin1_swedish_ci' ORDER BY TABLE_ROWS desc;


SELECT CONCAT
( 'ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME,' MODIFY COLUMN ',COLUMN_NAME,' ',COLUMN_TYPE,' CHARACTER SET utf8 COLLATE utf8_bin NOT NULL;') FROM information_schema.COLUMNS
  WHERE TABLE_SCHEMA = 'alfresco_utf8' AND COLLATION_NAME = 'latin1_swedish_ci'and IS_NULLABLE='NO'
      UNION
SELECT CONCAT( 'ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME,' MODIFY COLUMN ',COLUMN_NAME,' ',COLUMN_TYPE,' CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL;') FROM information_schema.COLUMNS
   WHERE TABLE_SCHEMA = 'alfresco_utf8' AND COLLATION_NAME = 'latin1_swedish_ci'and IS_NULLABLE='YES';
check the database connect Config includes UTF8

move in old alf_data and set rindex=full?

are the all information modified to UTF-8?

best regards
thomas