cancel
Showing results for 
Search instead for 
Did you mean: 

URGENT(help needed) - Alfresco migration from latin1 to UTF8

zvukovic
Champ in-the-making
Champ in-the-making
Hi all,
We are using alfresco 3.2 instance and for database MySQL 5.1 in latin1 character set.
We will need to migrate this database from latin1 to utf-8 in order to provide support for filenames in Greek characters.
These are the steps we have folowed for migrating data
Dump database and convert data
> mysqldump alfresco > alfresco_latin1.sql
> iconv -f ISO8859-1 -t UTF-8 alfresco_latin1.sql > alfresco_utf8.sql
> cat alfresco_utf8.sql | sed  -e 's/DEFAULT CHARSET=latin1;/DEFAULT CHARSET=utf8 COLLATE utf8_bin;/' > alfresco_utf8-1.sql
> cat alfresco_utf8-1.sql | sed  -e 's/SET NAMES latin1/SET NAMES utf8/' > alfresco_utf8-2.sql
> mysql create database alfresco_utf8 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;
> mysql alfresco_utf8 < alfresco_latin1.sql

Also next is added to connection parameter to support UTF8
db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode\=yes&characterEncoding\=UTF-8

Data is imported with no errors but when we tried to restart alfresco and put it to use new database we created it said we have inconsistent indexes and to do FUll reindexing.
We have specified index.recovery.mode=FULL but then we got some new errors saying it can not do re-indexing.


These are parts of log which describe issue

7:43:14,673 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    50 % complete.
17:43:35,556 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    60 % complete.
17:43:45,039 ERROR [org.alfresco.repo.node.index.AbstractReindexComponent.threads] Reindexer   182 failed with error: could not deserialize; nested exception is org.hibernate.type.SerializationException: could not deserialize.
17:43:53,811 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    70 % complete.
17:44:04,810 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    80 % complete.
17:44:23,245 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    90 % complete.
17:45:33,090 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent]    100 % complete.
21:05:05,786 ERROR [org.alfresco.repo.node.index.AbstractReindexComponent.threads] Reindexer   286 failed with error: could not deserialize; nested exception is org.hibernate.type.SerializationException: could not deserialize.
21:05:06,479 INFO  [org.alfresco.repo.node.index.FullIndexRecoveryComponent] Index recovery completed.
21:05:07,839 WARN  [org.alfresco.repo.cache.TransactionalCache.org.alfresco.cache.qnameEntityTransactionalCache] Transactional update cache 'org.alfresco.cache.qnameEntityTransactionalCache' is full (500).
21:05:18,206 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent] Rebuilding indexes for 26 AVM Stores
21:05:18,207 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]     Rebuilding index for sitestore
21:05:18,401 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]   Reindex   0% complete
21:05:18,401 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]     Rebuilding index for project
21:05:18,620 INFO  [org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent]         Rebuilding index for snapshots 0 to 772




21:06:44,034 ERROR [org.springframework.web.context.ContextLoader] Context initialization failed
org.alfresco.error.AlfrescoRuntimeException: 04190000 Reindex failure for org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent
   at org.alfresco.repo.node.index.AbstractReindexComponent.reindex(AbstractReindexComponent.java:291)
   at org.alfresco.repo.node.index.IndexRecoveryBootstrapBean.onBootstrap(IndexRecoveryBootstrapBean.java:35)
   at org.alfresco.util.AbstractLifecycleBean.onApplicationEvent(AbstractLifecycleBean.java:62)
   at org.springframework.context.event.SimpleApplicationEventMulticaster$1.run(SimpleApplicationEventMulticaster.java:77)
   at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:49)
   at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:75)
   at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:246)
   at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:617)
   at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:355)
   at org.springframework.web.context.ContextLoader.createWebApplicationContext(ContextLoader.java:246)
   at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:189)
   at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:49)
   at org.alfresco.web.app.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:69)
   at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:3843)
   at org.apache.catalina.core.StandardContext.start(StandardContext.java:4342)
   at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
   at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
   at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:525)
   at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:627)
   at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
   at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488)
   at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1149)
   at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
   at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
   at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
   at org.apache.catalina.core.StandardHost.start(StandardHost.java:719)
   at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
   at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
   at org.apache.catalina.core.StandardService.start(StandardService.java:516)
   at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
   at org.apache.catalina.startup.Catalina.start(Catalina.java:578)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
   at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413)
Caused by: org.springframework.jdbc.UncategorizedSQLException: SqlMapClient operation; uncategorized SQLException for SQL []; SQL state [null]; error code [0];  
— The error occurred in alfresco/ibatis/#resource.dialect#/avm-common-SqlMap.xml. 
— The error occurred while applying a result map. 
— Check the alfresco.avm.result_AVMNodeProperty. 
— Check the result mapping for the 'serializableValue' property. 
— Cause: org.alfresco.ibatis.SerializableTypeHandlerCallback$DeserializationException: java.io.StreamCorruptedException: invalid stream header: C2ACC3AD; nested exception is com.ibatis.common.jdbc.exception.NestedSQLException:  
— The error occurred in alfresco/ibatis/#resource.dialect#/avm-common-SqlMap.xml. 
— The error occurred while applying a result map. 
— Check the alfresco.avm.result_AVMNodeProperty. 
— Check the result mapping for the 'serializableValue' property. 
— Cause: org.alfresco.ibatis.SerializableTypeHandlerCallback$DeserializationException: java.io.StreamCorruptedException: invalid stream header: C2ACC3AD
   at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.translate(SQLStateSQLExceptionTranslator.java:120)
   at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.translate(SQLErrorCodeSQLExceptionTranslator.java:276)
   at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:197)
   at org.springframework.orm.ibatis.SqlMapClientTemplate.executeWithListResult(SqlMapClientTemplate.java:220)
   at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:267)


Please did anybody had experience with migrating latin1 to utf-8 character set on mysql and how you resolved issue?
Any help will be appreciated

Thanks
Zeljko
13 REPLIES 13

loftux
Star Contributor
Star Contributor
Is it a typo or did you import back the original dump in the last step?
mysql alfresco_utf8 < alfresco_latin1.sql

mrogers
Star Contributor
Star Contributor
And please don't cross post.     I've deleted them for you :!:

zvukovic
Champ in-the-making
Champ in-the-making
Hi mrogers,
It is my typo, we have done correct import
> mysql alfresco_utf8 < alfresco_utf8-2.sql

Do you have any clue how to find solution for re-indexing(FullIndexRecoveryComponent,AVMFullIndexRecoveryComponent)?

Thanks
Zeljko

loftux
Star Contributor
Star Contributor
The only thing I can think of is that the conversion/import isn't correct after all. That during the reindexing Alfresco fails to read the database.
Have your tried to create a new repository (db+alf_data) that uses utf-8 and the connection string you are using? If all works well, then do an export of the database structure (excluding data) for that db and your converted db and compare them. Maybe you will find a difference there. If not, then there is something in the actual data.
What if you do a select in your converted db, does the data look correct then?

Some googling, with the import
mysql -h host -u username -p password –default_character_set utf8 database < file.sql 
since your file is now utf-8 encoded.

zvukovic
Champ in-the-making
Champ in-the-making
And please don't cross post.     I've deleted them for you :!:
Sorry about that is urgent question for us so that is the reason why I posted several times.

Thanks

zvukovic
Champ in-the-making
Champ in-the-making
The only thing I can think of is that the conversion/import isn't correct after all. That during the reindexing Alfresco fails to read the database.
Have your tried to create a new repository (db+alf_data) that uses utf-8 and the connection string you are using? If all works well, then do an export of the database structure (excluding data) for that db and your converted db and compare them. Maybe you will find a difference there. If not, then there is something in the actual data.
What if you do a select in your converted db, does the data look correct then?

Some googling, with the import
mysql -h host -u username -p password –default_character_set utf8 database < file.sql 
since your file is now utf-8 encoded.

Hi Loftux.
To explain whole scenario : we are having 2 ALfresco instances one with latin1 and second one with utf-8.
This one with utf-8 character set is our DEV environment and there is everything ok and verified regarding Greek character set.
On PROD we have latin1 and we need to migrate it to use utf-8 because of upcoming request to have Greek character set for filenames in Alfresco.

We have tested all this and it is ok on DEV environment where we are already using utf-8.
Steps we have done in order to migrate data
1) we created one more UTF-8 schema alfresco_utf8 in DEV environment
2) then we dumped data from PROD
3) converted it to utf-8 in steps I have desribed
4) restored it to DEV in this new UTF-8 schema

After that we have alfresco db.url to use this new schema and to do FULL re-indexing but we are stucked because of the error I have posted.

Do you have any clue how we can resolve the issue with re-indexing because alfresco doesn;t want to start with this error we get.
Also if we try to start it without re-indexing it wan;t to start because it says we need to re-index content

Please do you any idea how we can overcome this?

Can we delete complete alf_data folder and restart alfresco, will it regenerate alf_data with data from db?

Thanks
Zeljko

loftux
Star Contributor
Star Contributor
The only thing, as mentioned, is that the import
> mysql alfresco_utf8 < alfresco_utf8-2.sql
is importing using ISO8859-1 instead of UTF-8 (your data file is now in utf-8). Thats why I would suggest
> mysql –default_character_set utf8 alfresco_utf8 < alfresco_utf8-2.sql
Other than that, I have no idea.

zvukovic
Champ in-the-making
Champ in-the-making
The only thing, as mentioned, is that the import
> mysql alfresco_utf8 < alfresco_utf8-2.sql
is importing using ISO8859-1 instead of UTF-8 (your data file is now in utf-8). Thats why I would suggest
> mysql –default_character_set utf8 alfresco_utf8 < alfresco_utf8-2.sql
Other than that, I have no idea.


Hmm, this is the statement we used to create database

> mysql create database alfresco_utf8 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;

Ok will import data like you stated and try once again re-indexing.

Thanks
Zeljko

zvukovic
Champ in-the-making
Champ in-the-making
Hi Loftux,
I have tried like you suggested but that didn't resolved our issue
> mysql –default_character_set utf8 alfresco_utf8 < alfresco_utf8-2.sql

Do you have any other idea what can help to migrate successfully from latin1 to utf-8?

Thanks
Zeljko