cancel
Showing results for 
Search instead for 
Did you mean: 

How to preserve original document create and modified date during upload

sanjaybandhniya
Elite Collaborator
Elite Collaborator

Hi,

I want to preserve original document create and modified date during upload. how can I achieve that?

If this is possible then will It preserve during FTP upload?

3 ACCEPTED ANSWERS

Ok, if you are able to locate the extracted metadata in log by AbstractMappingMetadataExtracter/PdfBoxMetadataExtracter and check that Found: {..............} has 'created/modified' metadata but Mapped and Accepted: {............} doesn't show it then here is what could be happening.

  • When file is uploaded, during node creation cm:auditable aspect is applied. It contains "cm:created" and "cm:modified" properties which are set during the node creation.  These properties are protected and mandatory properties (see the details below) defined in ootb content-model.xml. When a property is defined as "protected", it means once the value is set, it can not be updated i.e. becomes read-only. 

         So, these values are set at the time of node creation and marked read-only after that. 

<property name="cm:created">
	<title>Created</title>
	<type>d:datetime</type>
	<protected>true</protected>
	<mandatory enforced="true">true</mandatory>
	<index enabled="true">
		<atomic>true</atomic>
		<stored>false</stored> 
		<tokenised>both</tokenised>
		<facetable>true</facetable>
	</index>
</property>

<property name="cm:modified">
	<title>Modified</title>
	<type>d:datetime</type>
	<protected>true</protected>
	<mandatory enforced="true">true</mandatory>
	<index enabled="true">
		<atomic>true</atomic>
		<stored>false</stored> 
		<tokenised>both</tokenised>
		<facetable>true</facetable>
	</index>
</property>

The alternative solution for this would be create your custom properties in your custom content model; and keep the created/modified matadata values on those custom properties for your use. Unless you want to override the default behavior of auditable aspect properties which i believe would not be a good idea. 

For example:::::

Create following properties in your custom content model:

<aspect name="demo:customAuditMetadata">
	<title>Custom Audit Metadata</title>
	<description>Custom Audit Metadata</description>
	<properties>
		<property name="demo:originCreatedDate">
			<title>Original Created Date</title>
			<description>Created date of files based on incoming metadata extracted from metadata extractor</description>
			<type>d:text</type>
		</property>	
		<property name="demo:originModifiedDate">
			<title>Original Modified Date</title>
			<description>Modified date of files based on incoming metadata extracted from metadata extractor</description>
			<type>d:text</type>
		</property>	
	</properties>
</aspect>

Add following bean definition and add the above properties in the mappingProperties:

<bean id="extracter.PDFBox" class="org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter" parent="baseMetadataExtracter">
 <property name="documentSelector" ref="pdfBoxEmbededDocumentSelector" />
 <property name="inheritDefaultMapping">
	 <value>false</value>
 </property>
 <property name="overwritePolicy">
        <!-- Allow extraction happens all the time (e.g. when content is updated or new version is uploaded).-->
	<value>EAGER</value>
 </property>
 <property name="mappingProperties">
	  <props>
		 <prop key="namespace.prefix.demo">http://www.github.com/model/demo/1.0</prop>
		 <prop key="created">demo:originCreatedDate</prop>
		 <prop key="modified">demo:originModifiedDate</prop>
	</props>
 </property>
</bean>

- Update the share config to display the newly added properties on document-details page as needed.

~Abhinav
(ACSCE, AWS SAA, Azure Admin)

View answer in original post

@sanjaybandhniya  Find the demo project here:

https://github.com/abhinavmishra14/alfresco-metadataextraction-demo

I had an observation between community and enterprise versions. Examples i gave above works perfectly fine with enterprise versions of 5.2.x (i used 5.2.6) and 6.1.x(used 6.1), but properties files are not getting picked correctly (its some sort of intermittent behavior) on community editions. 

Only change i did is highlighted below for community edition and it picks up always corretly.

<property name="mappingProperties">
    <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
	<property name="location">
	   <value>classpath:alfresco/module/${project.artifactId}/metadataextraction/TikaAutoMetadataExtracter.properties</value>
	</property>
    </bean>
</property>

On enterprise version both works fine, above path and below given path as well:

<value>classpath:alfresco/metadata/TikaAutoMetadataExtracter.properties</value>

This one also works on both versions:

<value>classpath:alfresco/extension/metadata/TikaAutoMetadataExtracter.properties</value>

https://github.com/abhinavmishra14/alfresco-metadataextraction-demo/blob/master/metadata-extractor-d...

I am not sure what difference the two type of versions (community and enterprise) has in terms of extension points, tried looking at source code but no clues. But good news is that the other path i shared above (available in demo project) works fine for both community and enterprise versions.

Hope this helps trim down your issue. 

~Abhinav
(ACSCE, AWS SAA, Azure Admin)

View answer in original post

I have created an alternative deployment for Docker that allows apply this configuration.

Sample project available in https://github.com/aborroy/alfresco-custom-metadata-extractor

Hyland Developer Evangelist

View answer in original post

24 REPLIES 24

bip1989
Star Contributor
Star Contributor

I had a similar requirement in my project. And to keep the original create/creator/modified/modifier values when documents are uploaded initially, we added custom auditable aspect in our custom model. And we apply the aspect with same values as we can see on olfresco's auditable aspect while documents are created and we used document creation behavior to do this.

Alfresco then keeps on updating its original auditable aspect based on further updates on document. And our custom aspect remains unchanged

I am not talking about alfresco upload date,I want to preserve document original creation date during upload.

Please check below Image.

image

OOTB Metadata extrator does maps extraction and application of created date metadata. If you look at the pdfbox metadata extactor properties you would notice that "created" metadata is mapped to "cm:created".

# Mappings
author=cm:author
title=cm:title
description=cm:description
created=cm:created

https://github.com/Alfresco/alfresco-repository/blob/master/src/main/resources/alfresco/metadata/Pdf...

This is the class which may be parsing the newly uploaded pdf files and extracting their available metadata and map them to content model metadata:

https://github.com/Alfresco/alfresco-repository/blob/master/src/main/java/org/alfresco/repo/content/...

You can enable following logs to see if metafdata is getting extracted or not:

log4j.logger.org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter=DEBUG
log4j.logger.org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter=DEBUG
log4j.logger.org.alfresco.repo.content.metadata.TikaAutoMetadataExtracter=DEBUG

Have a look at this test class as well which tests about "createdate" metadata.:

https://github.com/Alfresco/alfresco-repository/blob/master/src/test/java/org/alfresco/repo/content/...

You can also look at auto metada extractor impl as well for reference: https://github.com/Alfresco/alfresco-repository/blob/master/src/main/java/org/alfresco/repo/content/...

https://github.com/Alfresco/alfresco-repository/blob/master/src/main/resources/alfresco/metadata/Tik...

~Abhinav
(ACSCE, AWS SAA, Azure Admin)

Hi,

I have checked all these class but I am not getting idea what customization I have to make to unable mapping of original document created date with cm:created.