PDF extraction

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-14-2013 02:48 AM
Hi,
Is there any add-ons available for extraction pdf to other format(metadata extraction) ? Your help would be greatly appreciated…
Is there any add-ons available for extraction pdf to other format(metadata extraction) ? Your help would be greatly appreciated…
Labels:
- Labels:
-
Archive
4 REPLIES 4
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-14-2013 05:47 AM
Hello,
metadata extraction from PDF is already contained as part of the standard platform. Usually, there are no add-ons required for this unless you have a very specific requirement. What are you trying to do? Where do you think the default functionality of Alfresco provides too little support?
Regards
Axel
metadata extraction from PDF is already contained as part of the standard platform. Usually, there are no add-ons required for this unless you have a very specific requirement. What are you trying to do? Where do you think the default functionality of Alfresco provides too little support?
Regards
Axel

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-14-2013 10:03 AM
I want to extract the information's like: author, modified by,..etc. Can you help me in getting these details?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-14-2013 10:36 AM
Hello,
author is one example of metadata that should already be extracted automatically when you upload a PDF via Share. You can also trigger metadata extraction by starting the action "extract-metadata" (ContentMetadataExtracter) via a script / rule. "Modified By" (cm:modified) isn't a property that should be extracted - this should only be maintained / managed by the system. You could of course define a custom property and map the extracted value of any "Modified By" document header via the PdfBoxMetadataExtracter.properties mapping configuration.
Please also have a look at the <a href="http://wiki.alfresco.com/wiki/Metadata_Extraction">wiki article about metadata extraction</a>.
Regards
Axel
author is one example of metadata that should already be extracted automatically when you upload a PDF via Share. You can also trigger metadata extraction by starting the action "extract-metadata" (ContentMetadataExtracter) via a script / rule. "Modified By" (cm:modified) isn't a property that should be extracted - this should only be maintained / managed by the system. You could of course define a custom property and map the extracted value of any "Modified By" document header via the PdfBoxMetadataExtracter.properties mapping configuration.
Please also have a look at the <a href="http://wiki.alfresco.com/wiki/Metadata_Extraction">wiki article about metadata extraction</a>.
Regards
Axel
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-18-2013 10:01 AM
Hi,
please can you share some of the screen shots of the solution.
please can you share some of the screen shots of the solution.
