cancel
Showing results for 
Search instead for 
Did you mean: 

Relation between Document and Database

shashi
Champ in-the-making
Champ in-the-making

Hi All,

We want to export all documents from /contentstore and metadata from alf_content_url table.

As the table table (alf_content_url) stores only path of the files with *.bin format(eg- store://2021/1/29/12/31/17dfa825-fe2f-4cc1-90c0-f6b3b287445a.bin)

We can see also files and folder from Alfressco UI under Company Home.

So, where to find folder and file relations like which files belongs to which folder in table ot confiuration file?

Thanks in Advance!

Kind regards,

Shashi

3 ACCEPTED ANSWERS

sufo
Star Contributor
Star Contributor

EDIT: added correct name for the document from ALF_NODE_PROPERTIES table.

You've asked for it Smiley Happy

select a.child_node_name p2.string_value as content_name, p1.string_value as parent_name, u.content_url 
    from 
        alf_content_url u left join alf_content_data d on u.id=d.content_url_id 
        left join alf_node_properties p on d.id=p.long_value 
        left join alf_child_assoc a on a.child_node_id=p.node_id 
        left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
left join alf_node_properties p2 on a.child_node_id=p2.node_id where p.qname_id=( select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) where q.local_name='content' and n.uri='http://www.alfresco.org/model/content/1.0' ) and p1.qname_id=( select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0' )
and
p2.qname_id=(
select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id)
where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
);

But you will get also thumbnails, so you have to filter the results further.

Document metadata is stored in ALF_NODE_PROPERTIES table. That table also contains NODE_ID, which is used in ALF_CHILD_ASSOC table that contains parent-child relations. ALF_NODE_PROPERTIES contains ID of ALF_CONTENT_DATA row and that contains CONTENT_URL_ID and that is ID in ALF_CONTENT_URL table.

This is only for the ideal scenario. If you have multifiling (one document in more folders), other associations defined or more content properties (custom model), things can get far more complicated.

View answer in original post

sufo
Star Contributor
Star Contributor

I love to learn new things so now I know how to do recursive selects in the DB (to build path using ALF_CHILD_ASSOC table) Smiley Happy

This SQL code works on Oracle DB and filters out thumbnails and older versions of content (takes into account only workspace://SpacesStore):

create or replace function get_path (document_node_id in number) return varchar2 as
    document_path varchar2(32767);
begin
    with pth(parent_node_id, child_node_id, parent_name) as (
        select a.parent_node_id as parent_node_id, a.child_node_id as child_node_id, p1.string_value as parent_name from
            alf_child_assoc a left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        where 
            p1.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
            and
            a.child_node_id=document_node_id
        union all
        select a1.parent_node_id, a1.child_node_id, p2.string_value from 
            pth left join alf_child_assoc a1 on pth.parent_node_id=a1.child_node_id left join alf_node_properties p2 on a1.parent_node_id=p2.node_id
        where
            p2.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
    ) select listagg(parent_name, '/') within group(order by parent_node_id) into document_path from pth;
    return document_path;
end get_path;

select get_path(a.child_node_id) as content_path, p2.string_value as content_name, u.content_url 
    from 
        alf_content_url u left join alf_content_data d on u.id=d.content_url_id 
        left join alf_node_properties p on d.id=p.long_value 
        left join alf_child_assoc a on a.child_node_id=p.node_id 
        left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        left join alf_node_properties p2 on a.child_node_id=p2.node_id
        left join alf_node n on a.child_node_id=n.id
    where 
        n.store_id=(
            select id from alf_store where protocol='workspace' and identifier='SpacesStore'
        )
        and not
        n.type_qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='thumbnail' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='content' and n.uri='http://www.alfresco.org/model/content/1.0'
        ) 
        and  
        p1.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p2.qname_id=( 
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0' 
        );

PostgreSQL version:

create or replace function get_path(document_node_id in bigint) returns text as $$
declare document_path text;
begin
    with recursive pth(parent_node_id, child_node_id, parent_name) as (
        select a.parent_node_id as parent_node_id, a.child_node_id as child_node_id, p1.string_value as parent_name from
            alf_child_assoc a left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        where 
            p1.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
            and
            a.child_node_id=document_node_id
        union all
        select a1.parent_node_id, a1.child_node_id, p2.string_value from 
            pth left join alf_child_assoc a1 on pth.parent_node_id=a1.child_node_id left join alf_node_properties p2 on a1.parent_node_id=p2.node_id
        where
            p2.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
    ) select string_agg(parent_name, '/' order by parent_node_id) into document_path from pth;
	return document_path;
end
$$ language plpgsql;

select get_path(a.child_node_id) as document_path, p2.string_value as content_name, u.content_url 
    from 
        alf_content_url u left join alf_content_data d on u.id=d.content_url_id 
        left join alf_node_properties p on d.id=p.long_value 
        left join alf_child_assoc a on a.child_node_id=p.node_id 
        left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        left join alf_node_properties p2 on a.child_node_id=p2.node_id
        left join alf_node n on a.child_node_id=n.id
    where 
        n.store_id=(
            select id from alf_store where protocol='workspace' and identifier='SpacesStore'
        )
        and not
        n.type_qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='thumbnail' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='content' and n.uri='http://www.alfresco.org/model/content/1.0'
        ) 
        and  
        p1.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p2.qname_id=( 
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0' 
        );

View answer in original post

EddieMay
World-Class Innovator
World-Class Innovator

Hi @shashi 

There is an API - the nodes api might be what you are looking for.

HTH,

Digital Community Manager, Alfresco Software.
Problem solved? Click Accept as Solution!

View answer in original post

11 REPLIES 11

sufo
Star Contributor
Star Contributor

EDIT: added correct name for the document from ALF_NODE_PROPERTIES table.

You've asked for it Smiley Happy

select a.child_node_name p2.string_value as content_name, p1.string_value as parent_name, u.content_url 
    from 
        alf_content_url u left join alf_content_data d on u.id=d.content_url_id 
        left join alf_node_properties p on d.id=p.long_value 
        left join alf_child_assoc a on a.child_node_id=p.node_id 
        left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
left join alf_node_properties p2 on a.child_node_id=p2.node_id where p.qname_id=( select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) where q.local_name='content' and n.uri='http://www.alfresco.org/model/content/1.0' ) and p1.qname_id=( select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0' )
and
p2.qname_id=(
select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id)
where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
);

But you will get also thumbnails, so you have to filter the results further.

Document metadata is stored in ALF_NODE_PROPERTIES table. That table also contains NODE_ID, which is used in ALF_CHILD_ASSOC table that contains parent-child relations. ALF_NODE_PROPERTIES contains ID of ALF_CONTENT_DATA row and that contains CONTENT_URL_ID and that is ID in ALF_CONTENT_URL table.

This is only for the ideal scenario. If you have multifiling (one document in more folders), other associations defined or more content properties (custom model), things can get far more complicated.

shashi
Champ in-the-making
Champ in-the-making

Hi Sufo,

Thanks a lot for your answer. indeed a good answer that clear my questions.

One more questions.  to get complete folder and subfolder details for a file(Like - a test.doc file stored inside src/main/test folder in tree structure format)

Thanks in Advance!

Kind regards,

Shashi

sufo
Star Contributor
Star Contributor

I love to learn new things so now I know how to do recursive selects in the DB (to build path using ALF_CHILD_ASSOC table) Smiley Happy

This SQL code works on Oracle DB and filters out thumbnails and older versions of content (takes into account only workspace://SpacesStore):

create or replace function get_path (document_node_id in number) return varchar2 as
    document_path varchar2(32767);
begin
    with pth(parent_node_id, child_node_id, parent_name) as (
        select a.parent_node_id as parent_node_id, a.child_node_id as child_node_id, p1.string_value as parent_name from
            alf_child_assoc a left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        where 
            p1.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
            and
            a.child_node_id=document_node_id
        union all
        select a1.parent_node_id, a1.child_node_id, p2.string_value from 
            pth left join alf_child_assoc a1 on pth.parent_node_id=a1.child_node_id left join alf_node_properties p2 on a1.parent_node_id=p2.node_id
        where
            p2.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
    ) select listagg(parent_name, '/') within group(order by parent_node_id) into document_path from pth;
    return document_path;
end get_path;

select get_path(a.child_node_id) as content_path, p2.string_value as content_name, u.content_url 
    from 
        alf_content_url u left join alf_content_data d on u.id=d.content_url_id 
        left join alf_node_properties p on d.id=p.long_value 
        left join alf_child_assoc a on a.child_node_id=p.node_id 
        left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        left join alf_node_properties p2 on a.child_node_id=p2.node_id
        left join alf_node n on a.child_node_id=n.id
    where 
        n.store_id=(
            select id from alf_store where protocol='workspace' and identifier='SpacesStore'
        )
        and not
        n.type_qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='thumbnail' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='content' and n.uri='http://www.alfresco.org/model/content/1.0'
        ) 
        and  
        p1.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p2.qname_id=( 
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0' 
        );

PostgreSQL version:

create or replace function get_path(document_node_id in bigint) returns text as $$
declare document_path text;
begin
    with recursive pth(parent_node_id, child_node_id, parent_name) as (
        select a.parent_node_id as parent_node_id, a.child_node_id as child_node_id, p1.string_value as parent_name from
            alf_child_assoc a left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        where 
            p1.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
            and
            a.child_node_id=document_node_id
        union all
        select a1.parent_node_id, a1.child_node_id, p2.string_value from 
            pth left join alf_child_assoc a1 on pth.parent_node_id=a1.child_node_id left join alf_node_properties p2 on a1.parent_node_id=p2.node_id
        where
            p2.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
    ) select string_agg(parent_name, '/' order by parent_node_id) into document_path from pth;
	return document_path;
end
$$ language plpgsql;

select get_path(a.child_node_id) as document_path, p2.string_value as content_name, u.content_url 
    from 
        alf_content_url u left join alf_content_data d on u.id=d.content_url_id 
        left join alf_node_properties p on d.id=p.long_value 
        left join alf_child_assoc a on a.child_node_id=p.node_id 
        left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        left join alf_node_properties p2 on a.child_node_id=p2.node_id
        left join alf_node n on a.child_node_id=n.id
    where 
        n.store_id=(
            select id from alf_store where protocol='workspace' and identifier='SpacesStore'
        )
        and not
        n.type_qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='thumbnail' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='content' and n.uri='http://www.alfresco.org/model/content/1.0'
        ) 
        and  
        p1.qname_id=(
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
        )
        and
        p2.qname_id=( 
            select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
            where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0' 
        );

shashi
Champ in-the-making
Champ in-the-making

Thank you Sufo!

It is realy helpful and good query!

But I tried to convert the below query for Mysql as we using MySql database.

Getting lots of Syntax error and I tried to convert but still getting the error specially for the Functions

Any help for Mysql will be thankful Smiley Happy

Thanks in advance!

Kind regards

sufo
Star Contributor
Star Contributor

Shashi, seems that you have MySQL 5.7 or even older version. This should work:

drop procedure if exists get_path;
delimiter $$
create procedure get_path(in document_node_id bigint(20), out path text)
begin
    declare parent_name varchar(255);
    declare temppath text;
    declare tempparent bigint(20);
    declare continue handler for not found 
    begin
        set tempparent = null;
        set parent_name = '';
	end;
    set max_sp_recursion_depth = 255;
    select a.parent_node_id, p1.string_value from
            alf_child_assoc a left join alf_node_properties p1 on a.parent_node_id=p1.node_id 
        where 
            p1.qname_id=(
                select q.id from alf_qname q left join alf_namespace n on (q.ns_id=n.id) 
                where q.local_name='name' and n.uri='http://www.alfresco.org/model/content/1.0'
            )
            and
            a.child_node_id=document_node_id 
    into tempparent, parent_name;
    if tempparent is null
    then
        set path = parent_name;
    else
        call get_path(tempparent, temppath);
        set path = concat(temppath, '/', parent_name);
    end if;
end$$
delimiter ;

drop function if exists get_path;
delimiter $$
create function get_path(child_node_id int) returns text deterministic
begin
    declare res text;
    call get_path(child_node_id, res);
    return res;
end$$
delimiter ;

Original idea is from here: https://stackoverflow.com/a/42734226/14914433 

jpotts
World-Class Innovator
World-Class Innovator

I don't know why you are doing this. It seems like you are making your life complicated for no good reason.

There is a perfectly good API that shields you from the database and file system details.

If you want to export files and metadata, use the API and stop looking at the database and the file system.

Maybe you have a good reason for avoiding the API, and, if so, best of luck to you.

shashi
Champ in-the-making
Champ in-the-making

Thanks Jpotts,

are you talking from CMIS API?

Already we have implemented in our applications.

Actually things is we have plan to migrate our current application to new system. and we need only data from Alfresco.

for that Physical data we can get from filesystem that will be in *.bin format and now we want metadata to know which file stored in filesystem belongs to which folder and subfolder(mean parent/child folder/file.docx).

is there any REST api where we can get all meta data info in json or xml format?

Thanks in Advance!

Kind regards!

EddieMay
World-Class Innovator
World-Class Innovator

Hi @shashi 

There is an API - the nodes api might be what you are looking for.

HTH,

Digital Community Manager, Alfresco Software.
Problem solved? Click Accept as Solution!

shashi
Champ in-the-making
Champ in-the-making

Thank You EddieMay,

I checked the API link it says that it will work with Alfresco 5.2. but In our applications we have used Alfresco 4.2.e

and we are using CMIS inteface to connect with Java applications not the REST api.

Any way thank you very much. and feeling hounor to get response from Alfresco Admin Smiley Happy

Kind regards,

Shashi