cancel
Showing results for 
Search instead for 
Did you mean: 

search xml using lucene ,is that possible?

mmrs
Champ in-the-making
Champ in-the-making
Hi all
I got some xml files placed on some where like "Company Home > MyProject> Projects > Proposed > Submitted Project".
The "Submitted Project" folder contains xml files that represent data  and forms of that project
and i want to search these xml files but i need the result as pairs of(tag : value) ;
some thing like "whrere projectId=2 and  pipeWidth=8" so is that possible?? and if yes how?
the structure of xml look like this:
<project id="">
   <properity1 />
   <properity2 />
    …
    ..
</project>

i did that using org.xml.sax package ,but i have to do that using lucene to increase performance and reduce the execution time of the search
i search the http://wiki.alfresco.com/wiki/Search but didn't find what i want
any one have some example ,link  or any guide
help is highly appreciated

thanx in advance
mmrs
4 REPLIES 4

mmrs
Champ in-the-making
Champ in-the-making
… till now i came up with this:
String xmlTag="<"+searchIn+">"+searchFor;
        SearchParameters sp=new SearchParameters();
        sp.addStore( new StoreRef(StoreRef.PROTOCOL_WORKSPACE,"SpacesStore"));
        sp.setLanguage(SearchService.LANGUAGE_LUCENE);
        sp.setQuery("PATH:\"" + "/app:company_home/cm:MyProject/cm:Projects/cm:Proposed/cm:SubmittedProject" + "//*\" AND TYPE:\"" +
                    ContentModel.TYPE_CONTENT + "\" AND TEXT:\"" + xmlTag + "\"");
ResultSet resultSet = getSearchService().query(sp);
where searchIn: the xml Element
searchFor: it's value (number)

but the problem is that lucene looks for string here not int here;
if element "pipeWidth" has the value:08 in the xml file ,and the search parameter were:
searchIn="pipeWidth"
searchIn=8 ,  then resultSet will be null  :cry:

how can i fix this
please  some one highlight this post

regards
mmrs

pmonks
Star Contributor
Star Contributor
The XML Metadata Extractor can be used for this - it extracts element and/or attribute values from XML files and stores them in metadata properties (which can be indexed into the search engine).

There's a page on the wiki that documents how to configure XML Metadata Extraction.

Cheers,
Peter

mmrs
Champ in-the-making
Champ in-the-making
thank you for ur respose

i will look for that page and try the XML Metadata Extractor

camillo
Champ in-the-making
Champ in-the-making
I have the same problem!
Did you solve it?