Hyland Connect

dynamolalit · ‎05-07-2010

Hi,

For my project, we are using Liferay as Front End 5.2.3 & Alfresco 3.2r as CMS.I have a business requirement to integrate Apache SOLR to both of these so that a user of portal can search Liferay as well as Alfresco content single handedly on a portlet in Liferay.

As a part of it, i need to integrate SOLR with Alfresco.How can i do it? :roll:

As SOLR is based on Lucene & Alfresco also uses Lucene indexes so can I expose Alfresco indexes to SOLR so that it can be in sync with SOLR .

Is it possible ??? Can SOLR get leverage from existing Alfresco Lucene indexes in any way?

Or every time a content is created in repository, SOLR indexes needs to be updated? If so , how can it be done?

Would appreciate for any help/suggestion.

dynamolalit · ‎05-10-2010

Hi,

For my project, we are using Liferay as Front End 5.2.3 & Alfresco 3.2r as CMS.I have a business requirement to integrate Apache SOLR to both of these so that a user of portal can search Liferay as well as Alfresco content single handedly on a portlet in Liferay.

As a part of it, i need to integrate SOLR with Alfresco.How can i do it? :roll:

As SOLR is based on Lucene & Alfresco also uses Lucene indexes so can I expose Alfresco indexes to SOLR so that it can be in sync with SOLR .

Is it possible ??? Can SOLR get leverage from existing Alfresco Lucene indexes in any way?

Or every time a content is created in repository, SOLR indexes needs to be updated? If so , how can it be done?

Would appreciate for any help/suggestion.

Any help please!!

jpfi · ‎05-10-2010

hi,
integration alfresco lucene index into a solr system is a very hard task. Your main problem will be handling with permissions, because permissions are not synchronized to lucene index.
I've implemented another solution as part of of customisation project:
- custom aspect solrIndexable: marker aspect
- a bunch of custom policies that post metadata to solr (OnContentUpdate, On PropertiesUpdated etc.)

cheers, jan

dynamolalit · ‎05-10-2010

Hi Jan,

Thanks for your reply.

Can you please elaborate a bit more on your approach especially on

a bunch of custom policies that post metadata to solr (OnContentUpdate, On PropertiesUpdated etc.)

How can i post metadata to Solr?

What i can assume is that once content is approved, i can create a xml containing content metadata & share it with solr thus solr can index the xml & use it!Or may be something like this. :idea:

Is it possible?

dynamolalit · ‎05-21-2010

Hi,

I managed to post content to SOLR from Alfresco using command cURL.

http://curl.haxx.se/

Once content is approved under workflow, i am forming an XML using DOM parser & posting this well formed XML using cURL command.

Here is source for the same:

KMReviewProcessAction.java from where I am posting XML to SOLR.


/**
    * Method to form XML to post to SOLR.
    * @param nodeRef
    */
   @SuppressWarnings("unchecked")
   public void formSolrXml(NodeRef nodeRef){
      logger.debug("Inside formSolrXml with noderef "+nodeRef);
      Node contentNode = new Node(nodeRef);
      String repoPath = Utils.generateURL(FacesContext.getCurrentInstance(), contentNode, URLMode.WEBDAV);      
      //logger.debug("repoPath "+repoPath);      
      String webdavUrl = alfServerIp+repoPath;
      logger.debug("webdavUrl in formSolrXml() : "+webdavUrl);
      String noderef = nodeRef.toString();
      String[] tempNodeRef = noderef.split("SpacesStore/");
      String contentUuid = tempNodeRef[1];
      String curlPreSyntax = "curl ";
        String curlPostSyntax = "/solr/update -F commit=true -F stream.file=@";
        
      //Create an XML using XMLFormationBean & pass it to SOLR.
      XmlFormationBean xmlFormationBean = new XmlFormationBean();
      QName statusQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}contentStatus");
      QName ownerQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}originalOwner");
      QName ratingQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}contentRating");
      QName nameQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}name");
      QName coAuthorQname = QName.createQName("{{http://www.xxxx.com/model/km/content/1.0}coauthor");
      QName titleQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}title");
      QName descriptionQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}description");      
      QName authorQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}author");
      String cName = nodeService.getProperty(nodeRef, nameQname).toString();
      logger.debug("Content Name in formSolrXml : "+cName);
      String cStatus = nodeService.getProperty(nodeRef, statusQname).toString();
      logger.debug("Content Status in formSolrXml : "+cStatus);
      String cOwner = nodeService.getProperty(nodeRef, ownerQname).toString();
      logger.debug("Content Owner in formSolrXml :  "+cOwner);
      String cRating = nodeService.getProperty(nodeRef, ratingQname).toString();
      logger.debug("Content Rating in formSolrXml :  "+cRating);
      String cCoAuthor = ".";
      try{
         cCoAuthor = nodeService.getProperty(nodeRef, coAuthorQname).toString();
      }catch (NullPointerException e) {
         cCoAuthor = ".";
      }
      logger.debug("Content CoAuthor in formSolrXml : "+cCoAuthor);
      String cTitle = nodeService.getProperty(nodeRef, titleQname).toString();
      logger.debug("Content Title in formSolrXml :  "+cTitle);
      String cDesc = ".";
      try{
         cDesc = nodeService.getProperty(nodeRef, descriptionQname).toString();
      }catch (NullPointerException e) {
         cDesc = ".";
      }
      logger.debug("Content Description in formSolrXml :  "+cDesc);
      String cAuthor = ".";
      try{
         cAuthor = nodeService.getProperty(nodeRef, authorQname).toString();
      }catch (NullPointerException e) {
         cAuthor = ".";
      }
      logger.debug("Content Author in formSolrXml :  "+cAuthor);
      String cType =  getContentMimeType(nodeRef);
      logger.debug("Content Type in formSolrXml :  "+cType);   
      try{
         Collection<NodeRef> categories = (Collection<NodeRef>)nodeService.getProperty(nodeRef, ContentModel.PROP_CATEGORIES);
         Iterator itr11 =  categories.iterator();
         while(itr11.hasNext()){
            NodeRef catNodeRef = (NodeRef) itr11.next();
            String category = Repository.getNameForNode(nodeService, catNodeRef);
            logger.debug("category name in formSolrXml : "+category);
            categoryList.add(category);
         }         
      }catch (NullPointerException e) {
         logger.error("Error while retrieving categories in formSolrXml : "+e.getMessage());
      }   
      //InputStream iStream = (InputStream) contentService.getReader(nodeRef, ContentModel.PROP_CONTENT);
      contentService = services.getContentService();
      
        ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT);
       
        if (reader != null && reader.exists())
        {
                // get the transformer
               
                ContentTransformer transformer = contentService.getTransformer(reader.getMimetype(), MimetypeMap.MIMETYPE_TEXT_PLAIN);
                
                // is this transformer good enough?
                if (transformer != null)
                {
                   
                    // We have a transformer that is fast enough
                    ContentWriter writer = contentService.getTempWriter();
                     
                    writer.setMimetype(MimetypeMap.MIMETYPE_TEXT_PLAIN);
                     
                    try
                    {   
                       
                       transformer.transform(reader, writer);
                        // point the reader to the new-written content
                       
                        reader = writer.getReader();
                         
                        // Check that the reader is a view onto something concrete
                        if (!reader.exists())
                        {
                            
                            throw new ContentIOException("The transformation did not write any content, yet: \n"
                                    + "   transformer:     " + transformer + "\n" + "   temp writer:     " + writer);
                        }else {
                               
                              content = reader.getContentString();
                               
                        }
                        
                    }
                    catch (ContentIOException e)
                    {
                        
                       
                    }
                }
            }


        logger.debug("Content as a string for SOLR indexing !! :  "+content);
       
        
        //Forming XML.
        String finalFileName =  xmlFormationBean.formXmlFromContent(contentUuid,content, cType, cAuthor, cCoAuthor, cTitle, cOwner, categoryList, cDesc, cStatus, cRating, cName, solrFileLoc,webdavUrl);
        logger.debug("finalFileName in KMReviewProcessAction : "+finalFileName);
      //Deploying to SOLR.Check for content type.
        String curlCommand = curlPreSyntax + solrServIp + curlPostSyntax;
        if(cType.equalsIgnoreCase("application/pdf")){    //PDF
           curlCommand += finalFileName;   
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        }else if(cType.equalsIgnoreCase("application/msword")){    // Word
           curlCommand += finalFileName;
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        }else if(cType.equalsIgnoreCase("application/vnd.excel")){    //Excel
           curlCommand += finalFileName;   
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        }else if(cType.equalsIgnoreCase("application/vnd.powerpoint")){    // Powerpoint
           curlCommand += finalFileName;   
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        }else if(cType.equalsIgnoreCase("text/plain")){    //Text
           curlCommand += finalFileName;   
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        }else if(cType.equalsIgnoreCase("text/html")){   //HTML
           curlCommand += finalFileName;   
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        }else if(cType.equalsIgnoreCase("text/xml")){   //XML
           curlCommand += finalFileName;   
           logger.debug("Calling Curl command for content type : "+cType+" : "+curlCommand);
        } 
      Runtime systemShell = Runtime.getRuntime();
         try {
            //Process output = systemShell.exec(curlCommand);
            //int outputCode = output.exitValue();
            systemShell.exec(curlCommand);
            logger.debug("Curl command called correctly for : "+cName);
      } catch (IOException e) {
         logger.debug("Document named "+cName +" could not indexed");
         e.printStackTrace();
      }      
   }
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

XmlFormationBean.java forming XML.



public class XmlFormationBean {

   private static final Log logger = LogFactory.getLog(XmlFormationBean.class);
   private String add = "add";
   private String doc = "doc";
   private String field = "field";
   private String fieldName = "name";
   private String id = "id";
   private String text = "text";
   private String type = "content_type";
   private String author = "author";
   private String coAuthor = "coAuthor";
   private String title = "title";
   private String owner = "owner";
   private String description = "description";
   private String url = "link";
   private String category = "category";
   private String rating = "rating";
   private String lastModified = "last_modified";
   private String cdataStartTag = "<![CDATA[";
   private String cdataEndTag = "]]>";
   private String tilde = "~";
   private String cap = "^";
   // private String keywords = "keywords";

   /**
    * Method forming XML with passed parameters. 
    * It returns the full file path of XML to be posted to SOLR using cURL.
    * Text content is tagged in <![CDATA[]]> tag for XML transformation.
    * 
    * @param contentUuid
    * @param passedContent
    * @param contentType
    * @param contentAuthor
    * @param contentCoAuthor
    * @param contentTitle
    * @param contentOwner
    * @param contentCategories
    * @param contentDescription
    * @param contentUrl
    * @param contentStatus
    * @param contentRating
    * @param contentName
    * @return fileNameToReturn.
    */
   public String formXmlFromContent(String contentUuid, String passedContent,
         String contentType, String contentAuthor, String contentCoAuthor,
         String contentTitle, String contentOwner,
         List<String> contentCategories, String contentDescription,
         String contentStatus, String contentRating,
         String contentName, String solrFileLoc, String webdavUrl) {
      String nameOfContent = contentName;
      String xmlExt = ".xml";
      String forwardSlash = "/";
      logger.debug("Inside formXmlFromContent() with nameOfContent : "
            + nameOfContent);
      String fileNameToReturn = null;
      String cleanContent = passedContent.replaceAll("\\P{ASCII}+", ""); 

      try {
         DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
               .newInstance();
         DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
         Document document = docBuilder.newDocument();
         Element rootElement = document.createElement(add);
         Element docElement = document.createElement(doc);
         Element idField = document.createElement(field);
         idField.setAttribute(fieldName, id);
         idField.appendChild(document.createTextNode(contentUuid));
         docElement.appendChild(idField);
         Element titleField = document.createElement(field);
         titleField.setAttribute(fieldName, title);
         titleField.appendChild(document.createTextNode(contentTitle));
         docElement.appendChild(titleField);
         Element descriptionField = document.createElement(field);
         descriptionField.setAttribute(fieldName, description);
         descriptionField.appendChild(document
               .createTextNode(contentDescription));
         docElement.appendChild(descriptionField);
         Element authorField = document.createElement(field);
         authorField.setAttribute(fieldName, author);
         authorField.appendChild(document.createTextNode(contentAuthor));
         docElement.appendChild(authorField);
         Iterator<String> catItr = contentCategories.iterator();
         while (catItr.hasNext()) {
            String contentCategory = catItr.next();
            logger.debug("contentCategory " + contentCategory);
            Element categoryField = document.createElement(field);
            categoryField.setAttribute(fieldName, category);
            categoryField.appendChild(document
                  .createTextNode(contentCategory));
            docElement.appendChild(categoryField);
         }
         Element typeField = document.createElement(field);
         typeField.setAttribute(fieldName, type);
         typeField.appendChild(document.createTextNode(getMimetypeForSolrSearch(contentType)));
         docElement.appendChild(typeField);
         Element coAuthorField = document.createElement(field);
         coAuthorField.setAttribute(fieldName, coAuthor);
         coAuthorField.appendChild(document.createTextNode(contentCoAuthor));
         docElement.appendChild(coAuthorField);   
         Element contentField = document.createElement(field);
         contentField.setAttribute(fieldName, text);
         //CDATASection contentCdata = document.createCDATASection(passedContent);
         //contentCdata.deleteData(contentCdata.get, count)
         contentField.appendChild(document.createCDATASection(cleanContent));
         docElement.appendChild(contentField);
         //logger.debug("Content text : "+contentField.getTextContent());
         Element lastModifiedField = document.createElement(field);
         lastModifiedField.setAttribute(fieldName, lastModified);
         lastModifiedField.appendChild(document
               .createTextNode(getLastModifiedDate()));
         docElement.appendChild(lastModifiedField);
         /*
          * Commented out for future. String docKeywords = " "; Element
          * keywordsField = document.createElement(field);
          * keywordsField.setAttribute(fieldName, keywords);
          * keywordsField.appendChild(document.createTextNode(docKeywords));
          * docElement.appendChild(keywordsField);
          */
         Element ownerField = document.createElement(field);
         ownerField.setAttribute(fieldName, owner);
         ownerField.appendChild(document.createTextNode(contentOwner));
         docElement.appendChild(ownerField);
         Element ratingField = document.createElement(field);
         ratingField.setAttribute(fieldName, rating);
         ratingField.appendChild(document.createTextNode(contentRating));
         docElement.appendChild(ratingField);
         Element urlField = document.createElement(field);
         urlField.setAttribute(fieldName, url);
         urlField.appendChild(document.createTextNode(webdavUrl));
         docElement.appendChild(urlField);
         rootElement.appendChild(docElement);
         document.appendChild(rootElement);
         //Element contentText = document.getElementById("text");
         //String tempContent = contentText.getTextContent();
         //String cdataContent = cdataStartTag + tempContent +cdataEndTag;
         //contentText.setTextContent(cdataContent);
         DOMSource source = new DOMSource(document);
         fileNameToReturn = solrFileLoc + forwardSlash + contentUuid + xmlExt;         
         File file = new File(fileNameToReturn);
         Result result = new StreamResult(file);
         Transformer xformer = TransformerFactory.newInstance()
               .newTransformer();
         xformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
         xformer.setOutputProperty(OutputKeys.ENCODING, "UTF-16");
         // xformer.setOutputProperty(OutputKeys.STANDALONE, "yes");
         xformer.transform(source, result);
      } catch (Exception e) {
         logger.error("Error while forming XML in formXmlFromContent() : "
               + e.getMessage());
         e.printStackTrace();
      }
      logger.debug("fileNameToReturn in  formXmlFromContent() : "   + fileNameToReturn);
      return fileNameToReturn;
   }

   /**
    * Method to get Last Modified date as per SOLR specified format i.e "yyyy-MM-dd'T'HH:mm:ss'Z'".
    * @return formattedDate.
    */
   public String getLastModifiedDate() {
      // logger.debug("Inside getLastModifiedDate()");
      Calendar lastModDate = Calendar.getInstance();
      lastModDate.setTime(new Date());
      SimpleDateFormat format = new SimpleDateFormat(
            "yyyy-MM-dd'T'HH:mm:ss'Z'");
      String formattedDate = format.format(lastModDate.getTime());
      logger.debug("formattedDate in getLastModifiedDate() : "
            + formattedDate);
      return formattedDate;
   }
   
   /** 
    * Method to get simplified Mimetype for a content to be indexed with SOLR.
    * @param contentType
    * @return simplified Mimetype.
    */
   public String getMimetypeForSolrSearch(String contentType){
      String result = null;
      if(contentType.equalsIgnoreCase("application/pdf")){
         result = "pdf";
      }else if(contentType.equalsIgnoreCase("application/msword")){
         result = "word";
      }else if(contentType.equalsIgnoreCase("application/vnd.excel")){
         result = "excel";
      }else if(contentType.equalsIgnoreCase("application/vnd.powerpoint")){
         result = "powerpoint";
      }else if(contentType.equalsIgnoreCase("text/plain")){
         result = "text";
      }else if(contentType.equalsIgnoreCase("text/html")){
         result = "html";
      }else if(contentType.equalsIgnoreCase("text/xml")){
         result = "xml";
      }   
      logger.debug("Mimetype in getMimetpyeForSolr() " +result);
      return result;
   }
}

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This seems to be kind of workaround but at the same time i am looking for other better alternatives, if any.

jpfi · ‎05-21-2010

Hi,
there is a nice SOLR client API in java: http://wiki.apache.org/solr/Solrj that I'm using
best, jan

dynamolalit · ‎05-25-2010

Hi Jan,

Thanks for you reply.

I got good amount of help from SOLRJ.Below is my utility for posting XML file to SOLR.

SolrPostContentUtility.java :


package com.xxxx.alfresco.km.bpm;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.Reader;
import java.io.StringReader;
import java.io.StringWriter;
import java.io.UnsupportedEncodingException;
import java.io.Writer;
import java.net.HttpURLConnection;
import java.net.ProtocolException;
import java.net.URL;

import org.apache.log4j.Logger;

import com.xxxx.alfresco.km.property.KmPropertyReader;

/**
 * Utility class for posting XML file to SOLR server.
 * For a given well-formed XML, it posts the data of file to 
 * SOLR server & get response back.
 */
public class SolrPostContentUtility {
   
  private static Logger logger = Logger.getLogger(SolrPostContentUtility.class); 
  public static final String POST_ENCODING = "UTF-8";
  private static final String SOLR_OK_RESPONSE_EXCERPT = "<int name=\"status\">0</int>";
  private String solrServIp = null;
  private URL solrUrl = null;
  
  /**
   * Constructs an instance for posting data to the specified SOLR URL.
   */
  public SolrPostContentUtility() {
    try{
       KmPropertyReader kmPropertyReader = new KmPropertyReader();
        solrServIp = kmPropertyReader.getProperty("solr.server.ip");
        logger.debug("solrServIp in SolrPostContentUtility : "+solrServIp);
        URL passedSolrUrl = new URL(solrServIp);
        solrUrl = passedSolrUrl;
    }catch (Exception e) {
      logger.error("Error while reading property in SolrPostContentUtility : "+e.getMessage());
   }
  }

  /**
   * Method posting XML to SOLR & returning number of content posted successfully.
   * @param fullyQldFileName
   */
  public int postXmlToSolr(String fullyQldFileName){
     int result = 0;//0 for normal & 1 for error.
     logger.debug("Entering postXmlToSolr() with file name : "+fullyQldFileName);
    try {   
       result = postFileToSolr(fullyQldFileName);           
        //Checking for response
       if(result == 0){
           logger.debug("Committing Solr index changes after successful posting of data.");
           final StringWriter sw = new StringWriter();
           commit(sw);
           warnIfNotExpectedResponse(sw.toString(),SOLR_OK_RESPONSE_EXCERPT);
      }else if(result == 1){
         logger.error("Could not commit Solr index changes due to error.");
      }
    } catch(IOException ioe) {
       logger.error("Unexpected IOException in postXmlToSolr() : " + ioe);
    }
    logger.debug("result in postXmlToSolr() : "+result);  
    return result;
  }
 
 
/**
 * Method to post XML file to SOLR.
 * It takes file name & passes it with StringWriter object to postFile().
 * @param fileName
 * @return
 * @throws IOException
 */
public int postFileToSolr(String fileName) throws IOException {
   logger.debug("Entering postFileToSolr() with file name : "+fileName);
      int result = 0;//0 for normal & 1 for error.
      File srcFile = new File(fileName);
      final StringWriter sw = new StringWriter();      
      if (srcFile.canRead()) {
         //logger.debug("File name to be posted to SOLR server in postFileToSolr() : " + srcFile.getName());
         result = postFile(srcFile, sw);
        warnIfNotExpectedResponse(sw.toString(),SOLR_OK_RESPONSE_EXCERPT);
      } else {
         logger.error("Cannot read input file in postFileToSolr() : " + srcFile);
      }
      logger.debug("result in postFileToSolr() : "+result);  
    return result;
  }
  
  /**
   * Opens the file and posts it's contents to the solrUrl,
   * writes to response to output.
   * XML should be formed using a real parser e.g. DOM & should be well-formed.
   * @throws UnsupportedEncodingException 
   */
  public int postFile(File file, Writer output) 
    throws FileNotFoundException, UnsupportedEncodingException {
   logger.debug("Entering postFile() ");
   int result = 0;//0 for normal & 1 for error.
    Reader reader = new InputStreamReader(new FileInputStream(file),POST_ENCODING);
    try {
      result = postDataToSolr(reader, output);
    } finally {
      try {
        if(reader != null) reader.close();
      } catch (IOException e) {
        throw new PostException("IOException while closing file in postFile()", e);
      }
    }
    logger.debug("result in postFile() : "+result);  
    return result;
  }

  /**
   * Reads data from the data reader and posts it to solr,
   * writes to the response to output
   */
  public int postDataToSolr(Reader data, Writer output) {
   logger.debug("Entering postDataToSolr() ");
    HttpURLConnection urlc = null;
    int result = 0; //0 for normal & 1 for error.
    try {
      urlc = (HttpURLConnection) solrUrl.openConnection();
      try {
        urlc.setRequestMethod("POST");
      } catch (ProtocolException e) {
        throw new PostException("Shouldn't happen: HttpURLConnection doesn't support POST??", e);
      }
      urlc.setDoOutput(true);
      urlc.setDoInput(true);
      urlc.setUseCaches(false);
      urlc.setAllowUserInteraction(false);
      urlc.setRequestProperty("Content-type", "text/xml; charset=" + POST_ENCODING);
      
      OutputStream out = urlc.getOutputStream();
      
      try {
        Writer writer = new OutputStreamWriter(out, POST_ENCODING);
        pipeDataToSolr(data, writer);
        writer.close();
      } catch (IOException e) {
         result = 1;//error.
        throw new PostException("IOException while posting data in postDataToSolr() ", e);
      } finally {
        if(out!=null) out.close();
      }
      
      InputStream in = urlc.getInputStream();
      try {
        Reader reader = new InputStreamReader(in);
        pipeDataToSolr(reader, output);
        reader.close();
      } catch (IOException e) {
         result = 1;//error.
        throw new PostException("IOException while reading response in postDataToSolr()", e);
      } finally {
        if(in!=null) in.close();
      }
      
    } catch (IOException e) {
      result = 1;//error.
      try {
        logger.error("Solr returned an error in postDataToSolr() : " + urlc.getResponseMessage());
      } catch (IOException f) { }
      logger.error("Connection error while connecting to SOLR server in postDataToSolr() : " + e);      
    } finally {
      if(urlc != null){
         urlc.disconnect();
      }
    }
    logger.debug("result in postDataToSolr() : "+result); 
    return result;
  }

  /**
   * Pipes everything from the reader to the writer via a buffer
   */
  private static void pipeDataToSolr(Reader reader, Writer writer) throws IOException {
   logger.debug("Entering pipeDataToSolr() ");
    char[] buf = new char[1024];
    int read = 0;
    while ( (read = reader.read(buf) ) >= 0) {
      writer.write(buf, 0, read);
    }
    writer.flush();
  }
  
  /**
   * Custom Exception class for utility.
   */
  private class PostException extends RuntimeException {
         private static final long serialVersionUID = 1L;
      PostException(String reason,Throwable cause) {
         super(reason + " POST URL = " + solrUrl ,cause);
       }
     }
  
  /** Check what SOLR replied to a POST, and complain if it's not what we expected.
   * Parse the response and check it XMLwise, here we just check it as an unparsed String  
   */
  static void warnIfNotExpectedResponse(String actual,String expected) {
    if(actual.indexOf(expected) < 0) {
       logger.error("Unexpected response from Solr: '" + actual + "' does not contain '" + expected + "'");
    }
  }  

  /**
   * Does a simple commit operation 
   */
  public void commit(Writer output) throws IOException {
   logger.debug("Entering commit()");
    postDataToSolr(new StringReader("<commit/>"), output);
  }     
}
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

KMSolrSearchActionHandler.java invoking utility :


package com.xxxx.alfresco.km.bpm;

import java.io.File;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

import javax.faces.context.FacesContext;

import org.alfresco.model.ContentModel;
import org.alfresco.repo.content.MimetypeMap;
import org.alfresco.repo.content.transform.ContentTransformer;
import org.alfresco.repo.workflow.jbpm.JBPMSpringActionHandler;
import org.alfresco.service.ServiceRegistry;
import org.alfresco.service.cmr.repository.ContentData;
import org.alfresco.service.cmr.repository.ContentIOException;
import org.alfresco.service.cmr.repository.ContentReader;
import org.alfresco.service.cmr.repository.ContentService;
import org.alfresco.service.cmr.repository.ContentWriter;
import org.alfresco.service.cmr.repository.NodeRef;
import org.alfresco.service.cmr.repository.NodeService;
import org.alfresco.service.namespace.QName;
import org.alfresco.web.bean.repository.Node;
import org.alfresco.web.bean.repository.Repository;
import org.alfresco.web.ui.common.Utils;
import org.alfresco.web.ui.common.Utils.URLMode;
import org.apache.log4j.Logger;
import org.jbpm.context.exe.ContextInstance;
import org.jbpm.graph.exe.ExecutionContext;
import org.springframework.beans.factory.BeanFactory;

import com.xxxx.alfresco.km.property.KmPropertyReader;

/**
 * @author Lalit Jangra
 * Class to handle SOLR integration with Alfresco.
 * Once content is approved & moved back to original upload location,
 * it will check for 'cm:contentStatus' property of content.
 * If it is set to 'approved', another workflow named 'SolrSearchWF' is triggered.
 * This workflow will extract all required metadata properties from content noderef &
 * form an XML to be posted to SOLR along with content as text string.
 * Finally XML will be posted to SOLR using SOLR Content Post Utility.
 */
public class KMSolrSearchActionHandler extends JBPMSpringActionHandler{
   
   private static final long serialVersionUID = 1L;
   private static Logger logger = Logger
   .getLogger(KMSolrSearchActionHandler.class);
   private String solrServIp = null;
   private String solrFileLoc = null;
   private   String alfServerIp = null;
   private NodeService nodeService;
   private ServiceRegistry services;
   ContentService contentService = null;
   List<String> categoryList = null;   
   private String webdavUrl = null;
   //private Collection<NodeRef> categories = null;
   
   /**
    * Method to initialize services.
    */
   @Override
   protected void initialiseHandler(BeanFactory factory) {
      services = (ServiceRegistry) factory
      .getBean(ServiceRegistry.SERVICE_REGISTRY);
      nodeService = services.getNodeService();
      contentService = services.getContentService();   
   }

   /**
    * Method calling formSolrXml() passing noderef of the content 
    * forming SOLR search specific XML.
    */
   @Override
   public void execute(ExecutionContext context) throws Exception {
      logger.debug("Inside execute of KMSolrSearchActionHandler");
       try{
              KmPropertyReader kmPropertyReader = new KmPropertyReader();
               solrServIp = kmPropertyReader.getProperty("solr.server.ip");
               logger.debug("solrServIp in KMSolrSearchActionHandler : "+solrServIp);
               solrFileLoc = kmPropertyReader.getProperty("solr.file.location");
               logger.debug("solrFileLoc in KMSolrSearchActionHandler : "+solrFileLoc);
               alfServerIp = kmPropertyReader.getProperty("alfresco.server.ip");
               logger.debug("alfServerIp in KMSolrSearchActionHandler : "+alfServerIp);
           }catch (Exception e) {
            logger.error("Error while reading property in KMSolrSearchActionHandler : "+e.getMessage());
         }
      final ContextInstance contextInstance = context.getContextInstance();
      NodeRef nodeRef = (NodeRef) contextInstance.getVariable("nodeRef");
      //Forming SOLR XML.
      formSolrXml(nodeRef);      
   }
   
   /**
    * Method to form XML to be posted to SOLR.
    * Once well-formed XML is formed, it will call postSolrXML method to post 
    * the same XML to SOLR using Content Post Utility.
    * @param nodeRef
    */
   @SuppressWarnings("unchecked")
   public void formSolrXml(NodeRef nodeRef){
      logger.debug("Inside formSolrXml in KMSolrSearchActionHandler : "+nodeRef);
      Node contentNode = new Node(nodeRef);
      String repoPath = Utils.generateURL(FacesContext.getCurrentInstance(), contentNode, URLMode.WEBDAV);      
      //logger.debug("repoPath "+repoPath);      
      webdavUrl = alfServerIp+repoPath;
      logger.debug("webdavUrl in formSolrXml() : "+webdavUrl);
      String noderef = nodeRef.toString();
      String[] tempNodeRef = noderef.split("SpacesStore/");
      String contentUuid = tempNodeRef[1];
        String category = "";
        String content = "";
        
      //Create an XML using XMLFormationBean & pass it to SOLR.
      XmlFormationBean xmlFormationBean = new XmlFormationBean();
      QName statusQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}contentStatus");
      QName ownerQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}originalOwner");
      QName ratingQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}contentRating");
      QName nameQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}name");
      QName coAuthorQname = QName.createQName("{{http://www.xxxx.com/model/km/content/1.0}coauthor");
      QName titleQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}title");
      QName descriptionQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}description");      
      QName authorQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}author");
//      QName kTQName = QName.createQName("{http://www.alfresco.org/model/content/1.0}Knowledge Type");
//      QName kPQName = QName.createQName("{http://www.alfresco.org/model/content/1.0}KP Domain");
//      QName typeQname = QName.createQName("{http://www.alfresco.org/model/content/1.0}content");
      
      String cName = nodeService.getProperty(nodeRef, nameQname).toString();
      logger.debug("Content Name in formSolrXml : "+cName);
      String cStatus = nodeService.getProperty(nodeRef, statusQname).toString();
      logger.debug("Content Status in formSolrXml : "+cStatus);
      String cOwner = nodeService.getProperty(nodeRef, ownerQname).toString();
      logger.debug("Content Owner in formSolrXml :  "+cOwner);
      String cRating = nodeService.getProperty(nodeRef, ratingQname).toString();
      logger.debug("Content Rating in formSolrXml :  "+cRating);
      String cCoAuthor = "";
      try{
         cCoAuthor = nodeService.getProperty(nodeRef, coAuthorQname).toString();
      }catch (NullPointerException e) {
         logger.debug("Null CoAuthor in formSolrXml()");
      }
      logger.debug("Content CoAuthor in formSolrXml : "+cCoAuthor);
      String cTitle = "";
      try{
         cTitle = nodeService.getProperty(nodeRef, titleQname).toString();
      }catch (NullPointerException e) {
         logger.debug("Null Title in formSolrXml()");
      }
      logger.debug("Content Title in formSolrXml :  "+cTitle);
      String cDesc = "";
      try{
         cDesc = nodeService.getProperty(nodeRef, descriptionQname).toString();
      }catch (NullPointerException e) {
         logger.debug("Null Description in formSolrXml()");
      }
      logger.debug("Content Description in formSolrXml :  "+cDesc);
      String cAuthor = "";
      try{
         cAuthor = nodeService.getProperty(nodeRef, authorQname).toString();
      }catch (NullPointerException e) {
         logger.debug("Null Author in formSolrXml()");
      }
      logger.debug("Content Author in formSolrXml :  "+cAuthor);
      String cType =  getContentMimeType(nodeRef);
      logger.debug("Content Mimetype in formSolrXml :  "+cType);   
      try{
         Collection<NodeRef> categories = (Collection<NodeRef>)nodeService.getProperty(nodeRef, ContentModel.PROP_CATEGORIES);
         logger.debug("categories size : "+categories.size());
         Iterator itr11 =  categories.iterator();
         categoryList = new ArrayList<String>();   
         while(itr11.hasNext()){
            NodeRef catNodeRef = (NodeRef) itr11.next();
            category = Repository.getNameForNode(nodeService, catNodeRef);
            logger.debug("category name in formSolrXml : "+category);
            categoryList.add(category);
            logger.debug("categoryList size : "+categoryList.size());
         }         
      }catch (NullPointerException e) {
         logger.error("Null categories in formSolrXml() ");
      }   

      //Transforming content to plain text format & extracting text from content as a string.
      contentService = services.getContentService();
        ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT);
        if (reader != null && reader.exists())
        {
                // get the transformer
                ContentTransformer transformer = contentService.getTransformer(reader.getMimetype(), MimetypeMap.MIMETYPE_TEXT_PLAIN);
                if (transformer != null)
                {
                    // We have a transformer that is fast enough
                    ContentWriter writer = contentService.getTempWriter();
                    writer.setMimetype(MimetypeMap.MIMETYPE_TEXT_PLAIN);
                    try
                    {   
                       transformer.transform(reader, writer);
                        // point the reader to the new-written content
                        reader = writer.getReader();
                        // Check that the reader is a view onto something concrete
                        if (!reader.exists())
                        {
                           logger.error("Error while getting reader in KMSolrSearchActionHandler ");
                            throw new ContentIOException("The transformation did not write any content, yet: \n"
                                    + "   transformer:     " + transformer + "\n" + "   temp writer:     " + writer);
                        }else {
                              content = reader.getContentString();
                        }
                        
                    }
                    catch (ContentIOException e)
                    {
                       logger.error("Error in transforming content : "+e.getMessage());
                       
                    }
                }
            }       
        logger.debug("Length of content as a string for SOLR indexing !! :  "+content.length());
        
        //Forming well-formed SOLR search XML.
        String finalFileName =  xmlFormationBean.formXmlFromContent(contentUuid,content, cType, cAuthor, cCoAuthor, cTitle, cOwner, categoryList, cDesc, cStatus, cRating, cName, solrFileLoc,webdavUrl);
        logger.debug("finalFileName in formSolrXml : "+finalFileName);
        //Posting well-formed XML to SOLR server.
        postSolrXML(finalFileName,cName,nodeRef);
   }   
   
   /**
    * Method to post well-formed XML to SOLR using content post utility.
    * @param fileName
    * @param contentName
    */
   public void postSolrXML(String fullFileName,String contentName, NodeRef nodeRef){
      logger.debug("Entering postSolrXML() with content to be posted to SOLR : "+contentName +" &  fullFileName : "+fullFileName);
      //Only if these is a non-null file, then it should be posted to SOLR.
        File file = new File(fullFileName); 
        if(file.length() > 0){           
             try {
                //Calling  SolrPostContentUtility.
                logger.debug("Posting content to SOLR using utility by passing fully qualified file name,if result is 0, it's OK . if it's 1, its error!");
                SolrPostContentUtility utility = new SolrPostContentUtility();
                int outCome = utility.postXmlToSolr(fullFileName);
                //If outCome is 0, it is OK , if it is 1, it's error!
                if(outCome == 0){
                   //Set km:underWorkflow to indexed for this content.
                   logger.debug("\n *************** Content named : "+contentName+" : successfully posted to SOLR. *************** \n");
                   logger.debug("\n *************** nodeRef of content posted successfully : " + nodeRef + " ***************");
                   logger.debug("\n *************** webdavUrl of content posted successfully : " + webdavUrl + " ***************");
                   QName underWorkflowQname = QName.createQName("{http://www.xxxx.com/model/km/content/1.0}underWorkflow");
                   String underWorkflowFlag = nodeService.getProperty(nodeRef, underWorkflowQname).toString();
                   logger.debug("km:underWorkflow set value in postSolrXML after post : "+underWorkflowFlag);
                   Map<QName, Serializable> propertyMap = nodeService.getProperties(nodeRef);
                   propertyMap.put(underWorkflowQname, "indexed");
                   nodeService.setProperties(nodeRef, propertyMap);
                }else if(outCome == 1){
                   logger.error("\n *************** Content named : "+contentName +" : could NOT be posted successfully to SOLR. *************** \n");
                }                 
          } catch (Exception e) {
             logger.debug("Error while posting file : "+e.getMessage());
             e.printStackTrace();
          }      
        }else{
           logger.error("Null XML formed");
        }
   }
   
   /**
    * Method to get MimeType of a content passing it's nodeRef.
    * @param nodeRef
    * @return MimeType
    */
   public String getContentMimeType(NodeRef nodeRef){
      QName PROP_QNAME_CONTENT = QName.createQName("http://www.alfresco.org/model/content/1.0", "content");
       ContentData contentData = (ContentData) nodeService.getProperty(nodeRef, PROP_QNAME_CONTENT);
       String originalMimeType = contentData.getMimetype();
       return originalMimeType;
   }

}

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Its working fine.

jain_kumar11 · ‎09-06-2011

Hi Lalit,

Could you please share following files as well KmPropertyReader.java and XmlFormationBean.java as existing code is using some reference from these.

KJ

dynamolalit · ‎09-19-2011

Hi Jain,

KmPropertyReader.java is nothing but a java class reading properties from a property file.

Also XmlFormationBean.java is a java class creating xml using sax parser to be deployed to solr.

All other relevant class i have already shared here.

All the best.

Hyland Connect

How to integrate Apache SOLR with Alfresco?