cancel
Showing results for 
Search instead for 
Did you mean: 

Http Proxy issues

matafagafo
Champ in-the-making
Champ in-the-making
We have one problem with Squid proxy caching system.
   The documents are stored in the proxy cache, so when the documents are update in the repository, the users are still getting the old version of the document, because the old version is supplied by the proxy server.
    I know that I can disable the proxy cache, but it don't are a option.
    How can I configure Alfresco to instruct the proxy to don't cache the the repository files ?
    Thanks for your help.
15 REPLIES 15

sancho78
Champ in-the-making
Champ in-the-making
Workaround: Tell the proxy to deny caching of contents from alfresco.

Mathias

matafagafo
Champ in-the-making
Champ in-the-making
Hello, thanks for your help, I already do this, but I have users that use proxies out of my network (like public networks and other enterprises), and I can't change they configurations.
I looked the HTTP headers that my Alfresco 2.1.0 server send when I download a document from the repository and realized that headers like pragma: no-cache  are missing.
In my opinion the repository must send this headers to instruct the proxies and the browser to not cache this files..
Is there a way to configure Alfresco the send this headers with the download ?

nuttinjeff
Champ in-the-making
Champ in-the-making
Hi, I'm new to Alfresco, but we're having the same problem here. The browsers are caching the files, so when someone update a document that was already downloaded, and try to open it again, the browser deliver the old version of it…

Is there a way to tell the browser to do not cache repository files?

Version: Community Network - v2.1.0

Thanks.

braulio_moura
Champ in-the-making
Champ in-the-making
Hi Everybody…

I'm running Alfresco Communiy 2.1 and facing this problem too..

For some documents, when it gets updated, it's link doesn't bring the update version, but the older version of the document…

Until now, I didn't have any clue about what it may be.. I thought it is a bug in Alfresco…

but reading this threat, i've noticed that it might be a cache problem…

Is this proxy cache behavior confirmed?? Can I assume for sure that it's the problem I'm facing??  Has anyone encountered found the solution??

Sorry for many questions, but I have to find the solution.. My users are not "seeing Alfresco with good eyes" because of this…

Can someone please help me???

Thanx in advance!!!

matafagafo
Champ in-the-making
Champ in-the-making
Hy Braulio
The only solution that I found was to configure the proxy (squid) to not proxy Alfresco urls, but this is not a good solution because, if the users access a repository from other sites that use proxy they will have the same problem.
This is, in my opinion, a Alfresco bug, Afresco don`t set a response header to   instruct to the proxy server do not cache the url. This can be done too with a Servlet Filter, it`s simple solution to implement, but until now I do not have time to do it.

braulio_moura
Champ in-the-making
Champ in-the-making
Hi matafagafo!!!

Thankx!! After talking to my company's "Corporate It Team" I've confirmed that that's not the best solution (as you told here before!!!)…

Evaluating the issue, I'm sure that the best solution is to change Alfresco's source code…

I'm changing the download servlet code (BaseDownloadContentServlet.java) to introduce the following lines:

      res.setHeader("Cache-Control", "no-cache");//Http 1.1  
      res.setHeader("Pragma","no-cache");//Http 1.0        
      res.setDateHeader("Expires", -1);

After testing the solution, I'll publish the class here, and what we have to do to deploy it…

Thanks!!!!

PS: matafagafo, Am I crazy or are you from Brazil???

braulio_moura
Champ in-the-making
Champ in-the-making
here's the new code for the BaseDownloadContentServlet class…

As we can see, I changed the processDownloadRequest method to include the following code:

res.setHeader("Cache-Control", "no-cache");//Http 1.1
res.setHeader("Pragma","no-cache");//Http 1.0
res.setDateHeader("Expires", -1);

here comes the code:

protected void processDownloadRequest(HttpServletRequest req, HttpServletResponse res,
         boolean redirectToLogin)
         throws ServletException, IOException
   {  
      Log logger = getLogger();
      String uri = req.getRequestURI();
     
      if (logger.isDebugEnabled())
      {
         String queryString = req.getQueryString();
         logger.debug("Processing URL: " + uri +
               ((queryString != null && queryString.length() > 0) ? ("?" + queryString) : ""));
      }
     
      // TODO: add compression here?
      //       see http://servlets.com/jservlet2/examples/ch06/ViewResourceCompress.java for example
      //       only really needed if we don't use the built in compression of the servlet container
      uri = uri.substring(req.getContextPath().length());
      StringTokenizer t = new StringTokenizer(uri, "/");
      int tokenCount = t.countTokens();
     
      t.nextToken();    // skip servlet name
     
      // attachment mode (either 'attach' or 'direct')
      String attachToken = t.nextToken();
      boolean attachment = URL_ATTACH.equals(attachToken) || URL_ATTACH_LONG.equals(attachToken);
     
      ServiceRegistry serviceRegistry = getServiceRegistry(getServletContext());
     
      // get or calculate the noderef and filename to download as
      NodeRef nodeRef;
      String filename;
     
      // do we have a path parameter instead of a NodeRef?
      String path = req.getParameter(ARG_PATH);
      if (path != null && path.length() != 0)
      {
         // process the name based path to resolve the NodeRef and the Filename element
         PathRefInfo pathInfo = resolveNamePath(getServletContext(), path);
        
         nodeRef = pathInfo.NodeRef;
         filename = pathInfo.Filename;
      }
      else
      {
         // a NodeRef must have been specified if no path has been found
         if (tokenCount < 6)
         {
            throw new IllegalArgumentException("Download URL did not contain all required args: " + uri);
         }
        
         // assume 'workspace' or other NodeRef based protocol for remaining URL elements
         StoreRef storeRef = new StoreRef(t.nextToken(), t.nextToken());
         String id = URLDecoder.decode(t.nextToken(), "UTF-8");
         // build noderef from the appropriate URL elements
         nodeRef = new NodeRef(storeRef, id);
        
         if (tokenCount > 6)
         {
            // found additional relative path elements i.e. noderefid/images/file.txt
            // this allows a url to reference siblings nodes via a cm:name based relative path
            // solves the issue with opening HTML content containing relative URLs in HREF or IMG tags etc.
            List<String> paths = new ArrayList<String>(tokenCount - 5);
            while (t.hasMoreTokens())
            {
               paths.add(URLDecoder.decode(t.nextToken()));
            }
            filename = paths.get(paths.size() - 1);

            try
            {
               NodeRef parentRef = serviceRegistry.getNodeService().getPrimaryParent(nodeRef).getParentRef();
               FileInfo fileInfo = serviceRegistry.getFileFolderService().resolveNamePath(parentRef, paths);
               nodeRef = fileInfo.getNodeRef();
            }
            catch (FileNotFoundException e)
            {
               throw new AlfrescoRuntimeException("Unable to find node reference by relative path:" + uri);
            }
         }
         else
         {
            // filename is last remaining token
            filename = t.nextToken();
         }
      }
     
      // get qualified of the property to get content from - default to ContentModel.PROP_CONTENT
      QName propertyQName = ContentModel.PROP_CONTENT;
      String property = req.getParameter(ARG_PROPERTY);
      if (property != null && property.length() != 0)
      {
          propertyQName = QName.createQName(property);
      }
     
      if (logger.isDebugEnabled())
      {
         logger.debug("Found NodeRef: " + nodeRef);
         logger.debug("Will use filename: " + filename);
         logger.debug("For property: " + propertyQName);
         logger.debug("With attachment mode: " + attachment);
      }
     
      // get the services we need to retrieve the content
      NodeService nodeService = serviceRegistry.getNodeService();
      ContentService contentService = serviceRegistry.getContentService();
      PermissionService permissionService = serviceRegistry.getPermissionService();
     
      try
      {
         // check that the user has at least READ_CONTENT access - else redirect to the login page
         if (permissionService.hasPermission(nodeRef, PermissionService.READ_CONTENT) == AccessStatus.DENIED)
         {
            if (logger.isDebugEnabled())
               logger.debug("User does not have permissions to read content for NodeRef: " + nodeRef.toString());
           
            if (redirectToLogin)
            {
               if (logger.isDebugEnabled())
                  logger.debug("Redirecting to login page…");
              
               redirectToLoginPage(req, res, getServletContext());
            }
            else
            {
               if (logger.isDebugEnabled())
                  logger.debug("Returning 403 Forbidden error…");
              
               res.sendError(HttpServletResponse.SC_FORBIDDEN);
            } 
            return;
         }
        
         // check If-Modified-Since header and set Last-Modified header as appropriate
         Date modified = (Date)nodeService.getProperty(nodeRef, ContentModel.PROP_MODIFIED);
         long modifiedSince = req.getDateHeader("If-Modified-Since");
         if (modifiedSince > 0L)
         {
            // round the date to the ignore millisecond value which is not supplied by header
            long modDate = (modified.getTime() / 1000L) * 1000L;
            if (modDate <= modifiedSince)
            {
               if (logger.isDebugEnabled())
                  logger.debug("Returning 304 Not Modified.");
               res.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
               return;
            }
         }
         res.setDateHeader("Last-Modified", modified.getTime());
        
         if (attachment == true)
         {
            // set header based on filename - will force a Save As from the browse if it doesn't recognise it
            // this is better than the default response of the browser trying to display the contents
            res.setHeader("Content-Disposition", "attachment");
         }
        
         // get the content reader
         ContentReader reader = contentService.getReader(nodeRef, propertyQName);
         // ensure that it is safe to use
         reader = FileContentReader.getSafeContentReader(
                    reader,
                    Application.getMessage(req.getSession(), MSG_ERROR_CONTENT_MISSING),
                    nodeRef, reader);
        
         String mimetype = reader.getMimetype();
         // fall back if unable to resolve mimetype property
         if (mimetype == null || mimetype.length() == 0)
         {
            MimetypeService mimetypeMap = serviceRegistry.getMimetypeService();
            mimetype = MIMETYPE_OCTET_STREAM;
            int extIndex = filename.lastIndexOf('.');
            if (extIndex != -1)
            {
               String ext = filename.substring(extIndex + 1);
               String mt = mimetypeMap.getMimetypesByExtension().get(ext);
               if (mt != null)
               {
                  mimetype = mt;
               }
            }
         }
         // set mimetype for the content and the character encoding for the stream
         res.setContentType(mimetype);
         res.setCharacterEncoding(reader.getEncoding());
        
         // get the content and stream directly to the response output stream
         // assuming the repo is capable of streaming in chunks, this should allow large files
         // to be streamed directly to the browser response stream.
         res.setHeader("Accept-Ranges", "bytes");
        
        //Para nao utilizar o cache do proxy.
         res.setHeader("Cache-Control", "no-cache");//Http 1.1  
         res.setHeader("Pragma","no-cache");//Http 1.0        
         res.setDateHeader("Expires", -1);
                 

         try
         {
            boolean processedRange = false;
            String range = req.getHeader("Content-Range");
            if (range == null)
            {
               range = req.getHeader("Range");
            }
            if (range != null)
            {
               if (logger.isDebugEnabled())
                  logger.debug("Found content range header: " + range);
               // return the specific set of bytes as requested in the content-range header
               /* Examples of byte-content-range-spec values, assuming that the entity contains total of 1234 bytes:
                     The first 500 bytes:
                      bytes 0-499/1234

                     The second 500 bytes:
                      bytes 500-999/1234

                     All except for the first 500 bytes:
                      bytes 500-1233/1234 */
               /* 'Range' header example:
                      bytes=10485760-20971519 */
               try
               {
                  if (range.length() > 6)
                  {
                     StringTokenizer r = new StringTokenizer(range.substring(6), "-/");
                     if (r.countTokens() >= 2)
                     {
                        long start = Long.parseLong(r.nextToken());
                        long end = Long.parseLong(r.nextToken());
                       
                        res.setStatus(HttpServletResponse.SC_PARTIAL_CONTENT);
                        res.setHeader("Content-Range", range);
                        res.setHeader("Content-Length", Long.toString(((end-start)+1L)));
                       
                        InputStream is = null;
                        try
                        {
                           is = reader.getContentInputStream();
                           if (start != 0) is.skip(start);
                           long span = (end-start)+1;
                           long total = 0;
                           int read = 0;
                           byte[] buf = new byte[((int)span) < 8192 ? (int)span : 8192];
                           while ((read = is.read(buf)) != 0 && total < span)
                           {
                              total += (long)read;
                              res.getOutputStream().write(buf, 0, (int)read);
                           }
                           res.getOutputStream().close();
                           processedRange = true;
                        }
                        finally
                        {
                           if (is != null) is.close();
                        }
                     }
                  }
               }
               catch (NumberFormatException nerr)
               {
                  // processedRange flag will stay false if this occurs
               }
            }
            if (processedRange == false)
            {
               // As per the spec:
               //  If the server ignores a byte-range-spec because it is syntactically
               //  invalid, the server SHOULD treat the request as if the invalid Range
               //  header field did not exist.
               long size = reader.getSize();
               res.setHeader("Content-Range", "bytes 0-" + Long.toString(size-1L) + "/" + Long.toString(size));
               res.setHeader("Content-Length", Long.toString(size));
               reader.getContent( res.getOutputStream() );
            }
         }
         catch (SocketException e1)
         {
            // the client cut the connection - our mission was accomplished apart from a little error message
            if (logger.isInfoEnabled())
               logger.info("Client aborted stream read:\n\tnode: " + nodeRef + "\n\tcontent: " + reader);
         }
         catch (ContentIOException e2)
         {
            if (logger.isInfoEnabled())
               logger.info("Client aborted stream read:\n\tnode: " + nodeRef + "\n\tcontent: " + reader);
         }
      }
      catch (Throwable err)
      {
         throw new AlfrescoRuntimeException("Error during download content servlet processing: " + err.getMessage(), err);
      }
   }

How to deploy…

Place BaseDownloadContentservlet.class into alfresco-web-client.jar (located at <%TOMCAT_HOME%>\webapps\alfresco\web-inf\lib - in case, you're using TOMCAT as your application server). Then reestart aplication…

NOTE.: i'm still testing this code.. so, if you encounter any error, please tell me!!!!!!!!

Bye!!!!

kevinr
Star Contributor
Star Contributor
Hi,

That change means it will never cache content at all, this change should be enough and should mean it caches when appropriate based on the content date:

res.setHeader("Cache-Control", "must-revalidate");
res.setHeader("ETag", "\"" + Long.toString(modified.getTime()) + "\"");

Please let us know if it works for you, if so we'll add the fix.

Thanks!

Kevin

kevinr
Star Contributor
Star Contributor
Hi,

I have done some testing with Apache mod_proxy, mod_cache etc. and validated that my suggested fix works and modified file content is now correctly returned through the proxy. I will commit the change to HEAD later today, so the code change suggested (if you want to make them yourself) will work and so then will a nightly build once the fix is in (or get the code from public SVN as usual). The fix will also go into all enterprise code branches for customers.

Thanks for reporting the issue!

Cheers,

Kevin