Hyland Connect

ftoth · ‎02-27-2007

Hi,

I've been working with 2.0, and it appears that the ability to enter structured XML via Xforms is tied in with the whole optional WCM feature? Is this necessary? It seems like it would be handy to be able to use Xforms as part of the general CMS, without confusing the issue with web site management.

Is there a way?

Many thanks,

Fred

jcox · ‎03-29-2007

Fred,

Thanks again for all your great input.

I'd like to help figure out exactly what can be done to help you prior
to the upcoming release with the built-in deployment stuff.

As you said, the CIFS projection of the AVM lets you deploy content using
pretty much anything you want (personally, I like to use rsync over ssh).
If you just want to deploy some files from one place to another, what is
the problem you're encountering? When I rsync from Alfresco's CIFS
projection to a remote file system, it works quite well – even for large
sites. The first version of our built-in deployment will be will be able to
take advantage of history information (and deal with meta info nicely),
but there are *lots* of bells & whistles we plan to add later. I want to
be certain that the "core" features in our first release meet your needs,
so if you can elaborate here, it would be very helpful.

I would like to somehow move the functionality of the [deployment]
ant script into Alfresco. Perhaps a custom action of some sort?

Built-in deployment has been under very active development & will
be available quite soon (weeks, not months).

The multiple virtual sites, via sandboxes, is very powerful,
but it appears that I can't use that without adopting virtualization
and alfresco as the delivery engine?

The virtualization server is virtualizing the contents of the AVM itself.
The major components of Alfresco are:

DM repository (used for document management)

AVM repository (used for WCM currently, but may take over completely)

Alfresco webapp (the GUI for creating & manipulating content)

The virtualization server (creates a virtual view of *your* webapp/website)

The virtualization server is a modified version of Tomcat that knows how
to use interpret the area-izing information embedded within a URL
( *.www–sandbox.*:8180/…), and fetch data via AVMRemote. It also
does some fancy footwork with classloaders to allow many users to have
what appears to be a separate version of your webapp in the "staging"
area, and yet still share the jar files they have in common, when possible
(this allows the system to scale).

Whether you fetch data via the CIFS projection of the AVM repository,
or if you use AVMRemote (like the virtualization server does via its
built-in JNDI binding), you're still accessing the AVM. Both end up using
AVMRemote behind the scenes to fetch the data being presented.
Incidentally, the virualization server does not require the CIFS projection
to work for webapps that don't make use of the servlet method
getRealPath(). However, if a webapp *does* call getRealPath(),
then so long as you've mounted CIFS where you've said you will
within $VIRTUAL_TOMCAT_HOME/conf/alfresco-virtserver.properties,
then the JNDI names will coincide with the file system names; thus
even AVM-unaware code will work too.

Put differently, the fact that the virtualization server requires the AVM
is inherent in the fact that the core feature of the virtualization server
is to display the contents *of* the AVM. On the other hand, the fact
that the AVM is bound to the same process as the Alfresco webapp
is just an artifact of its current implementation. Some day, the AVM
may be in a different process (and/or on a different host) than the
one running the Alfresco webapp.

(Though I do remember one note somewhere about virtualization working
with any "well behaved" java technology).

There are some limits to what can be virtualized; for example, if you
have a singleton of any sort (not just a java singleton), then it will be
shared by all virtual webapps based off of the project's staging webapp.

For example, suppose your webapp writes to a database table.
If users Alice and Bob are are viewing their "separate" virtualized
instances of the webapp within their own sandboxes, and Alice does
something that makes this webapp modify the 'moo' table, then Bob
will see the change Alice has made to the 'moo' table immediately.
In other words, the virtualization server can virtualize files but
can't magically restructure arbitrary hard-coded programs.

As described above, I can't use all of the cleverness of xforms without
the virtualization server, right? (I still don't understand that one!)

You can use the data generated by XForms, and the files derived from
them (e.g.: templatized output files) by simply deploying them to
a native file system. From there, any webserver will just be able to
serve them up as ordinary files … because that's exactly what they'll be.

But what I can do, it seems, is use xforms to capture XML, and, once
captured, use external techniques to process that XML in various ways.

Exactly.

Again, a key enabler is the CIFS mount, which, from what I can
tell, works in 2.0 as well, with the benefit of presenting all of the
sandboxes. So once again, I can have the best of both worlds, without
adopting alfresco as the delivery engine.

With the stuff coming in 2.0.1, you won't even need CIFS for deployment.
When you talk about the Alfresco "delivery engine", I think what you
probably mean is "repository".    By the way, for the fun of it, you might
want to try comparing the speed of a vanilla tomcat instance fetching
data from a native file system to the virtualization server fetching data
from the AVM.   In most cases, there isn't a heck of a lot of difference.
When doing a bunch of tests a while back, I was very pleasantly surprised.
Of course, I might not want to use a setup like this for the splash page
of a super high volume website, but then if you slapped a cache in front
of it, it would probably be just fine for that as well. Food for thought.
Another amusing experiment I did was set the docroot of an Apache 2.x
instance to be the /www/avm_webapps directory of the staging area
of a web project using the CIFS projection.   This also worked much better
than I'd anticipated.   The CIFS projection of AVM contents is quite fast
(much faster than the CIFS projection of the older DM repository's data).
Some time soon, I'll take a crack at making a virtualized version of
Apache 2.x, so we can have a full Apache/Tomcat stack (and maybe even
deal with PHP).   My first priority is creating infrastructure for link checking
at submit/update time, but after that I think I'm going to have a look at this.

But what I think I'm seeing is that as long as my production environment
is something other than alfresco, the virtualization server can NOT be
used to preview my site.

The virtualization server's job is to allow data within certain portions of
the AVM repository to be interpreted as a set of virtual webapps. Suppose
that you have a production environment that wishes to access some data
from the native file system, and other data from the AVM. Nothing
prevents you from doing this – it's not an "all or nothing" choice.
There are at least three ways you could proceed:

[1] Invoke AVMRemote APIs directly, where ever you wish.
[2] Make portions of your site act as a reverse proxy to the virt server
[3] Make portions of your site access AVM content via CIFS

So, somewhere in the content freemarker template, I have something like
${quote.todaysQuote}

The quote bean is in my world (struts2), and not in Alfresco's.
I can still use Alfresco to manage the template that contains
the above snippet, but there's no way the virtualization server is ever
going to be able to preview that page, right?

It sure could.
As long as the webapp you are virtualizing contains the jar file
referenced by your function in its WEB-INF/lib dir, then this
becomes logic internal to your webapp. Think of it this way:
the virtualization server just provides a framework for automatically
virtualizing AVM content & serving it up (after all, it's just Tomcat with
a bunch of classes replaced to handle AVMRemote calls, bootstrapping,
webapp notifications, and some extra tap dancing to make it all scale).
Thus, if you can write a webapp to do something, the virtualization
server can (in most cases) just server it up for you. To do this,
you've got to configure *your* webapp with the proper libraries….
but then this is no different from what you'd expect from *any* webapp
whether served via the virtualization server, or a "vanilla" Tomcat.

What IS the virtualization server doing, anyway?

Suppose you have a web project named "mysite",
and two users: Alice and Bob. Within Alice's sandbox, the "eyeball"
icon will have URLs for Alice of the form:

http://alice.mysite.www--sandbox.<virtualization-domain>:8180/…‍

The "eyeball" icon will create URLs for Bob like this:

http://bob.mysite.www--sandbox.<virtualization-domain>:8180/…‍

Let's say your virtualization server is sitting on IP address 192.168.1.5,
and that you're using the EchoDNS server at ip.alfrescodemo.net to
deal with DNS wildcards. Therefore, *.192-168-1-5.ip.alfrescodemo.net
will be resolved as the IP address 192.168.1.5. Consequently, the
following URL will resolve to 192.168.1.5 (on port 8180):

http://alice.mysite.www--sandbox.192-168-1-5.ip.alfrescodemo.net:8180/…‍

Note that the same is true of the following url for Bob:

http://bob.mysite.www--sandbox.192-168-1-5.ip.alfrescodemo.net:8180/…‍

The wiki page entitled Configuring the Virtualization Server covers this in greater detail.

In any event, using the area info embedded into the "virtual host name"
the virtualization server is able to figure out the associated AVM repository
and area-ize the request path accordingly when it makes calls to
AVMRemote behind the scenes. There's a lot of other stuff going on
too, like how it knows when to reload a webapp, how it maps between
JNDI/webapp/CIFS/URL namespaces, and so on… but you get the gist
(I hope).

The virtual server logs give me the impression that there's some RMI
going on, which makes sense. So I'm guessing that the virtualization
server just decodes the URL and then requests the appropriate
documents from Alfresco. But there must be a fair amount of complexity
in there, right?

Interestingly, the amount of code involved is not that large, but in terms
of how and why it is designed the way that it is.. yes, you're right.
Many late nights went into thinking it all through, and making it actually
work. I'm glad you're enjoying it & finding it useful.

Cheers,
-Jon

Hyland Connect

Xforms tied to WCM?