cancel
Showing results for 
Search instead for 
Did you mean: 

Maven2 Build for Alfresco

rdanner
Champ in-the-making
Champ in-the-making
Hello everyone… we (the royal 'we' as in the Alfresco Community) need your input on a very important subject. BUILD.

Alfresco has been using a custom build harness for the last two years and in that time there seems to be a growing support for the use of Maven2.  I'd like to open the floor here for your input on Alfresco moving to a MAven2 based build.  

Please at least cast a ballot – as I mentioned, this is very important and it makes sense that if you have support or concerns which have not been voiced or require re-enforcement that you speak up.

For the sake of discussion I think there are several things we need to talk about.

First and foremost are the pros and cons of moving to maven. 
Second would be concerning the logistics of the move if it were to happen. 

Just to be clear there is no official movement from Alfresco to change the build process – this is just an exploratory question. There seems to be a lot of interest and several people are willing to help make the transition if it makes sense but there is no official plan from alfresco to switch. We won't know the facts unless you vote and unless you speak up. 

[size=200]Please make your thoughts known![/size]
13 REPLIES 13

rdanner
Champ in-the-making
Champ in-the-making
We use maven 2 here for everything.  I love maven – I hate maven –  but – it’s the best we have and it makes sense for open source projects.  Once you know maven – you know it.  Everyone I know who has switched resisted it at first because Maven 1.x stank but they are all glad they made the shift to Maven2.  This is another area where we can seek de facto based standards and reduce the barrier to entry for Alfresco.


That said… my gripes with maven are (Maven has some flaws that have nothing to do with Maven per se.):

           First is the issue of Sun Jars which are not allowed to be housed in the Maven repository or any other repository because of some silliness on the part of Sun.  Second are all the other jars that are not in the maven repository because vendors (cough-fresco-cough <GRIN>) have not made them available.

           The second issue is that the central maven repository is a superfund site where people have managed to place the same libraries at 30 different groups and names – this is mavens fault because they don’t seem to manage the issue.

           The third issue is that Eclipse plug-ins for maven are Terrible (yes – with a capital ‘T’) but they are just as good as any support for home cooked ant builds so I am not sure it matters – we just don’t gain much on the eclipse front.  Netbeans however seems to have nice tools (and a profiler) so in some ways we are gaining there for people who use netbeans.


           The last issue I can think of is that working offline with maven has sometimes bitten me because I can't reach a dependency – this is not much different that just not having the Jar on hand except that with Maven you don't ship with the jars the way the Alfresco source is currently shipped – good for the download, it's small if you just want to sniff around the source but it does mean you must be on-line for your first build.



Other than that – my concerns are with logistics:

           We need a strong plan with guidelines for defining groups, artifact ids, dependencies and other practices.

           We need a plan for the overall source configuration – to be honest Alfresco is pretty well suited for a port: we already have things broken down in to sub projects.

           I think we need a proof of concept that demonstates
                * nightly builds will continue to work
                * junit tests will continue to run
                * the sun will continue to rise and so fourth

           The actual porting of the code to a Maven2 build is simple and we could do it very quickly with the help of a few – that isnt the hard part.  The hard part would be actually making the switch. 

           Alfresco's engineering team has to be on board and has to participate in moving and the community has to be on board and know whats happening and when its going to happen and then have support after it has happened.

           That is the tough part – making the change where it touches the people Smiley Happy

           If we do decide to make the move (aka there is strong support and the POC works well) we would need to move with the quickness (kung-fu sword noises here) and make this happen.  We can't sit on the fence with two builds, it is too far down the stack and way too important to mess around.  It is also essentially a fork until it is one build again … so once we decide to go (should we decide to go) we need to execute the act quickly.

jcox
Champ in-the-making
Champ in-the-making
Good topic Russ!

Before we get too deeply into the fine points of Ant versus Maven II,
I think it's worth stepping back and defining our needs.  This way, we'll
be in the much better shape to identify the best way to proceed
(or at least to weight tradeoffs with a checklist in hand). 

Here's what I'd like to see regardless of our choice of build framework:
  • Remove inter-project dependency issues for good

  • Remove all .classpath issues in Eclipse for good

  • Make the source tree reflect the java package organization

  • Default: one jar per java package; build others as-needed.

  • All build output can be configured to go outside the source tree

  • Make Javadoc build better docs & overall indices w/o special tweaking

  • Improve build speed

  • Allow builds to occur directly from read-only source archive
This could be done by reorganizing the code within Ant so that all the java
code is in a single project. Essentially, I think this is just the next logical
step along a path  begun when we got rid of recursive ant calls in our
build;   we're  letting javac do its job in sorting out dependencies rather
than  hard-coding them ourselves (in 2 places).   I dont' know how easy
or hard that is to do in Maven II compared to Ant.
Does somebody care to jump in here?

There are always funny corner-cases when it comes to building
non-trivial programs so I thought it might be nice to think about
it collectively. The one real technical issue I can see up-front with
this is the following:

        
         Package  org.alfreco.foobar  has classes a, b, c, and d.
         Classes a & b are in project ab, and go into ab.jar, while
         classes c & d are in project cd, and go into cd.jar.  Due
         to some security/classloader/bundling reason, ab.jar and
         cd.jar must reside in different locations.  

    This could be resolved in three different ways:
  1. Punt.  If you need a jar in more than one place, the fact that it has a bit of extra
  2. baggage is often not a problem.  
  3. Make targets that create more selective jars by picking & choosing which classes they want.
  4. This is a bit ugly but there aren't that many places where it would be necessary
    so the overall ugliness is probably not too great.
  5. Reorganizing the code so that packages don't need to be split across different jars.
  6. This is probably the cleanest & best thing to do, but it's a bit more work.
Note that [1], [2] and [3] are not mutually exclusive options.
We could get something working in a hackish way fairly quickly
via [1] and [2], and clean things up as we go via [3].

There are probably a bunch of other issues I'm not considering at all, so
if you can think of anything big offhand, it would be nice to hear about.
My guess is that the pros strongly outweigh the cons, but that could
certainly change if someone thinks of a  show-stopper.

In the end, I'd like us to focus on what problem we're solving first,
rather than the specific technology we're using.  The inter-project
dependency issues we've got now have become a bit painful,
a finer-grained set of jars would give those who accept "hot patches"
more peace of mind, the mapping of packages to paths has
become annoying in places, and I'm sick of javadoc indexes not
being quite right.   Those are the pain points I can see immediately,
but perhaps there are others too.   If so, what are they?

In what ways would Maven II help or hurt relative to what could be
done by reorganizing the 22 ant projects into 1 (as it probably should
have been originally)?  Ultimately, if Maven II helps with respect to
these specific "pain point" issues, and does not create other problems,
I'm very open to understanding how and why this is so.   Otherwise,
I'm inclined to stick with the  known pros/cons of Ant, but use it in a
better way.  When the dust settles a bit on 2.1, we're definitely going to
do some homework on Maven II.

I'm not speaking for all of engineering here, just myself.
From what I can tell, nobody in engineering has a very strong
opinion on this yet because  we just haven't had time to discuss
it or analyze it in detail.   It's a conversation that's well worth having.

Thanks for kicking off the discussion!

       Cheers,
       -Jon

rdanner
Champ in-the-making
Champ in-the-making
Good topic Russ!

Before we get too deeply into the fine points of Ant versus Maven II,
I think it's worth stepping back and defining our needs.  This way, we'll
be in the much better shape to identify the best way to proceed
(or at least to weight tradeoffs with a checklist in hand). 


You make some really good points.  I think a bunch of us were looking at making a one for one switch just to head in the direction of a somewhat industry or at least open source standard for the sake of lowering the barrier to entry for folks who have never built Alfresco.  Frankly speaking, I wasn't even aware of two or three of the issues you surfaced.

There are some features that come with Maven II which are very good.  One that I am particularly fond of is the way it performs tests.  It goes through a lot to make sure that classloaders are 'sterile' and that the test environment has only what is needed (declared) for the test.

Other features include plugins for eclipse, javadoc, clean, and other such capabilities. 

I want to go back and look at your post more carefully.  Maybe there are some points I can address.  I have a couple of answers concerning maven and also some comments which don't have anything to do with maven that I'll take a swing at.

Like you said: it's probably a good idea to make sure we have a good list of the concerns we want to address before we go too far down the technology path but I'll throw out some maven trivia to address your points and maybe you can respond to my question concerning API(s) below.


RE: Remove inter-project dependency issues for good

This doesn't seem to be a build harness question per se, but rather a packaging question.  I'm trying to think through the issue in the moment here and one thing that pops out to me is that there are some basic dependencies like core that you will have between projects.  That is, other projects will need core and possibly other low level libraries in order to build.  I think you are addressing something different and I am just not quite sure what.

Something that I brought up out in CA was pulling the API's out in to separate jars.  I think there are several obvious pros and only one constant con and one situational that I can think of.

Pros:
    You can absolutely ensure that there is no dependence on implementation code by making sure implementations are not available to javac during build

    You have a more granular, safer jar/library you can put in common classloaders if you have to share an API (while the implementations can remain at different versions as long as binary compatibility exists)

    You have libraries which represent your contracts and only your contracts.
Cons:
    More jars to manage.

       If you share an API jar you need to make sure you have compatibility.
Remove all .classpath issues in Eclipse for good

What do you mean by this?  Maven II will generate your eclipse machinery but perhaps the issue you are speaking of will still be presnent.  Typically the eclipse machinery is not kept in SVN, its generated by maven by typing mvn eclispe:eclipse or by using the mvn plugin to import the project. 

Make the source tree reflect the java package organization
This is independent of the build harness correct? We only need make sure that we comply with this requirement, and that the build tool doesn't stop us from keeping compliant.

RE: Default: one jar per java package; build others as-needed.
RE: All build output can be configured to go outside the source tree


Maven builds in a target folder.  I never looked to see if it can be configured to be put outside the source tree – by default it is in the source tree but at the top level (not in line with the java source files)

RE: Make Javadoc build better docs & overall indices w/o special tweaking

What is throwing them off currently? 

RE: Improve build speed

I think granularity and organization come to play here.  You need the capability to easily build only what you need to build.  I am not sure Maven would be "faster" for its own sake but we could probably make some optimizations in the organization which would impact build time. 
We should make sure that whatever we do, build time doesn’t get worse.


RE: Allow builds to occur directly from read-only source archive

I don't see any issues with this.


Good perspective Jon.  Does anyone else have concerns about the current build harness that we aught to investigate?

rdanner
Champ in-the-making
Champ in-the-making
Good topic Russ!

Here's what I'd like to see regardless of our choice of build framework:
  • Remove inter-project dependency issues for good

  • Remove all .classpath issues in Eclipse for good

  • Make the source tree reflect the java package organization

  • Default: one jar per java package; build others as-needed.

  • All build output can be configured to go outside the source tree

  • Make Javadoc build better docs & overall indices w/o special tweaking

  • Improve build speed

  • Allow builds to occur directly from read-only source archive

I'll put together a wiki page on requirements so we have something we can use as a measuring stick when considering our next steps and so that we have something to build a test plan from.

jcox
Champ in-the-making
Champ in-the-making
The inter-project dependencies are a needless pain in the neck
for developers.   It ends up forcing certain things to live in
places because of when they're built, rather than what would
be the ideal logical organization.  In at few cases, it forces
us to do lazy-init within spring…etc.   Cruft like this only gets worse
over time, so I'd like us to fix now while it's still relatively easy.
This is a developer's perspective, not a user's… but that matters too! Smiley Happy

I'd like to learn a lot more about Maven II.  If it can meet all our
requirements, and most folks prefer it, then that's great.   Ant is ok,
but I don't love it.   That said, I think we could be using Ant a lot
better than we are currently.  

This doesn't seem to be a build harness question per se, but rather a packaging question.
It's not just about the packaging.   Structuring things so that one big
javac can see everything at once means perfect build dependencies
with 0 effort forever.   Otherwise we're right back to:

class X needs to be built before Y, but it needs class P
which must be put in location Q but then that generates a circular
dependency that Ant can handle but Eclipse can't… unless you
sprinkle lazy inits here and there in various spring config files…
and then muck around with .classpath files used by the continuous
builds done by IDEs such as Eclipse… and… and….

Ugh. The picture I'm painting isn't that bad yet, but it's always there,
gnawing, gnawing, gnawing….

Regarding your concern about many jar files,  you can actually
compose jars, so there can actually be *fewer* of them in the
final product regardless of the intermediate jars you choose to
create (or not).   If we go the many-smaller-jars route in the
final product, then the way of handling compatability is via
a major/minor numbering scheme.   Tried and true.  The existence
proof:  UNIX libs   (kudos to Sun for getting it right).

Check out One-JAR:
http://www.ibm.com/developerworks/library/j-onejar/index.html?ca=drs-j4904
As far as I'm concerned, everything is on the table.

> Make the source tree reflect the java package organization
This is independent of the build harness correct?
It depends on the underlying capabilities of that harness.
For example,  the way we're using Ant currently, source code
that belongs to the same package cannot live in the same directory,
due to build dependency issues.  That's not a fundamental problem
with Ant, but rather with our current use of Ant.   It's my understanding
that Maven II lets you do things at a higher level than Ant, but sometimes
that can mean loss of lower-level control. 

For example, some tools force you to build things a certain way,
and if your goals and their model are badly matched, the tool gets
in the way).    I'm not saying that's the case with Maven II,
but I don't know for sure that it's not either… yet.

Being able to build from a read-only source is great.
It adds a lot of flexibility to the system in terms of
permissions, media types, etc.   It's also just cleaner
(e.g.: we'll be able to get rid of a lot of  svn:ignore tags).

As for build speed, one javac is faster than n, and only ever builds
exactly what's out of date.   When you have n projects, you've got
to be a lot more careful, and even then people tend to throw in a
"clean" operation because they get scared.   As soon as you do that,
you're rebuilding way more than you really need to;  if you fail to
do that when things are slightly wrong in the dependencies
(which is easy to do), then you waste time in a different way…
which is no fun either.

Essentially, as soon as you break things up into multiple projects,
you take ownership of hand-crafting dependencies rather than just
letting the compiler do its job.    It's bad for exactly the same reasons
recursive make is bad. 

How these compiled classes are *packaged* however, is a totally
orthogonal issue, so long as the build framework supports multiple
artifacts per project.   In  Maven I, that was an issue, but I don't
know what the deal is with Maven II.   Some stuff needs to go
into  server/lib while other stuff needs to be in common/lib,
so this actually is a hard requirement.   I don't want to be forced
into a different project  / javac invocation just because the build
framework thinks it knows better and refuses to generate multiple
artifacts in one project.   If that's the case with Maven II, it's a
non-starter in my view. 

Again, this isn't something we've had a chance to discuss a lot
internally, so I'm not summarizing a collectively held viewpoint,
nor am I even expressing misgivings regarding Maven II.
For all I know right now, I'd love it.   That said, this is at least
part of the checklist I'll use when I get around to forming my
own opinion on the topic.   I'm sure there will be other stuff
too, such as learning curve, interop with tools, how others
in the community & engineering feel about it all, and so forth.

Well, you asked for my opinion, and all you got was my meta-opinion!
I hope to do better than that soon.

  Cheers,
  -Jon

rdanner
Champ in-the-making
Champ in-the-making
I've started a wiki page concerning issues with the current build and other requirements we would like to consider moving forward.

Please feel free to add your own requirements, thoughts etc.  I only ask that you try and organize your thoughts along with eveyone elses.  The page doesn't have much structure yet but I think we will see some very soon.

http://wiki.alfresco.com/wiki/Build_Harness_Requirements

rdanner
Champ in-the-making
Champ in-the-making
The inter-project dependencies are a needless pain in the neck
for developers.   It ends up forcing certain things to live in
places because of when they're built, rather than what would
be the ideal logical organization.  In at few cases, it forces
us to do lazy-init within spring…etc.   Cruft like this only gets worse
over time, so I'd like us to fix now while it's still relatively easy.
This is a developer's perspective, not a user's… but that matters too! Smiley Happy

The comments on the build dependencies are really interesting for me and I am going to give them a lot of thought.

I've always liked the idea of keeping things in separate projects and using maven to include what needs to be included.  I tend to think of it as a virtue that each piece only gets compiled with what it needs and what it needs is either declared or missing (in the POM file)  in which case [need not declared] an explosion will occur.  The pom file is a sort of documentation on the cohesion of the code. I also like the fact that you can enforce certain layering and strict use of APIs across libraries (check for cohesion with implementation) by breaking things in to sub projects and declaring dependencies. I haven't fretted much that I am taking some of the dependency management off the shoulders of javac but then, I've never worried too much about build time – which can become an issue if you have to rebuild and you freak and run a clean – and some of the symptoms you mention (like clean) are actualities.

Really good stuff Jon.  I have a lot to think about here because what you are saying is very much different then what I am currently doing. We have a lot of small projects linked together by dummy pom files and Maven is responsible for determining the build order. I always like having to come from a completely different direction and examine how we are doing things.  One of the reasons I though Maven would be a natural fit for Alfresco was the fact that Alfresco is currently split up in a similar fashion.

So if I get what you are saying: you would like packages to be atomic so that they can be individually built and deployed as a patch, but that the whole project can also be built in one javac.

It’s interesting because really never thought of Alfresco as a single project.  I think of the core and repository as very different from the web client for example and I’ve always thought those lines would deepen with time. Now, again that is a bit of deployment versus source tree talk, and they are not the same thing so again, I may have to go rethink.  I am stuck in that mind set because I do break my libraries up as separate projects and I build my applications as yet separate projects.  In most cases the applications have only spring configuration, web configuration and a few domain classes and maven pulls in all the dependencies. 

The fact that the Alfresco approach was similar (not the same but not too far off) re-enforced my thought process around that kind of organization.   I’ve been aware of at least some of the basic CONS for some time but I have always felt the PROS outweighed them.  I’ll be interested to hear what others throw in on this subject but I really appreciate the thoughts you have put here especially because they are very different from my own on the subject.

jcox
Champ in-the-making
Champ in-the-making
I've always liked the idea of keeping things in separate projects
and using maven to include what needs to be included. I tend to
think of it as a virtue that each piece only gets compiled with what
it needs and what it needs
I think taking over any of these dependencies just invites brittleness,
crufty hand-written configs, needless compilations, and/or bad builds.
As for enforcing proper layering,  public/protected/package APIs
already handle this;  a separate package is the wrong mechanism for
the job.   In short, I want my *compiler* worrying about compilation,
not me.

Regarding code cohesion, I think that's what my packaging, distribution
system, SVN branches, and automatically enforced jar naming
conventions should handle, not my compiler.

So if I get what you are saying: you would like packages to be atomic
so that they can be individually built and deployed as a patch, but that
the whole project can also be built in one javac.

Ideally, I'd like the package hierarchy to be mirrored exactly in the
directory structure, then have one javac compile all my .java files
into class files.  This means that anything that's out of date gets
rebuilt, and anything that's not isn't.  No build artifact (e.g.: class file)
should be forced back into the source tree.    Then, depending on
how I want things packaged up, jar files get created.   There are many
ways this last bit could be done.   We could have a set of explicit
package-to-jar rules,  and/or do jars-of-jars as we see fit via One-JAR
(again keeping these the heck away from the source tree).  One nice
thing about One-JAR is that we could have neater-looking installs
yet at the same time, clear partitions within the One-JAR if it came
time to do patches in the field.

Different projects seem useful when you're integrating work across
different organization units, such as when you've got 3rd party plugins.
The benefit there is that if the 3rd party hands you something that
won't compile, the core system can forge ahead.  However, within an
organization, something that breaks the build can be rolled back
because you've got SVN history right there.   Therefore, if it's
in the same SVN branch, crippling javac's ability to work out the
dependencies for you by busting things up into different projects
represents a substantial maintenance cost for no gain. 

Hence, a single project with more flexibility on the backend
when it comes to packaging (jar/war creation & naming)
seems better to me than N projects hand-cobbled together
with hardcoded glue.

Sleepy now…

- Jon

stk137
Champ in-the-making
Champ in-the-making
One thing I'd like to gain out Alfresco moving to Maven would be a better SDK.  There could be some custom web client architype to get started and unlike the current SDK this should be a proper JEE-Web project in my IDE that's aware of all the JSF-Facets, tags, and so on.  As it is now, I open something like SDK CustomJSP and Eclipse (jee-europa) doesn't know it's a web project and there's bunch of errors for all the unknown tags.

The other benefit to replacing the current eclipse based SDK, with Maven
and architypes, is that it'd work in Eclipse (and derivatives like Red Hat Developer Studio), Netbeans, and IntelliJ as a JEE-Web-JSF project. 
(Last I checked importing an Eclipse Web project in Netbeans didn't work, only basic Java projects)

I also I thought I'd mention this:
http://ancientprogramming.blogspot.com/2007/08/badly-packaged-project-will-break.html
upshot: create one distro rather than two like Spring and choose your group name carefully and don't change it.