Hi Vaibhav -
I think the dream of 50 million objects is not an illusion. Our target is to ultimately match documentum's billion object mark. But I don't think it can be matched without a database, even if it isn't Oracle.
Databases give us two things that are of ultimate benefit - transaction control and separation of logical and physical schemas. Also they can give us performance for access on various properties using indexing and caching techniques that we don't need to build into the system. Partitioning of metadata across many machines and table partitions is another example of some things much easier on top of a relational database.
When I discussed with Doug Cutting, the author of Lucene, the type of information we wanted to manage and how we want to use it, even he said that there are some things best left to a database. Using a database gives us the ability to use tools designed for databases such as replication tools and query tools. Relational languages have evolved to be able to perfom many types of aggregations, joins, and other operations that would be very hard to do without a database. We don't need to build these features into the system, we can just integrate them in.
The fact that we use Hibernate, actually gives us lots of choice in terms of how we store our metadata. At the moment we have struck a good balance in terms of flexibility of what metadata is stored and the ability to access that information. Time and experience with applications will tell us whether we have the balance right. We have already gone more along the pendulum toward your vision than where we were with Documentum. We already store a lot of information into a serialized, but not necessarily XML form.
I gather from your vision that you would like to store many things, very quickly. How quickly and by how many people do you want to get them out? The balance really depends on the type of applications you are building.
What applications have you developed or are considering developing? What types of information should be stored in XML vs. retrievable through a relational interface? I believe that it is possible to scale to the levels you describe using the current tools at hand, but hard to tell without specifics. It would be good to start a conversation on this.
-john