Thursday, July 26, 2007

What's Gilad Up To?

The legendary "computational theologist" Gilad Bracha is on the verge of another divine revelation.

Gilad, as many know, was for years one of the gurus of the Java language design, delicately dancing on the infinitely thin edge between semantic integrity and linguistic flexibility. Until, that is, he got tired of the need to continually choose between poor alternatives and quit Sun last year. Since then his new blog, the cheerily named Room 101 (one of my must-reads), has been putting forth all kinds of evidence about his new venture.

It's clear that it's a startup, and that they're working on a new language. But what kind of language? And for what purpose?

Start with Gilad's previously-mentioned-here paper on Objects As Software Services. This paper discusses the possibilities of downloadable software objects -- with behavior and state -- that can provide offline functionality, transparent synchronization to the server when reconnected, and dynamic upgrade. He goes into some detail on the language properties these objects would benefit from -- specifically, reflective tracking of field accesses (via a mirror-based reflective mechanism), and a typing semantics that is flexible enough to allow code upgrade without entering classloader hell.

In more recent posts on Room 101, we've seen other pronouncements of what good languages look like. For one, his excellent recent post on why constructors are considered harmful, which is because they are far too tightly coupled to the specific class in which they're declared. Most of the time you want object creation to be a matter of policy, but constructors are pure mechanism -- all you get is the instance of the class you're specifying, which is totally inadequate in most cases (the majority of cases where you want an object that fulfills some interface, but you want the framework to pick the appropriate class, since it knows more about the classes involved than you do). The canonical solution here is factory methods, but Gilad considers most of the factory method idioms in, e.g., Java to have many of the same problems -- you can't override a static factory class, you can't extend or replace it. He suggests instead a mechanism for encapsulating object-creation methods within a parameterized class definition, allowing full virtualization of object creation without the overly tight coupling to the class itself. So Gilad's seeking a language that has the structural purity of Smalltalk, with some of the warts removed.

There's also his post on why tuples are good, which is relatively straightforward as Gilad's posts go. Again, tuples give you a single language mechanism that covers a variety of important uses. We're seeing a real focus here on The Right Thing. It must be a blast for Gilad to be working with a clean sheet of paper after a decade in the immensely constrained Java space.

Finally, there's his post on why message passing is good. This is not going to be any surprise to anyone who's enamored of message passing languages, such as Erlang and (at least at the distributed level) E. Pure message passing gives you a degree of isolation and security that more tightly coupled languages can't match. It also has a very real performance cost -- we don't yet know how to build an optimizer that can make a message-passing loop be as efficient as an inlined C++ function-call loop -- but Gilad's main concerns, as we've seen, are linguistic flexibility and expressiveness for use cases including distributed programming and dynamically upgradeable software objects, not achieving the maximum possible performance for typical modern architectures (which has always been one of Bjarne's key goals).

Also, Gilad recruited Peter Ahé away from Sun, and Peter's spilling some beans of his own.

So putting all this together, what do we have? Gilad is working on a new language that:
  • is a pure message-passing language (a la Self)
  • refines Smalltalk's metaclass structure to allow parameterized factories and basic constructors to be implemented as virtualizable instance methods
  • supports a flexible enough typing structure to enable dynamic upgrade of the code of stateful objects
  • has a mirror-based reflective mechanism suitable to allow, among other things, logging of all object field accesses for subsequent orthogonal synchronization
  • has tuples :-)
It's going to be very interesting to play with this language of his once it's done. And it even looks like he's hiring, so things may be pretty close to public visibility! If nothing else, his new venture is going to provide a lot of food for thought. New languages always have a hard time achieving critical mass, but with Gilad's sense of history and design skills behind it, this new one has a much better than average chance.

Offlining Soon (Temporarily)

My wife is within days of delivering our second child. We've already got a two-and-a-half year old girl, Sophie, who's a little bundle of shining happiness -- seriously, this kid is one of the cheeriest tots you're likely to meet, and my wife and I are totally smitten with her. Well, now her baby brother Matthew is on the way, and it's happening Real Soon Now.

So at some point in the not-very-distant-at-all future, I'm going to be going offline in a big way, possibly for up to a month. Do not fret. I shall return in force, once we've gotten the hang of juggling TWO little ones in the house, and once we've made darn sure that Sophie feels OK with everything.

Thursday, July 12, 2007

Classloader Hell

I really like Java. It's definitely my favorite language, and the one I've written the most code in, by far. Its balance of static typing, great libraries, OS independence, "no surprises" philosophy, and unsurpassed tool support mean that I'm more productive in Java than in any other language I've tried.

But that's not to say it's perfect. There can never be an objectively perfect language, since there are too many kinds of software that need to be written, and all languages have expressive tradeoffs that affect their usefulness for those purposes. But even on its own merits, Java's got some major issues.

The biggest issue is the infamously named "classloader hell". All reasonably experienced Java programmers know it bitterly well: the difficulties of running code within application servers or other kinds of complex deployment situations, where there are multiple sub-projects all loading code at runtime and attempting to operate with each other. (Briefly, for those who don't know, Java code is loaded by objects named "classloaders"; classloaders define a name space of loaded classes, and a single class, if loaded by multiple classloaders, is considered to be two distinct -- and non-interoperable -- classes.)

This is one area where Java's original standard was written in haste. The model of classloaders delegating to each other, the way that classes are considered to be different if loaded from different classloaders, and the many problems this can cause -- all are fundamental problems that manifest in a variety of confusing and confounding ways.

The problem gets much worse when you get into code interoperating between multiple systems on a network, each of which may have subtly different versions of the system's code. The entire Remote Method Invocation standard was predicated on the assumption that Java's ability to load code over the network would enable code to travel alongside the network messages that reference that code. It feels to me like a case of "wow, Java lets code run anywhere! Just think of what that lets you do for a networking protocol!" The problem was that the versioning issues weren't well understood, and that has caused bitter pain in actual usage.

It's not like Sun isn't aware of the issues. This technical report covers the basics: it's possible to set up RMI scenarios where perfectly innocent messaging patterns result in systems being unable to send messages to each other at all. Worse, the runtime behavior is difficult to analyze or understand.

These problems aren't merely abstract. Another Sun paper discusses an entire advanced networking system that was well into development, based on Jini and RMI, when these problems reared their ugly heads and fundamentally destabilized the entire project. The ultimate reason for abandoning the project was its business infeasibility, but it's clear that Sun's own technologies -- as currently implemented -- were deeply broken.

Now, security, versioning, and interoperability are notoriously hard problems, and it's far from clear that anyone has really great solutions to them. Certainly Sun can't be too deeply faulted for not getting it right -- Java did get many other things right (including OS independence, dynamic compilation, dynamic inlining, garbage collection, and much more). So I don't mean to pick too deeply on Java; I just mean to highlight some of the biggest outstanding issues with the language.

There's work going on to fix at least some of this: for example, the Java module JSR, #277, is intended to bring at least some rigor to the entire classloading system. The .NET framework has a better grasp of module loading issues, though I'm not aware of how .NET messaging compares to RMI or whether it's exposed to any of the same issues. The Fortress language is taking a much more rigorous approach to component versioning and component assembly as well. But most of these seem dedicated to fixing the module loading problems in a single environment; I don't know of much work on addressing the security and versioning issues associated with network messaging (though JSR 277 does address some of the naming issues that break RMI and Jini).

Still, that's the great thing about software: there's always more that needs fixing, and more great things to build! Expect many more posts on this general topic....

Every Two Weeks

[Crossposted to my other blog.]

Blogging's still new to me. The only thing I really know about it so far is that while there's an infinity of posts to make, there's only finite time to make them. And that time is in high demand, mainly from family, but also from all the other great projects that aren't blog-writing.

So I have to set some kind of deadline for myself. And that is: two weeks. I must, and will, post in both my blogs at least every two weeks.

I'm setting that deadline because I notice my reaction when going to other blogs: if I see the latest post is a month or more old, it's a sign that the person's really way too busy to blog, and that the blog's an afterthought to them. It's also a sign that who knows when the next post might be?

But if the last post was a week or so ago, that's still timely. Two weeks is pushing it, but based on my experience so far, it's the most I can commit myself to.

So, that's my promise to you: I'll update both my blogs every two weeks at most. Feel free to flame me if I don't!

...Except for the next month, because we're expecting my wife to go into labor sometime in the next three weeks, which means all bets are off :-)