In January of 2003, we published an interview with virtual reality pioneer Jaron Lanier (http://java.sun.com/features/2003/01/lanier_qa1.html) that raised some basic questions about programming: Is there something fundamentally misguided about the way we write programs today? Why is it so difficult, if not impossible, to write bug-free programs that contain more than 20 to 30 million lines of code? Do we need a radical new paradigm shift in programming? If so, what might it look like? The interview provoked a strong response, both inside and outside of Sun Microsystems.
One response came from Sun's Victoria Livschitz, a senior IT architect and Java Evangelist who has an interesting history. Livschitz grew up in Lithuania, where she was the women's chess champion and a National Chess Master in 1988 -- the same year in which she won the prestigious Russian national junior mathematical competition. She studied applied mathematics at Kharkov University in the Ukraine before coming to the US, where she subsequently received a degree in Computer Science from Case Western Reserve University. After a four-year stint at the Ford Motor company, she came to Sun in 1997, where she has served as principal architect on several high-profile eCommerce and EAI projects, while managing all aspects of Sun's technical presence at General Motors. In 2001, she was named System Engineer of the Year for the company's Central Area, and won the Trusted Advisor Award at Sun. In addition, she is a founding member of the World Wide Institute of Software Architects.
We met with her recently to talk about programming, chess, and other challenging matters.
Before we get in to the details of programming, tell us about your background.
I come from several generations of mathematicians. My father, for example, is a world-renowned expert on some areas of functional analysis. I began my undergraduate studies in applied mathematics, but quickly gravitated towards discrete disciplines and programming; in the end, computer science turned out to be a much better fit.
Playing Chess and Creating Software
I'm intrigued by the fact that you were the women's chess champion in Lithuania and achieved the rank of master. I'm wondering if you see any parallels between the challenges of being creative in the use of logic in chess, and being creative in the use of logic in programming.
Chess, like any other complex intellectual activity, involves a combination of knowledge, creative vision, and technique. Good chess players are able to acquire and store a lot of information -- classic and trendy opening systems, standard endgame positions, novelties specific to their repertoire, and so forth. The creative vision of a master comes through his or her ability to discover patterns hidden in the position, then correctly interpret the competing (and often incomplete) patterns as either more or less relevant to their goals -- and finally choose the best option. Also, it requires flawless technique to see the well-played game through to its logical conclusion, in the face of the creative resistance put up by a skilled opponent. This combination of knowledge, creativity, and technique is what has been attracting people to chess for over 4,000 years.
The parallel between chess and programming is rather obvious. Programming is also about knowledge, creativity, and technique. Good programmers must have a vast body of knowledge at their fingertips: the programming syntax of one or more languages, standard and special-purpose data structures, typical (as well as advanced) coding techniques, many kinds of libraries and APIs, a multitude of design patterns, and so on. Good programmers use their creative vision to recognize many patterns that may be relevant to the solution of the specific design problem at hand, and correctly choose the best approach. Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless.
It's not surprising that so many people at Sun like chess. I am actually hoping to organize an internal chess championship sometime this year. Seems like that would be a lot of fun.
Of course, chess is also a sport, which demands unique talents and qualifications that are required of a champion, such as willpower, stamina, the ability to take risks, and so forth; competitive chess is really more similar to football than programming. Then again, coding marathons to meet deadlines are very much like chess tournaments, don't you think?
What do you see as the big problems in writing software?
When I first became a developer on large, real-world projects at Ford as part of an elite development group, I was shocked by the deficiencies of the software engineering process at-large, and this subject has fascinated me ever since. It is widely known that few significant development projects, if any, finish successfully, on time, and within budget. At best, it takes at least one full release cycle to work out the major bugs. Many projects die quietly in development, crumbled by costs, changing requirements, lack of communication between the various teams involved, and who knows what else.
And here's what's really sad -- the overwhelming majority of so-called "successful" development projects produce mediocre software. Take almost any corporate accounting application, and you'll find it poor in quality, unimpressive in capabilities, difficult to extend, misaligned with other enterprise systems, technologically obsolete by the time of release, and functionally identical to dozens of other accounting systems. Hundreds of thousands of dollars are spent on development, and millions afterwards on maintenance -- and for what? From an engineering standpoint, zero innovation and zero incremental value have been produced.
Jaron Lanier has argued that we cannot write big programs with a lot of code without creating many bugs, which he concludes is a sign that something is fundamentally wrong.
I agree with Jaron's thesis completely. The correlation of the size of the software with its quality is overwhelming and very suggestive. I think his observations raise numerous questions: Why are big programs so buggy? And not just buggy, but buggy to a point beyond salvation. Is there an inherent complexity factor that makes bugs grow exponentially, in number, severity, and in how difficult they are to diagnose? If so, how do we define complexity and deal with it?
Jaron's emphasis on "pattern recognition" as a substitute for the rigid, error-prone, binary "match/no match" constructs that are dominant in today's programs is intriguing to me, especially because I've always thought that the principles of fuzzy logic should be exploited far more widely in software engineering. Still, my quest for the answer to Jaron's question seems to yield ideas orthogonal to his own.
I can see two reasonable ways to create complex programs that are less susceptible to bugs. As in medicine, there is prevention and there is recovery. Both the objectives and the means involved in prevention and recovery are so different that they should be considered separately.
The preventive measures attempt to ensure that bugs are not possible in the first place. A lot of progress has been made in the last twenty years along these lines. Such programming practices as strong typing that allows compile-time assignment safety checking, garbage collectors that automatically manage memory, and exception mechanisms that trap and propagate errors in traceable and recoverable matter do make programming safer. The Java language, of course, personifies the modern general-purpose programming language with first-class systemic safety qualities. It's a huge improvement over its predecessor, C++. Much can also be said about the visual development tools that simplify and automate more mundane and error-prone aspects of programming.
Having said that, these technological advances are still inadequate in dealing with many categories of bugs. You see, a "bug" is often just a sign of recognition that a program is behaving undesirably. Such "undesirability" may indeed be caused by mechanical problems in which code does something different from what it was intended to do. But all too often the code is doing exactly what the programmer wanted at the time, which (in the end) turned out to be a really bad idea. The former is a programming bug, and the latter a design bug, or in some exceptionally lethal cases, an architectural bug. The constant security-related problems associated with Microsoft's products are due to its fundamental platform architecture. Java technology, in contrast, enjoys exceptional immunity to viruses because of its sandbox architecture.
I don't believe that future advances in software engineering will prevent developers from making mistakes that lead to design bugs. Over time, any successful software evolves to address new requirements. A piece of code that behaved appropriately in previous versions suddenly turns out to have deficiencies -- or bugs. That's OK! The reality of the program domain has changed, so the program must change too. A bug is simply a manifestation of the newly discovered misalignment. It must be expected to happen, really! From that vantage point, it's not the prevention of bugs but the recovery -- the ability to gracefully exterminate them -- that counts.
In regard to recovery, I can't think of a recent technological breakthrough. Polymorphism and inheritance help developers write new classes without affecting the rest of the program. However, most bug fixes require some degree of refactoring, which is always dangerous and unpredictable.
Fighting Software Complexity
What about the notion of complexity as the primary reason for software bugs? Do you have any concrete ideas on how to reduce complexity?
Well, I see two principal weapons. One is the intuitiveness of the programming experience from the developer's point of view. Another is the ability to decompose the whole into smaller units and aggregate individual units into a whole. Let me start with the programming experience first.
Things appear simple to us when we can operate intuitively, at the level of consciousness well below fully focused, concentrated, strenuous thinking. Thus, the opposite of complexity -- and the best weapon against it -- is intuitiveness. Software engineering should flow from the intuitiveness of the programming experience. A programmer who works with complex programs comfortably does not see them as complex, thanks to the way our perception and cognition work. A forest is a complex ecosystem, but for the average hiker the woods do not appear complex.
How well do you think modern programming languages, particularly the Java language, have been able to help developers hide complexity?
Unfortunately, I believe modern computer science and software engineering have failed to make significant advances there. The syntax of all mainstream programming languages is rather esoteric. Mathematicians, who feel comfortable with purely abstract syntax, spend years of intense study mastering certain skills. But unlike mathematicians, programmers are taught to think not in terms of absolute proof, but in terms of working metaphors. To understand how a system works, a programmer doesn't build a system of mathematical equations, but comes up with real-life metaphor correctness which she or he can "feel" as a human being. Programmers are "average" folks; they have to be, since programming is a profession of millions of people, many without college degrees. Esoteric software doesn't scale to millions, not in people, and not in lines of code.
Now back to your question. For a long time, programmers have been manipulating subroutines, functions, data structures, loops, and other totally abstract constructs that neglect -- no, numb -- human intuition. Then object-oriented programming took off. Developers could, for the first time, create programming constructs that resembled elements of the real world -- in name, characteristics, and relationships to other objects. Even a non-programmer understands, at a basic level, the concept of a "Bank Account" object. The power of intuitively understanding the meaning and relationship between things is the proverbial silver bullet, if there is one, in the war against complexity.
Object-oriented programming allowed developers to create industrial software that is far more complex than what procedural programming allowed. However, we seem to have reached the point where OO is no longer effective. No one can comfortably negotiate a system with thousands of classes. So, unfortunately, object-oriented programming has a fundamental flaw, ironically related to its main strength.
In object-oriented systems, "object" is the one and only basic abstraction. The universe always gets reduced to a set of pre-defined object classes, some of which are structural supersets of others. The simplicity of this model is both its blessing and its curse. Einstein once noted that an explanation should be as simple as possible, but no simpler. This is a remarkably subtle point that is often overlooked. Explaining the world through a collection of objects is just too simple! The world is richer than what can be expressed with object-oriented syntax.
Consider a few common concepts that people universally use to understand and describe all systems -- concepts that do not fit the object mold. The "before/after" paradigm, as well that of "cause/effect," and the notion of the "state of the system" are amongst the most vivid examples. Indeed, the process of "brewing coffee," or "assembling a vehicle," or "landing a rover on Mars" cannot be decomposed into simple objects. Yes, they are being treated that way in OO languages, but that's contrived and counter-intuitive. The sequence of the routine itself -- what comes before what under what conditions based on what causality -- simply has no meaningful representation in OO, because OO has no concept of sequencing, or state, or cause.
Processes are extremely common in the real world and in programming. Elaborate mechanisms have been devised over the years to handle transactions, workflow, orchestration, threads, protocols, and other inherently "procedural" concepts. Those mechanisms breed complexity as they try to compensate for the inherent time-invariant deficiency in OO programming. Instead, the problem should be addressed at the root by allowing process-specific constructs, such as "before/after," "cause/effect," and, perhaps, "system state" to be a core part of the language.
I envision a programming language that is a notch richer then OO. It would be based on a small number of primitive concepts, intuitively obvious to any mature human being, and tied to well-understood metaphors, such as objects, conditions, and processes. I hope to preserve many features of the object-oriented systems that made them so safe and convenient, such as abstract typing, polymorphism, encapsulation and so on. The work so far has been promising.
So, your basic thesis is that programming constructs should be more intuitive to developers, and more closely simulate and resemble the real world. That would enable developers to write software with fewer bugs, right?
Exactly. In the fight against complexity, an attempt to engage the programmer's intuition and subliminal perception is the best strategy, yet it is terribly neglected. If you want proof of just how important metaphors are to the simplification of the programming experience, look at what visual development tools like Visual Basic have done to demystify the previously "obscure art of programming" and attract millions of new people to programming. Rows and columns, cells, list boxes, and push buttons are modeled after simple, intuitive metaphors -- that's why it takes a well-rounded person with no previous programming skills only about a week to master plain Visual Basic!
But expanding the pure object-oriented paradigm to allow for a richer set of basic abstractions -- like processes and conditions -- is only half of the arsenal in the war on complexity. The other half is a powerful aggregation/decomposition model that is rather weak, convoluted, and fragmented in modern programming. In order to deal with complexity, the organization of the software elements is of utmost importance.
Hierarchies and collections are pretty much the only tools we've got to define how things relate to each other and how they should be organized into manageable structures. Hierarchical aggregation fits well with the fractal nature of many organic and artificial systems, and it is intuitively obvious to most people. Plus, the depth of the aggregation scales linearly with the exponential growth of elements, which is hugely important. Collections are similarly plentiful in the natural and virtual worlds, fit well with peer-to-peer systems, and once again, are totally intuitive. Unfortunately, this wonderfully simple division of structures into hierarchies and collections is, again, too simple for our needs.
There is a plethora of other relationships that also don't fit very neatly. Master/slave, many-to-many, component/container, interval, element/metadata, and so on, are just a few common ones that we deal with every day. We treat each structural relationship differently every time. Theoretically, the object is the one and only "unit of software" in object-oriented systems, but is that really true? We have explicit distinctions between classes, packages, resource files and application bundles, containers and components, classes and interfaces, applications and services, and so on. Each new technology introduces new concepts. Inside the source code, we've got "Is-A" and "Has-A" as two alternative mechanisms to create new software components out of existing ones. Still, all these things combined cannot express the simplest aggregation of several elements with particular semantic relationships; therefore, an external graphical "design pattern" is needed to document which elements are aggregated and how the collective system works.
Talk about the complexity and counter-intuitiveness of programming! What seems to be missing is a unified component architecture rich enough to cover the whole spectrum of needs, from distribution to reuse. I am convinced that it isn't that hard to do. First, a notion of a "component" as a fully autonomous element of software must be strictly defined. An object, to be sure, is not a component, although many components may be implemented with objects. Then the rules of relation, composition, and aggregation of sub-components into higher-level components will be defined, in fully codifiable form. Familiar "Is-A" and "Has-A" relationships will be present, among many others. Finally, the rules of derivation will be defined and codified to enable a comprehensive reuse framework. Inheritance, for example, will be only one form of derivation made possible under the new model.
Equipped with such a powerful component architecture, a new theory of reuse may be developed, this time addressing the entire software lifecycle over a project's lifetime in a graceful, truly evolutionary way. Refactoring will no longer be a brutal, destructive operation. Instead, a safe, almost organic rejuvenation of the old components by the new ones -- guaranteed at compile time to be semantically, as well as syntactically, correct -- will become possible, analogous to the cyclical rejuvenation found in every corner of nature.
Software is truly amazing media, unlike anything else found in nature or created by humankind. Like information in general, software is not an entirely physical substance, for it has no mass, volume, or density. Neither is it an entirely metaphysical concept, for it interacts with real, physical entities, and causes very concrete physical impacts, such as the rotation of a turbine, the flow of electricity, or the imprint of an image on the page.
Software is a product of our imagination, like a book, a painting or a movie, designed to synthesize a particular representation of the real world. But unlike all other forms of pure art, software is constructed for utilitarian purposes to do more then merely reflect the real world; software interacts with the world and in many cases even controls it. And what is truly amazing -- software is replicable: instantaneously, in arbitrary numbers, at zero cost!
I believe there has to be a better way to harness the power of software media than what we came up with in the last millennium.
Advice to Developers
Do you have any concrete advice for Java developers? And are you optimistic about the direction software is headed?
I recall that you asked Jaron a similar question. My advice to developers will echo his sentiments. Don't take everything you've been told about good software engineering as gospel truth. Don't be bamboozled. Maintain your sense of skepticism and look for more intuitive metaphors.
As far as optimism about the future, I see a lot of interesting work around presentation of data to end users. Sun's Project Looking Glass is a good example of the innovative thinking and good use of intuitive metaphors that make interactions with complex multi-media information effortless. Apple and Microsoft seem to be working on similarly interesting technologies. Sadly, none of it is going into basic research and the development of principally innovative general-purpose programming languages. The complacency around C/C++ and the Java language is pervasive. C#, the first programming language in years, looks more like the Java language. Enormous productivity gains remain to be uncovered and difficult problems are yet to be solved. The world has gone crazy with XML and then web services; SOAP and UDDI are getting enormous attention, and, yet, from a software engineering standpoint, they seem to me a setback rather then a step forward.
We now have a generation of young programmers who think of software in terms of angle brackets. An enormous mess of XML documents that are now being created by enterprises at an alarming rate will be haunting our industry for decades. With all that excitement, no one seems to have the slightest interest in basic computer science. Still, there must be people out there who think differently. Jaron Lanier is clearly one of them. Recently, one project at Sun Labs appeared to be genuinely interested in beginning work on the "next thing after Java technology" as part of far-reaching research into new computing platforms. So, I don't know, things may begin turning around.
Contact Victoria Livschitz