May 31st, 2006
Ed Kaim responded to yesterday’s "Scrap the Windows Codebase" post with some good comments and it’s worth a follow-up. Ed says "I was surprised by the negativity of the tone overall and felt it was very much in the style of Michael Moore". Well, maybe I deserve that. It is hard to talk about scrapping one of the industry’s most valuable codebases in positive terms.
Ed has a good point when he says:
All I care about is that the OS does what my customers and I expect it
to do and that the apps we build don’t break. If it takes Microsoft 10
years to ship each new OS, that is better for us because it
means less budget gets spent on migration and more on core projects.
However, if the rug gets pulled out from under billions of users by
drastic changes for questionable improvements, we’re all screwed.
I agree with him 100%. Microsoft has a overwhelming responsibility to their customers and shareholders not to cause needless market and consumer upheaval.
But here’s the point: It’s far worse to live in denial. If you have a problem, you need to face it full-on, even if it’s more severe than you want it to be. When a business has 10000 employees it can no longer use, it’s not easy to make the decision to have massive lay-offs. But it’s a mistake if the business ignores the problem.
Re-read my post on Windows Vista: Past Its Due Date Already, where I talk about this kind of denial in a similar situation with the product of a former software industry market leader:
Then a line is crossed. You know that something is wrong. Your
engineers can feel it. There’s a malaise in the air. But, nobody says
anything. At the lunch table, you read PC Week’s scathing criticism.
People stare around the room, some even laugh or scoff. Most say
nothing. You go back to your work, you immerse yourself in further
enhancements to your product. You convince yourself everything is OK.
You look at competitive products only for purposes of punching holes in
their strategy. You find the holes. You reassure yourself. Everyone
smiles.
Repeat until fail.
I recognize the pattern. That’s where Windows is right now.
A few MS employees have told me I’m not far off. And Robert Scoble, in his short comment to the post, says "I totally agree". Robert may not be on the team, but he’s at least a close observer.
Ed also makes another good point:
There are smatterings of anti-Windows sentiment in broadly sweeping
statements and quotations taken somewhat out of context that would
indicate that people are fleeing Windows due to the problems Gary
outlines. I don’t see it at all.
He’s right. I don’t think people are fleeing. Windows customers want windows to be healthy. Sure they do. I was listening to an InfoWeek Podcast yesterday and Mitch Wagner said that the newest Vista Beta and Office 2007 have him ready to "eat his words" about former negative comments. It’s looking better, and we’re all happy.
Yes, even I am happy. Nobody who relies upon Windows wants it to fail. I’m not a Windows basher, trust me. I did try Linux as my primary OS for 2 years. I gave it a good try and ran my Windows apps under Wine or VMWare. I should blog about it someday, it was an interesting experience in compromise. When I switched back to XP 18 months ago, I felt like an old friend had returned.
When I was working on the dBASE project I talked about in my earlier post, it was the same way. Everybody in the market wanted dBASE to be great. Everybody inside Ashton-Tate felt that. They wanted to produce the best product for the market. Nobody was "leaving for other products". There were no other alternatives! Very much like Windows. How can anyone leave? There are truly no alternatives.
My upcoming posts will be less negative. The last post was the "gosh we have a problem, Houston" post. Of course it feels bad to admit the codebase is doomed. Microsoft must eventually admit it. But, my next post won’t be "pro-Linux". It will be pro-Microsoft.
Thanks for the comments Ed. They’ll keep me on track.
May 30th, 2006
[Microsoft's New Win-Win Strategy: Post 2 of 5]
As a successful software architect, I’ve learned to recognize the results of a poorly managed design process. In Windows Vista: Past Its Due Date Already, I gave some insight into why Windows matches a "software implosion pattern" I’ve seen before. This post explores why Microsoft really should scrap the codebase, the next post will suggest the controversial idea that Microsoft should scrap Windows in favor of Linux.
The recognition that the code base is in trouble is no secret, and for the most part even Microsoft agrees. Microsoft has a team called "The Windows Code Excellence Team". They have a program for "driving broad changes efficiently into the Windows codebase". They call these "Strike Force Efforts" and the very name they chose reveals the adversarial relationship that even Microsoft insiders perceive with their own code base.
But still, Microsoft thinks they can fix it. This post is about why they can’t.
The Recipe Was Wrong From The Start
The biggest single reason why Microsoft can’t fix the problem is because Windows archtecture is flawed at its very foundation. They’re sinking money into repairs on an old car, trying to make it fuel efficient, trying to make it conform to current needs. But as we all know, sometimes you need to buy a new one.
Understanding why this is true is technically challenging for most people. The real issues are understood only by well-educated computer scientists. As a result, most of the common discussions around Windows architecture talk about the visible aspects of Windows, rather than the technical underpinnings. And, because the technical world is so full of geeks that argue pointlessly (a la SlashDot), there are rarely any efforts to bring clarity to Windows most severe problem.
There are ample sources of technical information about this. While it may seem like a Windows vs. Mac argument, Daniel Eran’s excellent (if lengthy) article "Five Architectural Flaws in Windows Solved in Mac OS X" is one of the best examples. What can easily be missed about this article is that 4 out of 5 points Dan makes are directly the result of Apple’s decision to use Unix as an architectural basis for OS X. If Dan were talking about MacOS 9, only the first of his five arguments would apply.
So, by adopting Unix, Apple was able to push their own technology into the next generation.
Both anecdotal evidence and qualified research are available to illustrate the problem. Security researchers can point squarely at flawed API design as a primary reason why Windows architecture actually "encourages insecure applications". Those who remember Fredrick Brooks classic book "The Mythical Man-Month" see the same trip-ups in Vista that Brooks warned about in his book.
Because Windows architecture has fundamental design flaws, Microsoft is constantly adding layer after layer of new technologies to achieve what would be implicit in a better architecture. This increases the amount of work, technology, and conceptual baggage that developers need to learn and master. Thus, it decreases the relability of overall applications. Developers have a limited amount of mental energy. Any Windows developer knows that negotiating Windows "idiosyncrasies" takes up almost as much time as the application itself.
The idiosyncrasies themselves become entire "worlds of new technology" Knowing how the Windows event loop operates, and arcane topics such as using "assemblies" as a way to avoid "DLL Hell" become distinct new art forms. The Windows community has become so accustomed to this continual "band aid" approach that they don’t even recognize the original problem. Instead, the new technologies become job skills, and further serve to separate Windows programming expertise and culture from the larger world of shared computer science knowledge. (See an earlier post of mine for more on this "disconnect").
Now and then, some Windows developer asks the right questions (from the InfoWorld gripeline):
I am surprised that still, after all these years, that Windows has not
seen the solution that UNIX (and probably many OS’s) takes to DLL Hell
– use versioned DLL files so something linked against an old DLL will
use the old one while something linked against the new one will use the
new one. Viola. Problem solved.
Continual layers of technology to solve architectural problems leads to the next problem with the Windows codebase …
Escalating Complexity
In a recent New York Times article, the following appeared:
Several thousand engineers have labored to build and test Windows
Vista, a sprawling, complex software construction project with 50
million lines of code, or more than 40 percent larger than Windows XP.
Windows is growing beyond even Microsoft’s ability to manage it. In October 2004, Martin Taylor, then Microsoft GM of Platform Strategy, admitted that changes were needed and introduced a new "role-based" strategy for reducing the size of the code-base:
"Today, it’s still the entire code base. There’s no reduction in the
bits you get; things are just roped off," Taylor said Friday. "We want
to get to a model of role-based deployment where you might just have
the bits you need for that function. … It’s one of our design goals
for Longhorn."
In that ComputerWorld article, Taylor was talking about a new agenda for Longhorn to trim the size of the code base. The article later says
A Microsoft spokeswoman confirmed that the goals of providing a smaller
Windows "footprint" are to cut maintenance costs and provide a "reduced
surface attack area."
Martin is now part of the Windows Live team. The new GM of platform strategy, Bill Hilf, hasn’t said a word about it in the last 18 months. So much for trying. If a smaller code-base was one of the design goals for Longhorn, it seems Microsoft has decided to put the idea (and Martin Taylor) on the back burner for now.
Windows 3.1 had 2.5 million lines of code, Windows 95 had 15 million, XP has 40 million, and Vista will have at least 50 million. Microsoft managers are stumped about how to reduce the complexity and size, and Michael Cherry, former Microsoft product manager, says "It’s such a collection of smart people that they’ve started to believe too much in themselves". (MercuryNews)
Not only is the problem growing, but the team is looking the other way.
Reinventing Microsoft Software Culture
It’s not enough to throw out the codebase. Here’s the real challenge: You need to retool the very culture which allowed it to happen. Before starting over, you need to figure out what (mis-)management style allowed Windows XP’s excellent security foundation to be completely disabled by other arms of the organization. Microsoft already tried to re-invent Windows once with Windows NT. The problem is, while one set of OS experts in Microsoft is devising an excellent security framework, another set of "experts" is violating all the rules in the interest of "dumbing down the features" for users. Security guru Steve Gibson (quoted in WindowsITPro) explains the phenomenon:
"With
Windows 2000 you could argue that Microsoft was at least preserving the
original NT security model," Gibson noted. "Regular users would log on
as Administrator only when doing system tasks like installing
applications or bug fixes, and then log on as a regular user to get
work done. This is much like a UNIX machine, where the root account is
tweaked very carefully, not generally used for day-to-day work. But
Microsoft moved that NT security model to the home and gave
Administrator power to users. [The company] discarded the traditional
security model because it was too hard to explain to users…"
Aaron Margosis (a Microsoft employee) has an entire blog dedicated to the topic of trying to help people run Windows as "Non-Admin". Despite his excellent advice, it is impossible for the average user to comprehend, much less follow such instructions when all of the default settings of Windows, and the expectations of third-party software are that average users will be running as Administrator. As a result, Microsoft’s excellent security model lies in the background, gathering dust, while clever hackers throughout the world know that it’s open season for attacking the average user’s desktop. All of this is no accident, and is engineered into the product—introduced by well-meaning product managers attempting to make things simpler while elsewhere in the organization people know better.
It is as if Microsoft is releasing products by "trial and error". They even recognize the "non-admin" problem and are moving to add yet another layer of complexity called User Account Control to Vista to help solve this problem. But, even on the second attempt, it’s not looking good. Beta users are annoyed and claiming it is far too complex and intrusive:
In its current incarnation, too many people are likely to dismiss [User Account Control] completely, and if that happens, everyone loses. [Ed Bott -ZDNet]
That Microsoft has launched a major product with an major introduced security flaw is the most brazen sort of incompetence. That they are still not getting it right reveals something much worse: ignorance. While scrapping the codebase is essential, it’s equally essential to establish new rules when moving forward.
Margolis final paragraph in one of his articles is among the most interesting. Probably the biggest reason why Windows XP is so vulnerable, and that so many people run as root, is cultural:
Hey, y’all! We need to lead by example. People look to us for best practices, for the right way to do things. We are trying to convince the world that we are thought leaders in software and in software security. In the Unix world, they never run as root except when necessary. They “su”, do what they need to do, and revert back. We are not leaders when we run as root all the time. Comrades: you need to run as “User”, and your customers need to see you doing it. If you run into issues, don’t add yourself back to the admins group – file a bug against the offending product. Customers: if
you see any MS sales, MCS, Premier, PSS, etc., doing web or email as
admin, please tell them, “You’re not setting a very good example. I am disappointed.”
Spaghetti
While some experts agree that the above flaws are proven facts, I suspect there are more people in the industry that consider these things quite subjective. As we all know, a picture is worth a thousand words. In April, a ZDNet article with the dubious title "Why Windows is less secure than Linux" included two diagrams generated by Sana Security and shown below. These diagrams received little attention, but compare graphically how Windows and Linux process the service of a single web page.
The first diagram illustrates how Windows processes the page:
The second illustrates how Linux processes the page:

The orderly arrangement of the Linux traces are no accident. They are the result of years of thinking which goes back to the origins of Unix itself. Good operating system design assures that the operating system and application layers are distinct—separated by boundaries which are like immovable walls. Such walls manage the compexity of systems by isolating operations from one another to minimize dependencies. To a good system designer, these are not just guidelines, they are dogma.
To any good systems architect, the traces of the Windows diagram are like a giant black spot on your MRI. They represent an undisciplined and haphazard set of interelationships resulting from years of unsystematic development and support of legacy code and processes.
Little Hope of Repair
There is hardly any hope of repair for such systems. The new Windows Vista may eventually work, but it will be by brute force testing and a bit of sleight of hand—not good design. Microsoft, however, is trying to fix it. On his blog, Microsoft employee Larry Osterman describes the problem:
As systems get older, and as features get added, systems grow more
complex. The operating system (or database, or whatever) that started
out as a 100,000 line of code paragon of elegant design slowly turns
into fifty million lines of code that have a distinct resemblance to a
really big plate of spaghetti.
This isn’t something specific to Windows, or Microsoft, it’s a
fundamental principal of software engineering. The only way to avoid
it is extreme diligence – you have to be 100% committed to ensuring
that your architecture remains pure forever.
It’s no secret that regardless of how architecturally pure the
Windows codebase was originally, over time, lots of spaghetti-like
issues have crept into the product over time.
Larry’s right about the problem. But unfortunately there is no way to turn back the hands of time and retroactively make sure that the architecture was pure from the start. Yet, he goes on to describe how Microsoft has developed internal tools "that perform static analysis of the windows binaries and
they work out the architectural and engineering dependencies between
various system components". The hope is that by knowing which layers should be isolated and why, changes can be put in place which fix the problem and eliminate the spagetti.
Then, unfortunately, Larry goes into the tall grass when he says:
Well, most of the layering issues can be resolved via email, but for
some set of issues, email just doesn’t work – you need to get in front
of people with a whiteboard so you can draw pretty pictures and explain
what’s going on.
Software architecture may be an interesting thing to talk about in email or on the whiteboard. But such naive attempts will not make the sweeping architectural changes that are necessary to yield noticable improvement. Only good design, enforced by software tools and disciplined coding practices, can result in well-layered systems with managable complexity. Much of the windows code itself predates the very tools and practices needed to fix it. For the fundamental design of Windows to change, you need to go back to the drawing board.
Even if you believed, for a moment, that you could check every line of code and fix every dependency, the math would get you. As the number of function points increases, the number of side-effects and dependencies increases exponentially. Even a small software system can have millions of interdependent relationships. A large system like Vista with 50 million lines of code would have side-effects and causal relationships that would defy analysis.
Confidence Building
In March, when Vista slipped, article after article appeared about the whys and whens of the slip. The popular jourlalism moved quickly into an editorial stance. The New York Times article "Burden of the years weighs on Windows" set the stage:
"Windows is now so big and onerous because of the size of its code
base, the size of its ecosystem and its insistence on compatibility
with the legacy hardware and software, that it just slows everything
down," noted David Yoffie, a professor at Harvard Business School.
"That’s why a company like Apple has such an easier time of innovation."
Microsoft was uncharacteristically silent. I believe, finally, there could be no disagreement.
Whether the problem is as egregious as I say, there certainly is the belief that it has reached a turning point in its lifecycle. Consumer confidence in Windows behavior has waned, and now the recognition that the underlying operating system is to blame is becoming widely accepted.
If, as part of a bold new strategy, Microsoft announced that the Vista codebase was the end-of-the-line, confidence in future solutions could finally increase. Instead of fighting the past, the talented teams of Microsoft engineers would be learning from their mistakes. As it is, there is far too much code and far too many problems for them to do anything other than trudge forward, making it work as best they can.
Conclusions
The Windows codebase is in bad shape. It’s unlikely that Microsoft, or indeed anyone, can fix it. I am certain they will create a usable version of Vista. But, I expect that one year after its release, we will not be looking back, happy that the problems are solved. Instead, such an albatross of design can only yield new problems, and new challenges for Microsoft.
Get rid of it, and replace it. But with what? In the next post (coming in a couple days), I’ll suggest that Linux be a part of a new strategy to revitalize the product line. It’s good for Microsoft in more ways than you think.
May 28th, 2006
Over the next week or two, I’ll be blogging about Microsoft’s product strategy. I have received a variety of interesting reactions to Windows Vista: Past its Due Date. I obviously struck a chord with many people both inside and outside of Microsoft. In the "Vista" article, I related a tale of another time a major product met its demise: dBASE. I was there for that one. But, history repeats itself, and it’s happening with Windows. Windows is headed down the same path.
Unless Microsoft does something dramatic. I call it Windows TNG.
Microsoft’s best, and perhaps only opportunity to take their products to the next level involves four simple, bet-the-company steps. Each follow-up post will go into these steps in detail. Honestly, I don’t think Microsoft has what it takes to take such bold moves anymore. But, here they are:
- Scrap the Windows codebase forever. Release Vista, and announce publicly that it will be the last version of Windows based upon the NT/Win32 platform.
- Use Linux as the base operating system for the next generation of Windows. Do not modify it, do not "Microsoftize" it. Do not try to own it. Exploit it.
- Reinvent the Desktop. Call it Windows. Windows: The Next Generation. Outdo Apple, outdo the current platform, outdo every "Linux desktop" effort in existence.
- Put applications first. Office TNG, Project TNG, Excel TNG, Outlook TNG. Do not port. Rewrite. Do not create a Win32 compatibility layer. Do it right.
To some readers, this is an obvious win. To others, it’s ridiculous. Some would say it is heresy. The most analytical would say that it throws away Microsoft’s biggest IP asset, the Windows codebase, and puts Microsoft head-to-head on a level playing field, making them far too vulnerable. The stock price would plummet.
Maybe so. But, once you realize that the codebase is the problem, you also realize Microsoft has to devise something better.
My early career was fueled by the exitement that Microsoft brought to the playing field. They not only created a C compiler for me, they created a better one. They not only gave me a graphical desktop. They gave me a better one. They not only created a better way for the novice to write windows applications, they amazed the world with VB and revolutionized the desktop development environment. They were hungry, unfettered by legacy strategies and technologies. They were the underdog, and they were my champion.
So, rather than bitch and moan (which is so easy), I’ll go into details about these steps in a multi-part post. I hope it’s useful.
May 3rd, 2006
For almost two years, I’ve been using a Firefox extension called "Tabbrowser Extensions" written by a Japanese guy who really tries hard, but doesn’t quite do English very well. Now, I have to say, he may not know English, but the controversial TBE packs more tabbing features than I even thought were conceivable into one extension. If there’s a feature you want… it’s there! Thumbnails, tree views, horizontal or vertical tabs, closebox management, look-and-feel, fonts for unread items, pop-up blocking and repurposing as tabs. The newest version is even more incredible.
Am I recommending it? I’m not sure. Even the author advises against using it! He says…
This extension strongly unrecommended. Tab Mix is recommended instead of this, because it is stable, light, and it covers most useful features of this…
If you think this is too heavy and too gigantic having many needless features, see a thread in MozillaZine: Rebuilding TBE’s featureset with other plugins. There are many tiny extensions which provide each feature of TBE’s.
If that doesn’t have you clicking to download, the author has an "Advantages and Disadvantages" pages which is heavily weighted toward the disadvantages, including this gem…
Virtually, now no one can update TBE codes, excluding [sic] me. Its codes are
like as entwined spaghetti. Many unknown bugs are maybe there, many
known problems (with unknown reasons) too.
Fantastic! No wonder it works so well under Windows!
The poor guy! It’s obvious from looking at his page that he’s had nothing but complaints from lots of people. Maybe it’s no wonder. Just navigating the preferences pages makes you feel like you’re in the cockpit of a 747. Maybe a 747 is even easier. As a programmer, I can’t even imagine how you can make all those checkboxes operate in combination. I mean, when you add tab thumbnails, have tabs indicate which things you haven’t read, group tabs by color and launch group, there are so many fonts and special-features packed into the tab you really need a quick reference card to understand what it all means!
But, here’s the rub. I’ve been using it for 2 years. Not a problem. And I love it! When people look over my shoulder at me whizzing through dozens of tabs and sorting and rearranging and categorizing they say I look like some kind of frenzied information junkie. It’s true. I can consume and organise tons of web information using TBE. It saves me time.
And it’s almost like a video game in terms of the fun factor! I change the preferences all the time just because I can!
So, I may be alone. I may be weird. I may be totally off my nut.
But, I recommend Tabbrowser Extensions highly! Get it. Go nuts! In a world of dangerous awful stupid adware and browser viruses, at least THIS crazy thing does something helpful. It may not be for everybody. You really need to be a cockpit junkie. But, if you are, you’ll never go back.
And hey, let’s figure out how to help this guy out of his depressing mess. I’d love seeing this thing maintained and updated and made more official. If there are bugs, I haven’t found any. But, considering how over-the-top this thing is, there MUST be zillions in there somewhere!
Anyway! I give it FIVE UGLY STARS!