Friday, October 06, 2006

Some Comments from the MIT Emerging Technology Conference

Jeff Bezos was one of the presenters at the conference I went to last week, and he managed to mention Amazon’s less glamorous “developer” products without sounding too much like he was marketing them. He was pointing out that Amazon hopes to find what fundamental technological problems they have solved for themselves and try to generate revenue from those things by offering them as services. It’s clear that an interesting new thing happening in that fundamental infrastructure software as a service is actually becoming practical.

Amazon S3: some of you may have heard about this, but it’s a remote file storage service for cheap. Highly available and probably limited by your own bandwidth you use Amazon’s public APIs to push your files to some unknown location and retrieve them on demand. Some comments:

  • Only a Big Dog like Amazon could do this because of branding and comfort. ISVs like us could even use such a service for asset management and say that we use Amazon’s S3 service for reliable asset storage, or automate regular database dump and push to an S3 location transparent to the user. If it were Bob’s Remote Storage in Arkansas, people would not be likely to trust it, but Amazon has a good shot at getting this right.
  • For companies with high storage requirements and possibly little or no capital or expertise available to build a proper machine room, this is amazing enabling technology. TCO for SANs and such is rough, never mind the initial cost. One can make a really compelling price/performance argument to use a service whose cost scales linearly with the growth. In fact, the cost probably flattens out somewhat over time.

Amazon EC2: This is the Amazon elastic computing cloud. At first, this just sounds like another grid technology for doing supercomputing. Actually, it’s sort of a virtual system-on-demand sort of thing done remotely. One example usage is that you would create some server image for your web application, then request from EC2 that you want 10 load balanced images just like that; Amazon then does some magic and these systems are available to you.

  • This is again huge for companies who may not have a machine room, but need to handle spikes in traffic. Let’s say you put out a project release and you know you will get slashdotted. Just for the week surrounding your release, you can order up 50 extra machines of capacity *just for that week*. When the buzz wears off, you release those systems, and all of this without the ugly investment that is really hard to do well, and leaves a bunch of under-utilized systems taking up power and space in the garage. The cost is localized and incremental.
  • Bezos observes that 70% of application development goes into “heavy lifting” (what I call “dumb, annoying, and necessary”) things like storage, system administration, and (for amazon) shipping and packaging. Things like S3 and EC2 can remove those things from the equation for small companies. This supports the very compelling notion of micro-ISVs.
  • There was a time when only excellent mechanics were capable of owning automobiles. Now, even people like me can both own *and* drive a car, imagine that. The point is this: lowering the barrier on dumb stuff can really enable people with good ideas to get their ideas implemented.

Remote Web Search API: On 10/4, Amazon announced API-level integration with a public web crawler Alexa, and they may have been what she was alluding to, but suddenly searching the internet in vertical ways becomes more possible.


(I found the following non-sequitur in my notes I wrote as the MIT staff was fiddling with some wireless mics which were dropping in and out: “how geeky do you have to be to land the a-v tech job at MIT?”)

That’s just Amazon. This week, I have been doing a but of reading, and look what else is out there (just to name a few):

Yahoo BBAuth and Google Account Authentication: Yahoo and Google (probably others) have an “outsourced authentication” API available. Yahoo’s API is called BBAuth or Browser-based authentication. This is nice for mash-ups, in that I can write a novel application that leverages Yahoo-resident content (e.g., the contents of my Yahoo photos site) without having to build in any actual authentication code into my application. My app would (in a browser-based app) redirect to the yahoo login page, where an end user provides his yahoo username and password. The app gets a token back and can then access the user’s content, but the application is never responsible for the username, password, or profile data. Google’s API is shaped the same way so that an application which accesses data contained in a user’s content area (such as gmail) can be seen without the application really needing to know the user’s login or password. This holds good promise to allow my little application (whatever that is) to ignore the infrastructure required for solid authentication/authorization. I.e., no AD integration, no LDAP server to set up, no password security/aging support to write (yet again).

Google Checkout API: Implementing a shopping cart/secure checkout process was once a big pain. I was around in the internet bubble and implemented several of these, and it was always way too hard. Had low-cost technology like this been around in 1997 or 1998, many many more good site ideas would have survived the crash. Anyway, as expected, this is a generic shopping cart application where the risky and tricky part actually belongs to Google. Sure, there are look and feel issues, but again, strong branding helps people feel better about blurring the lines between sites.

All these are consistent with the pattern: infrastructure software services are mature enough to use in commercial applications, both as micro ISVs and Big Company ISVs (like us).

(Thanks to Phil for being an excellent sounding board in Cambridge while assimilating this stuff!)

TRETC (this tag is for the blog stuff-- They said to include this tag if you are blogging about the conference; interesting experiment while I blog this at http://srehorn.blogspot.com/ to see what happens. Now I can go search for other people’s impressions of this same content. What did they get out of the same talk?)

Thursday, October 05, 2006

MSFT Wants Me To Use Team Foundation Server

But what if I don't want to use it? I am sure TFS is a lovely piece of infrastructure, but I have had a heckuva time installing it. After numerous attempts on a spankin' new disk, I still cannot get it to run. I don't claim to be a MSFT Products guru, so maybe I am just being a dork, but anything that is that hard to install is a little off putting, at best. Am I off base? Maybe someone can set me straight on what I am doing wrong. Someday I'll get it installed and then I can try it out. (For the record, I am hung up on the piece which requires the SharePoint persistence to be set up a certain way, but I cannot seem to make the SharePoint installer do the right thing so I need to pursue that next.)

My philosophical bent right now is to use continuous, automated integration, test-driven development (TDD), and a strong IDE. Visual Studio (VS) is a strong IDE you bet; maybe the best ever. My shop however, uses TDD, StarTeam for source control (not SourceSafe), and centralized, continuous build with CruiseControl.net. These things just don't seamlessly integrate with VS. Even with TFS, the unit-test support for TDD is pretty weak. I know MSFT is working on that and we can expect them to get it really right in another two releases. Because there is a StarTeam plugin for VSS which works pretty well, and a nice NUnit wrapper called TestRunner from Mailframe.net that lives in the IDE, we are partly there. But to get the continuously integrated warm fuzzy I seek? So I tell my boss that we can no longer use TDD, StarTeam and we can no longer have continuous integration unless we switch to Team Foundation Server? I don't think so. I have the goal of creating a nice, TDD-friendly, continuous integration framework that isn't annoying for developers to use. This is tricky, and I'll post some various aspects of this over time to show how it can be done.

The good news is that MSFT knows that there are 5 or 10 people in the world who will, for whatever wacky reason, choose not to use TFS. In fact, IMHO, in order to create a platform for team development (TFS) such that it could support current best-practice development, MSFT had no choice but to improve their fundamental build technology. The venerable nmake finally ran out of steam. A key technology here is MSBuild, is delivered as part of the .NET 2 distribution—not even an IDE feature but used by the IDE anyway. I am happy about this because they did such a nice job with MSBuild that I would say it's useful beyond just building DLLs and such, but that is an essay for a different day.

So just as an overview, let's consider in some more detail what the players are:

  • MSBuild and Project files (*.csproj): Automatic builds individual developers must make EXACTLY the same build. Historically in MSFT environments this is tricky because the IDE does build in some magic way; thus do you really know what bits in are in the build and how they got there? Was there some goofball compiler flag introduced in your "production" build that no developer ever saw (and thus didn't test against)?
  • CruiseControl.NET: Continuous integration which builds and sends scandalous emails to whomever was sloppy enough to break the build for whatever reason.
  • StarTeam for source control: I don't use Visual Source Safe. Shocking! Subversion, StarTeam, PerForce, they all work great and have each their own annoyances. Using something besides VSS (much less TFS) shouldn't prevent me from setting up my build environment My Way.
  • NUnit : TDD requires that I have test code and support as part of my regular IDE "air". Test code is written at the same time as application code, so I need to partition my test code from my production code but still be able to work with the test code in the IDE environment without having to think about it. I want green bars during TDD, and TestRunner from MailFrame.net (among others) provides that. [Cautionary note: the current version of TestRunner has exhibited some behavior different from what executing test DLLs with nunit-console.exe. In particular, the console run of the tests highlighted the fact that test B was expecting an object instance reference created in test A. This is pretty basically wrong from a test-development perspective, but the point is that TestRunner's wrapper on nunit didn't see it. I recommend using nunit-console as the final word, especially because TestRunner's installation uses a particular version of the nunit DLLs.]

That's the basic view of the stack, and further posts will elaborate on how exactly to set these up along with a sprinkling of what I believe to be best practices.