Tuesday, November 04, 2008

Amazon and Azure

I was reading my esteemed colleague Dmitry Sotnikov's overview comparison of the proliferation of cloud platforms, and it inspired me to write down my take on these two approaches to the world of 'cloud computing', whatever that is.

I am a big fan of Amazon Web Services (AWS) because it's available now and you pay only for what you use. No startup fee, no minimum subscription charges, no relentless mailings offering upgrades. I am able to boot up remote systems and interact with them without having to do an install from scratch, etc. This works nicely in my world, but then I am a technical guy so I am not put off by having to write a program or login to a machine with ssh to do something.

So it's great, you do everything yourself, and it's a pain because yes, you do everything yourself. As I am often reminded in yoga class, maximum flexibility yields maximum pain. Even though I am not afraid to write a program, there is a big conceptual space to absorb before I can start writing that program. First I need to understand what an Amazon machine image is, how to attach persistent block storage, etc. It's really no surprise that although AWS has received a lot of attention from the press, hobbyists, and academics, it's not what you could call mainstream.

Microsoft's Azure is not yet completely defined, but I think I can see where they are headed. While not as feature-complete as Amazon, they are focusing on providing a nice layer of abstraction that offers what will likley be full-on system management of systems which physically reside who-knows-where. A small number of services that works simply and well is arguably better than a raw interface on a comprehensive set of services. Plus Microsoft is able to buy themselves some time to finish building out the data centers, and use the early adopters as guinea pigs to drive out the operational issues.

So as not to cannibalize their lucrative desktop biz, Microsoft is positioning this so that you have many options for creating applications. They say you can build your application with the new Azure APIs and tools such that you can leverage services in the cloud only if and when it makes sense to do so. In other cloud-computing scenarios, you need to commit to a particular application architecture which is either resident 'offsite' or very local. If they deliver on the promise of making it easy to dynamically leverage cloud services, or bring the whole thing inside your firewall, or mix these notions as needed, then there might truly be some useful software here. You bet you are still committing to the Microsoft stack with all its various pros and cons, and the essential feature the deliver with Azure is that you get a new architectural option for lateral scaling, hopefully without having to work too hard.

Somewhere, there is, no doubt, a giant room of people coding like crazy to build out more services, create the tutorials, and clean up the APIs. So while Microsoft is not offering as much granularity as Amazon, they are offering what will eventually be a highly usable interface on the web which allows people to do stuff that they normally do. You can build on your local machine, and push the result to the cloud right from your desktop. No ssh is (apparently) involved, and, for most people, that's the right answer and removes an important barrier to entry for many people.

For Amazon, there are things like ElasticFox, a Firefox plug-in application for managing one's AWS services. But it's really just syntactic sugar on top of the raw APIs-- it lacks the abstraction that captures the work being done in the way that actual people think about it. Microsoft's focus on application hosting deployed right from the IDE represents an important distinction: Amazon offers the ability to do whatever you want, Microsoft gives you an application development environment AND a nice web front end to what you probably need to do anyway for production systems. Three are compromises here because you naturally don't have the same flexibility with Azure that you have AWS, but in many, many deployment scenarios, all that flexibility just gets in the way.

Monday, September 01, 2008

Enabling Semantic Infrastructure for Collaborative Systems

As more companies embrace the techniques of enterprise social software (ESS), I start wondering about what I am going to do with this extra data coming my way. ESS systems can promote collaboration which often leads to Even More Data coming into my "information space." Currently it's not so bad-- I have a middling amount of email, about fifty thousand bookmarks, about a hundred RSS feeds-- how hard can that be to keep up with? When you think about how much that space would expand if I add even my most immediate group of co-workers, in effect, I have all of their bookmarks, RSS feeds, etc. This is only a good thing if I have some help dealing with it all.

The only way out of the coming information glut will be to have the machines help dig us out. But how to do that? Enterprise search is heavy, expensive, local text search. It can be helpful, but it is not the way to handle any the meaning of the data, i.e., its semantics. Semantic data management needs to be baked into a content system, so the generation of metadata just becomes part of the working environment. This can be in the very simple form of tags/folksonomies and standard representations of working groups using techniques such as friend of a friend (FOAF) and description of a project (DOAP). When semantic data management capability is part of the infrastructure in an ESS-type environment, it promises to allow data to be organized and queries in interesting and emergent ways. In a corporate environment, the system can easily link up associated projects, content, or people.

This sounds fabulous in theory, but adding such capability to your environment is actually worse and harder than putting in a bigtime enterprise text search server. Semantic web technology is some of the newest 10-year old technology in our bag of tricks, and it has the capability to capture, analyze, and derive value from relationships inside of the data. But, in order to get this kind of semantic connectivity, you need a purpose-built data store, a server to house it, and an analytical/query system to get to it. Even worse, semantic meta data gets big very quickly which can lead to storage and query-response issues. So, not helpful, right? The software integration problem alone is daunting enough, and adding semantic infrastructure to my overall ESS platform means that I have to adopt yet another system to manage.

Naturally, I want to imagine that I have access to a semantic data server that looks like it lives inside my machine room, and that feels like software-as-a-service (SaaS). I have worked quite a bit with virtualization from external providers: Amazon's EC2, CohesiveFT, and others. These are amazing systems that allow me to sign up, submit some commands, and some magic happens. What's great about that is that I don't have to put the systems up, I don't even need to understand how they work. I can just use them as if I *had* spent 6 months adding a new wing to my server room. With a virtual semantic server, I can integrate the promise of the "linked web of data" into my ESS platform to manage and leverage the meaning of all that data.

Now, it's important to observe that virtualization works in the SaaS model because of the generic nature of the task. The vendors of such services are able to optimize for scale, performance, and reliability without needing to know precisely how the systems are going to be used. Semantic data as a service falls into this same category in that it's generic and needs to be actively optimized for scale, performance, and reliability.

Talis, a UK-based company is aiming to be the Amazon EC2 of the semantic web, and I think they have a good shot at it. The same principles apply: Talis is concerned with making a semantic store fast, reliable, and scalable, so you don't have to. Your data is stored and processed somewhere else, but it's always your data. Via a straightforward HTTP-based interface, you add metadata, and query against it.

Mind you, this is all very new, and Talis themselves have not yet defined their precise business model, but they are working on it, and making developer access free-for-the-asking for the time being. Clearly, there are many real-world issues to resolve, such as SLAs, privacy, and billing models, but the key notion here is that semantic data processing is quite generic, and we should not be creating our own semantic servers to manage this data. In consumer-land, "Web 2.0" is creeping toward the linked web of data where more people are (finally) starting to understand what TimBL was talking about with this 'semantic web' stuff. Now is the time as ESS systems proliferate, we who glue these systems together should be taking advantage of what semantic web technology can do for us, and skip the server set-up part by using a system such as that provided by Talis.

See the Talis.com website and their developer wiki (n2.talis.com) for some overview articles and taste of how to interact with a Talis data store. A future post will include some of my experiments with the system.

Monday, January 21, 2008

SimpleDB From Amazon Web Services, Part I

I just heard about a great new website called Amazon.com! I don’t know what ‘Amazon’ has to do with selling things, but they have stuff for sale there. Tell your friends you heard it here first! I pride myself on staying hip to startups like these.

A little reading at this “web” site leads me to find that they offer more than a paltry assortment of books, CDs, power tools, sporting goods, etc. Turns out they also have web services for sale for virtual hosting, disk storage, and recently, a database. The database-in-the-sky notion piqued my interest, so I spent some time working with this new service. What follows is an overview only-- in “part 2” on this topic, I’ll talk more about code details, but for now, the goal is to discuss what’s interesting about the beta-level offering called "SimpleDB".

If you are not familiar with Amazon Web Services (AWS), the general idea is that many fundamental computing infrastructure components ought to be available on-demand in the network (or, more fancifully, in the “cloud”). Amazon has a gigantic infrastructure, proven ability to manage it, and, maybe more importantly, they also have a gigantic billing infrastructure. Thus, Amazon can cost-effectively provide virtual machines, disk storage, message queuing, etc. by reselling the bits and pieces of infrastructure that fall out on their machine room floors.

SimpleDB is an Amazon Web Service. All of Amazon's web services are structured as pay-for-what-you-use. There is no startup cost, and you pay tiny amounts for transactions, and for the storage you eventually use. It's a perfect long-tail kind of scenario. For all those people who need to maintain “only” a few thousand records in a table and don't want to run a system with a database on it (and deal with backups, power, redundancy, optimization, etc.) this sort of service is perfect.

I think it’s also interesting in that they made the decision to not just present an interface on a standard relational database model.

Simple is Good

Most of us geek types would over-engineer this and create a multi-tenant database instance using Oracle or Postgres or something. The API would consist of ways to send strings containing SQL to such a service and get back XML chunks of data. But if you think about it, most applications, especially smallish web-based applications, have schemas that are not particularly complicated. Naturally, there are times when you need your own database with thousands of tables, referential integrity, and smart DBAs to keep it all tuned and running. With SimpleDB, AWS is betting that the vast majority of “Web 2.0” applications will be applications with simple data needs.

This is the sweet spot that Amazon is trying to capture with the SimpleDB. You do not submit SQL strings, you simply perform something like name-value pair (attribute) gets and sets, and the underlying system performs a spell to produce your data. Offering a service like this provides users a simple interface to perform simple operations and high reliability (in theory—it’s just beta now!). Amazon wins too because they then have a controllable, revenue-generating service.

Of course there are issues regarding service-level agreements (SLAs) if you are going to run your business on such a service, and that's why Amazon has such long beta programs-- so they can evaluate the usage patterns the user create to figure out what they can reliably support. The stinkers, they still charge for the beta while using us early adopters as guinea pigs. How come I don’t get a discount for this?

Not Normal

So how can you possibly reap any value from what amounts to a “.properties” file in the cloud? DBAs recoil in horror at the idea of production data that looks more like FileMaker or Excel than Oracle. But if you don't care how “efficient” it is or isn't, and make it someone else's problem, as long as you get your data back “soon enough” then it’s all good.

Consider the following data set for a training log application. Some days I run, and some days I ride my bike. I have different data points for each type of workout. Normally (ha, a pun) I would create a table to hold this stuff like this:

Key, date, distance, type, distance, time, heartrate, route

Which leads you to "CREATE TABLE workout (blah blah blah)" and "INSERT workout VALUES (blah, blah, blah)" and the usual SQL hoops. Since I have about 20 different routes that I run or ride, I would likely end up with another table to hold route data with a key relationship, and you know the rest.

The big advantage of SimpleDB is that it's well, simple. Everything looks like a name-value pair. As long as you can grab something by its key, you can set a name-value pair for that item. Even better, a given attribute for a given key can have multiple values. This is a trick that is sort of cheating in Relational Dataville—you know, where you put a “magic string” (e.g., “Red, Green, Blue”) into a column value which gets interpreted in code after it’s extracted or parsed at query time by an incomprehensible stored procedure. SimpleDB treats this case as typical, and optimizes around it. So, in that case where you might have several potential values for a single attribute (think column), you just set that value too. The effect is that an attribute named “color” can have a query-able value of both “Red” and “Green” without having to make a separate table to achieve join-like behavior.

Another aspect that is a little jarring at first is that each item in your store can have its own collection of attributes. If you want to aggregate similar items, make sure they share an attribute that you can use for grouping. A lack of structure allows you to make monumentally messy databases because queries are at the mercy of your ability to follow conventions, but I think it’s a nice balance between perfectly normal square data, and a sparse matrix in Excel. That is, for every item you put into your SimpleDB database, if you want things to answer to queries that need to understand “Category”, make sure you provide an attribute called Category for each item.

Here is an illustration using (gasp) lisp-like notation (apologies to lisp purists):
(itemKey (name value) (name value) (name value) …)

E.g.,
(323 ('date' 'Jan 12, 2008') ('type' 'road ride')
('distance' '53km') ('time' '1:50'))

I.e., for unique key ‘323’, set a property called 'date' to 1/12, distance is 53km, etc.

Earlier I said that you can assign multiple values to an attribute. So, using the training example, I might want a property which lists the songs I was listening to during the workout. In square-table land, you'd have to create a second table with a primary key relationship and do a join to see all that data put together. Instead, I can add that data in all its multi-dimensional glory inline like this:

(324 ('date' 'Jan 13, 2007')
('distance' '53km')
('time' '1:50') (music ('tom waits' 'david byrne' 'cat power')))

Seem interesting? In the second portion of this piece, I’ll talk more about queries and what the programmatic interface is like.

Comments, questions, and corrections are always welcome. Thanks for reading.

Monday, December 24, 2007

I am So Old

Two things happened which remind me that I am really old when you think about how fast things are changing. Many have noted that most kids today don't have any idea about vinyl records, even though the needle-across-the-record sound effect is so common in TV shows for the young-uns. I don't think my kids would even recognize an LP if they saw it. For that matter, I realize that by calling it "vinyl" that puts me into a particular generation. My mom would probably point out that I wouldn't recognize a lacquer phonograph disc if I saw it...

I digress. It's telephones that are really really different. We are seriously spoiled by cell phones now, and most kids have no concept of having to stop and find a pay phone (that's a good thing-- I certainly don't miss that particular annoyance). I'll bet you cannot find a 15-year old who can even recognize a pay phone, much less know how to use one. How about phone booths? What would Superman do today? Head for Starbucks I guess.

Alexei, my 10.9 year old, asked me yesterday, "What is that thing on the phone that people put their finger in and turn it in a circle?" Haw! I told him the technical details are too much to go into (I look forward to the day when I can explain to him in detail why that particular user interface was chosen) but I said that is how people used to dial telephones. Weird, huh? Then I realized that we still use the word dial, but I havent found the dial on my cell phone yet. In fact, I said, all telephones used to be like that.

I did not go on to say that those phones HAD to be connected to the wall by a wire. Hey people-my-age, remember when you had to get the really long coily line to the handset so you could at least walk around the kitchen while on the phone? When I was in college, my girlfriend and I had a phone which had a really long cord to the wall, and a really long cord between the handset and the base. The long cord to the wall was motly transparent, so there were spare loops of it always getting hooked on someone's foot, which would spectacularly rip the phone off of whatever shelf it was resting on. This led us to label the phone the "Attack Phone" because it would ambush you when you walked by.


A few minutes ago, I heard a busy signal in a movie. Here's another thing that kids won't understand. Voice mail and call waiting is pervasive, so there is essentially no time at which a kid will hear a busy signal and know what that means. What, we are not busy anymore? For that matter, younger kids use cell phones so much, and land lines less and less, the whole notion of a dial tone is unknown. What other sounds of early telephony are already lost?

Friday, December 14, 2007

Helio Ocean Review

I was a late adopter to cell phones in general, and I still don't use mine very much. But when it works, it's amazing. We've all forgotten just how amazing it is to have a telephone with you at the park and call a friend, or to call in a smoothie order from the freeway. Portable phone, good. How about portable internet on a 1-inch screen? Ugh, no. The whole WAP internet thing was a joke to me. In a pinch, accessing the internet on a vanilla cell phone does work, but it's sort of like eating a bowl of cereal with tweezers.

With the advent of "smart phones" or "internet enabled phones" or whatever the current term is, I became curious, but still skeptical. Sure, you can get a BlackBerry in color, but the internet still sucks and is virtually unusable. Apple's device finally demonstrated that it's possible to have a useful experience on a tiny machine. Good for them, and good for us because the pressure is really on all the other vendors to provide an internet experience that is not completely compromised. I would have purchased an iPhone but Apple just had to do that deal with ATT-- my experience with their service in my area was so bad that a hard association with ATT was a deal killer. No matter how nice the internet, if I cannot reliably use the thing as a phone, forget it.

I bought a Helio Ocean to see if it would work for me. I had seen a demo of it, and read a nice piece about it and its design in Technology Review (http://www.technologyreview.com, registration required, but worth it). What follows is my take on this device after a couple of weeks of use. Short summary: excellent value, very good software, worthwhile machine.

The Helio people giddily point out that theirs is the first dual-slide phone out there-- if you slide it along one axis, it's like a phone, and slide the other way, it's a thumb-qwerty keyboard. I suppose this is a differentiator, but, for me, it's just more moving parts to break. You can read all the official features at the Helio web site: http://www.helio.com.

  • Internet: good, very usable. This is because the iPhone with its larger screen and touchscreen interface is just flat better. The Ocean is definitely the next best thing. There is a stock browser that comes with the phone, and Opera mini is available for the phone as well (also free) and works well. The screen is large enough to view as HTML but of course you have to scroll around a lot. One sensible feature is that the machine is set up to assume that if you open the keyboard and start typing something, then your goal is to perform an internet search. So by typing something in and hitting the "go" button, you get a tab-summarized search of Yahoo!, Google, Wikipedia, Amazon, etc. Sensible and useful.

  • Music/Video Player: very good. The stock Ocean has a few hundred megabytes of available storage to which you can easily copy mp3 files. I also added a 2G microSD card ($35! cheap!) so can carry several CDs worth of music around with me. The audio quality on playback is just fine, but (as expected) my Creative Labs' Zen is a bit better. I have been using the Ocean to carry around Jon Udell's podcasts and it's perfect for that. You can view video from YouTube on it, but the quality is not great. YouTube video quality is not fabulous to begin with, and once you down-level it for mobile use, it's even worse. But it works, and that alone is pretty amazing. Because YouTube is an endless source of bizarre entertainment, if your smartphone's main purpose is "interstitial time killer" then this feature is huge.

  • User Interface: very good. Because it's not a touch screen, there are physical buttons involved, and this means that you have to do lots of scrolling with what amounts to keyboard equivalents on a cell phone, rather than using the mouse (or your finger) to point and click. Both the Opera Mini browser and the Helio browser do a decent job of this click-click-click-select thing. Happily, there are several places where the "soft" buttons that surround the phone do the "right" thing; i.e., the thing you are most likely to do is associated with a button which is just clicked with a thumb. Another thing which the designers executed well was adjustment to the phone's orientation. Recall that this phone acts like a cell phone when you slide it longways, and more like a BlackBerry when you slide it out sideways. All the applications on this device do the right thing when you switch orientation, even down to the buttons. E.g., if the lower right button is "go" in landscape mode, if I flip it the other way, the lower right button is still "go" even though it's now a different physical button. Make sense? Probably not-- so suffice to say the Principle of Least Surprise is normally in effect on this phone. Well done.

  • Phone: excellent. Oh yeah, you can also use it as a telephone! Works great, clear sound, good microphone, tolerable speakerphone. Easy to redial, return calls, call from history, all that works great. The stock ringtones are uniformly obnoxious to the extent that you are compelled to go purchase something different, which I resent. Therefore, my phone is on buzz most of the time. Someday I will get a different ringtone. But even if I get around to downloading one, I simply don't want some top-40 song playing when the phone rings, or some cheesy 8-bit synth tune. Maybe if I got some Frank Zappa, or maybe the sound of a stomach rumbling...

  • Service Plan: very good. It's simple like they all can and should be. Helio charges a competitive flat rate for the service I get which includes the internet, text messaging, pictures, etc, and there are more-inclusive plans for fanatical users, and good "family plans". Alas, it's a 2-year deal, which I don't like, but oh well. Signing up is really simple, and you can do most of it online. I didn't have a great experience activating the phone because the overseas guy who helped me was on (I think) his first day and barely knew what he was doing. To his credit, he tried really hard and was fun to deal with-- youthful exuberance and all that. I had to call back to get my phone activated, because in fact he hadn't actually turned it on.

  • Camera: very good. Why the heck do I want to have a camera in my telephone? How about a cheese grater built into that gas grill? I admit, I am sufficiently old to have taken a long time to understand or appreciate the idea of having a camera on a phone. The gen-Y group got it right away because they all have phones with them all the time. So of course, when you have a camera, and you are with friends, it's a perfect match. And it's just plain nice to have a simple camera with you all the time, which you do when it's stuck to your phone. The camera on the Ocean is really very good, and the UI is sensible. Plus, it has a video camera which is nicely integrated with the YouTube application so you can capture a decent quality video and post it straight away. The newly-released YouTube viewing application has a really nice ajax-y feel to it which I found to be quite usable. This particular application on the Helio convinced me that the platform is capable of supporting nice-looking and usable applications on the Helio platform (which is SK Telecom, for those of you interested in that sort of thing).

  • Reliability: good. I can only give this a "good" rating because I have seen the phone crash completely due an application error. Of course there are some bugs, which I, as an early adopter, happily accept. It reboots quickly (15-20 seconds). I subscribe to Jerry Pournelle's observation that "any error rate large enough to measure is too high" for this sort of thing. An app can crash, but it shouldn't kill the OS. We as consumers need to insist on OS stability even with nutty applications.

  • Battery: good. Naturally, if you use the internet and camera a lot, the battery goes down relatively quickly if you judge it as a cell phone. The phone charges on USB or on the dedicated charger which comes in the box, and the battery life is great if you are only using the telephone. I have seen the charge plummet though while using the camera-- where it goes from apparently 80% to turning itself off in the course of 5 minutes or so. It doesn't normally do that with the camera, so I am thinking something went off in the weeds that day. I am anxiously watching for that to happen again.

  • Custom App Dev: not good. Because I am a developer, one of my first questions is "how can I create an application for this thing?". I dug deep on the internet trying to find an answer to this-- I know that the platform is SK Telecom, so surely there is a dev community out there, and a way to put apps on the system. Well, no. An email to the Helio people requesting access to their SDK and documentation said, in effect, "pitch your idea and business plan first, and we'll decide if you can have access the docs." Sorry, this is just plain stupid. Not that I have a million great ideas, but you have to understand the platform's capabilities and limitations to completely form a vision of what an application can do. I guess I'll just have to make something up and send it in, or, more likely, just target the iPhone.

  • GPS: The phone has GPS on it, but I am not enough of a GPS geek to know what chipset is being used, or what resolution it provides. What I have seen is the Google maps application on which the phone overlays your current position. It's a small thing, but a natural and extremely useful combination of features. For example, you can put in an address you want to get to and the phone knows where you are. Not helpful when I am at my house, but huge when you are lost in the urban jungle. Garmin has an app that runs on the Ocean which provides audio turn-by-turn directions, so for some extra bucks, you can add the features of a full-featured standalone GPS unit to your telephone. Very cool.

    All in all, I like this thing. There are lots of features on this phone which I may never use and didn't mention, but be sure to visit the Helio site for a complete rundown. I use the telephone while out and about, check my Yahoo! mail with it, send text messages with two thumbs like never before, and have even done some internet RSS reading on the bus with it. There are many cool and useful features-- even for my demographic. Now I can finally accept that it's possible to access the internet from a super-portable device in a useful way.

  • Wednesday, November 21, 2007

    PowerShell V2.0 Background Processes: Tiny Overview

    Just Now?

    For many years I have hoped for some construct in windows that works as well as in *nix to put long running processes in the background. There are ways to hack it in Windows, but all of them seem to have issues. Essentially, if you are running a long-running application from a console with Windows, no matter how you slice it, you are going to have a console window somewhere. Or you could make your operation a proper first-class Windows service, but that's hardly helpful for a long-running WSH script. Happily, we now have backgrounding in the PowerShell V2.0 along with lots of other new features. What follows is a slice of how background jobs known as PSJobs in PowerShell behave.

    Typical of PowerShell, the implementers didn't just copy what was done in *nix, they took the opportunity to make things better, or at least more complicated. With *nix (Trademarknix?), capturing output from background jobs was sort of messy with that 2>&1 approach to redirect stdout and stderr, and then you still have to redirect that output somewhere. Even when you manage to track all the output of a process, once jobs are put in background there isn't much else you can do except kill them.

    Consistent with the PowerShell Way, creating a background process actually returns a handle to an object with methods and properties. This handle also goes into a table that can be directly inspected and manipulated or as part of a pipeline. This is a distinct improvement over the traditional *nix-style where the job is placed into an opaque system process table which is more complicated to access. Of course in *nix you can get the process ID of a running process by parsing the output of the 'ps' command, but that metadata on the process doesn't get you much. PowerShell does it better. Read on for a simple example.

    There is a family of PSJob cmdlets, shown here for fun:

    • Start-PSJob
    • Get-PSJob
    • Receive-PSJob
    • Stop-PSJob
    • Wait-PSJob
    • Remove-PSJob

    This post will focus on Start-, Get-, and Receive-PSJob.


    Basic Tour


    What happens when you use this feature? We have to have some cmdlet to run in the background for demonstration purposes, and I picked get-service as an example even though it doesn't really run very long. It's a useful example because it returns somewhat familiar output.


    But First

    The PSJob infrastructure depends on the new PowerShell remoting feature, and, for that, you need WinRM installed and running on your system. You probably already have WinRM installed, but it may not be running so you may have to start it up by visiting the services controller (services.msc) or use start-service in PowerShell. If you don't even have WinRM, you can download it here: http://www.microsoft.com/downloads/details.aspx?FamilyId=845289CA-16CC-4C73-8934-DD46B5ED1D33&displaylang=en . And of course you need to have uninstalled any PowerShell 1x and installed the PowerShell 2.0 CTP bits. Note that PowerShell doesn't show up in XP on the add/remove programs list unless you have "show updates" checked.


    If the remote management service is running, you likely will also need to run a WinRM setup script delivered as part of the PowerShell 2.0 CTP. This is in $pshome and you can run it from a PowerShell prompt this way:
    PS C:\> & $pshome\Configure-wsman.ps1

    (mysterious output)


    Onward

    Do this in a PowerShell prompt:

    PS C:\> start-psjob -command "get-service"

    You should wind up with something like this, your numbers will of course be different, and your font, and the spacing, and your time of day, but do I really need to tell you that?


    PS C:\> start-psjob -command "get-service"

    SessionId Name State HasMoreData Command
    --------- ---- ----- ----------- -------
    26 Running True get-service


    So what happened to the output of the get-service command? It gets pulled into the PSJob identified by Session Id in the table for later retrieval. Note that the State is reported as running. The Start-PSJob returns to you the handle to the job you just launched. This handle is also stored in the PSJob internal table, which is accessible in the session using Get-PSJob.




    PS C:\> get-psjob
    SessionId Name State HasMoreData Command
    --------- ---- ----- ----------- -------
    1 Failed False get-service
    3 Failed False get-process
    5 Failed False get-process



    ...many other failed attempts because I had neglected to actually turn on the WinRM service silly me...

    24 Completed False get-service
    26 Completed True get-service



    You can get the handle to the job either by grabbing it at launch time, like this:
    PS C:\> $svc_job = start-psjob -command "get-service"
    PS C:\> $svc_job.SessionId
    30

    Or by using the -SessionID argument to the get-psjob cmdlet:
    PS C:\> $svc_job2 = Get-Psjob -SessionId
    30
    PS C:\> $svc_job2
    SessionId Name State HasMoreData Command
    --------- ---- ----- ----------- -------
    30 Completed True get-service

    However you arrive at the handle, you have what you need to check the running state, the command issued, and a HasMoreData flag which is useful to see if the handle's output buffer has been drained. All this is good to know, but where did the output go?

    To locate the output for this run, use the Receive-PSJob cmdlet. This cmdlet parses the handle to the job and returns the output and error result of the execution. In the example below, it's just returned to the console via default ToString behaviors, but you would normally capture the Receive-PSJob output and do Important Work with it. Be aware that this is, by default, a "destructive" read in that it drains the buffers that it reads. (There is a -keep parameter for Receive-PSJob you can use if you want to read the buffers multiple times. For that matter, you can inspect and read from the member directly like this: $svc_job.ChildJobs[0].Output).

    It's just a little disconcerting at first glance:

    PS C:\> $svc_job receive-psjob

    ...
    Running winmgmt Windows Management Instrumentation
    Running WinRM Windows Remote Management (WS-Manag...
    Stopped WLSetupSvc Windows Live Setup Service
    Stopped WmdmPmSN Portable Media Serial Number Service
    Stopped Wmi Windows Management Instrumentation...
    Stopped WmiApSrv WMI Performance Adapter
    Stopped WMPNetworkSvc Windows Media Player Network Sharin...
    Stopped wscsvc Security Center
    Running wuauserv Automatic Updates
    Running WudfSvc Windows Driver Foundation - User-mo...
    Running WZCSVC Wireless Zero Configuration
    Stopped xmlprov Network Provisioning Service

    And then running the same Receive-PSJob on the same handle:
    PS C:\> $svc_job receive-psjob
    PS C:\>

    No output the second time we call it, because the first invocation already drained that output buffer. Note also that if you inspect the $svc_job handle after the Receive-PSJob call, $svc_job.HasMoreData is False.

    As I mentioned earlier, unless you are a one-liner alpha-geek in PowerShell, you'll probably use (like me) constructs like this inside of short scripts:

    $svc_listing = Receive-PSJob $svc_job

    And then process the resulting array to locate something by service name to checks its state.
    Of course, the practical application of using Receive-PSJob will depend on the output of the operation being backgrounded. The main goal here is to show the basic recipe to background a process, and what to do with the result. I haven't looked closely (yet) at how the results from backgrounded cmdlets are serialized, except to see that they are serialized, and that it warrants more investigation. In the example of get-services, it appears that the serialized version of a service handle is available as opposed to a live reference to a service-- you wouldn't (for example) be able to invoke a Start() on a service after retrieving it with Receive-PSJob.

    That's one quick tour of one small part of the process backgrounding in PowerShell v2.0. I am very happy to see this feature, and it looks like they did a nice job with it. I am also very happy about the remoting feature, which I'll look at another time.

    Friday, October 06, 2006

    Some Comments from the MIT Emerging Technology Conference

    Jeff Bezos was one of the presenters at the conference I went to last week, and he managed to mention Amazon’s less glamorous “developer” products without sounding too much like he was marketing them. He was pointing out that Amazon hopes to find what fundamental technological problems they have solved for themselves and try to generate revenue from those things by offering them as services. It’s clear that an interesting new thing happening in that fundamental infrastructure software as a service is actually becoming practical.

    Amazon S3: some of you may have heard about this, but it’s a remote file storage service for cheap. Highly available and probably limited by your own bandwidth you use Amazon’s public APIs to push your files to some unknown location and retrieve them on demand. Some comments:

    • Only a Big Dog like Amazon could do this because of branding and comfort. ISVs like us could even use such a service for asset management and say that we use Amazon’s S3 service for reliable asset storage, or automate regular database dump and push to an S3 location transparent to the user. If it were Bob’s Remote Storage in Arkansas, people would not be likely to trust it, but Amazon has a good shot at getting this right.
    • For companies with high storage requirements and possibly little or no capital or expertise available to build a proper machine room, this is amazing enabling technology. TCO for SANs and such is rough, never mind the initial cost. One can make a really compelling price/performance argument to use a service whose cost scales linearly with the growth. In fact, the cost probably flattens out somewhat over time.

    Amazon EC2: This is the Amazon elastic computing cloud. At first, this just sounds like another grid technology for doing supercomputing. Actually, it’s sort of a virtual system-on-demand sort of thing done remotely. One example usage is that you would create some server image for your web application, then request from EC2 that you want 10 load balanced images just like that; Amazon then does some magic and these systems are available to you.

    • This is again huge for companies who may not have a machine room, but need to handle spikes in traffic. Let’s say you put out a project release and you know you will get slashdotted. Just for the week surrounding your release, you can order up 50 extra machines of capacity *just for that week*. When the buzz wears off, you release those systems, and all of this without the ugly investment that is really hard to do well, and leaves a bunch of under-utilized systems taking up power and space in the garage. The cost is localized and incremental.
    • Bezos observes that 70% of application development goes into “heavy lifting” (what I call “dumb, annoying, and necessary”) things like storage, system administration, and (for amazon) shipping and packaging. Things like S3 and EC2 can remove those things from the equation for small companies. This supports the very compelling notion of micro-ISVs.
    • There was a time when only excellent mechanics were capable of owning automobiles. Now, even people like me can both own *and* drive a car, imagine that. The point is this: lowering the barrier on dumb stuff can really enable people with good ideas to get their ideas implemented.

    Remote Web Search API: On 10/4, Amazon announced API-level integration with a public web crawler Alexa, and they may have been what she was alluding to, but suddenly searching the internet in vertical ways becomes more possible.


    (I found the following non-sequitur in my notes I wrote as the MIT staff was fiddling with some wireless mics which were dropping in and out: “how geeky do you have to be to land the a-v tech job at MIT?”)

    That’s just Amazon. This week, I have been doing a but of reading, and look what else is out there (just to name a few):

    Yahoo BBAuth and Google Account Authentication: Yahoo and Google (probably others) have an “outsourced authentication” API available. Yahoo’s API is called BBAuth or Browser-based authentication. This is nice for mash-ups, in that I can write a novel application that leverages Yahoo-resident content (e.g., the contents of my Yahoo photos site) without having to build in any actual authentication code into my application. My app would (in a browser-based app) redirect to the yahoo login page, where an end user provides his yahoo username and password. The app gets a token back and can then access the user’s content, but the application is never responsible for the username, password, or profile data. Google’s API is shaped the same way so that an application which accesses data contained in a user’s content area (such as gmail) can be seen without the application really needing to know the user’s login or password. This holds good promise to allow my little application (whatever that is) to ignore the infrastructure required for solid authentication/authorization. I.e., no AD integration, no LDAP server to set up, no password security/aging support to write (yet again).

    Google Checkout API: Implementing a shopping cart/secure checkout process was once a big pain. I was around in the internet bubble and implemented several of these, and it was always way too hard. Had low-cost technology like this been around in 1997 or 1998, many many more good site ideas would have survived the crash. Anyway, as expected, this is a generic shopping cart application where the risky and tricky part actually belongs to Google. Sure, there are look and feel issues, but again, strong branding helps people feel better about blurring the lines between sites.

    All these are consistent with the pattern: infrastructure software services are mature enough to use in commercial applications, both as micro ISVs and Big Company ISVs (like us).

    (Thanks to Phil for being an excellent sounding board in Cambridge while assimilating this stuff!)

    TRETC (this tag is for the blog stuff-- They said to include this tag if you are blogging about the conference; interesting experiment while I blog this at http://srehorn.blogspot.com/ to see what happens. Now I can go search for other people’s impressions of this same content. What did they get out of the same talk?)