Setup

Setup is...

Setup is, in its simplest form, the process of getting the resources (such as files and registry keys) that make up a software program transferred from a source (such as a CD or web server) and configured on a target machine.

Setup technologies are made up of software that falls into categories like installation engines (such as the Windows Installer), agents (such as the Automatic Update agent), and bootstrappers (just about every setup.exe today is a bootstrapper). Finally, acts such as patching, migration, and repair also typically fall into the domain of setup.

Now, let's go back and define some of those bolded terms a bit more distinctly. First, let's define the all important resources. I like to think of resources as the "physical building blocks" that make up a software program. Here's a short list of resources right off the top of my head: directories, files, registry keys, verbs, extensions, class ids, prog ids, typelibs, ini file settings, virtual sites, virtual directories, databases.

For example, almost every program out there is composed of at least one file. You'll find my favourite (remind me to explain why this word still has a "u" in it) example to point at is the Microsoft Office suite. In the case of the suite of programs that make up the Microsoft Office 2003 Professional product there are almost 4200 files. Another common set of resources in many Windows applications are registry keys. Microsoft Office 2003 Professional has over 14,000 registry keys that get configured.

Fundamentally all of the stuff on your computer is a resource, application data, or user data. Hmm, two new terms introduced there. Let me quickly define them. Application data is the bits of information a program generates while it is running. Application data can be stored in files, registry keys, rows in a database table, and tons of other places depending on the type of application. Usually, application data is not "installed" but is almost always "uninstalled". I'll come back to this distinction later. User data is the stuff that programs generate for users. User data is that document you wrote for your boss last week, or the saved location in the game you were up till 5 am playing, or this blog entry for me. User data is typically very important to the user that created. User data is never installed and is should never be uninstalled unless the user says so.

To use the three words introduced thus far you might ponder this sentence, "Everybody at Microsoft is in the business of creating resources, that sometimes save state in application data, try to make you more efficient at creating your user data."

User data is a fascinating topic with respect to setup. Rather than try to bloat this entry and cover more of the facets of user data, I'm going to queue a blog entry to discuss user data and move along with the definitions.

A source, the next item needing a definition on our list, is a place where you find resources. Today, most programs ship on a CD. So, the CD will be the source for the program's resources. In large corporations, it is very common to have a central server where the CD is copied and then multiple users can install at the same time. In those cases, the central server is the source. There are even some programs that can be installed from web servers. The original source (for example, the CD used to populate the central server) is sometimes also called the "original media" or just media. Sources are obviously very important during installation and keeping track of the source (or sources) for your program is important if you ever want to repair or update that program.

So, resources are copied from a source and installed on a target. Targets are machines (how I often refer to a computer). Most of us are relatively familiar with installing software on a "local target" (i.e. the machine your keyboard, mouse, and monitor are directly plugged into). However, you can also install software on a "remote target". Remote target installation is very popular in datacenters where there are hundreds or thousands of machines all locked away in some gigantic, well air conditioned room with only Ethernet cables coming out. Most large corporations also like to be able to install/update programs on the machines sitting in front of all of the employees in their company. Those are both examples of installing on remote targets. Targets and their importance are hopefully pretty self-explanatory so I'll move along.

Installation engines are programs responsible for managing the resources on your machine. These management tasks include, adding new resources (install), fixing broken/deleted resources (repair, oh there's another definition for you), and removing resources (uninstall) from a target. I call these programs "engines" because they consume a file full of instructions and process each instruction in a predefined way. When I talk about compensating transactions later, I'll come back and explain why being an "engine" is so important.

Possibly the most popular installation engine on the Windows platform right now is the "Windows Installer." As noted in my introduction blog, I know quite a bit about this technology and will spend lots of time in future blog entries digging into the details of this technology. Other installation technologies (not all are engines) include winnt32.exe (where Windows setup starts), InstallShield and Wise installers (I don't know if either of them still support their old installers, but they each had an engine before the Windows Installer even existed), and batch files (which count as the first installation technology after plain old copy commands).

Now, if you're installing on a local target then you can just double-click on a file or program (like setup.exe) to get your installation started. However, if you're doing a remote install you don't have control of the mouse or keyboard on the remote machine you're going to need something to help you. Note, if you do have direct control of the machine, like through Terminal Server, that's basically just a local install and isn't interesting in this case. To do a remote install, you need an agent to kick the installation engine on the remote machine. Agents are programs that either request install commands from a controlling machine (pull) or wait for a controlling machine to send an install command directly (push).

If you are running Windows 2000 or Windows XP then you have a pull agent on your machine. This agent, often referred to as the Automatic Update Agent, is responsible for checking with Windows Update to keep your machine updated with the latest patches. The agent that comes with the Automated Deployment Services is a push agent that sits around waiting to execute the commands from the ADS controller. Obviously, agents have quite a bit of power and communicate off the local machine. Those two attributes make agents meaty targets for hackers. I'll talk more about this later.

Bootstrappers are, as far as I'm concerned, an unfortunate necessity. Bootstrappers get their name from the expression "pull yourself up by your own bootstraps" which means "to improve your situation in life by your own efforts". Bootstrappers basically do the work to get the installation engine started. Bootstrappers have quite a bit in common with agents but have historically remained independent of each other (although I may make suggestions later how this could change). As mentioned above, if you've ever clicked on a program called setup.exe chances are high that you've dealt with a bootstrapper.

Finally, I've already touched on repair but patching and migration deserve a bit more attention. Patching is the process of finding already installed resources and applying the appropriate bits to transform the resource into another resource. These bits are called a "binary difference" because you get them by calculating all of the differences between two resources. Patching is primarily useful when you are bandwidth constrained and can't send a full copy of the new resource. Those of you reading this blog over a 56K modem know exactly what I'm talking about.

Migration is all about preserving user data from one version of a program to the next. If you think about Office there are lots of settings that you can configure to get Word or Outlook or whatever to look exactly the way you look. Office is particularly good about keeping those setting even after you've upgraded from Office 2000 to Office XP to Office 2003. Migration is a rather tedious process, IMHO, but there are some tricks you can play to make life easier. One day when I'm really bored, maybe I'll write about those.

So there we go. A rather long blog entry (3 pages according to Word, eek) that hopefully provides a communication foundation for future blog entries as well as leaving plenty of fertile ground to cover.