Fairly Symmetrical

Programs vs. Data

10:33 AM by Eric: Programs vs. Data Tech

Ed Felton has an ongoing series of posts about the difference—if any—between "programs" and "data" in modern computing. The distinction, at least in current usage, can be very subtle and not at all intuitive.


Ed says at one point:

Maximillian Dornseif asks how one can draw the line between programs and data. This is important because the law often treats the two differently. He concludes that no clear line can be drawn.

This is a … difficult question … Some cases are easy. The English text of this paragraph is data. Microsoft Word, considered as a whole, is a program.

Then he gets into some slightly more complicated examples (formulas in spreadsheets, macros, etc.). In his latest post, he brings up the case of programs which are never intended to be run; such programs are, as he points out, often used as a logical proof instead of as executable programs.

Personally I think a program is anything which, when loaded into memory, is executable. This means that macros, formulas, and even most scripting languages are not programs. In fact, this distinction means that even a program in source form is not actually a program! Loading

#include <stdio.h> int main(int argc, char *argv[]) { printf("Hello world!\n"); return 0; }

into memory will result in garbage: it's not an executable program. In fact, the above is data which will be fed to a program—a compiler—whose binary, loadable and executable output will be a program.

I think this is the most conclusive, simplest way to define the difference between program and data. It allows the special treatment of programs/software by law, in a manner that can be clearly and objectively determined. Programs can be handled one way, data another.

This distinction is not without its problems. It runs the risk of identifying something as a program some times and data other times; for instance, something which is a program on a Macintosh will not be a program on an x86 computer because the executable formats (among other things) differ. Compiled Java or .Net code might not ever be a program, given that their compiled form is an intermediate language which must then be either just-in-time compiled for the current platform or else run through a virtual machine (in exactly the same manner as a script is run through its interpreter). These could possibly be addressed by redefining programs as anything which, when loaded into memory, could be executed by an appropriate processor, but I wonder if this opens the barn door a little wide.

In a trade sense, obviously we will continue to treat certain kinds of data (for instance, source code) as if it were the program it will produce, but in a legal sense I think this distinction could be quite helpful.

Creative Commons License
This work is licensed under a Creative Commons License.

This page was last updated Sun 23 September 2007 at 09:00 AM CDT