Thursday, 29 January 2009

Fragment complete

The previous blog entry is the complete chunk #2 – An Introduction to Java – the whole 2,500ish words. Feel free to add your comments and suggestions.

Thursday, 1 January 2009

An Introduction to Java - FINAL VERSION

In this fragment we will introduce you to Java, the programming language that is at the core of the Processing development environment.

We will first review some fundamental concepts that will help you to understand the features that have contributed to the success of Java, setting it apart from other programming models.

Next, we will wind back the clock and briefly review the events that led to the creation of Java from a historic perspective.

To conclude, we will take a quick look at some aspects of the Java syntax by examining a very simple Processing program.

You don’t have to know how a car works to be able to drive one; but if you do know the basics of combustion engines, transmission systems, braking systems, and tyre grip physics, it will make you a better – and safer – driver. The same applies to programming. So let’s start this introduction by looking at what is going on under the bonnet of our PCs when we run a computer program.

Everything You Always Wanted to Know About Computers, but Were Afraid to Ask

Let’s face it. For all their wonderful multimedia capabilities of modern day, computers are, at their heart, nothing more than number crunching machines.

Inside our PCs is located its single most important piece of hardware – the microprocessor, also known as the Central Processing Unit, or CPU. The CPU is responsible for executing program instructions and manipulating data. But there is one catch: CPUs understand only binary numbers (sequences of zeroes and ones). Program instructions and data expressed in binary format are commonly known as machine code.

On the same circuit board where the CPU is located, you will find a bunch of silicon chips which make up your PC’s Random Access Memory (RAM). It is in the RAM that the machine code is stored for the CPU to fetch and execute. What happens is this: whenever you ask your operating system (OS) to run a program, say, by double clicking on the program’s icon, your OS reads the relevant file(s) from disk and loads the program into RAM. Once in memory, the OS can start feeding the machine code to the CPU for execution. The combination of a specific operating system version running on a particular type of hardware is often referred to as a platform.

Now you know (almost) everything a programmer needs to know about computer architecture; you are on your way to becoming a geek. But to write a computer program in machine code that the CPU understands would be a colossal task, even for the geekiest among geeks, because binary numbers don’t mean much to us humans. Instead, we humans like to communicate concepts through words and sentences. So we have a problem: computers and people “speak” two completely different languages.

Can We Have a Translator, Please?

If you need to communicate with a person who only understands Japanese, and you don’t know the first thing about the Japanese language, what do you do? You hire a translator, of course. And that’s exactly the solution that computer scientists came up with to solve the computer-human language mismatch: let’s create programming languages based on words and sentences that people understand (called high-level programming languages), and then use software to translate these “human-friendly” program files (the source code) into machine code for the CPU to execute. The translation process from source code to machine code is called compilation. There are two categories of software that performs compilation: source code compilers and source code interpreters.

Meet the Compiler

A compiler performs source code compilation before program execution, storing the resulting machine code on disk as an executable program. When the operating system loads a compiled program into RAM for execution, the CPU can execute it straight away without any delays since it is already in binary format. Compiled programs use the CPU speed to the full; they are said to be high-performing programs.

The C language, and its object-oriented variant C++, are probably the most significant examples of contemporary programming languages used in conjunction with compilers. C and C++ compilers, originated on the Unix platform, are currently available on most platforms. More than that, most modern operating systems, including Windows, Linux and Mac OS X, are written in C.

The drawback of compiled programs is that, since each operating system communicates with application programs in its own specific way, they are not easily portable; a C++ program compiled for Windows, for example, will not work if you try to run it on a Linux box, and vice-versa, without modification.

Meet the Interpreter

An interpreter performs source code compilation during execution of the program (at runtime). When you request the execution of an interpreted program, your OS loads the interpreter in RAM, which, in turn, loads the program source code as-is and then compiles one instruction at a time into machine code on-the-fly before it is passed to the CPU for execution. The resulting machine code is never stored on disk, so every time an interpreted program is executed the compilation has to be performed all over again.

Interpreted programs are portable; in other words, once you have written a program, say, in Ruby for example, you can run it on any operating system for which there is a Ruby interpreter available, without having to change a single line of code.

The extra processing required by the on-the-fly compilation of the source code into machine code at runtime introduces a delay which makes interpreted programs to run more slowly than compiled ones, all else being equal.

Meet Java - and the Amazing Virtual Machine

Java takes a different approach: it uses both a compiler and an interpreter. Java source code is firstly processed by a compiler, but the result of the compilation is not machine code as with traditional (native) compilers; the Java compiler produces something called bytecode. Bytecode is the closest you can get to machine code, whilst still maintaining platform-neutrality. But, unlike machine code, bytecode cannot be executed as-is; an interpreter is still required to run the compiled Java program. The Java interpreter is best known as the Java Virtual Machine, or JVM. Because the bytecode is already pretty close to machine code, the JVM can perform the on-the-fly compilation from bytecode to machine code very quickly, much faster if compared to purely interpreted languages.

As the bytecode is platform-neutral, compiled Java programs are entirely portable; you can run your compiled Java program on any platform for which there is a JVM implementation, without changing a single line of code. And the JVM runtime execution of bytecode performs to levels close to that of machine code compiled programs. Java manages to combine portability with high performance. And that’s no mean feature.

There are a couple of other services provided by the Java Virtual Machine to your programs that are worth mentioning for the sake of completeness only, as they are very advanced topics: garbage collection and multi-threading support. Garbage-collection automatically frees up memory space no longer used by a program. Multi-threading support allows programmers to write multitasking software that performs better under certain conditions.

Meet the Java API (the Java Libraries)

Java comes with a large collection of ready-made software components providing a wide range of functionality that you can incorporate into your programs straight out-of-the-box. In computing parlance, such a collection of reusable software components is called a software library. And the way you access the functionality provided by a software library is by means of its Application Programming Interface (API). An API is essentially a well-documented set of component interfaces organized in a structured way. The range of functionality provided by the Java libraries is staggering. It covers the grouping of data (collections); utilities to manipulate text; mathematical operations; program input and output; networked programming; security; database connectivity; graphical user interface; 2D graphics; and a lot more.

A rich set of libraries makes a programming language more appealing as it allows programmers to produce complex software faster, with less code and more quality. Few other programming environments can match the richness of functionality provided by the Java core libraries.

We have now completed our examination of the main features of the Java programming model. It is now time to wind back the clock, and look at the events that led to the creation of Java.

A Historic Perspective

By the early 1990’s, computer networks had become an established reality. Thanks to advances in networking hardware and software technologies, it was both possible and viable to interconnect computers running different operating systems. Even more interesting was the possibility of interconnecting entire networks of computers. This is how the Internet was built. This networked computer model is known as distributed computing. Programmers were now required to write software that was capable to communicate over a network link with other software running on different computers and possibly different operating systems.

A newer software development paradigm, called object-oriented programming (OOP), had established itself as a superior way to tackle large, complex software projects. To make software systems more manageable, OOP splits them into objects; a software object groups together both the data and the set of instructions that operate on that data as a single entity. Objects can interact between them to achieve the desired functionality. You will meet OOP concepts in detail later in the course.

The dominant object-oriented programming language at that time was C++, partly because of the popularity of the Unix operating system as the backbone for most distributed financial, scientific and engineering applications. C++ was the result of adding object-oriented features to the C language. The resulting pastiche was a powerful but pretty tricky language to master, and therefore error-prone. The lack of standard libraries providing functionality required in real-world applications did not help either.

The Internet itself was gathering momentum. Although it had been around for some twenty years, offering services such as e-mail and file transfers, it only started to gain widespread popularity when the first user-friendly Web browser, called Mosaic, was created in 1993.

To summarize, the computing landscape in the early 1990’s was characterized by the following factors:
  • Distributed computing, mixing disparate hardware and operating systems.
  • The growing popularity of the World Wide Web.
  • Network-oriented and object-oriented software development.

Meanwhile, at a Sun Microsystems research facility in Menlo Park, a team of technologists were working on the research and development of a platform-independent programming language that could be used to control any sort of programmable appliance. This language was code-named Oak by its inventor, James Gosling. The team marketed their creation to the digital cable television industry, without success. But they saw an opportunity in the growing popularity of the World Wide Web. The Oak language, now renamed Java, could be used to run interactive software applications (called applets) embedded in a Web page by plugging in a JVM into a Web browser. The team built such a JVM-equipped browser, called HotJava, and used it to demonstrate the new technology at an industry event in 1994. The success was such that later Netscape agreed to integrate Java to its Navigator Web browser, by far the most popular Web browser around. In May 1995, at the SunWorld exhibition, Sun showcased Java applet technology, and announced Netscape’s support. Java technology was officially born.

Although the initial success of Java was based on software delivery over the Web, it soon became apparent that it had more far reaching potential. Its simpler, pure object-oriented language was much easier to master than C++. At the same time, the Java core libraries offered the functionality that developers needed to build networked applications. As an added bonus, developers could run their Java software on many platforms without changes. The adoption of Java technology was phenomenal.

Today Java has evolved to be one of the most successful and popular software development technologies in the history of computing, and is used to build anything from enterprise systems to mobile phone applications.

Java and Processing

The Processing development environment is also built with Java. The programming language used in Processing is Java, although Processing provides an extended graphics API and a simplified programming model that gets you started with graphic programming very quickly, whilst allowing you to use more advanced techniques as you gain more experience.

The Processing team selected Java technology because it performs well enough to tackle processor-intensive graphic applications, whilst retaining a level of syntactic simplicity that was sufficiently adequate as a teaching environment. Another advantage of using Java is that it gives Processing the capability to export programs (sketches) as Java applets that can be embedded on a Web page.

To conclude this introduction, we will take a look at a very simple Processing sketch. Our sketch draws the St. George’s Cross, in the form of the national flag of England. The specification for our sketch is very simple: a rectangle with a 3:5 ratio, a white background, and a red cross with a thickness of 1/5 of the rectangle’s height.

Here is the source code:

* St George's Cross.
* A very simple Processing program.

// calculate and set variables
int flagDim = 70;
int flagWide = flagDim * 5;
int flagHeight = flagDim * 3;
int crossWidth = flagHeight / 5;

// call required Processing API functions
// define sketch window size
size(flagWide, flagHeight);
// set window background to white
// set stroke weight for the cross
// set stroke color to red
stroke(255, 0, 0);
// draw the cross
line(0, flagHeight/2, flagWide, flagHeight/2);
line(flagWide/2, 0, flagWide/2, flagHeight);

This simple program starts with comments. Anything enclosed between a /* and */ or following a // is a comment. You can write whatever you want inside a comment area. This is useful to explain what you intend to do in the code. It is a good practise to always include comments in your code.

Outside comment areas you can only use valid Java statements. If you mistype something, Processing will tell you and highlight the offending line(s). Note that each statement is terminated by a semicolon (;).

Next we have a number of variable declarations, where we calculate and store the data values for our flag. The equals sign = is used to assign a value to a variable; it is called the assignment operator. The basic mathematic operators are + (addition), - (subtraction), * (multiplication) and / (division).

All the variables declared are of type integer numbers, defined with the keyword int. In Java you always have to declare the type of a variable.

The variables we set are used as parameters in the call to the Processing API functions that perform the tasks we need to draw our flag, as explained in the comments. Look also at the index.html file in the reference folder of your Processing distribution for a more detailed description of each function.

If you type these few lines of code into the Processing IDE, and click on the “Run” icon, a St. George’s Cross will appear on a separate window. You can change its dimensions by changing the value of the flagDim variable from 70 to some other value; try 50 or 100 for example, and run the program again.

This is a very limited introduction to the Java syntax, just to get our feet wet with Processing. You will learn many more constructs and keywords as we progress through the course. In the next fragment we will discuss how programs are designed, with an introduction to algorithms.


About Processing:

About the history of Java:

Various technical articles on:

About the history of programming languages:

About programming language popularity: