Mark Robinson

Subscribe to Mark Robinson: eMailAlertsEmail Alerts
Get Mark Robinson: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Java Developer Magazine

Java Developer : Article

Types, Variables, Objects & References

Types, Variables, Objects & References

"Variables have types, objects have classes." This phrase, borrowed from the Java Language Specification, succinctly answers several questions that, judging by the frequency of their appearance in the Java news groups, are a common source of confusion for many Java programmers. It reinforces that variables and objects are different things and helps explain why "casts" are necessary, why Java "constants" can appear to be modifiable and what it really means for an argument to be passed to a method "by value".

This month, I'll begin with some background information that will help those who are not yet comfortable with the terms types, variables, objects, classes and references to get up to speed. Then, I'll present a problem, based upon the exam questions for Sun's Java Certification Program, that will test your understanding of references and objects.

Variables are placeholders that programmers use to represent unknown or changing values which are to be manipulated in the manner described by the instructions of a program. Every variable has a name, a type, a lifetime and a value. Since Java is a statically (or strongly) typed language, the name and type of every variable must be declared to the compiler prior to the variable's first use. This is done by specifying a data type (I use the term "data type" and "type" interchangeably) followed by a sequence of characters which represents the name of the variable. The data type of a variable determines both the amount of memory that needs to be reserved to hold the value that the variable represents, and the kinds of operations that can be performed using that variable. The following are some typical variable declarations:

int counter;
double angle;
String message;
MyClass myVar;

int, double, String, and MyClass are data types. counter, angle, message, and myVar are the names of the four variables that these statements declare. The data type MyClass corresponds to a user-defined Java class that would have to be available to the compiler for the declaration of the variable myVar to compile successfully. It demonstrates that the classes you implement in Java are a form of user-defined data type. Notice that, with few exceptions, the user-defined data types (classes) are treated identically to the data types defined as part of the Java language and standard libraries. Essentially, every Java class you write is a new data-type that further extends the Java language.

However, it is important to distinguish between the two kinds of Java "data-types": primitive types and reference types. The primitive types represent simple values, such as numbers, that can only be manipulated in a limited number of predetermined ways. The numeric primitive types consist of the integral types byte, short, int, long and char, and the floating-point types float and double. The remaining primitive type is the boolean type which represents the logical truth values true and false.

When you declare a variable of a primitive type, the compiler reserves sufficient room in memory to hold the largest acceptable value for the declared type and associates the variable name with that location. For instance, when you declare a variable of type int, 32 bits of memory are reserved. Thereafter, when any assignments to the variable are made, the binary representation of the new integer value is stored in those 32 bits of memory.

This differs from the way that variables of a reference type work. The reference types are classes, interfaces and arrays. When you declare a variable of a reference type, sufficient room in memory is allocated to hold a reference. A reference is a value that can be used to find another location in memory. In some respects, references are comparable to domain names. When you type a domain name into your browser, the DNS process transparently resolves the domain name to a physical location from which the requested information is retrieved. Likewise, Java will use a reference to transparently determine how to access the object that a reference "refers" to. The actual form of a reference and the process that Java uses to resolve it depends upon the specific implementation of the Java Virtual Machine. Since references, in Java, are not directly accessible by the programmer, as long as you are aware of their existence, you needn't be concerned with how they are implemented.

Let's re-examine what happens as a result of the four variable declarations shown earlier. The declaration "int counter" causes 32 bits of memory to be reserved that are associated with the variable name "counter." The variable is marked as being of type "int" so that the compiler can insure that the variable will only participate in those operations, specified in the Java language, that are permissible for integer variables. The declaration "double angle" works similarly, except that 64 bits of memory are reserved, which are then associated with the variable name "angle" and the data type "double."

The declarations "String message" and "MyClass myVar" each result in sufficient memory being reserved to store a reference. As before, the data-type String is associated with the variable name "message" and the data-type MyClass is associated with the variable name "myVar." Note that at this point, no objects exist. We simply have four variable names, each of which is associated with a data-type and a location in which a value can be stored.

The first time a variable is explicitly assigned a value is referred to as "initializing the variable." We can initialize the variables we previously declared in the following way:

counter = 2;
angle = 3.333;
message = null;
myVar = null;

In practice, the declaration and initialization of a variable are often combined into a single statement:

int counter = 2;
double angle = 3.333;
String message = null;
MyClass myVar = null

The actions performed as a result of each combined statement are identical to the actions that result from the separate corresponding declaration and initialization statements discussed previously. The value "null" is used to indicate that a reference variable does not currently refer to an object. Note again that we still do not have any objects. I'm belaboring that point to emphasize that a reference variable, and the object it refers to, are two distinct things. Since reference variables that don't refer to an object are rarely useful, let's examine how we can create some objects. An object is the physical representation of a data type defined by a class. Objects are alternatively referred to as "instances." Correspondingly, creating a new object/instance of a class is referred to as "instantiating" an object. Like variables, objects have a lifetime. They are created, they exist for some period of time, and then they are destroyed. When an object is created, sufficient memory is allocated to hold all of the instance variables declared in the object's class definition. For example, if the class MyClass was declared as follows:

class MyClass
{
// Instance Variables
int age;
String name;
}

then, when an object of MyClass is created, sufficient memory would be allocated to hold an integer value (the variable "age") and a reference (the variable "name"). Unlike variables, objects are not named.

Objects are typically created using the keyword "new" followed by the name of the desired object's class and any required arguments. To create a new object, the Java Virtual Machine allocates sufficient memory to hold the new object's instance variables. Then, each instance variable is initialized to a default value (the numeric types are initialized to zero, the boolean type is initialized to false, and reference types are initialized to null). Finally, the appropriate constructor(s) are executed, and presuming no errors have occurred, a reference to the newly created object is returned.

Each Virtual Machine implements some mechanism to track the number of references that refer to an object. When the number of references that refer to an object drops to zero, the object is no longer accessible and, eventually, the built-in garbage collector will come along and reclaim the memory the object occupies.

So, there are actually three separate things that are happening in the rather typical statement:

Date today = new Date( 97, 02, 01 );

First, the expression "Date today" declares a new reference variable named today. As a result, sufficient memory to store a reference is allocated and associated with the variable name today and the data type Date. Second, the expression "new Date( 97, 02, 01)" causes a new unnamed Date object to be created and a reference to the new Date object is returned. Third, the "=" (assignment) operator is evaluated which results in the reference, which was returned when the Date object was created, being stored in the memory location associated with the variable named today.

After the statement has executed, there are two separate structures left in memory: the variable named today and the unnamed Date object. If the next statement in the program was:

Date start = today;

First, a new reference variable named start would be created. Then, the "=" assignment operator would result in a copy of the reference stored in today being stored in start. There would then be three structures in memory: the two variables today and start, both of which would contain a reference to the third structure - the unnamed Date object.

Since both variables refer to the same object, either can be used to change the values in the object's instance variables. For instance, the statements:

today.setMonth( 11 );
start.setDate( 21 );

would modify the Date object so that the date it contains is December 21, 1997 (months are zero-based).

Let's move on to a more complicated example. The question in Listing 1 is a slightly simplified example of the type of question given on the Sun Java Certification exams. (For more information on Sun's Java Certification Program, visit http://www.sun.com/sunservice/suned/java_information.) See if you can figure out the answer.

LISTING 1
1. import java.util.Date;
2.
3. public class Example
4. { public static void main(String args[])
5. { Date d1 = new Date( 99, 11, 31 );
6. Date d2 = new Date( 99, 11, 31 );
7. method( d1, d2 );
8. System.out.println( "d1 is " + d1
9. + "\nd2 is " + d2 );
10. }
11. public static void method( Date d3, Date d4 )
12. { d4.setYear( 100 );
13. d3 = d4;
14. }
15. }

Which one or more of the following correctly describe the behavior when this program is compiled and run?
a) compilation is successful and the output is:
d1 is Fri December 31 00:00:00 GMT 1999
d2 is Fri December 31 00:00:00 GMT 1999
b) compilation is successful and the output is:
d1 is Fri December 31 00:00:00 GMT 1999
d2 is Sun December 31 00:00:00 GMT 2000
c) compilation is successful and the output is:
d1 is Sun December 31 00:00:00 GMT 2000
d2 is Sun December 31 00:00:00 GMT 2000
d) the assignment 'd1 = d2' is rejected by the compiler
because the Date class cannot overload the operator '='.
e) the expression ("d1 is " + d1 + "\nd2 is " + d2) is rejected by
the compiler because the Date class cannot overload the operator '+'.

The program begins execution at the beginning of the "main" method in line 4. Lines 5 and 6 result in four structures being created: the variables d1 and d2 and the two unnamed Date objects they refer to, both of which contain the date December 31, 1999 (Recall that months are zero-based, i.e 0 = Jan, 1 = Feb, etc.).

In line 7, the method, named "method", which is declared in line 11, is invoked. The variables d1 and d2 are passed as arguments. In Java, variables are always passed "by value". That means that a copy of the value contained in the variable being passed as an argument is assigned to the corresponding method parameter. In other words, passing the variables d1 and d2 (line 7) to the method parameters d3 and d4 (line 11) is equivalent to:

Date d3 = d1;
Date d4 = d2;

Thus, d3 is assigned a copy of the reference contained in d1 and d4 is assigned a copy of the reference contained in d2. Therefore, d3 now refers to the unnamed object that was created in line 5 and d4 now refers to the unnamed object that was created in line 6.

In line 12, the "setYear" method of the object referred to by the variable d4 is invoked. The "setYear" method adds 1900 to the year passed as an argument to calculate the new year. So, the year of the Date object referred to by d4 (and also by d2), is changed to 2000.

In line 13, d4 is assigned to d3. Again, this means that a copy of the reference contained in the variable d4 is stored in the variable d3. Consequently, the variables d2, d3, and d4 now all refer to the same Date object - the one whose year was set to 2000 in line 12.

In line 14, the end of the "method" method is reached. Consequently, the method parameters d3 and d4 are destroyed. Method parameters receive a copy of the values contained in the variables passed to them. Therefore, they can never affect the original values. The variables d1 and d2 still contain the same references they contained originally. Although the original references aren't changed, the copies of the references assigned to the method parameters can be used to access and modify the objects that the original references referred to. Thus, the objects can be modified.

Therefore, when the output statement that spans lines 8 and 9 is executed, the object referred to by d1 will print the String corresponding to the date December 31, 1999 and the object referred to by d2 will print the String corresponding to the date December 31, 2000. The correct answer is B.

Next month, I'll wrap up the discussion of "Variables have types, objects have classes" by examining the relationship between the data types of variables and casting.

Finally, a caveat to soothe the purists; the Java Virtual Machine Specification permits a great deal of latitude in the way Java virtual machines are actually implemented. Details relating to the actual amount of memory allocated to hold the values of particular data-types and the timing of memory allocations presented in this conceptual overview, should not be presumed to be accurate for any particular Java virtual machine.

More Stories By Mark Robinson

Mark Robinson is the president/CEO of Cyberian Foundations, a software development/consulting firm that specializes in developing object-oriented business applications. His clients include Xerox, IBM, Sealy and numerous small businesses.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.