What Is the Java Virtual Machine?

What is the Java Virtual Machine?

By Ernest Rider, OCI Senior Software Engineer

February 2000


The Java Virtual Machine (JVM) is the Sun Microsystems specification of a (emulated) virtual 32-bit processor that directly supports the Java programming language.

The JVM executes operating-system and hardware-independent binary format bytecodes (virtual machine instructions). JVMs can currently execute 254 different types of bytecode, each of which is exactly one byte (8 bits). Bytecode 186 is unused.

What do bytecodes look like?

Well, they are all in binary format as we might expect. Bytecodes exists to manipulate primitive types (byte, short, int, long, char, float, double, boolean, and JVM internal return address types) and reference types (objects/arrays).

All bytecodes look like this:

mnemonic <operand1>, <operand2>, . . .

where mnemonic can be one of the pseudo codes in the table below:

    bytecode
(Decimal)   

  bytecode (Hex) 

  Pseudo Code (Sun's Format)   

  bytecode
(Decimal)   

  bytecode (Hex) 

  Pseudo Code (Sun's Format)   

  bytecode
(Decimal)   

  bytecode (Hex) 

Pseudo Code (Sun's Format)

0

(0x00)

nop

86

(0x56)

sastore

171

(0xab)

lookupswitch

1

(0x01)

aconst_null

87

(0x57)

pop

172

(0xac)

ireturn

2

(0x02)

iconst_m1

88

(0x58)

pop2

173

(0xad)

lreturn

3

(0x03)

iconst_0

89

(0x59)

dup

174

(0xae)

freturn

4

(0x04)

iconst_1

90

(0x5a)

dup_x1

175

(0xaf)

dreturn

5

(0x05)

iconst_2

91

(0x5b)

dup_x2

176

(0xb0)

areturn

6

(0x06)

iconst_3

92

(0x5c)

dup2

177

(0xb1)

return

7

(0x07)

iconst_4

93

(0x5d)

dup2_x1

178

(0xb2)

getstatic

8

(0x08)

iconst_5

94

(0x5e)

dup2_x2

179

(0xb3)

putstatic

9

(0x09)

lconst_0

95

(0x5f)

swap

180

(0xb4)

getfield

10

(0x0a)

lconst_1

96

(0x60)

iadd

181

(0xb5)

putfield

11

(0x0b)

fconst_0

97

(0x61)

ladd

182

(0xb6)

invokevirtual

12

(0x0c)

fconst_1

98

(0x62)

fadd

183

(0xb7)

invokespecial

13

(0x0d)

fconst_2

99

(0x63)

dadd

184

(0xb8)

invokestatic

14

(0x0e)

dconst_0

100

(0x64)

isub

185

(0xb9)

invokeinterface

15

(0x0f)

dconst_1

101

(0x65)

lsub

186

(0xba)

xxxunusedxxx

16

(0x10)

bipush

102

(0x66)

fsub

187

(0xbb)

new

17

(0x11)

sipush

103

(0x67)

dsub

188

(0xbc)

newarray

18

(0x12)

ldc

104

(0x68)

imul

189

(0xbd)

anewarray

19

(0x13)

ldc_w

105

(0x69)

lmul

190

(0xbe)

arraylength

20

(0x14)

ldc2_w

106

(0x6a)

fmul

191

(0xbf)

athrow

21

(0x15)

iload

107

(0x6b)

dmul

192

(0xc0)

checkcast

22

(0x16)

lload

108

(0x6c)

idiv

193

(0xc1)

instanceof

23

(0x17)

fload

109

(0x6d)

ldiv

194

(0xc2)

monitorenter

24

(0x18)

dload

100

(0x6e)

fdiv

195

(0xc3)

monitorexit

25

(0x19)

aload

111

(0x6f)

ddiv

196

(0xc4)

wide

26

(0x1a)

iload_0

112

(0x70)

irem

197

(0xc5)

multianewarray

27

(0x1b)

iload_1

113

(0x71)

lrem

198

(0xc6)

ifnull

28

(0x1c)

iload_2

114

(0x72)

frem

199

(0xc7)

ifnonnull

29

(0x1d)

iload_3

115

(0x73)

drem

200

(0xc8)

goto_w

30

(0x1e)

lload_0

116

(0x74)

ineg

201

(0xc9)

jsr_w

31

(0x1f)

lload_1

117

(0x75)

lneg

203

(0xcb)

ldc_quick

32

(0x20)

lload_2

118

(0x76)

fneg

204

(0xcc)

ldc_w_quick

33

(0x21)

lload_3

119

(0x77)

dneg

205

(0xcd)

ldc2_w_quick

34

(0x22)

fload_0

120

(0x78)

ishl

206

(0xce)

getfield_quick

35

(0x23)

fload_1

121

(0x79)

lshl

207

(0xcf)

putfield_quick

36

(0x24)

fload_2

122

(0x7a)

ishr

208

(0xd0)

getfield2_quick

37

(0x25)

fload_3

123

(0x7b)

lshr

209

(0xd1)

putfield2_quick

38

(0x26)

dload_0

124

(0x7c)

iushr

210

(0xd2)

getstatic_quick

39

(0x27)

dload_1

125

(0x7d)

lushr

211

(0xd3)

putstatic_quick

40

(0x28)

dload_2

126

(0x7e)

iand

212

(0xd4)

getstatic2_quick

41

(0x29)

dload_3

127

(0x7f)

land

213

(0xd5)

putstatic2_quick

42

(0x2a)

aload_0

128

(0x80)

ior

214

(0xd6)

invokevirtual_quick

43

(0x2b)

aload_1

129

(0x81)

lor

215

(0xd7)

invokenonvirtual_quick

44

(0x2c)

aload_2

130

(0x82)

ixor

216

(0xd8)

invokesuper_quick

45

(0x2d)

aload_3

131

(0x83)

lxor

217

(0xd9)

invokestatic_quick

46

(0x2e)

iaload

132

(0x84)

iinc

218

(0xda)

invokeinterface_quick

47

(0x2f)

laload

133

(0x85)

i2l

219

(0xdb)

invokevirtualobject_quick

48

(0x30)

faload

134

(0x86)

i2f

221

(0xdd)

new_quick

49

(0x31)

daload

135

(0x87)

i2d

222

(0xde)

anewarray_quick

50

(0x32)

aaload

136

(0x88)

l2i

223

(0xdf)

multianewarray_quick

51

(0x33)

baload

137

(0x89)

l2f

224

(0xe0)

checkcast_quick

52

(0x34)

caload

138

(0x8a)

l2d

225

(0xe1)

instanceof_quick

53

(0x35)

saload

139

(0x8b)

f2i

226

(0xe2)

invokevirtual_quick_w

54

(0x36)

istore

140

(0x8c)

f2l

227

(0xe3)

getfield_quick_w

55

(0x37)

lstore

141

(0x8d)

f2d

228

(0xe4)

putfield_quick_w

56

(0x38)

fstore

142

(0x8e)

d2i

202

(0xca)

breakpoint

57

(0x39)

dstore

143

(0x8f)

d2l

254

(0xfe)

impdep1

58

(0x3a)

astore

144

(0x90)

d2f

255

(0xff)

impdep2

59

(0x3b)

istore_0

145

(0x91)

i2b

 

 

 

60

(0x3c)

istore_1

146

(0x92)

i2c

 

 

 

61

(0x3d)

istore_2

147

(0x93)

i2s

 

 

 

62

(0x3e)

istore_3

148

(0x94)

lcmp

 

 

 

63

(0x3f)

lstore_0

149

(0x95)

fcmpl

 

 

 

64

(0x40)

lstore_1

150

(0x96)

fcmpg

 

 

 

65

(0x41)

lstore_2

151

(0x97)

dcmpl

 

 

 

66

(0x42)

lstore_3

152

(0x98)

dcmpg

 

 

 

67

(0x43)

fstore_0

153

(0x99)

ifeq

 

 

 

68

(0x44)

fstore_1

154

(0x9a)

ifne

 

 

 

69

(0x45)

fstore_2

155

(0x9b)

iflt

 

 

 

70

(0x46)

fstore_3

156

(0x9c)

ifge

 

 

 

71

(0x47)

dstore_0

157

(0x9d)

ifgt

 

 

 

72

(0x48)

dstore_1

158

(0x9e)

ifle

 

 

 

73

(0x49)

dstore_2

159

(0x9f)

if_icmpeq

 

 

 

74

(0x4a)

dstore_3

160

(0xa0)

if_icmpne

 

 

 

75

(0x4b)

astore_0

161

(0xa1)

if_icmplt

 

 

 

76

(0x4c)

astore_1

162

(0xa2)

if_icmpge

 

 

 

77

(0x4d)

astore_2

163

(0xa3)

if_icmpgt

 

 

 

78

(0x4e)

astore_3

164

(0xa4)

if_icmple

 

 

 

79

(0x4f)

iastore

165

(0xa5)

if_acmpeq

 

 

 

80

(0x50)

lastore

166

(0xa6)

if_acmpne

 

 

 

81

(0x51)

fastore

167

(0xa7)

goto

 

 

 

82

(0x52)

dastore

168

(0xa8)

jsr

 

 

 

83

(0x53)

aastore

169

(0xa9)

ret

 

 

 

84

(0x54)

bastore

170

(0xaa)

tableswitch

 

 

 

85

(0x55)

castore

 

 

 

 

 

 

Can other languages produce Java bytecode?

In short, yes!

Does this mean we don't have to write in Java to get "bytecode" portability?

Well that's one for debate. All I can say is that you get the wealth of a pure object orientated language in Java, some of the best minds fixing and enhancing the technology, and some really nice frameworks that other languages are struggling to match.

In essence, the Java language embodies a design thought process that is conducive to good OOP practices.

What makes up a JVM?

1. A fetch, decode, execute module

This is really the heart of a JVM implementation. It mimics a "real" machine's fetch, decode, and execute cycles.

This module will usually consist of a 32-bit program counter (PC). All other registers are not stipulated by the JVM specification, since many different architectures have different register capabilities that can be best exploited without adhering to a strict architecture.

2. A stack per thread

The JVM stack is a last-in-first-out (LIFO) stack that stores frames. A frame is a local block workspace, usually directly related to code that appears between "{"...."}" block specifiers.

In a REAL machine, a frame is analogous to a set of local block variables from the last base pointer (BP) to the current stack pointer (SP).

During a simulated multitasking (single processor) fetch, decode, execute cycle, the JVM can switch between stacks, saving and restoring the PC and any implementation registers. In some machines, this is done preemptively (forced), and in others, each must yield to another.

3. Heap

The Java heap is shared across all threads and contains dynamic object references, including dynamically constructed objects/methods.

4. Method area

All compiled class and object methods are stored in the method area.

(Dynamic objects created at run time are stored in the heap and are exposed to the garbage collector.)

5. Constant pool

The constant pool holds all the constants referenced by the system. Some constants may be propagated through the bytecode in implementations.

6. Native method stack

This is a stack for which native methods can accept parameters from, and return parameters to, the JVM.

Some JVMs parse parameters using a 3rd party Native Language framework, such as RNI/COM in the Microsoft VM.

7. Garbage collector

This is a thread or process that determines whether heap references have lost their parent/creator object and are thus marked dirty for removal. The garbage collector is quite efficient, despite the absence of a C++-like "delete" operator.

8. bytecode verifier

The bytecode verification process is perhaps the most sophisticated part a JVM implementation. It addresses the concerns that bytecodes may do damage to a system by running amok on a given hardware architecture.

Sun's JVM verifies that bytecodes are well-formed before executing them. This involves making sure all exceptions are caught, no overflows exist, and more. Some JVMs check while executing bytecodes, which can be expensive for performance.

9. Other advanced features

As Java compiler technology improves, the separation of the JVM and the "real" machine gets narrower and narrower. Just-In-Time (JIT) compilation has become a key technology for improving JVM performance.

A typical JIT will pre-optimise method bytecodes into native blocks ready to run on demand. Some JIT's use the native method stack to invoke native methods.

Implicit parallel computing support through the Java threading model is also making positive gains.

It is the author's opinion that through such research and development, some JVMs may be able to operate as real-time systems, despite the use of bytecode binaries.

Show us a program in bytecodes!

Example:

  1. public class Person {
  2.  
  3. String name = "Unknown";
  4.  
  5. public void setName(String n) {
  6. name = n;
  7. }
  8.  
  9. public String getName() {
  10. return(name);
  11. }
  12.  
  13. public static void main(String args[]) {
  14. Person p = new Person();
  15. p.setName(args[0]);
  16. System.out.println("The name of the person entered was "+p.getName()+".");
  17. }
  18. }
  19.  

Using the command line Java decompiler tool, javap, we can get the bytecode in pseudo ops (-c) and the line numbers in the heap where they are stored (-l).

C:\>javac Person.java
C:\>javap -c -l Person
  1. //Compiled from Person.java
  2. public synchronized class Person extends java.lang.Object
  3. /* ACC_SUPER bit set */
  4. {
  5. java.lang.String name;
  6. public void setName(java.lang.String);
  7. public java.lang.String getName();
  8. public static void main(java.lang.String[]);
  9. public Person();
  10. }
  11.  
  12. Method void setName(java.lang.String)
  13. 0 aload_0
  14. 1 aload_1
  15. 2 putfield #14 -Field java.lang.String name-
  16. 5 return
  17.  
  18. Line numbers for method void setName(java.lang.String)
  19. line 6: 0
  20. line 5: 5
  21.  
  22. Method java.lang.String getName()
  23. 0 aload_0
  24. 1 getfield #14 -Field java.lang.String name-
  25. 4 areturn
  26.  
  27. Line numbers for method java.lang.String getName()
  28. line 10: 0
  29.  
  30. Method void main(java.lang.String[])
  31. 0 new #4 -Class Person-
  32. 3 dup
  33. 4 invokespecial #9 -Method Person()-
  34. 7 astore_1
  35. 8 aload_1
  36. 9 aload_0
  37. 10 iconst_0
  38. 11 aaload
  39. 12 invokevirtual #17 -Method void setName(java.lang.String)-
  40. 15 getstatic #15 -Field java.io.PrintStream out-
  41. 18 new #7 -Class java.lang.StringBuffer-
  42. 21 dup
  43. 22 ldc #2 -String "The name of the person entered was "-
  44. 24 invokespecial #11 -Method java.lang.StringBuffer(java.lang.String)-
  45. 27 aload_1
  46. 28 invokevirtual #13 -Method java.lang.String getName()-
  47. 31 invokevirtual #12 -Method java.lang.StringBuffer append(java.lang.String)-
  48. 34 ldc #1 -String "."-
  49. 36 invokevirtual #12 -Method java.lang.StringBuffer append(java.lang.String)-
  50. 39 invokevirtual #18 -Method java.lang.String toString()-
  51. 42 invokevirtual #16 -Method void println(java.lang.String)-
  52. 45 return
  53.  
  54. Line numbers for method void main(java.lang.String[])
  55. line 14: 0
  56. line 15: 8
  57. line 16: 15
  58. line 13: 45
  59.  
  60. Method Person()
  61. 0 aload_0
  62. 1 invokespecial #10 -Method java.lang.Object()-
  63. 4 aload_0
  64. 5 ldc #3 -String "Unknown"-
  65. 7 putfield #14 -Field java.lang.String name-
  66. 10 return
  67.  
  68. Line numbers for method Person()
  69. line 1: 0
  70. line 3: 4
  71. line 1: 10
  72.  

How safe is my Java bytecode from Reverse Engineering?

Not very safe, out-of-the-box. However, many products exist to make decompiling exponentially harder (closer to near impossible) through obfuscation.

(obfuscate. – To make so confused or opaque as to be difficult to perceive or understand)

How do I measure the performance of a JVM?

Just like testing "real" machine performance, JVM performance is an open-ended battle.

Claim and counter claim over who has the best JVM could go on forever. This goes to show that computer people are very passionate when it comes to issues such as performance.

Performance means many things to many different applications.

So how do we meaningfully test performance?

Well, first you must accept that the 90/10 rule applies in general; i.e., 90% of your time is spent working on 10% of the code for a given unit.

Then you must construct applications that typify your purpose or goal and perform some real-time analysis. Using a Java thread to time yourself is not good enough, as JVMs are – for the most part – non-real time.

Fortunately, the confusion over testing JVM performance has been confronted by the Java community. In an effort to gain a useful comparison of JVMs, a set of application performance benchmarks have been adopted (sometimes to suit the implementer).

Some useful ones:

Name Proprietor Description
CaffieneMark 3.0 Pendragon Software

http://www.pendragon-software.com/company.html

SpecJVM98 Spec http://www.spec.org/
JMark ZDNET http://www.zdnet.com/zdbop/jmark/jmark.html
VolanoMark volano http://www.volano.com/benchmarks.html

It is important that when using these tools, the results have relevance to your needs; for instance, testing thread performance may not give you great graphics.

What does the future of the JVM look like?

The future the JVM centers quite firmly around increasing its performance on any given architecture.

Researchers and developers will continue to peel away at its virtual-ness and add more and more real machine code while keeping the bytecode source intact.

Meanwhile, many different architectures and frameworks will be added to aid the industry in making open technology choices.

Finally, hardware architectures that are bytecode compatible are going to break performance ground earlier than software, but the holy grail is the hybrid interpreted/native compiler. This will continue to be developed and influence the ever-increasing demand on performance software technology.



Software Engineering Tech Trends (SETT) is a regular publication featuring emerging trends in software engineering.


secret