CS 375 Compilers Novak Vocabulary

392 terms by gordonnovak 

Create a new folder

Advertisement Upgrade to remove ads

absolute address

the numeric address of a location in memory.
cf. relative address.

absolute code

computer program code that is executable without
further processing: all addresses in the code are absolute.
cf. relocatable code.

abstract syntax tree (AST)

a tree representation of a program
that is abstracted from the details of a particular programming
language and its surface syntax.

accepting state

a state of a finite automaton in which the input string is accepted
as being a member of the language recognized by the automaton.

activation record

stack frame.

actual parameter

a parameter used in a call to a subprogram. cf. formal parameter.

address alignment

see storage alignment.

address space

1. the set of memory addresses that a program may reference.
2. the amount of memory allocated to a program or user.
3. the amount of memory addressable by the address size of a
machine instruction.

adjacency matrix

a method of representing a graph by a Boolean
matrix M , where M_{ij = 1 iff there is an arc from node i
to node j in the graph.


an alternate name for a memory location. Whenever a given memory
location is denoted by more than one name, any of the names can be
considered to be an alias.


the creation of an alternate name for data, either in the definition of
a program or during its operation.


a set of symbols used in the definition of a language.


a case where more than one interpretation is possible.

ambiguous grammar

a grammar that allows some sentence or string to be generated
or parsed in more than one way (i.e., with distinct parse trees).

ambiguous value

a value with more than one name.


a node in a tree that lies on a path between the given node and the root;
a parent of a node or an ancestor of its parent.


a relation * is antisymmetric iff
forall a, b . a b and b a --> a = b .
Example <=


the number of arguments of a function.

assembly language

a language for writing computer programs, in which
one assembly language instruction usually corresponds to one machine


a specification of the order in which operations should be performed
when two operators of the same precedence are adjacent. Most operators
are left-associative, e.g. the expression A - B - C
should be interpreted as ((A - B) - C).


abstract syntax tree.

augmented transition network (ATN)

a formalism for describing parsers, especially for natural language.
Similar to a finite automaton, but augmented in that arbitrary tests may be
attached to transition arcs, subgrammars may be called recursively, and
structure-building actions may be executed as an arc is traversed.

automatic programming

synthesis of a program that satisfies a
specification, where the specification is higher-level than ordinary
programming languages.


an abstract, mathematically defined computer.
Plural is automata.


an expression is available if it has been computed previously in the
computation path preceding the current location and has not been killed.


filling in the address of a label, which has just become defined, in
preceding parts of the program that made forward references to it.


a collection of items, analogous to a set, but allowing multiple
occurrences of an item. Also, multiset.

base address

the address of the beginning of a data area. This address is added to
a relative address or offset to compute an absolute address.

basic block

a sequence of program statements such that if any of them is executed, all
of them are; a sequence of statements that has a label (if any) only at the
beginning and a branch (if any) only at the end.

basic type

a data type that is implemented in computer hardware
instructions, such as integer or real.


a result of arbitrary precision integer arithmetic.
(an abbreviation of BIG NUMber.)


the association of a name with a variable or value.


a Free Software Foundation program similar to yacc.

bit vector

a sequence of Boolean values (0 or 1) represented as the
bits of one or more computer words. It is an efficient representation
for sets that are subsets of a fairly small finite set of possible elements.


short for basic block.


Backus-Naur Form, a syntax for writing context-free grammars
that describe computer languages.

Boolean matrix

a matrix whose elements are Boolean values, 0
or 1.

bottom-up parsing

a parsing method in which input words are matched against the right-hand
sides of grammar productions in an attempt to build a parse tree from the
bottom towards the top.


a pseudo-operation for some assemblers, used to specify the
reservation of a block of storage, perhaps initialized to some constant
value. (an abbreviation of Block Starting with Symbol.)


describes a variable whose value will be needed later during
program execution. Also, live.


a fast memory, smaller than the total main memory, used
by the CPU for faster access to data.

cache miss

a reference to a memory location that is not in the
cache, causing processing to be delayed until the operand can be fetched
from main memory.

cache prefetch

an instruction that causes the contents
of a specified memory address to be fetched from main memory into the cache
memory so that it will be available for fast access when needed.

call by name

a form of parameter transmission in which the effect of a textual
substitution of the actual parameter for the formal parameter is achieved.

call by reference

a form of parameter transmission in which the address of the actual
parameter is transmitted to the subprogram. Any changes to the parameter
made by the subprogram will change the argument as seen by the calling

call by value

a form of parameter transmission in which the value of the actual
parameter is transmitted to the subprogram. Changes to the parameter
made by the subprogram will not be seen by the calling program.

canonical derivation

rightmost derivation.

canonical form

a standardized form of expressions or data. If all programs put their
expressions into a canonical form, the number of cases that will have
to be considered by other programs is reduced.

Cartesian product

if A and B are sets, the Cartesian product
A X B is the set of ordered pairs (a, b) where
a in A and b in B .

cascading errors

a situation, e.g. in compiling a program, where one error causes many
reported errors. For example, failure to declare a variable may cause
an error every time that variable is referenced.


to coerce a given value to be of a specified type.

Chomsky hierarchy

the hierarchy of formal language types: regular, context free, context
sensitive, and recursively enumerable languages, each of which is a proper
subset of the following class.


complex instruction set computer.


in object-oriented programming, a description of a set of similar objects.
For example, Fido, an object or instance, might be a member of the class Dogs.

class variable

in an object-oriented language, a variable associated with a class of
objects, e.g. the number of members of the class.

closed procecdure

a procedure whose code is separate from that of calling programs; the
procedure is entered by a subroutine call, and it returns to the calling
program when it is finished.

closely coupled

describes a parallel computer architecture consisting of multiple CPU's
that are tightly connected, e.g. by sharing the same memory.
cf. loosely coupled.

code generation

the phase of a compiler in which executable output code is generated
from intermediate code.

code motion

the movement of code by a compiler to a place other than where it appears
in the source program. For example, an expensive but unchanging
computation might be moved outside a loop.


see type coercion.


in a hash table, a case in which a symbol has the same
hash function value as another symbol.

column-major order

a method of storing arrays in which values in a column of the array
are in adjacent memory locations. cf. row-major order.


a statement in Fortran that describes a data area that is
named and whose variables can be referenced by any procedure that includes
the COMMON statement.

common subexpression

an expression that appears more than once
in a program.


a program that translates a source language into an
object language that is executable on a computer.


a program that produces a compiler for a language from a specification
of the syntax and semantics of the language, e.g. yacc.


describes a task that is or can be done by the compiler,
without running the program. cf. static.

complex instruction set computer

a CPU design featuring a large number of relatively complex instructions.
Abbreviated CISC. cf. RISC.


of a subexpression, having its value computed within a given block of code.


making a sequence that consists of the elements of a first sequence
followed by those of a second sequence.

concatenation of languages

a language consisting of the set of sentences formed by concatenating a
sentence from the first language and a sentence from the second language.

condition code register

A CPU register that describes the result of the last arithmetic operation
or comparison. It typically contains bits for <0, =0,
>0, carry, and overflow.

constant folding

performing at compile time an operation whose
operands are constant, giving a new constant result.

context-free grammar

a grammar in which the left-hand side of each production consists of a
single nonterminal symbol.

control flow analysis

analysis of the possible paths that control flow
may take in a program during execution.

current status variable

in a hand-written parser, a variable that
denotes the construct last seen, e.g., start of expression,
operator, or operand.


or currying:
to replace a function of multiple arguments by a function of one
argument that returns as its value a function that can be applied to
the remaining arguments.
For example, (+ 1 x), the addition of 1
to x, can be replaced by (1+ x)
where 1+ is a function that adds 1 to a number.


directed acyclic graph,
a graph consisting of a set of nodes and directed arcs (arrows) between
nodes, such that no circular paths (cycles) exist.

dangling reference

in execution of a program, a reference, usually by means of a pointer,
to storage that has been deallocated. For example, in a recursive
language, a pointer to storage that is allocated on the execution stack
could be retained after the routine associated with that stack frame has
exited, resulting in errors.

data area

a contiguous area of memory, specified by its base address and size.
Data within the area are referenced by the base address of the area and
the offset, or relative address, of the data within the area.

data flow analysis

analysis of places in the programs where data
receive values and the places where those data values can subsequently be used.

dead code

parts of a program that cannot be reached during execution and therefore
can never be executed.


a statement in a programming language that provides
information to the compiler, such as the structure of a data record,
but does not specify executable code.


of a variable, having received a value prior to a given point in a program.

definition-use chain

the portion of a program flow graph across which a
variable is both defined and live (busy), beginning at the point where
the variable receives a value and ending at the last place that value is used.
Also, du-chain.


to convert from a pointer (address) to the data that is pointed to.


a list of steps that shows how a sentence
in a language is derived from a grammar by application of grammar rules.


a node in a tree that is a child of another node or a descendant of one
of its children.

deterministic finite automaton

a finite automaton that has at most one transition from a state for each
input symbol and no empty transitions. Abbreviated DFA.


deterministic finite automaton.


a set of benchmark programs; a unit for comparing relative processor
performance. cf. whetstone.

disambiguating rules

rules that allow an ambiguous situation to be resolved to a single
outcome, e.g. rules of operator precedence.


a policy used in determining the order of actions, such as the order of
filling requests for service.


in an activation record, an array of pointers to the activation records
of surrounding blocks, used to access variables defined in those blocks.


a basic block of a program is a dominator of a second block if every
path from the entry of the program to the second block passes through it.


refers to things that happen or can only be determined
during actual execution of a program. cf. static.

dynamic memory

memory that is assigned during execution of a program, especially heap

dynamic scoping

a convention in a language, such as Lisp, that a variable can be referenced
by any procedure that is executed after it has become bound and before
it becomes unbound; thus, the scope of the variable can depend on the
execution sequence.

dynamic type checking

testing of the types of the values of variables at runtime, as is done
in Lisp and object-oriented languages. cf. static type checking.

effective address

the address of a data element, taking into account
offsets due to array indexing and record accesses.

embedded language

a language that is built on another language implementation, by
interpretation or translation into the other language; e.g., an expert
system language embedded in Lisp.


a method of making a software system modular by creating
well-defined interface routines that deal with a particular kind of data
and allowing other programs to access the data only through those routines;
the interface routines encapsulate the data. cf. information hiding.


to generate all of the members of a set.

enumerated type

a scalar type consisting of a finite set of
enumerated values, e.g. type boolean = (false, true);.


a section of code that is executed just before leaving a subprogram
to restore register values, transfer the result of the subprogram to
the calling program, and branch to the return address.


a statement in Fortran that specifies that two variables
occupy the same or overlapping storage; it is possible that the variables
have different types.

equivalence relation

a relation that is reflexive, symmetric, and transitive.

equivalent grammars

grammars that denote the same language.

error production

a grammar production, as in a Yacc grammar,
that is executed if no other production matches the input.

execution stack

a stack of activation records or stack frames that is maintained
during execution of programs in a block-structured or recursive language.

external fragmentation

storage fragments consisting of unused memory between blocks that
are in use.


finite automaton.


finite automaton recognizable.


a component part of a data record.

finite automaton

an abstract computer consisting of an alphabet of symbols, a finite
set of states, a starting state, a subset of accepting states, and
transition rules that specify transitions from one state to another
depending on the input symbol. The machine begins in the starting
state; for each input symbol, it makes a transition as specifies by the
transition rules. If the automaton is in an accepting state at the end
of the input, the input is recognized. Also, finite state machine.
Abbreviated FA.

finite automaton recognizable

a language that is regular. Abbreviated FAR.

finite state machine

see finite automaton.


a Free Software Foundation program similar to lex.


1. to clear a buffer by writing out or transmitting its contents.
2. to discard remaining data in a buffer.

formal grammar

see grammar.

formal parameter

a parameter specified in the argument list of a procedure definition.
cf. actual parameter.

forward reference

reference to a label in a program that has not yet appeared in the program


the breaking up of memory into blocks that are too small to be of use.
cf. internal fragmentation, external fragmentation.


the set of leaf nodes of a tree.


storage that can no longer be accessed because no pointer to it exists.

garbage collection

the identification of unused storage and collection of it so that it can
be placed back on the heap for reuse.


1. garbage collection.
2. the occurrence of a garbage collection during execution.
3. to perform garbage collection.


a procedure that produces the elements of a sequence, returning the next
element each time it is called; e.g., a pseudo-random number generator.

global analysis

analysis of the properties of an entire program or

global optimization

optimization based on analysis of the entire
program or procedure.


a formal specification of a language, consisting of a set of nonterminal
symbols, a set of terminal symbols or words, and production rules that specify
transformations of strings containing nonterminals into other strings.


refers to the size of problem or data that is handled.


a (directed) graph is a pair ( S, Gamma ) where S is a set of
nodes and Gamma subset S X S is a set of transitions between

graph coloring

an algorithm for assigning a minimal number of ``colors'' to nodes of
a graph such that no two nodes that are connected have the same color.
Used in compilers as a method of register assignment: colors correspond
to registers, nodes to variables or def-use chains, and connections to
variables that are simultaneously live.


software that facilitates the cooperative work of a group of people on
a single problem concurrently; members of the group may be at different
locations and communicate via networks.


in bottom-up parsing, the substring that should next be reduced as a phrase.

handle pruning

in bottom-up parsing, the process of removing the handle from the parsing
stack and replacing it by a nonterminal symbol or data structure representing
the phrase.

hash function

a deterministic function that converts converts a
symbol or other input to a ``randomized'' integer value.

hash table

a table that associates key values with data by use of
a hash function.


an area of contiguous memory and/or a set of unused storage records that
can be allocated to the running program as dynamic memory upon request;
the address of the record is returned and assigned to a pointer variable.
new in Pascal, malloc in C, and cons in Lisp allocate
heap memory.

higher-order logic

a logic that is more powerful than first-order predicate calculus,
e.g., one that allows quantification over predicate symbols.


a parallel computer architecture in which many CPU's (the number of which
is a power of 2 , say 2^{n ) are logically connected
as an n-dimensional hypercube, where each processor is at a corner of the
cube and is directly connected to the processors at neighboring corners.
A message can be transferred from any processor to any other in a number
of steps proportional to the logarithm of the number of processors.


a symbol that is used as the name of a variable, type,
constant, procedure, etc.


an expression written with an operator between its operand,
e.g. a + b . cf. prefix, postfix.

implicit parameter

a parameter that is passed to a subprogram without being specified directly
by the programmer, e.g., the return address.

induction variable

a variable that is incremented during a loop and used to perform a similar
action on multiple data; also, loop index.

information hiding

a method of making programs modular by allowing programs to see only
a set of well-defined interfaces to a data type, but not the internal
implementation of the data. cf. encapsulation.


to use a method defined in a superclass.


the availability of procedures or data by virtue of membership in a class,
as in an object-oriented system.

See More

Please allow access to your computer’s microphone to use Voice Recording.

Having trouble? Click here for help.

We can’t access your microphone!

Click the icon above to update your browser permissions above and try again


Reload the page to try again!


Press Cmd-0 to reset your zoom

Press Ctrl-0 to reset your zoom

It looks like your browser might be zoomed in or out. Your browser needs to be zoomed to a normal size to record audio.

Please upgrade Flash or install Chrome
to use Voice Recording.

For more help, see our troubleshooting page.

Your microphone is muted

For help fixing this issue, see this FAQ.

Star this term

You can study starred terms together

NEW! Voice Recording

Create Set