CS 375 Compilers Novak Vocabulary

317 terms by gordonnovak 

Create a new folder

Advertisement Upgrade to remove ads

absolute address

the numeric address of a location in memory.
cf. relative address.

absolute code

computer program code that is executable without
further processing: all addresses in the code are absolute.
cf. relocatable code.

abstract syntax tree (AST)

a tree representation of a program
that is abstracted from the details of a particular programming
language and its surface syntax.

accepting state

a state of a finite automaton in which the input string is accepted
as being a member of the language recognized by the automaton.

activation record

stack frame.

actual parameter

a parameter used in a call to a subprogram. cf. formal parameter.

address alignment

see storage alignment.

address space

1. the set of memory addresses that a program may reference.
2. the amount of memory allocated to a program or user.
3. the amount of memory addressable by the address size of a
machine instruction.

adjacency matrix

a method of representing a graph by a Boolean
matrix M , where M_{ij = 1 iff there is an arc from node i
to node j in the graph.


an alternate name for a memory location. Whenever a given memory
location is denoted by more than one name, any of the names can be
considered to be an alias.


the creation of an alternate name for data, either in the definition of
a program or during its operation.


a set of symbols used in the definition of a language.


a case where more than one interpretation is possible.

ambiguous grammar

a grammar that allows some sentence or string to be generated
or parsed in more than one way (i.e., with distinct parse trees).


a relation * is antisymmetric iff
forall a, b . a b and b a --> a = b .
Example <=


the number of arguments of a function.

assembly language

a language for writing computer programs, in which
one assembly language instruction usually corresponds to one machine


a specification of the order in which operations should be performed
when two operators of the same precedence are adjacent. Most operators
are left-associative, e.g. the expression A - B - C
should be interpreted as ((A - B) - C).


abstract syntax tree.

augmented transition network (ATN)

a formalism for describing parsers, especially for natural language.
Similar to a finite automaton, but augmented in that arbitrary tests may be
attached to transition arcs, subgrammars may be called recursively, and
structure-building actions may be executed as an arc is traversed.

automatic programming

synthesis of a program that satisfies a
specification, where the specification is higher-level than ordinary
programming languages.


an expression is available if it has been computed previously in the
computation path preceding the current location and has not been killed.


filling in the address of a label, which has just become defined, in
preceding parts of the program that made forward references to it.

base address

the address of the beginning of a data area. This address is added to
a relative address or offset to compute an absolute address.

basic block

a sequence of program statements such that if any of them is executed, all
of them are; a sequence of statements that has a label (if any) only at the
beginning and a branch (if any) only at the end.

basic type

a data type that is implemented in computer hardware
instructions, such as integer or real.


the association of a name with a variable or value.


a Free Software Foundation program similar to yacc.

bit vector

a sequence of Boolean values (0 or 1) represented as the
bits of one or more computer words. It is an efficient representation
for sets that are subsets of a fairly small finite set of possible elements.


short for basic block.


Backus-Naur Form, a syntax for writing context-free grammars
that describe computer languages.

Boolean matrix

a matrix whose elements are Boolean values, 0
or 1.

bottom-up parsing

a parsing method in which input words are matched against the right-hand
sides of grammar productions in an attempt to build a parse tree from the
bottom towards the top.


a pseudo-operation for some assemblers, used to specify the
reservation of a block of storage, perhaps initialized to some constant
value. (an abbreviation of Block Starting with Symbol.)


describes a variable whose value will be needed later during
program execution. Also, live.


a fast memory, smaller than the total main memory, used
by the CPU for faster access to data.

cache miss

a reference to a memory location that is not in the
cache, causing processing to be delayed until the operand can be fetched
from main memory.

cache prefetch

an instruction that causes the contents
of a specified memory address to be fetched from main memory into the cache
memory so that it will be available for fast access when needed.

call by name

a form of parameter transmission in which the effect of a textual
substitution of the actual parameter for the formal parameter is achieved.

call by reference

a form of parameter transmission in which the address of the actual
parameter is transmitted to the subprogram. Any changes to the parameter
made by the subprogram will change the argument as seen by the calling

call by value

a form of parameter transmission in which the value of the actual
parameter is transmitted to the subprogram. Changes to the parameter
made by the subprogram will not be seen by the calling program.

canonical derivation

rightmost derivation.

canonical form

a standardized form of expressions or data. If all programs put their
expressions into a canonical form, the number of cases that will have
to be considered by other programs is reduced.

Cartesian product

if A and B are sets, the Cartesian product
A X B is the set of ordered pairs (a, b) where
a in A and b in B .

cascading errors

a situation, e.g. in compiling a program, where one error causes many
reported errors. For example, failure to declare a variable may cause
an error every time that variable is referenced.


to coerce a given value to be of a specified type.

Chomsky hierarchy

the hierarchy of formal language types: regular, context free, context
sensitive, and recursively enumerable languages, each of which is a proper
subset of the following class.


complex instruction set computer.


in object-oriented programming, a description of a set of similar objects.
For example, Fido, an object or instance, might be a member of the class Dogs.

class variable

in an object-oriented language, a variable associated with a class of
objects, e.g. the number of members of the class.

closed procecdure

a procedure whose code is separate from that of calling programs; the
procedure is entered by a subroutine call, and it returns to the calling
program when it is finished.

code generation

the phase of a compiler in which executable output code is generated
from intermediate code.

code motion

the movement of code by a compiler to a place other than where it appears
in the source program. For example, an expensive but unchanging
computation might be moved outside a loop.


see type coercion.


in a hash table, a case in which a symbol has the same
hash function value as another symbol.

column-major order

a method of storing arrays in which values in a column of the array
are in adjacent memory locations. cf. row-major order.

common subexpression

an expression that appears more than once
in a program.


a program that translates a source language into an
object language that is executable on a computer.


a program that produces a compiler for a language from a specification
of the syntax and semantics of the language, e.g. yacc.


describes a task that is or can be done by the compiler,
without running the program. cf. static.

complex instruction set computer

a CPU design featuring a large number of relatively complex instructions.
Abbreviated CISC. cf. RISC.


of a subexpression, having its value computed within a given block of code.


making a sequence that consists of the elements of a first sequence
followed by those of a second sequence.

concatenation of languages

a language consisting of the set of sentences formed by concatenating a
sentence from the first language and a sentence from the second language.

condition code register

A CPU register that describes the result of the last arithmetic operation
or comparison. It typically contains bits for <0, =0,
>0, carry, and overflow.

constant folding

performing at compile time an operation whose
operands are constant, giving a new constant result.

context-free grammar

a grammar in which the left-hand side of each production consists of a
single nonterminal symbol.

control flow analysis

analysis of the possible paths that control flow
may take in a program during execution.

current status variable

in a hand-written parser, a variable that
denotes the construct last seen, e.g., start of expression,
operator, or operand.


directed acyclic graph,
a graph consisting of a set of nodes and directed arcs (arrows) between
nodes, such that no circular paths (cycles) exist.

dangling reference

in execution of a program, a reference, usually by means of a pointer,
to storage that has been deallocated. For example, in a recursive
language, a pointer to storage that is allocated on the execution stack
could be retained after the routine associated with that stack frame has
exited, resulting in errors.

data area

a contiguous area of memory, specified by its base address and size.
Data within the area are referenced by the base address of the area and
the offset, or relative address, of the data within the area.

data flow analysis

analysis of places in the programs where data
receive values and the places where those data values can subsequently be used.

dead code

parts of a program that cannot be reached during execution and therefore
can never be executed.


a statement in a programming language that provides
information to the compiler, such as the structure of a data record,
but does not specify executable code.


of a variable, having received a value prior to a given point in a program.

definition-use chain

the portion of a program flow graph across which a
variable is both defined and live (busy), beginning at the point where
the variable receives a value and ending at the last place that value is used.
Also, du-chain.


to convert from a pointer (address) to the data that is pointed to.


a list of steps that shows how a sentence
in a language is derived from a grammar by application of grammar rules.


a node in a tree that is a child of another node or a descendant of one
of its children.

deterministic finite automaton

a finite automaton that has at most one transition from a state for each
input symbol and no empty transitions. Abbreviated DFA.


deterministic finite automaton.

disambiguating rules

rules that allow an ambiguous situation to be resolved to a single
outcome, e.g. rules of operator precedence.


in an activation record, an array of pointers to the activation records
of surrounding blocks, used to access variables defined in those blocks.


a basic block of a program is a dominator of a second block if every
path from the entry of the program to the second block passes through it.


refers to things that happen or can only be determined
during actual execution of a program. cf. static.

dynamic memory

memory that is assigned during execution of a program, especially heap

dynamic scoping

a convention in a language, such as Lisp, that a variable can be referenced
by any procedure that is executed after it has become bound and before
it becomes unbound; thus, the scope of the variable can depend on the
execution sequence.

dynamic type checking

testing of the types of the values of variables at runtime, as is done
in Lisp and object-oriented languages. cf. static type checking.

effective address

the address of a data element, taking into account
offsets due to array indexing and record accesses.


a method of making a software system modular by creating
well-defined interface routines that deal with a particular kind of data
and allowing other programs to access the data only through those routines;
the interface routines encapsulate the data. cf. information hiding.


to generate all of the members of a set.

enumerated type

a scalar type consisting of a finite set of
enumerated values, e.g. type boolean = (false, true);.


a section of code that is executed just before leaving a subprogram
to restore register values, transfer the result of the subprogram to
the calling program, and branch to the return address.

equivalence relation

a relation that is reflexive, symmetric, and transitive.

equivalent grammars

grammars that denote the same language.

error production

a grammar production, as in a Yacc grammar,
that is executed if no other production matches the input.

execution stack

a stack of activation records or stack frames that is maintained
during execution of programs in a block-structured or recursive language.


finite automaton.


finite automaton recognizable.


a component part of a data record.

finite automaton

an abstract computer consisting of an alphabet of symbols, a finite
set of states, a starting state, a subset of accepting states, and
transition rules that specify transitions from one state to another
depending on the input symbol. The machine begins in the starting
state; for each input symbol, it makes a transition as specifies by the
transition rules. If the automaton is in an accepting state at the end
of the input, the input is recognized. Also, finite state machine.
Abbreviated FA.

finite automaton recognizable

a language that is regular. Abbreviated FAR.

finite state machine

see finite automaton.


a Free Software Foundation program similar to lex.

formal grammar

see grammar.

formal parameter

a parameter specified in the argument list of a procedure definition.
cf. actual parameter.

forward reference

reference to a label in a program that has not yet appeared in the program


the breaking up of memory into blocks that are too small to be of use.
cf. internal fragmentation, external fragmentation.


storage that can no longer be accessed because no pointer to it exists.

garbage collection

the identification of unused storage and collection of it so that it can
be placed back on the heap for reuse.


1. garbage collection.
2. the occurrence of a garbage collection during execution.
3. to perform garbage collection.


a procedure that produces the elements of a sequence, returning the next
element each time it is called; e.g., a pseudo-random number generator.

global analysis

analysis of the properties of an entire program or

global optimization

optimization based on analysis of the entire
program or procedure.


a formal specification of a language, consisting of a set of nonterminal
symbols, a set of terminal symbols or words, and production rules that specify
transformations of strings containing nonterminals into other strings.


a (directed) graph is a pair ( S, Gamma ) where S is a set of
nodes and Gamma subset S X S is a set of transitions between

graph coloring

an algorithm for assigning a minimal number of ``colors'' to nodes of
a graph such that no two nodes that are connected have the same color.
Used in compilers as a method of register assignment: colors correspond
to registers, nodes to variables or def-use chains, and connections to
variables that are simultaneously live.

hash function

a deterministic function that converts converts a
symbol or other input to a ``randomized'' integer value.

hash table

a table that associates key values with data by use of
a hash function.


an area of contiguous memory and/or a set of unused storage records that
can be allocated to the running program as dynamic memory upon request;
the address of the record is returned and assigned to a pointer variable.
new in Pascal, malloc in C, and cons in Lisp allocate
heap memory.


a symbol that is used as the name of a variable, type,
constant, procedure, etc.


an expression written with an operator between its operand,
e.g. a + b . cf. prefix, postfix.

implicit parameter

a parameter that is passed to a subprogram without being specified directly
by the programmer, e.g., the return address.

induction variable

a variable that is incremented during a loop and used to perform a similar
action on multiple data; also, loop index.

information hiding

a method of making programs modular by allowing programs to see only
a set of well-defined interfaces to a data type, but not the internal
implementation of the data. cf. encapsulation.


to use a method defined in a superclass.


the availability of procedures or data by virtue of membership in a class,
as in an object-oriented system.

inherited attribute

an attribute of a node in a parse tree that is derived from the context
in which the node appears. cf. synthesized attribute.


inserting code of a subprogram directly into the code compiled for the
calling program, rather than compiling a subroutine call to an external


an order of visiting binary trees, in which the left subtree of a node
is examined, followed by the node itself, followed by the right subtree.


in object-oriented programming, an individual data object
that is an instance of a class of similar objects. Also, object.

instance variable

a data field in an instance.

intermediate code

see intermediate language.

intermediate language

an internal language used as the representation of a program during
compilation, such as trees or quadruples. The source language is
translated to intermediate language, then to the object language.

internal fragmentation

wasted storage within a block, either because the block is of fixed size
and is not all used, or because of padding.

interpreted code

a form of program that is read and executed by an interpreter program.
The interpreter reads an instruction, determines its meaning, and executes it.


a set of basic blocks of a program that comprise a sequence of statements or
simple loop.

invariant code

code whose value does not change during a certain
period of program execution.


a special word that is used to indicate the structure of a language,
such as the reserved words of computer languages.


of a subexpression, having any previously computed value invalidated
by redefinition of a component of the subexpression. Note that the term
``killed'' cannot properly be applied to a program variable.

Kleene closure

zero or more occurrences of a grammar item;
indicated by a superscript *.

language denoted by a grammar

L(G), the set of strings that can
be derived from a grammar, beginning with the start symbol.

left-associative operator

an operator in an arithmetic expression such that if there are two adjacent
occurrences of the operator, the left one should be done first.

left factoring

a method of modifying a grammar to eliminate left recursion.

left recursion

in top-down parsing, a grammar rule whose right-hand side begins with the
nonterminal symbol on the left-hand side will cause an infinite recursion,
called left recursion. Also, describes such a production.

leftmost derivation

a derivation in which the leftmost nonterminal of the string is replaced
at each step.


a popular software tool for constructing a lexical analyzer from a regular
grammar and actions associated with the grammar productions.


a basic symbol in a language; e.g., a variable name would be a lexeme for
a grammar of a programming language.

See More

Please allow access to your computer’s microphone to use Voice Recording.

Having trouble? Click here for help.

We can’t access your microphone!

Click the icon above to update your browser permissions above and try again


Reload the page to try again!


Press Cmd-0 to reset your zoom

Press Ctrl-0 to reset your zoom

It looks like your browser might be zoomed in or out. Your browser needs to be zoomed to a normal size to record audio.

Please upgrade Flash or install Chrome
to use Voice Recording.

For more help, see our troubleshooting page.

Your microphone is muted

For help fixing this issue, see this FAQ.

Star this term

You can study starred terms together

NEW! Voice Recording

Create Set