Dependency Manager Interface
The dependency manager tracks needs that dependents have of providers. This
is a general purpose interface which is associated with a
DataDictinary object; infact the dependencymanager is really the
datadictionary keeping track of dependcies between objects that it handles
(descriptors) as well as prepared statements.
The primary example of this is a prepared statement's needs of
schema objects such as tables.
Dependencies are used so that we can determine when we
need to recompile a statement; compiled statements depend
on schema objects like tables and constraints, and may
no longer be executable when those tables or constraints are
altered. For example, consider an insert statement.
An insert statement is likely to have dependencies on the table it
inserts into, any tables it selects from (including
subqueries), the authorities it uses to do this,
and any constraints or triggers it needs to check.
A prepared insert statement has a dependency on the target table
of the insert. When it is compiled, that dependency is registered
from the prepared statement on the data dictionary entry for the
table. This dependency is added to the prepared statement's dependency
list, which is also accessible from an overall dependency pool.
A DDL statement will mark invalid any prepared statement that
depends on the schema object the DDL statement is altering or
dropping. We tend to want to track at the table level rather than
the column or constraint level, so that we are not overburdened
with dependencies. This does mean that we may invalidate when in
fact we do not need to; for example, adding a column to a table may
not actually cause an insert statement compiled for that table
to stop working; but our level of granularity may force us to
invalidate the insert because it has to invalidate all statements
that depend on the table due to some of them actually no longer
being valid.
It is up to the user of the dependency system at what granularity
to track dependencies, where to hang them, and how to identify when
objects become invalid. The dependency system is basically supplying
the ability to find out who is interested in knowing about
other, distinct operations. The primary user is the language system,
and its primary use is for invalidating prepared statements when
DDL occurs.
The insert will recompile itself when its next execution
is requested (not when it is invalidated). We don't want it to
recompile when the DDL is issued, as that would increase the time
of execution of the DDL command unacceptably. Note that the DDL
command is also allowed to proceed even if it would make the
statement no longer compilable. It can be useful to have a way
to recompile invalid statements during idle time in the system,
but our first implementation will simply recompile at the next
execution.
The start of a recompile will release the connection to
all dependencies when it releases the activation class and
generates a new one.
The Dependency Manager is capable of storing dependencies to
ensure that other D.M.s can see them and invalidate them
appropriately. The dependencies in memory only the current
D.M. can see; the stored dependencies are visible to other D.M.s
once the transaction in which they were stored is committed.
REVISIT: Given that statements are compiled in a separate top-transaction
from their execution, we may need/want some intermediate memory
storage that makes the dependencies visible to all D.M.s in the
system, without requiring that they be stored.
To ensure that dependencies are cleaned up when a statement is undone,
the compiler context needs to keep track of what dependent it was
creating dependencies for, and if it is informed of a statement
exception that causes it to throw out the statement it was compiling,
it should also call the dependency manager to have the
dependencies removed.
Several expansions of the basic interface may be desirable:
- to note a type of dependency, and to invalidate or perform
an invalidation action based on dependency type
- to note a type of invalidation, so the revalidation could
actually take some action other than recompilation, such as
simply ensuring the provider objects still existed.
- to control the order of invalidation, so that if (for example)
the invalidation action actually includes the revalidation attempt,
revalidation is not attempted until all invalidations have occurred.
- to get a list of dependencies that a Dependent or
a Provider has (this is included in the above, although the
basic system does not need to expose the list).
- to find out which of the dependencies for a dependent were marked
invalid.
To provide a simple interface that satisfies the basic need,
and yet supply more advanced functionality as well, we will present
the simple functionality as defaults and provide ways to specify the
more advanced functionality.
interface Dependent {
boolean isValid();
InvalidType getInvalidType(); // returns what it sees
// as the "most important"
// of its invalid types.
void makeInvalid( );
void makeInvalid( DependencyType dt, InvalidType it );
void makeValid();
}
interface Provider() {
}
interface Dependency() {
Provider getProvider();
Dependent getDependent();
DependencyType getDependencyType();
boolean isValid();
InvalidType getInvalidType(); // returns what it sees
// as the "most important"
// of its invalid types.
}
interface DependencyManager() {
void addDependency(Dependent d, Provider p, ContextManager cm);
void invalidateFor(Provider p);
void invalidateFor(Provider p, DependencyType dt, InvalidType it);
void clearDependencies(Dependent d);
void clearDependencies(Dependent d, DependencyType dt);
Enumeration getProviders (Dependent d);
Enumeration getProviders (Dependent d, DependencyType dt);
Enumeration getInvalidDependencies (Dependent d,
DependencyType dt, InvalidType it);
Enumeration getDependents (Provider p);
Enumeration getDependents (Provider p, DependencyType dt);
Enumeration getInvalidDependencies (Provider p,
DependencyType dt, InvalidType it);
}
The simplest things for DependencyType and InvalidType to be are
integer id's or strings, rather than complex objects.
In terms of ensuring that no makeInvalid calls are made until we have
identified all objects that could be, so that the calls will be made
from "leaf" invalid objects (those not in turn relied on by other
dependents) to dependent objects upon which others depend, the
dependency manager will need to maintain an internal queue of
dependencies and make the calls once it has completes its analysis
of the dependencies of which it is aware. Since it is much simpler
and potentially faster for makeInvalid calls to be made as soon
as the dependents are identified, separate implementations may be
called for, or separate interfaces to trigger the different
styles of invalidation.
In terms of separate interfaces, the DependencyManager might have
two methods,
void makeInvalidImmediate();
void makeInvalidOrdered();
or a flag on the makeInvalid method to choose the style to use.
In terms of separate implementations, the ImmediateInvalidate
manager might have simpler internal structures for
tracking dependencies than the OrderedInvalidate manager.
The language system doesn't tend to suffer from this ordering problem,
as it tends to handle the impact of invalidation by simply deferring
recompilation until the next execution. So, a prepared statement
might be invalidated several times by a transaction that contains
several DDL operations, and only recompiled once, at its next
execution. This is sufficient for the common use of a system, where
DDL changes tend to be infrequent and clustered.
There could be ways to push this "ordering problem" out of the
dependency system, but since it knows when it starts and when it
finished finding all of the invalidating actions, it is likely
the best home for this.
One other problem that could arise is multiple invalidations occurring
one after another. The above design of the dependency system can
really only react to each invalidation request as a unit, not
to multiple invalidation requests.
Another extension that might be desired is for the dependency manager
to provide for cascading invalidations -- that is, if it finds
and marks one Dependent object as invalid, if that object can also
be a provider, to look for its dependent objects and cascade the
dependency on to them. This can be a way to address the
multiple-invalidation request need, if it should arise. The simplest
way to do this is to always cascade the same invalidation type;
otherwise, dependents need to be able to say what a certain type
of invalidation type gets changed to when it is handed on.
The basic language system does not need support for cascaded
dependencies -- statements do not depend on other statements
in a way that involves the dependency system.
I do not know if it would be worthwhile to consider using the
dependency manager to aid in the implementation of the SQL DROP
statements or not. Past implementations
of database systems have not used the dependency system to implement
this functionality, but have instead hard-coded the lookups like so:
in DropTable:
scan the TableAuthority table looking for authorities on
this table; drop any that are found.
scan the ColumnAuthority table looking for authorities on
this table; drop any that are found.
scan the View table looking for views on
this table; drop any that are found.
scan the Column table looking for rows for columns of
this table; drop any that are found.
scan the Constraint table looking for rows for constraints of
this table; drop any that are found.
scan the Index table looking for rows for indexes of
this table; drop the indexes, and any rows that are found.
drop the table's conglomerate
drop the table's row in the Table table.
The direct approach such as that outlined in the example will
probably be quicker and is definitely "known technology" over
the use of a dependency system in this area.
|