Updates
and Events in a Nested Relational
Programming
Language
Weizhong
Sun
School
of Computer Science
McGill
University, Montreal
March
2000
A
thesis submitted to the Faculty of Graduate Studies and Research in partial
fulfillment
of the requirements of the degree of Master of Science
Contents
Abstract i
Résumé ii
Acknowledgements iii
1 Introduction 1
1.1 Relational Model 1
1.1.1 Flat Relational Model 1
1.1.1.1 Introduction 1
1.1.1.2 Relational Algebra 2
1.1.1.3 Domain Algebra 3
1.1.1.4 Update Operation 3
1.1.2 Nested Relational Model 4
1.1.3 Discussion 7
1.2 Active Database Systems 8
1.2.1 Introduction 8
1.2.2 ECA Model 9
1.2.2.1 Basic Concepts 10
1.2.2.2 Rule Execution Granularity 10
1.2.2.3 Transaction and Coupling Modes 11
1.2.2.4 ECA Model in SQL3 14
1.2.3 Discussion 19
1.3 Summary and Thesis Outline 21
2 jRelix Overview 23
2.1 Declaration of Domain and
Relation 23
2.1.1 Domain Declaration 23
2.1.2 Relation Declaration and
Initialization 25
2.1.3 Example 26
2.2 Relational Algebra 27
2.2.1 Assignment and Incremental
Assignment 27
2.2.2 Relational Expressions 28
2.3 Domain Algebra 34
2.4 Unification of Relational Algebra and
Domain Algebra 36
2.5 Computation 37
3
User Manual on Updates and Event Handlers 42
3.1
User Manual on Updates 42
3.1.1
Introduction 42
3.1.2
Syntax and Semantics 42
3.1.3
Worked Example of Flat Relation 43
3.1.4
Worked Example of Nested Relation 48
3.2
User Manual on Event Handlers 53
3.2.1
Introduction 53
3.2.2
Computation as Event Handler 56
3.2.3 Cascading
Event Handler
56
3.2.4
Event Handler On/Off 58
3.2.5
Printing Event Handler 58
3.2.6
Deleting Event Handler 58
3.2.7
Trigger and New and Rest 59
3.2.8
Undo Command 61
3.2.9
Worked Example of Event Handler 62
4
Implementation of Update and Event Handler 65
4.1
Representation of Nested Relations 65
4.2
Implementation of Updates 66
4.2.1
EvaluateUpdate 68
4.2.1.1 Algorithm of EvaluateUpdate 68
4.2.1.2 Example of EvaluateUpdate 70
4.2.2
ExecuteUpdate 72
4.2.2.1 Algorithm of ExecuteUpdate 72
4.2.2.2 Example of ExecuteUpdate 72
4.2.3
SwitchTrigger 73
4.2.3.1 Algorithm of SwitchTrigger 73
4.2.3.2 Example of SwitchTrigger 74
4.3 Implementation of Event Handlers 75
4.3.1 Computation as Event Handler 75
4.3.2
Algorithm of ExecuteEventHandler 76
4.3.3 Example of ExecuteEventHandler 78
4.4
Combination of Updates and EventHandlers 80
4.5 Setting Event Handlers On/Off 81
4.6 Deleting Event Handlers 82
4.7 Printing Event Handlers 82
5 Conclusion 83
5.1 Conclusions 83
5.2 Future Work 83
Appendix
A 91
Bibliography
99
Abstract
This
thesis documents the design and implementation of Updates and Event Handlers in
a relational database programming language. Update operations are implemented
in the nested relational model and Event Handlers transform the database system
from passive to active.
The update operation allows the user to
change values of specified attributes in certain tuples. These attributes can
be selected by a "using" clause which uses a relational algebra
operation to select tuples from the relation we want to update. We can also use
updates to add or delete some tuples to or from the relation. The nested update
in nested relations will also be presented.
Event handlers are introduced as procedures, invoked by events
which are system generated procedure calls. We implement the event handler
based on computation - the procedural abstraction facility of the database
programming language.
An update statement can invoke multiple
event handlers. Event handlers may contain update statements which in turn
invoke other event handlers. This introduces cascading event handlers. We will
present the combination of update and event handler algorithm by which the
update operation is coupled with event handler closely and neatly, and the
cascading event handler is handled by the system's recursive execution.
In this thesis, we provide the user a
uniform syntax to update both flat relations and nested relations. The
unification of computation and procedure leads to a simpler language. More
over, the explicitness and intuitiveness of the definition and implementation
of both event and event handler are under substantial considerations.
Résumé
Cette thèse
décrit la conception et l'implémentation d'un manipulateur d'événement et de
mise à jour, dans un language de base de données relationnelle. Les opérations
de mise à jour sont implementées par un modèle relationnel emboîté, et les
manipulateurs d'événement transforment un système de base de données du passif
à l'actif.
L'opération de mise à jour permet à
l'usager de changer la valeur de certains attributs dans une série. Ces
attributs peuvent être choisis en se servant d'une clause qui utilise une
opération d'algèbre relationnelle afin de choisir la série dont la relation
doit être mise à jour. Il est aussi possible d'utiliser l'opération de mise à
jour pour ajouter ou enlever une serie d'une relation. La mise à jour emboîté
d'une relation emboîté sera aussi présenté.
Des manipulateurs d'évènements seront
présentés en tant que procédures, invoqués par des évènements, qui sont des
appels de procédure générés par un système. L'implémentation de ces
manipulateurs d'évènement est basée sur des traitement de données - la facilité
de l'abstraction procédural du language de programation de base de données.
L'instruction de mise à jour peut
invoquer de multiple manipulateurs d'évènement. Les manipulateurs d'évènement
peuvent à leur tour invoquer d'autres manipulateurs d'évènement. Seront
présenté, une combinaison d'algorithme de mise à jour et de manipulations
reliées de près de façon efficace, et la cascade d'un manipulateur d'évènement
par l'exécution récursive d'un système.
Dans cette thèse, une syntaxe uniforme de
mise à jour pour des relations emboîté ou non est présentée aux lecteurs.
L'unification des traitements et procédures mênent à un langage plus simple. De
plus, l'évidence et l'intuitivetée de la définition et l'implémentation des deux
manipulateurs d'évènement sont des considérations substantiels.
Acknowledgements
I would
like to thank many people who provided assistance to me. Among them, I
especially wish to thank Prof. T. H. Merrett, my thesis supervisor. His
invaluable advice, patient guidance, continuous encouragement, and financial
support provided me a favorable environment to write my thesis. Without these,
this thesis might never have been written.
I would like to thank McGill University
and the faculty of Computer Science for the financial support of Maxtern
Recruitment Fellowship and Tuition Fee Waiver Fellowship.
I would also like to thank Ian Garton,
for his assistance and advice during the project from which I benefit a lot. I
am also indebted to my friends, Heng Jia and Jian Fu, for their enthusiastic
assistance on preparing this thesis.
I would like to thank my family, for
their continuous support and encouragement throughout my studies at McGill.
Last but not least a special thanks goes
to Lynn, whose love and encouragement are the driving force for me.
Chapter
1
Introduction
This
thesis documents the design and implementation of Updates and Event Handlers
for jRelix. The basic aim of our work
is to support update operations in a nested relational model and transform
jRelix from a passive database system to an active one. We first have a
historical overview of the relational model including both the flat and the
nested relational model. Then we will discuss active database systems. Finally
we will present the thesis outline.
1.1 Relational Model
1.1.1 Flat Relational Model
1.1.1.1
Introduction
The relational
model introduced by Codd [Cod70] [Cod79] represents the database as a
collection of time-varying relations. The relation is a simple and uniform data
structure which consists of rows and columns. Each relation resembles a table,
and each row in the table represents a collection of related data values. The
terminology "tuple" is used to refer to a row, and
"attribute" is used to refer to a column header. "Domain"
refers to the data type of values that can appear in a column. Formally, a
relation is a subset of the Cartesian product of its domains.
If the domains are all simple, such a
relation has a tabular representation with the following properties:
* All
tuples are distinct from each other.
* A
relation is an unordered set of tuples.
* Each
attribute is unique and the order of columns is irrelevant.
* The
attribute values are atomic. By atomic it means each value in the domain is
undecomposable.
The
fourth property is the so-called first-normal-form (1NF) requirement. A
relation that satisfies 1NF is also called a flat relation.
The concept of normalization was
introduced in [Cod70] and dealt with more rigorously in his later papers
[Cod72a] [Cod72b]. It addresses how data ought to be organized within a
database in order to make the database as compact and as easy to manage as
possible and ensure the result is consistent. The theory is based on a series
of normal forms - which define increasingly stringent requirements. A thorough
discuss about normalization technique can be found in [Dat81] and [Ull82].
A number of early projects were
implemented after the introduction of the relational model, and the most
significant research may be attributed to three projects [CB99]. The first was
System R, which was developed at IBM's San Jose Research Laboratory in
California during the late 1970s [AB+76]. This project was designed to prove
the practicality of the relational model by providing an implementation of its
data structures and operations. The second project was the INGRES project at
the University of California at Berkeley [HSW75]. INGRES involved the
development of a prototype RDBMS which spawned the commercial product INGRES.
The third project was the Peterlee Relational Test Vehicle at the IBM UK
Scientific Center in Peterlee [Tod76]. This project was significant for
research into such issues as query processing and optimization, and functional
extension. The idea of procedural expression (computation) of an infinite
relation was first proposed in this project.
1.1.1.2
Relational Algebra
The
relational algebra is first suggested by [Cod70], and consists of a collection
of operations applied on relations. All operations take a relation as an
operand and return a relation as a result. The result relation can be further
manipulated. The property that any relational algebra operation evaluates to a
relation is called the "closure principle". This allows complex
relational expressions by concatenating a series of simple operations.
The relational algebra operations are
usually divided into two groups based on the number of their operands.
* Unary
operations, includes projection, selection.
*
Binary operations, includes mu-join and sigma-join.
1.1.1.3
Domain Algebra
The
need for arithmetic and related operations on the values of attributes in
individual tuples is encountered. Merrett [Mer76] proposed the domain algebra
which consists of a set of operations to manipulate attributes in individual
tuples. The user can create new attributes from the existing ones with two
kinds of operations:
*
horizontal operations: new value is generated from the values within a tuple
-
Constant
-
Rename
-
Mathematical operations
-
If-then-else
-
Predefined functions
*
vertical operations: new value is generated from values along an attribute
- Reduction
-
Equivalence
-
Functional Mapping
-
Partial Functional Mapping
1.1.1.4
Update Operation
As Codd
described, a relational database is a time-varying collection of data, all of
which can be accessed and updated as if they were organized as a collection of
time-varying tabular relations.
There are three ways to update a relation
as time processes:
*
Insert additional tuples into the relation.
*
Delete existing tuples from the relation.
*
Change the attribute values of the existing tuples in the relation.
1.1.2 Nested Relational Model
In the
late 1970s a significant direction in database research was established:
various extensions to the relational model were proposed to enrich the
semantics of the original model, and to open the model to better meet the
requirements.
In 1977 Makinouchi [Mak77] proposed to
generalize the relational model by removing the 1NF constraint. Jaeschke and
Schek [JS82] introduced a generalization of the ordinary relational model by
allowing relations with set-valued attributes and adding two restructuring
operators, the nest and the unnest operations, to manipulate such (one-level)
nested relations. Thomas and Fishcher
[TF86] generalized Jaeschke and Schek's model and allowed nested
relations of arbitrary (but fixed) depth. The definition of recursively nested
relations was also discussed [LS88]. Detailed examination can be found in
[AB84] [FT83] [KK89] [SS86].
Meanwhile, various query languages have
been introduced for the nested relational model. Roth, Korth and Silberschatz
[RKS88] defined a calculus-like query language for the nested relational
database model of Thomas and Fischer. Since then, many SQL-like query languages
[PA86] [KR89] [PT86], graphics-oriented query languages [HP87] and datalog-like
languages [BK86] [BNR+87] have been introduced for this model or slight
generalizations of it.
By removing the 1NF constraint of the
flat relational model, the attributes in the nested relational model can be
relation-valued. Let us consider an example of a database for employees in
different departments of a company. Each employee in a department has name and
salary. The relationship can be represented diagrammatically as a tree, as
shown in Figure 1.1.
Figure 1.1: Hierarchical structure example
As
well, the contents of above information structure can be illustrated like that
shown in Figure 1.2.
Figure 1.2: Example of a
hierarchical record
Alternatively,
we can use a nested relation to present the information as illustrated in Table
1.1. The nested relation consists of
two tuples each having two attributes:
*
DEPARTMENT: The name of the department. Its data type is string (atomic).
* EMPLOYEES:
A nested relation containing the information of each employee in the
department. Each tuple in relation
EMPLOYEES contains a whole relation as an attribute. The first tuple contains a
relation with 4 tuples. The second tuple contains a relation with 3 tuples.
Table 1.1: Example of a nested
relation presentation
If we
use the 1NF conformant relation to store such company information, it would
have repeated values for attribute DEPARTMENT as shown in Table 1.2.
Table 1.2: 1NF-conformant company record table
The
main advantages of nested relation model are the following [Lev91] [Yua98]:
* For
storage efficiency, nested relations minimize redundancy data
* For
programming flexibility, nested relations allow a very flexible interface at
the external level, since both flat and hierarchical data can be presented to
the user.
1.1.3 Discussion
The
flat relational model is usually thought to be not qualified to handle
applications such as CAD/CAM where the objects to be modeled are structured in
hierarchy (also referred to as complex objects) while the nested relational
model is more capable of handling such applications. But this is not true. As
shown in the implementation of jRelix (detailed discussion of jRelix will be
presented in the following chapters and can be found in [Hao98] [Yuan98]
[Bak98]), the nested relational model essentially does not extend the data
model with new concepts or operators. It just permits a more deliberate use of
type constructors (attribute could be of relation type) and the query language
(including relational algebra and domain algebra in jRelix) already available.
To structure hierarchical data, the user
can define the data with nested relation-valued attributes. The system will
then map the nested relations to flat relations in the storage and the query
language can be applied on both flat relations and nested relations. The only
extension is to integrate relational algebra expressions into domain
expressions thus any operations that are performed on the top-level relations
can be performed at the lower level attributes which are also relations.
It might be argued that mapping nested
relations to flat relations requires lots of joins, which is very time
consuming. But this is not a problem of mapping itself, but an issue of the
implementation of joins. The join operation can be implemented using a pointer
so that the cost of joins of flat relations is just the cost of pointer
de-referencing of nested relations. Queries with low selectivity are suitable
to be handled by pointer de-referencing.
A thorough examination of how to use
jRelix to build up CAD/CAM applications will not be discussed further as it is
beyond the scope of work presented in this thesis.
1.2 Active Database Systems
1.2.1 Introduction
Traditional
database systems are passive in the sense that commands are executed by the
database (e.g., query, update, delete) as and when requested by the user or
application program. Such systems lack the capability to respond to happenings
of interest without the user intervention.
One early thread of research in active
database systems focused on scaling up production rule systems so that they
could deal with large numbers of rules and facts stored in a database. The
primary goal of these large-scale production rule systems was to support
inferencing over a database [DE89]
[Han89] [SLR88] [SKM92].
In the mid to late 1970s, triggers were
proposed as a way of enforcing integrity constraints, and a trigger subsystem
for System R was described in [EC75] [Esw76]. In 1982, Stonebraker first proposed the use of situation-action
rules as a unifying mechanism for implementing constraints, view processing,
triggers and access control [Sto82]. This idea was implemented in the POSTGRES
Rule System I. In [DH+94], Situation-action rules were used to embed the shared
operational semantics of applications in databases. [KDM88] described the
event-trigger mechanism for enforcing complex constraints in CAD databases.
Research in active database really
exploded after the introduction of ECA-rule first proposed in HiPAC [DB+88].
Based on the idea of ECA, HiPAC developed concepts for time-constrained active,
object-oriented database systems. Besides HiPAC, there are another two
important pioneering research projects in active database systems: Postgres
Rule System II [SJ+90] and Starburst Rule System [Wid96], which introduced
various granularity of ECA rules
Following these early pioneering
projects, more and more research projects in active database systems have been
implemented, such as Chimera (an active and deductive DBMS) [FMT84], REACH (an
active real-time system) [BB+93], and Ode (a persistent C++ system with
triggers) [GJ91].
For example, an inventory control system
needs to monitor the inventory database, so that when the quantity in stock of
some item falls below a threshold, a re-ordering activity may be initiated.
This behavior could be implemented over a passive database system in the following
two ways, neither of which is satisfactory. First, a new application program is
written to poll the database periodically. However, polling too often can be
inefficient, polling too slowly may result in delayed responses to critical
situation. Alternatively, every program that updates the inventory database
could check the condition and invoke the reordering operation if necessary;
however, this is poor software engineering because the desired semantics are
embedded in many programs and also every updater needs to know which downstream
operations to call.
An Active Database System is a
conventional, passive database system extended with the capability of reactive
behavior. The desired behavior is expressed in rules that are defined and
stored in the database [VB98].
If we build up the above inventory
control system using an active database system, we can define a rule that when
quantity in stock falls below the threshold, the system will trigger a request
of order. The active database system is now responsible for detecting the
situation of interest and triggering the appropriate response. This has the
benefit that the rules can be shared by many application programs, and the
database system can optimize their implementation.
An active database system has some
advantages over its passive counterpart [VB98]:
1.
Perform functions that in passive database systems must be encoded in
applications.
2.
Perform tasks that require special-purpose subsystems in passive database
systems.
1.2.2 ECA Model
In this
section, we will discuss the event-condition-action (ECA) model that is widely
used in active database systems. Before we go any further we will give formal
definitions of several basic concepts in ECA model.
1.2.2.1
Basic Concepts
According
to [MD89], we have the following definitions:
* "An event is the occurrence of
pre-defined state which triggers the rule and causes the system to evaluate the
condition. Event can be defined as having typed formal parameters, which are
bound at the time an instance is signaled, and are passed to the condition and
action. Event can be either a single
primitive event or a composite event. Primitive event can be one of the
followings: database operations (data definition, data manipulation, or
transaction control), temporal events (absolute, relative or periodic), or
external notification (application defined events). Composite event is built up
from the primitive type events by operators such as disjunction, sequence,
closure, conjunction, and negation. For example, the event on insert to Emp or
update to salary of Emp is a composite event using the operator or."
* "Conditions are typically predicates or
queries against the database state."
*
"An action is a sequence of operations which are executed when the
condition of the triggering event is satisfied. These operations can be either
database operations or requests to application programs."
1.2.2.2
Rule Execution Granularity
Rules may
be executed at several granularities: at a tuple- or instance-level; at a
table- or statement-level; or at a transaction or other granularity. Some
projects support one granularity, others a mixture.
We will illustrate that the semantics can
be quite different for the different granularities with the following example.
EMP (
Name Salary )
M.Gordon 25000
P.McConnel 22000
Event:
delete the tuple where Salary >= 20000
Condition:
true
Action:
decrease Salary by 5000
If the rule is evaluated at the statement
level, i.e., after the entire update operation finishes, the result is the
empty set (because both tuples are deleted by the update operation). If the
rule is evaluated once for each tuple, then the P. McConnel tuple will remain
in the relation: after the M. Gordon tuple is deleted, the rule fires, causing
P. McConnel's salary to decrease to 17000, so that it no longer satisfies the
predicate of the delete operation.
1.2.2.3
Transaction and Coupling Mode
Before
we discuss various coupling modes of rule execution, we introduce the concept
of transaction which is closely relative to coupling modes.
* A
transaction can be considered as a collection of actions with the following
properties [GR93]:
- Consistency: a transaction is a correct
transformation of the state of the database.
The
actions taken as a group do not violate any integrity constraints
associated with the state.
-
Atomicity: the changes a transaction performs are atomic; either all happen or
none happen.
-
Isolation: although transactions execute concurrently, it appears to each
transaction, T, that each other transactions executes either before T or after
T, but not both.
-
Durability: once a transaction completes successfully (commits), its changes to
the state survive failures.
A
transaction may contain any number of nested transactions [Mos85] [GR93]. A
nested transaction is fully contained within another transaction (its parent).
A nested transaction can have as its parent either another nested transaction
or a top level transaction. The related transactions are organized in the form
of a transaction tree.
In the basic nested transaction model
proposed in [Mos85], a nested transaction is started explicitly by the parent
transaction, which is suspended until the nested transaction commits or aborts.
The effects of a nested transaction do not become permanent until it and all
its ancestors commit through a top level transaction. A nested transaction may
be aborted without causing its parent transaction to abort. The parent
transaction can create another transaction to retry the aborted one. When a
parent transaction aborts its effects and the effects of all its descendants
are ignored. If the top transaction aborts, the whole transaction tree is
aborted.
Three more types of nested transactions
are presented in [DHL90] as the extension of the basic nested transaction
model.
*
Deferred nested transactions are nested transactions whose execution is explicitly
delayed until the end of the user's top transaction. If more than one deferred
nested transaction is spawned within a user's transaction, they all execute in
parallel at the end of the user's transaction. If any of these deferred nested
transactions spawns itself another nested transaction, this is executed until
all deferred transactions from the level above have finished.
*
Nested top-transactions are top transactions started from within another
transaction and are represented by their own tree. However, it may not see any
non-committed objects of the spawning transaction and is not automatically
aborted when the spawning transaction aborts.
*
Causally-dependent-top-transactions (CD-top) are spawned from within another
transaction and are like nested top-transactions that have their own
transaction tree but are commit-dependent on the parent. Aborting the spawning
transaction aborts the CD-top transaction. However, aborting CD-top transaction
has no effect on the spawning transaction.
Now we move on to the topic of coupling
modes of rule execution. An important aspect of ECA rule execution is the exact
time of event detection, condition checking, and action execution relative to
the triggering operation and the end of the transaction. There can be various
delays in checking the condition of a rule after its event is detected or the
execution of a rule's action can be delayed after its condition is evaluated to
true. These delays can either occur within one transaction or spread over different
transactions. The latter case raises the issue of nested transactions.
The various possibilities of delays of
the rule execution are called coupling modes. There are three coupling modes,
both for the relative delay between event detection and condition checking (EC
coupling) and between condition checking and action execution (CA coupling):
*
Immediate: The condition (action) is evaluated (executed) immediately after the
event (condition) within the same transaction. Immediate coupling modes can be
used, for example, to enforce security constraints or to propagate updates. For
rule execution in small granularity, the condition (action) evaluation
(execution) will be immediately done for each rule execution in that
granularity.
*
Deferred: The condition (action) is evaluated (executed) within the same
transaction as the event (condition) of the rules, but at the end of the
triggering transaction and before the triggering transaction commits. Deferred
coupling modes can be used to defer constraint checking until the end of the
transaction, thereby, allowing temporary inconsistencies within a single
transaction. In small granularity, all the condition (action) evaluations
(executions) of rule execution in that granularity will be deferred until the end
of the transaction.
*
Detached: The condition (action) is evaluated (executed) within a different
transaction from the event (condition). The execution of the action can be
dependent upon or independent of the committing of the transaction in which the
event took place or the condition was evaluated. The benefit of this mode is
that it results in shorter transactions that can be committed earlier, improve
response time and concurrency. It can also be used to monitor access to
sensitive information, as running a detached independent transaction will
record information on the access even if the transaction in which the access
took place is aborted. The detached coupling mode is supposed to be implemented
in nested transaction model.
The selection of proper coupling mode(s)
for rule execution is highly dependent on the transaction model(s) supported by
the active database system.
1.2.2.4
ECA Model in SQL3
The
database language SQL is now firmly established as the predominant language
standard for database access. The first version of SQL standard, SQL-86, was
published in 1986. A substantial reversion of the standard appeared in 1992 as
SQL-92. One significant extension is that SQL-92 introduced the notion of
referential actions, which can be considered as a limited form of rule support.
More recently, significant active database functionality is being incorporated
into SQL3 [Day95] [Nor98] in the form of triggers.
A trigger in SQL3 is a named
event-condition-action rule that is activated by a database state transition.
Each trigger is associated with a particular table and is activated whenever
that table is modified and an optional condition, specified as part of the
trigger definition, evaluates to true. Triggers can be processed either at tuple-level
or table-level granularity.
To define a trigger in SQL3, the name of
the trigger must be unique among all trigger names. The triggering operation of
a trigger can only be an INSERT, DELETE or UPDATE statement. The specification
of the trigger condition is optional. If omitted, a trigger condition that
specifies WHEN (TRUE) is implicit.
Each trigger is associated with a trigger
activation time, which determines whether the trigger is activated before or
after the triggering operation. Therefore, there are two modes of trigger in
SQL3: BEFORE mode and AFTER mode. A trigger in BEFORE mode is activated before
the operation that modifies a table executes, while a trigger in AFTER mode is
activated after the modifying operation executes. The specification of the
trigger activation time is mandatory.
The two kinds of triggers offer different
benefits. Since AFTER triggers execute after the triggering operation, they are
quite useful for performing follow-on updates to other tables. In contrast,
BEFORE triggers are especially useful for enforcing transitional constraints.
Consider the following example:
Emp (
empno, name, dno, age )
Dept (
dno, dname, loc, nemps )
CREATE
TRIGGER deptdel
BEFORE
DELETE ON Dept
WHEN
Dept.nemps > 0
BEGIN
DELETE
FROM Emp WHERE Emp.dno = Dept.dno
END
We have
defined a trigger deptdel:
*
subject table: Dept
*
operation: DELETE
*
condition: WHEN Dept.nemps > 0
*
action: DELETE Emp WHERE Emp.dno = Dept.dno
* mode:
BEFORE
*
granularity: table-level
Trigger
deptdel is in the BEFORE mode. It enforces the constraint that when a
department tuple is deleted, if the number of employees in the department is
greater than zero, then delete all employees of the department first.
The trigger we defined above is processed
in the table-level granularity, which means the trigger is executed once for
all the changes applied on the subject table Dept. In the following example, we
will define a trigger of tuple-level granularity.
CREATE
TRIGGER empdel
AFTER
DELETE ON Emp
FOR
EACH ROW
BEGIN
UPDATE
Dept SET nemps = nemps - 1 WHERE Dept.dno = Emp.dno
END
In the
trigger empdel, we have
subject
table: EMP
operation:
DELETE
condition
is omitted, which implies WHEN(TRUE)
action:
UPDATE
Dept SET nemps = nemps - 1 WHERE Emp.dno
= Dept.dno.
mode:
AFTER
granularity:
FOR EACH ROW specifies the granularity as tuple-level.
When we
execute the statement:
DELETE
FROM Emp WHERE age >= 60;
the
trigger empdel will execute once for each of the tuple that is deleted from
Emp, which decreases the number of employees of the corresponding department by
1.
We
summarize the execution rules of trigger as follows:
* When
a triggering operation on its subject table is initiated and the trigger
condition evaluates to true, a trigger is to be activated. When a trigger is
activated, its trigger action is executed as part of the execution of its
triggering operation.
* The
trigger action is a list of SQL statements within a BEGIN/END block. When a
trigger is activated, each statement in the block is executed in order.
* The
execution of trigger actions of a trigger may activate other triggers
(including the same trigger). Because of this, it is possible for the execution
of a statement involving triggers to never terminate. It is up to the user to
make sure that cascade triggers do not lead to non-terminating statement
executions.
In the following example, we use the
trigger deptdel and empdel to illustrate the cascade trigger invocation.
CREATE
TRIGGER deptdel
BEFORE
DELETE ON Dept
WHEN
Dept.nemps > 0
BEGIN
DELETE
FROM Emp WHERE Emp.dno = Dept.dno
END
CREATE
TRIGGER empdel
AFTER
DELETE ON Emp
FOR
EACH ROW
BEGIN
UPDATE
Dept SET nemps = nemps - 1 WHERE Dept.dno = Emp.dno
END
We
execute the following statement
DELETE FROM
Dept WHERE Dept.dno = DeptNo
where
DeptNo is the number of the department to be deleted.
Before
the corresponding department is deleted from Dept, the table-level trigger
deptdel will be activated and the statement
DELETE
FROM Emp WHERE Emp.dno = Dept.dno
in the
BEGIN/END block will be executed. Subsequently this statement will activate the
tuple-level trigger empdel. Thus trigger empdel will execute once for each of
the tuple that is deleted from the Emp, which means the statement
UPDATE
Dept SET nemps = nemps - 1 WHERE Dept.dno = Emp.dno
will be
executed once after a tuple is deleted from Emp. The operations continue until
all the employees in that department are deleted from Emp; meanwhile Dept.nemps
becomes zero. Then the statement
DELETE FROM
Dept WHERE Dept.dno = DeptNo
is
executed so that the department DeptNo is deleted from Dept.
1.2.3 Discussion
1.
About Event, Condition and Action
In our
work, we make a simpler and more intuitive definition of event, condition and
action:
* Events
are system-generated procedure calls.
This
idea came from Visual Basic [SK+94], an event-driven programming language. In
this event-driven approach, programs can be invoked by a variety of pre-defined
actions such as a mouse click or a keyboard stroke, which are called events.
The system has a facility to detect occurrences of events and then perform some
specified actions which are referred to as event handlers.
* Event
handlers are procedures.
The
event handler executed in response to an event can be expressed as a procedure.
Consider
the following example:
procedure
mouse.leftclick
begin
print("hello")
end
Here we
define an event handler, which is a procedure that prints "hello" on
the screen, in response to the event of a left click on the mouse.
As shown above, the action print is taken
within the event handler. We do not have Condition part in our model yet.
Although we can evaluate condition in the event handler, it is desired to have
an explicit component in the model to perform this function. This will be
discussed further as future work in Chapter 5.
2.
About Composite Event
In our
model, in response to the composite event using the operator or such as
on
insert to Emp or update to salary of Emp
we can
define a common procedure which is called by the event handlers that deal with
insert to Emp and update to salary of Emp, respectively. If either of the two
events occurs, the corresponding event handler will be triggered and call that
common procedure. It is not obvious how to implement the composite event using
the operator and because it involves the detection of concurrent events which
is beyond the scope of event handling.
3.
About Rule Execution Granularity
In our model
we do not support rule execution in tuple-level granularity because there is no
concept of tuple in jRelix. The
tuple-level granularity can cause data inconsistency
problems.
As shown in example in 1.2.2.2, the relation resulting from the tuple-level
rule
execution
depends on the order of tuples in the original relation, which is contradictory
to the relational model theory that results should not depend on the order of
the tuples.
It is possible to work at sub-relation
granularity (i.e., with subset of the tuples of a relation, defined, say, by a
predicate) in our model, but it will still cause data inconsistency just as the
tuple-level granularity.
4.
Ideas to be followed in our work
We
summarize the ideas to be followed in our work and make the comparison between
our ideas and SQL3.
Table 1.3: Ideas to be
followed in our model
1.3 Summary and Thesis Outline
The
purpose of this chapter was to introduce the basic conception of the Nested
Relational Model and Active Database Systems. In the remainder of the thesis,
we will present our start at the implementation of Updates in a nested
relational database, jRelix, and the implementation of Event Handlers which
enable jRelix to be an Active Database System.
In chapter 2, we will introduce jRelix in
detail and discuss some of the features relevant to our work. Chapter 3 is a
user manual intended to provide a tutorial to readers about the features of
Updates and Event Handlers in jRelix and how to use them.
Chapter
4 documents the implementation details of Updates and Event Handlers in jRelix.
Finally, chapter 5 summarizes the thesis and suggests work to do in the future.
Chapter
2
jRelix
Overview
This
chapter is a tutorial about jRelix - a relational database programming
language. jRelix consists of a database management system (DBMS) which is
responsible for organizing and storing data, and a query language Aldat -
Algebraic Data Language, based on relational algebra and domain algebra
[Mer77]. The nested relation model presented in jRelix leads to a significant
extension of the domain algebra. The extended domain algebra now includes the
relational algebra based on the conceptually unified definition of both flat
and nested relation [Hao98, Yuan98].
Section 2.1 describes the declaration of
attributes and relations and the initialization of relations. Section 2.2
discusses the assignments, incremental assignments and relational expressions.
Section 2.3 discusses the domain algebra. Section 2.4 covers the unification of
relational algebra and domain algebra upon nested relations. Section 2.5
introduces computation [Bak98] which will be used to implement the event
handler in jRelix.
2.1 Declaration of Domain and Relation
In this
section we describe the syntax of declaration of domains and relations,
initialization of relations, as well as some system commands applied on domains
and relations.
2.1.1 Domain Declaration
We use
the key word domain to declare attributes used in a relation. The syntax (
refer to Appendix A for complete jRelix syntax ) goes as following:
domain
IDList Type
Here
IDList specifies the list of attributes being declared. And Type specifies the
type of these attributes. We summarize the valid types supported in nested
relations of jRelix in table 2.1.
Table 2.1: Attribute
types in jRelix
There
are four kinds of meta-type for attributes:
* type
IDList specifies the attributes on which the nested attributes are defined
* type
expression is for relational and domain expressions (to be implemented)
* type
statement is for statements (to be
implemented)
* type
computation is used to define
computations as attributes ( for details refer to [Bak98] )
We use
command sd to show the information (name, type, etc.) of a specific attribute
or all the attributes currently defined in the system if the optional parameter
Identifier is omitted. The syntax of this command is:
sd
(Identifier)?
The
command dd is used to delete attributes specified in IDList. However, if any of
the attributes are being used in any existing relation, the command will fail.
This requires that the user delete all the relations associated with the
specified attributes before deleting the attributes. The syntax is:
dd
IDList
2.1.2 Relation Declaration and Initialization
The
syntax of relation declaration goes as the following:
relation
IDList "(" IDList ")" ( Initialization )?
The
first IDList specifies the relation being created, and the second IDList
specifies the attributes on which the new relation(s) will be defined.
Initialization is optional.
We use
the command sr to show information (name, type,number of tuples, sort, etc.) of
a specific relation or information of all the relations currently defined in the
system if the optional parameter Identifier is omitted. The syntax goes as
follow:
sr (
Identifier )?
The
command pr is used to print a relation on the screen. The syntax is:
pr
Expression
The
command dr is used to remove the relations specified in IDList from the system.
The
syntax is:
dr
IDList
2.1.3 Example
domain
Name strg;
domain
Sal intg;
domain
EMP (Name, Sal);
domain
DETP strg;
relation
employees ( DEPT, EMP ) <- { ("stereo", {("M.Gordon",
25000), ("P.McConnel", 22000), ("T.Anastasio", 27000),
("J.Fishman", 24000)}),
("television",
{("B.Martin", 38000), ("C.Wood", 32000),
("J.Medeski", 35000)})};
Note
that the assignment (<-) operator will be formally introduced in Section
2.2. Here we defined a nested relation employees which has two attributes, DEPT
and EMP. Attribute DEPT is of primitive type string, while attribute EMP is of
non-primitive type, a relation itself, defined on attributes Name and Sal.
Attribute Name is of primitive type string and attribute Sal is of primitive
type integer. The relation is shown as follows:
2.2 Relational Algebra
In this
section we will discuss relational algebra which consists of a set of
functional operators. These operators are applied on one or two relations and
produce a result relation. This closure property of relational algebra allows
complex expressions to be constructed by various operators. The syntax of the
operations will be presented and examples are used to illustrate their
functionalities.
We start with how to use assignment and
incremental assignment. Then we discuss relational expressions.
2.2.1 Assignment and Incremental Assignment
jRelix
provides two assignment operators, one is assignment ( <- ) which creates a
relation using the result of a relational expression, the other is incremental
assignment
( <+
) which adds the result of a relational expression to an existing relation. The
syntax goes as follow:
Identifier ( "<-" |
"<+" ) Expression
|
Identifier "[" IDList (
"<-" | "<+" ) ExpressionList "]" Expression
The
assignment ( <- ) operator always creates a relation named by Identifier
which consists of the same domain and data as the source relation. The source
relation remains unaffected. If the relation which has the same name has
already existed in the system, it will be removed first.
The incremental assignment ( <+ )
operator, if there is no relation with the same specified name already existing
in the system, behaviors the same as the assignment operator. If such a
relation does exist in the system, the result of the right side Expression will
be added to the left side relation provided that both of them are defined on
the same set of attributes.
Consider
the following example:
domain
sno strg;
domain
mark intg;
relation
Record ( sno, mark ) <- {("1", 85), ( "2", 70),
("3", 90), ("4", 70)};
relation
moreRecord ( sno, mark ) <- { ("5", 65) };
oldRecord
<- Record;
Record
<+ moreRecord;
The
results are shown as follows:
oldRecord ( sno
mark )
1 85
2 70
3 90
4 70
Record
( sno mark )
1 85
2 70
3 90
4 70
5 65
The
above example shows how we copy the data of relation Record to oldRecord using
assignment and add the data of relation moreRecord to Record using incremental
assignment.
2.2.2 Relational Expressions
Relational
expressions include projection, selection and join. They can be divided into
two groups,
unary operations and binary operations. Unary operations require a relation
as
input and produce a relation as output. Binary operations require two relations
as input
and
produce a relation as output. The source relations will not be affected in both
kinds of
expressions.
Unary
Operations
*
Projection
Projection
extracts a subset of the source relation. The result relation consists of a
subset of the attributes of the source relation specified by IDList. Duplicate tuples
will be removed from the result relation. If the IDList is omitted, the
resulting relation will contain one tuple with one boolean attribute .bool. The
value is true if the operand relation has at least one tuple. Otherwise the
value is false. The syntax goes as follow:
"["
( IDList )? "]" in Expression
Example:
1.
Retrieve the marks in relation Record.
Result
<- [mark] in Record
Result
( mark )
65
70
85
90
2.
Check whether there is tuple in relation Record
Result
<- [ ] in Record
Result
( .bool )
true
*
Selection
Selection
returns a subset of the source relation specified by Expression. The tuples in
the result relation are those satisfying the condition of the SelectionClause.
The syntax goes as follow:
where
SelectClause in Expression
Example:
Retrieve
the tuples in which mark is higher than 80 in relation Record
Result
<- where mark > 80 in Record;
Result
( sno mark)
1 85
3 90
* T -
Selection
Projection
and selection can be combined into one expression called T - Selection. The
syntax goes as follow:
"["
( IDList )? "]" where SelectClause in Expression
Example:
Retrieve
the student number whose mark is higher than 80 in relation Record.
Result
<- [ sno ] where mark > 80 in Record;
Result
( sno )
1
3
* QT -
Selection
Support
for QT - Selection has not been implemented in jRelix. Refer to [Mer84] for
more information.
Binary
Operations
There are
two categories of binary operators: (-joins and (-joins. (-joins are a
generalization of set operations on relations, and (-joins are a generalization
of logical operations on relations [Mer84]. The results of (-joins and (-joins
are also relations.
The
syntax for join goes as follows:
Expression
JoinOperator Expression
|
Expression
"[" ExprList ":" JoinOperator (":")? ExprList
"]" Expression
In the
first production, the common attributes of the left and right side relations
are used as join attributes. In the case where the left and right side
relations have no common attributes, the user may specify which attributes form
the join attributes. This is handled in the second production.
mu-joins
mu-joins correspond to the binary set
operations of union, intersection and difference. We summarize mu-joins in the
following table.
Table 2.2: Summary of
mu-joins
In
general, mu-joins operators can be defined in terms of three components -
center, left and right. Given two relations R(X,Y), S(Y,Z), the three
components are defined as follows:
center(R,S)
= {(x,y,z) |(x,y) in R and (y,z) in S}
left(R,S)
= {(x,y,dc)|(x,y) in R and any z ((y,z)
not in S)}
right(R,S)
= {(dc,y,z)|(y,z) in S and any x ((x,y)
not in R)}
We
have:
R ujoin
S = center(R,S) U left(R,S) U right(R,S)
R ijoin
S = center(R,S)
R djoin
S = X,Y in left(R,S)
R
drjoin S = Y,Z in right(R,S)
R
lrjoin S = left(R,S) U center(R,S)
R rjoin
S = right(R,S) U center(R,S)
R sjoin
S = left(R,S) U right(R,S)
Notice that
here is a new symbol dc. This is one of the two null values in jRelix. The
other is dk. Details of dc and dk can be found in [Mer84].
Example: CLASS ( ITEM TYPE ) RECLASS (
ITEM TYPE )
Yarn a Yarn
a
String a String
b
Ball b Top a
Sandal c
Result1
<- CLASS ijoin RECLASS;
Result1
( ITEM TYPE )
Yarn a
Result2
<- CLASS ujoin RECLASS;
Result2
( ITEM TYPE )
Ball b
Sandal c
String a
String b
Top a
Yarn a
Result3
<- CLASS djoin RECLASS;
Result3
( ITEM TYPE )
Ball b
Sandal c
String a
sigma-joins
sigma-joins
correspond to the set comparison operators such as 'subset?' or 'equals?'. We
summarize sigma-joins
in the following table.
Table 2.3:
Summary of sigma-joins
As of
this writing, mu-joins have been implemented in jRelix while sigma-joins have not. But they will soon
be included in jRelix. Refer to [Mer84] for details of mu-joins and
sigma-joins.
2.3 Domain Algebra
Relational
algebra considers relations as the primitive data [Mer84], therefore does not
provide power to manipulate attributes. Domain algebra, which is proposed in
[Mer77], provides a set of operations applied on attributes. They can be
classified into two groups: horizontal and vertical. Figure 2.1 provides a
summary of the domain algebra.
Figure 2.1: Summary of
domain algebra
It is necessary to declare a virtual
domain when we make use of the domain algebra.
Virtual
domains provide a flexible way to manipulate domains independent of the context
of relations. They are declared on a set of actual domains or virtual domains
that are subsequently based on actual domains. And they are actualized when
they appear in the relational algebra, based on the actual domains' data in the
source relation.
Horizontal operations are used to
manipulate individual tuples, while vertical operations are applied over sets
of tuples. In the following example, we show the user a horizontal operation and
a vertical operation of domain algebra, respectively.
Example:
>
let SUBTOT be PRICE*QTY (horizontal -
works along a tuple)
>
let TOT be red + of SUBTOT (vertical -
combine values from more than one tuple)
A thorough
description of domain algebra can be found in [Mer84]. For details of its
implementation please refer to [Yua98]. Syntax of them is shown in Appendix A.
2.4 Unification of Relational Algebra and Domain
Algebra
jRelix
provides full support for nested relations in which an attribute of a relation
could itself be a relation. It not only requires the relational algebra to
handle different joins of nested relations, but also implies that the domain
algebra should be capable of handling nested attributes that actually are
relations. Therefore, with nested relations, domain algebra must absorb
relational algebra.
In the jRelix system, the basic strategy
of implementation of the domain algebra for nested relations is to extend the
domain algebra, which integrates relational expressions into domain expressions
so that the relational algebra can be applied to the nested attributes.
Consequently, any operations that are performed on the top level relations can
be performed at the lower level attributes which are also relations.
Take
the relation employees from section 2.1.3.
Example
1:
>
namesal <- [EMP] in employees;
The
above statement does a projection over nested attribute EMP in relation
employees. The result goes as the following:
EMP
( NAME SAL )
M.Gordon 25000
P.McConnel 22000
T.Anastasio 27000
J.Fishman 24000
B.Martin 38000
C.Wood 32000
J.Medeski 35000
Note
that there could be duplicate tuples in the result of projection on the nested
attribute in jRelix. For instance, if the tuple (M.Gordon, 25000) appears
twice, in both department stereo and television, it will be duplicate after the
projection over EMP.
Example
2:
> let
SenEmp be [NAME] where SAL >= 30000 in EMP;
>
seniors <- [SenEmp] in employess;
The
first statement defines a virtual attribute SenEmp using domain algebra. It is
observed that the relational statement
[NAME] where SAL >= 33000 in EMP
is
integrated in the domain expression. Thus we have defined an attribute SenEmp
which is a relation itself. The second statement actualizes the nested
attribute SenEmp in relation employees with a relational expression. We get the
following result:
Seniors
( SenEmp )
( NAME )
B.Martin
J.Medeski
In jRelix, the domain algebra is
implemented to be able to realize almost all the functionalities of relational
algebra, leading it to be a union of both traditional domain algebra and
relational algebra. For a complete discussion of the extended domain algebra in
jRelix please refer to [Yua98].
2.5 Computation
Computation
implements procedural abstraction in jRelix.
The basis of computation can be found in [Mer93]. The formal syntax
(details in appendix A) for the declaration of a computation goes as follow:
"comp"
Identifier "(" ( ParameterList )? ")" "is"
ComputationBody
We will
briefly discuss the computation with examples in this section. Please refer to
[Bak98] for a complete explanation of computations in jRelix. An example of the
declaration of a computation is shown as follows:
>
domain D,V,T float;
>
comp velocity (D, V, T) is
{ D <- V * T;
} alt
{ V <- D / T;
} alt
{ T <- D / V;
};
The computation
name is velocity. There are three parameters in this computation, and they are
all defined as domains. There are three "alt" blocks in this example.
All of them satisfy the constraint V = D / T. Given values for any two of these
variables, the value of the third will be calculated according to the
constraint.
The central design principle applied in
implementing computation is to make them resemble relations. We show the
relation corresponding to velocity as below:
VELOCITY
( D V T )
1 1 1
2 1 2
. . .
. . .
34.1 5.5 6.2
. . .
. . .
It is
an infinite relation, every tuple satisfies the constraint V = D / T. Further
more, all tuples satisfy this constraint is included in the relation. The
parameters of a computation become the domain of its associated relation.
As we have already discussed, the full
support of nested relations in jRelix means that the attribute of a relation
could itself be a relation. This implies that computation can also take
relations as its parameters.
Now we discuss the invocation of
computations.
*
Invoking a computation with a select expression
We can
use a select expression to supply specific values for the input parameters of a
computation.
Example:
>
result <- where D = 10 and T = 2 in velocity;
result
( D V T )
10
5 2
We set parameter
D to 10 and T to 2 using a selection clause. This causes these two domains to
be the input parameters, so the second block is executed in order to calculate
V.
*
Invoking a computation with array syntax
jRelix
provides an array syntax ( for details refer to [Hao98] ) which is syntax sugar
for a T-selection. This can be used to invoke a computation.
Example:
>
result <- velocity [10, ,2]
The
expression velocity [10, , 2] is equivalent to a T-select statement whose
projection list consists of only the second parameter of velocity. And the
first parameter has the input value 10 and the third parameter has the input
value 2. The expression is equivalent to
[V]
where D = 10 and T = 2 in velocity
Thus
the relation result should be
result
[ V ]
5
*
Invoking a computation by joining it with a relation
It has
been explained how a computation could be thought of as a relation, perhaps an
infinite one. Therefore, it makes sense to take a natural join of a computation
with a relation.
Example:
temprel
( D T )
20 10
15 3
16 4
> result <- velocity ijoin temprel;
We get
the relation result as follow:
result ( D V T )
20 2 10
15 5 3
16 4 4
As we have discussed, the parameters of a
computation can be thought of as domains of the computation. When two relations
are joined together, their common domains are called the join domains. This is
retained when a computation is joined with a relation.
Thus D
and T are the join domains in this example. The join domains D and T become the
input parameters of the computation while the remaining parameters become
outputs, so the second "alt" block of velocity is selected for
execution.
*
Stand-alone invocation of computations
Computation
may also be invoked by means of a top-level call in jRelix taking domains and
relations as its parameters or taking no parameters. Thus computation functions
just as procedure, which is a top level procedural abstraction implemented in
Relix, the predecessor of jRelix. Please refer to [RSL95] for a complete
discussion of procedure in Relix. Take the following example:
>
comp AssignComp ( ) is
{ result <- temprel;
};
>
comp JoinComp ( A, B, C ) is
{ C <- A ujoin B;
};
>
AssignComp ( );
>
JoinComp ( in oldRecord, in moreRecord, out Record);
Here we
have defined two computations. The stand-alone invocation of computation
AssignComp is a top level call which takes no parameter. The stand-alone
invocation of computation JoinComp is also a top level call, taking three
parameters where in and out are used to specify the input and output
parameters. The statement(s) in the computation body will be executed when the
computation is called.
The unification of computation and
procedure leads to a simpler language. In the next chapter, we will present to
the user how to use computations to define event handlers.
Chapter
3
User
Manual on Updates & Event Handlers
In this
chapter, we discuss how to use updates for both flat and nested relations, and
how to define event handlers in jRelix.
3.1 User Manual on Updates
3.1.1 Introduction
The
update operation allows us to change values of specified attributes in certain
tuples. These attributes could be selected by a using clause which uses a
relational algebra operation to select tuples from the relation we want to
update. We can also use updates to add or delete some tuples to or from the
relation. The nested update in nested relations will be presented.
3.1.2 Syntax and Semantics
Update
provides the mechanism for changing a relation. There are three basic update
operations on relations: add, delete and change. The syntax for update is:
UpdateStatement
:= update Identifier (add | delete) Expression
|
update Identifier change
(StatementList)? (UsingClause)?
UsingClause
:= using JoinOperator Expression
|
using
"[" IDList ":" JoinOperator (":")? ExpressionList
"]" Expression
Here
the first two productions add or delete the result of Expression to or from the
relation being updated. The semantics of add is the same as that of the
incremental assignment. The semantics of delete is the same as that of the
djoin. The third production updates part of the relation in the way specified
by StatementList. The part of the relation to be updated is the join result
(specified by JoinOperator) of the relation being updated and the result of
expression in UsingClause. If there is no UsingClause, the whole relation is updated.
The StatementList that follows the keyword change may contain update
statements. This nested update is used to update values of nested attributes in
nested relations.
Note
that the precedence ordering of update would be more intuitive if the syntax is
modified as following:
update
Identifier (UsingClause)? change (StatementList)?
3.1.3 Worked Example of Flat Relation
First
we show the user how to update tuples in non-nested relations.
CLASS RECLASS
( ITEM TYPE ) (
ITEM TYPE )
Yarn a
Yarn a
String a
String b
Ball b
Top a
Sandal c
In the
following examples assume that CLASS contains the above values every time we
enter a new command.
*
Example 1: update CLASS add RECLASS;
Result: CLASS ( ITEM TYPE )
Ball b
Sandal c
String a
String b
Top a
Yarn a
*
Example 2: update CLASS delete
RECLASS;
Result: CLASS ( ITEM TYPE )
Ball b
Sandal c
String a
*
Example 3: update CLASS change TYPE
<- "B" using ijoin RECLASS;
Result: The result of CLASS ijoin RECLASS is the
tuple {Yarn, a}
The change instruction, TYPE <-
"B", changes this to {Yarn, b}
So we have CLASS ( ITEM TYPE )
Ball b
Sandal c
String a
Yarn b
Example
4: update CLASS change TYPE <-
"b" using ijoin ( [ITEM] in RECLASS );
CLASS
ijoin [ITEM] in RECLASS
(
ITEM TYPE )
String
a
Yarn
a
Result: CLASS ( ITEM TYPE )
Ball b
Sandal c
String B
Yarn
B
*
Example 5: let NEWTYPE be TYPE;
update CLASS change TYPE <-
NEWTYPE
using ijoin ( [ITEM,
NEWTYPE] in RECLASS );
CLASS
ijoin ( [ITEM, NEWTYPE] in RECLASS )
(
ITEM TYPE NEWTYPE )
String
a b
Yarn
a a
Result:
CLASS ( ITEM TYPE )
Ball b
Sandal c
String b
Yarn a
*
Example 6: update CLASS change TYPE
<- NEWTYPE
using ujoin (
[ITEM, NEWTYPE] in RECLASS );
CLASS
ujoin ( [ITEM, NEWTYPE] in RECLASS )
(
ITEM TYPE NEWTYPE )
Ball
b dc
Sandal
c dc
String
a b
Top
dc a
Yarn
a a
Result: CLASS ( ITEM TYPE )
Ball b
Sandal c
String b
Yarn a
Top a
NOTE:
dc or dk does not replace a known value.
*
Example 7: update CLASS change TYPE
<- "B" using djoin RECLASS;
CLASS
djoin RECLASS
(
ITEM TYPE )
Ball
b
Sandal
c
String
a
Result: CLASS ( ITEM TYPE )
Ball B
Sandal B
String B
Yarn a
The
following two examples are of global change.
Example
8: update CLASS change TYPE <-
"C";
Result: CLASS ( ITEM TYPE )
Ball C
Sandal C
String C
Yarn C
Example
9: let NEWTYPE be if TYPE =
"c" then "B" else TYPE;
update CLASS change TYPE <-
NEWTYPE;
Result: CLASS ( ITEM TYPE )
Ball b
Sandal B
String a
Yarn a
The
following example uses a T-selector to select tuples to be updated.
Example
10: update CLASS change TYPE <-
"B"
using ijoin ( where
ITEM = "Yarn" in CLASS );
CLASS
ijoin ( where ITEM = "Yarn" in CLASS )
(
ITEM TYPE )
Yarn
a
Result: CLASS ( ITEM TYPE )
Ball b
Sandal c
String a
Yarn B
3.1.4 Worked Example of Nested Relation
The
following examples show the user how to update tuples in nested relations.
Consider
the following relations where relation EMP is a domain of relation employees:
employees
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
25000
P.McConnel 22000
T.Anastasio 27000
J.Fishman 24000
television B.Martin 38000
C.Wood 32000
J.Medeski 35000
NEWEMP
( NAME SAL ) RETIREEMP ( NAME )
P.Alger 30000
M.Gordon
B.Martin
We
assume that employees and EMP contain the above values every time we enter a
new command.
*
Example 11: update employees change (
update EMP add NEWEMP )
using ijoin ( where DEPT =
"television" in employees );
employees
ijoin ( where DEPT = "television" in employees )
(
DEPT EMP )
( NAME SAL )
television B.Martin 38000
C.Wood 32000
J.Medeski 35000
Result:
employees
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
25000
P.McConnel 22000
T.Anastasio 27000
J.Fishman 24000
television B.Martin 38000
C.Wood 32000
J.Medeski 35000
P.Alger 30000
*
Example 12: update employees change (
update EMP delete RETIREEMP );
Result:
employees
(
DEPT EMP )
( NAME SAL )
stereo P.McConnel
22000
T.Anastasio 27000
J.Fishman 24000
television C.Wood 32000
J.Medeski 35000
*
Example 13: update employees change (
update EMP change SAL <- SAL +10000 )
using ijoin ( where DEPT =
"stereo" in employees );
employees
ijoin ( where DEPT = "stereo" in employees )
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
25000
P.McConnel
22000
T.Anastasio 27000
J.Fishman 24000
Result:
employees
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
35000
P.McConnel 32000
T.Anastasio 37000
J.Fishman 34000
television B.Martin 38000
C.Wood 32000
J.Medeski 35000
*
Example 14: update employees change (
update EMP change SAL <- SAL +10000
using djoin RETIREEMP );
EMP
djoin RETIREEMP
(
DEPT EMP )
( NAME SAL )
stereo P.McConnel
22000
T.Anastasio 27000
J.Fishman 24000
television C.Wood 32000
J.Medeski 35000
Result:
employees
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
25000
P.McConnel 32000
T.Anastasio
37000
J.Fishman 34000
television B.Martin 38000
C.Wood 42000
J.Medeski 45000
*
Example 15: update employees change (
update EMP change SAL <- SAL +10000
using djoin RETIREEMP )
using ijoin ( where DEPT =
"stereo" in employees);
employees
ijoin ( where DEPT = "stereo" in employees )
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
25000
P.McConnel 22000
T.Anastasio 27000
J.Fishman 24000
Then
the above part of EMP (say EMP') is selected for nested update.
EMP'
djoin RETIREEMP
EMP
(
NAME SAL )
P.McConnel 22000
T.Anastasio 27000
J.Fishman 24000
Result:
employees
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
25000
P.McConnel 32000
T.Anastasio
37000
J.Fishman 34000
television B.Martin 38000
C.Wood 32000
J.Medeski 35000
*
Example 16: let NEWSAL be if SAL <
35000 then SAL + 10000 else SAL;
update employees change (
update EMP change SAL <- NEWSAL );
Result:
employees
(
DEPT EMP )
( NAME SAL )
stereo M.Gordon
35000
P.McConnel 32000
T.Anastasio 37000
J.Fishman 34000
television B.Martin 38000
C.Wood 42000
J.Medeski 35000
3.2 User Manual on Event Handlers
3.2.1 Introduction
Procedural
abstraction has been achieved in jRelix in the form of computation. And
computation is also implemented to function as procedure which was differently
implemented in previous versions of Relix. Therefore, computation is used to
define event handlers in jRelix. In this chapter we will present the reader the
mechanism by which he/she can define event handlers, properties of event
handlers and operations which can be performed on them. Event handlers defined
on nested updates will also be demonstrated. Finally, the usage of the undo
command within event handlers will be explored.
3.2.2 Computation as Event Handlers
Event
handlers are procedures (computations) and events are system-generated
procedural calls. So far, in our implementation, events are generated by
updates. But they may arise otherwise, e.g., mouse click, notice from the
Internet, or a database read.
To distinguish event handlers from
user-defined computations, we give them special names, constituted according to
the following syntax.
* Naming
Event Handlers
The
original syntax for declaration of computation goes like:
Computation
:= "comp" Identifier "(" (ParameterList)? ")"
"is" ComputationBody
We
modify the syntax as follows so that we can define a computation as an event
handler:
Computation
:= "comp" CompName "(" (ParameterList)? ")"
"is" ComputationBody
where
CompName
:= Identifier | EventName
EventName
:=
(prefix":")?action":"relation("["attribute-list"]")?
Now we
will discuss each component of the EventName.
*
prefix
Prefix
is an optional part of the name, could be either pre or post. If prefix is
omitted, it is post by default. Pre implies that the event handler would be
executed before the execution of the event which triggers it. Post implies that
the event handler would be executed after the execution of the event which
triggers it. The pre or post event handler is coupled with the event which
triggers it without interruption.
*
action
Action
specifies by what kind of update operation on the relation would the according
event handler be activated. There are three possible types of action: add,
delete, or change.
*
relation
Relation
identifies the name of the relation to be updated.
*
attribute-list
The
optional attribute-list is a list of attributes of the preceding relation. For
event handlers with action add or delete, the attribute-list is always omitted.
For event handlers with action change, the attribute-list could be any subset
of the original attributes, including the empty subset, which means change action
on any attribute(s) will trigger the event handler. Attributes in the
attribute-list are separated from one another by comma.
Here
are some examples of valid event names for a relation R with attributes a, b,
c:
pre:add:R
post:delete:R
add:R
(equivalent to post:add:R)
delete:R
(equivalent to post:delete:R)
pre:change:R[a,b]
post:change:R[a,b,c]
post:change:R
change:R[a]
(equivalent to post:change:R[a] )
The
following are not valid event names:
pre:add:R[a]
post:delete:R[b,c]
Event names
and handlers are related one to one. There will not be any two or more event
handlers with the same event name.
*
Defining Event Handlers
Because
computation is used to define event handler, defining an event handler is just
like defining a computation but with a special event-name.
comp
event-name ( ) is
{
statements;
}
Note
that event handlers do not take parameters because they are system-generated
procedural calls without the user intervention.
3.2.3 Cascading Event Handlers
As
shown above, in the body of event handler, there could be several statements
including updates which may have their own event handlers. This introduces
cascading event handlers which will be executed in turn. They would be cascaded
as deep as the user wants. The following figure illustrates the cascading
situation.
Figure 3.1: Cascading Event
Handlers
( Solid
lines represent pre or post event handler calls. Dotted lines represent
statements in an event handler body which in turn trigger cascading event
handlers. Event represents an update operation upon a relation. )
The
sequence of execution can be found by reading the above figure top down:
Pre-Event2-Handler
Event2
Post-Event2-Handler
Pre-Event3-Handler
Event3
Post-Event3-Handler
Event1
Pre-Event4-Handler
Event4
Post-Event4-Handler
Cascading
event handlers are useful in such applications as spread sheets.
3.2.4 Event Handler On/Off
When
first defined, an event handler is set to state on which means it will be
executed when the according event occurs. If it is set to state off afterwards,
then the according event will not trigger the event handler even it still
exists in the system. The user can activate it by setting it to state on again.
The
following are the commands used to set the event handler's state:
eventon
event-name
This
command sets the event-handler to state on.
eventoff
event-name
This
command sets the event-handler to state off.
3.2.5 Printing Event Handlers
We use the
same pr command that is used for relation to print an event handler.
pr
event-name
This
command will print out the definition of the event handler with the specified
event-name.
3.2.6 Deleting Event Handlers
If the
user wants to define a new event handler with an existing name, the old event
handler should be deleted first; otherwise, the system will prompt a warning
and the new definition will not be accepted. We use the same dr command that is
used for relation to delete an event handler.
dr event-name
3.2.7 Trigger and New and Rest
For the
update operation, the affected relation would be separated into three pieces
which are named Trigger, New and Rest. They only exist in the pre and post
event handler. These three pieces are useful to achieve the undo operation
which will be discussed later. Trigger is defined as being the tuples which
will be affected by an update operation on the original relation, or you can
call it Old. New is defined as being the new values of those old tuples. Rest
is defined as being the tuples which are not affected by an update operation on
the original relation. Note that Trigger, New and Rest are always created even
when there is no event handler defined for this operation as they could be
useful in further development of the system. For example, they could be used to
achieve the rollback issued by the user in a later time afte the update
operation or recover the system from crash.
Considering
the following examples:
R[X,
Y] P[X, Y] Q[X, Y] S[X, Y]
a
1 a 1 a
3 b 1
a 2
b 1
Trigger New Rest
update
R change X<-"c" using ijoin P; {(a,1)} {(c,1)}
{(a,2),(b,1)}
update
R add Q; {} Q R
update
R delete S; S {} R-S
Consider
the nested update-add:
em
(
DEPT EMP )
( NAME )
tele a
b
stereo c
d
NEWEMP[NAME]
e
update
em change (update EMP add NEWEMP) using ijoin where DEPT =
"tele" in em;
Trigger New Rest
em {(tele,a),(tele,b)} {(tele,a),(tele,b),(tele,e)} {(stereo,c),(stereo,d)}
Now
let's consider the nested update-change. Take original em.
update
em change (update EMP change NAME <-
"z" using ijoin where NAME = "a" in
EMP) using ijoin where DEPT
= "tele" in em ;
Trigger New Rest
em {(tele,a)} {(tele,z)}
{(tele,b),(stereo,c),(stereo,d)}
In our
system, Trigger is available to the user as a normal relation. Therefore, the
user can apply relational algebra upon Trigger (for examples, refer to the 3.2.9),
and use pr command to print out Trigger.
pr
trigger
As the
scope of Trigger is within the update statement and its event handlers (if
there is any), the commands related to Trigger are only used within the
corresponding event handlers. Note that New and Rest are hidden from the
user.
3.2.8 Undo Command
In
jRelix, whenever there is an update event, we produce the Trigger, New and
Rest. As mentioned before, their scope is within the event itself and its
pre/post mode event handlers. What the update operation does is switch these
three pieces so that Trigger is detached from Rest, and New is combined with
Rest. This creates the new relation for the update operation. Meanwhile, the
Trigger becomes New, and New becomes Trigger.
We add a new command, undo, in jRelix,
which is used in event handler. What undo does is to switch Trigger, New and
Rest in which Trigger and New exchange their places again, functioning the same
as update operation. Obviously the update operation is reversed by executing
command undo.
If we execute the undo command in the
pre-mode event handler which runs before the update event, it just updates the
relation as what the update operation does. So we could equally name undo do.
And when it comes to run the update operation, it functions as the command undo
by switching those three pieces again. All in all, the undo and the update
operation both function as a toggle. We illustrate the above explanation in
Figure
3.2.
Figure 3.2: Undo using
Trigger, New and Rest
The
syntax of the undo command goes as follow:
undo
Note
that the undo is also just available within the corresponding event handlers.
3.2.9 Worked Example of Event Handlers
In this
section we will introduce two examples to show how event handlers are used.
Example
1:
domain
X,Y intg;
relation
R(X,Y);
comp
pre:change:R() is
{ S
<- trigger;
}
update
R change X <- 1;
By
defining the event handler pre:change:R, trigger of R will be assigned to S, a copy
of tuples to be changed in R, each time before an update-change operation is
applied upon relation R.
Example
2:
This
example shows an extreme case of cascading event handlers, in which event
handler does further update on the same relation.
domain
n integer;
relation
start(n) <- {4};
comp
post:add:iota () is
{ let
nmin1 be (red min of n) - 1;
let n be nmin1;
update iota add [n] in [nmin1] where
nmin1>=0 in iota;
};
update
iota add start;
By
defining the event handler post:add:iota, each time there is an update-add
operation upon relation iota, the event handler will be executed after the
execution of update-add operation.
Since there is an update-add statement in
the event handler post:add:iota, after executing this statement, it will
trigger the event handler post:add:iota again. It seems that the system will
not ever stop. But it should not be the case. We have a scheme to stop the
update-eventhandler-update infinite loop.
Definition:
A void update is an update which does no change on the subject relation.
Stop
Scheme: If an update is a void update, then the cascading event handler should
stop.
Now we
examine how it is applied in the example. From the beginning, relation start is
{4}, relation iota is empty. After executing the statement update iota add
start, relation iota becomes {4}. Then the event handler post:add:iota is
triggered, so it comes into the event handler body to execute those statements
within it. The first two are let...be... statements which cause no change in
relation iota. The third one is an update-add statement upon relation iota
which produces a number whose value is the min value of all tuples in relation
iota minus 1 and the condition of adding it to iota is that it should be
non-negative. After executing it, relation iota becomes {4,3}. And the event
handler post:add:iota is triggered again. It automatically runs through the
event handler again following the above procedure. The procedure will repeat
for several times until the number produced by the min value of all tuples in
iota minus 1 is less then 0. Then the update-add statement in the event handler
will not do any change on iota, which means it satisfies the stop condition
according to the stop scheme. The following figures illustrate this procedure.
Figure 3.3: Cascading Event Handlers
Chapter
4
Implementation
of Updates and Event Handlers
4.1 Representation of Nested Relations
The
implementation of nested relation is built upon flat relations. When a
non-primitive type of attribute is defined, a relation of the same name, but
with a prefix ".", is automatically created by the system and hidden
from the user. The new relation consists of the attributes which make up the non-primitive
type of domain, as well as one additional attribute .id. This is the surrogate
field used to link up the upper level relation with nested non-primitive
attributes.
The relation employees from section 2.1.3
will serve as an example. The declaration of the nested attribute EMP leads to
the creation of the relation .EMP which is initially empty. The creation of
nested relation employees causes tuples to be added to .EMP. Surrogates are
stored under the EMP field of employees.
Note that in relation employees,
surrogate 1,2 of the nested attribute EMP link the value 1, 2 of attribute .id
in the relation .EMP, shown as the following:
employees
( DEPT EMP )
stereo .1
television .2
.EMP (
.id Name Salary )
.1 M.Gordon
25000
.1 P.McConnel
22000
.1 T.Anastasio
27000
.1 J.Fishman
24000
.2 B.Martin
38000
.2 C.Wood
32000
.2 J.Medeski
35000
Refer
to [Hao98, Yua98] for more details about implementation of nested relations.
4.2 Implementation of Updates
In this
section we will discuss the algorithm for update operations. There are three
ways to update a relation: add some tuples, delete some tuples, or change the
values of some specific tuples. We call them update-add, update-delete, and
update-change, respectively. The update-add is just syntax sugar for the
incremental assignment while the update-Delete is just syntax sugar for djoin.
For update-change, it could be a global change or a partial change. Considering
nested update-change, there could be update-add, update-delete or update-change
nested in the upper level update-change, as deep as the user wants. For syntax
refer to Appendix A.
Essentially, we create three pieces of
relation for the update operation, which are Trigger(Old), New and Rest. To do
the update, whether it's update-add, update-delete, or update-change, we just
switch these three pieces to produce the updated relation. This also applies to
undo operation. Both update and undo function like a toggle.
The nested updates will not do any change
on the upper level relation. So, for implementation, to be able to keep a copy
of the old relation and make the undo operation, we just keep track of the
Trigger and New and Rest down to the lowest level relation. Take the example of nested update-add from
section 3.2.7:
update
em change (update EMP add NEWEMP) using ijoin where DEPT = "tele" in
em;
We
don't keep the Trigger, New and Rest of em (which is the top level relation),
but those of EMP (which is the nested relation to be changed).
Trigger New
Rest
EMP {}
{(.1,e)} {(.1,a),(.1,b),(.2,c),(.2,d)}
Now
let's consider nested update-change.
update
em change (update EMP change NAME <-
"z" using ijoin where NAME = "a" in
EMP) using ijoin where DEPT
= "tele" in em ;
We get
Trigger and New and Rest of EMP, just as we do for the nested update-add.
Trigger New
Rest
EMP {(.1,a)} {(.1,z)}
{(.1,b),(.2,c),(.2,d)}
We only
need to keep these three pieces according to the lowest level relation which is
actually changed. This makes undo command easy by just switching Trigger and
New with Rest, like a toggle.
The structure of the update statement is
kept in a tree according to the syntax. We refer to the tree whenever we need
to analyze the statement.
There
are three major components of the algorithm:
*
EvaluateUpdate
*
ExecuteUpdate
*
SwitchTrigger
4.2.1 EvaluateUpdate
4.2.1.1
Algorithm of EvaluateUpdate
EvaluateUpdate
is the function that creates Trigger, New and Rest for the update operation
upon a relation.
Here are the explanations of the
variables in the algorithm:
*
upper_level_trigger: the input parameter of the function, which is the affected
part of the upper level relation for a nested update; set to the first level
relation when the function was firstly called
*
cur_level_rel: the current level relation to be updated
*
ad_rel: the relation to be added to the original relation
*
del_rel: the relation to be deleted to the original relation
*
using_rel: the relation appearing in the using clause
*
using_join: the join type appearing in the using clause
*
temp_rel: the temporary relation used in the algorithm
*
Trigger: the affected part of the original relation
* New:
the new values of the tuples affected in the original relation
* Rest:
the unaffected tuples of the original relation
The
algorithm of EvaluateUpdate( upper_level_trigger ) goes as follows:
* analyze
the statement tree to get the action_type of the current level update
* if
action_type equals to ADD
* if it is the first level
* set New to ad_rel
* else
* set temp_rel to the
result of upper_level_trigger ijoin
cur_level_rel on the surrogate
* for each surrogate in temp_rel
* for each tuple in ad_rel
* add surrogate to the tuple
* add the tuple into New
* set Trigger to empty
* set Rest to cur_level_rel
* if
action_type equals to DELETE
* if it is the first level
* set temp_rel to cur_level_rel
* else
* set temp_rel to the result of
upper_level_trigger ijoin cur_level_rel
* set Trigger to the result of temp_rel
ijoin del_rel
* set New to empty
* set Rest to the result of temp_rel djoin
del_rel
* if
action_type equals to CHANGE
* if there is a UsingClause
* get the using_rel and using_join from
the UsingClause
* set temp_rel to the result of
cur_level_rel using_join using_rel
* else
* set temp_rel to cur_level_rel
* if it is the first level
* set Trigger to temp_rel
* else
* set temp_rel to the result of
upper_level_trigger ijoin temp_rel
* set Trigger to temp_rel
* set Rest to the result of cur_level_rel
djoin Trigger
* for each statement in current level
* if this statement is of type ASSIGNMENT
* for each tuple in Trigger
* replace the value of the domain with new value and add the tuple
into
New
* adjust Trigger, New and Rest so that the Trigger only includes
the
tuples actually being
changed
* if this statement is of type UPDATE
* call EvaluateUpdate( temp_rel )
The
algorithm will recursively call itself until it arrives at the deepest nested
update operation and then create Trigger, New and Rest accordingly.
4.2.1.2
Example of EvaluateUpdate
Take
the relation employees and RETIREEMP from section 3.1.4 as the example.
update
employees change ( update EMP change SAL <- SAL +10000
using djoin
RETIREEMP )
using ijoin ( where DEPT = "stereo" in employees);
* The
system calls EvaluateUpdate ( employees )
* The
action_type of current level (first level) update is CHANGE
* There
is a UsingClause in the current level
* The using_rel
is the relation got from (where DEPT = "stereo" in employees)
* The
using_type is ijoin
* Set
temp_rel to the result of employees
ijoin (where DEPT ="stereo" in employees), so temp_rel is {(stereo,
.1)}
* Set
Trigger to temp_rel, so Trigger is {(stereo, .1)}
* Set
Rest to employees djoin Trigger, so Rest is {(television, .2)}
* The
statement of current level is:
update EMP change SAL <- SAL +10000
using djoin RETIREEMP
* This
statement is of type UPDATE, so call EvaluateUpdate (temp_rel)
* The
upper_level_relation is {(stereo, .1)}, got from the input parameter
* The action_type of current level (second
level) update is CHANGE
* There is a UsingClause in the current
level
* The using_rel is the relation RETIREEMP
* The using_type is djoin
* Set temp_rel to the result of EMP djoin RETIREEMP,
so temp_rel is {(.1 P.McConnel 22000)
(.1 T.Anastasio 27000)
(.1 J.Fishman 24000)
(.2 C.Wood 32000)
(.2 J.Medeski 35000)}
* This is not the first level, so set temp_rel to the result
of
upper_level_trigger ijoin
temp_rel, so temp_rel is
{(.1 P.McConnel
22000)
(.1 T.Anastasio
27000)
(.1 J.Fishman
24000)}
* Set Trigger to temp_rel, so Trigger is
{(.1 P.McConnel 22000)
(.1 T.Anastasio 27000)
(.1 J.Fishman 24000)}
* Set Rest to the result of cur_level_rel
djoin Trigger,
so Rest is {(.2 C.Wood
32000)
(.2 J.Medeski
35000)}
* The statement of current level is: SAL
<- SAL +10000
* The statement is of type ASSIGNMENT
* For each tuple in Trigger, replace the old value of domain SAL
with new
value which is 10000
larger than the old one, and add the tuple in New.
So New is {(.1 P.McConnel
32000)
(.1 T.Anastasio 37000)
(.1 J.Fishman
34000)}
* Adjust Trigger, New and Rest if necessary.
* Finish
4.2.2 ExecuteUpdate
4.2.2.1
Algorithm of ExecuteUpdate
ExecuteUpdate
is the function that really updates the relation by switching Trigger, New and
Rest produced by function EvaluateUpdate.
The
algorithm goes as follows:
ExecuteUpdate(
)
*
analyze the statement tree to get the action_type of current level update
* if
the action_type equals to ADD or DELETE
* call SwitchTrigger ( )
* if
the action_type equals to CHANGE
* for each statement in current level
* if the statement is of type ASSIGNEMNT
* call SwitchTrigger( )
* if the statement is of type UPDATE
* call ExecuteUpdate ( )
The
algorithm will recursively call itself until it arrives at the deepest nested
update operation and then switch Trigger, New and Rest accordingly.
4.2.2.2
Example of ExecuteUpdate
Following
the above example, we have got Trigger, New and Rest by calling EvaluateUpdate(
), and now we want to actually get the result of the update operation:
update
employees change ( update EMP change SAL <- SAL +10000
using djoin
RETIREEMP )
using ijoin ( where DEPT =
"stereo" in employees);
* The
system calls ExecuteUpdate( )
* The
action_type of current level (first level) update is CHANGE
* The
statement of current level is:
update EMP change SAL <- SAL + 10000
using djoin RETIREEMP
* The
statement is of type UPDATE, so call ExecuteUpdate ( )
* The action_type of current level (second
level) update is CHANGE
* The statement of current level is: SAL
<- SAL + 10000
* The statement is of type ASSIGNMENT
* Call SwitchTrigger ( )
* Finish
4.2.3 SwitchTrigger
4.2.3.1
Algorithm of SwitchTrigger
StwitchTrigger
is the function that switchs Trigger, New and Rest which are produced by
EvaluateUpdate. It is called by the function ExecuteUpdate.
The
algorithm goes as follows:
SwitchTrigger
( )
* set
the updated relation to be the result of New ujoin Rest
* store
Trigger to a temporary relation
* set
Trigger to New
* set
New to that temporary relation
The
algorithm that creates the updated relation by toggling Trigger and New with
Rest is quite straightforward. We use the same algorithm to achieve undo
operation. It is obvious that if we switch Trigger and New again with Rest
after the update operation, we actually get the original relation, thus undo
the update operation.
4.2.3.2
Example of SwitchTrigger
Based
on the above example, we have got Trigger, New and Rest as follows:
Trigger: {(.1 P.McConnel
22000)
(.1 T.Anastasio
27000)
(.1 J.Fishman
24000)}
Rest: {(.2
C.Wood 32000)
(.2 J.Medeski
35000)}
New:
{(.1 P.McConnel 32000)
(.1 T.Anastasio
37000)
(.1 J.Fishman
34000)}
Now it
is time to make the change:
* New ujoin
Rest -> updated relation
So the
updated relation is:
{(.1 P.McConnel 32000)
(.1
T.Anastasio 37000)
(.1
J.Fishman 34000)
(.2
C.Wood 32000)
(.2
J.Medeski 35000)}
*
Trigger -> temporary relation
So the
temporary relation is:
{(.1 P.McConnel 22000)
(.1
T.Anastasio 27000)
(.1
J.Fishman 24000)}
* New
-> Trigger
So
Trigger becomes:
{(.1 P.McConnel 32000)
(.1
T.Anastasio 37000)
(.1
J.Fishman 34000)}
*
temporary relation -> New
So New
becomes:
{(.1 P.McConnel 22000)
(.1
T.Anastasio 27000)
(.1
J.Fishman 24000)}
4.3 Implementation of Event Handlers
In this
section we will discuss the way we implement event handlers and explain the
syntax change we made to support event handlers. We will also describe the
algorithm used to select proper event handlers for an update operation. Finally
we will explain how event handlers are combined with the implementation of
updates.
4.3.1 Computation as Event Handlers
Consider
relation R (X, Y, Z),
a b 1
c d 2
we
define the following event handler:
>
comp post:change:R[X] ( ) is
{
statements;
};
Since
event handlers are computations, they are stored as regular computations, both
in the system table and on disk (refer to Chapter 2). We add a new attribute,
Active, in the system table .rel. It is used to indicate whether an event handler
is active (1) or not (0). The user can change the event handler's state by
using the command eventon or eventoff. For relation and regular computation,
the attribute Active has no effect, so always set to 0.
Now the system table .rel will look like
this:
.rel (
name type
nattr sort ntuples active)
R rel 3
1 2 0
post:change:R[X] comp 0 0 0 1
And the
event handler is stored on disk as post:change:R[X].comp.
Note
that the prefix of EventName is optional. If it is omitted, we take it as post.
Next we will discuss how the proper event
handlers are selected to execute when an update operation occurs.
4.3.2 Algorithm of ExecuteEventHandler
ExecuteEventHandler
is the function used to execute proper event handlers according to the update
operation. The function takes two input parameters: one is prefix, the other is
event_name.
*
prefix: pre or post; when the function is called before the update operation,
it is set to pre; when the function is called after the update operation, it is
set to post.
*
event_name: the event's name, indicating the action type (add/delete/change),
relation name and affected attributes of the update operation.
The
event handlers both satisfying the prefix and event_name will be put into
selection_list. The algorithm goes as follows:
ExecuteEventHandler
( prefix, event_name )
* get
the action_type from event_name
* get
the relation_name from event_name
* get
the attribute_list from event_name; if action_type is add/delete,
attribute_list must be empty.
* set
query1 to be prefix, set query2 to be (action_type + relation_name), set query3
to be attribute_list.
*
search the system table .rel, get those computations with attribute Active set
to 1 (which must be event handlers )
* for
each selected computation, analyze their name.
* separate the name to three parts: mode(pre or post), event and
attrlist; if
the name do not have
mode, set mode to post
* if mode equals to query1
* if action_type of query2 is ADD/DELETE
* if event equals to query2
* add the event handler name to the
selection_list
* if action_type of query2 is CHANGE
* if event equals to query2 and if
attrlist is the subset of query3
* add the event handler name to the
selection_list
* for
each event handler in the selection_list
* execute the event handler
4.3.3 Example of ExecuteEventHandler
Now we examine
how the selection of event handlers is done when we execute the following
update operation:
>
update R change X <- "e";
Assuming
we have defined six event handlers. The system table .rel goes as follows:
.rel (
name type nattr
sort ntuples active)
R
rel 3 1 2 0
change:R[Z] comp 0 0 0 1
pre:add:R comp 0 0 0 1
pre:change:R comp 0 0
0 1
pre:change:R[X] comp 0 0 0 1
pre:change:R[X,Y] comp 0
0 0 1
post:change:R[X] comp 0 0 0 1
post:change:R[Y] comp 0 0
0 0
Before
the update statement is executed, the system calls:
ExecuteEventHandler
("pre", "change:R:[X]")
*
action_type of event_name is CHANGE
*
relation_name of event_name is R
*
attribute_list of event_name is X
* set
query1 to be pre, set query2 to be change:R, query3 to be X get all
computations with attribute active set to 1,
which are change:R[Z], pre:add:R,
pre:change:R, pre:change:R[X,Y],
post:change:R[X]
* for change:R[Z],
because it does not have mode, set mode to post, set event to
change:R, set attrlist to Z
* mode does not equal to query1, so the
event handler is not added to
selection_list
* for
pre:add:R, set mode to pre, set event to add:R, set attrlist to empty
* mode equals to query1
* action_type of query2 is CHANGE
* event does not equal to query2, so the
event handler is not added to
selection_list
* for
pre:change:R, set mode to pre, set event to change:R, set attrlist to
empty
* mode equals to query1
* action_type of query2 is CHANGE
* event equals to query2 and attrlist is the
subset of query3, so the event
handler is added to selection_list
* for
pre:change:R[X], set mode to pre, set event to change:R, set attrlist to
X,
* mode equals to query1
* action_type of query2 is CHANGE
* event equals to query2 and attrlist is the subset of query3
(here attrrlist
equals to query3), so the
event handler is added to selection_list
* for
pre:change:R[X, Y], set mode to pre, set event to change:R, set attrlist
to X, Y
* mode equals to query1
* action_type of query2 is CHANGE
* event equals to query2 but attrlist is not
subset of query3, so the event
handler is not added to selection_list
* for post:change:R[X],
set mode to post, set event to change:R, set attrlist to
X
* mode does not equal to query1, so the
event handler is not added to
selection_list
* there
are two event handlers, pre:change:R and pre:change:R[X], in the
selection_list
* execute event handler pre:change:R
* execute event handler pre:change:R[X]
Finishing
the event handler's execution, the system will run the update statement. After
that, the system will call:
ExecuteEventHandler
("post", "change:R:[X]")
The
similar procedure goes through to select the event handler post:change:R[X] to
execute. Note that in the body of event handler, there could be update
statements which in turn have their own event handlers; namely, cascading event
handlers.
In the next section, we will introduce
the combination of update and event handler algorithm by which the update
operation is coupled with event handler closely and neatly, and the cascading
event handler is handled by the system's recursive execution.
4.4 Combination of Updates and Event handlers
To
combine updates and event handlers, we just need to insert the function call of
ExecuteEventHandler into ExecuteUpdate as follows:
ExecuteUpdate
( )
*
ExecuteEventHandler("pre", event_name)
*
analyze the statement tree to get the action_type of current level update
* if
the action_type equals to ADD or DELETE
* call SwitchTrigger ( )
* if
the action_type equals to CHANGE
* for each statement in current level
* if the statement is of type ASSIGNEMNT
* call SwitchTrigger( )
* if the statement is of type UPDATE
* call ExecuteUpdate ( )
*
ExecuteEventHandler("post", event_name)
Reading
an update statement, the Interpreter first calls the function EvaluateUpdate(
), then calls ExecuteUpdate( ).
In the function ExecuteEventHandler,
proper event handlers will be found and executed. Executing the statements in
an event handler body is an indirect recursive call to the interpreter. When
the interpreter reads an update statement from the event handler body, it will
call EvaluateUpdate ( ) and ExecuteUpdate ( ) again. Thus we achieve the
cascading event handler recursively.
We illustrate the whole procedure as the
following figure (dotted line shows indirect call to the interpreter):
Figure 4.1: Coupling of Update and Event Handler
4.5 Setting Event Handlers On/Off
The
algorithm of eventon and eventoff is quite straightforward. To implement these
two commands, we introduce the following syntax:
EventOn
: = eventon EventName ( "[" IDList "]" )?
EventOff
: = eventoff EventName ("[" IDList "]" )?
The
algorithm goes as follows:
*
separate the EventName and IDList
*
search the system table .rel, get a list of event handlers whose name equals to
EventName and whose attribute list equals to IDList
* for
each entry in the selected event handler list
* set Active to requested state
4.6 Deleting Event Handlers
We use
the same dr command used to delete relation to delete event handler. Once the Interpreter
detects that the dr command is followed by an event name, the following
algorithm is executed:
* get
the event handler to be deleted from the system table .rel
*
remove it from .rel
4.7 Printing Event Handlers
We use
the same pr command used to print relation to print event handler. Once the
Interpreter detects that the pr command is followed by an event name, the
following algorithm is executed:
* get
the event handler to be printed from the system table .rel
* print
the source code of the event handler
Chapter
5
Conclusion
In this
chapter we first summarize the work presented in this thesis. Then we discuss
some future work to do on updates, event handlers and transaction control.
5.1 Conclusions
We have
implemented Updates and Event Handlers in jRelix based on a nested relational
model. In our implementation, update statements can handle changes applied on
either atomic-valued attributes of flat relation or relation-valued attributes
of nested relations. We provide for the user a uniform syntax for updating both
flat relations and nested relations.
We have used jRelix computations to
perform event programming in jRelix. By modifying the syntax of computation
name, we can accept an event name as a computation name; thus, define an event
handler using computation. As we presented, computation can be unified with
procedure (refer to [Elk96] for details of procedure and defining event handler
with procedure in Relix), leading to a simpler language.
According to our definition, events are
nothing more than system-generated procedure calls, and event handlers are
procedures invoked by corresponding events. It is simpler to implement the
event handler execution model based on this definition and more intuitive to
use it (refer to section 1.2.3).
5.2 Future Work
In this
section, we will discuss some issues about further improvement on jRelix. First
we discuss the integration of computation in an update statement. Then we discuss
how to add condition evaluation into our event handler execution model. Finally
we move on to the topic of transaction control in jRelix.
*
Computation in Updates
Theoretically,
computation can appear after the key word change in the update statement.
As of
this writing, we have partially implemented the combination of updates and
computations. Consider the following example:
>domain
DEP, B intg;
>domain
DEPOSIT comp(DEP);
>domain
BALANCE comp(B);
>comp
ba(BALANCE, DEPOSIT) is
state bal intg;
state oldbal intg;
{ comp DEPOSIT(DEP) is
{
oldbal <- bal;
bal <- bal + DEP;
} alt
{
DEP <- bal - oldbal;
}
};
comp BALANCE (B) is
{ B <- bal;
};
bal <- 0;
};
>relation
accts(ACCNO, CLIENT) <- {(1729, "pat"), (4104. "suz")};
>accounts
<- accts ijoin ba;
>update
accounts change DEPOSIT(100) using where ACCNO = 4104 in accounts;
In the
above example, the computation ba contains two nested computations as its domains:
DEPOSIT and BALANCE, which are two methods of ba. Note that there are two state
variables that are defined within ba and used by the two methods. The relation
accounts is instantiated by the join of acct and ba, and thus all the tuples in
the relation accounts have their own copy of the two state variables. Actually
state variables are implemented as hidden domains of the relation. Please refer
to [Bak98] for details of stateful computations.
The above update statement will make a
deposit of $100 to the account 4104. We do this by the following two
steps:
* Find
the corresponding statements in the computation body of DEPOSIT.
*
Execute each statement for each selected tuple.
In this
case, for the selected account 4104 we first assign the value of the hidden
domain bal to oldbal, and then increase the value of domain bal by 100.
Due to time limitation, our
implementation can only accept parameters of integer type. More work needs to
be done on the implementation and testing of such statements. Obviously, the combination of updates and
computations improves the flexibility and functionality of jRelix.
*
Condition Evaluation
Comparing
to the ECA model, current implementation of event handler in jRelix does not
explicitly have a condition evaluation scheme. In the current model, we only
have two components: the events, which are system-generated procedure calls,
and the event handlers, which are procedures performing actions in response to
the events. Every time an event occurs, the corresponding event handler(s) will
be invoked without evaluating a specific condition.
There are two alternative ways to
integrate condition evaluation into our model. One way is that we evaluate the
condition within the event handler, in which case the current model is capable
of handling it and need no modification. The other way is that we add a
condition evaluation component in our model which will improve the explicitness
of the model. To do this, first we need
to modify the syntax of the definition of event handler. We suggest the
condition part be integrated with event name of the event handler. The syntax
can go as follows:
ComputationName
: = Identifier
|
EventName ( "["
IDList "]" )? (":" ConditionExpression)?
ConditionExpression
: = "when" Expression
The
following expressions can be valid condition expressions:
when
quantity < 5;
when
(salary < 10000) and (DEPT = "stereo");
when
([] where mark > 85 in Record);
Thus we
can define the following event handler:
comp
post:change:inventory[quantity]:when quantity < 5 ( ) is
{
update
inventory change quantity <- quantity + 20;
update
stock change quantity <- quantity - 20;
}
If one
or more items are sold, the quantity of the item in the inventory will be
changed. Only when the quantity goes below 5 will the above event handler be
activated and transfer 20 items from stock to inventory.
We suggest the condition evaluation be
performed within the procedure
ExecuteEventHandler
( prefix, event_name )
* get
the action_type from event_name
* get
the relation_name from event_name
* get
the attribute_list from event_name; if action_type is add/delete,
attribute_list must be empty.
* set
query1 to be prefix, set query2 to be (action_type + relation_name), set query3
to be attribute_list.
*
search the system table .rel, get those computations with attribute Active set
to 1 (which must be event handlers since only event handler could set Active to
be 1)
* for
each selected computation, analyze their name.
* separate the name to four parts:
mode("pre" or "post"), event, attrlist and
condition; if the name
does not have mode, set mode to "post"
* if mode equals to query1
* if action_type of query2 is ADD/DELETE
* if event equals to query2
* if condition evaluates to true
* add the event handler name to the
selection_list
* if action_type of query2 is CHANGE
* if event equals to query2 and if
attrlist is the subset of query3
* if condition evaluates to true
* add the event handler name to the
selection_list
* for each event handler in the
selection_list
* execute the event handler
The
modifications on the original algorithm are indicated by bold font.
Taking the above example, when we execute
the following statement:
update
inventory change quantity <- quantity - 1;
we
first decrease the quantity of the inventory by 1; then we find the
corresponding event handler based on the event name and prefix, which is
comp
post:change:inventory[quantity]:when quantity < 5 ( )
We
retrieve the condition part of the event handler, when quantity < 5, and
evaluate it. If the condition evaluates to true, we will invoke this event
handler; otherwise, not. The algorithm can be improved for efficiency. If the
change is to increase the quantity or it is an add operation, we do not need to
check this condition. The same applies to other update operations if we pay
additional attention to the specific conditions.
The algorithm
we adopt here corresponds to the immediate coupling mode (both EC coupling and
CA coupling) in which case the event detection, condition evaluation and action
execution all occur immediately one after another in the triggering
transaction.
More research
needs to be done on what can be set as conditions, and how to sensibly evaluate
them.
*
Transaction Control
In an
active database system, an update could trigger multiple event handlers. Event
handlers may contain update statements which in turn invoke other event
handlers. Each of these event handlers can be considered as a sub-transaction
and the way they are cascading is unforeseeable as they can be independently
defined in different applications. This introduces cascading transaction. We
will discuss more of it in the following section.
The
current implementation of jRelix does not support transaction control. The
essence of the concept of transaction is consistency, which requires that the
actions be taken as a group to transform the state of the database correctly.
This implies the concept of atomicity, which enforces all the actions in a
transaction either all happen or none happen. Partial execution of a
transaction may leave the database in an inconsistent state.
We suggest that a transaction in jRelix
be defined by using "begin_T" and "end_T" to group
statements together, shown as follows:
begin_T
{
statements
}end_T
Once
the interpreter encounters "begin_T" it should impose locks on all the
relations affected in the transaction and execute the grouped statements. If
"end_T" is reached and all operations succeed, changes upon relations
should be committed and all the locks imposed by the transaction should be
released. If an operation in the transaction fails or a "abort_T"
command is encountered, the changes applied on the relations within the
transaction should be undone and all the corresponding locks should be
released.
We have implemented an undo command which
can be used to reverse the changes applied on the relation by an update
statement. In current implementation, we only support to use this command
within the event handler of the corresponding update statement. But the idea of
undo can be applied upon the transaction level as a recovery mechanism of
transaction control. We can expand the scope of the Trigger, New and Rest from
statement to transaction, thus when a transaction is aborted, we can revert all
the changes happened within the transaction by switching all Trigger pieces and
New pieces with Rest pieces.
There are some other issues, such as
locking scheme and concurrency control, to be addressed in the future
implementation of transaction control. We can take advantage of the two
versions of data that are kept, Trigger and New, to improve the concurrency of
transactions.
One major source of concurrency
bottlenecks is queries. Query transactions typically run much longer than
update transactions and they access lots of data. So if they run using
two-phase locking - which means a transaction must get all of its locks before
releasing any of them (refer to [BN87] for details on two-phase locking) - they
often set many locks and hold these locks for a long time, creating long delays
of update transactions.
If no read locks are set and writes still
obey two-phase locking, the queries will not slow down the execution of update
transactions. Since we keep two versions of data, the old version of data will
not be overwritten until the next update applies on the same data. The query
transaction which begins before the update transaction commits can read the old
version data and the query transaction which begins after the update
transaction commits can read the new version of data (time-stamping the
transactions), both of which do not need to set read locks on the data. Thus
the query transaction will not block the update transaction. The update
transactions are still serializable with respect to each other as long as they
obey two-phase locking.
The anomaly of unrepeatable read will
occur under above assumption. Consider the following example:
E: r6 w7 c7 r6 w8 r6 c6 c8
R: R1 R2 R3
where E
represents the execution of three transactions 6, 7 and 8. R represents a data
item and R1, R2 and R3 are three versions of R. w means write, r means read and
c means commit. Suppose these transactions all operate on the same data item R.
The first r6 reads the current version of
data item R, say R1. The second r6 still reads R1 ignoring w7's effect because
transaction 6 begins before transaction 7 commits. The third r6 should still
read R1. But when transaction 8 produces R3, R1 will be discarded in our
current model because only two latest version of R will be kept. Thus the third
r6 can not access R1 which causes the unrepeatable read problem.
There are several ways to solve this
problem:
*
Restrict repeating reads on the same data item in one transaction
* Set
read locks: query transactions also obey two-phase locking.
*
Multi-version data: each data item consists of a sequence of versions, each
version corresponding to each update that was applied to it.
More
research is needed on cascading transaction control, especially the concurrency
among them, which is a nontrivial topic.
Appendix
A
Backus-Naur
Form for jRelix Commands
This
appendix documents jRelix grammar/syntax in the Backus-Naur Form (BNF) format.
Most of the grammar is taken from Biao's thesis [Hao98] and we made some
modifications for grammar of update (refer to section 3.1.2) and event handler
(refer to section 3.2.2.1).
The convention of this BNF definition is
explained in the following figure.
Figure A.1:
BNF convention
The grammar is created from the grammar specification
(in file Parser.jjt), using jjdoc which is the JavaCC documentation generator.
Because JavaCC is a top-down parser, left-recursion is not allowed in the
grammar specification.
There are five token definitions:
<EOF> for end-of-file; <IDENTIFIER> for identifier;
<INTEGER_LITERAL> for integer constants; <FLOAT_LITERAL> for
floating constants; and <STRING_LITERAL> string constants.
Start
:= Command ";" | Statement ";" | ";" |
<EOF>
Command
:= "help" (<IDENTIFIER>)?| "quit" | "input"
FilePath | "debug"
|
"time" | "trace" | "dd" IDList | "dr"
(IDList
|
EventName ("[" IDList "]")?)| "pr" (Expression
|
EventName ("[" IDList "]")?)| "sd"
(<IDENTIFIER>)?
|
"sr" (<IDENTIFIER>)?| "sc" <IDENTIFIER> |
"srd" | "ssd"
|
"ssr" | "undo" | "print" <STRING_LITERAL>
|
"eventon" EventName ("[" IDList "]")?
|
"eventoff" EventName ("[" IDList "]")?
Statement
:= SequentialStatement
SequentialStatement
:= ParallelStatement("--" ParallelStatement)*
ParallelStatement
:= ChoiceStatement ("||" ChoiceStatement)*
ChoiceStatement
:= PrimaryStatement ("??" PrimaryStatement)*
PrimaryStatement
:= Declaration | Assignment | Update
|
ComputationCall | Conditional | ForLoop | WhileLoop
| Exit
| DeadLock | Exec | "(" Statement ")"
|
StatementBlock
StatementBlock
:= "{" Statement (";" Statement)* (";")?
"}"
Conditional
:= "if" Expression "then" Statement
("else"
Statement)?
ForLoop
:= ("for" Identifier)? ("from" Expression)?
("to" Expression)?
("by" Expression)?
"do"
Statement
WhileLoop
:= "while" Expression "do" Statement
Exit :=
"exit"
DeadLock
:= "deadlock"
Exec :=
"exec" Identifier
Declaration
:= "relation" IDList "(" IDList ")"
(Initialization)?
|
Identifier ("initial" Expression)? "is" Expression
("target" Expression)?
|
"domain" IDList Type
|
"let" Identifier ("initial" Expression)? "be"
Expression
|
"comp" CompName "(" (ParameterList)? ")"
"is" ComputationBody
Initialization
:= "<-" ("{" ConstantTupleList "}"
|
Identifier | FilePath )
|
"rcopy" (Identifier | FilePath)
|
"is" (Identifier | FilePath)
ConstantTupleList
:= ConstantTuple ("," ConstantTuple)*
ConstantTuple
:= "(" Constant ("," Constant)* ")"
Constant
:= Literal | "{" ConstantTupleList "}"
Identifier
:= <IDENTIFIER>
FilePath
:= <STRING_LITERAL>
Assignment
:= Identifier (AssignOperator Expression
|
"[" IDList AssignOperator ExpressionList "]" Expression)
AssignOperator
:= "<-" | "<+"
Update
:= "update" Identifier(UpdateOperator Expression
|
"change" (StatementList)? ("using" UsingClause)?
|
"[" IDList UpdateOperator ExpressionList "]" Expression )
UpdateOperator
:= "add" | "delete"
StatementList
:= Statement ("," Statement)*
UsingClause
:= Identifier
|
"(" Expression ")"
|
JoinOperator Expression
|
"[" ExpressionList ":" JoinOperator (":")?
ExpressionList "]" Expression
IDList
:= Identifier ("," Identifier)*
ExpressionList
:= Expression ("," Expression)*
Expression
:= Disjunction
Disjunction
:= Conjunction ("or" Conjunction)*
Conjunction
:= Comparison ("and" Comparison)*
Comparison
:= Concatenation (ComparativeOperator Concatenation)?
Concatenation
:= MinMax ("cat" MinMax)*
MinMax
:= Summation (MinMaxOperator Summation)*
Summation
:= JoinExpression (AdditiveOperator JoinExpression)*
JoinExpression
:= Projection
((JoinOperator
Projection
| "[" ExpressionList ":"
JoinOperator (":")?
ExpressionList "]" Projection
)
)*
Projection
:= Projector (("in" | "from") Projection | Selection)
|
Selection
Projector
:= (QuantifierOperator)? "[" (ExpressionList)? "]"
Selection
:= Selector | Qselector | Term
Selector
:= ("where" | "when") Expression ("in" |
"from") Projection |
"edit" (Projection)? | "zorder" Projection
Qselector
:= "quant" QuantifierList
(("where" |
"when") Expression)?
("in" |
"from") Projection
QuantifierOperator
:= "." | "%" | "#"
QuantifierList
:= Quantifier ("," Quantifier)*
Quantifier
:= "(" Expression ")" Expression
Term :=
Factor (MultiplicativeOperator Factor)*
Factor
:= UnaryOperator Factor | Power
Power
:= Primary ("**" Power)*
Primary
:= Literal | QuantifierOperator | ArrayElement
|
PositionalRename | Identifier | Cast | "(" Expression ")"
| Pick
| Eval | Function | IfThenElseExpression
|
VerticalExpression
ArrayElement
:= Identifier "[" ArrayIndexList "]"
ArrayIndexList
:= (Expression)? ("," (Expression)?)*
PositionalRename
:= Identifier "(" (IDList)? ")"
Cast :=
"(" Type ")" Primary
Pick :=
"pick" Selection
Eval :=
"eval" Expression
Function
:= FunctionOperator "(" Expression ")"
Literal
:= "null" | "dc" | "dk" | "true" |
"false"
|
(SignOperator)? (<INTEGER_LITERAL> | <FLOAT_LITERAL>)
|
<STRING_LITERAL>
SignOperator
:= "plus" | "-"
IfThenElseExpression
:= "if" Expression "then" Expression
"else"
Expression
VerticalExpression
:= "red" AssoCommuOperator "of" Expression
| "equiv" AssoCommuOperator
"of" Expression
"by" ExpressionList
| "fun" OrderedOperator
"of" Expression
"order" ExpressionList
| "par" OrderedOperator
"of" Expression
("order" ExpressionList
"by" ExpressionList
| "by" ExpressionList "order"
ExpressionList
)
Type :=
("boolean" | "bool") | "short" |
("integer" | intg")
|
"long" | ("float" | "real") | "double"
|
("string" | "strg" )| "text" |
("statement" | "stmt")
|
("expression" | "expr")
|
("computation" | "comp") "(" IDList ")"
|
"(" IDList ")"
AssoCommuOperator
:= ("or" | "|")
|
("and" | "&") | "min" | "max" |
"+" | "*"
| ("ijoin" |
"natjoin") | "ujoin" | "sjoin"
OrderedOperator
:= AssoCommuOperator
|
"cat" | "-" | "\" | "mod" |
"**" | "pred" | "succ"
ComparativeOperator
:= "substr" | "=" | "!="
|
">" | "<" | ">=" | "<="
MinMaxOperator
:= "min" | "max"
AdditiveOperator
:= "+" | "-"
JoinOperator
:= "nop" | MuJoin | ("not")? SigmaJoin
MuJoin
:= ("ijoin" | "natjoin") | "ujoin" |
"djoin" | "sjoin"
|
"ljoin" | "rjoin" | "dljoin"
| "drjoin"
SigmaJoin
:= ("icomp" | "natcomp") | "eqjoin"
|
("gejoin" | "sup" | "div") | "ltjoin"
|
("lejoin" | "sub") | ("iejoin" | "sep")
MultiplicativeOperator
:= "*" | "\" | "mod"
UnaryOperator
:= "+" | "-" | "not"
FunctionOperator
:= "abs" | "sqrt" | "sin" | "asin"
|
"cos" | "acos" | "tan" | "atan" | "sinh" | "cosh"
|
"tanh" | "log" | "log10" | "round" |
"ceil"
|
"floor" | "isknown" | "chr" | "ord"
ParameterList
:= Parameter ("," Parameter)*
Parameter
:= <IDENTIFIER> (":" "seq")?
ComputationBody
:= ComputationVariableDeclaration
ComputationBlock
("alt" ComputationBlock)*
ComputationBlock
:= "{" ComputationStatements "}"
ComputationVariableDeclaration
:=
(LocalVariableDeclaration
| StateVariableDeclaration
| ComputationalInitialization
)*
LocalVariableDeclaration
:= "local" IDList Type ";"
StateVariableDeclaration
:= "start" IDList Type ";"
ComputationalInitialization
:= IDList "<-" Expression ";"
ComputationStatements
:= CompStatement ((";" CompStatement |
"also"
CompStatement))* (";")?
CompStatement
:= Statement | Command
ComputationCall
:= Identifier "(" (CallParameterList)? ")"
CallParameterList
:= CallParameter ("," CallParameter)*
CallParameter
:= ("in" | "out") <IDENTIFIER>
CompName
:= CompIdentifier | EventName ("[" IDList "]")?
CompIdentifier
:= <IDENTIFIER>
EventName
:= (Prefix ":")? Action ":" Identifier
Prefix
:= "pre" | "post"
Action
:= "add" | "delete" | "change"
Bibliography
[AB+76] M. M. Astrahan, M. W. Blasgen, D. D.
Chamberlin, K. P. Eswaran, J. N. Gray, P. P. Griffith, W. F. King, R. A. Lorie,
P. R. McJones, J. W. Mehl, G. R. Putzolu, I. L. Traiger, B. W. Wade and V.
Watson. System R: Relational Approach to Database Management. ACM Trans.
Database Systems, 1(2), 97-137, 1976.
[AB84] S. Abiteboul and N. Bidoit. Non first
normal form relations to represent hierarchically organized data. In
Proceedings of 2nd ACMSIGACT/SIGMOD Symposium on Principles of Database
Systems, pages 191-200, 1984.
[Bak98] Patrick Baker. Java Implementation of
Computations in a Database Programming Language. Master's thesis, McGill
University, 1998.
[BB+93] M. Berndtsson and B. Lings. On developing
reactive object-oriented databases. IEEE Data Engineering Bulletin, 15(4), pages
31-34, 1992.
[BK86] F. Bancilhon and S. Khoshafian. A calculus
for complex objects. In Proceedings 5th of ACM SIGACT-SIGMOD Symposium on
Principles of Database System, pages 53-59, 1986.
[BN87] P. A. Bernstein and E. Newcomer.
Principles of Transaction Processing. Morgan Kaufmann Publishers, Inc, 1987.
[BNR+87] C. Beeri, S. Naqvi, R. Ramakrishnan, O.
Shmueli, and S. Tsur. Sets and negation in logic database language (ldl1). In
Proceedings 6th PODS, San Diego, pages 21-37, 1987.
[CA+94] S. Chakravarthy, E. Anwar, L. Maugis, and
D. Mishra. Design of Sentinel: An object-oriented DBMS with event-based rules,
Information and Software Technology, 39(9), pages 555-568, 1994.
[CB99] T. M. Connolly and C. E. Begg. Database
Systems - A Practical Approach to Design, Implementation and Management,
[CCS94] C. Collet, T. Coupaye, and T. Svensen. Naos
- Efficient and modular reactive capabilities in an object-oriented database
system. Int. Conf. On Very Large Databases, Santiago, Chile, 1994, pages
132-143.
[Cod70] E. F. Codd. A relational model for large
shared data banks. Communications of the ACM, 13(6):377-387, 1970.
[Cod72a] E. F. Codd. Database Systems: Further
Normalization of the Data Base Relational Model. Prentice-Hall, Edited by R.
Rustin, 1972.
[Cod72b] E. F. Codd. Database Systems: Relational
Completeness of Data Base Sublanguages. Prentice-Hall, Edited by R. Rustin,
1972.
[Cod79] E. F. Codd. Extending the database
relational model to capture more meaning. ACM Transactions on Database Systems,
4:397-434, 1979.
[Dat81] C. J. Date. An Introduction to Database
Systems, 3rd Edition. Addison-Wesley, Reading, MA, 1981.
[Day95] Umeshwar Dayal. Ten Years of Activity in
Active Database Systems: What Have We Accomplished? In Active and Real-Time
Database Systems (ARTDB-95), pages 3-22, Springer- Verlag, Skövde, 1995.
[DB+88] U. Dayal, B. Blaustein, A. Buchmann, U.
Chakravarthy, M, Hsu, R. Ledin, D. McCarthy, A Rosenthal, and S. Sarin. The HiPAC
Project: Combining Active Databases and Timing Constraints. SIGMOD Record, Vol.
17, No. 1, March 1988, 51-70.
[DE89] L. M. L. Delcambre and J. N. Etheredge.
The relational Production Language: A production language for relational
database. In L. Kerschberg, editor, Expert Database Systems - Proceedings from
the Second International Conference, pages 333-351. Benjiamin/Cummings, Redwood
City, California, 1989.
[DH+94] U. Dayal, H. Y. Hwang, F. Manola, A.
Rosenthal, and J. M. Smith. Knowledge-Oriented Database Management. Phase 1
Final Technical Report, Computer Corporation of America, Cambridge, MA, Aug.
1994.
[DHL90] U. Dayal, M. Hsu and R. Ladin. Organizing
Long-Running Activities with Triggers and Transactions. In Proc. Of the 1990
ACM SIGMOD Intl. Conf. on Management of Data, Atlantic City, NJ, May 1990.
[EC75] K. P. Eswaran and D. D. Chamberlin.
Functional specifications of a subsystem for data base integrity. In
proceedings of the First International Conference on Very Large Data Bases,
pages 317-326, Barcelona, Spain, September 1991.
[Elk96] Ahmad Ziad El-kays. Implementing Event
Handlers in a Database Programming Language. Master's thesis, McGill
University, 1996.
[Esw76] K. P. Eswaran. Specifications,
implementations and interactions of a trigger subsystem in an integrated
database system. IBM Research Report RJ 1820, IBM San Jose Research Laboratory,
San Jose, California, August 1976.
[FMT84] P. Fraternali, D. Montesi, and L. Tanca.
Active Database Semantics. In Proceedings of the Fifth Australian Database
Conference, University of Canterbury, New Zealand, Jan. 1984.
[FT83] P. C. Fisher and S. J. Thomas. Operations
on non-first-normal-form relations. In Proceedings of IEEE COMPSAC '83, pages
464-475, 1983.
[GGD91] S. Gatziu, A. Geppert, and K. R. Dittrich.
Integrating active concepts into an object-oriented database systems. Workshop
on Database Programming Languages, Nafplion, Greece, 1991, pages 399-415.
[GJ91] N. Gehani and H. V. Jagadish. Ode as an
active database: Constraints and triggers. In Proceedings of the Seventeenth
International Conference on Very Large Data Bases, pages 327-336, Barcelona,
Spain, September 1991.
[GR93] Jim Gray and Andreas Reuter. Transaction
Processing: Concepts and Techniques. Morgan Kaufmann Publishers, Inc., San
Mateo, CA, 1993.
[Han89] E. N. Hanson. An initial report on the
design of Ariel: A DBMS with an integrated production rule system. SIGMOD
Record, Special Issue on Rule Management and Proceeding in Expert Database
Systems, 18(3):12-19, September 1989.
[Hao98] Biao Hao. Implementation of The Nested
Relational Algebra in Java. Master's
thesis, Mcgill University. 1998.
[HSW75] G. D. Held, M. R. Stonebraker and E. Wong.
INGRES: a relational database system. In Proceedings of AFIPS National Computer
Conference, Vol 44, pages 409-416, AFIPS Press, Montvale, N. J. 1975.
[HP87] G. Houben and J. Paredaens. The R-Algebra:
An extension of an Algebra for Nested Relations. Technology University,
Eindhoven, 1987.
[JS82] G. Jaeschke and H. J. Schek. Remarks on
the algebra of non-first-normal-form relations. In Proceedings of the First ACM
SIGACT-SIGMOD Symposium on Principles of Database Systems, pages 124-138, Los
Angeles, 1982.
[KDM88] A. M. Kotz, K. R. Dittrich, and J. A.
Mulle. Supporting semantic rules by a generalized event/trigger mechanism. In
Advances in Database Technology - EDBT '88, Lecture Notes in Computer Science
303, pages 76-91. Springer-Verlag, Berlin, March 1988.
[KK89] H. Kitagawa and T. L. Kunii. The
Unnormalized Relational Data Model for Office Form Processor Design.
Springer-Verlag, Tokyo, Japan, 1989.
[KR89] H. F. Korth and M. A. Roth. Query
languages for nested relational databases. In Nested Relations and Complex
Objects in Databases, Lecture Notes in Computer Science 361. Springer-Verlag,
Berlin, 1989.
[Lev91] M. Levene. The nested Universal Relation
Database Model. Springer-Verlag, Berlin, 1991.
[LS88] R. Lorie and H. J. Schek. On dynamically
defined objects and SQL. In Proceedings of 2nd Workshop on Object-Oriented
Database Systems, Bad Münster, 1988.
[Mak77] A. Makinouchi. A consideration on normal
form of not-necessarily-normalized relation in the relational data model. In
Proceedings of 3rd International Conference on Very Large Data Bases, pages
447-453, Tokyo, 1977.
[MD89] D. McCarthy and U. Dayal. The architecture
of an active data base management system. Proceedings of ACM SIGMOD, Portland,
Oregon 1989, 215-224.
[Mer76] T. H. Merrett. MRDS: An algebraic
relational database system. In Canadian Computer Conference, pages 102-124,
Montreal, 1976.
[Mer77] T. H. Merrett. Relations as programming
language elements. Information Processing Letters, 6(1):29-33, 1977.
[Mer84] T. H. Merrett. Relational Information
Systems. Reston Publishing Co., Reston, VA, 1984.
[Mos85] J. Eliot Moss. Nested Transactions: An
Approach to Reliable Distributed Computing. The MIT Press, Massachusetts, 1985.
[MUE94] Thomas A. Mueck. Active Databases: Concepts
and Design Support. Advances in Computers, Vol. 39, Academic Press Inc., 1994,
107-189.
[Nor98] Norman W. Paton. Active Rules in Database
Systems. Springer-Verlag,
New York, 1998.
[PA86] P. Pistor and F. Anderson. Designing a
generalized NF2 model with an SQL-type language interface. In Proceedings of
the 12th International Conference on Very Large Data Base, pages 278-285, 1986.
[PT86] P. Pistor and R. Traunmueller. A database
language for sets, lists and tables. Information Systems, 11(4):323-336, 1986.
[RKS88] M. Roth, H. Korth and A. Silberschatz.
Extended algebra and calculus for nested relational database. ACM Transactions
on Database Systems, 13(4):389-417, 1988.
[RSL95] Rebecca S. Lui. Implementing Procedures in
a Database Programming Language. Master's thesis, McGill University, 1996
[SJ+90] M. Stonebraker, A. Jhingran, J. Goh, and S.
Potamianos. On rules, procedures, caching and views in data base systems. In
Proceedings of the ACM SIGMOD International Conference on Management of Data,
pages 281-290, 1990.
[SLR88] T. Sellis, C.-C. Lin, and L. Raschid.
Implementing large production systems in a DBMS environment: Concepts and
algorithms. In proceedings of the ACM SIGMOD International Conference on
Management of Data, pages 404-412, Chicago, Illinois, June 1988.
[SK+94] J. Sayles, S. Karlen, P. Molchan, and G.
Bilodeau. GUI-BASED DESIGN AND DEVELOPMENT FOR CLIENT/SERVER APPLICATIONS. John
Wiley and Sons, Inc., New York, 1994.
[SKM92] E. Simon, J. Kiernan, and C. de
Maindreville. Implementing high level active rules on top of a relational DBMS.
In proceedings of the Eighteenth International Conference on Very Large Data
Bases, pages 315-326, Vancouver, British Columbia, August 1992.
[SS86] H. Schek and M. Scholl. The relational
model with relation-valued attributes. Information Systems, 11(2):137-147,
1986.
[Sto82] M. Stonebraker et al. A Rules System for
Relational Data Base Management Systems. Proceedings of the Second
International Conference on Databases, Jerusalem, June 1982.
[TF86] S. J. Thomas and P. C. Fischer. Nested
relational structures. In P. C. Kanellakis, editor, Advances in Computing
Research III, The Theory of Databases, pages 269-307. JAI Press, 1986.
[Tod76] S. J. P. Todd. Peterlee Relational Test
Vehicle PRTV, a relational overview. IBM Scientific Center Report UKSC0075,
Peterlee, England, July 1975.
[Ull82] J. D. Ullman. Principles of Database
Systems, 2nd Edition. Computer Science Press, Rockville, MD, 1982.
[VB98] I. Vlahavas and N. Bassiliades. Parallel,
Object-Oriented, and Active Knowledge Base Systems. Kluwer Academic Publishers,
Boston/Dordrecht/London, 1998.
[Wid96] J. Widom. The Starburst rule system, in
Active Database Systems: Trigger and Rules for Advanced Database Processing, J.
Widom and S. Ceri, Eds.: Morgan Kaufmann Publishers, 1996, pages 177-206.
[Yua98] Zhongxia Yuan. Java implementation of the
nested domain algebra in a database programming language. Master's thesis,
McGill University, 1998.