SimpleData provides a lightweight interface between a consumer of
data such as a web application and a data source such as a relational
database or an application API.. It is normally used as a thin
wrapper on Apache DDLUtils package.
SimpleData stores meta information about record oriented data and
provides an interface that a connection adaptor can implement. In
particular this will be used by SimpleWebApp to access a database via
DdlUtils and
automatically produce CRUD style user interfaces. However, it is
not limited to relational databases, and the DBasicTest example does
not use a database. Non database sources could include LDAP, P11,
and an API to a CRUD-oriented application. The latter is
important.
The interface uses a generalized design (like SimpleORM) so that fields
in records are essentially properties in a property list. This
avoids reflection and enables meta data to be stored with data
instances. More importantly, it enables queries to be specified
in a formal, non-Sql manner which is necessary if it is to be used on
non-Sql data source.
Unlike an ORM such as SimpelORM, it uses value semantics. There
is no "cache",
records are just retrieved by copy, and two requests for the same data
record will produce two different instances in memory. There is
no automatic update of dirty records. It is thus relational/table
oriented. This avoids many issues, particularly with detaching
records and lifecycles. A DataSet layer may be built on top of it
later which addresses these issues, but for normal applications the
complexity is not warrented.
Thus there is also no direct inter record references, and the trouble
that
causes. Simple.
See DBasicTest for an example. Note the way that the query is
specified.
Design Issues
The use of DRecordMeta is good. Provides solid handles for column
names for queries etc, very concise, adaptable at run time, avoids
static XML.
The big issue is whether actual values should be stored in a Bean,
accessed via BeanUtils, rather than the old HashMap of
DFieldInstances. (DFieldInstance is gone.) Fixed
application beans are also OK -- Bean Utils will introspect (not
reflect). Hard to store non-data info such as previous value for
optimistic locking, dirty bits, etc.
DdlUtils demands a SqlDynaBean which it generates. (Hence
the odd SimpleData instance generation code.) Ie.
SqlDynaBean author = database.createDynaBeanFor("author",
false);
author.set("author_id", 123);
platform.insert(database, author);
DRecordInstance remains. Each bean is wrapped in a
DRecordInstance. Allows extra methods, and optionally a place to
store extra info. Could also change implementation.
Alternative is to go back to simple hash map of field values in
DRecordInstance, and then copy values into SqlDynaBean just before
insert. However, updates should really be of DynaBeans that were
queried, although I doubt if DdlUtils really cares.
Hmm. Maybe a copy in/out approach would be better. But by
using the DynaBeans directly the SimpleData layer becomes simpler and
more transparent from a DdlUtils perspective.
But either way the internal use of DynaBeans is hidden from the client,
but it is not hidden from the DConnection supplier.
Non-database implementations (hash maps) use copy of bean contents
semantics to hash map. Thus insert followed by a query does not
preserve == identity. That is good, it is how a database behaves.
By using value semantics it is possible to have sub record types, add
extra calculated values etc.
Unlike monsters such as Hibernate, DdlUtils is simple and
transparent. It also has facilities for updating schemas with
Alter Table, which is cute. But it is rather limited, eg. not
transactional at all and no optimistic locking. Will probably
have to make enhancements to it. And its use of Apache BeanUtils
DynaBeans is painful -- what a lot of noise just to essentially
implement a hash map of fields. But worth utilizeing it and
its community rather than just copying code out of SimpleOrm.
Old, Obsolete Design notes follow
Relationship to SimpleORM
The orignal design was to build our own SimpleJDBC instead of using
DDLUtils. SimpleJDBC could be dependent (not independent) of
SimpleData, making record storage much easier. No need for
BeanUtils. Meta data stored in SimpleData could be used
directly. But in this case collaboration beats simplicity and
elegance.
I am still struggling with Object Identity, and whether it should be
kept or dropped as being too complex. Dropping it is tempting,
but writing per record business rules becomes tricky.
Certainly if it is kept then records will live in a DataSet object, and
only the whole DataSet can be detached, which avoids a number of thorny
semantic issues with detaching of records that refer to other
records. DataSets also gives well defined semantics to the many
end of relationships, which has been worrying me for some time -- they
simply contain the records that have been loaded into the DataSet, not
what is on the disk.
Other aspects that might be
purged from SimpleORM include
- No references, just scalar values. No lazy fetch.
- No SProperty lists etc. for meta data. SRecord|FieldMeta
just have ordinary field variables to store information about them.
- No fancy logic to select some but not all fields etc. (But
I actually need this for my non-relational database.)
There are several non-trivial SimpleORM features that need to remain in
SimpleJDBC. These include:-
- Generating correct DDL queries on the different databases.
- DQuery interface, but keep it simple and rely on rawClauses for
non-trivial cases. (The vast majority of queries are
simple.)
(No separate SPreparedStatement interface, just the DQuery one.)
- Optimistic locking.
- DML Create table statements, with all the ugly data type problems.
- Generating record numbers.
- Threadlocal connection access.
- ValidateRecord/Field hooks.
I do not want to reinvent these wheels. I wonder if there is a
JDBC layer in Hibernate that could be used. (Probably not.).
Sub-Record Types
I think that we want to be able to define varients on record
types. So that we can query for a base record fields plus some
extras. But no automatic discrimination -- the caller of SQuery
etc. specifies which record type they want. (Record type separate
from table type.) Needs more thought.
This works Because of no identity.
There is no issue about querying the same record twice.
Issues
Identity is key question
Ie. If we query the same record twice do we get the same or different
record instances. (Within the context of a Connection/DataSet,
not JVM.)
- Requires map of all records/primary keys within a dataset.
Fairly easy to do.
- No Subrecords. Or tricky partial record update logic.
- APIs, local and remote. hmm.
Employee.Swiggle(DeptId) vs.
Employee.Swiggle(DeptRecord)
- Multiple instances of a record in different states -- the
external, internal, temporary, permanent, previous...
- Thus no automatic flushing. The caller needs to explicitly
call Update, Delete etc.
Removal of object identity, references and the cache from this layer
does not mean that they can not be added to a layer on top of
this.
Some sort of DataSet concept could be added. But the adaptors
should
not need to know about those issues. In SimpleORM object identity
issues and database access issues are intertwined, and they do not have
to be.
Today, Hibernate rules the complex ORM space. But it is generally
used
in very simplistic, valur oriented manner. The implementation is
huge,
and projects get into great difficulty when they go beyond the
basics.
Keeping SimpleData/JDBC very simple could provide a clear advantage.
How could a multi record business rule be written with no identity?
EmployeeRecord {
onValidate() {
if (Employee.Department.profitable)....
}
}
Options:-
- Just read the extra Department record redundantly as needed.
Simple. Inefficient. May be stale (need to update db often).
- Insist that Department record is passed in manually as a simple
field. Messy.
- People don't write rules like this in practice. But we want
them to be able to. It is precicely what makes a data oriented
application sing.
What would it look like to add DataSet/Identity to identity less
core?
Messy. This is fairly deep. Eg. Sub-records.
With 1.5 annotations, should we just use conventional reflection?
Could work for ORM, Not good for SimpleWeb.
How do we layer on Hibernate? Bean approach?
SimpleData as an Application API
Many application application APIs could look like a database.
Eg. CRUD Employes, search them etc. If they are defined
using SimpleData then SimpleWeb can put simple user interfaces on
top. And if they are implemented using SimpleData adaptors
underneath there can be very little code indeed. This would be
quite cute, a bridge between the database style of interfacing and the
API style.
Miscellaneous Thoughts
- "Adaptor" does know about DRecord structures and also DQuery
structures.
- No DResultSet, just iterators. Implementing the adapors
becomes needlessly messy.
- Rewrite of SimpleORM ok, really a new product.
- Adaptor/Database == Connection. No need for separate
Statement object BECAUSE adaptor knows about record structures.
- For relational, thread local connection etc. separate.
"Session" better term?
Related Work
JDBC
//Statement stmt0 = connection.createStatement(); res =
stmt0.executeQuery("SELECT...");
PreparedStatment stmt = connection.prepareStatement("SELECT ... ? ...);
stmt.setString(1, "...");
ResultSet res = stmt.executeQuery();
while ( res.next() )
x = srs.getString("COL");
SimpleORM
Department delDept = (Department)Department.meta.findOrCreate("300");
delDept.deleteRecord();
SResultSet res = Department.meta.newQuery()
.gt(Department.BUDGET, SJSharp.newInteger(qbudget))
.descending(Department.NAME)
.execute();
while (res.hasNext()) ... // possible but unusual to purge as you go.
SConnection.commit();
Hibernate
Session session =
sessionFactory.getCurrentSession();
WTestRecord tr = new
WTestRecord(0, id);
session.save(tr);
WTestRecord row = (WTestRecord)
session
.createCriteria(WTestRecord.class)
.add( Expression.eq("idx", id) )
.uniqueResult();
session.delete(row);
ADO.Net -- SQLDataReader
// Forward only (ie. efficient), but can bind to web controls
SqlDataReader dreader = new
SqlCommand("Select...where @foo", con)
(.Parameters.add(@foo, "...").ExecuteReader()
// (Param syntax @foo for MSSQL, ? for OleDb)
While (dreader.Read()) ... dreader("myColumn") // IE current row only,
JDBC like, efficient.
myRepeater.DataSource = dreader // ie can bind JDBC directly to web
control. Efficient.
<asp:Repeater ID="myRepeater">
...<%#Container.DataItem("myColumn")%> ...</>
xor <asp:DataGrid ID="myRepeater">
<columns><asp:BoundColumn
DataField="myColumn"/>...<//> // p540
ADO.Net -- SQLDataSet
// Holds all the data. Can be detached. Not as efficient.
DataSet myDataSet = new DataSet();
SqlDataAdaptor mydad = new
SqlDataAdaptor("SELECT...", con)
mydad.Fill(myDataSet, "myDSTableName")
myDataTable.DataSource = myDataSet; // Which table? Same
Interface as SqlDataReader?
// Repeater: low level, DataList: higher, DataGrid: table, DataTable:
top, only with a DataSet?
//Problem:- To sort/select from a DataSet requires all to be in
memory, not passed through to SQL.
//(There is no query language other than SQL -- 2003 & still?.)
myCols = myDataSet.Tables("myTable").Columns("myCol").Unique = True. //
And presumably for fkeys.
myDataSet.Tables("myTable").Columns.Add(new DataColumn(...).Expression = "Price * Quantity") //
p600 Also aggregation.
myDataSet.Relations.Add("myRelName",
myDataSet.Tables("..").Columns("...), myDataSet.T.C) //p607
foreach child in parent.GetChildRows("myRelName")...
myDataSet.Tables(...).NewRow()("myCol)=value...
myDataSet.Tables(...).Rows(2).Delete()
mydad.Update(MyDataSet,
"myTab") // p613
// Adaptor has properties SelectCommand, InsertCommand etc. Can
be redefined to use stored proc etc.
DataViews -- filter, sort,
query etc. P616
Cache P623 cache DataSet in
server memory, server can purge ap needs to recreate.
/* Performance:- Creating 100K records with 5 string fields each * ArrayList: 6 us/rec * LinkedHashMap: 10 us/rec * Both negligible, RDBMS reads at best 1ms/rec. */
|