[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gnue-dev] Re: DataObjects.txt
From: |
Jason Cater |
Subject: |
[Gnue-dev] Re: DataObjects.txt |
Date: |
Sun, 23 Dec 2001 01:40:28 -0600 |
Comments inline. This was a 1:00 a.m. response... if it makes no sense at
all, just troutslap me once or twice :)
On Saturday 22 December 2001 06:42 am, Neil Tiffin wrote:
> After i read the IRC log i look at dataobjects.txt and have some
> questions/comments.
>
> 1) In larger systems we need to have the ResultSets manipulated as
> subsets. So if i create a ResultSet of 1,000,000 records and I have a
> table in the GUI that displays 15 records I dont want to have to send
> all 1,000,000 records to the GUI. Now I see that we could use the
> nextRecord() and lastRecord() functions. But I really think we will
> need a getNRecords(startAt=0, number=1) function, even if the API is
> not immediately implemented (although it should not be hard to do).
>
I'm not entirely sure I follow the logic here, although implementing the
requested functionality certainly wouldn't be difficult (i.e., it's a natural
extension of the getRecord(), nextRecord(), and lastRecord() methods.)
However, by its very nature, a ResultSet is a cache of a query-set, so
passing 1,000,000 records back and forth should never happen (unless the
application was really processing 1,000,000 records at a time :) Based on
your example, if the UI only needed 15 records to be displayed, then only 15
records would be loaded into a ResultSet. Otherwise, I think ResultSets
aren't being used properly.
I am sure I missed your point, so I apologize in advance. As I stated in the
first sentence, such functionality wouldn't be difficult.
> 2) Concerning the definition of "conditions" used to create
> ResultSets. These "conditions" must have both selection conditions
> and sorting conditions. I was not sure what a GConditions was? But
Yes, both are important. GConditions is a tree-like structure that defines,
to use your terminology, "selection conditions" of arbitrary complexity. The
name GConditions comes from the internal class name. I like the terminology
"selection conditions", as that's less ambiguous -- I'd never considered
"sorting" to be a "condition", but, in retrospect, I suppose it is.
Having said that, we do not currently (AFAIK) have an internal structure for
"sorting conditions". Internally (this is from memory, btw) we use lists (or
arrays) to pass the sorting field sequence. Will sorting definitions require
a type more complex than a list? The only reason the selection conditions
have their own class (GConditions) is that a tree-like structure was harder
to implement otherwise.
For those not following the IRC discussion, a GConditions tree would look
something like: (I hope my mail program doesn't wrap these lines)
OR
_______________|___________
/ \
GREATER-THAN EQUALS
_______|_______ ________|_________
/ \ / \
<Field 'Salary'> 60,000 <Field 'Position'> 'Manager'
Using infix traversal, this would represent the SQL WHERE clause:
WHERE (Salary > 60000) OR (Position = 'Manager')
To represent sorting definitions, we might have a list like:
( StoreBranch, LastName, FirstName )
which would translate to the following SQL ORDER BY clause:
ORDER BY StoreBranch, LastName, FirstName
> it did not sound like it did sorting. Again this must be done on the
> server to make sure getNRecords() (or for that matter nextRecord() )
> works correctly.
>
I assume by this comment that you are referring to GEAS' possible use of
ResultSets. ResultSets weren't meant to be passed back and forth, per se,
between a client and server. ResultSets were more of a local caching
mechanism. I'd imagine, though I don't participate much in GEAS' design so
may be off base here, that the client would maintain an internal ResultSet
and GEAS would maintain it's own internal ResultSet -- with GEAS' internal
resultset being results from a relational query and form's resultset being
objects from a GEAS call.
Because a ResultSet is stateful, I am not sure how sorting affects a
getNRecords or a nextRecord implementation. Maybe this is from a lack of
understanding OODB principles, but in relational-land, when we query against
the database, we are guaranteed that if we traverse the results of that query
one record at a time, we will not come across the same record twice,
regardless of whether we sorted or not. Of course, without sorting, two
identical queries could return the same set of records, but with the records
in a different order. However, unless I misunderstand your statement, I am
not sure why a database would be requeried during the life of a ResultSet. A
requerying results in a new ResultSet, with its nextRecord now pointing to
the first record.
I see two ways that GEAS can use our ResultSet definition:
1. By using GNUe-Common as GEAS' database interface, GEAS would use
ResultSets internally since the DBDrivers use ResultSets.
2. GEAS can present the client applications with a ResultSet via CORBA.
James and Reinhard were discussing #1. I am guessing you are thinking in
terms of #2, but am not positive.
In #1, GEAS would maintain an active ResultSet for the client. This
ResultSet would not be visible to the client -- it is GEAS' link to the
relational database backend. If the client could see GEAS' internal
ResultSet, then any Object-to-Relational mapping would not be necessary and
GEAS-based business rules would not be practical. GEAS would, in essence,
simply be a caching relational layer.
However, I would envision GEAS maintaining a backend ResultSet as long as the
client was open and requesting rows based on a query.
#2 has possibilities, but I don't think it fits with GEAS' way of doing
things. It may be a little too "relational". Reinhard and James were
discussing (AFAICT) GEAS using something similar to the GConditions tree via
CORBA, but that was the extent of the discussion wrt exposed services. I also
think the GConditions selection tree should be exposed by GEAS unless OQL or
such is working. Also, will OQL allow us to specify an arbitrarily complex
selection clause? I am not at all familiar with OQL, but assume it will.
Forms will, I imagine, continue to maintain its own ResultSet as it has to
have some way of retaining records internally.
> 3) I did not see any introspection functions. For example get a list
> of field names, get characteristics of fields, get number of records
> in DataObject, etc. Now I admit that if we are planning on making
> the introspection data just system data objects with standard field
> names, then I would be very happy.
I see two types of "introspection" here:
1) System-level introspection -- What objects does the system provide and,
from each object, what fields are available and what are their
characteristics?
2) Result-level introspection -- I have a ResultSet... what fields are in
the ResultSet, how many records are in the ResultSet, etc.
I am not sure if you were asking about 1, 2, or both. I'll describe our
current offering for both, but be forewarned that neither are
feature-complete, but none-the-less currently usable.
We are currently toying with system-level introspection. Our PostgreSQL and
Oracle dataobjects have round-one of our testing. I apologize that these are
not well documented, but they are new and not necessarily the final way we
plan to do it. Currently, these methods are only utilized by Designer's form
definition Wizards. Basically, the implementation is:
A db driver (or a DataObject to be more precise) currently defines three
methods:
1) getSchemaTypes(): This returns a list of available top-level "object
types". For relational db drivers, this is usually (Tables, Views),
meaning the only top-level objects are tables and views. For object
backend drivers, this might return ( Classes ) as the only top-level
objects are classes (that might not be the correct term for geas
objects; I'm not sure.)
Currently, the only intended use of this method is so Designer can allow
the developer to limit the list of objects with a drop-down box (i.e.,
the developer can ask that only tables are presented for him to choose
from.)
2) getSchemaList(): This returns a list of available "objects". This
optionally takes a list of "object types" to return. "Object types" are
returned by getSchemaTypes().
This method returns a list of Schema objects. There is one Schema object
per table or class. This Schema object defines several methods, the most
important being getChildSchema(), which returns a list of child Schemas.
(Please excuse the names of these functions... however, we are trying
our best not to be tied to relational, 2-tier models.) For relational
databases, the "child Schema" of tables are fields. Therefore, a
table's getChildSchema() function returns a list of all the fields
defined by that table. An object database's child schema might return
"attributes" or "methods".
Each of these "child schemas" have attributes such as datatype,
nativeType, size, etc, that describe various properties of the fields.
Datatype is a limited list of datatypes (just as "string", "number",
"date", etc). The nativeType attribute describes, using the database's
terminology, the datatype. For example, Oracle defines an integer as
number() and Postgres defines an integer as, IIRC, int8(). The
nativetype for each would be, respectively, "number" and "int8", whereas
the datatype for each will always be "number". This is to simplify
client code -- the clients do not care about the specific storage
mechanism, they just need to know it's a "number". I believe GEAS
already has a similar predefined set of types. Reinhard suggested that
we reuse GEAS' definitions. I have not looked at GEAS' types, but I'm
sure we will adopt them as our definitions are fairly simple and
overly limited.
3) getSchemaByName(): This takes a string argument (the name of an object)
and returns a single Schema type defining the named object. For example,
findSchemaByName("ap_vendors") returns a Schema object defining the
characteristics of the ap_vendors table. This method isn't fully
defined and, IIRC, isn't implemented in any of the drivers.
I am sure we will need more methods than these three, but these currently,
parden the humor, "scratch Designer's itch".
As for result-level introspection, ResultSets define a number of methods,
such as
getRecordCount() -- Return the number of records in the ResultSet
getCacheCount() -- Return the number of physical records in the
resultset (may be less than or equal to getRecordCount() .)
getRecordNumber() -- The positional location of the current record
isFirstRecord() -- Are we at the first record in the set
isLastRecord() -- Are we at the last record in the set
I don't recall if ResultSets define a method to return the fields and
datatypes of the cached records. However, this information is stored
internally, so a method could be easily added that simply returns this
internal information. This has not been an issue in the past as the client
application told the backend what fields it needed in the resultset and so
knew what fields were returned.
>
By the way, I have described the way DataObjects currently work, not the way
they *must* work. We are definitely open to restructuring our classes if they
do not fit GNUe's needs. However, I am fairly confident, given some
fine-tuning and general polishing, our system is extensible enough to meet
GEAS' needs. Of course, I may be slightly biased. <grin>
I hope this helped.
-- Jason