[Gnue-dev] Re: DataObjects.txt

gnue-dev
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gnue-dev] Re: DataObjects.txt

From:	Jason Cater
Subject:	[Gnue-dev] Re: DataObjects.txt
Date:	Sun, 23 Dec 2001 01:40:28 -0600
Comments inline.  This was a 1:00 a.m. response... if it makes no sense at 
all, just troutslap me once or twice :)

On Saturday 22 December 2001 06:42 am, Neil Tiffin wrote:
> After i read the IRC log i look at dataobjects.txt and have some
> questions/comments.
>
> 1) In larger systems we need to have the ResultSets manipulated as
> subsets. So if i create a ResultSet of 1,000,000 records and I have a
> table in the GUI that displays 15 records I dont want to have to send
> all 1,000,000 records to the GUI.  Now I see that we could use the
> nextRecord() and lastRecord() functions.  But I really think we will
> need a getNRecords(startAt=0, number=1) function, even if the API is
> not immediately implemented (although it should not be hard to do).
>

I'm not entirely sure I follow the logic here, although implementing the 
requested functionality certainly wouldn't be difficult (i.e., it's a natural 
extension of the getRecord(), nextRecord(), and lastRecord() methods.) 
However, by its very nature, a ResultSet is a cache of a query-set, so 
passing 1,000,000 records back and forth should never happen (unless the 
application was really processing 1,000,000 records at a time :)  Based on 
your example, if the UI only needed 15 records to be displayed, then only 15 
records would be loaded into a ResultSet. Otherwise, I think ResultSets 
aren't being used properly.

I am sure I missed your point, so I apologize in advance. As I stated in the 
first sentence, such functionality wouldn't be difficult.

> 2) Concerning the definition of "conditions" used to create
> ResultSets.  These "conditions" must have both selection conditions
> and sorting conditions.  I was not sure what a GConditions was? But

Yes, both are important. GConditions is a tree-like structure that defines, 
to use your terminology, "selection conditions" of arbitrary complexity.  The 
name GConditions comes from the internal class name. I like the terminology 
"selection conditions", as that's less ambiguous -- I'd never considered 
"sorting" to be a "condition", but, in retrospect, I suppose it is. 

Having said that, we do not currently (AFAIK) have an internal structure for 
"sorting conditions".  Internally (this is from memory, btw) we use lists (or 
arrays) to pass the sorting field sequence.  Will sorting definitions require 
a type more complex than a list?  The only reason the selection conditions 
have their own class (GConditions) is that a tree-like structure was harder 
to implement otherwise. 

For those not following the IRC discussion, a GConditions tree would look 
something like:   (I hope my mail program doesn't wrap these lines)

                               OR 
                 _______________|___________
               /                             \
         GREATER-THAN                       EQUALS
       _______|_______                ________|_________
     /                 \            /                    \
 <Field 'Salary'>   60,000    <Field 'Position'>      'Manager'

Using infix traversal, this would represent the SQL WHERE clause: 

   WHERE (Salary > 60000) OR (Position = 'Manager')
 
To represent sorting definitions, we might have a list like:

 ( StoreBranch, LastName, FirstName ) 

which would translate to the following SQL ORDER BY clause:

  ORDER BY StoreBranch, LastName, FirstName


> it did not sound like it did sorting.  Again this must be done on the
> server to make sure getNRecords() (or for that matter nextRecord() )
> works correctly.
>
I assume by this comment that you are referring to GEAS' possible use of 
ResultSets.  ResultSets weren't meant to be passed back and forth, per se, 
between a client and server.  ResultSets were more of a local caching 
mechanism.  I'd imagine, though I don't participate much in GEAS' design so 
may be off base here, that the client would maintain an internal ResultSet 
and GEAS would maintain it's own internal ResultSet -- with GEAS' internal 
resultset being results from a relational query and form's resultset being 
objects from a GEAS call. 

Because a ResultSet is stateful, I am not sure how sorting affects a 
getNRecords or a nextRecord implementation. Maybe this is from a lack of 
understanding OODB principles, but in relational-land, when we query against 
the database, we are guaranteed that if we traverse the results of that query 
one record at a time, we will not come across the same record twice, 
regardless of whether we sorted or not.  Of course, without sorting, two 
identical queries could return the same set of records, but with the records 
in a different order.   However, unless I misunderstand your statement, I am 
not sure why a database would be requeried during the life of a ResultSet.  A 
requerying results in a new ResultSet, with its nextRecord now pointing to 
the first record.

I see two ways that GEAS can use our ResultSet definition: 

1.  By using GNUe-Common as GEAS' database interface, GEAS would use 
ResultSets internally since the DBDrivers use ResultSets. 

2.  GEAS can present the client applications with a ResultSet via CORBA.

James and Reinhard were discussing #1.  I am guessing you are thinking in 
terms of #2, but am not positive.  

In #1, GEAS would maintain an active ResultSet for the client.  This 
ResultSet would not be visible to the client -- it is GEAS' link to the 
relational database backend.  If the client could see GEAS' internal 
ResultSet, then any Object-to-Relational mapping would not be necessary and 
GEAS-based business rules would not be practical. GEAS would, in essence, 
simply be a caching relational layer. 

However, I would envision GEAS maintaining a backend ResultSet as long as the 
client was open and requesting rows based on a query.  

#2 has possibilities, but I don't think it fits with GEAS' way of doing 
things. It may be a little too "relational". Reinhard and James were 
discussing (AFAICT) GEAS using something similar to the GConditions tree via 
CORBA, but that was the extent of the discussion wrt exposed services. I also 
think the GConditions selection tree should be exposed by GEAS unless OQL or 
such is working. Also, will OQL allow us to specify an arbitrarily complex 
selection clause? I am not at all familiar with OQL, but assume it will. 

Forms will, I imagine, continue to maintain its own ResultSet as it has to 
have some way of retaining records internally.

> 3) I did not see any introspection functions. For example get a list
> of field names, get characteristics of fields, get number of records
> in DataObject, etc.  Now I admit that if we are planning on making
> the introspection data just system data objects with standard field
> names, then I would be very happy.

I see two types of "introspection" here: 

  1) System-level introspection -- What objects does the system provide and, 
     from each object, what fields are available and what are their 
     characteristics? 
  2) Result-level introspection -- I have a ResultSet... what fields are in 
     the ResultSet, how many records are in the ResultSet, etc.  

I am not sure if you were asking about 1, 2, or both. I'll describe our 
current offering for both, but be forewarned that neither are 
feature-complete, but none-the-less currently usable.

We are currently toying with system-level introspection.  Our PostgreSQL and 
Oracle dataobjects have round-one of our testing.  I apologize that these are 
not well documented, but they are new and not necessarily the final way we 
plan to do it. Currently, these methods are only utilized by Designer's form 
definition Wizards.  Basically, the implementation is: 

A db driver (or a DataObject to be more precise) currently defines three 
methods: 

  1) getSchemaTypes():  This returns a list of available top-level "object
     types". For relational db drivers, this is usually (Tables, Views), 
     meaning the only top-level objects are tables and views. For object
     backend drivers,  this might return ( Classes ) as the only top-level 
     objects are classes  (that might not be the correct term for geas
     objects; I'm not sure.)          

     Currently, the only intended use of this method is so Designer can allow 
     the developer to limit the list of objects with a drop-down box (i.e., 
     the developer can ask that only tables are presented for him to choose 
     from.)

  2) getSchemaList(): This returns a list of available "objects". This 
     optionally takes a list of "object types" to return.  "Object types" are 
     returned by getSchemaTypes(). 

     This method returns a list of Schema objects. There is one Schema object 
     per table or class. This Schema object defines several methods, the most
     important being getChildSchema(), which returns a list of child Schemas. 
     (Please excuse the names of these functions... however, we are trying 
     our best not to be tied to relational, 2-tier models.)  For relational
     databases, the "child Schema" of tables are fields.  Therefore, a
     table's getChildSchema() function returns a list of all the fields 
     defined by that table.  An object database's child schema might return 
     "attributes" or "methods". 

     Each of these "child schemas" have attributes such as datatype, 
     nativeType, size, etc, that describe various properties of the fields. 
     Datatype is a limited list of datatypes (just as "string", "number", 
     "date", etc).  The nativeType attribute describes, using the database's 
     terminology, the datatype.  For example, Oracle defines an integer as 
     number() and Postgres defines an integer as, IIRC, int8().  The 
     nativetype for each would be, respectively, "number" and "int8", whereas 
     the datatype for each will always be "number".  This is to simplify 
     client code -- the clients do not care about the specific storage 
     mechanism, they just need to know it's a "number".  I believe GEAS 
     already has a similar predefined set of types.  Reinhard suggested that 
     we reuse GEAS' definitions.  I have not looked at GEAS' types, but I'm 
     sure we will adopt them as our definitions are fairly simple and
     overly limited. 

  3) getSchemaByName(): This takes a string argument (the name of an object) 
     and returns a single Schema type defining the named object. For example, 
     findSchemaByName("ap_vendors") returns a Schema object defining the 
     characteristics of the ap_vendors table.  This method isn't fully 
     defined and, IIRC, isn't implemented in any of the drivers.

I am sure we will need more methods than these three, but these currently, 
parden the humor, "scratch Designer's itch". 

As for result-level introspection, ResultSets define a number of methods, 
such as 

  getRecordCount() -- Return the number of records in the ResultSet
  getCacheCount() -- Return the number of physical records in the 
      resultset (may be less than or equal to getRecordCount() .)
  getRecordNumber() -- The positional location of the current record
  isFirstRecord() -- Are we at the first record in the set
  isLastRecord() -- Are we at the last record in the set

I don't recall if ResultSets define a method to return the fields and 
datatypes of the cached records.  However, this information is stored 
internally, so a method could be easily added that simply returns this 
internal information.  This has not been an issue in the past as the client 
application told the backend what fields it needed in the resultset and so 
knew what fields were returned. 


>

By the way, I have described the way DataObjects currently work, not the way 
they *must* work. We are definitely open to restructuring our classes if they 
do not fit GNUe's needs.  However, I am fairly confident, given some 
fine-tuning and general polishing, our system is extensible enough to meet 
GEAS' needs.  Of course, I may be slightly biased.  <grin>

I hope this helped. 

-- Jason
[Prev in Thread]
Current Thread
[Next in Thread]
[Gnue-dev] DataObjects.txt, Neil Tiffin, 2001/12/22
- [Gnue-dev] Re: DataObjects.txt, Jason Cater <=
  - [Gnue-dev] Re: DataObjects.txt, Neil Tiffin, 2001/12/23
    - [Gnue-dev] Re: DataObjects.txt, Jason Cater, 2001/12/23
Prev by Date: [Gnue-dev] DataObjects.txt
Next by Date: [Gnue-dev] Re: DataObjects.txt
Previous by thread: [Gnue-dev] DataObjects.txt
Next by thread: [Gnue-dev] Re: DataObjects.txt
Index(es):
- Date
- Thread