phpgroupware-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Phpgroupware-developers] SQL garbage in contacts backend


From: Alex Borges
Subject: Re: [Phpgroupware-developers] SQL garbage in contacts backend
Date: Wed, 25 Feb 2004 12:25:39 -0600

I understand exactly what you mean.

El mié, 25-02-2004 a las 11:15, Markus Kaemmerer escribió:
> Hi,
> 
> while working on synchonization support we found, that sync of 300
> Items from my PocketPC (connected through USB) to my local phpGW
> installation takes about 30 minutes (the same with ActiveSync and
> Outlook takes about 30 seconds). 
> 
> One reason for this is, that the contacts backend. During the tests we
> found, that there are the indizis missing on nearly all tables. An
> later I found that for one vCard export there the following SQL
> statements are executed.

Indexes shouldve been created upon setup....where yout testing from a
clean vanilla install or what kind of upgrade path did it follow?

Anyhow, this is fixed by adding indexes.

> 
> The first 5 SQL statements are fetching the WHOLE table and the last
> 3500 bytes SQL statement is nealy self describing :-).  In this
> statement there are things like: "(CASE WHEN ' ' IS NULL THEN '' ELSE
> ' ' END)". The SQL builder should suppress such things.

1.- 
ITs the way its handling the empty strings and all.... whomever you
appoint to this part of your project should get in touch with me so we
can work out a solution for sql builder working well for NULLS in
multiple databases (thats what its trying to automatically fix in that
case).

Get in touch with me so as to work closely with what you guys are doing
and we will try to see what needs to be done.



> 
> These things sould be fixed before phpGW can be used in larger
> environments or with synchronizing. 

I agree that indexes should. 

I disagreee that this query is unacceptably slow. I would call it
acceptably unreadable, but not necesarily inneficient (it has to run in
all databases).

As for the multiple queries, that is a problem inherent in the design. 

Communication types and contact types are catalogs (really small ones,
tops 50 entries). In the actual addressbook code, the whole contents of
them are cached. So, this is just a 'first hit' performance hit

When i say catalogs i mean note_types and address_types hold none of the
real data for either tables (addresses or notes), they hold a catalog of
types ('work', 'home', '..e.tc).


Before going any further marcus, i direct you to
THE EXPLANATION OF THE QUERIES
So you can get a clue before going for the neck in this kind of stuff:
> 
> --- 
> 
> SELECT contact_comm_descr.comm_descr_id AS comm_descr_id,
> contact_comm_descr.descr AS comm_description,
> contact_comm_descr.comm_type_id AS comm_type FROM
> phpgw_contact_comm_descr AS contact_comm_descr

Catalog, should be small, executed upon contacts_sql instantiation. You
can also tell it to lazyly not-cache it, but that would be stupid.

> 
> SELECT contact_comm_type.comm_type_id AS comm_type_id,
> contact_comm_type.type AS comm_type_description,
> contact_comm_type.active AS comm_active, contact_comm_type.class AS
> comm_class FROM phpgw_contact_comm_type AS contact_comm_type
> 
Same as above

> SELECT contact_addr_type.addr_type_id AS addr_type_id,
> contact_addr_type.description AS addr_description FROM
> phpgw_contact_addr_type AS contact_addr_type

Same as above

> 
> SELECT contact_note_type.note_type_id AS note_type_id,
> contact_note_type.description AS note_description FROM
> phpgw_contact_note_type AS contact_note_type

Same as above
> 
> SELECT contact_types.contact_type_id AS contact_type_id,
> contact_types.contact_type_descr AS contact_type_descr,
> contact_types.contact_type_table AS contact_type_table FROM
> phpgw_contact_types AS contact_types
Same as above: Maximum Two entries (persons and organizations)

> 
> SELECT contact_central.owner AS owner FROM phpgw_contact AS
> contact_central WHERE ((contact_central.contact_id = 233))

Gets the owner (one entry) of a single contact (direct search of a
contact_id (pk)) ...this doesnt slow us down just as is.... however, if
it does this query in a loop foreach address card uploaded, then its a
problem (should accumulate them and do an IN() query).

> 
> SELECT contact_central.contact_id AS contact_id,
> contact_central.access AS access, contact_central.owner AS owner,
> concat((CASE WHEN first_name IS NULL THEN '' ELSE first_name END),
> (CASE WHEN ' ' IS NULL THEN '' ELSE ' ' END), (CASE WHEN middle_name
> IS NULL THEN '' ELSE middle_name END), (CASE WHEN ' ' IS NULL THEN ''
> ELSE ' ' END), (CASE WHEN last_name IS NULL THEN '' ELSE last_name
> END)) AS per_full_name, contact_person.first_name AS per_first_name,

About the CASE and all... first understand why they are there: 
- We dont have a full name field, the concat and case gets the full name
if youre asking for it. By the way, if youre trying to sync the full
name back to phpgw, then you have a problem....:). 

- This is the most efficient name to get the full name out of that
table. PERIOD.


> contact_person.middle_name AS per_middle_name,
> contact_person.last_name AS per_last_name, contact_person.suffix AS
> per_suffix, contact_person.prefix AS per_prefix,
> contact_person.birthday AS per_birthday, contact_person.pubkey AS
> per_pubkey, contact_person.title AS per_title,
> contact_person.department AS per_department, contact_person.initials
> AS per_initials, contact_person.sound AS per_sound,
> contact_person.active AS per_active, contact_person.modified_by AS
> per_modby, contact_person.modified_on AS per_modon, contact_org.name
> AS org_name, contact_addr.contact_addr_id AS key_addr_id,
> contact_addr.add1 AS addr_add1, contact_addr.add2 AS addr_add2,
> contact_addr.add3 AS addr_add3, concat((CASE WHEN contact_addr.add1 IS
> NULL THEN '' ELSE contact_addr.add1 END), (CASE WHEN contact_addr.add2
> IS NULL THEN '' ELSE contact_addr.add2 END), (CASE WHEN
> contact_addr.add3 IS NULL THEN '' ELSE contact_addr.add3 END)) AS
> addr_address, contact_addr.postal_code AS addr_postal_code,
> contact_addr.city AS addr_city, contact_addr.state AS addr_state,
> contact_addr.country AS addr_country, contact_addr.preferred AS
> addr_preferred, contact_addr_type.description AS addr_description,
> contact_comm_descr.descr AS comm_description, contact_comm_type.type
> AS comm_type_description, contact_comm.comm_data AS comm_data,
> contact_note.contact_note_id AS key_note_id, contact_note.note_text AS
> note_text, contact_note_type.description AS note_description,
> contact_others.other_value AS other_value, contact_others.other_name
> AS other_name FROM phpgw_contact AS contact_central LEFT  JOIN
> phpgw_contact_note AS contact_note ON contact_note.contact_id =
> contact_central.contact_id LEFT  JOIN phpgw_contact_note_type AS
> contact_note_type ON contact_note_type.note_type_id =
> contact_note.note_type_id LEFT  JOIN phpgw_contact_addr AS
> contact_addr ON contact_addr.contact_id = contact_central.contact_id
> RIGHT  JOIN phpgw_contact_addr_type AS contact_addr_type ON
> contact_addr_type.addr_type_id = contact_addr.addr_type_id LEFT  JOIN
> phpgw_contact_comm AS contact_comm ON contact_comm.contact_id =
> contact_central.contact_id LEFT  JOIN phpgw_contact_comm_descr AS
> contact_comm_descr ON contact_comm_descr.comm_descr_id =
> contact_comm.comm_descr_id LEFT  JOIN phpgw_contact_comm_type AS
> contact_comm_type ON contact_comm_type.comm_type_id =
> contact_comm_descr.comm_type_id LEFT  JOIN phpgw_contact_others AS
> contact_others ON contact_others.contact_id =
> contact_central.contact_id INNER  JOIN phpgw_contact_person AS
> contact_person ON contact_person.person_id =
> contact_central.contact_id LEFT  JOIN phpgw_contact_org_person AS
> contact_org_person ON contact_org_person.person_id =
> contact_person.person_id LEFT  JOIN phpgw_contact_org AS contact_org
> ON contact_org.org_id = contact_org_person.org_id WHERE
> contact_central.contact_id = 233 

Yes its a big query. It builds big queries when you require data from
all the nine tables. I invite you to come up with a shorter generic SQL
query to get that list of fields in the select part of the query.


Recomendation Marcus:

Get your facts straight before just calling other peoples work
'garbage'. Ill be the first to admit the design has problems and that
the way we tackled those problems was not through a KISS design. 

But it works, im syncing with my xml-sync code+probiz-ipc 5000 users
today. 

A user created in my process includes account creation, email account
creation, account grouping,  person creation and complex categorization,
default acl syncronization and default preferences. 

Of all the bottlenecks im finding, the worse is preferences and
categories (since updates cannot be 'sqlized' but are encoded), NOT
CONTACTS.

Its acceptably fast and robust and its quite inteligent in building the
queries for which it was made. 

Thx again though for pointing out the indexes problem (chances are thats
90% of your problem). 

They are created on fresh installs as far as i know, im not shure if
they are also included in upgrades. 

> 
> 
> 
> 
> Markus
> 
> --
> Markus Kämmerer         Team Software Solutions
> pro|business AG, EXPO Plaza 1, 30539 Hannover
> E-Mail: address@hidden,  Phone.: 0511/60066-0
> WWW: http://www.probusiness.de/,    Mobile: 0177/5990932
> 
> **********************************************
>            address@hidden 2004
> 
> *     Halle 1, Stand 3k4 (Magirus Deutschland GmbH)
> *     Halle 1, Stand 7f2 (EMC Deutschland GmbH)
> *     Halle 6, Stand D46 (Land Niedersachsen)
> **********************************************
> 
> 
> _______________________________________________
> Phpgroupware-developers mailing list
> address@hidden
> http://mail.gnu.org/mailman/listinfo/phpgroupware-developers
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]