[Chicken-users] Chicken Gazette

chicken-users
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-users] Chicken Gazette - Issue 13

From:	Peter Bex
Subject:	[Chicken-users] Chicken Gazette - Issue 13
Date:	Mon, 22 Nov 2010 21:22:35 +0100
User-agent:	Mutt/1.4.2.3i
     _/_/_/  _/        _/            _/
  _/        _/_/_/          _/_/_/  _/  _/      _/_/    _/_/_/
 _/        _/    _/  _/  _/        _/_/      _/_/_/_/  _/    _/
_/        _/    _/  _/  _/        _/  _/    _/        _/    _/
 _/_/_/  _/    _/  _/    _/_/_/  _/    _/    _/_/_/  _/    _/

--[ Issue 13 ]------------------------------------- G A Z E T T E
                               brought to you by the Chicken Team


== 0. Introduction

Welcome to issue 13 of the Chicken Gazette!

== 1. The Hatching Farm

  * Ivan Raikov has been hacking away at his as yet unreleased egg      
    "9ML-toolkit". He also released version 1.13 of npdiff which fixes  
    a bug that caused it to drop the context when two sequences are     
    completely different, and he released version 1.1 of miniML.        
  * Mario Goulart has been improving the documentation of Awful and     
    accents-substitute, and fixed a bug in Awful's handling of Ajax     
    output.                                                             
  * Alan Post has been putting a lot of work into his genturfa'i parser 
    to create a bootstrap compiler for parsing PEG grammars. This       
    bootstrap compiler will eventually be used to compile the           
  * Peter Bex released a fix for a problem in the postgresql egg with   
    the builtin "name" datatype, as well as a problem with NUL bytes in 
    byte-arrays.                                                        
  * David Krentzlin has been polishing his nomads egg further. He       
    also released version 0.2 _and_ version 0.3 of his log5scm egg. It  
    now includes support for stacked log contexts, as well as a lazy    
    version of "port-sender". Finally, he also fixed the setup and meta 
    files for sendfile and uri-dispatch.                                
  * Felix Winkelmann has made some bugfixes and enhancements in his     
    "bind" egg and added an `ecase` macro to miscmacros which works     
    like `case` except throws an error when no clause is matched.       
  * Moritz Heidkamp updated hyde's dependency list to work with Spiffy  
    4.9 and updated his "Vandusen" IRC bot to support reminders of      
    interesting changesets for the Chicken Gazette editor.              

== 2. Core development

The following things have been changed in Chicken core by Felix:

A workaround was added to the scheduler for a problem with threads
that accidentally get blocked on a filedescriptor but are not marked as
blocked for I/O, which may help fix a few race conditions.

The `,g` debugging toplevel command was fixed and some more feedback
was added to it so it shows the variable being fetched.

The scrutinizer was updated to give a warning when a one-armed `if`
is used in tail-position, as suggested on chicken-users by Jörg
Wittenberg.

Also suggested by Jörg Wittenberg, `make-mutex` was changed so it's
shorter and doesn't store the thread in the mutex.

The definition of `##sys#halt` was modified to accept an error message
argument, to make it match the call sites.

As suggested by Moritz Heidkamp, an SRFI-0 feature
"`irregex-is-core-unit`" was added to the new Chicken to make it
easier for egg authors to check whether it's running under an older or
a newer Chicken.

A new `equal=?` procedure was added, which works like `equal?` except
that it ignores exactness when comparing numbers.

The Chicken team has been discussing a feature request by Mario
Goulart for adding umask support to Chicken's `posix` unit. This will
be the first feature to go through the new Change Request procedure
outlined by Felix. Anyone who would like to take part in shaping
Chicken's evolution is free to add her or his insights about this
functionality to the ticket at http://bugs.call-cc.org/ticket/424 or
on the corresponding thread on the chicken-hackers mailinglist!

== 3. Chicken Talk

Just before the previous gazette was issued, Nicolas Pelletier posted
a follow-up to an old thread of his about an overflow problem with the
`current-time` procedure.  He reported that he had tested the changes
Felix made in the "flonum-milliseconds" branch, and they were working
perfectly for him.

The latter half of the week saw a flurry of messages on the list:

On Wednesday, Jörg Wittenberg posted some ideas about how to improve
Chicken's debugging information so that it would be able to detect a
difficult problem he had encountered (but already solved). Felix
responded that he agrees the scrutinizer could be smarter about
issuing warnings and promptly added a change to the experimental
branch which implements such a warning.

Later it also turned out that Jörg had been using the `-local` option
in an attempt to optimize his code, but didn't fully foresee the
consequences of that. The new version of Chicken does more aggressive
local optimizations which broke some of his assumptions because it now
inlines certain procedures. The lesson: specific optimizations like
`-local` are tricky things which change the semantics of Scheme, and
hence they can break code later on when they're improved, so don't use
them unless you're absolutely sure you need them!

Jörg then quickly followed with another post about "the hardship of
upgrading Chicken from 4.6.1 to 4.6.3" in which he points out some
possible regressions in the latter version, and he also mentions that
srfi-18's `make-mutex` procedure could use some improvements. Again,
Felix responded that he's lost his taste for square brackets (cue
collective sigh of relief) and that he will integrate those
improvements.

On Thursday, Felix posted a message to announce his "sequences" egg
which adds generic operations on various "sequence" types like lists,
strings etcetera, inspired by Common Lisp's sequences. He said he'd
like to get some feedback on this API, which he got -- in spades! Alan
Post, Thomas Chust, John Cowan, Hans Nowak and Jörg Wittenberg all
responded to his request for comments, asking for the addition of
several new procedures as well as sequence comprehensions, possibly
based on SRFI-42. There was also a short discussion about the naming
conventions in the sequence API.

On Friday, Daishi Kato posted a bugreport for the format-textdiff egg
which Ivan Raikov dealt with swiftly.

Jeronimo Pellegrini also posted about a problem with threads which he
ran into while working on an implementation of thread pools for
didactic purposes. He also mentioned he found a bug in the error
handling of the deadlock detection code. Jörg chimed in to tell him
that he also noticed that Chicken's behaviour of how thread port
buffering of newlines had changed recently.

On Saturday another new thread (!) was started by Alan Post in which
he reported a bug in Chicken's keyword argument handling. He created a
ticket in Trac to help track this bug, but with the help of Alex and
Felix he found out it was not a bug in Chicken but in his own code;
`string->symbol` does not produce keyword objects even when the string
ends with a colon. After he changed his code to use `string->keyword`
everything worked as it should. Keywords can be confusing things:
they're not quite the same as symbols because they're self-evaluating,
yet `symbol?` returns `#t`.

== 4. Omelette Recipes

With this week's omelette recipe we'll dip our toes in some of the
more advanced uses of the insanely great PostgreSQL database. With the
postgresql egg I've been trying to unlock some advanced features
through a high-level API, so it becomes a breeze to use them!

But let's start simple. First, we create a database and matching user
account so we can mess around with impunity:

     $ psql -U postgres  # or psql -U pgsql postgres (depending on OS/package)
     psql (8.4.4)
     Type "help" for help.
     
     postgres=# CREATE USER gazette WITH PASSWORD 'whatever';
     CREATE ROLE
     postgres=# CREATE DATABASE gazette WITH OWNER gazette;
     CREATE DATABASE

If the server is running, but the connection cannot be established,
you might need to configure `pg_hba.conf` to allow connections.
Please see the pg_hba.conf section of the Postgres manual for more
info. The easiest is to set "trust" for "local" connections, assuming
you can secure access to the unix domain socket file.

If you ever need a reminder about SQL syntax, you can type
"\h CREATE USER" or even just "\h CREATE".

Now, let's fire up Chicken and connect to the database:

    (use postgresql)
    (define con (connect '((dbname . "gazette")
                           (user . "gazette")
                       (password . "whatever"))))

The alist you pass to the `connect` procedure can contain all the
options described in the list of PQconnectdbParams parameter keywords.
You can also pass a connection string as accepted by `PQconnectdb`,
which is just a space-separated string that looks like this:
`"dbname=gazette user=gazette password=whatever"`

Let us assume we're going to store daily average temperature
measurements as part of a weather service. Assuming we have several
stations across the nation that measure the temperature and we
don't want to mess around with datetime types (as those are somewhat
problematic in Chicken in my opinion), we could set up a table like
this:

    (query con (conc "CREATE TABLE weather "
                     "( year integer, "
                     "  month integer, "
                 "  day integer, "
                 "  avg_temps float[], "
                     "  PRIMARY KEY (year, month, day))"))

_Disclaimer_: Note that the datatypes we're going to define are
probably not the best way to deal with this dataset. It's intended
as just a simple demonstration. Good database design is still an art,
and what's good design depends on how you're going to use your data!
Rule of thumb: think of the queries you're going to do most often and
optimise for those. It's just like choosing datatypes or algorithms in
a programming language...

The temperatures column contains arrays of floating point numbers. This
maps to a vector in Chicken:

    ;; Store temperatures at five measuring stations on 22 November
    (query con (conc "INSERT INTO weather (year, month, day, avg_temps) "
                     "VALUES ($1, $2, $3, $4)")
               2010 11 22 (vector 6 7.5 5.2 4.6 5.4))

The dollar signs are placeholders, on which the positional value
corresponding to the number after the dollar gets inserted. These
values are the extra arguments to `query` following the query string.
The library is smart enough to know how to convert vectors to Postgres
native arrays. By the way, always use placeholders like this. Never
/ever/ splice values directly into the query string! You'd risk a visit
from little Bobby Tables (see http://www.bobby-tables.com)...

When we request our data back into Scheme, we get a vector again:

    (value-at (query con (conc "SELECT avg_temps "
                               "FROM weather "
                               "WHERE year = $1 AND month = $2 AND day = $3")
                         2010 11 22))
    =>
    #(6 7.5 5.2 4.6 5.4)

value-at can return any value located at any row or column from a
result set matrix, but by default, it'll pick the first column of the
first row.

Now let's suppose that we want to store not only the average
temperature, but also the amount of rain in millimeters that
fell that day. We could just add another column, but let's say
we prefer to keep the data provided by each measuring station
together. In order to do so, we'll create a custom "composite type":

    (query con "CREATE type weatherdata AS ( rainfall float, avg_temp float )")

And now we'll need to convert our existing data to make use of this
new type. For safety, we'll wrap it in a transaction so if we make a
mistake we won't end up with a half-converted or corrupted table. If
you don't grok the UPDATE query, just skip it and assume it converts
properly from the old to the new representation, using NULL values for
the rainfall of the converted records.

    (with-transaction con
      (lambda ()
        (query con "ALTER TABLE weather ADD COLUMN data weatherdata[]")
        (query con (conc "UPDATE weather w "
                         ;; Set the data to a generated row array:
                         "SET data=arrays.r "
                     "FROM (SELECT array_agg(ROW(NULL, t[i])::weatherdata) AS 
r, "
                     "             flat.year, flat.month, flat.day "
    
                     ;; This next SELECT returns the valid indices for each
                     ;; array and that array, so you'd get something like this:
                     ;;
                     ;;  1 | {a, b}    | year | month | day
                     ;;  2 | {a, b}    | year | month | day
                     ;;  1 | {c, d, e} | year | month | day
                     ;;  2 | {c, d, e} | year | month | day
                     ;;  3 | {c, d, e} | year | month | day
                     ;; ... etc ...
    
                     "      FROM (SELECT generate_subscripts(avg_temps, 1) i, "
                     "                   avg_temps t, year, month, day "
                         "            FROM weather "
                         "            ORDER BY i) AS flat "
                         "      GROUP BY flat.year, flat.month, flat.day) AS 
arrays "
                     
                     ;; Match up the data belonging to the row being updated
                         "WHERE arrays.year = w.year "
                         "  AND arrays.month = w.month "
                         "  AND arrays.day = w.day"))
        (query con "ALTER TABLE weather DROP COLUMN avg_temps")))

It's a little convoluted, but that's one of the disadvantages of
working with arrays; all SQL operations are set-based so you will need
to jump through some hoops to convert back and forth.

The `with-transaction` protects us from stupid mistakes. If an
exception is thrown inside, it will automatically roll back the
transaction, and since Postgres supports transactions almost
everywhere, the alter table and update statements will be undone.
Anyone else using the table in the meanwhile will not even notice
anything was going on with it.

Anyway, we can now add some new values to the database for the next
day, and retrieve what we have so far:

    (query con (conc "INSERT INTO weather (year, month, day, data)"
                     "VALUES ($1, $2, $3, $4)")
               2010 11 23 '#((11 4.2) (8.3 5.8) (6.0 7.1) (7.3 6.2) (12 4)))
    (row-fold-right cons '() (query con (conc "SELECT year, month, day, data "
                                              "FROM weather ORDER BY day")))
    =>
    ((2010 11 22 #((#<sql-null-type> 6)
                   (#<sql-null-type> 7.5)
                   (#<sql-null-type> 5.2)
                   (#<sql-null-type> 4.6)
                   (#<sql-null-type> 5.4)))
     (2010 11 23 #((4.2 11) (5.8 8.3) (7.1 6) (6.2 7.3) (4 12))))

`Row-fold-right` simply performs a right fold of the supplied procedure
over the resulting rows, with the column values provided in a list, in
the order of the SELECT list's fields.

Now, the composite values from our weatherdata type are represented as
ordinary lists. The position in the list corresponds to the position
in the "ROW" type's constructor syntax, which itself corresponds to
the ordering of the types in the `CREATE TYPE` statement we entered
earlier. Of course this is not the most convenient way to deal with
types, so let's improve the situation by creating an abstract type in
Scheme to match the one in SQL, using the defstruct egg:

    (use defstruct)
    (defstruct weather-data rainfall average-temperature)
    
    ;; For debugging purposes:
    (define-record-printer (weather-data w out)
      (fprintf out "{rain: ~A temp: ~A}"
                   (weather-data-rainfall w)
                   (weather-data-average-temperature w)))

This struct somehow needs to be serialized to an SQL row value (you can
look up the exact expected syntax in the Composite Value Input section
of the Postgres manual):

    (define (weather-data-unparser conn w)
      (sprintf "(~A, ~A)"
               (weather-data-rainfall w)
               (weather-data-average-temperature w)))

In case of more complex types, you can also re-use the list-unparser
procedure, which simply unparses all elements in a list. This is
highly recommended when you are unparsing string values, since those
may contain characters which have special meaning in the ROW
syntax. That would look like this:

    (define (weather-data-unparser conn w)
      (list-unparser conn (list (weather-data-rainfall w)
                                (weather-data-average-temperature w))))

While we're at it, let's also write a weather-data parser which can
extract a weather data object from a string describing a ROW returned
by the database:

    (define weather-data-parser
      (let ((constituent-parser (make-composite-parser
                                  (list numeric-parser numeric-parser))))
        (lambda (str)
          (let ((l (constituent-parser str)))
            (make-weather-data rainfall: (car l)
                           average-temperature: (cadr l))))))

And now we hook these into the connection object:

    (update-type-parsers! con
      (cons (cons "weatherdata" weather-data-parser)
            (type-parsers con)))
    
    (update-type-unparsers! con
      (cons (cons weather-data? weather-data-unparser)
            (type-unparsers con)))

The first part says the "weatherdata" type (which you can find in the
`pg_type` system table) can be parsed by the `weather-data-parser`
procedure. Some magic ensures that parsers for arrays and other
composite types containing elements of this type are available. If
you need to, you could override the parser for weatherdata arrays by
providing a parser for the type called "_weatherdata".

The second part says that when `query` receives an argument for which
the `weather-data?` predicate returns `#t`, it should be unparsed by
the `weather-data-unparser` procedure.

Finally, we can work with a more readable (if more verbose) interface:

    (query con (conc "INSERT INTO weather (year, month, day, data)"
                     "VALUES ($1, $2, $3, $4)")
               2010 11 24 (vector (make-weather-data rainfall: 12
                                                 average-temp: 6)
                              (make-weather-data rainfall: 11.2
                                                 average-temp: 4)
                              (make-weather-data rainfall: 10.1
                                                 average-temp: 4.7)
                              (make-weather-data rainfall: 9.4
                                                 average-temp: 2.5)
                              (make-weather-data rainfall: 11
                                                     average-temp: 4.3)))
    (row-fold-right cons '() (query con (conc "SELECT year, month, day, data "
                                              "FROM weather ORDER BY day")))
    =>
    ((2010 11 22 #({rain: #<sql-null-type> temp: 6}
                   {rain: #<sql-null-type> temp: 7.5}
                   {rain: #<sql-null-type> temp: 5.2}
                   {rain: #<sql-null-type> temp: 4.6}
                   {rain: #<sql-null-type> temp: 5.4}))
     (2010 11 23 #({rain: 11 temp: 4.2}
                   {rain: 8.3 temp: 5.8}
                   {rain: 6 temp: 7.1}
                   {rain: 7.3 temp: 6.2}
                   {rain: 12 temp: 4}))
     (2010 11 24 #({rain: 12 temp: 6}
                   {rain: 11.2 temp: 4}
                   {rain: 10.1 temp: 4.7}
                   {rain: 9.4 temp: 2.5}
                   {rain: 11 temp: 4.3})))

If you want the parsers to be available for all connections, just
parameterize the `default-type-parsers` and `default-type-unparsers`.

Whew! This omelette recipe grew a little longer than I expected it to
become, but I hope it shows that PostgreSQL is an extensible database,
and as such can be used a lot like Lispy languages; you can extend
it to fit your program's problem domain, and best of all: you can
integrate those database-extensions right into Chicken!

For those who are motivated to go further down the rabbit hole, you
can try to figure out how to define type /hierarchies/ or custom /base
types/; atomic types which have their own custom "reader syntax"
inside Postgres. You'd write those readers/writers in C or another
language embedded in Postgres. Any takers for making Chicken one of
those?

== 5. About the Chicken Gazette

The Gazette is produced weekly by a volunteer from the Chicken
community. The latest issue can be found at http://gazette.call-cc.org
or you can follow it in your feed reader at
http://gazette.call-cc.org/feed.atom. If you'd like to write an issue,
check out the instructions and come and find us in #chicken on
Freenode!

[ --- End of this issue --- ]
[Prev in Thread]
Current Thread
[Next in Thread]
[Chicken-users] Chicken Gazette - Issue 13, Peter Bex <=
- [Chicken-users] Re: [Chicken Gazette - Issue 13] - ##sys#:keyword:'s, F. Wittenberger, 2010/11/22
  - Re: [Chicken-users] Re: [Chicken Gazette - Issue 13] - ##sys#:keyword:'s, Peter Bex, 2010/11/22
    - Re: [Chicken-users] Re: [Chicken Gazette - Issue 13] - ##sys#:keyword:'s, John Cowan, 2010/11/22
    - Re: [Chicken-users] Re: [Chicken Gazette - Issue 13] -, Matthew Flatt, 2010/11/22
  - Re: [Chicken-users] Re: [Chicken Gazette - Issue 13] - ##sys#:keyword:'s, Felix, 2010/11/24
- Re: [Chicken-users] Chicken Gazette - Issue 13, F. Wittenberger, 2010/11/22
  - Re: [Chicken-users] Chicken Gazette - Issue 13, John Cowan, 2010/11/22
  - Re: [Chicken-users] Chicken Gazette - Issue 13, Peter Bex, 2010/11/23
    - Re: [Chicken-users] Chicken Gazette - Issue 13, F. Wittenberger, 2010/11/24
    - Re: [Chicken-users] Chicken Gazette - Issue 13, Peter Bex, 2010/11/24
Prev by Date: Re: [Chicken-users] Re: sequences egg
Next by Date: [Chicken-users] Re: [Chicken Gazette - Issue 13] - ##sys#:keyword:'s
Previous by thread: [Chicken-users] Re: sequences egg
Next by thread: [Chicken-users] Re: [Chicken Gazette - Issue 13] - ##sys#:keyword:'s
Index(es):
- Date
- Thread