help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sorting on compound keys?


From: Andreas Röhler
Subject: Re: Sorting on compound keys?
Date: Sun, 29 May 2011 22:17:59 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; de; rv:1.9.2.17) Gecko/20110414 SUSE/3.1.10 Thunderbird/3.1.10

Am 27.05.2011 00:49, schrieb Tim Landscheidt:
Andreas Röhler<andreas.roehler@easy-emacs.de>  wrote:

sometimes I want to sort unified diffs of CSV files (sepa-
rated by tabs (here: \t)):

| +A 1\t1\tx
| +A 1\t2\ty
| +B 2\t3\tz
| -A 1\t1\tx
| -B 2\t2\ty
| -B 2\t3\tz

by the second column, then the first column, then "+" vs.
"-". Unfortunately, it seems that sort-regexp-fields doesn't
allow more than one match field as a key. sort-fields
doesn't work either as it requires the fields to be sur-
rounded by white space (no "+" vs. "-") and doesn't allow
white space inside the fields.

     Is there any function in vanilla Emacs (23.1.1) that I
missed? I looked at pimping sort-regexp-fields, but it seems
to me that sort-subr would have to be rewritten from scratch
to achieve sorting on compound keys.

last time I looked into that feature was missing indeed.
However, didn't look for a need of re-write from the
scratch, just to extend to existing routine - ie. introduce
one or more levels of sorting.

I remember our discussion in de.comp.editoren :-), but as I
read sort-subr it is hard-coded that the sort key is one
literal, continuous part of the buffer as sort-lists is a
list of buffer positions.

sort-subr takes functions to determine the fields to sort.

No, it accepts functions to determine the *boundaries* of
the fields that have to be part of the buffer as I have
written above.

As for the functions as arguments, maybe have a look at
`ar-th-sort' in thingatpt-utils-base.el

https://code.launchpad.net/s-x-emacs-werkstatt/

How is this useful in this case?

Tim




Hi Tim,

you are right. It must not be done inside sort-subr, but on the top of it.

BTW as sort-subr takes whitespace as field-delimiter, there is no way to get +A considered as two fields. Beside this limitation, code below should provide multiple-fields sorting.

;;; sort-multiple-keys.el --- sort multiple fields

;; Author: Andreas Roehler <andreas.roehler@online.de>

;; Keywords: data

;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with this program.  If not, see <http://www.gnu.org/licenses/>.

;;; Commentary:

;; Sort lines in region lexicographically by the
;; ARG-LIST fields. Fields already sorted by a field
;; specified by a previous arg are sorted by the next
;; remaining. Uses any number of args given in a list.

;; Fields are separated by whitespace and numbered from
;; 1 up. With a negative arg, sorts by the ARGth field
;; counted from the right. Called from a program, there
;; are three arguments:  BEG  END and FIELD-LIST. BEG
;; and END specify region to sort. The variable
;; `sort-fold-case' determines whether alphabetic case
;; affects the sort order.


;; Example - assume the code below uncommented at the
;; beginning of a buffer:

;; +C 2 1       x
;; +A 2 2       y
;; +A 1 2       y
;; +A 1 2       z
;; +C 1 1       x
;; +A 4 2       z
;; +A 3 2       y
;; +B 3 3       x
;; +C 2 1       x
;; +B 2 3       z
;; -A 6 1       x
;; -B 1 2       y
;; -A 2 1       x
;; -B 1 3       z

;; sort region hierarchically with first, fourth and second field
;; (sort-multiple-fields 1 126 '(1 4 2))
;; ==>

;; +A 1 2       y
;; +A 2 2       y
;; +A 3 2       y
;; +A 1 2       z
;; +A 4 2       z
;; +B 2 3       z
;; +B 3 3       x
;; +C 1 1       x
;; +C 2 1       x
;; +C 2 1       x
;; -A 2 1       x
;; -A 6 1       x
;; -B 1 2       y
;; -B 1 3       z



;;; Code:

(defun sort-multiple-fields (beg end fields)
  (interactive "*r\nnSort for field: ")
  (save-excursion
    (when (interactive-p)
      (while
          (yes-or-no-p "Sort another field?")
        (add-to-list 'fields (read-number "Sort for field: ")))
      (message "Sorting for fields %s" (prin1-to-string fields)))
    (let* ((positions (copy-sequence fields))
          (max-field (car (sort positions #'>))))
      (sort-multiple-fields-base beg end fields))))

(defun sort-multiple-fields-base (beg end fields)
  (lexical-let ((key (or (car-safe fields) (list fields)))
                (this-fields (copy-sequence fields))
                last)
    (save-restriction
      (narrow-to-region beg end)
      (sort-fields key beg end)
        (setq last (car fields))
        (when (cadr this-fields)
          (setq this-fields (cdr this-fields))
        (sort-multiple-fields-intern beg end last this-fields fields)))))

(defun sort-multiple-fields-intern (beg end &optional last this-fields fields)
  (lexical-let ((beg beg)
                (pos end)
                (end end)
                (last last)
                (fields fields)
                (this-fields (copy-sequence this-fields))
                regexp)
    (setq key (pop this-fields))
    (dotimes (i max-field)
      ;; i starts with 0, first field is done above
      (cond ((eq 0 i)
             (if (eq 1 last)
                 (setq regexp "^[ \t\n]*\\([^ \t\n]+\\)")
               (setq regexp "^[ \t\n]*[^ \t\n]+")))
            ((eq last (1+ i))
             (setq regexp (concat regexp "[ \t\n]+\\([^ \t\n]+\\)")))
            (t (setq regexp (concat regexp "[ \t\n]+[^ \t\n]+")))))
    (setq regexp (concat regexp ".*$"))
    (goto-char beg)
    (while (and (re-search-forward regexp pos t 1)
                (setq beg (line-beginning-position))
                (setq erg (match-string-no-properties 1)))
      ;; at least one success
      (when (and (re-search-forward regexp pos t 1)
                 (string= (match-string-no-properties 1) erg)
                 (setq end (line-end-position)))
        (while (and (re-search-forward regexp pos t 1)
                    (string= (match-string-no-properties 1) erg)
                    (setq end (line-end-position))))
        (when (and beg end)
          ;; we really moved, there is another region to sort
          (save-restriction
            (narrow-to-region beg end)
            (sort-fields key beg end)
            (when (car this-fields)
              (setq last key)
              (sort-multiple-fields-intern beg end last this-fields))))))))

(provide 'sort-multiple-keys)
;;; sort-multiple-keys.el ends here













reply via email to

[Prev in Thread] Current Thread [Next in Thread]