From dd657ad3f1428b026486db3ec36691df17ddf515 Mon Sep 17 00:00:00 2001 From: "Steve M. Robbins" Date: Sat, 22 Oct 2011 04:54:51 +0200 Subject: Import nyquist_3.05.orig.tar.gz [dgit import orig nyquist_3.05.orig.tar.gz] --- docsrc/xlisp/xlisp-doc/examples/hash-tables.htm | 496 ++++++++++++++++++++++++ 1 file changed, 496 insertions(+) create mode 100644 docsrc/xlisp/xlisp-doc/examples/hash-tables.htm (limited to 'docsrc/xlisp/xlisp-doc/examples/hash-tables.htm') diff --git a/docsrc/xlisp/xlisp-doc/examples/hash-tables.htm b/docsrc/xlisp/xlisp-doc/examples/hash-tables.htm new file mode 100644 index 0000000..3a74ffa --- /dev/null +++ b/docsrc/xlisp/xlisp-doc/examples/hash-tables.htm @@ -0,0 +1,496 @@ + + +Hash Tables + + + + + + + +Nyquist / XLISP 2.0 - +Contents | +Tutorials | +Examples | +Reference + +

+ +

Hash Tables

+ +

The internal XLISP 'hash' function from 'xlsym.c':

+ +

+/* hash - hash a symbol name string */
+int hash(char *str, int len)
+{
+    int i;
+    for (i = 0; *str; )
+        i = (i << 2) ^ *str++;
+    i %= len;
+    return (i < 0 ? -i : i);
+}
+

+ +

In XLISP this would look like:

+ +

+(defun lisp-hash (string table-size)
+  (let ((i 0))
+    (dotimes (index (length string))
+      (setq i (logxor (bsh i 2) (char-code (char string index)))))
+    (setq i (rem i table-size))
+    (if (minusp i) (- i) i)))
+

+ +

A hash function is a kind of random +number generator, where the same input always produces the same output +number. The XLISP hash +function computes equally distributed integer numbers in a given range from +the characters of an input string.

+ +

A very simple example:

+ +

We want to store 4 strings in 2 lists, +stored in 2 array elements:
+ +
```
+> (setq my-array (make-array 2))
+#(NIL NIL)  ; NIL NIL = two empty lists
+
```
+ +
If the array index is computed by the +hash function, then the equally +distributed numbers make sure that every list will contain approximately +the same number of strings:
+ +
```
+> (dolist (string '("a" "b" "c" "d") my-array)
+    (push string (aref my-array (hash string (length my-array)))))
+#(("d" "b") ("c" "a"))
+
```
+ +
The order of the strings in the array was computed by the +hash function, it is not the same order +as given to dolist.
If we now search for a string in the lists then the +hash function will tell us the number of +the array element with the list containing the string because the same input +string to the hash function always +produces the same output number, as long as the same +'table-size' [the same number of array elements] is used:
+ +
```
+> (dolist (string '("a" "b" "c" "d"))
+    (format t "~s = ~s~%" string
+              (aref my-array (hash string (length my-array)))))
+"a" = ("c" "a")
+"b" = ("d" "b")
+"c" = ("c" "a")
+"d" = ("d" "b")
+NIL
+
```
+ +
The hash function will always find +the correct list as long as the number of array elements has not +changed.

+ +

The two main tasks of the hash +function are:

+ +

Make sure that all lists contain approximately the same number of +elements, independent from the characters in the input strings, no matter if +the strings are very similar or completely different. With the +hash function it will nearly never +happen that one list contains all strings while all other lists are +empty.
With the same 'name' and 'table-size' arguments the +hash function will always return exactly +the same integer number, so a string can always be found no matter in what +order the strings are stored in the lists of the array.

+ +

Now we can find strings stored in lists, but we want to store and find +arbitrary things. Therefore we replace the ordinary lists with association +lists:

+ +

+> (setq my-array (make-array 2))
+#(() ())
+
+> (dolist (a-cons '(("a" . 1) ("b" . 2) ("c" . 3) ("d" . 4)) my-array)
+    (push a-cons (aref my-array (hash (car a-cons) (length my-array)))))
+#((("d" . 4) ("b" . 2)) (("c" . 3) ("a" . 1)))
+

+ +

We now have an array like this:

+ +

+ + + + + + + + + + + + + + + + + + + + + + +
Array
0 Association List 1 → (("d" . 4) ("b" . 2))
1 Association List 2 → (("c" . 3) ("a" . 1))

+ +

The association lists give the flexibility to store an arbitrary number +of key/value pairs, we are not limited by the fixed number of +array elements, while the array together with the +hash function gives much more speed than +a single association list if we want to manage a big number of + key/value pairs.

+ +

With a big number of key/value pairs it is faster to keep them in many +small association lists than in one single big list. Arrays +provide random access, where every element can be accessed in the same time, +while a list can only be searched from the beginning up to the matching +element. The longer the list, the slower the search +becomes.

+ +

With the hash function we find the +association list containing the key:

+ +

+> (dolist (key '("a" "b" "c" "d"))
+    (format t "~s = ~s~%" key
+              (aref my-array (hash key (length my-array)))))
+"a" = (("c" . 3) ("a" . 1))
+"b" = (("d" . 4) ("b" . 2))
+"c" = (("c" . 3) ("a" . 1))
+"d" = (("d" . 4) ("b" . 2))
+NIL
+

+ +

With the assoc function we find the +key/value pair:

+ +

+> (dolist (key '("a" "b" "c" "d"))
+    (format t "~s = ~s~%" key
+              (assoc key (aref my-array (hash key (length my-array)))
+                     :test #'equal)))
+"a" = ("a" . 1)
+"b" = ("b" . 2)
+"c" = ("c" . 3)
+"d" = ("d" . 4)
+NIL
+

+ +

With the cdr function we get the +value:

+ +

+> (dolist (key '("a" "b" "c" "d"))
+    (format t "~s = ~s~%" key
+              (cdr (assoc key (aref my-array (hash key (length my-array)))
+                          :test #'equal))))
+"a" = 1
+"b" = 2
+"c" = 3
+"d" = 4
+NIL
+

+ +

And now we have our first working hash-table.

+ +

But we still have one problem. The +hash function works only with +symbols or strings, while assoc can +also work with numbers, strings and even lists as 'key' argument. To +make our hash-table work with all types +assoc can handle, we must make the +hash function happy and convert the +'key' argument with format into a +string before computing the hash index:

+ +

+> (setq my-array (make-array 2))
+#(() ())
+
+> (dolist (a-cons '((#\x . 1) ((y z) . 2) (12 . 3) (6.5 . 4)) my-array)
+    (push a-cons (aref my-array (hash (format nil "~s" (car a-cons))
+                                      (length my-array)))))
+#(((12 . 3) (#\x . 1)) ((6.5 . 4) ((Y Z) . 2)))
+
+> (dolist (key '(#\x (y z) 12 6.5))
+    (format t "~s = ~s~%" key
+              (cdr (assoc key (aref my-array (hash (format nil "~s" key)
+                                                   (length my-array)))
+                          :test #'equal))))
+#\x = 1
+(Y Z) = 2
+12 = 3
+6.5 = 4
+NIL
+

+ +

Wonderful.

+ +

A final quirk still needs to be solved. Maybe you have noticed the :test +argument to assoc. Like with all Lisp +functions providing :test arguments, the +assoc :test defaults to +eql [because +eq is unreliable with numbers, and +eql is faster +than equal], but +eql doesn't work with +floating-point numbers, strings and lists, so we had to +use equal.

+ +

The typical Lisp solution is to provide a :test argument to the +'make-hash-table' function, so the programmer can choose which function to +use. The :test argument to 'make-hash-table' becomes a property +of the hash-table itself, so the :test only needs to be given +once, at the time when the hash-table is created, and not every +time the hash-table is accessed afterwards.

+ +

We have the problem that hash-tables are no +built-in XLISP data type and we want use +make-hash-table in the same way +as make-array:

+ +

+(setq my-hash-table (make-hash-table size :test #'equal))
+

+ +

Here the make-hash-table function has no access to the property list of +the 'my-hash-table' symbol, so the only solution is to make the +:test function become part of the hash-table itself:

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Array
  0 Test Function → the :test argument to + assoc
  1 Association List 1 → ((key1 . value1) + ... (keyN . valueN))
  2 Association List 2 → ((key1 . value1) + ... (keyN . valueN))
  3 Association List 3 → ((key1 . value1) + ... (keyN . valueN))
...
  n Association List n → ((key1 . value1) + ... (keyN . valueN))

+ +

This is the final layout of our hash-tables, so we can start +to implement the hash-table functions.

+ +

+ + + +

+ +

make-hash-table

+ +

+(defun make-hash-table (size &optional (test #'eql))
+  (and (< size 1) (error "hash-table minimum size is 1" size))
+  (let ((hash-table (make-array (1+ size))))
+    (setf (aref hash-table 0) test)
+    hash-table))
+
+(defun gethash (key hash-table)
+  (let* ((size   (1- (length hash-table)))
+         (index  (1+ (hash (format nil "~s" key) size)))
+         (a-list (aref hash-table index))
+         (test   (aref hash-table 0)))
+    (cdr (assoc key a-list :test test))))
+
+(defun puthash (key value hash-table)
+  (let* ((size   (1- (length hash-table)))
+         (index  (1+ (hash (format nil "~s" key) size)))
+         (a-list (aref hash-table index))
+         (test   (aref hash-table 0))
+         (a-cons (assoc key a-list :test test)))
+    (setf (aref hash-table index)
+          (cons (cons key value)
+                (if a-cons
+                    (remove-if #'(lambda (x)
+                                   (funcall test key (car x)))
+                               a-list)
+                    a-list)))))
+
+(defun remhash (key hash-table)
+  (let* ((size   (1- (length hash-table)))
+         (index  (1+ (hash (format nil "~s" key) size)))
+         (a-list (aref hash-table index))
+         (test   (aref hash-table 0))
+         (a-cons (assoc key a-list :test test)))
+    (and a-cons
+         (setf (aref hash-table index)
+               (remove-if #'(lambda (x)
+                              (funcall test key (car x)))
+                          a-list)))
+    a-cons))
+
+(defun clrhash (hash-table)
+  (let ((size (1- (length hash-table))))
+    (do ((index 1 (1+ index)))
+        ((> index size))
+      (setf (aref hash-table index) nil))
+    hash-table))
+
+(defun hash-table-p (expr)
+  (and (arrayp expr)             ; expression is an array
+       (> (length expr) 1)       ; with more than one elements
+       (fboundp (aref expr 0))   ; first element is a function
+       (let ((size (1- (length expr))))   ; all other
+         (do ((index 1 (1+ index)))       ; elements are lists
+             ((or (> index size)
+                  (not (listp (aref expr index))))
+              (> index size))))))
+
+(defun hash-table-count (hash-table)
+  (let ((size (1- (length hash-table)))
+        (entries 0))
+    (do ((index 1 (1+ index)))
+        ((> index size))
+      (setf entries (+ entries (length (aref hash-table index)))))
+    entries))
+
+(defun hash-table-size (hash-table)
+  (1- (length hash-table)))
+
+(defun hash-table-test (hash-table)
+  (aref hash-table 0))
+
+(defun print-hash-table (hash-table)
+  (if (not (arrayp hash-table))
+      (format t ";; Not an array: ~s~%" hash-table)
+      (dotimes (index (length hash-table))
+        (let ((element (aref hash-table index)))
+          (cond ((not (listp element))
+                 (format t ";; array element ~a: ~s~%" index element))
+                ((null element)
+                 (format t ";; bucket ~a: ()~%" index))
+                (t
+                 (format t ";; bucket ~a:~%" index)
+                 (let ((entry-counter 1))
+                   (dolist (entry element)
+                     (format t ";; ~a.~a: ~s~%" index entry-counter entry)
+                     (incf entry-counter)))))))))
+

+ + + +

+ +

+ +Nyquist / XLISP 2.0 - +Contents | +Tutorials | +Examples | +Reference + + + -- cgit v1.2.3

	Array

0	Association List 1	→	`(("d" . 4) ("b" . 2))`

1	Association List 2	→	`(("c" . 3) ("a" . 1))`

	Array

0	Test Function	→	the :test argument to + assoc

1	Association List 1	→	((key1 . value1) + ... (keyN . valueN))

2	Association List 2	→	((key1 . value1) + ... (keyN . valueN))

3	Association List 3	→	((key1 . value1) + ... (keyN . valueN))
	...

n	Association List n	→	((key1 . value1) + ... (keyN . valueN))