summaryrefslogtreecommitdiff
path: root/src/de/lmu/ifi/dbs/elki/database/query/package-info.java
blob: fb3539bca5c2b6ce344101db6ba301622e6ce39c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
/**
 * <p><b>Database queries</b> - computing distances, neighbors, similarities - API and general documentation.</p>
 * 
 * <h2>Introduction</h2>
 * 
 * <p>The database query API is designed around the concept of <b>prepared statements</b>.</p>
 * <p>When working with index structures, preprocessors, caches and external data, computing
 * a distance or a neighborhood is not as simple as running a constant time function.
 * Some functions may only be defined on a subset of the data, others can be computed much
 * more efficiently by performing a batch operation. When plenty of memory is available,
 * caching can be faster than recomputing distances all the time. And often there will be more
 * than one way of computing the same data
 * (for example by using an index or doing a linear scan).</p>
 *
 * <p>Usually, these operations are invoked very often.
 * Even deciding which method to use at every iteration can prove quite costly when the number
 * of iterations becomes large. Therefore the goal is to "optimize" once, then invoke the same
 * handler cheaply. This can be achieved by using "prepared statements" as this would be called
 * in a traditional RDBMS context.</p>
 * 
 * <h2>Prepared queries in ELKI</h2>
 * 
 * <p>Prepared statements in ELKI are currently available for:
 * <ul>
 * <li>Distance queries: {@link de.lmu.ifi.dbs.elki.database.query.distance.DistanceQuery DistanceQuery}</li>
 * <li>Similarity queries: {@link de.lmu.ifi.dbs.elki.database.query.similarity.SimilarityQuery SimilarityQuery}</li>
 * <li>kNN (k-nearest-neighbors) queries: {@link de.lmu.ifi.dbs.elki.database.query.knn.KNNQuery KNNQuery}</li>
 * <li>&epsilon;-range queries: {@link de.lmu.ifi.dbs.elki.database.query.range.RangeQuery RangeQuery}</li>
 * <li>rkNN (reverse k-nearest-neighbors) queries: {@link de.lmu.ifi.dbs.elki.database.query.rknn.RKNNQuery RKNNQuery}</li>
 * </ul>
 * with a quite similar API.</p>
 * 
 * <h2>Obtaining query objects</h2>
 * 
 * <p>The general process of obtaining a Query is to retrieve it from the database using:
 * <ul>
 * <li>{@link de.lmu.ifi.dbs.elki.database.Database#getDistanceQuery Database.getDistanceQuery(distance)}</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.Database#getSimilarityQuery Database.getSimilarityQuery(similarity)}</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.Database#getKNNQuery Database.getKNNQuery(distance)}</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.Database#getRangeQuery Database.getRangeQuery(distance)}</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.Database#getRKNNQuery Database.getRKNNQuery(distance)}</li>
 * </ul>
 * as appropriate. See the query class links above for the detailed API. Avoid calling this method within a loop construct!</p>
 * 
 * <p>The query can then be evaluated on objects as needed.</p>
 * 
 * <h2>Optimizer hints</h2>
 * 
 * <p>In order to assist the database layer to choose the most suitable implementation, one should
 * also provide so called "hints" as available. In general, any object could be a "hint" to the
 * database layer (for extensibility), but the following are commonly used:</p>
 * 
 * <ul>
 * <li>An Integer as maximum value of "k" used in kNN and rkNN queries
 * (since a preprocessor or index might only support a certain fixed maximum value)</li>
 * <li>A maximum distance used in range queries</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.query.DatabaseQuery#HINT_BULK DatabaseQuery.HINT_BULK} to request support for bulk operations</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.query.DatabaseQuery#HINT_EXACT DatabaseQuery.HINT_EXACT} to exclude approximate answers</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.query.DatabaseQuery#HINT_HEAVY_USE DatabaseQuery.HINT_HEAVY_USE} to suggest the use of a cache or preprocessor</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.query.DatabaseQuery#HINT_OPTIMIZED_ONLY DatabaseQuery.HINT_OPTIMIZED_ONLY} to disallow linear scans</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.query.DatabaseQuery#HINT_SINGLE DatabaseQuery.HINT_SINGLE} to disallow expensive optimizations, since the query will only be used once</li>
 * <li>{@link de.lmu.ifi.dbs.elki.database.query.DatabaseQuery#HINT_NO_CACHE DatabaseQuery.HINT_NO_CACHE} to disallow retrieving a cache class</li>
 * </ul>
 * 
 * <p>Please set these hints appropriately, since this can effect your algorithms performance!</p>
 * 
 * <h2>Full example:</h2>
 * 
 * <blockquote><pre> {@code // Get a kNN query with maxk = 10
 * KNNQuery<V, DoubleDistance> knnQuery = database.getKNNQuery(EuclideanDistanceFunction.STATIC, 10);
 * // run a 10NN query for each point, discarding the results
 * for(DBID id : database) {
 *   knnQuery.getKNNForDBID(id, 10);
 * }
 * }</pre></blockquote>
 * 
 * @apiviz.exclude java.util.*
 * @apiviz.exclude de.lmu.ifi.dbs.elki.utilities.*
 */
/*
This file is part of ELKI:
Environment for Developing KDD-Applications Supported by Index-Structures

Copyright (C) 2013
Ludwig-Maximilians-Universität München
Lehr- und Forschungseinheit für Datenbanksysteme
ELKI Development Team

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.
*/
package de.lmu.ifi.dbs.elki.database.query;