Class: Queries::Query::Autocomplete
- Inherits:
-
Queries::Query
- Object
- Queries::Query
- Queries::Query::Autocomplete
- Includes:
- Arel::Nodes, Concerns::Identifiers
- Defined in:
- lib/queries/query/autocomplete.rb
Overview
Requires significant refactor.
To consider: In general our optimization follows this pattern:
a: Names that match exactly, full string b: Names that match exactly, full Identifier (cached) c: Names that match start of string exactly (cached), wildcard end of string, minimum 2 characters d: Names that have a very high cuttoff [good wildcard anywhere] ? d.1: Names that have wildcard either side (limit to 2 characters). Are results optimally better than d? e: Names that have exact ID (internal) (will come to top automatically) f: Names that match some special pattern (e.g. First letter, second name in taxon name search). These may need higher priority in the stack.
May also consider length, priority, similarity
Direct Known Subclasses
AnatomicalPart::Autocomplete, AssertedDistribution::Autocomplete, BiologicalAssociation::Autocomplete, BiologicalAssociationsGraph::Autocomplete, BiologicalRelationship::Autocomplete, CollectingEvent::Autocomplete, CollectionObject::Autocomplete, Container::Autocomplete, ControlledVocabularyTerm::Autocomplete, Conveyance::Autocomplete, DataAttribute::Autocomplete, DataAttribute::ValueAutocomplete, Depiction::Autocomplete, Descriptor::Autocomplete, Document::Autocomplete, Extract::Autocomplete, FieldOccurrence::Autocomplete, Gazetteer::Autocomplete, GeographicArea::Autocomplete, Identifier::Autocomplete, Image::Autocomplete, Lead::Autocomplete, Loan::Autocomplete, Namespace::Autocomplete, Note::Autocomplete, Observation::Autocomplete, ObservationMatrixRow::Autocomplete, Organization::Autocomplete, Otu::Autocomplete, Person::Autocomplete, PreparationType::Autocomplete, Serial::Autocomplete, Sound::Autocomplete, Source::Autocomplete, TaxonName::Autocomplete, TaxonName::Tabular, TypeMaterial::Autocomplete, User::Autocomplete
Instance Attribute Summary collapse
- #dynamic_limit ⇒ Integer
- #project_id ⇒ Array
-
#query_string ⇒ String?
The initial, unparsed value, sanitized.
Attributes inherited from Queries::Query
Instance Method Summary collapse
-
#autocomplete ⇒ Array
Default the autocomplete result to all TODO: eliminate.
- #autocomplete_cached ⇒ ActiveRecord::Relation
-
#autocomplete_cached_wildcard_anywhere ⇒ ActiveRecord::Relation
Removes years/integers!.
- #autocomplete_common_name_exact ⇒ Object
-
#autocomplete_common_name_like ⇒ Object
TODO: GIN/similarity.
- #autocomplete_exact_id ⇒ ActiveRecord::Relation
- #autocomplete_exactly_named ⇒ ActiveRecord::Relation
- #autocomplete_named ⇒ ActiveRecord::Relation
- #autocomplete_ordered_wildcard_pieces_in_cached ⇒ ActiveRecord::Relation
-
#cached_facet ⇒ ActiveRecord::Relation?
TODO: Used in taxon_name, source, identifier.
- #combine_or_clauses(clauses) ⇒ Arel::Nodes::Grouping
- #common_name_name ⇒ Object
- #common_name_table ⇒ Object
- #common_name_wild_pieces ⇒ Object
- #exactly_named ⇒ Arel::Nodes::Matches
-
#fragments ⇒ Array
Used in unordered AND searches.
-
#initialize(string, project_id: nil, **keyword_args) ⇒ Autocomplete
constructor
A new instance of Autocomplete.
-
#integers ⇒ Array
Of strings representing integers.
-
#least_levenshtein(fields, value) ⇒ Object
Calculate the levenshtein distance for a value across multiple columns, and keep the smallest.
-
#match_wildcard_end_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided.
-
#match_wildcard_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided.
- #named ⇒ Arel::Nodes::Matches
-
#only_ids ⇒ Arek::Npdes?
Used in or_clauses, match on id only if integers alone provided.
- #only_integers? ⇒ Boolean
-
#parent ⇒ Arel::Nodes::TableAlias
Used in heirarchy joins.
- #parent_child_join ⇒ Scope
-
#parent_child_where ⇒ Arel::Nodes::Grouping
Match at two levels, for example, 'wa te" will match "Washington Co., Texas".
-
#pieces ⇒ Array
TODO: used?!.
-
#safe_integers ⇒ Array<Integer>
Array of integers parsed from
query_stringthat fit within the 4-byte SQL integer range (1 to 2_147_483_647). -
#scope ⇒ Scope
stub TODO: deprecate? probably unused.
-
#string_fragments ⇒ Array
Used in unordered AND searches.
- #wildcard_wrapped_integers ⇒ Array
- #wildcard_wrapped_years ⇒ Array
- #with_cached ⇒ Arel::Nodes::Matches
- #with_cached_like ⇒ Arel::Nodes::Matches
-
#with_id ⇒ Arel::Nodes?
Used in or_clauses.
-
#with_project_id ⇒ Arel::Nodes::Equality
TODO: nil/or clause this.
- #year_letter ⇒ String?
- #years ⇒ Array
Methods inherited from Queries::Query
#alphabetic_strings, #alphanumeric_strings, base_name, #base_name, #base_query, #build_terms, #end_wildcard, #levenshtein_distance, #match_ordered_wildcard_pieces_in_cached, #no_terms?, referenced_klass, #referenced_klass, #referenced_klass_except, #referenced_klass_intersection, #referenced_klass_union, #start_and_end_wildcard, #start_wildcard, #table, #wildcard_pieces
Constructor Details
#initialize(string, project_id: nil, **keyword_args) ⇒ Autocomplete
Returns a new instance of Autocomplete.
40 41 42 43 44 45 46 47 |
# File 'lib/queries/query/autocomplete.rb', line 40 def initialize(string, project_id: nil, **keyword_args) @query_string = ::ApplicationRecord.sanitize_sql(string)&.delete("\u0000") # remove null bytes @project_id = project_id # should not need this # build_terms # TODO - should remove this for accessors end |
Instance Attribute Details
#dynamic_limit ⇒ Integer
34 35 36 |
# File 'lib/queries/query/autocomplete.rb', line 34 def dynamic_limit @dynamic_limit end |
#project_id ⇒ Array
26 27 28 |
# File 'lib/queries/query/autocomplete.rb', line 26 def project_id @project_id end |
#query_string ⇒ String?
Returns the initial, unparsed value, sanitized.
30 31 32 |
# File 'lib/queries/query/autocomplete.rb', line 30 def query_string @query_string end |
Instance Method Details
#autocomplete ⇒ Array
Returns default the autocomplete result to all TODO: eliminate.
245 246 247 248 |
# File 'lib/queries/query/autocomplete.rb', line 245 def autocomplete return [] if query_string.blank? all.to_a end |
#autocomplete_cached ⇒ ActiveRecord::Relation
283 284 285 286 287 288 289 |
# File 'lib/queries/query/autocomplete.rb', line 283 def autocomplete_cached if a = cached_facet base_query.where(a.to_sql).limit(20) else nil end end |
#autocomplete_cached_wildcard_anywhere ⇒ ActiveRecord::Relation
Returns removes years/integers!.
267 268 269 270 271 |
# File 'lib/queries/query/autocomplete.rb', line 267 def autocomplete_cached_wildcard_anywhere a = match_wildcard_in_cached return nil if a.nil? base_query.where(a.to_sql) end |
#autocomplete_common_name_exact ⇒ Object
315 316 317 318 |
# File 'lib/queries/query/autocomplete.rb', line 315 def autocomplete_common_name_exact return nil if no_terms? base_query.joins(:common_names).where(common_name_name.to_sql).limit(1) end |
#autocomplete_common_name_like ⇒ Object
TODO: GIN/similarity
321 322 323 324 |
# File 'lib/queries/query/autocomplete.rb', line 321 def autocomplete_common_name_like return nil if no_terms? base_query.joins(:common_names).where(common_name_wild_pieces.to_sql).limit(5) end |
#autocomplete_exact_id ⇒ ActiveRecord::Relation
251 252 253 254 255 256 257 |
# File 'lib/queries/query/autocomplete.rb', line 251 def autocomplete_exact_id if i = ::Utilities::Strings::only_integer(query_string) base_query.where(id: i).limit(1) else nil end end |
#autocomplete_exactly_named ⇒ ActiveRecord::Relation
292 293 294 295 |
# File 'lib/queries/query/autocomplete.rb', line 292 def autocomplete_exactly_named return nil if no_terms? base_query.where(exactly_named.to_sql).limit(20) end |
#autocomplete_named ⇒ ActiveRecord::Relation
298 299 300 301 |
# File 'lib/queries/query/autocomplete.rb', line 298 def autocomplete_named return nil if no_terms? base_query.where(named.to_sql).limit(20) end |
#autocomplete_ordered_wildcard_pieces_in_cached ⇒ ActiveRecord::Relation
260 261 262 263 |
# File 'lib/queries/query/autocomplete.rb', line 260 def autocomplete_ordered_wildcard_pieces_in_cached return nil if no_terms? base_query.where(match_ordered_wildcard_pieces_in_cached.to_sql) end |
#cached_facet ⇒ ActiveRecord::Relation?
TODO: Used in taxon_name, source, identifier
276 277 278 279 280 |
# File 'lib/queries/query/autocomplete.rb', line 276 def cached_facet return nil if no_terms? # TODO: or is redundant with terms in many cases (table[:cached].matches_any(terms)).or(match_ordered_wildcard_pieces_in_cached) end |
#combine_or_clauses(clauses) ⇒ Arel::Nodes::Grouping
227 228 229 230 231 232 233 234 235 |
# File 'lib/queries/query/autocomplete.rb', line 227 def combine_or_clauses(clauses) clauses.compact! raise TaxonWorks::Error, 'combine_or_clauses called without a clause, ensure at least one exists' unless !clauses.empty? a = clauses.shift clauses.each do |b| a = a.or(b) end a end |
#common_name_name ⇒ Object
307 308 309 |
# File 'lib/queries/query/autocomplete.rb', line 307 def common_name_name common_name_table[:name].eq(query_string) end |
#common_name_table ⇒ Object
303 304 305 |
# File 'lib/queries/query/autocomplete.rb', line 303 def common_name_table ::CommonName.arel_table end |
#common_name_wild_pieces ⇒ Object
311 312 313 |
# File 'lib/queries/query/autocomplete.rb', line 311 def common_name_wild_pieces common_name_table[:name].matches(wildcard_pieces) end |
#exactly_named ⇒ Arel::Nodes::Matches
182 183 184 |
# File 'lib/queries/query/autocomplete.rb', line 182 def exactly_named table[:name].eq(query_string) if query_string.present? end |
#fragments ⇒ Array
Used in unordered AND searches
93 94 95 96 97 98 99 100 |
# File 'lib/queries/query/autocomplete.rb', line 93 def fragments a = alphanumeric_strings if a.size > 0 && a.size < 6 a.collect{|a| "%#{a}%"} else [] end end |
#integers ⇒ Array
Returns of strings representing integers.
72 73 74 |
# File 'lib/queries/query/autocomplete.rb', line 72 def integers Utilities::Strings.integers(query_string) end |
#least_levenshtein(fields, value) ⇒ Object
Calculate the levenshtein distance for a value across multiple columns, and keep the smallest.
330 331 332 333 |
# File 'lib/queries/query/autocomplete.rb', line 330 def least_levenshtein(fields, value) levenshtein_sql = fields.map {|f| levenshtein_distance(f, value).to_sql } Arel.sql("least(#{levenshtein_sql.join(", ")})") end |
#match_wildcard_end_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided
214 215 216 |
# File 'lib/queries/query/autocomplete.rb', line 214 def match_wildcard_end_in_cached table[:cached].matches(end_wildcard) end |
#match_wildcard_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided
220 221 222 223 224 |
# File 'lib/queries/query/autocomplete.rb', line 220 def match_wildcard_in_cached b = fragments return nil if b.empty? table[:cached].matches_all(b) end |
#named ⇒ Arel::Nodes::Matches
177 178 179 |
# File 'lib/queries/query/autocomplete.rb', line 177 def named table[:name].matches_any(terms) if terms.any? end |
#only_ids ⇒ Arek::Npdes?
Returns used in or_clauses, match on id only if integers alone provided.
168 169 170 171 172 173 174 |
# File 'lib/queries/query/autocomplete.rb', line 168 def only_ids if only_integers? with_id else nil end end |
#only_integers? ⇒ Boolean
86 87 88 |
# File 'lib/queries/query/autocomplete.rb', line 86 def only_integers? Utilities::Strings.only_integers?(query_string) end |
#parent ⇒ Arel::Nodes::TableAlias
Returns used in heirarchy joins.
188 189 190 |
# File 'lib/queries/query/autocomplete.rb', line 188 def parent table.alias end |
#parent_child_join ⇒ Scope
144 145 146 |
# File 'lib/queries/query/autocomplete.rb', line 144 def parent_child_join table.join(parent).on(table[:parent_id].eq(parent[:id])).join_sources end |
#parent_child_where ⇒ Arel::Nodes::Grouping
Match at two levels, for example, 'wa te" will match "Washington Co., Texas"
150 151 152 153 154 |
# File 'lib/queries/query/autocomplete.rb', line 150 def parent_child_where a,b = query_string.split(/\s+/, 2) return table[:id].eq(-1) if a.nil? || b.nil? table[:name].matches("#{a}%").and(parent[:name].matches("#{b}%")) end |
#pieces ⇒ Array
TODO: used?!
117 118 119 |
# File 'lib/queries/query/autocomplete.rb', line 117 def pieces query_string.split(/\s+/) end |
#safe_integers ⇒ Array<Integer>
Returns Array of integers parsed from query_string that fit within the
4-byte SQL integer range (1 to 2_147_483_647).
79 80 81 82 83 |
# File 'lib/queries/query/autocomplete.rb', line 79 def safe_integers integers .map(&:to_i) .select { |i| i.between?(1, 2_147_483_647) } end |
#scope ⇒ Scope
stub TODO: deprecate? probably unused
56 57 58 |
# File 'lib/queries/query/autocomplete.rb', line 56 def scope where('1 = 2') end |
#string_fragments ⇒ Array
Used in unordered AND searches
105 106 107 108 109 110 111 112 |
# File 'lib/queries/query/autocomplete.rb', line 105 def string_fragments a = alphabetic_strings if a.size > 0 && a.size < 6 a.collect{|a| "%#{a}%"} else [] end end |
#wildcard_wrapped_integers ⇒ Array
122 123 124 |
# File 'lib/queries/query/autocomplete.rb', line 122 def wildcard_wrapped_integers integers.collect{|i| "%#{i}%"} end |
#wildcard_wrapped_years ⇒ Array
127 128 129 |
# File 'lib/queries/query/autocomplete.rb', line 127 def wildcard_wrapped_years years.collect{|i| "%#{i}%"} end |
#with_cached ⇒ Arel::Nodes::Matches
203 204 205 |
# File 'lib/queries/query/autocomplete.rb', line 203 def with_cached table[:cached].eq(query_string) end |
#with_cached_like ⇒ Arel::Nodes::Matches
208 209 210 |
# File 'lib/queries/query/autocomplete.rb', line 208 def with_cached_like table[:cached].matches(start_and_end_wildcard) end |
#with_id ⇒ Arel::Nodes?
Returns used in or_clauses.
158 159 160 161 162 163 164 |
# File 'lib/queries/query/autocomplete.rb', line 158 def with_id if safe_integers.any? table[:id].in(safe_integers) else nil end end |
#with_project_id ⇒ Arel::Nodes::Equality
TODO: nil/or clause this
194 195 196 197 198 199 200 |
# File 'lib/queries/query/autocomplete.rb', line 194 def with_project_id if project_id.present? table[:project_id].in(project_id) else nil end end |
#year_letter ⇒ String?
66 67 68 |
# File 'lib/queries/query/autocomplete.rb', line 66 def year_letter Utilities::Strings.year_letter(query_string) end |