Class: Queries::Query::Autocomplete
- Inherits:
-
Queries::Query
- Object
- Queries::Query
- Queries::Query::Autocomplete
- Includes:
- Arel::Nodes, Concerns::Identifiers
- Defined in:
- lib/queries/query/autocomplete.rb
Overview
Requires significant refactor.
To consider: In general our optimization follows this pattern:
a: Names that match exactly, full string b: Names that match exactly, full Identifier (cached) c: Names that match start of string exactly (cached), wildcard end of string, minimum 2 characters d: Names that have a very high cuttoff [good wildcard anywhere] ? d.1: Names that have wildcard either side (limit to 2 characters). Are results optimally better than d? e: Names that have exact ID (internal) (will come to top automatically) f: Names that match some special pattern (e.g. First letter, second name in taxon name search). These
may need higher priority in the stack.
May also consider length, priority, similarity
Direct Known Subclasses
AssertedDistribution::Autocomplete, BiologicalAssociation::Autocomplete, BiologicalAssociationsGraph::Autocomplete, BiologicalRelationship::Autocomplete, CollectingEvent::Autocomplete, CollectionObject::Autocomplete, Container::Autocomplete, ControlledVocabularyTerm::Autocomplete, Conveyance::Autocomplete, DataAttribute::Autocomplete, DataAttribute::ValueAutocomplete, Depiction::Autocomplete, Descriptor::Autocomplete, Document::Autocomplete, Extract::Autocomplete, FieldOccurrence::Autocomplete, Gazetteer::Autocomplete, GeographicArea::Autocomplete, Identifier::Autocomplete, Image::Autocomplete, Lead::Autocomplete, Loan::Autocomplete, Namespace::Autocomplete, Note::Autocomplete, Observation::Autocomplete, ObservationMatrixRow::Autocomplete, Organization::Autocomplete, Otu::Autocomplete, Person::Autocomplete, Serial::Autocomplete, Sound::Autocomplete, Source::Autocomplete, TaxonName::Autocomplete, TaxonName::Tabular, TypeMaterial::Autocomplete, User::Autocomplete
Instance Attribute Summary collapse
- #dynamic_limit ⇒ Integer
- #project_id ⇒ Array
-
#query_string ⇒ String?
The initial, unparsed value, sanitized.
Attributes inherited from Queries::Query
Instance Method Summary collapse
-
#autocomplete ⇒ Array
Default the autocomplete result to all TODO: eliminate.
- #autocomplete_cached ⇒ ActiveRecord::Relation
-
#autocomplete_cached_wildcard_anywhere ⇒ ActiveRecord::Relation
Removes years/integers!.
- #autocomplete_common_name_exact ⇒ Object
-
#autocomplete_common_name_like ⇒ Object
TODO: GIN/similarity.
- #autocomplete_exact_id ⇒ ActiveRecord::Relation
- #autocomplete_exactly_named ⇒ ActiveRecord::Relation
- #autocomplete_named ⇒ ActiveRecord::Relation
- #autocomplete_ordered_wildcard_pieces_in_cached ⇒ ActiveRecord::Relation
-
#cached_facet ⇒ ActiveRecord::Relation?
TODO: Used in taxon_name, source, identifier.
- #combine_or_clauses(clauses) ⇒ Arel::Nodes::Grouping
- #common_name_name ⇒ Object
- #common_name_table ⇒ Object
- #common_name_wild_pieces ⇒ Object
- #exactly_named ⇒ Arel::Nodes::Matches
-
#fragments ⇒ Array
Used in unordered AND searches.
-
#initialize(string, project_id: nil, **keyword_args) ⇒ Autocomplete
constructor
A new instance of Autocomplete.
-
#integers ⇒ Array
Of strings representing integers.
-
#least_levenshtein(fields, value) ⇒ Object
Calculate the levenshtein distance for a value across multiple columns, and keep the smallest.
-
#match_wildcard_end_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided.
-
#match_wildcard_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided.
- #named ⇒ Arel::Nodes::Matches
-
#only_ids ⇒ Arek::Npdes?
Used in or_clauses, match on id only if integers alone provided.
- #only_integers? ⇒ Boolean
-
#parent ⇒ Arel::Nodes::TableAlias
Used in heirarchy joins.
- #parent_child_join ⇒ Scope
-
#parent_child_where ⇒ Arel::Nodes::Grouping
Match at two levels, for example, ‘wa te“ will match ”Washington Co., Texas“.
-
#pieces ⇒ Array
TODO: used?!.
-
#safe_integers ⇒ Array<Integer>
Array of integers parsed from ‘query_string` that fit within the 4-byte SQL integer range (1 to 2_147_483_647).
-
#scope ⇒ Scope
stub TODO: deprecate? probably unused.
-
#string_fragments ⇒ Array
Used in unordered AND searches.
- #wildcard_wrapped_integers ⇒ Array
- #wildcard_wrapped_years ⇒ Array
- #with_cached ⇒ Arel::Nodes::Matches
- #with_cached_like ⇒ Arel::Nodes::Matches
-
#with_id ⇒ Arel::Nodes?
Used in or_clauses.
-
#with_project_id ⇒ Arel::Nodes::Equality
TODO: nil/or clause this.
- #year_letter ⇒ String?
- #years ⇒ Array
Methods inherited from Queries::Query
#alphabetic_strings, #alphanumeric_strings, base_name, #base_name, #base_query, #build_terms, #end_wildcard, #levenshtein_distance, #match_ordered_wildcard_pieces_in_cached, #no_terms?, referenced_klass, #referenced_klass, #referenced_klass_except, #referenced_klass_intersection, #referenced_klass_union, #start_and_end_wildcard, #start_wildcard, #table, #wildcard_pieces
Constructor Details
#initialize(string, project_id: nil, **keyword_args) ⇒ Autocomplete
Returns a new instance of Autocomplete.
40 41 42 43 44 45 |
# File 'lib/queries/query/autocomplete.rb', line 40 def initialize(string, project_id: nil, **keyword_args) @query_string = ::ApplicationRecord.sanitize_sql(string)&.delete("\u0000") # remove null bytes @project_id = project_id build_terms # TODO - should remove this for accessors end |
Instance Attribute Details
#dynamic_limit ⇒ Integer
34 35 36 |
# File 'lib/queries/query/autocomplete.rb', line 34 def dynamic_limit @dynamic_limit end |
#project_id ⇒ Array
26 27 28 |
# File 'lib/queries/query/autocomplete.rb', line 26 def project_id @project_id end |
#query_string ⇒ String?
Returns the initial, unparsed value, sanitized.
30 31 32 |
# File 'lib/queries/query/autocomplete.rb', line 30 def query_string @query_string end |
Instance Method Details
#autocomplete ⇒ Array
Returns default the autocomplete result to all TODO: eliminate.
243 244 245 246 |
# File 'lib/queries/query/autocomplete.rb', line 243 def autocomplete return [] if query_string.blank? all.to_a end |
#autocomplete_cached ⇒ ActiveRecord::Relation
281 282 283 284 285 286 287 |
# File 'lib/queries/query/autocomplete.rb', line 281 def autocomplete_cached if a = cached_facet base_query.where(a.to_sql).limit(20) else nil end end |
#autocomplete_cached_wildcard_anywhere ⇒ ActiveRecord::Relation
Returns removes years/integers!.
265 266 267 268 269 |
# File 'lib/queries/query/autocomplete.rb', line 265 def autocomplete_cached_wildcard_anywhere a = match_wildcard_in_cached return nil if a.nil? base_query.where(a.to_sql) end |
#autocomplete_common_name_exact ⇒ Object
313 314 315 316 |
# File 'lib/queries/query/autocomplete.rb', line 313 def autocomplete_common_name_exact return nil if no_terms? base_query.joins(:common_names).where(common_name_name.to_sql).limit(1) end |
#autocomplete_common_name_like ⇒ Object
TODO: GIN/similarity
319 320 321 322 |
# File 'lib/queries/query/autocomplete.rb', line 319 def autocomplete_common_name_like return nil if no_terms? base_query.joins(:common_names).where(common_name_wild_pieces.to_sql).limit(5) end |
#autocomplete_exact_id ⇒ ActiveRecord::Relation
249 250 251 252 253 254 255 |
# File 'lib/queries/query/autocomplete.rb', line 249 def autocomplete_exact_id if i = ::Utilities::Strings::only_integer(query_string) base_query.where(id: i).limit(1) else nil end end |
#autocomplete_exactly_named ⇒ ActiveRecord::Relation
290 291 292 293 |
# File 'lib/queries/query/autocomplete.rb', line 290 def autocomplete_exactly_named return nil if no_terms? base_query.where(exactly_named.to_sql).limit(20) end |
#autocomplete_named ⇒ ActiveRecord::Relation
296 297 298 299 |
# File 'lib/queries/query/autocomplete.rb', line 296 def autocomplete_named return nil if no_terms? base_query.where(named.to_sql).limit(20) end |
#autocomplete_ordered_wildcard_pieces_in_cached ⇒ ActiveRecord::Relation
258 259 260 261 |
# File 'lib/queries/query/autocomplete.rb', line 258 def autocomplete_ordered_wildcard_pieces_in_cached return nil if no_terms? base_query.where(match_ordered_wildcard_pieces_in_cached.to_sql) end |
#cached_facet ⇒ ActiveRecord::Relation?
TODO: Used in taxon_name, source, identifier
274 275 276 277 278 |
# File 'lib/queries/query/autocomplete.rb', line 274 def cached_facet return nil if no_terms? # TODO: or is redundant with terms in many cases (table[:cached].matches_any(terms)).or(match_ordered_wildcard_pieces_in_cached) end |
#combine_or_clauses(clauses) ⇒ Arel::Nodes::Grouping
225 226 227 228 229 230 231 232 233 |
# File 'lib/queries/query/autocomplete.rb', line 225 def combine_or_clauses(clauses) clauses.compact! raise TaxonWorks::Error, 'combine_or_clauses called without a clause, ensure at least one exists' unless !clauses.empty? a = clauses.shift clauses.each do |b| a = a.or(b) end a end |
#common_name_name ⇒ Object
305 306 307 |
# File 'lib/queries/query/autocomplete.rb', line 305 def common_name_name common_name_table[:name].eq(query_string) end |
#common_name_table ⇒ Object
301 302 303 |
# File 'lib/queries/query/autocomplete.rb', line 301 def common_name_table ::CommonName.arel_table end |
#common_name_wild_pieces ⇒ Object
309 310 311 |
# File 'lib/queries/query/autocomplete.rb', line 309 def common_name_wild_pieces common_name_table[:name].matches(wildcard_pieces) end |
#exactly_named ⇒ Arel::Nodes::Matches
180 181 182 |
# File 'lib/queries/query/autocomplete.rb', line 180 def exactly_named table[:name].eq(query_string) if query_string.present? end |
#fragments ⇒ Array
Used in unordered AND searches
91 92 93 94 95 96 97 98 |
# File 'lib/queries/query/autocomplete.rb', line 91 def fragments a = alphanumeric_strings if a.size > 0 && a.size < 6 a.collect{|a| "%#{a}%"} else [] end end |
#integers ⇒ Array
Returns of strings representing integers.
70 71 72 |
# File 'lib/queries/query/autocomplete.rb', line 70 def integers Utilities::Strings.integers(query_string) end |
#least_levenshtein(fields, value) ⇒ Object
Calculate the levenshtein distance for a value across multiple columns, and keep the smallest.
328 329 330 331 |
# File 'lib/queries/query/autocomplete.rb', line 328 def least_levenshtein(fields, value) levenshtein_sql = fields.map {|f| levenshtein_distance(f, value).to_sql } Arel.sql("least(#{levenshtein_sql.join(", ")})") end |
#match_wildcard_end_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided
212 213 214 |
# File 'lib/queries/query/autocomplete.rb', line 212 def match_wildcard_end_in_cached table[:cached].matches(end_wildcard) end |
#match_wildcard_in_cached ⇒ Arel::Nodes::Matches
match ALL wildcards, but unordered, if 2 - 6 pieces provided
218 219 220 221 222 |
# File 'lib/queries/query/autocomplete.rb', line 218 def match_wildcard_in_cached b = fragments return nil if b.empty? table[:cached].matches_all(b) end |
#named ⇒ Arel::Nodes::Matches
175 176 177 |
# File 'lib/queries/query/autocomplete.rb', line 175 def named table[:name].matches_any(terms) if terms.any? end |
#only_ids ⇒ Arek::Npdes?
Returns used in or_clauses, match on id only if integers alone provided.
166 167 168 169 170 171 172 |
# File 'lib/queries/query/autocomplete.rb', line 166 def only_ids if only_integers? with_id else nil end end |
#only_integers? ⇒ Boolean
84 85 86 |
# File 'lib/queries/query/autocomplete.rb', line 84 def only_integers? Utilities::Strings.only_integers?(query_string) end |
#parent ⇒ Arel::Nodes::TableAlias
Returns used in heirarchy joins.
186 187 188 |
# File 'lib/queries/query/autocomplete.rb', line 186 def parent table.alias end |
#parent_child_join ⇒ Scope
142 143 144 |
# File 'lib/queries/query/autocomplete.rb', line 142 def parent_child_join table.join(parent).on(table[:parent_id].eq(parent[:id])).join_sources end |
#parent_child_where ⇒ Arel::Nodes::Grouping
Match at two levels, for example, ‘wa te“ will match ”Washington Co., Texas“
148 149 150 151 152 |
# File 'lib/queries/query/autocomplete.rb', line 148 def parent_child_where a,b = query_string.split(/\s+/, 2) return table[:id].eq(-1) if a.nil? || b.nil? table[:name].matches("#{a}%").and(parent[:name].matches("#{b}%")) end |
#pieces ⇒ Array
TODO: used?!
115 116 117 |
# File 'lib/queries/query/autocomplete.rb', line 115 def pieces query_string.split(/\s+/) end |
#safe_integers ⇒ Array<Integer>
Returns Array of integers parsed from ‘query_string` that fit within the 4-byte SQL integer range (1 to 2_147_483_647).
77 78 79 80 81 |
# File 'lib/queries/query/autocomplete.rb', line 77 def safe_integers integers .map(&:to_i) .select { |i| i.between?(1, 2_147_483_647) } end |
#scope ⇒ Scope
stub TODO: deprecate? probably unused
54 55 56 |
# File 'lib/queries/query/autocomplete.rb', line 54 def scope where('1 = 2') end |
#string_fragments ⇒ Array
Used in unordered AND searches
103 104 105 106 107 108 109 110 |
# File 'lib/queries/query/autocomplete.rb', line 103 def string_fragments a = alphabetic_strings if a.size > 0 && a.size < 6 a.collect{|a| "%#{a}%"} else [] end end |
#wildcard_wrapped_integers ⇒ Array
120 121 122 |
# File 'lib/queries/query/autocomplete.rb', line 120 def wildcard_wrapped_integers integers.collect{|i| "%#{i}%"} end |
#wildcard_wrapped_years ⇒ Array
125 126 127 |
# File 'lib/queries/query/autocomplete.rb', line 125 def wildcard_wrapped_years years.collect{|i| "%#{i}%"} end |
#with_cached ⇒ Arel::Nodes::Matches
201 202 203 |
# File 'lib/queries/query/autocomplete.rb', line 201 def with_cached table[:cached].eq(query_string) end |
#with_cached_like ⇒ Arel::Nodes::Matches
206 207 208 |
# File 'lib/queries/query/autocomplete.rb', line 206 def with_cached_like table[:cached].matches(start_and_end_wildcard) end |
#with_id ⇒ Arel::Nodes?
Returns used in or_clauses.
156 157 158 159 160 161 162 |
# File 'lib/queries/query/autocomplete.rb', line 156 def with_id if safe_integers.any? table[:id].in(safe_integers) else nil end end |
#with_project_id ⇒ Arel::Nodes::Equality
TODO: nil/or clause this
192 193 194 195 196 197 198 |
# File 'lib/queries/query/autocomplete.rb', line 192 def with_project_id if project_id.present? table[:project_id].in(project_id) else nil end end |
#year_letter ⇒ String?
64 65 66 |
# File 'lib/queries/query/autocomplete.rb', line 64 def year_letter Utilities::Strings.year_letter(query_string) end |