Class: Queries::Otu::Autocomplete
- Inherits:
-
Query::Autocomplete
- Object
- Query
- Query::Autocomplete
- Queries::Otu::Autocomplete
- Defined in:
- lib/queries/otu/autocomplete.rb
Overview
See Query::Autocomplete for optimization strategy per name. There are 4 classes of name, each which has the same strategy: OTU name, Original TaxonName, TaxonName, CommonName We then apply a global priority pulling the best names from each sub-strategy to the top.
Constant Summary collapse
- QUERIES =
Keys are method names. Existence of method is checked before requesting the query
{ # OTU otu_name_exact: {priority: 1}, autocomplete_exact_id: {priority: 1}, autocomplete_identifier_cached_exact: {priority: 1}, otu_name_start_match: {priority: 200}, otu_name_similarity: {priority: 220}, # TaxonName autocomplete_taxon_name: {priority: nil}, # Priority is slotted from 10 .. 20 # These are all approximately covered in the blanket taxon_name autocomplete # taxon_name_name_exact: {priority: 10}, # taxon_name_identifier_exact: {priority: 10}, # taxon_name_name_start_match: {priority: 100}, # taxon_name_name_high_cuttoff: {priority: 200}, # CommonName # These should all be covered/moved to common_name_autocomplete, autocomplete_common_name_exact: {priority: 100}, autocomplete_common_name_like: {priority: 1000} # common_name_identifier_exact: {priority: 10}, # common_name_name_start_match: {priority: 100}, # common_name_name_similarity: {priority: 200}, }.freeze
Instance Attribute Summary collapse
-
#exact ⇒ Boolean
&exact=<“true”|“false”> if ‘true’ then only #name = query_string results are returned (no fuzzy matching).
-
#having_taxon_name_only ⇒ Object
Boolean, nil true - only return Otus with ‘name` = nil false,nil - no effect.
-
#include_common_names ⇒ Object
Boolean, nil true - ‘pre-load’ common names with otus false/nil - ignored.
-
#include_taxon_name ⇒ Object
Boolean, nil true - ‘pre-load’ taxon name with otus false/nil - ignored.
-
#with_taxon_name ⇒ Object
Boolean, nil true - OTU must have taxon name false - OTU must not have taxon name nil - ignored.
Attributes inherited from Query::Autocomplete
#dynamic_limit, #project_id, #query_string
Attributes inherited from Query
Instance Method Summary collapse
-
#api_autocomplete ⇒ Object
DEPRECATED Maintains valid_taxon_name_id needed for API.
-
#api_autocomplete_extended ⇒ Array
An autocomplete result that permits displaying the TaxonName as originally matched.
- #autocomplete ⇒ Object
- #autocomplete_base(targets = QUERIES) ⇒ Object
-
#autocomplete_taxon_name ⇒ Scope
Pull the result of a TaxonName autocomplete.
- #autocomplete_taxon_name_extended ⇒ Object
- #base_query ⇒ Object
-
#compact_priorities(otus) ⇒ Object
Doesn’t work for extended, as we can have the same OTU with different labels.
-
#initialize(string, project_id: nil, having_taxon_name_only: false, with_taxon_name: nil, exact: 'false', include_common_names: false, include_taxon_name: false) ⇒ Autocomplete
constructor
A new instance of Autocomplete.
- #otu_name_exact ⇒ Object
-
#otu_name_similarity ⇒ Object
All records that meet the similarity cuttoff - this is intended as a generic replacement for wildcarded results.
- #otu_name_start_match ⇒ Object
- #scope_autocomplete(query) ⇒ Object
Methods inherited from Query::Autocomplete
#autocomplete_cached, #autocomplete_cached_wildcard_anywhere, #autocomplete_common_name_exact, #autocomplete_common_name_like, #autocomplete_exact_id, #autocomplete_exactly_named, #autocomplete_named, #autocomplete_ordered_wildcard_pieces_in_cached, #cached_facet, #combine_or_clauses, #common_name_name, #common_name_table, #common_name_wild_pieces, #exactly_named, #fragments, #integers, #least_levenshtein, #match_wildcard_end_in_cached, #match_wildcard_in_cached, #named, #only_ids, #only_integers?, #parent, #parent_child_join, #parent_child_where, #pieces, #safe_integers, #scope, #string_fragments, #wildcard_wrapped_integers, #wildcard_wrapped_years, #with_cached, #with_cached_like, #with_id, #with_project_id, #year_letter, #years
Methods inherited from Query
#alphabetic_strings, #alphanumeric_strings, base_name, #base_name, #build_terms, #cached_facet, #end_wildcard, #levenshtein_distance, #match_ordered_wildcard_pieces_in_cached, #no_terms?, referenced_klass, #referenced_klass, #referenced_klass_except, #referenced_klass_intersection, #referenced_klass_union, #start_and_end_wildcard, #start_wildcard, #table, #wildcard_pieces
Constructor Details
#initialize(string, project_id: nil, having_taxon_name_only: false, with_taxon_name: nil, exact: 'false', include_common_names: false, include_taxon_name: false) ⇒ Autocomplete
Returns a new instance of Autocomplete.
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
# File 'lib/queries/otu/autocomplete.rb', line 66 def initialize( string, project_id: nil, having_taxon_name_only: false, with_taxon_name: nil, exact: 'false', include_common_names: false, include_taxon_name: false ) super(string, project_id:) @having_taxon_name_only = boolean_param({having_taxon_name_only:}, :having_taxon_name_only) @with_taxon_name = boolean_param({with_taxon_name:}, :with_taxon_name) # TODO: move to mode @exact = boolean_param({exact:}, :exact) @include_common_names = boolean_param({include_common_names:}, :include_common_names) @include_taxon_name = boolean_param({include_taxon_name:}, :include_taxon_name) end |
Instance Attribute Details
#exact ⇒ Boolean
Returns &exact=<“true”|“false”> if ‘true’ then only #name = query_string results are returned (no fuzzy matching).
27 28 29 |
# File 'lib/queries/otu/autocomplete.rb', line 27 def exact @exact end |
#having_taxon_name_only ⇒ Object
Returns Boolean, nil true - only return Otus with ‘name` = nil false,nil - no effect.
16 17 18 |
# File 'lib/queries/otu/autocomplete.rb', line 16 def having_taxon_name_only @having_taxon_name_only end |
#include_common_names ⇒ Object
Returns Boolean, nil true - ‘pre-load’ common names with otus false/nil - ignored.
32 33 34 |
# File 'lib/queries/otu/autocomplete.rb', line 32 def include_common_names @include_common_names end |
#include_taxon_name ⇒ Object
Returns Boolean, nil true - ‘pre-load’ taxon name with otus false/nil - ignored.
37 38 39 |
# File 'lib/queries/otu/autocomplete.rb', line 37 def include_taxon_name @include_taxon_name end |
#with_taxon_name ⇒ Object
Returns Boolean, nil true - OTU must have taxon name false - OTU must not have taxon name nil - ignored.
22 23 24 |
# File 'lib/queries/otu/autocomplete.rb', line 22 def with_taxon_name @with_taxon_name end |
Instance Method Details
#api_autocomplete ⇒ Object
DEPRECATED Maintains valid_taxon_name_id needed for API.
Considerations:
otus -> taxon names -> valid taxon name_id <- otu can return more OTUs than the original query
because there can be multiple OTUs for the valid name of an invalid original result.
right now we pick the first valid OTU for the name with distinct on()
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
# File 'lib/queries/otu/autocomplete.rb', line 139 def api_autocomplete @with_taxon_name = true # This limit() has more impact now. Since all # names are loaded large matches can swamp exact names # before priority ordering is applied. May require tuning. otus = compact_priorities( autocomplete_base.limit(30) ) otu_order = otus.map(&:id).uniq f = ::Otu.where(id: otu_order) .joins('left join taxon_names t1 on otus.taxon_name_id = t1.id') .joins('left join otus o2 on t1.cached_valid_taxon_name_id = o2.taxon_name_id') .select('distinct on (otus.id) otus.id, otus.name, otus.taxon_name_id, COALESCE(o2.id, otus.id) as otu_valid_id') f.sort_by.with_index { |item, idx| [(otu_order.index(item.id) || 999), (idx || 999)] } end |
#api_autocomplete_extended ⇒ Array
An autocomplete result that permits displaying the TaxonName as originally matched. Note that otu: is really only useful when displaying otus without &having_taxon_name_only=true. We don’t, for example make use of this element there.
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 |
# File 'lib/queries/otu/autocomplete.rb', line 214 def api_autocomplete_extended otu_queries = QUERIES.dup otu_queries.delete(:autocomplete_taxon_name) base_otus = autocomplete_base(otu_queries).limit(30) taxon_name_otus = autocomplete_taxon_name_extended r = [] base_otus.each do |o| r.push({ otu: o, # contains priority label_target: o }) end taxon_name_otus.each do |o| r.push({ otu: o, label_target: (o.label_target_taxon_name_id ? ::TaxonName.find(o.label_target_taxon_name_id) : o.taxon_name ) # is o.taxon_name true?! }) end # Keep a unique set of otu + label (to render) seen = Set.new # The compacted result compact = [] r.each do |h| g = h[:label_target].id.to_s + h[:label_target].class.name m = [ h[:otu].id, g ] next if seen.include?( m ) seen << m compact.push h end compact.sort!{|c,d| (c[:otu].priority || 999) <=> (d[:otu].priority || 999 )} # TODO: Refactor to remove extra query and assignment of otu_valid_id. This is ugly. otu_order = compact.collect{|d| d[:otu].id} # Extra query is painful. f = ::Otu.where(id: otu_order) .joins('left join taxon_names t1 on otus.taxon_name_id = t1.id') # .joins('left join otus o2 on t1.cached_valid_taxon_name_id = o2.taxon_name_id') .joins('left join otus o2 on t1.cached_valid_taxon_name_id = o2.taxon_name_id and o2.taxon_name_id <> otus.taxon_name_id') # See https://github.com/sfg-taxonpages/orthoptera/issues/90 .select('distinct on (otus.id) otus.id, otus.name, otus.taxon_name_id, COALESCE(o2.id, otus.id) as otu_valid_id') compact.each do |h| h[:otu_valid_id] = f.select{|j| j.id == h[:otu].id}.first.otu_valid_id end compact end |
#autocomplete ⇒ Object
286 287 288 |
# File 'lib/queries/otu/autocomplete.rb', line 286 def autocomplete compact_priorities( autocomplete_base.limit(40) ) end |
#autocomplete_base(targets = QUERIES) ⇒ Object
290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 |
# File 'lib/queries/otu/autocomplete.rb', line 290 def autocomplete_base(targets = QUERIES) queries = [] targets.each do |q, p| if self.respond_to?(q) a = send(q) next if a.nil? # query has returned nil y = p[:priority] a = scope_autocomplete(a) a = a.select("otus.*, #{y} as priority") unless y.nil? queries.push a end end queries.compact! q = referenced_klass_union(queries).order('priority') q = include_common_names ? q.includes(:common_names) : q q = include_taxon_name ? q.includes(:taxon_name) : q q end |
#autocomplete_taxon_name ⇒ Scope
Returns Pull the result of a TaxonName autocomplete. Maintain the order returned, and re-cast the result in terms of an OTU query. Expensive but maintaining order is key.
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/queries/otu/autocomplete.rb', line 115 def autocomplete_taxon_name taxon_names = Queries::TaxonName::Autocomplete.new(query_string, exact:, project_id:).autocomplete # an array, not a query ids = taxon_names.collect{|n| n.is_combination? ? n.cached_valid_taxon_name_id : n.id} # TODO: Experiment with :cached_valid_taxon_name_id) # We assume we want to land on Valid OTUs, but see # return nil if ids.empty? min = 10.0 max = 20.0 scale = (max - min) / ids.count.to_f # TODO: optimize * base_query.select("otus.*, ((#{min} + row_number() OVER ())::float * #{scale}) as priority") # small incrementing numbers for priority .joins("INNER JOIN ( SELECT unnest(ARRAY[#{ids.join(',')}]) AS id, row_number() OVER () AS row_num ) AS id_order ON otus.taxon_name_id = id_order.id") .order('id_order.row_num') end |
#autocomplete_taxon_name_extended ⇒ Object
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
# File 'lib/queries/otu/autocomplete.rb', line 158 def autocomplete_taxon_name_extended taxon_names = Queries::TaxonName::Autocomplete.new(query_string, exact:, project_id:).autocomplete # an array, not a query ids = taxon_names.collect{|n| [ (n.is_combination? ? n.cached_valid_taxon_name_id : n.id), # Points to the OTU target, if there is one n.id, # points to the label target ] } return ::Otu.none if ids.empty? ids.uniq! min = 10.0 max = 20.0 scale = (max - min) / ids.count.to_f # TODO: optimize * otus = base_query .select(<<~SQL.squish) .joins(<<~SQL.squish) INNER JOIN ( SELECT unnest(ARRAY[#{ids.map(&:first).join(',')}]) AS id, unnest(ARRAY[#{ids.map(&:last).join(',')}]) AS label_target_taxon_name_id, row_number() OVER () AS row_num ) AS id_order ON otus.taxon_name_id = id_order.id SQL .order('id_order.row_num') otus = scope_autocomplete(otus) # We could currently get away with using .includes here, but if we were # to ever filter or group `otus` on a non-otu table like id_order then # .includes would do a join on the associated table below and we could # get duplicate otu.id result rows that would be de-duplicated by rails, # losing vital (non-dup) id_order info. So just always do preload here # instead. otus = include_taxon_name ? otus.preload(:taxon_name) : otus otus = include_common_names ? otus.preload(:common_names) : otus otus end |
#base_query ⇒ Object
84 85 86 87 88 |
# File 'lib/queries/otu/autocomplete.rb', line 84 def base_query q = ::Otu.all q = q.where(project_id:) if project_id.any? q end |
#compact_priorities(otus) ⇒ Object
Doesn’t work for extended, as we can have the same OTU with different labels
273 274 275 276 277 278 279 280 281 282 283 284 |
# File 'lib/queries/otu/autocomplete.rb', line 273 def compact_priorities(otus) # Mmmmarg! # We may have the same name at different priorities, strike all but the highest/first. r = [] i = {} otus.each do |o| next if i[o.id] r.push o i[o.id] = true end r end |
#otu_name_exact ⇒ Object
90 91 92 |
# File 'lib/queries/otu/autocomplete.rb', line 90 def otu_name_exact base_query.where(otus: {name: query_string}) end |
#otu_name_similarity ⇒ Object
All records that meet the similarity cuttoff
-
this is intended as a generic replacement for wildcarded results
Observations:
- was similarity(), experimenting with word_similarity
- 3 letter matches are going to be low probability, matches kick in at 4
105 106 107 108 109 110 |
# File 'lib/queries/otu/autocomplete.rb', line 105 def otu_name_similarity base_query .where('otus.name % ?', query_string) .where( ApplicationRecord.sanitize_sql_array(["word_similarity('%s', otus.name) > 0.33", query_string])) .order('otus.name, length(otus.name)') end |
#otu_name_start_match ⇒ Object
94 95 96 |
# File 'lib/queries/otu/autocomplete.rb', line 94 def otu_name_start_match base_query.where('otus.name ilike ?', query_string + '%') end |
#scope_autocomplete(query) ⇒ Object
318 319 320 321 322 323 |
# File 'lib/queries/otu/autocomplete.rb', line 318 def scope_autocomplete(query) query = query.joins(:taxon_name) if with_taxon_name query = query.where.missing(:taxon_name) if with_taxon_name == false query = query.where(otus: {name: nil}) if having_taxon_name_only query end |