Class: DwcOccurrence
- Inherits:
-
ApplicationRecord
- Object
- ActiveRecord::Base
- ApplicationRecord
- DwcOccurrence
- Includes:
- Housekeeping
- Defined in:
- app/models/dwc_occurrence.rb
Overview
A Darwin Core Record for the Occurrence core. Field generated from Ruby dwc-meta, which references the same spec that is used in the IPT, and the Dwc Assistant. Each record references a specific CollectionObject or AssertedDistribution.
Important: This is a cache/index, data here are periodically destroyed and regenerated from multiple tables in TW.
DWC attributes are camelCase to facilitate matching dwcClass is a replacement for the Rails reserved ‘Class’
All DC attributes (attributes not in DwcOccurrence::TW_ATTRIBUTES) in this table are namespaced to dc (“purl.org/dc/terms/”, “rs.tdwg.org/dwc/terms/”)
README:
There is a two part strategy to building the index. 1) An individual record will rebuild on request with `parameter to collection_objects/123/dwc*?build=true`.
2) Wipe, and rebuild on some schedule. It would in theory be possible to track and rebuild when a class of every property was created (or updated), however
this is a lot of overhead to inject/code for a lot of models. It would inject latency at numerous stages that would perhaps impact UI performance.
TODO: The basisOfRecord CVTs are not super informative.
We know collection object is definitely 1:1 with PreservedSpecimen, however
AssertedDistribution could be HumanObservation (if source is person), or ... what? if
its a published record. Seems we need a 'PublishedAssertation', just like we model the data.
Gotchas.
* updated_at is set by touching the record, not via housekeeping.
Constant Summary collapse
- DC_NAMESPACE =
'http://rs.tdwg.org/dwc/terms/'.freeze
- TW_ATTRIBUTES =
Not yet implemented, but likely needed (at an even higher level) ? :id
[ :id, :project_id, :created_at, :updated_at, :created_by_id, :updated_by_id, :dwc_occurrence_object_type, :dwc_occurrence_object_id ].freeze
- HEADER_CONVERTERS =
{ 'dwcClass' => 'class', }.freeze
Instance Attribute Summary collapse
-
#occurrence_identifier ⇒ Object
Returns the value of attribute occurrence_identifier.
Class Method Summary collapse
- .annotates? ⇒ Boolean
-
.asserted_distributions_join ⇒ ActiveRecord::Relation
that matches, consider moving to Shared.
-
.by_collection_object_filter(filter_scope: nil, project_id: nil) ⇒ Object
TODO: use filters Return scopes by a collection object filter.
-
.collection_objects_join ⇒ ActiveRecord::Relation
that matches, consider moving to Shared.
-
.computed_columns ⇒ Scope
The columns inferred to have data.
-
.empty_fields ⇒ Array
Of column names as symbols that are blank in ALL projects (not just this one).
-
.excluded_columns ⇒ Array
Of symbols.
-
.scoped_by_otu(otu) ⇒ Scope
TODO: Move to DwcOccurrence filter.
- .stale(kind = 'CollectionObject') ⇒ Object
-
.sweep ⇒ Object
Delete all stale indecies, where stale = object is missing.
-
.target_columns ⇒ Array
!! TODO: When we come to adding AssertedDistributions, FieldOccurrnces, etc.
Instance Method Summary collapse
-
#as_json(options = {}) ⇒ Object
Strip nils when ‘to_json` used.
- #asserted_distribution ⇒ Object
- #basis ⇒ Object
- #collecting_event ⇒ Object
- #collection_object ⇒ Object
- #create_object_uuid ⇒ Object protected
-
#dwc_json ⇒ Object
Hash * Legally formatted DwC fields only, with things like ‘dwcClass` translated * Only fields with values returned * Keys are sorted.
-
#generate_uuid_if_required(force = false) ⇒ Object
TODO: quick check if occurrenceID exists in table?! <-> locking sync !?.
-
#is_stale? ⇒ Boolean
!! This a spot check, it’s not (yet) coded to be comprehensive.
- #is_stale_metadata ⇒ Object
- #otu ⇒ Object
- #set_metadata_attributes ⇒ Object protected
- #uuid_identifier_scope ⇒ Object
Methods included from Housekeeping
#has_polymorphic_relationship?
Methods inherited from ApplicationRecord
Instance Attribute Details
#occurrence_identifier ⇒ Object
Returns the value of attribute occurrence_identifier.
87 88 89 |
# File 'app/models/dwc_occurrence.rb', line 87 def occurrence_identifier @occurrence_identifier end |
Class Method Details
.annotates? ⇒ Boolean
124 125 126 |
# File 'app/models/dwc_occurrence.rb', line 124 def self.annotates? false end |
.asserted_distributions_join ⇒ ActiveRecord::Relation
that matches, consider moving to Shared
142 143 144 145 146 147 |
# File 'app/models/dwc_occurrence.rb', line 142 def self.asserted_distributions_join a = arel_table b = ::AssertedDistribution.arel_table j = a.join(b).on(a[:dwc_occurrence_object_type].eq('AssertedDistribution').and(a[:dwc_occurrence_object_id].eq(b[:id]))) joins(j.join_sources) end |
.by_collection_object_filter(filter_scope: nil, project_id: nil) ⇒ Object
TODO: use filters Return scopes by a collection object filter
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
# File 'app/models/dwc_occurrence.rb', line 187 def self.by_collection_object_filter(filter_scope: nil, project_id: nil) return DwcOccurrence.none if project_id.nil? || filter_scope.nil? c = ::CollectionObject.arel_table d = arel_table # TODO: hackish k = ::CollectionObject.select('coscope.id').from( '(' + filter_scope.to_sql + ') as coscope ' ) a = self.collection_objects_join .where('dwc_occurrences.project_id = ?', project_id) .where(dwc_occurrence_object_id: k) .select(::DwcOccurrence.target_columns) # TODO !! Will have to change when AssertedDistribution and other types merge in a end |
.collection_objects_join ⇒ ActiveRecord::Relation
that matches, consider moving to Shared
133 134 135 136 137 138 |
# File 'app/models/dwc_occurrence.rb', line 133 def self.collection_objects_join a = arel_table b = ::CollectionObject.arel_table j = a.join(b).on(a[:dwc_occurrence_object_type].eq('CollectionObject').and(a[:dwc_occurrence_object_id].eq(b[:id]))) joins(j.join_sources) end |
.computed_columns ⇒ Scope
Returns the columns inferred to have data.
239 240 241 |
# File 'app/models/dwc_occurrence.rb', line 239 def self.computed_columns select(target_columns) end |
.empty_fields ⇒ Array
Returns of column names as symbols that are blank in ALL projects (not just this one).
205 206 207 208 209 210 211 212 213 214 215 216 |
# File 'app/models/dwc_occurrence.rb', line 205 def self.empty_fields empty_in_all_projects = ActiveRecord::Base.connection.execute("select attname from pg_stats where tablename = 'dwc_occurrences' and most_common_vals is null and most_common_freqs is null and histogram_bounds is null and correlation is null and null_frac = 1;").pluck('attname').map(&:to_sym) empty_in_all_projects # - target_columns end |
.excluded_columns ⇒ Array
Returns of symbols.
233 234 235 |
# File 'app/models/dwc_occurrence.rb', line 233 def self.excluded_columns ::DwcOccurrence.columns.collect{|c| c.name.to_sym} - (self.target_columns - [:dwc_occurrence_object_id, :dwc_occurrence_object_type]) end |
.scoped_by_otu(otu) ⇒ Scope
TODO: Move to DwcOccurrence filter
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 |
# File 'app/models/dwc_occurrence.rb', line 155 def self.scoped_by_otu(otu) a,b = nil, nil if otu.taxon_name_id.present? a = ::Queries::DwcOccurrence::Filter.new( asserted_distribution_query: { taxon_name_query: { taxon_name_id: otu.taxon_name_id, descendants: false, # include self synonymify: true } }) b = ::Queries::DwcOccurrence::Filter.new( collection_object_query: { taxon_name_query: { taxon_name_id: otu.taxon_name_id, descendants: false, # include self synonymify: true } }) else a = ::Queries::DwcOccurrence::Filter.new( asserted_distribution_query: { otu_id: otu.id}) b = ::Queries::DwcOccurrence::Filter.new( collection_object_query: { otu_query: { otu_id: otu.id}}) end from("((#{a.all.to_sql}) UNION (#{b.all.to_sql})) as dwc_occurrences") end |
.stale(kind = 'CollectionObject') ⇒ Object
118 119 120 121 122 |
# File 'app/models/dwc_occurrence.rb', line 118 def self.stale(kind = 'CollectionObject') tbl = kind.tableize DwcOccurrence.joins("LEFT JOIN #{tbl} tbl on dwc_occurrences.dwc_occurrence_object_id = tbl.id") .where("tbl.id IS NULL and dwc_occurrences.dwc_occurrence_object_type = '#{kind}'") end |
.sweep ⇒ Object
Delete all stale indecies, where stale = object is missing
111 112 113 114 115 116 |
# File 'app/models/dwc_occurrence.rb', line 111 def self.sweep %w{CollectionObject AssertedDistribution}.each do |k| stale(k).delete_all end true end |
.target_columns ⇒ Array
!! TODO: When we come to adding AssertedDistributions, FieldOccurrnces, etc. we will have to make this more flexible
222 223 224 225 226 227 228 229 |
# File 'app/models/dwc_occurrence.rb', line 222 def self.target_columns [:id, # must be in position 0 :occurrenceID, :basisOfRecord, :dwc_occurrence_object_id, # !! We don't want this, but need it in joins, it is removed in trim via `.excluded_columns` below :dwc_occurrence_object_type, # !! ^ ] + CollectionObject::DwcExtensions::DWC_OCCURRENCE_MAP.keys end |
Instance Method Details
#as_json(options = {}) ⇒ Object
Strip nils when ‘to_json` used
60 61 62 |
# File 'app/models/dwc_occurrence.rb', line 60 def as_json( = {}) super(.merge(except: attributes.keys.select{ |key| self[key].nil? })) end |
#asserted_distribution ⇒ Object
93 94 95 |
# File 'app/models/dwc_occurrence.rb', line 93 def asserted_distribution dwc_occurrence_object_type == 'AssertedDistribution' ? dwc_occurence_object : nil end |
#basis ⇒ Object
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
# File 'app/models/dwc_occurrence.rb', line 243 def basis case dwc_occurrence_object_type when 'CollectionObject' if dwc_occurrence_object.is_fossil? return 'FossilSpecimen' else return 'PreservedSpecimen' end when 'AssertedDistribution' # Used to fork b/b Source::Human and Source::Bibtex: case dwc_occurrence_object.source&.type || dwc_occurrence_object.sources.order(cached_nomenclature_date: :DESC).first.type when 'Source::Bibtex' return 'MaterialCitation' when 'Source::Human' return 'HumanObservation' else # Not recommended at this point return 'Occurrence' end end 'Undefined' end |
#collecting_event ⇒ Object
97 98 99 |
# File 'app/models/dwc_occurrence.rb', line 97 def collecting_event collection_object&.collecting_event end |
#collection_object ⇒ Object
89 90 91 |
# File 'app/models/dwc_occurrence.rb', line 89 def collection_object dwc_occurrence_object_type == 'CollectionObject' ? dwc_occurence_object : nil end |
#create_object_uuid ⇒ Object (protected)
361 362 363 364 365 366 367 |
# File 'app/models/dwc_occurrence.rb', line 361 def create_object_uuid @occurrence_identifier = Identifier::Global::Uuid::TaxonworksDwcOccurrence.create!( identifier_object: dwc_occurrence_object, by: dwc_occurrence_object&.creator, # revisit, why required? project_id: dwc_occurrence_object&.project_id, # Current.project_id, # revisit, why required? is_generated: true) end |
#dwc_json ⇒ Object
Returns Hash
-
Legally formatted DwC fields only, with things like ‘dwcClass` translated
-
Only fields with values returned
-
Keys are sorted.
69 70 71 72 73 74 75 |
# File 'app/models/dwc_occurrence.rb', line 69 def dwc_json a = as_json.reject!{|k,v| TW_ATTRIBUTES.include?(k.to_sym) || v.nil?} HEADER_CONVERTERS.keys.each do |k| a[ HEADER_CONVERTERS[k] ] = a.delete(k) if a[k] end a.sort.to_h end |
#generate_uuid_if_required(force = false) ⇒ Object
TODO: quick check if occurrenceID exists in table?! <-> locking sync !?
277 278 279 280 281 282 283 284 285 |
# File 'app/models/dwc_occurrence.rb', line 277 def generate_uuid_if_required(force = false) if force # really make sure there is an object to work with create_object_uuid if !occurrence_identifier && !dwc_occurrence_object.nil? # TODO: can be simplified when inverse_of/validation added to identifiers else # assume if occurrenceID is not blank identifier is present if occurrenceID.blank? create_object_uuid if !occurrence_identifier && !dwc_occurrence_object.nil? # TODO: can be simplified when inverse_of/validation added to identifiers end end end |
#is_stale? ⇒ Boolean
!! This a spot check, it’s not (yet) coded to be comprehensive. !! You should request a full rebuild (rebuild=true) at display time !! to ensure an up-to-date individual record
295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 |
# File 'app/models/dwc_occurrence.rb', line 295 def is_stale? case dwc_occurrence_object_type when 'CollectionObject' times = .values n = read_attribute(:updated_at) times.each do |v| return true if v > n end return false else # AssertedDistribution return dwc_occurrence_object.updated_at > updated_at end end |
#is_stale_metadata ⇒ Object
311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 |
# File 'app/models/dwc_occurrence.rb', line 311 def case dwc_occurrence_object_type when 'CollectionObject' o = CollectionObject.select(:id, :updated_at, :collecting_event_id).find_by(id: dwc_occurrence_object_id) ce = CollectingEvent.select(:id, :updated_at).find_by(id: o.collecting_event_id) td = dwc_occurrence_object&.taxon_determinations.order(:position).first tdr = if td&.otu&.taxon_name&. != scientificName td.updated_at else nil end tc = if fieldNumber != o.dwc_field_number collecting_event.identifiers.where(type: 'Identifier::Local::FieldNumber').first.updated_at else nil end return { collection_object: o.updated_at, # Shouldn't be neccessary since on_save rebuilds, but cheap here collecting_event: ce&.updated_at, trip_code: tc, taxon_determination: dwc_occurrence_object.taxon_determinations.order(:position)&.first&.updated_at, taxon_determination_reorder: tdr, taxon_determination_roles: dwc_occurrence_object.taxon_determinations.order(:position)&.first&.updated_at, biocuration_classification: dwc_occurrence_object.biocuration_classifications.order(:updated_at).first&.updated_at, georeferences: dwc_occurrence_object.georeferences.order(:updated_at).first&.updated_at, data_attributes: dwc_occurrence_object.data_attributes.order(:updated_at).first&.updated_at, collection_object_roles: dwc_occurrence_object.roles.order(:updated_at).first&.updated_at, collecting_event_data_attributes: dwc_occurrence_object.collecting_event&.data_attributes&.order(:updated_at)&.first&.updated_at, collecting_event_roles: dwc_occurrence_object.collecting_event&.roles&.order(:updated_at)&.first&.updated_at # citations? # tags?! }.select{|k,v| !v.nil?} else # AssertedDistribution { asserted_distribution: dwc_occurrence_object.updated_at, # TODO: Citations } end end |
#otu ⇒ Object
101 102 103 104 105 106 107 108 |
# File 'app/models/dwc_occurrence.rb', line 101 def otu case dwc_occurrence_object_type when 'AssertedDistribution' dwc_occurrence_object.otu when 'CollectionObject' collection_object.otu end end |
#set_metadata_attributes ⇒ Object (protected)
369 370 371 372 |
# File 'app/models/dwc_occurrence.rb', line 369 def write_attribute( :basisOfRecord, basis) write_attribute( :occurrenceID, occurrence_identifier&.identifier) # TODO: Slightly janky to touch this here, might not be needed with new hooks end |
#uuid_identifier_scope ⇒ Object
265 266 267 |
# File 'app/models/dwc_occurrence.rb', line 265 def uuid_identifier_scope dwc_occurrence_object&.identifiers&.where('identifiers.type like ?', 'Identifier::Global::Uuid%')&.order(:position) end |