Module: Export::Dwca
- Defined in:
- lib/export/dwca.rb,
lib/export/dwca/data.rb
Defined Under Namespace
Modules: GbifProfile Classes: Data
Constant Summary collapse
- INDEX_VERSION =
Version is a way to track dates where the indexing changed significantly such that all or most of the index should be regenerated. To add a version use `Time.now` via IRB
[ '2021-10-12 17:00:00.000000 -0500', # First major refactor '2021-10-15 17:00:00.000000 -0500', # Minor Excludes footprintWKT, and references to GeographicArea in gazetteer; new form of media links '2021-11-04 17:00:00.000000 -0500', # Minor Removes '|', fixes some mappings '2021-11-08 13:00:00.000000 -0500', # PENDING: Minor Adds depth mappings '2021-11-30 13:00:00.000000 -0500', # Fix inverted long,lat '2022-01-21 16:30:00.000000 -0500', # basisOfRecord can now be FossilSpecimen; occurrenceId exporting; adds redundant time fields '2022-03-31 16:30:00.000000 -0500', # collectionCode, occurrenceRemarks and various small fixes '2022-04-28 16:30:00.000000 -0500', # add dwcOccurrenceStatus '2022-09-28 16:30:00.000000 -0500' # add phylum, class, order, higherClassification ]
Class Method Summary collapse
-
.build_index_async(klass, record_scope, predicate_extension_params: {}) ⇒ Object
When we re-index a large set of data then we run it in the background.
-
.download_async(record_scope, request = nil, predicate_extension_params: {}) ⇒ Download
The download object containing the archive.
- .index_metadata(klass, record_scope) ⇒ Object
Class Method Details
.build_index_async(klass, record_scope, predicate_extension_params: {}) ⇒ Object
When we re-index a large set of data then we run it in the background. To determine when it is done we poll by the last record to be indexed.
62 63 64 65 66 |
# File 'lib/export/dwca.rb', line 62 def self.build_index_async(klass, record_scope, predicate_extension_params: {} ) s = record_scope.order(:id) ::DwcaCreateIndexJob.perform_later(klass.to_s, sql_scope: s.to_sql) (klass, s) end |
.download_async(record_scope, request = nil, predicate_extension_params: {}) ⇒ Download
Returns the download object containing the archive.
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# File 'lib/export/dwca.rb', line 33 def self.download_async(record_scope, request = nil, predicate_extension_params: {}) name = "dwc-a_#{DateTime.now}.zip" download = ::Download::DwcArchive.create!( name: "DwC Archive generated at #{Time.now}.", description: 'A Darwin Core archive.', filename: name, request: request, expires: 2.days.from_now, total_records: record_scope.size # Was haveing problems with count() TODO: increment after when extensions are allowed. ) # Note we pass a string with the record scope ::DwcaCreateDownloadJob.perform_later(download, core_scope: record_scope.to_sql, predicate_extension_params: predicate_extension_params) download end |
.index_metadata(klass, record_scope) ⇒ Object
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/export/dwca.rb', line 68 def self.(klass, record_scope) a = record_scope.first&.to_global_id&.to_s b = record_scope.last&.to_global_id&.to_s t = record_scope.size # was having problems with count = { total: t, start_time: Time.now, sample: [a, b].compact } if b && (t > 2) max = 9 max = t if t < 9 ids = klass .select('*') .from("(select id, type, ROW_NUMBER() OVER (ORDER BY id ASC) rn from (#{record_scope.to_sql}) b ) a") .where("a.rn % ((SELECT COUNT(*) FROM (#{record_scope.to_sql}) c) / #{max}) = 0") .limit(max) .collect{|o| o.to_global_id.to_s} [:sample].insert(1, *ids) end [:sample].uniq! end |