Module: Utilities::DarwinCore::Summary
- Defined in:
- lib/utilities/darwin_core/summary.rb
Overview
Summary helpers for compacted DarwinCore tables. Methods here operate on plain row Hashes and have no TaxonWorks or Rails dependencies.
Constant Summary collapse
- COLUMN_ORDER =
Display order for compacted DwC columns. Derived columns appear after individualCount.
%w[ catalogNumber otherCatalogNumbers scientificName scientificNameAuthorship individualCount adultMale adultFemale immatureNymph exuvia sex lifeStage caste eventDate eventTime year month day startDayOfYear endDayOfYear country stateProvince county verbatimLocality waterBody habitat decimalLatitude decimalLongitude verbatimCoordinates verbatimLatitude verbatimLongitude coordinateUncertaintyInMeters geodeticDatum footprintWKT verbatimElevation minimumElevationInMeters maximumElevationInMeters verbatimDepth minimumDepthInMeters maximumDepthInMeters basisOfRecord occurrenceID occurrenceStatus recordNumber fieldNumber eventID samplingProtocol preparations typeStatus identifiedBy identifiedByID dateIdentified recordedBy recordedByID kingdom phylum dwcClass order higherClassification superfamily family subfamily tribe subtribe genus specificEpithet infraspecificEpithet taxonRank nomenclaturalCode previousIdentifications institutionCode institutionID georeferenceProtocol georeferenceRemarks georeferenceSources georeferencedBy georeferencedDate verbatimSRS verbatimEventDate associatedMedia associatedTaxa occurrenceRemarks verbatimLabel ].freeze
Class Method Summary collapse
-
.count_rows_compacted(rows) ⇒ Integer
Count rows (with a catalogNumber) that shared their catalogNumber with at least one other row — i.e.
-
.ordered_headers(headers) ⇒ Array<String>
Return headers sorted by COLUMN_ORDER, with any unrecognised columns appended at the end.
-
.year_before_1700?(event_date) ⇒ Boolean
Return true if the year portion of event_date is before 1700.
Class Method Details
.count_rows_compacted(rows) ⇒ Integer
Count rows (with a catalogNumber) that shared their catalogNumber with at least one other row — i.e. rows that were involved in a merge. Rows without a catalogNumber are excluded from this count.
113 114 115 116 117 |
# File 'lib/utilities/darwin_core/summary.rb', line 113 def self.count_rows_compacted(rows) with_catalog_number = rows.select { |r| r['catalogNumber'].to_s.strip.length > 0 } grouped = with_catalog_number.group_by { |r| r['catalogNumber'] } grouped.sum { |_key, group| group.size > 1 ? group.size : 0 } end |
.ordered_headers(headers) ⇒ Array<String>
Return headers sorted by COLUMN_ORDER, with any unrecognised columns appended at the end.
101 102 103 104 105 |
# File 'lib/utilities/darwin_core/summary.rb', line 101 def self.ordered_headers(headers) ordered = COLUMN_ORDER.select { |h| headers.include?(h) } remaining = headers - ordered ordered + remaining end |
.year_before_1700?(event_date) ⇒ Boolean
Return true if the year portion of event_date is before 1700. Handles ISO 8601 ranges (slash-separated) by inspecting the start date.
124 125 126 127 128 129 |
# File 'lib/utilities/darwin_core/summary.rb', line 124 def self.year_before_1700?(event_date) return false if event_date.nil? || event_date.empty? date_part = event_date.include?('/') ? event_date.split('/').first : event_date match = date_part.match(/\A(\d{4})/) match && match[1].to_i < 1700 end |