Module: Export::Coldp::Files::Taxon
- Defined in:
- lib/export/coldp/files/taxon.rb
Overview
Concepts not mapped:
`namePhrase` - e.g. `sensu lato` this would come from OTU#name
Notes
-
ColDP importer has a normalizing step that recognizes some names no longer point to any OTU
-
CoLDP can not handle assertions that a name that is currently treated as (invalid) was useds as a name (valid) for previously valid concept, i.e. CoL does not track alternative past concept heirarchies
TODO: create map of all possible CoLDP used IRIs and ability to populate project with them automatically
Constant Summary collapse
- IRI_MAP =
{ extinct: 'https://api.checklistbank.org/datapackage#Taxon.extinct', # 1,0 temporal_range_end: 'https://api.checklistbank.org/datapackage#Taxon.temporal_range_end', # from https://api.checklistbank.org/vocab/geotime temporal_range_start: 'https://api.checklistbank.org/datapackage#Taxon.temporal_range_end', # from https://api.checklistbank.org/vocab/geotime lifezone: 'https://api.checklistbank.org/datapackage#Taxon.lifezone', # from https://api.checklistbank.org/vocab/lifezone }
- SKIPPED_RANKS =
%w{ NomenclaturalRank::Iczn::SpeciesGroup::Superspecies NomenclaturalRank::Iczn::SpeciesGroup::Supersuperspecies }
Class Method Summary collapse
-
.according_to_date(otu) ⇒ Object
Potentially reference Confidence level confidence_validated_at (last time this confidence level was deemed OK).
-
.according_to_id(otu) ⇒ Object
A reference to the publication of the person who established the taxonomic concept TW has a plurality of sources that reference this concept, it’s a straightforward map It is somewhat unclear how/whether CoL will use this concept.
- .generate(otus, project_members, root_otu_id = nil, reference_csv = nil, prefer_unlabelled_otus: true) ⇒ Object
- .link(otu) ⇒ Object
-
.name_phrase(otu, vocab_id) ⇒ Object
Name phrase is for appended phrases like senso stricto and senso lato.
- .predicate_value(otu, predicate) ⇒ Object
-
.provisional(otu) ⇒ Object
return [Boolean, nil] TODO - reason in TW this is provisional name.
-
.reference_id(sources) ⇒ Object
“supporting the taxonomic concept” Potentially- all other Citations tied to Otu, what exactly supports a concept?.
- .remarks(otu, taxon_remarks_vocab_id) ⇒ Object
-
.scrutinizer(otu) ⇒ Object
The scrutinizer concept is unused at present We’re looking for the canonical implementation of it before we implement/extrapolate from data here.
- .scrutinizer_date(otu) ⇒ Object
-
.scrutinizer_id(otu) ⇒ Object
ORCID version of above.
Class Method Details
.according_to_date(otu) ⇒ Object
Potentially reference
Confidence level
confidence_validated_at (last time this confidence level was deemed OK)
86 87 88 89 90 91 |
# File 'lib/export/coldp/files/taxon.rb', line 86 def self.according_to_date(otu) # a) Dynamic - !! most recent updated_at stamp for *any* OTU tied data -> this is a big grind: if so add cached_touched_on_date to Otu # b) modify Confidence level to include date # c) review what SFs does in their model nil end |
.according_to_id(otu) ⇒ Object
A reference to the publication of the person who established the taxonomic concept
TW has a plurality of sources that reference this concept, it's a straightforward map
It is somewhat unclear how/whether CoL will use this concept
79 80 81 |
# File 'lib/export/coldp/files/taxon.rb', line 79 def self.according_to_id(otu) nil end |
.generate(otus, project_members, root_otu_id = nil, reference_csv = nil, prefer_unlabelled_otus: true) ⇒ Object
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
# File 'lib/export/coldp/files/taxon.rb', line 113 def self.generate(otus, project_members, root_otu_id = nil, reference_csv = nil, prefer_unlabelled_otus: true) # Until we have RC5 articulations we are simplifying handling the fact # that one taxon name can be used for many OTUs. Track to see that # an OTU with a given taxon name does not already exist # `taxon_name_id: nil` - unify via Ruby hash keys observed_taxon_name_ids = { } # TODO: optional Taxon.alternativeID field allows inclusion of external identifiers: https://github.com/CatalogueOfLife/coldp#alternativeid-1 https://github.com/CatalogueOfLife/coldp#identifiers # e.g., gbif:2704179,col:6W3C4,BOLD:AAJ2287,wikidata:Q157571 ::CSV.generate(col_sep: "\t") do |csv| csv << %w{ ID parentID nameID namePhrase provisional accordingToID scrutinizer scrutinizerID scrutinizerDate referenceID extinct temporalRangeStart temporalRangeEnd environment link remarks modified modifiedBy } taxon_remarks_vocab_id = Predicate.find_by(uri: 'https://github.com/catalogueoflife/coldp#Taxon.remarks', project_id: otus[0]&.project_id)&.id name_phrase_vocab_id = Predicate.find_by(uri: 'https://github.com/catalogueoflife/coldp#Taxon.namePhrase', project_id: otus[0]&.project_id)&.id otus.each do |o| # !! When a name is a synonmy (combination), but that combination has no OTU # !! then the parent of the name in the taxon table is nil # !! Handle this edge case (probably resolved now) # TODO: alter way parent is set to conform to CoLDP status # For OTUs with combinations we might have to change the parenthood?! parent_id = nil if root_otu_id != o.id if pid = o.parent_otu_id(skip_ranks: SKIPPED_RANKS, prefer_unlabelled_otus: prefer_unlabelled_otus) parent_id = pid else puts 'WARNING no parent!!' # there is no OTU parent for the hierarchy, at present we just flat skip this OTU # Curators can use the create OTUs for valid ids to resolve this data issue next end end # TODO: This was excluding OTUs that were being excluded downstream previously # This should never happen now since parent ambiguity is caught above! # can be removed in theory # TODO: remove once RC5 better modelled next if observed_taxon_name_ids[o.taxon_name_id] observed_taxon_name_ids[o.taxon_name_id] = nil # TODO: Use o.coordinate_otus to summarize accross different instances of the OTU sources = o.sources source = o.source parent_id = (root_otu_id == o.id ? nil : parent_id ) csv << [ o.id, # ID (Taxon) parent_id, # parentID (Taxon) o.taxon_name.id, # nameID (Name) name_phrase(o, name_phrase_vocab_id), # namePhrase provisional(o), # provisional according_to_id(o), # accordingToID scrutinizer(o), # scrutinizer scrutinizer_id(o), # scrutinizerID scrutinizer_date(o), # scrutizinerDate reference_id(sources), # referenceID predicate_value(o, :extinct), # extinct predicate_value(o, :temporal_range_start), # temporalRangeStart predicate_value(o, :temporal_range_end), # temporalRangeEnd predicate_value(o, :lifezone), # environment (formerly named lifezone) link(o), # link Export::Coldp.sanitize_remarks(remarks(o, taxon_remarks_vocab_id)), # remarks Export::Coldp.modified(o[:updated_at]), # modified Export::Coldp.modified_by(o[:updated_by_id], project_members) # modifiedBy ] Export::Coldp::Files::Reference.add_reference_rows(sources, reference_csv, project_members) if reference_csv end end end |
.link(otu) ⇒ Object
93 94 95 |
# File 'lib/export/coldp/files/taxon.rb', line 93 def self.link(otu) # API or public interface end |
.name_phrase(otu, vocab_id) ⇒ Object
Name phrase is for appended phrases like senso stricto and senso lato
48 49 50 51 52 53 |
# File 'lib/export/coldp/files/taxon.rb', line 48 def self.name_phrase(otu, vocab_id) da = DataAttribute.find_by(type: 'InternalAttribute', controlled_vocabulary_term_id: vocab_id, attribute_subject_id: otu.id) da&.value end |
.predicate_value(otu, predicate) ⇒ Object
27 28 29 30 |
# File 'lib/export/coldp/files/taxon.rb', line 27 def self.predicate_value(otu, predicate) return nil unless IRI_MAP[predicate] otu.data_attributes.joins(:predicate).where(controlled_vocabulary_terms: {uri: IRI_MAP[predicate]}).first&.value end |
.provisional(otu) ⇒ Object
return [Boolean, nil]
TODO - reason in TW this is provisional name
34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/export/coldp/files/taxon.rb', line 34 def self.provisional(otu) # nomen dubium # incertae sedis # unresolved homonym, without replacement # # # # * if two OTUs for same name are in OTU set then both have to be provisional # * missaplication (?) nil end |
.reference_id(sources) ⇒ Object
“supporting the taxonomic concept” Potentially- all other Citations tied to Otu, what exactly supports a concept?
107 108 109 110 111 |
# File 'lib/export/coldp/files/taxon.rb', line 107 def self.reference_id(sources) i = sources.pluck(:id) return i.join(',') if i.any? nil end |
.remarks(otu, taxon_remarks_vocab_id) ⇒ Object
97 98 99 100 101 102 103 |
# File 'lib/export/coldp/files/taxon.rb', line 97 def self.remarks(otu, taxon_remarks_vocab_id) if !taxon_remarks_vocab_id.nil? && otu.data_attributes.where(controlled_vocabulary_term_id: taxon_remarks_vocab_id).any? otu.data_attributes.where(controlled_vocabulary_term_id: taxon_remarks_vocab_id).pluck(:value).join('|') else nil end end |
.scrutinizer(otu) ⇒ Object
The scrutinizer concept is unused at present We’re looking for the canonical implementation of it before we implement/extrapolate from data here.
* crawl attribution for inference on higher/lower
* UI/methods to assign/spam/visualize throught
* project preference (!! should project preferences has reference ids? !!)
according to is the curator responsible for this OTU, comma delimited list of curators We could also look at time-stamp data to detect “staleness” of an OTU concept
63 64 65 |
# File 'lib/export/coldp/files/taxon.rb', line 63 def self.scrutinizer(otu) nil end |
.scrutinizer_date(otu) ⇒ Object
72 73 74 |
# File 'lib/export/coldp/files/taxon.rb', line 72 def self.scrutinizer_date(otu) nil end |
.scrutinizer_id(otu) ⇒ Object
ORCID version of above
68 69 70 |
# File 'lib/export/coldp/files/taxon.rb', line 68 def self.scrutinizer_id(otu) nil end |