Class: Utilities::DarwinCore::Table
- Inherits:
-
Object
- Object
- Utilities::DarwinCore::Table
- Defined in:
- lib/utilities/darwin_core/table.rb
Overview
A wrapper for DarwinCore Occurrence data as native Ruby objects. Accepts input as CSV, TSV string, or File. Outputs as CSV, TSV string, or File.
Instance Attribute Summary collapse
-
#errors ⇒ Array<Hash>
Error/warning log entries from compaction or validation each entry: { type: :error|:warning, column:, message:, values: }.
-
#headers ⇒ Array<String>
readonly
Column headers.
-
#rows ⇒ Array<Hash>
readonly
Each row is a Hash keyed by header string.
-
#skipped_rows ⇒ Array<Hash>
Rows excluded from compaction (e.g. no catalogNumber).
Instance Method Summary collapse
-
#compact(by: :catalog_number, preview: false) ⇒ Utilities::DarwinCore::Table
Compact rows by merging on a key column.
-
#initialize(csv: nil, tsv_string: nil, file: nil) ⇒ Table
constructor
Construct a Table from one of three input types:.
- #load_from_csv(csv) ⇒ Object private
- #load_from_file(path) ⇒ Object private
- #load_from_tsv_string(tsv_string) ⇒ Object private
-
#to_csv ⇒ CSV
A CSV object with headers.
-
#to_file(path) ⇒ String
Write TSV data to a file.
-
#to_tsv ⇒ String
TSV-formatted string.
Constructor Details
#initialize(csv: nil, tsv_string: nil, file: nil) ⇒ Table
Construct a Table from one of three input types:
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
# File 'lib/utilities/darwin_core/table.rb', line 31 def initialize(csv: nil, tsv_string: nil, file: nil) @errors = [] @skipped_rows = [] @headers = [] @rows = [] sources = [csv, tsv_string, file].compact raise ArgumentError, 'Provide exactly one of csv:, tsv_string:, or file:' unless sources.size == 1 if csv load_from_csv(csv) elsif tsv_string load_from_tsv_string(tsv_string) elsif file load_from_file(file) end end |
Instance Attribute Details
#errors ⇒ Array<Hash>
Returns error/warning log entries from compaction or validation each entry: { type: :error|:warning, column:, message:, values: }.
20 21 22 |
# File 'lib/utilities/darwin_core/table.rb', line 20 def errors @errors end |
#headers ⇒ Array<String> (readonly)
Returns column headers.
11 12 13 |
# File 'lib/utilities/darwin_core/table.rb', line 11 def headers @headers end |
#rows ⇒ Array<Hash> (readonly)
Returns each row is a Hash keyed by header string.
15 16 17 |
# File 'lib/utilities/darwin_core/table.rb', line 15 def rows @rows end |
#skipped_rows ⇒ Array<Hash>
Returns rows excluded from compaction (e.g. no catalogNumber).
24 25 26 |
# File 'lib/utilities/darwin_core/table.rb', line 24 def skipped_rows @skipped_rows end |
Instance Method Details
#compact(by: :catalog_number, preview: false) ⇒ Utilities::DarwinCore::Table
Compact rows by merging on a key column.
85 86 87 88 89 90 91 92 93 |
# File 'lib/utilities/darwin_core/table.rb', line 85 def compact(by: :catalog_number, preview: false) case by when :catalog_number Utilities::DarwinCore::Compact.by_catalog_number(self, preview:) else raise ArgumentError, "Unknown compact strategy: #{by}" end self end |
#load_from_csv(csv) ⇒ Object (private)
97 98 99 100 101 102 |
# File 'lib/utilities/darwin_core/table.rb', line 97 def load_from_csv(csv) @headers = csv.headers.map(&:to_s) csv.each do |row| @rows << headers.each_with_object({}) { |h, hash| hash[h] = row[h] } end end |
#load_from_file(path) ⇒ Object (private)
109 110 111 112 |
# File 'lib/utilities/darwin_core/table.rb', line 109 def load_from_file(path) raise ArgumentError, "File not found: #{path}" unless File.exist?(path) load_from_tsv_string(File.read(path)) end |
#load_from_tsv_string(tsv_string) ⇒ Object (private)
104 105 106 107 |
# File 'lib/utilities/darwin_core/table.rb', line 104 def load_from_tsv_string(tsv_string) csv = ::CSV.parse(tsv_string, col_sep: "\t", headers: true) load_from_csv(csv) end |
#to_csv ⇒ CSV
Returns a CSV object with headers.
51 52 53 54 55 56 57 58 59 |
# File 'lib/utilities/darwin_core/table.rb', line 51 def to_csv output = ::CSV.generate(col_sep: "\t", headers: headers, write_headers: true) do |csv_out| rows.each do |row| csv_out << headers.map { |h| row[h] } end end ::CSV.parse(output, col_sep: "\t", headers: true) end |
#to_file(path) ⇒ String
Write TSV data to a file.
75 76 77 78 |
# File 'lib/utilities/darwin_core/table.rb', line 75 def to_file(path) File.write(path, to_tsv) path end |
#to_tsv ⇒ String
Returns TSV-formatted string.
63 64 65 66 67 68 69 |
# File 'lib/utilities/darwin_core/table.rb', line 63 def to_tsv lines = [headers.join("\t")] rows.each do |row| lines << headers.map { |h| row[h] }.join("\t") end lines.join("\n") + "\n" end |