Module: TaxonWorks::Vendor::Serrano

Defined in:
lib/vendor/serrano.rb

Overview

A middle-layer wrapper between Serrano and TaxonWorks

Defined Under Namespace

Classes: CrossRefLaTeX

Constant Summary collapse

CUTOFF =
50.0

Class Method Summary collapse

Class Method Details

.citation_is_valid_doi?(citation) ⇒ Boolean

Returns Boolean use our global identifier class to determined if value is DOI this isn't super robust, but maybe OK.

Returns:

  • (Boolean)

    Boolean use our global identifier class to determined if value is DOI this isn't super robust, but maybe OK



100
101
102
103
104
# File 'lib/vendor/serrano.rb', line 100

def self.citation_is_valid_doi?(citation)
  doi = Identifier::Global::Doi.new(identifier: citation)
  doi.valid?
  !doi.errors.has_key?(:identifier)
end

.cutoffFloat

Returns:

  • (Float)


9
10
11
# File 'lib/vendor/serrano.rb', line 9

def self.cutoff
  CUTOFF
end

.get_bibtex_string(citation) ⇒ String?

Returns:

  • (String, nil)


66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/vendor/serrano.rb', line 66

def self.get_bibtex_string(citation)
  begin
    # Convert citation to DOI if it isn't already
    if !citation_is_valid_doi?(citation)
      # First item should be the one with highest score/relevance: https://github.com/CrossRef/rest-api-doc#sort-order
      res = ::Serrano.works(query: citation, limit: 1)&.dig("message", "items")&.first
      # citation = Serrano.works(query: citation)&.dig("message", "items")&.max_by { |i| i["score"] }&.dig("DOI") unless citation_is_valid_doi?(citation)

      score = res&.dig("score") || -1.0
      citation = (score >= CUTOFF) ? res&.dig("DOI") : nil
    end

    bibtex = ::Serrano.content_negotiation(ids: unurize_doi(citation), format: "bibtex") unless citation.nil?

    return bibtex =~ /^\s*@/ ? bibtex : nil
  rescue
    return nil
  end
end

.new_from_citation(citation: nil) ⇒ Source::BibTex.new, ...

TODO: attempt to extract DOI from full string

Four possible paths: 1) citation. 2) citation which includes a doi. 3) naked doi, e.g., '10.3897/zookeys.20.205'. 4) doi with preamble, e.g., 'dx.doi.org/10.3897/zookeys.20.205' or

'https://doi.org/10.3897/zookeys.20.205'.

Returns:

  • (Source::BibTex.new)

    a new instance

  • (Source::Verbatim.new)

    a new instance

  • (false)


47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'lib/vendor/serrano.rb', line 47

def self.new_from_citation(citation: nil)
  citation&.strip!
  return false if citation.length < 6

  # check string encoding, if not UTF-8, check if compatible with UTF-8,
  # if so convert to UTF-8 and parse with latex, else use type verbatim
  a = get_bibtex_string(citation) 
  b = ::Utilities::Strings.encode_with_utf8(a) if a

  if b
    Source::Bibtex.new_from_bibtex(
      BibTeX::Bibliography.parse(b, filter: CrossRefLaTeX.instance).first
    )
  else
    Source::Verbatim.new(verbatim: a ? a : citation)
  end
end

.unurize_doi(doi) ⇒ String

Returns:

  • (String)


87
88
89
90
91
92
93
94
95
# File 'lib/vendor/serrano.rb', line 87

def self.unurize_doi(doi)
  doi = doi.strip

  if matches = doi.match(/https?:\/\/[^\/]+\/(.*)/)
    matches[1]
  else
    doi
  end
end