Module: Shared::Unify

Extended by:
ActiveSupport::Concern
Included in:
Topic
Defined in:
app/models/concerns/shared/unify.rb

Overview

A module to unify two objects into 1, or to move data between objects.

!! The module works on relations, not attributes, which are ignored and untouched (but see position, and is_original). !! For example if two objects have differing ‘name` fields this is ignored.

When :only or :except are provided, then the remove_object IS NOT DESTROYED, only data are moved between objects.555h

If they are not provided, we attempt to destroy the ‘remove_object`

* If a related object is now a duplicate then its annotations are moved to the deduplicate object
* If `preview` = true then rolls back all changes.
  • Annotation classes (e.g. Notes) can not be unified except through their relation to unified objects.

  • Users and projects can not be unified, though technically the approach should be be a hard/but robust approach to the problem, with some key exceptions (e.g. two root TaxonNames)

  • Classes that are exposed in the UI are defined at app/javascript/vue/tasks/unify/objects/constants/types.js.

  • Run ‘rake tw:development:linting:inverse_of_preventing_unify` judiciously when modifying models or this code. It will catch missing `inverse_of` parameters required to unify objects. Note that it will always report some missing relationships that do not matter.

Constant Summary collapse

EXCLUDE_RELATIONS =

Never auto-handle these, let the final destroy remove them. Housekeeping relations are not hit here, we don’t merge users at the moment.

[
  :versions,            # Not picked up, but adding in case
  :dwc_occurrence,      # Will be destroyed on related objects destruction
  :pinboard_items,      # Technically not needed here
  :cached_map_register, # Destroyed on merge of things like Georeferences and AssertedDistributions
  :cached_map_items,
]

Instance Method Summary collapse

Instance Method Details

#deduplicate_update_target(object) ⇒ Object (private)



278
279
280
281
282
283
284
285
286
287
288
289
# File 'app/models/concerns/shared/unify.rb', line 278

def deduplicate_update_target(object)
  i = object.identical

  # There is exactly 1 match, merge is unambiguous
  if i.size == 1
    j = i.first
    j.unify(object.reload)
  else
    # Merge would be ambiguous, there are multiple matches
    return false
  end
end

#except_relationsObject

Per class, when merging skip these relations



37
38
39
# File 'app/models/concerns/shared/unify.rb', line 37

def except_relations
  []
end

#inferred_relationsObject

Our target is a list of relations that we can iterate through and, by inspection, update related records to point to self.

* We don't want to target convienience relations as they are in essence alias of base-class relations and redundant
* We don't want anything that relates to a calculated cached value
* We *do* want to catch relations that are edges in which the same class of object is on both sides, these require
 an alias.  We inspect for `related_<name>` as a pattern to select these.

TODO: Revist. depending on the`related_XXX naming pattern is brittle-ish, perhaps
converge on using `unife_relations` to force inclusion.

Note: class_name based exclusions prevent a lot of duplicated efforts, as much of their use is based on convienience relations on things like subclassed or scoped data.

Returns:

  • Array of ActiveRecord::Reflection



69
70
71
72
73
74
75
76
# File 'app/models/concerns/shared/unify.rb', line 69

def inferred_relations
  ( unify_relations +
   ::ApplicationEnumeration.klass_reflections(self.class) +
   ::ApplicationEnumeration.klass_reflections(self.class, :has_one))
    .delete_if{|r| r.options[:foreign_key] =~ /cache/}
    .delete_if{|r| EXCLUDE_RELATIONS.include?(r.name.to_sym)}
    .delete_if{|r| !r.name.match(/related/) && ( r.options[:through].present? || r.options[:class_name].present? )}
end

#log_unify_result(object, relation, result) ⇒ Object (private)

During logging attempt to resolve duplicate objects issues by moving annotations from the would-be duplicate to an identical existing record.

Returns:

  • result Hash



296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
# File 'app/models/concerns/shared/unify.rb', line 296

def log_unify_result(object, relation, result)
  n = relation.name.to_s.humanize

  # Handle an edge case, preserve Citations that
  # would only be invalid due to origin flag
  if object.class.name == 'Citation' && object.errors.key?(:is_original) && object.is_original
    object.is_original = false
    object.save
  end

  if object.errors.key?(:position)
    object.position = nil
    object.save
  end

  # One degree of seperation issue
  #
  # Here we check to see that the error is related
  # to the object being unified, if not,
  # we don't know how to handle this with confidence.
  if object.errors.details.keys.include?(relation.options[:inverse_of])

    # object can't be updated, move its annotations to self
    unless deduplicate_update_target(object)
      result[:result][:unified] = false
      result[:details][n][:unmerged] += 1
      result[:details][n][:errors] ||= []
      result[:details][n][:errors].push( {id: object.id, message: object.errors.full_messages.join('; ')} )
    else # We unified and destroyed the duplicate
      result[:details][n][:deduplicated] += 1
    end

    # THere are no errors we can fix, ensure we have a fresh copy
    # of the object and check for validity.
  else
    object.reload
    if object.invalid?
      result[:result][:unified] = false
      result[:details][n][:unmerged] += 1
      result[:details][n][:errors] ||= []
      result[:details][n][:errors].push( {id: object.id, message: object.errors.full_messages.join('; ')} )
    else
      result[:details][n][:merged] += 1
    end
  end

  result
end

#merge_relations(only: [], except: []) ⇒ Object

Perhaps used_inferred to hash

Returns:

  • Array of ActiveRecord::Reflection



43
44
45
46
47
48
49
50
51
# File 'app/models/concerns/shared/unify.rb', line 43

def merge_relations(only: [], except: [])
  o = (only_relations + [only&.map(&:to_sym)].flatten).uniq
  if o.any?
    used_inferred_relations.select{|a| o.include?(a.name)}
  else
    e = (except_relations + [except&.map(&:to_sym)].flatten).uniq
    used_inferred_relations.select{|a| !e.include?(a.name)}
  end
end

#only_relationsObject

Per class, Iterating through all of these



32
33
34
# File 'app/models/concerns/shared/unify.rb', line 32

def only_relations
  []
end

#pre_validate(remove_object, result) ⇒ Object (private)



241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
# File 'app/models/concerns/shared/unify.rb', line 241

def pre_validate(remove_object, result)
  s = result

  if s[:result][:target_project_id].nil?
    if is_community?
      s[:result].merge!(
        unified: false,
        message: 'Can not merge community objects without project context.'
      )
    else
      s[:result][:target_project_id] = project_id
    end
  end

  if remove_object == self
    s[:result].merge!(
      unified: false,
      message: 'Can not unify the same objects.'
    )
  end

  if !is_community?
    if project_id != remove_object.project_id
      s.merge!(
        unified: false,
        message: 'Danger, objects come from different projects.')
    end
  end

  if remove_object.class.name != self.class.name
    s[:result].merge!(
      unified: false,
      message: "Can not unify objects of different types (#{remove_object.class.name} and #{self.class.name}).")
  end
  s
end

#relation_label(relation) ⇒ Object



210
211
212
# File 'app/models/concerns/shared/unify.rb', line 210

def relation_label(relation)
  relation.name.to_s.humanize
end

#stub_unify_result(result, relation_name, attempted) ⇒ Object (private)



346
347
348
349
350
351
352
353
354
355
# File 'app/models/concerns/shared/unify.rb', line 346

def stub_unify_result(result, relation_name, attempted)
  result[:details].merge!(
    relation_name => {
      attempted:,
      merged: 0,
      unmerged: 0,
      deduplicated: 0
    }
  )
end

#unify(remove_object, only: [], except: [], preview: false, cutoff: 250, target_project_id: nil) ⇒ Object

See header.

Parameters:

  • remove_object

    this object will be destroyed if possible

  • only (Array of Symbols) (defaults to: [])

    only operate on these relations, useful for partial merges/moving objects

  • except (Array of Symbols) (defaults to: [])

    don’t operate on these relations

  • preview (defaults to: false)

    Boolean if true then roll back all operations

  • cutoff (defaults to: 250)

    Integer if more than cutoff relations are observed then always rollback TODO: add delayed job handling

  • target_project_id (Integer) (defaults to: nil)

    required when self is_community?, scopes operations to target project only

Returns:

  • Hash a result



113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
# File 'app/models/concerns/shared/unify.rb', line 113

def unify(remove_object, only: [], except: [], preview: false, cutoff: 250, target_project_id: nil)
  s = {
    result: { unified: nil, total_related: 0, target_project_id:},
    details: {},
  }

  o = remove_object
  pre_validate(o, s)
  return s if s[:result][:unified] == false
  pid = s[:result][:target_project_id]

  self.class.transaction do
    # before_unify # potential hooks, appear not to be required

    merge_relations(only:, except:).each do |r|
      n = relation_label(r)

      case ::ApplicationEnumeration.relationship_type(r)

      when :has_many
        i = o.send(r.name)

        unless ::ApplicationEnumeration.relation_targets_community?(r)
          i = i.where(project_id: pid)
        end

        next unless i.any?

        t = i.size
        stub_unify_result(s, n, t)

        s[:result][:total_related] += t
        next if s[:result][:total_related] > cutoff

        i.find_each do |j|
          j.update(r.options[:inverse_of] => self)
          log_unify_result(j, r, s)
        end

      when :has_one, :belongs_to
        i = o.send(r.name)
        if !i.nil?
          stub_unify_result(s, n, 1)

          i.update(r.options[:inverse_of] => self)
          log_unify_result(i, r, s)
        end
      end
    end

    if cutoff_hit = s[:result][:total_related] > cutoff
      s[:result][:message] = "Related cutoff threshold (> #{cutoff}) hit, unify is not yet allowed on these objects."
    else

      begin
        o.reload # reset all in-memory has_many caches that would prevent destroy

        unless only.any? || except.any?
          o.destroy!
        end

      rescue ActiveRecord::InvalidForeignKey => e
        s[:result][:unified] = false
        s[:details].merge!(
          Object: {
            errors: [
              { id: e.record.id, message: e.record.errors.full_messages.join('; ') }
            ]
          }
        )

        raise ActiveRecord::Rollback
      rescue ActiveRecord::RecordNotDestroyed => e
        s[:result][:unified] = false
        s[:details].merge!(
          Object: {
            errors: [
              { id: e.record.id, message: e.record.errors.full_messages.join('; ') }
            ]
          }
        )

        raise ActiveRecord::Rollback
      end
    end

    # after_unify # potential hooks, appear not to be required

    if preview || cutoff_hit
      raise ActiveRecord::Rollback
    end
  end

  s[:result][:unified] = true unless s[:result][:unified] == false
  s
end

#unify_relationsObject

Override in instances methods, see Serial for eg



85
86
87
# File 'app/models/concerns/shared/unify.rb', line 85

def unify_relations
  []
end

#unify_relations_metadata(target_project_id: nil) ⇒ Object



214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
# File 'app/models/concerns/shared/unify.rb', line 214

def (target_project_id: nil)
  s = {}

  merge_relations.each do |r|
    name = relation_label(r)

    case ::ApplicationEnumeration.relationship_type(r)
    when :has_many
      if ::ApplicationEnumeration.relation_targets_community?(r)
        i = send(r.name)
      else
        i = send(r.name).where(project_id: target_project_id)
      end

      next unless i.count > 0
      s[r.name] = { total: i.count, name: }
    when :has_one
      if send(r.name).present?
        s[r.name] = { total: 1, name: }
      end
    end
  end
  s.sort.to_h
end

#used_inferred_relationsObject

Keep separated from inferred_relations so we can better audit all models in rake linting

Returns:

  • Array of ActiveRecord::Reflection



80
81
82
# File 'app/models/concerns/shared/unify.rb', line 80

def used_inferred_relations
  (inferred_relations.select{|r| !r.options[:inverse_of].nil?} + unify_relations).uniq
end