class MiGA::Dataset

Dataset representation in MiGA

Attributes

name[R]

Datasets are uniquely identified by name in a project

project[R]

MiGA::Project that contains the dataset

Public Class Methods

EXCLUDE_NOMARKER_TASKS() click to toggle source

Tasks to be excluded from datasets without markers

# File lib/miga/dataset/base.rb, line 37
def EXCLUDE_NOMARKER_TASKS
  @@EXCLUDE_NOMARKER_TASKS
end
EXCLUDE_NOREF_TASKS() click to toggle source

Tasks to be excluded from query datasets

# File lib/miga/dataset/base.rb, line 31
def EXCLUDE_NOREF_TASKS
  @@EXCLUDE_NOREF_TASKS
end
INFO_FIELDS() click to toggle source

Standard fields of metadata for datasets

# File lib/miga/dataset.rb, line 36
def INFO_FIELDS
  %w[name created updated type ref user description comments]
end
KNOWN_TYPES() click to toggle source

Supported dataset types

# File lib/miga/dataset/base.rb, line 18
def KNOWN_TYPES
  @@KNOWN_TYPES
end
ONLY_MULTI_TASKS() click to toggle source

Tasks to be executed only in datasets that are multi-organism. These tasks are ignored for single-organism datasets or for unknwon types

# File lib/miga/dataset/base.rb, line 51
def ONLY_MULTI_TASKS
  @@ONLY_MULTI_TASKS
end
ONLY_NONMULTI_TASKS() click to toggle source

Tasks to be executed only in datasets that are single-organism. These tasks are ignored for multi-organism datasets or for unknown types

# File lib/miga/dataset/base.rb, line 44
def ONLY_NONMULTI_TASKS
  @@ONLY_NONMULTI_TASKS
end
OPTIONS() click to toggle source

Options supported by datasets

# File lib/miga/dataset/base.rb, line 57
def OPTIONS
  @@OPTIONS
end
PREPROCESSING_TASKS() click to toggle source

Returns an Array of tasks (Symbols) to be executed before project-wide tasks

# File lib/miga/dataset/base.rb, line 25
def PREPROCESSING_TASKS
  @@PREPROCESSING_TASKS
end
RESULT_DIRS() click to toggle source

Directories containing the results from dataset-specific tasks

# File lib/miga/dataset/base.rb, line 12
def RESULT_DIRS
  @@RESULT_DIRS
end
exist?(project, name) click to toggle source

Does the project already have a dataset with that name?

# File lib/miga/dataset.rb, line 30
def exist?(project, name)
  project.dataset_names_set.include? name
end
new(project, name, is_ref = true, metadata = {}) click to toggle source

Create a MiGA::Dataset object in a project MiGA::Project with a uniquely identifying name. is_ref indicates if the dataset is to be treated as reference (true, default) or query (false). Pass any additional metadata as a Hash.

# File lib/miga/dataset.rb, line 56
def initialize(project, name, is_ref = true, metadata = {})
  name = name.to_s
  name.miga_name? or
    raise 'Invalid name, please use only alphanumerics and underscores: ' +
          name

  @project, @name, @metadata = project, name, nil
  metadata[:ref] = is_ref
  metadata[:type] ||= :empty
  metadata[:status] ||= 'incomplete'
  @metadata_future = [
    File.join(project.path, 'metadata', "#{name}.json"),
    metadata
  ]
  return if File.exist? @metadata_future[0]

  save
  pull_hook :on_create
end

Public Instance Methods

activate!() click to toggle source

Activate a dataset. This removes the :inactive flag

# File lib/miga/dataset.rb, line 125
def activate!
  metadata[:inactive] = nil
  metadata[:warn] = nil if metadata[:warn] && metadata[:warn] =~ /^Inactive: /
  metadata.save
  project.recalculate_tasks("Reference dataset activated: #{name}") if ref?
  pull_hook :on_activate
end
active?() click to toggle source

Is this dataset active?

# File lib/miga/dataset.rb, line 155
def active?
  metadata[:inactive].nil? || !metadata[:inactive]
end
Also aliased as: is_active?
closest_relatives(how_many = 1, ref_project = false) click to toggle source

Returns an Array of how_many duples (Arrays) sorted by AAI:

  • 0: A String with the name(s) of the reference dataset.

  • 1: A Float with the AAI.

This function is currently only supported for query datasets when ref_project is false (default), and only for reference dataset when ref_project is true. It returns nil if this analysis is not supported.

# File lib/miga/dataset.rb, line 186
def closest_relatives(how_many = 1, ref_project = false)
  return nil if (ref? != ref_project) || multi?

  r = result(ref_project ? :taxonomy : :distances)
  return nil if r.nil?

  require 'miga/sqlite'
  MiGA::SQLite.new(r.file_path(:aai_db)).run(
    'SELECT seq2, aai FROM aai WHERE seq2 != ? '        'GROUP BY seq2 ORDER BY aai DESC LIMIT ?', [name, how_many]
  )
end
inactivate!(reason = nil) click to toggle source

Inactivate a dataset. This halts automated processing by the daemon

If given, the reason string is saved as a metadata :warn entry

# File lib/miga/dataset.rb, line 115
def inactivate!(reason = nil)
  metadata[:warn] = "Inactive: #{reason}" unless reason.nil?
  metadata[:inactive] = true
  metadata.save
  project.recalculate_tasks("Reference dataset inactivated: #{name}") if ref?
  pull_hook :on_inactivate
end
info() click to toggle source

Get standard metadata values for the dataset as Array

# File lib/miga/dataset.rb, line 135
def info
  MiGA::Dataset.INFO_FIELDS.map do |k|
    k == 'name' ? name : metadata[k]
  end
end
is_active?()

Same as active? for backwards-compatibility

Alias for: active?
is_query?()

Same as query? for backwards-compatibility

Alias for: query?
is_ref?()

Same as ref? for backwards-compatibility

Alias for: ref?
metadata() click to toggle source

MiGA::Metadata with information about the dataset

# File lib/miga/dataset.rb, line 78
def metadata
  if @metadata.nil?
    @metadata = MiGA::Metadata.new(*@metadata_future)
    pull_hook :on_load
  end
  @metadata
end
query?() click to toggle source

Is this dataset a query (non-reference)?

# File lib/miga/dataset.rb, line 149
def query?
  !metadata[:ref]
end
Also aliased as: is_query?
ref?() click to toggle source

Is this dataset a reference?

# File lib/miga/dataset.rb, line 143
def ref?
  !query?
end
Also aliased as: is_ref?
remove!() click to toggle source

Delete the dataset with all it's contents (including results) and returns nil

# File lib/miga/dataset.rb, line 105
def remove!
  results.each(&:remove!)
  metadata.remove!
  pull_hook :on_remove
end
save() click to toggle source

Save any changes you've made in the dataset

# File lib/miga/dataset.rb, line 88
def save
  MiGA.DEBUG "Dataset.save: #{name}"
  metadata.save
  pull_hook :on_save
end
save!() click to toggle source

Forces a save even if nothing has changed in the metadata

# File lib/miga/dataset.rb, line 96
def save!
  MiGA.DEBUG "Dataset.save!: #{name}"
  metadata.save!
  pull_hook :on_save
end