class MiGA::Dataset

Dataset representation in MiGA

Attributes

name[R]

Datasets are uniquely identified by name in a project

project[R]

MiGA::Project that contains the dataset

Public Class Methods

EXCLUDE_NOREF_TASKS() click to toggle source

Tasks to be excluded from query datasets

# File lib/miga/dataset/base.rb, line 31
def EXCLUDE_NOREF_TASKS
  @@EXCLUDE_NOREF_TASKS
end
INFO_FIELDS() click to toggle source

Standard fields of metadata for datasets

# File lib/miga/dataset.rb, line 33
def INFO_FIELDS
  %w[name created updated type ref user description comments]
end
KNOWN_TYPES() click to toggle source

Supported dataset types

# File lib/miga/dataset/base.rb, line 18
def KNOWN_TYPES
  @@KNOWN_TYPES
end
ONLY_MULTI_TASKS() click to toggle source

Tasks to be executed only in datasets that are multi-organism. These tasks are ignored for single-organism datasets or for unknwon types

# File lib/miga/dataset/base.rb, line 45
def ONLY_MULTI_TASKS
  @@ONLY_MULTI_TASKS
end
ONLY_NONMULTI_TASKS() click to toggle source

Tasks to be executed only in datasets that are single-organism. These tasks are ignored for multi-organism datasets or for unknown types

# File lib/miga/dataset/base.rb, line 38
def ONLY_NONMULTI_TASKS
  @@ONLY_NONMULTI_TASKS
end
OPTIONS() click to toggle source

Options supported by datasets

# File lib/miga/dataset/base.rb, line 51
def OPTIONS
  @@OPTIONS
end
PREPROCESSING_TASKS() click to toggle source

Returns an Array of tasks (Symbols) to be executed before project-wide tasks

# File lib/miga/dataset/base.rb, line 25
def PREPROCESSING_TASKS
  @@PREPROCESSING_TASKS
end
RESULT_DIRS() click to toggle source

Directories containing the results from dataset-specific tasks

# File lib/miga/dataset/base.rb, line 12
def RESULT_DIRS
  @@RESULT_DIRS
end
exist?(project, name) click to toggle source

Does the project already have a dataset with that name?

# File lib/miga/dataset.rb, line 27
def exist?(project, name)
  !project.dataset_names_hash[name].nil?
end
new(project, name, is_ref = true, metadata = {}) click to toggle source

Create a MiGA::Dataset object in a project MiGA::Project with a uniquely identifying name. is_ref indicates if the dataset is to be treated as reference (true, default) or query (false). Pass any additional metadata as a Hash.

# File lib/miga/dataset.rb, line 53
def initialize(project, name, is_ref = true, metadata = {})
  name.miga_name? or
    raise 'Invalid name, please use only alphanumerics and underscores: ' +
          name.to_s
  @project, @name, @metadata = project, name, nil
  metadata[:ref] = is_ref
  @metadata_future = [
    File.join(project.path, 'metadata', "#{name}.json"),
    metadata
  ]
  return if File.exist? @metadata_future[0]

  save
  pull_hook :on_create
end

Public Instance Methods

activate!() click to toggle source

Activate a dataset. This removes the :inactive flag

# File lib/miga/dataset.rb, line 121
def activate!
  metadata[:inactive] = nil
  metadata[:warn] = nil if metadata[:warn] && metadata[:warn] =~ /^Inactive: /
  metadata.save
  project.recalculate_tasks("Reference dataset activated: #{name}") if ref?
  pull_hook :on_activate
end
active?() click to toggle source

Is this dataset active?

# File lib/miga/dataset.rb, line 167
def active?
  metadata[:inactive].nil? or !metadata[:inactive]
end
Also aliased as: is_active?
closest_relatives(how_many = 1, ref_project = false) click to toggle source

Returns an Array of how_many duples (Arrays) sorted by AAI:

  • 0: A String with the name(s) of the reference dataset.

  • 1: A Float with the AAI.

This function is currently only supported for query datasets when ref_project is false (default), and only for reference dataset when ref_project is true. It returns nil if this analysis is not supported.

# File lib/miga/dataset.rb, line 198
def closest_relatives(how_many = 1, ref_project = false)
  return nil if (ref? != ref_project) || multi?

  r = result(ref_project ? :taxonomy : :distances)
  return nil if r.nil?

  require 'miga/sqlite'
  MiGA::SQLite.new(r.file_path(:aai_db)).run(
    'SELECT seq2, aai FROM aai WHERE seq2 != ? '        'GROUP BY seq2 ORDER BY aai DESC LIMIT ?', [name, how_many]
  )
end
inactivate!(reason = nil) click to toggle source

Inactivate a dataset. This halts automated processing by the daemon

If given, the reason string is saved as a metadata :warn entry

# File lib/miga/dataset.rb, line 111
def inactivate!(reason = nil)
  metadata[:warn] = "Inactive: #{reason}" unless reason.nil?
  metadata[:inactive] = true
  metadata.save
  project.recalculate_tasks("Reference dataset inactivated: #{name}") if ref?
  pull_hook :on_inactivate
end
info() click to toggle source

Get standard metadata values for the dataset as Array

# File lib/miga/dataset.rb, line 131
def info
  MiGA::Dataset.INFO_FIELDS.map do |k|
    k == 'name' ? name : metadata[k]
  end
end
is_active?()

Same as active? for backwards-compatibility

Alias for: active?
is_multi?()

Same as multi? for backwards-compatibility

Alias for: multi?
is_nonmulti?()

Same as is_nonmulti? for backwards-compatibility

Alias for: nonmulti?
is_query?()

Same as query? for backwards-compatibility

Alias for: query?
is_ref?()

Same as ref? for backwards-compatibility

Alias for: ref?
metadata() click to toggle source

MiGA::Metadata with information about the dataset

# File lib/miga/dataset.rb, line 71
def metadata
  if @metadata.nil?
    @metadata = MiGA::Metadata.new(*@metadata_future)
    pull_hook :on_load
  end
  @metadata
end
multi?() click to toggle source

Is this dataset known to be multi-organism?

# File lib/miga/dataset.rb, line 151
def multi?
  return false if metadata[:type].nil? || @@KNOWN_TYPES[type].nil?

  @@KNOWN_TYPES[type][:multi]
end
Also aliased as: is_multi?
nonmulti?() click to toggle source

Is this dataset known to be single-organism?

# File lib/miga/dataset.rb, line 159
def nonmulti?
  return false if metadata[:type].nil? || @@KNOWN_TYPES[type].nil?

  !@@KNOWN_TYPES[type][:multi]
end
Also aliased as: is_nonmulti?
query?() click to toggle source

Is this dataset a query (non-reference)?

# File lib/miga/dataset.rb, line 145
def query?
  !metadata[:ref]
end
Also aliased as: is_query?
ref?() click to toggle source

Is this dataset a reference?

# File lib/miga/dataset.rb, line 139
def ref?
  !query?
end
Also aliased as: is_ref?
remove!() click to toggle source

Delete the dataset with all it's contents (including results) and returns nil

# File lib/miga/dataset.rb, line 101
def remove!
  results.each(&:remove!)
  metadata.remove!
  pull_hook :on_remove
end
save() click to toggle source

Save any changes you've made in the dataset

# File lib/miga/dataset.rb, line 81
def save
  MiGA.DEBUG "Dataset.metadata: #{metadata.data}"
  metadata.save
  pull_hook :on_save
end
Also aliased as: save!
save!()

Currently save! is simply an alias of save, for compatibility with the Project interface

Alias for: save
type() click to toggle source

Get the type of dataset as Symbol

# File lib/miga/dataset.rb, line 94
def type
  metadata[:type]
end