class MiGA::Dataset

Dataset representation in MiGA.

@package MiGA @license Artistic-2.0

Attributes

metadata[R]

MiGA::Metadata with information about the dataset.

name[R]

Datasets are uniquely identified by name in a project.

project[R]

MiGA::Project that contains the dataset.

Public Class Methods

INFO_FIELDS() click to toggle source

Standard fields of metadata for datasets.

# File lib/miga/dataset.rb, line 25
def INFO_FIELDS
  %w(name created updated type ref user description comments)
end
KNOWN_TYPES() click to toggle source
# File lib/miga/dataset/base.rb, line 9
def KNOWN_TYPES ; @@KNOWN_TYPES ; end
PREPROCESSING_TASKS() click to toggle source
# File lib/miga/dataset/base.rb, line 10
def PREPROCESSING_TASKS ; @@PREPROCESSING_TASKS ; end
RESULT_DIRS() click to toggle source
# File lib/miga/dataset/base.rb, line 8
def RESULT_DIRS ; @@RESULT_DIRS ; end
exist?(project, name) click to toggle source

Does the project already have a dataset with that name?

# File lib/miga/dataset.rb, line 19
def exist?(project, name)
  not project.dataset_names_hash[name].nil?
end
new(project, name, is_ref=true, metadata={}) click to toggle source

Create a MiGA::Dataset object in a project MiGA::Project with a uniquely identifying name. is_ref indicates if the dataset is to be treated as reference (true, default) or query (false). Pass any additional metadata as a Hash.

# File lib/miga/dataset.rb, line 50
def initialize(project, name, is_ref=true, metadata={})
  raise "Invalid name '#{name}', please use only alphanumerics and " +
    "underscores." unless name.miga_name?
  @project = project
  @name = name
  metadata[:ref] = is_ref
  @metadata = MiGA::Metadata.new(
    File.expand_path("metadata/#{name}.json", project.path), metadata )
end

Public Instance Methods

activate!() click to toggle source

Activate a dataset. This removes the :inactive flag.

# File lib/miga/dataset.rb, line 89
def activate!
  self.metadata[:inactive] = nil
  self.metadata.save
end
closest_relatives(how_many=1, ref_project=false) click to toggle source

Returns an Array of how_many duples (Arrays) sorted by AAI:

  • 0: A String with the name(s) of the reference dataset.

  • 1: A Float with the AAI.

This function is currently only supported for query datasets when ref_project is false (default), and only for reference dataset when ref_project is true. It returns nil if this analysis is not supported.

# File lib/miga/dataset.rb, line 149
def closest_relatives(how_many=1, ref_project=false)
  return nil if (is_ref? != ref_project) or is_multi?
  r = result(ref_project ? :taxonomy : :distances)
  return nil if r.nil?
  db = SQLite3::Database.new(r.file_path :aai_db)
  db.execute("SELECT seq2, aai FROM aai WHERE seq2 != ? " +
    "GROUP BY seq2 ORDER BY aai DESC LIMIT ?", [name, how_many])
end
ignore_task?(task) click to toggle source

Should I ignore task for this dataset?

# File lib/miga/dataset.rb, line 132
def ignore_task?(task)
  return true unless is_active?
  return !metadata["run_#{task}"] unless metadata["run_#{task}"].nil?
  return true if task==:taxonomy and project.metadata[:ref_project].nil?
  pattern = [true, false]
  ( [@@_EXCLUDE_NOREF_TASKS_H[task], is_ref?     ]==pattern or
    [@@_ONLY_MULTI_TASKS_H[task],    is_multi?   ]==pattern or
    [@@_ONLY_NONMULTI_TASKS_H[task], is_nonmulti?]==pattern )
end
inactivate!() click to toggle source

Inactivate a dataset. This halts automated processing by the daemon.

# File lib/miga/dataset.rb, line 82
def inactivate!
  self.metadata[:inactive] = true
  self.metadata.save
end
info() click to toggle source

Get standard metadata values for the dataset as Array.

# File lib/miga/dataset.rb, line 96
def info
  MiGA::Dataset.INFO_FIELDS.map do |k|
    (k=="name") ? self.name : metadata[k.to_sym]
  end
end
is_active?() click to toggle source

Is this dataset active?

# File lib/miga/dataset.rb, line 126
def is_active?
  metadata[:inactive].nil? or !metadata[:inactive]
end
is_multi?() click to toggle source

Is this dataset known to be multi-organism?

# File lib/miga/dataset.rb, line 112
def is_multi?
  return false if metadata[:type].nil? or @@KNOWN_TYPES[type].nil?
  @@KNOWN_TYPES[type][:multi]
end
is_nonmulti?() click to toggle source

Is this dataset known to be single-organism?

# File lib/miga/dataset.rb, line 119
def is_nonmulti?
  return false if metadata[:type].nil? or @@KNOWN_TYPES[type].nil?
  !@@KNOWN_TYPES[type][:multi]
end
is_query?() click to toggle source

Is this dataset a query (non-reference)?

# File lib/miga/dataset.rb, line 108
def is_query? ; !metadata[:ref] ; end
is_ref?() click to toggle source

Is this dataset a reference?

# File lib/miga/dataset.rb, line 104
def is_ref? ; !!metadata[:ref] ; end
remove!() click to toggle source

Delete the dataset with all it's contents (including results) and returns nil.

# File lib/miga/dataset.rb, line 75
def remove!
  self.results.each{ |r| r.remove! }
  self.metadata.remove!
end
save() click to toggle source

Save any changes you've made in the dataset.

# File lib/miga/dataset.rb, line 62
def save
  self.metadata[:type] = :metagenome if !metadata[:tax].nil? and
    !metadata[:tax][:ns].nil? and metadata[:tax][:ns]=="COMMUNITY"
  self.metadata.save
end
type() click to toggle source

Get the type of dataset as Symbol.

# File lib/miga/dataset.rb, line 70
def type ; metadata[:type] ; end