Package starcluster :: Module cluster :: Class Cluster
[hide private]
[frames] | no frames]

Class Cluster

source code


Instance Methods [hide private]
 
__init__(self, ec2_conn=None, spot_bid=None, cluster_tag=None, cluster_description=None, cluster_size=None, cluster_user=None, cluster_shell=None, master_image_id=None, master_instance_type=None, node_image_id=None, node_instance_type=None, node_instance_types=[], availability_zone=None, keyname=None, key_location=None, volumes=[], plugins=[], permissions=[], refresh_interval=30, disable_queue=False, disable_threads=False, cluster_group=None, force_spot_master=False, **kwargs)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
__repr__(self)
repr(x)
source code
 
load_volumes(self, vols)
Iterate through vols and set device/partition settings automatically if not specified.
source code
 
load_plugins(self, plugins) source code
 
update(self, kwargs) source code
 
_validate_running_instances(self)
Validate existing instances against this cluster's settings
source code
 
get(self, name) source code
 
__str__(self)
str(x)
source code
 
load_receipt(self, load_plugins=True)
Load the original settings used to launch this cluster into this Cluster object.
source code
 
__getstate__(self) source code
 
get_nodes_or_raise(self) source code
 
get_node_by_dns_name(self, dns_name) source code
 
get_node_by_id(self, instance_id) source code
 
get_node_by_alias(self, alias) source code
 
_nodes_in_states(self, states) source code
 
get_spot_requests_or_raise(self) source code
 
create_node(self, alias, image_id=None, instance_type=None, zone=None, placement_group=None, spot_bid=None, force_flat=False) source code
 
create_nodes(self, aliases, image_id=None, instance_type=None, count=1, zone=None, placement_group=None, spot_bid=None, force_flat=False)
Convenience method for requesting instances with this cluster's settings.
source code
 
_get_next_node_num(self) source code
 
add_node(self, alias=None, no_create=False)
Add a single node to this cluster
source code
 
add_nodes(self, num_nodes, aliases=None, no_create=False)
Add new nodes to this cluster
source code
 
remove_node(self, node, terminate=True)
Remove a single node from this cluster
source code
 
remove_nodes(self, nodes, terminate=True)
Remove a list of nodes from this cluster
source code
 
_get_launch_map(self, reverse=False)
Groups all node-aliases that have similar instance types/image ids Returns a dictionary that's used to launch all similar instance types and image ids in the same request.
source code
 
_get_type_and_image_id(self, alias)
Returns (instance_type,image_id) for a given alias based on the map returned from self._get_launch_map
source code
 
create_cluster(self)
Launches all EC2 instances based on this cluster's settings.
source code
 
_create_flat_rate_cluster(self)
Launches cluster using flat-rate instances.
source code
 
_create_spot_cluster(self)
Launches cluster using all spot instances.
source code
 
is_spot_cluster(self)
Returns True if all nodes are spot instances
source code
 
has_spot_nodes(self)
Returns True if any nodes are spot instances
source code
 
is_ebs_cluster(self)
Returns True if all nodes are EBS-backed
source code
 
has_ebs_nodes(self)
Returns True if any nodes are EBS-backed
source code
 
is_stoppable(self)
Returns True if all nodes are stoppable (i.e.
source code
 
has_stoppable_nodes(self)
Returns True if any nodes are stoppable (i.e.
source code
 
is_cluster_compute(self)
Returns true if all instances are Cluster/GPU Compute type
source code
 
has_cluster_compute_nodes(self) source code
 
is_cluster_up(self)
Check that all nodes are 'running' and that ssh is up on all nodes This method will return False if any spot requests are in an 'open' state.
source code
 
get_spinner(self, msg)
Logs a status msg, starts a spinner, and returns the spinner object.
source code
 
wait_for_active_spots(self, spots=None)
Wait for all open spot requests for this cluster to transition to 'active'.
source code
 
wait_for_active_instances(self, nodes=None)
Wait indefinitely for cluster nodes to show up.
source code
 
wait_for_running_instances(self, nodes=None)
Wait until all cluster nodes are in a 'running' state
source code
 
wait_for_ssh(self, nodes=None)
Wait until all cluster nodes are in a 'running' state
source code
 
wait_for_cluster(self, msg='Waiting for cluster to come up...')
Wait for cluster to come up and display progress bar.
source code
 
is_cluster_stopped(self)
Check whether all nodes are in the 'stopped' state
source code
 
is_cluster_terminated(self)
Check whether all nodes are in a 'terminated' state
source code
 
attach_volumes_to_master(self)
Attach each volume to the master node
source code
 
detach_volumes(self)
Detach all volumes from all nodes
source code
 
restart_cluster(self)
Reboot all instances and reconfigure the cluster
source code
 
stop_cluster(self, terminate_unstoppable=False)
Shutdown this cluster by detaching all volumes and 'stopping' all nodes
source code
 
terminate_cluster(self)
Destroy this cluster by first detaching all volumes, shutting down all instances, cancelling all spot requests (if any), removing its placement group (if any), and removing its security group.
source code
 
start(self, create=True, create_only=False, validate=True, validate_only=False, validate_running=False)
Creates and configures a cluster from this cluster template's settings.
source code
 
_start(self, create=True, create_only=False)
Create and configure a cluster from this cluster template's settings (Does not attempt to validate before running)
source code
 
_setup_cluster(self)
This method waits for all nodes to come up and then runs the default StarCluster setup routines followed by any additional plugin setup routines
source code
 
run_plugins(self, plugins=None, method_name='run', node=None, reverse=False)
Run all plugins specified in this Cluster object's self.plugins list Uses plugins list instead of self.plugins if specified.
source code
 
run_plugin(self, plugin, name='', method_name='run', node=None)
Run a StarCluster plugin.
source code
 
is_running_valid(self)
Checks whether the current running instances are compatible with this cluster template's settings
source code
 
_validate(self)
Checks that all cluster template settings are valid.
source code
 
is_valid(self)
Returns True if all cluster template settings are valid
source code
 
_validate_spot_bid(self) source code
 
_validate_cluster_size(self) source code
 
_validate_shell_setting(self) source code
 
_validate_image_settings(self) source code
 
_validate_zone(self) source code
 
__check_platform(self, image_id, instance_type)
Validates whether an image_id (AMI) is compatible with a given instance_type.
source code
 
_validate_instance_types(self) source code
 
_validate_cluster_compute(self) source code
 
_validate_ebs_aws_settings(self)
Verify EBS volumes exists and that each volume's zone matches this cluster's zone setting.
source code
 
_validate_permission_settings(self) source code
 
_validate_ebs_settings(self)
Check EBS vols for missing/duplicate DEVICE/PARTITION/MOUNT_PATHs and validate these settings.
source code
 
_has_all_required_settings(self) source code
 
_validate_credentials(self) source code
 
_validate_keypair(self) source code
 
ssh_to_master(self, user='root') source code
 
ssh_to_node(self, alias, user='root') source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Properties [hide private]
  zone
If volumes are specified, this method determines the common availability zone between those volumes.
  _security_group
  cluster_group
  placement_group
  master_node
  nodes
  running_nodes
  stopped_nodes
  spot_requests
  progress_bar

Inherited from object: __class__

Method Details [hide private]

__init__(self, ec2_conn=None, spot_bid=None, cluster_tag=None, cluster_description=None, cluster_size=None, cluster_user=None, cluster_shell=None, master_image_id=None, master_instance_type=None, node_image_id=None, node_instance_type=None, node_instance_types=[], availability_zone=None, keyname=None, key_location=None, volumes=[], plugins=[], permissions=[], refresh_interval=30, disable_queue=False, disable_threads=False, cluster_group=None, force_spot_master=False, **kwargs)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides: object.__init__
(inherited documentation)

__repr__(self)
(Representation operator)

source code 

repr(x)

Overrides: object.__repr__
(inherited documentation)

load_volumes(self, vols)

source code 

Iterate through vols and set device/partition settings automatically if not specified.

This method assigns the first volume to /dev/sdz, second to /dev/sdy, etc for all volumes that do not include a device/partition setting

__str__(self)
(Informal representation operator)

source code 

str(x)

Overrides: object.__str__
(inherited documentation)

load_receipt(self, load_plugins=True)

source code 

Load the original settings used to launch this cluster into this Cluster object. The settings are loaded from the cluster group's description field.

create_nodes(self, aliases, image_id=None, instance_type=None, count=1, zone=None, placement_group=None, spot_bid=None, force_flat=False)

source code 

Convenience method for requesting instances with this cluster's settings. All settings (kwargs) except force_flat default to cluster settings if not provided. Passing force_flat=True ignores spot_bid completely forcing a flat-rate instance to be requested.

add_nodes(self, num_nodes, aliases=None, no_create=False)

source code 

Add new nodes to this cluster

aliases - list of aliases to assign to new nodes (len must equal num_nodes)

_get_launch_map(self, reverse=False)

source code 

Groups all node-aliases that have similar instance types/image ids
Returns a dictionary that's used to launch all similar instance types
and image ids in the same request. Example return value:

{('c1.xlarge', 'ami-a5c02dcc'): ['node001', 'node002'],
 ('m1.large', 'ami-a5c02dcc'): ['node003'],
 ('m1.small', 'ami-17b15e7e'): ['master', 'node005', 'node006'],
 ('m1.small', 'ami-19e17a2b'): ['node004']}

Passing reverse=True will return the same information only keyed by
node aliases:

{'master': ('m1.small', 'ami-17b15e7e'),
 'node001': ('c1.xlarge', 'ami-a5c02dcc'),
 'node002': ('c1.xlarge', 'ami-a5c02dcc'),
 'node003': ('m1.large', 'ami-a5c02dcc'),
 'node004': ('m1.small', 'ami-19e17a2b'),
 'node005': ('m1.small', 'ami-17b15e7e'),
 'node006': ('m1.small', 'ami-17b15e7e')}

_create_flat_rate_cluster(self)

source code 

Launches cluster using flat-rate instances. This method attempts to minimize the number of launch requests by grouping nodes of the same type/ami and launching each group simultaneously within a single launch request. This is especially important for Cluster Compute instances given that Amazon *highly* recommends requesting all CCI in a single launch request.

_create_spot_cluster(self)

source code 

Launches cluster using all spot instances. This method makes a single spot request for each node in the cluster since spot instances *always* have an ami_launch_index of 0. This is needed in order to correctly assign aliases to nodes.

is_stoppable(self)

source code 

Returns True if all nodes are stoppable (i.e. non-spot and EBS-backed)

has_stoppable_nodes(self)

source code 

Returns True if any nodes are stoppable (i.e. non-spot and EBS-backed)

get_spinner(self, msg)

source code 

Logs a status msg, starts a spinner, and returns the spinner object.
This is useful for long running processes:

    s = self.get_spinner("Long running process running...")
    (do something)
    s.stop()

wait_for_cluster(self, msg='Waiting for cluster to come up...')

source code 

Wait for cluster to come up and display progress bar. Waits for all spot requests to become 'active', all instances to be in a 'running' state, and for all SSH daemons to come up.

msg - custom message to print out before waiting on the cluster

restart_cluster(self)

source code 

Reboot all instances and reconfigure the cluster

Decorators:
  • @print_timing('Restarting cluster')

stop_cluster(self, terminate_unstoppable=False)

source code 

Shutdown this cluster by detaching all volumes and 'stopping' all nodes

In general, all nodes in the cluster must be 'stoppable' meaning all nodes are backed by flat-rate EBS-backed instances. If any 'unstoppable' nodes are found an exception is raised. A node is 'unstoppable' if it is backed by either a spot or S3-backed instance.

If the cluster contains a mix of 'stoppable' and 'unstoppable' nodes you can stop all stoppable nodes and terminate any unstoppable nodes by setting terminate_unstoppable=True.

This will stop all nodes that can be stopped and terminate the rest.

start(self, create=True, create_only=False, validate=True, validate_only=False, validate_running=False)

source code 

Creates and configures a cluster from this cluster template's settings.

create - create new nodes when starting the cluster. set to False to
         use existing nodes
create_only - only create the cluster node instances, don't configure
              the cluster
validate - whether or not to validate the cluster settings used.
           False will ignore validate_only and validate_running
           keywords and is effectively the same as running _start
validate_only - only validate cluster settings, do not create or
                configure cluster
validate_running - whether or not to validate the existing instances
                   being used against this cluster's settings

_start(self, create=True, create_only=False)

source code 

Create and configure a cluster from this cluster template's settings
(Does not attempt to validate before running)

create - create new nodes when starting the cluster. set to False to
         use existing nodes
create_only - only create the cluster node instances, don't configure
              the cluster

Decorators:
  • @print_timing("Starting cluster")

run_plugins(self, plugins=None, method_name='run', node=None, reverse=False)

source code 

Run all plugins specified in this Cluster object's self.plugins list Uses plugins list instead of self.plugins if specified.

plugins must be a tuple: the first element is the plugin's name, the second element is the plugin object (a subclass of ClusterSetup)

run_plugin(self, plugin, name='', method_name='run', node=None)

source code 

Run a StarCluster plugin.

plugin - an instance of the plugin's class name - a user-friendly label for the plugin method_name - the method to run within the plugin (default: "run") node - optional node to pass as first argument to plugin method (used for on_add_node/on_remove_node)

_validate(self)

source code 

Checks that all cluster template settings are valid. Raises a ClusterValidationError exception if not.

__check_platform(self, image_id, instance_type)

source code 

Validates whether an image_id (AMI) is compatible with a given instance_type. image_id_setting and instance_type_setting are the setting labels in the config file.

_validate_ebs_settings(self)

source code 

Check EBS vols for missing/duplicate DEVICE/PARTITION/MOUNT_PATHs and validate these settings. Does not require AWS credentials.


Property Details [hide private]

zone

If volumes are specified, this method determines the common availability zone between those volumes. If an availability zone is explicitly specified in the config and does not match the common availability zone of the volumes, an exception is raised. If all volumes are not in the same availabilty zone an exception is raised. If no volumes are specified, returns the user specified availability zone if it exists.

Get Method:
unreachable.zone(self) - If volumes are specified, this method determines the common availability zone between those volumes.

_security_group

Get Method:
unreachable._security_group(self)

cluster_group

Get Method:
unreachable.cluster_group(self)

placement_group

Get Method:
unreachable.placement_group(self)

master_node

Get Method:
unreachable.master_node(self)

nodes

Get Method:
unreachable.nodes(self)

running_nodes

Get Method:
unreachable.running_nodes(self)

stopped_nodes

Get Method:
unreachable.stopped_nodes(self)

spot_requests

Get Method:
unreachable.spot_requests(self)

progress_bar

Get Method:
unreachable.progress_bar(self)