aws_sagemaker_remote.training package¶
Submodules¶
aws_sagemaker_remote.training.args module¶
-
aws_sagemaker_remote.training.args.CHECKPOINT_LOCAL_PATH= '/opt/ml/checkpoints'¶ args = {} output_dir = os.environ.get(‘SM_OUTPUT_DIR’, None) if output_dir:
args[‘output_dir’] = output_dirmodel_dir = os.environ.get(‘SM_MODEL_DIR’, None) if model_dir:
args[‘model_dir’] = model_dir- for channel in config.inputs.keys():
env_key = ‘SM_CHANNEL_{}’.format(channel.upper()) channel_dir = os.environ.get(env_key, None) if channel_dir:
args[channel] = channel_dir
return args
Type: def sagemaker_env_args(config)
-
aws_sagemaker_remote.training.args.is_sagemaker()¶
-
aws_sagemaker_remote.training.args.sagemaker_env_arg()¶ Check for
SM_TRAINING_ENVenvironment variable and return object if it exists
-
aws_sagemaker_remote.training.args.sagemaker_env_args(args: argparse.Namespace, config: aws_sagemaker_remote.training.config.SageMakerTrainingConfig)¶ Check for
SM_TRAINING_ENVenvironment variable and use it to override arguments.
-
aws_sagemaker_remote.training.args.sagemaker_training_args(parser: argparse.ArgumentParser, script, source='', base_job_name='training-job', job_name='', profile='default', run=False, wait=True, inputs=None, dependencies=None, additional_arguments=None, argparse_callback=None, model_dir='output/model', output_dir='output/output', checkpoint_dir='output/checkpoint', checkpoint_s3='default', checkpoint_container='/opt/ml/checkpoints', checkpoint_initial=None, training_image='aws-sagemaker-remote-training:latest', training_image_path='/home/docs/checkouts/readthedocs.org/user_builds/aws-sagemaker-remote/checkouts/latest/aws_sagemaker_remote/ecr/training', training_image_accounts=['763104351884'], training_instance='ml.m5.large', training_role='aws-sagemaker-remote-training-role', enable_sagemaker=True, experiment_name=None, trial_name=None, spot_instances=False, volume_size=30, max_run=43200, max_wait=86400, env=None, workers=2, output_json=None)¶ Configure
argparse.ArgumentParserfor training scripts.Parameters: - parser (argparse.ArgumentParser) – Parser to configure
- script (str) – Path to script file to execute.
Set default for
--sagemaker-script - source (str, optional) – Path of source directory to upload.
Must include
scriptpath. Defaults to directory containingscriptif not provided. - base_job_name (str, optional) – Job name will be generated from
base_job_nameand a timestamp ifjob_nameis not provided. Set default for--sagemaker-base-job-name. - job_name (str, optional) – Job name is used for tracking and organization.
Generated from
base_job_nameif not provided. Usebase_job_nameand leavejob_nameblank for most use-cases. Set default for--sagemaker-job-name. - profile (str, optional) – AWS profile to use for session.
Set default for
--sagemaker-profile. - run (bool, optional) – Run on SageMaker.
Set default for
--sagemaker-run. - wait (bool, optional) – Wait for SageMaker processing to complete.
Set default for
--sagemaker-wait. - inputs (dict(str,str), optional) – Dictionary of input arguments.
For eack key and value, create an argument
--keythat defaults to value. * Running locally, input arguments are unmodified. * Running remotely, inputs are set to appropriate SageMaker mount points. Local inputs are uploaded automatically. - dependencies (dict(str, str)) – Dictionary of modules.
For eack key and value, create an argument
--module-keythat defaults to value. This controls the path of a dependency of your code. The files at the given path will be uploaded to S3, downloaded to SageMaker, and put on PYTHONPATH. - additional_arguments (list, optional) – List of tuple of positional args and keyword args for
argparse.ArgumentParser.add_argument. Use to add additional arguments to the script. - argparse_callback (function, optional) – Function accepting one argument
parser:argparse.ArgumentParserthat adds additional arguments. Use to add additional arguments to the script. - model_dir (string, optional) – Directory to save trained inference model.
Set default for
--model-dir. - output_dir (string, optional) – Directory to save outputs (images, logs, etc.).
Set default for
--output-dir. - checkpoint_dir (string, optional) – Directory to save checkpoints for saving and resuming training.
Set default for
--checkpoint-dir. - checkpoint_s3 (string, optional) – S3 storage for checkpoints for saving and resuming training or “default”.
Set default for
--sagemaker-checkpoint-s3. - checkpoint_container (string, optional) – Local directory for checkpoints when running remotely.
Set default for
--sagemaker-checkpoint-container. - training_image (str, optional) – URI of ECR or DockerHub Docker image to use for training.
Set default for
--sagemaker-training-image. - training_instance (str, optional) – Type of instance to use for training (e.g.,
ml.t3.medium). Set default for--sagemaker-training-instance. - training_role (str, optional) – AWS IAM role name to use for training. Will be created if it does not exist.
Set default for
--sagemaker-training-role. - experiment_name (str, optional) – Name of experiment. Required if
trial_nameis provided. Set default for--sagemaker-experiment-name. - trial_name (str, optional) – Name of trial within experiment.
Set default for
--sagemaker-trial-name. - enable_sagemaker (bool, optional) –
- True: Include SageMaker command-line options.
- False: Only include local command-line options
- max_run (int, optional) – Maximum training time in seconds.
- max_wait (int, optional) – Maximum time to wait for a spot instance in seconds.
- workers (int, optional) – Number of workers
-
aws_sagemaker_remote.training.args.sagemaker_training_channel_args(parser: argparse.ArgumentParser, inputs)¶
-
aws_sagemaker_remote.training.args.sagemaker_training_checkpoint_args(parser: argparse.ArgumentParser, checkpoint_dir, checkpoint_initial=None, checkpoint_s3='default', checkpoint_container='/opt/ml/checkpoints', enable_sagemaker=True)¶
-
aws_sagemaker_remote.training.args.sagemaker_training_dependency_args(parser: argparse.ArgumentParser, dependencies)¶
-
aws_sagemaker_remote.training.args.sagemaker_training_model_args(parser: argparse.ArgumentParser, model_dir='model')¶
-
aws_sagemaker_remote.training.args.sagemaker_training_output_args(parser: argparse.ArgumentParser, output_dir)¶
-
aws_sagemaker_remote.training.args.sagemaker_training_parser_for_docs()¶
aws_sagemaker_remote.training.channels module¶
-
aws_sagemaker_remote.training.channels.expand_folder_channels(channels, session)¶
-
aws_sagemaker_remote.training.channels.expand_list_channels(channels)¶
-
aws_sagemaker_remote.training.channels.expand_repeated_channels(channels)¶
-
aws_sagemaker_remote.training.channels.parse_channel_arguments(channels, session)¶
-
aws_sagemaker_remote.training.channels.process_channels(channels, args, session, prefix)¶
-
aws_sagemaker_remote.training.channels.read_channel_arguments(channels, args)¶
-
aws_sagemaker_remote.training.channels.remove_empty_channels(channels)¶
-
aws_sagemaker_remote.training.channels.set_suffixes(channels, session, hyperparameters)¶
-
aws_sagemaker_remote.training.channels.standardize_channel(channel)¶
-
aws_sagemaker_remote.training.channels.standardize_channels(channels)¶
-
aws_sagemaker_remote.training.channels.upload_local_channel(channel, session, s3_uri)¶
-
aws_sagemaker_remote.training.channels.upload_local_channels(channels, session, prefix)¶
aws_sagemaker_remote.training.config module¶
-
class
aws_sagemaker_remote.training.config.SageMakerTrainingConfig(inputs=None, dependencies=None, env=None)¶ Bases:
object
aws_sagemaker_remote.training.experiment module¶
-
aws_sagemaker_remote.training.experiment.ensure_experiment(client, experiment_name)¶
aws_sagemaker_remote.training.iam module¶
-
aws_sagemaker_remote.training.iam.ensure_training_role(iam, role_name)¶
aws_sagemaker_remote.training.main module¶
-
class
aws_sagemaker_remote.training.main.TrainingCommand(main, script=None, help=None, metrics=None, **training_args)¶ Bases:
aws_sagemaker_remote.commands.Command-
configure(parser: argparse.ArgumentParser)¶
-
run(args)¶
-
-
aws_sagemaker_remote.training.main.sagemaker_training_handle(args, config, main, metrics=None)¶
-
aws_sagemaker_remote.training.main.sagemaker_training_main(main, script=None, script_fn=None, description=None, metrics=None, **training_args)¶ Entry point for training.
Example
from aws_sagemaker_remote import sagemaker_processing_main def main(args): # your code here pass if __name__ == '__main__': sagemaker_processing_main( main=main, # ... additional configuration )
Parameters: - main (function) – Main function. Must accept a single argument
args(argparse.Namespace) - script (str, optional) – Path to script file to execute.
Set to
__file__for most use-cases. Empty or None defaults to file containingmain. Object interpreted as file containing the object. - description (str, optional) – Script description for argparse
- metrics (dict, optional) – Metrics to record. Dictionary of metric name (str) to RegEx that extracts metric (str). See SageMaker Training Metrics Docs
- **training_args (dict, optional) – Keyword arguments to
aws_sagemaker_remote.training.args.sagemaker_training_args()
- main (function) – Main function. Must accept a single argument
aws_sagemaker_remote.training.train module¶
-
aws_sagemaker_remote.training.train.sagemaker_training_run(args, config: aws_sagemaker_remote.training.config.SageMakerTrainingConfig, metrics=None)¶
aws_sagemaker_remote.training.training_inputs module¶
-
aws_sagemaker_remote.training.training_inputs.build_training_input(channel, i, args)¶
-
aws_sagemaker_remote.training.training_inputs.build_training_inputs(channels, args)¶