aws_sagemaker_remote.processing package

Submodules

aws_sagemaker_remote.processing.args module

aws_sagemaker_remote.processing.args.is_sagemaker()
aws_sagemaker_remote.processing.args.sagemaker_processing_args(parser: argparse.ArgumentParser, script, run=False, wait=True, profile='default', role='aws-sagemaker-remote-processing-role', image='aws-sagemaker-remote-processing:latest', image_path='/home/docs/checkouts/readthedocs.org/user_builds/aws-sagemaker-remote/checkouts/latest/aws_sagemaker_remote/ecr/processing', image_accounts='763104351884', instance='ml.t3.medium', inputs=None, outputs=None, dependencies=None, input_mount='/opt/ml/processing/input', output_mount='/opt/ml/processing/output', module_mount='/opt/ml/processing/modules', base_job_name='processing-job', job_name='', runtime_seconds=3600, volume_size=30, python='python3', requirements=None, configuration_script=None, configuration_command=None, additional_arguments=None, argparse_callback=None, output_json=None, env=None)

Configure argparse.ArgumentParser for processing scripts.

Parameters:
  • parser (argparse.ArgumentParser) – Parser to configure
  • script (str) – Path to script file to execute. Set default for --sagemaker-script
  • run (bool, optional) – Run on SageMaker. Set default for --sagemaker-run.
  • wait (bool, optional) – Wait for SageMaker processing to complete. Set default for --sagemaker-wait.
  • profile (str, optional) – AWS profile to use for session. Set default for --sagemaker-profile.
  • role (str, optional) – AWS IAM role name to use for processing. Will be created if it does not exist. Set default for --sagemaker-role.
  • image (str, optional) – URI of ECR Docker image to use for processing. Set default for --sagemaker-image.
  • image_path (str, optional) – Path to build docker if image does not exist. Set default for --sagemaker-image-path.
  • image_accounts (str, optional) – Accounts required to build docker image. Set default for --sagemaker-image-accounts.
  • instance (str, optional) – Type of instance to use for processing (e.g., ml.t3.medium). Set default for --sagemaker-instance.
  • inputs (dict(str,str), optional) –

    Dictionary of input argument keys to strings or aws_sagemaker_remote.args.PathArgument. Strings are converted to PathArgument with local set to your string. This should be sufficient for most use cases. For eack key and value, create an argument --key that defaults to value.

    • Running locally, input arguments are unmodified.
    • Running remotely, inputs are set to appropriate SageMaker mount points. Local inputs are uploaded automatically.

    For example:

    import OPTIONAL, PathArgument from aws_sagemaker_remote.args
    inputs = {
        "my_input_1": "path/to/data1", # implicit
        "my_input_2": PathArgument(local="path/to/data2"), # explicit
        "my_optional_input": OPTIONAL
    }
    

    Your script will now have arguments --my-input-1, --my-input-2, and --my-optional-input.

  • outputs (dict(str, str)) –

    Dictionary of output arguments keys to strings or aws_sagemaker_remote.args.PathArgument. Strings are converted to PathArgument with local set to your string. This should be sufficient for most use cases. For eack key and value, create an argument --key that defaults to value.

    For eack key:

    • Create an argument --key that defaults to value.local. This controls an output path.
    • Create an argument --key-s3 that defaults to value.remote. This controls where output is stored on S3. * Set to default to automatically create an output path based on the job name * Set to an S3 URL to store output at a specific location on S3
  • dependencies (dict(str, str)) – Dictionary of modules. For eack key and value, create an argument --module-key that defaults to value. This controls the path of a dependency of your code. The files at the given path will be uploaded to S3, downloaded to SageMaker, and put on PYTHONPATH.
  • input_mount (str, optional) – Local path on SageMaker container where inputs are downloaded. Set default for --sagemaker-input-mount.
  • output_mount (str, optional) – Local path on SageMaker container where outputs are written before upload. Set default for --sagemaker-output-mount.
  • module_mount (str, optional) – Local path on SageMaker container where source code is downloaded. Mount point is put on PYTHONPATH. Set default for --sagemaker-module-mount.
  • base_job_name (str, optional) – Job name will be generated from base_job_name and a timestamp if job_name is not provided. Set default for --sagemaker-base-job-name.
  • job_name (str, optional) – Job name is used for tracking and organization. Generated from base_job_name if not provided. Use base_job_name and leave job_name blank for most use-cases. Set default for --sagemaker-job-name.
  • runtime_seconds (int, optional) – Maximum in seconds before killing job. Set default for --sagemaker-runtime-seconds.
  • volume_size (int, optional) – Volume size in GB. Set default for --sagemaker-volume-size.
  • python (str, optional) – Pyton executable on container (default: python3). Set default for --sagemaker-python.
  • requirements (str, optional) – Set path to requirements file to upload and install with pip install -r. Set default for --sagemaker-requirements.
  • configuration_script (str, optional) – Set path to bash script to upload and source. Set default for --sagemaker-configuration-script.
  • configuration_command (str, optional) – Set command to be run to configure container, e.g. pip install mypackage && export MYVAR=MYVALUE. Set default for --sagemaker-configuration-command.
  • additional_arguments (list, optional) – List of tuple of positional args and keyword args for argparse.ArgumentParser.add_argument. Use to add additional arguments to the script.
  • argparse_callback (function, optional) – Function accepting one argument parser:argparse.ArgumentParser that adds additional arguments. Use to add additional arguments to the script.
  • output_json (str, optional) – Write SageMaker training details to this path. Set default for --sagemaker-output-json
aws_sagemaker_remote.processing.args.sagemaker_processing_input_args(parser: argparse.ArgumentParser, inputs=None, input_mount='/opt/ml/processing/input')
aws_sagemaker_remote.processing.args.sagemaker_processing_module_args(parser: argparse.ArgumentParser, dependencies=None, module_mount='/opt/ml/processing/modules')
aws_sagemaker_remote.processing.args.sagemaker_processing_output_args(parser: argparse.ArgumentParser, outputs=None, output_mount='/opt/ml/processing/output')
aws_sagemaker_remote.processing.args.sagemaker_processing_parser_for_docs()

aws_sagemaker_remote.processing.config module

class aws_sagemaker_remote.processing.config.SageMakerProcessingConfig(dependencies=None, inputs=None, outputs=None, env=None)

Bases: object

aws_sagemaker_remote.processing.iam module

aws_sagemaker_remote.processing.iam.ensure_processing_role(iam, role_name)

aws_sagemaker_remote.processing.main module

class aws_sagemaker_remote.processing.main.ProcessingCommand(script, main, help=None, **processing_args)

Bases: aws_sagemaker_remote.commands.Command

configure(parser: argparse.ArgumentParser)
run(args)
aws_sagemaker_remote.processing.main.sagemaker_processing_handle(args, config, main)
aws_sagemaker_remote.processing.main.sagemaker_processing_local_args(args, config: aws_sagemaker_remote.processing.config.SageMakerProcessingConfig)
aws_sagemaker_remote.processing.main.sagemaker_processing_main(main, script=None, description=None, **processing_args)

Entry point for processing.

Example

from aws_sagemaker_remote import sagemaker_training_main

def main(args):
    # your code here
    pass

if __name__ == '__main__':
    sagemaker_training_main(
        main=main,
        # ... additional configuration
    )
Parameters:
  • main (function) – Main function. Must accept a single argument args:argparse.Namespace.
  • script (str, optional) – Path to script file to execute. Set to __file__ for most use-cases. Empty or None defaults to file containing main.
  • description (str, optional) – Script description for argparse
  • **processing_args (dict, optional) – Keyword arguments to aws_sagemaker_remote.processing.args.sagemaker_processing_args()

aws_sagemaker_remote.processing.process module

aws_sagemaker_remote.processing.process.ensure_eol(file)

Ensure that file has Linux line endings. Convert if it doesn’t.

aws_sagemaker_remote.processing.process.make_arguments(args, config: aws_sagemaker_remote.processing.config.SageMakerProcessingConfig)
aws_sagemaker_remote.processing.process.make_processing_input(mount, name, source, s3, mode=None)
aws_sagemaker_remote.processing.process.process(session: sagemaker.session.Session, role, script, inputs=None, outputs=None, dependencies=None, requirements=None, configuration_script=None, configuration_command=None, base_job_name='processing-job', job_name=None, image='aws-sagemaker-remote-processing:latest', image_path='/home/docs/checkouts/readthedocs.org/user_builds/aws-sagemaker-remote/checkouts/latest/aws_sagemaker_remote/ecr/processing', image_accounts='763104351884', instance='ml.t3.medium', volume_size=30, runtime_seconds=3600, output_mount='/opt/ml/processing/output', input_mount='/opt/ml/processing/input', module_mount='/opt/ml/processing/modules', python='python3', wait=True, logs=True, arguments=None, tags=None, output_json=None, env=None)
aws_sagemaker_remote.processing.process.sagemaker_arguments(vargs)
aws_sagemaker_remote.processing.process.sagemaker_processing_run(args, config)

Module contents