Site Adapter
The site adapters provide interfaces to various Cloud APIs and batch systems in order to allow a on-demand provisioning of resources and a dynamic orchestration of pre-built VM images and containers.
Sites are generally configured in the Sites configuration block. One has to specify a site name, the adapter to use
and a site quota in units of cores. Negative values for the site quota are interpreted as infinity. Optionally a
minimum lifetime in seconds of the Drone
can be specified. This is defined as
the time the Drone
remains in
AvailableState
before draining it. If no value is given, infinite lifetime
is assumed. Multiple sites are supported by using SequenceNodes.
Note
Even if a minimum lifetime is set, it is not guaranteed that the Drone
is not
drained due to its dropping demand before its minimum lifetime is exceeded.
Generic Site Adapter Configuration
Available configuration options
Option |
Short Description |
Requirement |
---|---|---|
name |
Name of the site |
Required |
adapter |
Site adapter to use. Adapter will be auto-imported (class name without Adapter) |
Required |
quota |
Core quota to be used for this site. Negative values are interpreted as infinity |
Required |
drone_heartbeat_interval |
Time in seconds between two consecutive executions of |
Optional |
drone_minimum_lifetime |
Time in seconds the drone will remain in |
Optional |
For each site in the Sites configuration block. A site specific configuration block carrying the site name has to be added to the configuration as well.
The site specific MappingNode contains site adapter specific configuration options that you can find below in the particular site adapter documentation.
In addition, it is required to add the following MappingNodes:
MachineTypes containing a SequenceNode of available machine types to be supported at the given site.
MachineTypeConfiguration a MappingNode for each machine type containing machine type specific configurations, details can be found below in the particular site adapter documentation.
MachineTypeMetaData containing a MappingNode for each machine type specifying the amount of Cores, Memory and Disk available
Note
The amount of memory and disk space is always specified in units of Gigabytes (GB) in TARDIS. The amount of cores is equivalent to the number of single core job slots provided by a machine.
Example configuration
Sites:
- name: MySiteName_1
adapter: MyAdapter2Use
quota: 123
drone_heartbeat_interval: 10
drone_minimum_lifetime: 3600
- name: MySiteName_2
adapter: OtherAdapter2Use
quota: 987
MySiteName_1:
general_adapter_option: something
MachineTypes:
- Micro
- Fat
MachineTypeConfiguration:
Micro:
machine_type_specific_option_1: 124234-1245-1345-15
machine_type_specific_option_2: 4583453-3245-345-2345
Fat:
machine_type_specific_option_1: 0034532-345-234-2341
machine_type_specific_option_2: 1345-134-1345-134-1
MachineMetaData:
Micro:
Cores: 1
Memory: 2
Disk: 30
Fat:
Cores: 32
Memory: 128
Disk: 256
MySiteName_2:
general_adapter_option: something_else
MachineTypes:
- XL
MachineTypeConfiguration:
XL:
machine_type_specific_option_1: 9847867-467846-468748BC
MachineMetaData:
XL:
Cores: 128
Memory: 256
Disk: 1024
Cloud Stack Site Adapter
The CloudStackAdapter
implements an interface to the CloudStack API.
The following general adapter configuration options are available.
Available adapter configuration options
Option |
Short Description |
Requirement |
---|---|---|
api_key |
Your CloudStack API Key to authenticate yourself. |
Required |
api_secret |
Your CloudStack API secret to authenticate yourself. |
Required |
end_point |
The end point of the CloudStack API to contact. |
Required |
All configuration entries in the MachineTypeConfiguration section of the machine types are directly added as keyword arguments to the CloudStack API deployVirtualMachine call. All available options are described in the CloudStack documentation
Example configuration
Sites:
- name: Hexascale
adapter: CloudStack
quota: 300
Hexascale:
api_key: BlaBlubbFooBar123456
api_secret: AKshflajsdfjnASJFkajsfd
end_point: https://api.hexascale.com/compute
MachineTypes:
- Micro
- Tiny
MachineTypeConfiguration:
Micro:
templateid: 909ce5b7-2132-4ff0-9bf8-aadbb423f7d9
serviceofferingid: 71004023-bb72-4a97-b1e9-bc66dfce9470
zoneid: 35eb7739-d19e-45f7-a581-4687c54d6d02
securitygroupnames: "secgrp-WN,NFS-access,Squid-access"
userdata: ini/hexascale.ini
keypair: MG
rootdisksize: 70
Tiny:
templateid: 909ce5b7-2132-4ff0-9bf8-aadbb423f7d9
serviceofferingid: b6cd1ff5-3a2f-4e9d-a4d1-8988c1191fe8
zoneid: 35eb7739-d19e-45f7-a581-4687c54d6d02
securitygroupnames: "secgrp-WN,NFS-access,Squid-access"
userdata: ini/hexascale.ini
keypair: MG
rootdisksize: 70
HTCondor Site Adapter
The HTCondorAdapter
implements an interface to the HTCondor batch system.
Regular batch jobs are submitted that start the actual Drone, which than is integrated itself in overlay batch system
using the chosen BatchSystemAdapter.
Available adapter configuration options
Option |
Short Description |
Requirement |
---|---|---|
max_age |
The result of the condor_status call is cached for max_age in minutes. |
Required |
bulk_size |
Maximum number of jobs to handle per bulk invocation of a condor tool. Default: 100 |
Optional |
bulk_delay |
Maximum duration in seconds to wait per bulk invocation of a condor tool. Default: 1.0 |
Optional |
executor |
The executor used to run submission and further calls to the Moab batch system. Default: ShellExecutor is used! |
Optional |
Available machine type configuration options
Option |
Short Description |
Requirement |
---|---|---|
jdl |
Path to the templated jdl used to submit drones to the HTCondor batch system |
Required |
SubmitOptions |
Options to add to the condor_submit command. (see example) |
Optional |
Note
The template jdl is using the Python template string syntax (see example HTCondor JDL for details).
Warning
The $(…) used for HTCondor variables needs to be replaced by $$(…) in the templated JDL.
Note
In order to properly identify started drones in the overlay batch system and to limit the amount of resources
(CPU cores, memory, disk) announced to be available, a set of environment variables needs to be set inside the
drone. Preference is to use the environment
parameter in the HTCondor JDL. However, in case of using the
HTCondor grid universe the environment is usually dropped by the Grid Compute Element. In that case, we suggest
to pass the environment variables using the arguments
parameter and set the corresponding environment
variables inside the drone itself based on the command line arguments in long option syntax.
Example configuration
Sites:
- name: TOPAS
adapter: HTCondor
quota: 462
TOPAS:
max_age: 1
MachineTypes:
- wholenode
- remotenode
MachineTypeConfiguration:
wholenode:
jdl: pilot_wholenode.jdl
remotenode:
jdl: pilot_remotenode.jdl
SubmitOptions:
spool: null
pool: remote-pool.somewhere.de
MachineMetaData:
wholenode:
Cores: 42
Memory: 256
Disk: 840
remotenode:
Cores: 8
Memory: 20
Disk: 100
Example HTCondor JDL (Vanilla Universe)
executable = start_pilot.sh
transfer_input_files = setup_pilot.sh,grid-mapfile
output = logs/$$(cluster).$$(process).out
error = logs/$$(cluster).$$(process).err
log = logs/cluster.log
accounting_group=tardis
x509userproxy = /home/tardis/proxy
environment=${Environment}
request_cpus=${Cores}
request_memory=${Memory}
request_disk=${Disk}
The Environment
contains the following variables, TardisDroneCores
. TardisDroneMemory
. TardisDroneDisk
and TardisDroneUuid
.
Example HTCondor JDL (Grid Universe)
universe = grid
executable = start_pilot.sh
arguments = ${Arguments}
transfer_input_files = setup_pilot.sh,grid-mapfile
output = logs/$$(cluster).$$(process).out
error = logs/$$(cluster).$$(process).err
log = logs/cluster.log
accounting_group=tardis
x509userproxy = /home/tardis/proxy
request_cpus=${Cores}
request_memory=${Memory}
request_disk=${Disk}
The Arguments
contains the following command line arguments, --cores
. --memory
. --disk
and
--uuid
.
Moab Site Adapter
The MoabAdapter
implements an interface to the Moab batch system. Regular batch
jobs are submitted that start the actual Drone, which than is integrated itself in overlay batch system
using the chosen BatchSystemAdapter..
Available adapter configuration options
Option |
Short Description |
Requirement |
---|---|---|
bulk_size |
Maximum number of jobs to handle per bulk invocation of the Default: 100 |
Optional |
bulk_delay |
Maximum duration in seconds to wait per bulk invocation of the Default: 1.0 |
Optional |
StartupCommand |
The command executed in the batch job. (Deprecated: Moved to MachineTypeConfiguration!) |
Deprecated |
executor |
The executor used to run submission and further calls to the Moab batch system. Default: ShellExecutor is used! |
Optional |
SubmitOptions |
Options to add to the msub command. long and short arguments are supported (see example) |
Optional |
The available options in the MachineTypeConfiguration section are the expected WallTime of the placeholder jobs and the requested NodeType. For details see the Moab documentation.
Example configuration
Sites:
- name: moab-site
adapter: Moab
quota: 2000
moab-site:
executor: !TardisSSHExecutor
host: login.dorie.somewherein.de
username: clown
client_keys:
- /opt/tardis/ssh/tardis
MachineTypes:
- singularity_d2.large
- singularity_d1.large
MachineTypeConfiguration:
singularity_d2.large:
Walltime: '02:00:00:00'
NodeType: '1:ppn=20'
StartupCommand: startVM.py
SubmitOptions:
short:
M: "someone@somewhere.com"
long:
timeout: 60
singularity_d1.large:
Walltime: '01:00:00:00'
NodeType: '1:ppn=20'
StartupCommand: startVM.py
MachineMetaData:
singularity_d2.large:
Cores: 20
Memory: 120
Disk: 196
singularity_d1.large:
Cores: 20
Memory: 120
Disk: 196
OpenStack Site Adapter
The OpenStackAdapter
implements an interface to the OpenStack Cloud API.
The following general adapter configuration options are available.
Available adapter configuration options
Option |
Short Description |
Requirement |
---|---|---|
auth_url |
The end point of the OpenStack API to contact. |
Required |
username |
Your OpenStack API username to authenticate yourself. |
Optional |
password |
Your OpenStack API password to authenticate yourself. |
Optional |
user_domain_name |
The name of the OpenStack user domain. |
Optional |
project_domain_name |
The name of the OpenStack project domain. |
Optional |
application_credential_id |
Your application credential ID to authenticate yourself. |
Optional |
application_credential_secret |
Your application credential secret to authenticate yourself. |
Optional |
Note
Either username
, password
, user_domain_name
and project_domain_name
or
application_credential_id
and application_credential_secret
are mandatory to authenticate against the
OpenStack endpoint.
All configuration entries in the MachineTypeConfiguration section of the machine types are directly added as keyword arguments to the OpenStack API create-server call. All available options are described in the OpenStack documentation
Example configuration
Sites:
- name: Woohoo
adapter: OpenStack
quota: 10 # CPU core quota
Woohoo:
auth_url: https://whoowhoo:13000/v3
username: woohoo
password: Woohoo123
project_name: WooHoo
user_domain_name: Default
project_domain_name: Default
MachineTypes:
- m1.xlarge
MachineTypeConfiguration:
m1.xlarge:
flavorRef: 5 # ID of m1.xlarge
networks:
- uuid: fe0317c6-0bed-488b-9108-13726656a0ea
imageRef: bc613271-6a54-48ca-9222-47e009dc0c29
key_name: MG
user_data: tardis/cloudinit/woohoo.ini
MachineMetaData:
m1.xlarge:
Cores: 8
Memory: 16
Disk: 160
Slurm Site Adapter
The SlurmAdapter
implements an interface to the SLURM batch system. Regular
batch jobs are submitted that start the actual Drone, which than is integrated itself in overlay batch system
using the chosen BatchSystemAdapter..
Available adapter configuration options
Option |
Short Description |
Requirement |
---|---|---|
bulk_size |
Maximum number of jobs to handle per bulk invocation of the Default: 100 |
Optional |
bulk_delay |
Maximum duration in seconds to wait per bulk invocation of the Default: 1.0 |
Optional |
StartUpCommand |
The command executed in the batch job. (Deprecated: Moved to MachineTypeConfiguration!) |
Deprecated |
executor |
The executor used to run submission and further calls to the Moab batch system. Default: ShellExecutor is used! |
Optional |
Available machine type configuration options
Option |
Short Description |
Requirement |
---|---|---|
Walltime |
Expected walltime of drone |
Required |
Partition |
Name of the Slurm partition to run in |
Required |
StartupCommand |
The command to execute at job start |
Required |
SubmitOptions |
Options to add to the sbatch command. long and short arguments are supported (see example) |
Optional |
Example configuration
Sites:
- name: hpc2000
adapter: Slurm
quota: 100
hpc2000:
executor: !TardisSSHExecutor
host: hpc2000.hpc.org
username: billy
client_keys:
- /opt/tardis/ssh/tardis
MachineTypes:
- one_day
- twelve_hours
MachineTypeConfiguration:
one_day:
Walltime: '1440'
Partition: normal
StartupCommand: 'pilot_clean.sh'
SubmitOptions:
short:
C: "intel"
long:
gres: "gpu:2,mic:1"
six_hours:
Walltime: '360'
Partition: normal
StartupCommand: 'pilot_clean.sh'
SubmitOptions:
long:
gres: "gpu:2,mic:1"
twelve_hours:
Walltime: '720'
Partition: normal
StartupCommand: 'pilot_clean.sh'
MachineMetaData:
one_day:
Cores: 20
Memory: 62
Disk: 480
twelve_hours:
Cores: 20
Memory: 62
Disk: 480
six_hours:
Cores: 20
Memory: 62
Disk: 480
Kubernetes Site Adapter
The KubernetesAdapter
implements an interface to the Kubernetes API.
The following general adapter configuration options are available.
Available adapter configuration options
Option |
Short Description |
Requirement |
---|---|---|
host |
The end point of the Kubernetes Cluster. |
Required |
token |
Bearer token used to authenticate yourself. |
Required |
To create a token refer to: Kubernetes documentation
Available machine type configuration options
Option |
Short Description |
Requirement |
---|---|---|
namespace |
Namespace for pods to run in. |
Required |
image |
Image for the pods. |
Required |
args |
Arguments for the containers that run in your pods. |
Required |
hpa |
Set TrueFalse to enabledisable kubernetes horizontal pod autoscaler feature. |
Required |
min_replicas |
Minimum number of pods to scale to. (Only required when hpa is set to True) |
Required |
max_replicas |
Maximum number of pods to scale to. (Only required when hpa is set to True) |
Required |
cpu_utilization |
Average Cpu utilization to maintain across pods of a deployment. (Only required when hpa is set to True) |
Required |
Example configuration
Sites:
- name: Kube-site
adapter: Kubernetes
quota: 10
Kube-site:
host: https://127.0.0.1:443
token: 31ada4fd-adec-460c-809a-9e56ceb75269
MachineTypes:
- example
MachineTypeConfiguration:
example:
namespace: default
image: busybox:1.26.1
label: busybox
args: ["sleep", "3600"]
MachineMetaData:
example:
Cores: 2
Memory: 4
Your favorite site is currently not supported? Please, have a look at how to contribute.