vac.conf

NAME
DESCRIPTION
[SETTINGS] OPTIONS
[VACUUM_PIPE ...] SECTIONS
[MACHINETYPE ...] SECTIONS
USER_DATA SUBSTITUTIONS
SUPERSLOTS AND MULTIPROCESSOR LMs
VACUUM PIPES
SPACE CENSUS AND GOCDB
AUTHOR
SEE ALSO

NAME

vac.conf − Vac configuration file

DESCRIPTION

vacd is a daemon which implements the Vacuum model on a factory (hypervisor) machine. vacd reads its configuration from /etc/vac.conf and /var/run/vac.conf and .conf files in /etc/vac.d and these files are also read by the vac utility command to find default values.

The configuration files use the Python ConfigParser syntax, which is similar to MS Windows INI files. The files are divided into sections, with each section name in square brackets. For example: [settings]. Each section contains a series of option=value pairs. Sections with the same name are merged and if options are duplicated, later values overwrite values given earlier.

For ease of management, any configuration file ending in .conf in the directory /etc/vac.d will be read, in alphanumeric order by name, and then /etc/vac.conf and /var/run/vac.conf are read if present. Along with other files in that directory, /var/run/vac.conf will be deleted at the next boot and is suitable for temporary configuration changes in preparation for a reboot. eg setting shutdown_time

Vac creates and manages Virtual Machines (VMs), and Docker (DC) or Singularity (SC) Containers. Logical Machine (LM) is used to refer to VMs or Containers. Vac documentation always uses Machine to refer to VMs and Containers, and Factory to refer to hypervisors/hosts.

[SETTINGS] OPTIONS

The [settings] section is required and has options which apply to all virtual machines.

vac_space is required and gives the name of this Vac space. A single space will be sufficient for many sites. The space name should be a fully qualified domain name, like vac01.example.com and may be used as a virtual CE name in some monitoring and accounting systems. It should not be the canonical hostname of any of the factory machines, and does not need to registered in DNS.

factories is a space separated list of the fully qualified domain names of all the factories in this Vac space, including this factory. The factories are queried using UDP when a factory needs to decide which machinetype to start. The Vac responder process on the factories replies to these queries with a summary of the LMs and the outcome of recent attempts to run a LM of each machinetype.

udp_timeout_seconds is how long to wait before giving up on more UDP replies. Defaults to 10.0 seconds.

mb_per_processor sets the memory allocated for each processor in a LM in MiB (1024^2). If enough LMs will underuse memory or KSM is enabled, then it may be more efficient to increase this value and overcommit the factory’s physical memory to prevent the VM kernels from believing they need to swap. Defaults to 2048.

hs06_per_processor gives the HEPSPEC06 power of each processor in an LM. This is used to calculate the total values $MACHINEFEATURES/hs06 and $JOBFEATURES/hs06_job inside the LM, when writing APEL accounting records, and when comparing running LMs to target shares. If not set, the default value 1.0 is used internally, but the $MACHINEFEATURES and $JOBFEATURES values are not set and only the /var/lib/vac/apel-archive copy of the APEL accounting record for each VM is created.

processors_per_superslot sets size of the superslots, as the total number of logical processors which can be allocated to single or multiprocessor LMs in each superslot. See SUPERSLOTS AND MULTIPROCESSOR LMs for more details. Defaults to 1, disabling the superslot mechanism and multiprocessor LMs.

shutdown_time can be set to apply a limit on the shutdown time for all LMs. This value is used if it is ever earlier than the shutdown time calculated from the machinetype’s max_wallclock_seconds. In this case $MACHINEFEATURES/shutdowntime is updated for the LM and Vac will kill the LM if that shutdown time is reached. As shutdown_time is approached, LMs will be created with shorter and shorter lifetimes if their min_wallclock_seconds settings permit. shutdown_time is given in Unix seconds since 00:00:00 UTC on 1st of January 1970. If the shutdown time is being set in preparation for a reboot, then creating an additional [settings] section in /var/run/vac.conf just containing the shutdown_time option may be helpful, as this will be automatically be deleted during the reboot.

draining takes the values yes or no. If set to yes, then no more LMs are created but existing ones are allowed to finish naturally. Default no.

gocdb_sitename gives the GOCDB site name to use when writing APEL accounting record files to /var/lib/vac/apel-outgoing and /var/lib/vac/apel-archive. Please use your official site name to avoid the risk of misnamed records getting into the central APEL database. If gocdb_sitename is not given, then records are only written to apel-archive and the domain name of the Vac space name is used as a placeholder in the files.

gocdb_cert_file and gocdb_key_file are the absolute paths of an X.509 certificate and private key which are authorized to update the GOCDB capacity and VO information for this space. See SPACE CENSUS AND GOCDB for more about this feature.

description optionally gives a short description of the space which can be included in logging and monitoring.

version_logger can be set to 0 to disable the logging of the version in use. Normally Vac sends one UDP factory_status message per day to vac-version-logger.gridpp.ac.uk on port 8884 to report the Vac version number. This is used by the Vac developers to target patches for security and bugs at the versions currently in use. Defaults to 1.

vacmon_hostport If set, this option gives a space-separated list of HOST:PORT to send VacQuery UDP messages to. This can be used to monitor the ongoing status of factories and LMs via site or central VacMon services. The central GridPP VacMon service is vacmon.gridpp.ac.uk:8884

total_processors is derived from /proc/cpuinfo by default and does not usually need to be set explicitly. If set, then it provides an additional limit on the number of processors which may be assigned to virtual machines. In turn this limits the number of virtual machines which can be created.

overload_per_processor sets the level of load per processor on the factory machine which will temporarily prevent the creation of more LMs. The number of processors is counted using /proc/cpuinfo and the one minute load average is taken from /proc/loadavg. LMs typically generate high loads during their initial set up, and this mechanism throttles the VM creation rate in response to the current overall load figure. Values of around 2.0 are ok with well-behaved LMs, but the default is more cautious. Default 1.25.

volume_group can be used to set the volume group in which a logical volume will be created for each LM. The logical volumes will have the LMs’ fully qualified domain names as their names. For example, /dev/vac_volume_group/factory1-00.example.com/ would be used by the VM factory1-00.example.com. Defaults to vac_volume_group if that volume group exists.

disk_gb_per_processor explicitly sets the size of disks to create for LMs in GB (1000^3). For logical volumes, Vac normally calculates the disk size using the space available to Vac in volume_group, total_processors, and the number of processors allocated to the virtual machine. For file-backed virtual disks for VMs, the default value of 40 GB is used unless overriden by disk_gb_per_processor.

root_public_key is the file name of a public key which will be offered to VMs if it exists. For most VM machinetypes this will allow root ssh access from the factory. Default /root/.ssh/id_rsa.pub

singularity_user is the username to use when creating Singularity containers, and must be given if those containers are to be used. Vac will kill processes running as this user which are not part of a valid Singularity container managed by Vac.

fix_networking can be set to False to stop Vac trying to reset the lowlevel networking state if the vac_169.254.0.0 virtual network does not exist and cannot be created with libvirt. /var/log/vacd-factory receives details of what Vac detects and tries to do. Default True.

forward_dev can be used to specify which network interface on the factory will be used for outgoing network connections from the VMs to other machines. For example, eth1. This option is only needed if you have multiple network interfaces on the factory and only one should be used for traffic originating from the VMs.

user_data_option_XXX and user_data_file_XXX are locally defined substitutions which will be applied to all machinetypes’ user_data files, unless the value is overridden by a per-machinetype value elsewhere in the configuration. See [MACHINETYPE ...] SECTIONS for the full syntax of these options.

The LM names are formed by adding a hyphen and the LM number to the hostname component of its fully qualified domain name. For example, factory1.example.com would have factory1-00.example.com, factory1-01.example.com, ... as its LMs. MAC addresses are formed with the prefix 56:4D as the first two bytes, and the four bytes of the IP address as the remaining four bytes.

[VACUUM_PIPE ...] SECTIONS

[vacuum_pipe ...] sections define remote Vacuum Pipes from which logical machine definitions can be obtained. Each Vacuum Pipe takes the form of a JSON document which contains one or more machine definitions. Each of these definitions are identified by suffixes which are concatenated with the machinetype prefix included in the [vacuum_pipe ...] section name.

For example, if the local configuration file contains [vacuum_pipe example] and the remote JSON document has a machine definition with suffix prod, then the machinetype example-prod including the hyphen will be added to Vac’s configuration and processed as if declared by a [machinetype example-prod] section in the local configuration.

Where the machine definition requires local files such as certificates and keys, the path is /var/lib/vac/machinetypes/MACHINETYPEPREFIX or /var/lib/vac/machinetypes/MACHINETYPEPREFIX/files where MACHINETYPEPREFIX is given in the [vacuum_pipe ...] section name. The description of [machinetype ...] sections below explains how these paths may be needed by user_data substitutions.

vacuum_pipe_url is required in [vacuum_pipe ...] sections and gives the HTTP(S) URL of a remote Vacuum Pipe JSON document supplied by the VO which contains details of the boot images, contextualizations, and other options needed to create one or more types of logical machine.

target_share gives the desired share of the total LMs available in this space for the machinetypes created by this [vacuum_pipe ...] section. This value in turn is shared out among the machinetypes created according to target share values given in the Vacuum Pipe JSON document. The shares do not need to add up to 1.0, and if a share is not given, then it is set to 0. Vac factories consult these shares when deciding which machinetype to start as LMs become available.

Other options permitted in a machinetype section may be given in a vacuum_pipe section, from where they will be copied to each machinetype created due to that vacuum pipe.

[MACHINETYPE ...] SECTIONS

In addition to Vacuum Pipes, [machinetype ...] sections can be included in the local configuration, with the name of the machinetype given in the section name, such as [machinetype example]. Local machinetype sections should not be required in normal operation. A machinetype name must only consist of lowercase letters, numbers, and hyphens. Each of these sections contain option=value pairs that are specific to that machinetype.

target_share gives the desired share of the total LMs available in this space for this machinetype. The shares do not need to add up to 1.0, and if a share is not given for a machinetype, then it is set to 0. Vac factories consult these shares when deciding which machinetype to start as LMs become available.

machines_dir_days sets the expiration time in days for per-LM directories created under /var/lib/vac/machines. Default 3.

backoff_seconds is the delay after a LM of this machinetype aborts. If a LM aborts, then no new LMs of this type will be created for this amount of time. This can be used to prevent the unnecessary creation of many LMs when no work is available, and avoid overloading the matcher or task queue of the VO.

fizzle_seconds is used in three places within the backoff procedure and in two other parts of Vac:
(1) First, if a LM finishes without producing a shutdown message code and has lasted less than fizzle_seconds, then it is treated as aborted.
(2) Secondly, after the backoff_seconds time has expired for a LM abort, once at least one LM has been started in this Vac space, then no more new LMs can be started for another fizzle_seconds.
(3) Thirdly, these new LMs are identified because they are still in the starting phase of creating files, or because they have been running for less than fizzle_seconds.
(4) Additionally, when writing the accounting log files, any LMs which run for less than fizzle_seconds are excluded.
(5) Finally, the heartbeat file checking is only carried out once an initial period of fizzle_seconds has passed.

max_wallclock_seconds gives the maximum lifetime of a LM. Vac will set $MACHINEFEATURES/shutdowntime for the LM using this value to communicate it to the LM. Vac will destroy the LM if it is still running after this amount of time. Default 86400.

min_wallclock_seconds gives the minimum remaining time required when creating a LM. This can be used to stop Vac creating LMs with short lifetimes when shutdown_time has been set or when building superslots. Default max_wallclock_seconds.

min_processors and max_processors give the minimum and maximum number of logical processors which can be allocated to LMs of this type when they are created.

accounting_fqan is used to specify a FQAN to include when writing APEL accounting records, to associate usage with particular experiments.

machine_model is required and tells Vac how to configure the virtual hardware seen by the LMs of this machinetype. Currently cernvm3, vm-raw, singularity, or docker. Default cernvm3.

cvmfs_repositories is a space-separated list of CernVM-FS repositories in /cvmfs/ to be provided to a Singularity or Docker Container by a bind mount of /cvmfs/ . Vac will refresh the repositories to keep them mounted in /cvmfs/ even if the host uses the automounter to manage CernVM-FS.

tmp_binds is a space-separated list of empty temporary or working directories to be created on the factory for each Docker container instance, and shared into the container with a bind mount at the specified locations. This may be requested for performance reasons.

disk_gb_per_processor explicitly sets the size of disks to create for LMs in GB (1000^3) and overrides the global calculated size or the size given in [settings]. However, when there is insufficient space to create logical volumes of that size, then LMs of this type will not be created.

heartbeat_file allows the machinetype to nominate a file which will be created in $JOBOUTPUTS before fizzle_seconds has passed. If this file is not created by then and maintained for the lifetime of the LM, the LM will be destroyed.

heartbeat_seconds gives the frequency at which the heartbeat_file must be updated after fizzle_seconds has passed. If the file is not updated for heartbeat_seconds then the LM will be destroyed. If heartbeat_seconds is 0, then only the existence of the file will be checked. Default 0.

image_signing_dn is used to specify a regular expression to match the DN of an X.509 certificate used to verify the authenticity of the root image. Vac attempts to obtain the certificate and signature from a CernVM Signature Block at the end of the image file, verifies the certificate using the CA files in /etc/grid-security/certificates, and compares the certificate DN to image_signing_dn. If this option is given, all these verification steps must be satisified for the image to be used. As of 2016, CernVM images are signed with a DN matching the regular expression /CN=cvm-sign01\.cern\.ch$

root_device is the device name exposed to the VM that is associated with the root disk image. Default vda.

scratch_device is the device name exposed to the VM that is associated with a scratch logical volume in the vm-raw model. Ignored for CernVM. Default vdb.

container_command is the command to run inside a Docker or Singularity container. Only the command itself may be given, with no command-line arguments. Default /user_data .

legacy_proxy can be set to True to generate Globus legacy proxies rather than RFC 3820 proxies. Default False.

user_data_proxy set to true causes the files x509cert.pem and x509key.pem in the machinetype’s subdirectory of /var/lib/vac/machinetypes to be used to make a limited X.509 proxy. The two files can be identical if desired, and the X.509 certificate and RSA private key will be extracted from the files as appropriate. (Note that this location is one level above the files subdirectory in which the following options look by default.)

For the remaining options, if the file name begins with ’/’, then it will be used as an absolute path; otherwise the path will be interpreted relative to the files subdirectory of the machinetype’s subdirectory of /var/lib/vac/machinetypes (which may be named after a vacuum_pipe section and shared between multiple machinetypes as described above.) For values supplied in a remote Vacuum Pipe JSON document, only filenames without ’/’ characters and HTTP(S) URLs are allowed.

root_image is the path to the image file from which the LM will boot. With the VM and Singularity machine_models, this can also be a remote HTTP or HTTPS URL which Vac will cache in /var/lib/vac/imagecache. The remote server must supply a Last-Modified timestamp and Vac will re-request the image each time a LM starts using an If-Modified-Since request to minimise network load. Alternatively, the images may be files in the local filesystem. With cernvm3 machine_model, the files are ISO CDROM-style boot images; with the cernvm2 machine_model, they are the root hard disk image itself; with Singularity they can be images or directory hierarchies, including ones under /cvmfs/... ; for Docker, root_image must begin with docker:// and is interpreted by Docker as an image provided by a registry, either beginning with the registry host and port, or on Docker Hub itself.

user_data is the path of a contextualization file provided by the VO and perhaps modified by Vac. If the path is a remote HTTP or HTTPS URL, Vac will fetch it over the network each time a LM is started. However the file is obtained, Vac will apply a series of default and locally defined ##user_data___## substitutions to it. See USER_DATA SUBSTITUTIONS below for a list of the default substitutions. For VMs, the file is supplied using the EC2/OpenStack HTTP procedure; for containers, the file is bind mounted at /user_data .

user_data_option_XXX and user_data_file_XXX are locally defined substitutions which will be applied to the user_data file before the LM is started. user_data_option_XXX takes the string to be substituted. user_data_file_XXX takes the relative or absolute path to a file whose contents will be substituted for the pattern in the user_data file.

USER_DATA SUBSTITUTIONS

Before the user_data file is used in starting a LM, several pattern based substitutions are performed by Vac. These patterns are in the form ##user_data___##. String values given to the option user_data_option_XXX replace patterns of the form ##user_data_option_XXX##. The contents of the files given to user_data_file_XXX options also replace patterns of the form ##user_data_option_XXX##. In both cases XXX are arbitrary strings consisting of letters, numbers, and underscores.

The pattern ##user_data_option_x509_proxy## is replaced by the proxy created if user_data_proxy_cert is set to true.

In addition, the following substitutions are performed automatically by Vac using data it holds internally:

##user_data_uuid## is the UUID assigned to the LM by Vac or Docker.
##user_data_space##
is the Vac space name.
##user_data_url##
is the HTTP(S) from which the user_data template was obtained. Only given if the template was retrieved by HTTP(S) rather from a local path.
##user_data_machinefeatures_url##
and ##user_data_jobfeatures_url## and ##user_data_joboutputs_url## are the values of $MACHINEFEATURES, $JOBFEATURES, and $JOBOUTPUTS to set within the LM.
##user_data_machinetype##
and ##user_data_vmtype## (deprecated) are the name of the machinetype of this LM.
##user_data_machine_hostname##
and ##user_data_vm_hostname## (deprecated) are the hostname given to the LM by Vac.
##user_data_manager_version##
and ##user_data_vmlm_version## (deprecated) have the form "Vac v.v.v" where v.v.v is the Vac version.
##user_data_manager_hostname##
and ##user_data_vmlm_hostname## (deprecated) are the hostname of the Vac factory machine.

SUPERSLOTS AND MULTIPROCESSOR LMs

By setting processors_per_superslot in [settings] to a value greater than one, Vac will attempt to create LMs in groups with the same finishing time. This causes groups of processors to become available at the same time which enables the creation of LMs which require multiple virtual CPUs. When creating these LMs, the max_processors and min_processors values from the relevant machinetype section determine the LM’s requirements. processors_per_superslot also limits the largest number of processors which may be assigned to a single LM (which will occupy a whole superslot.) The min_wallclock_seconds value is used to determine whether there is sufficient time left to create a LM of that machinetype. max_wallclock_seconds determines whether a sufficiently long-lived LM can be created to match the superslot.

VACUUM PIPES

If vacuum_pipe sections exist as described above, then the corresponding JSON documents are fetched via HTTP(S) and used to create machinetypes in the configuration using default values supplied by the VOs for those machinetypes. The JSON documents contain a dictionary with keys cache_seconds and machinetypes. cache_seconds sets the maximum time the JSON document may be cached, and defaults to 3600 seconds if not set or the document has never been fetched successfully. machinetypes is a list of machinetype dictionaries, in which the following options are supported: accounting_fqan, backoff_seconds, container_command, cvmfs_repositories, fizzle_seconds, disk_gb_per_processor, heartbeat_file, heartbeat_seconds, image_signing_dn, legacy_proxy, machine_model, max_processors, max_wallclock_seconds, min_processors, min_wallclock_seconds, root_device, root_image, scratch_device, suffix, target_share, tmp_binds, user_data, user_data_option_XXXX, user_data_file_XXXX, user_data_proxy.

As explained above, options referring to files on the LM factory may not specify filesystem paths if obtained from a Vacuum Pipe: only filenames within the /var/lib/vac/machinetypes/MACHINETYPEPREFIX/files directory are acceptable.

SPACE CENSUS AND GOCDB

If the options gocdb_cert_file and gocdb_key_file are set, then Vac will attempt to update GOCDB with information about this Vac space. The space must already exist in GOCDB as part of the site with name given by gocdb_sitename. The list of VOs with machinetypes in this space is taken from the Vac configuration, with FQANs given by accounting_fqan. Both machinetypes defined locally and those from remote vacuum pipes are included. Capacity information for the whole space is combined from a static file and a dynamic census of the other Vac factories in this space.

Static information is taken from the file /var/lib/vac/space-catalogue.json if it exists. This JSON file consists of a dictionary with factory names as the keys and dictionaries of key/value pairs for each factory. For example: {’vac01.example.com’: {’max_processors’:2, ’max_machines’:1, ’max_hs06’:20.0}, ’vac02.example.com’: {’max_processors’:3, ’max_machines’:2, ’max_hs06’:30.0} }

Each Vac factory also maintains a census of all factories in this space by sending VacQuery messages every hour and recording the results in /var/lib/vac/space-census/ . Results more than one day old are ignored, and then deleted after two days (to allow debugging.)

Each Vac factory sends one update to GOCDB for the whole space once per day if there are 24 or fewer factories responding in the census. If there are more factories, then the time between updates is increased in proportion to the number of factories in the census. The update information sent is based on both space-catalogue.json and the census. If a factory is described in space-catalogue.json then its census results are ignored.

AUTHOR

Andrew McNab <Andrew.McNab@cern.ch>

More about Vac: http://www.gridpp.ac.uk/vac/

SEE ALSO

vacd(8), vac(1), check-vacd(8)