AWS configuration for H5CLUSTER#

Amazon Web Services is a software suite backed with distributed services available world wide provided by Amazon Inc. H5CLUSTER relies on AWS EC2 for server/computer rental, the necessary interconnect and storage system. The storage system may be locally accessible storage so called ephemeral disks, attached remote storage called Elastic Block Store (EBS) volumes as well as Amazon S3 device. In addition there is

For identity/user management a shared cluster relies on IAM services to automate assigning login accounts on running cluster and for inter network access VPC services.

Spot Request Limit increase#

INFO: According to this announcement AWS EC2 is moving towards vCPU based Request Limits. For now it is unclear whether spot requests are affected. updated: 2019-09-16

This edition of H5CLUSTER supports spot, block spot requests to reduce operating costs as well as on-demand instances. The default setting provided with a fresh AWS EC2 account are confined to few instances, therefore to prevent boot-up failures please be sure that the spot instance requests are reasonably large. While more the better, it is recommended to have 30 per cluster. Keep in mind that a running cluster may be shared among users.

The spot instance limits are meant to be interpreted as is. The AWS EC2 platform doesn't come with guarantees, but does sell surplus inventory through a bidding process. These surplus instances are not equally available in all region and zones; there may be significant price difference between zones, and is function of date and time as well. Having said that the average price floats around 0.03 USD / core and in some cases may drop to 0.01 USD/core. Screenshot

IAM identity management for shared cluster#

When operating a cluster in exclusive/private mode H5CLUSTER has no IAM group requirements. However within a midsize to large organization it is recommended to share a running computing environment with others. Sharing resources do reduce operating costs, and to facilitate sharing the follow these steps: Lunch IAM dash board in your browser, and create groups for each planned shared cluster. Screenshot Similarly create an IAM account for each cluster user, then assign a user to a given cluster. When a cluster is being instantiated it scans the matching group for members and creates the user log in account on all cluster nodes if only if the IAM user is active and have public SSH keys uploaded. To upload SSH public key click on user name Screenshot Then select Upload SSH public Key. All uploaded SSH public keys are copied to $HOME/.ssh/authorized_keys Multiple SSH keys are supported, It is advised that cluster users have permission set to manage their own public keys. Screenshot A few words on the two category of people interacting with H5CLUSTER:

In other words if you want a personnel to able to manage clusters then required to have the minimum level of access rights set on IAM console to do so. This will allow him/her to start up and terminate an H5CLUSTER. However for regular users only an IAM account, cluster group membership and an uploaded SSH public key is required. Once cluster is in running state they may log in using the matching SSH private key and their favourite SSH client.

IAM identity management for personal cluster#

Ad-hoc | personal clusters don't require IAM groups being assigned | registered, as long as the administrator/user have IAM permission to EC2, VPC resources he or she can boot up a cluster. This personal cluster will only have login account matching with the computer he is starting the cluster from.

Identity managment for OpenID#

EBS volumes#

For performance and increased user experience the system relies on locally attached NVMe hard drives, so called ephemeral drives, of IO nodes. These drives not only provide shared file system among all nodes but local high performance scratch disk services. The downside of this fast and low latency storage is its lifespan being tied to running instances therefore their content are not preserved after the system is shut down. To help you with this problem each cluster may take up-to 15 EBS volumes, which maybe specified in .aws/config file for each cluster

[node name-io-nodes]
volumes: my-ebs-volume [,...]
...
[volume ebs://my-ebs-volume] 
mount-dir: /mnt
uri: vol-0f2906XXXXXXX3a5c, vol-03c0XXXXXXXXXf55a,
     vol-05bca0XXXXXXXf450, vol-072bXXXXXXXXX3d4b,
     vol-006f83XXXXXXXc6b8, vol-0d1aXXXXXXXXXd859
assign: round-robin

These volumes will be available as /mnt/xvda /mnt/xvdb .. /mnt/xvdf distributed round-robin among all nodes within the node group, making them available for PVFS backing and the content will survive cluster termination. The requirements are: an already existing amazon volume formatted to ext4 file system, and the volumes not be attached to any instances when starting the cluster. Here is an example to create n EBS volume from the command line:

aws ec2 create-volume --size 10 --availability-zone us-east-1a --volume-type gp2 

for more details please follow this AWS guide.

S3 volumes#

In order to keep up technological changes, and arising demand h5cluster introduces S3 based volumes each with different properties. The major difference from EBS storage based volumes is that data stored in S3 systems available across the federation -- or all instatiated clusters of all nodes, and in some cases from POSIX capable workstations. Because the underlying S3 storage system differs from regular hard drives respect to data consistency and access patterns the difference affect performance and quality of access.

spack volume#

is based on block device approach, with local linux kernel level caching, therefore provides the highest throughput across all nodes, taking advantage of AWS S3 fast interconnect to EC2 instances with peak performance of 3GB/sec. The downside is that only a single instance can have write access to stored data because of the current limitations of metadata service.

[volume s3://spack-us-east-1a]
region: us-east-1
mount-dir: /mnt/spack 
cache-file: /tmp
cache-size: 4000
block-size: 4M
chmod: 0550
chown: root:users
read-only: true

The above configuration snipet defines a read-only s3 device, with a directory /mnt/spack and 4000 x 4M = 16GB local cache where spack system files will be stashed to reduce latency. All H5CLUSTER system files with the exception of components connecting to spack volume stored centraally on the provided s3 volume. For enterprise level installation dedicated option is available.

s3 volume as block device#

The driver scripts built in utility comes allows to create and mount s3 volumes. First you need to define a section in .aws/config file with the bucket name as follows, where you can provide additional parameters to control block-size, compression, cache sizes, etc...

[volume s3://my-unique-name-volume ]
...

The following line will create 2 disks within the s3 bucket: my-unique-name-volume each 10G size. Since an s3 volume may only be mounted once with read-write access, for each PVFS volume a dedicated s3 disk is needed. In this case we are to use 2 disks.

cluster mkfs s3 --name my-unique-name-volume --volume-size 10G --volume-count 2 

To verify the volume just run the standard amazon utility: aws s3 ls s3://my-unique-name-volume where you should see directories and a volume.txt json descriptor for the volume:

                           PRE disk-00/
                           PRE disk-01/
2020-09-26 15:21:15        306 volume.txt

Within a disk-??/ there are compressed blocks of block-size with random prefix names containing the file system the device is formatted with. In the default case: file-system: ext4 -E packed_meta_blocks=1 -E lazy_journal_init=1 -E lazy_itable_init=1 -E nodiscard -b 4096 -O ^has_journal

aws s3 ls s3://h5cluster-5/disk-00/
2020-09-26 15:21:07       1041 396624c0-00000800
2020-09-26 15:21:08       3012 3b16a13f-00000024
...
2020-09-26 15:21:07       2354 ea0565d2-00000180

To mount on your local host run: cluster mount s3 --name my-unique-name-volume --mount-dir /mnt/s3, and to unmount: cluster umount --dir /mnt/s3. Both operations require s3backer installed locally as well as superuser privilages.

user volume#

is based on simpler approach that allows regular file access with standard S3 properties. Indeed a convenient storage for smaller non-performance related files such as configurations etc. This type of volume may be mounted read/write on all nodes and posix clients at the same time. The restriction is that user data is not shared across others.

shared volume#

This is similar to user volumes in terms of performance and access patterns, but is shared for all users

Elastic IP#

AWS nodes have public IP addresses attached to them once they go through the initialization phase. These IP addresses are visible from the AWS console and to the H5CLUSTER driving scripts but randomly assigned from a pool of IP addresses available to Amazon Inc. To help others to connect to a running cluster it is recommended to reserve one IP address per cluster for a nominal fee, which may be specified in the cluster configuration script:

[cluster name-of-my-cluster]
ipv4: x.x.x.x

You can read on Elastic IP addresses by clicking here.

H5CLUSTER provides an alternative mechanism: updating the workstation's /etc/hosts file with the master nodes public IPv4 address. This requires the user to run the cluster driver script on the workstation before able to log into the cluster.

What you should know about Amazon VPC-s#

The AWS platform provides feature rich environment for IP based interconnect and how it may communicate with outside world. The subset of these features are exploited to create isolated environment for each running cluster. Think of this environment as a confined private high performance IPv4 intranet with a Gateway to Internet. The resources are consumables, and by default they permit only 5 active clusters. This limitation may be lifted by limit increase. In addition to AWS policy, due to glitches in boto3 library and or improper cluster disposal automatically generated VPC-s may survive a cluster terminate --name running_cluster_name request, and keep hugging resources. In case of resource related errors a manual pruning often resolves the problem.