Saturday, August 29, 2020

Synchronize AWS S3 bucket with AWS EC2 instance

 Task in hand: Synchronize AWS S3 bucket with AWS EC2 instance.



Infra requirement:

  1. IAM user group with following policies

    • Administrator Access
    •  AmazonS3FullAccess
      2. IAM user in above created group with Access tokens also generated.
      3. S3 bucket
      4. EC2 instance with following
    • Linux OS(any flavor)
    • httpd server
    • aws command line interface (cli)

Below are the detailed steps:

        1. IAM - Identity Access Management

            d. Create a user group and allow following permission

    • Amazon S3 full access
    • Administration access            

        2. Create a user and add to above created group. In the last step of user creation                        download authentication key id and key to a csv format.
        3. S3 bucket
    •  Create a S3 bucket in AWS 
    • Upload a HTML file to the S3 bucket and block public  access

        4. EC2 Instance 
    • Create an EC2 instance with Amazon Linux AMI and download your private key (ppk file). Please do not share this PPK file to anyone. It is the key to access all your EC2 instances.
    •  Once the instance is up and running, you can see that in EC2 instance list. 
   
    • Change inbound rules in security group and allow port 80 to accept HTTP requests. 
    • Login to EC2 instance using your favorite SSH client and secure PPK file. I have used Putty.
    • Once successfully logged in, you can see linux shell with login user as ec2-user(based on linux flavor).
  • Now that you are logged in to aws ec2 instance, we have to install couple of packages
    • Update yum packages using ‘sudo yum update’
    •  Install Apache server using ‘sudo yum install httpd’. Here ‘sudo’ meaning run with elevated privileges, ‘yum’ is package manager, ‘install’ command to execute by package manager and ‘httpd’ is the package name to install.
    • Installing httpd will inherently create /var/www/html folder. ALL HTML files that Apache(httpd) servers are picked from /var/www/html by default
    • Change permission on /var/www/html folding using 'sudo chown ec2 - user /var/www/html' and 'sudo chmod -R o+r /var/www/html' commands
    • Check for aws cli package installation by using following commands - 'aws - version'
    • if package exists you see version of aws cli installed, else you have to install aws cli using following guide -  https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-linux.html
    • configure aws cli using 'aws configurations' commands and provide access key id and access key downloaded (csv) while creating user. let region and output format be default 

    • Start apache server as system process using following command $ sudo systemctl start httpd

    • Check that status of httpd server $systemctl status httpd

    • Now that we have setup all necessary infra to sync S3 to EC2, let us sync the  data. Use following command to sync S3 bucket to EC2 instance $ aws s3 sync s3://buckername /localpath
    • Ex: $ aws s3 sync s3://my_s3_bucker /var/www/html
Testing:
Till now we have setup all infra which is necessary for syncing S3 to EC2. Also synced S3 to EC2. it is time to test
  1. Since we have synced html file from S3 to EC2 instance
  2. Here html file is copied to /var/www/html folder
  3. As httpd sources data from this folder, when we try to browse home page of Apache server using public IP (or DNS which is mapped to this ec2 instance), you should be able to see the content of HTML page.