The pieces of information below are a part of my high-level notes during my first steps with AWS. Although they are not rocket science, I publish them, as they might be helpful for others.

Notes

Recently I’ve been exploring AWS tools. Thanks to the Free Tier, it’s encouraging to explore AWS without worrying about the bills. Obviously, I came across many difficulties, which I share below. They are split into 3 sections: Installation, Code writing and Deploy. I’ve installed and used AWS with Ubuntu 16.04 along with AWS Pycharm extension. Due to that, I can’t confirm the notes are valid with other operating systems and IDE.

Installation

  1. Once installed, the AWS Extension + Pycharm combo is very comfortable. But the installation might be painful. 
  2. The official installation guide for Pycharm AWS extension seems to be the best, but be prepared to visit multiple pages to install additional requirements. 
  3. Although the installation of AWS CLI requires Python3.3+, the installation of SAM requires Python3.6. So it’s comfortable to use Python 3.6 for the whole installation.
  4. I didn’t install the tools from point 12 in the mentioned order. The Docker had been already installed. Remember only that the Docker must be able to work without SUDO.
  5. The public keys generated by AWS might have wrong permission, chmod 400 mykey.pem should solve the problem.

Writing Code

  1. AWS Lambda natively allows using Python3.6 and Python3.7.
  2. To load and save data as file, AWS uses the native boto3 library instead of common os. If you’re not certain to deploy the code on AWS or use on your machine, take the decision in advance to adjust (or not) the code to AWS logic. Otherwise, prepare yourself to rewrite chunks of the code. You can also omit the problem using Docker.
  3. Before writing the code, decide if you want to execute using Lambda or EC2, because:
    1. Lambda doesn’t allow to call the Docker! Still, people emulate it (didn’t try it yet).
    2. Lambda throws timeout after 15 minutes. You can omit it by setting a better machine, computing asynchronous,  by optimizing the code, etc. 
    3. The compressed code can be no heavier than 50MB if used with Lambda. If heavier than 3MB, it should be at first put into the S3 bucket and then called.
    4. Read about other Lambda and EC2 limitations. Otherwise, you can be surprised during the deploy.

Deploy

  1. It’s a good practice to deploy Numpy and Pandas as Manylinux distribution. Otherwise, you can get misleading error messages. This tip is valid also for other Linux servers!
  2. If you want to grant public access to specific files (ex. results of the program), it’s a good method to separate the code into two S3 buckets. Then, one bucket can be granted with public access by editing the bucket policy.
  3. If you upload the files to S3 with Pycharm AWS extension, keep in mind that the old file version can be visible for over a dozen minutes.
  4. If you’d like to connect to EC2 with SSH, set the EC2 to automatically create public IP. 
  5. EC2 + SSH requires proper security group otherwise, the code is not executed and returns Timeout Error. Change it and set the outbound group for SSH.