Table of Contents
Shared Responsibility Model: 2
Understanding AWS Accounts and Security Best Practices. 5
Securing the Root User with Multi-Factor Authentication (MFA) 5
Best Practices for Securing Your AWS Account: 5
Disaster Recovery Strategies. 7
AWS Certified Solutions Architect – Associate Exam Prep. 18
Module 1: Design Secure Architectures. 29
Module 2: Design Resilient Architectures. 45
Module 3: Design High-Performing Architectures. 59
Module 4: Design Cost-Optimized Architectures. 85
You must:
- Understand the Shared Responsibility Model, the AWS global infrastructure, and different AWS service resilience.
- Know how to design resilient architectures as a solutions architect.
- Be familiar with AWS accounts, how they work, and the importance of securing the root user with multi-factor authentication.
- Comprehend the best practices for securely handling AWS credentials and employing IAM users, groups, and roles.
- Ensure a deep understanding of security principles, disaster recovery strategies, and the AWS Well-Architected Framework.
Shared Responsibility Model:
* AWS is responsible for the security and compliance of the Cloud, where the customer is responsible for security and compliance in( i.e., What the customer brings into the cloud) the Cloud.
* https://continuumgrc.com/understanding-the-difference-between-hipaa-and-hitrust/
* https://d1.awsstatic.com/whitepapers/compliance/Using_AWS_in_GxP_Systems.pdf
Global Infrastructure and AWS Service Resilence
AWS Global Infrastructure
Key Components:
- Regions
- Geographical areas containing multiple Availability Zones (AZs).
- Each region is isolated and independent.
- Examples: us-east-1, eu-west-1, ap-southeast-1.
- Availability Zones (AZs)
- Distinct locations within a region engineered to be isolated from failures in other AZs.
- Each AZ has independent power, cooling, and physical security.
- Connected through low-latency links.
- Edge Locations
- Sites for content delivery (Amazon CloudFront) and DNS services (Amazon Route 53).
- Helps in reducing latency by serving content closer to users.
- Local Zones
- Extension of an AWS Region providing closer proximity to end-users for specific workloads.
- Example: us-west-2-lax-1.
- Wavelength Zones
- Infrastructure for deploying applications closer to 5G networks to minimize latency.
AWS Service Resilience
General Principles:
- Fault Tolerance: Capability of a system to remain operational even if some components fail.
- High Availability: Ensuring a service remains accessible as much as possible.
- Disaster Recovery: Strategies and tools for recovering from catastrophic failures.
Key AWS Services and Their Resilience Features:
- Amazon EC2 (Elastic Compute Cloud)
- Auto Scaling: Automatically adjusts the number of EC2 instances.
- Elastic Load Balancing (ELB): Distributes incoming traffic across multiple instances.
- Multi-AZ Deployments: Ensuring instances are spread across different AZs.
- Amazon S3 (Simple Storage Service)
- Cross-Region Replication (CRR): Automatically replicates objects across regions.
- Versioning: Keeps multiple versions of an object to protect against accidental deletions.
- S3 Standard-IA and S3 One Zone-IA: For infrequent access data with different resilience characteristics.
- Amazon RDS (Relational Database Service)
- Multi-AZ Deployment: Provides high availability and failover support.
- Read Replicas: Improve read performance and data durability.
- Automated Backups and Snapshots: For data protection and recovery.
- Amazon DynamoDB
- Global Tables: Fully replicated tables across multiple AWS regions.
- On-Demand Backup and Restore: Creates full backups of DynamoDB tables.
- Amazon EFS (Elastic File System)
- Multi-AZ Deployment: Data is stored redundantly across multiple AZs.
- Automatic Scaling: Automatically grows and shrinks your file system as you add and remove files.
- Amazon Route 53
- Health Checks and DNS Failover: Automatically route traffic to healthy endpoints.
- Global DNS Network: Redundant and globally distributed.
- AWS Lambda
- Multi-AZ Execution: Functions run in multiple AZs within an AWS Region.
- Automatic Scaling: Automatically scales out in response to incoming traffic.
- Amazon CloudFront
- Global Edge Network: Caches copies of content at edge locations for low-latency delivery.
- DDoS Protection: Integrated with AWS Shield for protection against DDoS attacks.
- AWS Backup
- Centralized Backup Management: Manages and automates backups across AWS services.
- Cross-Region Backup: Store backup copies in multiple regions.
- AWS Global Accelerator
- Global Network: Directs traffic through the AWS global network for better performance and availability.
- Health Checks: Automatically redirects traffic away from unhealthy endpoints.
- Distribute Resources Across Multiple AZs and Regions.
- Implement Auto Scaling and Load Balancing.
- Use Infrastructure as Code (IaC) for repeatable and reliable resource management.
- Regularly Test Disaster Recovery Plans.
- Leverage AWS Managed Services for critical infrastructure components.
This cheatsheet provides a high-level overview of AWS Global Infrastructure and the resilience features of various AWS services. It is designed to help you understand the core concepts and best practices for building resilient applications on AWS.
Understanding AWS Accounts and Security Best Practices
AWS Accounts
- What is an AWS Account?
- An AWS account is a container for your AWS resources. It allows you to access and manage AWS services and resources.
- Each AWS account has a unique identifier and is linked to a billing account.
- Account Structure:
- Root User: The root user has full access to all AWS services and resources in the account. This user is created when the account is first set up and is linked to the email address used during registration.
- IAM Users: Identity and Access Management (IAM) users are created by the root user or other authorized users to grant specific permissions to individuals or services without giving full access to the AWS account.
- Importance of Securing the Root User:
- The root user has unlimited access and control over the AWS account, including billing information and the ability to close the account. Thus, securing this account is critical to prevent unauthorized access and potential misuse.
Securing the Root User with Multi-Factor Authentication (MFA)
- Multi-Factor Authentication (MFA):
- AWS provides MFA to add an extra layer of security. This typically involves using a virtual MFA device (like Google Authenticator) or a hardware MFA device (like YubiKey).
- Steps to Enable MFA for the Root User:
- Sign in as the root user, navigate to the IAM Console, and activate MFA on the root account.
- Learn how to enable MFA.
Best Practices for Securing Your AWS Account:
-
- Use Strong, Unique Passwords:
- Ensure your root account and all IAM users use strong, unique passwords.
- AWS Service: AWS IAM Password Policies.
- Learn about IAM Password Policies.
- Enable MFA for All Users:
- Encourage or require all IAM users to enable MFA for their accounts.
- AWS Service: AWS IAM.
- Learn about enabling MFA for IAM users.
- Create Individual IAM Users:
- Avoid using the root account for everyday tasks. Instead, create individual IAM users with the minimum permissions necessary.
- AWS Service: AWS IAM.
- Learn about creating IAM users.
- Use IAM Roles:
- For applications and services that need access to AWS resources, use IAM roles instead of hardcoding credentials.
- AWS Service: AWS IAM Roles.
- Learn about IAM roles.
- Regularly Review Permissions:
- Regularly audit and review permissions to ensure users have only the access they need.
- AWS Service: AWS IAM Access Analyzer, AWS CloudTrail.
- Learn about IAM Access Analyzer.
- Learn about AWS CloudTrail.
- Monitor Account Activity:
- Use AWS CloudTrail and other monitoring services to track account activity and detect suspicious actions.
- AWS Services: AWS CloudTrail, Amazon CloudWatch, AWS Config.
- Learn about AWS CloudTrail.
- Learn about Amazon CloudWatch.
- Learn about AWS Config.
- Use Strong, Unique Passwords:
By understanding how AWS accounts work and following these best practices for securing the root user and other accounts with MFA, and utilizing AWS services and techniques, you can significantly enhance the security of your AWS environment.
Security Principles
Key Security Principles:
- Least Privilege: Grant only the permissions necessary for users to perform their tasks.
- AWS Service: AWS IAM Policies.
- Learn about IAM Policies.
- Defense in Depth: Implement multiple layers of security.
- AWS Services: AWS Security Hub, AWS WAF, AWS Shield.
- Learn about AWS Security Hub.
- Encryption: Use encryption to protect data at rest and in transit.
- AWS Services: AWS KMS, Amazon S3 Encryption, AWS CloudHSM.
- Learn about AWS KMS.
- Monitoring and Logging: Continuously monitor and log activities.
- AWS Services: AWS CloudTrail, Amazon CloudWatch, AWS Config.
- Learn about AWS CloudTrail.
Disaster Recovery Strategies
Disaster Recovery (DR):
- Strategies to ensure the availability and resilience of systems in the event of a disaster.
- AWS Service: AWS Disaster Recovery.
- Learn about AWS Disaster Recovery.
Key DR Strategies:
- Backup and Restore: Regularly back up data and restore it in case of failure.
- AWS Services: AWS Backup, Amazon S3, Amazon RDS Automated Backups.
- Learn about AWS Backup.
- Pilot Light: Maintain a minimal version of the environment always running.
- AWS Services: Amazon EC2, Amazon RDS, Amazon Route 53.
- Learn about the Pilot Light strategy.
- Warm Standby: Keep a scaled-down version of a fully functional environment.
- AWS Services: AWS Elastic Beanstalk, Amazon Aurora, AWS Elastic Load Balancing.
- Learn about the Warm Standby strategy.
- Multi-Site Active/Active: Run a full-scale version of your environment in multiple regions.
- AWS Services: AWS Global Accelerator, Amazon Route 53, AWS Elastic Load Balancing.
- Learn about the Multi-Site Active/Active strategy.
Know how to:
- Design secure access to AWS resources, including public, private, hybrid, and multi-cloud environments.
- Use AWS services for creating traceability and security for access to AWS resources.
- Secure workloads and applications by managing access, operational permissions, and the security configurations of Amazon VPCs.
- Implement disaster recovery plans and understand RPO and RTO to meet business needs.
- Use AWS Backup capabilities for various services and understand cross-Region backup strategies.
Knowledge of:
- The AWS global infrastructure and its architectural components, such as Regions, Edge Locations, and Availability Zones.
- AWS services’ resilience at the zonal, regional, and global levels.
- Key AWS security services, such as AWS IAM, AWS Organizations, and service control policies.
- The functionalities of AWS networking services and storage options like EBS, S3, and CloudFront.
Skill of:
- Designing high-performing, cost-optimized, and secure AWS architectures.
- Applying the AWS Well-Architected Framework and understanding its pillars, including the new Sustainability pillar.
- Building and securing Amazon VPCs, including default and custom VPCs.
- Managing and configuring disaster recovery strategies such as Backup and Restore, Pilot Light, Warm Standby, and Multi-site Active-Active.
Difference Between:
- Highly available and fault-tolerant workloads, and understanding disaster recovery strategies.
- Horizontal and vertical scaling, and the use of Amazon EC2 Auto Scaling vs. AWS Auto Scaling.
- Public, private, hybrid, and multi-cloud environments in terms of designing secure access.
Use cases:
- When to use AWS Secrets Manager vs. AWS Systems Manager Parameter Store for storing secrets.
- Employing AWS Storage Gateway for hybrid environments.
- Using Amazon S3 Cross-Region Replication for disaster recovery purposes.
- Selecting appropriate connectivity options like VPN or Direct Connect for different performance and throughput needs.
- Utilizing AWS services like Elastic Beanstalk, CloudFormation, and OpsWorks for automated deployments.
Reasons:
- Prioritize security at every layer and implement security best practices to protect AWS environments.
- Use multi-factor authentication for root user accounts to mitigate security risks.
- Understand and employ disaster recovery strategies to ensure business continuity and meet regulatory requirements.
- Leverage the AWS Well-Architected Framework for designing and operating reliable, secure, efficient, and cost-effective systems.
You should:
- Be aware of the cost implications of different methods of connectivity and optimize costs by tweaking queries or database designs.
- Research and take advantage of caching rather than always scaling up and out.
- Know the configurations and pricing models for each AWS service.
- Understand services like Kinesis Data Firehose, Snow Family, Glue, AWS DataSync, AWS Transfer Family, Storage Gateway, Direct Connect, and Database Migration Service.
- Establish connectivity between VPCs and architect networking strategies for multi-account setups, including shared services VPC.
- Be able to select the appropriate data transfer service for migration or ingestion.
- Have in-depth knowledge of CloudFront use cases, how it works, and its benefits.
- Understand the functionality of Route 53 and how to develop solutions using it.
- Know how Transit Gateway works, its use cases, and how it simplifies network peering solutions.
- Be able to select appropriate connectivity options, comparing VPN and Direct Connect for performance and throughput.
- Select high-performing networking solutions for workloads given specific requirements.
- Have knowledge deeper than identifying which database service to use, understanding how to maximize performance and configure read replicas for RDS.
- Be comfortable with selecting appropriate compute services and understanding their benefits and limitations.
- Choose architectures and services that scale to meet performance requirements, determining which AWS service would be most scalable and performant based on use case.
- Be familiar with basic API calls or AWS CLI commands for S3 and multi-part uploads.
- Know the major performance configurations for each service, especially Amazon S3.
- Understand which AWS storage services are best for low-latency requirements, and be familiar with the different types of EBS volumes.
- Know the general upper bounds for capacity for storage solutions to choose the right service for future data storage needs.
- Be able to determine the best storage solution based on future storage needs and understand the configuration options for key storage services.
- Add AWS edge networking services such as CloudFront, Route 53, and Global Accelerator, and configure them to transmit data securely with improved latency.
- Know which services can be used for caching, including CloudFront, ElastiCache, and DynamoDB Accelerator, and understand the fundamentals of caching.
- Use RDS Proxy to make applications more scalable, resilient to database failures, and secure.
- Manage security for S3 buckets, add control based on specific paths or objects, and know when to use S3 lifecycle configurations versus S3 intelligent tiering.
- Understand Amazon S3 encryption, including client-side and server-side encryption methods.
- Use AWS Certificate Manager to encrypt data in transit and know how certificates are renewed.
- Read and interpret policy documents, understand IAM decision logic, and evaluate how multiple policies affect an identity.
- Dive deep into AWS Cost Management Tools like Cost Explorer, tags, AWS Budgets, Saving Plans, Reserved Instances, and the AWS Cost and Usage Reports.
- Improve performance designs with AWS Services for storage, databases, networking, and compute, including serverless technologies and patterns for Lambda and Fargate.
- Review compute services, their use cases, and pricing models to optimize for cost.
Understand:
- Understand the principles of the AWS Well-Architected Framework, including all six pillars: operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability.
- Understand the Shared Responsibility Model and the specific security responsibilities that AWS and customers share.
- Understand the different types of storage services AWS offers (e.g., S3, EBS, EFS) and their specific use cases and performance characteristics.
- Understand the basics of networking in AWS, including VPCs, subnets, route tables, and gateway configurations.
Be familiar with:
- Be familiar with AWS identity and access management (IAM) best practices, including the use of roles, policies, and multi-factor authentication (MFA).
- Be familiar with AWS monitoring and logging services, such as CloudWatch, CloudTrail, and VPC Flow Logs.
- Be familiar with the AWS pricing model and cost management tools, including how to use AWS Budgets, Cost Explorer, and Cost and Usage Reports.
- Be familiar with the different types of databases AWS offers, such as RDS, DynamoDB, Redshift, and Aurora, and their best use cases.
Plan:
- Plan for high availability and disaster recovery by understanding and implementing multi-AZ and multi-region architectures.
- Plan for security by designing and implementing secure access controls, encryption, and compliance measures.
- Plan for scalability by using auto-scaling, load balancing, and distributed architectures.
- Plan for cost optimization by choosing the right services and configurations to balance performance and cost.
Ensure:
- Ensure you have a backup strategy that includes regular backups, cross-region replication, and a tested disaster recovery plan.
- Ensure your architectures are designed to handle failure gracefully, with redundant components and failover mechanisms.
- Ensure your applications are designed to be stateless and scalable, using services like S3, DynamoDB, and SQS.
- Ensure you are following best practices for operational excellence, including automation, monitoring, and incident response.
Know:
- Know the differences between various AWS compute services, such as EC2, Lambda, ECS, and EKS, and when to use each.
- Know how to use AWS networking services, including Direct Connect, VPN, and AWS Transit Gateway, to connect on-premises and cloud resources.
- Know how to secure your AWS environment using services like AWS Shield, AWS WAF, and AWS Secrets Manager.
- Know the best practices for database optimization, including indexing, query optimization, and read replicas.
Identify:
- Identify the appropriate AWS service for different workloads, considering factors like performance, cost, and scalability.
- Identify potential security risks and implement measures to mitigate them, such as using IAM roles, security groups, and NACLs.
- Identify opportunities for automation using AWS services like CloudFormation, Elastic Beanstalk, and OpsWorks.
- Identify and use the appropriate AWS tools for monitoring, logging, and alerting to maintain operational excellence.
Learn:
- Learn how to design serverless applications using AWS Lambda, API Gateway, and DynamoDB.
- Learn how to implement CI/CD pipelines using AWS CodePipeline, CodeBuild, and CodeDeploy.
- Learn how to manage and deploy containerized applications using Amazon ECS, EKS, and Fargate.
- Learn how to use AWS analytics services, such as Athena, Redshift, and QuickSight, to gain insights from your data.
You Should
- You should be aware of the cost implications of each of these methods of connectivity.
- you should research how to take advantage of caching instead of always scaling up and out. You can also optimize for costs in some cases by tweaking queries or database design. It all depends on the service and the use case
- you should know the configurations you can make and add the pricing models for each service.
- you should definitely understand Kinesis Data Fire Hose, the Snow Family, Glue, AWS Data Sync, AWS Transfer Family, Storage Gateway, Direct Connect, and the Database Migration Service.
- you should understand how to establish connectivity between VPCs. You should read how to architect networking strategies for a multi-account setup including a scenario that has a shared services VPC
- you should be able to select the appropriate data transfer service for data migration or ingestion. There are multiple data transfer services or ways you can transfer data to AWS that you should be familiar with so you can select the appropriate solutions. There’s AWS DataSync, the AWS Snow Family, the AWS Transfer Family, AWS Database Migration Service, and others you should feel comfortable designing architectures with
- You should have an in depth knowledge of the use cases for CloudFront, how it works, and the benefits of using it.
- you should have a firm grasp on the functionality of Route 53 and how to develop solutions using it. This would include understanding the record types you can create and why you would need to use Route 53.
- You should know how Transit Gateway works, its use cases, and how it can simplify network peering solutions. Another service you should know about from a high level is AWS CloudHub, which can also help you create a hub and spoke model for connecting networks. In addition to the idea of connecting a remote network to AWS, you should also know how to create connections between VPCs so that applications in one VPC can receive and send messages or data with an application being hosted in another VPC or another AWS account.
- you should be able to select the appropriate connectivity option. There can be differences between the use cases for either type of connection, VPN or Direct Connect, and you should familiarize yourself with the performance and throughput capabilities of both VPN connections and Direct Connect connections so that you can properly compare and contrast them when making architecture decisions. On top of AWS managed VPN and Direct Connect, you should also be familiar with services like Transit Gateway, which can be used with either VPN or Direct Connect to connect multiple VPCs to a remote network.
- you should be able to select high-performing networking solutions for a workload given a defined set of requirements or circumstances. Now, this doesn’t mean you need to be an absolute networking expert to be an AWS Solutions Architect, but you should know enough about AWS networking services and solutions to make informed decisions and do basic troubleshooting.
- you should have knowledge that goes deeper than identification of which database service to use given a use case. Dive deeper and know how these services work and operate to be able to answer questions that go beyond, do I need a relational database? How do you maximize the performance of your relational database engines? What if a significant portion of your traffic is from read operations on the database? Understand how to configure read replicas for RDS. Also, be sure you understand the different purposes of read replicas compared to multi-AZ deployments.
- You should be very comfortable with picking when to use what compute services and what benefits and limitations each compute service has.
- you should also be able to choose the appropriate architecture and services that scale to meet performance requirements. This means you should be able to determine based on a use case which AWS service would be the most scalable and performant given defined circumstances. For example, let’s say there’s an organization that needs to host a backend web service on EC2, and their usage of this web service will vary greatly throughout the day. This solution needs to be highly available and elastic. What type of architecture would be a scalable solution for this workload? In this case, you would likely want to research using EC2 with Elastic Load Balancing and Amazon EC2 Auto Scaling.
- you should be familiar with the basic API calls or the AWS CLI commands for S3. You should also familiarize yourself with using multi-part uploads.
- you should be familiar with the major performance configurations for each service. For example, when designing solutions using Amazon S3, you may be asked to consider how could you improve the performance of data uploads and data retrievals?
- you should be thinking about which AWS storage services are best fits for low-latency requirements. EBS volumes are extremely low latency, and the performance is configurable. You should be familiar with the different types of EBS volumes and how performant each type is based on your use case and on IOPS
- you should know the general upper bounds for capacity for storage solutions because this will help you choose the right service for your future data storage needs.
- You should also be able to determine which storage solution is the best fit based on future storage needs, too. Know the configuration options for key storage services. For example, for object storage, ensure you understand the S3 storage classes, basic API calls, and multi-part uploads. S3 is great for big data storage and backup and recovery, too. Ensure you understand the scalability, availability, durability, security, performance, and cost optimization with S3.
- Also, ensure you know how to add AWS edge networking services such as CloudFront, Route 53 and Global Accelerator, and configure them to transmit your data securely and with improved latency. Understand how these services help to encrypt data, remove network hops, and control
- You should know which services can be used for caching including but not limiting to Amazon CloudFront, ElastiCache, and DynamoDB Accelerator. Again, ensure you understand the fundamentals of caching, Read Replicas, and more. Diving even deeper, ensure you know how to use RDS Proxy to make your applications more scalable, more resilient to database failures, and more secure.
- The last thing I wanna bring to your attention is protecting based on access patterns. Certain services such as S3 give you the ability to manage security for entire buckets and to add control based on specific paths or objects. What capabilities exist for automatically managing the data lifecycle? For example, when would you use S3 lifecycle configurations instead of S3 intelligent tiering? You should know which services provide this level of granularity, read and build policies based on different access patterns and needs, and understand how those policies are evaluated by the service backend.
- You should also understand Amazon S3 encryption. S3 provides encryption at rest and in transit. With S3, you have two methods that support encryption, client side encryption and server side encryption.
- you should also know how to use AWS Certificate Manager to encrypt data in transit and how certificates are renewed.
- you should be able to read and interpret policy documents. Learn the major parts of a policy statement what’s required, and what are the ways in which the policies provide granularity with permissions. Make sure you also understand IAM decision logic when evaluating policies including how that will affect an identity with multiple policies
- You should also dive deep into the AWS Cost Management Tools, so Cost Explorer, tags, AWS Budgets, Saving Plans, Reserved Instances and Capacity, and the AWS Cost and Usage Reports.
- you should also have a lot of depth in improving performance of your design with AWS Services for storage, databases, networking, and compute. This domain also includes serverless technologies and patterns for Lambda and Fargate.
- you should review all of the compute services, their use cases, and their pricing models. This will help you figure out how to optimize for cost.
AWS Certified Solutions Architect – Associate Exam Prep
Module 0: Welcome
Hi everyone, I’m Julie Elkins, an exam prep curriculum developer here at AWS, and welcome to the AWS Certified Solutions Architect Associate Certification exam prep course. Before we get started with the content, let’s take a minute to talk about this certification and what it provides.
- You should be aware of the cost implications of each of these methods of connectivity.
- you should research how to take advantage of caching instead of always scaling up and out. You can also optimize for costs in some cases by tweaking queries or database design. It all depends on the service and the use case
- you should know the configurations you can make and add the pricing models for each service.
- you should definitely understand Kinesis Data Fire Hose, the Snow Family, Glue, AWS Data Sync, AWS Transfer Family, Storage Gateway, Direct Connect, and the Database Migration Service.
- you should understand how to establish connectivity between VPCs. You should read how to architect networking strategies for a multi-account setup including a scenario that has a shared services VPC
- you should be able to select the appropriate data transfer service for data migration or ingestion. There are multiple data transfer services or ways you can transfer data to AWS that you should be familiar with so you can select the appropriate solutions. There’s AWS DataSync, the AWS Snow Family, the AWS Transfer Family, AWS Database Migration Service, and others you should feel comfortable designing architectures with
- You should have an in depth knowledge of the use cases for CloudFront, how it works, and the benefits of using it.
- you should have a firm grasp on the functionality of Route 53 and how to develop solutions using it. This would include understanding the record types you can create and why you would need to use Route 53.
- You should know how Transit Gateway works, its use cases, and how it can simplify network peering solutions. Another service you should know about from a high level is AWS CloudHub, which can also help you create a hub and spoke model for connecting networks. In addition to the idea of connecting a remote network to AWS, you should also know how to create connections between VPCs so that applications in one VPC can receive and send messages or data with an application being hosted in another VPC or another AWS account.
- you should be able to select the appropriate connectivity option. There can be differences between the use cases for either type of connection, VPN or Direct Connect, and you should familiarize yourself with the performance and throughput capabilities of both VPN connections and Direct Connect connections so that you can properly compare and contrast them when making architecture decisions. On top of AWS managed VPN and Direct Connect, you should also be familiar with services like Transit Gateway, which can be used with either VPN or Direct Connect to connect multiple VPCs to a remote network.
- you should be able to select high-performing networking solutions for a workload given a defined set of requirements or circumstances. Now, this doesn’t mean you need to be an absolute networking expert to be an AWS Solutions Architect, but you should know enough about AWS networking services and solutions to make informed decisions and do basic troubleshooting.
- you should have knowledge that goes deeper than identification of which database service to use given a use case. Dive deeper and know how these services work and operate to be able to answer questions that go beyond, do I need a relational database? How do you maximize the performance of your relational database engines? What if a significant portion of your traffic is from read operations on the database? Understand how to configure read replicas for RDS. Also, be sure you understand the different purposes of read replicas compared to multi-AZ deployments.
- You should be very comfortable with picking when to use what compute services and what benefits and limitations each compute service has.
- you should also be able to choose the appropriate architecture and services that scale to meet performance requirements. This means you should be able to determine based on a use case which AWS service would be the most scalable and performant given defined circumstances. For example, let’s say there’s an organization that needs to host a backend web service on EC2, and their usage of this web service will vary greatly throughout the day. This solution needs to be highly available and elastic. What type of architecture would be a scalable solution for this workload? In this case, you would likely want to research using EC2 with Elastic Load Balancing and Amazon EC2 Auto Scaling.
- you should be familiar with the basic API calls or the AWS CLI commands for S3. You should also familiarize yourself with using multi-part uploads.
- you should be familiar with the major performance configurations for each service. For example, when designing solutions using Amazon S3, you may be asked to consider how could you improve the performance of data uploads and data retrievals?
- you should be thinking about which AWS storage services are best fits for low-latency requirements. EBS volumes are extremely low latency, and the performance is configurable. You should be familiar with the different types of EBS volumes and how performant each type is based on your use case and on IOPS
- you should know the general upper bounds for capacity for storage solutions because this will help you choose the right service for your future data storage needs.
- You should also be able to determine which storage solution is the best fit based on future storage needs, too. Know the configuration options for key storage services. For example, for object storage, ensure you understand the S3 storage classes, basic API calls, and multi-part uploads. S3 is great for big data storage and backup and recovery, too. Ensure you understand the scalability, availability, durability, security, performance, and cost optimization with S3.
- Also, ensure you know how to add AWS edge networking services such as CloudFront, Route 53 and Global Accelerator, and configure them to transmit your data securely and with improved latency. Understand how these services help to encrypt data, remove network hops, and control
- You should know which services can be used for caching including but not limiting to Amazon CloudFront, ElastiCache, and DynamoDB Accelerator. Again, ensure you understand the fundamentals of caching, Read Replicas, and more. Diving even deeper, ensure you know how to use RDS Proxy to make your applications more scalable, more resilient to database failures, and more secure.
- The last thing I wanna bring to your attention is protecting based on access patterns. Certain services such as S3 give you the ability to manage security for entire buckets and to add control based on specific paths or objects. What capabilities exist for automatically managing the data lifecycle? For example, when would you use S3 lifecycle configurations instead of S3 intelligent tiering? You should know which services provide this level of granularity, read and build policies based on different access patterns and needs, and understand how those policies are evaluated by the service backend.
- You should also understand Amazon S3 encryption. S3 provides encryption at rest and in transit. With S3, you have two methods that support encryption, client side encryption and server side encryption.
- you should also know how to use AWS Certificate Manager to encrypt data in transit and how certificates are renewed.
- you should be able to read and interpret policy documents. Learn the major parts of a policy statement what’s required, and what are the ways in which the policies provide granularity with permissions. Make sure you also understand IAM decision logic when evaluating policies including how that will affect an identity with multiple policies
- You should also dive deep into the AWS Cost Management Tools, so Cost Explorer, tags, AWS Budgets, Saving Plans, Reserved Instances and Capacity, and the AWS Cost and Usage Reports.
- you should also have a lot of depth in improving performance of your design with AWS Services for storage, databases, networking, and compute. This domain also includes serverless technologies and patterns for Lambda and Fargate.
- you should review all of the compute services, their use cases, and their pricing models. This will help you figure out how to optimize for cost.
I know that many of you are already aware of the benefits of becoming an AWS Certified Solutions Architect and that’s why you’re here. But to make sure everyone is on the same page, I wanted to talk about this certification. Preparing for certification exams should always start with understanding the focus and requirements needed for the specific certification.
This course is intended for those who are preparing to take the AWS Certified Solutions Architect Associate Exam. You should possess the knowledge, skills, and experience to design solutions based on the AWS Well-Architected Framework. There are no specific technical requirements to complete this course or sit for the certification exam, but having a general understanding of the solutions architect role, and how to use AWS technologies to design well architected solutions will help.
This certification exam is intended for individuals who perform a solutions architect role and have one or more years of experience designing highly available, cost efficient, fault tolerant, and scalable distributed systems. And the certification validates your ability to design and implement systems on AWS, highlighting your capacity to incorporate resiliency, high performance, security, and cost optimization.
During the exam, you’ll be asked to complete tasks such as evaluating the design and implementation of multi-tiered and highly available architectures using scaling for compute and storage solutions, identifying cost effective solutions, and more.
The AWS Certified Solutions Architect Associate Certification validates you have the knowledge, depth, and experience to do the following; design solutions that incorporate AWS services to meet current business requirements and future projected needs, design architectures that are secure, resilient high performing, and cost optimized, and review existing solutions and determine improvements.
Now, let’s talk about this exam prep course. In this intermediate level course from Amazon Web Services, you will learn how to assess your preparedness for the AWS Certified Solutions Architect Associate Exam. In this course, you will learn how to prepare for the exam by exploring the exam’s domains, and task statements, mapping them to development on AWS and identifying specific areas to study The certification Exam Guide which outlines this course is organized into four domains that relate to the topics necessary to cover for the exam.
Designing secure applications and architectures is covered in domain one. This domain reviews designing secure access to AWS resources, application tiers, and data security options.
Domain two covers designing resilient architectures, this domain focuses on multi-tier solutions, highly available, and fault tolerant architectures, AWS services used as decoupling mechanisms, and appropriate resilient storage options.
Domain three, which is about designing high performing architectures will guide you through identifying elastic and scalable compute and storage solutions and selecting high performing network and database solutions.
The fourth and final domain focuses on designing cost optimized architectures. This is where you’ll look at identifying cost effective storage, compute, database solutions, as well as designing cost optimized network architectures.
The new exam also covers over 30 new services, new features plus it adds the latest pillar from the AWS Well-Architected Framework, Sustainability. Check the exam guide for key tools, technologies, and concepts that might be covered on your exam.
Let’s quickly run through some of the concepts and services you might be expected to know for each domain of the exam guide.
The SAA-CO3 exam guide references the following four domains broken down as follows; domain one covers designing secure architectures, this domain covers 30% of the exam content and focuses on designing secure access to your AWS resources. I recommend diving deeper into AWS Identity and Access Management or IAM, AWS single Sign-On, which was just renamed to AWS IAM Identity Center, AWS Security Token Service, AWS Control Tower, and service control policies. Also make sure you understand how to secure your workloads.
Key services here are Amazon Cognito, Amazon Guard Duty, Amazon Macie, AWS Shield, AWS SWAF, AWS Secrets Manager, AWS STS, and the controls available to secure your VPC. Now, along with this, we have to know how to secure our data. Make sure you understand the AWS Key Management Service or AWS KMS, an AWS Certificate Manager along with your data recovery and retention requirements.
How do you implement policies to your data and protect that data? What services would be key here? I recommend diving into all of the AWS storage services, plus understand the access policies for encryption keys and implementing policies for key rotation and certificate renewals.
Domain two covers designing resilient architectures. This domain covers 26% of the exam content and focuses on designing scalable and loosely coupled architectures, and designing highly available and or fault tolerant architectures. For scalable and loosely coupled, I would know how to integrate and when to use Amazon API Gateway, AWS Transfer Family, Amazon’s Simple Queue Service, or SQS, AWS Secrets Manager, Application Load Balancers, Amazon SNS, AWS Fargate, AWS Lambda, Amazon Elastic Container Services, and Amazon Elastic Kubernetes Service. And just so you know, Kubernetes is featuring more on most of the updated certification exams. What are the key design principles for multi-tier microservices and event driven architectures? Know the difference between horizontal and vertical scaling, this usually confuses most students. For the second focus area for high availability and fault tolerant architectures, make sure you understand how these AWS services and the AWS Global Architecture can add high availability and fault tolerance. Dive deeper into Amazon Route 53, Amazon Comprehend, backup designs and strategies, RDS proxy, and AWS X-ray.
This is where fundamentals and basics will become useful. What is immutable infrastructure? What are the best practices for disaster recovery? Understand different use cases for the AWS managed services, load balancing, distrusted design patterns, and storage options depending on the requirements and the characteristics of each AWS storage service. This domain also includes automation and identification of metrics to track based on requirements and then how to use that to ensure high availability.
Domain three covers designing high performing architectures. This domain covers 24% of the exam content and this is where we see a brand new section compared to the version SAA-CO2. For this updated exam ensure you know how to determine requirements and implement high performing data ingestion and storage. What are the available caching services? What are your data access patterns? Do you know how to scale these solutions?
Dive deeper into scaling capabilities of AWS and these AWS services. Amazon CloudFront, AWS Direct Connect, AWS VPN, AWS PrivateLink, Amazon Simple Storage Service or Amazon S3, Amazon Elastic File System, Amazon Elastic Block Store, AWS Batch, Amazon Elasticache, Amazon Athena, AWS Lake Formation, Amazon Quick Site, AWS Data Sync, AWS Storage Gateway, AWS Glue, Amazon EMR, and Amazon Kineses. And remember that you should also have a lot of depth in improving performance of your design with AWS Services for storage, databases, networking, and compute. This domain also includes serverless technologies and patterns for Lambda and Fargate.
Here are two questions to consider, how do you decouple workloads so the components can scale independently? And what scaling metrics and conditions kickoff scaling actions? We will dive deeper into these questions in just a bit. Know how to select the appropriate resource type and size along with the compute options and features to meet requirements. And before you move on, ensure you also understand data transformation and ingestion. Lake Formation is appearing in this updated version, know how to secure your ingestion access points and the speed needed for ingestion, transformation and streaming data.
Domain four covers designing cost optimized architectures. This domain covers 20% of the exam content and AWS is actually adding more cost optimization to all of their exams. And cost optimization is always important and it’s one of my favorite topics. Know how to cost optimize your storage, databases, networking, and compute environments. You will see scenario questions with different cost optimization requirements and you will need to choose the best service for that design. You should also dive deep into the AWS Cost Management Tools, so Cost Explorer, tags, AWS Budgets, Saving Plans, Reserved Instances and Capacity, and the AWS Cost and Usage Reports.
Understand how to optimize the cost of your databases too along with knowing which storage service will be best for your requirements. One key item of focus should be right sizing. Not only your compute and your database instances, but also right sizing your storage. How do you determine the correct storage size and lowest cost to transfer data? How do S3 lifecycles help you cost optimize your storage? Again, we’re gonna dive deeper into these questions, but also focus on load balancing and connectivity cost savings. Key services would be AWS Transit Gateway, VPC Peering, DNS, Route 53, and Elastic Load Bouncers.
And to wrap up this deeper dive into what you should expect from each domain, remember earlier I mentioned the new pillar for the AWS will Architected Framework, the Sustainability pillar. New questions will appear on your exam for this pillar but also understand the other five pillars too. The AWS Well-Architected Framework and all of its pillars will help you in this exam, other AWS certifications and in the real world.
As we just mentioned throughout the course you will examine each domain and learn what we should know to ensure you are not only comfortable with each domain and the task statements but can also design solutions for different requirements, level of security, resiliency, and cost. Our goal here is not to provide you with everything you need to know and understand for each of the domains, our goal here is to help you assess your readiness for the certification exam, provide structure and add clarity and context to each domain and help you to be ready to sit and pass your certification exam.
As we cover each domain, you should evaluate if you’re ready to move forward and take the exam or if you need more preparation and depth into the topics from each domain. Using that information, you can then evaluate the areas you need to cover and understand how much more depth is needed into those topics. In this course, you will also review sample exam questions from the exam domains and learn how to interpret the concepts being tested so that you can better eliminate incorrect responses. Again, take the time to evaluate if you’re ready to move forward to take the exam or if you need more preparation in depth into the topics from each domain.
There are two types of questions on the exam, multiple choice, which has one correct response and three incorrect responses or distractors, or multiple response which has two or more correct responses out of five or more response options. During the exam, you’ll be asked to choose the best answer for scenarios to complete tasks to design and implement systems on AWS. This highlights your capacity to incorporate resiliency, high performance, security, and cost optimization.
We will walk through these example questions but the overall length, complexity and difficulty of the questions tend to be longer and more complicated than what you might expect from an associate level certification exam.
Most of the questions involve lengthy scenarios that are usually several sentences to a couple paragraphs in length. Most of the answer choices will be several sentences long as well, so take your time as you’re reading through these longer questions and be sure to process every word in detail that you read. Be on the lookout for repeated sentences across all of the possible answers with just a word or two change. Those one or two words can make all the difference when it comes to determining which answer is correct and which answer might be a distractor.
Always do your best to eliminate these distractors as early as possible so you can focus more on the plausible answers and select the best possible answer or answers to each question. Call to action, you’ve set your goal, now it’s time to build knowledge and skills to propel your career. We recommend you take the time and effort to understand the content for each domain when studying for the Solutions Architect Associate Exam.
Use practice questions to build your reading comprehensive skills and identify keywords and phrases. It’s easy to sign up and schedule your exam, click the link in the resources to get your exam scheduled. At the end of this course, you will also practice skills with hands on labs, test your knowledge with more practice question sets and learn strategies for identifying incorrect responses by interpreting the concepts that are being tested in the exam. Plus, you will take a full length official practice test that will help identify your strengths and weaknesses in each domain area.
This certification exam covers a broad range of topics. Your first resource should be the certification exam guide that outlines the certification exam and this course. A link to the exam guide will also be included in the course notes. AWS also provides additional resources and you can sign up for weekly exam tips and resources from AWS training and certification. Get exam ready tips on taking your exam, exam guides and sample questions and opportunities to dive deep on exam strategies with accredited AWS expert instructors.
Before we get started with the course, let’s cover a few fundamentals in the next video.
Fundamentals
–
Welcome back. Before we get started with the first domain, designing secure architectures, let’s first focus on the fundamentals needed for the certification exam.
If you wanna understand AWS for the exam and the real world, you need to learn the fundamentals and learn how AWS could fail. If you understand how AWS could fail, you can design your architectures to work around failures. AWS likes to say everything fails all the time, and the focus of this exam, as you can see from the domains, is to know and understand how to design secure, resilient, high-performing, and cost-optimized solutions.
First, ensure you understand and can explain what AWS is and what cloud computing is. Dive deeper into the five criteria of cloud computing. Also ensure you understand the AWS Global Infrastructure and the architectural components, Regions, Edge Locations, and Availability Zones. Dive deeper into globally resilient, regionally resilient, and availability zone resilient services.
Some services are resilient in their Availability Zone. Some are resilient in their Region and can cope with the failure of an entire Availability Zone. Some services are resilient globally and can cope with the failure of an entire Region and still continue to operate.
AWS also provides the Shared Responsibility Model to help define and provide clarity about which areas of system security belong to AWS and which are owned by you, the user. And then finally, my absolute favorite, the AWS Well-Architected Framework. The best practices for designing and operating reliable, secure, efficient, and cost effective systems in the cloud have been consolidated into the AWS Well-Architected Framework. And understanding of the best practices is essential for those architecting solutions in AWS, but also for the certification exam. Dive deeper into the sixth pillar, Sustainability. As we mentioned, it is a new addition to this updated version.
There are several ways to learn about the AWS Well-Architected Framework.
First, there is the online documentation. I recommend reading through the AWS Well-Architected Framework multiple times and revisiting it as often as needed. This is going to be an excellent reference for you in preparing for the certification exam.
Second, there is the AWS Well-Architected Tool that is designed to help you review the state of your applications and workloads.
And third, there is the AWS Well-Architected Labs where you can get practice applying the Well-Architected concepts.
And the last fundamental you will need for this certification and in the real world is general fundamentals. Ensure you understand the fundamentals of networking, virtualization, encryption, storage, security, DNS, DNSSEC, databases, and so on.
To pass this certification and get a job in the real world, you need to know significantly more than what is required to pass this exam. If you are a solutions architect, you must know how to design resilient architectures. As a developer, you need to know how to create applications to cope with failures and be able to diagnose, repair, and prevent failed systems. For this certification and these roles, you need depth in fundamentals and in AWS.
Throughout this course and as you assess your knowledge level, we are here to help clarify and explain the context of the domains so that you’re able to understand what the exam is asking you. Feel free to reach out any time. Again, using that information, you can then evaluate the areas you need to cover and understand how much more depth is needed for those topics.
Good luck completing this course. Let’s get started with Domain 1, Design Secure Architectures. Scan the QR code to follow along and I’ll see you in the next video.
Module 1: Design Secure Architectures
Welcome back. Let’s get started with the first domain of the AWS Certified Solutions Architect Associate Exam Guide: Design Secure Architectures. The focus will be on designing secure applications and architectures.
Security is one of the most important aspects of any environment and a constant theme throughout this exam. It is important that you not only know the various security services available but also that you understand the security concepts and how they affect your decisions regarding the services and solutions you will end up evaluating.
AWS recommends security begin considered at every stage, level, and tier of your applications and architectures, not just in the sections about security. Consider secure solutions for every topic. Domain one, Design Secure Architectures is broken into three task statements.
- Design secure access to AWS resources.
- Design secure workloads and applications.
- And determine appropriate data security controls.
Over the next several videos I will address each task statement individually breaking down the knowledge and skills expected of you to be successful.
Let’s get started to evaluate your readiness for the exam. In the next video, we will cover the first task statement from domain one, design secure access to AWS resources. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
Secure access
–
Hi, and welcome back. Let’s begin with the first task statement from Domain one Design Secure Access to AWS Resources.
We just mentioned that security is one of the biggest and earliest considerations you will make when designing an architecture. This includes defining how people, tools, and applications you build will access the necessary AWS services and your data. This could involve determining not only who or what can launch or terminate your resources but also managing how and when access is given along with operational permissions and almost anything else that would involve calls to the services.
In the course introduction we mentioned the importance of understanding fundamentals specifically the Shared Responsibility Model, the AWS global infrastructure, and different AWS service resilience. This task statement covers securing access to your AWS resources and securing access to services in different types of cloud environments. Ensure you know the differences between public, private, hybrid, and multi-cloud environments and how to design secure access to all of them.
One area that is often overlooked when studying for this certification exam is AWS accounts. AWS accounts may seem too basic to take the time to understand, but I consider it part of the required, need-to-know fundamentals for AWS. You must understand what accounts are and how they work to ensure you have secure access to your AWS resources.
Back to our fundamentals. What level of service is AWS IAM? Zonal, regional or global? IAM is a global AWS service that secures any data in the AWS IAM database across all AWS Regions. Ensure you understand IAM. Review the best practices for account root user security, applying the principle of least privilege, and multifactor authentication.
AWS accounts begin with a single account root user with full permissions. So, why is this a risk to use the account root user? Well, these account permissions cannot be changed or modified and if this account root user is compromised then so is the whole AWS environment. One way to secure the account root user is to add multi-factor authentication. But what else could we implement to secure our AWS account? How about adding another user with more limited permissions instead of using the account root user? When we add new users to our AWS account, those new users will be different people and need different levels of access.
What level of permissions do new IAM users have in your AWS account? IAM identities start with no permissions but permissions can be granted. Dive deeper into IAM users, groups, and roles and what goes into deciding between which to use and how they might be combined. Understand how using the principle of least privilege limits your blast radius.
Here are a few questions to consider. How do you create IAM users, groups, and roles? What are their strengths and limitations? And what scenarios would dictate possibly switching between the various user group and role-based permissions? Along with those identities, make sure you know how IAM and other AWS services give you the ability to secure the necessary credentials and best practices for handling those credentials.
Look at various methods of assuming roles including assigning roles to AWS services. Understand how to use AWS Security Token Service within and across accounts, and how roles are used with federations. Your users might already have identities outside of AWS such as in your corporate directory. If those users need to work with AWS resources or work with applications that access those resources then those users will need security credentials. You can use an IAM role to specify permission for users whose identity is federated from your organization or third party identity provider.
Do you know how to design and configure active directory to federation access to AWS IAM roles or users? Also dive into the best practices for controlling your application’s access to AWS APIs. When should you hard code credentials into your application? The answer to this is never, but know the other ways to enable API access and dive deeper into IAM policies.
For this certification, you do not need to know how to write advanced policies, but you should be able to read and interpret policy documents. Learn the major parts of a policy statement what’s required, and what are the ways in which the policies provide granularity with permissions. Make sure you also understand IAM decision logic when evaluating policies including how that will affect an identity with multiple policies attached.
Also ensure you know the appropriate use of resource policies for AWS services. What is a policy? Well, it’s an object in AWS that when associated with an identity or resource defines their permissions. The two types of policies are identity based policies which are attached to an IAM user group or role. These policies give you the ability to specify what that identity can do. Its permissions. With identity policies, we control what resources that identity can access. With resource policies, we control who can access the resource. So resource based policies are attached to a resource such as an Amazon S3 bucket, Amazon SQS queues, VPC endpoints, AWS Key Management Service encryption keys to specify who has access to the resource and what actions they can perform on it. And bucket policies are structured differently compared to the structure of identity policies. The difference is the principle part of the policy. The principle part of a resource policy defines which principles are affected by this resource policy.
And lastly, understand the methods, services and tools available that help you create traceability for access to AWS resources. Traceability helps to monitor, alert and audit actions and changes to your environment in real time. It also helps to integrate log and metric collection with systems to automatically investigate and take action. Just as you need to be aware of the performance and behaviors of your application components. You also need to have insight into who and what has access to your account resources and data.
You will be tested on designing security strategies for multiple AWS accounts, know how to gain that visibility, how to enforce security standards, and how to alert and automate based on that data. Ensure you understand AWS Control Tower, AWS Organizations and service control policies. Learning to design secure access to the resources is an important step in learning how to prioritize security at every step. Take your time when learning these topics and get practice with both the designing and implementation whenever possible.
Let’s get started with our first walkthrough question. I’ll see you in the next video.
Walkthrough question 1
–
Welcome to our first walkthrough question for the course. These walkthroughs are meant to assist you in a few ways.
First, this will be one of the opportunities to see the types of questions that you’ll encounter on the actual exam. While the questions aren’t pulled directly from the certification exam, they are of similar quality and difficulty and will give you exposure to this style of questioning.
Second, I want to show you methods I consider to be helpful when you’re working with multiple-choice questions. These methods help you focus on what you’re looking for and help you identify any distractors you may encounter on your exam.
And third, these questions will provide you with additional information. Any questions you feel confident in will reinforce your knowledge in that area, and any questions that reveal gaps in your knowledge will help you identify where to focus in your studies going forward.
As I go through each of these questions, I’ll generally follow a particular format. I’ll start by reading through the questions, sometimes called the stem. Then I’ll identify key words and phrases in the stem that show you exactly what you’re looking for in the responses or answers. After that, I’ll go through the responses and I’ll give you time to figure out if you can identify the correct response yourself. After you’ve been given a chance to figure out yourself, I’ll go through the responses and discuss why they’re correct or incorrect. Okay, now that I’ve given you a background information on how we’re gonna run these questions, let’s get started with our first question.
The stem reads, the CIO of a company is concerned about the security of the root user of their AWS account. How can the CIO ensure that the root account follows the best practices for securely logging in? Select two.
So, as you can see, this stem wants you to pick two answers out of the responses. Keep in mind that there is no partial credit on the exam. When you come across questions with multiple keys, be very careful to properly evaluate so that you’re able to find all possible answers. Reading this question, can you identify any key words or phrases? And also, what exactly is the question asking?
The stem is asking you to identify the two responses that will help the CIO ensure best practices for securely logging in with the root user. What about key words? A few key words I see are root user, best practices, and secure login. The question gives you all of the information you need to start looking at the responses.
Now let’s look at the responses or the answers.
The answers are A, enforce the use of an Access Key ID and Secret Access Key for the root user logins.
B, enforce the use of MFA for the root user logins.
C, enforce the root user to assume a role to access the root user’s own resources.
D, enforce the use of complex passwords for member account root user logins.
E, enforce the deletion of the root account so that it cannot be used.
I’ll give you some time to identify the keys and you can pause the video if you need more time too.
The correct answers or keys for this item are B, enforce the use of MFA for the root user logins, and D, enforce the use of complex passwords for member account root user logins. Both of these methods will provide additional security for the root user of their AWS account. These are also in line with the security best practices.
Let’s go through the distractors.
Starting with A, which recommends that you enforce the use of an Access Key ID and Secret Access Key for the root user logins. You use an Access Key ID and Secret Access Key to make programmatic requests to AWS. But you should not use your AWS account root user access key. This method gives the root user full access to all resources for all AWS services in the account, including the billing information. The permissions for this access method cannot be reduced.
For response C, you have, enforce the root user to assume a role to access the root user’s own resources. This is incorrect because the root user cannot assume a role to its own account. Roles are used for services within the account and third-party access to other AWS accounts.
And last is E, which proposes you enforce the deletion of the root account so that it cannot be used. This will not work because if the root account is deleted, all resources will also be deleted. If this happens, you are no longer running environments in AWS.
How did you answer and understand this question? Remember when you are taking your exam, read each question and the answers completely to identify key words. Then eliminate the distractors and re-read the two plausible answers to choose your best answer.
Let’s get started with the second task statement from domain one, design secure workloads and applications. Refer to the Exam Guide and scan here for a link to download and follow along, and I’ll see you in the next video.
Secure workloads
–
Welcome back. Let’s get started with the second task statement designed to cure workloads and applications. The focus for this task statement is how the people, tools, and applications you build will securely access the necessary AWS services. This could involve determining who or what can launch or terminate your resources, managing how and when access is given, operational permissions, and again, almost anything else that would involve calls to the services.
Again, let’s take a moment to go back to our fundamentals. Ensure you know how to design, build, and secure an Amazon Virtual Private Cloud or Amazon VPC. There are two types of Amazon VPC, default and custom. Make sure you know the differences and how the security for each is initially configured.
Let’s also check in and see if you know what is the resilient type for an Amazon VPC. Is an Amazon VPC a zonal, regional, or global service? Well, when you create a VPC, it is in one Region and in one AWS account, so that makes an Amazon VPC a regional service.
Dive deeper and ensure you understand how to design secure VPC architectures. For example, many application architectures are multi-tiered. When studying about securing application tiers, pay attention to the use and functionality of security groups, network access control lists, route tables, and NAT gateways. These will provide management and security controls over your network traffic. They can provide granularity in the rules, restrictions, and allowances that you need. Understand how they work both together and individually.
Also understand how to build their rules, the pitfalls to avoid, rule processing logic, and methods to employ them for better combined functionality. Again, know the base configurations for security filters for both the default and custom VPC.
Back to fundamentals, ensure you understand networking fundamentals such as protocols, CIDR, subnetting, routing, security filters, gateways, and so on.
For network segmentation, understand the strategies behind, when to use public and private subnets, what differentiates a public from a private subnet, and common practices around the use of these subnets. Back to our fundamentals, what is a subnet? Ensure you know the fundamentals going into this exam. A subnet is where our services sit and run from inside our Amazon VPCs. They help to add structure and functionality to our VPCs. What resiliency do subnets have? Are they zonal, regional or global? Subnets are an Availability Zone resilient feature of AWS.
You’ll also need to understand routing mechanisms within a VPC. This will involve the implementation, use and features of the route tables. For example, if you needed to allow specific types of traffic to access your application servers, but the traffic was going to be coming from your on premises location, traversing a VPN and your application servers were in a private subnet, how would you set that up? What could you do to make sure that the application servers were safe from access coming from the public internet, but also that there wouldn’t be any issues at the VPC, subnet or instant levels for the requests coming in from the on premises that’s connected by a VPN in your VPC? This solution will also require you to know how to appropriately select and deploy other components such as the AWS service endpoints, like PrivateLink, peering, transit gateways, VPN connections, Direct Connect, and other network connection tools and methods that are commonly used when deployed with VPCs.
Endpoints are a great way to add secure access. Again, fundamentals, what is an endpoint service? They’re gateway objects that we can create inside our VPC similar to internet gateway or a NAT gateway to connect to AWS public services without the need of a gateway like the internet gateway or the NAT gateway. We just mentioned that AWS also has a VPC endpoint service, PrivateLink, which can help to solve the issue of exposing an application in adding secure access for other VPCs in other AWS accounts.
Let’s say you have an application and you make that application public. Well, now you’re using the internet and your application is exposed, so how can you secure this application? We could set up VPC peering, but that’s gonna add more management overhead as you scale, and it also exposes other applications in the VPCs to the other VPCs that are peered. PrivateLink is a secure and scalable way to expose your application or service to tens or hundreds of VPCs with no peering, internet gateway, NAT gateway, and so on.
Ensure you know how to secure external connections to and from AWS resources using private connections with AWS Site-to-Site VPNs, AWS Client VPN, and Direct Connect. You want to ensure you understand the capacity, security, and resilience options for configuring each of these services. Let’s step back again and talk about fundamentals and best practices for securing your data.
Here are a few questions to consider. How do you build in security to your networking tiers? How do you secure application use across these tiers? And what does the management of those security components look like?
For the exam, you could get a question asking, for which AWS service helps to secure personally identifiable information or PII? PII is personal data used to establish an individual’s identity, this includes your name, your home address, email address, your social security number, driver license number, passport, date of birth, bank account information, credit card, and so on.
Amazon Macie is an AWS service that uses machine learning to discover, classify, and protect sensitive data stored in Amazon S3. More services to help here are Amazon Cognito, Amazon GuardDuty, and for Cognito, ensure you have an understanding of Cognito user pools, Cognito identity pools, and how Cognito brokers the single sign on or ID federation. You will most likely see scenario based questions around use cases for these services.
Another fundamental needed for designing secure workloads and applications is firewalls and proxy servers. You’ll wanna understand how to integrate security services to secure applications with services such as AWS Shield AWS WAF, AWS IAM Identity Center, Amazon Cognito, Amazon Guard Duty, Amazon Macie, AWS Secrets Manager, and AWS Systems Manager Parameter Store. Understand the difference between Shield Standard and Shield Advance.
Know when and why you might choose one security service over another, such as if you’re trying to prevent external DDoS or SQL injection attacks. For example, if you are storing a secret and need high volume access with automatic credential rotation, what would you choose? AWS Secrets Manager or AWS Systems Manager Parameter Store? AWS Secrets Manager is designed to store secrets more so than Systems Manager Parameter Store. And Secrets Manager can force the rotation of your secrets at a chosen interval of days.
Dive deeper and ensure you can choose the best service for the requirements. Another example, AWS WAF can only be deployed on certain AWS services, such as application load balancers, Amazon API Gateway, and Amazon CloudFront.
General understanding of these services will be crucial to knowing how they be deployed in the scenarios you will encounter on your exam.
Let’s get started with our second walkthrough question. I’ll see you in the next video.
Walkthrough question 2
–
Welcome back. Let’s take a look at another sample question.
The stem for this item is: “A solutions architect “must secure the network traffic for two applications “running on separate Amazon EC2 instances “in the same subnet. “The applications are called Application A “and Application B. “Application A requires that inbound HTTP requests “be allowed and all other inbound traffic be blocked. “Application B requires that inbound HTTPS traffic “be allowed and all other inbound traffic be blocked, “including HTTP traffic. “What should the solutions architect use “to meet these requirements?”
Reading this question, can you identify any key words or phrases? Also, what exactly is this question asking? The question here is what should you use to meet the requirements to allow HTTP requests and block all other traffic for Application A, and what should you use to allow inbound traffic for HTTPS but block all other inbound traffic for Application B? What key words did you identify? A few key words I see are same subnet and the allow and block of inbound traffic, except HTTP and HTTPS.
Now let’s look at the answers.
A is to configure the access with network access control lists,
B is to configure the access with security groups,
C is to configure the network connectivity with VPC peering,
and D is to configure the network connectivity with route tables.
I’ll give you some time to figure it out, and remember you can pause the video if needed.
The key for this answer is B, configure the security groups. B is correct because a security group acts as a virtual firewall for your instance to control inbound and outbound traffic. They support all allow rules only, blocking all other traffic if a matching rule is not found. Security groups are specifically applied at the instance, so different instances in the same subnet can have different rules applied
.
Let’s go to through the distractors.
A is incorrect. While network access control lists can allow and block traffic, they operate at the subnet boundary. They use one set of rules for all traffic entering or leaving the particular subnet. Since the EC2 instances for both applications are in the same subnet, they would use the same network ACL. The question requires different security requirements for each application.
Answer C is incorrect. VPC peering permits separate distinct VPCs to communicate with each other and does not provide any traffic-blocking capabilities.
Answer D is also incorrect. A route table contains a set of rules called routes that are used to determine where network traffic from your subnet or gateway is directed. It does not provide any ability to block traffic as requested for applications that are in the same subnet.
How did you answer and understand this question? Remember when you’re taking your exam, read each question and the answers completely to identify key words, then eliminate the distractors and reread the plausible answers to choose your best answer.
Let’s get started with the third task statement from domain one, determine appropriate data security controls. Refer to the exam guide and scan here for a link to download and follow along. I’ll see you in the next video.
Data security controls
–
Welcome back. Let’s get started with a third task statement, determine appropriate data security controls. For this task statement, and when considering security at every level, the protection of data is one of the most important areas of focus.
We mentioned this briefly in the last lesson. However, whether the data is in transit or at rest, its security needs to be evaluated. Under AWS accounts and IAM fundamentals, we talked about the principle of lease privilege, and this is important for security at each layer. Do you know how to design and implement the principle of lease privilege to ensure only those who need access have the only the degree of access that they need?
Diving deeper, do you also know how to design and implement securing access to your encryption keys? As a solutions architect, we need to understand the fundamentals of encryption. There are two types of encryption we need to know for this certification exam. Encryption at rest and encryption in transit.
So let’s pause again and cover briefly some fundamentals needed for this exam. In AWS, we need to make sure our data is encrypted at rest and also in transit.
Encryption at rest is designed to protect against unauthorized access and theft. Encryption at rest is usually used when only one party is involved.
Encryption in transit is designed to protect data as it’s being transferred between two places and two or more parties are involved.
Ensure you understand that using encryption adds a tunnel around that data so no one from the outside can read the data. And to understand encryption, it’s important to understand some terms. Ensure you know the following terms and concepts. Plaintext is not always text data, but it is always unencrypted. It could be docs, images, applications, and so on. An algorithm is code that takes the plaintext and the encryption key, and generates your encrypted data. When a algorithm is being used, it needs the plaintext and also a key. A key is just a password and is used with the algorithms and produces ciphertext, and there are different types of keys in different types of encryption. Ciphertext is your encrypted data.
Now taking a step back, there are also two types of encryption and keys that we need to know, symmetric and asymmetric.
Here are a few questions to consider. What methods are available to secure your data at rest? For example, why would you use AWS KMS instead of AWS Cloud HSM for managing your data encryption keys or how can you use those two services together? How do you manage encryption keys across regions? What types of keys are there, and what are the differences in their capabilities? How often can you rotate each type of key?
Also, understand how to implement access policies for encryption keys. For this exam, you’ll need to understand the main differences and why you might choose one service over another. Diving deeper, you should also know how to use AWS Certificate Manager to encrypt data in transit and how certificates are renewed.
You should also understand Amazon S3 encryption. S3 provides encryption at rest and in transit. With S3, you have two methods that support encryption, client side encryption and server side encryption.
With client side encryption, the objects being uploaded to S3 are encrypted before the data leaves the source to S3, so it happens on the client side. With server side encryption, the data uses the default encryption in transit with https. When that data arrives at S3, it is encrypted by S3, S3 manages this encryption. With client side encryption, you control all, with server side encryption, we have a few choices you need to know for this certification exam, so ensure you understand each. Server side encryption with customer provided keys, server side encryption with Amazon S3 managed keys, server side encryption with customer master keys stored in AWS KMS.
Let’s change the topic slightly and talk about compliance and compliance requirements. For this exam, you also need to know which AWS technologies can help to meet and satisfy compliance requirements. We mentioned earlier that security and compliance is a shared responsibility between AWS and the customer. Understand AWS artifact and how to use the self-service central repository for the AWS security and compliance reports and online agreements.
Let’s move on and talk about data retention, classification, and data recovery. There are best practices for securing sensitive data in AWS data stores. Ensure you have a good understanding of general data security patterns and a clear mapping of these patterns to cloud security controls. AWS provides the AWS cloud adoption framework with a specific security perspective to help. Let’s dive deeper.
First, there are five capabilities, IAM, detective controls, infrastructure security, data protection, and incident response. But basically we need to ensure we are thinking about the data we are protecting, how it is stored, and who has access to it.
Second, remember that not all data is created equal. We must ensure our data is classified properly to enforce the security of that data.
Third, add security controls or defense in depth. Most important here is layering multiple security controls to provide redundancy along with two categories, preventative and detective.
Understand what data protection looks like for your architecture and your requirements.
Here are questions to consider. How do you design data protection when using a VPN over the internet, or over a private connection through AWS Direct Connect or connections between VPCs or for the transfer of your data between services such as Amazon S3 and your VPC? How do you protect the data in transit when reaching in users over the public internet? Also, how do the various data management and storage services handle data protection?
For example, how will data management and storage differ when looking at S3 versus Amazon EBS? And does the use of those protections change the performance of the services? Let’s consider a scenario where the data is being generated on an instance that is using an EBS volume, that data needs to be protected while maintaining durability. Would you want to store the data on an encrypted EBS volume or transfer the data to an encrypted S3 bucket? The least effort would be to use the encrypted EBS volume.
Another question to consider, will the use of encryption affect performance, and if so, how? Ensure you know which services have no impact and which services might have a slight impact on the performance, for example, data retrieving speed with AWS RDS and AWS KMS, or when reading data from S3.
Also know how to handle the root keys and how that method differs from your data keys. Dive deeper into AWS KMS and S3.
Are there managed services that can help you secure, evaluate, and audit the security of your data? Definitely dive into AWS KMS. Understanding your data security options will require understanding how the services operate, their security options, and how the services interact.
Also, how do the various data management and storage services handle data protection? For example, how will data management and storage differ when looking at S3 versus Amazon EBS, and does the use of those protections change the performance of the services? Let’s consider a scenario where the data is being generated on an EC2 instance that is using an EBS volume. The data needs to be protected while maintaining durability. Would you want to store the data on an encrypted EBS volume or transfer the data to an encrypted S3 bucket? The least effort would be to use the encrypted EBS volume.
Another question to consider is will the use of encryption affect performance, and if so, how? Ensure you know which services will have no impact and which services might have a slight impact on the performance. For example, data retrieving speed with Amazon RDS and AWS KMS or in reading data from S3. Also know how to handle the root keys and how that method differs for your data keys. Dive deeper into AWS KMS and S3. Are there any managed services that can help you secure, evaluate, and audit the security of your data? Definitely AWS KMS. Understanding your data security options will require understanding how the services operate, their security options, and how the services will interact.
The last thing I wanna bring to your attention is protecting based on access patterns. Certain services such as S3 give you the ability to manage security for entire buckets and to add control based on specific paths or objects. What capabilities exist for automatically managing the data lifecycle? For example, when would you use S3 lifecycle configurations instead of S3 intelligent tiering? You should know which services provide this level of granularity, read and build policies based on different access patterns and needs, and understand how those policies are evaluated by the service backend.
As I said earlier, data security will be extremely important when approaching this exam. The best solution will often be the most secure solution. When adding security at every layer, data protection will be crucial to the design and the implementation of your solutions on both the exam and in the real world. Prioritize security at every layer. Study with security in mind and evaluate how protection can be added both in original designs and reinforcing existing deployments.
Let’s wrap up this lesson and go back to fundamentals and talk about cloud storage. What is cloud storage? How does it work? What are the benefits, requirements, and types? Well, cloud storage is a cloud computing model that stores data through a cloud computing provider who manages and operates data storage as a service. The benefits are no hardware to purchase, no storage to provision, and no capital spending. Time to deployment is faster and the storage lifecycle management policies add automation, savings, and can lock data in support of compliance requirements. Storage requirements along with the additional compliance requirements ensure your data is safe, secure, and available when needed.
At the basics for requirements, we need to ensure we plan and design for our durability, availability, security, regulatory and governance requirements, and functional requirements. We also need to ensure we know which type of data we are storing, object, file, or block. Then we have five ways to use cloud storage, backup and recovery, software test and development, data migration, compliance, and big data and data lakes.
Disaster recovery is a big part of this exam and must be included as part of your design for a resilient architecture. What options exist for different storage to protect data in the event of a disaster? Disaster recovery strategies available to you within AWS can be categorized into four approaches, ranging from the low cost and low complexity of making backups to more complex strategies using multiple active Regions.
Active passive strategies use an active site such as an AWS Region to host the workload and serve traffic. The passive site such as a different AWS Region is used for recovery. The passive site does not actively serve traffic until a failover event is invoked. For a disaster event based on disruption or loss of one physical data center for a well-architected highly available workload, you may only require a backup and restore approach to disaster recovery. If your definition of a disaster goes beyond the disruption or loss of a physical data center to that of a Region or if you are subject to regulatory requirements that require it, then you should consider pilot light, warm standby, or multi-site active-active.
Understand the AWS services that use a backup strategy that runs periodically or is continuous. How often you run your backup will determine your achievable recovery point. The backup should also offer a way to restore it to the point in time in which it was taken.
Here are a few services to dive deeper into, EBS snapshot, DynamoDB backup, RDS snapshot, Aurora snapshot, EFS backup when using AWS Backup, Amazon Redshift snapshot, Neptune snapshot, DocumentDB, and for Amazon S3, you can use Amazon S3 Cross-Region Replication to asynchronously copy objects to an S3 bucket in the disaster recovery Region continuously while providing versioning for the stored object so that you can choose your restoration point.
AWS backup provides a centralized location to configure, schedule, and monitor. Know the AWS backup capabilities for the following services and resources. EBS volumes, EC2 instances, RDS and Aurora databases, DynamoDB tables, EFS file systems, Storage Gateway volumes, and Amazon FSx for Windows and Lustre. AWS Backup also supports copying backups across Regions such as to a disaster recovery Region. Here’s a question. What service can we use for hybrid environments? Well, what about AWS Storage Gateway?
For this exam, focus on data security patterns in the corresponding AWS security controls that protect your data. Let’s get started with our third walkthrough question, and I’ll see you in the next video.
Walkthrough question 3
–
Welcome back. This time for the walkthrough question, I’m gonna change it up a bit. I’ll still show you the stem and responses as usual, but instead of showing you the key words in the question, I’ll just give you some time to identify them yourself. Remember to identify the important part to the question and ensure that any response you consider to be correct will need to meet all of the requirements.
The stem for this item is, a company needs to implement a secure data encryption solution to meet regulatory requirements. The solution must provide security and durability in generating, storing, and controlling cryptographic data keys. Which action should be taken to provide the most secure solution?
Okay, what key words can you identify and what is this question asking?
Let’s look at the answers.
You have first A, to use AWS KMS to generate AWS KMS keys and data keys. Use KMS key policies to control access to the AWS KMS keys.
B is to use AWS KMS to generate cryptographic keys and import the keys to AWS Certificate Manager. Use IAM policies to control access to the keys.
C is to use a third-party solution from AWS Marketplace to generate the cryptographic keys and store them on an encrypted instance store volume. Use IAM policies to control access to the encryption key APIs.
D is to use OpenSSL to generate the cryptographic keys and upload the keys to an Amazon S3 bucket with encryption activated. Apply AWS Key Managed Service key policies to control access to the keys.
Take some time here to see if you can identify the correct response. And before we look at the key or correct answer, let’s identify some key words. So the first key words that stand out for me are regulatory requirements. If you have depth in AWS KMS and CloudHSM, you will know that KMS uses hardware security modules to protect and validate your KMS keys, but KMS manages the software for the encryption. With CloudHSM, AWS provisions the encryption hardware for us, but we have to use our own client to perform the encryption. AWS KMS and CloudHSM are both FIPS compliant. KMS is level two and CloudHSM is level three.
Now let’s look at the correct key.
Our key is response A. It suggests using KMS to generate and control access to keys. This solution provides a secure way to generate, store, and control the data keys and key durability.
Let’s go through the distractors one by one, starting with B, which recommends using KMS to generate keys and then import them to AWS Certificate Manager and then using IAM policies to control access. This is a distractor because AWS Certificate Manager is used for storing SSL keys and not data keys.
C says to use a third-party solution for key generation, encrypted instance store volumes for storage, and IAM policies for access control to the encryption keys’ APIs. The largest issue I see with this is the use of the instance store volumes. Those are specifically designed to be ephemeral, and that means that the durability requirement isn’t met.
Last, we have D, which points towards OpenSSL for generation, S3 for storage, and KMS policies for access control. As there is no way to do key management directly from S3, and KMS policies only work for keys stored in KMS, this is also an incorrect response.
How did you answer and understand this question? Remember, when you are taking your exam, read each question and the answers completely to identify key words. Then eliminate the distractors and re-read the plausible answer to choose your best choice answer.
In the next video, we will wrap up domain one and then we’ll get started with an introduction to domain two, design resilient architectures. And remember to get some hands-on experience in the first lab, design secure architectures. I’ll see you in the next video.
Secure architecture wrap up
–
Welcome back, and congratulations on finishing the first Domain for this certification. For your studies and preparation, dive deeper into protecting your resources, applications, and data.
One of the security best practices is to follow the principle of least privilege where we only grant the permissions needed and no more. What services can we use to follow the principle of least privilege when working in a multi account environment? Would you use AWS Control Tower AWS Service Catalog, or AWS Organizations? Be sure you have a good understanding of IAM. For example, when would you use an AWS IAM role and when would you use an IAM user?
Know the differences between an identity policy a resource policy, a permissions policy, and a service control policy. How are policies evaluated when there are overlapping allow and deny rules? Know different ways to federate into AWS. Be sure to know the AWS Single Sign-On service and different use cases for AWS Directory Service.
Understand what monitoring services exist in addition to Amazon CloudTrail, Amazon CloudWatch, and VPC Flow Logs.
Understand how to set up your own VPCs with the appropriate security controls. Be sure you know the use cases and capabilities of AWS Shield, AWS WAF, AWS Secrets Manager, and AWS Systems Manager Parameter Store.
Make sure you know how you can protect your data in transit and at rest in AWS. In what cases would you need to choose AWS KMS or AWS CloudHSM?
Ensure you have a solid understanding of the different AWS security services capabilities appropriate use cases and when another service might be a better option. Being able to choose between two different services based on the stem is gonna help you pass the certification exam.
Let’s get started with Domain 2. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
Module 2: Design Resilient Architectures
Welcome back. Let’s get started with Domain 2 covering Design Resilient Architectures.
As defined in the AWS Well-Architected Framework, a resilient workload has the capability to recover when stress by loads, attacks, either accidental through a bug or deliberate through intention, and failure of any component in the workloads components.
In this domain, we will focus on designing resilient architectures and dive deeper into high availability, fault tolerance disaster recovery, elasticity, and scalability.
The second domain is broken into two task statements. Design scalable and loosely coupled architectures and design highly available and/or fault-tolerant architectures.
As with the other domains, the solutions you are considering should involve thorough examination of the architecture requirements and needs and the understanding that none of these topics stand alone.
Over the next several videos in this module, I will address each task statement individually, breaking down the knowledge and skills expected of you to be successful. Refer to the exam guide and scan here for a link to download and follow along.
Let’s start to evaluate your readiness for the exam in the next video, where we will cover the first task statement from Domain 2, Designing Resilient Architectures. I’ll see in the next video.
Scalable architectures
–
Welcome back. Let’s begin with the first task statement from domain two, design scalable and loosely coupled architectures. When thinking about resilient architectures, we want to design our architectures with scalable and loosely coupled services, and we want to ensure our application continues to run even when there is a failure of an application component. It is best practice to test your applications at scale to ensure your applications can perform well under load.
So, let’s again go back to our fundamentals to ensure we understand scaling. Scaling is the ability of a system to scale, so increasing and decreasing the load placed on that system. And systems scale when they need to grow or shrink, again depending on the load or the demand. There are two ways to scale, vertical and horizontal. Ensure you know the difference between these two along with the cost optimization for each.
Still under fundamentals, what is elasticity? Well, it is using automation along with horizontal scaling to match your capacity with your demand. Demand is very rarely linear. It is usually increasing and decreasing and using elasticity that gives our capacity the ability to increase and decrease to meet that ever changing demand. AWS provides launch configuration and autoscaling to scale out our systems to match the capacity to the demand giving our environment the ability to scale. It adds additional resources as the demand increases. And then when that demand decreases, we can scale back in to a smaller number of servers or even no servers. This optimizes our environment for performance efficiency, operational excellence, and cost optimization which are all pillars of the AWS Well-Architected Framework.
For this exam, know the difference between horizontal and vertical scaling and how to achieve them using different services. For example, what are the capabilities of AWS Auto Scaling and Amazon EC2 Auto Scaling? What are the different types of scaling policies you can use with Amazon EC2 Auto Scaling?
Let’s move on. You should also have a fundamental understanding of containers, serverless, and virtualization. There are many options for running your compute workloads in the cloud. Containers, serverless, and virtual instances in EC2, but when does it make more sense to use one over the other? What are the benefits and limitations of the different service options? Different workloads have different needs.
Know how to recommend the appropriate compute, storage, networking, and databases based on the requirements and the right AWS services to meet the requirements. For example, when using Amazon EC2 for high performance computing, there are different instance types and deployment configurations that are optimal for HPC. For more depth, read the AWS Well-Architected Framework High Performance Computing lens. Know the different types of placement groups and advance networking options for certain EC2 instance types will help answer questions and build better designs.
There are many options for using storage in your multi-tier architecture. For example, many years ago, companies used relational databases for nearly every data need. At extreme scale, this presents challenges. However, AWS has many different purpose-built databases such as DynamoDB for NoSQL database needs and for low latency performance at extreme scale, RDS frame managed relational database service, Amazon Aurora for a cloud native relational database service, and Amazon Redshift for data warehouse needs.
Here are questions to consider. How can you handle scaling with the AWS database services? For resiliency in RDS, would you use Read Replicas or a Multi-AZ design? Read Replicas provide two main benefits, performance benefits and availability benefits. RDS Multi-AZ does not scale your reads. It cannot be accessed directly and the standby does not provide added performance. It is only good for high availability. Diving deeper using a Reed Replica is not a substitute for using a cache. Querying a Read Replica still has overhead of making a database connection, authentication, SQL query parsing, optimization locking, and so on. But there are many other options for using caching.
Under extreme Load, your application components may run into performance issues. You should know which services can be used for caching including but not limiting to Amazon CloudFront, ElastiCache, and DynamoDB Accelerator. Again, ensure you understand the fundamentals of caching, Read Replicas, and more. Diving even deeper, ensure you know how to use RDS Proxy to make your applications more scalable, more resilient to database failures, and more secure.
Also, ensure you know how to add AWS edge networking services such as CloudFront, Route 53 and Global Accelerator, and configure them to transmit your data securely and with improved latency. Understand how these services help to encrypt data, remove network hops, and control application access which helps with the resiliency and performance and when to use them for different requirements.
Here’s another scenario. What if you need to design a file transfer where you do not have to host and manage your own file transfer service? Your manager does not want to invest in operating and managing infrastructure, patching servers, monitoring for uptime and availability, and building one-off mechanisms to provision users and audit their activity. What do you do? You could add and implement the AWS Transfer Family to manage file processing workflows. The Transfer Family gives you the ability to create, automate, and monitor your file transfer and data processing without maintaining your own code or infrastructure. It also supports up to three Availability Zones and is backed by auto scaling, redundant fleet for your connection, and transfer request. It is important to know and understand how AWS managed services scale based on your configuration.
This exam will test your ability to build highly scalable and reliable workloads using a service-oriented architecture or a microservices architecture. Service-oriented architecture is the practice of making software components reusable via service interfaces. Microservices architecture goes further to make components smaller and simpler. Distributed systems rely on communication networks to interconnect components such as servers or services. Your workload must operate reliably despite data loss or latency in these networks. Components of the distributed system must operate in a way that does not negatively impact other components or the workload.
Ensure you understand and can design with best practices to help your workloads withstand stresses or failures more quickly recover from them and mitigate the impact of such impairments. With the microservices approach and design, services communicate over well-defined application programming interfaces or APIs. There are different patterns to these builds which include API-driven, event-driven and data streaming. A highly scalable and fault tolerant microservices architecture with the user interface, implementation of the microservices, and data stores can be built in AWS using containers and serverless services to reduce operational complexity and add resilience. There are many options for running your serverless workloads.
Back to fundamentals, ensure you know what serverless is and how it is defined. In AWS, serverless is an operational model with no infrastructure provision or manage, automatically scales by unit of consumption, pay for value billing mode, and has built-in availability and fault tolerance. Amazon API Gateway scales automatically and many API use cases do not require management.
Dive deeper and no use cases for API Gateway. For an application to call publicly available AWS services, you can use Lambda to interact with required services and expose Lambda functions through API methods in API Gateway. Lambda is a serverless event-driven compute service that gives you the ability to run code for virtually any type of application or backend service without provisioning or managing servers. To scale Lambda, ensure you understand concurrency and transactions. Amazon Simple Queue Service queues can deliver very high throughput. To achieve high throughput, you must scale message producers and consumers horizontally.
Think about the scenario where your application frontend is processing much faster than some of your backend processing. How can we achieve independent scaling of application components as needed? Web applications commonly use synchronous APIs when communicating with backend services. For longer running processes, asynchronous workflows can offer an improved user experience and help manage scaling. By using durable message stores like SQS or DynamoDB, you can separate the request ingestion and response from the request processing.
Dive deeper into SQS to decouple and scale microservices, distributed systems and serverless applications. We mentioned that one of the ways in which that we can help ensure our application can continue to run, despite a single point of failure is to implement loosely coupled architectures.
Before looking specifically at the various services for decoupling, it might be best for me to make sure we’re all on the same page with what is meant by decoupling. Decoupling refers to components remaining autonomous and unaware of each other as they complete their work as part of a larger system. This can be used to describe components within a simple application or can be applied at a much larger scale. This task focuses on designing scalable and loosely coupled architectures.
Additionally, it’s important that you understand and can distinguish between two decoupling techniques, synchronous and asynchronous integration. Synchronous decoupling generally involves at least two components, both must always be available and order for things to function properly. Knowing how to use decoupling services will be incredibly important. The use of tools such as the load balancer discussed earlier or even handling more complicated builds with Amazon Event Bridge will be important to solving the examples you’ll be presented with.
Additionally, you’ll need to understand how to use serverless tools in AWS to build and improve decoupling mechanisms. What are the ways you can deploy SQS, API Gateway, DynamoDB, and many other services within the serverless toolkit?
Understanding how to implement these tools will help you to build flexibility and decoupling capabilities into your architectures, and it will also help in determining which responses will work. Knowing the textbook definitions and methodologies of decoupling will only get you so far.
It’s important that you learn about how the serverless and decoupling services can be used to build and improve the functionality of both your architectures and the scenarios presented on the exam. Evaluate the needs from the questions. And by understanding how the services work, know what will be best for the various implementations you encounter.
Given a set of use cases and knowing the capabilities of different AWS services, you should know how to design the appropriate type of architecture to meet the requirements. A wide variety of patterns including event-driven, microservices, and multi-tiered architectures might apply, but some will be more appropriate based on the user’s needs.
This task statement requires you to have depth and knowledge across API Gateway, Transfer Family, SQS, Secrets Manager, Application Load Balancers, SQS, Fargate, Lambda, Amazon Elastic Container Service, Elastic Kubernetes service, and more.
Let’s get started with our fourth walkthrough question. I’ll see you in the next video.
Walkthrough question 4
–
Welcome back. Let’s get started with another walkthrough question.
Your Stem is: an API receives a high volume of sensor data. The data is written to queue before being processed to produce trend analysis and forecasting reports. With the current architecture, some data records are being received and processed more than once. How can a solutions architect modify the architecture to ensure that duplicate records are not processed?
Reading this question, can you identify any keywords or phrases, and again, exactly what is this question asking? The question is asking how to update the design and implement a solution so records only get processed once. What are some key words? I see high volume, data is being processed after it is in the queue, and that the records are being processed more than once.
Now let’s look at our choices for answers.
The responses are: A: Configure the API to send the records to Amazon Kinesis Data Streams.
B: Configure the API to send the records to Amazon Kinesis Data Firehose
C: Configure the API to send the records to Amazon Simple Notification Service
and D: Configure the API to send the records to Amazon Simple Queue Service FIFO queue.
Now, I’ll give you some time to evaluate your responses.
The key for this question is D: Amazon SQS, FIFO queue. This is the correct answer because the FIFO queue improves upon and compliments the standard queue. The most important features of this queue type are FIFO, first in first out delivery, and exactly once processing. The order in which the messages are sent and received is strictly preserved, and in messages delivered once and remains available until consumer processes and deletes it. Duplicates are never introduced.
Let’s go through the responses starting with A, which is incorrect. Amazon Kinesis and Kinesis Data Streams can cause duplicate records. There are two primary reasons why records may be delivered more than one time to your Amazon Kineses Data Stream application: producer retries, and consumer retries.
B is also incorrect. Amazon Kineses Firehose can cause duplicate records. C is also incorrect, since Amazon SNS is a message service, and not a queuing service. It would not be a viable solution for this situation, where many records might come in at the same time, and need to be queued for processing.
How did you answer and understand this question? Remember, when you’re taking your exam, read each question and the answers completely to identify keywords. Then eliminate the distractors, and reread the plausible answers to choose your best answer.
Let’s get started with the second task statement from domain two. Design highly available and/or fault tolerant architectures. Refer to the exam guide, and scan here for a link to download and follow along, and I’ll see you in the next video.
High availability
–
Welcome back, let’s begin with the second test statement from domain two, design highly available and/or fault tolerant architectures. While the two do share some overlap, they do slightly vary in definition. For this exam, you need to know what is high availability, fault tolerance and also disaster recovery.
Let’s begin with high availability since we’ve mentioned it several times already. High availability is the way to design your systems to have the ability to keep your systems up and running and providing service as often as possible. It is designed so that if a system component fails, then that failure can be replaced or fixed as quickly as possible. It maximizes the system’s online time. However, high availability does not mean it stops failures and it also does not mean that there will be no downtime or outages. What high availability does is respond when there is a failure to be fixed as soon as possible, so that that system can be brought back into service.
What is an example of high availability? Well, let’s say we have an application running on a single server inside AWS, and this server is used by employees to complete their job. If this server goes down, then the employees cannot work because the server is down and it’s currently experienced an outage. So if you design this architecture to be highly available, you could quickly spin up a new virtual machine to failover to or you could be running two servers for this application, one in active and one in a standby mode. If one server goes down, you can failover to that second server to serve your employees. But in this situation, the employees may have to log back in, so there could be a bit of a downtime, which is okay. It’s not what you want, but with high availability, some downtime can be expected depending on your design. The goal of high availability is to reduce outages and stay operational, fast and automatic recovery is best, but there is usually downtime even if it’s very brief.
Now let’s go over fault tolerance. What do you think of when you think of fault tolerance? It is similar to high availability but fault tolerance is the actual ability of a system to keep operating in the event of a failure. So one or more of the system’s components fails but the system it’s able to continue on with those faults failing. A fault tolerant design must continue to operate.
Let’s go back to our earlier scenario. In this case, if we are designing for fault tolerance, then we would have two active servers serving the one application. So if one server goes down, then the second server is already active and will continue to serve the employees, and there’s actually no downtime in this situation. Fault tolerant designs operate to minimize failures and to continue to operate through failure and these system designs are usually more expensive than high availability designs.
Disaster recovery is a bit different from fault tolerance and high availability because both are all about designing to operate through a disaster. Disaster recovery is all about what we need to plan for and also what we need to do in the event of a disaster. Having a disaster recovery plan is crucial because the worst time to recover from a disaster is in the middle of that disaster. Disaster recovery needs pre-planning and steps to complete the disaster recovery process, so when the disaster occurs, you are already set with a plan to recover your systems as quickly as possible. Maybe you have an onsite backup to switch your environment to or maybe you use AWS as your backup site and have CloudFormation templates ready to go to provision your environment inside AWS after that disaster. It is essential to have backups of your environment stored offsite, stored in the cloud, but definitely not stored in the same building as your systems, because if that building is damaged, then so are your backups.
You need a plan to protect your data and store backups elsewhere and it’s best practice to run disaster recovery exercises to practice this process so in a real disaster, the process goes smoothly. Regardless of the style of architecture, you also have to include in your design the amount of resources needed included ensuring redundancy of your normal operational components such as servers and databases, but it also includes components needed to handle data replication, traffic management, failure detection, and everything else necessary for a highly available or fault tolerant architecture.
Ensure you know how and why the AWS global infrastructure adds high availability and fault tolerance. In addition to determining the amount of resources, you also need to evaluate which AWS services you can use to improve the reliability of your architectures. This is especially important with legacy applications where component migrations are not possible. If components or applications cannot be run in AWS or you have reasons why the go-to services for something like data replication don’t fit your use case, you need to know how to meet reliability requirements.
Additionally, ensure you are ready to select highly available configurations to mitigate single points of failure, as well as selecting appropriate disaster recovery strategies to meet your business requirements. When trying to determine single points of failure, it often helps to think of working backwards from the failure. What I mean by this is when looking at an environment, take each component and assess what happens when that component fails. What happens when one of your web servers fail? What do you do when one of the application servers go down? How about the database? Evaluate these failures, then determine how you’ll mitigate, and remember to add the disaster recovery strategies to your design.
To determine the requirements, it’s common to look at disaster recovery objectives, evaluate the recovery time objective and the recovery point objectives or RTO and RPO respectively, and use those to help deciding on which strategies will work best.
RPO is the maximum acceptable amount of time since the last data recovery point. How often does the data need to be backed up?
RTO is the maximum acceptable delay between the interruption of service and restoration of service. What is the maximum time an application can be unavailable? In terms of strategies, make sure you look at both active/passive and active/active as they all help to meet business needs.
For the active/passive study, backup and restore, pilot light and warm standby solutions. And for active/active, research the multi-site active/active deployment style. These all have their ideal use cases and they don’t have to be used individually. Make sure you understand their strengths and how they affect RPO and RTO needs.
You will need to have a backup plan to ensure you meet RTO and RPO requirements for your solutions. Amazon S3 can be used as a low cost and extremely durable object storage solution. For file storage solutions, EFS and the Amazon FSx can be used. Know the availability, encryption options, and limitations of each of these services. Also understand what capabilities exist within different AWS database services to handle disasters.
For example, the RDS service has a Multi-AZ deployment capability to automatically fail over if the primary database goes down. Know how this works and how long the database access is down. Understand how Amazon Aurora global database works for cross Region failover and how long it takes to fail over. Also know how Amazon DynamoDB global tables work. What backup and recovery options exist for each of these services? Understand the different disaster recovery strategies such as backup and restore, pilot light, warm standby, and active/active failover.
Ensure you know how to improve availability and disaster recovery for legacy applications that are not cloud native. Understand how AWS Elastic Disaster Recovery can be used for both on premises and cloud-based applications. For applications running in Amazon EC2, understand how you can use AMIs or Amazon Machine Images and the EC2 Image Builder as part of your disaster recovery strategy.
To build resilient architectures, we need to minimize or eliminate downtime for our applications. Back to our fundamentals. A fundamental understanding of basic networking concepts and AWS networking services is foundational in ensuring high availability and/or fault tolerant architectures. Know how routing tables work,, VPC peering connections, AWS Transit Gateways AWS Site-to-Site VPNs, AWS Direct Connect locations, AWS Direct Connect Gateways, and Amazon Route 53 Resolver. Keep in mind the capabilities and limitations of the different services to ensure you meet your requirements.
You need to know which services can be used to automate deployments using services like Elastic Beanstalk, CloudFormation, and OpsWorks. For container deployments, you could use ECS and EKS. And to help ensure the integrity of your resources, you’ll wanna check for vulnerabilities in your infrastructure and code using services like Amazon Inspector and Amazon CodeGuru.
Know what services can be used to design your application to handle a single point of failure for an EC2 instance or a failure of an entire Availability Zone. In this case, we might look at adding an elastic load balancer across multiple Availability Zones with EC2 Auto Scaling group to ensure we have the appropriate number of instances based on our configuration.
What if you want to ensure an application can automatically fail over to other Regions around the world? Know the various DNS routing capabilities of Amazon Route 53 to support global architectures including failover routing. Again, understand how you can use Global Accelerator to improve application availability and performance.
RDS Proxy, which we mentioned in the last lesson, is a fully managed, highly available database proxy for RDS that makes applications more scalable, more resilient to database failures, and more secure. We just talked about serverless architectures in our last lesson too. Serverless applications can have a large number of open connections to the database server and may open and close database connections at a high rate exhausting database memory and compute resources. Again, RDS Proxy gives applications the ability to pull and share connections established with the database, improving database efficiency and application scalability. With RDS Proxy, failover times for Aurora and RDS databases are reduced by up to 66% and database credentials, authentication and access can be managed through integration with Secrets Manager and IAM.
Part of the AWS Well-Architected Framework is staying current and implementing continual improvement of your designs. It is critical for continuous improvement to track and respond to key metrics for applications and infrastructure in your environment. Know how to use Amazon CloudWatch and AWS X-Ray to observe what is happening in your systems. Know how to initiate automated actions based on key metrics using CloudWatch alarms.
Additionally, know how to respond in near real time to changes in your environment using EventBridge. Lately, I’ve heard from a lot of you that Amazon Polly is appearing in the exam again. What is a use case for Amazon Polly and high availability and fault tolerance? Well, how about if your organization deals with large volumes of IT service requests? Traditionally, the burden is put on the requester to choose the correct category for every issue. A manual error or misclassification of a ticket usually means a delay in resolving the IT service request. This can result in reduced productivity, a decrease in customer satisfaction, and an impact to service level agreements, as well as broader operational impacts. As your enterprise grows, the problem of getting the right service request to the right team becomes even more important, especially if there’s an outage.
Using an approach based on machine learning and artificial intelligence can help your enterprises ever evolving needs. You can use Amazon Comprehend to classify IT service requests automatically, and add Amazon Polly to add spoken output to your system. Amazon Polly could also be used to create self-service within your Amazon Connect contact center services.
When it comes to high availability and fault tolerance architectures, always look at your architecture to identify potential single points of failure. Don’t forget that many AWS managed services are built for specific purpose, such as Amazon SQS that has high availability built-in. And ensure you understand the AWS Global Architecture, plus Amazon Route 53, Amazon Comprehend, backup strategies, RDS Proxy, and AWS X-Ray.
Let’s get started with our fifth walkthrough question, and I’ll see you in the next video.
Walkthrough question 5
–
Welcome back. Let’s get started with our next question.
The Stem for this item is a gaming company is experiencing exponential growth. On multiple occasions, customers have been unable to access resources. To keep up with increased demand, Management is considering deploying a cloud-based solution. The company is looking for a solution that can match on premises resilience of multiple data centers, and is robust enough to withstand the increased growth activity. What configuration should a solutions architect implement to deliver the desired results?
Reading this question, can you identify any keywords or phrases, and what the question is asking?
The question is asking, which design solution will ensure resiliency or high availability, and scale to meet the demand? A few key words that I see are customers are not able to access resources and need to keep up with the demand, multiple data centers, and increased growth.
Now let’s look at our choices for answers.
The responses are: A. a VPC configured with an Application Load Balancer targeting an EC2 Auto Scaling group consisting of Amazon EC2 instances in one Availability Zone.
- Multiple Amazon EC2 instances configured within peered VPCs across two Availability Zones.
- A VPC configured with a Network Load Balancer targeting an EC2 Auto Scaling group consisting of Amazon EC2 instances spanning two Availability Zones.
and D. A VPC configured with an Application Load Balancer targeting an EC2 Auto Scaling group consisting of Amazon EC2 instances spanning two AWS Regions.
I’ll give you some time now to see if you can identify the key for yourself.
Let’s go through the responses.
Answer A is incorrect. High availability is achieved by deploying application resources across multiple Availability Zones. For high availability, you should host instances in multiple Availability Zones.
Answer B is incorrect. Peered VPCs will permit communication between the two VPCs, but will not address requirement for high availability of the application.
Answer C is correct. The Network Load Balancer is capable of handling millions of requests per second while maintaining ultra-low latency. Combined with an Auto Scaling group, the Network Load Balancer can handle volatile traffic patterns, and set in the Auto Scaling group targets across multiple availability zones, will make this highly available.
Answer D is also incorrect. Application Load Balancer targets, must be within a single Region.
How did you answer and understand this question? Remember, when you’re taking your exam, read each question and the answers completely to identify keywords. Then eliminate the distracters, and reread the plausible answers to choose the best answer.
In the next video, we’ll wrap up domain two, then we’ll get started with an introduction to domain three, Design High Performing Architectures, and remember to get some hands-on experience in the second lab, Configure High Availability in Your Amazon VPC. I’ll see you in the next video.
Resilient architectures wrap up
–
Welcome back. Let’s review Domain 2. One of the key benefits of AWS cloud computing is that we can stop guessing about capacity. When using AWS services in an optimized manner, we can automatically scale up and down to match our application needs. Since we’re only paying for the resources we use, we get the added benefit of reducing our costs. We’ll cover cost optimization in depth in Domain 4.
Know the difference between highly available and fault-tolerant workloads along with disaster recovery. An understanding of AWS Global Infrastructure is required in order to design resilient workloads across availability zones and across regions. Which two services can be used to create a self-healing environment that can tolerate a single point of failure of a single EC2 instance or a loss of an entire availability zone? Ensure you understand Elastic Load Balancing and Amazon EC2 Auto Scaling.
Now let’s look at cross-region resilience for the same architecture. A self-healing environment that can tolerate a single point of failure of a single EC2 instance or loss of an entire availability zone. What additional services could you use to ensure high availability? Amazon Route 53 with failover routing or latency-based routing? Perhaps you would look at AWS Global Accelerator. Be sure to understand different types of disaster recovery strategies, such as Backup and Restore, Pilot Light, and Warm Standby.
Here’s another question. What services can you use for monitoring your workloads? Amazon CloudWatch or AWS X-Ray? Be sure you know the AWS services well enough to choose one over another given a particular scenario.
In this Domain, the focus was on ensuring understanding of how to use AWS services to automatically scale up and down to match demand. Additionally, we can use loosely coupled architectures so that different parts of our application can scale independently as needed.
Remember, you need to know the difference between horizontal and vertical scaling and Amazon EC2 Auto Scaling and AWS Auto Scaling.
Ensure you understand the services you can use with AWS Auto Scaling. Many of the AWS-managed services have built-in resilience, such SQS, Lambda, and Fargate. Know their capabilities and their appropriate use cases.
Ensure you understand serverless technologies and patterns and the differences between stateful and stateless applications. For example, if I have a legacy stateful application that writes to disk, which AWS service can you use to activate the application to scale horizontally? Amazon EFS, Amazon FSx? It is possible to have stateful containers using the notion of volume storage. This is a more tightly coupled architecture where the communication between the container and the storage isn’t service to service but it is achieved through common storage protocols. This allows, for example, mounting and attaching network volumes that are decoupled from the container lifecycle. The integration between ECS and EFS represents the introduction of a completely elastic managed file system available to containers. This file system can be seen as serverless storage and it compliments containers and gives them the ability to persist state in a very cloud-native way. If the application also stores state information in-memory, is there something we can do to initiate horizontal scaling?
Ensure you know why sticky sessions keep customers on the same instance. By default, an Application Load Balancer routes each request independently to a registered target based on the chosen load-balancing algorithm. However, you can use the sticky session feature to give the load balancer the ability to bind a user session to a specific target.
Definitely know which services you can use to decouple your architectures like SQS and Elastic Load Balancing.
Also know what services or capabilities you can use to offload some of your frequent traffic to data stores such as ElastiCache, DynamoDB Accelerator, and Read Replicas.
Let’s get started with the main three. Refer to the Exam Guide and scan here for a link to download and follow along, and I’ll see you in the next video.
Module 3: Design High-Performing Architectures
–
Welcome back. Let’s get started with Domain 3 which is design high-performing architectures.
There are a few major layers to think about when designing high-performing solutions on AWS. You have the compute layer, the storage layer, the database layer, the networking layer, but you also have to think about your data ingestion and transformation. There are other things too to consider like security and authentication, but for this domain, we will focus on these four layers and also data ingestion and transformation solutions.
The optimal solution for particular workload varies. Well-architected workloads use multiple solutions and use different features to improve performance and also use a data-driven approach to select the patterns and implementation for your architecture to achieve a cost-effective solution.
Most high-performing architectures also use the AWS services that are specific to the optimization of your architecture’s performance. We mentioned this earlier in the course. It is important to use data obtained through benchmarking or load testing to continually optimize your architecture.
This domain is broken into five task statements, determine high-performing and or scalable storage solutions. Design high-performing and elastic compute solutions. Determine high-performing database solutions, determine high-performing and or scalable network architectures and determine high-performing data ingestion and transformation solutions.
Over the next several videos in this module, I will address each task statement individually, breaking down the knowledge and skills expected of you to be successful.
Let’s get started with the first task statement covering high-performing and or scalable storage solutions, and start to evaluate your readiness for the exam in the next video.
High-performing scalable storage solutions
–
Welcome back. Let’s begin with the first task statement from domain three, determine high-performing and/or scalable storage solutions. We’ve already mentioned cloud storage in an earlier lesson. When you’re building solutions on AWS, you are most likely going to need to figure out how and where to store the data you’ll be working with. This can seem like a straightforward task, but there are many different AWS storage services that can potentially meet your needs, and being able to choose the right one takes practice.
In AWS, storage is available in three forms: object, block, and file. The optimal storage solution for a system or a scenario question in your exam will vary based on the kind of access method, patterns of access, required throughput, frequency of access, frequency of updates, and availability and durability requirements.
Picking the right storage solution includes evaluating how a solution scales and how it performs under given use cases. For example, EBS has a totally different scaling mechanism and performance implications than storage solutions like S3 or EFS. Let’s say you’re hosting a web application on an EC2 instance using EBS volumes as the storage for a file system. When the EBS volumes begin to run out of storage capacity, you need to scale the volumes. How do you do this? Well, you can modify the volume type, size, and IOPS capacity. This is not something that happens automatically, but EBS volumes support live configuration changes while in production.
Let’s now compare that to using EFS. If you are hosting a file system on EFS and anticipate that you will need to vertically scale the storage at one time or another, how can you scale the storage? Is it a task that you need to carry out similar to EBS? In this case, EFS will scale automatically as you add or remove files from the file system. This step into service knowledge can be useful if you were to get questions on the exam that you’re asked to pick a storage solution with the least amount of operational support needed for storage scaling.
You should also be able to determine which storage solution is the best fit based on future storage needs, too. Know the configuration options for key storage services. For example, for object storage, ensure you understand the S3 storage classes, basic API calls, and multi-part uploads. S3 is great for big data storage and backup and recovery, too. Ensure you understand the scalability, availability, durability, security, performance, and cost optimization with S3.
Let’s pause for a quick fundamental service resilience question. What level of resiliency is S3? Well, S3 is a globally resilient service, and S3 is ran in every AWS Region. Your data is stored in a specific Region and can tolerate a failure of an Availability Zone in that Region, and your data can be replicated across Regions, too.
For block storage, understand the different EBS volume types and their performance. Key words for choosing block storage in a scenario question are DAS, SAN, and persistent storage for EC2. EC2 instances have two types of storage, instant store, which is local storage and is ephemeral, and EBS volumes, which are persistent storage. Also remember that EBS snapshots are a great feature for EBS for backups and disaster recovery. EBS snapshots are stored in S3 so that data becomes Region resilient because S3 can tolerate a failure of an Availability Zone and keep running.
For file storage, understand when you choose Amazon EFS for your storage. EFS provides a shared file system and is great for large repositories, development environments, media stores, directories, and more. Key words for EFS are network-based file systems, NFS, hybrid access, so access via VPN or Direct Connect, and Linux instances. There are different configuration options for performance. We have general purpose and Max IO, and also cost optimization lifecycle policies. And just a tip, Storage Gateway is also great for a hybrid storage solution.
Amazon FSx is a file server similar to EFS, but it was created for Windows instances. It also has FSx for Lustre, which is for Linux instances. It is a fully-managed Windows file system share drive and supports the SMB protocol and Windows NTFS. So pay attention to those key words. It also supports active directory integration and is highly available and scalable too. It’s a great solution for hybrid storage.
Let’s go back to the lifecycle policies that we just mentioned. Data generally isn’t static. As you operate your applications over time, the rate of that data accumulation should be considered when making architecture decisions. Let’s say you have an application that needs three terabytes of data storage today, but over the next five years will grow into needing 100 terabytes of data storage. This is something you need to think about when choosing a solution. This also means you should know the general upper bounds for capacity for storage solutions because this will help you choose the right service for your future data storage needs.
Let’s say you have an application that is processing data and this application needs extremely low latency for reading the data. Based on this vague scenario, you should be thinking about which AWS storage services are best fits for low-latency requirements. EBS volumes are extremely low latency, and the performance is configurable. You should be familiar with the different types of EBS volumes and how performant each type is based on your use case and on IOPS. Choosing the correct configurations for EBS volumes can greatly impact performance, so being aware of these configurations when architecting solutions and answering questions will help.
Now diving deeper, as a solutions architect, you also need to know other services to integrate and improve the performance like S3 Accelerator and caching with CloudFront. There may be other configurations for storage services that could impact performance, and you should be familiar with the major performance configurations for each service. For example, when designing solutions using Amazon S3, you may be asked to consider how could you improve the performance of data uploads and data retrievals?
For data uploads or transfers, you should be familiar with the basic API calls or the AWS CLI commands for S3. You should also familiarize yourself with using multi-part uploads. Then there are also features like S3 Accelerator for more performant uploads or caching with CloudFront to make data retrieval faster, knowing these can again help you make decisions and answer exam questions.
Let’s get started with our sixth walkthrough question. I’ll see you in the next video.
Walkthrough question 6
–
Welcome back. Let’s walk through another question.
The Stem here is a solutions architect who sees a large number of video files to upload to an Amazon S3 bucket. The file sizes are 100 to 500 megabytes. The solution architect wants to easily resume failed upload attempts. How should the solutions architect perform the uploads in the least amount of time?
Reading this question, can you identify any keywords or phrases and what the question is asking? The question has a few keywords and phrases I’d like to call out. First is the fact that we have files we’re uploading to S3, so you know that it is a service to focus on. Then the Stem provides sizes for the file 100 to 500 megabytes. I generally pay attention when numbers or specifics like this are provided because it likely means that this file size matters. The problem to solve is that the solutions architect needs to easily resume failed upload attempts, and finally the question is asking for the solution that will take the least amount of time.
Now, let’s take a look at the responses.
A, Split each file into five megabyte parts. Upload the individual parts normally and use S3 multipart upload to merge the parts into a complete object
B, Using the AWS CLI, copy individual objects into the Amazon S3 bucket with the AWS S3 copy command, from the Amazon S3 console, select the Amazon S3 bucket.
C, Upload the S3 bucket and drag and drop items into the bucket.
D, Upload the files with SFTP and the AWS transfer family.
Take some time to select the correct answer.
The key is B, using the AWS CLI, copy individual objects into the Amazon S3 bucket with the AWS S3 copy command. In general, when your object size reaches 100 megabytes, you should consider using multipart uploads instead of uploading the object in a single operation. Using multipart upload can improve your throughput and it can also provide quick recovery of restarting objects due to a network failure. The Stem talked about being able to easily resume failed upload attempts and multipart upload helps us for this. Now, for the reason you wanna use the command line and the copy command is because AWS S3 commands automatically perform multipart uploading and downloading based on the file size. This means this option will solve the problem and it will take very little time to solve because the command already does the multipart upload piece for you.
Now let’s review the incorrect responses. First up is A, split each file into five megabyte parts. Upload the individual parts normally and use S3 multipart upload to merge the parts into a complete object. This is incorrect because multipart upload for S3 is recommended for objects over 100 megabytes. So breaking the objects up into five megabyte parts is not necessary. And though using multipart upload for the five megabyte parts would work, it would not reassemble the object in S3 to the original size, since the object was broken up into five megabyte parts before uploading. This would also take more time than using the copy command. For these reasons, this response is incorrect.
Next up is C, from the Amazon S3 console, select the Amazon S3 bucket, upload the S3 bucket and drag and drop items into the bucket. This is incorrect because uploading files into S3 from the AWS management console does not provide any protection from network problems and does not automatically do multipart uploads. So this does not solve the problem presented in the Stem.
Lastly, there is D, upload the files with SFTP and the AWS transfer family. This option is incorrect because though you could upload files through SFTP, it will not solve the problem for easily restarting file uploads and it also requires more work and time to use this option than it would to use the command line in the copy command.
How did you answer and understand this question? Remember, when you’re taking your exam, read each question and the answers completely to identify those key words. Then eliminate the distractors and reread the plausible answers to choose your best answer.
Let’s get started with the second task statement from domain three, design high performing and elastic compute solutions. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
High-performing elastic compute solutions
–
Welcome back. Let’s get started with the second task statement from domain three, design high performing and elastic compute solutions. We have mentioned scalability in AWS a few times already, but we have also mentioned that just because a solution is hosted on AWS, it doesn’t always mean that that is inherently scalable. The scalability and elasticity of your solution depends on your design of what AWS services you configure, implement, and integrate for your current and future requirements.
Ensure you can select compute resources that meet your requirements, performance needs, and provide cost optimization. In AWS, compute is available in three forms: instances, containers, and functions. We briefly talked about instances in our last lesson.
Amazon EC2 instances are virtual servers, so again, back to our fundamentals, understanding virtualization is important as a solutions architect and in the real world, too. When we launch an EC2 instance, these virtual server instances come in different families and sizes, along with a wide variety of capabilities, including solid-state drives and graphics processing units. So at launch, we get a raw amount of CPU, memory, local storage, type of storage, and so on. And you must understand the performance of each EC2 instance type because each instance type also has an amount of network bandwidth, resource rations, and additional features and capabilities.
For containers, again, understand container fundamentals along with the benefits that container computing provides. Amazon Elastic Container Service and Amazon Kubernetes Service should be your focus here. ECS accepts containers with instructions on where and how to run the containers. EKS helps us run AWS-powered kubernetes on EC2 instances. Diving deeper into ECS, there are two types, Amazon EC2 and Fargate. Fargate is serverless compute for containers and Amazon EC2 can be used if you need to control the installation, configuration, and management of your compute environment. Understand how to integrate other services with your containers such as application load balancers for port mapping.
For functions, understand how to extract the execution environment from the code you want to run. Lambda accepts functions with a small piece of code written in a language. It uses a runtime like Python, Java, NodeJS, and so on, and starts the code without running an EC2 instance. You are billed for the duration of the execution, functions are invoked on an event occurring, and have a 15-minute execution limit, so if you need longer than 15 minutes, then Step Functions might be a better fit for your design and requirements, and you can also use Lambda along with CloudFront to run Lambda globally by deploying your Lambda functions to all CloudFront edge locations. This adds more responsiveness and better performance. Let’s dive deeper and discuss how to architect solutions so that your compute and application components can scale independently.
We already covered this in an earlier lesson, but this task statement also touches on integrating services like SQS and Elastic Load Balancing to decouple your workloads. And to scale, we need visibility. So ensure you understand the basics of CloudWatch metrics, alarms, and dashboards. Additionally, look at how metrics can be used as a basis for Amazon EC2 Auto Scaling. Know how to add in automation and alerts and remediation for scaling events based on alarms created in CloudWatch. These alarms are invoked when a metric crosses a certain threshold for a defined amount of time. Knowing how to choose what metrics to monitor for scaling is something you should be very familiar with for this exam.
Also know what types of metrics are not available by default, such as the amount of memory used on EC2 instances. There are other types of metrics you can use for scaling as well, such as a custom metric that you define or metrics related to elastic load balancing like your healthyhostcount, and your surgequeuelength. It really depends on the situation which scaling metrics make the most sense and it’s a good idea to explore what metrics are available from the different compute-related services that can be used for scaling.
Remember some services in AWS are elastic and scalable by nature, like Lambda. If you need to invoke your Lambda function once or 100 times per second, you can simply do so. You don’t need to scale the service or design architectures to scale Lambda. Also remember that that is not the case for every AWS service, and that is important for you to know when you make design decisions and choosing the metrics that can be used for scaling actions and identifying potential issues.
Amazon EC2 is a service that is not inherently scalable, though there are ways to make EC2 scalable and elastic. The first thing to consider when beginning to architect solutions using Amazon EC2 is what EC2 type will you use? In order to choose the correct instance type, you first must know what your application needs are.
For the exam, if there are questions around choosing an instance type, it’s likely the the stem will provide some context around application needs, so given the application resource needs for compute, storage, and networking requirements, can you choose the appropriate instance family for the EC2 instance? You won’t need to memorize every single instance type, but instead, you should have a firm grasp on how picking the instance family for your workload can impact your application performance and scalability. Visit the Amazon EC2 instance type page to read about all of the different instance families and explore using Amazon EC2 to prepare for this exam.
Expanding outside of EC2 now, you should also be able to choose the appropriate architecture and services that scale to meet performance requirements. This means you should be able to determine based on a use case which AWS service would be the most scalable and performant given defined circumstances. For example, let’s say there’s an organization that needs to host a backend web service on EC2, and their usage of this web service will vary greatly throughout the day. This solution needs to be highly available and elastic. What type of architecture would be a scalable solution for this workload? In this case, you would likely want to research using EC2 with Elastic Load Balancing and Amazon EC2 Auto Scaling.
Know how to choose appropriate architectures for scaling compute workloads is crucial for being prepared for the Solutions Architect Associate Exam. It’s also important to note Amazon EC2 is not the only compute option out there. You should be able to answer architecture questions about when you want to use EC2 or Lambda or a container service.
You should be very comfortable with picking when to use what compute services and what benefits and limitations each compute service has.
Let’s get started with our seventh walkthrough question, and I’ll see you in the next video.
Walkthrough question 7
–
Welcome back. Let’s get started with another walkthrough question.
The Stem reads, “A company has developed an application that processes photos and videos. When users upload photos and videos, a job processes the files. The job can take up to one hour to process long videos. The company is using Amazon EC2 On-Demand Instances to run web servers and processing jobs. The web layer and the processing layer have instances that run in an Auto Scaling group behind an Application Load Balancer. During peak hours, users report that the application is slow and that the application does not process some requests at all. During evening hours, the systems are idle. What should a solutions architect do so that the application will process all jobs in the most cost-effective manner?”
Reading this question, can you identify any keywords or phrases and what the question is asking? When reading an exam question, you should first try to identify the key points in the Stem, then take those key points and apply them to each response to evaluate which one is the key. Make sure you take the time to read each answer in full. In this case, there is a job that can take up to one hour to process, so the Stem is providing a timeframe for the process in question, which could be important. Then it gives details on the current architecture with the web servers using On-Demand EC2 Instances and the processing layer run on an EC2 Instances in an Auto Scaling group. This is contextual information. Then the problem statement you’re solving for is during peak hours, the users report that the application is slow or at times does not process the job. Finally, when evaluating the answers, you want to see if the answers could solve the problem, but also keep in mind that the question is asking for the most cost-effective answer.
Now, let’s take a look at the responses.
A, use a larger instance size in the Auto Scaling groups of the web layer and the processing layer.
B, use Spot Instances for the Auto Scaling group of the web layer and the processing layer.
C, using SQS standard queue between the web layer and the processing layer. Use a custom queue metric to scale the Auto Scaling group in the processing layer.
And D, use Lambda functions instead of EC2 Instances and Auto Scaling groups. Increase the service quota so that sufficient concurrent functions can run at the same time.
Now, I’ll give you some time to evaluate the responses yourself.
The key to this Stem is C, use an SQS standard queue between the web layer and the processing layer. Use a custom queue metric to scale the Auto Scaling group in the processing layer. The question sets up a scenario where the processing layer with the Auto Scaling group is not scaling up fast enough to complete the jobs at peak times. This means the scaling requirement needs to be tweaked so that instances can scale and sync with the volumes of the requests coming in. An Auto Scaling group can scale in response to changes in system low and in SQS queue. So, if you introduce an SQS queue into the architecture, you can use a custom metric to scale the processing instances when the queue reaches a certain depth, which signals that messages are beginning to be backed up. This solution also solves the issue where sometimes, at peak usage, the instances sometimes drop requests and never process the jobs at all. With the introduction of an SQS queue, even if the Auto Scaling group is at maximum capacity, jobs will be saved in the queue and will be processed when compute resources become available. In regard to cost-effectiveness, you’ll see, once we cover the incorrect responses, how this is the most cost-effective solution.
So, now let’s take a look at the other responses.
First is A, use a larger instance size in the Auto Scaling groups of the web layer and the processing layer. A larger instance size would enhance the performance of the processing instances. However, it also increased the cost. This is mostly an issue because the larger instances would incur greater costs and smaller instances while idle. We know with this application that at night the resources are idling, which means the additional costs incurred using this solution would disqualify it as the correct response. In addition, a change in the instance size cannot guarantee processing. That could depend on how the application is written and how effectively it uses resources.
Next is B, use spot instances for the Auto Scaling group of the web layer and the processing layer. For this response, it may, at first glance, seem like a good answer. Spot Instances are more cost-effective than On-Demand Instances, so you may be tempted to choose this response based on that alone. Taking a deeper look at this response though, you will see that it does not actually solve the problem set up for you in the question. The application runs slowly and can drop requests when it gets too overwhelmed. Using Spot Instances doesn’t inherently make an architecture more scalable. On top of that, Spot Instances and On-Demand Instances take the same amount of time to boot up and scale out, so swapping out one for another doesn’t really change the problem here we are having. Additionally, there’s no guarantee that Spot Instances are available at all times, so there is a chance that a Spot Instance isn’t available and the application would continue to drop requests. So, this is also incorrect.
Finally, there is D, use AWS Lambda functions instead of EC2 and Auto Scaling groups. Increase the service quota so the sufficient concurrent functions can run at the same time. This is incorrect because you know the processing job can take up to one hour to run, but Lambda functions have a maximum run time of 15 minutes in duration. This disqualifies the response, so it is incorrect.
How did you answer and understand this question? Remember, when you’re taking your exam, read each question and the answers completely to identify key words, then eliminate the distractors, and reread the plausible answers to choose your best answer. If you got this one wrong, I would recommend reading more on decoupling services, SQS, EC2 scaling practices, and Lambda functionality.
Let’s get started with the third task statement from domain three, determine high-performing database solutions. Refer to the exam guide and scan here for a link to download and follow along, and I’ll see you in the next video.
High-performing database solutions
–
Welcome back. Let’s get started with the third task statement from Domain 3, determine high-performing database solutions and focus on high performance for databases in our applications. In AWS, you can choose from multiple purpose-built database engines, including relational, key value, document, in-memory, graph, time series, and ledger databases. With AWS databases, you don’t need to worry about database management tasks, such as server provisioning, patching, setup, configuration, backups, or recovery.
AWS continually monitors your clusters to keep your workloads up and running with self-healing storage and automated scaling so that you can focus on higher value application development. The optimal database solution for a system varies based on the requirements for availability, consistency, partition tolerance, latency, durability, scalability, and query capability.
Many systems use different database solutions for various subsystems and activate different features to improve performance. Selecting the wrong database solution and features for a system can lead to a lower performance efficiency. Your workload’s database approach has a significant impact on performance efficiency. It’s often an area that is chosen according to organizational defaults rather than through a data-driven approach. But with storage, you must consider the access patterns of your workload and if there are other non-database solutions that could solve the problem more efficiently, such as using graph, time series, or in-memory storage database.
For the exam, you should have knowledge that goes deeper than identification of which database service to use given a use case. Dive deeper and know how these services work and operate to be able to answer questions that go beyond, do I need a relational database? How do you maximize the performance of your relational database engines? What if a significant portion of your traffic is from read operations on the database? Understand how to configure read replicas for RDS. Also, be sure you understand the different purposes of read replicas compared to multi-AZ deployments.
Understand how Amazon Aurora clusters and data replication work. Aurora is part of the RDS family, but it does have quite a few differences and improvements over RDS. The Aurora architecture is very different from RDS. Aurora uses the base architecture of a cluster. It is a cluster made up of a single primary instance and then zero or more read replicas. This may seem the same as RDS, but Aurora read replicas can provide the benefits of reads and also the benefits of a multi-AZ so they can be used to improve availability and read operations on your cluster. Aurora storage is different from RDS because Aurora does not use the local storage for the compute. It actually uses a shared cluster volume which provides faster provisioning, improved availability, and better performance.
Let’s dive deeper. Knowing how to choose the optimal database solution for a system varies based on the requirements for availability, consistency, partition tolerance, latency, durability, scalability, and query capability. Amazon has many purpose-built database engines. Understand when you might use different database engines. Would PostgreSQL be more suited for a scenario question or a use case on MySQL? If you’re dealing with an application that needs consistent single-digit millisecond performance and extremely high volumes, does it make more sense to choose DynamoDB or Aurora?
DynamoDB is known for consistent response times in the single milliseconds regardless of loading and without tuning effort. DynamoDB’s data is replicated across multiple storage nodes by default, and it is fast and backed by SSD. DynamoDB also handles our backups, provides point-in-time recovery, encryption at rest, and can support event-driven integration to take actions if your table changes.
What if you need a relational database solution with a regional failover? Would you choose RDS MySQL or Aurora MySQL? Well, again, back to fundamentals, RDS is a regional resilient service that can handle the loss of one availability zone, but Aurora has an Aurora global database that can span multiple Regions.
High-performing databases also include high-performing compute. Know the various ways you can optimize your database throughput. Should you use caching such as ElastiCache or DynamoDB Accelerator? And if so, why? Again, you need to dive deeper to choose the best service for the requirements.
ElastiCache is a managed in-memory cache for two caching engines: Redis and Memcached.
DAX is an accelerator for DynamoDB and adds access to data in milliseconds and has two caches: item and query.
Perhaps you’re running a Lambda function for your compute that opens many different connections to your RDS instance. Can we use RDS Proxy to manage our connections more efficiently? We talked about RDS Proxy in an earlier lesson. Remember, RDS Proxy maintains a pool of established connections to your RDS instances. This reduces the stress on your compute and memory resources and supports a large number and frequency of connections to help your applications scale without compromising performance.
Ensure you can answer questions on database capacity planning. Know how to choose the correct instance type and size. What about RDS instance auto-scaling? If your workload is unpredictable, can you turn on storage auto-scaling for an RDS database instance? You can, but auto-scaling doesn’t occur if the maximum storage threshold would be equal to or exceeded by the storage increment.
Aurora Serverless is an on-demand auto-scaling configuration for Aurora that scales compute capacity up and down based on your application needs. This provides a relatively simple cost-effective option for infrequent, intermittent, or unpredictable workloads. It is cost-effective because it automatically scales up, scales compute capacity to match your application’s usage, and shuts down when it’s not in use. Aurora Serverless provides a version of the Aurora database product where you do not provision or manage the database instance and works a bit differently architecturally. Aurora Serverless provides the same shared cluster storage so you get six copies of your data across three Availability Zones, but we do not provision our clusters the same way. With Aurora Serverless, we use ACUs which are Aurora Capacity Units. The Aurora ACUs provide a certain amount of compute and a corresponding amount of memory. So for an Aurora Serverless cluster, we can choose a minimum ACU and a maximum ACU and then your Aurora Serverless cluster will scale between that minimum and maximum value. It will add and remove the capacity in your cluster based on the load. A really useful cost optimization feature of Aurora Serverless is that the ACU can go down to zero and then it can be paused after a period of inactivity. This is a great cost savings because when your cluster is paused, you are only paying for the storage that you are using.
Let’s get started with our eighth walkthrough question and I’ll see you in the next video.
Walkthrough question 8
–
Welcome back. Let’s get started with another walkthrough question.
The stem reads, “A company is building a distributed application “which will send sensor IoT data, “including weather conditions and wind speed, “from wind turbines to AWS for further processing. “As the nature of the data is spiky, “the application needs to be able to scale. “It is important to store the streaming data “in a key-value database and then send it “to a centralized data lake, “where it can be transformed, analyzed, “and combined with diverse organizational datasets “to derive meaningful insights and make predictions. “Which combination of solutions would accomplish “the business need with minimal operational overhead? “Select two.”
So reading this question, can you identify any keywords and what exactly the question is asking? A few key phrases stand out for me. Data is spiky, needs to be able to scale, and the need for the data to be stored in a key-value database. Then I focus on the fact that we need to send this data to a data lake for analyzing. Finally, I see that we need to find a solution that has minimal operational overhead and we need to select two answers to meet the business requirement in a key-value database, transfer to data lake for transformation for insights and predictions.
Let’s look at the responses.
A is to configure Amazon Kinesis to deliver streaming data to an Amazon S3 data lake.
B, use Amazon DocumentDB to store IoT sensor data.
C, write Lambda functions to deliver streaming data to Amazon S3.
And D, use Amazon DynamoDB to store the IoT sensor data and turn on Kinesis Data Streams.
E, use Amazon Kinesis to deliver streaming data to Amazon Redshift and turn on Redshift Spectrum.
Take a minute to review the stem and responses. Are you ready?
The keys are A and D. Let’s discuss these responses and why they are correct. First, knowing that DynamoDB is a key-value database that can scale automatically and handle spiky access patterns is essential to getting this question correct. DynamoDB is a great choice for storing the sensor data as it scales up easily without overhead and it gives you the ability to store key-value data easily in tables. Then the second thing I want to think about is how we will get each entry into the DynamoDB table to be sent to the data lake. This is where Kinesis Data Streams comes in. Kinesis Data Streams capture item-level changes for data in DynamoDB tables in a stream. As soon as the changes are in the stream, you can then create solutions that process those changes or replicate those changes to other data stores, like a data lake, for example. So response D is to use Amazon DynamoDB to store the IoT sensor data and turn on Kinesis Data Streams is also a correct response.
This doesn’t solve the entire problem, though. In our case, we need to find a solution that sends the data to a data lake. That is where answer A comes in, configure Amazon Kinesis to deliver streaming data to an Amazon S3 data lake. You can configure Amazon Kinesis to process or deliver data sent to DynamoDB streams and Kinesis can deliver data to Amazon S3. S3 is a very common service to use to host data lakes and Amazon Kinesis solves the problem and requires the least amount of operational overhead.
Now let’s review the incorrect responses.
B is use Amazon DocumentDB to store IoT sensor data. This is incorrect because Amazon DocumentDB is not a key-value database.
Next is C, write AWS Lambda functions to deliver streaming data to Amazon S3. This is a bit tricky because you could do this and invoke the Lambda function to run when data is added to the DynamoDB stream. This could work. The issue, however, is that it requires more operational overhead than using Kinesis to perform this task because writing a Lambda function would require custom code. So this is incorrect.
Finally there is E, use Amazon Kinesis to deliver streaming data to Amazon Redshift and turn on Redshift Spectrum. This is also incorrect because although you could use Kinesis to deliver the data to Amazon Redshift, Amazon S3 is a better choice for a data lake in this scenario, because in Amazon S3, you can combine data from many different sources for analysis, requiring less operational overhead.
How did you answer and understand this question? Remember, when you’re taking your exam, read each question and the answers completely to identify keywords, then eliminate the distractors and reread the two plausible answers to choose your best answer.
Let’s get started with the fourth task statement from domain 3, determine high-performing and/or scalable architectures. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
High-performing scalable network architectures
–
Welcome back. Let’s get started with the fourth task statement from Domain 3, “Determine high-performing and/or scalable network architectures.” In order to be prepared for the Solutions Architect Associate exam, you should be able to select high-performing networking solutions for a workload given a defined set of requirements or circumstances. Now, this doesn’t mean you need to be an absolute networking expert to be an AWS Solutions Architect, but you should know enough about AWS networking services and solutions to make informed decisions and do basic troubleshooting.
Again, ensure you understand networking fundamentals. I’ve seen a lot of people who do not take the time to learn networking fundamentals, but the network is between all components of your architecture and can have a great impact, both positive and negative, on your performance.
On AWS, networking is virtualized, and it’s available in a number of different types and configurations. This makes it easier to match your networking methods with your needs and requirements for bandwidth, latency, jitter, and throughput. AWS offers product features, for example, enhanced networking, EBS-optimized instances, S3 Transfer Acceleration, and CloudFront to optimize network traffic. AWS also offers networking features, for example, Route 53 latency routing, VPC endpoints, Direct Connect, and Global Accelerator to reduce network distance or jitter.
Another consideration is your location when deploying your network. You can choose to place resources close to where they will be used to reduce distance. Know how to configure and implement your network across Regions, placement groups, and edge services to improve performance.
Ensure you know and understand how to integrate storage services, data transfer methods, and networking options to build solutions that protect your data with unmatched durability and security. Again, using network metrics to make changes to networking configurations as the workload evolves and consider creating your network and your whole architecture as infrastructure as code to ensure your cloud-based networks can be quickly rebuilt or modified. Staying current and involving your network architecture over time is necessary to maintain performance efficiency.
For this exam, ensure you know how to design and build an Amazon VPC. Know how to secure your VPC. Remember, VPCs are a regional resilience service, and a custom VPC is private without explicit configuration. What components are needed for your VPC? I’ve seen VPCs created in different ways, but they all use the same components. I like the build order of VPC, subnets, route tables, internet gateway, network access control list, security groups, and then I jump into customizing my VPC. At this point, I add resources, update my security group, maybe add a NAT gateway if needed, peer to another VPC, and create endpoints.
As a Solutions Architect, you will most likely see a hybrid model with an Amazon VPC in AWS and an on premises data center that need a way to privately communicate to transfer data and messages across systems. How would you design and secure this architecture?
For the exam, know how to design network architectures including global architectures across two or more Regions, hybrid architectures connecting your on premises environment to resources in AWS such as multiple VPCs and multi-tier architectures. From our earlier steps to create and customize a VPC, we can move on to creating a VPN, or a Direct Connect connection, or maybe both. You can privately connect your AWS resources to your on premises data center resources through VPN connections or the use of services like Direct Connect.
Given the volume of data, compliance standards, and performance requirements for a hybrid deployment, you should be able to select the appropriate connectivity option. There can be differences between the use cases for either type of connection, VPN or Direct Connect, and you should familiarize yourself with the performance and throughput capabilities of both VPN connections and Direct Connect connections so that you can properly compare and contrast them when making architecture decisions. On top of AWS managed VPN and Direct Connect, you should also be familiar with services like Transit Gateway, which can be used with either VPN or Direct Connect to connect multiple VPCs to a remote network.
You should know how Transit Gateway works, its use cases, and how it can simplify network peering solutions. Another service you should know about from a high level is AWS CloudHub, which can also help you create a hub and spoke model for connecting networks. In addition to the idea of connecting a remote network to AWS, you should also know how to create connections between VPCs so that applications in one VPC can receive and send messages or data with an application being hosted in another VPC or another AWS account.
We mentioned this earlier, but the focus here would be VPC peering, Transit Gateway, or Direct Connect gateway. Another private connection is endpoint services like PrivateLink and VPC endpoints. If we have an application that needs to communicate between VPCs, and we do not wanna make that application public and expose it to the internet or configure VPC peering because of the additional overhead, we could set up PrivateLink.
Beyond connecting to your on premises data center to AWS through private connections, you should also be able to select appropriate features to optimize connectivity to AWS public services. Gateway endpoints are used for AWS public services and sit inside the AWS public zone.
Sometimes we wanna connect to these public services like S3 or DynamoDB from a private instance or subnet that does not have access to the internet and does not have a NAT gateway set up. There are many different ways you can create connections or send requests to public AWS resources, and you should be able to select appropriate features to optimize connectivity to AWS.
For example, suppose you have a website hosted across two different Regions and you wanna send traffic to the Region that is geographically closer to the end user accessing the website. What AWS service or feature would you use to route the traffic to the correct Region? The key to knowing this one is to know how Route 53 works and to know the different routing policies and what use cases they serve. In this example, you would use Route 53 geoproximity routing policies.
Beyond the routing policies, you should have a firm grasp on the functionality of Route 53 and how to develop solutions using it. This would include understanding the record types you can create and why you would need to use Route 53.
I would also suggest you learn about Global Accelerator, which is a service that can improve your application’s network performance and may be considered when optimizing AWS solutions for network performance.
Another way you can improve network performance is to consider caching assets closer to your end users by taking advantage of the AWS edge location using services like CloudFront. You should have an in depth knowledge of the use cases for CloudFront, how it works, and the benefits of using it.
One more topic for network performance is that you should be able to select the appropriate data transfer service for data migration or ingestion. There are multiple data transfer services or ways you can transfer data to AWS that you should be familiar with so you can select the appropriate solutions. There’s AWS DataSync, the AWS Snow Family, the AWS Transfer Family, AWS Database Migration Service, and others you should feel comfortable designing architectures with.
Depending on the amount of data, type of data, and source and destination of the data migration, one service may be more appropriate than other services, so it’s important that you know the functional and performance differences between the data transfer and migration services.
For each of the architecture types, we need to ensure our applications can scale to meet our customer demand. How would you architecture resources for a global customer base accessing a web-based application? Would you use CloudFront, Global Accelerator, serverless solutions to meet scale while minimizing cost? How will you scale while meeting your database and storage needs?
Dive deeper into auto scaling and load balancing. EC2 Auto Scaling has launch configurations to customize your scaling and what you provision. Understand minimum, maximum, and desired capacity. Also know the different scaling policies and when to use them. And of course, understand how to integrate auto scaling with a load balancer for better health checks and more application awareness.
Know the different load balancers and use cases for each. Here’s a tip that may help with an exam question. Know the layer of the OSI model for each load balancer. The OSI model should be included under your network fundamentals. How do you ensure your customer latency requirements are met at scale?
What if you have customers in three or more Regions around the world? Let’s take a moment to walk through another exercise focusing on scalability. Given this architecture, what types of things can we do to improve scalability? If we have a requirement for real time communications, we need to align that requirement with service levels that provide availability from 99.9 to 99.999%. Depending on the degree of high availability that you want, you can ensure that your design scales along with the full life cycle of an application.
AWS recommends following these guidelines to achieve high availability. We’ve mentioned these before in the last module. Design the system to have no single point of failure. Use automated monitoring, failure detection, and failover mechanisms for both stateless and stateful components. To achieve high availability for your design, you must know which services, features, and tools are available to meet your requirements. For example, you could use a load balancer cluster for an active-standby pair. Route 53 provides a global DNS service that can be used as a public or private endpoint, and DNS health checks can be configured to route traffic to healthy endpoints or to independently monitor the health of your application. You can also use a combination of services such as Auto Scaling, lifecycle hooks with CloudWatch events, Route 53, and Lambda functions.
I’ll give you a few minutes to review this.
Okay. What types of things did you come up with? Adding CloudFront? What about ElastiCache? Perhaps we can improve our scaling policies on the Auto Scaling group by choosing smaller instance sizes and scale in more granular increments.
AWS helps to ensure scalability, elasticity, and high availability while meeting the key requirements, but we need to know the AWS services that meet our requirements and how to design, configure, and integrate them.
Okay, time for another challenge. How might you build a global, highly scalable serverless solution? Our goal here is to be able to handle extremely high volumes while still minimizing cost.
Take a few moments to think about how you could accomplish this.
One possible solution is provided in this architecture. Here we have an architecture using S3 website hosting to serve up static content. Route 53 latency and failover routing is used for optimal performance with some built in resilience. DynamoDB global tables are used for low latency, performance, and cross-Region data replication.
Given this architecture, what types of things can we do to improve scalability? I’ll give you a few minutes to review this. Okay. What types of things did you come up with? Adding CloudFront with improved performance, reliability, security, and add a cache. Would Global Accelerator be useful here? Global Accelerator actually has a feature for Amazon S3 Multi-Region Access Points that give you the ability to define S3 endpoints to span buckets in multiple AWS Regions to improve your website security, reliability, and performance.
Let’s get started with our ninth walkthrough question, and I’ll see you in the next video.
Walkthrough question 9
–
Welcome back. Let’s get started with another walkthrough question.
The question reads, a large international company has a management account in AWS organizations and over 50 individual accounts for each country they operate in. Each of the country accounts has at least four VPCs set up for functional divisions. There is a high amount of trust across the accounts, and communication among all of the VPCs should be allowed. Each of the individual VPCs throughout the entire global organization will need to access an account and a VPC that provide shared services to all the other accounts. How can the member accounts access the shared services VPC with the least operational overhead?
Reading this question, can you identify any key words and exactly what the question is asking? Let’s break it down. Right away we know we are looking at a setup with multiple AWS accounts, over 50 as the stem states, then as you keep reading, you can see that the VPCs in this account need to access resources in a shared services account. Then the question asks how the AWS accounts can access the shared services VPC with the least operational overhead or essentially which will require the least amount of work to operate.
Now, let’s look at the responses, keeping these key points in mind.
A, create an application load balancer, with a target of the private IP address of the shared services VPC. Add a certification authority authorization record for the application load balancer to Route 53. Point all requests for shared services in the VPCs routing tables to that CAA record.
B, create a peering connection between each of the VPCs and the shared services VPC.
C, create a network load balancer across availability zones in the shared services VPC. Create service consumer roles in IAM, and set endpoint connection acceptance to automatically accept. Create consumer endpoints in each division VPC and point to the network load balancer.
D, create a VPN connection between each of the VPCs and the shared service VPC.
Take a few minutes to review the responses. Are you ready?
The key is C, create a network load balancer across availability zones in the shared services VPC. Create a service consumer role in IAM, and set endpoint connection acceptance to automatically accept. Create consumer endpoints in each division VPC and point to the network load balancer. To get the right answer you must know that setting up an AWS PrivateLink connection is more appropriate than using VPC peering. You would use PrivateLink when you have a client server set up where you want to allow one or more consumer VPCs. You need directional access to a specific service or set of instances in the service provider VPC. With the scenario set up in the question there are multiple consumers who need to access to the shared services, so this use case fits. Then the second thing you need to know is how to establish a PrivateLink connection. To do this, you would create a network load balancer, create a service consumer role in IAM, set up an endpoint connection in the shared services VPC, and set it to auto accept, create consumer endpoints in each VPC trying to access the shared VPC and then point to the network load balancer in the shared services VPC. This is how you establish a PrivateLink connection across VPCs.
Now let’s review the incorrect responses.
A, create an application load balancer with a target of the private IP address of the shared services VPC. Add a certification authority authorization record for the application load balancer to route 53. Point all requests for the shared services in the VPCs routing tables to that CAA record. While this answer seems almost feasible, a CAA record specifies which certificate authorities can issue a certificate for a domain or subdomain so a CAA record will not actually help in this scenario.
Next is B. Create a peering connection between each of the VPCs and the shared services VPC. This response is the strongest distractor. VPC peering connections have a limitation on the number of peering connections. One VPC can accept up to 125 peering connections and when you look at the question, you can see there are over 50 accounts with four VPCs per account. This puts a number of needed peering connections over 125 so the shared services VPC could not accept all of those peering connection requests.
Last is D, create a VPN connection between each of the VPCs and the shared service VPC. This is incorrect because while you could create and manage VPN connections between all of these VPCs, you would need to maintain and operate those VPN connections which is more operational overhead than setting up the PrivateLink connection as described in response C.
How did you answer and understand this question? Remember when you’re taking your exam, read each question and the answers completely to identify keywords then eliminate the distractors and reread the plausible answers to choose your best answer. If you did not get this one right you should understand how to establish connectivity between VPCs. You should read how to architect networking strategies for a multi-account setup including a scenario that has a shared services VPC.
Let’s get started with the fifth task statement from domain three, determine high-performing data ingestion and transformation solutions. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
High-performing data ingestion and transformation solutions
–
Welcome back. Let’s get started with the fifth task statement from domain three, determine high performing data ingestion and transformation solutions. AWS data ingestion is a process of getting data, so collecting and curating and preparing data from the source system to AWS for storage, data lakes, machine learning, and more.
The movement of your data is important and we need to ensure you understand the fundamentals and requirements of data ingestion, and choose the appropriate approach to meet the performance, latency, scale, security, and governance needs.
Common ingestion patterns are homogenous data ingestion patterns where you move the data to the destination in the same format or same storage engine as the source. So here, your focus is speed of transfer, data protection for serving the integrity, and automating continuous ingestion of that data. This can be done by using one of the many cloud-based ETL services such as Amazon Athena or Amazon EMR.
There is also heterogeneous data ingestion patterns where the data must be transformed as it’s ingested to the destination data storage. The focus here is changing the data type or format to meet the destination requirements or running machine learning to derive new attributes of the data.
For streaming data, Amazon Kinesis and Amazon Managed Streaming for Apache Kafka provide the collection, processing, and analysis of data in real time. Kinesis provides a suite of capabilities to collect, process, and analyze real time streaming data. Kinesis data stream is a service that provides ingestion of streaming data. Producers of data push data directly into the stream which consists of a group of stored data units called records. The stored data is available for further processing or storage as part of the data pipeline. Ingestion of streaming videos can be done using Kinesis video streams.
Understand the Kinesis family to know which service to use for a specific scenario or use case. Here are some questions to consider. Why would you choose Kinesis data streams instead of Kinesis Data Fire Hose? Data streams is the real time data streaming service in Kinesis with high scalability and durability. It can help in continuous capturing multiple gigabytes of data every second from multiple sources. Data Fire Hose provides a facility of loading data streams into AWS data stores. Kinesis Data Fire Hose provides the simplest approach for capturing, transforming, and loading data streams into AWS data stores. Which Kinesis solution includes basic data transformation options? The answer is Kinesis data analytics.
Once the data is extracted, it needs to be transformed and loaded into a data store for feeding into a machine learning model. It could also needs to be cataloged and organized so that it’s available for consumption.
What services can we use to optimize processing? Would you use Amazon EMR, AWS Glue, AWS Lake Formation? If your data is stored in S3, you could choose to use EMR and Glue. If you have large jobs, there are best practices to optimize data access from EMR and Glue to S3. Remember that S3 can scale horizontally, and with EMR and Glue, you can process data in a highly distributed way at a massive scale. Depending on your design, you may need to adjust the number of concurrent S3 requests, modify the retry strategy for S3 request, or adjust the number of S3 objects processed. But what are other ways to optimize your data ingestion and transformation? If you bill your data lake on S3 using EMR clusters, you can transform your data assets to Parquet or use Lambda functions to transfer your data in a data lake built on S3. Dive deeper into data ingestions and transformation solutions.
Also ensure you know about data analytics and visualization services, and when to use them. For example, Amazon Athena, AWS Lake Formation, and Amazon QuickSight. Maybe your design requires a data storage and analytics solution that offers more agility and flexibility than traditional data management systems. A data lake is a popular way to store and analyze data because it gives companies the ability to manage multiple data types from a wide variety of sources, and stores this data structured and unstructured in a centralized repository. This provides quick ingestion, elimination of data duplication, and data sprawl, and centralized governance and management.
Here are some more questions to consider. How would you handle data ingestion into your AWS Lake Formation data lake? AWS provides services and capabilities to ingest different types of data into your data lake built on Amazon S3 depending on your use case, but you should definitely understand Kinesis Data Fire Hose, the Snow Family, Glue, AWS Data Sync, AWS Transfer Family, Storage Gateway, Direct Connect, and the Database Migration Service.
Also ensure you understand which ingestion method to use depending on the frequency in which data is streaming and also changes to that data. What services can we use to transfer data from on premises environment to the cloud specifically? We just mentioned some, the Snow Family, Transfer Family, Direct Connect, DMS, and so on.
Dive deeper and know when and how you would use each. For example, when would it be more appropriate to use the Snow Family to transfer data from on premises to the cloud? When might you use Amazon Kinesis to transfer data into AWS?
Let’s talk about Glue too. In what ways can you use Glue? This data integration service helps to discover, prepare, move, and integrate data for multiple sources for analytics, machine learning, and application development.
Understand how to manage security for data ingestion in hybrid environments. Know when you would use access policies, encryption with S3 and KMS, tagging, and so on. You can manage access to your S3 resources using access policy options. When you’re creating a centralized data lake, you can provide access to various different accounts and users with a defined bucket policy. You can also use user policies to link permissions to user roles and permissions for the data processing and analytics services and tools. You can add S3 cross region replication, S3 object lock, versioning, and more. User policies in IAM control who can review and access data in your data lake built on S3. It’s also important to ensure that users who might inadvertently or maliciously gain access to those data assets can’t review and use them. This is accomplished by using encryption keys to encrypt and decrypt data assets. S3 supports multiple encryption options. KMS helps scale and simplify management of encryption keys, and for even greater levels of protection, you can use API Gateway, Cognito, and for compliance requirements for PII data, Cloud HSM.
In the next video, we will wrap up domain three. Then we’ll get started with an introduction to domain four, design cost optimized architectures. And remember to get some hands-on experience in the lab, configure an Amazon Cloud front distribution with an Amazon S3 origin. I’ll see you in the next video.
High-performing architectures wrap up
–
Welcome back. Let’s summarize this domain by taking some time to zoom out a little bit and think about how to design high performing architectures when working as an AWS solutions architect. And let’s review the types of topics you should be reviewing for the exam.
For this domain, be sure to understand how to architect high performing storage, compute database, network data ingestion, and data transformation solutions. Additionally, it is important to think about scalability for these too.
Again, additionally, make sure you understand the difference between object storage, block storage, and file storage. Be sure to know services such as S3, EFS, FSx, and EBS. Know how you can scale with different compute services including EC2, Batch, EMR, Fargate, and Lambda.
Also, be familiar with the capabilities of SQS and SNS. For example, given an architecture with an SQS queue being processed by a Lambda function how does error handling work at scale? What types of hard or soft quotas will impact your Lambda function?
And finally, understand other serverless technologies including API Gateway, Step Functions, EventBridge, and AWS AppSync. For example, with API Gateway, you can configure throttling and quotas for your APIs to prevent them from being overwhelmed by too many requests, which can improve the overall performance. What about AppSync? Well, it provides a robust, scalable GraphQL interface for application developers to combine data from multiple sources, including DynamoDB, Lambda, and HTTP APIs. So again, know your fundamentals. What is a GraphQL? What are the features of AppSync?
Continuing on, know the capabilities in limitations of RDS, Aurora, DynamoDB, ElastiCache, and DynamoDB Accelerator, as well as Redshift, which services support Read Replicas. How do Read Replicas work for each services supports them? Which database services provide cross region support? How are global databases implemented? How is auto scaling implemented for the different database services? Understand the different caching strategies such as lazy loading and write through. How is capacity planning handled for the database solution you selected?
For example, let’s say you’re working on a new application that is using DynamoDB to achieve extremely low latency at scale. It is a relatively new application and the access patterns for the customer isn’t clearly known. Would you choose on demand or provision capacity for the initial launch?
Let’s look at another scenario. Your team has migrated a legacy application into AWS and is using RDS MySQL for the data store. The initial migration was to an instance of compatible size to what was used in the on premises environment. It appears that much of the traffic is read only for the workload. Would you choose to set up read replicas? Or use a caching solution like ElastiCache?
Be sure to know which solutions can help optimize network traffic with services, such as CloudFront, Global Accelerator, and VPC endpoints.
You also need to know the basics of setting up VPCs from scratch, including subnets, routing, internet gateways, peering connections, and transit gateways. Also understand the configuration of hybrid environments with site to site VPNs and Direct Connect.
Finally you need to know the different data ingestion patterns, know the different data transfer services such as Data Sync and Storage Gateway. Also know the options within the AWS Transfer Family and the basics of how they work.
For massive data transfers, be aware of the Snow Family of products and the use cases for which they are designed.
Understand the different streaming options with Kinesis for data visualization.
Know the basics of Athena, Lake Formation, and QuickSight. Also know the capabilities of Glue for data transformation. For data processing, understand how you can use EMR.
Let’s get started with domain four. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
Module 4: Design Cost-Optimized Architectures
Welcome back. Let’s get started with Domain 4 which is design cost-optimized architectures. In this domain, we will discuss cost-optimized solutions for storage, compute, databases, and networks.
Again, going back to our fundamentals, what is cost optimization? It is the ability to deliver business value at the lowest price point. For this pillar, it will help to understand the cost-optimization pillar of the AWS Well-Architected Framework along with the design principles and best practices.
This domain is broken into four task statements. Design cost-optimized storage solutions, design cost-optimized compute solutions, design cost-optimized database solutions, and design cost-optimized network architectures.
Over the next several videos in this module, I will address each of the task statements individually breaking down the knowledge and skills expected of you to be successful. Let’s get started with the first task statement covering design cost-optimized storage solutions, and start to evaluate your readiness for the exam in the next video.
Cost-optimized storage solutions
–
Welcome back. Let’s get started with the first task statement for Domain 4 which is to design cost optimized storage solutions. In an earlier lesson, we talked about storage solutions. This task statement focuses on cost-effective storage solutions.
So first, you must understand the different storage solutions that exist and know which one to use in a given scenario or use case. For example, if you have a scenario question that needs a file system in place for an application, you wouldn’t select S3 as the solution since S3 is object storage and not file storage. You will need to take your knowledge a step further to identify cost optimized solutions and know how to size your storage appropriately.
For example, one way to optimize for cost when using EBS volumes is to ensure you are right sizing your volumes, meaning you’re not over-provisioning storage capacity. AWS storage solutions can meet different technical and price requirements from a cost optimization perspective. The lowest compute-related cost storage option is locally attached ephemeral storage. This is the storage included in the run rate for EC2 instances. The local processing nature of ephemeral storage isn’t for long-term persistent data storage.
For more persistent storage, EBS enables EC2 instances the ability to write to block level devices that can be migrated to different instances. It also provides snapshot functionality to back up and migrate these volumes.
Each option has advantages and disadvantages in terms of performance and cost. Ephemeral storage is included in the EC2 instance hourly rate. Amazon EBS comes at additional cost. You should consider these advantages and disadvantages for each workload during your initial assessment and then again after the workload has been running for a time. That way you can see if you’re using the resources, you allocated and make adjustments if necessary.
For example, among the EBS volume types, you could switch from Provisioned IOPS to GP2 instances if you find that your workload is more efficiently run on a bursting level of IOPS. AWS makes volume information available to customers through Trusted Advisor, which will highlight unattached EBS volumes and other important deployment information. In this case, with a proper archive strategy, deleting unattached EBS volumes will minimize storage cost.
S3 is a low-cost storage option and is accessible from anywhere in the world with an internet connection and provides 11 nines of durability, no limits on the amount of storage that can be used, and provides mechanisms for customers to delete or archive data that is no longer needed to Glacier using Amazon S3 lifecycle policies. S3 also provides customers with different storage classes with different levels of durability, availability, and pricing.
Let’s dive a bit deeper and talk about adding visibility to your environment with monitoring. For cost optimization, monitoring your cost and usage will ensure you analyze and attribute your expenditure. Know the AWS services, features, and tools you can use to accurately visualize and identify the usage and cost of systems to optimize and reduce your cost and that will help measure overall efficiency.
You’ll need to implement a good monitoring and cost management solution to understand your resource needs, how costs are allocated, and to measure and monitor your architecture. Your architecture will most likely be changing, so you need measures in place to monitor and measure that usage and your cost. You can monitor your utilization of CPU, RAM, storage, and so on to identify instances that could be downsized or may need to be increased.
AWS provides CloudWatch to track your metrics and to turn on alarms to immediately take action as needed. There are other services too such as Trusted Advisor, the AWS Well-Architected Framework Tool, and Cost Explorer, but it is crucial to define your metrics, set target goals, define and enforce your tagging strategy, use cost allocation tags, and make sure you review regularly for any infrastructure changes.
This will require a good understanding of cost management tools. For example, how do you use cost allocation tags? One way is to use tags to filter views in Cost Explorer to help analyze cost. Also know the capabilities of the AWS Cost Management tools. For example, if you wanted high-level interactive financial reports, would you use Cost Explorer or the cost and usage report? Cost Explorer gives you a high-level view and you can then drill down for more specifics. Cost and usage reports break down costs by the hour, day, month, product, resource tags, and so on. You can also create billing alarms, free tier alarms, and alarms with AWS Budgets.
Know different ways you can initiate automated actions based on defined budget thresholds. Understand Cost Explorer and Cost Management. AWS Organizations and Control Tower help to centrally manage billing, control access, compliance, security, and share resources too. Ensure you know the benefits of consolidated billing.
Be sure you know how to use AWS Auto-Scaling and Amazon EC2 Auto-Scaling to help you minimize cost by ensuring you only have the appropriate number of instances to match your customer demand.
We just mentioned that Trusted Advisor can give recommendations when it finds underutilized EBS volumes in your account. Having monitoring in place to establish a baseline of your usage is a big part of being able to right size your volumes. Ensure you continue to track the correct metrics to determine if you need to modify your solution for cost optimization. You can also optimize the cost for EBS by paying attention to how many IOPS you’re using and then comparing that to how many IOPS you are paying for. If you’re paying for the most optimized, most performant volume type, but are not using that capacity, then you may be able to change your volume type to save money without taking a performance hit.
Know how you can use data lifecycle rules to automatically delete data that is no longer needed. Be prepared to choose the most cost-effective storage when provided a set of requirements. Check out the services Amazon Data Lifecycle Manager and Backup to learn more about solutions for automatically deleting old EBS snapshots or backups from other AWS services as this could help reduce the storage costs within an AWS account.
For S3, you can optimize for cost by making use of the different storage classes when appropriate. You should be familiar with the different storage classes and how they impact both cost and retrieval times. You don’t need to know the exact pricing for the S3 storage tiers, but you should know the tradeoffs that are made between storage costs and retrieval costs for each tier and how that can impact your AWS bill given a scenario for data storage and access patterns. S3 has a wide variety of storage classes that you can use to minimize cost based on your use case.
Which S3 capability gives you the ability to automatically transition all objects in a bucket based on a period of time? S3 Lifecycle Configuration or Intelligent Tiering? Lifecycle policies can help with this. Which S3 capability automatically will move each individual object to the most cost-effective access tier when access patterns change? Intelligent Tiering is a great storage class for this requirement.
Dive deeper into S3. You will most likely get a few storage cost optimization questions where S3 is the answer or a plausible distractor. Is it possible to configure an S3 bucket to have the requester pay the cost of the request and data download instead of the S3 bucket owner? Yes, it is if you configure the requester pays bucket for storage, transfers, and usage.
You also need to know the different options for optimizing the data migration cost for hybrid environments like DataSync, Snow Family, AWS Transfer Family, and Storage Gateway. If your organization decides to move 250 terabytes of archived data from their internal servers to S3, what is the fastest and most cost-effective way to import the data to S3? AWS Snowmobile, establish a Direct Connect connection to transfer the data to S3, order multiple Snowball devices, or upload it directly using their current dedicated line? The Snowball is usually a strong choice for data transfer especially if you need that transfer to be more secure and quickly transfer terabytes to petabytes of data to AWS.
Architecting for data transfer ensures that you minimize data transfer cost. This may involve using content delivery networks such as CloudFront to cache data closer to users or using dedicated network links from your on premises to AWS with Direct Connect.
Let’s get started with our tenth walkthrough question and I’ll see you in the next video.
Walkthrough question 10
–
Welcome back. Let’s get started with another walkthrough question.
The Stem reads, a data processing facility wants to move a group of Microsoft Windows servers to the AWS Cloud. These servers require access to a shared file system that can integrate with the facility’s existing Active Directory infrastructure for file and folder permissions. The solution needs to provide seamless support for shared files with AWS and on premises servers and make the environment highly available. The chosen solution should provide added security by supporting encryption at rest and in transit. Which storage solution would meet these requirements?
Reading this question, can you identify any keywords or phrases and what the question is asking? Okay, the first key phrase I see here is Microsoft Windows servers. So right away, I start thinking of storage solutions that work with Windows. Next, is shared file system and my first thought here is EFS, but remember, EFS is only for Linux. Then we see integration with the existing Active Directory and file and folder permissions. Next, we need high availability and shared files between on-premises and AWS. And finally, encryption at rest and in transit. So we talked about our initial thoughts.
Now let’s look at the possible answers.
A, an AWS S3 file gateway joined to the existing Active Directory domain.
B, an Amazon FSx for Windows File Server file system joined to the existing Active Directory Domain.
C, an Amazon Elastic File System file system, joined to an AWS managed Active Directory domain and D, an Amazon S3 bucket mounted on Amazon EC2 instances in multiple availability zones running Windows servers.
Let’s take some time to figure it out.
The key is Answer B. Amazon FSx provides a fully managed native Microsoft Windows file system, so you can easily move your Windows based applications that require file storage to AWS. With FSx, there are no upfront hardware or software costs and you only pay for the resources used, with no minimum commitments, set up costs or additional fees.
Now let’s look at each response, to see why answer B was the best choice.
Answer A is incorrect. Amazon S3 File Gateway supports a file interface into Amazon S3 and combines a service and a virtual software appliance. By using this combination, you can store and retrieve objects in S3 using industry standard file protocols, such as network file system, and server message block. File Gateway can authenticate with Active Directory on the on-premises side, but not once the objects are stored in S3.
Answer C is incorrect. Amazon Elastic File System is a scalable elastic file system for Linux-based workloads. It is not supported for the Windows-based instances and we mentioned that in the last slide with our earlier thoughts.
Answer D is incorrect. S3 is object level storage. Although S3 does not provide native support for network file systems, it can be used as backup for a shared file system. However, it is not supported to mount S3 buckets to EC2 instances.
How did you answer this question? Remember when you’re taking your exam, read each question and the answers completely to identify key words. Then eliminate the distractors and reread the plausible answers to choose your best answer.
Let’s get started with the second test statement from domain four, design cost-optimized compute solutions. Refer to the exam guide and scan here for a link to download and follow along and I’ll see you in the next video.
Cost-optimized compute solutions
–
Welcome back. Let’s get started with the second task statement from domain four, which is design cost-optimized compute solutions. For compute cost optimization, using the appropriate instances and resources for your workload is key to cost savings. A well-architected workload uses the most effective resources, for example, rather than maintaining servers, you can use managed services to reduce costs.
For this certification exam, AWS offers a variety of flexible and cost effective pricing options for Amazon EC2, and other services to meet your needs and requirements. For example, on-demand instances, saving plans, reserved instances and spot instances, dive deeper into spot instances and understand that they are great for systems that can tolerate using a fleet of servers, where individual servers can come and go dynamically, such as stateless web servers, batch processing, or when using HPC and big data.
Appropriate service selection can also reduce usage and costs, such as CloudFront, to minimize data transfer calls or to completely eliminate licensing calls with Aurora and RDS, which remove expensive database licensing costs.
Solutions architects need to know how to evaluate costs when selecting compute services, and this includes the resource size or number, pricing models, and data transfer costs.
Let’s take a moment and talk about the cost optimization pillar. We mentioned right-sizing in the last lesson, and it should be your first step in cost optimization best practices. For this lesson, it would be right sizing your EC2 instances, but in the last lesson covering storage, it would be right sizing your database instances. By right sizing and choosing the correct instance type, you select the cheapest instance type that meets performance requirements and can save money. Be sure you know how to choose the right instance family for a workload. Do you need a solution that is more compute optimized for some type of batch processing? Or perhaps you need a memory optimized instance for processing large data sets and memory. Know the different instance families, and their general usage patterns, to choose the best instance for the cost and requirements.
What about hybrid compute options? For example, Outpost and Snowball Edge? How do you ensure you’re right sizing this compute service? Well, AWS provides cost calculations and examples. Hybrid solutions include multiple layers and components. You have to know what is included to optimize that cost. For example, when you use AWS managed services on Outpost, you are charged only for the services based on usage by instance hour, and that excludes underlying EC2 instance and EBS storage charges. Edge computing combines geography and networking to bring computing closer to the end users of the application. For example, using a content delivery network such as CloudFront. Another example might look like an Internet of Things solution that helps a company run business logic in remote areas, or with low latency. The big cost savings here is not running and managing a data center and not having servers to manage.
The second cost optimization pillar is to increase elasticity, and only using resources when those resources are needed, and not using resources when they are not needed. So this gives us a pay-for-what-we-use model. An example of this is using more smaller instances instead of a few larger instances for your workload to reduce your cost. You can also use auto scaling to scale these instances down when they’re not being used, or during non-business hours to save money. AWS provides this in a CloudFormation template and it’s managed by tags. So here’s another reason to add a tagging strategy to your environment. Ensure you know how to select the appropriate instance size for your workload. Again, knowing the difference between horizontal and vertical scaling, know which services to choose to meet your project requirements in the most cost effective manner.
The third pillar for cost optimization is choosing the right pricing model. Ensure you understand the different pricing options for Amazon EC2. On-Demand, Savings Plans, Reserved Instances, Spot Instances, Dedicated Hosts, Dedicated Instances, Scheduled Instances, Capacity Reservations, and so on. For example, which plan would you choose if you had a batch processing job that was designed to handle interruptions and you wanted to minimize your cost? Spot instances stand out here. What option would you choose if you couldn’t handle interruptions, but you wanted to save as much as possible, and wanted to have the flexibility of using EC2, Fargate or Lambda for your compute needs? Understand Saving Plans.
The fourth pillar of cost optimization is to match your storage to your usage, and we covered this in the last lesson, but for compute optimization specifically, right-size your storage for your compute environment. Know the different use cases of why and what storage to use to match your workloads.
The fifth pillar of cost optimization is continual improvement with measuring and monitoring to optimize your architecture. We mentioned this in the last lesson too, that your architecture will most likely be changing, so you need measures in place to monitor and measure your usage and cost, monitor the utilization of CPU, RAM, storage, and so on to identify instances that could be downsized or may need to be increased. AWS provides CloudWatch to track our metrics and set alarms, so we can immediately take actions, and other services like Cost Explorer and so on, to help define your metrics, set target goals, to find and enforce your tagging strategy, use cost allocation tags, and make sure you regularly review any infrastructure changes.
For this task statement, right-sizing your instances can be combined with an elastic load balancer to horizontally scale to meet your demand. Ensure you understand the different characteristics of load balancers, and when you might choose an Application Load Balancer, Network Load Balancer, and Gateway Load Balancer for use cases and cost optimization too. After attaching a load balancer to your auto scaling group, you can configure your auto scaling group to use elastic load balancing metrics, such as the Application Load Balancer requestcountpertarget, to scale the number of instances in the group as demand fluctuates. You can also add elastic load balancing health checks to your auto scaling group, so that Amazon EC2 auto scaling can identify and replace unhealthy instances based on these additional health checks. Otherwise, you can create a CloudWatch alarm, that it notifies you if the healthy host count of the target group is lower than permitted.
Let’s get started with our eleventh walkthrough question and I’ll see you in the next video.
Walkthrough question 11
–
Welcome back. Let’s get started with another walkthrough question.
The Stem reads, a startup company is looking for a solution to cost-effectively run and access microservices without the operational overhead of managing infrastructure. The solution needs to be able to scale quickly and accommodate rapid changes in the volume of requests and protect against common DDoS attacks. What is the most cost-effective solution that meets requirements?
Reading this question, can you identify any keywords and phrases and do you know exactly what this question is asking? The first key phrase I identified is running microservices and specifically is looking to run microservices without the overhead of managing infrastructure. It also states that the solution must scale quickly and protect against DDoS. One last call out for this question is that the company is looking for the most cost-effective solution to meet these requirements.
Now it’s time for the responses.
A is run the microservices in containers using AWS Elastic Beanstalk.
B, run the microservices in AWS Lambda behind an Amazon API Gateway.
C, run the microservices on Amazon EC2 instances in an Auto Scaling group.
And D, run the microservices in containers using Amazon Elastic Container Service backed by EC2 instances.
Take a minute to review the Stem and the responses.
The key for this one is B, run the microservices in AWS Lambda behind an Amazon API Gateway. Let’s break down why this is the correct answer. First of all, Lambda is a great solution for hosting backend microservices without the operational overhead of managing infrastructure as Lambda is a serverless service so this solution meets that need. Second, API Gateway and Lambda can scale up to meet rapid changes in the volume of requests without you needing to manage that scaling. The scaling mechanisms are built and are very fast, so this meets that need as well. Third, we need to think about DDoS protection. Both services, API Gateway and Lambda, are managed services that have massive capabilities for scaling so it’s very hard to overwhelm these services. On top of that, API Gateway gives you the ability to throttle for each method you create. This makes it so that your backend won’t be flooded with requests. They would instead be dropped at the API Gateway level. Also, API Gateway automatically protects your backend systems from DDoS attacks, whether attacked with counterfeit requests which are on a Layer 7 or SYN floods which are Layer 3 from the OSI model. Now finally, the fourth point here is that this is the most cost-effective solution. Since both API Gateway and Lambda are serverless services, you do not pay for idling resources. Given the requirements and the rapid changes up and down in volumes of request, you can be sure that you’re only paying for what you use at any given time and you’re not paying for idling resources like you may do at times with services like Amazon EC2.
All right, now let’s review the incorrect responses.
A, run the microservices in containers using Elastic Beanstalk. This answer is incorrect for a few reasons, one being the rapid scaling requirement. Elastic Beanstalk makes it really easy to host applications on EC2 instances inside of an Auto Scaling group fronted by an Elastic Load Balancer. This solution is scalable, but it can take a few minutes for the scaling to happen because Amazon EC2 instances need to boot up when a scaling event occurs, so this may not be fast enough for the requirements. The next reason why this solution is incorrect is that it is not the most cost-effective solution. Since this is an EC2-based service, resources may sit idle while volume is low, but would still incur costs. This means it is a less cost-effective solution than Lambda.
Next is C, run the microservices on Amazon EC2 instances in an Auto Scaling group. This one is incorrect for the same reasons as Elastic Beanstalk. Elastic Beanstalk makes it easier to set up, deploy, and manage a solution with EC2 and Auto Scaling groups, but it does not change the rate of scaling or the cost effectiveness, so this answer is incorrect.
Finally, there is D, run the microservices in containers using Amazon ECS backed by EC2 instances. This is also wrong for the same reasons as the last two answers. Because the microservices would be in containers, the scaling for an individual service would be very fast. If you have to scale the cluster, which you likely would for large changes in volumes of requests, then the cluster will take minutes to scale. This is not fast enough for the requirements. This solution also has the same potential for idling resources since it is Amazon EC2-based, which means it’s not as cost-effective as API Gateway and Lambda would be.
How did you answer and understand this question? Remember when you’re taking your exam, read each question and the answers completely to identify keywords, then eliminate the distractors and reread the plausible answers to choose your best answer. If you got this one incorrect, I would recommend reading about the scaling mechanisms, DDoS protection mechanisms, and pricing for all services involved. Once again, remember, this is an assessment course, so it’s supposed to help you find gaps in your knowledge for you to do some further research before taking the exam.
Let’s get started with the third task statement from Domain 4, design cost-optimized database solutions. Refer to the exam guide and scan here for a link to download and follow along, and I’ll see you in the next video.
Cost-optimized database solutions
–
Welcome back. Let’s get started with a third task statement from domain four, design cost optimized database solutions. As a solutions architect, you also need to know strategies to reduce your database costs and improve your availability. We mentioned this in the first lesson for this domain but I wanna repeat it here too.
Know the AWS storage services. You also need to know the type of data and what data store is needed. Back to your fundamentals, know the types of data stores.
Relational database management systems are at the core of most transaction processing systems but you do not want to store all your data in an RDS management system because it can cause performance issues and increase your total cost of ownership. So, how do you choose the right data store for your data? Look at the access patterns, the expected scale and growth and how often your data will be accessed. So some cost optimization you can try when migrating data to AWS would be moving a subset of the relational database to EC2 or RDS, moving large objects from the relational database to S3, or moving a subset of the relational database to a no SQL data store, such as DynamoDB.
Dive deeper into other considerations such as does the processing and storing of this data use database specific features? Are table and entity definitions in your database schema going to remain fixed as your application evolves? Would you like to enforce constraints across different tables in your data model while storing the data? Let’s step back to the third cost optimization migration option we just mentioned. What if we can move a subset of data to a no SQL data store such as DynamoDB? Well, this is a great option because DynamoDB gives you the ability to offload the administrative burdens of operating and scaling a distributed database so you don’t have to worry about hardware provisioning, setup and configuration, replication, software patching or cluster scaling. Understanding the condition of your data store helps you decide which data to move to the no SQL data store, S3 or which data you should continue to store in a relational database.
Another way to optimize for costs is to select the correct database service for your use case. For example, it may be more cost effective to use Aurora Serverless over Aurora. Diving deeper, another way to optimize for cost for databases is to select appropriate scaling strategies given a scenario.
You could right size storage for an RDS database by using automatic scaling for storage. For example, you could have a database that experiencing performance issues due to a high CPU utilization when there are periods of heavy read requests hitting the instance. In order to scale up, you could increase the instance size. This, however would increase the costs associated with the database.
Horizontal scaling is usually more cost effective but the most cost effective design would be to add a read replica or some sort of cache to help absorb some of the read requests and offload the requests from the database instance. This would give you the ability to meet performance demands while also optimizing for cost. Because introducing a reed replica can often be more cost effective than scaling up the database instances. Making these sorts of decisions around scaling require that you have deep knowledge of the different scaling mechanisms that exist for both AWS Database Services and AWS Compute Services. So, if you have gaps here, I would recommend reading the documentation for the relevant services.
Remember earlier we talked about RTO and RPO. Ensure you design your backup plan to meet your RPO requirements with the appropriate frequency. Understand which AWS managed database service support, point in time recovery, design your retention policy so that you don’t retain snapshots beyond their useful life to eliminate unnecessary storage costs.
Understand the different types of RDS engines and why you might choose one over the other.
One last tip for cost optimization for databases is to use managed services when possible. In the cloud, managed services remove the operational burden of maintaining servers for tasks like running applications or managing the databases. Because managed services operate at cloud scale, they can offer a lower cost per transaction or service.
Let’s get started with the fourth task statement from domain four, design cost optimized network solutions. Refer to the exam guide and scan here for a link to download and follow along, and I’ll see you in the next video.
Cost-optimized network architectures
–
Welcome back. Let’s get started with the last task statement for Domain 4, design cost-optimized network architectures. Earlier we said that in AWS networking is virtualized. To add to that statement, AWS is a managed network service. And you may pause here and ask, Julie, what are you talking about?
In AWS, there is no longer the need to manage and operate a fleet of switches, routers, and other associated network equipment. Networking resources are consumed and paid for in the same way you pay for CPU and storage, you only pay for what you use. However, API calls and data transfer costs are often forgotten. And earlier we mentioned the importance of establishing your account structure, and defining metrics and goals to track, with a business design to achieve the optimal outcome that you or your organization needs to be successful. Efficient use of networking resources is required for cost optimization, AWS, and also for hybrid networking. Understand the areas to consider for a hybrid network and your workloads.
Let’s start with the data layer of your design that provides a path for your network traffic. Ensure you know the best cost optimization design for connections between AWS and your on premises for different scenarios. Understand the different options for configuring network connectivity. For example, if you want a private connection from your on premises environment, you can use a site-to-site VPN or a dedicated Direct Connect location. Which one would you choose to minimize costs? Generally speaking, a solution using Direct Connect would be more costly than using VPN connections, so unless you need the throughput or security that Direct Connect provides, you should instead look to use VPN connections, as they are more cost effective.
Another type of connectivity to AWS is how you connect to AWS resources to manage them. For example, if you need to log on to an EC2 instance to check some logs, or do some troubleshooting, there are many ways to achieve this. There is the classic method of SSH or RDP, then there’s the AWS Systems Manager Session Manager, and there’s also Amazon EC2 instance Connect. You should be aware of the cost implications of each of these methods of connectivity.
Now let’s talk about connections between VPCs within a Region and cross Region. When would you use a peering connection instead of a transit gateway? Dive deeper, and know that data transfer charges are incurred when transferring data across AWS Regions. So, if you’re ascending data between services across Regions, consider how you are connecting those services. Using something like VPC peering could be more cost effective than something like transit gateway, so knowing the network connectivity services, and their pricing, will be helpful.
You need to know how data transfers costs work in AWS, and how to minimize them. For example, VPC gateway endpoints allow communication to S3 and DynamoDB without incurring data transfer charges within the same Region. Traffic that crosses an Availability Zone boundary typically incurs a data transfer charge. Use resources from the local Availability Zone whenever possible.
One final thing to be aware of is the cost implications of other networking services like NAT Gateways or transit gateways. You should know how to minimize costs for your NAT Gateway configuration. In production environments, you will likely want to have a Nat Gateway deployed in a public subnet for each Availability Zone. However, to save costs in development environments you can use a shared NAT Gateway.
Also under connections, ensure you can set up an appropriate throttling strategy for your workloads. For example, let’s say you have a service implemented with API Gateway, and you wanna control how many transactions per second, and weekly, each customer can use. How might you set up usage plans and API keys to accomplish this? After you create tests and deploy your APIs, you can use API Gateway usage plans, and once configured, usage plans and API keys will give customers the ability to access selected APIs and begin throttling requests to those APIs based on defined limits and quotas.
Finally, make sure you design your network with the appropriate bandwidth. Let’s say your on premises data center is connected to your VPCs through a Direct Connect connection. If you want to configure a failover with minimal costs, how would you do it, a separate Direct Connect connection or a site-to-site VPN connection? Another Direct Connect connection is going to add costs plus time to configure. Site-to-site VPN connections are more cost effective and quicker to configure and implement.
S3 also has costs we need to know. You are charged for the amount of data stored in the S3 bucket, you also incur costs for the API calls made to S3 and the data transfer out of the bucket. This means that one good way to optimize for costs when using S3 is to reduce the amount of API calls and data transfer that is occurring. One way you can do this is to use content delivery services like CloudFront. Depending on the amount of data being accessed and the access patterns, it can be more cost effective to front your S3 bucket with CloudFront. Coming up with an effective caching solution using CloudFront can be a great way to reduce data transfer out of S3, and therefore optimize cost. So for the exam, I would highly recommend you take a good look at when and how to use CloudFront to optimize for cost, and check out use cases beyond fronting an S3 bucket with a CloudFront distribution, that is not the only use case.
Know how CloudFront can aid in reducing data transfer in origin operating cost. For example, if AWS Origins such as S3, EC2, or elastic load balancing are used, there is no charge incurred for data transferred from Origins to CloudFront Edge locations. Additionally, CloudFront can provide regional Edge caches at no additional cost, to decrease the operational burden on origins and lower operating costs for intensive origins. Beyond optimizing for data transfer using CloudFront, you should also be able to determine strategies for reducing data transfer costs within AWS.
Monitoring and config management is an important way to gain insights and improve the performance of your network.
AWS provides monitoring and config management with services such as CloudWatch, Health Dashboard, VPC Reachability Analyzer, Transit Gateway Network Manager, Config, and Route Analyzer, to help resolve the root cause of performance issues. You should know how to review your existing workloads for opportunities to minimize cost. Begin with monitoring your solution to look at historical cost patterns. Look for opportunities to keep data in the same Availability Zone and Region, when possible, to avoid data transfer costs. Some Regions cost more than others, be sure to select less expensive regions when possible. Take advantage of the free tier offered for many AWS services.
Let’s get started with our twelveth walkthrough question, and I’ll see you in the next video.
Walkthrough question 12
–
Welcome back. Let’s get started with the last walkthrough question.
The Stem reads, a company is developing an application that runs on Amazon EC2 instances in a private subnet. The EC2 instances use a NAT gateway to access the internet. A solutions architect must provide a secure option so that developers can log into the instances. Which solution meets these requirements most cost-effectively?
Reading the question, can you identify any keywords or phrases and exactly what the question is asking? What keywords do you identify? I see private instance, NAT gateway, and secure login.
Now let’s look at the responses.
A, configure AWS Systems Manager Session Manager for the EC2 instances to enable login.
B, configure a bastion host in a public subnet to log in to the EC2 instances in a private subnet.
C, use the existing NAT gateway to log into the EC2 instances in a private subnet.
D, configure AWS Site-to-Site VPN to log in directly to the EC2 instances.
Take a minute to choose an answer.
All right, the key for this one is A, configure AWS Systems Manager Session Manager for the EC2 instances to activate login. Session Manager gives you the ability to set up secure and audible instant management without the need to open inbound port, maintain a bastion host, or manage SSH keys. There is no additional charge for accessing Amazon EC2 instances by using Session Manager. This makes a great option for accessing private instances because you do not need to open any ports and it’s also very cost-effective because it comes at no charge.
Now let’s review the incorrect responses.
First up is B, configure a bastion host in a public subnet to log into the EC2 instances in a private subnet. This is a classic way to turn on connectivity to instances in private subnets. This can work, but the bastion host would incur charges as an EC2 instance, so this is not the most cost-effective solution.
Next to is C, use an existing NAT gateway to log into the EC2 instances in a private subnet. This one is incorrect for a technical reason. NAT gateways only allow egress network traffic, not ingress. This means that you cannot establish an inbound connection to the instance in this manner. So having knowledge of how NAT gateways work and their functionality is important and helps you rule out this answer.
Lastly, there is D, configure AWS Site-to-Site VPN to log directly into the EC2 instances. This is a strong distractor because setting up VPN connections would work for this use case. However, VPN connections do incur additional costs, so this is not the most cost-effective solution which makes it an incorrect response.
d understand this question? Remember, when you’re taking your exam, read each question and the answers completely to identify keywords, then eliminate the distractors and reread the plausible answers to choose your best answer. If you got this one wrong, ensure you read up on the cost implications of the connectivity services and features on AWS.
In the next video, we will wrap up Domain 4 and then we’ll get started with our course conclusion. And remember to get some hands-on experience in the last lab “Build a Serverless Architecture” and I’ll see you in the next video.
Cost-optimized architectures wrap up
–
Welcome back. Let’s wrap up this last domain by taking a step back and thinking about the implications of designing cost optimized architectures. One of the benefits of cloud is that you can take an iterative approach to your AWS bill because the cloud follows a variable cost model and you only pay for what you use each month. You have the power to impact cost incurred by changing or optimizing the solutions you build. Whenever architecting AWS solutions, it’s important to look at how you can minimize cost across storage, compute, database, and network resources.
Let’s say through monitoring, you discovered that your EC2 instances are underutilized. This gives you an opportunity to right size the instance for the application and workload or scale in. By doing so, you would be reducing your EC2-related charges from that point forward. Then the following month, maybe you turn on Intelligent-Tiering for S3 and begin to take advantage of the correct storage classes for your access patterns. Then the next month is something else you find to optimize.
Knowing where to look for cost optimization opportunities and knowing how to optimize for cost gives you the power to impact your bottom line in ways that on premises computing doesn’t give you. This is an awesome thing to be able to see a measurable difference in the cost due to architecture decisions.
When it comes to compute, choosing the most appropriate instance type and size is a good place to start if you’re using EC2. From there, you can research the pricing model. Should you use on-demand, reserved, saving plans, dedicated, or spot instances, each of these pricing models has a different impact on cost.
Beyond EC2, what about Lambda? With Lambda, you get charged for the resources the function consumes and the time the Lambda function runs. This means you can optimize by writing functions that consume less resources and are more efficient for runtime. For containers, it can be a similar thing. You can save by having smaller, more lightweight containers and choosing the correct compute platform for your clusters whether it’s Amazon EC2 or Fargate.
To be fully prepared for the exam and your cloud career, you should review all of the compute services, their use cases, and their pricing models. This will help you figure out how to optimize for cost.
For this domain, you need to understand how to configure tags to be used with cost reporting tools. Be sure to understand the cost management tools such as Cost Explorer, the Cost and Usage Report, and Budgets.
In some use cases, there can be many options for storage. Know how to pick the most cost effective storage solution that meets your use cases. Also know the different automated methods for switching between S3 storage classes. For example, let’s say you’re storing production logs in S3. After 30 days, those logs are barely accessed. After 90 days, the logs can be archived. Would you use Intelligent-Tiering or Lifecycle Configurations?
Understand the different AWS purchasing options such as Savings Plans, Reserved Instances, and Spot Instances and how they work. Be sure to understand different scaling strategies such as Auto Scaling and hibernation. Given a particular workload, know how to determine the most cost effective services.
For storage, take advantage of the appropriate storage tiers. You can also optimize for cost by choosing the appropriate EBS volume types and sizes and by managing the lifecycle snapshots for EBS volumes or database backups.
For databases, choosing the right database for your use case is a good starting point. From there, you should research how to take advantage of caching instead of always scaling up and out. You can also optimize for costs in some cases by tweaking queries or database design. It all depends on the service and the use case.
Then for networking, be sure to dive deep into how these services work, their use cases, and the cost implications. Again, being able to choose the appropriate service for your use case is the first step, but after that, you should know the configurations you can make and add the pricing models for each service.
To really be able to make an impact on costs when using AWS, it requires that you have a deep working knowledge of the services you’re planning to use. The most important thing is to remember to first find a solution that meets your requirements. This could result in you coming up with multiple designs that could easily satisfy the requirements. From there, you can begin to evaluate the designs at a deeper level to see where you can cost optimize. How can you minimize your VPC cost? For example, how can you minimize cost for NAT gateways in a development VPC? How can you configure your network routes to minimize data transfer costs, for example, Region to Region, Availability Zone to Availability Zone, VPC endpoints, and AWS Global Accelerator?
In the next lesson, we will wrap up this course and cover test taking tips along with more resources. Refer to the exam guide and scan here for a link to download and follow along, and I’ll see you in the next video.
Module 5: Course Close
Welcome back and great job completing this course!
I hope you’ve enjoyed this exam prep course for the AWS Certified Solutions Architect Associate Certification. Throughout the course, we’ve provided valuable exam basics and guidance on how to approach the various domains from the exam guide that will be covered in the certification exam.
As a reminder, this course was not intended to teach what will be on the exam but was intended to provide you a method for self-evaluation to determine your level of readiness to take the certification exam. Use the information provided in this course to help guide you in your studies and preparations and do not forget to get some hands-on experience too.
AWS provides official practice exams for the certification. And similar to this course, it will help you assess your readiness for the exam and highlight areas where you may still have gaps in your knowledge.
Let’s cover a few test-taking tips.
First, read and understand the question and the answer options.
Identify keywords and phrases and qualifiers. This is very important, if the question is looking for the lowest cost option and you are thinking of the most resilient solution, you might choose a different answer.
Eliminate some of the answer options based on what you know. Compare and contrast the remaining options. Keep in mind the key phrases and qualifiers identified.
If you’re spending too much time, pick your best guess answer and flag the question for review later. You aren’t penalized for incorrect answers, but you get zero points if you leave it blank.
If you’re looking for more in-depth and guided instructions and courses, scan the QR code and keep in mind that the AWS training and certification team creates many courses from foundational to the professional level and also provides many AWS certifications again from the foundational, associate, professional, and specialty levels.
For next steps, I recommend studying any areas that you have identified its gaps in your knowledge. When you’re ready to take the test, visit our certification site to schedule your exam.
Good luck with your studies and preparations, and good luck on your exam.
Feel free to reach out anytime, and remember, AWS is here for you and we are all cheering for you.
One last quick note, please complete our feedback survey or reach out to me directly with any and all feedback. Feedback is so important to ensure I’m creating content that you need.
Again, I wish you all the best. Now go crush your exam!!!