High Performance Computing

The video discusses the advancements and capabilities of Amazon Web Services (AWS) in handling high-performance computing (HPC) workloads. It covers the evolution of AWS’s offerings for HPC, including networking, storage, and compute services, and how these advancements help various industries run their most demanding workloads efficiently and cost-effectively. The video highlights specific examples and case studies demonstrating AWS’s impact on HPC, particularly in fields like COVID-19 research, weather forecasting, autonomous vehicle simulations, and more.

00:00 – 01:00: Introduction by Ian Colle, General Manager of High Performance Computing at AWS.
01:00 – 02:30: Evolution of AWS’s HPC capabilities since 2017, including C5n instances and FSX Lustre.
02:30 – 04:00: Examples of customers using AWS for HPC, such as Formula One, Halliburton, and Max-R.
04:00 – 05:30: AWS’s aim to make HPC users the most productive scientists or engineers.
05:30 – 07:00: AWS’s contribution to the COVID-19 HPC Consortium and examples like novel tech sciences and Iowa State University.
07:00 – 09:00: Moderna’s use of AWS for vaccine development and other complex workflows.
09:00 – 11:00: Overview of key HPC components: orchestration, compute, networking, storage, and scheduling.
11:00 – 12:30: Introduction to AWS Parallel Cluster for HPC orchestration.
12:30 – 15:00: The AWS Nitro System and its impact on HPC performance.
15:00 – 17:30: Details on EC2 instances optimized for HPC, including C5n, P4d, and Graviton2-based instances.
17:30 – 19:30: Elastic Fabric Adapter (EFA) and its benefits for tightly coupled HPC workloads.
19:30 – 21:00: Deeper dive into EFA’s architecture and performance advantages.
21:00 – 23:00: Amazon FSX for Lustre and its role in high-performance file systems.
23:00 – 24:30: AWS Batch for dynamic provisioning and scheduling of compute resources.
24:30 – 26:00: Nice DCV for remote visualization and its use by customers like Volkswagen.
26:00 – 29:00: Customer case studies: Lyft Level 5 AV, Max-R Technologies, Fred Hutch, Fannie Mae, and DTN.
29:00 – End: Conclusion and summary of AWS’s capabilities in enabling HPC workloads.

Related links in the video description:

AWS Lambda

The video discusses various serverless architectural patterns and best practices for implementing them in AWS environments. It focuses on providing practical examples and insights into serverless applications, including Lambda functions, API Gateway, DynamoDB, and Kinesis. The session is intended for developers and architects with a basic understanding of serverless concepts and aims to deepen their knowledge by presenting real-world scenarios and solutions.

00:00 – 02:00: Introduction by Adrian Hornsby, discussing the scope of the session and the focus on serverless architectural patterns.
02:00 – 04:00: Overview of serverless computing, its benefits, and the spectrum of serverless services.
04:00 – 08:00: Explanation of different serverless patterns, focusing on common practices and lessons learned from real-world implementations.
08:00 – 12:00: Discussion on creating applications using the Lambda console, including CI/CD pipelines and best practices.
12:00 – 16:00: Introduction to serverless CI/CD pipelines with advanced configurations for large enterprises.
16:00 – 20:00: Detailed exploration of Lambda power tuning, optimizing memory settings for cost and performance.
20:00 – 24:00: Explanation of the saga pattern for handling complex transactions in serverless applications.
24:00 – 28:00: Practical examples of using API Gateway with Lambda functions, DynamoDB, and Kinesis for different use cases.
28:00 – 32:00: Introduction to advanced GraphQL usage with AWS AppSync, including caching and real-time subscriptions.
32:00 – 36:00: Discussion on data streaming patterns, using Kinesis and Firehose for scalable data ingestion and processing.
36:00 – 40:00: Examples of real-world customer implementations, highlighting best practices and challenges faced.
40:00 – 44:00: Introduction to RDS Proxy for managing database connections in serverless applications.
44:00 – 48:00: Best practices for logging, monitoring, and tracing in serverless applications using AWS tools.
48:00 – 52:00: Conclusion and Q&A session, summarizing the key takeaways and addressing audience questions.

Related links in the video description:

Orchestrating an Application Process with AWS Batch Using AWS CloudFormation (Blog)

The blog post explains how to automate the deployment and management of batch processing environments using AWS Batch and AWS CloudFormation. The challenge is setting up a scalable and efficient workflow for processing large datasets. AWS CloudFormation simplifies the creation of AWS Batch resources, enabling users to deploy complex batch processing jobs using Docker containers.

Challenges and Solutions:

Automation: AWS CloudFormation automates infrastructure setup.
Scalability: AWS Batch handles large-scale processing efficiently.
Integration: Docker containers streamline the deployment process.

For detailed information, refer to the blog post.

Enabling Job Accounting for HPC with AWS ParallelCluster and Amazon RDS (Blog)

The blog post explains how to enable job accounting for High Performance Computing (HPC) using AWS ParallelCluster and Amazon RDS. The challenge is tracking resource usage accurately across HPC jobs. AWS ParallelCluster integrates with Amazon RDS to provide detailed job accounting, helping organizations optimize resource allocation and manage costs effectively.

Challenges and Solutions:

Resource Usage Tracking: Accurate tracking of HPC job resource usage.
Integration: Using AWS ParallelCluster with Amazon RDS for detailed accounting.
Cost Management: Optimizing resource allocation to manage costs.

For detailed information, refer to the blog post.

Formula 1 Transfers Race Car Data into AWS Using AWS DataSync (Blog)

The blog post discusses how Formula 1 uses AWS DataSync to transfer large volumes of race car telemetry data to AWS efficiently and securely. This solves the challenge of managing and analyzing data quickly. AWS DataSync automates and accelerates data transfers, ensuring timely data availability for performance analysis and decision-making.

Challenges and Solutions:

Large Data Volumes: Efficient management of vast telemetry data.
Data Transfer Speed: Automating transfers with AWS DataSync for quick access.
Security: Secure data movement using AWS Direct Connect and encryption.

For a detailed walkthrough, refer to the blog post.

Introducing Amazon S3 Object Lambda – Use Your Code to Process Data as It Is Being Retrieved from S3 (Blog)

The blog post introduces Amazon S3 Object Lambda, allowing users to add custom code to process data as it is retrieved from S3. The challenge of creating multiple views of the same data for different applications is solved by using AWS Lambda functions, which transform the data on-the-fly without changing the original dataset or maintaining separate copies.

Challenges and Solutions:

Multiple Data Views: Simplified by using Lambda functions to process S3 data as it’s retrieved.
Data Processing: Redacts PII, converts formats, augments data, compresses files, and more.
Implementation: Involves creating Lambda functions and configuring S3 Object Lambda Access Points.

For detailed information, refer to the blog post.

Enabling Job Accounting for HPC with AWS ParallelCluster and Amazon RDS (Blog)

Challenges and Solutions:

Resource Usage Tracking: Accurate tracking of HPC job resource usage.
Integration: Using AWS ParallelCluster with Amazon RDS for detailed accounting.
Cost Management: Optimizing resource allocation to manage costs.

For detailed information, refer to the blog post.

Formula 1 Transfers Race Car Data into AWS Using AWS DataSync (Blog)

Challenges and Solutions:

Large Data Volumes: Efficient management of vast telemetry data.
Data Transfer Speed: Automating transfers with AWS DataSync for quick access.
Security: Secure data movement using AWS Direct Connect and encryption.

For a detailed walkthrough, refer to the blog post.

Building Microservices with AWS Lambda

The video, presented by Chris Munns from AWS, discusses building microservices with AWS Lambda. It focuses on the architecture, benefits, and practical implementation of microservices using AWS Lambda, API Gateway, EventBridge, and other AWS services. The talk highlights the distinction between containers and microservices, emphasizes the event-driven nature of Lambda, and provides examples and best practices for setting up serverless applications.

00:00 – 02:00: Introduction to the session and its focus on building microservices with AWS Lambda.
02:00 – 04:00: Overview of microservices architecture and its benefits.
04:00 – 08:00: Explanation of containers vs. microservices, emphasizing that containers are a delivery mechanism and microservices are an architectural pattern.
08:00 – 12:00: Discussion on AWS’s long history with microservices and the evolution of services like Amazon S3.
12:00 – 16:00: Introduction to the microservices iceberg concept and how it relates to public and private interfaces.
16:00 – 20:00: Explanation of synchronous vs. asynchronous services and their typical uses in microservices architecture.
20:00 – 24:00: Overview of AWS Lambda and its characteristics, including compute service criteria and pay-for-value model.
24:00 – 28:00: Introduction to Lambda’s event-driven model and the various ways to invoke Lambda functions.
28:00 – 32:00: Discussion on API Gateway, its types (REST, WebSocket, HTTP API), and new capabilities.
32:00 – 36:00: Comparison of API Gateway, Application Load Balancer (ALB), and AWS AppSync for exposing APIs.
36:00 – 40:00: Deep dive into security configurations for API Gateway and Lambda, including IAM roles and policies.
40:00 – 44:00: Explanation of synchronous and asynchronous event processing with EventBridge, SNS, SQS, and Kinesis.
44:00 – 48:00: Introduction to Lambda Layers for sharing code and dependencies across multiple functions.
48:00 – 52:00: Overview of monitoring, logging, and debugging tools like AWS X-Ray and CloudWatch.
52:00 – 56:00: Discussion on concurrency controls and scaling Lambda functions to avoid overloading downstream services.
56:00 – 60:00: Best practices for securing Lambda functions and managing IAM policies.
60:00 – 64:00: Introduction to AWS SAM (Serverless Application Model) for deploying and managing serverless applications.
64:00 – 68:00: Conclusion and summary of key takeaways for building microservices with AWS Lambda.

Related links in the video description:

AWS Lambda Under the Hood

The video discusses the internal systems and technologies behind AWS Lambda, particularly focusing on its architecture, scalability, and security features. Chris Munns and Mark Brooker, senior engineers at AWS, present an in-depth look at Lambda’s data plane, asynchronous invocation paths, streaming invocation paths, and the underlying Firecracker microVM technology.

00:00 – 02:00: Introduction by Holly Mesrobian, Director of AWS Lambda, and an overview of the session’s agenda.
02:00 – 05:00: Brief summary of the previous year’s discussion on synchronous invocation paths and Firecracker microVM technology.
05:00 – 08:00: Introduction to AWS Lambda’s scale, processing trillions of requests across hundreds of thousands of active customers each month.
08:00 – 10:00: Explanation of the control plane and data plane in Lambda, focusing on their roles and responsibilities.
10:00 – 14:00: Detailed look at the synchronous invocation path, including cold starts and warm starts.
14:00 – 18:00: Discussion on the asynchronous invocation path and its components like pollers, state manager, stream tracker, and leasing service.
18:00 – 22:00: Description of event processing for async invokes, event sources, and event destinations.
22:00 – 26:00: Scaling up and down with load in the async invoke path and handling errors.
26:00 – 30:00: Streaming invocation path for services like Kinesis and DynamoDB, including parallelization and scaling.
30:00 – 35:00: Introduction to Lambda’s provisioned concurrency and its benefits for latency-sensitive applications.
35:00 – 40:00: Discussion on concurrency vs. capacity and the importance of concurrency in measuring service scale.
40:00 – 45:00: Explanation of Littles’ Law and its relevance to building scalable systems.
45:00 – 50:00: Further details on provisioned concurrency and its implementation in Lambda.
50:00 – 55:00: Overview of Firecracker and its role in providing isolation and security for Lambda functions.
55:00 – 60:00: Deep dive into Firecracker’s architecture, including its use of KVM and device emulation.
60:00 – 65:00: Security features of Firecracker, including sandboxing and performance isolation.
65:00 – 70:00: Conclusion and key takeaways, emphasizing the advancements in Lambda’s scalability and security.

Related links in the video description:

Save Up to 90% and Run Production Workloads on Spot Instances

The video discusses the advantages and best practices for using Amazon EC2 Spot Instances to save costs and run large-scale production workloads. Chad Schmutzer, a principal developer advocate at AWS, introduces the topic, followed by customer success stories from John Weber of Adobe and Devaun Sampat from Hulu, who share their experiences and strategies for utilizing Spot Instances effectively.

00:00 – 02:00: Introduction to the session by Chad Schmutzer, focusing on saving costs with EC2 Spot Instances.
02:00 – 04:00: Overview of Spot Instances and their benefits, including cost savings up to 90%.
04:00 – 08:00: Introduction to Adobe and Hulu’s use cases for Spot Instances and their experiences.
08:00 – 12:00: Chad Schmutzer explains the evolution of EC2 instances and the concept of spare capacity leading to the creation of Spot Instances.
12:00 – 16:00: Explanation of different EC2 purchasing options: On-Demand, Savings Plans, and Spot Instances.
16:00 – 20:00: Use cases for Spot Instances, including fault-tolerant, stateless, and flexible workloads.
20:00 – 24:00: Detailed discussion on how Spot Instances operate, pricing, and handling interruptions.
24:00 – 28:00: Explanation of Spot pricing, showing the stability and predictability of Spot prices.
28:00 – 32:00: Introduction to the capacity optimized allocation strategy for improving reliability.
32:00 – 36:00: Use of Spot Instances in big data, CI/CD, web services, high-performance computing, and containers.
36:00 – 40:00: New features and integrations for Spot Instances in 2019, including ECS, EKS, and AWS Batch.
40:00 – 44:00: Devaun Sampat from Hulu discusses running 10 million containers on EC2 Spot Instances.
44:00 – 48:00: Challenges Hulu faced, such as IP address exhaustion, and solutions implemented.
48:00 – 52:00: John Weber from Adobe shares how Adobe uses Spot Instances in a large enterprise environment.
52:00 – 56:00: Adobe’s operations platform, Ethos, for managing Spot Instances and ensuring compliance and security.
56:00 – 60:00: Best practices for managing Spot Instances and handling interruptions at Adobe.
60:00 – End: Conclusion, offering $50 in Spot credits and encouraging audience engagement.

Related links in the video description: