Category: <span>Announcements</span>

New – Ready-to-use Models and Support for Custom Text and Image Classification Models in Amazon SageMaker Canvas

Today AWS announces new features in Amazon SageMaker Canvas that help business analysts generate insights from thousands of documents, images, and lines of text in minutes with machine learning (ML). Starting today, you can access ready-to-use models and create custom text and image classification models alongside previously supported custom models for tabular data, all without requiring ML experience or writing a line of code.

Business analysts across different industries want to apply AI/ML solutions to generate insights from a variety of data and respond to ad-hoc analysis requests coming from business stakeholders. By applying AI/ML in their workflows, analysts can automate manual, time-consuming, and error-prone processes, such as inspection, classification, as well as extraction of insights from raw data, images, or documents. However, applying AI/ML to business problems requires technical expertise and building custom models can take several weeks or even months.

Launched in 2021, Amazon SageMaker Canvas is a visual, point-and-click service that allows business analysts to use a variety of ready-to-use models or create custom models to generate accurate ML predictions on their own.

Ready-to-use Models
Customers can use SageMaker Canvas to access ready-to-use models that can be used to extract information and generate predictions from thousands of documents, images, and lines of text in minutes. These ready-to-use models include sentiment analysis, language detection, entity extraction, personal information detection, object and text detection in images, expense analysis for invoices and receipts, identity document analysis, and more generalized document and form analysis.

For example, you can select the sentiment analysis ready-to-use model and upload product reviews from social media and customer support tickets to quickly understand how your customers feel about your products. Using the personal information detection ready-to-use model, you can detect and redact personally identifiable information (PII) from emails, support tickets, and documents. Using the expense analysis ready-to-use model, you can easily detect and extract data from your scanned invoices and receipts and generate insights about that data.

These ready-to-use models are powered by AWS AI services, including Amazon Rekognition, Amazon Comprehend, and Amazon Textract.

Ready-to-use models available

Custom Text and Image Classification Models
Customers that need custom models trained for their business-specific use-case can use SageMaker Canvas to create text and image classification models. 

You can use SageMaker Canvas to create custom text classification models to classify data according to your needs. For example, imagine that you work as a business analyst at a company that provides customer support. When a customer support agent engages with a customer, they create a ticket, and they need to record the ticket type, for example, “incident”, “service request”, or “problem”. Many times, this field gets forgotten, and so, when the reporting is done, the data is hard to analyze. Now, using SageMaker Canvas, you can create a custom text classification model, train it with existing customer support ticket information and ticket type, and use it to predict the type of tickets in the future when working on a report with missing data.

You can also use SageMaker Canvas to create custom image classification models using your own image datasets. For instance, imagine you work as a business analyst at a company that manufactures smartphones. As part of your role, you need to prepare reports and respond to questions from business stakeholders related to quality assessment and it’s trends. Every time a phone is assembled, a picture is automatically taken, and at the end of the week, you receive all those images. Now with SageMaker Canvas, you can create a new custom image classification model that is trained to identify common manufacturing defects. Then, every week, you can use the model to analyze the images and predict the quality of the phones produced.

SageMaker Canvas in Action
Let’s imagine that you are a business analyst for an e-commerce company. You have been tasked with understanding the customer sentiment towards all the new products for this season. Your stakeholders require a report that aggregates the results by item category to decide what inventory they should purchase in the following months. For example, they want to know if the new furniture products have received positive sentiment. You have been provided with a spreadsheet containing reviews for the new products, as well as an outdated file that categorizes all the products on your e-commerce platform. However, this file does not yet include the new products.

To solve this problem, you can use SageMaker Canvas. First, you will need to use the sentiment analysis ready-to-use model to understand the sentiment for each review, classifying them as positive, negative, or neutral. Then, you will need to create a custom text classification model that predicts the categories for the new products based on the existing ones.

Ready-to-use Model – Sentiment Analysis
To quickly learn the sentiment of each review, you can do a bulk update of the product reviews and generate a file with all the sentiment predictions.

To get started, locate Sentiment analysis on the Ready-to-use models page, and under Batch prediction, select Import new dataset.

Using ready-to-use sentiment analysis with a batch dataset

When you create a new dataset, you can upload the dataset from your local machine or use Amazon Simple Storage Service (Amazon S3). For this demo, you will upload the file locally. You can find all the product reviews used in this example in the Amazon Customer Reviews dataset.

After you complete uploading the file and creating the dataset, you can Generate predictions.

Select dataset and generate predictions

The prediction generation takes less than a minute, depending on the size of the dataset, and then you can view or download the results.

View or download predictions

The results from this prediction can be downloaded as a .csv file or viewed from the SageMaker Canvas interface. You can see the sentiment for each of the product reviews.

Preview results from ready-to-use model

Now you have the first part of your task ready—you have a .csv file with the sentiment of each review. The next step is to classify those products into categories.

Custom Text Classification Model
To classify the new products into categories based on the product title, you need to train a new text classification model in SageMaker Canvas.

In SageMaker Canvas, create a New model of the type Text analysis.

The first step when creating the model is to select a dataset with which to train the model. You will train this model with a dataset from last season, which contains all the products except for the new collection.

Once the dataset has finished importing, you will need to select the column that contains the data you want to predict, which in this case is the product_category column, and the column that will be used as the input for the model to make predictions, which is the product_title column.

After you finish configuring that, you can start to build the model. There are two modes of building:

  • Quick build that returns a model in 15–30 minutes.
  • Standard build takes 2–5 hours to complete.

To learn more about the differences between the modes of building you can check the documentation. For this demo, pick quick build, as our dataset is smaller than 50,000 rows.

Prepare and build your model

When the model is built, you can analyze how the model performs. SageMaker Canvas uses the 80-20 approach; it trains the model with 80 percent of the data from the dataset and uses 20 percent of the data to validate the model.

Model score

When the model finishes building, you can check the model score. The scoring section gives you a visual sense of how accurate the predictions were for each category. You can learn more about how to evaluate your model’s performance in the documentation.

After you make sure that your model has a high prediction rate, you can move on to generate predictions. This step is similar to the ready-to-use models for sentiment analysis. You can make a prediction on a single product or on a set of products. For a batch prediction, you need to select a dataset and let the model generate the predictions. For this example, you will select the same dataset that you selected in the ready-to-use model, the one with the reviews. This can take a few minutes, depending on the number of products in the dataset.

When the predictions are ready, you can download the results as a .csv file or view how each product was classified. In the prediction results, each product is assigned only one category based on the categories provided during the model-building process.

Predict categories

Now you have all the necessary resources to conduct an analysis and evaluate the performance of each product category with the new collection based on customer reviews. Using SageMaker Canvas, you were able to access a ready-to-use model and create a custom text classification model without having to write a single line of code.

Available Now
Ready-to-use models and support for custom text and image classification models in SageMaker Canvas are available in all AWS Regions where SageMaker Canvas is available. You can learn more about the new features and how they are priced by visiting the SageMaker Canvas product detail page.

— Marcia

AWS Application Migration Service Major Updates: Import and Export Feature, Source Server Migration Metrics Dashboard, and Additional Post-Launch Actions

AWS Application Migration Service (AWS MGN) can simplify and expedite your migration to AWS by automatically converting your source servers from physical, virtual, or cloud infrastructure to run natively on AWS. In the post, How to Use the New AWS Application Migration Server for Lift-and-Shift Migrations, Channy introduced us to Application Migration Service and how to get started.

By using Application Migration Service for migration, you can minimize time-intensive, error-prone manual processes by automating replication and conversion of your source servers from physical, virtual, or cloud infrastructure to run natively on AWS. Last year, we introduced major improvements such as new migration servers grouping, an account-level launch template, and a post-launch actions template.

Today, I’m pleased to announce three major updates of Application Migration Service. Here’s the quick summary for each feature release:

  • Import and export – You can now use Application Migration Service to import your source environment inventory list to the service from a CSV file. You can also export your source server inventory for reporting purposes, offline reviews and updates, integration with other tools and AWS services, and performing bulk configuration changes by reimporting the inventory list.
  • Server migration metrics dashboard – This new dashboard can help simplify migration project management by providing an aggregated view of the migration lifecycle status of your source servers
  • Additional post-launch modernization actions – In this update, Application Migration Service added eight additional predefined post-launch actions. These actions are applied to your migrated applications when you launch them on AWS.

Let me share how you can use these features for your migration.

Import and Export
Before we go further into the import and export features, let’s discuss two concepts within Application Migration Service: applications and waves, which you can define when migrating with Application Migration Service. Applications represent a group of servers. By using applications, you can define groups of servers and identify them as an application. Within your application, you can perform various activities with Application Migration Service, such as monitoring, specifying tags, and performing bulk operations, for example, launching test instances. Additionally, you can group your applications into waves, which represent a group of servers that are migrated together, as part of your migration plan.

With the import feature, you can now import your inventory list in CSV form into Application Migration Service. This makes it easy for you to manage large scale-migrations, and ingest your inventory of source servers, applications and waves, including their attributes.

To start using the import feature, I need to identify my servers and application inventory. I can do this manually, or using discovery tools. The next thing I need to do is download the import template which I can access from the console. 

After I downloaded the import template, I can start mapping from my inventory list into this template. While mapping my inventory, I can group related servers into applications and waves. I can also perform configurations, such as defining Amazon Elastic Compute Cloud (Amazon EC2) launch template settings, and specifying tags for each wave.

The following screenshot is an example of the results of my import template:

The next step is to upload my CSV file to an Amazon Simple Storage Service (Amazon S3) bucket. Then, I can start the import process from the Application Migration Service console by referencing the CSV file containing my inventory list that I’ve uploaded to the S3 bucket.

When the import process is complete, I can see the details of the import results.

I can import inventory for servers that don’t have an agent installed, or haven’t yet been discovered by agentless replication. However, to replicate data, I need to use agentless replication, or install the AWS Replication Agent on my source servers.

Now I can view all my inventory inside the Source servers, Applications and Waves pages on the Application Migration Service console. The following is a screenshot for recently imported waves.

In addition, with the export feature, I can export my source servers, applications, and waves along with all configurations that I’ve defined into a CSV file.

This is helpful if you want to do reporting or offline reviews, or for bulk editing before reimporting the CSV file into Application Migration Service.

Server Migration Metrics Dashboard
We previously supported a migration metrics dashboard for applications and waves. In this release, we have specifically added a migration metrics dashboard for servers. Now you can view aggregated overviews of your server’s migration process on the Application Migration Service dashboard. Three topics are available in the migration metrics dashboard:

  • Alerts – Shows associated alerts for respective servers.
  • Data replication status – Shows the replication data overview status for source servers. Here, you get a quick overview of the lifecycle status of the replication data process.
  • Migration lifecycle – Shows an overview of the migration lifecycle from source servers.

Additional Predefined Post-launch Modernization Actions
Post-launch actions allow you to control and automate actions performed after your servers have been launched in AWS. You can use predefined or use custom post-launch actions.

Application Migration Service now has eight additional predefined post-launch actions to run in your EC2 instances on top of the existing four predefined post-launch actions. These additional post-launch actions provide you with flexibility to maximize your migration experience.

The new additional predefined post-launch actions are as follows:

  • Convert MS-SQL license – You can easily convert Windows MS-SQL BYOL to an AWS license using the Windows MS-SQL license conversion action. The launch process includes checking the SQL edition (Enterprise, Standard, or Web) and using the right AMI with the right billing code.
  • Create AMI from instance – You can create a new Amazon Machine Image (AMI) from your Application Migration Service launched instance.
  • Upgrade Windows version – This feature allows you to easily upgrade your migrated server to Windows Server 2012 R2, 2016, 2019, or 2022. You can see the full list of available OS versions on AWSEC2-CloneInstanceAndUpgradeWindows page.
  • Conduct EC2 connectivity checks – You can conduct network connectivity checks to a predefined list of ports and hosts using the EC2 connectivity check feature.
  • Validate volume integrity – You can use this feature to ensure that Amazon Elastic Block Store (Amazon EBS) volumes on the launched instance are the same size as the source, properly mounted on the EC2 instance, and accessible.
  • Verify process status – You can validate the process status to ensure that processes are in a running state after instance launch. You will need to provide a list of processes that you want to verify and specify how long the service should wait before testing begins. This feature lets you do the needed validations automatically and saves time by not having to do them manually.
  • CloudWatch agent installation – Use the Amazon CloudWatch agent installation feature to install and set up the CloudWatch agent and Application Insights features.
  • Join Directory Service domain – You can simplify the AWS join domain process by using this feature. If you choose to activate this action, your instance will be managed by the AWS Cloud Directory (instead of on premises).

Things to Know
Keep in mind the following:

  • Updated UI/UX – We have updated the user interface with card layout and table layout view for the action list on the Application Migration Service console. This update helps you to determine which post-launch actions are suitable for your use case . We have also added filter options to make it easy to find relevant actions by operating system, category, and more.
  • Support for additional OS versions – Application Migration Service now supports CentOS 5.5 and later and Red Hat Enterprise Linux (RHEL) 5.5 and later operating systems.
  • Availability – These features are available now, and you can start using them today in all Regions where Application Migration Service is supported.

Get Started Today

Visit the Application Migration Service User Guide page to learn more about these features and understand the pricing. You can also visit Getting started with AWS Application Migration Service to learn more about how to get started to migrate your workloads.

Happy migrating!

Donnie

AWS Week in Review – March 27, 2023

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

In Finland, where I live, spring has arrived. The snow has melted, and the trees have grown their first buds. But I don’t get my hopes high, as usually around Easter we have what is called takatalvi. Takatalvi is a Finnish world that means that the winter returns unexpectedly in the spring.

Last Week’s Launches
Here are some launches that got my attention during the previous week.

AWS SAM CLI – Now the sam sync command will compare your local Serverless Application Model (AWS SAM) template with your deployed AWS CloudFormation template and skip the deployment if there are no changes. For more information, check the latest version of the AWS SAM CLI.

IAM – AWS Identity and Access Management (IAM) has launched two new global condition context keys. With these new condition keys, you can write service control policies (SCPs) or IAM policies that restrict the VPCs and private IP addresses from which your Amazon Elastic Compute Cloud (Amazon EC2) instance credentials can be used, without hard-coding VPC IDs or IP addresses in the policy. To learn more about this launch and how to get started, see How to use policies to restrict where EC2 instance credentials can be used from.

Amazon SNS – Amazon Simple Notification Service (Amazon SNS) now supports setting context-type request headers for HTTP/S notifications, such as application/json, application/xml, or text/plain. With this new feature, applications can receive their notifications in a more predictable format.

AWS Batch – AWS Batch now allows you to configure ephemeral storage up to 200GiB on AWS Fargate type jobs. With this launch, you no longer need to limit the size of your data sets or the size of the Docker images to run machine learning inference.

Application Load Balancer – Application Load Balancer (ALB) now supports Transport Layer Security (TLS) protocol version 1.3, enabling you to optimize the performance of your application while keeping it secure. TLS 1.3 on ALB works by offloading encryption and decryption of TLS traffic from your application server to the load balancer.

Amazon IVS – Amazon Interactive Video Service (IVS) now supports combining videos from multiple hosts into the source of a live stream. For a demo, refer to Add multiple hosts to live streams with Amazon IVS.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you may have missed:

I read the post Implementing an event-driven serverless story generation application with ChatGPT and DALL-E a few days ago, and since then I have been reading my child a lot of  AI-generated stories. In this post, David Boyne, explains step by step how you can create an event-driven serverless story generation application. This application produces a brand-new story every day at bedtime with images, which can be played in audio format.

Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish, and every other week there is a new episode. The podcast is meant for builders, and it shares stories about how customers have implemented and learned AWS services, how to architect applications, and how to use new services. You can listen to all the episodes directly from your favorite podcast app or at AWS Podcasts en español.

AWS open-source news and updates – The open source newsletter is curated by my colleague Ricardo Sueiras to bring you the latest open-source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for the AWS Summit closest to your city. AWS Summits are free events that bring the local community together, where you can learn about different AWS services.

Here are the ones coming up in the next months:

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

AWS Clean Rooms Now Generally Available — Collaborate with Your Partners without Sharing Raw Data

Companies across multiple industries, such as advertising and marketing, retail, consumer packaged goods (CPG), travel and hospitality, media and entertainment, and financial services, increasingly look to supplement their data with data from business partners, to build a complete view of their business.

Let’s take a marketing use case as an example. Brands, publishers, and their partners need to collaborate using datasets that are stored across many channels and applications to improve the relevance of their campaigns and better engage with consumers. At the same time, they also want to protect sensitive consumer information and eliminate the sharing of raw data. Data clean rooms can help solve this challenge by allowing multiple companies to analyze their collective data in a private environment.

However, it’s difficult to build data clean rooms. It requires complex privacy controls, specialized tools to protect each collaborator’s data, and months of development time customizing analytics tools. The effort and complexity grows when a new collaborator is added, or a different type of analysis is needed, as companies have to spend even more development time. Finally, companies prefer to limit data movement as much as possible, usually leading to less collaboration and missed opportunities to generate new business insights.

Introducing AWS Clean Rooms
Today, I’m excited to announce the general availability of AWS Clean Rooms which we first announced at AWS re:Invent 2022 and released the preview of in January 2023. AWS Clean Rooms is an analytics service of AWS Applications that helps companies and their partners more easily and securely analyze and collaborate on their collective datasets without sharing or copying each other’s data. AWS Clean Rooms enables customers to generate unique insights about advertising campaigns, investment decisions, clinical research, and more, while helping them protect data.

Now, with AWS Clean Rooms, companies are able to easily create a secure data clean room on the AWS Cloud in minutes and collaborate with their partners. They can use a broad set of built-in, privacy-enhancing controls for clean rooms. These controls allow companies to customize restrictions on the queries run by each clean room participant, including query controls, query output restrictions, and query logging. AWS Clean Rooms also includes advanced cryptographic computing tools that keep data encrypted—even as queries are processed—to help comply with stringent data handling policies.

Key Features of AWS Clean Rooms
Let me share with you the key features and how easy it is to collaborate with AWS Clean Rooms.

Create Your Own Clean Rooms
AWS Clean Rooms helps you to start a collaboration in minutes and then select the other companies you want to collaborate with. You can collaborate with any of your partners that agree to participate in your clean room collaboration. You can create a collaboration by following several steps.

After creating a collaboration in AWS Clean Rooms, you can select additional collaboration members who can contribute. Currently, AWS Clean Rooms supports up to five collaboration members, including you as the collaboration creator.

The next step is to define which collaboration member can perform a query in collaboration with the member abilities setting.

Then, collaboration members will get notifications in their accounts, see detailed info from a collaboration, and decide whether to join the collaboration by selecting Create membership in their AWS Clean Rooms dashboard.

Collaborate without Moving Data Outside AWS
AWS Clean Rooms works by analyzing Amazon S3 data in place. This eliminates the need for companies to copy and load their data into destinations outside their respective AWS environments of the collaboration members or using third-party services.

Each collaboration member can create configured tables, an AWS Clean Rooms resource that contains reference to the AWS Glue catalog with underlying data that define how that data can be used. The configured table can be used across many collaborations.

Protecting Data
AWS Clean Rooms provides you with a broad set of privacy-enhancing controls to protect your customers’ and partners’ data. Each collaboration member has the flexibility to determine what columns can be accessed in a collaboration.

In addition to column-level privacy controls, as in the example above, AWS Clean Rooms also provides fine-grained query controls called analysis rules. With built-in and flexible analysis rules, customers can tailor queries to specific business needs. AWS Clean Rooms provides two types of analysis rules for customers to use:

  • Aggregation analysis rules allows queries that aggregate analysis without revealing user-level information using COUNT, SUM, and AVG functions along optional dimensions.
  • List analysis rules allow queries that output user-level attribute analysis of the overlap between the customer’s table and the tables of the member who can query.

Both analysis rule types allow data owners to require a join between their datasets and the datasets of the collaborator running the query. This limits the results to just their intersection of the collaborators datasets.

After defining the analysis rules, the member who can query and receive results can start writing queries according to the restrictions defined by each participating collaboration member. The following is an example query in the collaboration.

Analysis rules allow collaboration members to restrict the types of queries that can be performed against their datasets and the usable output of the query results. The following screenshot is an example of a query that will not be successful because it does not satisfy the analysis rule since the hashed_email column cannot be used in SELECT queries.

Full Programmatic Access
Any functionality offered by AWS Clean Rooms can also be accessed via the API using AWS SDKs or AWS CLI. This makes it easier for you to integrate AWS Clean Rooms into your products or workflows. This programmatic access also unlocks the opportunity for you to host clean rooms for your customers with your own branding.

Query Logging
This feature allows collaboration members to review and audit the queries that use their datasets to make sure data is being used as intended. With query logging, collaboration members who have query control and other members whose data is part of the query, can receive logs if they enable query logging.

If this feature is enabled, query logs are written to Amazon CloudWatch Logs in each collaboration member’s account. You can access the summary of the log queries in the last 7 days from the collaboration dashboard.

Cryptographic Computing
With this feature, you have the option to perform client-side encryption for sensitive data with cryptographic computing. You can encrypt your dataset to add a protection layer, and the data will use a cryptographic computing protocol called private-set intersection to keep data encrypted even as the query runs.

To use the cryptographic computing feature, you need to download and use the Cryptographic Computing for Clean Rooms (C3R) encryption client to encrypt and decrypt your data. C3R keeps your data cryptographically protected while in use in AWS Clean Rooms. C3R supports a subset of SQL queries, including JOIN, SELECT, GROUP BY, COUNT, and other supported statements on cryptographically protected data.

The following image shows how you can enable cryptographic computing when creating a collaboration:

Customer Voices
During the preview period, we heard lots of feedback from our customers about AWS Clean Rooms. Here’s what our customers say:

Comscore is a measurement and analytics company that brings trust and transparency to media. Brian Pugh, Chief Information Officer at Comscore, said, “As advertisers and marketers adapt to deliver relevant campaigns leveraging their combined datasets while protecting consumer data, Comscore’s Media Metrix suite, powered by Unified Digital Measurement 2.0 and Campaign Ratings services, will continue to support critical measurement and planning needs with services like AWS Clean Rooms. AWS Clean Rooms will enable new methods of collaboration among media owners, brands, or agency customers through customized data access controls managed and set by each data owner without needing to share underlying data.”

DISH Media is a leading TV provider that offers over-the-top IPTV service. “At DISH Media, we empower brands and agencies to run their own analyses of prior campaigns to allow for flexibility, visibility, and ease in optimizing future campaigns to reach DISH Media’s 31 million consumers. With AWS Clean Rooms, we believe advertisers will benefit from the ease of use of these services with their analysis, including data access and security controls,” said Kemal Bokhari, Head of Data, Measurement, and Analytics at DISH Media.

Fox Corporation is a leading producer and distributor of ad-supported content through its sports, news, and entertainment brands. Lindsay Silver, Senior Vice President of Data and Commercial Technology at Fox Corporation, said, “It can be challenging for our advertising clients to figure out how to best leverage more data sources to optimize their media spend across their combined portfolio of entertainment, sports, and news brands which reach 200 million monthly viewers. We are excited to use AWS Clean Rooms to enable data collaborations easily and securely in the AWS Cloud that will help our advertising clients unlock new insights across every Fox brand and screen while protecting consumer data.”

Amazon Marketing Cloud (AMC) is a secure, privacy-safe clean room application from Amazon Ads that supports thousands of marketers with custom analytics and cross-channel analysis.

“Providing marketers with greater control over their own signals while being able to analyze them in conjunction with signals from Amazon Ads is crucial in today’s marketing landscape. By migrating AMC’s compute infrastructure to AWS Clean Rooms under the hood, marketers can use their own signals in AMC without storing or maintaining data outside of their AWS environment. This simplifies how marketers can manage their signals and enables AMC teams to focus on building new capabilities for brands,” said Paula Despins, Vice President of Ads Measurement at Amazon Ads.

Watch this video to learn more about AWS Clean Rooms:

Availability
AWS Clean Rooms is generally available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), and Europe (Stockholm).

Pricing & Free Tier
AWS Clean Rooms measures compute capacity in Clean Rooms Processing Units (CRPUs). You only pay the compute capacity of queries that you run in CRPU-hours on a per-second basis (with a 60-second minimum charge). AWS Clean Rooms automatically scales up or down to meet your query workload demands and shuts down during periods of inactivity, saving you administration time and costs. AWS Clean Rooms free tier provides a tier of 9 CRPU-hours per month for the first 12 months per new customer.

AWS Clean Rooms helps companies and their partners more easily and securely analyze and collaborate on their collective datasets without sharing or copying each other’s data. Learn more about benefits, use cases, how to get started, and pricing details on the AWS Clean Rooms page.

Happy collaborating!

Donnie

AWS Chatbot Now Integrates With Microsoft Teams

I am pleased to announce that, starting today, you can use AWS Chatbot to troubleshoot and operate your AWS resources from Microsoft Teams.

Communicating and collaborating on IT operation tasks through chat channels is known as ChatOps. It allows you to centralize the management of infrastructure and applications, as well as to automate and streamline your workflows. It helps to provide a more interactive and collaborative experience, as you can communicate and work with your colleagues in real time through a familiar chat interface to get the job done.

We launched AWS Chatbot in 2020 with Amazon Chime and Slack integrations. Since then, the landscape of chat platforms has evolved rapidly, and many of you are now using Microsoft Teams.

AWS Chatbot Benefits
When using AWS Chatbot for Microsoft Teams or other chat platforms, you receive notifications from AWS services directly in your chat channels, and you can take action on your infrastructure by typing commands without having to switch to another tool.

Typically you want to receive alerts about your system health, your budget, any new security threat or risk, or the status of your CI/CD pipelines. Sending a message to the chat channel is as simple as sending a message on an Amazon Simple Notification Service (Amazon SNS) topic. Thanks to the native integration between Amazon CloudWatch alarms and SNS, alarms are automatically delivered to your chat channels with no additional configuration step required. Similarly, thanks to the integration between Amazon EventBridge and SNS, any system or service that emits events to EventBridge can send information to your chat channels.

But ChatOps is more than the ability to spot problems as they arise. AWS Chatbot allows you to receive predefined CloudWatch dashboards interactively and retrieve Logs Insights logs to troubleshoot issues directly from the chat thread. You can also directly type in the chat channel most AWS Command Line Interface (AWS CLI) commands to retrieve additional telemetry data or resource information or to run runbooks to remediate the issues.

Typing and remembering long commands is difficult. With AWS Chatbot, you can define your own aliases to reference frequently used commands and their parameters. It reduces the number of steps to complete a task. Aliases are flexible and can contain one or more custom parameters injected at the time of the query.

And because chat channels are designed for conversation, you can also ask questions in natural language and have AWS Chatbot answer you with relevant extracts from the AWS documentation or support articles. Natural language understanding also allows you to make queries such as “show me my ec2 instances in eu-west-3.”

Let’s Configure the Integration Between AWS Chatbot and Microsoft Teams
Getting started is a two-step process. First, I configure my team in Microsoft Teams. As a Teams administrator, I add the AWS Chatbot application to the team, and I take note of the URL of the channel I want to use for receiving notifications and operating AWS resources from Microsoft Teams channels.

Second, I register Microsoft Teams channels in AWS Chatbot. I also assign IAM permissions on what channel members can do in this channel and associate SNS topics to receive notifications. I may configure AWS Chatbot with the AWS Management Console, an AWS CloudFormation template, or the AWS Cloud Development Kit (AWS CDK). For this demo, I choose to use the console.

I open the Management Console and navigate to the AWS Chatbot section. On the top right side of the screen, in the Configure a chat client box, I select Microsoft Teams and then Configure client.

I enter the Microsoft Teams channel URL I noted in the Teams app.

Add the team channel URL to ChatbotAt this stage, Chatbot redirects my browser to Microsoft Teams for authentication. If I am already authenticated, I will be redirected back to the AWS console immediately. Otherwise, I enter my Microsoft Teams credentials and one-time password and wait to be redirected.

At this stage, my Microsoft Teams team is registered with AWS Chatbot and ready to add Microsoft Teams channels. I select Configure new channel.

Chabot is now linked to your Microsoft Teams There are four sections to enter the details of the configuration. In the first section, I enter a Configuration name for my channel. Optionally, I also define the Logging details. In the second section, I paste—again—the Microsoft Teams Channel URL.

Configure chatbot section one and two

In the third section, I configure the Permissions. I can choose between the same set of permissions for all Microsoft Teams users in my team, or I can set User-level roles permission to enable user-specific permissions in the channel. In this demo, I select Channel role, and I assign an IAM role to the channel. The role defines the permissions shared by all users in the channel. For example, I can assign a role that allows users to access configuration data from Amazon EC2 but not from Amazon S3. Under Channel role, I select Use an existing IAM role. Under Existing role, I select a role I created for my 2019 re:Invent talk about ChatOps: chatbot-demo. This role gives read-only access to all AWS services, but I could also assign other roles that would allow Chatbot users to take actions on their AWS resources.

To mitigate the risk that another person in your team accidentally grants more than the necessary privileges to the channel or user-level roles, you might also include Channel guardrail policies. These are the maximum permissions your users might have when using the channel. At runtime, the actual permissions are the intersection of the channel or user-level policies and the guardrail policies. Guardrail policies act like a boundary that channel users will never escape. The concept is similar to permission boundaries for IAM entities or service control policies (SCP) for AWS Organizations. In this example, I attach the ReadOnlyAccess managed policy.

Configure chatbot section three

The fourth and last section allows you to specify the SNS topic that will be the source for notifications sent to your team’s channel. Your applications or AWS services, such as CloudWatch alarms, can send messages to this topic, and AWS Chatbot will relay all messages to the configured Microsoft Teams channel. Thanks to the integration between Amazon EventBridge and SNS, any application able to send a message to EventBridge is able to send a message to Microsoft Teams.

For this demo, I select an existing SNS topic: alarmme in the us-east-1 Region. You can configure multiple SNS topics to receive alarms from various Regions. I then select Configure.

Configure chatbot section fourLet’s Test the Integration
That’s it. Now I am ready to test my setup.

On the AWS Chatbot configuration page, I first select the Send test message. I also have an alarm defined when my estimated billing goes over $500. On the CloudWatch section of the Management Console, I configure the alarm to post a message on the SNS topic shared with Microsoft Teams.

Within seconds, I receive the test message and the alarm message on the Microsoft Teams channel.

AWS Chatbot with Microsoft Teams, first messages received on the channel

Then I type a command to understand where the billing alarm comes from. I want to understand how many EC2 instances are running.

On the chat client channel, I type @aws to select Chatbot as the destination, then the rest of the CLI command, as I would do in a terminal: ec2 describe-instances --region us-east-1 --filters "Name=architecture,Values=arm64_mac" --query "Reservations[].Instances[].InstanceId"

Chatbot answers within seconds.

AWS chatbot describe instances

I can create aliases for commands I frequently use. Aliases may have placeholder parameters that I can give at runtime, such as the Region name for example.

I create an alias to get the list of my macOS instance IDs with the command: aws alias create mac ec2 describe-instances --region $region --filters "Name=architecture,Values=arm64_mac" --query "Reservations[].Instances[].InstanceId"

Now, I can type @aws alias run mac us-east-1 as a shortcut to get the same result as above. I can also manage my aliases with the @aws alias list, @aws alias get, and @aws alias delete commands.

I don’t know about you, but for me it is hard to remember commands. When I use the terminal, I rely on auto-complete to remind me of various commands and their options. AWS Chatbot offers similar command completion and guides me to collect missing parameters.

AWS Chatbot command completion

When using AWS Chatbot, I can also ask questions using natural English language. It can help to find answers from the AWS docs and from support articles by typing questions such as @aws how can I tag my EC2 instances? or @aws how do I configure Lambda concurrency setting?

It can also find resources in my account when AWS Resource Explorer is activated. For example, I asked the bot: @aws what are the tags for my ec2 resources? and @aws what Regions do I have Lambda service?

And I received these responses.

AWS Chatbot NLP Response 1AWS Chatbot NLP Response 2Thanks to AWS Chatbot, I realized that I had a rogue Lambda function left in ca-central-1. I used the AWS console to delete it.

Available Now
You can start to use AWS Chatbot with Microsoft Teams today. AWS Chatbot for Microsoft Teams is available to download from Microsoft Teams app at no additional cost. AWS Chatbot is available in all public AWS Regions, at no additional charge. You pay for the underlying resources that you use. You might incur charges from your chat client.

Get started today and configure your first integration with Microsoft Teams.

— seb

Amazon Linux 2023, a Cloud-Optimized Linux Distribution with Long-Term Support

I am excited to announce the general availability of Amazon Linux 2023 (AL2023). AWS has provided you with a cloud-optimized Linux distribution since 2010. This is the third generation of our Amazon Linux distributions.

Every generation of Amazon Linux distribution is secured, optimized for the cloud, and receives long-term AWS support. We built Amazon Linux 2023 on these principles, and we go even further. Deploying your workloads on Amazon Linux 2023 gives you three major benefits: a high-security standard, a predictable lifecycle, and a consistent update experience.

Let’s look at security first. Amazon Linux 2023 includes preconfigured security policies that make it easy for you to implement common industry guidelines. You can configure these policies at launch time or run time.

For example, you can configure the system crypto policy to enforce system-wide usage of a specific set of cipher suites, TLS versions, or acceptable parameters in certificates and key exchanges. Also, the Linux kernel has many hardening features enabled by default.

Amazon Linux 2023 makes it easier to plan and manage the operating system lifecycle. New Amazon Linux major versions will be available every two years. Major releases include new features and improvements in security and performance across the stack. The improvements might include major changes to the kernel, toolchain, GLib C, OpenSSL, and any other system libraries and utilities.

During those two years, a major release will receive an update every three months. These updates include security updates, bug fixes, and new features and packages. Each minor version is a cumulative list of updates that includes security and bug fixes in addition to new features and packages. These releases might include the latest language runtimes such as Python or Java. They might also include other popular software packages such as Ansible and Docker. In addition to these quarterly updates, security updates will be provided as soon as they are available.

Each major version, including 2023, will come with five years of long-term support. After the initial two-year period, each major version enters a three-year maintenance period. During the maintenance period, it will continue to receive security bug fixes and patches as soon as they are available. This support commitment gives you the stability you need to manage long project lifecycles.

The following diagram illustrates the lifecycle of Amazon Linux distributions:

Last—and this policy is by far my favorite—Amazon Linux provides you with deterministic updates through versioned repositories, a flexible and consistent update mechanism. The distribution locks to a specific version of the Amazon Linux package repository, giving you control over how and when you absorb updates. By default, and in contrast with Amazon Linux 2, a dnf update command will not update your installed packages (dnf is the successor to yum). This helps to ensure that you are using the same package versions across your fleet. All Amazon Elastic Compute Cloud (Amazon EC2) instances launched from an Amazon Machine Image (AMI) will have the same version of packages. Deterministic updates also promote usage of immutable infrastructure, where no infrastructure is updated after deployment. When an update is required, you update your infrastructure as code scripts and redeploy a new infrastructure. Of course, if you really want to update your distribution in place, you can point dnf to an updated package repository and update your machine as you do today. But did I tell you this is not a good practice for production workloads? I’ll share more technical details later in this blog post.

How to Get Started
Getting started with Amazon Linux 2023 is no different than with other Linux distributions. You can use the EC2 run-instances API, the AWS Command Line Interface (AWS CLI), or the AWS Management Console, and one of the four Amazon Linux 2023 AMIs that we provide. We support two machine architectures (x86_64 and Arm) and two sizes (standard and minimal). Minimal AMIs contain the most basic tools and utilities to start the OS. The standard version comes with the most commonly used applications and tools installed.

To retrieve the latest AMI ID for a specific Region, you can use AWS Systems Manager get-parameter API and query the /aws/service/ami-amazon-linux-latest/<alias> parameter.

Be sure to replace <alias> with one of the four aliases available:

  • For arm64 architecture (standard AMI): al2023-ami-kernel-default-arm64
  • For arm64 architecture (minimal AMI): al2023-ami-minimal-kernel-default-arm64
  • For x86_64 architecture (standard AMI): al2023-ami-kernel-default-x86_64
  • For x86_64 architecture (minimal AMI): al2023-ami-minimal-kernel-default-x86_64

For example, to search for the latest Arm64 full distribution AMI ID, I open a terminal and enter:

~ aws ssm get-parameters --region us-east-2 --names /aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-arm64
{
    "Parameters": [
        {
            "Name": "/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-arm64",
            "Type": "String",
            "Value": "ami-02f9b41a7af31dded",
            "Version": 1,
            "LastModifiedDate": "2023-02-24T22:54:56.940000+01:00",
            "ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-arm64",
            "DataType": "text"
        }
    ],
    "InvalidParameters": []
}

To launch an instance, I use the run-instances API. Notice how I use Systems Manager resolution to dynamically lookup the AMI ID from the CLI.

➜ aws ec2 run-instances                                                                            
       --image-id resolve:ssm:/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-arm64  
       --key-name my_ssh_key_name                                                                   
       --instance-type c6g.medium                                                                   
       --region us-east-2 
{
    "Groups": [],
    "Instances": [
        {
          "AmiLaunchIndex": 0,
          "ImageId": "ami-02f9b41a7af31dded",
          "InstanceId": "i-0740fe8e23f903bd2",
          "InstanceType": "c6g.medium",
          "KeyName": "my_ssh_key_name",
          "LaunchTime": "2023-02-28T14:12:34+00:00",

...(redacted for brevity)
}

When the instance is launched, and if the associated security group allows SSH (TCP 22) connections, I can connect to the machine:

~ ssh [email protected]
Warning: Permanently added '3.145.19.213' (ED25519) to the list of known hosts.
   ,     #_
   ~_  ####_        Amazon Linux 2023
  ~~  _#####       Preview
  ~~     ###|
  ~~       #/ ___   https://aws.amazon.com/linux/amazon-linux-2023
   ~~       V~' '->
    ~~~         /
      ~~._.   _/
         _/ _/
       _/m/'
Last login: Tue Feb 28 14:14:44 2023 from 81.49.148.9
[ec2-user@ip-172-31-9-76 ~]$ uname -a
Linux ip-172-31-9-76.us-east-2.compute.internal 6.1.12-19.43.amzn2023.aarch64 #1 SMP Thu Feb 23 23:37:18 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

We also distribute Amazon Linux 2023 as Docker images. The Amazon Linux 2023 container image is built from the same software components that are included in the Amazon Linux 2023 AMI. The container image is available for use in any environment as a base image for Docker workloads. If you’re using Amazon Linux for applications in EC2, you can containerize your applications with the Amazon Linux container image.

These images are available from Amazon Elastic Container Registry (Amazon ECR) and from Docker Hub. Here is a quick demo to start a Docker container using Amazon Linux 2023 from Elastic Container Registry.

$ aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
Login Succeeded
~ docker run --rm -it public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash
Unable to find image 'public.ecr.aws/amazonlinux/amazonlinux:2023' locally
2023: Pulling from amazonlinux/amazonlinux
b4265814d5cf: Pull complete 
Digest: sha256:bbd7a578cff9d2aeaaedf75eb66d99176311b8e3930c0430a22e0a2d6c47d823
Status: Downloaded newer image for public.ecr.aws/amazonlinux/amazonlinux:2023
bash-5.2# uname -a 
Linux 9d5b45e9f895 5.15.49-linuxkit #1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
bash-5.2# exit 

When pulling from Docker Hub, you can use this command to pull the image: docker pull amazonlinux:2023.

What Are the Main Differences Compared to Amazon Linux 2?
Amazon Linux 2023 has some differences compared to Amazon Linux 2. The documentation explains these differences in detail. The two differences I would like to focus on are dnf and the package management policies.

AL2023 comes with Fedora’s dnf, the successor to yum. But don’t worry, dnf provides similar commands as yum to search, install, or remove packages. Where you used to run the commands yum list or yum install httpd, you may now run dnf list or dnf install httpd. For convenience, we create a symlink for /usr/bin/yum, so you can run your scripts unmodified.

$ which yum
/usr/bin/yum
$ ls -al /usr/bin/yum
lrwxrwxrwx. 1 root root 5 Jun 19 18:06 /usr/bin/yum -> dnf-3

The biggest difference, in my opinion, is the deterministic updates through versioned repositories. By default, the software repository is locked to the AMI version. This means that a dnf update command will not return any new packages to install. Versioned repositories give you the assurance that all machines started from the same AMI ID are identical. Your infrastructure will not deviate from the baseline.

$ sudo dnf update 
Last metadata expiration check: 0:14:10 ago on Tue Feb 28 14:12:50 2023.
Dependencies resolved.
Nothing to do.
Complete!

Yes, but what if you want to update a machine? You have two options to update an existing machine. The cleanest one for your production environment is to create duplicate infrastructure based on new AMIs. As I mentioned earlier, we publish updates for every security fix and a consolidated update every three months for two years after the initial release. Each update is provided as a set of AMIs and their corresponding software repository.

For smaller infrastructure, such as test or development machines, you might choose to update the operating system or individual packages in place as well. This is a three-step process:

  • first, list the available updated software repositories;
  • second, point dnf to a specific software repository;
  • and third, update your packages.

To show you how it works, I purposely launched an EC2 instance with an “old” version of Amazon Linux 2023 from February 2023. I first run dnf check-release-update to list the available updated software repositories.

$ dnf check-release-update
WARNING:
  A newer release of "Amazon Linux" is available.

  Available Versions:

  Version 2023.0.20230308:
    Run the following command to upgrade to 2023.0.20230308:

      dnf upgrade --releasever=2023.0.20230308

    Release notes:
     https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html

Then, I might either update the full distribution using dnf upgrade --releasever=2023.0.20230308 or point dnf to the updated repository to select individual packages.

$ dnf check-update --releasever=2023.0.20230308

Amazon Linux 2023 repository                                                    28 MB/s |  11 MB     00:00
Amazon Linux 2023 Kernel Livepatch repository                                  1.2 kB/s | 243  B     00:00

amazon-linux-repo-s3.noarch                          2023.0.20230308-0.amzn2023                amazonlinux
binutils.aarch64                                     2.39-6.amzn2023.0.5                       amazonlinux
ca-certificates.noarch                               2023.2.60-1.0.amzn2023.0.1                amazonlinux
(redacted for brevity)
util-linux-core.aarch64 2.37.4-1.amzn2022.0.1 amazonlinux

Finally, I might run a dnf update <package_name> command to update a specific package.

This might look like overkill for a simple machine, but when managing enterprise infrastructure or large-scale fleets of instances, this facilitates the management of your fleet by ensuring that all instances run the same version of software packages. It also means that the AMI ID is now something that you can fully run through your CI/CD pipelines for deployment and that you have a way to roll AMI versions forward and backward according to your schedule.

Where is Fedora?
When looking for a base to serve as a starting point for Amazon Linux 2023, Fedora was the best choice. We found that Fedora’s core tenets (Freedom, Friends, Features, First) resonate well with our vision for Amazon Linux. However, Amazon Linux focuses on a long-term, stable OS for the cloud, which is a notable different release cycle and lifecycle than Fedora. Amazon Linux 2023 provides updated versions of open-source software, a larger variety of packages, and frequent releases.

Amazon Linux 2023 isn’t directly comparable to any specific Fedora release. The Amazon Linux 2023 GA version includes components from Fedora 34, 35, and 36. Some of the components are the same as the components in Fedora, and some are modified. Other components more closely resemble the components in CentOS Stream 9 or were developed independently. The Amazon Linux kernel, on its side, is sourced from the long-term support options that are on kernel.org, chosen independently from the kernel provided by Fedora.

Like every good citizen in the open-source community, we give back and contribute our changes to upstream distributions and sources for the benefit of the entire community. Amazon Linux 2023 itself is open source. The source code for all RPM packages that are used to build the binaries that we ship are available through the SRPM yum repository (sudo dnf install -y 'dnf-command(download)' && dnf download --source bash)

One More Thing: Amazon EBS Gp3 Volumes
Amazon Linux 2023 AMIs use gp3 volumes by default.

Gp3 is the latest generation general-purpose solid-state drive (SSD) volume for Amazon Elastic Block Store (Amazon EBS). Gp3 provides 20 percent lower storage costs compared to gp2. Gp3 volumes deliver a baseline performance of 3,000 IOPS and 125MB/s at any volume size. What I particularly like about gp3 volumes is that I can now provision performance independently of capacity. When using gp3 volumes, I can now increase IOPS and throughput without incurring charges for extra capacity that I don’t actually need.

With the availability of gp3-backed AL2023 AMIs, this is the first time a gp3-backed Amazon Linux AMI is available. Gp3-backed AMIs have been a common customer request since gp3 was launched in 2020. It is now available by default.

Price and Availability
Amazon Linux 2023 is provided at no additional charge. Standard Amazon EC2 and AWS charges apply for running EC2 instances and other services. This distribution includes full support for five years. When deploying on AWS, our support engineers will provide technical support according to the terms and conditions of your AWS Support plan. AMIs are available in all AWS Regions.

Amazon Linux is the most used Linux distribution on AWS, with hundreds of thousands of customers using Amazon Linux 2. Dozens of Independent Software Vendors (ISVs) and hardware partners are supporting Amazon Linux 2023 today. You can adopt this new version with the confidence that the partner tools you rely on are likely to be supported. We are excited about this release, which brings you an even higher level of security, a predictable release lifecycle, and a consistent update experience.

Now go build and deploy your workload on Amazon Linux 2023 today.

— seb

New – Use Amazon S3 Object Lambda with Amazon CloudFront to Tailor Content for End Users

With S3 Object Lambda, you can use your own code to process data retrieved from Amazon S3 as it is returned to an application. Over time, we added new capabilities to S3 Object Lambda, like the ability to add your own code to S3 HEAD and LIST API requests, in addition to the support for S3 GET requests that was available at launch.

Today, we are launching aliases for S3 Object Lambda Access Points. Aliases are now automatically generated when S3 Object Lambda Access Points are created and are interchangeable with bucket names anywhere you use a bucket name to access data stored in Amazon S3. Therefore, your applications don’t need to know about S3 Object Lambda and can consider the alias to be a bucket name.

Architecture diagram.

You can now use an S3 Object Lambda Access Point alias as an origin for your Amazon CloudFront distribution to tailor or customize data for end users. You can use this to implement automatic image resizing or to tag or annotate content as it is downloaded. Many images still use older formats like JPEG or PNG, and you can use a transcoding function to deliver images in more efficient formats like WebP, BPG, or HEIC. Digital images contain metadata, and you can implement a function that strips metadata to help satisfy data privacy requirements.

Architecture diagram.

Let’s see how this works in practice. First, I’ll show a simple example using text that you can follow along by just using the AWS Management Console. After that, I’ll implement a more advanced use case processing images.

Using an S3 Object Lambda Access Point as the Origin of a CloudFront Distribution
For simplicity, I am using the same application in the launch post that changes all text in the original file to uppercase. This time, I use the S3 Object Lambda Access Point alias to set up a public distribution with CloudFront.

I follow the same steps as in the launch post to create the S3 Object Lambda Access Point and the Lambda function. Because the Lambda runtimes for Python 3.8 and later do not include the requests module, I update the function code to use urlopen from the Python Standard Library:

import boto3
from urllib.request import urlopen

s3 = boto3.client('s3')

def lambda_handler(event, context):
  print(event)

  object_get_context = event['getObjectContext']
  request_route = object_get_context['outputRoute']
  request_token = object_get_context['outputToken']
  s3_url = object_get_context['inputS3Url']

  # Get object from S3
  response = urlopen(s3_url)
  original_object = response.read().decode('utf-8')

  # Transform object
  transformed_object = original_object.upper()

  # Write object back to S3 Object Lambda
  s3.write_get_object_response(
    Body=transformed_object,
    RequestRoute=request_route,
    RequestToken=request_token)

  return

To test that this is working, I open the same file from the bucket and through the S3 Object Lambda Access Point. In the S3 console, I select the bucket and a sample file (called s3.txt) that I uploaded earlier and choose Open.

Console screenshot.

A new browser tab is opened (you might need to disable the pop-up blocker in your browser), and its content is the original file with mixed-case text:

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers...

I choose Object Lambda Access Points from the navigation pane and select the AWS Region I used before from the dropdown. Then, I search for the S3 Object Lambda Access Point that I just created. I select the same file as before and choose Open.

Console screenshot.

In the new tab, the text has been processed by the Lambda function and is now all in uppercase:

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

Now that the S3 Object Lambda Access Point is correctly configured, I can create the CloudFront distribution. Before I do that, in the list of S3 Object Lambda Access Points in the S3 console, I copy the Object Lambda Access Point alias that has been automatically created:

Console screenshot.

In the CloudFront console, I choose Distributions in the navigation pane and then Create distribution. In the Origin domain, I use the S3 Object Lambda Access Point alias and the Region. The full syntax of the domain is:

ALIAS.s3.REGION.amazonaws.com

Console screenshot.

S3 Object Lambda Access Points cannot be public, and I use CloudFront origin access control (OAC) to authenticate requests to the origin. For Origin access, I select Origin access control settings and choose Create control setting. I write a name for the control setting and select Sign requests and S3 in the Origin type dropdown.

Console screenshot.

Now, my Origin access control settings use the configuration I just created.

Console screenshot.

To reduce the number of requests going through S3 Object Lambda, I enable Origin Shield and choose the closest Origin Shield Region to the Region I am using. Then, I select the CachingOptimized cache policy and create the distribution. As the distribution is being deployed, I update permissions for the resources used by the distribution.

Setting Up Permissions to Use an S3 Object Lambda Access Point as the Origin of a CloudFront Distribution
First, the S3 Object Lambda Access Point needs to give access to the CloudFront distribution. In the S3 console, I select the S3 Object Lambda Access Point and, in the Permissions tab, I update the policy with the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3-object-lambda:Get*",
            "Resource": "arn:aws:s3-object-lambda:REGION:ACCOUNT:accesspoint/NAME",
            "Condition": {
                "StringEquals": {
                    "aws:SourceArn": "arn:aws:cloudfront::ACCOUNT:distribution/DISTRIBUTION-ID"
                }
            }
        }
    ]
}

The supporting access point also needs to allow access to CloudFront when called via S3 Object Lambda. I select the access point and update the policy in the Permissions tab:

{
    "Version": "2012-10-17",
    "Id": "default",
    "Statement": [
        {
            "Sid": "s3objlambda",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:REGION:ACCOUNT:accesspoint/NAME",
                "arn:aws:s3:REGION:ACCOUNT:accesspoint/NAME/object/*"
            ],
            "Condition": {
                "ForAnyValue:StringEquals": {
                    "aws:CalledVia": "s3-object-lambda.amazonaws.com"
                }
            }
        }
    ]
}

The S3 bucket needs to allow access to the supporting access point. I select the bucket and update the policy in the Permissions tab:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "*",
            "Resource": [
                "arn:aws:s3:::BUCKET",
                "arn:aws:s3:::BUCKET/*"
            ],
            "Condition": {
                "StringEquals": {
                    "s3:DataAccessPointAccount": "ACCOUNT"
                }
            }
        }
    ]
}

Finally, CloudFront needs to be able to invoke the Lambda function. In the Lambda console, I choose the Lambda function used by S3 Object Lambda, and then, in the Configuration tab, I choose Permissions. In the Resource-based policy statements section, I choose Add permissions and select AWS Account. I enter a unique Statement ID. Then, I enter cloudfront.amazonaws.com as Principal and select lambda:InvokeFunction from the Action dropdown and Save. We are working to simplify this step in the future. I’ll update this post when that’s available.

Testing the CloudFront Distribution
When the distribution has been deployed, I test that the setup is working with the same sample file I used before. In the CloudFront console, I select the distribution and copy the Distribution domain name. I can use the browser and enter https://DISTRIBUTION_DOMAIN_NAME/s3.txt in the navigation bar to send a request to CloudFront and get the file processed by S3 Object Lambda. To quickly get all the info, I use curl with the -i option to see the HTTP status and the headers in the response:

curl -i https://DISTRIBUTION_DOMAIN_NAME/s3.txt

HTTP/2 200 
content-type: text/plain
content-length: 427
x-amzn-requestid: a85fe537-3502-4592-b2a9-a09261c8c00c
date: Mon, 06 Mar 2023 10:23:02 GMT
x-cache: Miss from cloudfront
via: 1.1 a2df4ad642d78d6dac65038e06ad10d2.cloudfront.net (CloudFront)
x-amz-cf-pop: DUB56-P1
x-amz-cf-id: KIiljCzYJBUVVxmNkl3EP2PMh96OBVoTyFSMYDupMd4muLGNm2AmgA==

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

It works! As expected, the content processed by the Lambda function is all uppercase. Because this is the first invocation for the distribution, it has not been returned from the cache (x-cache: Miss from cloudfront). The request went through S3 Object Lambda to process the file using the Lambda function I provided.

Let’s try the same request again:

curl -i https://DISTRIBUTION_DOMAIN_NAME/s3.txt

HTTP/2 200 
content-type: text/plain
content-length: 427
x-amzn-requestid: a85fe537-3502-4592-b2a9-a09261c8c00c
date: Mon, 06 Mar 2023 10:23:02 GMT
x-cache: Hit from cloudfront
via: 1.1 145b7e87a6273078e52d178985ceaa5e.cloudfront.net (CloudFront)
x-amz-cf-pop: DUB56-P1
x-amz-cf-id: HEx9Fodp184mnxLQZuW62U11Fr1bA-W1aIkWjeqpC9yHbd0Rg4eM3A==
age: 3

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

This time the content is returned from the CloudFront cache (x-cache: Hit from cloudfront), and there was no further processing by S3 Object Lambda. By using S3 Object Lambda as the origin, the CloudFront distribution serves content that has been processed by a Lambda function and can be cached to reduce latency and optimize costs.

Resizing Images Using S3 Object Lambda and CloudFront
As I mentioned at the beginning of this post, one of the use cases that can be implemented using S3 Object Lambda and CloudFront is image transformation. Let’s create a CloudFront distribution that can dynamically resize an image by passing the desired width and height as query parameters (w and h respectively). For example:

https://DISTRIBUTION_DOMAIN_NAME/image.jpg?w=200&h=150

For this setup to work, I need to make two changes to the CloudFront distribution. First, I create a new cache policy to include query parameters in the cache key. In the CloudFront console, I choose Policies in the navigation pane. In the Cache tab, I choose Create cache policy. Then, I enter a name for the cache policy.

Console screenshot.

In the Query settings of the Cache key settings, I select the option to Include the following query parameters and add w (for the width) and h (for the height).

Console screenshot.

Then, in the Behaviors tab of the distribution, I select the default behavior and choose Edit.

There, I update the Cache key and origin requests section:

  • In the Cache policy, I use the new cache policy to include the w and h query parameters in the cache key.
  • In the Origin request policy, use the AllViewerExceptHostHeader managed policy to forward query parameters to the origin.

Console screenshot.

Now I can update the Lambda function code. To resize images, this function uses the Pillow module that needs to be packaged with the function when it is uploaded to Lambda. You can deploy the function using a tool like the AWS SAM CLI or the AWS CDK. Compared to the previous example, this function also handles and returns HTTP errors, such as when content is not found in the bucket.

import io
import boto3
from urllib.request import urlopen, HTTPError
from PIL import Image

from urllib.parse import urlparse, parse_qs

s3 = boto3.client('s3')

def lambda_handler(event, context):
    print(event)

    object_get_context = event['getObjectContext']
    request_route = object_get_context['outputRoute']
    request_token = object_get_context['outputToken']
    s3_url = object_get_context['inputS3Url']

    # Get object from S3
    try:
        original_image = Image.open(urlopen(s3_url))
    except HTTPError as err:
        s3.write_get_object_response(
            StatusCode=err.code,
            ErrorCode='HTTPError',
            ErrorMessage=err.reason,
            RequestRoute=request_route,
            RequestToken=request_token)
        return

    # Get width and height from query parameters
    user_request = event['userRequest']
    url = user_request['url']
    parsed_url = urlparse(url)
    query_parameters = parse_qs(parsed_url.query)

    try:
        width, height = int(query_parameters['w'][0]), int(query_parameters['h'][0])
    except (KeyError, ValueError):
        width, height = 0, 0

    # Transform object
    if width > 0 and height > 0:
        transformed_image = original_image.resize((width, height), Image.ANTIALIAS)
    else:
        transformed_image = original_image

    transformed_bytes = io.BytesIO()
    transformed_image.save(transformed_bytes, format='JPEG')

    # Write object back to S3 Object Lambda
    s3.write_get_object_response(
        Body=transformed_bytes.getvalue(),
        RequestRoute=request_route,
        RequestToken=request_token)

    return

I upload a picture I took of the Trevi Fountain in the source bucket. To start, I generate a small thumbnail (200 by 150 pixels).

https://DISTRIBUTION_DOMAIN_NAME/trevi-fountain.jpeg?w=200&h=150

Picture of the Trevi Fountain with size 200x150 pixels.

Now, I ask for a slightly larger version (400 by 300 pixels):

https://DISTRIBUTION_DOMAIN_NAME/trevi-fountain.jpeg?w=400&h=300

Picture of the Trevi Fountain with size 400x300 pixels.

It works as expected. The first invocation with a specific size is processed by the Lambda function. Further requests with the same width and height are served from the CloudFront cache.

Availability and Pricing
Aliases for S3 Object Lambda Access Points are available today in all commercial AWS Regions. There is no additional cost for aliases. With S3 Object Lambda, you pay for the Lambda compute and request charges required to process the data, and for the data S3 Object Lambda returns to your application. You also pay for the S3 requests that are invoked by your Lambda function. For more information, see Amazon S3 Pricing.

Aliases are now automatically generated when an S3 Object Lambda Access Point is created. For existing S3 Object Lambda Access Points, aliases are automatically assigned and ready for use.

It’s now easier to use S3 Object Lambda with existing applications, and aliases open many new possibilities. For example, you can use aliases with CloudFront to create a website that converts content in Markdown to HTML, resizes and watermarks images, or masks personally identifiable information (PII) from text, images, and documents.

Customize content for your end users using S3 Object Lambda with CloudFront.

Danilo

AWS Week in Review – March 13, 2023

It seems like only yesterday I was last writing the Week in Review post, at the end of January, and now here we are almost mid-way through March, almost into spring in the northern hemisphere, and close to a quarter way through 2023. Where does time fly?!

Last Week’s Launches
Here’s some of the launches and other news from the past week that I want to bring to your attention:

New AWS Heroes: At the center of the AWS Community around the globe, Heroes share their knowledge and enthusiasm. Welcome to Ananda in Indonesia, and Aidan and Wendy in Australia, our newly announced Heroes!

General Availability of AWS Application Composer: Launched in preview during Dr. Werner Vogel’s re:Invent 2022 keynote, AWS Application Composer is a tool enabling the composition and configuration of serverless applications using a visual design surface. The visual design is backed by an AWS CloudFormation template, making it deployment ready.

What I find particularly cool about Application Composer is that it also works on existing serverless application templates, and round-trips changes to the template made in either a code editor or the visual designer. This makes it ideal for both new developers, and experienced serverless developers with existing applications.

My colleague Channy’s post provides an introduction, and Application Composer is also featured in last Friday’s AWS on Air show, available to watch on-demand.

Get daily feature updates via Amazon SNS: One thing I’ve learned since joining AWS is that the service teams don’t stand still, and are releasing something new pretty much every day. Sometimes, multiple things! This can, however, make it hard to keep up. So, I was interested to read that you can now receive daily feature updates, in email, by subscribing to an Amazon Simple Notification Service (Amazon SNS) topic. As usual, Jeff’s post has all the details you need to get started.

Using up to 10GB of ephemeral storage for AWS Lambda functions: If you use Lambda for Extract-Transform-Load (ETL) jobs, or any data-intensive jobs that require temporary storage of data during processing, you can now configure up to 10GB of ephemeral storage, mounted at /tmp, for your functions in six additional Regions – Asia Pacific (Hyderabad), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Europe (Spain), Europe (Zurich), and Middle East (UAE). More information on using ephemeral storage with Lambda functions can be found in this blog post.

Increased table counts for Amazon Redshift: workloads that require large numbers of tables can now take advantage of using up to 200K tables, avoiding the need to split tables across multiple data warehouses. The updated limit is available to workloads using the ra3.4xlarge, ra3.16xlarge, and dc2.8xlarge node types with Redshift Serverless and data warehouse clusters.

Faster, simpler permissions setup for AWS Glue: Glue is a serverless data integration and ETL service for discovering, preparing, moving, and integrating data intended for use in analytics and machine learning (ML) workloads. A new guided permissions setup process, available in the AWS Management Console, makes it simpler and easier to grant access to AWS Identity and Access Management (IAM) Roles and users to Glue, and use a default role for running jobs and working with notebooks. This simpler, guided approach helps users start authoring jobs, and work with the Data Catalog, without further setup. 

Microsoft Active Directory authentication for the MySQL-Compatible Edition of Amazon Aurora: You can now use Active Directory, either with an existing on-premises directory or with AWS Directory Service for Microsoft Active Directory, to authenticate database users when accessing Amazon Aurora MySQL-Compatible Edition instances, helping reduce operational overhead. It also enables you to make use of native Active Directory credential management capabilities to manage password complexities and rotation, helping you stay in step with your compliance and security requirements.

Launch of the 2023 AWS DeepRacer League and new competition structure: The DeepRacer tracks are one of my favorite things to visit and watch at AWS events, so I was happy to learn the new 2023 league is now underway. If you’ve not heard of DeepRacer, it’s the world’s first global racing league featuring autonomous vehicles, enabling developers of all skill levels to not only compete to complete the track in the shortest time but also to advance their knowledge of machine learning (ML) in the process. Along with the new league, there are now more chances to earn achievements and prizes using an all new three-tier competition spanning national and regional races. Racers compete for a chance to win a spot in the World Championship, held at AWS re:Invent, and a $43,000 prize purse. What are you waiting for, start your (ML) engines today!

AWS open-source news and updates: The latest newsletter highlighting open-source projects, tools, and demos from the AWS Community is now available. The newsletter is published weekly, and you can find edition 148 here.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Upcoming AWS Events
Here’s some upcoming events you may be interested in checking out:

AWS Pi Day: March 14th is the third annual AWS Pi Day. Join in with the celebrations of the 17th birthday of Amazon Simple Storage Service (Amazon S3) and the cloud in a live virtual event hosted on the AWS on Air channel. There’ll also be news and discussions on the latest innovations across Data services on AWS, including storage, analytics, AI/ML, and more.

.NET developers and architects looking to modernize their applications will be interested in an upcoming webinar, Modernize and Optimize by Containerizing .NET Applications on AWS, scheduled for March 22nd. In this webinar, you’ll find demonstrations on how you can enhance the security of legacy .NET applications through modernizing to containers, update to a modern version of .NET, and run them on the latest versions of Windows. Registration for the online event is open now.

You can find details on all upcoming events, in-person and virtual, here.

New Livestream Shows
There’s some new livestream shows that launched recently I’d like to bring to your attention:

My colleague Isaac has started a new .NET on AWS show, streaming on Twitch. The second episode was live last week; catch up here on demand. Episode 1 is also available here.

I mentioned AWS on Air earlier in this post, and hopefully you’ve caught our weekly Friday show streaming on Twitch, Twitter, YouTube, and LinkedIn. Or, maybe you’ve seen us broadcasting live from AWS events such as Summits or AWS re:Invent. But did you know that some of the hosts of the shows have recently started their own individual shows too? Check out these new shows below:

  • AWS on Air: Startup! – hosted by Jillian Forde, this show focuses on learning technical and business strategies from startup experts to build and scale your startup in AWS. The show runs every Tuesday at 10am PT/1pm ET.
  • AWS On Air: Under the Hood with AWS – in this show, host Art Baudo chats with guests, including AWS technical leaders and customers, about Cloud infrastructure. In last week’s show, the discussion centered around Amazon Elastic Compute Cloud (Amazon EC2) Mac Instances. Watch live every Tuesday at 2pm PT/5pm ET. 
  • AWS on Air: Lockdown! – join host Kyle Dickinson each Tuesday at 11am PT/2pm ET for this show, covering a breadth of AWS security topics in an approachable way that’s suitable for all levels of AWS experience. You’ll encounter demos, guest speakers from AWS, AWS Heroes, and AWS Community Builders. 
  • AWS on Air: Step up your GameDay – hosts AM Grobelny and James Spencer are joined by special guests to strategize and navigate through an AWS GameDay, a fun and challenge-oriented way to learn about AWS. You’ll find this show every second Wednesday at 11am PT/2pm ET.
  • AWS on Air: AMster & the Brit’s Code Corner – join AM Grobelny and myself as we chat about and illustrate cloud development. In Beginners Corner, we answer your questions and try to demystify this strange thing called “coding”, and in Project Corner we tackle slightly larger projects of interest to more experienced developers. There’s something for everyone in Code Corner, live on the 3rd Thursday of each month at 11am PT/2pm ET.

You’ll find all these AWS on Air shows in the published schedule. We hope you can join us!

That’s all for this week – check back next Monday for another AWS Week in Review.

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Meet the Newest AWS Heroes – March 2023

The AWS Heroes are passionate AWS experts who are dedicated to sharing their in-depth knowledge within the community. They inspire, uplift, and motivate the global AWS community, and today, we’re excited to announce and recognize the newest Heroes in 2023!

Aidan Steele – Melbourne, Australia

Serverless Hero Aidan Steele is a Senior Engineer at Nightvision. He is an avid AWS user, and has been using the first platform and EC2 since 2008. Fifteen years later, EC2 still has a special place in his heart, but his interests are in containers and serverless functions, and blurring the distinction between them wherever possible. He enjoys finding novel uses for AWS services, especially when they have a security or network focus. This is best demonstrated through his open source contributions on GitHub, where he shares interesting use cases via hands-on projects.

Ananda Dwi Rahmawati – Yogyakarta, Indonesia

Container Hero Ananda Dwi Rahmawati is a Sr. Cloud Infrastructure Engineer, specializing in system integration between cloud infrastructure, CI/CD workflows, and application modernization. She implements solutions using powerful services provided by AWS, such as Amazon Elastic Kubernetes Service (EKS), combined with open source tools to achieve the goal of creating reliable, highly available, and scalable systems. She is a regular technical speaker who delivers presentations using real-world case studies at several local community meetups and conferences, such as Kubernetes and OpenInfra Days Indonesia, AWS Community Day Indonesia, AWS Summit ASEAN, and many more.

Wendy Wong – Sydney, Australia

Data Hero Wendy Wong is a Business Performance Analyst at Service NSW, building data pipelines with AWS Analytics and agile projects in AI. As a teacher at heart, she enjoys sharing her passion as a Data Analytics Lead Instructor for General Assembly Sydney, writing technical blogs on dev.to. She is both an active speaker for AWS analytics and an advocate of diversity and inclusion, presenting at a number of events: AWS User Group Malaysia, Women Who Code, AWS Summit Australia 2022, AWS BuildHers, AWS Innovate Modern Applications, and many more.

Learn More

If you’d like to learn more about the new Heroes or connect with a Hero near you, please visit the AWS Heroes website or browse the AWS Heroes Content Library.

Taylor

Amazon S3 Encrypts New Objects By Default

At AWS, security is the top priority. Starting today, Amazon Simple Storage Service (Amazon S3) encrypts all new objects by default. Now, S3 automatically applies server-side encryption (SSE-S3) for each new object, unless you specify a different encryption option. SSE-S3 was first launched in 2011. As Jeff wrote at the time: “Amazon S3 server-side encryption handles all encryption, decryption, and key management in a totally transparent fashion. When you PUT an object, we generate a unique key, encrypt your data with the key, and then encrypt the key with a [root] key.”

This change puts another security best practice into effect automatically—with no impact on performance and no action required on your side. S3 buckets that do not use default encryption will now automatically apply SSE-S3 as the default setting. Existing buckets currently using S3 default encryption will not change.

As always, you can choose to encrypt your objects using one of the three encryption options we provide: S3 default encryption (SSE-S3, the new default), customer-provided encryption keys (SSE-C), or AWS Key Management Service keys (SSE-KMS). To have an additional layer of encryption, you might also encrypt objects on the client side, using client libraries such as the Amazon S3 encryption client.

While it was simple to enable, the opt-in nature of SSE-S3 meant that you had to be certain that it was always configured on new buckets and verify that it remained configured properly over time. For organizations that require all their objects to remain encrypted at rest with SSE-S3, this update helps meet their encryption compliance requirements without any additional tools or client configuration changes.

With today’s announcement, we have now made it “zero click” for you to apply this base level of encryption on every S3 bucket.

Verify Your Objects Are Encrypted
The change is visible today in AWS CloudTrail data event logs. You will see the changes in the S3 section of the AWS Management Console, Amazon S3 Inventory, Amazon S3 Storage Lens, and as an additional header in the AWS CLI and in the AWS SDKs over the next few weeks. We will update this blog post and documentation when the encryption status is available in these tools in all AWS Regions.

To verify the change is effective on your buckets today, you can configure CloudTrail to log data events. By default, trails do not log data events, and there is an extra cost to enable it. Data events show the resource operations performed on or within a resource, such as when a user uploads a file to an S3 bucket. You can log data events for Amazon S3 buckets, AWS Lambda functions, Amazon DynamoDB tables, or a combination of those.

Once enabled, search for PutObject API for file uploads or InitiateMultipartUpload for multipart uploads. When Amazon S3 automatically encrypts an object using the default encryption settings, the log includes the following field as the name-value pair: "SSEApplied":"Default_SSE_S3". Here is an example of a CloudTrail log (with data event logging enabled) when I uploaded a file to one of my buckets using the AWS CLI command aws s3 cp backup.sh s3://private-sst.

Cloudtrail log for S3 with default encryption enabled

Amazon S3 Encryption Options
As I wrote earlier, SSE-S3 is now the new base level of encryption when no other encryption-type is specified. SSE-S3 uses Advanced Encryption Standard (AES) encryption with 256-bit keys managed by AWS.

You can choose to encrypt your objects using SSE-C or SSE-KMS rather than with SSE-S3, either as “one click” default encryption settings on the bucket, or for individual objects in PUT requests.

SSE-C lets Amazon S3 perform the encryption and decryption of your objects while you retain control of the keys used to encrypt objects. With SSE-C, you don’t need to implement or use a client-side library to perform the encryption and decryption of objects you store in Amazon S3, but you do need to manage the keys that you send to Amazon S3 to encrypt and decrypt objects.

With SSE-KMS, AWS Key Management Service (AWS KMS) manages your encryption keys. Using AWS KMS to manage your keys provides several additional benefits. With AWS KMS, there are separate permissions for the use of the KMS key, providing an additional layer of control as well as protection against unauthorized access to your objects stored in Amazon S3. AWS KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data.

When using an encryption client library, such as the Amazon S3 encryption client, you retain control of the keys and complete the encryption and decryption of objects client-side using an encryption library of your choice. You encrypt the objects before they are sent to Amazon S3 for storage. The Java, .Net, Ruby, PHP, Go, and C++ AWS SDKs support client-side encryption.

You can follow the instructions in this blog post if you want to retroactively encrypt existing objects in your buckets.

Available Now
This change is effective now, in all AWS Regions, including on AWS GovCloud (US) and AWS China Regions. There is no additional cost for default object-level encryption.

— seb