Getting Started with Amazon ECS Anywhere – Now Generally Available

Since Amazon Elastic Container Service (Amazon ECS) was launched in 2014, AWS has released other options for running Amazon ECS tasks outside of an AWS Region such as AWS Wavelength, an offering for mobile edge devices or AWS Outposts, a service that extends to customers’ environments using hardware owned and fully managed by AWS.

But some customers have applications that need to run on premises due to regulatory, latency, and data residency requirements or the desire to leverage existing infrastructure investments. In these cases, customers have to install, operate, and manage separate container orchestration software and need to use disparate tooling across their AWS and on-premises environments. Customers asked us for a way to manage their on-premises containers without this added complexity and cost.

Following Jeff’s preannouncement last year, I am happy to announce the general availability of Amazon ECS Anywhere, a new capability in Amazon ECS that enables customers to easily run and manage container-based applications on premises, including virtual machines (VMs), bare metal servers, and other customer-managed infrastructure.

With ECS Anywhere, you can run and manage containers on any customer-managed infrastructure using the same cloud-based, fully managed, and highly scalable container orchestration service you use in AWS today. You no longer need to prepare, run, update, or maintain your own container orchestrators on premises, making it easier to manage your hybrid environment and leverage the cloud for your infrastructure by installing simple agents.

ECS Anywhere provides consistent tooling and APIs for all container-based applications and the same Amazon ECS experience for cluster management, workload scheduling, and monitoring both in the cloud and on customer-managed infrastructure. You can now enjoy the benefits of reduced cost and complexity by running container workloads such as data processing at edge locations on your own hardware maintaining reduced latency, and in the cloud using a single, consistent container orchestrator.

Amazon ECS Anywhere – Getting Started
To get started with ECS Anywhere, register your on-premises servers or VMs (also referred to as External instances) in the ECS cluster. The AWS Systems Manager Agent, Amazon ECS container agent, and Docker must be installed on these external instances. Your external instances require an IAM role that permits them to communicate with AWS APIs. For more information, see Required IAM permissions in the ECS Developer Guide.

To create a cluster for ECS Anywhere, on the Create Cluster page in the ECS console, choose the Networking Only template. This option is for use with either AWS Fargate or external instance capacity. We recommend that you use the AWS Region that is geographically closest to the on-premises servers you want to register.

This creates an empty cluster to register external instances. On the ECS Instances tab, choose Register External Instances to get activation codes and an installation script.

On the Step 1: External instances activation details page, in Activation key duration (in days), enter the number of days the activation key should remain active. The activation key can be used for up to 1,000 activations. In Number of instances, enter the number of external instances you want to register to your cluster. In Instance role, enter the IAM role to associate with your external instances.

Choose Next step to get a registration command.

On the Step 2: Register external instances page, copy the registration command. Run this command on the external instances you want to register to your cluster.

Paste the registration command in your on-premise servers or VMs. Each external instance is then registered as an AWS Systems Manager managed instance, which is then registered to your Amazon ECS clusters.

Both x86_64 and ARM64 CPU architectures are supported. The following is a list of supported operating systems:

  • CentOS 7, CentOS 8
  • RHEL 7
  • Fedora 32, Fedora 33
  • openSUSE Tumbleweed
  • Ubuntu 18, Ubuntu 20
  • Debian 9, Debian 10
  • SUSE Enterprise Server 15

When the ECS agent has started and completed the registration, your external instance will appear on the ECS Instances tab.

You can also add your external instances to the existing cluster. In this case, you can see both Amazon EC2 instances and external instances are prefixed with mi-* together.

Now that the external instances are registered to your cluster, you are ready to create a task definition. Amazon ECS provides the requiresCompatibilities parameter to validate that the task definition is compatible with the the EXTERNAL launch type when creating your service or running your standalone task. The following is an example task definition:

{
	"requiresCompatibilities": [
		"EXTERNAL"
	],
	"containerDefinitions": [{
		"name": "nginx",
		"image": "public.ecr.aws/nginx/nginx:latest",
		"memory": 256,
		"cpu": 256,
		"essential": true,
		"portMappings": [{
			"containerPort": 80,
			"hostPort": 8080,
			"protocol": "tcp"
		}]
	}],
	"networkMode": "bridge",
	"family": "nginx"
}

You can create a task definition in the ECS console. In Task Definition, choose Create new task definition. For Launch type, choose EXTERNAL and then configure the task and container definitions to use external instances.

On the Tasks tab, choose Run new task. On the Run Task page, for Cluster, choose the cluster to run your task definition on. In Number of tasks, enter the number of copies of that task to run with the EXTERNAL launch type.

Or, on the Services tab, choose Create. Configure service lets you specify copies of your task definition to run and maintain in a cluster. To run your task in the registered external instance, for Launch type, choose EXTERNAL. When you choose this launch type, load balancers, tag propagation, and service discovery integration are not supported.

The tasks you run on your external instances must use the bridge, host, or none network modes. The awsvpc network mode isn’t supported. For more information about each network mode, see Choosing a network mode in the Amazon ECS Best Practices Guide.

Now you can run your tasks and associate a mix of EXTERNAL, FARGATE, and EC2 capacity provider types with the same ECS service and specify how you would like your tasks to be split across them.

Things to Know
Here are a couple of things to keep in mind:

Connectivity: In the event of loss of network connectivity between the ECS agent running on the on-premises servers and the ECS control plane in the AWS Region, existing ECS tasks will continue to run as usual. If tasks still have connectivity with other AWS services, they will continue to communicate with them for as long as the task role credentials are active. If a task launched as part of a service crashes or exits on its own, ECS will be unable to replace it until connectivity is restored.

Monitoring: With ECS Anywhere, you can get Amazon CloudWatch metrics for your clusters and services, use the CloudWatch Logs driver (awslogs) to get your containers’ logs, and access the ECS CloudWatch event stream to monitor your clusters’ events.

Networking: ECS external instances are optimized for running applications that generate outbound traffic or process data. If your application requires inbound traffic, such as a web service, you will need to employ a workaround to place these workloads behind a load balancer until the feature is supported natively. For more information, see Networking with ECS Anywhere.

Data Security: To help customers maintain data security, ECS Anywhere only sends back to the AWS Region metadata related to the state of the tasks or the state of the containers (whether they are running or not running, performance counters, and so on). This communication is authenticated and encrypted in transit through Transport Layer Security (TLS).

ECS Anywhere Partners
ECS Anywhere integrates with a variety of ECS Anywhere partners to help customers take advantage of ECS Anywhere and provide additional functionality for the feature. Here are some of the blog posts that our partners wrote to share their experiences and offerings. (I am updating this article with links as they are published.)

Now Available
Amazon ECS Anywhere is now available in all commercial regions except AWS China Regions where ECS is supported. With ECS Anywhere, there are no minimum fees or upfront commitments. You pay per instance hour for each managed ECS Anywhere task. ECS Anywhere free tier includes 2200 instance hours per month for six months per account for all regions. For more information, see the pricing page.

To learn more, see ECS Anywhere in the Amazon ECS Developer Guide. Please send feedback to the AWS forum for Amazon ECS or through your usual AWS Support contacts.

Get started with the Amazon ECS Anywhere today.

Channy

Update. Watch a cool demo of ECS Anywhere to operate a Raspberry Pi cluster at home office and read its deep-dive blog post.

Introducing Amazon Kinesis Data Analytics Studio – Quickly Interact with Streaming Data Using SQL, Python, or Scala

The best way to get timely insights and react quickly to new information you receive from your business and your applications is to analyze streaming data. This is data that must usually be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and can be used for a variety of analytics including correlations, aggregations, filtering, and sampling.

To make it easier to analyze streaming data, today we are pleased to introduce Amazon Kinesis Data Analytics Studio.

Now, from the Amazon Kinesis console you can select a Kinesis data stream and with a single click start a Kinesis Data Analytics Studio notebook powered by Apache Zeppelin and Apache Flink to interactively analyze data in the stream. Similarly, you can select a cluster in the Amazon Managed Streaming for Apache Kafka console to start a notebook to analyze data in Apache Kafka streams. You can also start a notebook from the Kinesis Data Analytics Studio console and connect to custom sources.

Architectural diagram.

In the notebook, you can interact with streaming data and get results in seconds using SQL queries and Python or Scala programs. When you are satisfied with your results, with a few clicks you can promote your code to a production stream processing application that runs reliably at scale with no additional development effort.

For new projects, we recommend that you use the new Kinesis Data Analytics Studio over Kinesis Data Analytics for SQL Applications. Kinesis Data Analytics Studio combines ease of use with advanced analytical capabilities, which makes it possible to build sophisticated stream processing applications in minutes. Let’s see how that works in practice.

Using Kinesis Data Analytics Studio to Analyze Streaming Data
I want to get a better understanding of the data sent by some sensors to a Kinesis data stream.

To simulate the workload, I use this random_data_generator.py Python script. You don’t need to know Python to use Kinesis Data Analytics Studio. In fact, I am going to use SQL in the following steps. Also, you can avoid any coding and use the Amazon Kinesis Data Generator user interface (UI) to send test data to Kinesis Data Streams or Kinesis Data Firehose. I am using a Python script to have finer control over the data that is being sent.

import datetime
import json
import random
import boto3

STREAM_NAME = "my-input-stream"


def get_random_data():
    current_temperature = round(10 + random.random() * 170, 2)
    if current_temperature > 160:
        status = "ERROR"
    elif current_temperature > 140 or random.randrange(1, 100) > 80:
        status = random.choice(["WARNING","ERROR"])
    else:
        status = "OK"
    return {
        'sensor_id': random.randrange(1, 100),
        'current_temperature': current_temperature,
        'status': status,
        'event_time': datetime.datetime.now().isoformat()
    }


def send_data(stream_name, kinesis_client):
    while True:
        data = get_random_data()
        partition_key = str(data["sensor_id"])
        print(data)
        kinesis_client.put_record(
            StreamName=stream_name,
            Data=json.dumps(data),
            PartitionKey=partition_key)


if __name__ == '__main__':
    kinesis_client = boto3.client('kinesis')
    send_data(STREAM_NAME, kinesis_client)

This script sends random records to my Kinesis data stream using JSON syntax. For example:

{'sensor_id': 77, 'current_temperature': 93.11, 'status': 'OK', 'event_time': '2021-05-19T11:20:00.978328'}
{'sensor_id': 47, 'current_temperature': 168.32, 'status': 'ERROR', 'event_time': '2021-05-19T11:20:01.110236'}
{'sensor_id': 9, 'current_temperature': 140.93, 'status': 'WARNING', 'event_time': '2021-05-19T11:20:01.243881'}
{'sensor_id': 27, 'current_temperature': 130.41, 'status': 'OK', 'event_time': '2021-05-19T11:20:01.371191'}

From the Kinesis console, I select a Kinesis data stream (my-input-stream) and choose Process data in real time from the Process drop-down. In this way, the stream is configured as a source for the notebook.

Console screenshot.

Then, in the following dialog box, I create an Apache Flink – Studio notebook.

I enter a name (my-notebook) and a description for the notebook. The AWS Identity and Access Management (IAM) permissions to read from the Kinesis data stream I selected earlier (my-input-stream) are automatically attached to the IAM role assumed by the notebook.

Console screenshot.

I choose Create to open the AWS Glue console and create an empty database. Back in the Kinesis Data Analytics Studio console, I refresh the list and select the new database. It will define the metadata for my sources and destinations. From here, I can also review the default Studio notebook settings. Then, I choose Create Studio notebook.

Console screenshot.

Now that the notebook has been created, I choose Run.

Console screenshot.

When the notebook is running, I choose Open in Apache Zeppelin to get access to the notebook and write code in SQL, Python, or Scala to interact with my streaming data and get insights in real time.

In the notebook, I create a new note and call it Sensors. Then, I create a sensor_data table describing the format of the data in the stream:

%flink.ssql

CREATE TABLE sensor_data (
    sensor_id INTEGER,
    current_temperature DOUBLE,
    status VARCHAR(6),
    event_time TIMESTAMP(3),
    WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
)
PARTITIONED BY (sensor_id)
WITH (
    'connector' = 'kinesis',
    'stream' = 'my-input-stream',
    'aws.region' = 'us-east-1',
    'scan.stream.initpos' = 'LATEST',
    'format' = 'json',
    'json.timestamp-format.standard' = 'ISO-8601'
)

The first line in the previous command tells to Apache Zeppelin to provide a stream SQL environment (%flink.ssql) for the Apache Flink interpreter. I can also interact with the streaming data using a batch SQL environment (%flink.bsql), or Python (%flink.pyflink) or Scala (%flink) code.

The first part of the CREATE TABLE statement is familiar to anyone who has used SQL with a database. A table is created to store the sensor data in the stream. The WATERMARK option is used to measure progress in the event time, as described in the Event Time and Watermarks section of the Apache Flink documentation.

The second part of the CREATE TABLE statement describes the connector used to receive data in the table (for example, kinesis or kafka), the name of the stream, the AWS Region, the overall data format of the stream (such as json or csv), and the syntax used for timestamps (in this case, ISO 8601). I can also choose the starting position to process the stream, I am using LATEST to read the most recent data first.

When the table is ready, I find it in the AWS Glue Data Catalog database I selected when I created the notebook:

Console screenshot.

Now I can run SQL queries on the sensor_data table and use sliding or tumbling windows to get a better understanding of what is happening with my sensors.

For an overview of the data in the stream, I start with a simple SELECT to get all the content of the sensor_data table:

%flink.ssql(type=update)

SELECT * FROM sensor_data;

This time the first line of the command has a parameter (type=update) so that the output of the SELECT, which is more than one row, is continuously updated when new data arrives.

On the terminal of my laptop, I start the random_data_generator.py script:

$ python3 random_data_generator.py

At first I see a table that contains the data as it comes. To get a better understanding, I select a bar graph view. Then, I group the results by status to see their average current_temperature, as shown here:

Notebook screenshot.

As expected by the way I am generating these results, I have different average temperatures depending on the status (OK, WARNING, or ERROR). The higher the temperature, the greater the probability that something is not working correctly with my sensors.

I can run the aggregated query explicitly using a SQL syntax. This time, I want the result computed on a sliding window of 1 minute with results updated every 10 seconds. To do so, I am using the HOP function in the GROUP BY section of the SELECT statement. To add the time to the output of the select, I use the HOP_ROWTIME function. For more information, see how group window aggregations work in the Apache Flink documentation.

%flink.ssql(type=update)

SELECT sensor_data.status,
       COUNT(*) AS num,
       AVG(sensor_data.current_temperature) AS avg_current_temperature,
       HOP_ROWTIME(event_time, INTERVAL '10' second, INTERVAL '1' minute) as hop_time
  FROM sensor_data
 GROUP BY HOP(event_time, INTERVAL '10' second, INTERVAL '1' minute), sensor_data.status;

This time, I look at the results in table format:

Notebook screenshot.

To send the result of the query to a destination stream, I create a table and connect the table to the stream. First, I need to give permissions to the notebook to write into the stream.

In the Kinesis Data Analytics Studio console, I select my-notebook. Then, in the Studio notebooks details section, I choose Edit IAM permissions. Here, I can configure the sources and destinations used by the notebook and the IAM role permissions are updated automatically.

Console screenshot.

In the Included destinations in IAM policy section, I choose the destination and select my-output-stream. I save changes and wait for the notebook to be updated. I am now ready to use the destination stream.

In the notebook, I create a sensor_state table connected to my-output-stream.

%flink.ssql

CREATE TABLE sensor_state (
    status VARCHAR(6),
    num INTEGER,
    avg_current_temperature DOUBLE,
    hop_time TIMESTAMP(3)
)
WITH (
'connector' = 'kinesis',
'stream' = 'my-output-stream',
'aws.region' = 'us-east-1',
'scan.stream.initpos' = 'LATEST',
'format' = 'json',
'json.timestamp-format.standard' = 'ISO-8601');

I now use this INSERT INTO statement to continuously insert the result of the select into the sensor_state table.

%flink.ssql(type=update)

INSERT INTO sensor_state
SELECT sensor_data.status,
    COUNT(*) AS num,
    AVG(sensor_data.current_temperature) AS avg_current_temperature,
    HOP_ROWTIME(event_time, INTERVAL '10' second, INTERVAL '1' minute) as hop_time
FROM sensor_data
GROUP BY HOP(event_time, INTERVAL '10' second, INTERVAL '1' minute), sensor_data.status;

The data is also sent to the destination Kinesis data stream (my-output-stream) so that it can be used by other applications. For example, the data in the destination stream can be used to update a real-time dashboard, or to monitor the behavior of my sensors after a software update.

I am satisfied with the result. I want to deploy this query and its output as a Kinesis Analytics application.

First, I create a SensorsApp note in my notebook and copy the statements that I want to execute as part of the application. The tables have already been created, so I just copy the INSERT INTO statement above.

Then, from the menu at the top right of my notebook, I choose Build SensorsApp and export to Amazon S3 and confirm the application name.

Notebook screenshot.

When the export is ready, I choose Deploy SensorsApp as Kinesis Analytics application in the same menu. After that, I fine-tune the configuration of the application. I set parallelism to 1 because I have only one shard in my input Kinesis data stream and not a lot of traffic. Then, I run the application, without having to write any code.

From the Kinesis Data Analytics applications console, I choose Open Apache Flink dashboard to get more information about the execution of my application.

Apache Flink console screenshot.

Availability and Pricing
You can use Amazon Kinesis Data Analytics Studio today in all AWS Regions where Kinesis Data Analytics is generally available. For more information, see the AWS Regional Services List.

In Kinesis Data Analytics Studio, we run the open-source versions of Apache Zeppelin and Apache Flink, and we contribute changes upstream. For example, we have contributed bug fixes for Apache Zeppelin, and we have contributed to AWS connectors for Apache Flink, such as those for Kinesis Data Streams and Kinesis Data Firehose. Also, we are working with the Apache Flink community to contribute availability improvements, including automatic classification of errors at runtime to understand whether errors are in user code or in application infrastructure.

With Kinesis Data Analytics Studio, you pay based on the average number of Kinesis Processing Units (KPU) per hour, including those used by your running notebooks. One KPU comprises 1 vCPU of compute, 4 GB of memory, and associated networking. You also pay for running application storage and durable application storage. For more information, see the Kinesis Data Analytics pricing page.

Start using Kinesis Data Analytics Studio today to get better insights from your streaming data.

Danilo

Updated rollout information for the new Google Meet user experience, new option to revert to the legacy experience

Quick summary 

In April 2021, we announced a new and improved Google Meet experience to help make meetings more immersive, inclusive, and productive. We have revised the rollout for the new Google Meet experience: 
  • Rapid Release domains: A gradual rollout (up to 15 days for feature visibility) began on May 24, 2021. 
  • Scheduled Release domains: A gradual rollout (up to 15 days for feature visibility) will begin on June 1, 2021. 
  • Users using Google Meet with a personal Google account: rollout began on May 24, 2021. 
Additionally, users will have the option to revert to legacy Google Meet experience until June 10, 2021. On June 11, 2021, the new Google Meet experience will become the standard experience for all users. 
You can follow the Google Meet Help Community for more updates on rollout progress and support. 

Additional details 

Use of 3rd party Chrome extensions (including the Grid View extension) could impact your experience — disabling any extensions that interact with Google Meet could resolve this. 

Getting started 

  • Admins: There is no admin control for this feature. 
  • End users: No action required. To revert to the legacy Google Meet experience, select “Use the legacy experience for this call” from the banner. You can also disable any Chrome extensions that interact with Meet if you notice issues. 

Rollout pace 

Google Workspace Customers 
Users with personal Google accounts 
  • Rollout began on May 24, 2021 

Availability 

  • Available to all Google Workspace customers, as well as G Suite Basic and Business customers 
  • Available to users using Google Meet with a personal Google account 

Resources 

Google Workspace 

Google Workspace for Education 
Users with personal Google accounts 

View more insights on Keep activity with a new API and audit logs

What’s changing 

To help Workspace admins monitor and analyze their organizations’ Keep activity, starting today, admins will see audit logs for user activity in Google Keep in the Admin console under Reports > Audit log > Keep. Here, admins can monitor when domain users: 
  • Take action on a note, 
  • Add or remove a collaborator, 
  • Upload or delete an attachment. 
Admins who have enabled Keep for their users will see Keep audit events from May 1, 2021 onward. These audit logs are also available in the Reports API, where admins can expand their automated reporting to include Keep. The event log and Reports API do not include information about the content of notes or attachments, only that an internal user has taken a particular action on a note. 
Next, in the coming days, the rollout of a new API for Keep will begin. The API will allow admins and admin-enabled apps to manage Keep access controls for an organization, enabling support for cases such as enterprise data protection. For example, an admin could use the API to create Keep notes, or use an app to monitor Keep usage and help ensure that notes with sensitive information are not shared too widely. 
Note: These features are only available for Google Workspace customers, and are not available for users with personal Google accounts.

Who’s impacted

Admins and developers

Why it matters

Audit events for Keep allow administrators to better understand domain activity on Keep and monitor compliance with organizational guidelines. The Keep API gives admins more ways to manage how Keep data is accessed.
Together, we hope these features make it easier for admins to add a layer of security and compliance around Keep data in their organization.

Getting started

  • Admins: 
    • Keep audit logs will appear automatically for organizations who allow their users to use Keep. Visit the Help Center to learn more about Google Keep Audit logs.
    • The Reports API can be configured by Admins for Google Keep or specific users defined by an admin. Use this guide to learn more about the Reports API
    • Once the Keep API begins to roll out, API documentation will be published, and we will update this blog post to include links to that documentation.
  • End users: No action required.

Rollout pace

Availability

  • Available to all Google Workspace customers, as well as G Suite Basic and Business customers
  • Not available to users with personal Google accounts.

Resources

Save photos from Gmail messages directly to Google Photos with a new “Save to Photos” button

What’s changing

Now when you get a photo attachment in a Gmail message, you can save it directly to Google Photos with a new “Save to Photos” button. You’ll see it next to the existing “Add to Drive” button on the attachment and while previewing the image attachment. Currently, this is only available for JPEG images.
Save to Photos Button from a Gmail attachment

Use the Save to Photos option while previewing an image in Gmail

Who’s impacted

End users

Why you’d use it

This new feature frees you from having to download photo attachments from Gmail messages in order to then manually back them up to Google Photos.

Getting started

  • Admins: There is no admin control for this feature.
  • End users: This feature will be ON by default. For an eligible photo, you can choose the “Save to Photos” button which is alongside a similar option to “Add to Drive.” Visit the Help Center to learn more about how to upload files and folders to Google Drive.

Rollout pace

Rapid Release and Scheduled Release domains: Gradual rollout (up to 15 days for feature visibility) starting on May 26, 2021

Availability

Available to all Google Workspace customers, as well as G Suite Basic and Business customers. Also available to users with personal Google Accounts.

Resources