Category: <span>LaunchNews</span>

New – Bring ML Models Built Anywhere into Amazon SageMaker Canvas and Generate Predictions

Amazon SageMaker Canvas provides business analysts with a visual interface to solve business problems using machine learning (ML) without writing a single line of code. Since we introduced SageMaker Canvas in 2021, many users have asked us for an enhanced, seamless collaboration experience that enables data scientists to share trained models with their business analysts with a few simple clicks.

Today, I’m excited to announce that you can now bring ML models built anywhere into SageMaker Canvas and generate predictions.

New – Bring Your Own Model into SageMaker Canvas
As a data scientist or ML practitioner, you can now seamlessly share models built anywhere, within or outside Amazon SageMaker, with your business teams. This removes the heavy lifting for your engineering teams to build a separate tool or user interface to share ML models and collaborate between the different parts of your organization. As a business analyst, you can now leverage ML models shared by your data scientists within minutes to generate predictions.

Let me show you how this works in practice!

In this example, I share an ML model that has been trained to identify customers that are potentially at risk of churning with my marketing analyst. First, I register the model in the SageMaker model registry. SageMaker model registry lets you catalog models and manage model versions. I create a model group called 2022-customer-churn-model-group and then select Create model version to register my model.

Amazon SageMaker Model Registry

To register your model, provide the location of the inference image in Amazon ECR, as well as the location of your model.tar.gz file in Amazon S3. You can also add model endpoint recommendations and additional model information. Once you’ve registered your model, select the model version and select Share.

Amazon SageMaker Studio - Share models from model registry with SageMaker Canvas users

You can now choose the SageMaker Canvas user profile(s) within the same SageMaker domain you want to share your model with. Then, provide additional model details, such as information about training and validation datasets, the ML problem type, and model output information. You can also add a note for the SageMaker Canvas users you share the model with.

Amazon SageMaker Studio - Share a model from Model Registry with SageMaker Canvas users

Similarly, you can now also share models trained in SageMaker Autopilot and models available in SageMaker JumpStart with SageMaker Canvas users.

The business analysts will receive an in-app notification in SageMaker Canvas that a model has been shared with them, along with any notes you added.

Amazon SageMaker Canvas - Received model from SageMaker Studio

My marketing analyst can now open, analyze, and start using the model to generate ML predictions in SageMaker Canvas.

Amazon SageMaker Canvas - Imported model from SageMaker Studio

Select Batch prediction to generate ML predictions for an entire dataset or Single prediction to create predictions for a single input. You can download the results in a .csv file.

Amazon SageMaker Canvas - Generate Predictions

New – Improved Model Sharing and Collaboration from SageMaker Canvas with SageMaker Studio Users
We also improved the sharing and collaboration capabilities from SageMaker Canvas with data science and ML teams. As a business analyst, you can now select which SageMaker Studio user profile(s) you want to share your standard-build models with.

Your data scientists or ML practitioners will receive a similar in-app notification in SageMaker Studio once a model has been shared with them, along with any notes from you. In addition to just reviewing the model, SageMaker Studio users can now also, if needed, update the data transformations in SageMaker Data Wrangler, retrain the model in SageMaker Autopilot, and share back the updated model. SageMaker Studio users can also recommend an alternate model from the list of models in SageMaker Autopilot.

Once SageMaker Studio users share back a model, you receive another notification in SageMaker Canvas that an updated model has been shared back with you. This collaboration between business analysts and data scientists will help democratize ML across organizations by bringing transparency to automated decisions, building trust, and accelerating ML deployments.

Now Available
The enhanced, seamless collaboration capabilities for Amazon SageMaker Canvas, including the ability to bring your ML models built anywhere, are available today in all AWS Regions where SageMaker Canvas is available with no changes to the existing SageMaker Canvas pricing.

Start collaborating and bring your ML model to Amazon SageMaker Canvas today!

— Antje

Introducing Amazon GameLift Anywhere – Run Your Game Servers on Your Own Infrastructure

In 2016, we launched Amazon GameLift, a dedicated hosting solution that securely deploys and automatically scales fleets of session-based multiplayer game servers to meet worldwide player demand.

With Amazon GameLift, you can create and upload a game server build once, replicate, and then deploy across multiple AWS Regions and AWS Local Zones to reach your players with low-latency experiences across the world. GameLift also includes standalone features for low-cost game fleets with GameLift FleetIQ and player matchmaking with GameLift FlexMatch.

Game developers asked us to reduce the wait time to deploy a candidate server build to the cloud each time they needed to test and iterate their game during the development phase. In addition, our customers told us that they often have ongoing bare-metal contracts or on-premises game servers and want the flexibility to use their existing infrastructure with cloud servers.

Today we are announcing the general availability of Amazon GameLift Anywhere, which decouples game session management from the underlying compute resources. With this new release, you can now register and deploy any hardware, including your own local workstations, under a logical construct called an Anywhere Fleet.

Because your local hardware can now be a GameLift-managed server, you can iterate on the server build in your familiar local desktop environment, and any server error can materialize in seconds. You can also set breakpoints in your environment’s debugger, thereby eliminating trial and error and further speeding up the iteration process.

Here are the major benefits for game developers to use GameLift Anywhere.

  • Faster game development – Instantly test and iterate on your local workstation while still leveraging GameLift FlexMatch and Queue services.
  • Hybrid server management – Deploy, operate, and scale dedicated game servers hosted in the cloud or on-premises, all from a single location.
  • Streamline server operations – Reduce cost and operational complexity by unifying server infrastructure under a single game server orchestration layer.

During the beta period of GameLift Anywhere, lots of customers gave feedback. For example, Nitro Games has been an Amazon GameLift customer since 2020 and have used the service for player matchmaking and managing dedicated game servers in the cloud. Daniel Liljeqvist, Senior DevOps Engineer at Nitro Games said “With GameLift Anywhere we can easily debug a game server on our local machine, saving us time and making the feedback loop much shorter when we are developing new games and features.”

GameLift Anywhere resources such as locations, fleets, and compute are managed through the same highly secure AWS API endpoints as all AWS services. This also applies to generating the authentication tokens for game server processes that are only valid for a limited amount of time for additional security. You can leverage AWS Identity and Access Management (AWS IAM) roles and policies to fully manage access to all the GameLift Anywhere endpoints.

Getting Started with GameLift Anywhere
Before creating your GameLift fleet in your local hardware, you can create custom locations to run your game builds or scripts. Choose Locations in the left navigation pane of the GameLift console and select Create location.

You can create a custom location of your hardware that you can use with your GameLift Anywhere fleet to test your games.

Choose Fleets from the left navigation pane, then choose Create fleet to add your GameLift Anywhere fleet in the desired location.

Choose Anywhere on the Choose compute type step.

Define your fleet details, such as a fleet name and optional items. For more information on settings, see Create a new GameLift fleet in the AWS documentation.

On the Select locations step, select the custom location that you created. The home AWS Region is automatically selected as the Region you are creating the fleet in. You can use the home Region to access and use your resources.

After completing the fleet creation steps to create your Anywhere fleet, you can see active fleets in both the managed EC2 instances and the Anywhere location. You also can integrate remote on-premises hardware by adding more GameLift Anywhere locations, so you can manage your game sessions from one place. To learn more, see Create a new GameLift fleet in the AWS documentation.

You can register your laptop as a compute resource in the fleet that you created. Use the fleet-id created in the previous step and add a compute-name and your laptop’s ip-address.

$ aws gamelift register-compute 
    --compute-name ChannyDevLaptop 
    --fleet-id fleet-12345678-abcdefghi 
    --ip-address 10.1.2.3

Now, you can start a debug session of your game server by retrieving the authorization token for your laptop in the fleet that you created.

$ aws gamelift get-compute-auth-token 
    --fleet-id fleet-12345678-abcdefghi 
    --compute-name ChannyDevLaptop

To run a debug instance of your game server executable, your game server must call InitSDK(). After the process is ready to host a game session, the game server calls ProcessReady(). To learn more, see Integrating games with custom game servers and Testing your integration in the AWS documentation.

Now Available
Amazon GameLift Anywhere is available in all Regions where Amazon GameLift is available.  GameLift offers a step-by-step developer guide, API reference guide, and GameLift SDKs. You can also see for yourself how easy it is to test Amazon GameLift using our sample game to get started.

Give it a try, and please send feedback to AWS re:Post for Amazon GameLift or through your usual AWS support contacts.

Channy

Announcing Amazon CodeCatalyst, a Unified Software Development Service (Preview)

Today, we announced the preview release of Amazon CodeCatalyst. A unified software development and delivery service, Amazon CodeCatalyst enables software development teams to quickly and easily plan, develop, collaborate on, build, and deliver applications on AWS, reducing friction throughout the development lifecycle.

In my time as a developer the biggest excitement—besides shipping software to users—was the start of a new project, or being invited to join a project. Both came with the anticipation of building something cool, cutting new code—seeing an idea come to life. However, starting out was sometimes a slow process. My team or I would need to update our local development environments—or entirely new machines—with tools, libraries, and programming frameworks. We had to create source code repositories and set up other shared tools such as Jira, Confluence, or Jenkins, configure build pipelines and other automation workflows, create test environments, and so on. Day-to-day maintenance of development and build environments consumed valuable team cycles and energy. Collaboration between the team took effort, too, because tools to share information and have a single source of truth were not available. Context switching between projects and dealing with conflicting dependencies in those projects, e.g., Python 3.6 for project X and Python 2.7 for project Y—especially when we had only a single machine to work on—further increased the burden.

It doesn’t seem to have gotten any better! These days, when talking to developers about their experiences, I often hear them express that they feel modern development has become even more complicated. This is due to having to select and configure a wider collection of modern frameworks and libraries, tools, cloud services, continuous integration and delivery pipelines, and many other choices that all need to work together to deliver the application experience. What was once manageable by one developer on one machine is now a sprawling, dynamic, complex net of decisions and tradeoffs, made even more challenging by the need to coordinate all this across dispersed teams.

Enter Amazon CodeCatalyst
I’ve spent some time talking with the team behind Amazon CodeCatalyst about their sources of inspiration and goals. Taking feedback from both new and experienced developers and service teams here at AWS, they examined the challenges typically experienced by teams and individual developers when building software for the cloud. Having gathered and reviewed this feedback, they set about creating a unified tool that smooths out the rough edges that needlessly slow down software delivery, and they added features to make it easier for teams to work together and collaborate. Features in Amazon CodeCatalyst to address these challenges include:

  • Blueprints that set up the project’s resources—not just scaffolding for new projects, but also the resources needed to support software delivery and deployment.
  • On-demand cloud-based Dev Environments, to make it easy to replicate consistent development environments for you or your teams.
  • Issue management, enabling tracing of changes across commits, pull requests, and deployments.
  • Automated build and release (CI/CD) pipelines using flexible, managed build infrastructure.
  • Dashboards to surface a feed of project activities such as commits, pull requests, and test reporting.
  • The ability to invite others to collaborate on a project with just an email.
  • Unified search, making it easy to find what you’re looking for across users, issues, code and other project resources.

There’s a lot in Amazon CodeCatalyst that I don’t have space to cover in this post, so I’m going to briefly cover some specific features, like blueprints, Dev Environments, and project collaboration. Other upcoming posts will cover additional features.

Project Blueprints
When I first heard about blueprints, they sounded like a feature to scaffold some initial code for a project. However, they’re much more! Parameterized application blueprints enable you to set up shared project resources to support the application development lifecycle and team collaboration in minutes—not just initial starter code for an application. The resources that a blueprint creates for a project include a source code repository, complete with initial sample code and AWS service configuration for popular application patterns, which follow AWS best practices by default. If you prefer, an external Git repository such as GitHub may be used instead. The blueprint can also add an issue tracker, but external trackers such as Jira can also be used. Then, the blueprint adds a build and release pipeline for CI/CD, which I’ll come to shortly, as well as other integrated tooling.

The project resources and integrated tools set up using blueprints, including the CI/CD pipeline and the AWS resources to host your application, make it so that you can press “deploy” and get sample code running in a few minutes, enabling you to jump right in and start working on your specific business logic.

Project blueprints when starting a new project

At launch, customers can choose from blueprints with Typescript, Python, Java, .NET, Javascript for languages and React, Angular, and Vue frameworks, with more to come. And you don’t need to start with a blueprint. You can build projects with workflows that run on anything that works with Linux and Windows operating systems.

Cloud-Based Dev Environments
Development teams can often run into a problem of “environment drift” where one team member has a slightly different version of a toolchain or library compared to everyone else or the test environments. This can introduce subtle bugs that might go unnoticed for some time. Dev Environment specifications, and the other shared resources, that blueprints create help ensure there’s no unnecessary variance, and everyone on the team gets the same setup to provide a consistent, repeatable experience between developers.

Amazon CodeCatalyst uses a devfile to define the configuration of an on-demand, cloud-based Dev Environment, which currently supports four resizable instance size options with 2, 4, 8, or 16 vCPUs. The devfile defines and configures all of the resources needed to code, test, and debug for a given project, minimizing the time the development team members need to spend on creating and maintaining their local development environments. Devfiles, which are added to the source code repository by the selected blueprint can also be modified if required. With Dev Environments, context switching between projects incurs less overhead—with one click, you can simply switch to a different environment, and you’re ready to start working. This means you’re easily able to work concurrently on multiple codebases without reconfiguring. Being on-demand, Dev Environments can also be paused, restarted, or deleted as needed.

Below is an example of a devfile that bootstraps a Dev Environment.

schemaVersion: 2.0.0
metadata:
  name: aws-universal
  version: 1.0.1
  displayName: AWS Universal
  description: Stack with AWS Universal Tooling
  tags:
    - aws
    - a12
  projectType: aws
commands:
  - id: npm_install
    exec:
      component: aws-runtime
      commandLine: "npm install"
      workingDir: /projects/spa-app
events:
  postStart:
    - npm_install
components:
  - name: aws-runtime
    container:
      image: public.ecr.aws/aws-mde/universal-image:latest
      mountSources: true
      volumeMounts:
        - name: docker-store
          path: /var/lib/docker
  - name: docker-store
    volume:
      size: 16Gi

Developers working in cloud-based Dev Environments provisioned by Amazon CodeCatalyst can use AWS Cloud9 as their IDE. However, they can just as easily work with Amazon CodeCatalyst from other IDEs on their local machines, such as JetBrains IntelliJ IDEA Ultimate, PyCharm Pro, GoLand, and Visual Studio Code. Developers can also create Dev Environments from within their IDE, such as Visual Studio Code or for JetBrains using the JetBrains Gateway app. Below, JetBrains IntelliJ is being used.

Editing an application source file in JetBrains IntelliJ

Build and Release Pipelines
The build and release pipeline created by the blueprint run on flexible, managed infrastructure. The pipelines can use on-demand compute or preprovisioned builds, including a choice of machine sizes, and you can bring your own container environments. You can incorporate build actions that are built in or provided by partners (e.g., Mend, which provides a software composition analysis build action), and you can also incorporate GitHub Actions to compose fully automated pipelines. Pipelines are configurable using either a visual editor or YAML files.

Build and release pipelines enable deployment to popular AWS services, including Amazon Elastic Container Service (Amazon ECS), AWS Lambda, and Amazon Elastic Compute Cloud (Amazon EC2). Amazon CodeCatalyst makes it trivial to set up test and production environments and deploy using pipelines to one or many Regions or even multiple accounts for security.

Running automated workflow

Project Collaboration
As a unified software development service, Amazon CodeCatalyst not only makes it easier to get started building and delivering applications on AWS, it helps developers of all levels collaborate on projects through a single shared project space and source of truth. Developers can be invited to collaborate using just an email. On accepting the invitation, the developer sees the full project context and can begin work at once using the project’s Dev Environments—no need to spend time updating or reconfiguring their local machine with required tools, libraries, or other pre-requisites.

Existing members of an Amazon CodeCatalyst space, or new members using their email, can be invited to collaborate on a project:

Inviting new members to collaborate on a project

Each will receive an invitation email containing a link titled Accept Invitation, which when clicked, opens a browser tab to sign in. Once signed in, they can view all the projects in the Amazon CodeCatalyst space they’ve been invited to and can also quickly switch to other spaces in which they are the owner or to which they’ve been invited.

Projects I'm invited to collaborate on

From there, they can select a project and get an immediate overview of where things stand, for example, the status of recent workflows, any open pull requests, and available Dev Environments.

CodeCatalyst project summary

On the Issues board, team members can see which issues need to be worked on, select one, and get started.

Viewing issues

Being able to immediately see the context for the project, and have access to on-demand cloud-based Dev Environments, all help with being able to start contributing more quickly, eliminating setup delays.

Get Started with Amazon CodeCatalyst in the Free Tier Today!
Blueprints to scaffold not just application code but also shared project resources supporting the development and deployment of applications, issue tracking, invite-by-email collaboration, automated workflows, and more are all available today in the newly released preview of Amazon CodeCatalyst to help accelerate your cloud development and delivery efforts. Learn more in the Amazon CodeCatalyst User Guide. And, as I mentioned earlier, additional blogs posts and other supporting content are planned by the team to dive into the range of features in more detail, so be sure to look out for them!

AWS Marketplace Vendor Insights – Simplify Third-Party Software Risk Assessments

AWS Marketplace Vendor Insights is a new capability of AWS Marketplace. It simplifies third-party software risk assessments when procuring solutions from the AWS Marketplace.

It helps you to ensure that the third-party software continuously meets your industry standards by compiling security and compliance information, such as data privacy and residency, application security, and access control, in one consolidated dashboard.

As a security engineer, you may now complete third-party software risk assessment in a few days instead of months. You can now:

  • Quickly discover products in AWS Marketplace that meet your security and certification standards by searching for and accessing Vendor Insights profiles.
  • Access and download current and validated information, with evidence gathered from the vendors’ security tools and audit reports. Reports are available for download on AWS Artifact third-party reports (now available in preview).
  • Monitor your software’s security posture post-procurement and receive notifications for security and compliance events.

As a software vendor, you can now reduce the operational burden of responding to buyer requests for risk assessment information. It gives your customers a self-service access experience. You can now:

  • Build your product’s security profile by uploading your ISO 27001 or SOC2 Type 2 report and completing a software risk assessment with AWS Audit Manager.
  • Store and share your compliance reports such as ISO 27001 and SOC2 Type 2, using AWS Artifact third-party reports (preview).
  • View and approve your buyer requests for viewing security controls and compliance artifacts stored in Vendor Insights.

Let’s See It in Action
I want to procure a solution on the AWS Marketplace. But before purchasing the product, as a security engineer, I want to review its compliance. I navigate to the AWS Marketplace page of the AWS Management Console. I use the faceted search on the left side to select vendors that are ISO 27001 compliant.

AWS MArketplace vendor insights - faceted searchI select a product. On the Product Overview page, I select View assessment data on the top right side (not shown on the screenshot). Then, I can see the overview page, which shows the Security certification received and the Expiration date.

AWS MArketplace vendor insights - certification receivedI select the Security and compliance tab and see that I need to request access to see the detailed security and compliance information. I select the Request access button on the top right side to ask the vendor for access to their compliance documents.

AWS MArketplace vendor insights - request access part 1

On the next page, I fill in the Your information form with my details, and I select Request access.

AWS MArketplace vendor insights - request access part 2The Next Steps section details what will happen next. The seller will contact me to sign a nondisclosure agreement (NDA). The seller will notify AWS Marketplace when the NDA is signed. Then, I will be granted access to Vendor Insights data.

The process can take a few days. For this demo, I switch to a fictional product—Everest—for which I have access to the compliance data. Here is the Security and compliance tab when my request for access is accepted.

The Summary section shows how many controls are available. It reports how many have been validated with evidence and how many have been self-reported by the seller. It also shows how many noncompliant controls are reported.

I can scroll down the page to see the details for multiple categories: Audit, compliance and security policy, Data security, Access management, Application security, Risk management and incident response, Business resiliency and continuity, End user device security, Infrastructure security, Human resources, and Security and configuration policy. The screenshot does not show all of them.

AWS Marketplace vendor insights - security and complianceI select the detail for Access control and see the list under Control name. For each of them, I can see the compliance for SOC2 Type 2, ISO 27001, and the Vendor self-assessment.

AWS Marketplace vendor insights - access controlI select the noncompliant one to get the details and the explanation the vendor provided.

AWS Marketplace vendor insights - non compliant details

If needed, I might also use AWS Artifact third-party reports (preview) to download the compliance reports.

For Software Vendors
As a software vendor, you can create a security profile for your SaaS products on AWS Marketplace and share this profile with your prospective and existing buyers. It helps you to reduce the manual work for engineering and security teams to respond to your customer questionnaires.

To create a security profile, you will need to complete a self-assessment using AWS Audit Manager on your marketplace management AWS account, share the current SOC2 Type II and ISO27001 compliance artifacts, if available, and turn on automated assessment using Audit Manager and AWS Config on your production AWS accounts.

Our team has created an AWS CloudFormation template to automate the onboarding steps. You can find the technical resources, such as the setup guide and the onboarding templates, on our GitHub repository. Once the profile is created, Vendor Insights will keep your security profile up to date by using automated evidence from Audit Manager and AWS Config. The updates to your profile are sent as notifications. Your security and compliance team can review the updates before they are shared with buyers.

With Vendor Insights, you manage access to your product’s security profile by approving the buyer’s subscription requests. When a buyer requests access, Vendor Insights shares their contact information over email to your compliance or deal-desk operations team. They can complete the NDA with the buyer and notify AWS Marketplace to grant the buyer access to your security profile. You can also request AWS Marketplace to revoke the buyer’s subscription on a later day if you don’t want to share your product’s security and compliance posture information with the buyer anymore.

The entire process is documented in the AWS Marketplace Vendor Insights seller guide.

Pricing and Availability
Vendor Insights is now available in all AWS Regions where AWS Marketplace is available.

The pricing model is very simple; there is no charge involved for using AWS Marketplace Vendor Insights.

For buyers, you can access and download assets during your procurement phase. You lose access to the Vendor Insights profile if you have not purchased the product after 60 days. When you purchase the product, you keep access to the product’s security profile for continuous monitoring of its compliance status.

For sellers, AWS Marketplace doesn’t charge to activate and use Vendor Insights. You will incur fees for using Audit Manager and AWS Config.

Go and start your risk assessments on the AWS Marketplace today.

— seb

Announcing Additional Data Connectors for Amazon AppFlow

Gathering insights from data is a more effective process if that data isn’t fragmented across multiple systems and data stores, whether on premises or in the cloud. Amazon AppFlow provides bidirectional data integration between on-premises systems and applications, SaaS applications, and AWS services. It helps customers break down data silos using a low- or no-code, cost-effective solution that’s easy to reconfigure in minutes as business needs change.

Today, we’re pleased to announce the addition of 22 new data connectors for Amazon AppFlow, including:

  • Marketing connectors (e.g., Facebook Ads, Google Ads, Instagram Ads, LinkedIn Ads).
  • Connectors for customer service and engagement (e.g., MailChimp, Sendgrid, Zendesk Sell or Chat, and more).
  • Business operations (Stripe, QuickBooks Online, and GitHub).

In total, Amazon AppFlow now supports over 50 integrations with various different SaaS applications and AWS services. This growing set of connectors can be combined to enable you to achieve 360 visibility across the data your organization generates. For instance, you could combine CRM (Salesforce), e-commerce (Stripe), and customer service (ServiceNow, Zendesk) data to build integrated analytics and predictive modeling that can guide your next best offer decisions and more. Using web (Google Analytics v4) and social surfaces (Facebook Ads, Instagram Ads) allows you to build comprehensive reporting for your marketing investments, helping you understand how customers are engaging with your brand. Or, sync ERP data (SAP S/4HANA) with custom order management applications running on AWS. For more information on the current range of connectors and integrations, visit the Amazon AppFlow integrations page.

Datasource connectors for Amazon AppFlow

Amazon AppFlow and AWS Glue Data Catalog
Amazon AppFlow has also recently announced integration with the AWS Glue Data Catalog to automate the preparation and registration of your SaaS data into the AWS Glue Data Catalog. Previously, customers using Amazon AppFlow to store data from supported SaaS applications into Amazon Simple Storage Service (Amazon S3) had to manually create and run AWS Glue Crawlers to make their data available to other AWS services such as Amazon Athena, Amazon SageMaker, or Amazon QuickSight. With this new integration, you can populate AWS Glue Data Catalog with a few clicks directly from the Amazon AppFlow configuration without the need to run any crawlers.

To simplify data preparation and improve query performance when using analytics engines such as Amazon Athena, Amazon AppFlow also now enables you to organize your data into partitioned folders in Amazon S3. Amazon AppFlow also automates the aggregation of records into files that are optimized to the size you specify. This increases performance by reducing processing overhead and improving parallelism.

You can find more information on the AWS Glue Data Catalog integration in the recent What’s New post.

Getting Started with Amazon AppFlow
Visit the Amazon AppFlow product page to learn more about the service and view all the available integrations. To help you get started, there’s also a variety of videos and demos available and some sample integrations available on GitHub. And finally, should you need a custom integration, try the Amazon AppFlow Connector SDK, detailed in the Amazon AppFlow documentation. The SDK enables you to build your own connectors to securely transfer data between your custom endpoint, application, or other cloud service to and from Amazon AppFlow‘s library of managed SaaS and AWS connectors.

— Steve

New Amazon FinSpace Simplifies Data Management and Analytics for Financial Services

Managing data is the core of the Financial Services Industry (FSI). I worked for private banking and fund management companies and helped analysts to collect, aggregate, and analyze hundreds of petabytes of data from internal data sources, such as portfolio management, order management, and accounting systems, but also from external data sources, such as real-time market feeds and historical equities pricing and alternative data systems. During that period, I spent my time trying to access data across organizational silos, to manage permissions, and to build systems to automate recurring tasks in ever-growing and more complex environments.

Today, we are launching a solution that would have reduced the time I spent on such projects: Amazon FinSpace is a data management and analytics solution purpose-built for the financial services industry. Amazon FinSpace reduces the time it takes to find and prepare data from months to minutes so analysts can spend more time on analysis.

What Our Customers Told Us
Before data can be combined and analyzed, analysts spend weeks or months to find and access data across multiple departments, each specialized by market, instrument, or geography. In addition to this logical segregation, data is also physically isolated in different IT systems, file systems, or networks. Because access to data is strictly controlled by governance and policy, analysts must prepare and explain access requests to the compliance department. This is a very manual, ad-hoc process.

Once granted access, they often must perform computational logic (such as Bollinger Bands, Exponential Moving Average, or Average True Range) on larger and larger datasets to prepare data for analysis or to derive information out of the data. These computations often run on servers with constrained capacity, as they were not designed to handle the size of workloads in the modern financial world. Even server-side systems are struggling to scale up and keep up with the ever-growing size of the datasets they need to store and analyze.

How Amazon FinSpace Helps
Amazon FinSpace removes the undifferentiated heavy lifting required to store, prepare, manage, and audit access to data. It automates the steps involved in finding data and preparing it for analysis. Amazon FinSpace stores and organizes data using industry and internal data classification conventions. Analysts connect to the Amazon FinSpace web interface to search for data using familiar business terms (“S&P500,” “CAC40,” “private equity funds in euro”).

Analysts can prepare their chosen datasets using a built-in library of more than 100 specialized functions for time series data. They can use the integrated Jupyter notebooks to experiment with data, and parallelize these financial data transformations at the scale of the cloud in minutes. Finally, Amazon FinSpace provides a framework to manage data access and to audit who is accessing what data and when. It tracks usage of data and generates compliance and audit reports.

Amazon FinSpace also makes it easy to work on historical data. Let’s imagine I built a model to calculate credit risk. This model relies on interest rate and inflation rate. These two rates get updated frequently. The risk level associated with a customer is not the same today as it was a few months ago, when inflation and interest rates were different. When data analysts are looking at data as it is now and as it was in the past, they call it bitemporal modeling. Amazon FinSpace makes it easy to go back in time and to compare how models are evolving alongside multiple dimensions.

To show you how Amazon FinSpace works, let’s imagine I have a team of analysts and data scientists and I want to provide them a tool to search, prepare, and analyze data.

How to Create an Amazon FinSpace Environment
As an AWS account administrator, I create a working environment for my team of financial analysts. This is a one-time setup.

I navigate to the Amazon FinSpace console and click Create Environment:

FinSpace Create environment

I give my environment a name. I select a KMS encryption key that will serve to encrypt data at rest. Then I choose either to integrate with AWS Single Sign-On or to manage usernames and passwords in Amazon FinSpace. AWS Single Sign-On integration allows your analysts to authenticate with external systems, such as a corporate Active Directory, to access the Amazon FinSpace environment. For this example, I choose to manage the credentials by myself.

FinSpace create environment details

I create a superuser who will have administration permission on the Amazon FinSpace environment. I click Add Superuser:

Finspace create super user 1Finspace create super user 2I take a note of the temporary password. I copy the text of the message to send to my superuser. This message includes the connection instructions for the initial connection to the environment.

The superuser has permission to add other users and to manage these users’ permissions in the Amazon FinSpace environment itself.

Finally, and just for the purpose of this demo, I choose to import an initial dataset. This allows me to start with some data in the environment. Doing so is just a single click in the console. The storage cost of this dataset is $41.46 / month and I can delete it at any time.

Under Sample data bundles, Capital Markets sample data, I click Install dataset. This can take several minutes, so it’s a good time to stand up, stretch your legs, and grab a cup of coffee.

FinSpace install sample dataset

How to Use an Amazon FinSpace Environment
In my role as financial analyst, my AWS account administrator sends me an email containing a URL to connect to my Amazon FinSpace Environment along with the related credentials. I connect to the Amazon FinSpace environment.

A couple of points are worth noting on the welcome page. First, on the top right side, I click the gear icon to access the environment settings. This is where I can add other users and manage their permissions. Second, you can browse the different data by categories on the left side, or search for specific terms by typing your search query on the search bar on top of the screen, and refine your search on the left side.

I can use Amazon FinSpace as my data hub. Data are fed through the API or I can load data directly from my workstation. I use tags to describe datasets. Datasets are containers for data; changes are versioned and I can create historical views of data or use the auto-updating data view that Amazon FinSpace maintains for me.

For this demo, let’s imagine I received a request from a portfolio manager who wants a chart showing realized volatility using 5 minute time bars for AMZN stock. Let me show you how I use the search bar to locate data and then use a notebook to analyze that data.

First, I search my dataset for stock price time bar summary, with 5 min intervals. I type “equity” in the search box. I’m lucky: The first result is the one I want. If needed, I could have refined the results using the facets on the left.

finspace search equity

Once I find the dataset, I explore its description, the schema, and other information. Based on these, I decide if this is the correct dataset to answer my portfolio manager’s request.

finspace dataset details

 

I click Analyze in notebook to start a Jupyter notebook where I’ll be able to further explore the data with PySpark. Once the notebook is open, I first check it is correctly configured to use the Amazon FinSpace PySpark kernel (starting the kernel takes 5-8 minutes).

Finspace select kernel

I click “play” on the first code box to connect to the Spark cluster.

finspace connect to cluster

To analyze my dataset and answer the specific question from my PM, I need to type a bit of PySpark code. For the purpose of this demo, I am using sample code from the Amazon FinSpace GitHub repository. You can upload the Notebook to your environment. Click the up arrow as shown on the top left of the screen above to select the file from your local machine.

This notebook pulls data from the Amazon FinSpace catalog “US Equity Time-Bar Summary” data I found earlier, and then uses the Amazon FinSpace built-in analytic function realized_volatility() to compute realized volatility for a group of tickers and exchange event types.

Before creating any graph, let’s have a sense of the dataset. What is the time range of the data ? What tickers are in this dataset ? I answer these questions with simple select() or groupby() functions provided by Amazon FinSpace. I prepare my FinSpaceAnalyticsAnalyser class with the code below :

from aws.finspace.analytics import FinSpaceAnalyticsManager
finspace = FinSpaceAnalyticsManager(spark = spark, endpoint=hfs_endpoint)

sumDF = finspace.read_data_view(dataset_id = dataset_id, data_view_id = view_id)

Once done, I can start to explore the dataset:
finspace analysis 1

I can see there are 561778 AMZN trades and price quotes between Oct. 1, 2019 and March 31, 2020.

To plot the realized volatility, I use Panda to plot the values:

finspace plot realized volatility code

When I execute this code block, I receive:

finspace plot realized volatility graph

 

Similarly, I can start a Bollinger Bands analysis to check if the volatility spike created an oversold condition on the AMZN stock. I am also using Panda to plot the values.

finspace plot bollinger bandcode

and generate this graph:

finspace plot realized volatility bolinger graph

 

I am ready to answer the portfolio manager’s question. But why was there a spike on Jan 30 2020 ? The answer is in the news: “Amazon soars after huge earnings beat.” 🙂

Availability and Pricing
Amazon FinSpace is available today in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Canada (Central).

As usual, we charge you only for the resources your project uses. Pricing is based on three dimensions: the number of analysts with access to the service, the volume of data ingested, and the compute hours used to apply your transformations. Detailed pricing information is available on the service pricing page.

Give it a try today and let us know your feedback.

— seb