News & Events | Page 87 of 128 | Black Cat Security

Uncategorized

Public preview – Support for multiple currencies on Retail Rates API for all Azure products/services

April 14, 2021 April 14, 2021

An unauthenticated experience to get retail prices in different currencies for all Azure products/services (including Reserved Instances).

Amazon Redshift, Launch, News

AQUA (Advanced Query Accelerator) – A Speed Boost for Your Amazon Redshift Queries

April 14, 2021 April 14, 2021

Amazon Redshift already provides up to 3x better price-performance at any scale than any other cloud data warehouse. We do this by designing our own hardware and by using Machine Learning (ML).

For example, we launched the SSD-based RA3 nodes for Amazon Redshift at the end of 2019 (Amazon Redshift Update – Next-Generation Compute Instances and Managed, Analytics-Optimized Storage) and added additional node sizes last April (Amazon Redshift update – ra3.4xlarge Nodes), and last December (Amazon Redshift Launches RA3.xlplus Nodes With Managed Storage). In addition to high-bandwidth networking, RA3 nodes incorporate a sophisticated data management model. As I said when we launched the RA3 nodes:

There’s a cache of large-capacity, high-performance SSD-based storage on each instance, backed by S3, for scale, performance, and durability. The storage system uses multiple cues, including data block temperature, data blockage, and workload patterns, to manage the cache for high performance. Data is automatically placed into the appropriate tier, and you need not do anything special to benefit from the caching or the other optimizations.

Our customers use RA3 nodes to maintain very large data sets and are seeing great results. From digital interactive entertainment to tracking impressions and performance for media buys, Amazon Redshift and RA3 nodes help our customers to store and query data at world scale, with up to 32 PB of data in a single data warehouse.

On the downside, it turns out that advances in storage performance have outpaced those in CPU performance, even as data warehouses continue to grow. The combination of large amounts of data (often accessed by queries that mandate a full scan), and limits on network traffic, can result in a situation where network and CPU bandwidth become limiting factors.

We can do something about that…

Introducing AQUA
Today we are making the ra3.4xl and ra3.16xl nodes even more powerful with the addition of AQUA (Advanced Query Accelerator). Building on the caches that I told you about earlier, and taking advantage of the AWS Nitro System and custom FPGA-based acceleration, AQUA pushes the computation needed to handle reduction and aggregation queries closer to the data. This reduces network traffic, offloads work from the CPUs in the RA3 nodes, and allows AQUA to improve the performance of those queries by up to 10x, at no extra cost and without any code changes. AQUA also makes use of a fast, high-bandwidth connection to Amazon Simple Storage Service (Amazon S3).

You can watch this video to learn a lot more about how AQUA uses the custom-designed hardware in the AQUA nodes to accelerate queries. The benefit comes about in several different ways. Each node performs the reduction and aggregation operations in parallel with the others. In addition to getting the n-fold speedup due to parallelism, the amount of data that must be sent to and processed on the compute nodes is generally far smaller (often just 5% of the original). Here’s a diagram that shows how all of the elements come together to accelerate queries:

If you are already using ra3.4xl or ra3.16xl nodes to host your data warehouse, you can start using AQUA in minutes. You simply enable AQUA for your clusters, restart them, and benefit from vastly improved performance for your reduction and aggregation queries. If you are ready to move into the future with RA3 and AQUA, you can create a new RA3-based cluster from a snapshot of your existing one, or you can use Classic resize to do an in-place upgrade.

Using AQUA
I don’t happen to have a data warehouse! I used a snapshot provided by the Redshift team to create a pair of clusters. The first one (prod-cluster) does not have AQUA enabled, and the second one (test-cluster) does:

To create the AQUA-enabled cluster, I simply choose Turn on on the Cluster configuration page:

My queries will use the lineitem table, which has over 18 billion rows:

I create a session on each cluster and disable the Redshift result cache:

And then I run the same query on both clusters:

select sum(l_orderkey), count(*) from lineitem where
l_comment similar to 'slyly %' or
l_comment similar to 'plant %' or
l_comment similar to 'fina %' or
l_comment similar to 'quick %' or
l_comment similar to 'slyly %' or
l_comment similar to 'quickly %' or
l_comment similar to ' %about%' or
l_comment similar to ' final%' or
l_comment similar to ' %final%' or
l_comment similar to ' breach%' or
l_comment similar to ' egular%' or
l_comment similar to ' %closely%' or
l_comment similar to ' closely%' or
l_comment similar to ' %idea%' or
l_comment similar to ' idea%' ;

If you take a look at the diagram above (and perhaps watch the video), you can see why AQUA can handle queries of this type very efficiently. Instead of sequentially scanning all 18 billion or so rows on the compute nodes, AQUA distributes the collection of similar to expressions to multiple AQUA nodes where they are run in parallel.

The query on the cluster that has AQUA enabled finishes in less than a minute:

The query on the cluster that does not have AQUA enabled finishes in a little under 4 minutes:

As is always the case with databases, complex data, and equally complex queries, your mileage will vary. For example, you could imagine a query that did a complex JOIN of rows SELECTed from multiple tables, where each SELECT would benefit from AQUA, and the overall speedup could be even greater. As you can see from the simple query that I used for this post, AQUA can dramatically reduce query time and perhaps even enable some new types of somewhat real-time queries that were simply not possible or practical in the past.

Things to Know
Here are a couple of interesting facts about AQUA:

Cluster Version – Your clusters must be running Redshift version 1.0.24421 or later in order to be able to make use of AQUA. To learn more about how to enable and disable AQUA, read Managing an AQUA Cluster.

Relevant Queries – AQUA is designed to deliver up to 10X performance on queries that perform large scans, aggregates, and filtering with LIKE and SIMILAR_TO predicates. Over time we expect to add support for additional queries.

Security – All data cached by AQUA is encrypted using your keys. After performing a filtering or aggregation operation, AQUA compresses the results, encrypts them, and returns them to Redshift.

Regions – AQUA is available today in the US East (N. Virginia), US West (Oregon), US East (Ohio), Europe (Ireland), and Asia Pacific (Tokyo) Regions, and will be coming to Europe (Frankfurt), Asia Pacific (Sydney), and Asia Pacific (Singapore) in the first half of 2021.

Pricing – As I mentioned earlier, there’s no additional charge for AQUA.

Try AQUA Today
If you are using ra3.4xl or ra3.16xl nodes to power your Redshift cluster, you can enable AQUA, restart the cluster, and run some test queries within minutes. Take AQUA for a spin and let me know what you think!

— Jeff;

Uncategorized

General availability: Log analytics workspace name uniqueness is now per resource group

April 14, 2021 April 14, 2021

You can now use the same workspace name in deployments across all your environment without a conflict. This is useful in template deployments when the same name can be used for every deployment for consistency.

Uncategorized

Cognitive Services – New Computer Vision API v3.2 now generally available

April 14, 2021 April 14, 2021

Updated Computer Vision API now generally available to improve image tagging, content moderation, OCR language expansion, and more.

Uncategorized

General availability: New Azure Policy built-in definitions for data encryption in Azure Monitor

April 14, 2021 April 14, 2021

With Azure Monitor built-in policy definitions for data encryption, you can enforce organizational standards and assess compliance of data encryption settings in your environment.

Uncategorized

Public preview: Functions upgrade in Azure Monitor log analytics

April 14, 2021 April 14, 2021

We have upgraded the functions experience in log analytics, providing new UI and capabilities to allow you to do more with functions.

Uncategorized

Public preview: Cognitive Services- Anomaly Detector now supports multivariate anomaly detection

April 14, 2021 April 14, 2021

Enable organizations to identify anomalies across multiple variables with multivariate Anomaly Detector.

Uncategorized

Reduce dataflow execution time with cluster reuse public preview in Azure Data Factory

April 14, 2021 April 14, 2021

Azure Data Factory (ADF) has released a “quick re-use” option as public preview to the Azure Integration Runtime TTL to reduce data flow execution to from 2 mins to under 20 seconds.

Uncategorized

Public preview of Microsoft Build of OpenJDK

April 14, 2021 April 14, 2021

The Microsoft Build of OpenJDK is a new Long-Term Support distribution of OpenJDK for your Java workloads, in the cloud and everywhere else.

SAP on Azure; Azure Monitor for SAP solutions

Public preview of SAP NetWeaver, North Europe, and new updates in cluster monitoring

April 14, 2021 April 14, 2021

Azure Monitor for SAP Solutions (AMS) supports SAP NetWeaver monitoring in public preview, new metrics for high-availability (pacemaker) clusters, and availability in North Europe region.