AWS re:Invent 2019 demonstrated once again, AWS has been busy. The company announced 77 product launches, feature releases, and services at its annual conference — all that at some level is driving IT evolution and business transformation. Here are five that we think are going to prove to be extremely beneficial.
1. Amazon Redshift Updates
AWS made three notable updates to Redshift. First up, it’s now possible to unload an Amazon Redshift query result to an Amazon S3 data lake in the Apache Parquet format. Compared to text formats, the Parquet format is up to two times faster to unload and consumes up to six times less storage in Amazon S3. This enables users to save data transformation and enrichment they have done in Amazon Redshift into an Amazon S3 data lake in an open format. Data can then be analyzed with Redshift Spectrum or other AWS services.
Another enhancement: RA3 nodes with managed storage are now available. RA3 nodes enable customers to scale and pay for compute and storage independently. A cluster can be sized based only on compute needs for more cost-effective data analysis.
RA3 nodes are built on the AWS Nitro System. They feature high bandwidth networking with performance that’s indistinguishable from bare metal. The nodes use large, high-performance SSDs as local caches. Leveraging workload patterns and advanced data management techniques, they deliver the performance of the local SSD while scaling storage automatically to S3. RA3 nodes remain fully compatible with existing workloads.
A cluster with RA3 nodes can be created using the AWS management console or the CreateCluster API. Migrating a cluster to an RA3 cluster simply requires taking a snapshot of the existing cluster and restoring it to an RA3 cluster. Or, do a classic resize from the existing cluster to the new RA3 cluster.
Yet another significant Redshift update is the Federated Query feature. It allows for querying and analyzing data across operational databases, data warehouses, and data lakes. Queries on live data can be integrated in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL with queries across Amazon Redshift and Amazon S3 environments.
The Federated Query feature enables live data to be incorporated into business intelligence (BI) and reporting applications. The intelligent optimizer in Redshift pushes down and distributes a portion of the computation directly into the remote operational databases. This accelerates performance by reducing the data moved over the network. Redshift complements query execution, as needed, with its parallel processing capabilities.
Federated Query also makes it easy to ingest data into Redshift by allowing for direct queries of operational databases, applying transformations on the fly, and loading data into the target tables without requiring complex ETL pipelines.
Get the full story on what’s new with Redshift here.
2. Amazon Managed Apache Cassandra Service (MCS)
Managing large Cassandra clusters has been difficult, time consuming, and requires specialized expertise. That’s no longer the case with the introduction of Amazon MCS — a highly available, managed Apache Cassandra-compatible database service.
Amazon MCS is serverless, so there’s no administrative and maintenance burden. Customers only pay for the resources used. The service automatically scales tables up and down in response to application traffic. Users can build applications that serve thousands of requests per second with virtually unlimited throughput and storage.
Amazon MCS implements the Apache Cassandra version 3.11 CQL API, which enables the use of existing app code and drivers. It also provides consistent single-digit-millisecond read-and-write performance at any scale, so apps can be built with low latency for better user experience.
Data storage is fully managed and highly available, so there’s no need to provision storage. There’s also no limit on the size of a table or the number of items per table. For durability, table data is replicated automatically three times across multiple AWS availability zones (AZ).
Amazon MCS is integrated with AWS Identity and Access Management (IAM) to allow you to manage access to your data. Customer data is also encrypted at rest by default, and encryption keys are stored in AWS Key Management Services (KMS).
For more information, check out Amazon MCS features.
3. UltraWarm for Amazon Elasticsearch Service
UltraWarm, a performance-optimized warm storage tier, is now available in preview. It enables storing and interactively analyzing data using Elasticsearch and Kibana while reducing the cost per GB by up to 90% over existing Amazon Elasticsearch Service hot storage options.
UltraWarm enables Amazon Elasticsearch Service to support hot-warm domain configurations. It complements hot storage with lower priced, more durable storage for older data that’s not accessed often while maintaining the same interactive analysis experience.
UltraWarm combines Amazon S3 and optimized compute nodes powered by the AWS Nitro System to provide a hot-like experience for aggregations and visualizations.
To improve performance, the UltraWarm nodes use granular caching across all layers of the stack, adaptive prefetching, and query processing optimizations. It provides similar or superior performance compared to traditional warm nodes that rely on high density local storage.
Learn more about UltraWarm in the AWS News Blog.
4. Amazon EKS and Fargate
It’s now possible to use Amazon Elastic Kubernetes Service (EKS) to run Kubernetes pods on AWS Fargate with the upstream Kubernetes APIs. This enables the use of existing tooling to manage apps. You can now focus on designing and building your applications instead of managing the infrastructure that runs them.
The Kubernetes pods run with just the compute capacity requested. Each pod runs in its own VM-isolated environment without sharing resources with other pods. Users only pay for the pods run when they run. This enhances app utilization and cost-efficiency without any additional work.
With Amazon EKS and AWS Fargate, you get the serverless benefits of Fargate, the best practices of Amazon EKS, and the extensibility of Kubernetes out of the box.
Get more details at the AWS News Blog.
5. Amazon SageMaker
Amazon SageMaker, a fully managed service that provides the ability to build, train, and deploy machine learning (ML) models quickly, now has new tools and capabilities. They remove much of the heavy lifting from each step of the ML process and help developers better manage projects, experiments, and model accuracy.
Amazon SageMaker Studio is the first fully integrated development environment (IDE) for ML. It unifies all the tools needed for ML development, so developers can write code, track experiments, visualize data, and perform debugging and monitoring within a single, integrated visual interface.
Managing compute instances to view, run, or share a notebook is tedious. Amazon SageMaker Notebooks enables opening notebooks in seconds with a single click without having to provision instances. Compute resources can be increased or decreased without interruption. Notebook content is automatically copied and transferred to new instances.
Typical approaches to automated ML don’t give you the insights into the data used in creating the model or the logic that went into creating it. SageMaker Autopilot automates the creation of ML models and automatically chooses algorithms and tunes models. It also gives you complete control and visibility into your ML models.
Training an ML model typically entails many iterations to isolate and measure the impact of changing data sets, algorithm versions, and model parameters. Amazon SageMaker Experiments organizes, tracks, and compares ML training experiments on SageMaker. It’s integrated with Amazon SageMaker Studio, making it easy to browse active and past experiments, compare experiments on key performance metrics, and identify the best performing ones.
Amazon SageMaker Model Monitor continuously monitors ML models in production, detects deviations such as data drift that can degrade model performance over time, and sends alerts so remedial actions can be taken.
Amazon SageMaker Debugger provides insights into the ML training process by automating data capture and analysis from training runs in real time with no code changes. When anomalies are detected, it sends alerts for developers to take remedial actions. This reduces debugging time from days to minutes. The debug data remains in the customer’s AWS account, so SageMaker Debugger can be used for most privacy-sensitive applications.
You can learn more about these and other SageMaker changes on the AWS News Blog.
There’s So Much More
We’ve highlighted just a few of the announcements that came out of re:Invent 2019 that we think are notable. However, there are many, many more that are going to play significant roles in how organizations approach business challenges, innovate, and stay on top of changing industry and customer needs.
If you’re interested in learning more — including how ClearScale can help you take advantage of what’s new from AWS — give us a call. We can set you up with one of our solution architects to discuss your current and future needs. As an AWS Premier Consulting Partner, we stay at the forefront of AWS services and best practices. We can put them to work for you.
Get in touch today to speak with a Cloud expert and discuss how we can help: