From fraud detection and email filtering to self-driving cars and patient diagnosis, the potential applications of machine learning seem almost endless. But if you’re interested in deploying a machine learning (ML) application, where do you begin? Do you build it in-house or outsource it? What resources are required? What platform should you build it on?
In the first of this three-blog series, we cover some of the key considerations for developing and deploying ML applications.
The Basics of Machine Learning
Before making any big decisions regarding ML application definition, it’s important to understand what ML is and how it works. While the definition may vary based on the source, machine learning is just what the name suggests. It’s the process of a machine learning something.
In simple terms, data is fed into an algorithm. Every time the algorithm processes data, it learns from it and gets better at predicting an answer or solution. The output of a ML algorithm run on data is a model. The model represents what was learned by the ML algorithm, including any rules, numbers, or other algorithm-specific data structures required to make predictions.
The ML Process
Briefly, the ML process typically includes these steps:
Problem framing. The ML process starts with problem framing — determining what you want to predict and what kind of observation data is needed to make those predictions.
Data collection. Next, data is collected that contains the answer or solution you want predicted. There are a few general requirements for data. It should:
- Be comprised of large, diverse data sets integrated from multiple sources and concerning various business entities, collected across multiple time frames.
- Have a large, diverse data management infrastructure, with multiple data platforms, tools, and processing engines. While data can come from multiple sources, the trend is toward consolidating as much as possible into a data lake. Data lakes are moving toward elastic clouds to facilitate automation, optimization, and economics.
- Be labeled as required by most ML algorithms, particularly in supervised learning. (The data labeling process takes raw data, such as images or text files, and adds one or more informative labels to provide context so that a ML algorithm can learn from it.)
- Be cleaned prior to use by employing processes such as deduplication, normalization, and error correction.
- Be transformed into a form that the ML system can understand. Machines can’t understand data in formats such as images or text. That means they must be converted into numbers.
- Be split into two portions: a larger one devoted to training and a smaller one that’s reserved for evaluation.
Algorithm selection. There are various algorithms from which to choose. Finding the right one is partly trial and error, although the selection will be influenced by the size and type of data you’re working with, the insights you’re seeking from the data, and how those insights will be used. (More on this later.)
ML algorithms are generally classified into supervised, unsupervised, semi-supervised, and reinforcement learning. Currently, most business-oriented applications use supervised ML algorithms.
Supervised ML algorithms apply what has been learned in the past to new data using labeled examples in order to predict future events or behavior. Starting from the analysis of a known training data set, the algorithm produces an inferred function to make predictions about the output values. The algorithm can also compare its output with the correct, intended output and find errors in order to modify the model.
Training. The algorithm is fed training data. It learns patterns and maps between the feature and the label. This yields a model that can be used to predict on unseen data. Additional data is fed to the algorithm, so it continues to “learn” and outputs a more finely tuned model.
Evaluation. Once training is completed, the model is evaluated using the remaining data to assess its real-world performance. When the model can draw its own conclusions based on its data sets and training, it can be deployed for prediction on real-world data in various applications.
ML in the Real World
In theory, the ML process seems simple enough. However, there’s much more involved in using it to develop real-world applications — and in building out a comprehensive solution that meets what can often be complex, multi-faceted customer needs.
That’s where cloud native application development and ML experience, as well as powerful resources like those available from AWS, come into play. A good example is the work ClearScale did with an online floral company, which culminated in a proof of concept for a cost-effective, efficient-to-use-and-deploy recommendation engine.
Drawing on its expertise in using AWS services, ClearScale architected a solution that not only makes recommendations that are likely to align with customers’ preferences. It uses the behaviors of past customers to make accurate predictions about new shoppers.
The prototype includes a search engine component that re-ranks results within a session, so it always presents customers with high-quality feeds. It can also deliver personalized notifications.
Services such as Amazon Personalize, Amazon Pinpoint and Amazon Elasticsearch Service figure prominently in the solution. But there’s more to it than ML-driven functions.
ClearScale designed the recommendation engine to be serverless. With no infrastructure to buy and maintain, the online floral company can quickly scale on demand and only pay for the resources used.
ClearScale also used an Infrastructure as Code (IaC) approach, which allows for automatically managing, monitoring, and provisioning resources rather than manually configuring discrete hardware devices and operating systems. That means if the application should fail for any reason, the online floral company can quickly and easily redeploy the architecture.
It’s these kind of benefits that make ML application solutions all the more powerful —but require the expertise and experience of a team that understands the full picture of app development and deployment.
There’s More to Know
To learn more about ML and ClearScale’s experience in developing ML-powered applications, watch for part 2 and 3 of our ML blog series.
You can also download our free eBook, “Transform Your Business With Machine Learning.”
Get in touch today to speak with a cloud expert and discuss how we can help: