Requirements for Building and Expanding AI Inference vs. Training

Investing in deep learning (DL) is a major decision that requires understanding of each phase of the process, especially if you’re considering AI at the edge. Below are practical tips to help you expand your AI system.

Key Terminology for Deep Learning

  • Neural Network: Artificial neural networks are computing systems inspired by the organic neural networks found in human and other animal brains, where nodes (artificial neurons) are connected (artificial synapses) to work together.
  • Training: Learning a new capability from existing data
  • Inference: Applying this capability to new data (usually via an application or service)

Understanding the AI Deep Learning Inference vs. Training

Deep learning of an artificial neural network requires teams to curate huge quantities of data into a designated structure then feed that massive training dataset (the bigger, the better for training purposes) into a DL framework.

After the DL framework is trained (it has learned what inputs lead to what logical conclusion), it can leverage this new capability when exposed to novel data and make inferences about the new data that allow action.

For example, after seeing 50,000 images of cats with solid color coats, upon seeing an image of a multicolored cat, it should be able to infer that this image is also of a cat and not something else, like a car or a bicycle. The app or service using the inference model then uses the data in some way.

However, the infrastructure needed to achieve training versus inference is different in some critical ways.

What to look for when expanding DL training infrastructure

It is crucial to get as much raw compute power and as many nodes as you can afford. Think multi-core processors and GPUs. Why? The most critical issues our clients are facing today is getting accurately trained AI models. The more nodes and the more mathematical accuracy you can build into your system, the faster and more accurate your training will be.

Training often requires incremental addition of new data sets that remain clean and well-structured. That means these resources cannot be shared with others in the datacenter. Focus on optimization for this workload and you’ll have better performance and more accurate training than if you try to make a general compute system with the assumption that it can take on other jobs in its free time.

Huge training datasets require massive networking and storage capabilities to hold and transfer the data, especially if your data is image-based or heterogeneous. Plan ahead for adequate networking and storage capacity, not just for strong computing.

The greatest challenge in designing hardware for neural network training is scaling. Doubling the amount of training data doesn’t mean doubling the number of resources used to process it. It means expanding exponentially.

What to look for when expanding DL inference infrastructure

Inference systems should be optimized for performance. Think simpler hardware with less power than the training systems but with the lowest latency possible.

Throughput is critical to inference. The process requires high I/O bandwidth and enough memory to hold both the required training model(s) and the input data without having to make calls back to the storage components of the system.

Datacenter resource requirements for inference are typically not as great for a single instance compared to training needs. This is because the amount of data or number of users an inference platform can support is limited to the performance of the platform and the application requirements. Think of speech recognition software, which can only operate when there is one, clear input stream. More than one input stream renders the application inoperable. It’s the same with inference input streams.

To speak to a specialist about finding the right computing components to meet your AI needs, contact us.


Speak with an Expert Configurator at 1-800-371-1212