This document is relevant for: Inf1, Inf2, Trn1, Trn2

Announcing migration of NxD Core inference examples from NxD Core repository to NxD Inference repository starting this release#

Starting with Neuron Release 2.23, the following models and modules in NxD Core inference examples are now only available through NxD Inference package:

  • Llama

  • Mixtral

  • DBRX

I currently utilize one of the mentioned inference samples from the NxD Core repository in my model code. What do I do?#

For customers who want to deploy models out of the box, please use the NxD Inference model hub, which is the recommended option. With NxD Inference, you can import and use these models and modules in your applications. Customers will need to update their applications to use examples under the NxD Inference repository: aws-neuron/neuronx-distributed-inference. Any models compiled with inference code from the NxD Core repository will need to be re-compiled. Please refer to the Migrating from NxD Core inference examples to NxD Inference for guidance and see NxD Inference Overview for more information.

For customers who want to continue using NxD Core without NxD Inference, please refer to the Llama3.2 1B sample as a reference implementation: aws-neuron/neuronx-distributed

This document is relevant for: Inf1, Inf2, Trn1, Trn2