Machine learning capabilities have traditionally been the domain of Python and other languages with established ML ecosystems. However, Elixir developers can now leverage cutting-edge machine learning models without leaving the comfort of their preferred language, thanks to the Bumblebee library. In this article, we’ll explore how to get started with Bumblebee, examine its architecture, and implement some practical examples.
What is Bumblebee?
Bumblebee is an application-level library that makes working with pre-trained neural network models straightforward in Elixir. It serves as the integration layer between Nx (Numerical Elixir) and HuggingFace’s extensive collection of pre-trained models. This combination brings powerful machine learning capabilities to the Elixir ecosystem without requiring deep expertise in neural networks.
The library’s primary goal is to make machine learning accessible to Elixir developers by providing:
- Easy loading and running of pre-trained models
- Integration with HuggingFace’s model hub
- High-level APIs for common ML tasks
- Seamless integration with existing Elixir applications
Setting Up Your Environment
To get started with Bumblebee, you’ll need to set up your project with the necessary dependencies. Let’s create a new Mix project and add Bumblebee along with its dependencies:
mix new ml_project
cd ml_project
Now, update your mix.exs
file with the required dependencies:
defp deps do
[
{:nx, "~> 0.5.1"},
{:exla, "~> 0.5.1"},
{:bumblebee, "~> 0.3.0"},
{:axon, "~> 0.5.1"},
{:kino, "~> 0.8.0", only: [:dev]},
{:scidata, "~> 0.1.9", only: [:dev]}
]
end
After adding these dependencies, run:
mix deps.get
Let’s break down the key dependencies:
- Nx: The numerical computing library that provides the foundation for machine learning in Elixir
- EXLA: Google’s XLA (Accelerated Linear Algebra) backend for Nx, which enables GPU acceleration
- Bumblebee: The main library for working with pre-trained models
- Axon: The neural network framework for Elixir
- Kino: For interactive notebook-like functionality if you’re using Livebook
- Scidata: Provides access to common scientific datasets for experimentation
Understanding Bumblebee’s Architecture
Bumblebee connects several components of the Elixir machine learning ecosystem:
- Models: Pre-trained neural networks loaded from HuggingFace or custom sources
- Featurizers: Components that convert raw inputs (text, images, etc.) into model-compatible features
- Tasks: High-level abstractions for specific ML tasks like text classification or image generation
- Serving: Tools for deploying models in production environments
The flow typically looks like this:
Raw Input → Featurizer → Model → Post-processing → Results
This pipeline-based approach makes it easy to compose different components for various machine learning tasks.
Your First Bumblebee Application
Let’s create a simple text classification application using Bumblebee. We’ll use a pre-trained model to determine the sentiment of text inputs.
Create a new file named lib/sentiment_analyzer.ex
:
defmodule SentimentAnalyzer do
def load_model do
{:ok, model_info} = Bumblebee.load_model({:hf, "distilbert-base-uncased-finetuned-sst-2-english"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "distilbert-base-uncased-finetuned-sst-2-english"})
Bumblebee.Text.text_classification(model_info, tokenizer,
compile: [batch_size: 1],
defn_options: [compiler: EXLA]
)
end
def analyze(serving, text) do
%{predictions: predictions} = Nx.Serving.run(serving, text)
prediction = Enum.max_by(predictions, & &1.score)
{prediction.label, prediction.score}
end
end
Now, let’s create a simple script to test our sentiment analyzer. Create a file named lib/run_sentiment.ex
:
defmodule RunSentiment do
def main do
serving = SentimentAnalyzer.load_model()
test_texts = [
"I absolutely loved this movie! The acting was superb.",
"The service at this restaurant was terrible and the food was cold.",
"The product works as expected, nothing special.",
"Elixir's pattern matching makes code so elegant and readable."
]
for text <- test_texts do
{sentiment, confidence} = SentimentAnalyzer.analyze(serving, text)
IO.puts("Text: #{text}")
IO.puts("Sentiment: #{sentiment}, Confidence: #{Float.round(confidence, 3)}")
IO.puts("---")
end
end
end
To run the example:
mix run -e "RunSentiment.main()"
You should see output showing the sentiment (positive or negative) and confidence score for each test text.
Deeper Dive: Image Classification
Now, let’s explore image classification with Bumblebee. Create a file named lib/image_classifier.ex
:
defmodule ImageClassifier do
def load_model do
{:ok, model_info} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})
Bumblebee.Vision.image_classification(model_info, featurizer,
compile: [batch_size: 1],
defn_options: [compiler: EXLA]
)
end
def classify_image(serving, image_path) do
# Read and preprocess the image
{:ok, image_binary} = File.read(image_path)
image = Nx.Serving.run(serving, image_binary)
# Get the top 5 predictions
top_predictions =
image.predictions
|> Enum.sort_by(& &1.score, :desc)
|> Enum.take(5)
for %{label: label, score: score} <- top_predictions do
{label, Float.round(score, 3)}
end
end
end
And a script to test it:
defmodule RunImageClassification do
def main(image_path) do
serving = ImageClassifier.load_model()
IO.puts("Classifying image: #{image_path}")
predictions = ImageClassifier.classify_image(serving, image_path)
IO.puts("Top predictions:")
for {label, score} <- predictions do
IO.puts("#{label}: #{score}")
end
end
end
To run this example, you would need an image file:
mix run -e "RunImageClassification.main('path/to/your/image.jpg')"
Running Inference Efficiently
One of the strengths of Bumblebee is its ability to leverage hardware acceleration through EXLA. Here’s how you can configure your application for optimal performance:
## Configure XLA backend
Application.put_env(:nx, :default_backend, EXLA.Backend)
## For GPU acceleration (if you have a compatible GPU)
Application.put_env(:exla, :preferred_defn_options, [compiler: EXLA, client: :cuda])
## For CPU-only
## Application.put_env(:exla, :preferred_defn_options, [compiler: EXLA, client: :host])
For production deployments, you’ll want to leverage batching to process multiple inputs efficiently:
Bumblebee.Text.text_classification(model_info, tokenizer,
compile: [batch_size: 8], # Increase batch size for production
defn_options: [compiler: EXLA]
)
Creating a Text Generation Pipeline
Let’s implement a more complex example: text generation using a pre-trained GPT-2 model. Create a file named lib/text_generator.ex
:
defmodule TextGenerator do
def load_model do
{:ok, model_info} = Bumblebee.load_model({:hf, "gpt2"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "gpt2"})
Bumblebee.Text.generation(model_info, tokenizer,
compile: [batch_size: 1, sequence_length: 100],
defn_options: [compiler: EXLA],
min_new_tokens: 20,
max_new_tokens: 100,
strategy: %{type: :greedy}
)
end
def generate(serving, prompt) do
%{results: [%{text: text}]} = Nx.Serving.run(serving, prompt)
text
end
end
And a script to test it:
defmodule RunTextGeneration do
def main do
serving = TextGenerator.load_model()
prompts = [
"Elixir is a functional programming language that",
"The best thing about machine learning is",
"Once upon a time in a distant galaxy"
]
for prompt <- prompts do
IO.puts("Prompt: #{prompt}")
generated_text = TextGenerator.generate(serving, prompt)
IO.puts("Generated: #{generated_text}")
IO.puts("---")
end
end
end
To run the example:
mix run -e "RunTextGeneration.main()"
Integrating Bumblebee with Phoenix
One of the most powerful aspects of Bumblebee is how easily it integrates with Phoenix applications. Let’s look at how we might set up an API endpoint for our sentiment analysis model.
First, you’d add Bumblebee and its dependencies to your Phoenix project. Then, create a module to initialize and hold your model:
defmodule MyApp.ML.SentimentModel do
use GenServer
def start_link(_) do
GenServer.start_link(__MODULE__, nil, name: __MODULE__)
end
def analyze(text) do
GenServer.call(__MODULE__, {:analyze, text})
end
@impl true
def init(_) do
{:ok, model_info} = Bumblebee.load_model({:hf, "distilbert-base-uncased-finetuned-sst-2-english"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "distilbert-base-uncased-finetuned-sst-2-english"})
serving = Bumblebee.Text.text_classification(model_info, tokenizer,
compile: [batch_size: 4],
defn_options: [compiler: EXLA]
)
{:ok, serving}
end
@impl true
def handle_call({:analyze, text}, _from, serving) do
%{predictions: predictions} = Nx.Serving.run(serving, text)
prediction = Enum.max_by(predictions, & &1.score)
result = %{
sentiment: prediction.label,
confidence: Float.round(prediction.score, 3)
}
{:reply, result, serving}
end
end
Add this to your application supervision tree:
def start(_type, _args) do
children = [
# ... other children
MyApp.ML.SentimentModel
]
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
Then, create a controller for your API:
defmodule MyAppWeb.SentimentController do
use MyAppWeb, :controller
def analyze(conn, %{"text" => text}) do
result = MyApp.ML.SentimentModel.analyze(text)
json(conn, result)
end
end
And add a route:
scope "/api", MyAppWeb do
pipe_through :api
post "/sentiment", SentimentController, :analyze
end
This setup creates an API endpoint that accepts text input and returns the sentiment analysis result.
Performance Considerations
When deploying Bumblebee in production environments, keep these performance considerations in mind:
-
Memory Management: Large models require significant memory. Ensure your server has adequate RAM.
-
GPU Acceleration: For production workloads, GPU acceleration can provide substantial performance improvements. Configure EXLA to use CUDA or ROCm for compatible hardware.
-
Batching: Process inputs in batches when possible to maximize throughput.
-
Model Size: Consider using smaller, distilled models for production if latency is a concern.
-
Warm-up Time: The first inference takes longer due to compilation. Pre-warm your models at application startup.
Here’s an example of how to implement a pre-warming step:
@impl true
def init(_) do
# ... load model and create serving
# Pre-warm the model with a dummy inference
Nx.Serving.run(serving, "This is a pre-warming example text.")
{:ok, serving}
end
Conclusion
Bumblebee brings the power of modern machine learning to the Elixir ecosystem. By providing easy access to pre-trained models and integrating with the broader Nx ecosystem, it enables Elixir developers to incorporate sophisticated ML capabilities into their applications without leaving their preferred language environment.
While Bumblebee is still evolving, it already provides a solid foundation for implementing various ML tasks. Its integration with Phoenix makes it particularly powerful for web applications that need machine learning capabilities.
In future articles, we’ll explore more advanced Bumblebee topics, including fine-tuning models, creating custom pipelines, and implementing real-time ML services in Phoenix applications.
References
-
Bumblebee GitHub Repository. https://github.com/elixir-nx/bumblebee
-
Nx Documentation. https://hexdocs.pm/nx/Nx.html
-
HuggingFace Model Hub. https://huggingface.co/models
-
Livebook Project. https://github.com/livebook-dev/livebook
-
Tan, S. (2021). Machine Learning in Elixir with Nx. Pragmatic Programmers blog. https://pragprog.com/categories/elixir-phoenix/