Working on a project to implement LLM in Databricks using Hugging face, I faced issue of being unable to download the required libraries directly. Hence, had to find a work around for downloading it to my macbook and uploading the libraries to DBFS to be able to use the model in my Databricks workbook.

This guide explains how to download the sentence-transformers/all-MiniLM-L6-v2 model locally (on macOS), prepare it for offline use, upload it to Databricks DBFS, and load it successfully in a notebook.
🧰 Step 1: Setup Local Environment
/Library/Frameworks/Python.framework/Versions/3.10/bin/python3
python3 -m venv venv
source venv/bin/activate
pip install sentence-transformers huggingface_hub
🧱 Step 2: Download Model Files (Manually via curl)
mkdir all-MiniLM-L6-v2 && cd all-MiniLM-L6-v2
# Required files
sudo curl -L -O https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json
sudo curl -L -O https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/pytorch_model.bin
sudo curl -L -O https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer_config.json
sudo curl -L -O https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/vocab.txt
sudo curl -L -O https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/sentence_bert_config.json
sudo curl -L -O https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/modules.json
# Subfolders
mkdir 0_Transformer 1_Pooling
sudo curl -L -o 0_Transformer/config.json https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/0_Transformer/config.json
sudo curl -L -o 1_Pooling/config.json https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/1_Pooling/config.json
📦 Step 3: Zip & Upload to Databricks
cd all-MiniLM-L6-v2
zip -r ../all-MiniLM-L6-v2.zip *
cd ..
databricks fs mkdirs dbfs:/FileStore/models/
databricks fs cp all-MiniLM-L6-v2.zip dbfs:/FileStore/models/all-MiniLM-L6-v2.zip --overwrite
🧪 Step 4: Extract & Use in Databricks Notebook
dbutils.fs.cp("dbfs:/FileStore/models/all-MiniLM-L6-v2.zip", "file:/tmp/all-MiniLM-L6-v2.zip", True)
import zipfile
with zipfile.ZipFile("/tmp/all-MiniLM-L6-v2.zip", 'r') as zip_ref:
zip_ref.extractall("/tmp/all-MiniLM-L6-v2")
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("/tmp/all-MiniLM-L6-v2")
sentences = ["Databricks is awesome.", "Transformers are powerful."]
embeddings = model.encode(sentences)
print(f"Embedding shape: {embeddings.shape}")
🧹 Tips
- Make sure to use
curl -Lto follow redirects. - Verify
pytorch_model.binis ~90 MB, not a small HTML file. - Adjust extraction path if nested folder appears inside zip.