Skip to content

Commit

Permalink
🍎 feat: Apple MLX as Known Endpoint (#2580)
Browse files Browse the repository at this point in the history
* add integration with Apple MLX

* fix: apple icon + image mkd link

---------

Co-authored-by: “Extremys” <“[email protected]”>
Co-authored-by: Danny Avila <[email protected]>
  • Loading branch information
3 people authored May 1, 2024
1 parent 0e50c07 commit d21a056
Show file tree
Hide file tree
Showing 8 changed files with 77 additions and 0 deletions.
9 changes: 9 additions & 0 deletions api/server/services/Config/loadConfigModels.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,15 @@ const exampleConfig = {
fetch: false,
},
},
{
name: 'MLX',
apiKey: 'user_provided',
baseURL: 'http://localhost:8080/v1/',
models: {
default: ['Meta-Llama-3-8B-Instruct-4bit'],
fetch: false,
},
},
],
},
};
Expand Down
Binary file added client/public/assets/mlx.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions client/src/components/Chat/Menus/Endpoints/UnknownIcon.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ const knownEndpointAssets = {
[KnownEndpoints.fireworks]: '/assets/fireworks.png',
[KnownEndpoints.groq]: '/assets/groq.png',
[KnownEndpoints.mistral]: '/assets/mistral.png',
[KnownEndpoints.mlx]: '/assets/mlx.png',
[KnownEndpoints.ollama]: '/assets/ollama.png',
[KnownEndpoints.openrouter]: '/assets/openrouter.png',
[KnownEndpoints.perplexity]: '/assets/perplexity.png',
Expand Down
34 changes: 34 additions & 0 deletions docs/install/configuration/ai_endpoints.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,40 @@ Some of the endpoints are marked as **Known,** which means they might have speci

![image](https://github.com/danny-avila/LibreChat/assets/110412045/ddb4b2f3-608e-4034-9a27-3e94fc512034)

## Apple MLX
> MLX API key: ignored - [MLX OpenAI Compatibility](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)

**Notes:**

- **Known:** icon provided.

- API is mostly strict with unrecognized parameters.
- Support only one model at a time, otherwise you'll need to run a different endpoint with a different `baseURL`.

```yaml
- name: "MLX"
apiKey: "mlx"
baseURL: "http://localhost:8080/v1/"
models:
default: [
"Meta-Llama-3-8B-Instruct-4bit"
]
fetch: false # fetching list of models is not supported
titleConvo: true
titleModel: "current_model"
summarize: false
summaryModel: "current_model"
forcePrompt: false
modelDisplayLabel: "Apple MLX"
addParams:
max_tokens: 2000
"stop": [
"<|eot_id|>"
]
```

![image](https://github.com/danny-avila/LibreChat/blob/ae9d88b68c95fdb46787bca1df69407d2dd4e8dc/client/public/assets/mlx.png)

## Ollama
> Ollama API key: Required but ignored - [Ollama OpenAI Compatibility](https://github.com/ollama/ollama/blob/main/docs/openai.md)

Expand Down
1 change: 1 addition & 0 deletions docs/install/configuration/custom_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -1213,6 +1213,7 @@ Each endpoint in the `custom` array should have the following structure:
- "Perplexity"
- "together.ai"
- "Ollama"
- "MLX"

### **models**

Expand Down
30 changes: 30 additions & 0 deletions docs/install/configuration/mlx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title:  Apple MLX
description: Using LibreChat with Apple MLX
weight: -6
---
## MLX
Use [MLX](https://ml-explore.github.io/mlx/build/html/index.html) for

* Running large language models on local Apple Silicon hardware (M1, M2, M3) ARM with unified CPU/GPU memory)


### 1. Install MLX on MacOS
#### Mac MX series only
MLX supports GPU acceleration on Apple Metal backend via `mlx-lm` Python package. Follow Instructions at [Install `mlx-lm` package](https://github.com/ml-explore/mlx-examples/tree/main/llms)


### 2. Load Models with MLX
MLX supports common HuggingFace models directly, but it's recommended to use converted and tested quantized models (depending on your hardware capability) provided by the [mlx-community](https://huggingface.co/mlx-community).

Follow Instructions at [Install `mlx-lm` package](https://github.com/ml-explore/mlx-examples/tree/main/llms)

1. Browse the available models [HuggingFace](https://huggingface.co/models?search=mlx-community)
2. Copy the text from the model page `<author>/<model_id>` (ex: `mlx-community/Meta-Llama-3-8B-Instruct-4bit`)
3. Check model size. Models that can run in CPU/GPU unified memory perform the best.
4. Follow the instructions to launch the model server [Run OpenAI Compatible Server Locally](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)

```mlx_lm.server --model <author>/<model_id>```

### 3. Configure LibreChat
Use `librechat.yaml` [Configuration file (guide here)](./ai_endpoints.md) to add MLX as a separate endpoint, an example with Llama-3 is provided.
1 change: 1 addition & 0 deletions docs/install/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ weight: 1
* 🤖 [AI Setup](./configuration/ai_setup.md)
* 🚅 [LiteLLM](./configuration/litellm.md)
* 🦙 [Ollama](./configuration/ollama.md)
* 🍎 [Apple MLX](./configuration/mlx.md)
* 💸 [Free AI APIs](./configuration/free_ai_apis.md)
* 🛂 [Authentication System](./configuration/user_auth_system.md)
* 🍃 [Online MongoDB](./configuration/mongodb.md)
Expand Down
1 change: 1 addition & 0 deletions packages/data-provider/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,7 @@ export enum KnownEndpoints {
fireworks = 'fireworks',
groq = 'groq',
mistral = 'mistral',
mlx = 'mlx',
ollama = 'ollama',
openrouter = 'openrouter',
perplexity = 'perplexity',
Expand Down

0 comments on commit d21a056

Please sign in to comment.