6.3 KiB
Overview
The server by default
- runs on port 5146
- Uses Swagger UI (
/swagger/index.html) - Uses Elmah error logging (endpoint:
/elmah, local files:~/logs) - Uses serilog logging (local files:
~/logs) - Uses HealthChecks (endpoint:
/healthz)
Docker installation
(On Linux you might need root privileges. Use sudo where necessary)
- Set up the configuration
- Navigate to the
srcdirectory - Build the docker container:
docker build -t embeddingsearch-server -f Server/Dockerfile . - Run the docker container:
docker run --net=host -t embeddingsearch-server(the-tis optional, but you get more meaningful output. Or use-dto run it in the background)
Installing the dependencies
Ubuntu 24.04
- Install the .NET SDK:
sudo apt update && sudo apt install dotnet-sdk-10.0 -y
Windows
Download and install the .NET SDK or follow these steps to use WSL:
- Install Ubuntu in WSL (
wsl --installandwsl --install -d Ubuntu) - Enter your WSL environment
wsl.exeand configure it - Update via
sudo apt update && sudo apt upgrade -y && sudo snap refresh - Continue here: Ubuntu 24.04
MySQL database setup
- Install the MySQL server:
- Linux/WSL:
sudo apt install mysql-server - Windows: MySQL Community Server
- connect to it:
sudo mysql -u root(Or from outside of WSL:mysql -u root) - Create the database:
CREATE DATABASE embeddingsearch; use embeddingsearch; - Create the user (replace "somepassword! with a secure password):
CREATE USER 'embeddingsearch'@'%' identified by "somepassword!"; GRANT ALL ON embeddingsearch.* TO embeddingsearch; FLUSH PRIVILEGES;- Caution: The symbol "%" in the command means that this user can be logged into from outside of the machine.
- Replace
'%'with'localhost'or with the IP of your embeddingsearch server machine if that is a concern.
- Exit mysql:
exit
Configuration
Environments
The configuration is located in src/Server/ and conforms to the ASP.NET configuration design pattern, i.e. src/Server/appsettings.json is the base configuration, and /src/Server/appsettings.Development.json overrides it.
If you plan to use multiple environments, create any appsettings.{YourEnvironment}.json (e.g. Development, Staging, Prod) and set the environment variable DOTNET_ENVIRONMENT accordingly on the target machine.
Setup
If you just installed the server and want to configure it:
- Open
src/Server/appsettings.Development.json - Change the password in the "SQL" section (
pwd=<your password goes here>;) - Check the "AiProviders" section. If your Ollama/LocalAI/etc. instance does not run locally, update the "baseURL" to point to the correct URL.
- If you plan on using the server in production:
- Set the environment variable
DOTNET_ENVIRONMENTto something that is not "Development". (e.g. "Prod") - Rename the
appsettings.Development.json- replace "Development" with what you chose forDOTNET_ENVIRONMENT - Set API keys in the "ApiKeys" section (generate keys using the
uuidcommand on Linux)
- Set the environment variable
Structure
"Embeddingsearch": {
"ConnectionStrings": {
"SQL": "server=localhost;database=embeddingsearch;uid=embeddingsearch;pwd=somepassword!;",
"Cache": "Data Source=embeddings.db;Mode=ReadWriteCreate;Cache=Shared" // Name of the sqlite cache file
},
"Elmah": {
"LogPath": "~/logs" // Where the logs are stored
},
"AiProviders": {
"ollama": { // Name for the provider. Used when defining models for a datapoint, e.g. "ollama:mxbai-embed-large"
"handler": "ollama", // The type of API located at baseURL
"baseURL": "http://localhost:11434", // Location of the API
"Allowlist": [".*"], // Allow- and Denylist. Filter out non-embeddings models using regular expressions
"Denylist": ["qwen3-coder:latest", "qwen3:0.6b", "deepseek-v3.1:671b-cloud", "qwen3-vl", "deepseek-ocr"]
},
"localAI": { // e.g. model name: "localAI:bert-embeddings"
"handler": "openai",
"baseURL": "http://localhost:8080",
"ApiKey": "Some API key here",
"Allowlist": [".*"],
"Denylist": ["cross-encoder", "..."]
}
},
"ApiKeys": ["Some UUID here", "Another UUID here"], // (optional) Restrict access using API keys
"Cache": {
"CacheTopN": 10000, // Only cache this number of queries. (Eviction policy: LRU)
"StoreEmbeddingCache": true, // If set to true, the SQLite database will be used to store the embeddings
"StoreTopN": 10000 // Only write the top n number of queries to the SQLite database
}
}
AiProviders
Each AI provider (Ollama/LocalAI/OpenAI/etc.) can be specified individually.
One can even specify multiple Ollama instances and name them however one pleases. E.g.:
"AiProviders": {
"ollama_1": {
"handler": "ollama",
"baseURL": "http://x.x.x.x:11434",
},
"ollama_2": {
"handler": "ollama",
"baseURL": "http://y.y.y.y:11434",
}
}
handler
Currently two handlers are implemented for embeddings generation:
ollama- requests embeddings from
/api/embed
- requests embeddings from
openai- requests embeddings from
/v1/embeddings
- requests embeddings from
baseURL
Specified by scheme://host:port. E.g.: "baseUrl": "http://localhost:11434"
Any specified absolute path will be disregarded. (e.g. "http://x.x.x.x/any/subroute" -> "http://x.x.x.x/api/embed")
ApiKey
ollamacurrently does not support API keys. Specifying a key does not have any effect.openaiimplements the use of ApiKey. E.g."ApiKey": "Some API key here"
API
Accessing the api
Once started, the server's API can be viewed and manipulated via swagger.
By default it is accessible under: http://localhost:5146/swagger/index.html
To make an API request from within swagger:
- Open one of the actions ("GET" / "POST")
- Click the "Try it out" button. The input fields (if there are any for your action) should now be editable.
- Fill in the necessary information
- Click "Execute"
Authorization
Being logged in has priority over API Key requirement (if api keys are set).
So being logged in automatically authorizes endpoint usage.