With the new version of gates, it is now possible to use meta tags in API requests. These meta tags allow you to set specific filters for similarity search, especially when it is not possible to express all the features for a search in the actual embeddings.

Currently, there are two options for the meta tags:

  • The "amount" parameter to set the number of results you would like to receive (this is important as too many results can fill up the context window of LLMs quickly).
  • An option called "filter" that is expecting key-value pairs for the filters.

Meta tags in gates may look like this:

{
      "attribute_to_search_for": "This is a text I want to do similarity search with!",
      "@@meta@@": {
            "similarity": {
                  "amount" : 2, 
                  "filter": [
                        {
                              "key": "communication_style",
                              "value": "fact_oriented"
                        }
                  ]
            }
      }
}

We'll be adding more meta tags in the future.

Auto-pause for inactive gates containers

Being careful with resource usage is important to us. With gates v1.2.0, instances are now automatically stopped after a certain amount of time. We have implemented this to ensure that unused gates instances are not taking up valuable computing power and memory. Currently, gates instances will be stopped after they have been inactive for 60 minutes by default.

The pausing of a gates instances is a bit different from manually stopping it. When it is manually stopped, it needs to be manually started again to be used. When a gates instance was paused automatically due to inactivity, it will return a 503 code, then fully-restart again automatically and be available after a couple of seconds. If you include ?wait=true in your request, the service will not immediately respond and instead will wait for the service to restart and then normally respond, requiring you to only make one call to the gates instance.