Vera’s 2024 Glow Up, Part 1 — The Admin UI

The Vera Engineering Blog
3 min readFeb 6, 2024

--

This release was a big one; it’s all about our new design and UI for Vera administrators! At the end of this post, we snuck in a pretty big functionality update… model response caching. What’s that, you ask? Read on to find out!

The New Admin UI

It’s time for us to move from MVP to production, and we couldn’t be more excited about it. Through our shiny new interface, now you can:

Assign and Manage API Keys

If you, like many of our clients, are living and working in a multi-model world, managing all the different teams’ and users’ model access can be a real pain. In Vera, you can assign permissions at the group level, so your entire marketing team can have the same experience. Or, you can assign keys to individuals who warrant exceptions! It all works in the web portal, so you can manage it on your own.

Set and Enforce Policies

Vera offers a wide range of one-click policies you can toggle on and off at the groups and/or users level — meaning: you can have one set of policies for customers, another for Engineering, and one for Carla, your VPE, set individually.

With our new UI, you can toggle on (or off) policies that:

  • Control what input types users can send (code, video, etc.)
  • Allow or deny acces to certain models
  • Restrict their spend by tokens or API calls
  • Apply redaction or blocking of inputs and/or outputs (e.g. PII, passwords, topics like politics)

Search Logs

You always could, but now… it’s prettier. Now, Vera logs search is using our even-better, previously-discussed vector database, and you can also find the moderated terms that may have violated one of your internal policies.

Model Response Caching

If your user population is fairly large, you might notice that all too often, people ask the same questions twice. Why would you want to spend money on GPT-4 when you already have the answer?

Well, you shouldn’t. With Vera’s new model caching, users in the same group gain the benefit of the whole team’s prior queries. If your team (or one of your customers) asks a question that’s already been answered by an LLM, we’ll store that response locally so you save the token spend. You heard that right! Never pay double for the same questions asked.

And even better: since we all know how much latency can impact the customer experience, and that current commercial model capabilities can take too much time at inference to be viable. That is, of course, unless you can store these commonly-asked questions locally. This allows us to serve up lightning fast model responses that are accurate, cost-effective, and aligned with your internal policies. Caching is a feature that’s perfect for AI-powered products that need to serve answers quick.

What Else??

As always, we’re hard at work to squash bugs, apply fixes, and build out beautiful documentation. We did a lot of this too, but we won’t bore you with the details (yet).

And that’s about it for this week. If you haven’t already, don’t forget to get in touch with us!

--

--