Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETag + Cache API combo #3105

Open
tunnckoCore opened this issue Jul 6, 2024 · 5 comments
Open

ETag + Cache API combo #3105

tunnckoCore opened this issue Jul 6, 2024 · 5 comments
Labels

Comments

@tunnckoCore
Copy link

What version of Hono are you using?

4.4

What runtime/platform is your app running on?

Deno / JSR / ESM.sh / npm: specifiers

What steps can reproduce the bug?

Given you just have a basic app, and add the builtin ETag middleware, isn't it supposed to return 304 automatically, after a refresh of already loaded response? That's one.

Second, seems like by default it is using weak ETag, why obviously the default in code is false. Even if we force it to weak: false it doesn't make them non-weak. Weird.

Third, i'm trying to replicate a similar thing to Cloudflare R2 and/or AWS S3 - where we have ETags and caching, and 304 responses when nothing has changed.

import { Hono } from "npm:hono";
import { etag } from "npm:hono/etag";

// even tried with direct github import 
// import { etag } from "https://esm.sh/gh/honojs/hono/src/middleware/etag/index.ts";

import { cache } from "npm:hono/cache";

const app = new Hono();

app.get("/baw/*", etag());

// app.get(
//   "/baw/*",
//   cache({
//     cacheName: "my-3",
//     cacheControl: "max-age=120",
//     wait: true,
//   }),
// );

app.get("/baw/abc", async (c) => {
  await new Promise((resolve) => setTimeout(resolve, 5_000));

  return c.json({ ok: true, date: new Date().toISOString() });
});

Deno.serve(app.fetch);

Fourth, about the Cache middleware: seems like in Deno we don't need to make await cache.put, just remove the await and we won't need the "deno specific" wait: true option. Seems to work fine, tested it with similar logic to the cache's middleware with bare Deno.serve, eg.

Deno.serve(async (request) => {
  try {
    const url = new URL(request.url);
    const cacheKey = new Request(url.toString(), request);
    const cache = await caches.open("foobar-v3");

    console.log("Loading...", cacheKey.url);

    const response = await cache.match(cacheKey);

    if (response) {
      console.log("Cache HIT:", request.url);

      return response;
    }

    await new Promise((resolve) => setTimeout(resolve, 5_000));

    const resp = Response.json(
      { ok: true, date: new Date().toISOString() },
      { status: 200 },
    );

    // works fine: first MISS waits 5 seconds, then it's instant
    cache.put(cacheKey, resp.clone());

    console.log("Cache MISS:", request.url);

    return resp;
  } catch (e) {
    return Response.json({ error: true, message: e.message }, { status: 500 });
  }
});

Now, what's the problem for having both etag and cache middlewares? I mean, i don't get status 304 response for the /bar/abc on second load.

import { Hono, type Context } from "npm:hono";
import { etag } from "npm:hono/etag";
import { cache } from "npm:hono/cache";

const app = new Hono();

app.get("/baw/*", etag());
app.get(
  "/baw/*",
  cache({
    cacheName: "my-cache-4",
    cacheControl: "max-age=120",
    wait: true,
  }),
);

app.get("/baw/abc", async (c: Context) => {
  await new Promise((resolve) => setTimeout(resolve, 5_000));

  // even if it's static response (non Dynamic Date),
  // it still not returns 304 
  return c.json({ ok: true, date: new Date().toISOString() });
});

Deno.serve(app.fetch);

Tried with esm.sh http imports too. No difference.

What is the expected behavior?

comments above

What do you see instead?

Am I missing and messing something or it's some weird conflict, putting aside from the etag bugs?

What i expect is on first load to wait 5 seconds, then cache the thing, and on second load to be instant and respond with the cache result and status 304. I don't think there's something wrong with that, Cloudflare R2 achieves it similarly.

Additional information

Okay, after further debugging using just the etag middleware, seems like the default "non-weak" behavior doesn't work at all - ie. it doesn't return 304 on second load, even if the response is the same. When I enable it, the whole thing works and returns status 304.

The following works, meaning when i load /baw/abc it waits 5s and responds with the json and status 200. When i refresh it waits again, but responds with 304. All that is okay.

const app = new Hono();
const weak = true;
const id = 331311;

app.get("/baw/*", etag({ weak }));

app.get("/baw/abc", async (c) => {
  await new Promise((resolve) => setTimeout(resolve, 5_000));

  const data = { ok: true, id };
  const jsonStr = JSON.stringify(data);
  const hash = await sha1(jsonStr);
  const etag = weak ? `W/"${hash}"` : `"${hash}"`;

  console.log({ etag });

  return c.json(data, {
    status: 200,
    headers: { ETag: etag },
  });
});

Deno.serve(app.fetch);

And when we add the cache middleware after the etag one, all is great.

So, the whole problem is in the etag i think. Also noticing that there's no If-None-Match if it's "non-weak".

Sorry if i'm messing something but.. i can never get Cache Control and Etag, even after decades in open source..


Basically, the idea is to use both the web response Cache API and ETag. Why?

Because seems like that each one alone doesn't do what i need. What i need is when you hit first time, the response is delayed cuz the route handler is called. Then the response is cached, and an ETag is set.

Oooh.. hm. Problem is, that when i use only the Cache API (the cache middleware), when the response of the handler is changed, the cache stays and the load is instant. Presumable that's because the cache key is based on the url. Okay, i tried keyGenerator. But how that key can be based on the response, not on the URL?

And on the other hand, if i use only ETags, it works, but it always calls the handlers thus the response, even when correctly set as 304, is delayed.

@tunnckoCore
Copy link
Author

Okay, i tried keyGenerator. But how that key can be based on the response, not on the URL?

Okay, I've calculated a response hash, and set it on the context, the from the keyGen option, i use that hash as a key, yet it doesn't refresh the cache once the content of the handler changed.

Oh! I have an idea LOL.

But still, the other weird thing is, cacheControl doesn't seem to be respected, even if just max-age=10

🤦‍♂️

Such a limbo.

@tunnckoCore tunnckoCore changed the title ETag / Cache issues Jul 6, 2024
@yusukebe
Copy link
Member

yusukebe commented Jul 7, 2024

Hi @tunnckoCore

I've tried an app using ETag and Cache Middleware, as you write. I think it works correctly, is it not?:

Area.mp4

If you still have a problem, could you explain your current problems simply?

By the way, this ETag middleware is implanted with the same logic as the etag package.

@tunnckoCore
Copy link
Author

tunnckoCore commented Jul 11, 2024

Heya @yusukebe

Thing is, when cache expires, this 120s, it still serves from the previous cache - it's cached forever, it doesn't get invalidated or something like that, thus not calling the handler at all. I think it would be more correct that when the Cache Control expires, the cache should be invalidated and the handler called again, delayed again, and thus the response will be different and the ETag too.

I realize it's kind of tricky. But I believe we can make them work better when used together. Maybe detect if there's ETag middleware, and use the etag "as key" for the response cache.

Put all the ETag aside. Just using the Cache middleware. The response is forever cached. Which, kind of logical, is the "right" behavior because A) the Cache spec says it "doesn't honor HTTP caching headers", B) then why the cache control is there in the first place if it doesn't deal with some sort of checking and invalidating?

This

import {Hono, type Context} from '@hono/hono';
import {cache} from '@hono/hono/cache';
// import {etag} from '@hono/hono/etag';

const app = new Hono();

// app.get('/baw/*', etag({ weak: true }));

app.get('/baw/*', cache({ cacheName: 'foo-2', cacheControl: 'max-age=20', wait: true }));

app.get('/baw/abc', async (c: Context) => {
  await new Promise((resolve) => setTimeout(resolve, 4_000));
  return c.json({ ok: true, date: new Date().toISOString() });
});

Deno.serve(app.fetch);

Maybe we can use setInterval based on the set max-age, and delete with cache.delete?

Adding something like that in the middleware makes it work as expected, but is that a good way? Not the best for sure, but.. hm. Probably not suitable and working for serverless, right? Ah.

  const addHeader = (c: Context, key: string, cache: Cache) => {
    if (cacheControlDirectives) {
      const existingDirectives =
        c.res.headers
          .get("Cache-Control")
          ?.split(",")
          .map((d) => d.trim().split("=", 1)[0]) ?? [];

      for (const directive of cacheControlDirectives) {
        let [name, value] = directive.trim().split("=", 2);
        name = name.toLowerCase();
        if (!existingDirectives.includes(name)) {
          c.header("Cache-Control", `${name}${value ? `=${value}` : ""}`, {
            append: true,
          });
        }

+        if (name && value && name === 'max-age') {
+          setInterval(async () => {
+            await cache.delete(key);
+          }, (Number(value) * 1_000) - 1_000)
+        }
      }
    }

  // ...
@tunnckoCore
Copy link
Author

Or maybe just have two different middlewaares - cache, and response-cache cuz they are slightly different things.

@yusukebe
Copy link
Member

Hi @tunnckoCore

I think this works fine on Cloudflare Workers with a wrangler dev command:

Area.mp4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 participants