12

I have a setup job in my CI which installs all npm packages and stores them in cache:

setup:
  stage: pre-build
  cache:
    untracked: true
    when: on_success
    key:
      files:
        - repo-hash
      prefix: workspace
  script:
    - yarn install --frozen-lockfile

Currently, if the repo-hash does not change between two pipelines, the job successfully downloads the existing cache, but it still runs yarn install --frozen-lockfile.

How to change this behaviour, so that if the cache exists, this job is skipped?

4
  • You should cache the .yarn directory, per e.g. classic.yarnpkg.com/en/docs/install-ci; you still need to do the install, but then most (if not all) of the files are already local.
    – jonrsharpe
    Commented Mar 9, 2021 at 9:24
  • Thanks, actually I don't mind the downloading step, it is mainly the yarn install step which I'd like to skip :/
    – Nicoowr
    Commented Mar 9, 2021 at 9:28
  • Did you find a way of doing this ? Skipping the yarn install step ? Commented Mar 19, 2021 at 13:12
  • We finally decided not to do it. However I think you can use the changes keyword of gitlab-ci.yml to run the job only if the specified field has changed: docs.gitlab.com/ee/ci/yaml/#ruleschanges
    – Nicoowr
    Commented Mar 19, 2021 at 13:41

3 Answers 3

12

Right now on an NPM-based Preact project, I'm using gitlab-ci.yml's only feature to run an install job if there was any change to package-lock.json (and to skip install if not). Then my build job picks up the node_modules cache from the install job:

install:
  stage: setup # gitlab.com shared runners are slow; don't run unless there's a reason to install
  cache:       # even a "echo 'hello world'" script would be about a 30s job, worse if queued
    key:
      files: [package-lock.json] # the key is a hash of package-lock; branches with it will use this cache :D
    paths: [node_modules] # we're not caching npm (ie `npm ci --prefer-offline --cache`) because it's slow
    policy: push # override the default behavior of wasting time downloading and unpacking node_modules
  script: [npm ci] # I believe in package-lock/npm ci's mission
  only:
    changes: [package-lock.json] # our forementioned reason to install
    refs:
      - master
      - develop
      - merge_requests

build: # uses the default behavior, which runs even if install is skipped; but not if install failed
  stage: build
  cache:
    key:
      files: [package-lock.json]
    paths: [node_modules]
    policy: pull # override the default behavior of wasting time packing and uploading node_modules
  script:
    - npm run build
  artifacts:
    paths: [build]
  only:
    changes:
      - src/**/*
      - tsconfig.json
      - preact.config.js
      - package-lock.json
    refs:
      - master
      - develop
      - merge_requests

For Yarn you would simply replace any package-lock.json you see here with yarn.lock, and replace npm ci with yarn install --frozen-lockfile.

Previously, I was doing it all in the build job like this (similar to one of the other answers here):

build:
  stage: build
  cache:
    key:
      files: [package-lock.json]
    paths: [node_modules]
  script:
    - test -d node_modules || npm ci
    - npm run build
  artifacts:
    paths: [build]
  only:
    changes:
      - src/**/*
      - tsconfig.json
      - preact.config.js
      - package-lock.json
    refs:
      - master
      - develop
      - merge_requests

I changed to the more complicated approach because I thought I was also going to implement caching for npm cache, but I couldn't seem to get that to improve any actual install times -- things actually got worse. I stuck with it though simply because it allowed me to squeeze another 20 seconds out of the lockfile-wasn't-changed scenario. I find both approaches to be valid and effective. In my specific case, caching node_modules has cut about a minute and a half off of deployments.

9

try exit?

install:
  stage: install
  cache:
    key:
      files:
        - package-lock.json
      prefix: $CI_PROJECT_NAME
    paths:
      - node_modules/
  script:
    - |
      if [[ -d node_modules ]]; then
        exit 10
      fi
    - npm ci --cache .npm --prefer-offline
  allow_failure:
    exit_codes: 10
2
  • 1
    It would be better to just change 10 error code to 0 Commented Apr 28, 2022 at 15:10
  • 1
    This will work, but it would be better if that was possible at Gitlab level, because it can take a while to spin up the job (exemple: Docker executor on not super powerfull Windows VM takes 1 minute to reach this exit). Commented Jun 15, 2022 at 8:02
4

As per the official reference, this is not possible on the job level. However, you can add conditions to your job scripts and check for the existence of the should-be-cached files.

This is what the relevant parts of my setup look like:

stages:
- setup
- test
- build
- release
- publish

cache: &dependency-cache
  key:
    files:
    - "**/yarn.lock"
  paths:
  - node_modules
  - "*/node_modules"
  policy: pull

renew-cache:
  stage: setup
  except:
  - /master.*/
  - tags
  script:
  - test -d node_modules || yarn && yarn workspace app ngcc --properties es2015 browser module main
  cache:
    <<: *dependency-cache
    policy: pull-push

test-app:
  stage: test
  except:
  - /master.*/
  - tags
  script:
  - test -d node_modules || yarn
  - yarn workspace app test

build-app:
  stage: build
  only:
  - /master.*/
  script:
  - test -d node_modules || yarn
  - yarn workspace app build
  # Store build results ...

# The rest of the pipeline
  • I use yarn 2+ and yarn workspaces with node-modules yarn linker and dependency module hoisting limit
  • The cache is shared for all jobs with pull policy, the renew-cache job is the only one to write the cache
  • test -d node_modules || <COMMAND> runs the <COMMAND> only if the node_modules directory was not found - no cache for the job
  • New cache is pushed only when there is none or when any yarn.lock is modified
  • The same cache is re-used until it expires and while the yarn.lock files do not change
  • The test for the cached node_modules has to be present in all other build scripts that rely on the cache - the cache could have expired and the pipeline must be deterministically repeatable

This approach does not eliminate the need to run the setup job every time, so there is still some overhead. However, the overall pipeline run times are reduced greatly.

Hope this helps a little :-)

Not the answer you're looking for? Browse other questions tagged or ask your own question.