GitLab CI after_script is not executed
Summary
In GitLab CI, the steps of after_script
section are not executed anymore.
Experienced with gitlab-runner 12.10.0-rc1 (80ffd94f) on docker-auto-scale fa6cab46
Steps to reproduce
Create project with .gitlab-ci.yml
:
echo:
image: alpine:latest
script:
- echo script
after_script:
- echo after_script
Example Project
https://gitlab.com/hs-karlsruhe/ci-test/-/jobs/518567755
What is the current bug behavior?
The steps of after_script
section are not executed
What is the expected correct behavior?
The steps of after_script
section are executed
Relevant logs and/or screenshots
Running with gitlab-runner 12.10.0-rc1 (80ffd94f)
on docker-auto-scale fa6cab46
Preparing the "docker+machine" executor
Using Docker executor with image alpine:latest ...
Pulling docker image alpine:latest ...
Using docker image sha256:a187dde48cd289ac374ad8539930628314bc581a481cdb41409c9289419ddb72 for alpine:latest ...
Preparing environment
00:02
Running on runner-fa6cab46-project-16554705-concurrent-0 via runner-fa6cab46-srm-1587389254-aa44b7cd...
Getting source from Git repository
00:02
$ eval "$CI_PRE_CLONE_SCRIPT"
Fetching changes with git depth set to 50...
Initialized empty Git repository in /builds/hs-karlsruhe/ci-test/.git/
Created fresh repository.
From https://gitlab.com/hs-karlsruhe/ci-test
* [new ref] refs/pipelines/137885685 -> refs/pipelines/137885685
* [new branch] after_script -> origin/after_script
Checking out 062ebbae as after_script...
Skipping Git submodules setup
Restoring cache
00:01
Downloading artifacts
00:01
Running before_script and script
00:01
$ echo script
script
Running after_script
00:01
Saving cache
00:01
Uploading artifacts for successful job
00:01
Job succeeded
Output of checks
This bug happens on GitLab.com
What happened
This bug was introduced in gitlab-runner!1990 (merged), specifically https://gitlab.com/gitlab-org/gitlab-runner/-/blob/1494bf0071cb93ceb9bd771ea990ef292746f5d7/executors/docker/docker.go#L880. To understand the problem we have to understand how we execute the before_script
, script
, after_script
we run before_script
and script
together as 1 script, inside of a container that we call the build container
. When before_script+script
are finished executing we then use the same container to run after_script
for performance reasons, and also that after_script
has all the state the before_script+script
generated. We check the container sate to see if we need to get its exit code and finish execution, however, after_script
uses the already exited, created, not running container so this condition evaluates to false
and results into us just look at the exit code again. This means we just return the exit code and never actually running the after_script.
What we are doing
- Revert gitlab-runner!1990 (merged) which is being done in gitlab-runner!2026 (merged)
- Cherry-pick this commit into the 12.10 stable branch and tag
12.10.0-rc2
- Deploy
12.10.0-rc2
to the shared Runner fleet
The reason we aren't doing a rollback of the deployment, but rolling forward is because for us to do a rollback it would be just like a normal deploy, so it would take the same amount of time.
Follow up steps
- Create an integration test to assert that we run the
after_script
inside of the Docker executor.