First find out home directory for gitlab runner.
$ ps aux | grep gitlab-runner
/usr/bin/gitlab-runner run --working-directory /var/lib/gitlab-runner --config /etc/gitlab-runner/config.toml --service gitlab-runner --syslog --user gitlab-runner
Now, check .bash_logout
exist or not.
$ ls -lah /var/lib/gitlab-runner/
total 20K
drwxr-xr-x 4 gitlab-runner root 4.0K Dec 1 02:53 .
drwxr-xr-x 42 root root 4.0K Nov 11 03:29 ..
-rwxr--r-- 1 gitlab-runner gitlab-runner 30 Dec 1 02:53 .bash_logout
drwxr-xr-x 2 gitlab-runner gitlab-runner 4.0K Dec 1 02:26 .terraform.d
drwxrwxr-x 3 gitlab-runner gitlab-runner 4.0K Nov 11 03:23 builds
At last, open .bash_logout
file and comment out all lines and save it.
$ vim /var/lib/gitlab-runner/.bash_logout
# ~/.bash_logout: executed by bash(1) when login shell exits.
# when leaving the console clear the screen to increase privacy
#if [ "$SHLVL" = 1 ]; then
# [ -x /usr/bin/clear_console ] && /usr/bin/clear_console -q
#fi
ps: Originally answer was given by Jonathan Allen
Jonathan Allen (@sc_jallen) on gilab-runner issues.
Shell executor fails to prepare environment in Ubuntu 20.04
Summary
After updating our build system to Ubuntu 20.04, any job using the Shell executor fails:
Running with gitlab-runner 13.1.1 (6fbc7474)
on [redacted]
Preparing the "shell" executor
00:00
Using Shell executor...
Preparing environment
00:00
Running on [redacted]...
ERROR: Job failed (system failure): prepare environment: exit status 1. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
When we provision a runner instance using Ubuntu 18.04, the Shell executor works as expected.
Steps to reproduce
Provision a new runner instance using Ubuntu 18.04, and register as a runner. It will run Shell jobs successfully.
Provision a new runner instance using Ubuntu 20.04, following the exact same steps as you did before, and register it as a runner. Shell jobs will fail.
We use Ansible to provision our runners, and this was the only change we made. We tried several times configuring 18.04 and 20.04 from the same scripts, with only the AMI ID changing.
What is the current bug behavior?
Jobs using Shell executor fail in the «preparing environment» step.
What is the expected correct behavior?
Jobs using the Shell executor proceed past the «preparing environment» step and execute the shell script.
Relevant logs and/or screenshots
(Paste any relevant logs — please use code blocks («`) to format console output,
logs, and code as it’s tough to read otherwise.)
Output of checks
This bug happens on GitLab.com
Error 1) open /root/.ssh/known_hosts: no such file or directory
Using SSH executor...
ERROR: Preparation failed: ssh command Connect() error:
getting host key callback: open /root/.ssh/known_hosts:
no such file or directory
Will be retried in 3s ...
Enter fullscreen mode
Exit fullscreen mode
Solution:
Follow the steps to resolve:
1) Login to gitlab
instance via SSH
2) Become sudo
via:
sudo su
Enter fullscreen mode
Exit fullscreen mode
3) Now, you need to connect gitlab instance to the host where runner is try to connect
ssh <host-username>@<host-ip>
Enter fullscreen mode
Exit fullscreen mode
<host-username>
and <host-ip>
should match with the gitlab runner
, it will ask for password
then it will ask to accept key fingerprint
.
Now, try to run the job with the runner. It should be working
Error 2) Job failed: prepare environment: Process exited with status 1.
If you are getting following error in your when running gitlab ci/cd job via gitlab-runner
:
ERROR: Job failed: prepare environment: Process exited with status 1.
Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading
for more information
Enter fullscreen mode
Exit fullscreen mode
Solution:
Run following command:
find / -name .bash_logout
Enter fullscreen mode
Exit fullscreen mode
and delete following files if exist
sudo rm -r /home/gitlab-runner/.bash_logout
sudo rm -r /home/<username>/.bash_logout
Enter fullscreen mode
Exit fullscreen mode
Try to re-run the jobs it should be working.
Error 3) handshake failed: knownhosts: key is unknown
ERROR: Preparation failed: ssh command Connect() error: ssh Dial() error: ssh: handshake failed: knownhosts: key is unknown
Enter fullscreen mode
Exit fullscreen mode
Solution:
Solution A
Verify your login credentials
Solution B
Verify that SSH port is open
Solution C
Edit your runner and add disable_strict_host_key_checking = true
sudo nano /etc/gitlab-runner/config.toml
Enter fullscreen mode
Exit fullscreen mode
[[runners]]
name = "..."
url = "..."
token = "..."
executor = "ssh"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
[runners.ssh]
user = "..."
password = "..."
host = "..."
port = "..."
disable_strict_host_key_checking = true
Enter fullscreen mode
Exit fullscreen mode
Then restart the gitlab-runner
sudo gitlab-runner restart
Enter fullscreen mode
Exit fullscreen mode
Solution D
If you’re using WHM
as your hosting control panel, enable following settings:
SSH Password Authorization Tweak
Enter fullscreen mode
Exit fullscreen mode
Topic | Jenkins, From Zero To Hero Become a DevOps Jenkins Master |
Source | jenkins/centos7/Dockerfile |
Error | Step 2/7 : RUN yum -y install openssh-server |
Solution | FROM centos:centos7 |
Remark | Related to new CentOS 8 Stream |
Topic | Jenkins, From Zero To Hero Become a DevOps Jenkins Master |
Source | jenkinsdocker-compose.ymluser: root |
Error | jenkins | touch: cannot touch '/var/jenkins_home/copy_reference_file.log': Permission denied |
Solution | docker exec -it jenkins bash |
Remark | docker-compose.yml user:root to #user:root |
Topic | Jenkins, From Zero To Hero Become a DevOps Jenkins Master |
Source | jenkins server > Build with Parameters > Console Output |
Error | [SSH] executing... ERROR: Failed to authenticate with public key com.jcraft.jsch.JSchException: invalid privatekey: [B@95a8cd7 |
Solution | Add Credentials > Kind: Username with password |
Remark | If Kind: SSH Username with private key does not work for you. |
Topic | Jenkins, From Zero To Hero Become a DevOps Jenkins Master |
Source | jenkins server > Build with Parameters > Console Output |
Error | /tmp/script.sh $MYSQL_HOST $MYSQL_PASSWORD $DATABASE_NAME $AWS_SECRET_KEY $BUCKET_NAME bash: line 6: /tmp/script.sh: Permission denied |
Solution | volumes: |
Remark | Make the script permanent outside of docker container by using volumes but remember to make it executable. |
Topic | Jenkins, From Zero To Hero Become a DevOps Jenkins Master |
Source | jenkins > docker-compose build |
Error | /bin/sh: 1: python: not found |
Solution | # Dockerfile |
Remark | Tutorial videos based on Python 2 and outdated installing Ansible with pip commands. |
Topic | GitLab, CI/CD Getting Started |
Source | GitLab > (your project) > CI/CD > Pipelines |
Error | Preparing environment |
Solution | Comment out last 3 lines /home/gitlab-runner/.bash_logout
|
Remark | Delete the file works too. |
Topic | GitLab, CI/CD chmod: unrecognized option ‘—–BEGIN’ |
Source | GitLab > (your project) > CI/CD > Pipelines > Deployment stage |
Error | $ chmod og= $ID_RSA
ERROR: Job failed: exit code 1 |
Solution | CI/CD > Variables > Update variable Change Type drop-down to File |
Remark | Apply chmod on a file instead of file content. |
Topic | GitLab, CI/CD Load key “/builds/.username./my-proj.tmp/ID_RSA”: invalid format |
Source | GitLab > (your project) > CI/CD > Pipelines > Deployment stage |
Error | $ ssh -p2288 -i $ID_RSA -o StrictHostKeyChecking=no $SERVER_USER@$SERVER_IP "docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY"
|
Solution | user@server:~/.ssh$ ssh-keygen -p -m pem -f id_rsa |
Remark | Requires PEM format (-----BEGIN RSA PRIVATE KEY----- ) instead of (-----BEGIN OPENSSH PRIVATE KEY----- ) |
Topic | GitLab, CI/CD Permission denied (publickey,password) |
Source | GitLab > (your project) > CI/CD > Pipelines > Deployment stage |
Error | $ ssh -p2288 -i $ID_RSA -o StrictHostKeyChecking=no $SERVER_USER@$SERVER_IP "docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY"
|
Solution | user@server:~/.ssh$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys |
Remark | Add the public key (generated within) to the deployment server itself in order for it to accept the RSA private key from GitLab CI/CD variable. |
Topic | GitLab, CI/CD stages: -publish (passed) but -deploy (did not happen) |
Source | GitLab > (your project) > CI/CD > Pipelines > under Stages |
Error | Subsequent stage – deploy did not process even after -publish stage passed |
Solution | only: |
Remark | In March 2021, GitLab renamed the default ‘Master’ branch to ‘Main‘ for new projects. |
Topic | GitLab, Install a runner in CentOS 7 |
Source | $ gitlab-runner install --user=gitlab-runner --working-directory=/home/gitlab-runner |
Error | FATAL: flag provided but not defined: -user |
Solution | $ sudo yum install gitlab-runner |
Remark | Using GitLab runner installation instructions which works for Ubuntu but not for CentOS 7. |
Topic | GitLab, Install a runner in CentOS 8 Stream |
Source | $ gitlab-runner install --user=gitlab-runner --working-directory=/home/gitlab-runner |
Error | sudo: gitlab-runner: command not found |
Solution | $ whereis gitlab-runner |
Remark | sudo visudo is reserved for updating /etc/sudoers |
Topic | Limited access (non-root) user: AWS API Error |
Source | EC2 Dashboard |
Error | [AWS EC2] Instances (running) x API Error | Dedicated Hosts x API Error Instances x API Error | Key pairs x API Error $ terraform apply |
Solution | Delete and re-create the (non-root) user and re-assign required permissions policies. |
Remark | stackoverflow.com mentioned keys got revoked because AWS detected access key/secret key was exposed/ compromised. |
Topic | AWS Application Load Balancer – 503 Service Temporarily Unavailable |
Source | Terraform (main.tf) > AWS (EC2/ALB) > Browser |
Error | http://terraform-asg-example-15464xxxxx.us-east-2.elb.amazonaws.com/ 503 Service Temporarily Unavailable |
Solution | # Find aws_autoscaling_group resource and add this line in Terraform main.tf [Manual fix] Go AWS EC2 > click Target Groups > click affected group > Register targets > Check Instance ID (e.g. EC2) > Include as pending below > Register pending targets Go to browser and retry http://terraform-asg-example-15464xxxxx.us-east-2.elb.amazonaws.com/ |
Remark | The target group is created but contains no EC2 instances hence HTTP Error 503 is returned. |
Topic | Prometheus – err=”opening storage failed: mmap files |
Source | Prometheus on Docker > docker-compose up |
Error | prometheus | level=error ts=2022-07-20T13:20:57.653Z caller=main.go:787 err=”opening storage failed: mmap files, file: data/chunks_head/000476: mmap: invalid argument“ |
Solution | [List all volumes]docker volume ls -q (prometheus is the container name and prometheus-data is the volume) prometheus_prometheus-data [Delete prometheus volume only] $ docker volume rm prometheus_prometheus-data |
Remark | To delete data/chunks_head/000476 is one of the solutions. However, Prometheus is using Docker, trying $ docker exec -it prometheus bash will result in Error: No such container: prometheus because it fails to start up. This will result in historical data loss but configurations remain intact |
Topic | http port [8000] – port is already bound. Splunk needs to use this port. |
Source | CentOS 8 | Splunk v9 |
Error | $ sudo /opt/splunk/bin/./splunk start Checking http port [8000]: not available ERROR: http port [8000] - port is already bound. Splunk needs to use this port. |
Solution | $ sudo dnf install nmap |
Remark | netstat -an | grep 8000 ; fuser -k 8000/tcp ; lsof -i TCP:8000 ; grep -rnw '/etc/httpd/conf.d/' -e '8000' . All these commands will not find the associated PID using port 8000 except when using nmap shows it is alternate http |
Topic | Remove ‘already committed’ .vscode directory from Git repository |
Source | GitLab |
Error | Many existing projects have committed and uploaded the sftp.json file which stores servers login details. |
Solution | Create .gitignore to not push hidden folders and files by adding.* Above step is preventive but to remove .vscode folder already in Git repo, go Git Bash git rm -r --cached myFolder Finally, commit and push all changes. |
Remark | SFTP is a useful plugin for Visual Studio Code but the sftp.json file will get push to Git together with the rest of the project files if no .gitignore is deploy. |
This post is not the end, for we will continue to add more troubleshooting guides as we continue our exploration with DevOps tools.
Last updated 30 Jan 2023
Содержание
- GitLab CI Shell Executor failing builds with ERROR: Job failed: exit status 1
- Interactions with this post
- Upload artifact fails but jobs succeeds
- Summary
- Steps to reproduce
- Actual behavior
- Expected behavior
- Relevant logs and/or screenshots
- Environment description
- Used GitLab Runner version
- Container killed while running / Output logs missing in GitLab console
- Summary
- Context
- Output
- Reproducibility
- More information
- Configuration
- Make it possible to control build status using exit codes
- Update
- Description
- Proposal
- Invalid configurations:
- References
- Sequencing
- gitlab runner выдает в конце сообщение «Очистка файловых переменных 00:01 ERROR: Job failed: exit code 1»
- 4 ответа
- Обзор
- Мой код
- Как я это исправил
GitLab CI Shell Executor failing builds with ERROR: Job failed: exit status 1
Tonight I’ve been playing around with GitLab CI’s shell executor, and my builds have been failing with this error:
After some searching online, it appeared that similar to when receiving No Such Directory , this comment noted that it’s an issue with SKEL — the solution was to delete .bash_logout from the gitlab-runner user’s home, but I also removed .bashrc and .profile .
How to work around `ERROR: Job failed: exit status 1` errors with GitLab CI’s shell executor.
Written by Jamie Tanna on Wed, 03 Jun 2020 21:13:41 BST , and last updated on Wed, 02 Mar 2022 13:34:19 UTC .
Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.
Has this content helped you? Did it solve that difficult-to-resolve issue you’ve been chasing for weeks? Or has it taught you something new you’ll be able to re-use daily?
Please consider supporting me so I can continue to create content like this!
This post was filed under articles.
Interactions with this post
Below you can find the interactions that this page has had using WebMention.
Have you written a response to this post? Let me know the URL:
Do you not have a website set up with WebMention capabilities? You can use Comment Parade.
Источник
Upload artifact fails but jobs succeeds
Summary
Occasionally, after one of our build jobs is finished, it will start uploading the artifacts, and silently fail: the job succeeds but the artifacts are not uploaded and no errors are reported.
- Jobs that depend on these artifacts then fail, which requires re-running the whole job again (as opposed to just re-running the upload)
Steps to reproduce
- I have managed to reproduce the behavior (the helper crashing but job succeeding)
- More details in this project https://gitlab.com/jpsamper/runner-helper-reproducer
- In production, we usually see it when there are a lot of jobs running/uploading artifacts at the same time
- We’ve seen it with as low as 10 jobs uploading 1.5GB zipped (4-5GB unzipped) concurrently
- By using a custom gitlab-runner-helper with a lot more print statements, we have found that the logs stop after invoking r.client.Do (i.e. if we add a print statement right before and right after, the one right after never appears)
- Naturally, this is the behavior when something goes wrong, when the artifact is uploaded correctly, we see both print statements
Actual behavior
If I understand correctly, the function call linked above is invoking Do from the net/http package, and that call seems to be crashing the gitlab-runner-helper with no additional error message/return code/etc.
Expected behavior
- If an artifact upload fails, the job fails or retries
- An informative error message too, ideally
Relevant logs and/or screenshots
When everything works as expected:
And when it doesn’t:
Environment description
- Gitlab Runner with docker executor kubernetes executor
- Latest docker version
- Default config.toml
Used GitLab Runner version
We’re currently on gitlab-runner 13.3.0 but have been seeing this since at least 12.9.0
Источник
Container killed while running / Output logs missing in GitLab console
Summary
The job output is not completely displayed in GitLab console. That happened for successful and failed jobs.
Context
The script’s job simply call a bash script:
Output
The GitLab job console output the following:
As one can see, the final echo Job completed with code $exit_code did not show up. I have also seen this behavior with a successful job. When following the container with git logs -f , the output also stops in the middle of nowhere.
So, I’m not sure if sometimes the container gets killed too early, resulting in a exit code 1, or if it is a job failure in my script , since I have no way to access the error logs. Also it is possible that some flush is missing because sometime the job is still successful even if I don’t see the end of the logs.
Reproducibility
This happens with gitlab-runner 12.10.2 but:
- it happens always on the same job (we have multiple jobs of the same size in the same stage that are running concurrently).
- it happens only on one runner. If the job is executed on another runner (by pausing the problematic one), the job completes normally.
- on the problematic runner, the crash occurs almost always for that job, but sometimes it passes.
- the crash happen even if it is the only job running on the server (we have 4 executors on this runner).
I could see this happen directly: I was following the executor with docker logs -f . Suddenly, the logs stops and I get back to the shell. Then a docker logs just tell that no such container exists, meaning the runner just deleted it directly after he crashed.
More information
- We are running GitLab 12.10.3, with 2 runners version 12.10.2.
- The problem has also been seen on gitlab-runner 12.10.1 and 12.9.0 (but with a GitLab 12.10.3 instance)
It looks like related to our job that could be flaky, but that’s not the case (as said, this kind of error did never occured with previous GitLab versions and even with a flaky code, we should see the last echo line). Also, the same job on a stable branch (which had no modification since we updated GitLab) fails with the exact same behavior.
We updated the system’s runner and rebooted it, pruned all the containers, but nevertheless the error is still present.
At this point, it is very hard to get a deeper understanding of this problem because:
- we have multiple other jobs with similar, even longer tests that run without any problem.
- this same behaviour has been seen on both runners. If I shut down the runner and the job is executed by the other one, it succeeds, even with partial logs.
- it really looks like either gitlab-runner «returns» too early, omitting some log from the container, or either kills it prematurely.
- could it be related to #25348 (closed) ? 🤔
Configuration
We use Docker version 18.09.1, build 4c52b90 and docker-compose version 1.21.0 with the following configuration:
Источник
Make it possible to control build status using exit codes
Update
We’ve open a separate issue #273157 (closed) to explore additional ways on how to solve this issue — Please add your comments to this issue only if the proposed solution does not fit your needs
Description
It may be an interesting feature to be able to control build status using .gitlab-ci.yml and exit codes.
Proposal
Implement control for exit codes as follows, based on the list of possible ones that are supported (failed, warning, success):
script_exit_codes are applied to the job based on the first non-zero exit code encountered in the script, or the last zero if all exit codes were zero.
- If success: is not defined, it is evaluated as success: [0] . If it is defined, and you want 0 to be a success, it must be included in your array.
- If warning: is not defined, it is evaluated as warning: []
- If canceled: is not defined, it is evaluated as canceled: []
- If a job is set to allow_failure: true , all failure states will be also treated as warnings.
Invalid configurations:
The following configuration should be considered invalid and a syntax error should be returned because of duplicate values ( 1 in this case). Both success and warning should have mutually exclusive values.
References
Idea from @ayufan during Slack conversation 🙂
Sequencing
- Add script_exit_codes to the YAML keywords
- Include script_exit_codes in the /request jobs response
- Add support for canceled state. This should already be done as part of gitlab-runner#4843
- The runner should compare the exit code with the ones defined in script_exit_codes (if available) before sending the status.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
Источник
gitlab runner выдает в конце сообщение «Очистка файловых переменных 00:01 ERROR: Job failed: exit code 1»
Несмотря на то, что все мои шаги проходят успешно, Gitlab CI показывает это — «Очистка файловых переменных 00:01 ОШИБКА: задание не выполнено: код выхода 1»
И терпит неудачу в самом конце. Также интересно, что это происходит только для моей основной ветки. Он успешно работает в других ветках. Кто-нибудь сталкивался с этой проблемой и нашел решение?
4 ответа
Обзор
Это свело меня с ума, и я до сих пор не знаю, какой ответ правильный. Я сам столкнулся с этой проблемой и потратил на нее часы. Я думаю, что GitLab что-то испортил с подстановкой команд (показывает новый выпуск вчера), хотя я могу ошибаться насчет проблемы или ее сроков. Кроме того, похоже, что это происходит только для некоторых замен команд, а не для других. Я изначально подозревал, что это может быть связано с выводом в /dev/null , но не собирался углубляться. Он всегда выходил из строя сразу после того, как была инициирована подстановка команды.
Мой код
У меня был код, похожий на ваш (сокращенная версия ниже), я пытался манипулировать им несколькими способами, но каждое использование подстановки команд приводило к одному и тому же сообщению об ошибке:
Я предпринял следующие попытки:
Обе эти версии успешно работали на моем локальном компьютере, но не работали в GitLab (у меня могут быть опечатки выше — пожалуйста, не разбирайтесь, это уменьшенная версия моей реальной программы).
Как я это исправил
Вместо того, чтобы использовать подстановку команд $(. ) , я выбрал замену процесса , и, похоже, она работает без проблем.
Я бы попытался заменить то же самое в вашем коде, если это возможно:
Проблема также может заключаться в строке внутри оператора if (эхо), вы можете заменить ее следующим:
Опять же, не совсем уверен, что это решит проблему для вас, поскольку я все еще не уверен, в чем причина, но это решило мою проблему. Удачи.
В моем случае мой сценарий завершился командой curl для URL-адреса, который вернул бы 403 Forbidden и, вероятно, повесил трубку
. если это кому-то поможет 🙂
Мы столкнулись с той же проблемой в GitLab v13.3.6-ee со следующей строкой скрипта, который мы используем для открытого нового запроса на слияние:
И, как заявил @ctwheels, изменив эту строку на это:
Источник
If you are getting following error in your when running gitlab ci/cd job via gitlab-runner
:
ERROR: Job failed: prepare environment: Process exited with status 1.
Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading
for more information
Enter fullscreen mode
Exit fullscreen mode
Solution
Run following command:
find / -name .bash_logout
Enter fullscreen mode
Exit fullscreen mode
and delete following files if exist
sudo rm -r /home/gitlab-runner/.bash_logout
sudo rm -r /home/<username>/.bash_logout
Enter fullscreen mode
Exit fullscreen mode
Try to re-run the jobs it should be working.
All DEV content is created by the community!
Hey, if you’re landing here for the first time, you should know that this website is a global community of folks who blog about their experiences to help folks like you out.
Sign up now if you’re curious. It’s free!
Running with gitlab-runner 14.5.2 (e91107dd)
on ibrahimrunner a8c7nx2r
Preparing the «shell» executor
00:00
Using Shell executor…
Preparing environment
00:00
Running on ip-172-31-5-177…
ERROR: Job failed: prepare environment: exit status 1. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
I have install gitlab server in one aws ec2 and in the same server i installed gitlab-runner
now i created .gitlab-ci.yml file when i commit it automatically trigger the CICD but through above error
i.,
ERROR: Job failed: prepare environment: exit status 1. Check
Could you help me please
asked Dec 20, 2021 at 10:11
According to thus GitLab issue the fix is to remove .bash_logout
answered Dec 20, 2021 at 10:45
AlexDAlexD
8,3042 gold badges29 silver badges38 bronze badges
I was configured GitLab server and GitLab-runner machine on the same Ubuntu 20.04 machine.
But when I install GitLab-runner machine on another machine I got the solution.
answered Dec 24, 2021 at 9:41