Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • V verify
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 22
    • Issues 22
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ran.lu
  • verify
  • Wiki
  • 【百信现场】代码开发里面git push 失败

Last edited by ran.lu Nov 24, 2022
Page history
This is an old version of this page. You can view the most recent version or browse the history.

【百信现场】代码开发里面git push 失败

【【171】【AIARTS】【百信现场】代码开发里面git push 失败】https://www.tapd.cn/42483287/bugtrace/bugs/view?bug_id=1142483287001007608

远程连接

连上集群: `` ssh -p 8886 root@220.194.147.64 100Trust!@

ssh root@192.3.21.217 100trust!@


以普通用户连上代码开发:

ssh -p 34221 org-dev@192.3.21.217 Nwo7jvCW


# 过程 

org-dev@dev-d09e033f-989f-45f2-858c-ab591110b265-vxsld:~/code$ git status On branch master Your branch is ahead of 'origin/master' by 2 commits. (use "git push" to publish your local commits)

nothing to commit, working tree clean org-dev@dev-d09e033f-989f-45f2-858c-ab591110b265-vxsld:~/code$ git push Counting objects: 98, done. Delta compression using up to 192 threads. Compressing objects: 100% (94/94), done. remote: error: file write error: Bad file descriptor remote: fatal: unable to write loose object file error: remote unpack failed: unpack-objects abnormal exit error: failed to push some refs to 'git@gitea-ssh.apulis:gitea/ouc68z1NTiC02V7Z0mTAbg.git'



添加文本文件好像挺好的?


怀疑是: ls -lh  code/pretrained_model/ssd_80C_500E.ckpt  ?

org-dev@dev-d09e033f-989f-45f2-858c-ab591110b265-vxsld:~/code$ git commit -m "try add code" git [master 2a54d46] try add code 69 files changed, 7027 insertions(+) create mode 100644 code/Dockerfile create mode 100644 code/create_data.py create mode 100644 code/demo.jpg create mode 100644 code/eval.py create mode 100644 code/export.py create mode 100644 code/infer.py create mode 100644 code/infer/convert/convert_om.sh create mode 100644 code/infer/data/classes.json create mode 100644 code/infer/data/classes_id.json create mode 100644 code/infer/data/coco_ssd_mobile_net_v2.name create mode 100644 code/infer/data/config.cfg create mode 100644 code/infer/data/ssd-mobilenet-v2.aipp create mode 100644 code/infer/data/ssd_mobile_net_v2_aipp.pipeline create mode 100644 code/infer/data/ssd_mobile_net_v2_no_aipp.pipeline create mode 100644 code/infer/mxbase/CommandFlagParser.h create mode 100644 code/infer/mxbase/FunctionTimer.h create mode 100644 code/infer/mxbase/MxBaseInfer.cpp create mode 100644 code/infer/mxbase/MxBaseInfer.h create mode 100644 code/infer/mxbase/MxImage.cpp create mode 100644 code/infer/mxbase/MxImage.h create mode 100644 code/infer/mxbase/MxUtil.cpp create mode 100644 code/infer/mxbase/MxUtil.h create mode 100644 code/infer/mxbase/SSDInfer.cpp create mode 100644 code/infer/mxbase/SSDInfer.h create mode 100644 code/infer/mxbase/SSDPostProcessor.h create mode 100644 code/infer/mxbase/build.sh create mode 100644 code/infer/mxbase/main.cpp create mode 100644 code/infer/mxbase/run_test.py create mode 100644 code/infer/sdk/mxpi/MxpiSSDMobileNetV2PostProcessor.cpp create mode 100644 code/infer/sdk/mxpi/MxpiSSDMobileNetV2PostProcessor.h create mode 100644 code/infer/sdk/mxpi/build_mxpi.sh create mode 100644 code/infer/sdk/sample/ResultProcess.h create mode 100644 code/infer/sdk/sample/aipp.cpp create mode 100644 code/infer/sdk/sample/build_aipp.sh create mode 100644 code/infer/sdk/sample/build_no_aipp.sh create mode 100644 code/infer/sdk/sample/no_aipp.cpp create mode 100644 code/mindspore_hub_conf.py create mode 100644 code/modelzoo_level.txt create mode 100644 code/on_platform/modelarts/README.md create mode 100644 code/on_platform/modelarts/init.py create mode 100644 code/on_platform/modelarts/start.py create mode 100644 code/on_platform/plat_cfg.yaml create mode 100644 code/requirements.txt create mode 100644 code/scripts/docker_start.sh create mode 100644 code/scripts/run_distribute_self_defined_train.sh create mode 100644 code/scripts/run_distribute_train.sh create mode 100644 code/scripts/run_distribute_train_gpu.sh create mode 100644 code/scripts/run_eval.sh create mode 100644 code/scripts/run_eval_gpu.sh create mode 100644 code/serve_desc.template create mode 100644 code/src/init.py create mode 100644 code/src/anchor_generator.py create mode 100644 code/src/box_utils.py create mode 100644 code/src/config.py create mode 100644 code/src/config_ssd300.py create mode 100644 code/src/config_ssd_mobilenet_v1_fpn.py create mode 100644 code/src/dataset.py create mode 100644 code/src/eval_utils.py create mode 100644 code/src/init_params.py create mode 100644 code/src/lr_schedule.py create mode 100644 code/src/mobilenet_v1_fpn.py create mode 100644 code/src/ssd.py create mode 100644 code/train.py create mode 100644 code/transformer/ext.proto create mode 100644 code/transformer/ext_pb2.py create mode 100644 code/transformer/postprocess.py create mode 100644 code/transformer/preprocess.py create mode 100644 code/transformer/serve.yaml create mode 100644 code/version.ini



try add infer 也有点问题

org-dev@dev-d09e033f-989f-45f2-858c-ab591110b265-vxsld:~/code$ git push Counting objects: 10, done. Delta compression using up to 192 threads. Compressing objects: 100% (9/9), done. remote: fatal: error when closing loose object file: I/O error error: remote unpack failed: unpack-objects abnormal exit error: failed to push some refs to 'git@gitea-ssh.apulis:gitea/ouc68z1NTiC02V7Z0mTAbg.git' org-dev@dev-d09e033f-989f-45f2-858c-ab591110b265-vxsld:~/code$ ls -lh infer total 25M -rw-r--r-- 1 org-dev domainusers 180 Nov 21 18:28 eval_result.json drwxr-xr-x 2 org-dev domainusers 4.0K Nov 21 18:28 export -rw-r--r-- 1 org-dev domainusers 295K Nov 21 18:28 predictions.json -rw-r--r-- 1 org-dev domainusers 2.5K Nov 21 18:28 serve_desc.template -rw-r--r-- 1 org-dev domainusers 2.5K Nov 21 18:37 serve_desc.yaml -rw-r--r-- 1 org-dev domainusers 13M Nov 21 18:28 ssd.air -rw------- 1 org-dev domainusers 12M Nov 21 18:37 ssd.om drwxr-xr-x 2 org-dev domainusers 4.0K Nov 21 18:28 transformer



git config --global http.postBuffer 524288000



对`https://gitlab.apulis.com.cn/ran.lu/psutil.git`进行了操作,也是加这个 zip 文件,没有这个问题。
(
https://gitlab.apulis.com.cn/ran.lu/psutil/-/tree/feat/test-add-zip
)




[pid 523107] 14:11:09.606926 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=17, si_uid=1000} ---



改一下配置文件,加一下日志设置:

cd /data/aistudio/nfs/apulis/pvc/aiplatform-gitea-data ls gitea/conf/ app.ini


配置似乎不会生效~

改了 secret, 可以生效了。但是没有出现新的好日志。



---

ssh 进程:
root     1039605  0.0  0.0   4292  3384 ?        Ss   15:19   0:00 sshd: /usr/sbin/sshd -D -e [listener] 0 of 10-100 startups

strace -f -T -tt -e trace=all -p 1039605

[pid 1207696] 15:46:28.089729 write(3, "$\207)(\356\327\306\267\271\256\v\233\255S\351\352\376[\324\330\370\32\254\332\360\216\312\252\33\220\260\336"..., 4096) = -1 EBADF (Bad file descriptor) <0.000053> [pid 1207696] 15:46:28.089936 write(2, "error: file write error: Bad fil"..., 45) = 45 <0.000046> [pid 1207438] 15:46:28.090051 <... read resumed> "error: file write error: Bad fil"..., 128) = 45 <9.879668> [pid 1207696] 15:46:28.090106 write(2, "fatal: unable to write loose obj"..., 41 <unfinished ...> [pid 1207438] 15:46:28.090167 write(1, "0032\2", 5 <unfinished ...>


发现有 2 处 EBADF:

ioctl(-1, TIOCGPGRP, 0xfffffebfa84c) = -1 EBADF (Bad file descriptor) <0.000029>


pid: 1237051

这个文件句柄:

1237051 15:51:01.193577 openat(AT_FDCWD, "/data/git/gitea-repositories/gitea/ouc68z1ntic02v7z0mtabg.git/./objects/incoming-oEoffE/b3/tmp_obj_XwYObm", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0444) = 3 <0.033109>



EBADF
fd is not a valid file descriptor or is not open for writing.

https://linux.die.net/man/2/write





mmap void *mmap(void *addr, size_t lengthint " prot ", int " flags , int fd, off_t offset);int munmap(void *addr, size_t length);

1237051 15:51:01.226765 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff82a9c000 <0.000043> 1237051 15:51:01.226908 mmap(NULL, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff7f790000 <0.000027> 1237051 15:51:01.227002 mmap(NULL, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff7f77f000 <0.000025> 1237051 15:51:01.227093 mmap(NULL, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff7f76e000 <0.000025> 1237051 15:51:01.227182 mmap(NULL, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff7f75d000 <0.000025>


这个加起来是 286720 


67159 行
76406 行

估计不是行数超标…



也有怀疑 nfs ,因为 dmesg 中有许多错误日志。


https://www.redhat.com/sysadmin/using-nfsstat-nfsiostat



观看 gitea 代码

var ( allowedCommands = map[string]models.AccessMode{ "git-upload-pack": models.AccessModeRead, "git-upload-archive": models.AccessModeRead, "git-receive-pack": models.AccessModeWrite, lfsAuthenticateVerb: models.AccessModeNone, }

			m.PostOptions("/git-receive-pack", repo.ServiceReceivePack)

git-receive-pack - Receive what is pushed into the repository Invoked by git send-pack and updates the repository with the information fed from the remote end. This command is usually not invoked directly by the end user.


确实,调用了  git-receive-pack :

root@ubuntu:~# cat tmp_strace.log | grep -i receive 1236483 15:50:56.633331 newfstatat(AT_FDCWD, "/bin/git-receive-pack", 0x400162c338, 0) = -1 ENOENT (No such file or directory) <0.000029> 1236483 15:50:56.633420 newfstatat(AT_FDCWD, "/usr/bin/git-receive-pack", <unfinished ...> 1236569 15:50:56.640285 execve("/usr/bin/git-receive-pack", ["git-receive-pack", "gitea/ouc68z1ntic02v7z0mtabg.git"], 0x400167ec30 /* 24 vars */ <unfinished ...> 1236459 15:51:03.606437 write(2, "Received disconnect from 172.20."..., 76) = 76 <0.000016>

Clone repository
  • 3.137 环境 websocket 连接失败,其它环境无此问题
  • 3.137 环境,即使使用 env 中的 grafana 密码,都无法登陆;测试环境则可以
  • 3.172 不定期出现“疯狂写盘”
  • [2022 11 17] 本地数据集要在 slurm|superpodk8s 上面使用
  • [TODO] 合入日志加速发动
  • [build] 加快 SDK 打包
  • [info] 平台日志
  • [优化] ai arts 调用 aim 时,设置超时时间
  • [问题] aim SDK, 连不上 rpc 时会报错?这不能达到无感知
  • [问题] aimstack 有时会很慢
  • [问题] desay 171 部署: 单个训练成功, 收到 2 次调用, 一次训练成功, 一次训练失败
  • [问题] 收集到的日志只有几个服务。问题: 这个是哪里配置的?
  • aim SDK 支持 tensorboard 日志
  • aim SDK 瘦身
  • gpu02环境 选择 推理模型目录 很慢
View All Pages