BladeX Docker部署 Saber无法正常使用

Blade 未结 2 1713
brucedong
brucedong 剑圣 2020-10-06 12:26

一、该问题的重现步骤是什么?

   0.使用bladex/script/docker/app/deploy.sh 部署,部署过程中发现无法获取harbor镜像,修改docker-compose.yaml中的image地址后,并重新制作report,成功启动所有docker,其他未作调整:

image.png

  1. Saber项目直接使用yarn build 生成 dist文件;

  2. 上传dist下所有文件至 docker app_web-nginx_1 /usr/share/nginx/html/目录:

image.png

3.浏览器访问 http://192.168.1.102:8000,出现未知错误:

image.png

4.F12调试发现500错误:


  1. Request URL:
    http://192.168.1.102:8000/api/blade-auth/oauth/captcha
    Request Method:
    GET
    Status Code:
    500 Internal Server Error

    响应文本:

  2. {"code":500,"data":null,"message":"Failed to handle request [GET http://192.168.1.102/blade-auth/oauth/captcha]: finishConnect(..) failed: Connection refused: /172.30.0.91:8100"}

5.WEB-NGINX配置文件如下:

/ # cat /etc/nginx/nginx.conf 
user  root;
worker_processes  1;
error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;
events {
    worker_connections  1024;
}
http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
    access_log  /var/log/nginx/access.log  main;
    sendfile        on;
    #tcp_nopush     on;
    keepalive_timeout  65;
    #gzip  on;
    #include /etc/nginx/conf.d/*.conf;
    upstream gateway {
                 server 172.30.0.81;
                 server 172.30.0.82;
                 server 172.30.0.83;
             }
    server {
      listen       8000;
      server_name  web;
      root         /usr/share/nginx/html;
      location / {
      }
      location ^~ /oauth/redirect {
           rewrite ^(.*)$ /index.html break;
      }
      location ^~ /api {
           proxy_set_header Host $host;
           proxy_set_header X-Real-IP $remote_addr;
           proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
           proxy_buffering off;
           rewrite ^/api/(.*)$ /$1 break;
           proxy_pass http://gateway;
      }
    }
}
/ # 


二、你期待的结果是什么?实际看到的又是什么?

正常登录,实际api调用500错误


三、你正在使用的是什么产品,什么版本?在什么操作系统上?

BladeX/最新版/CentOS7/Docker 19


四、请提供详细的错误堆栈信息,这很重要。

浏览器报错,消息如上


五、若有更多详细信息,请在下面提供。

2条回答
  • 2020-10-06 12:31

    docker 私网 信息如下:

    Container NameIPv4 AddressIPv6 AddressMacAddressActions
    app_nacos_1172.30.0.48/16-02:42:ac:1e:00:30
    app_blade-admin_1172.30.0.5/16-02:42:ac:1e:00:05
    app_blade-turbine_1172.30.0.12/16-02:42:ac:1e:00:0c
    app_blade-zipkin_1172.30.0.2/16-02:42:ac:1e:00:02
    app_blade-log_1172.30.0.10/16-02:42:ac:1e:00:0a
    app_blade-gateway2_1172.30.0.82/16-02:42:ac:1e:00:52
    app_blade-system_1172.30.0.13/16-02:42:ac:1e:00:0d
    app_blade-auth1_1172.30.0.91/16-02:42:ac:1e:00:5b
    app_blade-report_1172.30.0.98/16-02:42:ac:1e:00:62
    app_blade-user_1172.30.0.11/16-02:42:ac:1e:00:0b
    app_blade-desk_1172.30.0.9/16-02:42:ac:1e:00:09
    app_blade-resource_1172.30.0.7/16-02:42:ac:1e:00:07
    app_blade-auth2_1172.30.0.92/16-02:42:ac:1e:00:5c
    app_blade-gateway1_1172.30.0.81/16-02:42:ac:1e:00:51
    app_blade-flow_1172.30.0.8/16-02:42:ac:1e:00:08
    app_blade-nginx_1172.30.0.14/16-02:42:ac:1e:00:0e
    app_sentinel_1172.30.0.58/16-02:42:ac:1e:00:3a
    app_blade-redis_1172.30.0.4/16-02:42:ac:1e:00:04
    app_seata-server_1172.30.0.68/16-02:42:ac:1e:00:44
    app_web-nginx_1172.30.0.6/16-02:42:ac:1e:00:06


    0 讨论(0)
  • 2020-10-06 12:33

     1. 日志显示的是connect refused,无法链接,说明是网络的问题,无法连接到172.30.0.91的服务

     2. 所以你需要看一下nacos的服务是否都注册成功, 你需要把nacos注册的服务已经注册ip也贴一下,另外使用docker logs -f xxx看一下服务启动是否成功

     3. 如果内网地址不通,访问的话自然会拒绝请求,还有docker部署需要在一台服务器内,如果是多台服务器需要借助docker swarm来达到内网互通

     4. 部署的时候会有两个nginx,一个nginx端口8000用于部署前端,一个nginx端口88用于对外暴露网关,你可以使用宿主机ip:88端口的地址来测试一下后端接口是否已经调通,具体看:https://sns.bladex.cn/article-14982.html

     5. 你只部署了gateway1和gateway2,分别对应子网ip81、82,那么webnginx反向代理的83就可以删掉了

    作者追问:2020-10-06 13:01

    #1. docker是单台服务器安装

    #2. Nacos服务列表

    服务名

    分组名称

    集群数目

    实例数

    健康实例数

    触发保护阈值

    操作

    blade-user

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-auth

    DEFAULT_GROUP

    1

    2

    2

    false

    详情|示例代码|删除

    blade-zipkin

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-report

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-admin

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-turbine

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-desk

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-log

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-system

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    blade-gateway

    DEFAULT_GROUP

    1

    2

    2

    false

    详情|示例代码|删除

    服务名

    分组名称

    集群数目

    实例数

    健康实例数

    触发保护阈值

    操作

    blade-resource

    DEFAULT_GROUP

    1

    1

    1

    false

    详情|示例代码|删除

    #3. Nacos注册在172.30.0.48

    image.png

    #4. 88端口是通的,88/doc.html能够访问,但“授权模块”“工作台模块”出现一样的报错,只有系统模块能够正常加载:

    image.png

    #5. gateway 83已经从nginx配置中删除,还是没效果。


    回答: 2020-10-06 13:08

    那就说明是服务没有完全部署成功,你到每个docker服务使用docker logs -f xxx 看看,对应到blade-auth、blade-gateway、blade-user等等服务打印的日志是什么


    还有就是使用postman越过前端,直接调用token接口看看是否可以正确返回,如果无法返回,就说明与前端无关了,主要去排查部署的后端服务问题,具体帖子:https://sns.bladex.cn/article-14982.html


    一般情况下,docker单服务器部署遇到问题大多是网段、naocs配置未读取导致的错误,从而无法正常调用,你需要着重去排查下

    作者追问:2020-10-06 13:10

    无法访问的 8100是blade-auth服务:

    image.png

    作者追问:2020-10-06 13:16


    2020-10-06 09:34:08.243  WARN 1 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : failed to request http://172.30.0.48:8848/nacos/v1/ns/instance/beat?app=blade-auth&namespaceId=public&port=8100&clusterName=DEFAULT&ip=172.30.0.91&serviceName=DEFAULT_GROUP%40%40blade-auth&encoding=UTF-8 from 172.30.0.48,

    2020-10-06 09:34:08.244 ERROR 1 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : [NA] failed to request ,

    ,

    java.net.ConnectException: Connection refused (Connection refused),

    at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) ~[na:1.8.0_265],

    at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) ~[na:1.8.0_265],

    at java.net.AbstractPlainSocketImpl.connect(Unknown Source) ~[na:1.8.0_265],

    at java.net.SocksSocketImpl.connect(Unknown Source) ~[na:1.8.0_265],

    at java.net.Socket.connect(Unknown Source) ~[na:1.8.0_265],

    at sun.net.NetworkClient.doConnect(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.http.HttpClient.openServer(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.http.HttpClient.openServer(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.http.HttpClient.<init>(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.http.HttpClient.New(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.http.HttpClient.New(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(Unknown Source) ~[na:1.8.0_265],

    at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(Unknown Source) ~[na:1.8.0_265],

    at com.alibaba.nacos.client.naming.net.HttpClient.request(HttpClient.java:82) ~[nacos-client-1.2.1.jar!/:na],

    at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:433) [nacos-client-1.2.1.jar!/:na],

    at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:482) [nacos-client-1.2.1.jar!/:na],

    at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:401) [nacos-client-1.2.1.jar!/:na],

    at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:343) [nacos-client-1.2.1.jar!/:na],

    at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:108) [nacos-client-1.2.1.jar!/:na],

    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_265],

    at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_265],

    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [na:1.8.0_265],

    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [na:1.8.0_265],

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_265],

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_265],

    at java.lang.Thread.run(Unknown Source) [na:1.8.0_265],

    ,

    2020-10-06 09:34:08.245 ERROR 1 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : request: /nacos/v1/ns/instance/beat failed, servers: [172.30.0.48:8848], code: 500, msg: java.net.ConnectException: Connection refused (Connection refused),

    2020-10-06 09:34:08.247 ERROR 1 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : [CLIENT-BEAT] failed to send beat: {"cluster":"DEFAULT","ip":"172.30.0.91","metadata":{"preserved.register.source":"SPRING_CLOUD"},"period":5000,"port":8100,"scheduled":false,"serviceName":"DEFAULT_GROUP@@blade-auth","stopped":false,"weight":1.0}, code: 500, msg: failed to req API:/api//nacos/v1/ns/instance/beat after all servers([172.30.0.48:8848]) tried: java.net.ConnectException: Connection refused (Connection refused),


    回答: 2020-10-06 13:22

     1. 先检查下nacos是否为单机模式,如果不是单机模式则改为单机(yml默认为单机模式)

     2. 是91的服务报错还是91、92都报错?如果只有91的话,先关掉91的服务,只启动92,再跑一下是否可以跑通

     3. 先把服务跑通,然后再着重排查ip联通的问题

    0 讨论(0)
提交回复