统部分业务包重启,导致系统无法访问,重启网关后系统恢复正常。

Blade 未结 2 121
263778608
263778608 2024-09-19 10:54
悬赏:5

一、该问题的重现步骤是什么?

1. 重启系统部分业务包,比如order、oa包,有时候会导致其他业务包无法访问,比如user、system包的接口调用失败,报404,auth包鉴权失败,网关gateway不可用等问题。

2. 最后需要重启网关gateway然后系统恢复正常。

3.是不是重启后存网关上的路由找不到?还是存在所谓的历史版本不一致,导致连接访问失败? 比如redis的数据都访问不到。


二、你期待的结果是什么?实际看到的又是什么?

期待的结果:任意重启业务包,对系统不产生影响,只会应该当前包的服务,其他的服务不应该产生影响,特别是网关要保证可用。

实际看到:网关gateway不可用,其他业务包无法访问。



三、你正在使用的是什么产品,什么版本?在什么操作系统上?

bladex 

springblade 2.6.0.RELEASE

四、请提供详细的错误堆栈信息,这很重要。

其他业务服务的日志:

2024-09-18 22:38:13.302 ERROR 324061 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : [NA] failed to request 


java.net.SocketTimeoutException: connect timed out

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)

        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)

        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

        at java.net.Socket.connect(Socket.java:589)

        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)

        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)

        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)

        at sun.net.www.http.HttpClient.(HttpClient.java:242)

        at sun.net.www.http.HttpClient.New(HttpClient.java:339)

        at sun.net.www.http.HttpClient.New(HttpClient.java:357)

        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)

        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)

        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)

        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)

        at com.alibaba.nacos.client.naming.net.HttpClient.request(HttpClient.java:86)

        at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:427)

        at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:462)

        at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:395)

        at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:337)

        at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:108)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)


2024-09-19 01:58:33.123 ERROR 324061 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : [NA] failed to request 


java.net.SocketTimeoutException: connect timed out

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)

        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)

        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

        at java.net.Socket.connect(Socket.java:589)

        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)

        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)

        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)

        at sun.net.www.http.HttpClient.(HttpClient.java:242)

        at sun.net.www.http.HttpClient.New(HttpClient.java:339)

        at sun.net.www.http.HttpClient.New(HttpClient.java:357)

        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)

        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)

        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)

        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)

        at com.alibaba.nacos.client.naming.net.HttpClient.request(HttpClient.java:86)

        at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:427)

        at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:462)

        at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:395)

        at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:337)

        at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:108)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)


(END)


网关的日志部分日志,直接报401鉴权失败。但是gateway重启后是可以正常访问的。

2024-09-19 09:52:31.230  INFO 5476 --- [r-http-epoll-31] o.springblade.gateway.filter.AuthFilter  : -----AuthFilter filter isSkip-----/blade-user/getUserInfo?phone=17603059577

2024-09-19 09:52:31.233  INFO 5476 --- [r-http-epoll-31] o.s.g.filter.GlobalResponseLogFilter     : 


================ Gateway Response Start  ================

<=== 401 GET: /blade-user/getUserInfo?phone=17603059577

===Headers===  transfer-encoding: [chunked]

===Headers===  Content-Type: [application/json;charset=UTF-8]

================  Gateway Response End  =================




五、若有更多详细信息,请在下面提供。

2条回答
  •  admin
    admin (楼主)
    2024-09-19 11:27

    你重启任意服务,都会导致nacos断连,而网关和nacos是有一个心跳周期的,如果周期还没到,他还是会以为服务下线了,自然就连接不上。等下一个周期心跳结束,才会把服务设置为正常可访问的状态。

    快速解决方案就是在服务重启成功后,把网关重启,这样就不用等心跳周期,重启完就能生效。

提交回复