一、该问题的重现步骤是什么?
1. 重启系统部分业务包,比如order、oa包,有时候会导致其他业务包无法访问,比如user、system包的接口调用失败,报404,auth包鉴权失败,网关gateway不可用等问题。
2. 最后需要重启网关gateway然后系统恢复正常。
3.是不是重启后存网关上的路由找不到?还是存在所谓的历史版本不一致,导致连接访问失败? 比如redis的数据都访问不到。
二、你期待的结果是什么?实际看到的又是什么?
期待的结果:任意重启业务包,对系统不产生影响,只会应该当前包的服务,其他的服务不应该产生影响,特别是网关要保证可用。
实际看到:网关gateway不可用,其他业务包无法访问。
三、你正在使用的是什么产品,什么版本?在什么操作系统上?
bladex
springblade 2.6.0.RELEASE
四、请提供详细的错误堆栈信息,这很重要。
其他业务服务的日志:
2024-09-18 22:38:13.302 ERROR 324061 --- [ing.beat.sender] com.alibaba.nacos.client.naming : [NA] failed to request
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)
at com.alibaba.nacos.client.naming.net.HttpClient.request(HttpClient.java:86)
at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:427)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:462)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:395)
at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:337)
at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:108)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2024-09-19 01:58:33.123 ERROR 324061 --- [ing.beat.sender] com.alibaba.nacos.client.naming : [NA] failed to request
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)
at com.alibaba.nacos.client.naming.net.HttpClient.request(HttpClient.java:86)
at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:427)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:462)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:395)
at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:337)
at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:108)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
(END)
网关的日志部分日志,直接报401鉴权失败。但是gateway重启后是可以正常访问的。
2024-09-19 09:52:31.230 INFO 5476 --- [r-http-epoll-31] o.springblade.gateway.filter.AuthFilter : -----AuthFilter filter isSkip-----/blade-user/getUserInfo?phone=17603059577
2024-09-19 09:52:31.233 INFO 5476 --- [r-http-epoll-31] o.s.g.filter.GlobalResponseLogFilter :
================ Gateway Response Start ================
<=== 401 GET: /blade-user/getUserInfo?phone=17603059577
===Headers=== transfer-encoding: [chunked]
===Headers=== Content-Type: [application/json;charset=UTF-8]
================ Gateway Response End =================
五、若有更多详细信息,请在下面提供。
你重启任意服务,都会导致nacos断连,而网关和nacos是有一个心跳周期的,如果周期还没到,他还是会以为服务下线了,自然就连接不上。等下一个周期心跳结束,才会把服务设置为正常可访问的状态。
快速解决方案就是在服务重启成功后,把网关重启,这样就不用等心跳周期,重启完就能生效。
不像你说的
等下一个周期心跳结束,才会把服务设置为正常可访问的状态。
都几个小时了,业务服务还是一直访问不上。网关这边报的错,业务服务拒绝连接。
网关错误日志如下:================ Gateway Request End =================
ERROR 24327 --- [r-http-epoll-14] a.w.r.e.AbstractErrorWebExceptionHandler : [91d3fb11] 500 Server Error for HTTP POST "/blade-b2ecproject/b2ec/queryTodayHistoryBalance"
io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: /192.168.123.50:9116
Caused by: java.net.ConnectException: finishConnect(..) failed: 拒绝连接
at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124) ~[netty-transport-native-unix-common-4.1.51.Final.jar!/:4.1.51.Final]
at io.netty.channel.unix.Socket.finishConnect(Socket.java:243) ~[netty-transport-native-unix-common-4.1.51.Final.jar!/:4.1.51.Final]
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:672) [netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar!/:4.1.51.Final]
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:649) [netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar!/:4.1.51.Final]
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:529) [netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar!/:4.1.51.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465) ~[netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar!/:4.1.51.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar!/:4.1.51.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[netty-common-4.1.51.Final.jar!/:4.1.51.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.51.Final.jar!/:4.1.51.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.51.Final.jar!/:4.1.51.Final]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191]
-------
业务服务是正常重启的,没看到报任何错误。
业务启动日志如下:
2024-10-18 14:45:15.565 INFO 30885 --- [sync-executor-1] o.s.core.launch.StartEventListener : ---[BLADE-B2ECPROJECT]---启动完成,当前使用的端口:[9116],环境变量:[prod]---
2024-10-18 14:45:16.037 INFO 30885 --- [ XNIO-1 task-2] io.undertow.servlet : Initializing Spring DispatcherServlet 'dispatcherServlet'
2024-10-18 14:45:16.038 INFO 30885 --- [ XNIO-1 task-2] o.s.web.servlet.DispatcherServlet : Initializing Servlet 'dispatcherServlet'
2024-10-18 14:45:16.127 INFO 30885 --- [ XNIO-1 task-2] o.s.web.servlet.DispatcherServlet : Completed initialization in 88 ms
--
尝试过重启网关或者还原旧的业务包重启服务,是可以的。
扫一扫访问 Blade技术社区 移动端