cita-monitor 部署问题-连接不到cita服务

错误如下:
[2020-02-28 17:18:19,711] ERROR in app: Exception on /metrics/cita [GET]
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 2311, in wsgi_app
response = self.full_dispatch_request()
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1834, in full_dispatch_request
rv = self.handle_user_exception(e)
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1737, in handle_user_exception
reraise(exc_type, exc_value, tb)
File “/usr/local/lib/python3.6/dist-packages/flask/_compat.py”, line 36, in reraise
raise value
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1832, in full_dispatch_request
rv = self.dispatch_request()
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1818, in dispatch_request
return self.view_functionsrule.endpoint
File “cita_monitor_agent.py”, line 316, in exporter
metadata_info = class_result.metadata(hex_number)
File “cita_monitor_agent.py”, line 163, in metadata
return self.cli_request(payload)
File “cita_monitor_agent.py”, line 123, in cli_request
req_result = os.popen(req).read()
File “/usr/lib/python3.6/encodings/ascii.py”, line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xe6 in position 71: ordinal not in range(128)

我测试了 单个容器部署 ,只监控cita ,问题同样

是根据哪个手册部署的吗?

@Rain 你看一下?

或者你的操作步骤是怎么样的?这样我们比较方便帮看定位到问题可能出在哪里。

File “cita_monitor_agent.py”, line 316, in exporter

这个调用的是 metadata_info = class_result.metadata(hex_number)

metadata 方法的定义是:

    def metadata(self, block_height):
        """Get metadate with cita-cli"""
        payload = "rpc getMetaData --height %s" % (block_height)
        return self.cli_request(payload)

你是怎么运行 cita-monitor 的,使用docker ?

需要提供下使用 CITA 的版本,

以及更多的cita-monitor 的输出信息

完全按照这里面的方式部署的。
cita-monitor 我用的是 docker-compose up -d

cita版本为最新的v20

需要更多的cita-monitor agent 运行输出的信息

@uangian 帮忙看看是不是CITA v20 的兼容问题

我给您的就是 citamon_agent_cita_exporter 这个容器的错误截图

当访问 http://192.168.98.242:1920/metrics/cita 这个的时候 就会报错如下

[2020-02-28 17:59:19,726] ERROR in app: Exception on /metrics/cita [GET]
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 2311, in wsgi_app
response = self.full_dispatch_request()
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1834, in full_dispatch_request
rv = self.handle_user_exception(e)
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1737, in handle_user_exception
reraise(exc_type, exc_value, tb)
File “/usr/local/lib/python3.6/dist-packages/flask/_compat.py”, line 36, in reraise
raise value
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1832, in full_dispatch_request
rv = self.dispatch_request()
File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1818, in dispatch_request
return self.view_functionsrule.endpoint
File “cita_monitor_agent.py”, line 316, in exporter
metadata_info = class_result.metadata(hex_number)
File “cita_monitor_agent.py”, line 163, in metadata
return self.cli_request(payload)
File “cita_monitor_agent.py”, line 123, in cli_request
req_result = os.popen(req).read()
File “/usr/lib/python3.6/encodings/ascii.py”, line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xe6 in position 214: ordinal not in range(128)

这个测试很简单 你们自己部署一定就知道了

我执行了一下,并没有出现你所描述的问题。

因此,我怀疑你没有连上 CITA 可能是网络方便出了点问题。你可以尝试采用如下方法排除:

  • CITA 是否已经正确启动?
  • CITA 启动时是否将端口映射到主机?如果你是使用 linux, 则会直接使用 host 的网络;但如果你使用的是 iOS, 则你需要手工去 expose ports. 参考 这里
  • cita-monitor 中的 NODE_IP 是否配对?注意,如果是 CITA 与 cita-monitor 不在同一主机,需要使用公网 IP, 或者可以互联的 IP。

我的执行步骤如下:

  • 启动 CITA (我这里直接使用 cita 的 docker 镜像):

创建一个单节点的链:

docker run -v "`pwd`":/opt/cita-run cita/cita-ce:20.2.0-secp256k1-sha3 cita create --super_admin "0x37d1c7449bfe76fe9c445e626da06265e9377601" --nodes "127.0.0.1:4000"

启动该节点:

docker run -d -p 1337:1337 -v "`pwd`":/opt/cita-run cita/cita-ce:20.2.0-secp256k1-sha3 /bin/bash -c 'cita setup test-chain/0 && cita start test-chain/0 && sleep infinity'
  • 修改 cita-monitor/agent/.env, 我的 .env 文件内容如下:
$ cat .env
# citamon_agent_cita_exporter HostName
HOSTNAME=nodehostname

# CITA node ip and port
NODE_IP=127.0.0.1
NODE_PORT=1337

# CITA node directory analysis
NODE_DIR=/root/rivtower/release/20.2.0/test-chain/0
SOFT_PATH=/root/rivtower/release/20.2.0

# CITA NODE INFO
CITA_NODENAME=node_0
CITA_CHAIN_ID=opc-devnet/1
CITA_NETWORKPORT=4000
  • 启动 cita-monitor
$ docker-compose up -d
Creating network "agent_citamon-agent-net" with the default driver
Creating citamon_agent_proxy_exporter ...
Creating citamon_agent_host_exporter ...
Creating citamon_agent_cita_exporter_127.0.0.1_1337 ...
Creating citamon_agent_rabbitmq_exporter ...
Creating citamon_agent_proxy_exporter
Creating citamon_agent_host_exporter
Creating citamon_agent_process_exporter ...
Creating citamon_agent_cita_exporter_127.0.0.1_1337
Creating citamon_agent_rabbitmq_exporter
Creating citamon_agent_cita_exporter_127.0.0.1_1337 ... done

CITA NODE INFO

CITA_NODENAME=node_0
CITA_CHAIN_ID=opc-devnet/1
CITA_NETWORKPORT=4000

请问这里需要配置吗

按理来说,这几个是为了做显示使用的。但最好是应按实际节点情况来配置。

文档中写到 这里的ip 应该是不能配置 127 ip的 ,您这个启动后 是否可以访问 http://127.0.0.1:1920/metrics/cita

的确不能访问,错误如下:

$ curl http://127.0.0.1:1920/metrics/cita
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>

我把它改成主机 IP 就可以了。

启动 CITA.

docker run -d -p $(HOST_IP):1337:1337 -v "`pwd`":/opt/cita-run cita/cita-ce:20.2.0-secp256k1-sha3 /bin/bash -c 'cita setup test-chain/0 && cita start test-chain/0 && sleep infinity'

.env 文件:

$ cat .env
# citamon_agent_cita_exporter HostName
HOSTNAME=nodehostname

# CITA node ip and port
NODE_IP=$(HOST_IP)
NODE_PORT=1337

# CITA node directory analysis
NODE_DIR=/root/rivtower/release/20.2.0/test-chain/0
SOFT_PATH=/root/rivtower/release/20.2.0

# CITA NODE INFO
CITA_NODENAME=node_0
CITA_CHAIN_ID=opc-devnet/1
CITA_NETWORKPORT=4000

启动 cita-monitor

$ docker-compose up -d

访问 http://127.0.0.1:1920/metrics/cita .

$  curl http://127.0.0.1:1920/metrics/cita
# HELP Node_Get_ServiceStatus [ value is 1 or 0 ] Check the running status of the CITA service, service up is 1 or down is 0.
# TYPE Node_Get_ServiceStatus gauge
...

请问 您的 cita-monitor 用的哪个版本

哦!我直接 clone 最新的版本,然后进入 cita-monitor/agent 操作的。

我使用的是 [Latest release] v0.4.1

$ curl http://127.0.0.1:1920/metrics/cita

500 Internal Server Error

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

当您出现这个错误的时候,请问容器里的错误是怎么提示的,您能帮我看下吗

我觉得你应该是网络没有配对。

你宿主机的操作系统是什么?是否已经配了主机 IP?

由于 CITA 与 monitor 都是运行在容器里,需要通过 Host 网络才能连通。

你需要看 citamon_agent_proxy_exporter 这个容器的错误,对吧?
稍等,我把环境恢复一下。