Troubleshooting on Cylon's Collection

Troubleshooting on Cylon's Collection https://www.161616.top/tags/troubleshooting/ Recent content in Troubleshooting on Cylon's Collection Hugo -- 0.125.7 zh Sun, 08 Jun 2025 23:10:36 +0800 解决AWS EKS error You must be logged in to the server (Unauthorized) https://www.161616.top/aws-eks-error-you-must-be-logged-in-to-the-server/ Sat, 07 Jun 2025 00:00:00 +0000 https://www.161616.top/aws-eks-error-you-must-be-logged-in-to-the-server/ 问题描述当获取了 EKS kubeconfig 后，使用该 kubeconfig 提示如下报错 bash 1 2 $ kubectl get pod --kubeconfig kubeconfig error: You must be logged in to the server (Unauthorized) 但该 IAM 用户已经存在了管理员权限了问题原因该文章有对这个问题进行描述 quote 您不是集群创建者如果您的 IAM 实体未用于创建集群，说明您不是集群创建者。在这种情况下，请完成以下步骤，将您的 IAM 实体映射到 aws-auth ConfigMap 以允许访问集群 [1] 这里检查和文章描述一致, 但还是这样的问题 bash 1 2 3 4 5 6 $ aws sts get-caller-identity { "UserId": "AIDAXxxxxxxxIIWKMR22Q", "Account": "55555555496", "Arn": "arn:aws:iam::55555555496:user/eks-user" } quote 这是因为，必须将该用户作为集群的 Access 进行关联，而不是授权 “EKS*” 相关权限图 - 集群用户选择已经存在的 IAM 用户解决openvpn与其他vpn路由冲突问题 https://www.161616.top/resolve-openvpn-tailscale-routers-conflict/ Sat, 07 Jun 2025 00:00:00 +0000 https://www.161616.top/resolve-openvpn-tailscale-routers-conflict/ 需求分析 openvpn在调研时不支持分流配置，需求是openvpn的开启不要影响现有的网络环境。在经过调研，发现 openvpn配置文件可以使用一些指令来指定访问某些地址的路由经过openvpn的设备，这样就可以实现了流量分流。配置指令说明这里主要用到了下面的参数指令说明 dhcp-option 添加额外的网络参数，可以是在客户端配置，或者服务端推送，这里有指定 DNS redirect-gateway def1 使用这个 def1 flag 可以使用0.0.0.0/1 and 128.0.0.0/1 来覆盖默认路由，这里的好处是不会擦除原有的默认网关 route 可以在建立连接后，自动添加一些路由，并且在TUN/TAP设备关闭后，自动销毁 gateway 默认来自第二参数或者默认网关，第二参数为 vpn_gateway 指远端的vpn地址 net_gateway 指 per-existing IP默认网关 route-nopull 当在客户端使用时，此选项有效禁止从Server将路由添加到客户端的路由表中，但是请注意，此选项仍然允许 Server 设置TCP/IP 客户端TUN/TAP接口的属性（这里主要用作创建openvpn自己的网络接口）。 pull-filter 忽略server端push 的资源，这些选项就是来自 —pul 或者其他选项的，例如其他选项 dhcp-option/route/gateway 等。最终的配置为 text 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 client proto udp explicit-exit-notify remote x. 物理机断电导致osd故障排查记录 https://www.161616.top/ch10-3-troubeshooting-osd-crash-by-poweroff/ Wed, 16 Apr 2025 00:00:00 +0000 https://www.161616.top/ch10-3-troubeshooting-osd-crash-by-poweroff/ ceph版本 nautilus 处理过程查看 ceph 集群状态 bash 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 ceph -s cluster: id: baf87797-3ec1-4f2c-8126-bf0a44051b13 health: HEALTH_WARN 3 osds down 1 host (3 osds) down 1 pools have many more objects per pg than average Degraded data redundancy: 1167403/4841062 objects degraded (24.115%), 391 pgs degraded, 412 pgs undersized services: mon: 3 daemons, quorum 10. Goswagger - Skipping '', recursion detected https://www.161616.top/goswagger-skipping-recursion-detected/ Sat, 21 Sep 2024 00:00:00 +0000 https://www.161616.top/goswagger-skipping-recursion-detected/ 问题：当使用的结构体为嵌套格式，会提示 recursion detected 或 cannot find type definition go 1 2 3 4 5 6 7 8 9 10 11 type Instance struct { metav1.TypeMeta Instances []InstanceItem `json:"instances" yaml:"instances" form:"instances" binding:"required"` ServiceSelector map[string]string `json:"serivce_selector" yaml:"serivce_selector" form:"serivce_selector"` } type InstanceItem struct { Name string `json:"name" yaml:"name" form:"name" binding:"required"` PromEndpoint string `json:"prom_endpoint" yaml:"prom_endpoint" form:"prom_endpoint" binding:"required"` Labels map[string]string `json:"labels" yaml:"labels" form:"labels"` } go swagger 注释为 text 1 2 3 4 5 6 7 8 9 10 // deleteInstance godoc // @Summary Remove prometheus instance. Gin - 参数默认值问题 https://www.161616.top/gin-param-default-value/ Fri, 20 Sep 2024 00:00:00 +0000 https://www.161616.top/gin-param-default-value/ 遇到问题：gin 使用 Bind 时无法填充，改成下面代码可以获取到 go 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 type User struct { Name string `form:"name,default=user1" json:"name,default=user2"` Age int `form:"age,default=10" json:"age,default=20"` } r := gin.Default() // way1 curl 127.0.0.1:8900/bind?name=aa // way2 curl -X POST 127.0.0.1:8900/bind -d "name=aa&age=30" // way3 curl -X POST 127.0.0.1:8900/bind -H "Content-Type: application/json" -d "{\"name\": \"aa\"}" r. Gorm - BeforeDelete无法获取正确条目 https://www.161616.top/gorm-before-delete/ Fri, 20 Sep 2024 00:00:00 +0000 https://www.161616.top/gorm-before-delete/ 遇到问题：BeforeDelete 在删除时获取 SQL 不正确 BeforeDelete 代码如下 go 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 func (t *Target) BeforeDelete(tx *gorm.DB) (err error) { // 找到与此 Target 相关的所有 Labels var labels []Label if err := tx.Model(t).Association("Labels").Find(&labels); err != nil { klog.V(4).Infof("Error fetching labels: %v", err) return err } for _, label := range labels { if err := tx.Delete(&label).Error; err != nil { klog. GKE强制升级后JAVA Pod无法识别limit限制 https://www.161616.top/gke-invalid-pod-limits/ Wed, 11 Sep 2024 00:00:00 +0000 https://www.161616.top/gke-invalid-pod-limits/ 今日 GKE EOL，kubelet 自动升级至1.28后，Java程序在启动后无法识别资源清单中的限制，被大量OOMKill Deployment清单中已经配置了资源限制，例如下面的参数 yaml 1 2 3 4 5 resources: limits: memory: "1Gi" requests: memory: "600Mi" JAVA_OPS参数配置是使用百分比 bash 1 -XX:+UseContainerSupport -XX:InitialRAMPercentage=70.0 -XX:MaxRAMPercentage=70.0 但是启动后无法识别参数，使用 gcloud 登录到主机内查看 jvm 运行状态（因为容器使用 distroless） bash 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 project-20220325-asia-east-2-pool-221ab289-hgnf ~ # nsenter -t 274655 --mount --uts --ipc --net --pid /opt/java/openjdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintContainerInfo -version OSContainer::init: Initializing Container Support Detected cgroups v2 unified hierarchy Path to /cpu. nacos code=403,msg=user not found! https://www.161616.top/nacos-403-user-not-found/ Mon, 20 May 2024 00:00:00 +0000 https://www.161616.top/nacos-403-user-not-found/ 应用连接 nacos 时报错 403，用户密码均正确，用户存在，并且权限正确 text 1 2 2024-05-20 10:36:11,593 - [ERROR] - [ropertySourceBuilder][main][n.c.NacosPropertySourceBuilder: 101][]: get data from Nacos error,dataId:admin-server.yaml - com.alibaba.nacos.api.exception.NacosException: http error, code=403,msg=user not found!,dataId=admin-server.yaml,group=DEFAULT_GROUP,tenant=cced143f-5adb-4cb8-a580-28802ea8f203 at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.queryConfig(ClientWorker.java:979) 原因：nacos 地址应该和应用连接的不一致，修改一致后恢复正常 nacos 配置文件配置 “/nacos”，应用直接连接 “/” bash 1 server.servlet.contextPath=/nacos 记录一次ceph集群故障处理记录 https://www.161616.top/ch10-2-troubeshooting-crash-record/ Tue, 13 Feb 2024 00:00:00 +0000 https://www.161616.top/ch10-2-troubeshooting-crash-record/ 处理记录 Ceph版本：octopus 首先遇到問題是，业务端无法挂在 cephfs 查看内核日志发现是 bad authorize reply ，以为是 ceph keyring被替换了 text 1 2 3 4 5 6 7 8 2019-01-30 17:26:58 localhost kernel: libceph: mds0 10.80.20.100:6801 bad authorize reply 2019-01-30 17:26:58 localhost kernel: libceph: mds0 10.80.20.100:6801 bad authorize reply 2019-01-30 17:26:58 localhost kernel: libceph: mds0 10.80.20.100:6801 bad authorize reply 2019-01-30 17:26:58 localhost kernel: libceph: mds0 10.80.20.100:6801 bad authorize reply 2019-01-30 17:26:58 localhost kernel: libceph: mds0 10.80.20.100:6801 bad authorize reply 2019-01-30 17:26:58 localhost kernel: libceph: mds0 10. 当cephfs和fscache结合时在K8s环境下的全集群规模故障 https://www.161616.top/ch10-1-ceph-fscache/ Sat, 11 Nov 2023 00:00:00 +0000 https://www.161616.top/ch10-1-ceph-fscache/ 本文记录了在 kubernetes 环境中，使用 cephfs 时当启用了 fscache 时，由于网络问题，或者 ceph 集群问题导致的整个 k8s 集群规模的挂载故障问题。结合fscache的kubernetes中使用cephfs造成的集群规模故障在了解了上面的基础知识后，就可以引入故障了，下面是故障产生环境的配置故障发生环境软件版本 Centos 7.9 Ceph nautilus (14.20) Kernel 4.18.16 故障现象在 k8s 集群中挂在 cephfs 的场景下，新启动的 Pod 报错无法启动，报错信息如下 bash 1 ContainerCannotRun: error while creating mount source path /var/lib/kubelet/pods/5446c441-9162-45e8-0e93-b59be74d13b/volumes/kubernetesio-cephfs/{dir name} mkcir /var/lib/kubelet/pods/5446c441-9162-45e8-de93-b59bte74d13b/volumes/kubernetes.io~cephfs/ip-ib file existe 主要表现的现象大概为如下三个特征对于该节点故障之前运行的 Pod 是正常运行，但是无法写入和读取数据无法写入数据 permission denied 无法读取数据 kublet 的日志报错截图如下彻底解决方法需要驱逐该节点上所有挂在 cephfs 的 Pod，之后新调度来的 Pod 就可以正常启动了故障的分析当网络出现问题时，如果使用了 cephfs 的 Pod 就会出现大量故障，具体故障表现方式有下面几种新部署的 Pod 处于 Waiting 状态 ceph常用命令 https://www.161616.top/ch11-1-ceph-common-cmd/ Wed, 13 Sep 2023 00:00:00 +0000 https://www.161616.top/ch11-1-ceph-common-cmd/ 测试上传/下载对象存取故据时，客户端必须首先连接至RAD05集群上某存储地，而后根据对像名称由相关的中CRUSH规则完成数据对象寻址。于是为了测试集群的数据存储功能，首先创建一个用于测试的存储池mypool，并设定其PG数量为16个。 sh 1 ceph osd pool create mypool 16 16 而后，即可将测试文件上传至存储池中。例如下面的rados put命令将/etc/hosts rados lspool 显示存储池 rmpool 删除存储池 mkpool 创建存储池 rados mkpool mypool 32 32 sh 1 2 rados mkpool {name} {pgnum} {pgpnum} rados mkpool test 32 32 sh 1 2 $ ceph osd pool create testpool 32 32 pool 'testpool' created 列出存储池 text 1 2 3 4 5 6 7 8 9 $ ceph osd pool ls mypool rbdpool testpool $ rados lspools mypool rbdpool testpool 而后即可将测试文件上传到存储池中，例如将rados put命令将/etc/issue文件上传至testpool存储池，对象名称仍然较保留文件名issue，而rados ls可以列出指定存储池中的数据对象踩坑nginx proxy_pass GET 参数传递 https://www.161616.top/nginx-proxy_pass/ Sat, 20 May 2023 00:00:00 +0000 https://www.161616.top/nginx-proxy_pass/ 场景在配置代理后，GET 请求的变量全部失效，配置如下 text 1 2 3 location /fw { proxy_pass http://127.0.0.1:2952; } 我的需求是，/fw/ 的都发往 2952端口，但实际情况是404，原因为“在没有指定 URI 的情况下，在1.12版本后会传递原有的URI” 这时会导致一个404错误，因为我的后端接口本身就是 /fw/xxx/ 会出现重复接下来做了一个变量传递 text 1 2 3 location ~* /fw/(?<section>.*) { proxy_pass http://127.0.0.1:2952/fw/$section; } 这时存在一个问题，就是 GET 请求的变量无法传递过去解决 nginx 官方给出一个样例，说明了，存在某种情况下，nginx 不会确定请求 URI 中的部分参数使用正则表达式时在 localtion 名称内例如，在这个场景下，proxy_pass 就会忽略原有的请求的URI，而将拼接后的请求转发 text 1 2 3 4 location /name/ { rewrite /name/([^/]+) /users?name=$1 break; proxy_pass http://127.0.0.1; } 那么这服务我遇到的问题，nginx官方给出了使用方式当在 proxy_pass 中需要变量，可以使用 $request_uri; 另外也可以使用 $is_args$args 参数来保证原有的请求参数被传递解决nginx在docker中报错 [rewrite or internal redirection cycle while internally redirecting to "/index.html] https://www.161616.top/ngx-in-docker-500/ Thu, 18 May 2023 00:00:00 +0000 https://www.161616.top/ngx-in-docker-500/ vue项目部署在裸机Linux上运行正常，部署在docker中nginx出现下列错误 text 1 Nginx "rewrite or internal redirection cycle while internally redirecting to "/index.html" 表现在用户界面 500 Internal Server Error 原因：nginx配置路径不对，改成正确的后恢复 Windows Terminal无法加载WSL [process exited with code 4294967295 (0xffffffff)] https://www.161616.top/wsl-problem-with-windows-terminal/ Wed, 30 Mar 2022 00:00:00 +0000 https://www.161616.top/wsl-problem-with-windows-terminal/ 在Windows Terminal中WSL无法打开错误代码是 process exited with code 4294967295 (0xffffffff)，但在命令行中通过 "C:\Windows\System32\wsl.exe" -d ubuntu18 是正常的解决方法是：通过修改启动的命令为 wsl.exe ~ -d Ubuntu 中间加一个 ~ 可以很好的解决掉这种方法存在一个问题，打开的wsl终端将为根目录而不是当前windows目录 Reference Unable to launch WSL Ubuntu Account locked due to 10 failed logins https://www.161616.top/account-locked-due-to-10-failed-logins/ Tue, 19 Oct 2021 00:00:00 +0000 https://www.161616.top/account-locked-due-to-10-failed-logins/ 进入后，找到linux16 开头的一行！将ro改为 rw init=/sysroot/bin/sh 查看passwd和 shadow 发现用户并没有锁，于是想到，应该是pam的设置。 text 1 pam_tally2.so deny=6 onerr=fail unlock_time=120 默认log在： /var/log/tallylog text 1 2 3 4 chroot /sysroot # 使用pam_tally2命令解锁 pam_tally2 --user=root --reset rw init=/sysroot/bin/sh Reference Centos7.x破解密码 pam_tally2锁用户 mysql5.6 innodb_large_prefix引起的一个异常 https://www.161616.top/mysql5.6-innodb_large_prefix-abnormal/ Sat, 02 Oct 2021 00:00:00 +0000 https://www.161616.top/mysql5.6-innodb_large_prefix-abnormal/ phenomenon： Specified key was too long; max key length is 3072 bytes 在修改一个数据库字段时，字段容量被限制为了表前缀的大小而不是本身的容量大小查了一下innodb_large_prefix究竟是什么？动态行格式DYNAMIC row format 支持最大的索引前缀(3072)。由变量innodb_large_prefix进行控制。 By default, the index key prefix length limit is 767 bytes. See Section 13.1.13, “CREATE INDEX Statement”. For example, you might hit this limit with a column prefix index of more than 255 characters on a TEXT or VARCHAR column, assuming a utf8mb3 character set and the maximum of 3 bytes for each character. 由PIPE size 引起的线上故障 https://www.161616.top/pipe-size-problem/ Sat, 02 Oct 2021 00:00:00 +0000 https://www.161616.top/pipe-size-problem/ sence：python中使用subprocess.Popen(cmd, stdout=sys.STDOUT, stderr=sys.STDERR, shell=True) ，stdout, stderr 为None. 在错误中执行是无法捕获 stderr的内容，后面将上面的改为 subprocess.Popen(cmd, stdout=PIPE, stderr=PIPE, shell=True),发现是可以拿到 stderr, 但是会遇到大量任务hanging，造成线上事故。为此特意查询subprocess的一些参数的说明。 stdin stdout stderr 如果这些参数为 PIPE, 此时会为一个文件句柄，而传入其他（例如 sys.stdout 、None 等）的则为None 正如这里介绍的一样，subprocess 。而使用 PIPE，却导致程序 hanging。一般来说不推荐使用 stdout=PIPE stderr=PIPE，这样会导致一个死锁，子进程会将输入的内容输入到 pipe，直到操作系统从buffer中读取出输入的内容。查询手册可以看到确实是这个问题 Refernce Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that. goland在mod模式下不从vendor文件夹查找依赖 https://www.161616.top/go-vendor-file-in-goland/ Sun, 13 Dec 2020 00:00:00 +0000 https://www.161616.top/go-vendor-file-in-goland/ goland使用vendor作为获取依赖源软件版本： system：windows10 1709 terminal： wsl ubuntu1804 goland：201903 goland 打开项目时使用mod模式，无法识别外部包的依赖根据goland官方提示，开启时，将忽略go.mod依赖描述，所以就找不到相对应的依赖，但是编译时正常的。可以看到下图中，external libraries 并没有加载外部的库导致了无法识别。此时想要正常使用的话，可以按照提示操作将 goland 改为gopath模式，执行go mod vendor 将依赖同步到vendor 。此时正常。当依赖更新时，可以手动添加对应的依赖库，go mod tidy 后。因为vendor中没有新的依赖，需要手动执行下go mod vendor即可正常使用。使用vendor编译在编译时，可以使用 -mod=vendor 标记，使用代码主目录文件夹下vendor目录满足依赖获取，go build -mod=vendor。此时，go build 忽略go.mod 中的依赖，（这里仅使用代码root目录下的vendor其他地方的将忽略） GOFLAGS=-mod=vendor 设置顶级vendor作为依赖 go env -w GOFLAGS="-mod=vendor" 进行设置。取消 go env -w GOFLAGS="-mod=" zimbra安装故障记录 https://www.161616.top/zimbra-troubleshooing/ Fri, 02 Oct 2020 00:00:00 +0000 https://www.161616.top/zimbra-troubleshooing/ 启动故障：zimbra postsuper: fatal: scan_dir_push: open directory defer: Permission denied bash 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Host mail.domain.com Starting ldap...Done. Starting zmconfigd...Done. Starting dnscache...Done. Starting logger...Done. Starting mailbox...Done. Starting memcached...Done. Starting proxy...Done. Starting amavis...Done. Starting antispam...Done. Starting antivirus...Done. Starting opendkim...Done. Starting snmp...Done. Starting spell...Done. Starting mta...Failed. Starting saslauthd...done. postsuper: fatal: scan_dir_push: open directory defer: Permission denied postfix failed to start Starting stats. Centos7 dbus问题总结 https://www.161616.top/centos7-dbus-troubleshooting/ Wed, 23 Sep 2020 00:00:00 +0000 https://www.161616.top/centos7-dbus-troubleshooting/ Authorization not available. Check if polkit text 1 2 3 4 Authorization not available. Check if polkit service is running or see debug message for more information. dbus.socket failed to listen on sockets: Address family not supported by protocol Failed to listen on D-Bus System Message Bus Socket. 这个问题是因为dbus.socket状态异常，所有依赖dbus的启动都会去通过systemcall连接 dbus，当服务不可用时，所有服务无法以systemd方式正常启动/关闭。需要检查dbus.socket是否正常。本地使用需保证unix套接字的监听时启动的 Did not receive a reply text 1 Failed to open connection to "system" message bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. 使用alpine为基础镜像Q&A https://www.161616.top/alpine-trouble-q-and-a/ Sun, 20 Sep 2020 00:00:00 +0000 https://www.161616.top/alpine-trouble-q-and-a/ 作为go应用存在二进制文件却不能执行明明镜像中有对应的二进制文件，但是执行时却提示 not found 或 no such file 或 standard_init_linux.go:211: exec user process caused "no such file or directory" 网上常说都是因为windows换行符编码问题。此处实际问题是该二进制文件是使用动态链接方式编译. 解决方法： text 1 CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build --ldflags "-extldflags -static" 注意：CGO_ENABLED=0 GOOS=linux GOARCH=amd64 和 cgo_enabled=0 goos=linux goarch=amd64 是有区别的。保存信息诸如此类信息都是上述问题 text 1 2 3 4 standard_init_linux.go:211: exec user process caused "no such file or directory" /tmp # ./envoy_end /bin/sh: ./envoy_end: not found 替换为国内源 text 1 RUN sed -i 's@http://dl-cdn.alpinelinux.org/@https://mirrors.aliyun.com/@g' /etc/apk/repositories 基于alpine制作PHP镜像 alpine包搜索 https://pkgs. envoy官方example运行失败问题处理 https://www.161616.top/envoy-example-failed/ Sat, 12 Sep 2020 00:00:00 +0000 https://www.161616.top/envoy-example-failed/ 镜像内安装包失败处理方法一：修改Dockerfile，在Dockerfile中增加如下 ubuntu示例 text 1 2 RUN sed -i 's/archive.ubuntu.com/mirrors.aliyun.com/g' /etc/apt/sources.list RUN sed -i 's/security.ubuntu.com/mirrors.aliyun.com/g' /etc/apt/sources.list apline示例 text 1 RUN sed -i 's@http://dl-cdn.alpinelinux.org/@https://mirrors.aliyun.com/@g' /etc/apk/repositories 方法二：使用http代理， ubuntu 参考命令行使用代理下载镜像失败处理方法一：docker宿主机使用ss，开启局域网可连接。同局域网中的都可直接连此代理方法二： docker systemd的 service文件中增加http代理可看到已经可以成功运行envoy example示例 cannot bind ‘0.0.0.0:80’: Permission denied docker-compose文件 yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 version: '3' services: envoy: image: envoyproxy/envoy-alpine:v1.15-latest volumes: - . tcp.validnode_checking踩过的坑 https://www.161616.top/oracle-tcp.validnode_checking/ Sun, 06 May 2018 00:00:00 +0000 https://www.161616.top/oracle-tcp.validnode_checking/ 对Oracle 检查ip合法性,就必须在服务器端的sqlnet.ora文件中设置如下参数 text 1 2 TCP.INVITED_NODES=(10.0.0.36,10.0.0.1,10.0.0.35) TCP.EXCLUDED_NODES=(10.0.0.2) 启动监听出现如下错误 text 1 2 3 4 5 6 7 8 9 10 11 $ lsnrctl status LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 12-MAR-2018 18:32:13 Copyright (c) 1991, 2009, Oracle. All rights reserved. Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521)) TNS-12541: TNS:no listener TNS-12560: TNS:protocol adapter error TNS-00511: No listener Linux Error: 111: Connection refused 错误输出并没有打印详细的信息,从lisenter.ora,tnsnames.ora入手,但没有发现文件是错误的。最后检查sqlnet.ora,发现TCP.INVITED_NODES参数有如下约束是官方文档没有给出的 tcp.invited_nodes需要满足如下条件才可成功启动监听 1、需要设置参数TCP.VALIDNODE_CHECKING为YES才能激活该特性。 2、tcp.invited_nodes的值中一定要包括本机地址（127.0.0.1 / 10.0.0.36）或localhost，因为监听需要通过本机ip去访问监听，一旦禁止lsnrct将不能启动或停止监听。 3、不能设置ip段和通配符。 4、此方式只适合tcp/ip协议。 5、此方式是通过监听限制白名单的。 6、针对的是ip地址而不是其他（如用户名等）。 7、此配置适用于9i以上版本。本次踩坑是oracle11gr2。 8、修改配置后需要重启监听才可生效。 TCP. windows上sqlplus客户端连接oralce数据库中文显示问题 https://www.161616.top/sqlplus-windows/ Thu, 19 Apr 2018 00:00:00 +0000 https://www.161616.top/sqlplus-windows/ 运行环境服务器：centos6.8 服务器oracle版本：oracle 11g R2 64位，字符集是ZHS32utf8。客户端：navicat 12x64 windows8.1x64 问题分析当在windows客户端使用sqlplus或navicat时如果数据库中文显示“????” 这种情况是在客户端与服务器端字符集不一致时，从客户端输入了汉字信息。输入的这些信息即便是把客户端字符集更改正确，也无法显示汉字。解决方法：退出sqlplus,设置相应的环境变量NLS_LANG linux： text 1 export NLS_LANG="SIMPLIFIED CHINESE_CHINA.ZHS16GBK" windows：出现问题此时。系统cmd命令行使用sqlplus已经正常显示中文，但是navicat中依旧是？？？？图为cmd命令行访问sqlplus客户端查询图为navicat f6弹出的sqlplus客户端原因是因为Navicat Premium默认自带的instant client，但是其是base lite版本的（Basic Lite： Basic 的精简版本，其中仅带有英文错误消息和 Unicode、ASCII 以及西欧字符集支持），不支持中文字符集，而本文中的服务器端oracle恰好是中文字符集。自带版本不支持。此处需要去oracle官网下载相对应的版本。 http://www.oracle.com/technetwork/database/database-technologies/instant-client/downloads/index.html 将下载的文件解压覆盖navicat中的instantclient目录里的文件。此时连接oracle实例提示如下信息尽管我们下载了64位的版本。却提示如图信息。这是因为Navicat仅支持32位的，因此还需下载一个32位的客户端。替换到instantclient目录中替换完成后连接实例。f6使用sqlplus查询发现中文已经正常显示 PHP安装错误记录 https://www.161616.top/install-troubleshooting/ Sun, 02 Oct 2016 00:00:00 +0000 https://www.161616.top/install-troubleshooting/ 编译错误错误：同时指定了fpm与aspxs2方式错误 bash 1 2 You've configured multiple SAPIs to be build.You can build only one SAPI module and CLI binary at the same time 原因：导致的原因是我的配置参数中同时使用了–enable-fpm 与–with-apxs2，因此编译的时候出错了，去掉其中的任意一个参数编译成功。系统缺少libtool bash 1 make ***[libphp5.la] Error 1 解决方法：在编译PHP版本时，产生错误 make ***[libphp5.la] Error 1 错误原因：系统缺少libtool 解决办法：yum install libtool-ltdl-devel make过程错误 make: *** [sapi/cli/php] Error 1 原因：在「./configure 」沒抓好一些环境变数值。错误发生点在建立「-o sapi/cli/php」是出错，没給到要 link 的 iconv 库参数。报错提示： bash 1 2 3 4 5 6 7 8 9 libiconv.so.2: cannot open shared object file: No such file or directory mak /root/tools/php-7.