6

I am trying to use the Kubernetes 1.7.12 fluentd-elasticsearch addon: https://github.com/kubernetes/kubernetes/tree/v1.7.12/cluster/addons/fluentd-elasticsearch

ElasticSearch starts up and can respond with:

{
 "name" : "0322714ad5b7",
 "cluster_name" : "kubernetes-logging",
 "cluster_uuid" : "_na_",
 "version" : {
   "number" : "2.4.1",
   "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
   "build_timestamp" : "2016-09-27T18:57:55Z",
   "build_snapshot" : false,
   "lucene_version" : "5.5.2"
 },
 "tagline" : "You Know, for Search"
}

But Kibana is still unable to connect to it. The connection error starts out with:

{"type":"log","@timestamp":"2018-01-23T07:42:06Z","tags":["warning","elasticsearch"],"pid":6,"message":"Unable to revive connection: http://elasticsearch-logging:9200/"}
{"type":"log","@timestamp":"2018-01-23T07:42:06Z","tags":["warning","elasticsearch"],"pid":6,"message":"No living connections"}

And after ElasticSearch is up, the error changes to:

{"type":"log","@timestamp":"2018-01-23T07:42:08Z","tags":["status","plugin:[email protected]","error"],"pid":6,"state":"red","message":"Status changed from red to red - Service Unavailable","prevState":"red","prevMsg":"Unable to connect to Elasticsearch at http://elasticsearch-logging:9200."}

So it seems as though, Kibana is finally able to get a response from ElasticSearch, but a connection still cannot be established.

This is what the Kibana dashboard looks like: enter image description here

I tried to get the logs to output more information, but do not have enough knowledge about Kibana and ElasticSearch to know what else I can try next.

I am able to reproduce the error locally using this docker-compose.yml:

version: '2'
services:
 elasticsearch-logging:
   image: gcr.io/google_containers/elasticsearch:v2.4.1-2
   ports:
     - "9200:9200"
     - "9300:9300"

 kibana-logging:
   image: gcr.io/google_containers/kibana:v4.6.1-1
   ports:
     - "5601:5601"
   depends_on:
     - elasticsearch-logging
   environment:
     - ELASTICSEARCH_URL=http://elasticsearch-logging:9200

It doesn't look like there should be much involved based on what I can tell from this question: Kibana on Docker cannot connect to Elasticsearch and this blog: https://gunith.github.io/docker-kibana-elasticsearch/

But I can't figure out what I'm missing.

Any ideas what else I might be able to try?

Thank you for your time. :)

Update 1:

curling http://elasticsearch-logging on the Kubernetes cluster resulted in the same output:

{
  "name" : "elasticsearch-logging-v1-68km4",
  "cluster_name" : "kubernetes-logging",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "2.4.1",
    "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
    "build_timestamp" : "2016-09-27T18:57:55Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

curling http://elasticsearch-logging/_cat/indices?pretty on the Kubernetes cluster timed out because of a proxy rule. Using the docker-compose.yml and curling locally (e.g. curl localhost:9200/_cat/indices?pretty) results in:

{
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : null
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

The docker-compose logs show:

[2018-01-23 17:04:39,110][DEBUG][action.admin.cluster.state] [ac1f2a13a637] no known master node, scheduling a retry

[2018-01-23 17:05:09,112][DEBUG][action.admin.cluster.state] [ac1f2a13a637] timed out while retrying [cluster:monitor/state] after failure (timeout [30s])
[2018-01-23 17:05:09,116][WARN ][rest.suppressed          ] path: /_cat/indices, params: {pretty=}
MasterNotDiscoveredException[null]
     at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
     at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
     at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:804)
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
     at java.lang.Thread.run(Thread.java:745)

Update 2: Running kubectl --namespace kube-system logs -c kubedns po/kube-dns-667321983-dt5lz --tail 50 --follow yields:

I0124 16:43:33.591112       5 dns.go:264] New service: kibana-logging
I0124 16:43:33.591225       5 dns.go:264] New service: nginx
I0124 16:43:33.591251       5 dns.go:264] New service: registry
I0124 16:43:33.591274       5 dns.go:264] New service: sudoe
I0124 16:43:33.591295       5 dns.go:264] New service: default-http-backend
I0124 16:43:33.591317       5 dns.go:264] New service: kube-dns
I0124 16:43:33.591344       5 dns.go:462] Added SRV record &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591369       5 dns.go:462] Added SRV record &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591390       5 dns.go:264] New service: kubernetes
I0124 16:43:33.591409       5 dns.go:462] Added SRV record &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591429       5 dns.go:264] New service: elasticsearch-logging

Update 3:

I'm still trying to get everything to work, but with the help of others, I am confident it is a RBAC issue. I'm not completely sure, but it looks like the elasticsearch nodes were not able to connect with the master (which I never knew was even needed) due to permissions.

Here are some steps that helped, in case it helps others starting out:

with RBAC on:

# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system logs po/elasticsearch-logging-v1-wkwcs
F0119 00:18:44.285773       9 elasticsearch_logging_discovery.go:60] kube-system namespace doesn't exist: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "kube-system". (get namespaces kube-system)
goroutine 1 [running]:
k8s.io/kubernetes/vendor/github.com/golang/glog.stacks(0x1f7f600, 0xc400000000, 0xee, 0x1b2)
        vendor/github.com/golang/glog/glog.go:766 +0xa5
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).output(0x1f5f5c0, 0xc400000003, 0xc42006c300, 0x1ef20c8, 0x22, 0x3c, 0x0)
        vendor/github.com/golang/glog/glog.go:717 +0x337
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).printf(0x1f5f5c0, 0xc400000003, 0x16949d6, 0x1e, 0xc420579ee8, 0x2, 0x2)
        vendor/github.com/golang/glog/glog.go:655 +0x14c
k8s.io/kubernetes/vendor/github.com/golang/glog.Fatalf(0x16949d6, 0x1e, 0xc420579ee8, 0x2, 0x2)
        vendor/github.com/golang/glog/glog.go:1145 +0x67
main.main()
        cluster/addons/fluentd-elasticsearch/es-image/elasticsearch_logging_discovery.go:60 +0xb53
[2018-01-19 00:18:45,273][INFO ][node                     ] [elasticsearch-logging-v1-wkwcs] version[2.4.1], pid[5], build[c67dc32/2016-09-27T18:57:55Z]
[2018-01-19 00:18:45,275][INFO ][node                     ] [elasticsearch-logging-v1-wkwcs] initializing ...
# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system exec kibana-logging-2104905774-69wgv curl elasticsearch-logging.kube-system:9200/_cat/indices?pretty

{
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : null
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

With RBAC off:

#  kubectl --kubeconfig kubeconfig.yaml --namespace kube-system log elasticsearch-logging-v1-7shgk
[2018-01-26 01:19:52,294][INFO ][node                     ] [elasticsearch-logging-v1-7shgk] version[2.4.1], pid[5], build[c67dc32/2016-09-27T18:57:55Z]
[2018-01-26 01:19:52,294][INFO ][node                     ] [elasticsearch-logging-v1-7shgk] initializing ...
[2018-01-26 01:19:53,077][INFO ][plugins                  ] [elasticsearch-logging-v1-7shgk] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
#  kubectl --kubeconfig kubeconfig.yaml --namespace kube-system exec elasticsearch-logging-v1-7shgk curl http://elasticsearch-logging:9200/_cat/indices?pretty
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    40  100    40    0     0      2      0  0:00:20  0:00:15  0:00:05    10
green open .kibana 1 1 1 0 6.2kb 3.1kb 

Thanks everyone for your help :)

13
  • Are elasticsearch and kibana deployed in the same namespace? Could you access the kibana container via a command line and launch some debugging commands?
    – whites11
    Commented Jan 23, 2018 at 8:11
  • @whites11, yes they are deployed to the same namespace, kube-system. I can do something like kubectl exec -it po/podname. Is that what you mean? What kind of debugging commands can I run?
    – Zhao Li
    Commented Jan 23, 2018 at 8:17
  • yeah that's what I mean. Try running curl http://elasticsearch-logging:9200 from the kibana pod
    – whites11
    Commented Jan 23, 2018 at 8:18
  • I'll try it on the kubernetes cluster tomorrow, but when I run it in the container using the docker-compose.yml, I get this: { "name" : "0322714ad5b7", "cluster_name" : "kubernetes-logging", "cluster_uuid" : "na", "version" : { "number" : "2.4.1", "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16", "build_timestamp" : "2016-09-27T18:57:55Z", "build_snapshot" : false, "lucene_version" : "5.5.2" }, "tagline" : "You Know, for Search" }
    – Zhao Li
    Commented Jan 23, 2018 at 8:19
  • Ok and what does curl http://elasticsearch-logging:9200/_cat/indices?pretty say?
    – whites11
    Commented Jan 23, 2018 at 8:21

2 Answers 2

5

A few troubleshooting tips:

1) ensure ElasticSearch is running fine.

Enter the container running elasticsearch and run:

curl localhost:9200

You should get a JSON, with some data about elasticsearch.

2) ensure ElasticSearch is reachable from the kibana container

Enter the kibana container and run:

curl <elasticsearch_service_name>:9200

You should get the same output as above.

3) Ensure your ES indices are fine.

Run the following command from the elasticsearch container:

curl localhost:9200/_cat/indices?pretty

You should get a table with all indices in your ES cluster and their status (which should be green or yellow in case you only have one ES replica).

If one of the above points fails, check the logs of your ES container for any error messages and try to solve them.

1

This exception indicates 2 misconfiguration 1. DNS Addon of Kubernetes is not working properly. Check your dns addon logs 2. Pod 2 Pod communication is not working properly. This is related with your underlying sdn addon cni flannel calico.

You can check by pinging one pod from another pod. If it is not working than check your networking configuration especially kube-proxy component.

4
  • Thanks for the tips. Sorry but I’m new to kubernetes, do you know what commands I can run to check those things? I’ll try and do some googling on it as well but me trying to find those commands would be slower.
    – Zhao Li
    Commented Jan 24, 2018 at 14:59
  • I updated the question with the dns logs. I don't quite know what I'm looking for though. I did some googling on cni flannel calico, but am not able to figure out how to access those logs. They don't seem to be operating the same way as the fluentd-elasticsearch addon and the dns addon. Thanks again for your time.
    – Zhao Li
    Commented Jan 24, 2018 at 16:58
  • Everything seems works fine and especially on dns and networking. Do you have chance to disable rbac and deploy it like the link below. github.com/pires/kubernetes-elasticsearch-cluster. This link does not use rbac so we can undertand that it is related with rbac or not. If it is working than we can focus on rbac. Do we have an option to connect your computer ? Commented Jan 26, 2018 at 10:54
  • thank you for the link and the offer to help troubleshoot the RBAC issue further. We are currently trying to use the later version of the add on configurations (1.9.2 github.com/kubernetes/kubernetes/tree/v1.9.2/cluster/addons/…). Hopefully the later configurations will have better support for RBAC. Thank you again for your help.
    – Zhao Li
    Commented Jan 26, 2018 at 17:00

Not the answer you're looking for? Browse other questions tagged or ask your own question.