Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues running on GKE #39

Open
howardjohn opened this issue Jan 31, 2022 · 14 comments · Fixed by #40
Open

Issues running on GKE #39

howardjohn opened this issue Jan 31, 2022 · 14 comments · Fixed by #40

Comments

@howardjohn
Copy link

Initial install fails with

  Warning  FailedCreate  5s (x13 over 25s)  daemonset-controller  Error creating: insufficient quota to match these scopes: [{PriorityClass In [system-node-critical system-cluster-critical]}]

Move to kube-system

curl -L https://raw.githubusercontent.com/merbridge/merbridge/main/deploy/all-in-one.yaml -s | sed 's/istio-system/kube-system/g' | kubectl apply -f -

Gets it running but has some errors:

[ -f bpf/mb_connect.c ] && make -C bpf load || make -C bpf load-from-obj
make[1]: Entering directory '/app/bpf'
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_connect.c -o mb_connect.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_get_sockopts.c -o mb_get_sockopts.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_redir.c -o mb_redir.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_sockops.c -o mb_sockops.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_bind.c -o mb_bind.o
[ -f /sys/fs/bpf/cookie_original_dst ] || sudo bpftool map create /sys/fs/bpf/cookie_original_dst type lru_hash key 4 value 12 entries 65535 name cookie_original_dst
[ -f /sys/fs/bpf/local_pod_ips ] || sudo bpftool map create /sys/fs/bpf/local_pod_ips type hash key 4 value 4 entries 1024 name local_pod_ips
[ -f /sys/fs/bpf/process_ip ] || sudo bpftool map create /sys/fs/bpf/process_ip type lru_hash key 4 value 4 entries 1024 name process_ip
sudo bpftool prog load mb_connect.o /sys/fs/bpf/connect \
	map name cookie_original_dst pinned /sys/fs/bpf/cookie_original_dst \
	map name local_pod_ips pinned /sys/fs/bpf/local_pod_ips \
	map name process_ip pinned /sys/fs/bpf/process_ip
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 692
str_off: 692
str_len: 1816
btf_total_size: 2532
[1] PTR (anon) type_id=2
[2] STRUCT bpf_sock_addr size=72 vlen=10
	user_family type_id=3 bits_offset=0
	user_ip4 type_id=3 bits_offset=32
	user_ip6 type_id=5 bits_offset=64
	user_port type_id=3 bits_offset=192
	family type_id=3 bits_offset=224
	type type_id=3 bits_offset=256
	protocol type_id=3 bits_offset=288
	msg_src_ip4 type_id=3 bits_offset=320
	msg_src_ip6 type_id=5 bits_offset=352
	(anon) type_id=7 bits_offset=512
[3] TYPEDEF __u32 type_id=4
[4] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[5] ARRAY (anon) type_id=3 index_type_id=6 nr_elems=4
[6] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[7] UNION (anon) size=8 vlen=1
	sk type_id=8 bits_offset=0
[8] PTR (anon) type_id=27
[9] FUNC_PROTO (anon) return=10 args=(1 ctx)
[10] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[11] FUNC mb_sock4_connect type_id=9 vlen != 0

libbpf: Error loading .BTF into kernel: -22.
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ connect4 pinned /sys/fs/bpf/connect
[ -f /sys/fs/bpf/pair_original_dst ] || sudo bpftool map create /sys/fs/bpf/pair_original_dst type lru_hash key 12 value 12 entries 65535 name pair_original_dst
[ -f /sys/fs/bpf/sock_pair_map ] || sudo bpftool map create /sys/fs/bpf/sock_pair_map type sockhash key 12 value 4 entries 65535 name sock_pair_map
sudo bpftool prog load mb_sockops.o /sys/fs/bpf/sockops \
	map name cookie_original_dst pinned /sys/fs/bpf/cookie_original_dst \
	map name process_ip pinned /sys/fs/bpf/process_ip \
	map name pair_original_dst pinned /sys/fs/bpf/pair_original_dst \
	map name sock_pair_map pinned /sys/fs/bpf/sock_pair_map
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 1040
str_off: 1040
str_len: 1675
btf_total_size: 2739
[1] PTR (anon) type_id=2
[2] STRUCT bpf_sock_ops size=192 vlen=36
	op type_id=3 bits_offset=0
	(anon) type_id=5 bits_offset=32
	family type_id=3 bits_offset=160
	remote_ip4 type_id=3 bits_offset=192
	local_ip4 type_id=3 bits_offset=224
	remote_ip6 type_id=6 bits_offset=256
	local_ip6 type_id=6 bits_offset=384
	remote_port type_id=3 bits_offset=512
	local_port type_id=3 bits_offset=544
	is_fullsock type_id=3 bits_offset=576
	snd_cwnd type_id=3 bits_offset=608
	srtt_us type_id=3 bits_offset=640
	bpf_sock_ops_cb_flags type_id=3 bits_offset=672
	state type_id=3 bits_offset=704
	rtt_min type_id=3 bits_offset=736
	snd_ssthresh type_id=3 bits_offset=768
	rcv_nxt type_id=3 bits_offset=800
	snd_nxt type_id=3 bits_offset=832
	snd_una type_id=3 bits_offset=864
	mss_cache type_id=3 bits_offset=896
	ecn_flags type_id=3 bits_offset=928
	rate_delivered type_id=3 bits_offset=960
	rate_interval_us type_id=3 bits_offset=992
	packets_out type_id=3 bits_offset=1024
	retrans_out type_id=3 bits_offset=1056
	total_retrans type_id=3 bits_offset=1088
	segs_in type_id=3 bits_offset=1120
	data_segs_in type_id=3 bits_offset=1152
	segs_out type_id=3 bits_offset=1184
	data_segs_out type_id=3 bits_offset=1216
	lost_out type_id=3 bits_offset=1248
	sacked_out type_id=3 bits_offset=1280
	sk_txhash type_id=3 bits_offset=1312
	bytes_received type_id=8 bits_offset=1344
	bytes_acked type_id=8 bits_offset=1408
	(anon) type_id=10 bits_offset=1472
[3] TYPEDEF __u32 type_id=4
[4] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[5] UNION (anon) size=16 vlen=3
	args type_id=6 bits_offset=0
	reply type_id=3 bits_offset=0
	replylong type_id=6 bits_offset=0
[6] ARRAY (anon) type_id=3 index_type_id=7 nr_elems=4
[7] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[8] TYPEDEF __u64 type_id=9
[9] INT long long unsigned int size=8 bits_offset=0 nr_bits=64 encoding=(none)
[10] UNION (anon) size=8 vlen=1
	sk type_id=11 bits_offset=0
[11] PTR (anon) type_id=28
[12] FUNC_PROTO (anon) return=13 args=(1 skops)
[13] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[14] FUNC mb_sockops type_id=12 vlen != 0

libbpf: Error loading .BTF into kernel: -22.
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ sock_ops pinned /sys/fs/bpf/sockops
sudo bpftool prog load mb_get_sockopts.o /sys/fs/bpf/get_sockopts \
	map name pair_original_dst pinned /sys/fs/bpf/pair_original_dst
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 664
str_off: 664
str_len: 1226
btf_total_size: 1914
[1] PTR (anon) type_id=2
[2] STRUCT bpf_sockopt size=40 vlen=7
	(anon) type_id=3 bits_offset=0
	(anon) type_id=5 bits_offset=64
	(anon) type_id=7 bits_offset=128
	level type_id=8 bits_offset=192
	optname type_id=8 bits_offset=224
	optlen type_id=8 bits_offset=256
	retval type_id=8 bits_offset=288
[3] UNION (anon) size=8 vlen=1
	sk type_id=4 bits_offset=0
[4] PTR (anon) type_id=28
[5] UNION (anon) size=8 vlen=1
	optval type_id=6 bits_offset=0
[6] PTR (anon) type_id=0
[7] UNION (anon) size=8 vlen=1
	optval_end type_id=6 bits_offset=0
[8] TYPEDEF __s32 type_id=9
[9] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[10] FUNC_PROTO (anon) return=9 args=(1 ctx)
[11] FUNC mb_get_sockopt type_id=10 vlen != 0

libbpf: Error loading .BTF into kernel: -22.
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ getsockopt pinned /sys/fs/bpf/get_sockopts
sudo bpftool prog load mb_redir.o /sys/fs/bpf/redir \
	map name sock_pair_map pinned /sys/fs/bpf/sock_pair_map
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 664
str_off: 664
str_len: 608
btf_total_size: 1296
[1] PTR (anon) type_id=2
[2] STRUCT sk_msg_md size=72 vlen=10
	(anon) type_id=3 bits_offset=0
	(anon) type_id=5 bits_offset=64
	family type_id=6 bits_offset=128
	remote_ip4 type_id=6 bits_offset=160
	local_ip4 type_id=6 bits_offset=192
	remote_ip6 type_id=8 bits_offset=224
	local_ip6 type_id=8 bits_offset=352
	remote_port type_id=6 bits_offset=480
	local_port type_id=6 bits_offset=512
	size type_id=6 bits_offset=544
[3] UNION (anon) size=8 vlen=1
	data type_id=4 bits_offset=0
[4] PTR (anon) type_id=0
[5] UNION (anon) size=8 vlen=1
	data_end type_id=4 bits_offset=0
[6] TYPEDEF __u32 type_id=7
[7] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[8] ARRAY (anon) type_id=6 index_type_id=9 nr_elems=4
[9] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[10] FUNC_PROTO (anon) return=11 args=(1 msg)
[11] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[12] FUNC mb_msg_redir type_id=10 vlen != 0

libbpf: Error loading .BTF into kernel: -22.
sudo bpftool prog attach pinned /sys/fs/bpf/redir msg_verdict pinned /sys/fs/bpf/sock_pair_map
sudo bpftool prog load mb_bind.o /sys/fs/bpf/bind
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 428
str_off: 428
str_len: 255
btf_total_size: 707
[1] PTR (anon) type_id=2
[2] STRUCT bpf_sock_addr size=72 vlen=10
	user_family type_id=3 bits_offset=0
	user_ip4 type_id=3 bits_offset=32
	user_ip6 type_id=5 bits_offset=64
	user_port type_id=3 bits_offset=192
	family type_id=3 bits_offset=224
	type type_id=3 bits_offset=256
	protocol type_id=3 bits_offset=288
	msg_src_ip4 type_id=3 bits_offset=320
	msg_src_ip6 type_id=5 bits_offset=352
	(anon) type_id=7 bits_offset=512
[3] TYPEDEF __u32 type_id=4
[4] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[5] ARRAY (anon) type_id=3 index_type_id=6 nr_elems=4
[6] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[7] UNION (anon) size=8 vlen=1
	sk type_id=8 bits_offset=0
[8] PTR (anon) type_id=18
[9] FUNC_PROTO (anon) return=10 args=(1 ctx)
[10] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[11] FUNC mb_bind type_id=9 vlen != 0

libbpf: Error loading .BTF into kernel: -22.
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ bind4 pinned /sys/fs/bpf/bind
make[1]: Leaving directory '/app/bpf'
time="2022-01-31T22:28:17Z" level=info msg="pod watcher ready" func="main.main()" file="main.go:118"
W0131 22:28:47.143006       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
I0131 22:28:47.143359       1 trace.go:205] Trace[1298498081]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167 (31-Jan-2022 22:28:17.142) (total time: 30000ms):
Trace[1298498081]: ---"Objects listed" error:Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout 30000ms (22:28:47.142)
Trace[1298498081]: [30.000970146s] [30.000970146s] END
E0131 22:28:47.143434       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
W0131 22:29:18.699484       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
I0131 22:29:18.699584       1 trace.go:205] Trace[1427131847]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167 (31-Jan-2022 22:28:48.696) (total time: 30003ms):
Trace[1427131847]: ---"Objects listed" error:Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout 30003ms (22:29:18.699)
Trace[1427131847]: [30.003509416s] [30.003509416s] END
E0131 22:29:18.699615       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
W0131 22:29:51.001196       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
I0131 22:29:51.001270       1 trace.go:205] Trace[911902081]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167 (31-Jan-2022 22:29:21.000) (total time: 30000ms):
Trace[911902081]: ---"Objects listed" error:Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout 30000ms (22:29:51.001)
Trace[911902081]: [30.000628622s] [30.000628622s] END
E0131 22:29:51.001303       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
W0131 22:30:26.399845       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
I0131 22:30:26.399923       1 trace.go:205] Trace[140954425]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167 (31-Jan-2022 22:29:56.399) (total time: 30000ms):
Trace[140954425]: ---"Objects listed" error:Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout 30000ms (22:30:26.399)
Trace[140954425]: [30.000565552s] [30.000565552s] END
E0131 22:30:26.399938       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
W0131 22:31:03.807817       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
I0131 22:31:03.807969       1 trace.go:205] Trace[208240456]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167 (31-Jan-2022 22:30:33.804) (total time: 30003ms):
Trace[208240456]: ---"Objects listed" error:Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout 30003ms (22:31:03.807)
Trace[208240456]: [30.003475488s] [30.003475488s] END
E0131 22:31:03.807989       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout
@howardjohn
Copy link
Author

$ uname -r
5.4.144+

This one is probably partially my bad :-). But there will still be issues since it cannot even schedule the pod in current state

@howardjohn
Copy link
Author

Ok on newer GKE version, now I just get:

[ -f bpf/mb_connect.c ] && make -C bpf load || make -C bpf load-from-obj
make[1]: Entering directory '/app/bpf'
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_connect.c -o mb_connect.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_get_sockopts.c -o mb_get_sockopts.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_redir.c -o mb_redir.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_sockops.c -o mb_sockops.o
clang -O2 -g  -Wall -target bpf -I/usr/include/x86_64-linux-gnu  -DMESH=1 -c mb_bind.c -o mb_bind.o
[ -f /sys/fs/bpf/cookie_original_dst ] || sudo bpftool map create /sys/fs/bpf/cookie_original_dst type lru_hash key 4 value 12 entries 65535 name cookie_original_dst
[ -f /sys/fs/bpf/local_pod_ips ] || sudo bpftool map create /sys/fs/bpf/local_pod_ips type hash key 4 value 4 entries 1024 name local_pod_ips
[ -f /sys/fs/bpf/process_ip ] || sudo bpftool map create /sys/fs/bpf/process_ip type lru_hash key 4 value 4 entries 1024 name process_ip
sudo bpftool prog load mb_connect.o /sys/fs/bpf/connect \
        map name cookie_original_dst pinned /sys/fs/bpf/cookie_original_dst \
        map name local_pod_ips pinned /sys/fs/bpf/local_pod_ips \
        map name process_ip pinned /sys/fs/bpf/process_ip
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ connect4 pinned /sys/fs/bpf/connect
[ -f /sys/fs/bpf/pair_original_dst ] || sudo bpftool map create /sys/fs/bpf/pair_original_dst type lru_hash key 12 value 12 entries 65535 name pair_original_dst
[ -f /sys/fs/bpf/sock_pair_map ] || sudo bpftool map create /sys/fs/bpf/sock_pair_map type sockhash key 12 value 4 entries 65535 name sock_pair_map
sudo bpftool prog load mb_sockops.o /sys/fs/bpf/sockops \
        map name cookie_original_dst pinned /sys/fs/bpf/cookie_original_dst \
        map name process_ip pinned /sys/fs/bpf/process_ip \
        map name pair_original_dst pinned /sys/fs/bpf/pair_original_dst \
        map name sock_pair_map pinned /sys/fs/bpf/sock_pair_map
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ sock_ops pinned /sys/fs/bpf/sockops
sudo bpftool prog load mb_get_sockopts.o /sys/fs/bpf/get_sockopts \
        map name pair_original_dst pinned /sys/fs/bpf/pair_original_dst
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ getsockopt pinned /sys/fs/bpf/get_sockopts
sudo bpftool prog load mb_redir.o /sys/fs/bpf/redir \
        map name sock_pair_map pinned /sys/fs/bpf/sock_pair_map
sudo bpftool prog attach pinned /sys/fs/bpf/redir msg_verdict pinned /sys/fs/bpf/sock_pair_map
sudo bpftool prog load mb_bind.o /sys/fs/bpf/bind
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ bind4 pinned /sys/fs/bpf/bind
make[1]: Leaving directory '/app/bpf'
time="2022-01-31T22:43:12Z" level=info msg="pod watcher ready" func="main.main()" file="main.go:118"
E0131 22:43:12.833151       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: unknown (get pods)
E0131 22:43:14.417228       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: unknown (get pods)
E0131 22:43:16.749130       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: unknown (get pods)
E0131 22:43:22.186900       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: unknown (get pods)
E0131 22:43:29.619632       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: unknown (get pods)

so the BPF issues are resolved at least.

BTW - how can I tell its working?

@howardjohn
Copy link
Author

Seems clusterrole needs Watch permission on pods

@kebe7jun
Copy link
Member

kebe7jun commented Feb 1, 2022

BTW - how can I tell its working?

The following validation ways are possible.

  1. Observe the number of packets in the ISTIO_REDIRECT chain by running iptables -t nat -vnL in the Pod's netns, which does not increase with the number of requests after using Merbridge.
  2. When starting Merbridge, add the -d parameter and then view the log by running cat /sys/kernel/debug/tracing/trace_pipe. See
    - -m
    - istio

Also, the e2e test example may be helpful: https://github.com/merbridge/merbridge/blob/main/.github/workflows/e2e.yaml

@howardjohn
Copy link
Author

Also getting Get "https://10.40.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.40.0.1:443: i/o timeout 30003ms (17:27:01.366). I think with host network 10.40.0.1 (kubernetes cluster IP) cannot be used. Instead we need to hit the external ip of api server

On cilium I see in configmap something like k8s-api-server: https://34.xx.xx.xx:443

@kebe7jun
Copy link
Member

kebe7jun commented Feb 7, 2022

It seems to me that Merbridge is just emulating the behavior of iptables by forwarding traffic to envoy and doing nothing else, not sure why this is happening.
How can I reproduce this issue?

@howardjohn
Copy link
Author

Sorry, that problem is in the merbridge pod itself. It has nothing to do with the iptables/ebpf at all. The problem is the daemonset is in host network, and tries to reach cluster IP Service. You cannot do this from a hostnetwork pod

@kebe7jun
Copy link
Member

kebe7jun commented Feb 8, 2022

Sorry, that problem is in the merbridge pod itself. It has nothing to do with the iptables/ebpf at all. The problem is the daemonset is in host network, and tries to reach cluster IP Service. You cannot do this from a hostnetwork pod

I have disabled hostnetwork mode: #45

@howardjohn
Copy link
Author

Thanks! with latest:

            curl-836505  [001] d... 670508.770018: bpf_trace_printk: call from user container: ip: 0x30d200a, port: 80
            curl-836505  [001] d... 670508.770171: bpf_trace_printk: redirect 68 bytes with eBPF successfully

but I do see iptables packets increase...

Chain PREROUTING (policy ACCEPT 44 packets, 2640 bytes)
 pkts bytes target     prot opt in     out     source               destination
   44  2640 ISTIO_INBOUND  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain INPUT (policy ACCEPT 44 packets, 2640 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 85 packets, 6069 bytes)
 pkts bytes target     prot opt in     out     source               destination
   39  2340 ISTIO_OUTPUT  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain POSTROUTING (policy ACCEPT 85 packets, 6069 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain ISTIO_INBOUND (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:15008
    0     0 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:15090
   44  2640 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:15021
    0     0 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:15020
    0     0 ISTIO_IN_REDIRECT  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain ISTIO_IN_REDIRECT (3 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 REDIRECT   tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            redir ports 15006

Chain ISTIO_OUTPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  *      lo      127.0.0.6            0.0.0.0/0
    0     0 ISTIO_IN_REDIRECT  all  --  *      lo      0.0.0.0/0           !127.0.0.1            owner UID match 1337
   29  1740 RETURN     all  --  *      lo      0.0.0.0/0            0.0.0.0/0            ! owner UID match 1337
   10   600 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            owner UID match 1337
    0     0 ISTIO_IN_REDIRECT  all  --  *      lo      0.0.0.0/0           !127.0.0.1            owner GID match 1337
    0     0 RETURN     all  --  *      lo      0.0.0.0/0            0.0.0.0/0            ! owner GID match 1337
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            owner GID match 1337
    0     0 RETURN     all  --  *      *       0.0.0.0/0            127.0.0.1
    0     0 ISTIO_REDIRECT  all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain ISTIO_REDIRECT (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 REDIRECT   tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            redir ports 15001

@kebe7jun
Copy link
Member

kebe7jun commented Feb 9, 2022

After processing with eBPF, it does not bypass iptables, the traffic will still go through iptables, this is normal.
Merbridge works in compatibility mode when the iptables rule is present.
You can try disabling iptables(don't run istio init container), or use iptables -t nat -F to clear out all the rules and make the request again, and it will work fine.

Another way to verify this is to use curl with the -v parameter and watch the destination address of the request, if it is 127.128.x.x, then eBPF is also working properly.

image

@howardjohn
Copy link
Author

howardjohn commented Feb 9, 2022 via email

@linsun
Copy link

linsun commented Mar 28, 2022

+1, would be nice to have a mode to disable iptables. This is the biggest puzzle for me when following the merbridge blog, since there is no modification to Istio. Only after reading comments here, I confirmed that the iptables are still growing with merbridge unless we disable the iptables in init container.

Curious if disabling iptables will help with your performance numbers too @hanxiaop @kebe7jun

@Xunzhuo
Copy link
Member

Xunzhuo commented Mar 28, 2022

Hi @linsun thanks for commenting, we are working this direction as well:)

@kebe7jun
Copy link
Member

We have completed the development of CNI mode, but it is still in beta stage, so try it if you need~
https://merbridge.io/blog/2022/05/18/cni-mode/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants