I have a new test case for https://github.com/xdp-project/bpf-examples here
https://github.com/tjcw/bpf-examples/tree/tjcw-integration-0.3/AF_XDP-filter
. It is for filtering flows; the idea is to send the first packet of a
flow to userspace, have the userspace determine (by looking at the
fivetuple of the packet) whether the flow is acceptable or not, and
setting an entry in an eBPF map accordingly. Second and subsequent
packets of the flow are handled in kernel by eBPF code.
The test case works, except that the first packet (which is reinjected
to the kernel through a tun/tap interface) is then dropped by the
kernel as a 'martian'. The effect is that if you try 'ping' to this
code then you see all packets replied to except the first, and if you
try 'ssh' there is a small hiatus at the start while the TCP protocol
on the client times out and retransmits the SYN packet.
I am attaching the output of 'pwru' (cilium packet-where-are-you) when running
tjcw@tjcw-Standard-PC-Q35-ICH9-2009:~$ ping -c 2 192.168.122.48
PING 192.168.122.48 (192.168.122.48) 56(84) bytes of data.
64 bytes from 192.168.122.48: icmp_seq=2 ttl=64 time=2.28 ms
--- 192.168.122.48 ping statistics ---
2 packets transmitted, 1 received, 50% packet loss, time 1028ms
rtt min/avg/max/mdev = 2.282/2.282/2.282/0.000 ms
tjcw@tjcw-Standard-PC-Q35-ICH9-2009:~$
Does anyone reading this know why the kernel is treating
the packet as a martian, and if there is a way of overcoming this ? I
am using Ubuntu 22.04 with uname -a showing
tjcw@tjcw-Standard-PC-Q35-ICH9-2009:~$ uname -a
Linux tjcw-Standard-PC-Q35-ICH9-2009 5.15.0-53-generic #59-Ubuntu SMP
Mon Oct 17 18:53:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
tjcw@tjcw-Standard-PC-Q35-ICH9-2009:~$
2022/11/25 11:37:02 Listening for events..
SKB CPU PROCESS FUNC
0xffff9478f5038300 1 [<empty>] pskb_expand_head
0xffff9478f5038300 1 [<empty>] skb_free_head
0xffff9478f5038300 1 [<empty>] bpf_prog_run_generic_xdp
0xffff9478f5038300 1 [<empty>] xdp_do_generic_redirect
0xffff9478f5038300 1 [<empty>] consume_skb
0xffff9478f5038300 1 [<empty>] skb_release_head_state
0xffff9478f5038300 1 [<empty>] skb_release_data
0xffff9478f5038300 1 [<empty>] skb_free_head
0xffff9478f5038300 1 [<empty>] kfree_skbmem
0xffff9478f5038300 1 [af_xdp_user] netif_receive_skb
0xffff9478f5038300 1 [af_xdp_user] skb_defer_rx_timestamp
0xffff9478f5038300 1 [af_xdp_user] __netif_receive_skb
0xffff9478f5038300 1 [af_xdp_user] __netif_receive_skb_one_core
0xffff9478f5038300 1 [af_xdp_user] ip_rcv
0xffff9478f5038300 1 [af_xdp_user] ip_rcv_core
0xffff9478f5038300 1 [af_xdp_user] sock_wfree
0xffff9478f5038300 1 [af_xdp_user] ip_route_input_noref
0xffff9478f5038300 1 [af_xdp_user] ip_route_input_rcu
0xffff9478f5038300 1 [af_xdp_user] ip_route_input_slow
0xffff9478f5038300 1 [af_xdp_user] fib_validate_source
0xffff9478f5038300 1 [af_xdp_user] __fib_validate_source
0xffff9478f5038300 1 [af_xdp_user] ip_handle_martian_source
0xffff9478f5038300 1 [af_xdp_user] kfree_skb_reason
0xffff9478f5038300 1 [af_xdp_user] skb_release_head_state
0xffff9478f5038300 1 [af_xdp_user] skb_release_data
0xffff9478f5038300 1 [af_xdp_user] skb_free_head
0xffff9478f5038300 1 [af_xdp_user] kfree_skbmem
0xffff9478c75d3000 1 [<empty>] pskb_expand_head
0xffff9478c75d3000 1 [<empty>] skb_free_head
0xffff9478c75d3000 1 [<empty>] bpf_prog_run_generic_xdp
0xffff9478c75d3000 1 [<empty>] ip_rcv
0xffff9478c75d3000 1 [<empty>] ip_rcv_core
0xffff9478c75d3000 1 [<empty>] skb_clone
0xffff9478c75d3000 1 [<empty>] consume_skb
0xffff9478f5038c00 1 [<empty>] ip_route_input_noref
0xffff9478f5038c00 1 [<empty>] ip_route_input_rcu
0xffff9478f5038c00 1 [<empty>] ip_route_input_slow
0xffff9478f5038c00 1 [<empty>] fib_validate_source
0xffff9478f5038c00 1 [<empty>] __fib_validate_source
0xffff9478f5038c00 1 [<empty>] ip_local_deliver
0xffff9478f5038c00 1 [<empty>] ip_local_deliver_finish
0xffff9478f5038c00 1 [<empty>] ip_protocol_deliver_rcu
0xffff9478f5038c00 1 [<empty>] raw_local_deliver
0xffff9478f5038c00 1 [<empty>] icmp_rcv
0xffff9478f5038c00 1 [<empty>] __skb_checksum_complete
0xffff9478f5038c00 1 [<empty>] icmp_echo
0xffff9478f5038c00 1 [<empty>] icmp_reply
0xffff9478f5038c00 1 [<empty>] __ip_options_echo
0xffff9478f5038c00 1 [<empty>] fib_compute_spec_dst
0xffff9478f5038c00 1 [<empty>] security_skb_classify_flow
0xffff9478f5038c00 1 [<empty>] consume_skb
0xffff9478f5038c00 1 [<empty>] skb_release_head_state
0xffff9478f5038c00 1 [<empty>] skb_release_data
0xffff9478f5038c00 1 [<empty>] kfree_skbmem
0xffff9478c75d3000 1 [<empty>] packet_rcv
0xffff9478c75d3000 1 [<empty>] consume_skb
0xffff9478c75d3000 1 [<empty>] skb_release_head_state
0xffff9478c75d3000 1 [<empty>] skb_release_data
0xffff9478c75d3000 1 [<empty>] skb_free_head
The first (dropped) packet is the section from the first pskb_expand_head to the kfree_skbmem, and the second (passed) packet is the section from the second pskb_expand_head to the end.