Wiki source code of IPv6

Version 1.2 by Sebastian Marsching on 2022/05/29 13:29

Show last authors
1 {{toc/}}
2
3 # IPv6 with 6to4 setup on Debian / Ubuntu
4
5 If you have a static IPv4 address, the easiest way to get IPv6 working is using the 6to4 protocol. 6to4 automatically assigns a /48 IPv6 subnet to each IP address. As the minimal recommended size for a IPv6 subnet is /64, you get nearly 2^16 subnets you can use behind one IPv4 address. This makes 6to4 an excellent choice, if you have a network behind a NAT gateway, with only one static IPv4 address.
6
7 You can generate a 6to4 configuration for Debian's or Ubuntu's `/etc/network/interfaces` using a nice and simple to use web-based [tool](http://debian6to4.gielen.name/).
8
9 ## Performing a 6to4 setup on a host behind a NAT gateway
10
11 However, if you cannot or do not want to do the IPv6 routing for your network on the NAT gateway with the global IPv4 address, the setup gets a little bit more complicated:
12
13 You have to forward all IPv6 traffic (that is protocol number 41) from your NAT gateway to the machine you are performing the 6to4 setup on. On Linux you can do this with iptables using the commands
14
15 ```bash
16 iptables -t nat -A PREROUTING -d ${EXTERNAL_IPV4_ADDRESS} -p ipv6 -j DNAT --to-destination ${TARGET_HOST}
17 iptables -A FORWARD -d ${TARGET_HOST} -p ipv6 -j ACCEPT
18 ```
19
20 (the second rule is only needed, if you are not using policy `ACCEPT` for the `FORWARD` chain). This will forward all 6to4 related traffic that hits `${EXTERNAL_IPV4_ADDRESS}` to `${TARGET_HOST}`.
21
22 ## MTU setting
23
24 With the default MTU setting of 1480 I experienced strange problems: Sometimes, connections got "stuck". By manually setting the MTU for the `tun6to4` interface to 1280 these problems could be solved. I guess, that these problem might be related to packet fragmentation when encapsulating the IPv6 packet within an IPv4 packet.
25
26 See also: [[Path MTU Discovery issues|doc:Miscellaneous.Network.IP.WebHome|anchor="HPathMTUDiscoveryIssues"]]
27
28 # IPv6 with Xen routed setup
29
30 If you want to add IPv6 support to Xen DomUs in a routed network setup, you can either use 6to4 on each individual domU (as long as each has a unique, global IPv6 address), or you can create a routed setup for IPv6 in addition to the exiting IPv4 setup.
31
32 This How To expects that you already have IPv6 running for the Xen Dom0. You might want to refer to the section above, if you have not configured IPv6 for the Dom0 yet.
33
34 The core of this setup is the following script, which should be saved in `/etc/xen/scripts/vif-route-ipv6` (do not forget to `chmod a+x` the file):
35
36 ```bash
37 #============================================================================
38 # /etc/xen/scripts/vif-route-ipv6
39 #
40 # Script for adding an IPv6 address to a routed Xen VM.
41 # This script is called by modified version of /etc/xen/script/vif-route.
42 #
43 # Usage:
44 # vif-route-ipv6 (online|offline)
45 #
46 # Environment vars:
47 # vif vif interface name (required).
48 # XENBUS_PATH path to this device's details in the XenStore (required).
49 # This path is used to extract the VM's UUID.
50 #
51 # Read from the store:
52 # domain name of Xen domU
53 #============================================================================
54
55 command=$1
56
57 # Read name of domU from Xen Store
58 domu_name=`xenstore-read ${XENBUS_PATH}/domain`
59
60 # Read configuration
61 CONFIG_FILE="/etc/xen/ipv6.cfg"
62 grepstr="ipv6_gateway_addr\["${domu_name}"\]="
63 config_line=`grep -i ${grepstr} ${CONFIG_FILE}`
64 ipv6_gateway_addr=${config_line##*=}
65
66 if [ -z ${ipv6_gateway_addr} ] ; then
67 exit 0
68 fi
69
70 case "$command" in
71 online)
72 ip -f inet6 addr add dev ${vif} ${ipv6_gateway_addr}
73 ;;
74 offline)
75 ip -f inet6 addr del dev ${vif} ${ipv6_gateway_addr}
76 ;;
77 esac
78 ```
79
80 It refers to the config file `/etc/xen/ipv6.cfg`. This file might look like this:
81
82 ipv6_gateway_addr[mydomu1]=2001:db8:0:1::1/64
83 ipv6_gateway_addr[mydomu2]=2001:db8:0:2::1/64
84
85 As you can see, for each DomU, that shall be IPv6 enabled, a line with the DomU name in square brackets is written into the configuration file. The IPv6 address after the equals sign is the address that will be assigned to the virtual interface corresponding to the DomU in the Dom0.
86
87 This differs from the IPv4 routed setup, where you only specify the address of the DomU and a host route is created in order to connect the Dom0 with the DomU. For IPv6 we are using a different setup for three reasons:
88
89 1. Configuration gets easier: We do not have to create host routes, the routes will be automatically determined by the subnet prefix for the address. In the example
90
91 above, a route for target `2002:ffff:ffff:1::/64` using the correct `vif`-device will be created automatically. There is also no need to manually configure a host-route to the gateway within the domU: The gateway's address (for `mydomu1` in the example it is `2002:ffff:ffff:1::1`) is within the subnet of the DomU.
92 1. We can easily add extra IP addresses to the DomU: As the Dom0 routes the whole subnet to the DomU, we can just add any address (except the gateway address) within
93
94 the `/64` subnet to the DomU, without having to change any configuration within the Dom0.
95 1. The IPv6 address space is vast: If we have a `/48` subnet for the whole Xen host and we use a `/64` subnet for each DomU, we can create up to nearly 2^16 [DomUs](https://sebastian.marsching.com/wiki/DomUs) on one Xen host. These are more [DomUs](https://sebastian.marsching.com/wiki/DomUs) than you will ever run on a single Xen host.
96
97 In order to make this setup work, we still have to ensure that the script `/etc/xen/scripts/vif-routed-ipv6` is called on the startup of a DomU. The easiest way is to patch `/etc/xen/scripts/vif-routed` using the following patch:
98
99 ```diff
100 --- vif-route.dpkg-dist 2010-01-09 15:34:48.000000000 +0100
101 +++ vif-route 2010-01-09 15:49:17.000000000 +0100
102 @@ -31,11 +31,13 @@
103 echo 1 >/proc/sys/net/ipv4/conf/${vif}/proxy_arp
104 ipcmd='add'
105 cmdprefix=''
106 + XENBUS_PATH="${XENBUS_PATH}" vif="${vif}" $dir/vif-route-ipv6 online
107 ;;
108 offline)
109 do_without_error ifdown ${vif}
110 ipcmd='del'
111 cmdprefix='do_without_error'
112 + XENBUS_PATH="${XENBUS_PATH}" vif="${vif}" $dir/vif-route-ipv6 offline
113 ;;
114 esac
115 ```
116
117 Finally, the setup in the domU is pretty easy: You can just use a statically configured `inet6` setup on `eth0`. Example:
118
119 auto lo
120 iface lo inet loopback
121
122 auto eth0
123 iface eth0 inet static
124 address 192.0.2.31
125 gateway 192.0.2.1
126 netmask 255.255.255.255
127 pointopoint 192.0.2.1
128 post-up /usr/sbin/ethtool -K eth0 tx off
129
130 iface eth0 inet6 static
131 address 2001:db8:0:1::2
132 netmask 64
133 gateway 2001:db8:0:1::1
134
135 # NAT with a dynamic IPv6 prefix
136
137 Some Internet service providers only provide a dynamic prefix for IPv6 via DHCPv6 prefix delegation. This prefix changes from time to time (e.g. when the connection is interrupted and reestablished).
138
139 With such a setup, there are two challenges: First, one has to delegate the prefix to the various subnets / VLANs within one's network. Second, using addresses with a non permanent prefix causes extra challenges for example when defining firewall rules between the various VLANs.
140
141 For these reasons, I am going to describe a setup in which the LAN only uses addresses from the [unique local address](https://en.wikipedia.org/wiki/Unique_local_address) (ULA) space. When routing into the Internet, these addresses are replaced with addresses allocated using the dynamic prefix.
142
143 For this example, it is assumed that the ISP provides a sufficiently large prefix (typically a /56) via DHCPv6 prefix delegation and that the Internet router at the edge of the LAN is capable of further delegating (parts of) this prefix via DHCPv6 prefix delegation. In my case, I am using an AVM Fritz!Box 3370 connected to a VDSL2 line from Deutsche Telekom. Deutsche Telekom provides a dynamic /56 via prefix delegation and the Fritz!Box can be configured to serve prefix delegation requests from DHCPv6 clients.
144
145 A computer that is connected to the VLAN of the Internet router (Fritz!Box) and to all other VLANS acts as a firewall and router. On this computer, I run the DHCP client from the `dhcpcd5` Ubuntu package. Please note that the version of this package included before Ubuntu 16.04 LTS is too old because it lacks critical features. Luckily, the package from Ubuntu 16.04 LTS can be installed on Ubuntu 14.04 LTS (and presumably newer versions of Ubuntu) without any problems.
146
147 I use the following configuration file (`/etc/dhcpcd.conf`):
148
149 # A sample configuration for dhcpcd.
150 # See dhcpcd.conf(5) for details.
151
152 # Allow users of this group to interact with dhcpcd via the control socket.
153 #controlgroup wheel
154
155 # Inform the DHCP server of our hostname for DDNS.
156 hostname
157
158 # Use the hardware address of the interface for the Client ID.
159 #clientid
160 # or
161 # Use the same DUID + IAID as set in DHCPv6 for DHCPv4 ClientID as per RFC4361.
162 # Some non-RFC compliant DHCP servers do not reply with this set.
163 # In this case, comment out duid and enable clientid above.
164 duid
165
166 # Persist interface configuration when dhcpcd exits.
167 persistent
168
169 # Rapid commit support.
170 # Safe to enable by default because it requires the equivalent option set
171 # on the server to actually work.
172 option rapid_commit
173
174 # A list of options to request from the DHCP server.
175 option domain_name_servers, domain_name, domain_search, host_name
176 option classless_static_routes
177 # Most distributions have NTP support.
178 option ntp_servers
179 # Respect the network MTU.
180 # Some interface drivers reset when changing the MTU so disabled by default.
181 #option interface_mtu
182
183 # A ServerID is required by RFC2131.
184 require dhcp_server_identifier
185
186 # Generate Stable Private IPv6 Addresses instead of hardware based ones
187 slaac private
188
189 # A hook script is provided to lookup the hostname if not set by the DHCP
190 # server, but it should not be run by default.
191 nohook lookup-hostname
192
193 # Limit interfaces used by dhcpcd (comma-separated).
194 allowinterfaces eth0
195
196 # Configuration for eth0
197 interface eth0
198 clientid "";
199 persistent
200 option rapid_commit
201 nooption domain_name_servers, domain_name, domain_search, host_name
202 nooption classless_static_routes
203 nooption ntp_servers
204 slaac hwaddr
205 nohook hostname lookup-hostname mtu ntp.conf resolv.conf timezone wpa_supplicant
206 ia_pd 1/::/58 nosuch0/0/58
207 ipv6only
208 nogateway
209 noipv6rs
210 noauthrequired
211 script /etc/dhcpcd-pd-script
212
213 Everything before the "allowinterfaces" line is the default configuration. I use the `allowinterfaces` option because the DHCPv6 client shall only run on the interface `eth0` (the interface facing the Internet router in this example). All the other interfaces use a static configuration (remember that internally this computer is a router).
214
215 We are only interested in the delegated IPv6 prefix, so I disable everything else using the `nooption` lines and disable all hooks that would affect the computer's network configuration.
216
217 I request a /58 prefix (so Internet router gets a /56 from the ISP so it should easily be able to provide a /58) using the `ia_pd` line. I have to specify a network interface that uses the assigned prefix, otherwise `dhcpcd` will not work. However, I do not want to use this prefix on a network interface, but only want to use it in IPtables. Therefore I specify an interface name that does not exist (`nosuch0`). This causes the DHCP client to log a warning, but the prefix delegation will still work.
218
219 The `ipv6only` options disables DHCP for IPv4 (I use a static IP address for IPv4) and the `nogateway` and `noipv6rs` options ensure that the DHCP client will not add any routes.
220
221 The `noauthrequired` option ensures that the IPv6 prefix is updated when the Internet router gets a new prefix from the ISP. Obviously, actually configuring authentication would be preferrable, but in my case the Internet router does not seem to support this. If there only are trusted devices in the network connecting the computer running dhcpcd with the Internet router, disabling authentication should be safe.
222
223 Finally, the script specified in the `script` line is called whenever a DHCP lease is obtained, renewed or released. It allows us to use the assigned prefix (we will soon see how).
224
225 In `/etc/network/interfaces` I use the following configuration for `eth0`:
226
227 iface eth0 inet6 auto
228 privext 0
229 dhcp 0
230 post-up sysctl -w net.ipv6.conf.$IFACE.accept_ra=2 >/dev/null
231
232 I disable the privacy extension because it does not make much sense for a router. I also disable DHCPv6 because Ubuntu uses ISC's DHCPv6 client (`dhclient`), which unfortunately cannot handle prefix delegations correctly. The `dhcpcd5` DHCPv6 client on the other hand does not have to be triggered by the network scripts explicitly but will detect the new interface and start its work automatically.
233
234 The `post-up` script is needed because Linux does not accept router advertisements (RAs) by default when forwarding is enabled. The rational behind this is that a computer with forwarding enabled acts as a router and typically a router should not accept RAs from other routers. In our case however, we want to accept RAs from our upstream router and setting `accept_ra` to `2` will override the default behavior.
235
236 Finally, we need the script that is called when a dynamic prefix is assigned so that we can create the corresponding rules for `ip6tables`.
237
238 I use the following script (`/etc/dhcpcd-pd-script`):
239
240 ```bash
241 #!/bin/bash
242
243 set -e
244
245 ip6t="/sbin/ip6tables"
246 prefix_file="/var/run/ipv6-pd/current_prefix"
247 external_interface="eth0"
248 internal_prefix="fc::/58"
249 internal_redirect_mark="0x8aebe875"
250
251 update_iptables() {
252 local external_prefix
253 external_prefix="$1"
254 # Ensure that the chains exist. If they already exist, creating them causes
255 # an error that we have to catch.
256 "${ip6t}" -t nat -N external_dnat 2>/dev/null || true
257 "${ip6t}" -t nat -N external_snat 2>/dev/null || true
258 # Flush the (existing) chains.
259 "${ip6t}" -t nat -F external_dnat
260 "${ip6t}" -t nat -F external_snat
261 "${ip6t}" -t nat -A external_dnat -i "${external_interface}" \
262 -d "${external_prefix}" -j NETMAP --to "${internal_prefix}"
263 "${ip6t}" -t nat -A external_snat -o "${external_interface}" \
264 -s "${internal_prefix}" -j NETMAP --to "${external_prefix}"
265 # Internal traffic directed to the external prefix should be rerouted to the
266 # internal prefix. However, the source address has to be rewritten so that
267 # the response will parse through this router again and thus the address in
268 # the response packet can be rewritten again. We use a mark so that we can
269 # know which packets need to be touched in the POSTROUTING (external_snat)
270 # chain.
271 "${ip6t}" -t nat -A external_dnat ! -i "${external_interface}" \
272 -s "${internal_prefix}" -d "${external_prefix}" -j MARK \
273 --set-mark "${internal_redirect_mark}"
274 "${ip6t}" -t nat -A external_dnat ! -i "${external_interface}" \
275 -s "${internal_prefix}" -d "${external_prefix}" -j NETMAP \
276 --to "${internal_prefix}"
277 "${ip6t}" -t nat -A external_snat ! -i "${external_interface}" \
278 -s "${internal_prefix}" -m mark --mark "${internal_redirect_mark}" \
279 -j NETMAP --to "${external_prefix}"
280 }
281
282 # If this script is called by the firewall script, we only try to restore the
283 # IPTables rules.
284 if [ $# -ge 1 ] && [ "$1" = "restore-iptables" ]; then
285 if [ -f "${prefix_file}" ]; then
286 last_prefix="`cat "${prefix_file}"`"
287 update_iptables "${last_prefix}"
288 fi
289 exit 0
290 fi
291
292 # If this script is called without a new prefix, there is nothing we can or have
293 # to do.
294 if [ ! -z "${new_dhcp6_ia_pd1_prefix1}" ]; then
295 expected_prefix_length="`echo -n "$internal_prefix" | cut -d / -f 2`"
296 # We expect a prefix of the right length because this is what we request.
297 # However, as our script cannot work correctly when the length of the
298 # internal and the external prefix do not match, we check this to be sure.
299 if [ "${new_dhcp6_ia_pd1_prefix1_length}" -ne "${expected_prefix_length}" ]; then
300 echo "Invalid prefix length: Expected ${expected_prefix_length} but got ${new_dhcp6_ia_pd1_prefix1_length}."
301 exit 1
302 fi
303 new_prefix="${new_dhcp6_ia_pd1_prefix1}/${expected_prefix_length}"
304 if [ -f "${prefix_file}" ]; then
305 last_prefix="`cat "${prefix_file}"`"
306 else
307 last_prefix=""
308 fi
309 if [ "${last_prefix}" = "${new_prefix}" ]; then
310 if "${ip6t}" -t mangle -L external_dnat -n -v 2>/dev/null | grep -q -F "${new_prefix}"; then
311 # The prefix has not changed and the IPTables rules have already been
312 # created.
313 exit 0
314 fi
315 fi
316 update_iptables "${new_prefix}"
317 mkdir -p "`dirname "${prefix_file}"`"
318 echo -n "${new_prefix}" >"${prefix_file}"
319 fi
320
321 ```
322
323 In this script, you have to adjust your internally used prefix (when choosing a ULA prefix, you should use a random number from the range fc::/7 in order to avoid colissions when connecting different networks using addresses from the ULA space). Like in the other configuration files, you have to change the interface name from `eth0` to whichever is the name of the interface that connects to the Internet router.
324
325 In order to work correctly when handling traffic that comes from the internal network and is directed at the internal network but using a destination address with the external prefix, the `iptables` rules use a mark. This will only work correctly if no other rules affecting the packet (in particular in the `FORWARD` chain) set the mark.
326
327 This script does the following: When called with the `new_dhcp6_ia_pd1_prefix1` environment variable set, it uses this prefix to create `iptables` rules that replace the internal prefix with the dynamic external one when routing through the external interface.
328
329 These rules are created in separate chains (`external_dnat` and `external_snat`), so that we can easily replace the rules without affecting any other rules that might be present. Please note that these chains need to be called from the `PREROUTING` and `POSTROUTING` chains like this:
330
331 ```bash
332 ip6tables -t nat -A PREROUTING -j external_dnat
333 ip6tables -t nat -A POSTROUTING -j external_snat
334 ```
335
336 You might have noticed that there is no code that removes the rules when the prefix delegation expires. The rationale behind this is simple: Usually, we only expect a prefix to be replaced with a different prefix. The only case when we would expect no prefix at all is when our Internet connection is down. In this case, however, it does not matter if we still have a rule with the old prefix.
337
338 ## Using the DHCPv6 client in a fail-over setup
339
340 In my case, the actual setup is even a bit more complex: I do not want the internal router to be a single point of failure. For the DSL router on the edge of the network this is acceptable because there is no reasonable way to avoid this. A simple router box is also less likely to fail than a "real" computer and software updates requiring a reboot are less frequent, too.
341
342 I will not discuss here the details of the fail-over setup of the network interfaces. I use a [HA solution](https://sebastian.marsching.com/wiki/Linux/OpenVSwitch#Using_Open_vSwitch_for_a_high-availability_.2F_fail-over_interface) involving OpenVSwitch. For the rest of this tutorial, it is assumed that fail-over is working for the network interfaces and that the network interface facing the Internet router (`eth0`) uses the same MAC address on all nodes of the HA cluster and is only active on a single node at once.
343
344 The remaining challenge is to ensure that the DHCPv6 client uses the same prefix when fail-over from one node to another one happens. If the prefix changed, existing connections would be interrupted.
345
346 DHCPv6 uses a DUID identifying the client when contacting the server. Typically, this ID is generated when the client runs for the first time and stored internally. This way, a client always has the same DUID when contacting a server. The DHCPv6 client from the `dhcpcd5` package stores this DUID in `/etc/dhcpcd.duid`. We could copy this file to all nodes so that they will identify as the same DHCPv6 client, however this could be dangerous. If there happen to be other interfaces on which we want to use DHCPv6 and for one of these interfaces two nodes might be active at the same time, it could end up in the same addresses being assigned to two different clients. In addition to that, we would also have to keep the information about active leases in sync between nodes.
347
348 Luckily, there is a simpler approach to this issue: The DHCP client from `dhcpcd5` has been specifically designed to work on computer where there is no permanent storage available (e.g. some embedded devices). On these devices, it generates a DUID based on the hardware address of the interface and it does not store lease information. This is exactly what we want. Typically, a DHCPv6 server will assign the same lease if the same client requests a new lease and its current lease has not expired yet (at least, this is what the DHCPv6 server in the Fritz!Box does). So if we fail-over from one node to another one, the delegated IPv6 prefix will be kept (the hardware address of the interfaces is the same as described earlier).
349
350 Unfortunately, there is no way to tell the DHCP client to operate with storing the DUID and lease information. It will simply try to store this information and fall-back to working without stored information if the write operation fails. We cannot use restrictive permissions on the relevant files because the DHCP clients runs with `root` privileges. However, we can set the immutable attribute on them, so that they cannot be changed any longer. We do this by running the following two commands:
351
352 ```bash
353 chattr +i /etc/dhcpcd.duid
354 chattr +i /var/lib/dhcpcd5
355 ```
356
357 Before doing this, ensure that `etc/dhcpcd.duid` is an existing empty file and that `/var/lib/dhcpcd5` is an existing empty directory.
358
359 Now, the write operations will fail and the DHCP client will show the desired behavior.