Help:Container Network

Jump to: navigation, search

Docker Networking Fundamentals

One of the reasons why Kogence uses Dockers as the HPC containerization technology is that the Docker provide rich network isolation features that are not available in other HPC containerization technologies such as Singularity. Docker container running on same host or on different hosts of a cluster can be connected as well as to the host on which they are running. This allows Kogence to orchestrate HPC workloads on containers such that these workloads are not even aware that they are running inside Docker. Similarly, in the case of distributed memory multi-node workloads such as MPI workloads are also not aware whether their connected peer workloads are also Docker workloads or not. Whether your Docker hosts run Linux, Windows, or a mix of the two, we can use Docker to manage them in a platform-agnostic way.

Docker’s networking subsystem is pluggable, using drivers. Docker offers several drivers such as bridge, host, macvlan, and overlay by default, and provide core networking functionality. On Kogence Container Cloud HPC platform we use each of these depending upon the workload you launch. User remains agnostic to the details and her workload automatically runs on a container cloud cluster orchestrated with the most optimal choice for her specific workload.

Docker Bridge Network

If Docker container are started with default bridge network then the containers will use a separate network namespace and the host network interfaces, routing tables, ARP tables etc will not be visible inside the container.

Bridge network works with a single host only. It provides network connectivity between the host and the containers running on that single host.

Docker daemon creates a virtual switch on the host. This virtual switch typically shows up as the docker0 network interface in the list of interfaces on the host. The docker0 interface is connected to a private subnet and receives the first IP address of that private address space. As with any other LAN hardware, this switch (i.e. the docker0 interface) also gets a MAC address. The host is able to receive and send packets using this network interface on the connected subnet either using Ethernet broadcasting mechanism (e.g. when destination MAC address is not known) or using point to point mechanism through the destination MAC address of any other network device connected to the same subnet. As the host communicates with other network devices using this interface, the host kernel will populate the MAC addresses of other network devices connected to this subnet in its ARP table. The host kernel routing table will also be populated with the subnet IP address range (i.e. the subnet mask) of this subnet so when host receives a packet, on any of host's network interface and not just docker0, destined for an IP address in this subnet range then host will use this docker0 network interface to further route the packet. Note that the host acts as the gateway for this subnet and therefore, hosts kernel routing table does not get a default gateway entry for this interface.

When a container, say cont#1, gets created (either using docker run or docker create) on this host, the Docker daemon adds a new virtual port (technically they are known as the veth-pair type network interface) on this virtual switch. This virtual port shows up as vethXYZ@ifAB interface on the host machine. Lets say this interface has an interface ID of CD. The Docker daemon also creates a new eth0@ifCD network interface inside the container cont#1. This interface will have an interface ID of AB. The @ifAB at the end of vethXYZ@ifAB indicates that this port on the switch (identified by the interface ID CD) is connected to the interface ID AB (which is the eth0@ifCD interface on the container cont#1). Correspondingly, @ifCD at the end of eth0@ifCD indicates that this Ethernet interface is connected to the interface ID CD (which is the vethXYZ@ifAB port on the virtual switch on host). Please note that technically all interfaces are created on the host machine -- they are just in different network namespaces. Kernel assigns the numerical network interface IDs to all network interfaces sequentially across all namespaces.

Please note that these ports (i.e. the veth-pair type network interface) on the virtual switch do not receive an IP address even though they show up as proper network interface with a MAC address in the list of network interfaces. But these MAC addresses will never get populated in an ARP table and these interfaces will never get used directly. The host will use the docker0 interface to send packets to the containers and the containers will use their own interfaces like eth0@ifCD to send packets to the host as as well as to the other containers. These interfaces like vethXYZ@ifAB should be mentally modelled as ports on the switch with an Ethernet wire connected. The interface on the other end (such as the eth0@ifCD interface) that this Ethernet wire connects to is the one that receives an IP address not the port itself. Technically, vethXYZ@ifAB is a slave interface while the docker0 is the master interface. Any outbound packet that goes through the vethXYZ@ifAB interface will get the docker0 IP address as source IP address.

The eth0@ifCD interface on the container cont#1 is also connected to same private subnet as the docker0 and receives the a unique private IP address of that private address space. As with any other LAN hardware, this interface also gets a MAC address. The processes on this container are able to receive and send packets using this network interface on the connected subnet either using Ethernet broadcasting mechanism (e.g. when destination MAC address is not known) or using point to point mechanism through the destination MAC address of any other network device connected to the same subnet. As the container communicates with other network devices using this interface, the container will populate the MAC addresses of other network devices connected to this subnet in its own ARP table. The kernel routing table of the container will also be populated with the subnet IP address range (i.e. the subnet mask) of this subnet so when container receives a packet, on any of its network interface and not just eth0@ifCD, destined for an IP address in this subnet range then the container will use this eth0@ifCD network interface to further route the packet. Note that the host acts as the gateway for this subnet and therefore the container's routing table table will also list the docker0 IP address as the default gateway entry for this interface. So any packet that is not destined for this subnet will be forwarded to the host machine on the docker0 interface. Host machine can then do NAT/PAT (if configured) and use its other network interfaces (if configured) to forward the packet to correct destination out side of host and containers private network. Please not the by default, NAT/PAT is not configured. So by default container can access the internet provided host has other network interfaces that are connected to the internet but the reverse is not possible (unless NAT/PAT in configured on host by publishing ports from container), i.e. the internet cannot access the containers even though host may be made accessible on the internet.

On Kogence container cloud HPC platform, by default, we do not publish any of the containers port. If your workloads needs to expose some ports then those specific ports are published specifically for your workload only. Corresponding changes on the host's network router and host's iptables are also made automatically as needed.

Docker Host Network

If Docker's host network driver is used then Docker containers do not run in a separate network namespace. There will be not network isolation and the workloads running in the container will use the host network stack. This means that the host's network interfaces, routing tables, ARP tables etc will be visible inside the container.

By default, Kogence container cloud HPC platform uses the host network and therefore all network and host level security implemented applies even to the workloads running in the containers as well. Other network choices are made only for specific types of workloads for which host network does not suffice. Please contact us for further details.