Thursday, December 29, 2011

Monday, December 12, 2011

The Jail Subsystem

Table of Contents
4.1 Architecture
4.2 Restrictions

Evan SarmientoCopyright © 2001 Evan Sarmiento On most UNIX® systems, root has omnipotent power. This promotes insecurity. If an attacker gained root on a system, he would have every function at his fingertips. In FreeBSD there are sysctls which dilute the power of root, in order to minimize the damage caused by an attacker. Specifically, one of these functions is called secure levels. Similarly, another function which is present from FreeBSD 4.0 and onward, is a utility called jail(8). Jail chroots an environment and sets certain restrictions on processes which are forked within the jail. For example, a jailed process cannot affect processes outside the jail, utilize certain system calls, or inflict any damage on the host environment.
Jail is becoming the new security model. People are running potentially vulnerable servers such as Apache, BIND, and sendmail within jails, so that if an attacker gains root within the jail, it is only an annoyance, and not a devastation. This article mainly focuses on the internals (source code) of jail. If you are looking for a how-to on setting up a jail, I suggest you look at my other article in Sys Admin Magazine, May 2001, entitled "Securing FreeBSD using Jail."

4.1 Architecture

Jail consists of two realms: the userland program, jail(8), and the code implemented within the kernel: the jail(2) system call and associated restrictions. I will be discussing the userland program and then how jail is implemented within the kernel.

4.1.1 Userland Code

The source for the userland jail is located in /usr/src/usr.sbin/jail, consisting of one file, jail.c. The program takes these arguments: the path of the jail, hostname, IP address, and the command to be executed.

4.1.1.1 Data Structures

In jail.c, the first thing I would note is the declaration of an important structure struct jail j; which was included from /usr/include/sys/jail.h.
The definition of the jail structure is:

/usr/include/sys/jail.h: 

struct jail {
        u_int32_t       version;
        char            *path;
        char            *hostname;
        u_int32_t       ip_number;
};

As you can see, there is an entry for each of the arguments passed to the jail(8) program, and indeed, they are set during its execution.

/usr/src/usr.sbin/jail/jail.c
char path[PATH_MAX];
...
if (realpath(argv[0], path) == NULL)
    err(1, "realpath: %s", argv[0]);
if (chdir(path) != 0)
    err(1, "chdir: %s", path);
memset(&j, 0, sizeof(j));
j.version = 0;
j.path = path;
j.hostname = argv[1];

4.1.1.2 Networking

One of the arguments passed to the jail(8) program is an IP address with which the jail can be accessed over the network. jail(8) translates the IP address given into host byte order and then stores it in j (the jail structure).

/usr/src/usr.sbin/jail/jail.c:
struct in_addr in; 
... 
if (inet_aton(argv[2], &in) == 0)
    errx(1, "Could not make sense of ip-number: %s", argv[2]);
j.ip_number = ntohl(in.s_addr);

The inet_aton(3) function "interprets the specified character string as an Internet address, placing the address into the structure provided." The ip_number member in the jail structure is set only when the IP address placed onto the in structure by inet_aton(3) is translated into host byte order by ntohl(3).

4.1.1.3 Jailing The Process

Finally, the userland program jails the process. Jail now becomes an imprisoned process itself and then executes the command given using execv(3).

/usr/src/usr.sbin/jail/jail.c
i = jail(&j); 
... 
if (execv(argv[3], argv + 3) != 0)
    err(1, "execv: %s", argv[3]);

As you can see, the jail() function is called, and its argument is the jail structure which has been filled with the arguments given to the program. Finally, the program you specify is executed. I will now discuss how jail is implemented within the kernel.

4.1.2 Kernel Space

We will now be looking at the file /usr/src/sys/kern/kern_jail.c. This is the file where the jail(2) system call, appropriate sysctls, and networking functions are defined.

4.1.2.1 sysctls

In kern_jail.c, the following sysctls are defined:

/usr/src/sys/kern/kern_jail.c:

int     jail_set_hostname_allowed = 1;
SYSCTL_PROC(_security_jail, OID_AUTO, set_hostname_allowed, CTLFLAG_RW,
    &jail_set_hostname_allowed, 0,
    "Processes in jail can set their hostnames");

int     jail_socket_unixiproute_only = 1;
SYSCTL_PROC(_security_jail, OID_AUTO, socket_unixiproute_only, CTLFLAG_RW,
    &jail_socket_unixiproute_only, 0,
    "Processes in jail are limited to creating UNIX/IPv4/route sockets only");

int     jail_sysvipc_allowed = 0;
SYSCTL_PROC(_security_jail, OID_AUTO, sysvipc_allowed, CTLFLAG_RW,
    &jail_sysvipc_allowed, 0,
    "Processes in jail can use System V IPC primitives");

static int jail_enforce_statfs = 2;
SYSCTL_PROC(_security_jail, OID_AUTO, enforce_statfs, CTLFLAG_RW,
    &jail_enforce_statfs, 0,
    "Processes in jail cannot see all mounted file systems");

int    jail_allow_raw_sockets = 0;
SYSCTL_PROC(_security_jail, OID_AUTO, allow_raw_sockets, CTLFLAG_RW,
    &jail_allow_raw_sockets, 0,
    "Prison root can create raw sockets");

int    jail_chflags_allowed = 0;
SYSCTL_PROC(_security_jail, OID_AUTO, chflags_allowed, CTLFLAG_RW,
    &jail_chflags_allowed, 0,
    "Processes in jail can alter system file flags");

int     jail_mount_allowed = 0;
SYSCTL_PROC(_security_jail, OID_AUTO, mount_allowed, CTLFLAG_RW,
    &jail_mount_allowed, 0,
    "Processes in jail can mount/unmount jail-friendly file systems");

Each of these sysctls can be accessed by the user through the sysctl(8) program. Throughout the kernel, these specific sysctls are recognized by their name. For example, the name of the first sysctl is security.jail.set_hostname_allowed.

4.1.2.2 jail(2) system call

Like all system calls, the jail(2) system call takes two arguments, struct thread *td and struct jail_args *uap. td is a pointer to the thread structure which describes the calling thread. In this context, uap is a pointer to the structure in which a pointer to the jail structure passed by the userland jail.c is contained. When I described the userland program before, you saw that the jail(2) system call was given a jail structure as its own argument.

/usr/src/sys/kern/kern_jail.c:
/*
 * struct jail_args {
 *  struct jail *jail;
 * };
 */ 
int 
jail(struct thread *td, struct jail_args *uap)

Therefore, uap->jail can be used to access the jail structure which was passed to the system call. Next, the system call copies the jail structure into kernel space using the copyin(9) function. copyin(9) takes three arguments: the address of the data which is to be copied into kernel space, uap->jail, where to store it, j and the size of the storage. The jail structure pointed by uap->jail is copied into kernel space and is stored in another jail structure, j.

/usr/src/sys/kern/kern_jail.c: 
error = copyin(uap->jail, &j, sizeof(j));

There is another important structure defined in jail.h. It is the prison structure. The prison structure is used exclusively within kernel space. Here is the definition of the prison structure.

/usr/include/sys/jail.h:
struct prison {
        LIST_ENTRY(prison) pr_list;                     /* (a) all prisons */
        int              pr_id;                         /* (c) prison id */
        int              pr_ref;                        /* (p) refcount */
        char             pr_path[MAXPATHLEN];           /* (c) chroot path */
        struct vnode    *pr_root;                       /* (c) vnode to rdir */
        char             pr_host[MAXHOSTNAMELEN];       /* (p) jail hostname */
        u_int32_t        pr_ip;                         /* (c) ip addr host */
        void            *pr_linux;                      /* (p) linux abi */
        int              pr_securelevel;                /* (p) securelevel */
        struct task      pr_task;                       /* (d) destroy task */
        struct mtx       pr_mtx;
      void            **pr_slots;                     /* (p) additional data */
};

The jail(2) system call then allocates memory for a prison structure and copies data between the jail and prison structure.

/usr/src/sys/kern/kern_jail.c:
MALLOC(pr, struct prison *, sizeof(*pr), M_PRISON, M_WAITOK | M_ZERO);
...
error = copyinstr(j.path, &pr->pr_path, sizeof(pr->pr_path), 0);
if (error)
    goto e_killmtx;
...
error = copyinstr(j.hostname, &pr->pr_host, sizeof(pr->pr_host), 0);
if (error)
     goto e_dropvnref;
pr->pr_ip = j.ip_number;

Next, we will discuss another important system call jail_attach(2), which implements the function to put a process into the jail.

/usr/src/sys/kern/kern_jail.c:
/*
 * struct jail_attach_args {
 *      int jid;
 * };
 */
int
jail_attach(struct thread *td, struct jail_attach_args *uap)

This system call makes the changes that can distinguish a jailed process from those unjailed ones. To understand what jail_attach(2) does for us, certain background information is needed.
On FreeBSD, each kernel visible thread is identified by its thread structure, while the processes are described by their proc structures. You can find the definitions of the thread and proc structure in /usr/include/sys/proc.h. For example, the td argument in any system call is actually a pointer to the calling thread's thread structure, as stated before. The td_proc member in the thread structure pointed by td is a pointer to the proc structure which represents the process that contains the thread represented by td. The proc structure contains members which can describe the owner's identity(p_ucred), the process resource limits(p_limit), and so on. In the ucred structure pointed by p_ucred member in the proc structure, there is a pointer to the prison structure(cr_prison).

/usr/include/sys/proc.h: 
struct thread {
    ...
    struct proc *td_proc;
    ...
};
struct proc { 
    ...
    struct ucred *p_ucred; 
    ...
};
/usr/include/sys/ucred.h
struct ucred {
    ...
    struct prison *cr_prison;
    ...
};

In kern_jail.c, the function jail() then calls function jail_attach() with a given jid. And jail_attach() calls function change_root() to change the root directory of the calling process. The jail_attach() then creates a new ucred structure, and attaches the newly created ucred structure to the calling process after it has successfully attached the prison structure to the ucred structure. From then on, the calling process is recognized as jailed. When the kernel routine jailed() is called in the kernel with the newly created ucred structure as its argument, it returns 1 to tell that the credential is connected with a jail. The public ancestor process of all the process forked within the jail, is the process which runs jail(8), as it calls the jail(2) system call. When a program is executed through execve(2), it inherits the jailed property of its parent's ucred structure, therefore it has a jailed ucred structure.

/usr/src/sys/kern/kern_jail.c
int
jail(struct thread *td, struct jail_args *uap)
{
...
    struct jail_attach_args jaa;
...
    error = jail_attach(td, &jaa);
    if (error)
        goto e_dropprref;
...
}

int
jail_attach(struct thread *td, struct jail_attach_args *uap)
{
    struct proc *p;
    struct ucred *newcred, *oldcred;
    struct prison *pr;
...
    p = td->td_proc;
...
    pr = prison_find(uap->jid);
...
    change_root(pr->pr_root, td);
...
    newcred->cr_prison = pr;
    p->p_ucred = newcred;
...
}

When a process is forked from its parent process, the fork(2) system call uses crhold() to maintain the credential for the newly forked process. It inherently keep the newly forked child's credential consistent with its parent, so the child process is also jailed.

/usr/src/sys/kern/kern_fork.c:
p2->p_ucred = crhold(td->td_ucred);
...
td2->td_ucred = crhold(p2->p_ucred);

Thursday, December 8, 2011

What is a Jail

BSD-like operating systems have had chroot(2) since the time of 4.2BSD. The chroot(8) utility can be used to change the root directory of a set of processes, creating a safe environment, separate from the rest of the system. Processes created in the chrooted environment can not access files or resources outside of it. For that reason, compromising a service running in a chrooted environment should not allow the attacker to compromise the entire system. The chroot(8) utility is good for easy tasks, which do not require a lot of flexibility or complex and advanced features. Since the inception of the chroot concept, however, many ways have been found to escape from a chrooted environment and, although they have been fixed in modern versions of the FreeBSD kernel, it was clear that chroot(2) was not the ideal solution for securing services. A new subsystem had to be implemented.
This is one of the main reasons why jails were developed.
Jails improve on the concept of the traditional chroot(2) environment, in several ways. In a traditional chroot(2) environment, processes are only limited in the part of the file system they can access. The rest of the system resources (like the set of system users, the running processes, or the networking subsystem) are shared by the chrooted processes and the processes of the host system. Jails expand this model by virtualizing not only access to the file system, but also the set of users, the networking subsystem of the FreeBSD kernel and a few other things. A more complete set of fine-grained controls available for tuning the access of a jailed environment is described in Section 16.5.
A jail is characterized by four elements:

A directory subtree -- the starting point from which a jail is entered. Once inside the jail, a process is not permitted to escape outside of this subtree. Traditional security issues which plagued the original chroot(2) design will not affect FreeBSD jails.
A hostname -- the hostname which will be used within the jail. Jails are mainly used for hosting network services, therefore having a descriptive hostname for each jail can really help the system administrator.
An IP address -- this will be assigned to the jail and cannot be changed in any way during the jail's life span. The IP address of a jail is usually an alias address for an existing network interface, but this is not strictly necessary.
A command -- the path name of an executable to run inside the jail. This is relative to the root directory of the jail environment, and may vary a lot, depending on the type of the specific jail environment.

Apart from these, jails can have their own set of users and their own root user. Naturally, the powers of the root user are limited within the jail environment and, from the point of view of the host system, the jail root user is not an omnipotent user. In addition, the root user of a jail is not allowed to perform critical operations to the system outside of the associated jail(8) environment.

Creating and Controlling Jails

Some administrators divide jails into the following two types: “complete” jails, which resemble a real FreeBSD system, and “service” jails, dedicated to one application or service, possibly running with privileges. This is only a conceptual division and the process of building a jail is not affected by it. The jail(8) manual page is quite clear about the procedure for building a jail:

# setenv D /here/is/the/jail
# mkdir -p $D 
# cd /usr/src
# make buildworld 
# make installworld DESTDIR=$D 
# make distribution DESTDIR=$D 
# mount -t devfs devfs $D/dev

: Selecting a location for a jail is the best starting point. This is where the jail will physically reside within the file system of the jail's host. A good choice can be /usr/jail/jailname, where jailname is the hostname identifying the jail. The /usr/ file system usually has enough space for the jail file system, which for “complete” jails is, essentially, a replication of every file present in a default installation of the FreeBSD base system.
: If you have already rebuilt your userland using make world or make buildworld, you can skip this step and install your existing userland into the new jail.
: This command will populate the directory subtree chosen as jail's physical location on the file system with the necessary binaries, libraries, manual pages and so on.
: The distribution target for make installs every needed configuration file. In simple words, it installs every installable file of /usr/src/etc/ to the /etc directory of the jail environment: $D/etc/.
: Mounting the devfs(8) file system inside a jail is not required. On the other hand, any, or almost any application requires access to at least one device, depending on the purpose of the given application. It is very important to control access to devices from inside a jail, as improper settings could permit an attacker to do nasty things in the jail. Control over devfs(8) is managed through rulesets which are described in the devfs(8) and devfs.conf(5) manual pages.

Once a jail is installed, it can be started by using the jail(8) utility. The jail(8) utility takes four mandatory arguments which are described in the Section 16.3.1. Other arguments may be specified too, e.g., to run the jailed process with the credentials of a specific user. The command argument depends on the type of the jail; for a virtual system, /etc/rc is a good choice, since it will replicate the startup sequence of a real FreeBSD system. For a service jail, it depends on the service or application that will run within the jail.
Jails are often started at boot time and the FreeBSD rc mechanism provides an easy way to do this.

A list of the jails which are enabled to start at boot time should be added to the rc.conf(5) file:
```
jail_enable="YES"   # Set to NO to disable starting of any jails
jail_list="www"     # Space separated list of names of jails
```
Note: Jail names in jail_list should contain alphanumeric characters only.
For each jail listed in jail_list, a group of rc.conf(5) settings, which describe the particular jail, should be added:
```
jail_www_rootdir="/usr/jail/www"     # jail's root directory
jail_www_hostname="www.example.org"  # jail's hostname
jail_www_ip="192.168.0.10"           # jail's IP address
jail_www_devfs_enable="YES"          # mount devfs in the jail
jail_www_devfs_ruleset="www_ruleset" # devfs ruleset to apply to jail
```
The default startup of jails configured in rc.conf(5), will run the /etc/rc script of the jail, which assumes the jail is a complete virtual system. For service jails, the default startup command of the jail should be changed, by setting the jail_jailname_exec_start option appropriately.

Note: For a full list of available options, please see the rc.conf(5) manual page.

The /etc/rc.d/jail script can be used to start or stop a jail by hand, if an entry for it exists in rc.conf:

# /etc/rc.d/jail start www
# /etc/rc.d/jail stop www

A clean way to shut down a jail(8) is not available at the moment. This is because commands normally used to accomplish a clean system shutdown cannot be used inside a jail. The best way to shut down a jail is to run the following command from within the jail itself or using the jexec(8) utility from outside the jail:

# sh /etc/rc.shutdown

More information about this can be found in the jail(8) manual page.

Opensource - Information

Thursday, December 29, 2011

Format Specifiers

Monday, December 12, 2011

The Jail Subsystem

The Jail Subsystem

4.1 Architecture

4.1.1 Userland Code

4.1.1.1 Data Structures

4.1.1.2 Networking

4.1.1.3 Jailing The Process

4.1.2 Kernel Space

4.1.2.1 sysctls

4.1.2.2 jail(2) system call

Thursday, December 8, 2011

What is a Jail

What is a Jail

Creating and Controlling Jails

Creating and Controlling Jails