0% found this document useful (0 votes)
73 views17 pages

Flask Security Architecture Overview

The document summarizes an operating system security architecture called Flask that aims to provide flexibility in supporting diverse security policies. It does this by controlling the propagation of access rights, enforcing fine-grained access controls, and supporting revocation of previously granted access rights. The architecture is implemented in a prototype Flask microkernel-based operating system, which initial evidence suggests impacts performance and code complexity modestly while demonstrating the ability to flexibly support a wide variety of security policies.

Uploaded by

bernasek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views17 pages

Flask Security Architecture Overview

The document summarizes an operating system security architecture called Flask that aims to provide flexibility in supporting diverse security policies. It does this by controlling the propagation of access rights, enforcing fine-grained access controls, and supporting revocation of previously granted access rights. The architecture is implemented in a prototype Flask microkernel-based operating system, which initial evidence suggests impacts performance and code complexity modestly while demonstrating the ability to flexibly support a wide variety of security policies.

Uploaded by

bernasek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

The Flask Security Architecture: System Support for Diverse Security Policies

Ray Spencer Secure Computing Corporation


Stephen Smalley, Peter Loscocco National Security Agency
Mike Hibler, David Andersen, Jay Lepreau University of Utah

[Link]

Abstract
Operating systems must be exible in their support for security policies, providing sufcient mechanisms for supporting the wide variety of real-world security poli cies. Such exibility requires controlling the propaga tion of access rights, enforcing ne-grained access rights and supporting the revocation of previously granted ac cess rights. Previous systems are lacking in at least one of these areas. In this paper we present an operating system security architecture that solves these problems. Control over propagation is provided by ensuring that the security policy is consulted for every security deci sion. This control is achieved without signicant perfor mance degradation through the use of a security decision caching mechanism that ensures a consistent view of policy decisions. Both ne-grained access rights and revo cation support are provided by mechanisms that are di rectly integrated into the service-providing components of the system. The architecture is described through its prototype implementation in the Flask microkernel based operating system, and the policy exibility of the prototype is evaluated. We present initial evidence that the architectures impact on both performance and code complexity is modest. Moreover, our architecture is ap plicable to many other types of operating systems and environments.

Introduction

A phenomenal growth in connectivity through the In ternet has made computer security a paramount concern, but no single denition of security sufces. Different computing environments, and the applications that run in them, have different security requirements. Because any notion of security is captured in the expression of a security policy, there is a need for many different policies
This research was supported in part by the Defense Advanced Research Projects Agency in conjunction with the Department of the Army under contract DABT6394C0058 and with the Air Force Research Laboratory, Rome Research Site, USAF, under agreement F306029620269. It was also supported in part by the Maryland Procurement Ofce, contract MDA904-97-C-3047. Authors: {sds,pal}@[Link], {mike,danderse,lepreau} @[Link], saltalk@[Link] (Spencer).

and even many types of policies [1, 43, 48]. To be gen erally acceptable, any computer security solution must be exible enough to support this wide range of security policies. Even in the distributed environments of today, this policy exibility must be supported by the security mechanisms of the operating system [32]. Supporting policy exibility in the operating system is a hard problem that goes beyond just supporting multi ple policies. The system must be capable of supporting ne-grained access controls on low-level objects used to perform higher-level functions controlled by the secu rity policy. Additionally, the system must ensure that the propagation of access rights is in accordance with the security policy. Lastly, policies are not, in general, static. To cope with policy changes or dynamic policies, the system must have a mechanism for revoking previ ously granted access rights. Earlier systems have pro vided mechanisms that allow several security policies to be supported, but they are inadequate to generally support policy exibility because they fail to address at least one of these three areas. This paper describes an operating system security ar chitecture that demonstrates the feasibility of policy ex ibility. This is done by presenting its prototype imple mentation, the Flask microkernel-based operating sys tem, that successfully overcomes these obstacles to policy exibility. The cleaner separation of mechanism and policy specied in the security architecture enables a richer set of security policies to be supported with less policy-specic customization than has previously been possible. Flask includes a security policy server to make access control decisions and a framework in the microkernel and other object managers in the system to enforce those access control decisions. Although the pro totype system is microkernel-based, the security mecha nisms do not depend on a microkernel architecture and will easily generalize beyond it. The resulting system provides policy exibility. It supports a wide variety of policies. It controls the prop agation of access rights by ensuring that the security policy is consulted for every access decision. Enforce ment mechanisms, directly integrated into the serviceproviding components of the system, enable ne-grained

access controls and dynamic policy support that allows the revocation of previously granted access rights. Initial performance results, as well as statistics on the scale and invasiveness of the code changes, indicate that the impact of policy exible security on the system can be kept to a minimum. The remainder of the paper begins by elaborating on the meaning of policy exibility. After a discussion of why two popular mechanisms employed in systems to provide security are limiting to policy exibility, some related work is described. The Flask architecture is then presented through a discussion of its prototype design and implementation. The paper concludes with an eval uation of the policy exibility of the system, an assess ment of the performance impact, and a discussion of the scale and invasiveness of the Flask changes.

Policy Flexibility

When rst attempting to dene security policy exi bility, it is tempting to generate a list of all known secu rity policies and dene exibility through that list. This ensures that the denition will reect a real-world view of the degree of exibility. Unfortunately, this simplis tic denition is unrealistic. Real-world security polices in computer systems are limited by the mechanisms cur rently provided in such systems, and it is not always clear how security policies enforced in the pencil-and-paper world translate to computer systems, if at all [3, 48]. As such, a better denition is needed. It is more useful to dene security policy exibility by viewing a computer system abstractly as a state machine performing atomic operations to transition from one state to the next. Within such a model, a system could be con sidered to provide total security policy exibility if the security policy can interpose atomically on any opera tion performed by the system, allowing the operation to proceed, denying the operation, or even injecting opera tions of its own. In such a system, the security policy can make its decisions using knowledge of the entire current system state, where the current system state can be con sidered to encompass the history of the system. Because it is possible to interpose on all access requests, it is pos sible to modify the existing security policy and to revoke any previously granted access. This second denition more correctly captures the essence of policy exibility, but practical considerations force a slightly more limited point of view. It is unlikely that a real system could base security policy decisions for all possible operations on the entire current system state. Instead, a more realistic approach is to identify that portion of the system state that is potentially secu rity relevant and to control operations that affect or are affected by that portion of the state. The degree of ex

ibility in such a system will naturally depend upon the completeness of both the set of controlled operations and the portion of the current system state that is available to the security policy. Furthermore, the granularity of the controlled operations affects the degree of exibility because it impacts the granularity at which sharing can be controlled. This description of policy exibility seems limiting in three ways. It allows some operations to proceed outside of the control of the security policy, restricts the opera tions that may be injected by the security policy, and per mits some system state to exist beyond the scope of the security policy. In actuality, each of these apparent limi tations is a desirable property since many of the internal operations and state of any system are of no apparent use or concern to any security policy. Section 6.1 will dis cuss how these limitations were interpreted for the Flask system. A system that is policy exible must be capable of supporting a wide variety of security policies. Security policies may be classied according to certain character istics, including such things as: the need to revoke pre viously granted accesses, the type of input required to make access decisions, the sensitivity of policy decisions to external factors like history or environment, and the transitivity of access decisions [43, Sec. 6]. The remain der of this section focuses on revocation, which is the most difcult of these characteristics to support. Since even the simplest security policies undergo change (e.g., as user authorizations change), a policy exible system must be capable of supporting policy changes. Since policy changes may be interleaved with the execution of controlled operations, there is the risk that the system will enforce access rights according to an obsolete policy. Thus, there must be effective atom icity in the interleaving of policy changes and controlled operations. The fundamental difculty in achieving this atomic ity is ensuring that previously granted permissions can be revoked as required by a policy change. When a permission is to be revoked, the system must ensure that any service controlled by the permission will no longer be provided unless the permission is later granted again. Revocation can be a very difcult property to satisfy because permissions, once granted, have a tendency to mi grate throughout the system. The revocation mechanism must guarantee that all of these migrated permissions are indeed revoked. A basic example of a migrated permission surfaces in Unix. The access decision for writing to a le is performed when that le is opened, and the granted permis sion is cached in the le description for efcient valida tion of write access during write operations. Revoking

write access to that le in Unix only prevents future attempts to open the le with write access and has no effect on the migrated permissions in existing le descriptions. This revocation support may be insufcient to meet the needs of a security policy. This type of situation is not uncommon, and migrated permissions can be found in other places throughout a system including: capabilities, access rights in page tables, open IPC connections, and operations currently in progress. More complicated sys tems are likely to yield more places to which permissions can migrate. In most cases, revocation can be accomplished simply by altering a data structure. However, it is more com plicated to revoke a permission when there is an opera tion in progress that has checked the permission already. The revocation mechanism must be able to identify all inprogress operations affected by such revocation requests and deal with each of them in one of three possible ways. The rst is to abort the in-progress operation, returning an error status. Alternately, it could be restarted, allow ing another check for the retracted permission. The third option is just to wait for the operation to complete on its own. In general, only the rst two are safe. Only when the system can guarantee that the operation can complete without causing the revocation request to block inde nitely (e.g., if all appropriate data structures have already been locked and there are no external dependencies) may the third option be taken. This is critical because block ing the revocation effectively denies the revocation request and causes a security violation.

because they allow the holder of a capability to control the direct propagation of that capability, whereas a crit ical requirement for supporting security policies is the ability to control the propagation of access rights in ac cordance with the policy. The enhancements introduced by Hydra and KeyKOS are intended to limit such propa gation, but the resulting systems still generally only support the specic policies they were designed to satisfy, at the cost of signicant complexity that diminishes the attraction of the capability model in the rst place. Primarily with an interest in solving the problem of supporting a multilevel security policy within a capability-based system, a few capability-based systems (e.g., SCAP [25], ICAP [18], Trusted Mach [4]) intro duced mechanisms that validated every propagation or use of a capability against the security policy. Kain and Landwehr [23] developed a taxonomy to character ize such systems. In these systems, the simplicity of the capability mechanism is retained, but capabilities serve only as a least privilege mechanism rather than a mech anism for recording and propagating the security policy. This is a potentially valuable use of capabilities. However, the designs for these systems do not dene the mechanisms by which the security policy is queried to validate capabilities, and those mechanisms are essential to providing policy exibility. The Flask architecture described in this paper could be employed to provide the security decisions needed to validate the capabilities in these systems. In the Flask prototype, the architecture is used in exactly this way.

Insufciency of Popular Mechanisms

3.2

Intercepting Requests

This section discusses two popular mechanisms that are often employed to provide security to systems and the reasons why both are limiting to policy exibility in normal usage. However, each has benets despite its lim itations, and both can be used within Flask in restricted ways that allow some of their benets without incurring their limitations.

3.1

Capability-Based Systems

The goal of a single operating system mechanism ca pable of supporting a wide range of security policies is not a new goal. The Hydra operating system devel oped in the 1970s separated its access control mecha nisms from the denition of its security policy [29, 52]. Hydra was a capability-based system, although the de velopers of the system recognized the limitations of a simple capability model and introduced several enhance ments to the basic capability mechanisms. The Hydra ap proach was taken even further by the KeyKOS [40] and EROS [47] systems. Though popular, capability mech anisms are poorly suited to providing policy exibility,

A common approach used to add security to a sys tem is to intercept service requests or to otherwise in terpose a layer of security code between all applica tions and the operating system (e.g., Kernel Hypervi sors [37], SPIN [20]), or between particular applica tions or sets of applications (e.g., L3/L4 [30], Lava [22], KeySAFE [28]). This may be done in capability systems or non-capability systems, and when applied to an oper ating system the security layer may lie within the oper ating system itself (as in Spring [36]) or in a component outside of the operating system to which all requests are redirected (as in Janus [17]). However, this approach has some serious limitations. In order to add security by intercepting requests, the ex isting functional interface must expose all abstractions and information ows that the security policy wishes to control. To avoid maintaining redundant state in the access control layer, the functional interface must ensure that all security-relevant attributes are either directly available as parameters or easily derived from parame ters. A policy that requires the use of some internal state

of the object manager as an input to the decision can not be implemented without either changing the manager to export the state or, if possible, replicating the state man agement in the enforcer itself. The level of abstraction provided by the interface may be inappropriate or may cause difculties in guaranteeing uniqueness or atomic ity. For example, typical name-based calls suffer from issues of aliasing, multi-component lookups, and preserv ing the tranquility of the name-to-object mapping from the time-of-check to the time-of-use. Finally, this ap proach is limited in that the security layer can only af fect the operation of the system as requests pass through it. Hence, it is often impossible for the system to reect subsequent changes to the security policy, in particular, the revocation of migrated permissions. As was the case with capabilities, implementing ac cess control within a security layer is a good approach when these disadvantages can be avoided through the use of other mechanisms. However, it is important to recog nize that other mechanisms are necessary, often mecha nisms that are more invasive than intercepting requests, in order to provide any degree of exibility in supporting security policies.

Related Work

The previous section described the relationship between Flask and a variety of efforts that involved capability-based systems or the interception of requests. This section describes the relationship between Flask and other efforts not previously mentioned. We focus on the research most directly related to Flask, although there are many other efforts with some relation to our work. The security architecture of the Flask system is derived from the architecture of our previous prototype system DTOS [35], which had similar goals. However, while the DTOS security mechanisms were independent of any particular security policy, the mechanisms were not sufciently rich to support some policies [43], espe cially dynamic security policies. At the highest level of abstraction, the exible secu rity model for Flask is consistent with the Generalized Framework for Access Control (GFAC) [2]. However, the GFAC model assumes that all controlled operations in the system are performed in the same atomic operation in which the policy is consulted, which is very difcult to achieve in a practical system and is the primary obstacle that the Flask system has had to overcome. The specic issue of revocation is not a new issue in operating system design, although it has received surpris ingly little recognition. Multics [39] effectively provided immediate revocation of all memory permissions by invalidating segment descriptors. Redell and Fabry [42], Karger [24] and Gong [18] all describe approaches for

revoking previously granted capabilities, though none were actually implemented. Spring [49] implemented a capability revocation technique, though only the ca pabilities were revoked, not migrated permissions. Revocation of memory permissions is naturally provided by microkernel-based systems with external paging support, such as Mach [31], though revocation is not extended to other permissions. DTOS provided the secu rity server with the ability to remove permissions previ ously granted and stored in the microkernels permission cache. However, except for memory permissions where Machs mechanisms could be used, DTOS did not provide for revocation of migrated permissions [38]. The Flask prototype is implemented within a microkernel-based operating system with hardwareenforced address space separation between processes. Several recent efforts (e.g., SPIN [5], VINO [46] and the Java protection models in [50]) have presented softwareenforced process separation. The distinction is essen tially irrelevant for the Flask architecture. It is essential that some form of separation between processes be pro vided, but the particular mechanism is not mandated by the Flask architecture. The general applicability of key aspects of the Flask architecture to other systems was concretely demonstrated by the adoption of the DTOS architecture in the security framework of SPIN [20]. Indeed, we believe the abstract Flask architecture, and the lessons it teaches, can be applied to software other than operating systems, such as middleware or distributed systems, although of course vulnerability to insecurities in the underlying operating systems would remain.

Flask Design and Implementation

This section denes the components of the Flask secu rity architecture and identies the requirements on each component necessary to meet the goals of the system. The Flask security architecture is described here in the context of its implementation within a microkernel-based multiserver operating system. However, the security ar chitecture only requires that the operating system include a reference monitor [16, Ch. 10]. In particular, the ar chitecture requires the completeness and isolation prop erties, although veriability is also ultimately necessary for condence in any implementation of the architecture. The Flask prototype was derived from the Fluke microkernel-based operating system [14]. The Fluke mi crokernel is especially well-suited for implementing the Flask architecture due to its lack of global resources [14] and the atomic properties of its API [13]. However, the original Fluke system was capability-based and was not in itself adequate to meet the requirements of the Flask architecture. The remainder of this section starts by providing an

Figure 1:

The Flask architecture. Components which enforce secu rity policy decisions are referred to as object managers. Components which provide security decisions to the object managers are referred to as security servers. The decision making subsystem may include other components such as administrative interfaces and policy databases, but the interfaces among these components are policy-dependent and are therefore not addressed by the architecture.

security policy. Object managers are responsible for dening a mech anism for assigning labels to their objects. A control policy, which species how security decisions are used to control the services provided by the object manager, must be dened and implemented by each object man ager. This control policy addresses threats in the most general fashion by providing the security policy with control over all services provided by the object manager and by permitting these controls to be congurable based on threat. Each object manager must dene handling routines which are called in response to policy changes. For all uses of polyinstantiation, each object manager must dene the mechanism by which the proper instanti ation of a resource is chosen.

5.2

General Support Mechanisms

overview of the Flask architecture. Then, it describes general support mechanisms required for the basic Flask architecture. It discusses the specic changes required for the microkernel. It explains how the complications caused by the need for revocation were overcome. This section ends by describing the prototype security server.

This section describes general support mechanisms that were introduced for all of the object managers in order to support policy exibility. Despite the simplic ity of the Flask architecture, some subtleties arise in the implementation, as will be discussed below. 5.2.1 Object Labeling All objects that are controlled by the security policy are also labeled by the security policy with a set of security attributes, referred to as a security context. A fundamental issue in the architec ture is how the association between objects and security contexts is maintained. The simplest solution would be to dene a single policy-independent data type which is part of the data associated with each object. However, no single data type is well-suited to all of the differing ways in which labels are used in a system. The Flask architec ture addresses these conicting needs by providing two policy-independent data types for labeling. A security context, the rst policy-independent data type, is a variable-length string which can be interpreted by any application or user with an understanding of the security policy. A security context might consist of sev eral attributes, such as a user identity, a classication level, a role and a type enforcement [6] domain, but this depends on the particular security policy. As long as it is treated as an opaque string, a security context can be handled by an object manager without compromising the policy exibility of the object manager. However, using security contexts for labeling and policy decision lookups would be inefcient and would increase the like lihood of policy-specic logic being introduced into the object managers. The second policy-independent data type, the secu rity identier (SID), is dened by Flask to be a xedsize value which can be interpreted only by the security server and is mapped by the security server to a particu-

5.1

Architecture Overview

The Flask security architecture [44], as shown in Fig ure 1, describes the interactions between subsystems that enforce security policy decisions and a subsystem which makes those decisions, and the requirements on the com ponents within each subsystem. The primary goal of the architecture is to provide for exibility in the security policy by ensuring that these subsystems always have a consistent view of policy decisions regardless of how those decisions are made or how they may change over time. Secondary goals for the architecture include appli cation transparency, defense-in-depth, ease of assurance, and minimal performance impact. The Flask security architecture provides three primary elements for object managers. First, the architecture provides interfaces for retrieving access, labeling and polyinstantiation decisions from a security server. Ac cess decisions specify whether a particular permission is granted between two entities, typically between a subject and an object. Labeling decisions specify the security attributes to be assigned to an object. Polyinstantiation de cisions specify which member of a polyinstantiated set of resources should be accessed for a particular request. Second, the architecture provides an access vector cache (AVC) module that allows the object manager to cache access decisions to minimize the performance overhead. Third, the architecture provides object managers the abil ity to register to receive notications of changes to the

need to uniquely distinguish subjects and objects of cer tain classes even if they are created in the same security context. For such policies, the SID must be computed from the security context and a unique identier chosen by the security server. 5.2.2 Client and Server Identication Object man agers must be able to identify the SID of a client making a request when this SID is part of a security decision. It is also useful for clients to be able to identify the SID of a server to ensure that a service is requested from an ap propriate server. Hence, the Flask architecture requires that the underlying system provide some form of client and server identication for inter-process communica tion (IPC). However, this feature is not complete without providing the client and server a means of overriding their identication. For instance, the need of a subject to limit its privileges when making a request on behalf of another subject is one justication for capability-based mechanisms [21]. In addition to limiting privileges, overriding the actual identication can be used to provide anonymity in communications or to allow for transparent interposition, such as through a network IPC server con necting the client and server in a distributed system [11]. The Flask microkernel provides this service directly as part of IPC processing, rather than relying upon compli cated and potentially expensive external authentication protocols such as those in Spring and the Hurd [7]. The microkernel provides the SID of the client to the server along with the clients request. The client can identify the SID of the server by making a kernel call on the capabil ity to be used for communication. When making an IPC request, the client can specify a different SID as its effec tive SID to override its identication to the server. The server can also specify an effective SID when preparing to receive requests. In both cases, permission to specify a particular effective SID is decided by the security server and enforced by the microkernel. Thus, the Flask mi crokernel supports the basic access control and labeling operations required for the architecture and it provides the exibility needed for least privilege, anonymity or transparent interposition. 5.2.3 Requesting and Caching Security Decisions In the simplest implementation, the object manager can make a request to the security server every time a secu rity decision is needed. However, to alleviate the perfor mance impact of communicating with the security server for each decision and of the computation of the decision within the security server, the Flask architecture provides caching of security decisions within the object manager. The caching mechanisms in Flask provide much more than simply caching individual security decisions. The access vector cache (AVC) module, which is a common

Figure 2:

Object labeling in Flask. A client requests the creation of a new object from an object manager, and the microkernel supplies the object manager with the SID of the client. The object manager sends a request for a SID for the new object to the security server, with the SID of the client, the SID of a related object and the object type as parameters. The security server consults the labeling rules in the policy logic, determines a security context for the new object, and returns a SID that corresponds to that security context. Finally, the object manager binds the returned SID to the new object.

lar security context. Possession or knowledge of a SID for a given security context does not grant any authoriza tion for that security context. The SID mapping cannot be assumed to be consistent across executions (reboots) of the security server nor across security servers on dif ferent nodes. Consequently, SIDs may be lightweight; in the implementation, SIDs are simply 32-bit integers. There is no specied internal structure to a SID; any in ternal structure is known only by the security server. The SID allows most object manager interactions to be inde pendent of not just the content but even the format of a security context, simplifying object labeling and the in terfaces that coordinate the security policy between the security server and object managers. However, in some cases, such as labeling persistent objects or labeling ob jects which are exported to other nodes, object managers must handle security contexts. This is described further in the discussion of the le server and network server in Section A.1 and Section A.2. When an object is created, it is assigned a SID that rep resents the security context in which the object is created. This context typically depends upon the client requesting the object creation and upon the environment in which it is created. For example, the security context of a newly created le is dependent upon the security context of the directory in which it is created and the security context of the client that requested its creation. Since the computa tion of a security context for a new or transformed object may involve policy-specic logic, it cannot be performed by the object manager itself. The labeling of a new object is depicted in Figure 2. For some security policies, such as an ORCON policy [19, 34], the security policy may

Figure 3:

Requesting and caching security decisions in Flask. A client requests the modication of an existing object from an object manager. The object manager queries its access vector cache (AVC) module for an access ruling for the (client SID, object SID, requested permissions) triple. If no valid entry exists, then the AVC module sends an access query to the security server. The security server consults the access rules in the policy logic, determines an access ruling, and returns the access ruling to the AVC module.

library shared by the object managers, provides for the coordination of the policy between the object manager and the security server. This coordination addresses both requests from the object manager for policy decisions and requests from the security server for policy changes. The rst of these is discussed in this section, while the second is discussed in Section 5.4. For a typical controlled operation in Flask, an object manager must determine whether a subject is allowed to access a object with some permission or set of permis sions. The sequence of requesting and caching security decisions is depicted in Figure 3. To minimize the overhead of security computations and requests, the security server can provide more decisions than requested, and the AVC module will store these decisions for future use. When a request for a security decision is received by the security server, it will return the current state of the secu rity policy for a set of permissions with an access vector. An access vector is a collection of related permissions for the pair of SIDs provided to the security server. For instance, all le access permissions are grouped into a single access vector. 5.2.4 Polyinstantiation Support A security policy may need to restrict the sharing of a xed resource among clients by polyinstantiating the resource and par titioning the clients into sets which can share the same instantiation of the resource. For example, multi-level secure Unix systems frequently partition the /tmp direc tory, maintaining separate subdirectories for each secu rity level [51]; the corresponding solution for Flask is discussed in Section A.1. A similar issue arises with the TCP or UDP port spaces, as discussed in Section A.2.

Figure 4: Polyinstantiation in Flask. A client requests the creation of a new object from an object manager, and the microkernel supplies the object manager with the SID of the client. The object manager sends a request for a SID for the member object to the security server, with the SID of the client, the SID of the polyinstantiated object and the object type as parameters. The security server consults the polyinstantiation rules in the policy logic, determines a security context for the member, and returns a SID that corresponds to that security context. Finally, the object manager selects a member based on the returned SID, and creates the object as a child of the member.

The Flask architecture supports polyinstantiation by pro viding an interface by which the security server may identify which instantiation can be accessed by a partic ular client. Both the client and the instance are identied by SIDs. The instantiations are referred to as members. The general sequence of selecting a member is depicted in Figure 4.

5.3

Microkernel-specic Features

The previous sections described the security functions that are common to all of the Flask object managers. In this section, we discuss the specic features that have been added to the microkernel. Support for revocation, however, will be discussed separately in Section 5.4. The specic features that were added to some of the other Flask object managers are described in Appendix A. Due to the requirements of Flukes architecture, each active kernel object is associated with a small chunk of physical memory [14]. Though memory is not itself an object within the microkernel, the microkernel provides the base service for memory management and binds a SID to each memory segment. The SID of each kernel object is identical to the SID of the memory seg ment with which it is associated. This relationship between the label of memory and the label of kernel objects associated with that memory permits the Flask microker nel controls to leverage the existing protection model of Fluke, rather than introducing an orthogonal protection model as in DTOS. However, it also creates a potential

SOURCE Client SID Server SID Effective Client SID

TARGET Effective Client SID Effective Server SID Effective Server SID

PERMISSION SpecifyClient SpecifyServer Connect

sions, as described in detail in [44, Sec. 3].

5.4

Revocation Support Mechanisms

Permission requirements for an IPC connection to exist. The specify permissions are only required when a subject species an effective SID. If a subject does not specify an effective SID, then its effective SID is equal to its actual SID.

Table 1:

loss of labeling exibility, since the memory allocation granularity is much coarser than the allocation granular ity for kernel objects. Flask provides direct security policy control over the propagation of memory access modes by associating a Flask permission with each mode, based on the SID of the address space and the SID of the memory segment. These memory access modes also act as capabilities to kernel objects associated with the memory. During the initial attempt to access mapped memory, the microker nel veries that the security policy explicitly grants permission for each requested access mode. Memory permissions cannot be computed at the level of any interface in Fluke, and are computed instead during page faults; hence, these controls provide an example where merely intercepting requests would be insufcient. Since the SID of a memory segment is not allowed to change, the Flask permissions need only be revalidated if a policy change occurs, as discussed in Section 5.4. In Fluke, a port reference serves as a capability for performing an IPC to a server thread waiting on the cor responding port set. Control over propagation in Fluke may be performed through typical interposition tech niques. In contrast, Flask provides direct control over the use of such port references by only allowing an IPC connection between two subjects if the appropriate permissions shown in Table 1 are satised. These direct con trols permit the policy to regulate the use of capabilities, addressing the concerns of Section 3.1. An interesting aspect of the Flask microkernel is the controls that are imposed on relationships between ob jects. In Fluke, these relationships are dened through the use of object references (e.g. the state of a thread con tains an address space reference). Unfortunately, these references are used in many different ways, in contrast to the way in which read and write access modes are used to control access to kernel objects. For example, a ref erence to an address space may be used to map mem ory into the space or to export memory from the space. Hence, Flask introduces separate controls over these re lationships and provides ner-grained control than Fluke. Some of the controls simply require the two objects to have equal SIDs, while others involve explicit permis

The most difcult complication in the Flask architec ture is that the object managers effectively keep a local copy of certain security decisions, both explicitly in an access vector cache and implicitly in the form of mi grated permissions. Therefore a change to the security policy requires coordination between the security server and the object managers to ensure that their representa tions of the policy are consistent. This section is devoted to a more detailed discussion of the requirements on the components of the architecture during a change in secu rity policy. The need for effective atomicity stated in Section 2 is achieved by imposing two requirements on the system. The rst is that after completion of a policy change, the behavior of the object manager must reect that change. No further controlled operations requiring a revoked permission can be performed without a subsequent policy change. The second requirement is that object managers must complete policy changes in a timely manner. This rst requirement is only a requirement on the object managers, but it results in effective atomicity of system-wide policy when coupled with a well-dened protocol between the security server and the object man agers. This protocol involves three steps. First, the se curity server noties all object managers that may have been previously provided any portion of the policy that has changed. Second, each object manager updates its internal state to reect the change. Finally, each object manager noties the security server that the change is complete. Sequence numbers are used to address the interleaving of messages providing policy decisions to the object managers and messages requesting changes to the policy. Both the synchronization protocol, which has been implemented, and an alternative approach based on theories of database consistency are described in [45, Sec. 6]. The latter solution was drawn from a model of transactional consistency, but solutions related to dis tributed shared memory consistency may also serve as useful models. The last step of the protocol is essential to support policies that require policy changes to occur in a partic ular order. For instance, a policy may require that cer tain permissions be revoked prior to granting new permissions. The security server cannot consider a policy change to be completed until it is completed by all af fected object managers. This allows effective atomicity of system-wide policy changes since the security server can determine when the policy change is effective for all relevant object managers. This protocol does not impose an undue burden in state

Figure 5: A revocation of microkernel permissions. Upon receipt of a revocation request from the security server, the microkernel rst updates its access vector cache, and then proceeds to examine thread and memory state and perform revocations as necessary. The atomic prop erties of Fluke were leveraged to ease implementation of the revocation mechanism.

operations are either atomic or cleanly subdivided into user-visible atomic stages [13]. The rst property per mits the kernel revocation mechanism to assess the ker nels state, including operations currently in progress. The revocation mechanism may safely wait for opera tions currently in progress to complete or restart due to the promptness guarantee. The second property permits Flask permission checks to be encapsulated in the same atomic operation as the service that they control, thereby avoiding any occurrences of the service after a revocation request has completed.

5.5

The Security Server

management on the security server. The number of ob ject managers in many systems is relatively small and the only transactions which require additional state are those where an object manager initially issues an access query for a permission that is granted. Furthermore, the se curity server may track permission grantings at various granularities to reduce the amount of state recorded by the security server. The form of atomicity provided by the protocol is rea sonable because of the timeliness requirement imposed on the object managers. It must not be possible for the revocation request to be arbitrarily delayed by actions of untrusted software. Each object manager must be capa ble of updating its own state without being indenitely blocked by its clients. When this timeliness requirement is generalized for system-wide policy changes, it also in volves two other elements of the system: the microker nel, which must provide timely communication between the security server and object managers, and the sched uler, which must provide the object manager with CPU resources. The general AVC module handles the initial process ing of all policy change requests and updates the cache appropriately. The only other operation that must be performed is revocation of migrated permissions. After updating the cache, the AVC module invokes any callbacks which have been registered by the object manager for re voking migrated permissions. The le server supports revocation of permissions which have migrated into le description objects, but currently lacks support for inter rupting in-progress operations. Complete callbacks for revoking migrated permissions have currently been im plemented only within the Flask microkernel, as shown in Figure 5. Two properties of the Fluke API simplify revocation in the microkernel: it provides prompt and complete exportability of thread state and guarantees that all kernel

As stated earlier, the security server is required to provide security policy decisions, to maintain the mapping between SIDs and security contexts, to provide SIDs for newly created objects, to provide SIDs of member objects, and to manage object manager access vector caches. Additionally, most security policy server im plementations will provide functionality for loading and changing policies. A security server might also bene t from providing its own caching mechanism, in addi tion to those contained in the object managers, to hold the results of access computations. This may prove ad vantageous because the security server can improve its response time by using cached results from previous, potentially expensive, access computations requested by any client. The security server also is typically a policy enforcer over its own services. First of all, if the security server provides interfaces for changing the policy, it must enforce the policy over which subjects can access this in terface. Second, it may limit the subjects that can request policy information. This is especially important in a policy where permission requests alter the policy, such as a dynamic conict of interest policy. If the condentiality of the policy information is important, then object man agers that cache policy information must also be respon sible for its protection. In a distributed or networked environment, it is tempt ing to suggest that the security server of each node merely act as a local cache of the environments policy. However, to support heterogeneous policy environ ments, it is desirable for each node to have its own secu rity server with a locally dened policy component, with some degree of coordination at a higher level. Even in a homogeneous policy environment, a core portion of the security policy must be locally dened for the node in order to securely bootstrap the system into a state where it may consult the environments policy. The develop ment of a distributed security server for coordinating the per-node security servers within an environment remains as future work. For many policies, the security server

should easily be scalable and replicable, since most poli cies will require little interaction among the individual nodes security servers. However, some security poli cies, such as history-based policies, may require greater coordination among the security servers. The security policy encapsulated by the Flask secu rity server is dened through a combination of its code and a policy database. Any security policy that can be expressed through the prototypes policy database lan guage may be implemented simply by altering the policy database. Supporting additional security policies requires changes to the security servers internal policy framework through code changes or by completely replacing the security server. It is important to note that even security policies that require altering the code of the security server do not require any changes to the object managers. The current Flask security server prototype imple ments a security policy that is a combination of four subpolicies: multi-level security (MLS) [3], type en forcement [6], identity-based access control and dynamic role-based access control (RBAC) [10]. The access de cisions provided by the security server must meet the re quirements of each of these four subpolicies. The policy logic for the multi-level security policy is largely dened through the security server code, aside from the labels themselves. The policy logic for the other subpolicies is primarily dened through the policy database language. These four subpolicies are not all the policies supported by the architecture or its implementation in Flask. They were chosen for implementation in the security server prototype in order to exercise the major features of the architecture. Because the Flask effort has focused on policy en forcement mechanisms and the coordination between these mechanisms and the security policy, the set of ad ditional security policies that can be implemented solely through changes to this policy database is currently lim ited. This is simply a shortcoming of the current proto type rather than a characteristic of the architecture. We have yet to explore the development of a more expres sive policy specication language or policy congura tion tool for Flask. Such a tool would facilitate the def inition of new security policies in the current prototype. There have been several recent projects that do consider exible tools for conguring the security policies (e.g., Adage [53], ASP [8], Dynamic DTE [15], ARBAC [41]) that nicely complement the Flask effort by potentially providing ways to manage the mechanisms provided by Flask.

Results

This section describes the results of the effort in three areas: policy exibility, performance impact, and the scale and invasiveness of the code changes.

6.1

Flexibility in the Flask Implementation

We evaluate the policy exibility that the system pro vides based upon the description of policy exibility in Section 2. The most important criterion discussed in that section was atomicity, i.e., the ability of the system to ensure that all operations in the system are controlled with respect to the current security policy. Section 5.4 described how the Flask architecture provides an effec tive atomicity for policy changes and how the microker nel in particular achieves atomicity for policy changes relating to its objects. Achieving this atomicity for the other object managers remains to be done. Section 2 also identies three other potential weak nesses in policy exibility. The rst is the range of oper ations that the system can control. As described in Sec tion 5.3 and Appendix A, each Flask object manager denes permissions for all services which observe or mod ify the state of its objects and provides ne-grained dis tinctions among its services. The advantages of the Flask controls over merely intercepting requests were clearly illustrated. The second potential source of inexibility is the limi tation on the operations that may be invoked by the secu rity policy. In Flask, the security server may use any of the interfaces provided by the object managers. Furthermore, the Flask architecture provides the security server with the additional interfaces provided by the AVC mod ule in each object manager. However, this is obviously not the same as having access to any arbitrary operation. For example, if the security policy requires the ability to invoke an operation which is strictly internal to some object manager, the object manager would have to be changed to support that policy. The third potential source of inexibility is the amount of state information available to the security policy for making security decisions. Based upon our previous analysis of policies for DTOS, the provision of a pair of SIDs is sufcient for most policies [43, Sec. 6.3]. However, the limitation to two SIDs is a potential weakness in the current Flask design. The description of the Flask le server in Section A.1 identies one case where a permission ultimately depends upon three SIDs and must be reduced to a collection of permissions among pairs of SIDs. An even worse situation is if the security decision should depend upon a parameter to a request that is not represented as a SID. Consider a request to change the scheduling priority of a thread. Here the security policy must certainly be able to make a decision based in part on

the requested priority. This parameter can be considered within the current implementation by dening separate permissions for some classes of changes, for instance, increasing the priority can be a different permission than decreasing the priority. But it is not practical to dene a separate permission for every possible change to the priority. This is not a weakness in the architecture itself, and the design could easily be changed to allow for a se curity decision to be represented as a function of arbi trary parameters. However, the performance of the sys tem would certainly be impacted by such a change, because an access vector cache supporting arbitrary param eters would be much more complicated than the current cache. A better solution may be to expand the interface only for those specic operations that require decisions based upon more complex parameters, and to provide separate caching mechanisms for those decisions. The Flask prototype provides a research platform for explor ing the need for a richer interface to better support policy exibility.

message size Null 16-byte 128-byte 1k-byte 4k-byte 8k-byte 64k-byte

Fluke (s) 13.5 15.0 15.8 21.9 42.9 78.5 503

naive +2% +2% +1% +2% +1% +1% +0%

Flask client identication +9% +4% +2% +2% +1% +5% +6%

client impersonation +6% +6% +5% +4% +2% +1% +0%

Table 2: Performance of IPC in Flask relative to the base Fluke sys tem. A Null IPC actually transfers a minimal message, 8 bytes in the current implementation. In Fluke, the tests use the standard Fluke IPC interfaces in a system congured with no Flask enforcement mecha nisms. Absolute times are shown in this column as a basis for com parison. Naive runs the same tests on the Flask microkernel. In client identication, the tests have been modied to use the Flask-specic server-side IPC interface to obtain the SID of the client on every call. Client impersonation uses the client-side IPC interface to specify an effective SID for every call.

6.2

Performance

All measurements in this section were taken using the time-stamp counter register on a 200MHz Pentium Pro processor with a 256KB L2 cache and 64MB of RAM. While a complete assessment of performance requires analysis of all object managers, we limit ourselves to the microkernel, and primarily to IPC since it is a critical path which must be factored into all higher level mea surements. 6.2.1 Object Labeling The segment SID for any piece of mapped physical memory is readily available, since it is computed when a virtual-to-physical address translation is created and is stored along with that trans lation. As the address translation must be obtained at object creation time anyway, the additional cost of label ing is minimal. We veried this by measuring the cost to create the simplest kernel object in both Fluke and Flask, showing the worst case overhead. Flask added 1% to the operation (3.62 versus 3.66 s). 6.2.2 IPC Operations This section presents perfor mance measurements for IPC operations under various message sizes and also measures the impact of caching within the microkernel. Table 2 presents timings for a va riety of client-server IPC microbenchmarks for the base Fluke microkernel and under different scenarios in the Flask system. The tests measure cross-domain transfer of varying amounts of data, from client to server and back again. For all of the tests performed on Flask in Table 2, the required permissions are available in the access vector

cache at the location identied by a hint within the port reference structure. While we have provided the data structures to allow for fast queries of previously com puted security decisions, we have not done any specic code optimization to speed up the execution. Therefore it was encouraging to nd that the addition of these data structures alone is sufcient to almost completely elimi nate any measurable impact of the permission checks. The most interesting case in Table 2 is the naive col umn, because it represents the most common form of IPC in the Flask system. Along this path there is only a sin gle Connect permission check. The results show a worstcase 2% (50 machine cycle) performance hit. As would be expected, the relative effect of the single access check diminishes as the size of the data transfer increases and memory copy costs become the dominating factor. The client identication column has a larger than expected impact due to the fact that, in the current implementation, the client SID is passed across the interface to the server in a register normally used for data transfer. This forces an extra memory copy (particularly obvious in the Null IPC test). The signicant effect on large data transfers is unexpected and needs to be investigated. The client im personation column shows the impact of checking both the Connect and SpecifyClient permissions. The effect of not nding the permission through the hint is shown in Table 3, which presents the relative costs of retrieving a security decision from the cache and from the security server. The operation being performed is the most sensitive of the IPC operations, round trip of trans fer of a null message between a client and a server and is consequently representative of the worst case. The cache column shows that the use of the hint is sig nicant in that it reduces the overhead from 7% to 2%.

Null

Fluke 13.5 s

using hint 13.8 s +2%

Flask using calling cache trivSS 14.4 s 43.4 s +7% +221%

calling realSS 82.5 s +511%

connections 1 2 4 8 16

revocation time 1.55 ms 1.56 ms 1.57 ms 1.60 ms 1.65 ms

Table 3:

Marginal cost of security decisions in Flask. The rst two columns repeat data from Table 2, identifying the relative cost of Flask when the required permission is found in the access vector cache (AVC) using the hint. The third column is the time required when the hint was incorrect but the permission was still found in the AVC. The trivSS column is the time required when the permission is not found in the AVC, and a trivial security server, which immediately returns an access ruling with all permissions granted, is used. The realSS column is the time required when the permission is not found in the AVC and an access ruling is computed by our prototype security server.

Table 4:

Measured cost of revoking IPC connections. A connection is established from a client to a server and then is immediately revoked. Increasing numbers of interposed threads are used to increase the work done for each revocation.

The trivSS column shows a more than tripling of the time required in the base Fluke case. The IPC interaction between the microkernel and security server requires trans fer of 20 bytes of data to the security server (along with the client SID) and return of 20 bytes. Since the permis sion for this IPC interaction is found using the hint, we see from Table 2 that over half of the additional overhead is due to the IPC. The remainder of the overhead is due to the identication of the request for a security decision, construction of the security server request in the kernel, and the unmarshaling and marshaling of parameters in the security server itself. The additional overhead in the realSS column compared to the previous case is the time required to compute a security decision within our proto type security server. Though no attempt has been made to optimize the security server computations, this result points out that the access vector cache can potentially be important regardless of whether interactions with the se curity server require an IPC interaction. 6.2.3 Revocation Operations The possible microkernel revocation operations are described in Section 5.4. For demonstration purposes we chose to evaluate the most expensive of those operations, IPC revocation. Ta ble 4 shows the results with varying numbers of active connections. The large base case is due to the need to stop all threads in the system when an IPC revocation is processed. The Fluke kernel provides a mechanism to cancel a thread and wait for it to enter a stopped state when the kernel wishes to examine or modify the threads state. The stop operation cannot be blocked indenitely by the threads activities nor by the activities of any other thread. Since a thread must be stopped prior to examina tion in order to ensure that it is in a well-dened state, the current Flask implementation must stop all threads when an IPC revocation is processed. Thus, the current implementation meets the completeness and timeliness requirements of the architecture but is quite costly. In contrast, the actual cost to examine and update the state

of the affected threads is small in relation, and as ex pected scales linearly with the number of connections. Changing the Fluke kernel to permit greater concurrency during the processing of a revocation request remains as future work. The frequency of policy changes is obviously policy dependent, but the usual examples of policy changes are externally driven and therefore will be infrequent. Moreover, a performance loss in a system with frequent policy changes should not be unexpected as it is fundamentally a new feature provided by the system. Obviously, even these uncommon operations should be completed as fast as possible, but that has not been a major consideration in the current implementation. 6.2.4 Macrobenchmark A macrobenchmark evalu ation of the Flask prototype is difcult to perform. Since Flask is a research prototype, it has only limited POSIX support and many of the servers are not robust or well tuned. As a result, it is difcult to run non-trivial benchmark applications. Nevertheless, we performed a sim ple comparison, running make to compile and link an application consisting of 20 .c and 4 .h les for a to tal of 8060 lines of code (including comments and white space), about 190KB total. The test environment included three object managers (the kernel, BSD lesystem server and POSIX process manager) along with a shell and all the GNU utilities necessary to build the application (make, gcc, ld, etc.). The Flask conguration of the test includes the security server with the three object managers congured to in clude the security features described in Section 5.3 and Appendix A. For each conguration, we ran make ve times, ignored the rst run, and averaged the time of the nal four runs (the initial run primed the data and meta data caches in the lesystem). To give a sense of the absolute performance of the base Fluke system, we also ran the test under FreeBSD 2.1.5 on the same machine and lesystem. Table 5 summarizes the experiment. The slowdown for Flask over the base Fluke system is less than 5%. By running the Flask kernel with unmodi ed Fluke object managers (Flask-FFS-PM), we see that

OS Cong BSD Fluke Flask Flask-FFS-PM Fluke-memfs Flask-memfs

Time (sec) 18.6 39.9 41.7 (4.5%) 40.9 (2.5%) 24.7 27.4 (11%)

Object Manager Kernel FFS PM

Total queries 603735 76708 892

using hint 175585 N/A N/A

Resolution using calling cache SS 428121 29 76700 8 890 2

Table 5:

Results of running make to compile and link a sim ple application in various OS congurations. BSD is FreeBSD 2.1.5, Flask-FFS-PM is the Flask kernel with the unmodied Fluke lesys tem server and process manager, and the memfs entries use a memorybased lesystem in place of the disk-based lesystem. Percentages are the slowdowns vs. the appropriate base Fluke congurations.

Table 6: Resolution of requested security decisions during the com pilation benchmark. Numbers are from the Flask conguration of Ta ble 5 and includes all ve runs of make and make clean. More completely exploring the performance overhead of the Flask security architecture remains as future work, and will likely be done in the context of a Linux or OSKit implementation of the architecture. This will permit more realistic workloads to be measured.

the overhead is pretty evenly divided between the ker nel and the other object managers (primarily the lesys tem server). However, this modest slowdown is against a Fluke system which is over twice as slow on the same test as a competitive Unix system (BSD). The bulk of this slowdown is due to the prototype lesystem server which does not do asynchronous or clustered I/O operations. To factor this out, we reran the tests using a memory-based lesystem which supports the same access checks as the disk-based lesystem. The last two lines of Table 5 show the results of these tests. Note that the Flask overhead has increased to 11%, as less is masked by the disk I/O latency. Table 6 reports the number of security decisions that were requested by each object manager during testing of the Flask conguration and how those decisions were resolved. The numbers include all ve runs of make as well as the intervening removal of the object les. These results reafrm the effectiveness of caching security de cisions, with well over 99% of the requests never reach ing the security server. 6.2.5 Performance Conclusions Initial microbench mark numbers suggest that the overhead of the Flask mi crokernel mechanisms can be made negligible through the use of the access vector cache and local hints when appropriate. They also highlight the need for an ac cess vector cache so that communications with the secu rity server and security computations within the security server are minimized. They also point to several areas for potential optimization, such as the AVC implementation, the communications infrastructure and the prototype se curity server computations. A complete analysis of the effectiveness of the AVC remains as future work. Issues such as the optimal cache size and the sensitivity of the AVC hit ratios to policy changes remain to be explored. Results of the simple macrobenchmark test are incon clusive. Although the performance impact numbers are encouraging (511% slowdown), the bad absolute per formance of the prototype system cannot be ignored.

6.3

Scale and Invasiveness of Flask Code

In Table 7 we present data that give a rough estimate of the scale and complexity of adding ne-grained security enforcement to the base Fluke components. Overall, the Fluke components increased in size less than 8%. Al though the kernel increased the most at 19%, for large object managers the percentage is reassuringly small (4 6%). Of these modications, we examined the magni tude of changes involved by classifying each changed location as trivial changes (e.g., one-line changes, #define changes, name or parameter changes, etc.) or non-trivial. For the process manager, 57% of the changes fell into the trivial category. For the kernel, a similar percentage of the changes were trivial, 61%, despite the fact that the kernel is an order of magnitude larger and more complicated than the process manager. The changes required to implement the Flask security architecture did not involve any modications to the ex isting Fluke API. Extended calls were added to the exist ing API to permit security-aware applications to use the additional security functionality, such as the client and server identication support. All applications that run on the base Fluke system can be executed unchanged on Flask.

Summary

This paper describes an operating system security ar chitecture capable of supporting a wide range of security policies, and the implementation of this architecture as part of the Flask microkernel-based operating system. It provides a usable denition of policy exibility, identi es limitations of this denition and highlights the need for atomicity. It shows that capability systems and in terposition techniques are inadequate for achieving policy exibility. It presents the Flask architecture and describes how Flask overcomes the obstacles to achieving

Component Kernel FFS Proc. Mgr Net Server Total

Fluke LOC 9271 21802 925 24549 58435

+Flask 1795 1342 196 1071 4575

%Incr. 19.3 6.2 21.2 4.4 7.8

#Locs. 258 14 85 224 647

%Locs. 2.4 .06 9.2 9.1 1.1

Filtered source code size for various Flask components and the number of discrete locations in the base Fluke code that were modied. This count of source code lines lters out comments, blank lines, preprocessor directives, and punctuation-only lines, and typically is 1/4 to 1/2 the size of unltered code. The network server count includes the ISAKMP and IPSEC distributions, counting as modica tions all Flask-specic changes to them and the base Fluke network component.

Table 7:

policy exibility, including the need for atomicity. Al though the performance evaluation of the Flask prototype is incomplete, this paper demonstrates that the architec ture is practical to implement and exible to use. Moreover, the architecture should be applicable to many other operating systems.

Labeling of persistent objects. The le server maintains a table within each le system which identies the security context of the le system and every directory and le within the le system, thereby ensuring that the security attributes of these objects are preserved even if the le system is moved to another system. This table is partitioned into a mapping between each security context and an integer persistent SID (PSID) and a mapping between each object and its persistent SID. These persistent SIDs are purely an internal abstraction within the le system and have a distinct name space for each le system. Hence, per sistent SIDs may be lightweight and the allocation of persistent SIDs may be optimized for each le system.

Figure 6:

Availability
The Flask software and documentation are available at <[Link] example, whereas Unix permits a process to invoke stat or unlink on a le purely on the basis of the process access to the les parent directory, the Flask le server checks Getattr and Unlink permissions to control access to the le itself in addition to the directory-based permis sions. Such controls are necessary to generally support nondiscretionary security policies. The Flask le server also supports ne-grained distinctions among services, such as separate Write and Append permissions for les and separate Add name and Remove name permissions for directories, which is important for supporting policy exibility. The le server provides operations to relabel les and directories, since the relabel operation has the potential of being much more efcient than merely copying such objects into new objects with different labels. There are a couple of complications of relabeling. First, migrated permissions pertaining to the le may need to be revoked. For instance, changing the SID of a le may affect the permission to write to a le that is stored in a le de scription object. Hence, all such permissions are recom puted and revoked if necessary. Second, a relabeling op eration cannot be simply controlled through the SID of the client subject and the SID of the le, but must also involve the newly requested SID. This is addressed by requiring three permissions for a relabel to complete, as shown in Table 8. The provision of a single relabel oper ation is also helpful from a policy exibility perspective, since the policy logic can be directly expressed in terms of any of these three possible SID pairs. In contrast, im plementing the same policy logic in terms of the permis-

Other Flask object managers

This appendix describes the specic features that have been added to some of the Flask user-space object man agers. Although the following subsections are not neces sary for understanding the Flask architecture, they provide helpful insight into the details of providing policy exibility in a complete system.

A.1

File Server

The Flask le server provides four types of controlled (labeled) objects: le systems, directories, les, and le description objects. Since le systems, directories and les are persistent objects, their labels must also be per sistent. The binding of persistent labels to these objects is shown in Figure 6. The le server supports persis tent labels without sacricing policy exibility or perfor mance by treating security contexts as opaque strings and by mapping these labels to SIDs by a query to the secu rity server for internal use in the le server. Control over le description objects is separated from control over the les themselves so that propagation of access to le de scription objects may be controlled by the policy. As noted in Section 3.1, the ability to control the propaga tion of access rights is critical to policy exibility. In contrast to the Unix le access controls, the Flask le server denes a permission for each service that ob serves or modies the state of a le or directory. For

SOURCE Subject SID Subject SID File SID

TARGET File SID New SID New SID

PERMISSION RelabelFrom RelabelTo Transition

SOURCE Process SID Message SID Message SID Node SID

TARGET Socket SID Socket SID Node SID Net Interface SID

LAYER Socket Transport Network

Table 8: Permission requirements for relabeling a le. Additionally, the subject must possess Search permission to every directory in the path.

sions controlling operations involved in copying an ob ject would be complicated by the much weaker coupling among the relevant SIDs. The le server design proposes the use of the Flask ar chitectures polyinstantiation support for security union directories (SUDs); however, the design for SUDs has not yet been implemented. SUDs are a generalization of the partitioned directory approach taken by multi-level secure Unix systems for dealing with /tmp. The SUD mechanism is designed to use the polyinstantiation support to determine the preferred member directory for each client to access by default. However, unlike the sim ple partitioned directory approach, the SUD mechanism provides a unied view of all accessible members within the polyinstantiated directory to clients based upon ac cess decisions between the client and the member direc tories. As was noted in Section 3.2, le server operations provide a simple example of the problems with implement ing security controls at the servers external interface. The Flask le server draws its le system implementa tion from the OSKit [12] whose exported COM interfaces are similar to the internal VFS interface [27] used by many Unix le systems. It was possible to implement the Flask security controls at that interface where these problems do not exist.

Table 9: Layered controls in the network protocol stack. Each layer applies controls based upon the SIDs of the abstractions directly ac cessible at that layer. Node SIDs are provided to the network server by a separate network security server, which may query distributed databases for security attributes, and network interface SIDs may be locally congured.

A.2

Network Server

Abstractly, the Flask network server ensures that ev ery network IPC is authorized by the security policy. Of course, a network server cannot independently ensure that a network IPC is authorized by the policy of its node, since it does not have end-to-end control over data delivery to processes on peer nodes. Instead, a network server must extend some level of trust to its peer network servers to enforce its own security policy, in com bination with their own security policies, over the peer processes. This requires a reconciliation of security poli cies, which would be handled by a separate negotiation server. The current negotiation server is limited to ne gotiating network security protocols and cryptographic mechanisms using the ISAKMP [33] protocol. The pre cise form of trust and the precise level of trust extended to peer network servers can vary widely and would be de-

ned within the policy. Extending the concept of policy exibility to a networked environment will require such support for complex trust relationships. The principal controlled object type for the network server is the socket. For socket types that maintain mes sage boundaries (e.g., datagram), the network server also binds a separate SID to each message sent or received on a socket. For other socket types, each message is implic itly associated with the SID of its sending socket. Since messages cross the boundary of control of the network server, and may even cross a policy domain boundary, the network server may need to apply cryptographic pro tections to messages in order to preserve the security re quirements of the policy and must bind the security attributes of the message to the message. Our prototype network server uses the IPSEC [26] protocols for this purpose, with security associations established by the ne gotiation server. The negotiation server may not pass SIDs across the network, since they are only local identi ers; instead, the negotiation server must pass the actual security attributes to its peer, which can then establish its own SID for the corresponding security context. Al though the negotiation server must handle security contexts, it does not interpret them, and thus remains policyexible. Attribute translation and interpretation must be performed by the corresponding security servers in ac cordance with the policy reconciliation. The network server controls are layered to match the network protocol layering architecture. Hence, the ab stract control over the high-level network IPC services consists of a collection of controls over the abstractions at each layer, as shown in Table 9. The layered controls provide the policy with the ability to precisely regulate network operations, using all the information relevant to security decisions, and they allow the policy to take advantage of specic characteristics of the different proto cols (e.g., the client/server relationship in TCP). The network server provides another example of the problems with implementing security controls at the servers exter nal interface. This is due to the need to control abstrac tions and interpose on operations which are not exported

by the network servers external interface. Since the TCP and UDP port spaces are xed resources, the network server uses the Flask architectures polyinstantiation support for security union port spaces (SUPs). SUPs are analogous to the SUDs discussed in Section A.1. The polyinstantiation support is used to determine the preferred member port space when a port number is associated with a socket and when an incom ing packet has a destination port number which exists in multiple member port spaces. The SUP mechanism pro vides a unied view of all accessible port spaces within the polyinstantiated port space based on access decisions. Many of the details of the Flask network server and other servers that support it are beyond the scope of this paper. A much more detailed description of an earlier version of the Flask network server can be found in [9].

work server design, and other members of the Flux group for help in numerous ways.

References
[1] M. D. Abrams. Renewed Understanding of Access Control Poli cies. In Proceedings of the 16th National Computer Security Con ference, pages 8796, Oct. 1993. [2] M. D. Abrams, L. J. LaPadula, K. W. Eggers, and I. M. Olson. A Generalized Framework for Access Control: An Informal De scription. In Proceedings of the 13th National Computer Security Conference, pages 135143, Oct. 1990. [3] D. E. Bell and L. J. La Padula. Secure Computer Systems: Math ematical Foundations and Model. Technical Report M74-244, The MITRE Corporation, Bedford, MA, May 1973. [4] T. C. V. Benzel, E. J. Sebes, and H. Tajalli. Identication of Sub jects and Objects in a Trusted Extensible Client Server Architec ture. In Proceedings of the 18th National Information Systems Security Conference, pages 8399, 1995. [5] B. N. Bershad, S. Savage, P. Pardyak, E. G. Sirer, M. E. Fiuczyn ski, D. Becker, C. Chambers, and S. Eggers. Extensibility, Safety, and Performance in the SPIN Operating System. In Proc. of the 15th ACM Symp. on Operating Systems Principles, pages 267 284, Copper Mountain, CO, Dec. 1995. [6] W. E. Boebert and R. Y. Kain. A Practical Alternative to Hierar chical Integrity Policies. In Proceedings of the Eighth National Computer Security Conference, 1985. [7] M. I. Bushnell. Towards a New Strategy of OS Design. GNUs Bulletin, 1(16), Jan. 1994. [8] M. Carney and B. Loe. A Comparison of Methods for Imple menting Adaptive Security Policies. In Proceedings of the Sev enth USENIX Security Symposium, pages 114, Jan. 1998. [9] A. Chitturi. Implementing Mandatory Network Security in a Policy-exible System. Masters thesis, University of Utah, 1998. pp. 70. [Link] [Link]. [10] D. F. Ferraiolo, J. A. Cugini, and D. R. Kuhn. Role-Based Access Control (RBAC): Features and Motivations. In Proceedings of the Eleventh Annual Computer Security Applications Conference, Dec. 1995. [11] T. Fine and S. E. Minear. Assuring Distributed Trusted Mach. In Proceedings IEEE Computer Society Symposium on Research in Security and Privacy, pages 206218, May 1993. [12] B. Ford, G. Back, G. Benson, J. Lepreau, A. Lin, and O. Shivers. The Flux OSKit: A Substrate for OS and Language Research. In Proc. of the 16th ACM Symp. on Operating Systems Principles, pages 3851, St. Malo, France, Oct. 1997. [13] B. Ford, M. Hibler, J. Lepreau, R. McGrath, and P. Tullmann. Interface and Execution Models in the Fluke Kernel. In Proceed ings of the 3rd USENIX Symposium on Operating Systems Design and Implementation, pages 101116, Feb. 1999. [14] B. Ford, M. Hibler, J. Lepreau, P. Tullmann, G. Back, and S. Clawson. Microkernels Meet Recursive Virtual Machines. In Proceedings of the Symposium on Operating Systems Design and Implementations, pages 137151, Oct. 1996. [15] T. Fraser and L. Badger. Ensuring Continuity During Dynamic Security Policy Reconguration in DTE. In Proceedings of the 1998 IEEE Symposium on Security and Privacy, pages 1526, May 1998. [16] M. Gasser. Building a Secure Computer Systems. Van Nostrand Reinhold Company, 1988.

A.3

Process Manager

The Flask process manager implements the POSIX process abstraction, providing support for functions such as fork and execve. These higher-level process abstrac tions are layered on top of Flask processes, which con sist of an address space and its associated threads. The process manager provides one controlled object type, the POSIX process, and binds a SID to each POSIX process. Unlike the SID of a Flask process, the SID of a POSIX process may change through an execve. Such SID transi tions are controlled by the process Transition permission between the old and new SIDs. This control permits the policy to regulate a process ability to transition to differ ent security domains. Default transitions may be dened by the policy through the default object labeling mecha nism described in Section 5.2.1. In combination with the le server and the microkernel, the process manager is responsible for ensuring that each POSIX process is securely initialized. The le server ensures that the memory for the executable is la beled with the SID of the le. The microkernel ensures that the process may only execute memory to which it has Execute access. The process manager initializes the state of transformed POSIX processes, sanitizing their environment if the policy requires it.

Acknowledgments
We especially thank Jeff Turner for his many con tributions to the Flask vision and architecture. Duane Olawsky contributed much to our understanding of the features required for policy exibility. We also thank Dan Wallach, Grant Wagner, Andy Muckelbauer, Ruth Taylor, Charlie Payne, Tom Keefe and the anonymous reviewers for reviewing earlier drafts of this paper, Roland McGrath for recent Fluke implementation, Ajay Chitturi for implementing an earlier version of our secure net-

[17] I. Goldberg, D. Wagner, R. Thomas, and E. A. Brewer. A Secure Environment for Untrusted Helper Applications. In Proceedings of the 6th Usenix Security Symposium, July 1996. [18] L. Gong. A Secure Identity-Based Capability System. In Pro ceedings of the 1989 IEEE Symposium on Security and Privacy, pages 5663, May 1989. [19] R. Graubart. On the Need for a Third Form of Access Control. In Proceedings of the 12th National Computer Security Conference, pages 296304, Oct. 1989. [20] R. Grimm and B. N. Bershad. Providing Policy-Neutral and Transparent Access Control in Extensible Systems. In J. Vitek and C. Jensen, editors, Secure Internet Programming: Security Issues for Distributed and Mobile Objects, volume 1603 of Lec ture Notes in Computer Science. Springer-Verlag, June 1999. [21] N. Hardy. The Confused Deputy. Operating Systems Review, 22(4):3638, Oct. 1988. [22] T. Jaeger, J. Liedtke, and N. Islam. Operating System Protec tion for Fine-Grained Programs. In Proceedings of the Seventh USENIX Security Symposium, pages 143157, Jan. 1998. [23] R. Kain and C. Landwehr. On Access Checking in CapabilityBased Systems. In Proceedings of the 1986 IEEE Symposium on Security and Privacy, pages 6677, May 1986. [24] P. A. Karger. New Methods for Immediate Revocation. In Pro ceedings of the 1989 IEEE Symposium on Security and Privacy, pages 4855, May 1989. [25] P. A. Karger and A. J. Herbert. An Augmented Capability Archi tecture to Support Lattice Security and Traceability of Access. In Proceedings of the 1984 IEEE Symposium on Security and Pri vacy, pages 212, May 1984. [26] S. Kent and R. Atkinson. Security Architecture for the Internet Protocol. RFC 2401, Internet Engineering Task Force, Nov. 1998. [Link] [27] S. R. Kleiman. Vnodes: An Architecture for Multiple File System Types in Sun UNIX. In Proc. of the Summer 1986 USENIX Conf., pages 238247, Atlanta, GA, June 1986. [28] C. R. Landau. Security in a Secure Capability-Based System. Operating Systems Review, pages 24, Oct. 1989. [29] R. Levin, E. Cohen, W. Corwin, P. F., and W. Wulf. Policy/mechanism separation in Hydra. In Proceedings of the Fifth Symposium on Operating Systems Principles, pages 132140, Unversity of Texas at Austin, Nov. 1975. ACM/SIGOPS. [30] J. Liedtke. Clans and Chiefs. In Architektur von Rechensystemen. Springer-Verlag, Mar. 1992. [31] K. Loepere. Mach 3 Kernel Interfaces. Open Software Founda tion and Carnegie Mellon University, Nov. 1992. [32] P. A. Loscocco, S. D. Smalley, P. A. Muckelbauer, R. C. Tay lor, S. J. Turner, and J. F. Farrell. The Inevitability of Failure: The Flawed Assumption of Security in Modern Computing Envi ronments. In Proceedings of the 21st National Information Sys tems Security Conference, pages 303314, Oct. 1998. http:// [Link]/nissc/1998/proceedings/[Link]. [33] D. Maughan, M. Schertler, M. Schneider, and J. Turner. Internet Security Association and Key Management Protocol (ISAKMP). RFC 2408, Internet Engineering Task Force, Nov. 1998. ftp:// [Link]/in-notes/[Link]. [34] C. J. McCollum, J. R. Messing, and L. Notargiacomo. Beyond the pale of MAC and DAC - dening new forms of access con trol. In Proceedings of the 1990 IEEE Symposium on Security and Privacy, pages 190200, May 1990. [35] S. E. Minear. Providing Policy Control Over Object Operations in a Mach Based System. In Proceedings of the Fifth USENIX UNIX Security Symposium, pages 141156, June 1995.

[36] J. G. Mitchell, J. J. Gibbons, G. Hamilton, P. B. Kessler, Y. A. Khalidi, P. Kougiouris, P. W. Madany, M. N. Nelson, M. L. Powell, and S. R. Radia. An Overview of the Spring System. In A Spring Collection. Sun Microsystems, Inc., 1994. [37] T. Mitchem, R. Lu, and R. OBrien. Using Kernel Hypervisors to Secure Applications. In Proceedings of the Annual Computer Security Applications Conference, Dec. 1997. [38] D. Olawsky, T. Fine, E. Schneider, and R. Spencer. Developing and Using a Policy Neutral Access Control Policy. In Proceed ings of the New Security Paradigms Workshop. ACM, Sept. 1996. [39] E. I. Organick. The Multics System : An Examination of its Struc ture. MIT Press, 1972. [40] S. A. Rajunas, N. Hardy, A. C. Bomberger, W. S. Frantz, and C. R. Landau. Security in KeyKOS. In Proceedings of the 1986 IEEE Symposium on Security and Privacy, pages 7885, Apr. 1986. [41] S. G. Ravi Sandhu, Venkata Bhamidipati and C. Youman. The ARBAC97 Model for Role-Based Administration of Roles: Pre liminary Description and Outline. In Proceedings of the Sec ond ACM Workshop on Role-Based Access Control, pages 4150, Nov. 1997. [42] D. Redell and R. Fabry. Selective Revocation of Capabilities. In Proceedings of the International Workshop on Protection in Operating Systems, pages 192209, Aug. 1974. [43] Secure Computing Corp. DTOS Generalized Security Policy Specication. DTOS CDRL A019, 2675 Long Lake Rd, Roseville, MN 55113, June 1997. http:// [Link]/randt/HTML/[Link]. [44] Secure Computing Corp. Assurance in the Fluke Microkernel: Formal Security Policy Model. CDRL A003, 2675 Long Lake Rd, Roseville, MN 55113, Feb. 1999. [Link] projects/ux/uke/html/[Link]. [45] Secure Computing Corp. Assurance in the Fluke Microkernel: Formal Top-Level Specication. CDRL A004, 2675 Long Lake Rd, Roseville, MN 55113, Feb. 1999. [Link] projects/ux/uke/html/[Link]. [46] M. I. Seltzer, Y. Endo, C. Small, and K. A. Smith. Dealing With Disaster: Surviving Misbehaved Kernel Extensions. In Proc. of the Second Symp. on Operating Systems Design and Implemen tation, pages 213227, Seattle, WA, Oct. 1996. USENIX Assoc. [47] J. S. Shapiro. EROS: A Capability System. Technical Report Technical Report MS-CIS-97-04, University of Pennsylva nia, Department of Computer and Information Science, 1997. [48] D. F. Sterne, M. Branstad, B. Hubbard, and B. M. D. Wolcott. An Analysis of Application Specic Security Policies. In Proceed ings of the 14th National Computer Security Conference, pages 2536, Oct. 1991. [49] SunSoft, Inc. Spring Programmers Guide, 1995. On-line docu mentation included in the Spring Research Distribution 1.0. [50] D. S. Wallach, D. Balfanz, D. Dean, and E. W. Felten. Extensible Security Architectures for Java. In Proc. of the 16th ACM Symp. on Operating Systems Principles, pages 116128, Oct. 1997. [51] R. M. Wong. A Comparison of Secure Unix Operating Systems. In Proceedings of the Sixth Annual Computer Security Applica tions Conference, pages 322333, Dec. 1990. [52] W. Wulf, R. Levin, and P. Harbison. Hydra/[Link]: An Experi mental Computer System. McGraw-Hill, 1981. [53] M. E. Zurko and R. Simon. User-Centered Security. In Proceed ings of the New Security Paradigms Workshop, Sept. 1996.

You might also like