Table of Contents: Basic | Expanded
Part III, Accounting, contains the following chapters:
Chapter 6
Administering the System Audit Trail
Chapter 7
System Accounting
The System Audit Trail features allow administrators to review a record of all system activity. The ongoing record of system activity shows general trends in system usage and also violations of your system use policy. For example, any unsuccessful attempts to use system resources can be recorded in the audit trail. If a user consistently attempts to access files owned by other users, or attempts to guess the root password, this can be recorded also. The site administrators can monitor all system activity through the audit trail. Sections of this chapter include:
References are made in this chapter to auditable "Mandatory Access Control" and "MAC" events, such as an event generated when an attempt is made to access a file protected by a higher MAC clearance. The audit system provides facilities to audit all events on all IRIX operating systems. Mandatory Access Control (MAC) is available only in the optional Trusted IRIX/B operating system. No MAC audit events are generated by standard IRIX. If you have installed Trusted IRIX/B, you will have received additional documentation describing the special security features in that product. Users of standard IRIX can safely ignore all references to MAC, labels, and the dbedit, chlabel and newlabel commands. To find out if a system is running Trusted IRIX/B, use the versions command to see if the trix_eoe product image is installed.
You can also determine if a system is running Trusted IRIX/B by using the sysconf command to see if MAC is configured (1 indicates it is):
sysconf MAC 1
Both standard IRIX and Trusted IRIX/B systems give a similar uname-a response:
IRIX64 SystemName 6.5 10301649 IP27
Discretionary Access Control (DAC) is the term used by the auditing subsystem for the standard UNIX system of file permissions. IRIX uses the standard permissions system common to all UNIX based operating systems.
The audit subsystem is distributed with your IRIX operating system media, but is not installed by default. To enable auditing, you must use Inst to install the eoe.sw.audit software package from your distribution media. Inst is described in detail in IRIX Admin: Software Installation and Licensing . Once this package has been installed, reboot your system and use the chkconfig utility to enable auditing. The chkconfig(1M) reference page provides complete information on the use of chkconfig but, simply described, you will see a list of configurable options and a notation of off or on for each option. The list is in alphabetical order.
For example, here is a partial chkconfig listing that includes the audit option:
Flag State ==== ===== audit off automount on windowsystem on xdm off
The following command enables auditing on your system:
chkconfig audit on
The system immediately begins collecting audit data on the default set of audit events. The default audit events are listed and described below.
The default auditing environment is already set up when you install IRIX. You need not take any action to maintain the default auditing environment. Within your default IRIX distribution, there is a file called /etc/init.d/audit. This file contains the default audit trail initialization. The default auditing selections produce a full record of system activity with a minimum of disk-space usage. Table 6-1 contains all event types audited by default. The individual event types are not described in this list, but a description for all event types is given in "Auditable Events".
| Default Audited Events | ||
|---|---|---|
sat_access_denied | sat_domainname_set | sat_mount |
sat_ae_custom | sat_exec | sat_open |
sat_ae_dbedit | sat_exit | sat_proc_attr_write |
sat_ae_identity | sat_fchdir | sat_proc_attr_write |
sat_ae_mount | sat_fd_attr_write | sat_proc_attr_write2 |
sat_bsdipc_create | sat_file_attr_write | sat_proc_read |
sat_bsdipc_create_pair | sat_file_crt_del | sat_proc_write |
sat_bsdipc_expl_addr | sat_file_crt_del2 | sat_svipc_change |
sat_bsdipc_mac_change | sat_file_write | sat_svipc_create |
sat_bsdipc_shutdown | sat_fork | sat_svipc_remove |
sat_chdir | sat_hostid_set | sat_sysacct |
sat_chroot | sat_hostname_set | sat_tty_setlabel |
sat_clock_set |
When you have installed your system, you can select the level and type of auditing that you wish to use. The default auditing environment described above is created for you at installation time. For most purposes this auditing environment is satisfactory. However, remember that the System Audit Trail is completely configurable at any time through the sat_select and satconfig utilities.
The satconfig utility is the preferred tool for use on graphics systems, since it provides a convenient graphical interface for switching each auditable event type on or off. The sat_select command is useful for server users and others who do not wish to use the satconfig utility. These utilities are discussed in detail in "About satconfig" and "About sat_select".
You can audit all system activity or certain types of activity, such as file removal or access denial. Users are tracked through the audit trail by User ID (UID) numbers. Any audited activity is associated with the UID of the person who performed that action. It is a central feature of the System Audit Trail that though the effective UID changes with the use of the su command, the SAT ID does not. All of a user's actions after logging in are audited at the original login UID.
When you select the type of activities to audit, there are still several options for auditing. For example, if you wish to monitor the removal of files, you can generate an audit record under two conditions:
Many different types of activities take place on your trusted computer system. There are login attempts, file manipulations, use of devices (such as printers and tape drives), and administrative activity. Within this list of general activities, you may choose to audit many specific kinds of actions.
Below is a list of auditable actions with a short definition of each action and one or more of the appropriate event types that can be audited. Important actions contain a note that they should always be audited:
Any login attempt, whether successful or not, should be audited. Also, an audit record should be generated when the user logs out of the system.
Whenever a user invokes the su command, whether to super-use some administrative account, such as root or another user account, the event should be audited. This is especially true for unsuccessful attempts, as they may indicate attempts at unauthorized access.
Any time a user changes a MAC label on a Trusted IRIX/B system, it is wise to make an audit record of the event. (This does not happen under standard IRIX.)
Whenever a user changes his or her password, it is wise to make an audit record of the event.
Any activity related to system administration should be carefully audited; for example, editing the /etc/fstab file.
When a user invokes the chmod command to change the DAC permissions on a file or the chown command to change the ownership of a file.
Whenever a new link, file, or directory is created.
Whenever a link, file, or directory is removed.
When a new process is created, forked, exited, or killed.
The audit administrator (auditor) can change the audited events by entering a new sat_select command. It is possible to change the selected event types at different times of day, by using the cron utility to execute sat_select periodically.
To tailor your auditing for your specific needs, use the sat_select or satconfig utilities.
The following is a complete list of auditable event types:
All sat_ae events are used for application auditing, which means that a privileged program generated the record, rather than the kernel.
satconfig is a graphical utility that you use to configure exactly which events will be audited on your system. Any user can invoke satconfig, but only the superuser may actually change the auditing environment.
When you first begin using the audit trail, there is a default set of audited events. You can modify that selection using satconfig, but the satconfig window contains a pulldown menu labeled "edit" that you can use at any time to set the auditing environment to a few preset environments. These include the original SGI default audit selections, your local default selections, all event types selected, no event types selected, and a current events selection. The current events selection restores the auditing environment that was last saved on your machine. The local default environment can be any combination of event types that you choose. You create a local default environment by following the instructions in "Saving and Retrieving Your Auditing Environment".
When you invoke satconfig, a new window opens on your screen. The main body of the window has a list of all the available event types. Next to each event type name is a button. At any time, each button is either up or down. If the button is down, the event type is selected for auditing. If the button is up, the event type is not audited. Use your mouse and the left mouse button to select whether you want the event type in question to be on or off.
At the bottom of the satconfig screen there are three buttons. These buttons are labeled Apply, Revert, and Quit. When you have made your auditing selections, use the left mouse button to press the Apply button on the screen to activate the auditing selections. If you change your mind while making audit selections, you can use the Revert button to reset the individual event type buttons to the selections currently in use. The third button is labeled Quit and closes the satconfig window. If you have made selections that have not been applied, satconfig asks you if you really want to quit and discard the changes you have made without applying them.
The sat_select utility is a character-based program that modifies your audit event type selections. Additionally, you can use the sat_select utility to change your local default auditing environment or to read in a preselected set of event type choices from a file. In this way, you can have several preset auditing environments ready in files for various situations and switch between them conveniently. If you have a graphical system, satconfig is the suggested utility for administering your auditing event type selections. sat_select exists for non-graphics systems and for making large-scale, file-oriented changes.
For complete information on using sat_select, consult the sat_select(1M) reference page, but in general, the syntax most often used is
sat_select -on event
and
sat_select -off event
sat_select -on event directs the system audit trail to collect records describing the given event. If "all" is given as the event string, all event types are collected.
sat_select -off event directs the system to stop collecting information on that event type. If "all" is given as the event string, all event types are ignored.
With no arguments, sat_select lists the audit events currently being collected. The effect of subsequent sat_select programs is cumulative. Help is available with the -h option.
From time to time you may wish to change your auditing environment. You do this with the sat_select command. If you are making a temporary change, you may wish to save your current auditing environment for easy replacement. To do so, use this command:
sat_select -out > /etc/config/sat_select.options
Then, to restore auditing to the saved state, use this command:
sat_select 'cat /etc/config/sat_select.options'
The single quotation marks in the above example are crucial and must not be omitted.
You may save as many different audit states as you wish, in different filenames. Simply insert the filename of the state you wish to use in the above example. The /etc/config/sat_select.options file is the default audit state file that is read at boot time. The /etc/config/sat_select.options file must be labeled dblow if you are running Trusted IRIX/B, and you should restrict DAC file permissions to root only regardless of your operating system type.
The location of your audit record files is also configurable. You can direct your audit records to be saved to any location you desire, including magnetic tape. satd saves its input data in the directories or files named in its path arguments.
The -f option to satd specifies an output path, which may be a directory or a file. If the output path is a specific filename, satd writes to that file. If the output path is a directory, satd creates and fills uniquely named files under that directory; files are named for their creation time. For instance, sat_9101231636 was created in 1991 on January 23 at 4:36 pm. You can specify several output paths in the satd command line. To do so, you must precede each path with a -f or put commas (but no blank space) between each pathname. Taken together, all of the output paths specified in the command line are known as the path list. Here are a pair of examples of command lines that contain path lists:
satd -f /sat1 -f /sat2 -f /sat3 -f /dev/null satd -f /sat1,/sat2,/sat3,/dev/null
If no output paths are specified after the -f flag, the audit trail records are not saved anywhere, and the system halts. If a path given as a command-line parameter is invalid for any reason, a warning is printed, that path is omitted from the path list, and satd continues operating with whatever specified paths are valid. If the specified path does not already exist, satd creates a file with that name.
A file or directory is full when the filesystem on which it resides has no more available space. If a directory is specified as an output path, an audit file is constructed under that directory. When the audit file is filled to an internally specified maximum size, it is closed and a new audit file is created under that directory.
When one output path becomes full, satd replaces the current output path with a path that is not full. The method of replacement is configurable with the -r option. The output path is also replaced if satd receives a SIGHUP signal, for instance one sent with a kill command.
If an output path becomes nearly full, warnings are displayed to the system console to notify the administrator to move the audit trail to tape. If all of the output paths become completely full, the system state moves to single-user mode with a very short grace period.
In order to protect against the loss of data due to sudden system state changes, when satd begins operations, it creates a file called /satd.reserve, which is exactly 250,000 bytes long. If satd runs out of space, it immediately removes the satd.reserve file to free the 250,000 bytes for use to store audit records while the system moves to single-user mode. While the system is coming down, satd stores audit records in a series of files named /satd.reserve-n, where n starts as 0. While satd is doing this, it issues a warning via wall to all users that they have ten seconds before system shutdown.
If the file /satd.emergency-0 already exists, satd immediately moves to the first available filename, typically /satd.emergency-1. To guard against this happening, a warning is issued at boot time if any /satd.emergency files exist.
For complete information on the audit daemon, see the audit(1M), satd(1M), and audit_filters(5) reference pages and the comments in /etc/init.d/audit.
At times, you may wish to examine the audit record of a particular user. For example, the user may have a history of violations of system security or may simply be leaving the project and an accounting of activity may be required.
If the user in question is being audited to determine if attempted security violations are taking place, use the command line:
sat_reduce -P satfile | sat_summarize -u user_name
This command line selects only the audit records that represent attempted violations. The -P flag to sat_reduce selects for attempted violations. The -u flag to the sat_summarize command lists the number of records generated by the user.
It is vitally important to remember that not every record of an attempted violation really represents malicious intent on the part of the user! Most of these records are generated in the course of normal work. The auditor should be looking for a trend, such as repeated attempts to access information unnecessary in the course of normal work (for example, a programmer attempting to access salary or hiring information).
In the second scenario, where the employee is leaving the project, the auditor is looking for a comprehensive list of files used by that employee so that the correct files and directories may be assigned a new owner who is remaining on the project.
The above listed command line provides a basic look at the user's activity. Next, to more closely examine the user's activities, issue the following command:
sat_reduce -u user_name satfile | sat_interpret | more
The sat_reduce command selects all of the audit records generated by the user. Then, the sat_interpret command puts the records into human readable form. The output of sat_interpret is very large. If it is impractical to direct this output to a file, you should direct the output to your screen and view it with a screen paging program such as more.
Using these two command lines, you should be able to view a user's activities and come to a reasonable knowledge of the types of actions the user is taking on the system. You can also generate a specific record, in human-readable form, of all security violations or files and resources accessed.
At times, you may wish to examine all audit records pertaining to an individual file. Perhaps some changes have been made to an important file and the user who made those changes must be identified. Or perhaps an accounting of all access to a sensitive file is needed. To obtain a record for each time the file was opened, you must first make certain that the audit daemon is recording sat_open and sat_open_ro events. Use the sat_select command to ensure that these events are logged. To search the audit log for these events, use the following command line:
sat_reduce -e sat_open -e sat_open_ro satfile |
sat_interpret | grep filename
If you are using Trusted IRIX/B, your system supports Mandatory Access Control (MAC) labels on all files and processes. This section explains how to check the audit trail of a given security label.
If you are using standard IRIX, your system does not support MAC labels, and attempts to read the audit trail for events relating to such labels will be futile.
Since the number of configurable labels in Trusted IRIX/B is great enough for each project or portion of a project at your site to have its own label, you may sometimes need to audit a specific label to generate a record of activity on that label. Use the following command to generate a log of activity on a label:
sat_reduce -l label satfile
The above command chooses only audit records that pertain to the given label. The following command syntax allows you to select more than one label for your report:
sat_reduce -l label -l label2 satfile
Once you have obtained output from sat_reduce, use the other auditing utilities, such as sat_interpret or sat_summarize, to view it according to your needs.
The audit trail for an active system with full auditing can be too large for a single person to read and understand, and the entries in the trail that alert you to trouble are small and rare. If you were to read the raw audit trail to find an instance of policy violation, it would be like trying to find a needle in a haystack. Therefore, several utilities exist to help you reduce and interpret the raw audit data. The sat_reduce, sat_interpret, and sat_summarize commands can be used to remove superfluous information and format the audit history in succinct packages. See the reference pages for these commands for specific information on their usage.
After your raw data has been reduced and interpreted, an individual record looks something like this:
Event type = sat_ae_identity Outcome = Failure Sequence number = 5 Time of event = Mon Mar 11 12:46:13.33 PST 1991 System call = syssgi,SGI_SATWRITE Error status = 0 (No error) SAT ID = anamaria Identity event = LOGIN|-|/dev/ttyq4|anamaria|That user gave an invalid label.
The sat_summarize command provides a short listing of what types of records are in the audit trail and how many there are of each type. It's a useful tool for scanning the records quickly and identifying trends in system usage or consistent problems.
Remember that file pathnames within audit records are not the same as those in common usage through the shell on your system. Since the audit record is an exact log for security purposes, many attributes of the pathname that are designed to be transparent in normal usage are explicit in the audit log. For example, the double slash (//) means a directory level crossing (ordinarily represented through the shell with a single slash (/)). A slash followed by an exclamation point (/!) indicates crossing a filesystem mount point. The slash and ampersand construction (/@) indicates that the path is following a symbolic link. If you are running Trusted IRIX/B, you may also see a slash followed by a right angle bracket (/>), which indicates that the directory level being crossed into is a multilevel directory. The egrep utility supports this notation, so it is possible to specify this form of pathname notation in regular expression searches. Below are two examples of audit record pathnames:
/usr/!orange2/@/fri//usr//src//lib//libmls//libmls.a /usr/!tmp/>L_e//sat//sat_9012280805
The system places the audit data in files on your system. Each file begins with the starting date and time of the file, the machine name, and the host ID, and ends with the stopping date and time. If your system is interrupted (for example, by a power failure), the audit file being used at that time will have no ending entry. The audit daemon automatically closes a file when it reaches a certain manageable size and opens another. A new file is always started when the system is brought up. For information on these files and their format, see the satd(1M) reference page.
The overwhelming majority of records in an audit trail are the result of the normal actions of users doing their jobs. No automated tool exists to locate records that signify the actions of abusers trying to violate system security. Nonetheless, an administrator can apply some general rules to detect abuse or violation of security policy. This list of tips is neither complete nor universal. Each administrator must customize the list to meet the particular needs of each site.
Intrusion by outsiders is among the most feared of abuses. Fortunately, this kind of abuse produces distinctive audit record patterns and is easily detected. Below, are descriptions of several different subcategories of outsider abuse that can be detected by the audit system. Note though, that these kinds of patterns can also be generated by an authorized user who makes a mistake or is misinformed. Patterns of this type are described below.
All attempts at unauthorized entry generate audit records of the sat_ae_identity event type. (Use sat_select, sat_reduce, and sat_interpret to collect and view these records.) The interpreted output of these events contains a text string that describes the attempt at entry. Intruders from outside your organization have a much higher instance of failed login attempts than your authorized users.
Three interesting text strings reveal attempts at unauthorized entry:
Here is an example of an interpreted audit record of an unsuccessful login attempt:
Event type = sat_ae_identity Outcome = Failure Sequence number = 1 Time of event = Mon Mar 11 12:45:40.34 PST 1991 System call = syssgi,SGI_SATWRITE Error status = 0 (No error) SAT ID = anamaria Identity event = LOGIN|-|/dev/ttyq4|guest|Unsuccessful login attempt.
Usage of your system outside of normal working hours or, if your system maintains physical security of terminals, from unusual locations, is a matter of interest. In most cases, the usage of the system is legitimate, but each instance certainly bears notation and examination. Many potential violations of security from outside your user community happen during nonpeak hours, and rarely from within your physical site.
To observe activity at odd hours, enter the following commands in order:
If your site assigns a terminal to each user and maintains reasonable physical security for each terminal, you can monitor logins from unusual locations. For example, if a user normally working in a group computer lab makes a login attempt from a private office, this event may be cause for interest. To get a list of login events, enter the following command:
sat_reduce -e sat_ae_identity sat_file | sat_interpret | grep LOGIN
Bear in mind that it does not necessarily represent a violation of security if a user is working at an unusual terminal or even if a user is logged on at two or more terminals at once. For instance, the user may be correcting a mistake and may have logged in elsewhere explicitly for the purpose of terminating unwanted processes. You should be looking for instances where the user is not genuinely logged in twice, but where one instance of the login is an intruder.
Whenever a user connects to a machine outside your trusted local network, an audit record should be generated. A connection to a host outside of the local network is worthy of notice but not necessarily a violation of security. You should be on the lookout for trojan horse programs that cause your system to make an outward connection at a later time. You can identify outward connections with the following command sequence:
The above command sequence is dependent on the specific implementation of your networking software. You may need to modify commands to reflect your networking situation. For example, if the software you are using does not generate the sat_bsdipc_addr auditing event type, you should search for another event type that is generated.
Beyond use and abuse by intruders, unfortunately, the possibility arises of abuse from within your organization. The following types of events are the most common instances of security violations. It is extremely counterproductive to assume that a security violation on the part of an authorized user indicates that the user is not trustworthy or is involved in some attempt to break security for malicious purposes. Most violations of system security by users involve a failure on the part of the Administrator to adequately prepare the working environment. Users are most concerned with accomplishing their work tasks, not with fixing the computer system to provide themselves with the correct tools. Therefore, you should not be suspicious of the user who violates security unless a clear pattern of a specific and unnecessary security violation is apparent.
Although the system records each instance where access to a file or resource is denied, the information contained in these audit records is rarely indicative of a security violation. Many applications and utilities operate on a principle of access denial as part of normal operation. These events are always logged, but only in rare cases do they indicate a violation. For example, the library function getutent always tries to open /etc/utmp for read-write access. If this action fails, getutent immediately tries again, but requesting read-only access. Permissions on /etc/utmp prohibit all users except root from opening this file for reading and writing. When an unprivileged user runs a program that calls getutent(), a sat_access_denied record is generated, and it is immediately followed in the audit trail by a sat_open_ro record, indicating that access was granted. The lesson in this example is that access denial is usually not indicative of a security violation.
The sat_access_failed event is often confused with the denial event. The event type is completely different and is even more rarely a cause for concern than access denial. When a user enters a command to an interactive shell (such as /bin/csh), the shell tries to execute the command in each directory in the user's search path, failing at each attempt until it finds a directory that actually contains the command. Suppose a user enters xterm and his or her path variable contains
/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/bin/X11:~/bin
A sat_access_failed record is generated for each directory in the path until the command is found and executed. In this scenario, a record of failed access is generated for each of the following nonexistent programs: /bin/xterm, /usr/bin/xterm, /usr/sbin/xterm, /usr/local/bin/xterm and a successful sat_file_exec record for the real program: /usr/bin/X11/xterm.
Every interpreted audit record contains a line beginning with the keyword Outcome. The field following this keyword can be equal to one of Success, Failure, or Success due to privilege. The last case indicates that the user made a system call that would have failed except that superuser privilege was invoked to assure its successful completion. This is not necessarily a security violation or an unexpected use of root privilege. It is perfectly normal to see these outcomes. Any time an ordinary user runs a program that contains code that uses root privilege, Success due to privilege outcomes are generated. A good example of this kind of program is passwd. An ordinary user generates a record of this type simply by changing the password on his or her account.
What you should be looking for is an instance where the SAT ID or Effective ID field is different from the "User ID" field. This occurs when a user executes /bin/su to gain root privileges or otherwise promotes the privilege level of a session. In most cases, this is not a security violation, since the root password is necessary to successfully complete the /bin/su command.
An instance of using superuser privilege, though, is always worth examination in the audit trail. When you encounter an instance where a user has promoted his or her login session to root, you should check to see that the user is authorized to know the root password. If not, check whether the user indeed executed the /bin/su command, or if he or she promoted the privilege of the session by some other means, such as a Trojan horse setuid shell command.
Whenever a user runs /bin/su and thereby promotes the privilege of his or her login session, the auditor should also make a routine check of what actions the user took while the privilege was promoted.
Sometimes a particular user is under official scrutiny by the management of a site. He or she may be on probation or may have just left employment under less than ideal circumstances. The auditor can choose to look at the records describing that user's behavior just by directing the audit trail through the sat_reduce command as follows:
Rarely should any user be subjected to this kind of accounting, and this feature should be used carefully and with consideration of the individuals involved.
Sometimes a particular file or resource is of special interest. An information leak may have occurred and an investigation is proceeding into how the leak took place. Or a special file or resource may have been created as bait to trap browsing intruders. In either case, the file or resource should be closely accounted by the auditor.
sat_reduce -n interesting_file -e sat_open -e sat_open_ro sat_filename | sat_interpret
Frequently, actions taken by the Administrator or root result in unusual audit records. With the enhanced privilege of these accounts, it is not unusual for more audit records of potential concern to be generated. Again, it is rare for a record to be generated that cannot be explained by the normal usage of the system or by simple human error.
Every modification of system data files is of interest to the auditor. Since these data files are not only under system security but in fact define system security, any unauthorized access can result in a total breach of security.
Each site has individual policies on how users are added to or removed from the system, how access control of files and hardware is administered, how network connectivity is maintained and administered, and a host of other issues. It is the responsibility of the auditor at each site to enforce the policies of the site and to use the auditing tool effectively to exercise that responsibility.
If you are running Trusted IRIX/B, system data files should be modified only with the dedicated editing tool, dbedit, and never with general-purpose text editors. Only privileged users can use the dbedit tool, and only privileged users have permission to alter the contents of the system data files. Any use of any other editor on a system data file is a violation of security policy and should be noticed by the auditor. If your interpreted audit trail contains sat_open records where the Actual name field contains the string "/secadm," check that the Process ID field (which gives both the PID and the name of the program being executed) does not contain "vi," "ex," "emacs" or any other commonly available text editor. This field should contain only the name "dbedit."
The Administrator should never modify permissions, ownership, or labels of system programs. If your audit trail contains evidence that the administrator has attempted to change attributes of system programs, you should investigate and find the reason for the change. Again, the explanation given is likely to be valid, and this is not good cause to suspect your Administrator of subterfuge; however, you may want to examine your system's security policies and make certain that neither the users nor the administrators take a cavalier attitude toward the security policies.
The following command searches your audit trail for the type of records that can indicate this problem:
sat_reduce -e sat_file_attr_write -e sat_fd_attr_write < satfile
In the interpreted output, look for lines with the Actual name field. Any audit record showing modified attributes for resources in /bin, /sbin, /etc, /lib, /var, /usr/bin, /usr/lib, /usr/share, /usr/bsd, /usr/sbin, or /usr/bin/X11 is an audit record deserving follow-up.
The auditor should be the only person to access the audit trail. No other users should read from it, write to it, remove files, or modify file attributes. Look at all records generated by people other than the one who knows the auditor account password, and check that none of those records refer to files in /var/adm/sat or in any other directory you use to store audit trail information.
Since the audit trail is stored in ordinary system files, archiving your audit data is as easy as making a backup tape. Archive your audit data to conserve disk space but do keep copies of your audit trail; evidence of intrusion and damage to your system may not always be apparent immediately, and the ability to research your audit trail over time can be very valuable in tracking down a security breach. You can use the compress utility to reduce the size of your old audit files by up to 80 percent.
Since the audit trail is stored in ordinary system files, once it has been archived, audit trail files can be safely removed. If you enter the df command (disk free) and determine that the filesystem containing your audit trail is more than 90 percent full, you should remove old audit files. If your audit files are kept in /var/adm/sat, enter the command
df -k /var/adm/sat
The output should be similar to this:
Filesystem Type blocks use avail %use Mounted on /dev/root efs 245916 218694 27222 89% /
In this example, the file system is 89 percent full, and the auditor should archive and remove audit trail files.
Do not allow your audit files to grow too large. Oversized audit files can use up your available disk space and cause the system to refuse new records and immediately cease operations. This can result in lost work and lost audit records. Maintain at least 10 percent free space in your audit filesystem at all times.
The audit daemon, satd(1M), must always be running on your system. The daemon eventually becomes unable to write to the audit file if free disk space drops to 0 percent. When it can no longer write to the audit file, the daemon exits with an error, and the system changes the run level to single-user mode. You must then archive and remove the audit files to free disk space before bringing the system back to multi-user mode. If the satd daemon is somehow killed or interrupted on your system, the system changes the run level to single user mode immediately. The daemon is respawned when the system is brought back up.
To make space on the disk for your audit trail, first boot the system into single-user mode. No audit records are generated in this mode. Once in single-user mode, archive your audit files and remove them from the disk. Once at least 10 percent of the filesystem is free, you may boot into multiuser mode without difficulty.
If your auditing system directs the audit files to the / (root) filesystem or the /usr file system and either filesystem becomes full, you will not be able to bring the system to single-user mode to archive and remove your old audit files. If you find yourself in this situation, perform the following procedures to remove old audit files:
From the shell, you must archive and remove the old audit files. Remember that when your system is running the Inst (also called miniroot) shell, your system's root directory appears as
/root/
rather than
/
and your /usr file system appears as
/root/usr
because your system's filesystems are mounted on the Inst filesystem.
IRIX provides utilities to log certain types of system activity. These utilities perform process accounting and system accounting. This chapter contains the following sections:
The four initial sections describe the standard UNIX System V accounting procedures. IRIX also implements an extended accounting facility, discussed in the following section:
Ask your Silicon Graphics sales representative for information on additional tools available. For example, SHARE II for IRIX is an optional product allowing additional administrative control of system resources including disk space, CPU entitlement, memory (real or virtual), number of processes, printer pages, terminal and modem connect-time, network packets, and more.
The IRIX process accounting system can provide the following information:
Using this information, you can:
The next sections describe the parts of process accounting, how to turn on and off process accounting, and how to look at the various log files.
The IRIX process accounting system has several parts:
You must specifically turn on this function. See "Turning On Process Accounting".
Note that for XFS filesystems, disk quotas (installed with the subsystem eoe.sw.quotas) can be used as an efficient accounting tool to keep track of disk usage. Refer to IRIX Admin: Disks and Filesystems for more information.
To turn on process accounting:
chkconfig acct on
/usr/lib/acct/startup
This starts the kernel writing information into the file /var/adm/pacct.
Process accounting is started every time you boot the system, and every time the system boots, you should see a message similar to this:
System accounting started
Note that process accounting files, especially /var/adm/pacct, can grow very large. If you turn on process accounting, especially on a server, you should watch the amount of free disk space carefully. See "Accounting File Size Control".
To turn off process accounting, follow these steps:
chkconfig acct off
/usr/lib/acct/shutacct
This stops the kernel from writing accounting data into the file /var/adm/pacct. Process accounting is now turned off.
The directory /usr/lib/acct contains the programs and shell scripts necessary to run the accounting system. Process accounting uses a login (/var/adm) to perform certain tasks. /var/adm contains active data collection files used by the process accounting. Here is a description of the primary subdirectories in /var/adm:
Process and disk accounting files can grow very large. On a busy system, they can grow quite rapidly.
To help keep the size of the file /var/adm/pacct under control, the cron command runs /usr/lib/acct/ckpacct to check the size of the file and the available disk space on the file system.
If the size of the pacct file exceeds 1000 blocks (by default), it runs the turnacct command with argument "switch." The "switch" argument causes turnacct to back up the pacct file (removing any existing backup copy) and start a new, empty pacct file. This means that at any time, no more than 2000 blocks of disk space are taken by pacct file information.
If the amount of free space in the file system falls below 500 blocks, ckpacct automatically turns off process accounting by running the turnacct command with the "off" argument. When at least 500 blocks of disk space are free, accounting is activated again the next time cron runs ckpacct.
The files listed here are located in the /var/adm directory:
The following files are located in the /var/adm/acct/nite directory:
The following files are located in the /var/adm/acct/sum directory:
The following files are located in the /var/adm/acct/fiscal directory:
When IRIX enters multiuser mode, /usr/lib/acct/startup is executed as follows:
The ckpacct procedure is run through cron every hour of the day to check the size of /var/adm/pacct. If the file grows past 1000 blocks (default), the turnacct switch is executed. The advantage of having several smaller pacct files becomes apparent when you try to restart runacct after a failure processing these records.
The chargefee program can be used to bill users for file restores, and so on. It adds records to /var/adm/fee that are picked up and processed by the next execution of runacct and merged into the total accounting records. runacct is executed through cron each night. It processes the active accounting files, /var/adm/pacct, /etc/wtmp, /var/adm/acct/nite/disktacct, and /var/adm/fee. It produces command summaries and usage summaries by login name.
When the system is shut down using shutdown, the shutacct shell procedure is executed. It writes a shutdown reason record into /etc/wtmp and turns process accounting off.
After the first reboot each morning, the administrator should execute /usr/lib/acct/prdaily to print the previous day's accounting report.
If you have installed the system accounting option, all the files and command lines for implementation have been set up properly. You may wish to verify that the entries in the system configuration files are correct. In order to automate the operation of the accounting system, you should check that the following have been done:
/usr/lib/acct/startup /usr/lib/acct/shutacct
The first line starts process accounting during the system startup process; the second stops it before the system is brought down.
0 4 * * 1-6 if /etc/chkconfig acct; then /usr/lib/acct/runacct 2> /var/adm/acct/nite/fd2log; fi 5 * * * 1-6 if /etc/chkconfig acct; then /usr/lib/acct/ckpacct; fi
Note that the above cron commands must constitute only one line in the crontabs file. The following command, which also constitutes only one line in the crontabs file, should be in /var/spool/cron/crontabs/root:
0 2 * * 4 if /etc/chkconfig acct; then /usr/lib/acct/dodisk > /var/adm/acct/nite/disklog; fi
0 5 1 * * if /etc/chkconfig acct; then /usr/lib/acct/monacct; fi
The above command is all on one line in the source file, and takes advantage of the default action of monacct that uses the current month's date as the suffix for the file names. Notice that the entry is executed when runacct has sufficient time to complete. This will, on the first day of each month, create monthly accounting files with the entire month's data.
PATH=/usr/lib/acct:/bin:/usr/bin
chkconfig acct on
and
/usr/lib/acct/startup
The next time the system is booted, accounting starts.
The runacct command is the main daily accounting shell procedure. It is normally initiated by cron during nonpeak hours. runacct processes connect, fee, disk, and process accounting files. It also prepares daily and cumulative summary files for use by prdaily or for billing purposes.
The following files produced by runacct are of particular interest:
runacct takes care not to damage files in the event of errors. A series of protection mechanisms are used that attempt to recognize an error, provide intelligent diagnostics, and terminate processing in such a way that runacct can be restarted with minimal intervention. It records its progress by writing descriptive messages into the file active. (Files used by runacct are assumed to be in the nite directory unless otherwise noted.) All diagnostics output during the execution of runacct are written into fd2log. runacct complains if the files lock and lockl exist when invoked. The lastdate file contains the month and day runacct was last invoked and is used to prevent more than one execution per day. If runacct detects an error, a message is written to /dev/console, mail is sent to root and adm, locks are removed, diagnostic files are saved, and execution is terminated.
To allow runacct to be restartable, processing is broken down into separate reentrant states. A file is used to remember the last state completed. When each state completes, statefile is updated to reflect the next state. After processing for the state is complete, statefile is read and the next state is processed. When runacct reaches the CLEANUP state, it removes the locks and terminates. States are executed as follows:
The runacct procedure can fail for a variety of reasons - usually due to a system crash, /usr running out of space, or a corrupted wtmp file. If the activeMMDD file exists, check it first for error messages. If the active file and lock files exist, check fd2log for any mysterious messages. The following are error messages produced by runacct and the recommended recovery actions:
ERROR: locks found, run aborted
The files /var/adm/acct/nite/lock and /var/adm/acct/nite/lock1 were found. These files must be removed before runacct can restart.
ERROR: acctg already run for date: check /var/adm/acct/nite/lastdate
The date in lastdate and today's date are the same. Remove lastdate.
ERROR: turnacct switch returned rc=?
Check the integrity of turnacct and accton. The accton program must be owned by root and have the setuid bit set.
ERROR: Spacct?.MMDD already exists
File setups probably already run. Check status of files, then run setups manually.
ERROR: /var/adm/acct/nite/wtmp.MMDD already exists, run setup manually
Self-explanatory.
ERROR: wtmpfix detected a corrupted wtmp file. Use fwtmp to correct the corrupted file.
Self-explanatory.
ERROR: connect acctg failed: check /var/adm/acct/nite/log
The acctcon1 program encountered a bad wtmp file. Use fwtmp to correct the bad file.
ERROR: Invalid state, check /var/adm/acct/nite/active
The file statefile is probably corrupted. Check statefile for irregularities and read active before restarting.
The runacct program, called without arguments, assumes that this is the first invocation of the day. The argument MMDD is necessary if runacct is being restarted and specifies the month and day for which runacct will rerun the accounting. The entry point for processing is based on the contents of statefile. To override statefile, include the desired state on the command line. For example, to start runacct, use the command:
nohup runacct 2 /var/adm/acct/nite/fd2log &
To restart runacct:
nohup runacct 0601 2 /var/adm/acct/nite/fd2log &
To restart runacct at a specific state:
nohup runacct 0601 WTMPFIX 2 /var/adm/acct/nite/fd2log &
Sometimes, errors occur in the accounting system, and a file is corrupted or lost. You can ignore some of these errors, or simply restore lost or corrupted files from a backup. However, certain files must be fixed in order to maintain the integrity of the accounting system.
The wtmp files are the most delicate part of the accounting system. When the date is changed and the IRIX system is in multiuser mode, a set of date change records is written into /etc/wtmp. The wtmpfix program is designed to adjust the time stamps in the wtmp records when a date change is encountered. However, some combinations of date changes and reboots will slip through wtmpfix and cause acctcon1 to fail.
The following steps show how to fix a wtmp file:
If the wtmp file is beyond repair, remove the file and create an empty wtmp file:
This prevents any charging of connect time. acctprc1 cannot determine which login owned a particular process, but it is charged to the login that is first in the password file for that user ID.
If the installation is using the accounting system to charge users for system resources, the integrity of sum/tacct is quite important. Occasionally, mysterious tacct records appear with negative numbers, duplicate user IDs, or a user ID of 65,535. First check sum/tacctprev with prtacct. If it looks all right, the latest sum/tacct.MMDD should be patched up, then sum/tacct recreated. A simple patchup procedure would be:
cd /var/adm/acct/sum
acctmerg -v < tacct.MMDD > xtacct
ed xtacct
acctmerg -i < xtacc t > tacct.MMDD
acctmerg tacctprev <tacct.MMDD> tacct
Remember that the monacct procedure removes all the tacct.MMDD files; therefore, you can recreate sum/tacct by merging these files.
The file /usr/lib/acct/holidays contains the prime/nonprime table for the accounting system. The table should be edited to reflect your location's holiday schedule for the year. The format is composed of three types of entries:
1992 0900 1630
A special condition allowed for in the time field is that the time 2400 is automatically converted to 0000.
day-of-year Month Day Description of Holiday
The day-of-year field is a number in the range of 1 through 366, indicating the day for the corresponding holiday (leading white space is ignored). The other three fields are actually commentary and are not currently used by other programs.
runacct generates five basic reports upon each invocation. They cover the areas of connect accounting, usage by person on a daily basis, command usage reported by daily and monthly totals, and a report of the last time users were logged in. The following paragraphs describe the reports and the meanings of their tabulated data.
In the first part of the report, the from/to banner should alert the administrator to the period reported on. This period runs from the time the last accounting report was generated until the time the current accounting report was generated. It is followed by a log of system reboots, shutdowns, power fail recoveries, and any other record dumped into /etc/wtmp by the acctwtmp program. See the acct(1M) reference page for more information.
The second part of the report is a breakdown of line utilization. The TOTAL DURATION field tells how long the system was in multiuser state (able to be accessed through the terminal lines). The columns are:
During real time, /etc/wtmp should be monitored, since this is the file from which connect accounting is geared. If it grows rapidly, execute acctcon1 to see which line is the noisiest. If the interrupting is occurring at a furious rate, general system performance will be affected.
The daily usage report gives a by-user breakdown of system resource utilization. Its data consists of:
These two reports are virtually the same except that Daily Command Summary reports only on the current accounting period, while Monthly Total Command Summary tells the story for the start of the fiscal period to the current date. In other words, the monthly report reflects the data accumulated since the last invocation of monacct.
The data included in these reports tell an administrator which commands are used most heavily. Based on those commands' characteristics of system resource utilization, the administrator can decide what to weigh more heavily when system tuning. These reports are sorted by TOTAL KCOREMIN, which is an arbitrary yardstick but often a good one for calculating "drain" on a system.
Large computing sites often have many unrelated users and must be able to charge them separately for resource usage. Although there are IRIX mechanisms to provide usage information, these mechanisms are inadequate for many sites. Standard IRIX accounting lacks some important metrics, uses lots of disk space, and provides little flexibility for usage billing. Third-party accounting software addresses some of these issues, but is still limited by data provided IRIX. Array clusters and hypercubes compound these problems by allowing a single user's resource usage to be spread over multiple systems.
IRIX provides three features to assist large computing sites with accounting needs: extended accounting, array sessions, and project IDs.
The original IRIX mechanism for resource accounting was based on standard System V accounting. Whenever a process exits, the kernel writes a record containing resource usage information to a file. Because the kernel itself does this file I/O, process accounting can become a minor bottleneck on heavily loaded systems. Another problem is with the format of data written by System V accounting: usage information is stored using an awkward compressed format that amounts to a 16-bit floating point number. Values quickly lose a significant amount of accuracy, and have a maximum value that is not difficult to exceed on modern systems (around 234, or 16 GB). Finally, there is no room for expansion in the accounting records. Metrics provided are fairly limited, and many customers need additional data. However, with no room for expansion, additional fields would require increasing the record size, which would break virtually all the existing software that uses accounting data.
In IRIX release 6.1 and later, extended accounting is available, while System V accounting remains in place, essentially unchanged.
One significant change in extended accounting is the delivery mechanism: records are written using the system audit trail (SAT) facility, which uses a daemon to collect audit records from the kernel using special system calls. SAT writes records out to destinations chosen by the system administrator; see satd(1M). This gets the kernel out of the file I/O business, and gives system administrators flexibility in the handling of accounting data.
The sat_select command can be used to select accounting events for the audit subsystem to monitor; see sat_select(1M) for details.
Housekeeping tasks such as rotating audit files and handling file-system-full conditions are handled by the satd program. Third party software can either read the audit files in their entirety (files may contain records for non-accounting events if a site has elected to audit them) or use the existing sat_reduce program to filter out only relevant accounting records; see sat_reduce(1M). Contents of individual records can be dumped in ASCII format using the sat_interpret program; see sat_interpret(1M).
Resource data contained in extended accounting records is stored as uncompressed 64-bit values, which should be sufficient for most metrics into the near future. Records contain spare fields to allow for future expansion, and a version code to allow software to handle future format changes gracefully. In addition to all of the metrics reported by System V accounting, these new metrics have been added:
To begin using extended accounting on a system, follow these steps:
chkconfig audit on
For more information, see the extacct(1M) reference page. Appendix A of IRIX Admin: System Configuration and Operation , lists kernel parameters for extended accounting.
To reduce disk space consumption and processing time for accounting records, IRIX can accumulate and report accounting information by array session. Process accounting is separately controlled - sites can use either accounting style, or both. Session accounting records contains data similar to process accounting records, except that counters and values reflect the accumulated total of all processes that were members of the session.
An array session is a set of processes all related to each other by a single unique identifier, the array session handle (ASH). A child process ordinarily inherits the ASH of its parent when created, thus becoming a member of its parent's array session. However, a system call is provided to allow a process to leave its parent's array session and start a new one. Programs like login and rshd use these system calls so that logging into the system effectively starts a new session. Programs like cron, su, and several batch queuing systems use these system calls so that work done on behalf of another user can have its own session. When the last process with a given ASH exits, the array sessions ends, and a session accounting record is written.
The ASH is a 64-bit value. A unique, increasing value (similar to a process ID) is assigned by default to each new array session as its handle. However, a system call is provided to change an array session's handle if desired. This can be used to synchronize the handles of array sessions on several systems in an array, thus allowing a distributed job to be considered a single entity for accounting purposes.
For more information, see the array_sessions(5) reference page.
The range of handles that ths system assigns may be configurable, so it is possible to ensure that automatically assigned handles never conflict with process-specified ones. The system ensures that a particular ASH is never in use by more than one array session on that local system at one time.
In addition to accumulated totals of various process accounting data, session accounting records contain a 64-byte field intended for "service provider information." In particular, batch queuing systems can use this field to record data about the queue name, initiator, and so forth. By default, the service provider information for a new array session is inherited from the array session of its parent process.
The standard init program always has its service provider information set to all zeroes, and standard login utilities (login, su, rshd) never change service provider information. Batch queuing systems, on the other hand, are always expected to set service provider information to some non-zero value. Thus, it is possible to distinguish batch jobs from interactive sessions by checking if the service provider information is all zeroes or not.
Many sites must be able to charge individual departments separately for their usage of a given system. Typically, this was done by billing total usage for each system user ID to the appropriate department. However, some sites have users that work for more than one department, so billing all usage to a single department is not appropriate.
To solve this accounting conundrum, the project ID feature was introduced into IRIX. A project ID is similar to a group ID, except that:
A default project ID is associated with every user ID. Whenever it is necessary for a user to do work that should be billed to a different project, the newproj command may be used to change project ID; see newproj(1) for details. This command starts a new shell and array session; background processes under the old shell continue being accounted for under the original project ID. Furthermore, the user ID and group ID remain unchanged, so access permissions are unaffected. To prevent users from specifying a project ID for which they are not authorized, the newproj command consults a file listing valid project IDs for each user. The system calls that set project ID require superuser privileges.
The file that contains user IDs and their authorized projects, /etc/project, is similar in style to /etc/passwd or /etc/group; see project(4) for details. This file also specifies the default project for each user, in order to avoid modifying /etc/passwd. Because the project ID is a simple number, an additional file, /etc/projid, associates mnemonic ASCII names with numeric project IDs; see projid(4) for details. The system administrator can configure a standard default project ID using the dfltprid variable of systune.
By default, an array session inherits the project ID of the session that spawned it. The standard login utilities (login, su, rshd) that start new array sessions have been updated to change project ID to the default project ID of the new user.
Library routines for reading project ID files is also provided, comparable to library routines for reading password file data. See projid(3C) for more information.
Table of Contents: Basic | Expanded