# Unix

## Resources

Beginner Unix Tutorial slides (2015-2016)

## What is UNIX?

UNIX is an operating system (OS)

UNIX has a convoluted history, but roughly speaking, it was originally invented by Ken Thompson in 1969. Dennis Ritchie, the inventor of the C programming language, is considered to be the co-author of the system. UNIX was reimplemented almost entirely in C during 1972-1974, making it the first source-portable OS. You can read more about UNIX in Wikipedia, a free encyclopdia, at: http://www.wikipedia.org/wiki/Unix

Technically, UNIX is proprietary software originally owned by AT&T, and now partially by Novell, SCO Group, and The Open Group - it's very unclear. Despite UNIX being one of the most influential operating systems in history, nowadays hardly anyone uses UNIX; instead people use variations of the original UNIX that have been created since the early 1980s, including many free variations (that is, free of restrictive copyrights), such as Debian GNU/Linux, RedHat Linux, FreeBSD, and OpenBSD. You can read more about the definition and philosophy of free software at: http://www.gnu.org/philosophy/free-sw.html

It is incorrect to refer to any of these variations as "UNIX"; it is better to say "UNIX-like". In practice however, because 99% of the time people only speak of UNIX-like systems anyway, the intended generalization is understood.

In this document, to acknowledge the distinction between the original UNIX and the loose generalization, the spelling "Unix", with only the first letter capitalized, will be used to refer to any common UNIX-like system.

## Unix Basics

Unix is a multi-user and multi-tasking operating system. "Multi-" is a combining form indicating "more than one". Therefore, in this context, multi-user means that a system is capable of being used by more than one user at the same time. Multi-tasking means that any user may run more than one task (aka program, or job) at the same time.

"User" is often a combining form indicating "a person or thing that uses"; in this context it is short for "computer-user". A "task" is any piece of work. [*]

### Time-Sharing

If you are unfamiliar with a multi-user operating system you may wonder how so many users can use a computer at the same time.

To understand, first note the following terms. A device is a computer hardware designed for a specific task. A channel is a path along which data can be transmitted.

Next consider the keyboard (an input device) and the monitor (an output device) as a pair forming an input/output channel to a computer. A device having input/output links with a computer is called a terminal. Now imagine many such terminals attached to one Unix machine.

Sharing one computer is possible because Unix is a time-sharing operating system, that is, an operating system with which its users (at different terminals of a single computer running Unix) can, because of its design and high speed, seemingly (or apparently) communicate with it at the same time.

### Accounts

For the sake of order and security, each user of a Unix system has a separate account. An account is something like an office, because it is a place to do work and store things related to that work.

Each account is uniquely identified by a number called a user identification number (or uid for short), as well as an account name, generally referred to as a username, and an associated password (a secret word that ensures admission by proving one's identity).

### Files and Directories

Account information, along with absolutely all other information in a Unix system, is stored in entities called files, similar in concept to ordinary office files. Files can be grouped into directories, just as office files can be grouped into folders. A directory can contain files as well as other directories, just as an office folder can contain files and other folders. However, unlike office folders, which are limited by physical constraints, directories can be nested to no impractical limit.

To be nested means to be positioned within something. For directories to be nested means for them to be positioned within other directories.

One structural constraint of directories is that all directories ultimately stem from one directory called root (abbreviated as a single forward slash:  / ''). Another constraint is that any single directory must contain files with unique names, but files in different directories can have the same name without any conflict. [1] [*] [1 TUPEnvironment, p. 21]

### Filesystem and File Hierarchy

A filesystem is a logical subdivision of hard disk space. Characteristic of a Unix filesystem is its hierarchical structure. Hierarchy refers to any system of things or persons arranged in a graded order, such as the structure of government, or the socioeconomic structure of our society [*]. In this case we are talking about a system, or structure, of directories and files in a nested order, beginning with the root directory. [*][*]

The filesystem is recursively defined in the form of an upside down tree, starting with the 'root' directory at the top. Recall that the root directory is simply named /''. It is not contained in any other directory, but from it all other directories descend, directly or indirectly. A file path is a description of a file's residence relative to some directory. A file path relative to the root is called an absolute path, and would start with /''. Note that all subsequent slashes in a path name are delimiters that separate directory names.

### Home Directories

While introducing the notion of a user account, it was mentioned that each account has an associated uid, username, and password for the sake of order and security. Well, futher to this reasoning, each account is also allocated a unique place on the filesystem to store the account owner's personal files and do work. That place is actually a unique directory within the filesystem, and is called a home directory (commonly abbreviated to home dir, hdir, or home). [*]

A path can be specified relative to a user's home directory, by a leading ~'', or to parent directory (the directory immediately above any give directory), by a leading ..''. A path can also be specified from the present working directory (pwd) by a leading .''.

No user is permitted to access another user's home directory, nor anything beneath it. Infact, it can be made possible for other users not to be able to even see what is within a user's home directory or their files. Equally so, it is also possible to grant permission for certain users, or all users, to read, write, and/or access a file or directory. The concept of permissions will be discussed within the intermediate-level material.

### Log-in

To use a Unix system, a user must have an account on that system, and must gain access to that account by proving identity by a process of authentication called log-in. A user "logs-in" by inputing a valid username and associated password. [*]

Logging-in is possible in two ways: via a graphical user interface (aka graphic console), or via a text user interface (aka text console). An interface is a common point, boundary, or link between two things - in this case it is a link between a human and a computer.

Regardless of how you login (short for log-in), you must enter your username and password (each followed by pressing the ENTER key). Two things to note are that the username must be typed in lower case, and that when you type your password, the characters will not appear on the screen. The password is concealed in case someone is peeking over your shoulder.

A login session (or session for short) is established upon successful entry of the username and password. "Session" is a "lasting connection", which in this case is between a user and a system.

The appearance of the login session via the text console will be significantly different from the login session via the graphic console. Both have advantages and disadvantages depending on the circumstance, but as far as the following instructional example is concerned, the graphic console requires an additional step to arrive at something that the text console immediately provides.

So if you've logged-in via a graphic console, the additional step is that you will have to open what's called a terminal window (aka terminal, or term). This can be accomplished by using the mouse to click on a terminal window icon, or by selecting a "terminal window" option from a mouse menu, though the specifics will be very system-dependant.

Regardless of login method, the login event will be registered (or logged), in a log file. Following the opening of a terminal window, the things that follow will be the same as what appears on the text console.

### The Shell

After you have logged-in, and in the case of a graphic console, after you have also opened a terminal window, the system might display a message of important information, known as message of the day (or motd for short).

Following the possible motd, you will be presented with a command-line consisting of a command-prompt and a cursor.

A command-line is a row on which commands (instructions for the system) can be entered. A command-prompt (or prompt for short) is a symbol that indicates the beginning of a command-line. When a shell is ready to recieve commands, it displays a command prompt. A cursor is a movable point that identifies a specific position on a visual display unit. In the case of a command-line, the cursor is usually either a blinking or solid underscore or vertical bar, that marks the current typing position.

A shell can read command lines from a terminal (interactively) or it can read them from a file (this is known as a shell script). Shell scripting will be discussed within the advanced-level material.

The symbol that represents the prompt could be something as simple as a single character, like a dollar sign ($''), or a percent sign (%''). It could also be the name of the machine you're using in square brackets ([tron]''), or something even more elaborate. The command line, prompt, and cursor are visual aids of what's called a command interpreter (or shell for short). The shell understands a certain language known as a shell command language, which is used to access a computer system. Commands entered on the command-line are passed to the shell for execution by the system, therefore the shell is an interface to the system. At this point, nothing will happen unless you enter a command. Charles Snow, a former instructor at McGill University, once said: "The shell waits with amazing patience and anxiety for you to type something." [*][2000.09.19] ### Commands Every command has the same structure: the command name, which tells the shell what command you want the system to execute; and the arguments, which detail what you want the command to do, how you want it to do it, and to what you want it done. Some commands consist of only the command name, and do not take arguments; on the other hand, some commands require arguments. command_name [argument(s)/parameter(s)...]  The arguments are also known as parameters, and are separated from each other by spaces. The parameters are separated by spaces to distinguish one parameter from another. The parameters of the command are further divided into options and names. Options are usually prefixed by a hyphen (-''), or double hyphen (--''). Names usually name the files that the command should use in its operation. command_name [option(s)...] [name(s)...]  ### Entering a Command Let's assume that the prompt is a percent sign. A simple command to try for the first time is one that invokes a program that reports the date and time. The program is called date'', and the command requires nothing more that that. Type "date" and press ENTER. The system will respond by displaying (aka printing) the date and time, followed by another prompt.  % date Tue Mar 4 03:59:55 EST 2003 %  The ENTER key must be pressed to invoke all commands, so from this point on in this document that will be implicit. A user's initial working directory is that user's home directory. Since we haven't done anything to change it, it will still be that, but we can confirm just out of curiosity. In fact, you can determine your present working directory at any time by typing the command pwd'', as follows.  % pwd /home/users/abatko %  The proper way to logout (exit the terminal, or session) is to simultaneously press the CTRL key and D, although if that doesn't work, you can also logout using the command exit''. ### Text Editors An editor is a program that enables the creation and modification of text files. For a detailed coverage of the most common text editors at SOCS, read the web page: http://www.cs.mcgill.ca/~navindra/editors/ This is a list of some editors described on the aforementioned web page: % gedit - Simple text editor for Gnome with syntax highlighting and spell check. % kate - Simple text editor for KDE with syntax highlighting and spell check. % xemacs - XEmacs is perhaps the easiest power editor for newcomers to use. XEmacs splintered off of the GNU Emacs effort some years ago; though today GNU Emacs and XEmacs are mostly still the same, each has a variety of features not supported by the other. % emacs - GNU Emacs is a remarkable editor and its various characteristics such as extensibility, customizability, and innumerous bundled packages are particularly appealing to the programmer. % nano F - Nano is a simple text-only editor, suitable for a novice or anyone wishing to invest only a minimum amount of time in learning an editor. % pico F - Pico is a simple text-only editor, suitable for a novice or anyone wishing to invest only a minimum amount of time in learning an editor. % vi - Vi is one of the most powerful and versatile editors with an abundance of features such as macros, and filter and scripting support. On top of this, vi is available on just about every UNIX-like system, and is very lightweight. It should also be noted that vi has two modes: a command mode, and an insert mode. % vim - VIM is Vi IMproved and has many advantages over standard vi.  If you try vi or vim and you don't know how to quit, press the escape key (ESC''), followed by :q!'' (colon, q, exclamation). ## Basic Commands The most important "basic" commands are ones that enable you to deal with the filesystem. Let the symbol %'' represent a "command prompt". ### Displaying the present working directory: % pwd - Display the absolute path name of the present working directory, also known as the "current working directory", or just "working directory".  ### Viewing directory contents: % ls - List the contents of the present working directory, ordered in columns, and sorted alphabetically. % ls D - List the contents of directory D.  ### Changing the present working directory: % cd - Change the pwd to the user's home directory. % cd D - Change the pwd to the directory D. % cd - - Change the pwd to the previous working directory. (Note that "previous" does not mean "parent"; it means the directory that was pwd immediately prior to what it currently is. Using a hyphen (-'') in this manner will only work for cd, it will not work for other commands.)  ### Making directories: % mkdir D - Make directory D.  ### Deleting directories: % rmdir D - Remove directory D. This will not be allowed if directory D is the pwd, is not empty, or the permissions of the directory don't allow for it to be removed.  ### Viewing (nondirectory) files: % less F - View the contents of file F, one screenful at a time. To scroll forward press the SPACEBAR, or f''. To scroll backward press b''. To return to the top of the file, press g''. To quit viewing the file, press q''.  ### Copying files: % cp F1 F2 - Copy file F1 onto file F2. If file F2 is already present, it is overwritten by file F1, otherwise file F2 is created with the content of file F1. % cp F D - Copy file F to directory D. % cp F... D - Copy each given file to directory D.  Note that cp'' will overwrite the destination file if it already exists. To be prompted for confirmation before overwriting the destination file, use the -i'' flag. ### Moving (renaming) files: % mv F1 F2 - Move (rename) file F1 onto (as) file F2. % mv F D - Move file F into directory D. % mv F... D - Move each given file into directory D.  Note that mv'' will overwrite the destination file if it already exists. To be prompted for confirmation before overwriting the destination file, use the -i'' flag. ### Removing (nondirectory) files: % rm F - Remove file F. % rm F... - Remove each given file.  Note that rm'' will delete the specified files without confirmation. To be prompted for confirmation before removing any files use the -i'' flag. ## On-line reference to commands. The on-line reference to commands is called the Reference Manual Pages, or manpages for short. Manpages describe most commands available on the system, providing both quick reference and detailed coverage. An interface to this on-line reference maual, is a utility called man''. Manpages are separated into nine "sections". Section 1 is the most important section to users, since it describes all user commands (executable programs and shell commands): The manpages themselves may be subdivided into "parts" most commonly labelled as follows:  NAME - The name of the command being described. SYNOPSIS - A pattern describing all possible invocations of the command. DESCRIPTION - Detailed description of the command. OPTIONS - Description of arguments illustrated in SYNOPSIS. FILES - Description of files associated with the command. SEE ALSO - References to related commands. HISTORY - Brief history of the command's creation. BUGS - Outline of known problems with the command. AUTHOR - Who wrote the command.  How to use the man utility: % man C - Display the manpage for command C. % man -k K... - Search the short manpage descriptions and manpage names for the keyword(s) K, and display all matches.  The actual sections of the Reference Manual Pages vary from system to system; the following is just a sample:  1 User Commands. (executable programs or shell commands). ex. chmod(1) 2 System calls (functions provided by the kernel). ex. creat(2) 3 Library calls (functions within system libraries). ex. stdio(3) 4 Special files (usually found in /dev). ex. pci(4) 5 File formats and conventions. ex. /etc/passwd 6 Games; (though really there are none). 7 Macro packages and conventions. 8 System administration commands (usually only for root). 9 Kernel routines [Non standard].  Sometimes a command will appear in more than one section. To specify a particular section precede the command name with a section number: % man N C - Display the manpage from section N of the manual, corresponding to command C.  ## Computer resources. There are two types of computer resources available for users to get work done, namely workstations and compute servers. Workstations are physically accessible machines for users to sit in front of to do their work. Compute servers are not physically accessible, and are ones designated for remote login (discussed below) for the purpose of running applications for any number of reasons such as more compute power, uninterrupted execution time, special appications, etc. ### Workstations: All labs are located in Trottier building, on the 3rd floor; these are currently accessible by all students holding valid SOCS computer accounts, and taking CS and/or ECSE courses. Our lab workstations run Ubuntu Linux. ## Remote login. Logging into a remote machine is possible by using a remote login program. Further to remotely access a SOCS machine it is necessary to use a secure remote login program, often referred to as a secure shell client. ssh'' is the program available on every SOCS machine for this purpose. To log in from home or elsewhere, you must also use a secure shell client. For more information about ssh please consult SSH. Below is an example of how to log into the compute server mimi for the first time; it is necessary to answer yes'' to continuing the connection.  % ssh mimi The authenticity of host 'mimi (132.206.51.5)' can't be established. RSA key fingerprint is 7e:c0:74:f3:ff:36:44:7f:2b:69:9a:da:7d:92:e0:fb. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'mimi,132.206.51.5' (RSA) to the list of known hosts. abatko@mimi's password: ******** [mimi] %  After mimi has been added to "the list of known hosts", this question will not have to be answered unless mimi's "RSA key fingerprint" changes. Once you're logged in, you can type commands on the remote compute server just as you would on the local host. To exit the remote server, type exit''.  [mimi] % date Fri Sep 19 08:49:41 EDT 2003 [mimi] %  ## Filesystem disk quota and usage. Recall that every account is given a home directory on the filesystem to store the account owner's personal files. Filesystem disk quota, (or quota for short) is a prescribed limit to the amount of disk space that an account can use on a given filesystem. Accounts are limited to the amount of space that they can use so that disk space usage can be controlled. Each account has two filesystem quotas: hard quota, and soft quota. ### Hard quota. The hard quota is the maximum amount of disk space that the system permits a given account to use. Reaching the hard quota can have nasty side effects: the system will not permit any operation that requires using additional disk space; attempting to edit files my corrupt them; and logging-in may fail (this problem is discussed below). The hard quota takes effect as soon as it is exceeded. ### Soft quota. The soft quota is less than the hard quota and it is the point at which the user will be warned about approaching the hard quota. The warnings depend on the operation performed when attempting to write data, and also on the system that the operation is being performed on. It is important that users resolve the quota problem as soon as possible because of the severe consequences that may result when the quota limit is approached. Exceeding the soft quota for more than seven days automatically turns the soft quota limit into the hard quota limit. ### Display quota and usage. A user can display her/his own filesystem disk quota and usage using the command quota -v''. The command displays both quotas and usage in kilobytes, as in the following example:  % quota -v Filesystem usage quota limit /net/u21 16522 19500 20000  ### Resolving quota problems. #### Can't login. It is common that when attempting to login via a graphic console, the login process may momentarily display the graphical desktop environment, and then immediately be thrown back to the login prompt. This behaviour indicates that the account has approached too close to the hard quota for the windowing program to be able to write data to special files in the account's home directory - which is something that some windowing programs do as part of their operation. To resolve this problem it is necessary to login via the text console. ##### Switching from graphic console to text console and back. Simultaneously press Ctrl-Alt-F1 to switch from a graphic console to a text console. To switch to a graphic console from a text console simultaneously press Alt-F3 or Alt-F7. Note that the "Ctrl" key is not necessary when going from text to graphic console. If you can't find what you're looking for on either direction, there is no harm in cycling through all the function keys (F1-F12). #### What files to delete when over quota. Once you're log in, regardless of whether it is via the graphic console or the text console, you will need to delete some files to return to below the soft quota limit. Commands to delete some directories and files are listed below: • rm -f ~/*core If deleting these files doesn't help, you will have to visit a System Administrator for further assistance. ## Passwords at SOCS. Passwords at SOCS are synchronous amongst Unix, Windows, and Mail accounts. To change your password, you must log into mimi and issue the passwd command:  [mimi] % passwd  Any user whose username is longer than 8 characters must pass the username as an argument to the command:  [mimi] % passwd username  For more information, please consult http://socsinfo.cs.mcgill.ca/wiki/Password ## World Wide Web, and browsing it. The World Wide Web (WWW) is a universe of network-accessible information. Tim Berners-Lee invented the World Wide Web in late 1990 while working at CERN, the European Particle Physics Laboratory in Geneva, Switzerland. It has a set of protocols and conventions, and uses hypertext and multimedia techniques to enable anyone to browse it, or contibute to it. Evolution of WWW is coordinated by W3C - World Wide Web Consortium, http://www.w3c.org/ % lynx - A text based, frame and java disabled WWW browser.  If you are logged-in via a graphic console, look around the graphical desktop environment for an icon to laungh a graphical WWW browser. If you can't find an icon, you can enter one of these commands: % firefox % opera % konqueror % seamonkey % galeon % epiphany  ## Mail system. IMAP and POP3 are two different mail paradigms: • IMAP = multiple client-server, multiple mailbox access. • POP = single client-server, single mailbox access. IMAP advantages: • Saved-message folders may be stored on server (as well as INBOX). • Allows access to INBOX (not just new mail) from multiple platforms. • Allows selective transfer of messages/parts to client (local Save). POP advantages: • Can also use POP paradigm, for minimum connect time and server resources. The mail system at SOCS is built around an independent mail server mail.cs.mcgill.ca, supporting both POP and IMAP. webmail'' is the preferred mail client for most users because it is web-based, intuitive, and full of important features such as spam control and mail filtering. Note that the incomming mail server is "mail.cs.mcgill.ca", and within the CS domain the outgoing mail (SMTP) server is also "mail.cs.mcgill.ca". If you are outside the CS domain, you can try "mailhost.mcgill.ca". For more information, please consult http://socsinfo.cs.mcgill.ca/wiki/Mail ## Programming. ### Compiling: Once you've used an editor to create your program, you are ready to complie it. Depending on what language you used to write your program, you will need an appropriate compiler. Please note that the word "foo" is used very generally as a sample name for absolutely anything, especially programs and files. To avoid confusion, "foo" is never made the name of a real file. % cc foo.c - C complier. Compiling the program foo.c may result with the creation of an executable program called a.out or an error message from the compiler. % gcc foo.c - GNU project C compiler. Compile the C program foo.c. % g++ foo.cpp - GNU project C++ compiler. Compile the C++ program foo.cpp. % javac foo.java - Java compiler. Compile the java program foo.java into a file called foo.class.  At SOCS, we have several different versions of the java compiler installed on the workstations. Please see http://socsinfo.cs.mcgill.ca/wiki/Java for information on how to switch between versions of java. ### Executing: To run an executable file (also known as a binary), simply type its name at the command prompt. % ./a.out - Run the binary a.out.  Compiled java programs (also known as java bytecodes) must be interpreted by a java interpreter. % java foo - Execute the Java bytecodes.  ### Testing: % ./a.out > foo.txt - Run the binary a.out and send its output to the file foo.txt. % ./a.out >! foo.txt - Run the binary a.out and overwrite an existing version of the file foo.txt.  ## Submitting assignments electronically. [mimi] % handin csxxx assx F... - Electronically submit the file(s) F, into assignment box assx, of the computer science course csxxx.  For more information about, please consult: http://socsinfo.cs.mcgill.ca/wiki/Handin Assignment submition may (depending on course and instructor) also be done via WebCT at: http://www.mcgill.ca/webct/ ## Printing. Please refer to McGill uPrint for printing services. ## Intermediate Tutorial edit ## Resources Intermediate Unix Tutorial slides (2015-2016) ## Viewing directory contents. When no argument is passed to ls'', the contents of the current directory is listed; when a ordinary file is passed to ls'' as an argument, the filename is repeated (along with other associated information, if it is requested by one or more options); when a directory is given as an argument, its contents are listed. The output of ls'' is sorted alphabetically and files usually appear before directories, though the latter is system specific. ### Dot-files. Dot-files are files and directories whose names begin with a period (.''). The only difference between non-dot-files and dot-files is that dot-files are not displayed by the command ls'' unless the -a'' option or the -A'' option is used. Because dot-files are not shown by ls'' by default, they are also known as hidden files. % ls -a - List the contents of the current directory including files and directories that begin with a dot .''.  Dot-files are usually configuration files of programs, and are considered "uninteresting", and therefore they don't need to be displayed each time the command ls'' is exectued. Note that because of the nature of the Unix filesystem, ls -a'' will list the entries .'' (working directory) and ..'' (parent directory) as members of every directory, without exception. That is because those directories are indeed found in every directory in the filesystem. This fact should cause you to question what this means for the root directory. The root directory is a special directory because rules apply to it that that do not apply to other "ordinary" directories. Amongst other rules, the ..'' entry in the root directory still exists, but instead of pointing to a parent directory it contains a null value, and hence has no effect. % ls -A - Lists the contents of the current directory including files and directories that begin with a dot .'', with the exception of the working directory .'' and the parent directory ..''.  ### File attributes. File attributes are properties associated with files, such as the mode (which is the file type, and permissions), the number of links to the file, the owner, the group, the size in bytes, the time of most recent modification, and the file name. For completeness, it must be mentioned that everything in the filesystem is actually a file, including ordinary files, directories, and other special files such as links, device files, and named pipes; so in fact, all these entities have file attributes. % ls -l - List the contents of the pwd in long format, printing for each entity the mode (file type and permissions), number of links to the file, owner name, group name, size in bytes, timestamp, and filename.  The following are a number of common invocations of ls'': % ls -la - List the contents of the pwd in long format, including dot-files. % ls -ltr - List the contents of the pwd in long format, sorted by time stamp (most recent modification time), and reversed (such that the most recently modified files appear at the bottom). % ls -ld - List the contents of the pwd in long format, but if an argument is a directory, list only its name (not its contents). % ls -lF - List the contents of the pwd in long format, and mark each non-regular file with symbol to indicate whether it is a directory, executable, symbolic link, etc.  Having shown a command that takes two options, it's necessary to point out that option order has no meaning. To illustrate this point, note that the following invocations have the same effect: % ls -la % ls -al % ls -l -a % ls -a -l  The following is an example of the output of ls -laF'': drwx--x--x 47 abatko 12345 3072 Sep 30 09:58 . drwxr-xr-x 34 root other 1024 Jul 2 08:30 .. drwx------ 3 abatko 12345 512 Jul 11 14:51 bin/ drwx--x--x 21 abatko 12345 512 Jul 30 2002 courses/ drwx------ 4 abatko 12345 5632 Sep 30 09:58 mail/ drwx--x--x 34 abatko 12345 1024 Sep 26 08:23 public_html/ -rw------- 1 abatko 12345 732 Sep 24 15:16 reminders  The file attributes "owner", "group", and "mode" are described next. The file attribute for the "number of links" is described in the advanced-level material. ## File access permissions (owner, group, mode). ### Owner & group. Unix file access is based on the concept of "users" and "groups". Every user on a system has a unique account with a unique login name and associated user identification number (UID). Likewise, every user on a system has a unique group identification number (GID). It is possible, and sometimes convenient for different users to be able to share files that they are not owners of; for this purpose there exist "groups". If a user belongs to a group, they may have "read", "write", and/or "executable" access to particular files associated with that group. Every user belongs to a primary group; by default at SOCS, primary GIDs are the same as each user's UID. Every file belongs to one user and one group. A file is "owned" by the user who created it; its group is the user's primary group. % id U - Print UIDs, GIDs, and group membership of user U. % groups U - Print the groups user U is in. % chgrp G F... - Change the group ownership of file(s) F..., to group G. % chgrp -R G D... - Recursively change the group ownership of directories D..., and their contents, to group G. % chown O F... - Change the ownership of file(s) F, to owner O. % chown -R O D... - Recursively change the ownership of directories D..., and their contents, to owner O. % chown U:G F... - If the username (or UID) U is followed by a colon and a group (or GID) G; U is made the owner, and G is made the group of file(s) F....  Note that if you need to change both a file's owner and group, change the group first, otherwise you won't have permission to change the group after you aren't the owner. ### Mode. The filesystem assigns a 10 bit "mode" to every file to govern how it can be accessed. The first bit indicates what type of file it is, while the remaining nine bits (which themselves are called "the mode") indicate the access privilages. The mode grants or denies permission to read, write, and/or execute a file. The mode grants permission separately to the owner of a file, to users belonging to group that the file is associated with, and to all other users (aka "the world"). #### With respect to non-directory files. "Read" permission allow a user to see the content of a file. "Write" permission allows a user to make modifications to a file, as well as to delete it. "Execute" permission allows the user to execute a file as if it were a command, that is, to run it. #### With respect to directories. For directories, "execute" permission allow a user to enter a directory. When a directory has "write" permission and "execute" permission it is possible to add a file to a directory. To delete an empty directory it is enough to have "write" permission; to delete a non-empty directory, it is necessary to have "read", "write", and "execute" permissions. To see the contents of a directory it is necessary to have "read" and "execute" permissions. The following first line is the output of the command ls -l index.html'': -rw-r--r-- 1 abatko 12345 7970 Sep 6 15:47 index.html - |--- | | --- | | | --- | | | | | | | +--------> Other permissions | | +-----------> Group permissions | +--------------> User permissions +----------------> File type: - = file d = directory l = link  From the output of the command above, the leading dash -'' tells us that the file index.html'' is an ordinary file, as opposed to being a directory, link, or some other special file. The owner of this file is user "abatko", and the group owner is the group "12345". Looking at just the permissions part of the mode, "rw-r--r--" tell us that the user has read and write permissions on this file, while the group "12345", as well as all other users, have only read permission. Note that the mode also contains three bits that perform special tasks: the set-user-id bit, the set-group-id bit, and the stick bit; you can read more about file attributes by doing man chmod''. Only the owner of a file, or the superuser may change a file's mode. % chmod M F... - Change the access permissions of file(s) F..., to mode M. The mode M can be specified in symbolic mode according to the format [ugo][+-=][rwx] or using the octal format.  #### Octal format. In octal format each permission type is associated with a number, as shown in the table below. The numbers add up to a total of 7, which means that the maximum permissions, "rwx" is equal to "7". It is possible to sum any combination of the three types of permissions, for instance, to give "read and write" permission, the number would be "4+2=6". type octal binary ------------------------ Read = 4 = 100 Write = 2 = 010 Execute = 1 = 001 --- 7  Below a user's home directory the most frequent permissions are 700 (for directories) and 600 (for non-directory files); below a user's website directory, the most frequent permissions are 711 (for directories) and 644 (for non-directory files). The reasoning is that the user (owner) should be able to "read, write, and execute" directories, and "read and write" files, while the group and world should be able to do the bare minimum, that is, to either have no access, or to have just "read" access to files, or just "execute" access to directories. A user's home directory should have the mode set to "rwx--x--x", or "711". 711 gives the user (owner) full access, whereas the members of the group and the world are only able to execute the directory. Some people like think of permissions in binary format (as in the table below) because the "1" digits nicely correspond to the positions in the "rwx" sequence. To illustrate this, consider "read and write" access; it would be "rw-", which requires a "1" on the left, and a "1" in the middle, and those are "100" (read) and "010" (write). So following binary addition, "100 + 010 = 110", which is 6 (read and write). type binary ---------------- Read = 100 Write = 010 Execute = 001  #### Default permissions. It is possible to force the initial file mode of a newly created file. This can be implemented for security reasons. For instance it is not advisable for newly created files to be world readable, since the creator could forget to adjust the permissions. The umask utility sets the file mode creation mask of a given shell's execution environment. % umask - Get the file mode creation mask of the current shell. % umask M - Set the file mode creation mask of the current shell to mask M. This mask affects the initial value of the file permission bits of subsequently created files.  The mask operand may be specified using the octal number representation. The mask is three octal digits corresponding to user, group, and other, respectively. For example, umask 022 removes read permission for (g)roup and (o)ther, changing directories normally created with mode 777 to 755. Files are normally created with mode 666 minus the umask; therefore they will result in the mode 644. When umask'' is executed with no arguments, it may return a two digit number, this is because the leading zero was omitted; therfore, 22 actully means 022. ##### Permanently setting the umask. Setting the umask at the command line will take effect for the duration of the shell session. To avoid having to manually set the umask for each shell session it is possible to write the command in a personal shell configuration file (aka shell startup file) which is executed by the shell at startup (before it gives a user the prompt). To learn how to set the umask permanently, continue reading - the information is explained below. ## Personal webpages. Users are permitted to have personal webpages at SOCS. Personal webpages are accessible via the address http://www.cs.mcgill.ca/~username. The following recipe outlines the creation of a very simple web page. By preciely carrying out these commands, you will create a publicly accessible webpage. The symbol ~'' denotes the user's home directory. % cd - Change current directory to your home directory. [This step is optional given that all the following steps are relative to the home directory]. % chmod 711 ~ - Change the mode of your home directory to 711. % mkdir ~/public_html - Make a directory called public_html. % chmod 711 ~/public_html - Change the mode of directory public_html to 711. % xemacs ~/public_html/index.html - Make a file called index.html within the directory public_html, having the following content: <html><body>Hello, world!</body></html> Note that you need not use xmacs; use any text editor you are familiar with. % chmod 644 ~/public_html/index.html - Change the mode of file index.html to 644.  For more information, please consult http://socsinfo.cs.mcgill.ca/wiki/Personal_Webpage ## Environment variables. Every shell session is associated with its own environment which is, in part, a list of name-value pairs called environment variables. Note that variables are named entities that store values. Some environment variables are used by the shell and other programs as sources of information needed in part to accomplish what they do. % env - Writes the current environment variables, with one name=value pair per line.  ### PATH. In the list of environment variables you will notice a variable called PATH''. This is perhaps the most important environment variable as it is a list of directories (aka search path) in which the shell looks for commands. When a command is entered, the shell checks if it contains slashes. If the command contains slashes, the shell executes the named command. If the command does not contain slashes, the shell attempts to locate the command and it executes the first occurence found. The shell first checks whether a shell function by that name exists, then it scans the list of builtin shell commands. Finally the elements of the PATH variable are scanned for an executable file matching the command entered. As you can see, a command can reside in so many places that without the help of the shell, and especially the PATH variable, it would be very burdensome to enter commands. % echo$PATH    - Returns the value of the environment variable PATH.  Note
that to indicate to the shell that a variable follows, the
variable name must be preceeded by the symbol $''.  The following is an example of the return value of echo$PATH''.

/sbin:/usr/sbin:/usr/local/sbin:/bin:/usr/bin:/usr/local/bin


#### .'''' in PATH.

Note that when a command is not prefaced by its location (the path to where it resides), it's not certain from which directory it will be executed. This bit of uncertainty can pose some risk to your account, though if .'' is listed in PATH, the risk is greater.

Imagine that a malicious person has gained access to the system, and has placed a trojan horse program (a program that does something other than what it's disguised as) in /tmp'', a temporary word-writable directory found on all systems, with the hopes that eventually someone will execute it. If .'' is in your PATH, and one day you cd'' into /tmp'', you may unknowingly execute that command. Imagine that the trojan horse program was named les'', and one day, meaning to type less'', you accidentally omitted the second s''; in this case the trojan horse program would be executed, because there is no les'' command in the search path, other than in .''.

Based on this explanation, you may realise that the worst position in PATH to put .'' is at the front. The reason is that at that point the trojan horse program does not have to be a miss-spelled command. The trojan horse program could be named less'', and it will be executed because the shell will always execute the first occurence in the search path.

##### ./foo vs foo.

The reason some users place .'' in PATH is because they want to be able to execute their own programs from their working directores. Indeed there are two ways to execute a program from a working directory: for reasons mentioned above, the wrong way is to put .'' in PATH; the proper way is to preceed the program name by ./'' on the command line.

% ./foo         - The proper way to execute the program foo from the current
working directory.

% foo           - The wrong way to execute the program foo from the current
working directory, because it relies on .'' being in PATH.


#### bin'''' in PATH.

If you write your own commands or programs that you want to have available for execution it is best to create the directory ~/bin'' and put those commands or programs there. For convenience sake you should append ~/bin to your PATH. Modifying environment variables is discussed below.

### which and whereis.

The following commands will display what will actually be executed when a given command is entered, as well as where things related to a given command are located.

% which C       - Displays the pathname or alias that would be executed if
argument C were given as a command.

% whereis C     - Locate the binary, source, and manual page files for
command C.


## Shells.

Dozens of shells have been written for Unix systems, and a handful of them have become widely accepted, among those are: Bourne shell (sh), GNU Bourne-Again Shell (bash), Berkeley Unix C shell (csh), and tcsh.

A shell normally has a set of associated files called startup files that the shell sources (ie. it reads and executes each line of the file) during different instances of its execution (for instance, startup, and logout). There are two types of startup files - ones for system-wide settings, and ones for personal settings. The startup files containing system-wide configurations allow the system administrator to configure certain features for all users of a given shell, for instance, setting default environment variables and aliases. The personal startup files are intended for users to configure the shell to their liking, including modifying environment variables, setting aliases, creating shell functions, setting the way the prompt looks, etc. Modifying startup files is discussed below; however it must be mentioned that despite shells having a lot in common, they also have a lot of differences (with the most apparently obvious differences being syntactic).

### Ways shells can be run.

#### Interactive shell.

An interactive shell session is one during which input is accepted from the keyboard; that is, it is a shell which presents a person with a command prompt and waits for input.

#### Non-interactive shell.

A non-interactive shell session is one in which input is accepted from a file of commands.

A login shell is the shell that a user has logged into (ie. it runs when a user first logs in - it is the first shell seen). A login shell is distinguished from a non-login shell (see below) by the first character of argument zero ($0) of a login shell being -. A login shell can also be started if the shell was invoked with the --login'' option. #### Non-login shell. A non-login shell is one that is started by a means other than logging in. For example, if a user logs in, the first shell is a login shell; next if that user enters the command bash'', a new bash shell will be launched as a non-login shell (because logging in was not the method by which it was launched). ### sh. The Bourne shell is the oldest of the current Unix shells, thus it is a bit primitive and lacks job control features. Despite this, most Unix users consider the Bourne shell superior for shell programming (aka shell scripting). Csh and tcsh are considered "utterly inadequate" for shell programming. Shell programming is discussed in the advanced-level material. #### Shell startup files. /etc/profile - System-wide settings and actions. ~/.profile - User-specific settings and actions.  ### bash. GNU Bourne-Again SHell is an enhanced and compatible version of sh, developed by the Free Software Foundation. Bash has features from sh, ksh, and csh. Some improvements offered by bash include: interactive command-line editing, and unlimited size command history. #### Shell startup files. /etc/profile - System-wide settings and actions, for login shells only. ~/.bash_profile - User-specific settings and actions, for login shells only. ~/.bashrc - User-specific settings and actions, for all non-login shells.  ### tcsh. Tcsh is an enhanced and compatible version of csh. Notable enhancements over csh include: an enhanced history mechanism; a command-line editor, which supports GNU Emacs or vi-style key bindings; programmable, interactive word completion and listing; and spelling correction of filenames, commands, and variables. #### Shell startup files. /etc/csh.cshrc - System-wide settings and actions. Note that it's possible that the file may actually be different depending on compile-time options. ~/.cshrc - User-specific settings and actions executed by login and non-login shells; though if ~/.tcshrc exists, it will be sourced instead. ~/.login - User-specific settings and actions executed by login shells after ~/.cshrc or ~/.tcshrc. Please note that the shell may be compiled to read ~/.login before ~/.tcshrc instead of after: see the version shell variable to check.  ### Determine login shell. % echo$0       - Returns the value of the name of the shell or shell script.
This variable is set at shell initialization.


The file /etc/shells contains valid login shells.

#### Change shell temporarily.

To change your shell temporarily, enter the name or absolute path of the shell you would like to use.

#### Change shell permanently.

To change your shell permanently, you must ssh to one of the servers, and use the passwd command as follows:

% passwd -r nis -e U    - Change the user login shell of user U.


## Modifying shell features.

### Temporary changes.

Entering any command at the command line to modify the shell's environment and features is temporary - the change lasts as long as the shell session - so the next time the shell is started those settings will be nonexistent.

### Permanent changes.

To retain any setting, such as a modified environment variable, an alias, a shell function, the prompt's appearance, etc., it is neccessary to write the those personalizations in your shell's startup file. In other words, to make a shell customization permanent, it is necessary to write the customization in the personal shell-startup file.

As an example, if for instance you want to set the umask, and your shell is tcsh, the place to set your umask is in ~/.cshrc because the login program sources ~/.cshrc before carrying out commands on remote systems. If your shell is sh, then set the umask in ~/.profile, and if your shell is bash, then set the umask in ~/.bash_profile. Make sure to place the umask command before the non-interactive-shell check (if any).

After having modified the shell startup file, for the settings to take effect, it is necessary to do one of two things. The simplest is to source the file without logging out. The other way is to terminate the shell session and start another one.

% source F      - Read and execute commands from file F in the current shell
environment.

% . F           - (bash) Same as source F'', though in bash the built-in
command .'' seems to be preferred over source''.


## Setting environment variables.

### bash.

% export        - With no arguments, the values of all variables are printed.

% export N=V    - Set the environment variable named N to value V.

% export PATH=$PATH:~/bin - Append ~/bin'' to PATH.  ### tcsh. % setenv - With no arguments, the values of all variables are print. % setenv N V - Set the environment variable named N to value V. % setenv PATH$PATH:~/bin
- Append ~/bin'' to PATH.


## Setting command aliases.

Aliases are shorthand names for commands. An alias is a name that is assumed temporarily, that is, until it is removed or the shell session ends. To retain an alias, such that it is set on all future login sessions, the alias command must be added to the personal shell startup file.

% alias         - (bash) Prints the list of current aliases, one per line.
% aliases       - (tcsh) Prints the list of current aliases, one per line.

% alias B=A     - (bash) Create an alias B for command A.
% alias B A     - (tcsh) Create an alias B for command A.

% unalias B     - Unalias the alias B.


The following are examples of useful aliases (written for the bash shell). It is easy to translate bash aliases to tcsh aliases by replacing the equal sign (='') with a space ( '').

% alias rm='rm -i'
% alias mv='mv -i'
% alias cp='cp -i'

% alias ..='cd ..'
% alias l='ls -l'
% alias j='jobs -l'


## Window system.

Note: Most of the following text is taken from related entries at http://www.wikipedia.org

A graphical user interface (or GUI, often pronounced "goo-ee") is a method of interacting with a computer that uses graphical images and "widgets" in addition to text. GUIs generally consist of graphical widgets such as windows, menus, buttons, and icons, and employ both a pointing device and a keyboard.

A window system is a system that provides functionality for drawing and moving windows on the screen and also provides a mouse cursor for the computer's graphical user interface. The most common window system for Unix systems is the X Window System (aka X); and the most widely used implementation of it is "XFree86", which is free (open source) software.

### Display manager.

A display manager is a program used to

• keep the X server process alive,
• connect the X server process to a physical screen,
• authenticate the user, and
• run a session.

A "session" is defined by the lifetime of a particular process; in the traditional character-based terminal world, a session is defined by the lifetime of the user's login shell. In the graphical display context, a session is defined by the lifetime an arbitrary session manager.

The default display manager for X is called "XDM", while the program is named xdm''. As part of its operation, xdm runs a script called Xsession'', which looks for the user's .xsession'' file to issue commands that determine the user's X session (for instance, what desktop environment or window manager to launch). Xsession implements a default session if no user-specific session exists.

~/.xsession     - User-specific settings and actions, executed at the start
of an X session.


#### Desktop environments and window managers.

A [graphical] desktop environment is a complete GUI solution that extends the X Window System. Examples of popular desktop environments are:

• GNOME (gnome-session)
• KDE (startkde)

Desktop environments are complete GUI solutions to the point that they come with their own replacements for XDM. Though if XDM is the default display manager, it is still possible to use GNOME and KDE. See below for how to set your desktop environment.

Similar to a desktop environment in principle is a window manager, though it is not a "complete GUI solution" - it is a program for controlling the placement and appearance of application windows under X. Examples of popular window managers are:

• KDE
• Gnome
• XFCE
• FVWM (fvwm2)
• TWM (twm)
• WindowMaker (wmaker)

See below for how to set your window manager.

##### Setting your desktop environment or window manager.

When XDM is the default display manager, it is necessary to edit the file ~/.xsession'' by inserting a command that launches the desired desktop environment or window manager. The commands that launch these desktop environments and window managers are written in brackets in the two lists above.

## I/O redirection operators.

The standard I/O facility provides some simple defaults for managing Input/Ouput. There are three default I/O streams: standard input (stdin), standard ouput (stdout), and standard error (stderr), respectively corresponding to file descriptor 0, 1, and 2.

Stdout consists of all "normal" ouput from programs, while stderr consists of error messages generated by programs. Stdin is normally read from the keyboard while stdout and stderr are sent to the terminal. The point is that these streams can be redirected using special redirection operators.

While standard I/O is a basic feature of Unix, the syntax used to redirect standard I/O depends on the shell being used.

The following describes common I/O redirections:

% P > F         - Redirect stdout of program P into the file F.
% P >> F        - Redirect stdout of program P appending it to file F.
% P >! F        - Redirect stdout of program P into the file F, overwriting
its contents (if any).

% P &> F        - Redirect stdout and stderr of P into the file F.
% P 2> F        - Redirect stderr of P into the file F.

% P < F         - Read stdin of P from the file F.
% P << c        - Read stdin of P from the keyboard until condition c.


The following are examples of redirecting stdout:

% ./a.out > foo.txt
- Run the binary a.out and send its output to the file foo.txt.
% ./a.out >! foo.txt
- Run the binary a.out and overwrite an existing version of
the file foo.txt.
% ./a.out >> foo.txt
- Run the binary a.out and append its output a possibly
existing version of the file foo.txt.


Redirecting more than one file descriptor at a time is show by example below:

% ./a.out < iF.txt > oF.txt
- Run the binary a.out, using as input, the input file iF.txt,
and redirect the output to the output file oF.txt,
appending to it if it already exists.
% ./a.out < iF.txt >! oF.txt
- Run the binary a.out, using as input, the input file iF.txt,
and redirect the output to the output file oF.txt,
overwriting it if it already exists.
% ./a.out < iF.txt >> oF.txt
- Run the binary a.out, using as input, the input file iF.txt,
and appending the output to the output file oF.txt.


It is also possible to redirect stdout of one program into the input of another, by a method called piping. This is accomplished by placing the pipe character (|'') between two commands as follows:

% P1 | P2       - Pipe the stdout of P1 into the stdin of P2.
The output of P1 is streamed into P2, and is not buffered
until the completion of P1.


The following are examples of piping:

% sort F | uniq
- Alphabetically sort the contents of file F, pipe the output
into the input of the command uniq'' so that only unique
lines are printed.

% P | less      - Execute some program P, and "pipe" its output to the input of
the program called less''.  This is commonly done when the
output of some program P is longer than the terminal window.

% P1 2>&1 | P2  - Duplicate stderr of P1 to its stdout and pipe the single
stdout stream to P2.  If P2 was the command less'', this
would be convenient for reading all the output of P1
(including stderr) if it scrolls beyond the height of your
terminal window).


## How to find files.

% find P... E   - Search for files matching the expression E, in paths P.

% find . -name '*foo*' -print
- Search for files in a directory hierarchy, for a base file
name that matches a shell pattern.

% find /u[0-9]*/[u]grad/ -name core -type f | xargs -i file {} \
| grep 'core file' | awk -F: '{print $1}' | xargs -i ls -l {} % find /u[0-9]*/[u]grad/ -name core -type f | xargs -i file {} \ | grep 'core file' | awk -F: '{print$1}' | xargs -i rm -f {}

% find . ! $$-path '*oracle8.0.5*' -or -path '*developer*' \ -or -path '*matlab*'$$ -name '*.html' -exec grep about.html {} \;

% find . ! -path '*oracle8.0.5*' ! -path '*developer*' \
! -path '*matlab*' -name '*.html' -exec grep about.html {} \;

% find . ! -path '*oracle8.0.5*' -name '*.html' -type f -exec grep -iH
'www-staff' {} \;

% find . ! -path '*oracle8.0.5*' -name '*.html' -type f -exec grep -il
'www-staff' {} \; | xargs -n 1 perl -pi -e 's/www-staff/staff/'


## How to get more information about a command foo''''.

% foo --help    - --help'' is a parameter built into most commands.
% foo -h        - -h'' is a parameter built into most commands.
% foo -?        - -?'' is a parameter built into most commands.

% whatis foo    - Display a one-line summary about a keyword.

% apropos foo   - List manpages (to locate commands) by keyword lookup.
(This is the same man -k K'' command).

% info foo      - A hypertext system for browsing documentation.

% locate foo    - Lists files in databases that match a pattern.

% locate '*foo*'
- Lists files in databases that match a pattern.
("locate" assumes that the command updatedb has been ran

% locate '*foo*' | less
- Same as above, but the result, which is often big, is sent
into the less'' text pager.  Press q'' to quit it.


Check directories /usr/doc/foo and /usr/lib/foo. Do a man'' on info(1), apropos(1), locate(1), find(1), and undocumented(2).

## Job control (user process management).

% ps auxww      - Report process status of all users on the system, including
associated user names, excluding associated controlling
terminal and using wide ouput.

% jobs -l       - List jobs (subprocesses) currently running, and
their associated job number and process IDs (PIDs).

% bg J          - Place job J in the background, as if it had been started
with &.  If job J is not specified, the shell's notion of
the current job is used.
% fg J          - Place job J in the foreground, and make it the current job.
If job J is not present, the shell's notion of the current
job is used.
% kill J        - Kill job J; where J is specified with a job number, or the
PID.
% [CTRL-Z]      - Suspend the current job, placing it in the background.

% %n            - Bring up the active job having job number n.


## Networking and Communications (remote opertations).

### News.

Usnet is a collection of thousands of computers worldwide that exchange files called news articles. There are thousands of interactive discussion groups, known as electronic bulletin boards that talk about literally everything. The news server is news.RISQ.QC.ca'' and SOCS courses are listed under mcgill.socs.courses.cs*''

### FTP.

File Transfer Program enables the transfer of files between networked computers with TCP/IP. For information about FTP at SOCS, please consult: http://socsinfo.cs.mcgill.ca/wiki/FTP

### SSH/SCP.

% scp           - Secure copy.  Uses ssh authentication and security to copy
files remotely.

% scp U@H:F U2@H2:F2
- Copy file F from host H (as user U) to file F2 on host H2,
(as user U2).

% scp index.html joe@mimi:~/public_html/.


### Who and w.

% who           - Show who is currently logged on the system.
% w             - Display information about users currently logged on the
system.


### Write.

% write U       - Send a message to user U.


### Talk.

% talk U        - Talk to user U.  Talk is a visual communication program which
copies lines from your terminal to that of another user.
Press Ctrl-C to terminate the session.


### Finger.

% finger U@H    - Display information about user U on host H, including
real name, terminal name, idle time, login time, etc.


### E-Mail.

% mail U        - Send an email to user U from the command line.  User U can
Mail will prompt you for the 'subject'.  Everything typed
on subsequent lines will compose the body of the message.
The message is sent when a new line contains a single
character ".".


## Intermediate commands and options.

Intermediate commands are ones that a beginner is not concerned with, as not knowing them does not inhibit getting work done.

% head F        - Display the first few lines (default is 10) of file F.
% head -N F     - Display the first N lines of file F.

% tail F        - Display the last few lines (default is 10) of file F.
% tail -N F     - Display the last N lines of file F.
% tail -f F     - Display the growth of input-file F.  This command may be
used to monitor the growth of a file that is being written
by some other process.
% tail -r F     - Print file F in reverse order.

% tac F         - Print file F in reverse order.

% wc F          - Print the number of bytes, words, and lines in file F.

% diff F1 F2    - Display line-by-line differences between pairs of text files
(file F1 and file F2).
% cmp F1 F2     - Compare file F1 and file F2.  Print the byte and line
numbers at which the difference occurred.

% sort F        - Sort lines of file F, and print the result to the standard
output.
% uniq F        - Display unique lines of sorted file F.

% tty           - Print the filename of the terminal connected to standard
input.

% uname -a      - Print system information, including machine (hardware) type,
hostname, operating system release, name and version.
Solaris and Linux versions are overlapping, but somewhat
different.

% cat           - Writes standard input to standard output.  The output can be
redirected to a file.
% cat -         - Same as cat.
% cat F         - View the contents of file F.  (Write the contents of file F
to the standard output.)
% cat -n F      - Display the contents of file F, preceding each line output
with its line number.
% cat F...      - Write the contents of each given file to standard output.
% cat F1 F2 F3 > F4
- Concatenate files F1, F2, and F3, and redirect the
concatenation to file F4.  Without the redirection the
concatenation would be printed on the standard output.

% gzip F        - Compress the file F, and append to it the extension gz.
% gzip -d F.gz
- Expand, or restore the compressed file F to it's original
form.
% gunzip F.gz   - Same as gzip -d.
% zcat F.gz     - Same as gunzip.

% tar -cvf F.tar D
- Create an archive of directory D in file F.tar.
% tar -cvf F.tar F...
- Create an archive of files F... in file F.tar.
% tar -cvvf F.tar D
- Same as tar -cvf, except more verbose.

% tar -xvf F.tar
- Extract the archived file F.tar.
% tar -xvvf F.tar
- Same as tar -xvf, except more verbose.

% tar -zxvf F.tar.gz
- Extract the tarred and gzipped file F.tar.gz.

% tar -tzf F.tar.gz
- List the contents of the zipped archive F.tar.gz, without
extracting.

% tar -tIf F.tar.bz2
- List the contents of the bz2 tarball.
% tar -xvIf F.tar.bz2
- Extract the tarred and bz2 zipped archive.

% du D          - Summarize disk space usage of directory D, and its
subdirectories.  The space is measured in 1k blocks.

% find D -name P
- Recursively find file(s) having the pattern P, starting in
directory D.  Pattern P can simply be a string, or a regular
expression.
% find D -type T
- Recursively find file(s) being of type T.  Check the "find"
manpage to see the acceptable types T.  If the expression
contains the option -print the result of the search for
which the expression is true. (The -print option may be used
with any "find" expression).

% psnup -4 F1.ps F2.ps
- Puts four (4) logical pages of input Post Script file F1.ps
onto each physical sheet of paper (output Post Script
file F2.ps).

% psnup -p letter -c -l -4 -m .1in F1.ps F2.ps

% history       - Display the list of all commands that were executed
(aka command history) during the time of the current login
session.
% !$- Run the last command entered that began with the partial string S. % !! - Execute the most recently executed command. % !N - Execute the Nth command in the history list. % C !$          - Execute command C with the parameter that was the first
parameter to the most recently executed command.

% cal           - Display a calendar.

% clear         - Clear the terminal screen.  It is easier to clear the screen
using the key combination CTRL-L.

% mkdir -p D    - Make the directory D, including all mentioned parent
directories, if they don't yet exist.  For example, if D
is code/perl/tests'', then mkdir would create the directory
code'' (if it didn't already exist), within which it would
create the directory perl'' (if it didn't already exist),
within which it would create the directory tests'' (if it


## Common file types.

Very common file types are Post Script (having extension .ps), Portable Document Format (having extension .pdf), DeVice Independent (having extension .dvi), and plain text, aka ASCII. There are several conversion tools for converting between these file types. One would want to convert particular files to the Post Script format in order to print a given file to a printer. There are several programs and methods of viewing these file formats as well.

% ghostview F.ps    - View PS file F.ps.
% acroread F.pdf    - View PDF file F.pdf.
% evince F.pdf      - View PDF file F.pdf.
% a2ps F.txt F.ps
- Convert plain text (ascii) file F.txt, to a PS format file
called F.ps.

% a2ps -nH -nu -nL F.txt F.ps
- Convert plain text (ascii) file F.txt, to a PS format file
called F.ps, without page headers, without filename at
bottom of page, and without login ID at top of page.

% dvips F.dvi -o F.ps
- Convert DVI format file F.dvi, to a PS format file
called F.ps.

% pdf2ps F.pdf F.ps
- Convert PDF format file F.pdf, to a PS format file
called F.ps.


## Mounting filesystems.

Normally, only the superuser can mount file systems; however when the file /etc/fstab contains the "user" option on a line, then anybody can mount the corresponding system.

The following are a few lines extracted from an /etc/fstab file on a host named troy:

/dev/fd0        /disks/dosa     msdos           user,noauto,nosuid      0 0
/dev/fd0        /disks/win95a   vfat            user,noauto,nosuid      0 0
/dev/fd0        /disks/floppy   minix           user,noauto,nosuid      0 0
/dev/fd0        /disks/ext2     ext2            user,noauto,nosuid      0 0
/dev/sda4       /disks/zipwin95 vfat            user,noauto,nosuid,sync,rw 0 0
/dev/sda4       /disks/ziplinux ext2            user,noauto,nosuid,sync,rw 0 0
/dev/sda4       /disks/zipdos   msdos           user,noauto,nosuid,sync,rw 0 0
/dev/cdrom      /disks/cdrom    iso9660         user,noauto,nosuid      0 0


Because these lines contain the option "user", they can be mounted by anybody. For instance, if we wanted to mount the linux file system ext2 found on a zip disk on troy, we would type the following command:

% mount /disks/ziplinux
- Mount the ext2 file system on /dev/sda4.

% mount Dev     - Mount a file system; attach the file system found on some
device to the big file tree.
% mount Dir     - Mount a file system; attach the file system found on some
device to the big file tree.


Before ejecting the zip disk, it must be unmounted. The command that accomplishes this task is 'umount'.

% umount /disks/zip
- Unmount

% umount Dev    - Unmount a file system; detach the file system associated
with device Dev, by giving the special device Dev on which
it lives.
% umount Dir    - Unmount a file system; detach the file system associated
with directory Dir, by giving the directory Dir where it has
been mounted.

% df            - List all currently mounted filesystems and their disk space
usage.
% mount         - List all currently mounted filesystems with their types, and
/etc/mtab.


## DOS floopy disk access.

A utility called mtools provides access to DOS disks in Unix. The mtools commands and corresponding command line parameters are described on the mtools(1) manpage. The following is a quick summary of usefull commands.

% mcd D         - Change the mtools working directory on the MS-DOS disk.

% mcopy F1 F2   - Copy MS-DOS file F1 to file F2 in the current mtools
directory.

% mcopy F... D  - Copy the file(s) F... to the mtools directory D.
% mcopy F       - Copy file MS-DOS file F into the current directory.

% mdel F...     - Delete an MS-DOS file(s) F.
% mdeltree D...
- Remove a directory and all the files and subdirectories it
contains.

% mdir          - Display the current mtools directory content.
% mdir D        - Display the the content of the MS-DOS directory D.

% mmd D         - Make an MS-DOS directory D.

% mmount D A    - Mount an MS-DOS disk on drive D, with mount arguments A.

% mmove F1 F2   - Move or rename an MS-DOS file F1 to file F2.

% mrd D...      - Removes an MS-DOS directory D.

% mren F1 F2    - Renames MS-DOS file F1 to file F2.

% mtype F       - Display the specified MS-DOS file F on the screen.

% mformat       - Format (add a minimal MS-DOS filesystem (boot sector, FAT,
and root directory) to a floppy disk.  You may create a
Linux second extended file system with the command mke2fs.


## File properties and default permissions.

An "inode" is a data structure that describes a file. Within any file system, the number of inodes, and hence the maximum number of files, is set when the filesystem is created. An inode holds most of the important information about the file, including the on-disk address of the file's data blocks. Each inode has its own unique identification number, called an "i-number". An inode also stores the file's ownership, access mode, timestamp, and type.

% stat F                - Display the status of file F, including:
device number, device type, inode number,
access rights, number of hard links, UID, GID,
total size in bytes, number of blocks allocated,
time of last access, time of last modification,
and time of last change.


The following is the ouput of the command "stat welcome.html":

  File: "welcome.html"
Size: 7970       Blocks: 16        Regular File
Access: (0644/-rw-r--r--)         Uid: (11374/  abatko)  Gid: (   10/ wheel)
Device: 4          Inode: 1627279    Links: 1
Access: Thu Sep  9 11:44:37 1999
Modify: Mon Sep  6 15:47:12 1999
Change: Mon Sep  6 15:47:12 1999


It is possible to have special files called "links" that point to other files. UNIX provides two different kinds of links, namely hard links, and soft links. A soft link is merely a pointer to a file that is associated with a set of data blocks, whereas a hard link is another name for the set of blocks associated with the file to which it points; in essence, a hard link contains those data blocks.

Amongst other information, the filesystem associates a file's data blocks with an inode, a name, and the number of links to the data blocks.

Assuming we had not deleted anything yet, if we delete fooH, then foo will still exist, and only the number of links to foo will be decremented, totalling 1. This is because data blocks are only "lost" when the link count goes to zero. Thus fooS will still be a valid symbolic link.

There are some important difference between hard and symbolic links. A hard link contains the data to which it was meant to point, thus hard links can take a lot of space. A symbolic link contains only the path to the file it points to. Unlike hard links, symbolic links (aka symlinks) can span filesystems, or even computer systems if a network file system is being used.

% ln F1 F2              - Create a hard link to file F1, naming it F2.
% ln -s F1 F2           - Create a soft link to file F1, naming it F2.
% ls -l F               - List in long format, information for file F.
If the file is a symbolic link, the filename is
printed followed by "->" and the path name of the
referenced file.
% ls -Ll F              - List in long format, information for the file linked
to by symbolic link.  The -L flag can be considered
as one that dereferences a symbolic link.


Advanced commands are ones that an intermediate is not concerned with. Not knowing about the existance of such commands does not inhibit getting work done; however they can be considered as power tools.

% tee F                 - Replicate the standard output, sending the copy to
file F.  This command is useful when used with a
pipe, since one copy of the ouput (and thus be
redirected to the pipe), while the second copy will
go to file F.

% script F              - Make a typescript of a terminal session, saving the
dialogue in file F.  If no file name is provided,
the typescript is saved in a file called typescript.
Press Ctrl-D or type exit to quit script.

% stty -a               - Write to standard output all of the option setting
for the terminal.

% split -l n F          - Split a file F into a set of files having at most
n lines each.  The original file F is left unchanged.

% splitvt               - Run two shells in a split window.  Use Ctrl-W to
toggle between the windows.

% xargs -n num U A...   - Construct a command line consisting of the utility U
and the argument(s) A(...).  Invoke the constructed
command line and wait for its completion.  The -n
flag specifies how many standard input arguments
to use.

% basename F s          - Strip all directory components from the file name F,
as well as the possible suffix s.


## Regular Expressions:

A regular expression is a pattern that describes a strings. Used in combination with the grep utility, regular expressions aid in searching for character patterns in files (described later).

% man 7 regex           - User's regular expressions manual.
% man 3 regex           - Programmer's regular expressions manual.


Do a man on grep, and search for the part REGULAR EXPRESSIONS.

Most of the following is a shameless transcription of selected portions of the aforementioned. Note that regular expressions are defined in POSIX 1003.2, and come in two forms: modern or "extended", and obsolete or "basic".

% grep P F              - Search file F for all occurrences of pattern P.


Most characters including all letters and digits, are regular expressions that match themselves.

  a                     - Match the single character 'a'.
hello                 - Match the sequence of characters 'hello'.
i85                   - Match the sequence of characters 'i85'.


Any metacharacter with special meaning may be quoted by preceding it with a backslash. In basic regular expressions the metacharacters (described later) are

PROBLEM HERE WITH AFT...

  .   ?   *   +   ^   ${ | ( ) [ \ - Match the single character '\'. - Match the two characters ''.  A "bracket" expressions is a list of characters enclosed by [ and ] matches any single character in that list; if the first character in the list is the caret ^ then it matches any character not in the list.  [234567] - Match any single digit from '0' to '9' [^3x] - Match any single character other then '3', or 'x'.  A range of ASCII characters may be specified by giving the first and last characters, separated by a hyphen.  [2-7] - Match any single character in the ASCII range from '2' to '7'. [a-z] - Match any single character in the ASCII range from 'a' to 'z'. [0-9A-Za-z] - Depending upon the ASCII character encoding, this may match any character that is a digit or an upper or lower case letter. Note that ranges cannot share endpoints. Note that there are predefined classes of characters that are independent of encoding, and are thus portable.  A collating-sequence can be enclosed in [.'' and .]''.  [[.ch.]]*c - Matchs the first five characters of chchccc.  Most metacharacters lose their special meaning inside lists, or "bracketed" expressions. To include a literal ] place it first in the list (following a possible caret ^. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.  []a-d] - Match any single character in the list of ']' and the range 'a' to 'd'. [ab^d] - Match any single character in the list of 'a', 'b', '^', and 'd'. [ad2-] - Match any single character in the list of 'a', 'd', '2', and '-'.  The period . matches any single character. A regular expression matching a single character may be followed by one of several repetition operators:  ? - The preceding item will be matched 0 or 1 times. * - The preceding item will be matched 0 or more times. + - The preceding item will be matched 1 or more times. {n} - The preceding item is matched exactly n times. {n,} - The preceding item is matched n or more times. {,m} - The preceding item is optional and is matched at most m times. {n,m} - The preceding item is matched at least n times, but not more than m times.  Two regular expressions may be concatenated. Two regular expressions may be joined by the infix operator | resulting in a regular expression matching any string in either subexpression. Repetition takes precedence over concatenation, which in turn takes precedence over alternation. Parenthesis ( ) override these precedence rules. Note that "basic" regular expressions are somewhat different the "extended" regular expressions. Two differences to keep in mind are that delimiters for bounds are \{ and \}, and parentheses for nested subexpressions are $$and$$. There is one new type of basic atom (a regular expression enclosed in ()''), namely the back reference \ followed by a non-zero decimal digit d. It matches the same sequence of characters matched by the dth parenthesized subexpression.  $$[bc]$$\1 - Matches bb or cc but not bc.  ## Matching patterns in files. Grep searches the named input file(s) for lines containing a match to the given pattern. Grep understands "basic" (obsolete) regular expressions, and "extended" (modern) regular expressions. The pattern given to grep is by default (implicitly) interpretted as a basic regular expression. It can also be made explicit by the flag -G. To interpret the pattern as an extended regular expression, use the -E flag. % grep P F - Search the input file F for lines containing a match to pattern P, a basic regular expression. % grep -G P F - Interpret pattern P as a basic regular expression (default). % grep -E P F - Interpret pattern P as an extended regular expression. % grep -N P F - Grep. If a match is found, print N lines of leading and trailing context. % grep -c P F - Grep. Suppress normal output; Print a count of matching lines. % grep -i P F - Grep. Ignoring case distinictions in both the pattern and input file. % grep -n P F - Grep. Prefix each line of output with the line number within input file F. % grep -v P F - Grep. Invert the sense of matching, to select non-matching lines. % grep P - Grep standard input for pattern P. % grep P - - Grep standard input for pattern P.  ## Command-line Operability. Luc Boulianne's Theorem (aka Luc's Theorem): "The study of Computer Science is the study of Minimizing Keystrokes." bash:  C-a - Move to the (s)tart of the current line. C-e - Move to the (e)nd of the current line. C-f - Move (f)orward a character. C-b - Move (b)ack a character. M-f - Move (f)orward to the end of the next word. M-b - Move (b)ack to the start of this, or the previous word. C-p - Fetch the (p)revious command from the history list, moving back in this list. C-n - Fetch the (n)ext command from the history list, moving forward in the list. C-h - delete the character behind the cursor. C-d - (d)elete the character under the cursor. M-d - (d)elete from the cursor to the end of the current word, or if between words, to the end of the next word. C-w - kill the (w)ord behind the cursor. C-k - (k)ill from the cursor to the end of the line. C-u - (u)nix-line-discard from cursor to beginning of line. C-y - (y)ank the top of the kill ring into the buffer at the cursor. C-l - Clear the screen, leaving the current line at the top of the screen. C-t - (t)ranspose characters: drag the character before point forward over the character at point. M-t - (t)ranspose words: drag the word behind the cursor past the word in front of the cursor. C-_ - Incremental undo, separately remembered for each line. C-x C-u - Incremental undo, separately remembered for each line. M-# - Make the current line a shell comment.  ## make. make is a utility for maintaining, updating, and regenerating groups of related programs and files. The purpose of the make utility is to automatically determine which peices of a large program need to be recompiled, and issue the commands to recompile them. make can be used with any programming language whose compiler can be invoked from the shell. Note that make is not limited to programs; it can be used to update files from others, whenever the others change. The command make'' relies on a file called and named "Makefile", which you must write to describe the relationships among files in your program and the commands for updating each file. make executes commands in the makefile associated with each target, typically to create or update a file of the same name. A target entry has the form:  target [:|::] [dependency] ... [; command ] ... [command ] ...  If no target is specified upon invokation of make, all the targets are checked recursively against their dependencies. Once a makefile exists, typing make suffices to perform all necessary recompilations. % make - Perform all necessary recompilations to programs specified in the file called Makefile. The make program uses the makefile database and the last-modification times of the files to decide which of the files need to be updated.  The following is a great example of a short Makefile. If the first non-TAB character is a @'', the following command will not be printed before being exectued. <MAKEFILE> PODFILE = hash.pod TITLE = 'Perl Hash Howto' OUTFILE = index.html JUNK = pod2htm* all:$(PODFILE)
@pod2html --infile=$(PODFILE) --outfile=$(OUTFILE) --title=$(TITLE) @rm -f$(JUNK)
@if [ -r $(OUTFILE) ] ; then \ chmod 644$(OUTFILE); \
fi

</MAKEFILE>


The following is another example of a Makefile:

<MAKEFILE>

COMPILER = gcc
MAIN_SOURCE = str2wrd.c
OBJ = someobjectfile.o
LIB = -lm
OUT_NAME = a.out

str2wrd: $(MAIN_SOURCE) #$(COMPILER) -o ($OUT_NAME)$(OBJ) $(LIB)$(MAIN_SOURCE)
$(COMPILER) -o$(OUT_NAME) $(MAIN_SOURCE) clean: \rm -f$(OBJ) $(OUT_NAME) </MAKEFILE>  ## Revision Control System. Programs, documentation, projects, and other such files that undergo frequent revisions or updates can be managed using the Revision Control System (RCS). % man 1 rcsintro - Manual containing an introduction to rcs.  Someone new to RCS need only learn two commands. % ci - (c)heck (i)n. Deposit the contents of a file into an archival file called an RCS file. % co - (c)heck (o)ut. Retrive revisions from an RCS file.  Consider an assignment that will undergo frequent revisions. Let the file be called foo.c. Let's assume that foo.c resides at /courses/2000.1/cs537/ass/ass04/foo.c. % cd ~/courses/2000.1/cs537/ass/ass04/ - Change directory to the place where foo.c lives. % mkdir RCS - Make an RCS directory called RCS. % ci foo.c - Check in file foo.c, thereby creating a corresponding RCS file in the RCS directory, storing foo.c into it as revision 1.1, and deleting foo.c.  ## Usefull tricks. Use xargs to help kill processes: % ps axwww | grep http | awk '{print$1}' | xargs -n1 kill -9
- Run ps, pipe it to grep.  Pipe the grep output to
awk, sending field 1 of each line to xargs.  xargs
executes kill -9 on each incomming output of awk.

% tar -cvf - D1 | (cd /tmp; tar -xvf -)
- Tar-compress directory D1, to standard output, piping
it to a Tar-extract in /tmp while reading from
standard input.
% tar -cvf - jsse1.0 | (cd /usr/local ; tar -xvf -)

% for i in grep "u1/" /var/etc/teaching.cs.mcgill.ca/passwd | grep \* | awk \
-F: '{ print $1 }'  ; do echo$i ; done

% perl -pi -e 's/hello/goodbye/g' F
- Inline text substitution of every occurence of the
word 'hello' for the word 'goodbye' in file F.

^Z                    - Suspends current job.  Some programs don't allow
suspention.  For example pine must be invoked
with -z to enable suspension.
~^Z                   - Suspend current login session.


## Vim.

Vim is Vi IMproved. Vi stands for "visual editor". The underlying editor of both Vi and Vim is ed. Both Vi and Vim have two modes of operation: command mode and insert mode.

Range selection in vim can be done using v, V, and ^V.

  v                     - visual text selection per character
V                     - visual text selection per line
^V                    - visual text selection per block (rectangular shape)


After making the desired selection, press 'y' to 'yank' (copy) the text into vim's buffer. Next move the cursor to a desired location and press 'p' to paste the selected text after the cursor, or 'P' to paste it before the cursor.

Searching and replacing. To the whole document (%), search (s), for 'hello' and replace it with 'goodbye', and confirm (c) the replacement. :%s/hello/goodbye/c

replace with goodbye (y/n/a/q/^E/^Y)?

  y                     - yes
n                     - no
a                     - all
q                     - quit
^E                    - scroll up one line
^Y                    - scroll down one line


Both vi and vim can be invoked with a -r flag followed by the name of a .swp file. This will recover the swp file which may have been left behind after a system crash or corrupted session.

% vim -r F.swp          - Recover file F using swap file F.swp.
Note: after recovering the file using the swap file
you should delete the swap file.  Especially before
attempting to vim the recovered file again.


Type ":help" in Vim to get started. Type ":help subject" to get help on a specific subject.

Some usefull vim commands follow:

:set paste              - Turn on pasting mode.
:set nopaste            - Turn off pasting mode.
:set ai                 - Turn on autoindenting.
:set noai               - Turn off autoindenting.
:set textwidth=0        - Disable maximum text width.
:set textwidth=78       - Set max width that text can be insterted to 78.
:set list               - Show tabs as '^I' and end of line characters as '\$'.
:set nolist             - Turn off list mode.
:syntax on              - Turn on syntax highlighting.
:syntax off             - Turn off syntax highlighting.


Variation of an Emacs/vi joke:

Daddy! Daddy! Why are we hiding from the police?''
Because they use Emacs, son, and we use vi.''
`

## Subshells.

Like processes and subprocesses, when a shell starts another shell, the new shell is called a subshell. The child process (the subshell) inherits its parent's environment, however changes to the child's environment does not affect the parent. A shell script runs in a subshell.

## Shell scripting (programming).

You can read Tom Christiansen's essay "Csh Programming Considered Harmful", to learn why csh (and by extension, tcsh) should not be used for writing shell scripts: http://www.cs.mcgill.ca/socsinfo/seminars/csh-whynot