What (really) happens at ls -l?
“ls” it is probably the first and most often used command you will learn whenever you start working on a Linux based shell.
In the simplest possible way to say it, the “ls” command is used to list the contents of a directory. When used with no extra arguments within the command line (meaning when you type only “ls” and hit enter), the system response will display a list with whatever files (and sub-directories) exist at that moment within the current directory you are positioned at.
However, as mentioned before, it is possible to customize the “ls” command response by adding arguments to the command line. That’s it, by typing “ls” along with something else before hitting enter. Some (and just some)of the most common arguments that can be passed along with “ls” are shown on the image below:
As the purpose of each single argument that could be used along with “ls” escapes the scope of this article, we will be merely focusing on the instance when “-l” is added to the “ls” command line. The system response to this command + argument (ls -l) concatenation, will be to list the files in the working directory in “long format”.
As seen on the image above, the long display view, displays not only the information on different alignment, but also displays way more information about the directory content, than the regular, no arguments, view.
This is a breakdown of the information shown in the long format response:
- File Name -The name of the file or directory.
- Modification Time -The last time the file was modified. If the last modification occurred more than six months in the past, the date and year are displayed. Otherwise, the time of day is shown.
- Size -The size of the file in bytes.
- Group -The name of the group that has file permissions in addition to the file’s owner.
- Owner -The name of the user who owns the file.
- File Permissions -A representation of the file’s access permissions. The first character is the type of file. A “-” indicates a regular (ordinary) file. A “d” indicates a directory. The second set of three characters represent the read, write, and execution rights of the file’s owner. The next three represent the rights of the file’s group, and the final three represent the rights granted to everybody else. File permissions is a complex topic that would deserve an article of its own.
So far so good? Great, but the original question stands. What REALLY happens when you type “ls -l” and hit enter?
It is a fair question, because so far we have only covered what we get to see on the screen as system response, and that is only half of the answer. There is a whole process waiting to be explained, running on the background.
Once you type the command and hit enter, the system will recognize that you have entered something, which needs interpretation. Have you entered “ls -l” or have you typed “asdasdjkgjiii -156465” which, of course wouldn’t mean anything that the Shell can interpret?
In order to “realize” this, your system will start by creating a single string of characters from your text entry. This is done by using the getline() function. Shell is written mostly in the C programming language. The getline() function is a standard C function that “reads” the command line and creates the resulting string. This string will be then separated using prefixed delimiters, which in the case of “ls -l” is going to be the blank space between “ls” and “-l”. Each resulting individual element recognized after removing the delimiters is now known as a token. All tokens will then be placed in an array of strings. This whole process is known as tokenization.
Each token will have to be checked for previously assigned aliases. An alias is a shortcut to a longer command created by the user and not built-in into the shell. It would make a lot of sense not to assign an alias to “ls” as the alias will have priority over the original purpose of the function.
Once checked that “ls” is not an alias, next, the computer checks if tokens are built-in functions or not. A built-in command is a Linux/Unix command which is built into the shell interpreter itself. These commands will always be available in the RAM memory, so that accessing them is bit faster than compared to external commands which are stored on the hard drive.
If the command is a built-in, the shell runs the command directly, without using another program. For example, “cd” is a built-in, however, “ls” is not, so the system needs to find the specific executable file for it.
Once the built-in option has been discarded out, the Shell looks for an executable “ls” file in the “$PATH” variable. “$PATH” is a variable that stores a list of directories that Shell looks through whenever a command is entered. Each path in the “$PATH” variable is searched for the executable that corresponds with the command “ls”. The system will then, invoke the function stat() to check if there is a matching executable in each path.
When the “ls” executable file is located (normally within usr/bin/ls), the Shell invokes the execve() command to run or execute it.
With the execve() function the shell will know the command and arguments required for the execution, as well as where to find them.
This example will execute the “ls” command found in “/usr/bin/ls”, with the “-l” argument.
The function gets executed within a child process by using the fork() function. fork() will create a “child” process and return the output to STDOUT . The “child” process is created by duplicating the calling process. This means that at the time the fork happens, both processes (calling or parent and child) are identical but have different PIDs (process id, a number that identifies each process within the system).
While this occurs, the parent process waits for execution to be completed and the child process is terminated, after which memory is cleared, and the parent process takes over again and waits for the next input from the user.
We can summarize the entire process like this:
- Shell interprets the command.
- Shell creates a child process and executes the command on the child process.
3. Shell waits for the child process’s completion.
So, that is pretty much it. Hope it helps somebody out there. Please let any of the authors know if you have any question.