Alex Clemmer is a computer programmer. Other programmers love Alex, excitedly describing him as "employed here" and "the boss's son".
Alex is also a Hacker School alum. Surely they do not at all regret admitting him!
(My batch ended on August 22, 2013, but as they say, never graduate.)
After learning a bunch about concurrency primitives Yesterday, I decided it would be fun to have an operational understanding of their implementation. So I decided to boot up dtruss
(which is like dtrace
, but for OS X) and look at the syscall pattern under different cool concurrent scenarios.
I started by running dtruss
on this very simple program that I wrote:
#include <stdio.h>
#include <semaphore.h>
int main()
{
sem_t mutex;
sem_init(&mutex, 0, 1);
sem_post(&mutex);
return 0;
}
… which promptly caused my terminal to explode with a huge chain of syscalls that I did not recognize:
SYSCALL(args) = return
open(".\0", 0x0, 0x1) = 3 0
fstat64(0x3, 0x7FFF6AFE1260, 0x0) = 0 0
fcntl(0x3, 0x32, 0x7FFF6AFE14E0) = 0 0
close(0x3) = 0 0
stat64("/Users/alex/Desktop/fun/scratch/locks\0", 0x7FFF6AFE11D0, 0x0) = 0 0
issetugid(0x7FFF6B01D530, 0x7FFF6AFE1A30, 0x7FFF6B01D530) = 0 0
csops(0x0, 0x0, 0x7FFF6AFE14BC) = 0 0
shared_region_check_np(0x7FFF6AFDF408, 0x2, 0x55) = 0 0
stat64("/usr/lib/dtrace/libdtrace_dyld.dylib\0", 0x7FFF6AFE05D0, 0x7FFF6AFE14C0) = 0 0
open("/usr/lib/dtrace/libdtrace_dyld.dylib\0", 0x0, 0x0) = 3 0
pread(0x3, "\312\376\272\276\0", 0x1000, 0x0) = 4096 0
pread(0x3, "\317\372\355\376\a\0", 0x1000, 0x1000) = 4096 0
mmap(0x10B3E6000, 0x2000, 0x5, 0x12, 0x3, 0x100001F) = 0xB3E6000 0
mmap(0x10B3E8000, 0x1000, 0x3, 0x12, 0x3, 0x100001F) = 0xB3E8000 0
mmap(0x10B3E9000, 0x1F40, 0x1, 0x12, 0x3, 0x100001F) = 0xB3E9000 0
close(0x3) = 0 0
[... AND SO ON ...]
Full printout here. Where’s the initial call to execve
? Have I really never run dtrace
on OS X before? There were some similarities here (for example fstat
is a Linux function too), but overall I had no idea what was going on.
UPDATE: found this neat article that actually has a remarkably similar trace! Wish I found that hours ago.
A lot of this looked to me like dynamic library loading, so I tried to run clang
with -static
, but I couldn’t quite get it to work. Something about a missing library.
Anyway, a lot of the differences between my prior experiance and this seem to be because OS X is a BSD rather than a Linux. In any event I began looking everything up to see if I could piece together what was happening in kernel land to run this program.
If you don’t know OS X syscalls very well (I also don’t!), you can have a look at the appendix at the bottom of this entry, where I kept my notes as I looked each of them up.
Here is what I think is going on in the syscalls above. A lot of this I’m not quite sure about/not sure how to find out more about. So some of it is pure speculation.
SYSCALL(args)
does.open
on the current directory. This returns a read-only file with file descriptor 3. (Notice where it says = 3 0
? I think the 3 is the file descriptor.)fstat64
and do something to it (e.g., duplicate it or something) with fcntl
. Not sure what that something is, because the flag gets turned into an integer here, and I don’t know how to map it back. After that, we close the folder with close
.stat64
to find some information about the final of the binary I compiled, which you can see is called locks
. After this, we check to see if it has elevated permissions with issetugid
, and then use csops
to inspect the signature of something at address 0x7FFF6AFE14BC
. (I’m not quite sure what’s up here.)shared_region_check_np
is doing — I couldn’t find it on google.stat64
to inspect a dynamic library that’s part of libdtrace
. Interesting! Would this go away if we weren’t using dtruss
? We subsequently open
it, again with file descriptor 3. After open
ing, we can clearly see two calls to pread
, which will read 0x1000
bytes! The first one starts at offset 0x0
and the second starts at offset 0x1000
, but it looks like they’re pointing at two different places. I wonder why.mmap
3 times to drop file descriptor 3 (i.e., the dynamic library from libdtrace
) into memory. I can’t tell what protections or flags they’re being passed in with (since I don’t know what flag, e.g., 0x12
corresponds to), but we are mapping 0x2000
, 0x1000
, and 0x1f40
bytes in, respectively. To different places in memory. Hmm.After this, we check the status of quite a few library files:
stat64("/usr/lib/libSystem.B.dylib\0", 0x7FFF6AFE0410, 0x7FFF6AFE1290) = 0 0
stat64("/usr/lib/system/libcache.dylib\0", 0x7FFF6AFE0110, 0x7FFF6AFE0F90) = 0 0
stat64("/usr/lib/system/libcommonCrypto.dylib\0", 0x7FFF6AFE0110, 0x7FFF6AFE0F90) = 0 0
stat64("/usr/lib/system/libcompiler_rt.dylib\0", 0x7FFF6AFE0110, 0x7FFF6AFE0F90) = 0 0
stat64("/usr/lib/system/libcopyfile.dylib\0", 0x7FFF6AFE0110, 0x7FFF6AFE0F90) = 0 0
stat64("/usr/lib/system/libdispatch.dylib\0", 0x7FFF6AFE0110, 0x7FFF6AFE0F90) = 0 0
[...]
I guess it’s hard to know why we’d inspect so many of these files. I suspect they’re dynamic libraries, and we’re only loading libdtrace
into memory with mmap
because we need it. Notice that none of them return error codes, so we’re definitely asking for information about files that exist.
Something that’s a bit more mysterious is the huge number of calls to mmap
that follow:
mmap(0x0, 0x2000, 0x3, 0x1002, 0x1000000, 0x4) = 0xB3EB000 0
mprotect(0x10B3EB000, 0x88, 0x1) = 0 0
mmap(0x0, 0x17000, 0x3, 0x1002, 0x1000000, 0x6) = 0xB3ED000 0
mprotect(0x10B3ED000, 0x1000, 0x0) = 0 0
[...]
Notice the file descriptor is 0x1000000
— that seems very high, and I don’t know what it’s for.
But, all this comes to a head when, at the final two lines:
sem_init(0x7FFF6AFE1908, 0x0, 0x1) = -1 Err#78
sem_post(0x7FFF6AFE1908, 0x0, 0xFFFFFFFFFFFFFFFF) = -1 Err#9
We learn that actually I programmed these wrong, they error out, and nothing much happens.
Ah, oh well.
Oddly enough, when you run this multiple times, things execute in a different order! If you have a look at the other tests in the gist, you’ll see that many of them are quite alike, but occur in different order.
I’ve never really been much of a systems programmer. So I started cracking them open one by one. These are some of my notes. Some of them might be wrong!
open
takes a file path and opens that file for reading or writing, depending on the flag given as argument.close
closes a file descriptor. After closing, it is open for reuse again.fstat64
and stat64
are functions that give us information (permission, inode number, etc.) about some file. In the case of stat
, the file is denoted by a path; in the case of fstat
, that file is denoted by a file descriptor.fcntl
performs an operation (e.g., duplication) on a file denoted by a file descriptor.issetugid
helps library routines like those in glibc determine whether a process has raised privileges, which might be an issue if, e.g., they accidentally load libraries off the path that might abuse these privileges.csops
is used by system demons to verify the code sign – the signatures that ensures certain important utilities are not tampered with.pread
reads from a file descriptor starting at some offset.mmap
maps files into memory. In other words, the file is opened and the byte locations of that file are mapped directly into the virtual address space of the running process. An argument determines where to start the map; if NULL
, the kernel determines this.Another curious note: a lot of string literals in these calls have a \0
at the end. For example: open(".\0", 0x0, 0x1)
. I think this is because they’re all null-terminated C strings. So, this syscall is basically opening with no flags and (I think) read-mode.