For Arboretum I'm making a file browser for the Save As… and Open menus. Currently you can navigate around and pick files, but it's missing the ability to access another hard drive, USB storage, or SD card or anything like that.
Windows 10's file explorer has a This PC section that lists them under a heading called Devices and drives. Linux Mint's file browser Nemo lists them similarly under Devices. So in this article all I'm really looking at is: where to get this list.
We need at least two pieces of information about each device. One is the name that we're actually showing to the user as part of the list, which could be a name they've given it like "Media" or "Backup". And the second is a file path for that device so we know what folder to show in the browser when they click on it.
It turned out to be pretty confusing for Linux, so maybe it's easier to start with Windows.
On Windows you can relatively easily loop through the volumes with the following C code.
This produces the output:
These aren't particularly useful to us by themselves. They're just GUIDs and not much else. The important part is that this loop can be used to look at all the volumes!
wchar_t and why do I use Windows
functions ending in 'W'? See
Appendix: Windows API and Unicode.
Instead of just printing each out, we can introduce the function
takes a volume name and gives us a list of paths we could use to refer to
it. There isn't a way to know up front how many characters long this list
is, so it actually expects you to just try calling it with a guess. If it
was enough room, great! If not, it tells you the amount you need and you
can try a second time.
The "path chain" has a somewhat unusual format. In C, strings are represented as an array of characters followed by one null character, as follows. The symbol for null is used to represent the null character, here.
The list of paths has multiple null-terminated strings stored end to end. The end of each string has a null character and the end of the list is signified by an extra null character. You could also think of the end of the list being an empty string. So, it's more like this:
This path chain does admittedly have the nice property that, if we don't actually care about any of the alternate paths, we can use it as though it's just a single string containing the first path.
So, now we have the path to the device. Next, we need the user-facing name
for it. This is probably the easiest part, as it's just one function
GetVolumeInformationW. The name is
guaranteed to be less than
MAX_PATH + 1
characters, so you don't have to worry about size this time.
Now you have all the pieces of information needed, so here's the completed example!
The most straightforward way to compile this code is with Visual Studio. It includes a C compiler with its Visual C++ stuff and generally bundles the two together. So look for the C++ things rather than C.
The only dependencies here are the Windows SDK and the C Run-Time library, though. So if you have another setup that has those then you're good to go.
So it turns out there's not just one set of standard Linux API calls to do this. The data is kept by various subsystems and in several informational directories and you're just kind of expected to piece it together.
This article is a decent start List all the mounted file systems or drives on Linux in C (C99) using mntent.h. It lists a bunch of mounts that aren't useful to a user, so they'd need to be filtered out, and doesn't have labels.
So lets give something like that a go.
mountpoint: / source: /dev/sda5 mountpoint: /mnt/C0DA7331DA7322B8 source: /dev/sda2
This accesses a file called
<mntent.h> is just a standard library
to parse this type of file. It still needs the labels, though!
udev is a device manager for the kernel. It keeps all the device files
on the system in the
/dev directory. One of
/dev/disk/by-label, which contains
device files that are named after the labels we need. We already have
paths to the device files like
in the output above. So, to get a label all you need to do is search
/dev/disk/by-label for a matching file.
How Do You Match Files?
You could follow symlinks until you reach the end at a "real" file. Then,
take the path name. Then do the same for the other file and at the end
compare the two path names. Linux has a function called
realpath that does exactly this.
The functions in
<dirent.h> can be used to actually
walk the directory. Then for each entry compare "real" paths and hopefully
find a label.
Now we got the paths to each volume and a nice label for each!
mountpoint: / label: (null) mountpoint: /media/andrew/OS Windows label: OS\x20Windows mountpoint: /mnt/C0DA7331DA7322B8 label: \x7eMedia\x7e
Who's been messing with my labels?
udev Property Encoding
udev disallows certain characters in its strings and encodes them by replacing "potentially unsafe" characters with their hexadecimal value preceded by \x, like \x20. Since backslash is used for this, it also has to be replaced by its own code \x5C.
So to get the proper labels we have to decode and replace these hex codes.
Now things are looking ~real~ nice.
mountpoint: / label: (null) mountpoint: /media/andrew/OS Windows label: OS Windows mountpoint: /mnt/C0DA7331DA7322B8 label: ~Media~
This Is Okay
This is the completed linux example.
I actually used Eclipse CDT to build this, similar to how I use Visual Studio on Windows. You can of course also compile on the command line with GCC directly. Pick up the gcc and libc-dev packages. With those you should be able to compile and run it with this command.
gcc -o list_volumes -std=gnu99 main_linux.c && ./list_volumes
So something I left out is
managed by the kernel. It's managed by
umount. It's still very
dependable, but it's worth mentioning that the kernel maintains its own
list that you can access at
/proc/self/mountinfo. This has its own format and unfortunately
doesn't have a corresponding library like
<mntent.h> to help read it.
As a bonus, I'm including the original example I put together. It includes
Windows and Linux in the same file using preprocessor conditionals. It also
does full parsing of
Linux and converts strings to UTF-8 on Windows.
I couldn't figure out a good way to make it digestible for this article but anyone who's interested in code that's a bit closer to what I'm using in Arboretum can take a look at main_both.c!
Appendix: Windows API And Unicode
The Windows API has three versions for many of its functions that involve strings.
A version ending in the letter 'A' which uses
Windows code pages.
The 'A' is for ANSI, because an early code page was fashioned after an American National Standards Institute draft. It's considered a bit of a misnomer because ANSI didn't have anything to do with making the specification. Windows code pages are Microsoft's standard.
A version ending in the letter 'W' which uses Unicode encoded in UTF-16.
The 'W' stands for "wide" because Windows uses a 16-bit "wide-character",
wchar_tto store each UTF-16 code unit.
A generic version that can be compiled as either of the other two.
The generic version uses the code page version unless the preprocessor symbol
UNICODEis defined before including
Windows.h, in which case it uses Unicode version. It also introduces a special type
TCHAR, which stands in for either a
wchar_tand is switched between them by the same definition.
Generally, I think Windows expects you to use the generic version and the preprocessor switch. But, code pages are obsolete and Unicode is used in file paths and Windows, internally. So, I usually prefer explictly calling the 'W' versions of functions so nothing uses code pages accidentally.
Unicode is definitely the dominant representation for text in 2018. But on Linux, OS/X, and on the World Wide Web the preferred encoding is UTF-8 instead of UTF-16. If you want to write a cross-platform program, then, you either have to handle both UTF-8 and UTF-16 or choose one and convert to the other when needed.
I stick with UTF-8 and convert to and from UTF-16 only when I'm talking to the Windows API. This complicates code a bit, so I omit it in examples to keep things focused on the topic at hand.