Technical inspection of Rekordbox 6 and its new internals
Hey guys,
So since the new Rekordbox 6 launched, I've been scrambling to support it for rekordcloud. At first, it seemed that it was not possible since they removed the XML export option and the internal database that Rekordbox uses has been encrypted for a long time. Luckily, I've been able to figure things out and rekordcloud now fully supports Rekordbox 6. 🎉
But this article isn't about that. With Rekordbox 6 a lot of interesting new developments have happened under the hood that I'd like to share with everyone here. I'll go over the new database, the Rekordbox Agent and some other thoughts.
Few things first:
My main operating system is Windows, so all file paths in this article will be for Windows.
%AppData% is a Windows file path shortcut for C:\Users\username\AppData
New Database
Rekordbox 5 used a .EDB file to store the database at %AppData%\Roaming\Pioneer\rekordbox\datafile.edb. This file appears to be a DeviceSQL database and Pioneer has not made any information about it available. Luckily they allowed us to export our data as an XML file so accessing the database was never much of a priority.
Now with Rekordbox 6, I'm sure you've noticed that it took a while to convert your library from Rekordbox 5 to 6. This is because they've completely changed the database from DeviceSQL to SQLite. The new database now resides at %AppData%\Roaming\Pioneer\rekordbox\master.db. This is an SQLite 3 database encrypted with SQLCipher 4. A bit more about the security below.
My educated guess is that DeviceSQL was too old for them to keep using adequately, especially with the Rekordbox Agent (more below). This was a good opportunity for them to get rid of some legacy baggage. SQLite is a file based database (it doesn't require a server) and one of the most common databases used. Probably the best choice for Rekordbox at this point.
Security
So, the new SQLite database is encrypted which means you can't just use it without the encryption key. Pioneer did this because they prefer that no one outside of Pioneer touches it (there is a forum post by Pulse but I can't find it right now). This is certainly understandable, but the data inside the database is yours so there should definitely be a way to get to it.
Since your data is stored and used locally, we already know that the key must (at some point) be present on our machine. Usually it's hidden in an executable file (possibly obfuscated, this means the key is garbled on purpose so it's hard to use without understanding the code around it). Knowing the key must be local somewhere, gives good hope that you can find it. Although sometimes this can be really hard.
You're probably wondering what the encryption key is. I'm not sure it's wise for me to just post it here. I'm not sure how upset Pioneer would be with me so I prefer not to. But if you read this article thoroughly and poke around in the source code, I'm sure you'll find it.
I thought the key might be license or machine dependent, but it appears that all databases are encrypted with the same key.
Rekordbox Agent
The other real interesting addition is the Rekordbox Agent. This looks like what they're using the achieve the new cloud sync feature. It's a completely separate program from Rekordbox 6 and it launches when you launch Rekordbox 6. Even if you don't have cloud sync, it still launches.
The Rekordbox Agent is an Electron app. [Note: removed some detailed reverse engineering instructions here.] After looking inside the source, the app appears to run the Express web framework for Node.js. We can combine this information with what we see in the Rekordbox Agent log file at %AppData%\Roaming\Pioneer\rekordboxAgent\log.log. It contains lines like these:
Rekordbox Agent launches an Express instance so it can listen to incoming events. By default it uses port 30001. You appear to be able change this in %AppData%\Roaming\Pioneer\rekordbox6\rekordbox3.settings by changing the CloudAgentPortNumber key value.
There are lots of routes here and I haven't explored all of them yet, but it's fairly safe to say that the Rekordbox Agent is in charge of up- and downloading your tracks and database. There are references in the source code to copying files (eg controllers/file_copy_worker.js looks like a multi-threaded worker). The source code also contains the database model for every table (in the models folder).
This brings us to the reason of the SQLite database. The Rekordbox Agent needs database access and is written in Electron (which uses Node.js). But since DeviceSQL is an old and not well supported database, they had to do a switch. There doesn't appear to be any package on NPM for DeviceSQL so supprting it in Javascript could have been a time sink. Switching was probably the cheaper, faster and more future-proof choice.
Other interesting things I noticed in the source code:
- Looks like your cloud sync backups go to Amazon AWS S3. More specifically (in my case), the rb-cloud-data-eu bucket in region eu-west-1. I'm in the EU so it might use multiple regions, depending on your location. But since I see the eu-west-1 reference more often, I think they only use one location. You can actually see this yourself in the Rekordbox Agent log file.
- There are references to Google Drive and OneDrive together with Dropbox, so they may support that in the future. This is paired with something called rekordbox cloud (hopefully not to be confused with rekordcloud in the future). Maybe they are just using Dropbox now as a quick fix (and let you pay for it) and later they might include their own storage solution?
- There is a reference to a Spotify file type. Maybe this is coming in the future? Take this with a grain of salt though, it was one small reference that did not occur elsewhere.
Extracting Your Data
The XML used to contain (almost) all our data, with a few notable exceptions such as MyTags. Now that this is no longer an option, we'll have to do it ourselves. Luckily, the SQLite database is clearly structured so getting our data out of there is no problem. I'm not going into details here, because it's really not very interesting how the database is structured.
There is one problem here though. The beatgrids are not found in the database. They appear to only be located in the ANLZ0000.DAT files. These are binary files that contain the exact beat locations (and more information such as cue points, waveform and more). These files are found at (for example) %AppData%\Roaming\Pioneer\rekordbox\share\PIONEER\USBANLZ\00c\fc4d7-802f-430d-a845-0bd9a87cb09c. Looks familiar? These are also on your USB stick used by Pioneer CDJs. These files contain your pre-calculated waveforms, cue points, beatgrids and any other information the CDJs need.
The database points at the right .DAT file per track so finding out which file has the right beatgrid is easy. Getting them out of the .DAT file is a bit harder, but thankfully has been thoroughly researched and explained here. This certainly saved me a lot of time!
Writing Data
Rekordbox 6 can still import XML files. Honestly, I wonder why since the move to remove XML export is a step closer to a walled garden approach. But since you can still import XML, I've decided to use that method (for now) to import your data from rekordcloud back into Rekordbox. It is definitely possible to write data directly to the SQLite database, but since I don't have any documentation about it, it's safer not to do this yet. There might be unforeseen ramifications that could cause data loss.
Possibilities
One pain point of the XML export was always that it did not include MyTags. With access to the Rekordbox database, we can read now (and write) MyTags. This opens up many new possibilities in managing your MyTags. Thoughts about this? Let me know!
Maybe there are novel ways to use the full information of the database that wasn't possible earlier with the XML export. Discuss below! :)
Legal
Preface: I'm not a lawyer or legal expert by any means so take everything I say with a grain of salt.
Rekordbox encrypts the database that contains your own data. This is unlike any of the competitors. Traktor and VirtualDJ both have XML files that you can look into with any text editor. Serato saves most data into ID3 tags but can still be viewed easily with a tool like kid3. Until now this was no big deal since the XML export provided most data.
If you're in the EU, you fall under the GDPR. Under Article 15 of the GDPR, it is stated that you have rights to access your own data. Pioneer must provide a way to do this. The XML export was a suitable method but no longer exists. There is no automatic way to provide access to your own data anymore. We'll see where this goes but I don't think they have a choice to either bring XML export back or open up the new database.
If you're in the US (or elsewhere): I don't think there are any applicable privacy laws here. Convince your representatives to take privacy more seriously.
Edit: most likely this does not fall under the GDPR since it is not "personal data" (see definition in Article 4.1).
It could be different
In case Pioneer reads this: do the right thing and open up Rekordbox.
Bringing back the XML export is an option but if Pioneer wants to do it right:
- Remove the database encryption
- Document the database
- Support third party developers
An open platform leads to innovation. Closing it up will cause Rekordbox to stagnate. Opening it up can lead to a whole ecosystem around Rekordbox with limitless possibilities and integrations.
Thoughts or corrections?
If you have any thoughts, or notice a mistake, let me know below.