Exploring Chromium User Data

Chromium_Icon

The Chromium Development Documentation Project [1] / "The Chromium Authors" as per the open source development agreement, CC BY 2.5, via Wikimedia Commons

Exposition

These days browsers can store a lot of user data on your local computer if you let them. They also urge you to create an account and login so that your browsing data + tons of other data can be synced to a server.

Those who sign-in to Chrome or Chromium with a Google account have no control over what Google does with their data once it leaves their computer. The next best thing is knowing what, where and how the data is being stored/used on a local machine.

There are some interesting ways to view how your own data is being stored by Chromium that I would like to detail below. The first method is looking at internal pages, which can show you quite a bit of what chromium does under the hood. The second method involves looking at user files, which are stored as sqlite databases or json.

One of the great things about Chromium being open source is that anyone can go read the docs, clone and build the source code or follow active development if they wish.

NOTE: There is really only one way to see how things work “under the hood”, but that would require someone skilled and brave enough to go through Chromium’s massive C++ codebase to see how user data actually gets stored and synced in code. I have no shame in admitting that I am not that person, at least not at this point in time.

Internal Pages

There are a number of non-listed internal chromium pages that you can see by typing in the address bar chrome://about or chrome://chrome-urls.

There’s lots of hyperlinks there that can show a wide variety of information about the state of your browser, system and current user info.

I’ve tried to organize a list of the one’s I found interesting.

System Info

URLDescription
chrome://systemInfo about chrome version, OS, extensions, and memory usage.
chrome://gpuGraphics features like OpenGL, Video Decoding and Hardware Acceleration
chrome://device-logShows input devices relevant to the browser (more useful for ChromeOS)

Network Info

URLDescription
chrome://network-errorsShows every kind of network error possible (at least in the browser)
chrome://inspectShows every tab with an option to open DevTools for any page or extension

Omnibox

URLDescription
chrome://omniboxProvides a debug view for the omnibox and shows extra information like where suggestions come from
chrome://predictorsShows how chrome “predicts” words frequently typed into the omnibox

Website Engagement

URLDescription
chrome://media-engagementShows your Media Engagement Index on sites.
chrome://site-engagementA measurement of time spent, scrolls, clicks, typing on a page. More info here

Local Browser State and User Preferences (in json format)

URLDescription
chrome://local-stateSome basic browser information
chrome://pref-internalsA large json file with lots of browser state and most if not all user settings

New Tab Data

URLDescription
chrome://newtabThe internal url for the new tab page
chrome://ntp-tiles-internalsA view on data stored for the new tab page
chrome://suggestionsSuggestions that appear on the new tab page

Generated Events / Actions

URLDescription
chrome://user-actionsEvery event/action that happens in the browser (not page) UI

Signin and Sync

URLDescription
chrome://sync-internals/All information about syncing with a Google Account
chrome://signin-internals/Information about about signed-in Google Accounts

URLS for regular UI pages

URLDescription
chrome://settingsSettings
chrome://bookmarksBookmarks
chrome://historyHistory
chrome://downloadsDownloads
chrome://extensionsExtensions

Turn on experimental/disabled features

URLDescription
chrome://flagsTurn on/off features with flags – similar to settings command line flags in ~/.config/chromium-flags.conf

Chrome version info

URLDescription
chrome://versionDetailed/verbose version page
chrome://helpLess verbose version page

Find your user’s data files on disk

Much of the state of the browser is stored in readable sqlite databases, json files and/or other intermediate types of files (such as those in the cache) that contain binary data.

Windows

Found in: C:\Users\<username>\AppData\Local\Google\Chrome\User Data\Default

Mac

Found in: /Users/<username>/Library/Application Support/Google/Chrome/Default

Linux

Found in: /home/<username>/.config/google-chrome/Default

Files will be in the same directory using Chromium, but likely under the name chromium instead of chrome or google-chrome.

Get a list of sqlite databases used by chromium

$ file ~/.config/chromium/Default/* | grep -i sqlite | cut -d ':' -f 1 | xargs -I@ basename @
Affiliation Database
Cookies
Extension Cookies
Favicons
heavy_ad_intervention_opt_out.db
History
Login Data
Media History
Network Action Predictor
previews_opt_out.db
QuotaManager
Reporting and NEL
Shortcuts
Top Sites
Web Data

These files can all be opened with sqlite3 on the CLI or a SQL GUI such as dbeaver.

Using sqlite to query data

Getting schema and tables from a sqlite db:

sqlite3 <database> '.schema' '.tables' '.exit'

Once you know the table and column where data stored, you can do pretty much anything with it a few examples would be

Export whole url table from History database as a csv

sqlite3 History '.mode csv' 'SELECT * FROM urls' '.exit' > ~/history.csv

Get all origin_urls from logins table in ‘Login Data’ database

sqlite3 'Login Data' 'SELECT origin_url FROM logins' '.exit' > ~/login_urls.txt

Get “keywords” used for search engines stored in ‘Web Data’ database. Output is in json format

sqlite3 'Web Data' '.mode json' 'SELECT * FROM keywords' | jq

Conclusion

There are probably better ways to do this, but I learned a lot from looking at some of these tucked away files. Specifically, I found it interesting that Chrome keeps media and site engagement metrics as well as the preloading/prediction that happens when you start to type in an often visited URL. These are the kind of features you don’t see or think about much as a user, but it’s always there in the background.

I also learned that sqlite can output data in json format. That’s neat.

Compared to digging through the C++ implementation of how all these metrics are created and captured within Chromium, this was both insightful and relatively easy.