Checking passwords against haveibeenpwned.com

07.10.2018 11:34

haveibeenpwned.com is a useful website that aggregates database leaks with user names and passwords. They have an on-line form where you can check whether your email has been seen in publicly available database dumps. The form also tells you in which dumps they have seen your email. This gives you a clue which of your passwords has been leaked and should be changed immediately.

For a while now haveibeenpwned.com reported that my email appears in a number of combined password and spam lists. I didn't find that surprising. My email address is publicly displayed on this website and is routinely scraped by spammers. If there was any password associated with it, I thought it has come from the 2012 LinkedIn breach. I knew my old LinkedIn password has been leaked, since scam mails I get are commonly mentioning it.

Screenshot of a scam email.

However, it came to my attention that some email addresses are in the LinkedIn leak, but not in these combo lists I appear in. This seemed to suggest that my appearance in those lists might not only be due to the old LinkedIn breach and that some of my other passwords could have been compromised. I thought it might be wise to double-check.

haveibeenpwned.com also provides an on-line form where they can directly check your password against their database. This seems a really bad practice to encourage, regardless of their assurances of security. I was not going to start sending off my passwords to a third-party. Luckily the same page also provides a dump of SHA-1 hashed passwords you can download and check locally.

I used Transmission to download the dump over BitTorrent. After uncompressing the 7-Zip file I ended up with a 22 GB text file with one SHA-1 hash per line:

$ head -n5 pwned-passwords-ordered-by-hash.txt
000000005AD76BD555C1D6D771DE417A4B87E4B4:4
00000000A8DAE4228F821FB418F59826079BF368:2
00000000DD7F2A1C68A35673713783CA390C9E93:630
00000001E225B908BAC31C56DB04D892E47536E0:5
00000006BAB7FC3113AA73DE3589630FC08218E7:2

I chose the ordered by hash version of the file, so hashes are alphabetically ordered. The number after the colon seems to be the number of occurrences of this hash in their database. More popular passwords will have a higher number.

The alphabetical order of the file makes it convenient to do an efficient binary search on it as-is. I found the hibp-pwlookup tool tool for searching the password database, but that requires you to import the data into PostgreSQL, so it seems that the author was not aware of this convenience. In fact, there's already a BSD command-line tool that knows how to do binary search on such text files: look (it's in the bsdmainutils package on Debian).

Unfortunately the current look binary in Debian is somewhat broken and bails out with File too large error on files larger than 2 GB. It needs recompiling with a simple patch to its Makefile to work on this huge password dump. After I fixed this, I was able to quickly look up the hashes:

$ ./look -b 5BAA61E4 pwned-passwords-ordered-by-hash.txt
5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8:3533661

By the way, Stephane on the Debian bug tracker also mentions a method where you can search the file without uncompressing it first. Since I already had it uncompressed on the disk I didn't bother. Anyway, now I had to automate the process. I used a Python script similar to the following. The check() function returns True if the password in its argument is present in the database:

import subprocess
import hashlib

def check(passwd):
	s = passwd.encode('ascii')
	h = hashlib.sha1(s).hexdigest().upper()
	p = "pwned-passwords-ordered-by-hash.txt"

	try:
		o = subprocess.check_output([
			"./look",
			"-b",
			"%s:" % (h,),
			p])
	except subprocess.CalledProcessError as exc:
		if exc.returncode == 1:
			return False
		else:
			raise

	l = o.split(b':')[1].strip()

	print("%s: %d" % (
		passwd,
		int(l.decode('ascii'))))

	return True

def main():
	check('password')

main()

Before I was able to check those passwords I keep stored in Firefox, I stumbled upon another hurdle. Recent versions do not provide any facility for exporting passwords in the password manager. There are some third-party tools for that, but I found it hard to trust them. I also had my doubts on how complete they are: Firefox has switched many mechanisms for storing passwords over time and re-implementing the retrieval for all of them seems to be a non-trivial task.

In the end, I opted to get them from Browser Console using the following Javascript, written line-by-line into the console (I adapted this code from a post by cor-el on Mozilla forums):

var tokendb = Cc["@mozilla.org/security/pk11tokendb;1"].createInstance(Ci.nsIPK11TokenDB);
var token = tokendb.getInternalKeyToken();
token.login(true);

var passwordmanager = Cc["@mozilla.org/login-manager;1"].getService(Ci.nsILoginManager);
var signons = passwordmanager.getAllLogins({});
var json = JSON.stringify(signons, null, 1);

I simply copy-pasted the value of the json variable here into a text editor and saved it to a file. I used an encrypted volume, so passwords didn't end up in clear-text on the drive.

I hope this will help anyone else to check their passwords against the database of leaks without exposing them to a third-party in the process. Fortunately for me, this check didn't reveal any really nasty surprises. I did find more hits in the database than I'm happy with, however all of them were due to certain poor password policies about which I can do very little.

Posted by Tomaž | Categories: Life

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)