Blog

Welcome to the Hardenize blog. This is where we will document our journey as we make the Internet a more secure place and have some fun and excitement along the way.

Back

3 Apr
2019

Hardenize Labs: Confusables

by Mike Cardwell

In our earlier blog post, we wrote about our research on detecting phishing hostnames in certificates obtained from Certificate Transparency logs. Today, we're taking our efforts further with the release of our project for automated discovery of phishing hostnames based on the continuous monitoring of all public certificates. This project is also the first to feature in our new Labs section, which we will use to publish our research.

Conceptually, our project is quite simple. We maintain a database of public hostnames, which is continuously updated with new discoveries. Today, this database is predominantly fed the hostnames from public certificates, but we have plans to expand our sources in the future to use other sources (e.g., domain name registrations).

Separately, we also continuously search the new hostnames in the database for phishing domains related to some selected high-profile keywords, which we selected because they are the identities most often impersonated:

apple
facebook
instagram
paypal
twitter

For the purposes of this project, we don't look for exact keyword matches, but instead for variations that might indicate obfuscation. How we do that is the topic of our previous blog post, but one example is app1e vs apple (letter L versus digit 1; they're quite similar as presented here on this page). Our goal is to gain a better understanding of what obfuscation techniques are used in real life.

Our project provides the overall statistics and shows the recent matches, but we also provide all the raw data for download. So head over to the project page to see for yourself:

Hardenize Labs: Confusables.

Detection of phishing hostnames is a feature of our monitoring platform. As of today, our customers at the highest subscription tier have access to an API that enables them to create custom keywords and monitor the discoveries. Please get in touch if this functionality is something you would like to use for your own monitoring needs.