Resources for the the 4th IEEE SECURITY & PRIVACY ON THE BLOCKCHAIN (IEEE S&B 2020) paper
Introduction
The following links are provided for anyone who wants to follow our ideas in paper "An Automatic Detection and Analysis of the Bitcoin Generator Scam".
Paper abstract
We investigate what we call the ``Bitcoin Generator Scam'' (BGS), a simple system in which the scammers promise to ``generate'' new bitcoins using the ones that were sent to them. A typical offer will suggest that, for a small fee, one could receive within minutes twice the amount of bitcoins submitted. BGS is clearly not a very sophisticated attack. The modus operandi is simply to put up some web page on which to find the address to send the money and wait for the payback. The pages are then indexed by search engines, and ready to find for victims looking for free bitcoins. We describe here a generic system to find and analyze scams such as BGS. We have trained a classifier to detect these pages, and we have a crawler searching for instances using a series of search engines. We then monitor the instances that we find to trace payments and bitcoin addresses that are being used over time. Unlike most bitcoin-based scam monitoring systems, we do not rely on analyzing transactions on the blockchain to find scam instances. Instead, we proactively find these instances through the web pages advertising the scam. Thus our system is able to find addresses with very few transactions, or even none at all. Indeed, over half of the addresses that have eventually received funds were detected before receiving any transactions. The data for this paper was collected over four months, from November 2019 to February 2020. We have found more than 1,300 addresses directly associated with the scam, hosted on over 500 domains. Overall, victims have paid (at least) over 5 million USD to the scam, with an average of 47.3 USD per transaction.
Train and Test Datasets
The following are the train and test data sets we used in our experiments. The files below are directories with the training instances dom as HTML pages saved in .html extension and the first and land URLs saved in .url extension
Training dataset
Collected Datasets
The following are the collected scam instances collected for the paper analysis.
Scam domains
Bitcoin addresses with transactions at the time of writing
Bitcoin addresses with no transactions at the time of writing
Addresses for other cryptocurrencies
The search queries
Example of fake log