28 Nov Challenges in monitoring and collecting data on Darknet
In the efforts for countering online criminal activities on the Darknet, Law Enforcement Agencies (LEAs) are facing a challenge in the timely detection of activities and correlation of the events with other criminal activities conducted online and offline. Being able to obtain on-time information about the criminals’ intentions puts the LEAs at a huge advantage in timely prevention of criminal activities. This can be accomplished by applying an automated process for continuous monitoring and data collecting from Darknet websites.
A well-established process for monitoring and collecting data published on darknet websites requires a systematic approach to accessing, exporting, and analyzing the data so that the process can provide a meaningful outcome.
The widest approach for accessing and collecting data from Darknet is by applying various scraping techniques. This approach can provide an enormous collection of darknet data that will require a defined process of analyzing and correlating so that information can help LEAs in countering online criminal activity. However, this process doesn’t come without its challenges. The following are some of the challenges that might need to be addressed and overcome.
- Darknet website location (.onion links)
The location of onion websites is difficult to deter because darknet websites can change their .onion location for various reasons. The new locations can be shared on clear and darknet website repositories and among public and closed social media groups. This creates problems in defining the real location of the darknet website, which makes the possibility for automation of the whole process a very big problem.
- Challenges in accessing the website
In the next part of the process, the system should be aware of the different limitations in accessing the website’s resources. The limitations can be a product of different types of CAPTCHA challenges for entering website resources and the need to have credible user credentials for login to websites.
- Technology for scraping
Another aspect of the process of automation of monitoring and collecting data from darknet websites is defining the right technology for scraping the website data. There isn’t a universal solution to the problem. Some of the most used approaches are:
- Phyton-based crawling application with Selenium as a browser automation package, and
- AppleScript is a process automation utility, similar to PowerShell for Microsoft Windows.
Both of these technologies are useful for scraping data but are completely different in the implementation process.
- Bandwidth requirements
Darknet is a very slow network. Therefore, the automated process of monitoring and collecting data will not be a fast one. A solution can be found in the distributed system approach so that the data can be collected for various locations and later saved in one database location. This means that data filtering and cleaning should be incorporated into the process so that duplication of data can be avoided.
- Data correlation
Collected data from darknet websites can be correlated with data found on the clear web. This is a challenge, but it can provide useful information about the data found on the darknet. The data can be correlated with data from different OSINT tools (like MALTEGO), with data from e-currency transactions, and with some purchased data from some data brokers. This can give more information about the user’s activities on the darknet and can help in profiling them.
- Information consolidation
The last challenge that needs to be overcome is a consolidation of the data so that it gives a context for later analysis. This process is very important in the monitoring phase of the process because it can give relevant information about activities on darknet websites.
This is a general overview of the challenges in establishing a process for automated monitoring, collecting, and content extraction from Darknet websites. The challenges are versatile but successful implementation of this kind of process can be very rewarding in countering online criminal activities.
Hayes, D. R., Cappa, F., & Cardon, J. (2018). A Framework for More Effective Dark Web Marketplace Investigations. Information, 9(8). https://doi.org/10.3390/info9080186
Rawat Romil and Rajawat, A. S. and M. V. and S. R. N. and G. A. (2021). Dark Web—Onion Hidden Service Discovery and Crawling for Profiling Morphing, Unstructured Crime and Vulnerabilities Prediction. In M. and P. R. K. and S. R. N. Mekhilef Saad and Favorskaya (Ed.), Innovations in Electrical and Electronic Engineering (pp. 717–734). Springer Singapore.
Shakarian, P. (2018). Dark-Web Cyber Threat Intelligence: From Data to Intelligence to Prediction. Information, 9, 305. https://doi.org/10.3390/info9120305