At Coalition, we need to gather a lot of information in order to solve cyber risk, which is no small task. In fact, we need to scan the entire internet. Scanning the internet is a lot like eating an elephant — there are many small steps needed to succeed.
If you fail to break down the problem into small enough pieces, you're going to make yourself miserable.
When looking at the internet as a whole, it can be overwhelming, especially when IPv6 is part of the conversation. If you step away from the concept of the internet as a whole, you can break down your scope into ever-smaller pieces: The whole internet, regional internet registries (RIR), autonomous system numbers (ASN), classless inter-domain routing blocks (CIDR), and individual internet protocol addresses (IP).
If that wasn't enough, there are 65,536 (2^16) ports that are addressable on each IP address for protocols like TCP and UDP. To keep the math simple (ha!), we'll look at IPv4 (2^32) and TCP (2^16).
The resulting complete set of endpoints is 2^48. It's a big number, and scanning all ports for the whole internet over TCP is a recipe for pain.
Thankfully, there are strategies for reducing the address space we need to contact: Private address space (RFC 1918), un-routable address blocks (by design and due to network issues), and other factors.
Choosing to scan all ports feels like a good strategy for completeness. But, in reality, if you're looking at a network with even modest defences, you're going to be placed on time out and possibly reported for abuse.
To further reduce scope, perhaps it's really a "horizontal scan" you're looking for. A horizontal scan selects a particular port to examine across the space you intend to scan. Now we're getting to a level of work that is manageable, and can leverage existing tools like
zmap to see if the port you're interested in is capable of communication. These tools are well documented and I encourage you to learn more.
This is where my story really begins. I was asked to improve our ability to detect and communicate with exposed Server Message Block services, more commonly knowns as SMB.
SMB is a networking protocol that originated at IBM, but was adopted and implemented by Microsoft circa 1990. Let's just say that 1990 was a more trusting time, and the ability to easily share information greatly outweighed the need to secure it — the concept of the internet as we know it today was not being considered in an adversarial lens.
If you move forward to 2020, there have been 4,390 SMB security vulnerabilities recorded as Common Vulnerability Enumerations (CVEs) by CVE Details. As such, SMB exposed to the internet presents an unacceptable security risk for Coalition policy holders, and we need to know about any exposed instance in a timely manner.
We were previously using software that was exceptionally solid at "speaking" SMB, but not particularly well suited for horizontal scans. To be clear; it was slow. My task was to make this process faster.
Go is ideally suited for tasks that run concurrently (like connecting to things on the internet), and designers have made concurrency a first class citizen; it's a core component of the language.
The features that enable this means you can run lightweight "workers" via
channels. This helps by removing network blocking that would be present in any serial model. Any single job may be slow, but it's unlikely that all jobs will be slow at the same time. So, even if a single worker is bogged down (don't accept the default timeouts, set your own!), other workers will continue onward.
Since Go allows you to easily fan-out your work to multiple workers with
channels, it does present a new challenge — managing the results from each worker. This is where
channels shine and allow you to concentrate your results to a single process, which is also a
goroutine, to handle output in a way that prevents collisions and resource contention.
Go is designed in such a way that it forces you to handle errors immediately.
Some programmers find the design to be burdensome, however, for things like internet scanning. It forces the most important single principle when you're connecting at a protocol level. If you're going to fail, fail as quickly as possible!
Given the sheer scale of the undertaking, saving a few seconds, or even processing cycles, really begins to add up. For any given operation, make sure you fail as quickly as possible, without sacrificing the potential for data acquisition.
This is the part that would have broken my heart a few years ago: I wrote code that took longer than it should have, only to be thrown out. But, the code was fun to write, and working at the level of bytes is really satisfying.
The process of turning packet captures into working payloads sometimes makes you feel like you've invoked alchemy and are bending the bytes to your will. As I moved through the complexities of the SMB protocol, I began to realize that, while I could implement the entire protocol myself, it wasn't a good use of time.
People who have dedicated years to projects of inter-operation with SMB will have a much better understanding of the protocol than me. The best use of my time was to work on integrating well-tested code rather than reinventing the wheel. Not all was lost though. I did release a Go package,
slycer, for working with byte slices, so you too can invoke the alchemy of bytes.
This work helps us gather data faster, and potentially more frequently, alerting our customers to exposed SMB. This work saw a speed improvement of 1400%, which allows us to collect more frequently, and frees up resources to collect other meaningful signals. We're always looking to improve the breadth and depth of data collected to have a more complete picture of cyber risk!
If you are interested in work like this, and are passionate about solving cyber risk, join the Coalition team!