Commit 40b19865 authored by Robert David Graham's avatar Robert David Graham
Browse files

Merge pull request #8 from jlund/readme-cleanup

Cleaned up some typos in the README and VULNINFO files
parents a359aea1 e91d82c8
Loading
Loading
Loading
Loading
+20 −20
Original line number Diff line number Diff line
@@ -88,10 +88,10 @@ Usage is similar to `nmap`. To scan a network segment for some ports:
This will:
* scan the 10.x.x.x subnet, all 16 million addresses
* scans port 80 and the range 8000 to 8100, or 102 addresses total
* print output to <stdout> that can be redirected to a file
* print output to `<stdout>` that can be redirected to a file

To see the complete list of options, use the `--echo` feature. This
dumps the current configuration and exits. This ouput can be used as input back
dumps the current configuration and exits. This output can be used as input back
into the program:

	# masscan -p80,8000-8100 10.0.0.0/8 --echo > xxx.conf
@@ -108,7 +108,7 @@ with the entire Internet in mind. It might look something like this:
Scanning the entire Internet is bad. For one thing, parts of the Internet react
badly to being scanned. For another thing, some sites track scans and add you
to a ban list, which will get you firewalled from useful parts of the Internet.
Therefore, you want to exlude a lot of ranges. To blacklist or exclude ranges,
Therefore, you want to exclude a lot of ranges. To blacklist or exclude ranges,
you want to use the following syntax:

	# masscan 0.0.0.0/0 -p0-65535 --excludefile exclude.txt
@@ -186,7 +186,7 @@ differences are:

You can think of `masscan` as having the following settings permanently
enabled:
* `-sS`: this does SYN scan only (currently, will change in future)
* `-sS`: this does SYN scan only (currently, will change in the future)
* `-Pn`: doesn't ping hosts first, which is fundamental to the async operation
* `-n`: no DNS resolution happens
* `--randomize-hosts`: scan completely randomized
@@ -201,7 +201,7 @@ command:
## Transmit rate (IMPORTANT!!)

This program spews out packets very fast. On Windows, or from VMs,
it can do 300,000 packets/second. On a Linux (no virtualization) it'll
it can do 300,000 packets/second. On Linux (no virtualization) it'll
do 1.6 million packets-per-second. That's fast enough to melt most networks.

Note that it'll only melt your own network. It randomizes the target
@@ -256,12 +256,12 @@ scan all "private" IP addresses. That would be the table of ranges like:
    172.16.0.0/20

In this example, the first 64k indexes are appended to 192.168.x.x to form
the target address. Then, the next 16-million are appenedd to 10.x.x.x.
the target address. Then, the next 16-million are appended to 10.x.x.x.
The remaining indexes in the range are applied to 172.16.x.x.

In this example, we only have three ranges. When scanning the entire Internet,
we have in practice more than 100 ranges. That's because you have to blacklist
or exlude a lot of sub-ranges. This chops up the desired range into hundreds
or exclude a lot of sub-ranges. This chops up the desired range into hundreds
of smaller ranges.

This leads to one of the slowest parts of the code. We transmit 10 million
@@ -293,9 +293,9 @@ is fairly straight forward:
This leads to another expensive part of the code. The division/modulus
instructions are around 90 clock cycles, or 30 nanoseconds, on x86 CPUs. When
transmitting at a rate of 10 million packets/second, we have only
100 nanoseconds per packet. I see now way to optimize this any better. Luckily,
100 nanoseconds per packet. I see no way to optimize this any better. Luckily,
though, two such operations can be executed simultaneously, so doing two 
of these as shown above is no more expesive than doing one.
of these as shown above is no more expensive than doing one.

There are actually some easy optimizations for the above performance problems,
but they all rely upon `i++`, the fact that the index variable increases one
@@ -307,7 +307,7 @@ need to spread our traffic evenly over the target.
The way we randomize is simply by encrypting the index variable. By definition,
encryption is random, and creates a 1-to-1 mapping between the original index
variable and the output. This means that while we linearly go through the
range, the output IP addresse are completely random. In code, this looks like:
range, the output IP addresses are completely random. In code, this looks like:

    range = ip_count * port_count;
    for (i = 0; i < range; i++) {
@@ -327,7 +327,7 @@ This architecture allows for lots of cool features. For example, it supports
"shards". You can setup 5 machines each doing a fifth of the scan, or
`range / shard_count`. Shards can be multiple machines, or simply multiple
network adapters on the same machine, or even (if you want) multiple IP
source addresses on the same network adapter
source addresses on the same network adapter.

Or, you can use a 'seed' or 'key' to the encryption function, so that you get
a different order each time you scan, like `x = encrypt(seed, i)`.
@@ -361,14 +361,14 @@ bypasses in Masscan:
* user-mode TCP stack
* user-mode synchronization

Masscan can use the PF_RING DNA driver. This driver DMA's packets directly
Masscan can use the PF_RING DNA driver. This driver DMAs packets directly
from user-mode memory to the network driver with zero kernel involvement.
That allows software, even with a slow CPU, to transmit packets at the maximum
rate the hardware allows. If you put 8 10-gbps network cards in a computer,
this means it could transmit at 100-million packets/second.

Masscan has it's own built-in TCP stack for grabbing banners from TCP
connections. It means it can easily support 10 million concurrent TCP
Masscan has its own built-in TCP stack for grabbing banners from TCP
connections. This means it can easily support 10 million concurrent TCP
connections, assuming of course that the computer has enough memory.

Masscan has no "mutex". Modern mutexes (aka. futexes) are mostly user-mode,
@@ -378,17 +378,17 @@ is contention, they'll do a system call into the kernel, which kills
performance. Mutexes on the fast path of a program severely limits scalability.
Instead, Masscan uses "rings" to synchronize things, such as when the
user-mode TCP stack in the receive thread needs to transmit a packet without
interferring with the transmit thread.
interfering with the transmit thread.


## Portability

The code runs well on Linux, Windows, and Mac OS X. All the importnat bits are
The code runs well on Linux, Windows, and Mac OS X. All the important bits are
in standard C (C90). It therefore compiles on Visual Studio with Microsoft's
compiler, the Clang/LLVM compiler on Mac OS X, and GCC on Linux.

Windows and Macs aren't tuned for packet transmit, and get only about 300,000
packets-per-second whereas Linux can do 1,500,000 packets/second. That's
packets-per-second, whereas Linux can do 1,500,000 packets/second. That's
probably faster than you want anyway.


@@ -400,12 +400,12 @@ information.
This project uses safe functions like `strcpy_s()` instead of unsafe functions
like `strcpy()`.

This project as automated unit regression tests (`make regress`).
This project has automated unit regression tests (`make regress`).


## Compatibility

A lot of effort has been made in make the input/output look like `nmap`, which
A lot of effort has gone into making the input/output look like `nmap`, which
everyone who does port scans is (or should be) familiar with.


+2 −2
Original line number Diff line number Diff line
@@ -22,7 +22,7 @@ to buffer-overflow the scanner, or even cause it to crash.

But I'd consider other vulnerabilities as well. Does Kali ship this with suid
and there's a preload bug? That's not really a vuln in this code, but if it's 
something I could fix, I'd consider it paying a bounty for it.
something I could fix, I'd consider paying a bounty for it.


## Disclosure policy
@@ -35,7 +35,7 @@ heavily used software, so I'm lax on this.

## Threats

The primary threat is from hostile targets on the Internet sending back back
The primary threat is from hostile targets on the Internet sending back
responses in order to:
* exploit a buffer-overflow vulnerability
* spoof packets trying to give fraudulent scan results (mitigated with our