I found a bit of writing I did about NAT a while back. There's been some talk about this recently, so I modded it a little bit to make it more relevant to WC2, as I thought the more nerdy folks out there might find it helpful.
So NAT. What's it all about?
Lamby's super-abridged, mostly fictional, oversimplified, heavily biased synopsis: Network Address Translation is a messy, stop-gap measure that got dreamed up by psychotic, sadistic and under-medicated network engineer as a way to get more flexibility out of the already overcrowded IPV4 address space prior to the implementation of IPV6.
Most people know by now that IPV4 – that’s the 4 numbers with the dots in-between that we all recognise as an IP address - was just too small. There just wasn't enough numbers. Its a 32 bit value. 2^32 = just under 4.3 billion possible addresses. Seemed like a big number 50 years ago, but even 20 years ago people were realising there weren't enough to go around.
IPV6 has now been implemented, which has lots
of address space - like crazy lots =). We still see IPV4 addresses on the surface, but the backbone of the internet is now all IPV6, and all those things like credit-card gobbling parking meters that all need their own address just have IPV6 addresses. IPV4 is still around to interface with all the existing technology.
So IPV4 addresses were just not cutting it – it's 4 measly bytes, and that limit was (and for the most part still is) hard coded into every bit of network infrastructure in the world. But! Thought the madman, most of the traffic is TCP and UDP, and they have this port number. It's another 2 bytes! What can we do with that? NOTHING you maniac, it already has an important function, leave it alone
. But no…. We get Network Address Translation, and we'll probably be stuck with it until the end of days. Post nuclear holocaust the only things to survive will be cockroaches and NAT.
So yes, a port number is 16 bits (0 thru 65535) and only a fraction of them are actually used. Officially ports 0-1023 are 'system ports' and stuff above that can be registered by people who want to put their official stamp on them….. or you can just pick one that's not getting used too much
Wikipedia lists around 1300, including 3 entries for port 6112:
6112 TCP UDP dtspcd, execute commands and launch applications remotely, Official
6112 TCP Blizzard's Battle.net gaming service and some games, ArenaNet gaming service, Relic gaming service, Unofficial
6112 TCP Club Penguin Disney online game for kids, Unofficial
Anyone spot the mistake? … battle.net port 6112 listed as TCP only. (nvm, I already fixed it – the guy maintaining the page sent me a rather curt little message about 'original research', but he left it because he knew I was right;)
So how many possible port numbers do we need? Well I think we can safely say its more than 256 (1 byte) and less than (256x256) 2 bytes, so allowing 2 bytes is about right. But how many port numbers are actually used every day? Depends who you ask… let's see well 53 is DNS, that’s a cert, and 80:HTTP, more to the point these days 443:HTTPS, 20-21:FTP maybe not daily, 37:TIME gets used behind the scenes, 25:SMTP (email) used lots, but not so much from home any more, these days its all secure web pages (443) and poor old 110:POP3 is well on the way out. I could go on, hey I love that Doom still officially owns port 666, (a system port 8P) but in reality most of the traffic in an average network on an average day probably uses maybe 50 ports? Who knows, maybe 100? Ok so lets say, for the sake of argument, its 100…. but we have 65535 possible port numbers, so lets go ahead and divide that by 100 and now we can support a network of 655 individual nodes off a single IPV4 address.
How do we do this? Well we just have to introduce an agent at the crossover point between our NAT network and the real world. The following description is, of course, complete and total rubbish for so many reasons… there are different types of NAT and different ways to implement things within them, and I'm no expert. This is just here to hopefully help people who know nothing at all about the subject to get their head around the general concept.
Think of it like a (crazy) guy with a clipboard who writes really fast and looks at every packet that goes past. First he sees an outbound packet from puter 20 on port 3000, he says “call it 10” and writes down 10 = 20:3000, then he changes the source
port number to 10 and sends the packet on its way. Next there's an outbound packet from puter 44 on port 8008, so he writes down '11= 44:8008', changes the source port number to 11 and sends it on its way. Next there's a packet from puter 150 on port 3000…. '12 = 150:3000' …. off you go… he does this a lot.
The packets, with their re-assigned port numbers go to wherever they were going and (as is the nature of things) they get responses from remote computers somewhere in the world. Those computers do whatever they do with the payload – the chunk of data that this addressing system is carrying – then they send back their response. There's to and from IP addresses and a source port and a destination port. The remote computers received the packet on the destination port, and you guessed it, for the return journey its all switched around: the to address becomes the from address and vice-versa, and the source port becomes the destination port etc., and the return packets get sent back to the addresses that the initial packets were received from.
The thing is here, that as far as the remote computers are concerned, these packets all came from the same computer, because they were all using the same external IP address as the from address, so they all arrive back at the same place and are greeted by the loony with the clipboard who looks at the packets, then checks his clipboard and says “hmm, port 10… that's puter 20, port 3000”, so he changes the destination
port to 3000 and the to address to puter 10's local IP address and sends it on its way.. etc. When he gets the port 12 return he sends that to puter 150, port 3000.
So 2 different computers on the local network have simultaneously requested data over port 3000 from 2 separate remote servers using the same external IP address and both receive their replies, on the same port, and are none the wiser that for most of the journey their packets actually had completely different addresses and ports on them. Likewise the remote server has no idea that the originating computer is not receiving the packets on port 10 or 12, but on port 3000. But because of the way NAT works it never matters because the nutter in the middle keeps untangling his own knots on the way back.
Why do it this way? When you think about the amount of data that is sent around just for one Game of Thrones episode, its colossal compared with this tiny little 2-byte port number… so why mess with it? There is, of course, a very good reason. It's because every piece of data in every layer of every packet is ordered according to some protocol or other, and somewhere along the line, some computer is going to process that data. So if you add data, or introduce some new protocol, it means the computer at the other end has to understand that change and correctly process it, or even just know to ignore it (Unless you're meta-tagging an mp3, which is just deliciously naughty – but that's another story;). NAT works by sneakily messing with the existing data so that the remote computers just process it in according to their existing protocols without having to be aware that any translation is taking place. In this way it can be introduced in a single location, to suit local purposes without having to upgrade the entire internet (which wasn't in the budget that quarter).
Well. That sound's really clever… so what's wrong with that? Well…. nothing …. and everything. Hmm, how to explain…
You know when you have to move house? You pack everything into boxes and it gets loaded onto a moving van, then unloaded at the other end and you try to reconstruct your home. To make this a bit easier, you no doubt write “kitchen” or “garage” on the boxes, maybe even “kitchen – glassware” and “kitchen – plastics”. This way you have an idea about where everything goes when you get to your new home. Also the guys from the moving company might even notice that a box says “glassware” and decide not drop it 8 feet off the back of the van onto the road with the rest of them (maybe?). Good idea.
- - > So your address is where your new house is, and the extra note about what the box is being used to move, and exactly where in the new house it needs to go when it arrives, can be thought of as the port number. < - -
So now what if 500 people are all moving interstate on the same day, and they are all using the same moving company? Bit of a worry.. ok then, let's up the ante a bit more... What if there was a little snafu and instead of printing address labels for 500 different addresses somebody printed 10,000 labels all with the same address on it and for whatever reason they had to use those labels, and the only place to write any notes on the boxes is in the one little space provided on the address label and that note space just is not big enough to write the correct address? How can this work? Well, says the madman with the clipboard. It seems everyone is writing “kitchen” on these things, hey I've got an idea! There's lots more words than just that…
How about if I write “unicycle” on this, that means “bathroom at the Jones house”, but if I label it “golf balls”, that means “kitchen at the Smith house”, and if I put down “proverb” that means garage at the Bloggs house. Then I can just keep a list here on my clipboard, and when when we get to the next state we can just look at the list and send everything where it needs to go.
As insanity would have it, it turns out this guy really is obsessive-compulsive enough to do this and he correctly decodes the labels on 10,000 boxes and sends them to the right house. He even crosses out “golf balls” and re-writes “kitchen” so Mrs. Smith knows where to take her (broken) glassware. Pretty neat trick right? Perhaps, at least until you have to cross some border where there's a security check, and the customs agent decides to check out your unicycle and discovers instead, bottles of shampoo and other liquid product, which we all know is what terrorists use to blow up planes so… oh dear… Mr. Customs agent rather ominously snaps on a rubber glove and asks you to step into a small cubicle…. and Mrs. Jones never does get her hand sanitizer back.
There are some important protocols that include some form of auditing of their own packets, the most obvious being IPsec or “Internet Protocol Security”. IPsec encrypts packets at the internet layer, which of course includes the port data which exists at the transport layer, so if NAT changes what it assumes is the source port number the packet has been corrupted and can't be decrypted.
The real problem is that the crazy-guy is messing with the language that everybody speaks. Nowhere in the universe, apart from in his twisted head, does “unicycle” mean “shampoo”. Its all well and good for stuff that he handles first, but when Mrs. Jones' mother tries to send them a pot-plant as a house-warming gift, the crazy guy re-labels it 'special fried rice' and sends it to Alaska.
is the problem with NAT for WC2. Battle.net is NOT
a gaming server for WC2 games, its more like Craiglist. It just advertises the games. Initially the host's computer is the server, then it's mostly just peer-to-peer. The 'server' (BNet, PvPGN etc.) just relays messages and keeps track of the stats when the game ends. But, the other clients aren't replying to packets that the fruit-bat has already messed with, they are using the actual correct information that was posted in the game list, or supplied by the game host. So when a packet arrives at the NAT agent labeled “WC2 Game”, our friend looks at his clipboard and says, “oh, you're a medium sized flock of migrating geese” and points it towards the equator.
NAT is an IPV4 thing only. There is no IPV6 Network address translation, its just not needed, and here's why:
The IPV6 address space contains approximately:
The zeros on the end round it off to the nearest 10 billion addresses.
The entire IPV4 address space is under 4.3 billion addresses.
Assigning a trillion trillion
IPV6 addresses to every person on the planet would account for less than 0.01% of the address space. Pretty sure nobody wanted to have to upgrade the internet again.
An IPV6 address uses the same amount of bandwidth as writing BOOBIES!
…… Damn I love numbers (and boobies