SFU ResNet Stability (Jan/Feb/Mar)

April 3rd, 2010

Instead of one post per month, I’m doing one post for January, February and March of 2010 because it’s the same story. ResNet hasn’t gotten any better, and if the outage numbers are any indication, managed to get worse.
January was notable for having 224 separate small outages, a new record, though March did come close with 214 separate outages. February clocked in at 107 outages, though it was likely lower due to the two-week Olympic break where many residents were not around (the two weeks of the break recorded 23 outages, while the other two had the remaining 84).

It appears that the ISP still has not configured their equipment properly, which in the case of a slow connection between Residence and SFU has been ongoing for over a year (the November post has more information on this issue).
Additionally, whenever the interconnect between SFU and Residence goes down, so does connectivity to SFU completely, which indicates that their router is not routing. This is in contrast with the way it should work, where a failed link is routed around, in this case going out over the ISP’s internet link and then back to SFU.
Why both of these issues continue to be issues, I can only guess at, but I suspect it is based on the ISP/Residence not caring enough to properly fix it.

Also noted were 13 outages on Jan. 4th and 25th, March 23rd and 14 outages on Feb. 5th. March 23rd also saw a few minute outage and drastically higher packet loss and latency into the evening. The reason given in was that someone had attacked the network internally, but there was no evidence provided, and given the veracity of previous statements, I would not put much weight behind it.
From my own logs and investigation, it looks more like someone, whether intentionally or not, was able to cause issues with the gateway due to poor network design/configuration.
Despite it being a preventable outage, no refund has been forthcoming, which was entirely expected due to Residence’s continued animosity towards taking steps to improve ResNet or treat their customers properly.

There was also an outage of a significant portion of an hour on the 15th of March and on the 29th of January. There was also notable instability, packet loss and latency on the evening of March 1st and 29th.
I realized when I was compiling these statistics that issues generally crop up in the evening, which is when the network is under the most load. In fact, upon more closely examining my logs, there is a definite trend in latency and packet loss throughout the day that peaks around midnight and falls off over the early morning hours, levels out in the late morning and afternoon and starts increasing around 5PM.
Which leads me to a conclusion I have previously reached, that the network capacity is too low and that the ISP is quite oversubscribed (though this is just further evidence, I reached that conclusion upon learning the 90:1 contention ratio).

Uptime was 98.1% in January, 98.9% (so close to 99%) in February and a more dismal 97.6% in March.
Downstream speed averages continued the trend, being at 10.3Mb/s in January, 11.0Mb/s in February and 9.8Mb/s in March, the first time the average speed fell below 10Mb/s, taking it out of the high-speed internet category (I consider 10Mb/s about minimum for something to be high speed, any lower than that and it’s not terribly usable for things requiring speed).
Average upstream bumped around .75Mb/s, for January, with February and March coming in at .89Mb/s and .71Mb/s respectively.

Average roundtrip time with the gateway was 38ms in January, 29ms in February, and a stunning bad 43ms in March. I say stunningly bad because it finally eclipsed the average roundtrip time on my Shaw connection, which goes to a much further distance, both in terms of the network and geographically.
Whereas Shaw’s gateway is downtown (roughly 15km away), ResNet’s gateway is on Residence, or a few hundred meters away and connected by low latency Ethernet and fiber, versus Shaw’s higher latency hybrid fiber and coax network.
As is a recurring theme, this just shows… You know how this goes, it’s overloaded.

The best summary I can come up with is that I am paying too much for too little. The ISP has not fixed problems that have been ongoing for over a year now, and Residence seems to be more interested in saving face than actually fixing issues.

SFU ResNet Stability (December)

January 18th, 2010

December was another month of ResNet’s continual instability and, despite lower load due to people not being around for the winter break, it still managed to have more outages than any of the previous three months.
Due to a power outage scheduled with insufficient notice, I was unable to log around four days worth of data. But even with fewer days, ResNet was out in some form or another 126 separate times, which is more than any other month that detailed logs were kept for.

December 13th was notable for thirteen separate outages, which is among the highest I have seen. There was high latency (~600ms to about 3s) to the gateway for about two hours on the 7th.
Additionally, there was a rogue DHCP server causing some people connectivity issues, and somehow the ISP left it going for nearly a week. Whether or not they took it off or the owner fixed it is unknown.

There were no major outages, save the internet going down when the power was out. Why there are not have battery backups on their equipment is rather strange, but given the track record of the equipment, rather unsurprising. It would have been nice to have some notice that the internet was going to be interrupted when the power was out, but the ISP and Residence do not seem to make any attempts to keep their customers informed.

Interestingly, there were more minor outages this month than any month prior, but unlike the previous two months, no major outages occurred. I was actually hoping to see a few days where the internet worked properly,over the winter break when most residents were at home, but true to form, ResNet couldn’t even handle the reduced load placed upon it.

Unfortunately, I can’t give an overall uptime due to a technical glitch caused by the power outage, but I do have enough data to give average speeds that the network performed at.
Downstream speed continued being poor at roughly 12Mb/s (or roughly the same as last month’s), while upstream was also a poor at around .72Mb/s.
Average ping to the gateway, on the other hand, was up roughly 15%, which on a roughly static network, is a little odd. It does seem to indicate poor network management practices.

All in all, I wish I had something better to write about, but the network continued to fail at anything other than basic service. Problems such as poor network configuration (manifesting itself as traffic shaping on the outbound link to SFU, which should not be shaped at all) still plague the network, even though some have existed for quite some time.
All in all, another month of lame service for too much money.

SFU ResNet Stability (November)

December 12th, 2009

November was another month of ResNet’s continual instability, clocking up 100 periods when the network was out in some form or another. This is down from 109 in October and 64 for part of September. There was only one major outage affecting only part of the network in addition to three extended periods of heightened instability.

November saw slightly fewer minor outages than October, with 100 outages of between about a minute (or less, the granularity of logging in only one minute) and around fifteen minutes.
To get the month off to a poor start, ResNet delivered very slow speeds during the evening of the first. Additionally, there were also seven outages on the first, the most for any day of the month.
There was also intermittent amounts of packet loss intermittently for seven hours starting in the early evening of the fifteenth which was a precursor of a major outage, similar to behaviour observed last month. Latency was very high on the evening of the sixteenth in part of the network, and was out for around nine hours on some segments of the network the day after, the sixteenth.
The 23rd was the last day with high latency and packet loss in the late evening.

Interestingly, the nine hour outage on the sixteenth was the same pattern of high latency and packet loss the day before as was the major outage in October. Once again, it appears that the ISP did not respond to the warning signs their network stability was declining.
I was unable to get an answer on why the internet was out for about half of Residence for so long, but if it was anything like previous issues, it was likely fully or at least partially preventable.
It is somewhat concerning that these  major issues seem to continue to keep cropping up and that there has been little if any improvement in the network situation so far this semester (over three months so far).

On the plus side, the issue with slow routing from Residence to the SFU network might actually get taken care of as the ISP has finally (after eight months) figured out that the issue is at their end.
I was informed that this was due to the upstream traffic shaping to keep people from uploading too much (due to the low transfer of connection the ISP uses to get to the internet). Inexplicably, though possibly due to hardware or software limitation or misconfiguration, this is also being applied to routes over the dedicated link to SFU, which should not happen. It makes using remote desktop software virtually impossible, and file transfers are painfully slow.
As of this post, it was still not working properly.

The internet was up slightly more than last month, with an uptime of a (still) poor ~97.3% (or ~98.2% in the parts that didn’t experience the nine hour outage part way through the month). Average downstream speed was lower, around 12.5Mb/s (versus ~14Mb/s last month and ~17Mb/s in September) though upstream speed stayed about the same at about .75Mb/s (though lower than September’s .85Mb/s).
Average ping to the gateway was only slightly higher (around 2%).

Once again, the network seemed to struggle under the load that it was being put under, falling flat on its face partially once, highlighting the need for an infrastructure upgrade. (Last I checked, the network was running end-of-life Extreme Networks switches.)
Additionally, the upload and download speeds are still lacking, especially compared to the price we are paying for the service compared to the price of services from other ISPs.
Unfortunately, it does not look like the quality provided by ResNet is going to improve any time soon.

SFU ResNet Stability (October)

November 2nd, 2009

Once again, I went through my logs pertaining to the stability of my ResNet connection for the month of October.  Overall, the connection was down more, but there was one whole day (out of 31) where no issues appeared. It also experienced two major outages.

For October, there were 109 outages of varying lengths, including a three hour outage on the 7th and a roughly seven hour outage on the 13th/14th (more on this one below).
There was also a roughly hour long outage (it wasn’t out for the whole hour, but it was too sow to be usable) early in the morning of the second, as well as large latency spike in the afternoon of the eighth and some during the day on the 19th.
The internet did work properly on the 11th, though this was likely due to reduced load do to many students not being in residence on account of Thanksgiving.

I was not actually at residence on the 13th, being home for the weekend and in transit from home, but I was able to monitor some of the situation from home via my Shaw connection.
The issues actually started on the 12, with noted high levels of packet loss and general instability appearing in the late afternoon and early evening. The ISP was notified, but likely due to the long Thanksgiving weekend, wasn’t able to do anything about it.
The 13th saw some of these issues clear up, but in the late afternoon, things started degenerating again. This lead to the internet going completely down around 6PM and staying out, with a few small exceptions for another 7 hours!
Additionally, routing to SFU, which seems to be hard-coded on both routers at the end of the route (which makes absolutely no sense to me) was not working until that evening, and it still continues (as of right now on November 3rd) to still not work properly.

I did inquire as to why the internet went down and spent so long down, and I was told that it was on account of an attack on the router. Why this was not fixed on the 12th when the issue first started cropping up has not been answered yet, and neither has why it took seven hours to fix a preventable problem.
Personally, I run software on my router (Snort, among others) that is really good at detecting and blocking potential attacks. I don’t know why the router wasn’t properly protected.

The internet connection was up a rather low 97% of the time, meaning that it was unusable for almost an entire day spread over the month!
Additionally, transfer rate was lower this month than last, with an average downstream speed of around 14Mb/s (versus ~17Mb/s for September) and around .75Mb/s upstream (versus .85Mb/s for September). Variation throughout the day increased further, with speeds dropping to around 5Mb/s during the slowest, peak hours.
Average ping on the internal network this month also crept up by roughly 13%.

Based on my observations from this month, I think my conclusion from last month that the network just isn’t designed to handle the load it is under is correct. The day of relative stability when lots of people weren’t here seems to indicate that when the load is low, stability is increased.
The large outage also highlighted the need for proper equipment, which represents a big factor in stability.

SFU ResNet Stability (September)

October 3rd, 2009

I just went through my logs to see when ResNet up here was down or otherwise having issues. Since I moved in September 6th (and started keeping track), there wasn’t a single day where the internet wasn’t out.

Between September 6th and 30th, the service was down briefly 64 times. September 27th was notable for having seven different small outages, while the 17th and 28th saw five each.
Also notable was a general instability for two and a half hours in the late afternoon of the 8th, a few hours of connectivity issues to the internet (but not internally) early on the 10th, two hours of slowness on the 17th and DNS issues on the 27th.
For comparison purposes, over the summer there were many fewer outages (I logged less than eighty minor ones and a handful of larger ones for the whole four months I was here for summer semester) and stability seemed much higher.

Another interesting note was that average ping on the internal network nearly tripled between my logs for the summer and the average ping times for September, with latency to the rest of the internet about doubling. Connection speed between the summer and September’s average (as measured by scheduled connection speed tests) was about halved from around 39Mb/s to around 17Mb/s. There is also much more variation show over the day, with speeds getting up to the 65Mb/s range in the early morning hours, and dropping to 7-12Mb/s during peak hours.
Upload speeds are even more abysmal than they were in the summer, averaging around 850kb/s, down from an average of around 1.5Mb/s in the summer.

The conclusion that I can draw from this is that the network is just not meant for the load it is being subjected to. Over the summer, when residence was at half-capacity or less, the network was generally fast and quite stable. One cause of the increased latency and decreased transfer rates could be that the outgoing connection is not fast enough for the load it is under, which is likely given was (last I had heard at least, it may have changed) a paltry 200Mb/s.

We have T-Shirts!

September 25th, 2009

I’d like to announce the availability of our new t-shirts: “Free as in Rum”.  This is taken from this post on my blog, and came about something like this:

Ffejery: “Is $foobar program Free as in Beer?”

Bob: “Not sure, but I’m sure it’s available on the Torrentz -grin-”

Ffejery: “So what would that make it… ‘Free as in Rum’?”

Behold:

Everything is Free.  This is the Third Kind.  Buy your t-shirts here.

Update: If you don’t get it, read this.

- The Ffejery

Interesting Known Issue for Gmail…

July 10th, 2009

I came across this interesting known issue for Gmail when I was poking around last night trying to figure out another problem.

Interesting known issue for Gmail...

Google's web email may not load in Google's browser. Use Firefox instead.

A rather interesting (and rather amusing) issue that Gmail won’t load in Chrome. I’ve been using Chrome on and off since it came out (more so now that I have a computer that has enough RAM for it, Chrome seems to like using lots of RAM), and I used to have the issue they’re describing happen to me quite often. It hasn’t happened for months now, so I’m assuming it’s a historical issue with old versions of Chrome.

Source (though there’s no guarantee it’s still there anymore): Gmail Known Issues – Gmail Help

My Experiences with a Virtualized pfSense Router

June 7th, 2009

Here in SFU’s residences, we have mandatory internet provided for us through a service called ResNet (or RezNet depending on where you look). It has been notoriously unreliable over the five semesters I have been in SFU, which I’ll discuss a little more later. The constant reliability issues over my first two semesters prompted me to get a second connection from Shaw so that I would generally always have internet connectivity.
But I got tired of manually swapping the connections between computers when the ResNet one wasn’t working properly. I just happened to have a server of reasonable speed with three network interfaces and a license of VMware Workstation. So it naturally followed that I decided to use that to run a router, but the issue was that I needed something that was good at load balancing, and in my research, Linux routers weren’t the best at it. Enter PFsense.

Initially, I had intended to setup NetBSD on my server to act as a Xen host and then run Linux on top of that for an OS I was familiar with, with NetBSD being a router host with PF, a central part of a router/firewall, running the sharing of the two connections.
Unfortunately, I could not get NetBSD to play nicely with my hardware (and in fact, no Linux kernel older than 2.6.27) and gave up and finally settled on Arch Linux due to the fact that it is almost always up to date. However, it still sat in a configuration where I had to manually change configuration files in order to switch internet connections due to my inexperience with VMware Workstation.
After some reading, I finally figured out that Workstation already had a bunch of built in networking functions, including bridging, which had been my stumbling block. Once I figured out how easy it was to use it to bridge between physical and virtual connections, there wasn’t anything keeping me from using Workstation to host a router.
So I went in search of an OS that had strong load balancing capabilities, and finally settled on PFsense, which even had an already made VMware compatible virtual appliance.

So, with PFsense happily running on Workstation, and my two WAN facing NICs setup not to grab IP addresses for the host OS, I was ready to try PFsense out.
Before I continue, I think a little background on my network layout is probably helpful. I’ve got both internet connections fed into a smart-managed (read cheap but with some advanced functionality) switch and from there, through port-based VLANs to my two onboard NICs in my server. Those are bridged to the WAN side of my PFsense VM (Virtual Machine), and then there’s another hardware NIC going from the LAN side of the VM to the rest of my physical switch, which then connects all my other various devices and computers together. It is a nice arrangement, and allows me to reconfigure my network at will in software rather than in hardware (actually moving cables between ports).

PFsense itself is based on FreeBSD with PF, and has a great web-configuration interface, as well as SSH access and normal console access. Setup itself was easy, based on a wiki article for Windows setup here that also works with Linux with some changes. Once I had it configured for one connection, I followed the tutorials for load balancing here, with the changes that I don’t have any routers between me and the internet.
There is a word of warning here, and that is that running virtualized is less secure than if I was running PFsense on a standalone machine, but I think I’ve properly mitigated with as much as I can by making sure Linux passes traffic on my two WAN ports directly to the VM and doesn’t do anything else with it. Your mileage, as the saying goes, may vary in this department and it is suggested that you have a NAT router in front of the VM. Which is something I was trying to avoid as I routinely kill $300 pro-sumer/SoHo routers when using them heavily and I needed something that could take the load of my typical usage patterns, including over 1TB of transfer through it in some months.
So after I had the router configured, I proceeded to make a silly mistake (due to not quite understanding what I was doing in PFsense) and proceeded to mis-set the failover rules so that they would never use the Shaw connection, which due to the then current levels of packet loss happening with my ResNet connection, was annoying to say the least. Once I noticed that I’d set both failovers to go the same way (WAN1->WAN2) instead of one each, I noticed my internet experience get much better.
PFsense has nice graphs of connection quality, and I thought I’d include one here to illustrate just how bad ResNet was around the time when I first got PFsense up and running. Additional graphs show processor utilization, states, transfer and many other interesting and useful statistics.

2 day quality graph generating using RRDtool through PFsense. The part that actually looks decent was a period of 100% packet loss as the graphs are only generated to the gateway.

2 day quality graph generating using RRDtool through PFsense. The part that actually looks decent was a period of 100% packet loss as the graphs are only generated to the gateway. This can be changed by manually editing a configuration file.

After I got the load balancing working, I set up some policy based routing (as per the tutorial) so that traffic to SFU’s IP ranges automatically goes to the ResNet connection (which ostensibly is connected to SFU’s network through a high-speed link, although I’m unsure about that some days) and when that is down, goes out through my Shaw connection. Additionally, SSL/HTTPS and other encrypted traffic is set to use my Shaw connection unless that is down (and has had less than four hours of downtime in nine months) to keep banking websites and the like from complaining if the connection were normally load balanced and hopped physical connections (with different IPs). Additionally, my Shaw email is set to only use the Shaw connection.

Now for some on my actual experiences using PFsense for the past 5+ months or so.

I have had one issue, but I’m not sure if it’s a PFsense issue or a VMware issue, but one time when I restarted my server, PFsense had lost its configuration. Luckily, I had a configuration backed up and was able to easily and quickly restore it, but it was still an annoyance.
I’ve never had PFsense lockup or crash on me that wasn’t a fault of either me directly, or of my messing with something on the host OS. The only issue I have seen is a marked slowdown when it is initially starting due to Snort hogging all of the system resources to startup. (The VM is running with 384MB of physical memory and 1 core of a Q6600 assigned to it.)
Performance is great. I have greatly increased the number of connections μTorrent allows BitTorrent transfers to use to take advantage of the two connections, allowing speeds to hit ~5.2MB/s down (~41.6Mb/s!) and up to around 250kB/s upstream (~2Mb/s upstream). Even when running many torrent transfers (~50) and with nearly 25,000 connections open, the router barely consumes more than 25% CPU of a single core of a 2.4GHz Core 2 Duo and stays under the maximum of 384MB of RAM. The thing that seems to consume the most resources seems to be the Snort plugin from the PFsense repos.

A speedtest showing how fast two internet connections can be when load balanced. 15Mb/s Shaw connection (possibly up to 35Mb/s with PowerBoost) and 10Mb/s SFU ResNet connection (possibly up to 100Mb/s, but Ive never seen it go over about 50Mb/s on its own).

A speedtest showing how fast two internet connections can be when load balanced. 15Mb/s Shaw connection (possibly up to 35Mb/s with PowerBoost) and 10Mb/s SFU ResNet connection (possibly up to 100Mb/s, but I've never seen it go over about 50Mb/s on its own).

But the speed isn’t why I wanted two internet connections (though it is a nice bonus), the reliability is. I have never a time when both connections were down simultaneously, and PFsense wouldn’t mind if I added a third connection from a diverse provider (say Rogers Portable Internet or something similar from a local WISP) for added redundancy.

All in all, PFsense does everything I need it to do and I’ve been very happy with the results. I highly suggest it for straight up load balancing for those needing a higher speed connection (but beware, there are caveats, such as most things other than BitTorrent will not run at speeds faster than one single connection) or have a single unreliable connection and don’t have the option of switching providers.

pfSense Router Virtualization part 2 (ISP discussion)

June 7th, 2009

And now for a little bit on my experiences with my ISPs themselves, to give a little more background about why I even needed a second connection.
SFU’s residence has a mandatory internet connection that costs $127.16 per semester (or ~$32/month). During my first year here (2007/2008 school year) it was being run by ConnectWest Networks, a subsidiary of Data Fortress, and there were many reliability issues with it all school year, the largest of which was caused by a virus attack compounding poor network planning. A virus attack, which a friend and I diagnosed and informed ResNet about, but that’s a story for another post.
The unreliability was even worse in the fall semester of 2008, which by time the start of the semester rolled around, was completely out for almost two weeks. According to monitoring software I was running (just a version of MTR with aggregate stats, actually) for most of the fall semester, packet loss was an average of about 20% for the semester, although it was mostly grouped into large blocks of huge packet loss.

ResNets quality since I installed PFsense. The little break is a period when I was having some issues with my server and the large break is form when it was down when I wasnt at school. This is over a six month period, showing twelve hour averages. Over the period, the internet was never out for longer than a couple or hours, which is why there are no spike to 100% loss shown.

ResNet's quality since I installed PFsense. The little break is a period when I was having some issues with my server and the large break is form when it was down when I wasn't at school. This is over a six month period, showing twelve hour averages. Over the period, the internet was never out for longer than a couple or hours, which is why there is no spike to 100% loss shown.

This starkly contrasts to my Shaw connection, which has been down less than four hours while I’ve been monitoring it, and has only been down on three separate occasions. Once, and the longest, was about three hours do to a car crash which took out a power pole and Shaw’s plant on it. The other two I don’t know the causes of, but one may be due to someone’s improperly connected TV. Shaw wasn’t sure about that one either.
Oh, did I mention I pay about $35 a month for Shaw (through the student plan) which offers a more reliable connection that is only ~$3/month more money? Additionally, Shaw offered to credit me back every time I’ve called them about internet downtime, whereas for the number of times ResNet has been down/slow/unreliable, I only received a $16 credit for what should have amounted to a full refund for the semester (at home Shaw credited us for two months of somewhat unreliable service when we were among the first people in town to upgrade to DOCSIS).
ResNet is trying to make things better though, undergoing a large infrastructure upgrade to increase the bandwidth locally, which was not nearly enough for the amount of people using the network and possibly transferring files between each other. For comparison I have a faster internal network for my few computers and have deployed faster network for a few hundred people than they did for around 1800 people.
Additionally, the off campus connection is a measly 200Mb/s (at least, last I knew about) for 1800 people, which given us a contention ratio of 90:1 (1800 people with a 10Mb/s connection) which is much higher than Shaw, who supposedly is closer to 25:1-35:1 depending on neighborhood. This means that at peak times, the internet slows down quite a bit. I have never noticed a corresponding slowdown with my Shaw connection.

So, all in all, I’m happy with my Shaw connection. But my ResNet connection certainly isn’t worth what I’m paying for it, especially compared to my Shaw connection.
I, however, appreciate the changes that the new ISP, Urban Networks, is doing to improve the situation, but I still have at least one connection interruption about every day on average, though they generally only last a few seconds. How the network performs come fall, when there are many more people on it, remains to be seen.

OS fingerprinting fun with p0f

March 3rd, 2009

So, I’m new to this whole blogging thing, and I haven’t quite gotten the hang of it down yet. Apparently I’m supposed to finish what I’m posting about rather than post and then finish.

A friend and I have discussed various sorts of datamining that we could do on the network traffic that firewalls routinely filter out and drop because they’re not actually intended for our particular IP. Somewhere in this discussion, we had talked about doing OS fingerprinting to see what else was on the local network (which is a small campus-sort of network, uses a /21 so only 2048 maximum hosts). But the conversation meandered somewhere else from there and never really got anywhere.

Which brings me to a few days ago. I was sitting in class and thinking about OS fingerprinting some more. I’ve played around with nmap’s OS fingerprinting before, but it is a rather brute force approach, being active. And active scans with nmap are something that people don’t like you doing on their network. So I started thinking about passive OS fingerprinting.
So, I did some searching on the internet and came across a tool called p0f, that does passive fingerprinting on all of the packets that it captures with pcap. Its developers even provide a handy webpage that will try to guess your OS (it’s pretty accurate, unless you’re like me and are running PF with scrub enabled), which you can try out here.

Which led me to deciding that it would be interesting to try out. So I setup p0f and set it to listen on an interface outside of my firewall and watched as packets started rolling in, as well as all of the OS guesses.

Which brings me to the results. I ran the scan on a /21 (that’s 2048 IPs) network, and I only saw packets from 513 unique hosts over the run of about a day and a half. Additionally, I had a lot of unknown hosts, or ones that I would categorize as unknown based on widely differing results from p0f. Additionally, in a few cases, I was able to find out the host OS at a specific IP that was listed as an unknwon by p0f and guessed that other fingerprints that looked the same were the same OS.

So, without further ado, here are the results (broken down by OS family):

143 hosts running XP.
113 hosts running Vista.
24 hosts running Windows XP or newer (so XP/Vista).
282 total running some version of Windows (I’ll explain the missing two in a moment).

54 hosts running OSX.

12 hosts running Linux of some flavour.

163 unknown OSs (of which I think one is running Linux and 6 are Windows).

2 Others.

So, hosts of interest. Someone is still running Windows 98SE, and there is also a Windows 2000 SP4 based computer on the network.
Additionally, two people still have PPC based Macs (as evidenced by their earlier, non-Intel version of OSX). One person is running FreeBSD (okay, that’s what I seem to scan as when I’m not scanning as unknown) and one person seems to be running some strange sort of NAS operating system (Synology or something?).
And yes, there are some people with XP SP1 still out there, as well as most people seem to be using SP2 and not SP3.

I had actually expected to get more hosts than I saw as this network apparently has 1800 people using it, granted, not all of those will have computers, and not all of them would have been on over the two days or so that I collected data.
I think part of the reason may be the fact that the switches don’t uniformly leak packets, but only sometimes, which may have kept me from getting proper data. Additionally, I didn’t see much stuff from the higher /22 (1024 IPs) as the lower /22, which may have to do with the way the network is setup. I suspect that there is a router in between the upper and lower halves, which is likely keeping me from seeing a lot of traffic from them (insert graph of IP usage distribution here).

Now for some analysis of the data. I was expecting to see more Windows machines (I suspect that quite a few of the unknown ones were Windows machines, but I couldn’t get a positive match on lots of the suspicious ones). The mix of XP versus Vista is about what I expected, because I know that lots of new (first year students mostly) get new laptops when they go away to school, and Vista Home Basic is generally the OS of choice.
There were also more Macs than I expected, but that makes sense considering people seem to like Macbooks (I’m not sure why though).
There were a lot fewer Linux hosts than I expected as well, and I’m not sure why that is. Maybe they’re harder to fingerprint and they’re in my unknowns.

Additionally, I should add that I have suspicions that some of the unknowns were actually routers. And while I understand that p0f shouldn’t have any issues fingerprinting the OS behind a router, it seemed to be having issues some times. Additionally, I also put hosts with multiple OS returns under the unknown categoty, which likely indicates multiple computers behind a router. Interestingly enough, I saw lots of XP/OSX pairings here, so I think people may have brought their XP desktops with them to school and had also gotten a nice new Macbook to take to class.

All in all, p0f seems to work quite well under good conditions (the connections from others on the network to my router were very accurate) but didn’t quite turn out too well under less than stellar conditions. Still, it is a very useful tool, and it was able to identify more than it wasn’t.
Another note, when I was compiling my aggregate statistics, I wanted to be 95% certain (or better) of what an OS actually was, so this probably contributed to the high number of detections that I classified as unknown (I have decided not to post my raw data due to possible privacy concerns, so you’ll just have to trust me on this one).

And what does this mean to the normal person? Well, if someone knows your OS, then they have narrowed down their list of exploits to run against you. Though, even if you are running something with packet normalization (like PF’s scrub), there are still other ways to fingerprint an OS, though they are much more intrusive, something which I’ll hopefully expand on in a later post.