Comments on “3 TB disks are Here” from Linux Magazine

Samat K Jain

22 Feb 2011

Linux Magazine published an article last week, 3 TB Drives are Here. On Twitter, I originally said it was wrong, but that’s a bit harsh. Parts of it, however, very misleading, and parts of it unnecessarily confusing.

The “2.199 TB” limit describes Logical Block Addressing (aka LBA), a scheme for addressing sectors on modern disks. Sectors are numbered 0 to n, where n is a number dependent on the disk’s size (i.e. disk size in bytes divided by sector size). There’s nothing intrinsically limiting about LBA, other than how many bits you can devote to store such an address. With this in mind, the sentence:

The LBA scheme uses 32-bit addressing under the MBR partitions.

is very misleading. I hate to be a grammar nazi, but it’s a misuse of active versus passive voice. This phrasing makes it seem as if LBA is the limitation; it’s not. Master Boot Record (MBR) blocks are what limit LBA addresses to 32-bits, and are what limit partitions to 2.199 TB.

The article then moves to discuss 4 KB sectors. While nothing here is wrong,it ignores the fact that current “4 KB sector disks” on the market (i.e. marketed as “Advanced Format”) do not work in the way described.

Most Advanced Format disks continue to report that their sectors are 512 bytes, a mode called 512e. Because of this, your “4 KB sector” disk still is limited to 2.199 TB when using MBR partition tables (the article, confusingly, implies otherwise).

However, they do use 4 KB sectors internally. That is, a request for sector 0 and 3 both, internally, request the same 4 KB sector. There are significant performance problems here: if you request sector 3 and 4, these internally map to two different 4 KB sectors. This becomes a problem when your filesystem uses 4 KB blocks (i.e. most modern filesystems, including NTFS, ext4, XFS, etc) that are not aligned to these boundaries: a 4 KB read may cause the drive to unnecessarily read 8 KB. The article does not mention anything about this sector alignment problem.

Discussing other operating systems, the article vaguely mentions “several operating systems” have switched to GPT (GUID Partition Tables). I really hate how vague the article is here: as far as I know, the only OS that does this by default is Apple’s Mac OS X. The article sells Linux short when it says:

In the consumer world this is a downside since most motherboards don’t have a BIOS that is GPT capable. This can affect all operating systems including Linux.

because, in fact, most motherboards do have a BIOS that can boot from GPT, especially when you use a hybrid MBR. And Linux, with GRUB 2, works fantastically with them. Unfortunately, compatibility is a crapshoot, and is not advertised. However, all the systems I’ve experimented on, some as old as 2005, worked fine booting from GPT. Where Linux definitely falls short is that no distribution (AFAIK) will setup a GPT for you.

With that in mind, it’s difficult to say:

Linux is ready for 4KB drive sectors with 64-bit LBA addressing

When it really isn’t. The largest obstacle is the sector alignment problem that the article glosses over, best explained by Theodore T’so’s Aligning filesystems to an SSD’s erase block size. His post, in short:

Linux partitioning utilities are hard-coded to assume 512-sectors, which create problems for 4 KB-sector disks and disks with larger block sizes (i.e. SSDs)
Various filesystem structures are not aligned to 4 KB boundaries (T’so points out LVM)

All of which kill performance, and in the case of SSDs, shorten lifespan.

One thing that bothers me about this article is that while it tries to explain the issues involved with 4 KB sector disks, it does nothing to tell you how to mitigate or avoid any of them. In the next couple of weeks, stay tuned for a few articles from me explaining how you can get around them with Linux.

gpxsplitter: Split GPX files with their waypoints

Samat K Jain

15 Feb 2011

Comments

gpxsplitter splits multi-track GPX files, containing waypoints, into individual one-track GPX files with their respective waypoints.

GPX files containing multiple tracks and waypoints jumbled together are produced on export by many GPS units, particularly MTK chipset-based devices such as the Qstarz Q1000 and Transystem i-Blue 474. Separating tracks and their associated waypoints was a headache until gpxsplitter came along. It’s meant to be run first-thing after downloading data from your unit via gpsbabel or mtkbabel. It’s a quick little script written in Python 2.x, with dependencies on mxDateTime and lxml.

You can get it from the gpxsplitter repository on gitorious, and the GpxSplitter wiki page is the one-stop place that will collect information about it.

I thought about turning this into a web service, where users can upload their GPX files and have them split, but I’d like to know the demand for such a service before writing it. Ideally, gpxsplitter should be part of gpsbabel or something… but yeah, I’ll save the XML parsing in C for a very, very rainy day.

There are probably any number of bugs. If you find one, please let me know—and send a testcase too!

Play WebM in Internet Explorer 9

Samat K Jain

14 Jan 2011

Comments

Update: Google now offers a WebM plugin for Internet Explorer 9, much easier than what I’ve detailed below.

Google’s recent announcement deprecating H.264 for Chrome (see my thoughts on it) means it’s likely that WebM will become the defacto standard for the HTML5 video tag, supported by Internet Explorer 9. Unfortunately, Internet Explorer 9 does not (yet) ship with WebM, despite a lot of misleading PR indicating some kind of “compatibility”.

So, how do you play WebM with Internet Explorer 9?

The easiest way is to use the DirectShow filter pack from Xiph.org. Download and install the installer, available for both 32-bit and 64-bit Windows, and not only will you be able to play WebM/VP8, but also Ogg/Theora, Vorbis, Speex, and FLAC. It’s an royalty-free, open-source standards smörgåsbord!

What do you do next? Of course, submit feedback! Click Send Feedback under Internet Explorer’s Tools menu, and simply ask Microsoft: please support WebM!

Clarification: Don’t install the Support for HTML <video> tag option. It installs an ActiveX control, which requires some extra markup (see the release notes).

Note: Internet Explorer 9 is a beta, as well as Xiph’s DirectShow filters. IE9 doesn’t support a lot of <video> tag features, so many demos out there on the Internet don’t work.

Google Chrome deprecates H.264: the right move, but little change for HTML5 video

Samat K Jain

11 Jan 2011

Comments

Google has decided to deprecate H.264 in Chrome. This is nothing but good for the future of web video. With support in three major browsers (Firefox 4, Chrome, and Opera) it means that WebM/VP8, instead of H.264, will become the defacto codec for HTML5 video.

I’ve talked to several people who think that this move has killed HTML5 video. I’m not sure I follow the logic — little has changed, except what will become the dominant codec.

You can say it’s made Flash the least common denominator, which ignores the fact that Flash already IS the least common denominator for web video.

Regarding codec fragmentation, little is changed there too: Microsoft’s Internet Explorer 9 and Apple’s various Webkit products still do not have WebM/VP8 support. Content providers wanting to support HTML5 still need to encode to both H.264 and WebM.

With the codec fragmentation problem as yet unsolved, do content providers have any reason to use HTML5 video when Flash still is the least common denominator? Well, Flash is no longer included with Windows 7 or Mac OS X (and was never included with any reputable Linux distribution). Are content providers still willing to force users to download plugins, when they can just use the dominant HTML5 video codec?

I don’t have the answers to these questions, nor does anyone else. Nobody said that the problem of open video would be solved easily or overnight. But focusing on WebM is, in my opinion, a step in the right direction.

In the meanwhile, WebM is winning, so why don’t you start encoding your videos to WebM now? On SamatsWiki I’ve a sparse page on encoding to WebM (which will work with stock Debian/Ubuntu tools), as well as one on encoding to Ogg Theora. If you’re on Linux, the easiest way to convert videos is OggConvert, an easy-to-use GNOME-based GUI. Publishing them on the Web is just as easy. Check out the HTML5 video chapter in Mark Pilgrim’s Dive Into HTML5, or Jakub Steiner’s How to get your clips on the web.

Hardware review of the Hewlett-Packard ProLiant N36L Microserver

Samat K Jain

10 Dec 2010

Comments

[flickr-photo:id=5204509633,size=m]

Low-power systems are popular with enthusiasts everywhere. From the Linksys NSLU2 (thoughtfully also known as “the slug”), and the various Marvell SheevaPlug devices, there isn’t a shortage of options. With all of them, however, you need to make compromises—be it having to deal with ARM’s tics, lack of I/O expansion, bad performance, or lackadaisical manufacturers.

If you’re willing to compromise on: size, but still be much smaller than your average PC; power, but also consume less power than your average PC; performance, but still run circles around an ARM-based device—then take a look at the Hewlett-Packard ProLiant N36L “Microserver”. Introduced September 2010, reviews and photos of this system are few and far between. In this article, I review the hardware aspects of the N36L, while in another, I review its software aspects [coming soon].

Internals

[flickr-photo:id=5204528809,size=m]

The N36L is powered by an x86-based AMD Athlon II Neo processor running at 1.3 GHz intended for low-power systems like netbooks. While it has a slower clockspeed, this AMD CPU typically benchmarks faster than Intel’s Atom 1.6 GHz CPU. For the enterprise crowd, the Athlon II Neo is a 64-bit processor and supports hardware-accelerated virtualization and nested paging. This CPU is ideal for partitioning lightly-used services into lightweight VMs. With two DDR3 DIMM slots, the N36L can accommodate up to 8 GiB of RAM.

Graphics is provided by an integrated ATI Mobility Radeon HD 4200 (which also supports GPGPU/OpenCL via proprietary drivers), and the Gigabit NIC is a Broadcom NetXtreme BCM5723.

[flickr-photo:id=5205136774,size=m]

The mainboard provides a respectable amount of expansion. It has two PCIe slots, an x16 (you could easily use a discrete graphics card, though you’d have to be picky about dimensions) and an x1. Adjacent the x1 slot is an x4 slot, supposedly for use with HP’s proprietary management card. You could probably hack a conventional x4 card into the slot, but I rather HP have made the x4 slot usable and used the x1 slot for it’s proprietary add-ons (does a management card really need more than PCIe x1?).

The chassis’ disk racks connect via a mini-SAS connector. There’s one internal SATA connector for the 5.25” bay, but the system’s eSATA connector faces outward so your dreams of easily putting six drives in this tiny system are dashed.

There’s an internal USB 2.0 port, a common feature on servers. It makes running an OS off a USB flash drive that much easier—sequestered internally, such a drive won’t accidentally get knocked off.

Externals

The frontside of the N36L is… “server-like”, whatever that means. Along the top are LED indicators for disk and network activity, as well as the system’s backlit power button. There are four USB 2.0 ports along the right side, and an HP logo that glows blue when the system is on. The chassis door is metal (not plastic!), and has a lock.

[flickr-photo:id=5204512873,size=m]

The backside of the N36L is austere. The only ports: two USB 2.0 ports, one D-sub VGA port, a Gigabit Ethernet port, and one eSATA port. There’s a security Kensington lock slot, as well as an “expander slot” for HP’s proprietary management card. The power supply, fortunately, is integrated (power bricks are a pet peeve of mine), and uses a standard AC power cord.

There are two fans: a 120 mm fan for the system’s main cooling, and a 40 mm fan internal to the PSU. Fortunately, both are quiet; HP rates the system at 21 dB. It’s not silent, but it is quiet. There are no top or side vents; air is drawn in through the front and exhausted out the back.

[flickr-photo:id=5204520585,size=m]

Unlike other PCs, the N36L does not use Phillips-head screws for the user-accessible bits. Two sizes of Torx screws are used (I’m unsure of the size), and HP was pleasant enough to include a Torx screwdriver that snaps into the inside of the machine’s front door. Screws for hard disks and the optical disk drive are also screwed into convenient holes in the front door—no little baggies of screws to lose here! There is a single thumbscrew on the top-back to remove the top cover, and two thumbscrews hold the motherboard plate in place.

[flickr-photo:id=5204523307,size=m]

Other than the handle mechanism which has a metal spring, the N36L’s disk caddies are simple plastic affairs. The plastic does not appear to be particularly high quality, but since the only purpose of the things is to hold disks (and not face the environment), it probably good enough.

How much power does the N36L consume? Using my Kill-a-Watt, I measured 60 W on startup, which settled down to 45 W or so after booting and idling. This unfortunately is a much more than I’d have liked, but with four spinning disks I suppose it’s reasonable.

Cons

I’m not trying to be pessimist by not including a Pros list, but honestly, if you need one at this point you probably don’t need this machine. However, there are some cons I found annoying:

Low height clearance for RAM. I found this out the hard way when my heatspreader-equipped DIMMs would not fit
In the USA, at least, the N36L ships with 1 GiB of RAM, and either 160 GB or 250 GB disk… Which I immediately tossed for 8 GiB of RAM and four Western Digital 2 TB Green series disks. HP could have easily knocked $50 off the price by not including RAM and disk.
No SATA cable included for the 5.25” bay. This is a minor quip, and was probably done to save that last extra $0.50—but it makes the decision to include the RAM and the disk seem that much more strange.

Conclusion

Why did I get an N36L? The short list:

It’s x86-based. I didn’t want to muck about with ARM—its benefits for me are few.
It can hold four 3.5” disks, and with Gigabit Ethernet functions as a great, inexpensive NAS
Does not come with an operating system—yes, you are NOT paying Microsoft’s Windows tax! Also, all of the hardware in the N36L is well-supported by Linux and free software. Most other systems in this class force a Windows Home Server license on you.
Cheap. I bought the N36L for $320 USD. It’s more expensive than most ARM-based alternatives, but simultaneously much more powerful.

If you’re looking for photos, see my HP ProLiant N36L set on Flickr. And, if you liked this article, please support this site and consider buying the N36L via affiliate link through Amazon (which only has the 250 GB disk model) or Newegg (which has both the 160 GB model and 250 GB model).

Bing Imagery Misaligned at Lower Zooms

Samat K Jain

30 Nov 2010

Comments

Microsoft’s Bing aerial imagery (that’s recently been donated to OSM for use in tracing) is offset by a few meters in some places.

For example, earlier this week I traced buildings with imagery from USDA’s NAIP program. Besides a history of being well-rectified, a GPS track in the parking lot confirms that I was spot on. However, on Bing with a low zoom, they’re misaligned:

[inline-old:Bing-Misaligned.jpg]

Zoom in a bit, and they magically align again:

[inline-old:Bing-Aligned.jpg]

Apparently, different, well-rectified imagery is used at higher zooms. If I’ve found one problem, it then follows that it exists elsewhere. Be careful when tracing!

High-resolution text console with uvesafb and Debian

Samat K Jain

9 Nov 2010

Comments

While you may rarely use the console on your server, it’s nice to have a high-resolution display just to see that many more columns and rows. Linux’s vesa module (via the vga= parameter) has been around for a while and made this possible, provided you kept up with what VGA mode number to use and don’t mind the spotty hardware compatibility.

While KMS is the way to do this in the future, it doesn’t help us with the drivers and hardware we have now. A new kernel module, uvesafb, mainlined in 2.6.24, is another, new option. In addition to specifying modes in a more user-friendly way (e.g. 1280x1024-32 for 32-bit color, with a 1280x1024 resolution), hardware compatibility is better—in particular, you can now get a high-resolution text console with NVIDIA display adapters.

In the following, I describe how to use uvesafb on Debian and derivative distributions (e.g. Ubuntu). The instructions assume kernel 2.6.27 or higher (Debian 6.0 (squeeze) and Ubuntu 8.10 (Intrepid Ibex), or later).

OpenStreetMap “Geolocate me” user script

Samat K Jain

20 Jul 2010

Comments

[inline-old:OpenStreetMap-Geolocate.png]

OpenStreetMap Geolocate is a user script that adds a “Geolocate me” link next to the OpenStreetMap.org search box. If your browser supports it and you’ve granted permission, clicking on this link will center your map window to your location, as reported by your browser via the HTML 5 geolocation API.

Say you’ve taken your laptop to a new cafe or conference—as soon as you open up OpenStreetMap, you can hit the “Geolocate me” link and quickly see what’s around you, without fiddling with search or endlessly dragging the slippy map. Or, better yet, quickly add what’s missing.

This definitely needs to be built into the OpenStreetMap website.

On most browsers, the geolocation API uses Google Location Services or Skyhook, which determine your location based on nearby wireless 802.11 access points. However, some browsers, like Firefox 3.6 on Linux, can talk to gpsd and your GPS unit, so geolocation can get quite accurate.

I’ve tested it on Firefox 3.6, Chrome 5, and Opera 10.60 (which, interestingly enough, is the first non-beta of Opera that supports geolocation). I’ve been told it also works on Safari 5.

I should make a note in the interest of accuracy: geolocation isn’t actually part of “HTML 5”—it’s a product of the W3C Geolocation Working Group. However, the need to be accurate didn’t keep the XML out of AJAX, and by and far geolocation is one of the technologies people think about when they hear HTML5.

This entry is cross-posted on my OpenStreetMap user diary.

Deciphering Intel’s new X25-M G2 SSD

Samat K Jain

22 Jul 2009

Comments

My laptop hard disk is beginning to die. In what seems like perfect timing, Intel has released a refresh of their X25-M solid state disk (SSD) lineup (via Engadget and Ars Technica). The new models offer much over the old ones:

Manufactured on a 35 nm vs 50 nm process
Faster seek times, both read and write, leading to more I/O operations per second (IOPS)
Significantly less expensive (Cited as a 60% price drop, though that’s comparing at-introduction MSRPs. It’s still at least 25% less.)
Greater shock tolerance (1500 G vs 1000 G)
Future TRIM command support, via firmware upgrade. The ATA TRIM command mitigates SSD fragmentation problems that have been the cause of many performance issues.

While die shrinks usually lead to parts that consume less power, the new X25-M uses the same amount of power when active (150 mW), and actually more power when idle (75 mW vs 60 mW). Still, it’s significantly less power than most laptop hard disk drives (my Hitachi 7K200 idles at 800 mW). [Source: Intel’s technical specifications]

Of course, with all these changes, Intel decided to name the drives the same as the old ones, making it difficult for people who want to buy one right now to know what device they’re actually getting.

This kind of inane marketing isn’t new, with the most infamous example on my mind being the Linksys WRT54G. Linksys (so far) as made 6 different revisions of the exact same model, drastically changing the internal hardware throughout the revisions. While most people don’t care, a few did, such as those in the modder community (like myself) who wanted to run modified firmwares. Purchasing anything took a lot of research on the part of the buyer. Manufacturers really should be in the business of making their products easier to buy, not more difficult.

Fortunately, I’ve done the research for you: the new Intel SSDs do have slightly different part numbers, so you can tell the old parts from the new. For example, the old X25-M 80 GB disk has a part number of SSDSA2MH080G1C1, while the newer model has a part number of SSDSA2MH080G201. That is, the part numbers contain either a “G1” or a “G2” corresponding to the revision.

With the glowing positive reviews for the X25-M since it’s introduction a few months ago, its new lower price, and most importantly, the failure of my current laptop disk, I’m going to pick up one of these drives within a week.

Microsoft’s Hyper-V contribution is not outside their agenda

Samat K Jain

22 Jul 2009

Comments

If you pay attention to Linux-related news, you may have heard that Microsoft has contributed code adding Hyper-V acceleration to the Linux kernel. This event is not something that falls outside of their corporate agenda (whether it falls out of their strategy, I’ll let Steve Balmer voice).

Hyper-V is Microsoft’s hypervisor, included with the server editions of Windows (somewhat similar to VMware Workstation or Sun’s VirtualBox). It lets you run other guest operating systems within the currently running one (called the host OS). Typically, virtualizing guest OSes is slow. To improve performance, rather than virtualizing everything, special drivers and software can be installed into the guest OS to make certain things faster (such as graphics, disk I/O, etc).

The popular Linux hypervisors (Xen, KVM, etc) don’t have special drivers like these for Windows, so they won’t be able to run Windows particularly quickly. With Microsoft’s contribution, Linux now will ship with built-in acceleration for Microsoft’s hypervisor, making Linux run that much faster. If you were an IT shop that simultaneously needed to maximize performance and run both Linux and Windows, would you:

Run an open-source Linux hypervisor, and virtualize Windows (slow)
Run Microsoft’s hypervisor, included with expensive Windows Server licenses, and virtualize Linux (fast)

The answer’s clear. Microsoft’s kernel contribution brings them good PR and satisfies real-world customer demands, while continuing to promote their agenda to make running Windows seem like the best choice. Smart move!