Technology: October 2007 Archives

The problem with some scientific research is not the research itself but the way people choose to use it. What better example than research on the blogosphere itself to show how you can twist a reasonably simple study for self-interested ends or just get it completely back-asswards? The reaction to the study itself is potentially the source of new research into blogger psychology: "I bloviate therefore I am".

A team from Carnegie Mellon University decided to look at how blogs link to each other as part of a wider study to look at where to put sensors to detect pollution or disease as quickly as possible without spending a shedload of money to put them everywhere. The slightly non-intuitive conclusion is that points with a high overall flow do not provide the best positions - it is those small channels that have the largest effect on the whole network where you want to have those sensors placed.

The team picked blogs as a study area largely because blogs have some interesting parallels with the spread of contagion through a network. They also make it easy to study that spread. They are time-stamped; they link to other blogs. You can trace the flow of 'information' relatively easily.

The researchers picked a large subset of blogs – 45 000 from a possible total of 2.5 million – and crunched through their links, taking account of which links went outside the dataset and which remained inside. They monitored posts that pointed to largish information cascades – effectively blogger pile-ons. To qualify, a subject had to accumulate 10 posts to be considered a cascade. That's big enough for a small pile-on in my book.

The CMU team then computed which blogs – from the subset they picked – were most likely to be a part of blogger pile-ons compared with those which had a high proportion of posts that were not. This gave them a cost function which led to a final list of 100 'top' blogs.

This is where the fun started. People on the list found that they were on some form of top 100 and started to brag about it. It's scientific so it must be true, was Neville Hobson's considered opinion. Then people started to wonder why a really weird bunch of blogs was considered to be the researchers' top 100. A commenter at Nick Carr's Rough Type wondered why a blog that had effectively been run off the farm by an angry mob was in the listing. Had they spent about ten seconds looking at the text at the top of the list, they might have realised that the corpus used by CMU came from 2006. That's right folks, this is not a list of current blogs - only those active up to about a year ago.

There is one other point that those crowing about being on the list might want to bear in mind. If it is any kind of ranking, this is a list of the pile-on addicts of 2006. If you wanted to know where to rubberneck at the biggest accidents on the blogiverse a year ago, these were your go-to guys.

Based on this, I think there is a strong argument for building a feedreader that uses this lot as a filter against your real list of RSS feeds: it would take out the mob rule and leave you with a lot more original information. (To be fair, there are some on the list I would want to keep in the feedreader).

The irony that this post itself is part of a pile-on is not lost on me.

From a Reuters piece on Microsoft's attempts to put Windows on a crash diet for a sub-$200 laptop:

"We still have plenty of work to do in determining if the highly constrained performance, power, and memory in the first generation XO laptops will be compatible with Windows and popular Windows applications," [Microsoft corporate vice president Will Poole said in an interview].

I think they're still waiting for it to boot.

Hwang's costly law

25 October 2007

Samsung is trying hard to push the idea of Hwang's Law, as seen in the company's latest move to show off an experimental 64Gbit device. In 2002, the head of the Korean company's chip business Hwang Chang-gyu gave a speech at one of the chipmaking industry's biggest technical conferences, ISSCC. There, he claimed that flash memories would break away from the prevailing trend in the chip industry and double in size every year. That was something that happened only in the very early days of the business, at the point when Gordon Moore was putting together the graph that became Moore's Law.

For much of its history, the growth in the number of functions that you can get onto one chip has wobbled between a doubling every 18 months or two years. And a lot depends on how you measure the number of functions – something that Intel has made use of on several occasions. Right now, the rate seems to be a doubling every two years, which fits neatly with Intel's own plans. That may explain why Moore has recently been reminding people that he picked the two-year rate as he wrote his first article on it in the mid-1960s.

But Samsung seeks to break with convention, by upping the rate for flash memories, at least, to a doubling every year. And, roughly every autumn, the company has produced an example of a chip that could store twice as much as the last. So far, so good.

Samsung's relentless push looks as though it is coming at a cost.