
I hate to reveal the extent of my geekiness, but would like to spare others the hours I spent debugging a problem I ran into last night running Apache 2 under Ubuntu from a flash drive. I run a number of web applications on a home server I have set up that uses a flash or pen drive for its filesystem, an unusual configuration, but one that gives me a silent fan-less server that can run 24/7 in my home office, which also doubles as our guest bedroom.
This week I decided to upgrade the server from an ancient version of Knoppix running Apache 1.3 and PHP4 to Ubuntu 7.10 running Apache 2.2 with PHP5. I followed the excellent directions at PenDrive Linux, and had the new system up and running in about an hour.
However, when I started upgrading my applications and installing them on the new system, I started running into problems where CSS stylesheets and Javascript files were not applied to pages. When I tried to load the CSS stylesheets directly into the browser, Firefox would display a blank page. IE would offer to download them, but then download nothing. I checked the error and access logs, but didn't see anything unusual. I tried a google search, but didn't find anything that seemed on topic. I thought of permission problems, which I always run afoul of in Linux, but after messing around with that for an hour or so, I concluded that wasn't it. Then I thought it might be different line endings, but that turned out not to be the case. Then I thought it might be Apache 2 configuration, which is significantly different with Apache 2 on Ubuntu than it was with the Apache 1.3 that I am used to, so I messed with that a while, without any luck. I did learn new information about permissions, line endings, and Apache 2 configuration, so it wasn't totally wasted time, but it was pretty frustrating.
I finally spent an hour googling around, running into my fellow Berkeley web geek Scot Hacker reporting a simular problem (but no solution) in 2004 with Apache 2.0 with Mandrake Linux. I finally ran across this post on a forum:
OK success! I upped the loglevel to "debug" in httpd.conf. When trying to access images the following messages appeared in error_log:(38)Function not implemented: core_output_filter: writing data to the network
A google search later and I found the following explanation of this error:
"Apache uses the sendfile syscall on platforms where it is available in order to speed sending of responses. Unfortunately, on some systems, Apache will detect the presence of sendfile at compile-time, even when it does not work properly. This happens most frequently when using network or other non-standard file-system. Symptoms of this problem include the above message in the error log and zero-length responses to non-zero-sized files. The problem generally occurs only for static files, since dynamic content usually does not make use of sendfile. To fix this problem, simply use the EnableSendfile directive to disable sendfile for all or part of your server. Also see the EnableMMAP, which can help with similar problems."
After setting "EnableSendfile off" in httpd.conf and restarting httpd it works as it should.
Following that post lead me the Apache doc about sendfile and more useful information on the Apache Wiki, with this wonderful quote:
Regrettably, there are occasions where sendfile will be found at compile time but not work when Apache tries to use it. This happens, in particular, when serving files from certain network file-systems (NFS or SMB, for example) or other non-standard file-systems.
I guessed that my flash drive might be one of those "regrettable" non-standard filesystems, so I added "EnableSendfile off" to my Apache2.conf file. Bingo, all my stylesheets were applied. Problem solved.
Lessons learned:
Three years ago, I wanted to put together a new website (SARS Watch Org), and wanted to use a different web host than the one my family site and blog were hosted on. Someone recommended dr2.net, and the price was right, $45 for year's hosting. While there were occasional problems, the site stayed up even when SARS Watch Org briefly became very popular. Sometime in the second year, I found out that the owner of dr2.net was a teenager named Matthew Eli, but he gave great service, so while I knew that it wouldn't make sense to host a mission critical application on dr2.net, over the years I transferred more and more of the sites I built for family members and friends to dr2.net, and moved all my personal email over to dr2.net. About a year ago Matt merged his operation with a company called Mesopia, which quickly was renamed Netbunch. From my point of view, things stayed the same, so I didn't pay a lot of attention when they announced that the new company had been sold to another company, Web Host Plus (so Matt could go to college). What a mistake.
Web Host Plus, also known as Webhostplus.com, is the gang that couldn't shoot straight of web hosting. Unannounced in advance, they decided to migrate all the sites over to new servers, with different Apache and PHP configurations (and some of them are already listed with RBL as spammers). The first thing I knew about it was when my sites started breaking. And tech support on Webhostplus.com is a joke -- I sent in email and the answer to almost every question about site configuration is "wait for the [un-announced] nameserver change to resolve" or "it's a permissions problem, chmod all your files to 777" -- in other words invite myself to being hacked. Then I stopped getting incoming email to all my email accounts. I spent many hours emailing technical support and posting on the forums without any result until I eventually called the poor sap who is in charge of the old dr2.net customers, Carl Rudel, who got incoming mail turned back on .. for a little while. So far it has gone down 3 times this week, for 2 days at a time in one case. And each time Web Host Plus lost all my mail! It takes a lot of work to be as incompetent as Web Host Plus -- internet email was expressly designed to avoid this. But if you check the forums at Netbunch you can see I am just one among hundreds with similar problems. I don't understand what kind of business model Web Host Plus has that makes it worth it for them to acquire smaller webhosting companies if they drive all the customers away? I fear credit card fraud -- there are certainly allegations of such among all the horror stories I read about Webhostplus when I searched for information about WebHostPlus on Web Hosting Talk.
I have now spent a large part of the weekend doing what I should have done when I first found out that Netbunch had been acquired by Web Host Plus -- backing everything up. I have my mail moved over, but still need to move my sites over. Anybody have any recommendations for a web host that lets you host multiple low-traffic domains on one account? I liked working with a small company where I "knew" who the owner was, and emailed with him, but I've just seen the problem with that. Would I be better off getting an account with someone like Yahoo or Verio? Do you get more competence in return for giving up personalized service? When I put together the web portal for MyTurn during the dot com boom, I found that size of the data center or company was no guarantee of competence either. Hopefully things have changed. Suggestions?
Kudos to Chauncey Thorn for this fun little hack to use Google for Geocoding that he posted on a FLOSS mailing list/ forum. Google would probably notice, mind, and shut down enterprise level use, but I imagine informal web services like this will put pricing pressure on geocoding vendors, while increasing number of apps that take advantage of geographic data.
@params string full address
@return assoc array
$location = array('lat' => $lat, 'lng' => $lng)function map_geocoder($address)
{
$address = urlencode($address);
$response = drupal_http_request('http://maps.google.com/maps?q='.$address);// let find the lat/lng in the $response->data
preg_match("//",
$response->data, $matches);// cleaning up
$match = preg_split('/ /',$matches[0]);
$lat = str_replace('lat=','',$match[1]);
$lat = str_replace('"','',$lat);
$lng = str_replace('lng=','',$match[2]);
$lng = str_replace('/>$lng = str_replace('"','',$lng); // finally ...
$location['lat'] = $lat;
$location['lng'] = $lng;return $location;
}
One of the most striking developments in the web over the last year has been the sudden popularity of sites like Furl, Flickr and Del.icio.us, where users can categorize the data or photos they save with keywords, more colloquially called tags. Everybody in what Kellan has called the Internet chattering classes has been talking about tags, and a word for them, folksonomy, has even been coined, discussed and debated. Even Mr. Metacrap himself has signed on as an advisor to Flickr, and can be found on Flickr happily adding metadata to his photos.
I've always been reluctant to rely on someone else to store my data. I tried each service soon after it was released, but didn't find any of them compelling enough to use on a daily basis. Furl I liked, but I was nervous about having all my data stored for me on the net by a company without an obvious business model, and then I found a better way to store data locally using Slogger. Del.icio.us I tried but couldn't make heads or tails of until Joshua Schachter explained it in person at ETech 2004. Flickr I tried at the same ETech, but at the time I was blocking Flash in my browser, so all I ever got was a blank screen. So much for being an early adopter.
However, I have recently started to use Flickr and Del.icio.us on a regular basis. Why? Because they turn out to be great ways of following a conversation on the web. I display the RSS feed for my Del.icio.us subscriptions on one of my personal portal pages, and it updates hourly with what other people have bookmarked about topics that interest me. I couldn't make the John Battelle's Web 2.0 conference this year, but in addition to reading the blog coverage and press coverage, I searched Flickr's web20 tag and got a good idea of who I know who was there.
Once, months before the fact that US soldiers were torturing Iraqis at Abu Ghraib was revealed to the world, I came across a site where American soldiers in Iraq were posting photographs on the internet and wrote about it. I wondered at the time what the effects on our democracy would be of soldiers being able to send photos of their experience directly to the citizens, unmediated by our media conglomerates. As we found out from the photos taken at Abu Ghraib and leaked to the press by soldiers outraged at what Bush has done to the proud American army, the effects can be very powerful. But it was still hard for the average American internet user to see photos taken by American soldiers in Iraq -- I still get hundreds of web searchers every day, drawn by my earlier post and looking for soldiers iraq photos.
Tonight when I went to Flickr, I decided to see if any soldiers in Iraq were using Flickr to post photos. I searched on the tag Iraq, and found 135 pictures that people had posted. Not all of them were from soldiers in Iraq -- some were from protests against the war, some of them were political posters for and against the war, but most of the pictures were from soldiers in Iraq. Almost all of the pictures are from who you would expect to have time and ability to post things to the internet, bored REMFs at American bases in Iraq. Still, any American with access to the Internet can go today and see these pictures posted recently by American soldiers at war (click on the pictures to go to the originals on Flickr):
Clearly, while there are points of view expressed in some of these pictures, Flickr is still small enough that no one is trying to game it for political purposes, although I suspect that is coming very soon. However, in a more interesting sign of things to come, besides American soldiers posting their pictures, I found at least one Iranian who had posted a picture of himself atop a destroyed Iraqi tank:
What happens when Iraqis start posting pictures on a (soon to be) popular photo portal where it is easy for Americans (geeks now, general populace to come soon) to find them? What happens when pro- and anti-occupation Iraqis start posting graphic pictures to make their points? What happens when we have an unmediated, high emotional impact, people-to-people conversation with video and pictures? Just like my last post on American soldiers in Iraq posting photos on the web, I don't have the answer to those questions either, except that I know the results have the potential to be as explosive as the Abu Ghraib photos.
If you want to follow the changing nature our media consumption and communication, I recommend subscribing to the RSS feed for the Iraq tag on Flickr -- I expect it will be a very interesting journey over the next year, no matter who our next president is.
As people who read this site regularly may have noticed, I've been playing with Flickr recently. I don't want to like it -- it is Flash (ugh) based and very trendy -- but in spite of myself, I do. I won't consider using Flickr on a regular basis to store and display pictures until it has IPTC support, because my entire digital photo collection is already tagged using IPTC fields in the JPEG files, but there are indications that IPTC support will be added at some point in the future, hopefully soon. In the meantime, I have been having fun with Flickr.
One of Flickr's more interesting features is that you can search through the public photos displayed on the site by tag, á la del.icio.us. It also allows you to create a "badge" of your photos on your website. However, I'm not interested in seeing my pictures on my website, I want to see other people's photos, and I would like to use the tag to define what I want to see. For instance, I am interested in displaying photos tagged Berkeley on The Berkeley Blog. Flickr provides an RSS feed for each tag, so I can see recent Berkeley photos in my RSS reader if I want to, but it doesn't have a "badge" for tags. So I decided to create an unofficial "badge" to use until Flickr finds time to build an official one.
With a lot of assistance from the flexible MagpieRSS parser and the helpful folks on the magpierss mail list at sourceforge.net, I modified a little script someone drafted to use MagpieRSS to parse the tag RSS feed of your choice, and display the pictures on your site. You can see it in action at The Berkeley Blog and here:
The script itself is available for download if you want to check it out. Enjoy, and if you find it useful you can repay me by politely bugging the Flickr folks to build in IPTC support soon.
Note (11/24/2005): the script stopped working a while ago when Flickr changed their url structure, but I went and got an updated version of the script that Dave Kellan wrote and updated that to account for the new url structure, and all is working again. Thanks for all the improvements, Dave.
I received an email today from Furl founder Mike Giles, announcing that he had sold Furl to Looksmart. John Battelle, ever quick off the mark, posted that when he talked to Mike about it, Mike explained he picked Looksmart because they wouldn't make him move to California, and because they would allow him to keep developing the Furl service in the way he wanted to.
I liked Furl, although I haven't used it much since I created a home-grown system for saving and indexing web pages to create my personal version of the Internet Archive (more on that later, if anyone is interested). Furl works well, and I haven't noticed any appreciable downtime. I'm a little uncomfortable with Looksmart, because they are one of the few remaining search companies that mixes paid listings in with their search results. However, the great thing about Furl, for which Mike Giles should be commended, is that since very early on it provided functions to export the pages you had saved, and Mike has constantly improved upon them. I've used the functions several times in the past, and they work. After several unpleasant experiences during the dotcom era, I swore never to use a service that doesn't provide an easy way to get my data back onto my local machine. That's one of the reasons Geodog's gmail account sits unused -- while there are plenty of good hacks, there is no supported method to export your email from Google's servers. I feel safe using Furl, even after Looksmart bought them, knowing that no matter how their policies change, I can get a copy of my data any day I want to. If I wanted to, I could use cron and wget to make a backup daily.
Gmail users, what happens to your email if tomorrow Google mistakenly decides you have violated their TOS and shuts down your account?
Google may at any time and for any reason terminate the Services, terminate this Agreement, or suspend or terminate your account. In the event of termination, your account will be disabled and you may not be granted access to your account or any files or other content contained in your account although residual copies of information may remain in our system.
With Yahoo you can spend $20 a year and buy POP3 access. With Google, you are currently SOL. Think about it. I hope Google and all the other companies providing us with services and eager to make money from the data we create also think about it, and start following Mike Giles' example.
One of the themes of Supernova is decentralization, and in preparation for the conference I've been rereading Steven Johnson's Emergence: The Connected Lives of Ants, Brains, Cities, and Software. I had a strange experience tonight where just before picking the book up I ran into this interesting op-ed, The Best Anti-Terror Force: Us in the Washington Post today arguing for the power of decentralization in so-called "homeland defense".
On Sept. 11, 2001, American citizens saved the government, not the other way around ... While the U.S. air defense system did fail to halt the attacks, our improvised, high-tech citizen defense "system" was extraordinarily successful.
Confronted by a cruel and diabolical surprise that day, those with formal responsibility for protecting our country from air attack could not defend us. ... This is not surprising given that the command-and-control structure required so many baton handoffs in the 77-minute response window between the crashes of the first and fourth terrorist aircraft.
What is surprising is that an alternative defense system, one with no formal authority or security funding, did succeed, and probably saved our seat of government. The downing of United Flight 93 in Pennsylvania was a heroic feat executed by the plane's passengers. But it was more: the culmination of a strikingly efficient chain of responses by networked Americans.
Requiring less time than it took the White House to gather intelligence and issue an attack order (which was in fact not acted on), American citizens gathered information from national media and relayed that information to citizens aboard the flight, who organized themselves and effectively carried out a counterattack against the terrorists, foiling their plans. Armed with television and cell phones, quick-thinking, courageous citizens who were fed information by loved ones probably saved the White House or Congress from devastation.
It is an interesting take on decentralization. I'm not sure that I buy the whole argument, but it is a nice illustration of the power of decentralization, and how rapid a response you can get when people tied together in an efficient decentralized communications infrastructure can do what Johnson argues ants do, which is act locally based on local knowledge.
I had told myself no more geek and/or digerati conferences this year. Too much money spent (and too little coming in) already. Also, I sometimes find being at the conferences difficult because, unlike so many other people there, I don't have anything to sell. I don't have an idea or product that I am trying to promote. I just like to learn. In addition, I'm pretty shy -- I've sat next to Esther Dyson, sat behind John Gilmore, and had lunch sitting next to David Weinberger, all people I admire tremendously, and barely exchanged a word with them.
However, when the opportunity to do a little work for Mike Masnick of Techdirt at Supernova 2004 came up, I jumped at the chance. Reporting is a role I am very comfortable with, and something I enjoy doing. Plus, Supernova 2002 was the first tech future conference I ever went to, and it was one of the best. Kevin Werbach has a pretty good finger on the pulse of technology, and he is very well connected. In the depths of the tech recession, he brought together a lot of very smart and interesting people to talk about the future of technology, and there were some great moments. Certainly a lot of the themes that were discussed at that first conference have blossomed since - decentralization, citizen journalism, (lousy, IMHO) social software, and a world of pervasive connection to the internet (it was the first conference that I went to besides the Wireless Planet conference that had Wi-Fi, or 802.11b as we called it back then). Supernova was also the place where for the first time I got to see in person a lot of the people whose writing I had been reading for years, which was fun. It was where, in spite of my shyness, I made friends with two of the nicest journalists in technology, Mike Masnick and Glenn Fleishman, and met one of the most prominent journalists, if not the cheeriest, Dan Gillmor.
It looks like this time Kevin has assembled another interesting group of smart people as speakers, and looking at the wiki, I imagine that Supernova 2004 will be a reunion of sorts, as well as an opportunity to see and hear a bunch of new people. I'm looking forward to it, to seeing some of you again, and to learning and reporting back on what I learn to those who aren't there.
The only thing I'm not looking forward to is the grief I'm going to catch from some of my online friends who haven't been afforted this opportunity.
I just finished reading Neil Turner's the review of the latest version of Firefox, and my first thought is, "I'm not installing that." Of course, I probably will end up doing so at some point, but it is so disappointing to see a project that started with such promise getting worse and worse with every release (although to be fair, it is also getting faster). Still, I'm still running Firebird .7 on one of my computers, and on the whole I prefer it to 0.8. If this review and the release notes are accurate, it looks like the situation just worsens with 0.9. The new download dialog foisted on users in 0.8 has been kept, the theme has been changed to one that looks quite ugly and is acknowledged as being worse than the current one, and the disregard for the most popular extensions and current users that was demonstrated when 0.8 was released is strikingly repeated. From the release notes, "when you run 0.9 for the first time all of your extensions will be automatically disabled." There were a lot of comments a year ago about all the problems with design by committee -- now we are starting to see some of the problems with design by dictatorship, and disregard of users. As someone said on the mozzilazine forum, "The capacity of this project to repeatedly shoot itself in the foot never ceases to amaze me." As an open source enthusiast, this is really disappointing.
I hope that I am wrong, and that when the dust settles there is still a superior product to Internet Explorer in there somewhere, but the current direction isn't promising. At the moment I am considering returning to Mozilla as my default browser, or testing the Opera waters again.
If any of my online friends haven't received an Gmail invite and would like one, I have one to give away -- just send me an email. First come first served, friends or regular readers only. I'm not interested in selling it.
I'm not particularly wowed by the service, but maybe you are as curious as I was to see what all the fuss is about.
Update, June 19, 2004: I came by three more Gmail invites today, and also ran into Jonas Luster's inspiring Gmail for good give-away. I have donated all of my Gmail invites to him, so for those of you arriving every hour from Google, I suggest heading over to his site and letting him know what you are ready to do for the world.
Update August 22, 2004. Just as a sociological note, not one of the visitors from Google took the time to read the note. Not one of them took the time to find my email address on this site. Not one of them sent me an email. Also, of the people I did send invites to, via Jonas Luster's site, not one sent me a thank you. It's a strange world out there.
I've gotten really bored with comments on this entry, so I am closing off any further comments. Good luck finding an invite elsewhere. As I stated above, I recommend Jonas Luster's Gmail for good give-away.
I spent today at the University of California at Berkeley Electrical Engineering and Computer Science Departments Annual Research Symposium. It was a blast, in many ways the academic equivalent of the O'Reilly Emerging Technologies Conference I went to two weeks ago. Instead of the O'Reilly fare of Robots and Quantum Dots and Programmable Matter and Emergent Democracy Worldwide, they had Smart Dust, Electric Clothes (Transistors made from woven textiles), Circuits printed on Plastic and Technology Research for Developing Regions. While some of the subjects were similar to ETech, the crowd and format were very different. While anyone who stumbled across the website in the last month could register and attend for free, the crowd consisted almost entirely of invited academics and members of the research divisions of large corporations, plus a few Europeans and a very large crowd from Finland. Instead of young hackers giving talks then joining the audience, there were graduate students who gave presentations or demos but then went back to their labs/cubes. The conference appeared to be primarily Berkeley CS and EE showing their stuff to current and potential sponsors and collaborators. Nothing wrong with that, and I was delighted with the chance to attend and see the profs and grad students present their research results.
I was very impressed with the breadth of the research being done, and with the number of labs that are scattered around town, working on things as different as extremely low power self organizing sensors connected by wireless networks to very interesting design methodologies for real-time fault tolerant software. I suspect that the people who tied up Sprint's application to put up 3 cell antennas on a building in Berkeley for 2 years have no idea of all the wacky and creative things that the UC wireless researchers are up to with radio in Berkeley.
I probably won't get a chance to write up my notes, but if I don't and you are interested, I highly recommend the three (1, 2, 3) talks mentioned above, all of which are archived on the Berkeley CSEE web site.
I have been a long time PGP key owner, but have almost never used it. A year ago, while I was talking with the Chandler folk about working for them, I ran into this great essay by Brad Templeton, Returning privacy to E-mail, and ever since then I have kept my eyes open for a more user-friendly approach to encrypting email. Tonight, I ran into this glowing review of new product, Voltage's identity based encryption, which claimed to go a considerable way to solving the problem that Brad Templeton wrote about a while ago. I looked over Voltage's site, but didn't feel competent to evaluate the new approach myself. The first thing I did was shoot off an email to Bruce Schneier, Counterpane CTO and author of Crypto-Gram, as well as several very good books. I don't expect a personal answer from Schneier, whom I have never met, but I hoped (and hope) that perhaps he will cover the topic in the next edition of Crypto-Gram.
Just for fun, I posed the question on a recently much maligned IRC channel that I sometimes lurk on. Within minutes I had a link to the original paper on identity based encryption, a link to PGP Inc. CTO's critique of the approach, which blew a few good sized holes in it, and an intelligent discussion of it on the IRC channel. After half an hour, I felt like I had a slightly informed opinion on the subject (identity based encryption is not quite the panacea that its proponents claim, because it creates new problems while solving old ones).
The power of the web to harness many minds in common cause still amazes me, even in trivial examples like this one. Of course, there is always another point of view.

I finally put up my photos2004 O'Reilly Emerging Technology conference in San Diego that I have written about. Most of them are just ho-hum, but there is a good picture of Esther Dyson and a good photo of Cory Doctorow and Bill Kearney, as well a few good photos of the great em3rg1ngl0ft aka Dachb0den Labs party.
Enjoy.
I've been debating about upgrading from Firebird to Firefox, the new Mozilla based browser, but have seen enough problem reports to decide to hold off until after this week's trip to Colorado. Plus, I'm not looking forward to reinstalling all my extensions -- 26 at last count. And the notice on the official Firebird extensions site,
During periods of heavy use, the extensions are taken offline to help maintain a healthy, responsive server. Please check back later! For now, please use Extension Room.
seems especially clueless. As far as I'm concerned, the whole point of Firebird / Firefox is the extensions. I also don't understand why Ben Goodger and the other developers are so opposed to the Tab Browser Extensions feature set? I understand that they have some problems with how the features are implemented, but to reject including or creating a better extension for some of the most popular features of Firebird / Firefox is nuts.
I'd be curious to hear what other Firebird users' upgrade experiences have been.

After 4 days of conferencing, I'm wiped out. I saw Phil Wolff half asleep at the San Diego airport on the way home and thought: "I know exactly how he feels". While some of today's speakers were great, 4 days was just a bit too long for me. I don't think that I could absorb another new idea, or meet another person. My computer is acting funny too, I wonder if it played unsafely with strangers and "picked up" something at ETech. I do know that I need to change all my email and web passwords, since lots of people were actively sniffing the wireless traffic, and the VPN I had set up for the conference didn't work, so I ended up sending them in the clear. Apparently a lot of other people did too -- I heard someone chuckling over all the Orkut login names and passwords he had picked up, and what fun he could have with them.
Hopefully, after re-securing my computing environment, I'll get a chance to digest everything that I heard, go over my notes, and perhaps even post something interesting before I get on the plane for Colorado on Monday. Hopefully.
Dave Sifry is showing some really cool data from Technorati.
See who on your blogroll updated.
Check'em out. Thoughts later.
Today I ran across an entry on Dave Sifry's weblog that noted that Technorati was hiring. I went to Technorati's job page and found a job that looks like it was made for me, as good as the dream job that got away from me a year ago. Excitedly, I sat down and wrote up a cover letter, pointing out all the ways in which I would be a perfect fit for the job (and thinking about how much fun it would be).
At the end, I debated how to include my resume. If Dave is processing applicants through a recruiter, they probably want resumes in Microsoft Word, which seems to be the only format that most recruiters can deal with, for reasons unknown to me. On the other hand, Dave is a longtime Linux geek. I remember that I've exchanged emails with him in the past. I check my email archive -- aha, my last email from him was written using Ximian Evolution 1.4. Better include my resume inline as ASCII as well, with a link to the online version of my resume, I think.
So I send off my resume and cover letter and happily turn to other tasks. Half an hour later I check email, and do my usual quick scan through my junkmail folder. What do I spot but the cc'd copy of my cover letter and resume. I just started testing a new version of the Spamnix spam filter that added Bayesian filtering. Spamnix is a Eudora plug-in based on Spam Assassin. It decided that the email and resume that I sent to Dave was SPAM:
SPAM: ------------------------ Spamnix Spam Report ------------------------- SPAM: Spamnix identified this message as spam. This report shows which SPAM: rules matched the message and how many points each rule contributed. SPAM: SPAM: Content analysis details: (5.4 hits, 5.0 required) SPAM: 5.4 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
Apparently words like Standards, Marketing, System, Proven, Global, Experience, President, Technologies, Product, Project, Manager, Director, Startups, and Software, which appear in my resume, are also common in the spam I receive. To some degree this makes sense -- resumes and cover letters, like spam, are a form of self-promotion. This is probably one of the reasons that I find job hunting painful, since I was trained to be modest about my accomplishments, and am shy by nature. But it creates a real problem. I can always find a less zealous spam filter, but what about Dave and all the other people I have sent resumes to since I started job hunting again at the beginning of this year? What if their spam filters have been junking my resumes? While that might explain the low response rate, it is a scary thought, and another sign that the volume of spam, and the endless game of spam hide and seek, are breaking email. How can I be sure that my email is reaching the person it is destined for? Should I tone down my resume? Use intentional misspellings in it? In Dave's case, I can go see his talk next week and chat with him personally, but how about all the other resumes I sent out in January? Should I start snail mailing resumes to internet companies?
When I started using email in the 80's, it might take 3 days for the person to get your email, but one could be reasonably confident that if you didn't get a bounce message, it had been delivered (and there were also the cheesy Return Receipts). Now, there is no assurance that your email has been delivered, just a reasonable probability that if you didn't use too many spam words, it probably got delivered to the inbox of the intended user. Probably (I wonder how Pfizer communicates with its salespeople in the field about its best-selling product).
I don't have any brilliant new suggestions on how to combat the flood of spam. I've followed the debates on Politech and IP about whether it is possible to combat it with technological or legal means. I do believe that the longer this flood is allowed to increase exponentially, the more support there will be for draconian technological and legal solutions, and the less likely that email will be a free and easy medium of communication. And we will all be poorer if that happens.
I tried to a get a number today and couldn't do it, then I tried about 5 name + address pairings, and none of them generated a listing. I also used the rphonebook:query and didn't get any results. Anybody know anything about this?
Update: Turns out it was a Google server error. I even got a 500 server error page from Google, which I had never seen before.
As many books as your CD can hold, 609 books, to be exact, are available for downloading and burning onto CD through Project Gutenberg's "Best Of CD" project. And if you have a DVD burner, the numbers are even more amazing. Today, at the Berkeley Public Library, they were giving away DVD's with the text of thousands of books on them.
While I can't imagine reading the books in this form, it is easy to create readable books from the files, and it is certainly easy to search the files for quotes or information. What wealth we have at our fingertips! Of course, as is so often the case, it is easiest for those who are least in need of these resources to access them. As a friend of mine says, "to those who have, much is given."
However, it's a great start, and efforts like Brewster Kahle's Internet Bookmobile are working to bring its benefits to the non-computer DVD enabled.
I've been using Mozilla then Firebird as my second browser for over a year now, but stuck with CrazyBrowser added onto IE as my primary browser because I use Surfsaver and Mybase to save pages that I find on the web when I am doing research. Too many times I have found something on the net, made a few notes on it, copied the URL and gone on, only to discover when I go back that the page has been taken down (or changed). Mybase is better at reconstructing pages exactly as they were, Surfsaver has full-text searching across all files. Both of them only work with IE, and I haven't been able find a similar program that stores web pages in a searchable database that works with Mozilla/Firebird. However, today I ran across a Firebird extension, Iview, that enables you to right click inside Mozilla or Firebird to open that page in IE. Brilliant! It isn't as good as having a Mozilla/Firebird aware web clipping application, but it is functional enough that I can demote IE to position of backup browser, where it belongs.
Also worthy of note are four other Firebird extensions:
TabBrowser Extensions turns Firebird into a modern tabbed browser. I've been using CrazyBrowser for years and can't imagine using a non-tabbed browser. TabBrowser Extensions provides lots of additional functionality to Firebird's default functionality.
RSS Panel turns your Firebird sidebar into a simple but full featured RSS reader. Great!
Flash Click to View replaces flash objects with a button to click if you want to see them -- greatly reduces the annoyance of certain media pages.
And of course, who could live without the Googlebar.
Firebird is turning into a robust and extremely usable browser, with lots of functionality being added through extensions. Recommended.
I was having some problems with the DNS for one of my web sites, and I got referred to DNS Report, which helped me diagnose the problem. Recommended.
I just found out via Dave Farber's IP list that if you type your phone number into Google, 510-843-0610, you get a reverse directory listing with your name and address, if you have divulged such to the phone company. This is another example of how technology is changing the boundaries of our privacy, and how a difference in degree can also be a difference in quality. Reverse phone directories have been available to businesses for many years, but they were not particularly advertised or easy to get. Now anyone on the planet can find out a lot about me from my phone number. Security experts scoff at security through obscurity, because the obscurity often conceals flaws in the security, and because a determined person can almost always penetrate the obscurity, but they forget that our social interactions often rely on obscurity to create boundaries. Usually, when we give people our phone numbers we don't assume that we are giving them our home addresses as well. Now we are.
Google does provide a fairly painless, although not well advertised, page that lets you opt-out, but unless you know the feature exists, you aren't likely to opt out. I wonder why Google is willing to take the PR hit it will get for this?
And now for some blogtech for news junkies. Dave Sifry has whipped up a cool little addition to Technocrati, Current Events. Find out what stories bloggers have been talking about for the last two hours! Find out what they are saying (after Dave bumps the type size up)!
A fun new toy.
The NYT is reporting that AOL is providing software to customers to block pop-ups. The sheer number of ironies in this article is simply delicious, and renders further comment unnecessary:
1) "AOL pioneered the often annoying but effective pop-up format"
2) "10 percent of its users had chosen not to receive pop-ups from AOL's own service, an option that has been harder to find than the new blocking software." Hard to find? Almost impossible.
3) "the number of sites that will accept pop-ups is increasing, including ever more sites owned by AOL Time Warner like Mapquest and CNN.com."