Archive

Archive for the ‘code’ Category

quran ubiquity plugin

May 19th, 2009 ahmedre No comments

updated and released the first version of the quran ubiquity plugin! you can go here to install it.

essentially, it contains two commands -
1. search-quran – takes a parameter of what to search for and will show the results that match that particular query. hitting enter will bring up the search results page.
2. get-ayah – takes a parameter of which ayah (ex 2:2) and an optional parameter of the language/translation you want the ayah in (in english – muhsin khan, for example – note that ubiquity will provide suggestions for these). hitting enter will insert the text into the selection space.

this is uber-useful for muslims imho :p perhaps i will try to provide a screencast later on that shows how to use this for those who are still afraid to try it :)

*update* – rather than make my own screencast, i’ve decided to record a set of audio instructions on how to use it.

by the way – if you haven’t used ubiquity before, i highly recommend that you watch this video first. it explains what ubiquity is and gives you an idea of what it is useful for. to put it quite simply, ubiquity is amazing. it’s an indispensable tool for your firefox. watch the video :)

and here is the audio tutorial on the quran plugin for ubiquity.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

enjoy!

Categories: code Tags: ,

migrating jaiku to identi.ca or twitter

April 13th, 2009 ahmedre No comments

i recently decided to move my tech tips microblog to identi.ca (the original copy was on jaiku), as i felt it was a little more befitting, actively developed, etc (although jaiku is now open source).

anyway… so i wanted to migrate my posts over – so i wrote a php script to do it (it assumes your jaiku is public and reads it without hassling with oauth).

<?php
$sleepTime = 5;
$jaikuSource = "http://username.jaiku.com/json";

$mode = 'identi.ca';
$baseStatusUrl = 'http://identi.ca/api/statuses/update.json';

// thanks, php-twitter
if ($mode == 'twitter'){
   $baseStatusUrl = 'http://twitter.com/statuses/update.json';
   $headers = array('Expect:', 'X-Twitter-Client: ',
      'X-Twitter-Client-Version: ','X-Twitter-Client-URL: ');
}

$ctr = 0;
$entries = array();

print "destination account username: ";
$username = trim(fgets(STDIN));
print "password: ";
system('stty -echo');
$password = trim(fgets(STDIN));
system('stty echo');
print "\n";

$done = false;
$params = '';
while (true){
   $count = 0;
   $posts = fetchUrl($jaikuSource . $params);
   $json = json_decode($posts, true);
   $stream = $json['stream'];
   $lastEntry = null;
   foreach ($stream as $entry){
      if (isset($entry['comment_id'])) continue;
      $lastEntry = $entry;
      $count++;
      $entries[$ctr++] = $entry['title'];
   }
   if ($count == 0) break;
   $lastPostTime = $lastEntry['created_at'];
   $ts = split('-', $lastPostTime);
   $hd = split('T', $ts[2]);
   $min = split('Z', $ts[4]);
   $gmtime = gmmktime($hd[1], $ts[3], $min[0], $ts[1], $hd[0], $ts[0]) - 1;
   $params = "?offset=$gmtime";
}

for ($i = $ctr-1; $i>=0; $i--){
   $params = array('status' => $entries[$i]);
   if ($i != ($ctr-1)){
      print "sleeping $sleepTime seconds\n";
      sleep($sleepTime);
   }
   twitterApiCall($baseStatusUrl, $params);
   print "updated status to: " . $entries[$i] . "\n";
}

function fetchUrl($url){
   $ch = curl_init($url);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
   $resp = curl_exec($ch);
   curl_close($ch);
   return $resp;
}

function twitterApiCall($url, $args = null){
   global $username, $password, $headers;

   // thanks, php-twitter
   $ch = curl_init($url);
   if (!is_null($args)){
      curl_setopt($ch, CURLOPT_POST, true);
      curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
   }
   if ((!empty($username)) && (!empty($password)))
      curl_setopt($ch, CURLOPT_USERPWD, $username . ':' . $password);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
   if (!empty($headers))
      curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
   $resp = curl_exec($ch);
   $info = curl_getinfo($ch);
   curl_close($ch);
   if ($info['http_code']!=200)
      print "error - got an http code of: " . $info['http_code'] . "\n";
}

make sure you edit $baseStatusUrl and $mode as necessary. enjoy!

Categories: code Tags:

ubiquity rocks!

February 18th, 2009 ahmedre 1 comment

today, i felt like playing some more with ubiquity, which i had installed for a while now but had not played around with sufficiently. i decided to try to write a simple plugin that will search the quran for a particular set of words. to do this, i felt obliged to expose an api for the alpha version of quranicrealm first, which was good because i needed to do it eventually anyway.

and here’s the mandatory screenshot:
ubiquity - quran search preview

it still needs a lot of work… things i still want to do if i get around to it:

  • add a favicon (for the site and for the plugin)
  • more options (ex, “search english,” or “search transliteration,” etc)
  • replace the current text with a link (or translation). this would be useful in im conversations or while writing blog posts.
  • a “get-ayah” command (to say, “get ayah 1 of sura fatiha in arabic,” for example).

anyway, i’ll post up the code when i’ve added some improvements insha’Allah. if you want it before then, post a comment.

Categories: code, islam, technology Tags: , ,

faster and better text search

January 25th, 2009 ahmedre 4 comments

i have a set of ~6000 quotes (verses, if you will), along with a multiple set of translations for each of those verses. before, i was searching across these verses using mysql. while this seemed to work, it was very limiting, and i began looking into alternatives.

so i did a little bit of research and tried out lucene and sphinx. for lucene, i specifically used the zend version (i’ll discuss standard lucene (java) towards the end of this post.)

i’ll show the results first, and then explain them after.

the graph above shows a quick overview of the tests run. a set of 3 different queries were run against 4 different backends. the numbers were generated using apache bench (ab) using 100 requests with a concurrency of 1.

backends:
lucene: this was the first implementation. in it, each verse was a “document.” each translation was a property of the document. the total number of documents was thus equivalent to the number of verses.

sphinx: this was the second sphinx implementation (see sphinx alt below for the first implementation). this implementation was just done to make the data model similar to that of lucene, which is exactly what it is. although this ended up being the fastest (by < 5ms in the tests run), i prefer the sphinx alt implementation because it’s closest to that of the database schema.

sphinx alt: although it is named “sphinx alt” in the graph above, this is really the initial sphinx implementation. in this model, a translation of one verse was a document. consequently, the total number of documents was (number of translations) * (number of verses). i sort of like this one most (even though it’s not the fastest) because it is the closest to the current database schema.

mysql: this is sort of the baseline, and, to be honest, it’s not fair either. the query used here is something in the nature of getting the row where the text is like ‘%word1%word2%’; the number of results returned by this are far fewer (and less valuable) than those returned by either lucene or sphinx. one would need to do “where text like ‘%word1%word2%’ or text like ‘%word2%word1%’” to get a more accurate estimate, but for baseline purposes, i simply ran the first query. note that the query cache size is 0 (ie query cache is on but effectively off for this set of tests). note that the text field has a fulltext index on it.

results:
sphinx wins hands down. however, although it seems that lucene comes in last, this is not really accurate because of the type of mysql query being used. from my limited tests (using a more complicated sql query), lucene and mysql have comparable performance, but lucene of course has the added benefit of more advanced query options, etc.

sphinx times were 8.030 ms, 7.542 ms, 8.324 ms, sphinx alt times were 8.304 ms, 7.898 ms, 11.131 ms, lucene times were 285.759 ms, 116.222 ms, 224.381 ms, and mysql times were 106.254 ms, 106.561 ms, 108.747 ms for queries 1, 2, and 3, respectively. query 1 contained three words (+term1 +term2 +term3), query 2 contained one word (+term4), and query 3 contained two words (+term5 +term6).

additional details:
plain vanilla java lucene is usually faster than zend’s lucene implementation. the largest difference can be noted in indexing times (a few seconds for java versus 15+ minutes in php). if i had to index frequently, i’d use java lucene or sphinx because they are insanely faster.

for example, the first query takes 179.84 ms on average in java (over 100 queries) versus about 272.61 ms on average for php. the second query takes 173.22 ms on average in java versus about 103.30 ms in php. the third query takes 178.98 ms on average in java versus about 214.78 ms in php.

php only won at the second query, which also happens to be the simplest query. two things to note – first, the times here don’t include the jvm or php interpreter start times. these are times reported by taking the time before and after the search call and displaying them. second, unlike the first test, this was all run from the command line and not directly via web (didn’t want to bother setting up tomcat or solr, etc).

just for fun, i implemented the “sphinx alt” data scheme in java lucene as well and re-ran the 3 tests 100 times each. the results were 178.54 ms, 160.20 ms, and 172.72 ms – very much comparable to the results with the alternate schema.

the summary of this very long post in 2 words: sphinx rocks.

Categories: code, technology Tags: , ,

simple is beautiful – command line id3 tagging

March 16th, 2008 ahmedre No comments

generally speaking, my set of mp3s is very well tagged. for my personal mp3s, i used to exclusively use easytag to tag them, and now i use a combination of easytag and amarok (which is totally awesome by the way!)

but sometimes, i have to mass edit id3tags for mp3s on the server, and i don’t have the luxury of using such gui tools for the editing. as thus, i’ve been mainly using id3v2 within some perl scripts to tag mp3s. this turns out to work great, but i also wanted to be able to add album art to the mp3s from the command line.

i couldn’t figure out how to do it using id3v2 (perhaps using the custom frames, there’s a way, but nothing extremely simple and obvious from what i was looking at). then i found the solution in the form of a id3lib-ruby, a ruby wrapper for id3lib, the same library that id3v2 is based on.

with this, everything turns out to be extremely easy -

require 'id3lib'

tag = ID3Lib::Tag.new('myfile.mp3')
cover = {
   :id => :APIC,
   :mimetype => 'image/jpeg',
   :picturetype => 3,
   :data => File.read('cover.jpg')
}

tag << cover
tag.update!

and that’s it. nice and simple. by the way, a picturetype of 3 denotes a front cover and is the default value (just learned that from a quick search). oh and the output mp3 image cover shows up fine in both linux and on itunes. beautiful!

Categories: code Tags: , ,

dealing with timezones in php

March 13th, 2008 ahmedre No comments

so i was working on some code in which i needed to know whether or not it was dst for a given country and/or timezone or not. luckily, with php5.2, some sparsely documented (yet very useful) classes were introduced – a more thorough documentation can be found here.

so let’s say i want to know whether or not egypt is in dst right now or not… so first i need to know what zoneinfo file egypt uses (for egypt, it’s simple, but this trick is useful for more obscure places, like “isle of man,” for example):

cd /usr/share/zoneinfo
grep -i egypt iso*.tab        # get the iso country code for egypt

# the above command returns 'EG' - so...
grep EG zone.tab
# returns 'Africa/Cairo'

in many cases, there are many timezones that exist for a given country. in many cases, it’s obvious which file you need, but in some cases, it’s not very obvious. in those cases, i found it helpful to open the binary files and look at the very last line, in which some hint about the offset of the timezone is given.

anyway… once you have the zoneinfo file that you would use, it’s very easy to find whether or not you are in dst (well, assuming that you know what the standard, non-dst offset from utc is). for example:

$tz = new DateTimeZone('America/New_York');
$date = new DateTime();
$date->setTimezone($tz);
echo $date->;format(DATE_RFC3339) . "\n";
echo $date->getOffset()/3600 . "\n";

running this returns the time in new york, and the offset (-4). since the standard est offset is -5 hours, -4 means we’re +1 which means we are currently on dst.

so if you don’t know the standard offset, another trick that you could do is pass some parameters to the new DateTime() constructor – so for example…

$tz = new DateTimeZone('America/New_York');
$date = new DateTime('2008-12-31');
$date->setTimezone($tz);
echo $date->getOffset()/3600 . "\n";

this returns -5, which is out of dst. anyhow, you could use the above if you don’t know the default offset for a timezone for dst by passing in 2 dates – something towards the middle of the year (july-ish) and something towards the end of the year (december-ish). if the offsets are different, the place probably has dst.

also, do note that some places have things a little differently – so dst in windhoek, namibia, for example, ends in april and starts in september.

Categories: code Tags: ,

sublime rhymes for the times

February 5th, 2008 ahmedre No comments

launched my second facebook app: sublime rhymes! still a little rough around the edges, some things can be a little more intuitive, but for now, i am going to sleep :)

Categories: code, technology Tags: ,

behind the times…

January 30th, 2008 ahmedre No comments

today, i discovered that the quranapp i’d written for facebook has been broken for some time now (mainly due to the changed notification apis, which affects both notification sending and invites). so i’ve finally updated it and it properly works again.

i actually kind of like their changes – their new invite form and friend selector are pretty simple to use and are very feature full (not to mention well documented), and not having to catch return types from sent messages and forward to a confirmation page is always a very nice thing ™.

i guess sometimes, you can’t just write software and forget about it :)

Categories: code, technology Tags: ,

nice bash tip

January 20th, 2008 ahmedre No comments

i never really used this until i ran into this article by accident, but it’s fairly cool…

so:

echo {one,two}.sh

outputs:

one.sh two.sh

this means that you can, as the article says, do something like this:

cp /etc/apache2/httpd.conf{,.bak}

to backup your apache conf. cool huh? the article has more details and such, but that’s just a summary.

Categories: code Tags: ,

quranicaudio.com redone!

September 4th, 2007 ahmedre 2 comments

keeping the seo statements from the previous post in mind, audio.islamicnetwork.com has become http://quranicaudio.com/ — it is now running on lighttpd (rather than apache – by the way, so far, i really like lighty masha’Allah) and al7amdulillah, so far, things look good.

quranicaudio.com is one of the web’s largest (if not the largest) collection of cd quality downloadable quran mp3s (and some oggs). check it out!

Categories: code, technology Tags: , ,