An interesting bug in DBD::SQLite

For some time, there’s been a bug in my bw-whois script related to the SQLite database implementation. The record expiration feature wasn’t working with SQLite, where it was working fine with MySQL and PostgreSQL. It took me a while to get around to hunting down the problem. I finally did that this past weekend and I found what appears to be a bug in the DBD::SQLite module from Perl’s DBI interface.

When using bound parameters with the dbh->do() function, the parameters aren’t binding properly in some circumstances. I haven’t had the time yet to dig into the DBD::SQLite code to find the problem, but here’s a simple example of the problem:

#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use DBD::SQLite;

my $dbh;
my $sth;
my $r;
my $q;

sub main {
    print "DBI VERSION: $DBI::VERSION\n";
    print "DBD::SQLite VERSION: $DBD::SQLite::VERSION\n";

    # connect to the database
    $dbh = DBI->connect( 'DBI:SQLite::memory:', undef, undef, { PrintError => 0, AutoCommit => 1 } );
    die "cannot connect to database ($DBI::errstr)\n" unless $dbh;
    print "connected.\n";

    # create and populate an in-memory table
    $dbh->do( "create table foo ( a, b, c )" ) or error("create table: $DBI::errstr\n");
    $r = $dbh->selectrow_hashref("select * from sqlite_master where name = ?", undef, 'foo');
    print "table $r->{name}: $r->{sql}\n";
    $dbh->do( "insert into foo values (?, ?, ?)" , undef, 1, 2, 3) or die "insert: $DBI::errstr\n";
    $dbh->do( "insert into foo values (?, ?, ?)" , undef, 2, 3, 4) or die "insert: $DBI::errstr\n";
    $dbh->do( "insert into foo values (?, ?, ?)" , undef, 3, 4, 5) or die "insert: $DBI::errstr\n";
    print_rows();

    # this should delete the second row (2, 3, 4) because 2 + 3 = 5
    # the parameter is not binding properly so it doesn't work
    my $param = 5;
    $q = "delete from foo where (a + b) == ?";
    print "$q ($param)\n";
    $dbh->do( $q, undef, $param) or die "delete from: $DBI::errstr\n";
    print_rows("select * from foo");

    # this works because it's not using a bound parameter
    $q = "delete from foo where (a + b) == 5";
    print "$q\n";
    $dbh->do($q) or die "delete from: $DBI::errstr\n";
    print_rows("select * from foo");
}

sub print_rows {
    my $query = shift or return;
    print "$query\n";
    $r = $dbh->selectall_arrayref($query) or die "selectall_arrayref: $DBI::errstr\n";
    print ' ' . join( ', ', @$_ ) . "\n" for @$r;
}

main();

Then just to make sure it’s not a bug in SQLite, I tried the same exercise in C. This version works fine:

// test-sqlite.c by Bill Weinman <http://bw.org/contact/>
// created 2012-08-07
// $ cc -o test-sqlite test-sqlite.c -lsqlite3 -std=gnu99 && ./test-sqlite

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sqlite3.h>

void print_rows( sqlite3 * db, char * query );
void errexit ( const char * message );

const char * fn = ":memory:";

int main( int argc, char ** argv ) {
    int rc = 0;
    sqlite3 * db;
    sqlite3_stmt * sth;
    char * sql = NULL;

    // connect to the database
    rc = sqlite3_open(fn, &db);
    if( rc == SQLITE_OK ) printf("sqlte3 database opened successfully.\n");
    else errexit("cannot open sqlite3 database");

    // create the table
    sql = "create table foo ( a, b, c )";
    printf("%s\n", sql);
    if( sqlite3_exec( db, sql, NULL, NULL, NULL ) != SQLITE_OK )
        errexit(sqlite3_errmsg(db));
    print_rows(db, "select * from sqlite_master");

    // populate rows
    sql = "insert into foo values( ?, ?, ?)";
    printf("%s\n", sql);
    if( sqlite3_prepare(db, sql, strlen(sql), &sth, NULL) != SQLITE_OK )
        errexit(sqlite3_errmsg(db));
    for( int row = 0; row < 3; ++row ) {
        for( int col = 1; col <= 3; ++col )
            if( sqlite3_bind_int(sth, col, row + col) != SQLITE_OK ) errexit(sqlite3_errmsg(db));
        sqlite3_step(sth);
        sqlite3_reset(sth);
    }
    if( sqlite3_finalize(sth) != SQLITE_OK )
        errexit(sqlite3_errmsg(db));
    print_rows(db, "select * from foo");

    // this binds the parameter correctly and works perfectly
    int param = 5;
    sql = "delete from foo where a + b == ?";
    printf("%s (%d)\n", sql, param);
    if( sqlite3_prepare(db, sql, strlen(sql), &sth, NULL) != SQLITE_OK )
        errexit(sqlite3_errmsg(db));
    if( sqlite3_bind_int(sth, 1, param) != SQLITE_OK ) errexit(sqlite3_errmsg(db));
    sqlite3_step(sth);
    if( sqlite3_finalize(sth) != SQLITE_OK )
        errexit(sqlite3_errmsg(db));
    print_rows(db, "select * from foo");

    sqlite3_close(db);
}

void print_rows( sqlite3 * db, char * query ) {
    sqlite3_stmt * sth;
    int col_count = 0;

    printf("%s\n", query);
    if( sqlite3_prepare(db, query, strlen(query), &sth, NULL) != SQLITE_OK )
        errexit(sqlite3_errmsg(db));
    col_count = sqlite3_column_count(sth);
    for( int col = 0; col < col_count; ++col) printf(" %s", sqlite3_column_name(sth, col));
    puts("");
    while( sqlite3_step(sth) == SQLITE_ROW ) {
        for( int col = 0; col < col_count; ++col )
            printf(" %s", sqlite3_column_text(sth, col));
        puts("");
    }
}

void errexit ( const char * message ) {
    fputs( message, stderr );
    exit(1);
}

Of course, there’s no equivalent for dbh->do() in the SQLite C interface, but at least I’ve been able to narrow down the bug to the DBI code. And since this same code works fine with the MySQL and PostgreSQL DBI drivers, I’m presuming that the bug is in DBD::SQLite.

Bill Weinman
7 August 2012

Categories: Computers, perl, SQLite

Simple factorial solutions

18 December 2011 3 comments

A few days ago I posted a simple recursive factorial example to my Facebook page. I asked for non-recursive alternatives and I received quite a few responses. Here are some of my thoughts.

The problem with a recursive solution is that it uses a lot of stack memory. Every function call pushes data onto the stack, and this is expensive. While an iterative solution may still have a complexity of O(n), it will still tend to be faster because it’s not pushing the stack on every iteration.

For example, my original factorial example:

long rFactorial( long n )
{
    if(n <= 1) return 1;
    else return n * rFactorial(n - 1);
}

For every iteration this function will make another function call, and when it returns its value, it must return the value from each iteration of the function.

On the other hand, here’s an iterative version:

long iFactorial( long n )
{
    long r = 1;
    if(n > 1) r *= n--;
    return r;
}

For every iteration this function simply multiplies, assigns a value, and decrements an integer. No stack operations at all. My simple experiments on a Core 2 Duo Macbook Air show this to be about twice as fast as the recursive version.

The other problem here is precision. For any sizable value of n, n! is a huge number that cannot be represented by a long integer. For example, 100! would require 158 decimal digits to represent, which would take 525 bits to represent as a binary integer.

This is what bigint libraries are for. The Gnu project has an excellent bigint library called GMP (The Gnu Multiprecision Library).

Here’s how we would code this problem using GMP:

void gmpFactorial( mpz_t result, long n )
{
    long i;
    mpz_t temp;
    mpz_init(temp);
    mpz_set_ui(result, 1);
    for( i = 1; i <= n; i++ ) {
        mpz_set_ui(temp, i);
        mpz_mul(result, result, temp);
    }
    mpz_clear(temp);
}

It’s called like this:

int main( int argc, char ** argv )
{
    long n = 0;
    mpz_t result;

    if(argv[1]) {
        sscanf(argv[1], "%ld", &n);
    }

    mpz_init(result);
    gmpFactorial(result, n);
    gmp_printf("%ld! is %Zd\n", n, result);
    mpz_clear(result);
}

So there are really two problems at work here. One is the algorithm: the iterative solution clearly works better than the recursive version.

The other problem is precision: Factorials can get big really fast. Here the solution is to use a bigint library.

How AppleCare Saved Me $1,711 (+tax)

3 October 2011 1 comment

This weekend Apple replaced my computer. They were unable to repair it, so they replaced it. And they didn’t replace it with a refurbished computer of the same two-year-old model, they replaced it with a brand-new current model. The computer they were replacing was a top-of-the-line “late-2009” i7 iMac (Quad-core 2.8GHz, 8GB RAM, 1TB HDD, 512M Radeon 4850) from two years ago, and the replacement is a top-of-the-line “mid-2011” i7 iMac (Quad-core 3.4 Ghz, 8GB RAM, 1TB HDD, 1GB Radeon 6970).

This is a fantastic policy that is far more generous than I would have expected, and by itself it makes the AppleCare worth it. I paid about $180 for the AppleCare policy and it provided me what amounts to about over $1,500 net-cost upgrade. I checked on Gazelle and they would have paid me $688 for my old iMac (yes, I could probably get more on Craigslist, but for argument’s sake let’s just use this number) and Apple sells my new model online for $2,399 +tax. That’s a difference of $1,711 (+tax). As I said, I spent $180 on the AppleCare policy. I could do that nine times and still be ahead.

There are definitely some things that Apple does that are mind-numbingly, unfathomably, stupid (including many of the ways they handled this repair incident), but overall I am very happy with the results. I paid for an extended warranty, which I normally don’t do, and they took very good care of me. In fact, they did much better than I would have expected, and even better than I would have hoped for in the best of scenarios.

I’ve not always recommended AppleCare. In fact, I’ve even laughed at myself for buying it in the past. I’ve only justified it because I depend on my computers so much and didn’t want to be faced with a costly repair at a time when I may have not had much cash on hand. But with this result I can honestly say that, where other vendor’s extended warranty policies may not, AppleCare makes economic sense.

From this perspective, it seems an extraordinary value.

–Bill Weinman, October 2011

Simulating dynamic typing in Objective C

27 September 2010 2 comments

Objective C implements polymorphism by using a generic object pointer type (id) and by requiring methods that provide information about the capabilities of a class implementation.

Objective C’s NSObject superclass implements methods like isMemberOfClass and respondsToSelector. These methods allow us to simulate dynamic typing by inspecting an object to see what it can do. It’s important to draw a distinction between what a class is and what it can do. The idea of polymorphism is to use a class without concern for the exact definition of that class, as long as it can do what you need it to do.

Recently, I was playing with SQLite under iOS and I decided to write a generic wrapper that could do something like this:

- (NSNumber *) SQLdo:(NSString *) query, ...;

This method would allow me to write arbitrary SQL queries with arbitrary numbers and types of parameters, using C’s variadic parameter system for the list of parameters. The implementation looks like this:

// SQLdo:query,...
// executes a non-select query on the SQLite database
// bound parameters are passed in the variadic argument list
// objects in variadic list are tested for type
// Return value is the number of affect rows
- (NSNumber *) SQLdo:(NSString *) query, ... {
    int param_count;

    // NSLog(@"%s: %@", __FUNCTION__, query);
    va_list args;
    va_start(args, query);

    [self openDB];
    const char *cQuery = [query UTF8String];
    sqlite3_stmt *statement;
    
    // preparing the query here allows SQLite to determine
    // the number of required parameters
    if (sqlite3_prepare_v2(database, cQuery, -1, &statement, NULL)
            != SQLITE_OK) {
        NSLog(@"SQLdo: could not prepare statement (%s)",
            sqlite3_errmsg(database));
        return [NSNumber numberWithInt:0];
    }

    if ((param_count = sqlite3_bind_parameter_count(statement))) {
        for (int i = 0; i < param_count; i++) {
            id o = va_arg(args, id);

            // determine the type of the argument
            if ([o respondsToSelector:@selector(objCType)]) {
                if (strchr("islISLB", *[o objCType])) { // integer
                    sqlite3_bind_int(statement, i + 1, [o intValue]);
                } else if (strchr("fd", *[o objCType])) {   // double
                    sqlite3_bind_double(statement, i + 1, [o doubleValue]);
                } else {    // unhandled types
                    NSLog(@"SQLdo: Unhandled objCType: %s", [o objCType]);
                    return [NSNumber numberWithInt:0];
                }
            } else if ([o respondsToSelector:@selector(UTF8String)]) {
                // string
                sqlite3_bind_text(statement, i + 1, [o UTF8String], -1,
                    SQLITE_TRANSIENT);
            } else {    // unhhandled type
                NSLog(@"SQLdo: Unhandled parameter type: %@", [o class]);
                return [NSNumber numberWithInt:0];
            }
        }
    }

    va_end(args);
    sqlite3_step(statement);
    if(sqlite3_finalize(statement) == SQLITE_OK) {
        return [NSNumber numberWithInt: sqlite3_changes(database)];
    } else {
        NSLog(@"SQLdo: sqlite3_finalize failed (%s)",
            sqlite3_errmsg(database));
        return [NSNumber numberWithInt:0];
    }
}

Notice the for loop for processing the variadic list. va_arg() is used to get the next object using Objective C’s generic object pointer type, id. This is then tested using respondsToSelector to find out if it’s numeric. All the numeric types respond to the objCType selector. The UTF8String selector is used for providing SQLite’s TEXT type.

This is a really useful paradigm, and it was relatively easy to implement using Objective C. I’m looking forward to extending it while I build a generic SQLite wrapper for my iOS apps.

Categories: Programming Languages

On Choosing a Programming Language

1 April 2010 14 comments

People often ask me, “What language should I learn? PHP, Perl, ASP, Python, Java, C, C++, Objective C?” It seems like what they’re really asking is, “What is the ‘best’ language?” And the answer is that there is no “best.”

The most common reason for choosing one language over another is that a client, customer, or environment may demand one or a selection from a small set of languages. That makes the choice easy. But what if there are no such restrictions imposed? If you really have free choice, how do you choose one language over all the others?

Different languages have different strengths and weaknesses. They’ve usually been designed for a particular purpose, or to replace another language or set of tools that became cumbersome or inefficient after years of use.

For example, the C language was designed in the early 1970s as a general purpose language for writing operating systems and systems-level tools. Before C, most of this work was done in assembly language. In 1972, Dennis Ritchie was working on new operating systems at Bell Labs and needed a language that could be easily ported to different architectures and operating systems. Existing languages were either too big or not flexible enough. So, he created C (named after its predecessor, B). Because C is small, powerful, and portable, it remains remarkably popular after 40 years of use.

By the mid 1980s, a lot of systems administration tools were being created using a combination of shell scripts and other tools such as awk, sed, and grep. In 1987, an engineer named Larry Wall decided that there was a need for a more cohesive and robust tool for this purpose, so he created Perl. Since then, Perl has grown and evolved, and remains a viable language for scripting and even for mid-sized web application projects.

Today there are hundreds of languages available, dozens of which have a large enough footprint to be considered for any given project. So, given the chance, how do you choose?

Criteria for Choosing a Computer Language

First, what is your application? You will need a different kind of language for writing a desktop application, such as a word processor or a web browser, than if you are building system administration utilities, such as a whois client or configuration manager, or building web tools, such as a content management system or a database maintenance backend.

Desktop applications tend to be resource-intensive and use a lot of operating system features, such as graphics, widgets, dialog boxes, etc. These applications often require the use of a software development kit (SDK) that matches your operating system, so you will likely need to use a language that’s compatible with that SDK. For Microsoft Windows, that could be C++, or for Mac OS X it could be Objective C. Sometimes you can get around that limitation and use a different language, but it’s usually a lot more convenient to use the language that matches the SDK.

For web applications, a scripting language is usually called for. The dynamic nature of the scripting languages (Perl, PHP, Python) works well for web applications. Web servers tend to be dynamic creatures and a language that can easily move from one environment to another can be a real advantage. Also, these applications tend to get updated frequently, so the short development cycle of a scripting language that doesn’t require a separate compilation step helps a lot.

Sysadmin tools, like web applications, can also benefit from the more dynamic interpreted languages. Perl and Python were designed specifically for this purpose. Python’s superior object model makes it an ideal choice for larger sysadmin and web projects, while Perl’s long legacy and large installed base makes it a great choice for projects that require a larger programming staff or ready availability of skilled developers.

So, What Language Should You Learn Next?

Why can’t I just answer your question and tell you what language to learn?

If you’d like to learn Perl or Python, you’re in luck! I’ve just finished my Perl 5 Essential Training course on lynda.com, and I’m working on a similar Python 3 course to be released later this year. So I’m happy to recommend those languages. But I would never recommend them for systems work.

If you want to become a well-rounded programmer, I will always recommend that you start as low-level as possible, which for most people is C. You will get a great introduction to how computers work and develop skills that will serve you well as you learn other languages later on.

Where you go from there depends on your goals. For applications coding, I suggest C++ or Objective C; for web or sysadmin work, I suggest Perl and Python (both, not either). Object-oriented (OO) programming is here to stay, so C++ and Objective C will serve you well for years to come, and Python’s OO model is superb. Perl is simply essential because so much existing code is written in it, and it’s a great language for smaller projects that don’t require a lot of indirection or OO patterns.

Please note my biases here: I’ve never been a fan of PHP or ASP. These languages exist for purposes that have little to do with their merits, and their design and implementation show this. The question asks for my opinions, and my recommendations necessarily reflect them. That said, I have worked in both PHP and ASP and will continue to do so when clients or development environments require those languages.

The bottom line: learn as many languages as you can and pay attention to their distinctions and commonalities. As you do so, you will necessarily develop an eye for patterns and paradigms that can be borrowed from one environment and shoe-horned into another, making you a more effective and more efficient programmer. You will also gain satisfaction from your growing mastery along the way.

—Bill Weinman, April 2010