Extended info from "pewbot"

LibraryThing‘s Tim Spalding has been in touch with me and he made some suggestions that I’ve now added into pewbot.

If you pop /extended onto the end of a request, then pewbot will return a richer set of information – e.g.:

http://webcat.hud.ac.uk:4128/pewbot/0750603054/extended/

The attributes returned for each ISBN are:

  • count
    the number of borrowers who borrowed both books
  • totalDays
    the total number of days that elapsed been each item being borrowed by all the borrowers
  • sumDays
    the sum of days, taking into account loans before and after
  • ckoBefore
    the number of borrowers who borrowed the second item first
  • ckoSame
    the number of borrowers who borrowed the second item at the same time
  • ckoAfter
    the number of borrowers who borrowed the second item afterwards

At the time of writing, the first result for 0750603054 is:

<isbn count="51"
      totalDays="701"
      sumDays="697"
      ckoBefore="1"
      ckoSame="47"
      ckoAfter="3">0443040591</isbn>

51 borrowers took out both 0750603054 and 0443040591 and 47 of them took both out at the same time.  3 others took out 0750603054 after 0443040591, and only 1 took it out before.  So, if you were going to borrow both books then the chances are you’d be borrowing them at the same time.  In other words, if someone wanted to borrow one of the books, then you could strongly recommend that they borrow the other at the same time …or if one of the books was unavailable, then the other might make a good alternative choice.

Let’s look at another one — this is a suggestion for ISBN 0333577698:

<isbn count="20"
      totalDays="5205"
      sumDays="-99"
      ckoBefore="9"
      ckoSame="0"
      ckoAfter="11">0713145226</isbn>

This time, no one borrowed both at the same time — instead 9 borrowed 0713145226 first, and 11 afterwards.

If we divide the total number of days by the number of borrowers, we get 260 days (5205 / 20) — in other words, on average 260 days elapse between each book being borrowed.

At first, this looks like we might not want to suggest to someone that they borrow both at the same time.  However, if we divide the sum of days then we get just -5 days (-99 / 20) — this is because roughly the same number of people borrowed one before the other and visa versa (11 vs 9).  So, that means that we might want to suggest both at the same time after all.

Here’s another suggestion for the same ISBN:

<isbn count="18"
      totalDays="11976"
      sumDays="11976"
      ckoBefore="0"
      ckoSame="1"
      ckoAfter="17">0138914257</isbn>

We can see that the majority of people (17) borrowed the suggested ISBN afterwards, and only 1 person borrowed both at the same time.  Not only that but, on average, there was a gap of 665 days (11976 / 18) between borrowing both.  From that, we could deduce that if you were going to borrow 0333577698, then you might want to borrow 0138914257 at the same time, or more likely you’d want to borrow at some point in the next couple of years.

One way to visualise all this is to imagine a bunch of students who are taking a 3 year course on the Java programming language.  During the entire course, they might all borrow “Java for Dummies” and “Advanced Java Development”, but (given the titles) you wouldn’t expect them to borrow both at the same time — maybe they’d get “Dummies” in the first year and “Advanced” in the last year?

So, “Advanced” would be a good suggestion to make if they’d already borrowed “Dummies”, but probably not the other way around.

Also, looking at the pewbot suggestions we might that find that lots of people borrowed “Introduction to Java” around about the same time as borrowing “Dummies”, so that would make a good suggestion for someone thinking about borrowing “Dummies”.

Phew – hope all that kinda makes sense!

In summary:

  • if we wanted to suggest books that would be good to borrow at the same time, then look for a high ckoSame value or a small average sumDays
  • if we wanted to make suggestions based on a user’s previous loan history, then look for a high ckoAfter and/or a positive average sumDays figure
Advertisements
2 comments
  1. Leto said:

    Great site. Was going to subscribe to the feed however all options are ‘Comments for “Self-plagiarism is style”‘ — no general posts feed? đŸ™‚

  2. Davey P said:

    Thanks for spotting that Leto — looks like the old feed URL stopped working after I upgraded from WordPress 2.01 to 2.02. It should be fixed now.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: