Collective Intelligence – Part 4: Calculating Similarity

This is the fourth of a series of posts on the topic of programming Collective Intelligence in web applications. This series of posts will draw heavily from Santam Alag’s excellent book Collective Intelligence in Action.

These posts will present a conceptual overview of key strategies for programming CI, and will not delve into code examples. For that, I recommend picking up Alag’s book. You won’t be disappointed!

Click on the following links to access previous posts in this series:

Determining Similarity using a Similarity Matrix

The essential task in developing collective intelligence is determining similarity between things – between users and items, between different items, and between groups of users.

In Collective Intelligence, this typically involves computing similarities in the form of a similarity matrix (or similarity table). A similarity matrix compares the values in two Term Vectors, and computes the relative similarity between comparable entries in each term vector. Please refer to this previous post, for a brief introduction to terms and term vectors.

In chapter 2 of his book, Alag calculates similarity tables using 3 basic approaches:

Cosine-based similarity
Correlation-based similarity
Adjusted-cosine-based similarity

I’m not going to get into the specific differences between the different methods, but I will provide a general example (from Alag’s book) to illustrate the approach.

User similarity in rating Photos

The example Alag gives involves 3 different users rating 3 different photos. They express their ranking of a photo as a number between 1 and 5. These ratings are displayed in the table below:

Image may be NSFW.
Clik here to view.

If we were to calculate a similiarity matrix (using the cosine-based approach) comparing how similar the photos are to each other, we’d get the following table:

Image may be NSFW.
Clik here to view.

This table tells us that Photo1 and Photo2 are very similar. The closer to 1 a value in the similarity table is, the more similar the items are to each other.

You can use the same approach to calculating similarity between users’ preferences for the photos. If we do the calculations, we get the following results:

Image may be NSFW.
Clik here to view.

Here we see that Jane and Doe are very similar.

In Alag’s book, he details the specific algorithm for caclulating each of the above similarity tables, and shows the different results obtained using the 3 methods listed above (i.e. cosine-based, correlation-based, and adjusted cosine-based methods). He also provides examples based on user ratings for photos, as well as user ranking of articles based on which articles they bookmarked. However, the basic approach is the same as illustrated above.

In Summary

In this post we looked at the basic task of calculating similarity between items and users. In the next post, we’ll look at the specific scenario of extracting intelligence from tags.

Also in this series

Image may be NSFW.
Clik here to view.

Collective Intelligence – Part 4: Calculating Similarity

Determining Similarity using a Similarity Matrix

User similarity in rating Photos

In Summary

Also in this series

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112