From Facemash to Songmash: How I Take Mark Zuckerberg’s Drunken Idea and Turn It Into Mine (Part I)

1.The Drunken Idea

“I’m a little intoxicated, not gonna lie. So what if it’s not even 10 pm and it’s a Tuesday night? The Kirkland facebook is open on my desktop and some of these people have pretty horrendous facebook pics. I almost want to put some of these faces next to pictures of farm animals and have people vote on which is more attractive.”  (9:48 pm) 

“Yea, it’s on. I’m not gonna do the farm animals, but I like the idea of comparing two people together. It gives the whole thing a very “Turing” feel.” (10:17 pm) 

These were the lines that Mark Zuckerberg, drunken, alone in his Harvard dorm room, wrote before hacking into the night, breaking into the university private users data, collecting images of thousands of female students and putting it on his site, called facemash.com, for the residence’s voyeurs to choose and rank who is hotter.

The site was an instant hit, as it received a dazzling 22,000 page views in the first hours, and passed through the Harvard campus like an avalanche, at one point temporarily bringing down the university’s network. Of course, it was a despicable idea, as Zuckerberg later on was accused of violating student privacy and almost expelled from Harvard. However, the legacy and notoriety of Facemash turned out to be one of the greatest things that had happened to Mark Zuckerberg, as it cemented his status as someone in a university of the greats, and to the future of social media, as he later on used this simple yet brilliant idea to develop a small social network site that you might have heard of, Facebook.

The film The Social Network (2010), directed by David Fincher, documenting the rise of Facebook, was hailed one of the greatest films of the 21st century, and the scene in which Zuckerberg built Facemash stands out: it was the first time I wholly immersed myself in programming, saw the elegant beauty in a simple idea. For me, coding was all about the joy, about having one idea that is genuinely your passion and scaling it with lines of code to see how impactful it can be. 

That’s why in this edition of BraveBits’ monthly blog, I am going to try to do what Zuckerberg had done and build my own edition of Facemash. However, here are some pre-notes to assure you all:

  1. I was not intoxicated while writing this blog.
  2. I did not intrude on any private data nor did I bring down the company’s system with.
  3. This blog is divided in two parts, taking you through the process of building my site, from ideating to deploying. Once you finish reading this (and inspecting the code), you will be inspired to build your own site of X-mash, with X being whatever you desire to, by whatever tech-stack you want. I love music, so here we go with Songmash

2. Core Functionality, and How a Chess Algorithm Fits Into Play

In its most simplicity, Songmash utilizes the core idea of comparing two things together, with songs here being the main target. The app randomizes two songs from a set list of chosen artists, lets users choose one over the another, then re-randomizes two others to an infinite point. The users’ choices are then calculated by an algorithm, before being ranked accordingly.

It is just that, deadly simple in idea, yet powerful since you can plug anything into as the prefix of “mash”. The only question you might be wondering right now is: what is the aforementioned algorithm?

​​

The Elo’s Rating Algorithm, named after its creator Arpad Elo, is familiar to those who have an interest in the subject of chess. The Elo system was originally invented as an improved chess-rating system over the previously used ones, but is also applied to a variety of sports and games. Basically, it is used to calculate ratings of players in competitive matches. One player’s loss of points reflects another’s gain; the less expected one is to win the match, the more points he receives if actually has the upper hand.

For the sake of compactness, we are not going to delve into the algorithm and what it all means at the moment – this article here explains well enough – however I am going to generally explain so that you have an idea before actually coding. Let’s say we have two players A and B competing with each other.

Ea and Eb are the probabilities one player wins over another, while Ra and Rb are the current ratings of A and B. Basically with each turn you want to calculate a new rating for both the subjects in the competition based on the expectations. In the event that A wins over B, the new rating is calculated with the second set of equations:

with R’a and R’b being the new ratings. S denotes the match results: winners get S=1; S=0 for losers; and the number is 0.5 if it’s a draw. The K factor is a little more confusing, and it’s the subject of multiple debates of statisticians for which integer that should be applied into. It is basically a constant that is used to fine tune one player’s score. If the K factor coefficient is set too large, there will be too much sensitivity to just a few, recent events, in terms of a large number of points exchanged in each game; and if the K-value is too low, the sensitivity will be minimal, and the system will not respond quickly enough to changes in a player’s actual level of performance.

Here, we are going to use what the USCF (United States Chess Federation) defines for K, which makes use of a logistic distribution as opposed to a normal distribution.

Applying it into the spectrum of our Songmash app, the “players” here are going to be the songs, and each user’s inputs will determine who is the better of any two songs, then the total set list would be ranked accordingly.

And now, when we are clear of why we do and what we do, let’s start looking into how we do it.

3. Bob the Server-Builder and How He “Hacked” Into Spotify

In this section, we are going to discuss how I built the server and got the initial data for Songmash and what the ultimate twist was. However, let me begin by saying that this is not a how-to-make-an-app tutorial, I am just going to slide through the process of building the server and jump right into the fun part.

But first, a quick look to the infrastructure of my server:

Database and Server

Database Schema of Songmash. One collection for songs, one collection for artists, and one for holding the head-to-head score between any two songs.

The structure of our server code. The backend is written with the MEN stack – MongoDB + Express + Node.js, with the database being stored by MongoDB Atlas service. The code is divided into three main sections: controllers (responds to the user input and performs interactions on the data model objects), models (manages the data of the application) and utils (holds the configurations and middlewares). 

Hacking Spotify

Along with the database comes the data: our goal now was to feed into data of songs automatically and systematically. Then comes the fun part: hacking Spotify! Well what I did was not actually hacking or retrieving any private Spotify data, but I also refrained myself from using any API it provides: one reason being its API is chained with complex authorization and does not result in our actual desired information, the other was, simply, it took out all the fun! Putting myself in Zuckerberg’s shoes, if Harvard had somehow made its students’ data public, we could never have seen Facemash, and subsequently Facebook. Coding is all about pure joy, remember!

Our method used here was the good old web scraping.

What I manually needed to do first was to add all songs of one artist in their own playlist, since the way Spotify is somewhat unorganized in the way it structures one artist’s song data. 

​​

Once the playlists are available, it’s F12 time! By inspecting the site, I can scrape the data by looking into the tag and class names of the needed data. However, if you notice, Spotify prevents us at first hand from inspecting and getting all the data we need. Damn you Spotify! Or is it?

There is a workaround that I used, which was to use curl requests to read the html code of our playlist.

And it worked like a charm! 

After I have had what I need to scrape the data, with a few TypeScript tricks as below, the data quickly falls into place.

And voila! We have an up-and-running server, being fed with data that we, well, call it hacked, who cares, from Spotify. We also ran through the reason and engine behind what we used to build this app and, hopefully, it did inspire you some bits to start writing your own codes!

Prenote for Part II

In the next part, we are going to discuss the frontend, and as a sneak peek, our used framework would not be React. Nor Angular. Nor Vue.

The Github repository for the server is as follows: link. See you next week!

Comments

Let’s make a great impact together

Be a part of BraveBits to unlock your full potential and be proud of the impact you make.