Scaling issues: The games table is huge

We store the game data for every game that’s played on the site. We’ve never faced a scaling issue as such but today we hit one, a big one. We’ve now over 2M+ unique automated games played on the site that the db size is now 298 GB and we started off with 300 GB thinking it would be more than enough. The index for the table was lost. I’m guessing this is because MySQL deletes them when the disk space is low? We’ll have to investigate deeply and maybe move the text/blob data to S3 and fetch from there. 

Apology:
I sincerely apologize for the slowness & the downtime of the site. We’re working super hard to fix it. As a quick solution to get back the site we’re expanding the storage to 2048 GB

Once this is done (in the next couple of hours), we’ll be pulling an all-nighter to optimize the game tables further. 

We really love you (all our users!) and we’ll be back with a super slick fast version. In fact, we’re revamping the interface as well which is going live next week. Again, very sorry about what happened. We never expected this and we’re on to fix it now.

I’d love to speak to each of you about your experience with the site and would really appreciate your feedback: vivek [at] hackerrank [dot] com. I promise to look through and reply to each one of them personally.

——
Vivek Ravisankar
Co-founder, hackerrank.com

Stock Prediction

The Stock Prediction Challenge was one of the vaguer challenges in the hackathon. Instead of having a set problem to solve, you needed to buy and sell stocks and try to earn the most money. While approximate challenges often involve trying out different approaches, this challenge especially did, since the future stock prices were unknown. However, a sample stock file was provided with the challenge that contained similar patterns to what was used in the actual contest. You could analyze this data (or use a program to analyze it) to detect the stock movement patterns. You could then write the actual program that would look for those patterns to buy and sell stocks.

Alexander did fairly well on this challenge using a very simple approach. Every day, he sold any stocks that had gone up in price and bought as many stocks as he could that had gone down by the highest percentage. He then worked on improving his approach. He started by selecting and sorting all the stocks that had gone down by at least 4% since the day before. He would then buy as many as possible from the stock that had gone down the most, which would often use up almost all the money. If there was leftover money, he would buy the next-most decreased stock that he could buy, and so on, until he used up his money. This simple strategy was enough to turn his $100 into $80,000 in a  few “years” of trading, and earned him 17th place on the back-to-school leaderboard. You can view Alexander’s code on Github.

The higher-ranking contestants were able to earn even more money by finding  more patterns in the data. While stocks often went up after going down and vice versa, this was not always true. Certain stocks fluctuated more, so it was worth buying more of those stocks when they went down then the less volatile stocks. Sometimes it might have been worth saving money from one turn to the next so you could buy a large number of cheap stocks. There was also another hidden pattern – certain stocks followed other stocks movements 30 days apart. If you realized such a pattern might exist, you could have stored the stock data from the previous 30 days, and then analyzed it to find out which stocks were following the movements of other stocks. This pattern was stronger than the simple up-and-down pattern, so you could have made more money by following it when possible. By using a more sophisticated analysis, you could have earned millions of dollars trading!

This challenge was somewhat “fuzzy” so it’s a bit more like the messy world than the clear-cut logic of some programming challenges. We may have more challenges like this in the future, perhaps with less clear-cut patterns. With enough of these challenges, people will soon be able to write their own algorithms to do real trading! (Note: Similar results are not guaranteed!)

Travelling Blimp Salesman

The Traveling Blimp Salesman challenge was one of the more complex challenges on the hackathon. There are many potential approaches to such a challenge, so one needs to try out different approaches and see what works. There are many different heuristics one could use when solving a standard traveling salesman problem, and these can be modified to work with this challenge. A simple approach is to always just visit the nearest city and continue until all the cites have been visited. This could be modified for this challenge to compare the cost of traveling directly to the next city vs. going back to headquarters first to pick up more blimps.

Many of the top scorers used modifications of this basic approach. Alexander tried many different approaches to the problem, but the more complex ones often ran into the time limit. He ended up using a dual-approach: His algorithm would visit a city and then check if other cities were worth visiting or if it was better to return to headquarters. Once time was running low, his code switched to an algorithm that just visited the cities one-at-a-time in order of the most revenue, until no more cities were worth visiting.  This simple approach was good enough to score in 6th-place on the back-to-school hackathon. You can view Alexander’s code on Github.

Neal Wu used a similar approach to score in first place on the challenge. He started with a randomized greedy algorithm that, after visiting a city, would check for nearby neighbors to visit. It would look at which city would provide the most profit for the next visit and whether that city was worth visiting directly. Since there could be up to 100,000 cities, Neal just checked a random subset of cities each time. After determining what order to visit, the cities, the next step was to figure out the best times to actually return to headquarters. Neal used dynamic programming to do perform these checks, which managed to boost his score into first place.

Some people tried more complex algorithms to solve this problem, but those didn’t end up doing well. Since there was such a large number of cities in the high-scoring test-cases, any overly-complex approach would run out of time trying to solve them. In addition, due to the distribution of cities and the costs of transporting blimps, it was often only worth visiting a small number of cities at a time (sometimes even one city was enough). In future challenges, we will probably use smaller test cases (and good distributions) so more sophisticated algorithms can work well. Approximate challenges can be both practical and highly competitive, so stay tuned for more of them in the future.

Mancala

If you just wanted to get a few points quickly on the hackathon, you could have done the Mancala challenge. Mancala is a simple game, and each turn there are only 6 possible moves. It would only take a minute or two to write some code that just checks which cells are occupied so it can return a random legal move. This code would have been enough to win a few games in the match-off to earn a few points and receive the free $100 AWS credit!

Basic Approach
However, if you wanted to do well on the leaderboard, you would need to do a little analysis before picking a move. Most contestants didn’t have enough time to develop a sophisticated Mancala AI, so you you could try to just out-play their simple bots. adr2370’s approach to Mancala was pretty simple, but was enough to earn 36.6 points for the challenge. He looked at which move would get the most points that full turn, including any extra moves. If two moves had the same score, he picked the move that left the most marbles on your side at the end. This simple heuristic works since marbles on your own side are likely to end up in your own store, either through your own moves, or at the end of the game. (If the other player empties first, you place the marbles in your side in your own store.) Since many bots were using simpler approaches, this code was good enough to beat 71% of the other bots and come in 31st place on the Mancala  Back-to-School Leaderboard. You can view adr2370’s full code on Github.

A Good Heuristic
With a better heuristic, you could score considerably higher, even without looking ahead at all. Mcjee used such an approach and received 41.3 points for the challenge. Mcjee wanted to try out Ruby and was able to write the code in under 40 lines (which could be shortened more), where the main part is only 14 lines (from lines 14 to 27):

In the main part, each move is analyzed with a couple of heuristics (‘overflows’, ‘scores’ and ‘extra) for its effect this turn. The first thing McJee examines is the “overflow” of a move, i.e. how many pieces are lost to the other side when they pass your own store. This goes further than adr2370’s check of the number of pieces left on each side, since Mcjee factors it into every move. This will cause Mcjee to move further back pieces earlier to avoid overflowing. This could lead to a buildup of pieces near your own store which will be forced to overflow later. However, by moving the further back pieces first, your pieces become safer from capture.

Next, Mcjee examines the ‘score’ of a move which simply looks at the index and the number of pieces of each cell. This might help counter the above tendency to move cells that are further back. The last metric is more intuitive: ‘extra’, the number of pieces captured that move (from the opponent). Capturing is the most important part of the game, since it immediately steals pieces from the opponent, and it helps cause the opponent to empty out sooner. Finally (line 26), Mcjee combines the metrics together with a special formula:

scores[x]-(overflows[x]*0.3)**2+extra;

I don’t know why these particular numbers works exactly, but it managed to place Mcjee in 8th place on the back-to-school leaderboard, and 20th overall for Mancala. This is particularly impressive since the other high-ranked bots were looking many moves ahead before doing a move.

More Advanced Play
The best bots in the Hackathon used more advanced techniques to win the game. Instead of just analyzing one move ahead, they looked many moved ahead with minimax searches of game trees and even techniques like alpha-beta pruning. HackerRank has many game challenges where you can practice such techniques, and in the future we will be adding more challenges so people can learn the techniques by doing the challenges. I will just provide a brief overview of the minimax approach, and you find out more with these notes.

A game tree is a tree of future possible moves. Each node in the tree are possible moves, and a node’s children are possible moves of the opponent. Each leaf of the tree receives a score which evaluates how much the position will be worth to you. This can either be an absolute evaluation, such as a game victory, or rely on some heuristic, such as current pieces per side. The score then gets passes up the tree through the minimax search. In minimax search, you pick the maximum score for each of your turn branches and assume your opponent will pick the move which minimizes your score for each of his turns. The actual move each turn will be the move on top of the tree with the highest minimax score.

The top bots in the Mancala competition all used such techniques to analyze each move. The 1st-place bot was created by Neal Wu (nealwu on HackerRank, @WuNeal on Twitter). It was able to search 14 moves ahead and even more moves if there was still time left! It could do this by using an efficient search in C++ and making use of alpha-beta pruning to optimize the search. Since Mancala only has a few possible moves per turn, very deep searches are possible. The bot would then return the best move it found after evaluating the end positions with a simple heuristic:

(my_mancala - your_mancala) + (marbles_on_my_side - marbles_on_your_side)

This heuristic treated pieces on your side as equivalent to being in your own store, since most pieces end up that way, either during gameplay or when one side empties. Although the heuristic was simple, the powerful search was enough to defeat every opponent the bot played.

Do you think you can now improve your bot? Although the hackathon is over, you can submit it on the Mancala challenge on the regular HackerRank site.

Leibniz CodeGolf

Leibniz was one of the simple challenges in the Hackathon. All you needed to do was use the Leibniz formula to calculate π/4 to the specified number of steps. This just involves alternating between adding and subtracting decreasing fractions. The goal was to accomplish this in as little code as possible. 

As is common for Code Golf challenges, the top entries were all in Perl. Before looking at a Perl solution, we’ll examine a Python solution by Alexander Ramirez (adr2370). He started off with a straightforward solution:

He then did a series of steps until he got the code down to 78 characters:

This is still a pretty straightforward solution. It uses simple tricks to shorten the code, such as -1i to alternate between adding and subtracting each fraction.

One of the high-scoring Perl solutions was submitted by Matthew Jee (“mcjee”), and he agreed to share his solution and explanation. This was his code, which was only 62 characters:

It is also straightforward (at least for Perl!). This is what each part of the code accomplishes:

  • <>  skips the input specifying the number of cases.
  • for(<>){…} iterates through each subsequent input.
  • $s=0; initializes the sum to 0.
  • $s+=(-1)**$_/(2*$_+1) for(0..$_-1); iterates through each term in the series, adding each one to $s .
  • Lastly, print”$s\n” prints the sum.

Bead Ornaments Challenge

The Bead Ornaments problem was one of the difficult challenges on Saturday’s Hackathon. While it was easy to score a few points on some of the challenges (such as Mancala or Leibniz), the Bead Challenge was a “standard” challenge which required finding the solution to pass the test cases.

The first part of the challenge was to figure out exactly what was being asked. The challenge was to find out how many possible ornaments you can create with colored beads (given as input). An ornament consists of multiple tree-structures of beads. Each tree is made up of beads of one color and then the trees are connected together to create an ornament. So you need to figure out how many possible arrangement of trees can be made out of each color, and then how many possible ways there are to connect the trees together.

Jerry Ma (“yuiop” on HackerRank), from Purdue University, was one of the winners of the Back-to-School hackathon, and he kindly agreed to share his solution to the Bead Challenge. This post is based on Jerry’s explanation.

Ways to Arrange Each Tree
The first task was figuring out how many possible formations can be created from ‘n’ beads of one color. I.e. how many ways can you arrange ‘n’ labeled nodes into a tree? We will call this function T(n).

Jerry was able to find the formula for this by working out some examples, and examining the sample inputs for the challenge:

   n | T(n)
 - 1 | 1
 - 2 | 1 
 - 3 | 3    (Any one of the nodes can be in the middle.)
 - 4 | 16  (Provided in sample case #3.)
 - 5 | 125 (Provided in sample case #5, with the equivalent case of 5 beads of different colors.)

The number of possible trees looks like perfect powers of some sort. More specifically, T(3) = 31, T(4) = 42, and T(5) = 53. This pattern leads to the correct formula for the possible number of trees, which is also known as Cayley’s Formula : T(n) = nn-2.

Combining the Trees
The next part of the challenge is more difficult: How many ways are there to connect the different trees of each color?

Read More

Solutions to Hackathon Challenges

One of the winners of the Back-to-School Hackathon was Alexander Ramirez. He helpfully posted about each challenge on his blog and linked to his code on Github: Life Hacks.

Click on the link to view his explanations; I’ll just provide a quick summary and link to his Github pages:

  • Leibniz challenge - he used Python instead of Perl, the usual Code-Golf choice.
  • Bead Ornaments - he figured out a formula to solve the code in only a few lined of Python. 
  • Stone Game - he just used a brute-force solution, but that only passed two test cases. 
  • Stock-Prediction - he just sold stocks that had gone up and bought ones that had gone down, making $80k in the process.
  • Traveling Salesman - he ended up using a one-at-a-time sales approach.
  • Mancala - he used a simple algorithm that focused mainly on getting the most points that turn.

We will post here in more detail about each challenge, but meanwhile you can check out those solutions.

HackerRank Hackathon Results

On Saturday, HackerRank ran our biggest contest yet, with contestants from all over the world competing to solve six challenging problems. The challenges ranged from a variation of the Traveling Salesman Problem to a math-involved question on the game of Nim to coding a bot to play Mancala.

The contest was split into two parts: The Back-To-School Hackathon for US university students, and the CodeSprint4 for everyone else in the world. The challenges were the same for each group. The top 10 winners of the The Back-To-School Hackathon will receive an all-expenses paid trip to Silicon Valley to tour top tech companies in the area. 

Over the next few days we will add posts and links to explain how the winners solved the challenges. In the meantime, here are some statistics from the top submissions:

Most popular languages
The most popular language (by a wide margin) for the contest was Python, followed by Java in second place. This is a chart by of all the submissions (above a nominal score) by programming language:

Most popular countries
The BackToSchool Hackathon was only open to US students, but what countries participated in CodeSprint4? The most participants came from India, followed by the US. Below is a chart of the top countries by the number of participants they had in the Top 200. You can view more stats by playing around with the filters on the leaderboard.

HackerRank Hackathon Tomorrow

What makes a great Hackathon?
  • Interesting variety of challenges - There should be an interesting mix of challenges and more than just standard algorithmic questions.
  • Serious competition - Winning should be hard. A good hackathon should have at least a thousand contestants. Also, one bot-to-bot challenge would be nice, where only the best bot will win!
  • Great Prizes - The winners should get real prizes (e.g. a free trip to California ;)
  • Job opportunities - Those who are interested should be able to connect with tech companies after the hackathon is over.
The HackerRank Back To School Hackathon will be all that and more! Its happening online this Saturday starting at 12noon PST. There will be exciting challenges, great competition and huge prizes! After the contest, participants will be able to apply to top tech companies such as Quora or Facebook. This job opportunity is purely opt-in for those who are interested.

Prizes:
Anyone eligible who completes a challenge will get a $100 AWS credit (see $100 AWS Post for details.)
The top 10 contestants will win an all expenses paid-trip to Silicon Valley and split up $4000 in cash (see Hackathon Prizes).

This contest is open to US university students. Everyone else can participate in the CodeSprint4 Contest at the same time.

Free $100 AWS Credit for Hackathon Participants!

The Back-to-School Hackathon is happening this Saturday, and here’s another incentive to join: $100 AWS credits, courtesy of Amazon Web Services! This prize will given out (after the contest) to every participant who gets on any challenge leaderboard in the hackathon.  To get on a leaderboard, you don’t need to beat other hackers, you just need to code a solution that gets more than 0 points. So even if you don’t win these huge prizes, you’ll still win just for participating! Many Amazon Web Services are super cheap, so $100 is enough to get your site up and running on the cloud.

Note that the Hackathon and prizes are for US university students. Everyone else can participate in the CodeSprint4 Contest, which will be at the same time.