Real-time messaging with Ruby iOS & Android

Alex Egg,

The next feature on the TODO list for Curb Call is in-app chat. Buyers need to be able to ask questions to agents in real-time. I know a few apps in the market already have this feature and that it is not a trivial problem. There are plenty of companies who make their sole business as real-time communication or chat providers: WhatsApp, kakao, facebook messages, et. al. So, it is clear that this is both a valuable and complicated feature for Curb Call.

Requirements

TO be clear, these are my requirements: client-to-client communication: ios/android clients w/ messages passing thought a middle-man server. Our current stack is a monolithic rails app w/ postgres and redis at our disposal.

From my research there are a few ways to implement a chat system:

  1. Evented web-server using a persistent socket connection (Node/websockets or event machine or) w/ custom chat protocol implementation
  2. In-house peer-to-peer solution: Faye, redis pub-sub, etc (sockets)
  3. Peer-to-peer turn-key solution: Pusher, pubnub, Layer
  4. traditional w/ client polling (fits into the traditional rails stateless paradigm)

Option #1 - Evented server w/ sockets

This is the state-of-the art, especially if you consider web-sockets. This allows you to make a persistent socket connection to all your traditional clients (iPhone, android) in addition to web browsers, allowing you to push content instantaneously. Although web-browser support isn’t one of our use-cases it is an interesting value addition considering most there are web-socket client libraries for iOS and android now.

Using this option puts you relatively low-level in the network stack, so you would have to implement all your protocols from scratch – this could be a positive or negative thing.

Option #2 - in-house messaging systems

These solutions build on options #1 with a well-defined protocol layer on the socket. Examples I researched are Faye and redis pub/sub. Most of them follow the pub/sub pattern where clients publish messages to a channel or clients subscribe to a channel to receive updates. Most of these solutions are hard to implement alongside rails b/c the subscribe steps usually block, as they create a persistent socket connection which doesn’t work with the traditional web-server: each request will indefinitely block a thread/process until you run out. One solution to this is using an event-system like event machine or implement it in node.

Option #3 - Peer-to-peer solutions Pusher, PubNub, etc.

Given the difficulty of the real-time problem it is no surprise that these business are popping up all over and have become very popular. Most of them are build on top of the web sockets technology. Although these system aren’t peer-to-peer per-se, they are as far as we are concerned b/c they don’t touch our server stack, rather the business’s servers.

iOS Background Socket Disconnection

Pusher works fine on iOS, however, the problem is when the app is backgrounded for too long your connection gets severed. And since these services (that i’ve profiled) don’t provide any replay functionally, it fails on of my requirements. We would have to build in some type of protection into the client, where if the app is backgrounded, somehow send the messages via Push Notification (apns, gcm, etc) and then when the app wakes up sync w/ the server to get the missed history.

Option #4 Polling

In this scenario, the clients will not have a persistent connection w/ the server, but rather will periodically request updates. The advantages of this is that you can use your existing web stack, in this case rails. The disadvantages are some added complexity to the client implementation. For example, on the client you have to implement a timer to ping the server and make sure this doesn’t block the UI thread. The side-effect of this is increased CPU/radio usage on the device which results in increased battery drain. Also, this implementation becomes more and more unrealistic after you a certain large number of clients pinging your server every few seconds, as you put unnecessary load on your server. It is not unreasonable to consider the case where for a given channel 99% of the pings return empty while only %1 actually have any updates. One solution to this problem is long-polling.

Long Polling

In this scenario, the client is periodically requesting updates as usual in a polling system. When the server has data it will return it, as normal. However, in a long-polling system on the 99% case where there is not data, instead of returning nothing the server will hold onto the connection for a given time-out until or until data becomes available. This cuts down on the # of requests and thus battery drain. Long polling is an interesting compromise between traditional-polling and a persistent socket. It’s not a persistent connection, but it’s not polling either.

Decision

The socket-based solutions appealing, in terms of robustness, but after trying to implement them in a non-blocking environment (node and event machine), I really don’t believe the complexity added by having socket/web socket support is really justified over long polling.

I also, didn’t want to write custom message-state management code either to handle the concept of pub/sub and channels so, a custom rails-based solution was out too, so it seemed wise to leverage the pub/sub features of redis since it’s already in my stack. But like I said, the pub/sub pattern blocks, so building it inside a rails process is out of the question. That leaves me to use some type of evented server w/ redis pub/sub and since I am in a ruby stack, I think my best choice is event machine inside a sinatra app that is exposed as a web-service to the mobile clients.

Implementation

I found a good implementation of exactly what I wanted to build – a non-blocking pub/sub system based on redis – already implemented called MessageBus: https://github.com/SamSaffron/message_bus . If I just write a quick chat-app wrapper for message-bus and host it in an evented web-server such as thin, we should able to scale to a pretty large level w/ 1 thread and not block the threads/dynos on the main rails app.

Server (Sinatra)

I wrote the server using sinatra, it is only 1 file, which includes the message-bus middle-wear and adds in a /publish endpoint.

$LOAD_PATH.unshift File.expand_path('../../../lib', __FILE__)
require 'message_bus'
require 'sinatra'
require 'sinatra/base'
require 'parse-ruby-client'


class Chat < Sinatra::Base

  set :public_folder,  File.expand_path('../assets',__FILE__)

  redis_uri = URI.parse(ENV["REDIS_URI"])
  redis_config = {:host => redis_uri.host, :port => redis_uri.port, :password => redis_uri.password}
  MessageBus.redis_config = redis_config

  Parse.init :application_id => ENV.fetch("PARSE_APP_ID"),
             :api_key        => ENV.fetch("PARSE_API_KEY"),
             :quiet           => true | false 

  use MessageBus::Rack::Middleware

  MessageBus.subscribe('/message') do |msg|
     data = msg.data
     message_id = msg.message_id
   
     #do a push here
     data = { alert: "#{msg.data}", sound: "sosumi.aiff"}
     push = Parse::Push.new(data)
     user_id=11
     query = Parse::Query.new(Parse::Protocol::CLASS_INSTALLATION).eq('user_id', user_id)
     push.where = query.where
     push.save
  end


  post '/message' do
    # raise "#{params.inspect}"
    # {"data"=>"message text\n", "name"=>"eggie5"}
    MessageBus.publish '/message', params

    headers = {}
    headers["Content-Type"] = "application/json; charset=utf-8"
    h={status:"OK"}
    json = JSON.dump(h)
    return [200, headers, json]
  end

  run! if app_file == $0
end

Code analysis

Initialization

First we need to pass the messaging system our current redis config info, because it piggy-backs off of the redis pub/sub system. Then we init our Parse API to wake up iOS clients when the app goes in the background. Next we include the message-bus rack middleware which gives our app a long-polling url endpoint: /message-bus/[client id]/poll which does not block the thread. In the background it uses event machine, so it is important to use an evented webserver w/ this messaging system or you just block server threads until it crashes.

Subscribe

This subscription callback is optional. The block is executed when a message is published. I’m using it here to send a push to an iPhone client. This would be the case where the client lost a connection because the app has been in the background past the mysterious apple time threshold. If they click the push it will bring the app to the foreground and start long-polling again.

Publish

The mobile clients call this endpoint when they want to post a message. It inserts the message into the message-bus system and then returns. (The previously mentioned subscribed callback will fire after this).

Client

The client implementation is pretty simple. When the client wants to start a chat it must have 2 things: a channel name, a nickname. When the client wants to join it channel, it simply starts polling from it, but hitting the /message-bus/[client id]/poll w/ a post param [channel_name]=0. This syntax means, return ALL the messages that client id is allowed to see on the messages channel. The server will return a json array of messages: This is an example of a message:

{"global_id":54,"message_id":50,"channel":"/message","data":{"data":"hello, from iOS!","name":"eggie5"}}

The client should keep track of the last message_id it received so, that if it wakes up from sleep (push notification) it can synchronize w/ the server by posting this body [channel_name]=[last_id]. Then the server will return all the message starting from that point.

For the chat UI on iOS I found a cool project that has done all the work and I simply implemented a few interfaces and added my datasource adapter for my messaging system: http://www.jessesquires.com/JSQMessagesViewController/ . I haven’t tried to make an android UI yet.

So in summary, the client polls every x seconds and if no data is on the server (no new messages), the connection stays open until data is returned from the server (somebody posted a message).

######Note: On an evented server, this will work fine, but on a traditional process/thread based server it will block each process/thread until you run out of resources.

I feel like the long-polling approach is a good compromise to the complexity of a full socket solution. It allows me to keep my stack contained in the sphere of ruby/rails technologies and my deploys to heroku simple.

See the system running in the video below!

Nov 19 2015 Update

It seems like this is the same approach rails is taking w/ it’s new ActionCable. Web sockets communication using an embedded evented server: event machine. So you are running two web servers: your traditional stateless server: puma/unicorn et al to handle HTTP and a side eventemachine to handle the web sockets.


Permalink: real-time-messenging-w-ruby-ios-android

Tags:

Last edited by Alex Egg, 2016-10-05 19:07:03
View Revision History