Ruminations

Code better not harder

VoicePing

Lesson learned while building VoicePing.

It’s neither about pursuing the big ideas nor technical challenges associated to building a highly mutable state machine with multitude of side effects.

Not even about achieving the elegant architecture of cooperating microservices which would scale to billion clients.

Yes, at some point the above big goals will seem to be the overall purpose or the end result of a real-time system. But right now:

80% are about making sure things are working as expected at the right time for:

  1. Consumers of the services (mobile clients).
  2. End users of the mobile clients.

20% are about making sure every little part of the software should be written and configured in the most understandable way, with as little overhead as possible in terms of memory or processing resources.

More HTTP server benchmark

Following from my last post on benchmarking hello world http server on various platforms, I continued with increasing the number of requests and concurrency. In addition, I added scalatra, play, and greenlet tornado to the list.

What is greenlet? To make it short, it’s just like goroutine in golang. What’s goroutine? Read about it here.

This time, I excluded node.js, twisted, and eventmachine from the list, because apache bench could not finish the benchmark at 100,000 number of requests with 100 multiple requests at a time.

node.js and eventmachine could still perform well at 15000 requests with 100 multiple requests, and the results of both don’t have a significant difference compared to the last time I tested it. Beyond that, the test was aborted after 10 failures, with the same error:

apr_socket_connect(): Operation already in progress (37)

I have also made an upgrade on ab executable to Version 2.3 Revision 1430300. This could be built from the latest httpd source code at version 2.4.4.

That being said, I may try to find out what are the problems in my test setup — Mac OS X 10.8.4, Retina MacBook Pro, Mid 2012 with Quad Core i7 at 2.3 GHz — that fails apache bench to finish it for node.js, twisted, and eventmachine. I figured that to do a full-scale benchmark with apache bench, it needs to be run on a Linux server.

The apache bench command that I was using:

% ab -r -k -n 100000 -c 100 -e ~/Documents/each_platform_results.csv http://localhost:port/

The source code for greenlet with tornado:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tornado.httpserver
import tornado.web
from greenlet_tornado import greenlet_asynchronous, greenlet_fetch

class MainHandler(tornado.web.RequestHandler):
@greenlet_asynchronous
def get(self):
self.write("Hello World!")

application = tornado.web.Application([
(r"/", MainHandler),
])

if __name__ == "__main__":
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8001)
print("Tornado listening at 8001")
tornado.ioloop.IOLoop.instance().start()

As you can see, there’s not much difference except for the @greenlet_asynchronous decorator.

The result is not that different either:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:        TornadoServer/3.0.2
Server Hostname: 0.0.0.0
Server Port: 8001

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 32.202 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 100000
Total transferred: 23100000 bytes
HTML transferred: 1200000 bytes
Requests per second: 3105.36 [#/sec] (mean)
Time per request: 32.202 [ms] (mean)
Time per request: 0.322 [ms] (mean, across all concurrent requests)
Transfer rate: 700.52 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 5
Processing: 6 32 10.7 29 100
Waiting: 6 32 10.7 29 100
Total: 10 32 10.7 29 100

Percentage of the requests served within a certain time (ms)
50% 29
66% 29
75% 30
80% 30
90% 51
95% 61
98% 63
99% 68
100% 100 (longest request)

Actually, tornado without greenlet is slightly faster by 0.01 ms.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:        TornadoServer/3.0.2
Server Hostname: 0.0.0.0
Server Port: 8001

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 31.168 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 100000
Total transferred: 23100000 bytes
HTML transferred: 1200000 bytes
Requests per second: 3208.39 [#/sec] (mean)
Time per request: 31.168 [ms] (mean)
Time per request: 0.312 [ms] (mean, across all concurrent requests)
Transfer rate: 723.77 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 4
Processing: 7 31 11.0 27 108
Waiting: 7 31 11.0 27 108
Total: 8 31 11.0 27 108

Percentage of the requests served within a certain time (ms)
50% 27
66% 28
75% 28
80% 29
90% 50
95% 59
98% 63
99% 73
100% 108 (longest request)

Next up, Scalatra which is a sinatra inspired web framework written in scala, but it takes much longer to setup compared to its predecessor.

Here’s the code:

1
2
3
4
5
6
7
8
9
10
package com.jessearmand.helloscala

import org.scalatra._
import scalate.ScalateSupport

class MyScalatraServlet extends HelloScalaAppStack {
get("/") {
"Hello World."
}
}

Yes, that simple! Once you set it up with some effort, if you never done any Scala programming.

How does it perform? Roughly, it’s twice as fast as python’s tornado.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:        Jetty(8.1.8.v20121106)
Server Hostname: 0.0.0.0
Server Port: 8080

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 16.332 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 100000
Total transferred: 14700000 bytes
HTML transferred: 1200000 bytes
Requests per second: 6122.82 [#/sec] (mean)
Time per request: 16.332 [ms] (mean)
Time per request: 0.163 [ms] (mean, across all concurrent requests)
Transfer rate: 878.96 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 4
Processing: 0 16 4.2 16 79
Waiting: 0 16 4.2 16 79
Total: 0 16 4.2 16 80

Percentage of the requests served within a certain time (ms)
50% 16
66% 17
75% 18
80% 19
90% 21
95% 23
98% 26
99% 29
100% 80 (longest request)

Now, let’s try Play framework which I guess is the best framework written in Scala to build a web application with non-blocking IO.

1
2
3
4
5
6
7
8
9
10
11
package controllers

import play.api._
import play.api.mvc._

object Application extends Controller {
def index = Action {
Ok("Hello World!")
// Ok(views.html.index("Your new application is ready."))
}
}

Note how I commented out the code responsible for rendering the default index.html view. This is only because I want a pure “Hello World” comparison. Although, I do understand that it’s not a comprehensive way to benchmark web apps for what it can do in various other cases. Read here for a good example of doing this.

Result on initial launch of the app:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Server Software:
Server Hostname: 0.0.0.0
Server Port: 9000

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 6.425 seconds
Complete requests: 100000
Failed requests: 10
(Connect: 5, Receive: 5, Length: 0, Exceptions: 0)
Write errors: 0
Keep-Alive requests: 99995
Total transferred: 11599420 bytes
HTML transferred: 1199940 bytes
Requests per second: 15563.80 [#/sec] (mean)
Time per request: 6.425 [ms] (mean)
Time per request: 0.064 [ms] (mean, across all concurrent requests)
Transfer rate: 1763.00 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.9 0 105
Processing: 2 6 4.9 5 96
Waiting: 0 6 4.9 5 96
Total: 2 6 5.1 5 157

Percentage of the requests served within a certain time (ms)
50% 5
66% 6
75% 7
80% 8
90% 11
95% 15
98% 22
99% 27
100% 157 (longest request)

Then, after the first benchmark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:
Server Hostname: 0.0.0.0
Server Port: 9000

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 3.933 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 100000
Total transferred: 11600000 bytes
HTML transferred: 1200000 bytes
Requests per second: 25422.96 [#/sec] (mean)
Time per request: 3.933 [ms] (mean)
Time per request: 0.039 [ms] (mean, across all concurrent requests)
Transfer rate: 2879.94 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 4
Processing: 2 4 1.2 3 13
Waiting: 2 4 1.2 3 13
Total: 2 4 1.3 3 16

Percentage of the requests served within a certain time (ms)
50% 3
66% 4
75% 4
80% 5
90% 6
95% 7
98% 7
99% 8
100% 16 (longest request)

It seems Play framework takes some time from the initial launch to achieve some kind of a “steady state”.

Last, let’s do some benchmark with go lang again. This time, I didn’t set GOMAXPROCS which sets the maximum number of CPUs that can be executing simultaneously. Thus, I let it use a default value that I don’t know of.

Result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:
Server Hostname: 127.0.0.1
Server Port: 3000

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 2.559 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 100000
Total transferred: 13800000 bytes
HTML transferred: 1200000 bytes
Requests per second: 39077.64 [#/sec] (mean)
Time per request: 2.559 [ms] (mean)
Time per request: 0.026 [ms] (mean, across all concurrent requests)
Transfer rate: 5266.32 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 4
Processing: 0 3 0.6 3 6
Waiting: 0 3 0.6 3 6
Total: 0 3 0.6 3 6

Percentage of the requests served within a certain time (ms)
50% 3
66% 3
75% 3
80% 3
90% 3
95% 4
98% 4
99% 5
100% 6 (longest request)

Go is definitely still a clear winner with half the speed compared to my last benchmark. Next to Go performance is Play framework.

On a related note, there’s an interesting article for Python developers if you’re thinking about migrating to Go.

I'm sold with golang

While I was waiting for WWDC, I did an apache bench for simple hello world server written in go, ruby’s eventmachine, python’s tornado and twisted, and the well-known node.js.

% ab -r -k -n 15000 -c 50 -e ~/Documents/each_platform_results.csv http://localhost:port/

15000 requests and 50 concurrent requests is the only setting that I could test on all platforms without failing tests with an error like this:

apr_socket_connect(): Operation already in progress (37)

How is the result?

OK, let’s start with tornado (I believe it’s used by facebook for realtime news feed?). Here’s the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import tornado.httpserver
import tornado.ioloop
import tornado.web

class MainHandler(tornado.web.RequestHandler):
@tornado.web.asynchronous
def get(self):
self.write("Hello World\n")
self.finish()

application = tornado.web.Application([
(r"/", MainHandler),
])

if __name__ == "__main__":
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8001)
print("Tornado listening at 8001")
tornado.ioloop.IOLoop.instance().start()

Result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:        TornadoServer/3.0.2
Server Hostname: 0.0.0.0
Server Port: 8001

Document Path: /
Document Length: 12 bytes

Concurrency Level: 50
Time taken for tests: 5.072 seconds
Complete requests: 15000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 15000
Total transferred: 3465000 bytes
HTML transferred: 180000 bytes
Requests per second: 2957.67 [#/sec] (mean)
Time per request: 16.905 [ms] (mean)
Time per request: 0.338 [ms] (mean, across all concurrent requests)
Transfer rate: 667.21 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 6 17 6.0 15 65
Waiting: 6 17 5.9 15 65
Total: 7 17 6.0 15 65

Percentage of the requests served within a certain time (ms)
50% 15
66% 16
75% 16
80% 17
90% 27
95% 30
98% 35
99% 38
100% 65 (longest request)

Then… the reputable Twisted:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from twisted.web import server, resource
from twisted.internet import reactor

class HelloResource(resource.Resource):
isLeaf = True

def render_GET(self, request):
request.setHeader("Content-Type", "text/plain")
return "Hello World\n"

port = 8001
reactor.listenTCP(port, server.Site(HelloResource()))
print "Twisted running at %d" % port
reactor.run()

Result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:        TwistedWeb/13.0.0
Server Hostname: 0.0.0.0
Server Port: 8001

Document Path: /
Document Length: 12 bytes

Concurrency Level: 50
Time taken for tests: 5.893 seconds
Complete requests: 15000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 2115000 bytes
HTML transferred: 180000 bytes
Requests per second: 2545.19 [#/sec] (mean)
Time per request: 19.645 [ms] (mean)
Time per request: 0.393 [ms] (mean, across all concurrent requests)
Transfer rate: 350.46 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.2 0 2
Processing: 11 20 2.2 19 34
Waiting: 11 20 2.2 19 34
Total: 11 20 2.2 19 34

Percentage of the requests served within a certain time (ms)
50% 19
66% 20
75% 20
80% 21
90% 22
95% 23
98% 25
99% 27
100% 34 (longest request)

What about ruby eventmachine?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
require 'eventmachine'
require 'evma_httpserver'

class Handler < EventMachine::Connection
include EventMachine::HttpServer

def process_http_request
resp = EventMachine::DelegatedHttpResponse.new( self )
resp.status = 200
resp.content = "Hello World!"
resp.send_response
end
end

EventMachine::run {
EventMachine::start_server("0.0.0.0", 8002, Handler)
puts "Listening... at 8002"
}

Result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:
Server Hostname: 0.0.0.0
Server Port: 8002

Document Path: /
Document Length: 12 bytes

Concurrency Level: 50
Time taken for tests: 1.909 seconds
Complete requests: 15000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 780000 bytes
HTML transferred: 180000 bytes
Requests per second: 7858.99 [#/sec] (mean)
Time per request: 6.362 [ms] (mean)
Time per request: 0.127 [ms] (mean, across all concurrent requests)
Transfer rate: 399.09 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 4.4 0 245
Processing: 1 6 19.5 4 247
Waiting: 1 5 19.3 3 247
Total: 2 6 20.0 4 247

Percentage of the requests served within a certain time (ms)
50% 4
66% 4
75% 4
80% 4
90% 6
95% 8
98% 11
99% 80
100% 247 (longest request)

Wow, that’s faster than tornado or twisted in terms of requests per second, but it transferred less data for some reason. OK, we move on to node.js, I expect it should be faster.

1
2
3
4
5
6
7
8
9
var http = require('http');

http.createServer(function (request, response) {
response.writeHead(200, { "Content-Type": "text/plain" });
response.write("Hello World\n");
response.end();
}).listen(8000);

console.log('Listening on port 8000...');

Result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:
Server Hostname: 0.0.0.0
Server Port: 8000

Document Path: /
Document Length: 12 bytes

Concurrency Level: 50
Time taken for tests: 2.181 seconds
Complete requests: 15000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 1695000 bytes
HTML transferred: 180000 bytes
Requests per second: 6877.80 [#/sec] (mean)
Time per request: 7.270 [ms] (mean)
Time per request: 0.145 [ms] (mean, across all concurrent requests)
Transfer rate: 758.98 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.1 0 137
Processing: 1 7 11.9 6 142
Waiting: 1 7 11.8 6 142
Total: 2 7 11.9 6 142

Percentage of the requests served within a certain time (ms)
50% 6
66% 6
75% 6
80% 6
90% 7
95% 9
98% 12
99% 101
100% 142 (longest request)

Yes, unsurprisingly it has the highest transfer rate, although it has less mean requests / sec compared to ruby event machine. Based on the percentage of the requests served, node.js is faster.

Are we done? No, the last but not least:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package main

import (
"fmt"
"io"
"net/http"
"runtime"
)

func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
io.WriteString(w, "Hello World\n")
})

runtime.GOMAXPROCS(4)
fmt.Println("Server running at http://127.0.0.1:3000/ with GOMAXPROCS(4) on i7")
http.ListenAndServe("127.0.0.1:3000", nil)
}

Result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Server Software:
Server Hostname: 0.0.0.0
Server Port: 3000

Document Path: /
Document Length: 12 bytes

Concurrency Level: 50
Time taken for tests: 0.228 seconds
Complete requests: 15000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 15000
Total transferred: 2070000 bytes
HTML transferred: 180000 bytes
Requests per second: 65771.30 [#/sec] (mean)
Time per request: 0.760 [ms] (mean)
Time per request: 0.015 [ms] (mean, across all concurrent requests)
Transfer rate: 8863.71 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 2
Processing: 0 1 0.1 1 2
Waiting: 0 1 0.1 1 2
Total: 0 1 0.2 1 3

Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 1
90% 1
95% 1
98% 1
99% 1
100% 3 (longest request)

Transfer rate is 8863.71 Kbytes/sec with 65771.3 requests per second?

Go lang, I’m sold.

Need to Load View

Just found out today that if a UIViewController subclass is constructed programmatically (that is without using Interface Builder),

and, if we don’t create and assign a UIView inside loadView:

1
2
3
4
5

- (void)loadView
{
self.view = [[UIView alloc] initWithFrame:[[UIScreen mainScreen] bounds]];
}

Then, this could ruin the animation transition of presentViewController:animated:completion:. In other words, the transition animation doesn’t occur.

There’s a reason why the documentation says that we need to implement this. I saw at least one project that doesn’t follow this rule.

Git merge and rebase tips

A few tips to deal with merge conflicts:

Github Workflow

New guy create a separate branch, make some changes, and create a pull request. Only the core developers are allowed to observe the pull request and merge the changes to main branch.

Good thing about pull request is, it could be discussed first, before the new guy is silently merging and pushing to the main branch.

Fetch, Merge

Read here.

Basically, if both developers are working on the same file, they could peek at the remote changes before doing a merging.

Let’s say the main developer who worked on the master branch would like to see a new feature on another branch:

# git fetch origin master

Look at what have changed:

# git diff origin/new_ui

Decided that it’s okay

# git merge --no-ff origin/new_ui

Note the –no-ff here:

The –no-ff flag causes the merge to always create a new commit object, even if the merge could be performed with a fast-forward. This avoids losing information about the historical existence of a feature branch and groups together all commits that together added the feature

From a good answer on stackoverflow:
http://stackoverflow.com/questions/2850369/why-does-git-use-fast-forward-merging-by-default

But, this is a simple case… this is usually okay if there are no conflicting changes. If there is, it’s usually better to use a fetch, rebase, merge workflow.

Fetch, Merge, Rebase

An article related to git flow

A good explanation is here (read especially the example in comment-1107)

Quote:
When Is the Merge Workflow OK?

The merge workflow will do you no damage at all if you

  • Only have one committer (or a very small number of committers, and you trust them all)
  • You don’t care much about reading your history