It’s neither about pursuing the big ideas nor technical challenges associated to building a highly mutable state machine with multitude of side effects.
Not even about achieving the elegant architecture of cooperating microservices which would scale to billion clients.
Yes, at some point the above big goals will seem to be the overall purpose or the end result of a real-time system. But right now:
80% are about making sure things are working as expected at the right time for:
Consumers of the services (mobile clients).
End users of the mobile clients.
20% are about making sure every little part of the software should be written and configured in the most understandable way, with as little overhead as possible in terms of memory or processing resources.
Following from my last post on benchmarking hello world http server on various platforms, I continued with increasing the number of requests and concurrency. In addition, I added scalatra, play, and greenlet tornado to the list.
What is greenlet? To make it short, it’s just like goroutine in golang. What’s goroutine? Read about it here.
This time, I excluded node.js, twisted, and eventmachine from the list, because apache bench could not finish the benchmark at 100,000 number of requests with 100 multiple requests at a time.
node.js and eventmachine could still perform well at 15000 requests with 100 multiple requests, and the results of both don’t have a significant difference compared to the last time I tested it. Beyond that, the test was aborted after 10 failures, with the same error:
apr_socket_connect(): Operation already in progress (37)
I have also made an upgrade on ab executable to Version 2.3 Revision 1430300. This could be built from the latest httpd source code at version 2.4.4.
That being said, I may try to find out what are the problems in my test setup — Mac OS X 10.8.4, Retina MacBook Pro, Mid 2012 with Quad Core i7 at 2.3 GHz — that fails apache bench to finish it for node.js, twisted, and eventmachine. I figured that to do a full-scale benchmark with apache bench, it needs to be run on a Linux server.
Server Software: TornadoServer/3.0.2 Server Hostname: 0.0.0.0 Server Port: 8001
Document Path: / Document Length: 12 bytes
Concurrency Level: 100 Time taken for tests: 32.202 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100000 Total transferred: 23100000 bytes HTML transferred: 1200000 bytes Requests per second: 3105.36 [#/sec] (mean) Time per request: 32.202 [ms] (mean) Time per request: 0.322 [ms] (mean, across all concurrent requests) Transfer rate: 700.52 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 5 Processing: 6 32 10.7 29 100 Waiting: 6 32 10.7 29 100 Total: 10 32 10.7 29 100
Percentage of the requests served within a certain time (ms) 50% 29 66% 29 75% 30 80% 30 90% 51 95% 61 98% 63 99% 68 100% 100 (longest request)
Actually, tornado without greenlet is slightly faster by 0.01 ms.
Server Software: TornadoServer/3.0.2 Server Hostname: 0.0.0.0 Server Port: 8001
Document Path: / Document Length: 12 bytes
Concurrency Level: 100 Time taken for tests: 31.168 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100000 Total transferred: 23100000 bytes HTML transferred: 1200000 bytes Requests per second: 3208.39 [#/sec] (mean) Time per request: 31.168 [ms] (mean) Time per request: 0.312 [ms] (mean, across all concurrent requests) Transfer rate: 723.77 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 4 Processing: 7 31 11.0 27 108 Waiting: 7 31 11.0 27 108 Total: 8 31 11.0 27 108
Percentage of the requests served within a certain time (ms) 50% 27 66% 28 75% 28 80% 29 90% 50 95% 59 98% 63 99% 73 100% 108 (longest request)
Next up, Scalatra which is a sinatra inspired web framework written in scala, but it takes much longer to setup compared to its predecessor.
Server Software: Jetty(8.1.8.v20121106) Server Hostname: 0.0.0.0 Server Port: 8080
Document Path: / Document Length: 12 bytes
Concurrency Level: 100 Time taken for tests: 16.332 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100000 Total transferred: 14700000 bytes HTML transferred: 1200000 bytes Requests per second: 6122.82 [#/sec] (mean) Time per request: 16.332 [ms] (mean) Time per request: 0.163 [ms] (mean, across all concurrent requests) Transfer rate: 878.96 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 4 Processing: 0 16 4.2 16 79 Waiting: 0 16 4.2 16 79 Total: 0 16 4.2 16 80
Percentage of the requests served within a certain time (ms) 50% 16 66% 17 75% 18 80% 19 90% 21 95% 23 98% 26 99% 29 100% 80 (longest request)
Now, let’s try Play framework which I guess is the best framework written in Scala to build a web application with non-blocking IO.
1 2 3 4 5 6 7 8 9 10 11
package controllers
import play.api._ import play.api.mvc._
objectApplicationextendsController{ defindex= Action { Ok("Hello World!") // Ok(views.html.index("Your new application is ready.")) } }
Note how I commented out the code responsible for rendering the default index.html view. This is only because I want a pure “Hello World” comparison. Although, I do understand that it’s not a comprehensive way to benchmark web apps for what it can do in various other cases. Read here for a good example of doing this.
Server Software: Server Hostname: 0.0.0.0 Server Port: 9000
Document Path: / Document Length: 12 bytes
Concurrency Level: 100 Time taken for tests: 6.425 seconds Complete requests: 100000 Failed requests: 10 (Connect: 5, Receive: 5, Length: 0, Exceptions: 0) Write errors: 0 Keep-Alive requests: 99995 Total transferred: 11599420 bytes HTML transferred: 1199940 bytes Requests per second: 15563.80 [#/sec] (mean) Time per request: 6.425 [ms] (mean) Time per request: 0.064 [ms] (mean, across all concurrent requests) Transfer rate: 1763.00 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.9 0 105 Processing: 2 6 4.9 5 96 Waiting: 0 6 4.9 5 96 Total: 2 6 5.1 5 157
Percentage of the requests served within a certain time (ms) 50% 5 66% 6 75% 7 80% 8 90% 11 95% 15 98% 22 99% 27 100% 157 (longest request)
Server Software: Server Hostname: 0.0.0.0 Server Port: 9000
Document Path: / Document Length: 12 bytes
Concurrency Level: 100 Time taken for tests: 3.933 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100000 Total transferred: 11600000 bytes HTML transferred: 1200000 bytes Requests per second: 25422.96 [#/sec] (mean) Time per request: 3.933 [ms] (mean) Time per request: 0.039 [ms] (mean, across all concurrent requests) Transfer rate: 2879.94 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 4 Processing: 2 4 1.2 3 13 Waiting: 2 4 1.2 3 13 Total: 2 4 1.3 3 16
Percentage of the requests served within a certain time (ms) 50% 3 66% 4 75% 4 80% 5 90% 6 95% 7 98% 7 99% 8 100% 16 (longest request)
It seems Play framework takes some time from the initial launch to achieve some kind of a “steady state”.
Last, let’s do some benchmark with go lang again. This time, I didn’t set GOMAXPROCS which sets the maximum number of CPUs that can be executing simultaneously. Thus, I let it use a default value that I don’t know of.
Server Software: Server Hostname: 127.0.0.1 Server Port: 3000
Document Path: / Document Length: 12 bytes
Concurrency Level: 100 Time taken for tests: 2.559 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100000 Total transferred: 13800000 bytes HTML transferred: 1200000 bytes Requests per second: 39077.64 [#/sec] (mean) Time per request: 2.559 [ms] (mean) Time per request: 0.026 [ms] (mean, across all concurrent requests) Transfer rate: 5266.32 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 4 Processing: 0 3 0.6 3 6 Waiting: 0 3 0.6 3 6 Total: 0 3 0.6 3 6
Percentage of the requests served within a certain time (ms) 50% 3 66% 3 75% 3 80% 3 90% 3 95% 4 98% 4 99% 5 100% 6 (longest request)
Go is definitely still a clear winner with half the speed compared to my last benchmark. Next to Go performance is Play framework.
On a related note, there’s an interesting article for Python developers if you’re thinking about migrating to Go.
While I was waiting for WWDC, I did an apache bench for simple hello world server written in go, ruby’s eventmachine, python’s tornado and twisted, and the well-known node.js.
Server Software: TornadoServer/3.0.2 Server Hostname: 0.0.0.0 Server Port: 8001
Document Path: / Document Length: 12 bytes
Concurrency Level: 50 Time taken for tests: 5.072 seconds Complete requests: 15000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 15000 Total transferred: 3465000 bytes HTML transferred: 180000 bytes Requests per second: 2957.67 [#/sec] (mean) Time per request: 16.905 [ms] (mean) Time per request: 0.338 [ms] (mean, across all concurrent requests) Transfer rate: 667.21 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 1 Processing: 6 17 6.0 15 65 Waiting: 6 17 5.9 15 65 Total: 7 17 6.0 15 65
Percentage of the requests served within a certain time (ms) 50% 15 66% 16 75% 16 80% 17 90% 27 95% 30 98% 35 99% 38 100% 65 (longest request)
Then… the reputable Twisted:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
from twisted.web import server, resource from twisted.internet import reactor
Server Software: TwistedWeb/13.0.0 Server Hostname: 0.0.0.0 Server Port: 8001
Document Path: / Document Length: 12 bytes
Concurrency Level: 50 Time taken for tests: 5.893 seconds Complete requests: 15000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 0 Total transferred: 2115000 bytes HTML transferred: 180000 bytes Requests per second: 2545.19 [#/sec] (mean) Time per request: 19.645 [ms] (mean) Time per request: 0.393 [ms] (mean, across all concurrent requests) Transfer rate: 350.46 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.2 0 2 Processing: 11 20 2.2 19 34 Waiting: 11 20 2.2 19 34 Total: 11 20 2.2 19 34
Percentage of the requests served within a certain time (ms) 50% 19 66% 20 75% 20 80% 21 90% 22 95% 23 98% 25 99% 27 100% 34 (longest request)
What about ruby eventmachine?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
require'eventmachine' require'evma_httpserver'
classHandler < EventMachine::Connection include EventMachine::HttpServer
defprocess_http_request resp = EventMachine::DelegatedHttpResponse.new( self ) resp.status = 200 resp.content = "Hello World!" resp.send_response end end
EventMachine::run { EventMachine::start_server("0.0.0.0", 8002, Handler) puts "Listening... at 8002" }
Server Software: Server Hostname: 0.0.0.0 Server Port: 8002
Document Path: / Document Length: 12 bytes
Concurrency Level: 50 Time taken for tests: 1.909 seconds Complete requests: 15000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 0 Total transferred: 780000 bytes HTML transferred: 180000 bytes Requests per second: 7858.99 [#/sec] (mean) Time per request: 6.362 [ms] (mean) Time per request: 0.127 [ms] (mean, across all concurrent requests) Transfer rate: 399.09 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 4.4 0 245 Processing: 1 6 19.5 4 247 Waiting: 1 5 19.3 3 247 Total: 2 6 20.0 4 247
Percentage of the requests served within a certain time (ms) 50% 4 66% 4 75% 4 80% 4 90% 6 95% 8 98% 11 99% 80 100% 247 (longest request)
Wow, that’s faster than tornado or twisted in terms of requests per second, but it transferred less data for some reason. OK, we move on to node.js, I expect it should be faster.
Server Software: Server Hostname: 0.0.0.0 Server Port: 8000
Document Path: / Document Length: 12 bytes
Concurrency Level: 50 Time taken for tests: 2.181 seconds Complete requests: 15000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 0 Total transferred: 1695000 bytes HTML transferred: 180000 bytes Requests per second: 6877.80 [#/sec] (mean) Time per request: 7.270 [ms] (mean) Time per request: 0.145 [ms] (mean, across all concurrent requests) Transfer rate: 758.98 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 1.1 0 137 Processing: 1 7 11.9 6 142 Waiting: 1 7 11.8 6 142 Total: 2 7 11.9 6 142
Percentage of the requests served within a certain time (ms) 50% 6 66% 6 75% 6 80% 6 90% 7 95% 9 98% 12 99% 101 100% 142 (longest request)
Yes, unsurprisingly it has the highest transfer rate, although it has less mean requests / sec compared to ruby event machine. Based on the percentage of the requests served, node.js is faster.
Server Software: Server Hostname: 0.0.0.0 Server Port: 3000
Document Path: / Document Length: 12 bytes
Concurrency Level: 50 Time taken for tests: 0.228 seconds Complete requests: 15000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 15000 Total transferred: 2070000 bytes HTML transferred: 180000 bytes Requests per second: 65771.30 [#/sec] (mean) Time per request: 0.760 [ms] (mean) Time per request: 0.015 [ms] (mean, across all concurrent requests) Transfer rate: 8863.71 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 2 Processing: 0 1 0.1 1 2 Waiting: 0 1 0.1 1 2 Total: 0 1 0.2 1 3
Percentage of the requests served within a certain time (ms) 50% 1 66% 1 75% 1 80% 1 90% 1 95% 1 98% 1 99% 1 100% 3 (longest request)
Transfer rate is 8863.71 Kbytes/sec with 65771.3 requests per second?
New guy create a separate branch, make some changes, and create a pull request. Only the core developers are allowed to observe the pull request and merge the changes to main branch.
Good thing about pull request is, it could be discussed first, before the new guy is silently merging and pushing to the main branch.
Basically, if both developers are working on the same file, they could peek at the remote changes before doing a merging.
Let’s say the main developer who worked on the master branch would like to see a new feature on another branch:
# git fetch origin master
Look at what have changed:
# git diff origin/new_ui
Decided that it’s okay
# git merge --no-ff origin/new_ui
Note the –no-ff here:
The –no-ff flag causes the merge to always create a new commit object, even if the merge could be performed with a fast-forward. This avoids losing information about the historical existence of a feature branch and groups together all commits that together added the feature
But, this is a simple case… this is usually okay if there are no conflicting changes. If there is, it’s usually better to use a fetch, rebase, merge workflow.