【MessagePack】msgpack-pythonのベンチマーク

msgpackはJSONより高速で軽量なシリアライゼーションフォーマットです。
今回はmsgpack-pythonのベンチマークを測定してみました。

MessagePackの特徴

  • serialize, deserializeが高速
  • serializeされたデータサイズが軽量
  • ストリーム処理が可能

ライバルはMongoDB等で使われている BSON でしょうか。

msgpack-python vs simplejson

msgpackの高速性と軽量性について JSON encoder/decoder と比較してみました。

codeは msgpack-python の benchmark.py を少しいじった感じです。

from msgpack import fallback
try:
    from msgpack import _unpacker, _packer
    has_ext = True
except ImportError:
    has_ext = False
import timeit
import simplejson as json
import gc

def profile(name, func):
    times = timeit.repeat(func, number=1000, repeat=4)
    times = ', '.join(["%8f" % t for t in times])
    print("%-30s %40s" % (name, times))


def bench_msgpack(name, data):
    if has_ext:
        packer = _packer.Packer()
        profile("packing %s (ext)" % name, lambda: packer.pack(data))
    packer = fallback.Packer()
    profile('packing %s (fallback)' % name, lambda: packer.pack(data))
    print " msgpack packed size:  ", len(packer.pack(data)), "[bytes]"

    data = packer.pack(data)
    if has_ext:
        profile('unpacking %s (ext)' % name, lambda: _unpacker.unpackb(data))
    profile('unpacking %s (fallback)' % name, lambda: fallback.unpackb(data))
    print " msgpack unpacked size:  ", len(fallback.unpackb(data)), "[bytes]"

def bench_json(name, data):
    profile('dumping %s ' % name, lambda: json.dumps(data))
    print " json dumped size:  ", len(json.dumps(data)), "[bytes]"

    data = json.dumps(data)
    profile('loading %s ' % name, lambda: json.loads(data))
    print " json loaded size:  ", len(json.loads(data)), "[bytes]"

def main():
    gc.disable()
    print "---msgpack---" 
    bench_msgpack("integers", [123]*1000)
    bench_msgpack("bytes", [b'x'*n for n in range(100)]*10)
    bench_msgpack("string", ['a'*(i % 128) for i in xrange(2**8)])
    bench_msgpack("lists", [[1,2,3]]*1000)
    bench_msgpack("dicts", [{'one':1}]*1000)

    print "---json---" 
    bench_json("integers", [123]*1000)
    bench_json("bytes", [b'x'*n for n in range(100)]*10)
    bench_json("string", ['a'*(i % 128) for i in xrange(2**8)])
    bench_json("lists", [[1,2,3]]*1000)
    bench_json("dicts", [{'one':1}]*1000)

main()

環境はMacBook-Air Darwin Kernel Version 13.2.0,CPUは1.7GHz Intel Core i5,メモリは4GB DDR3です。

$ uname -a
Darwin MacBook-Air.local 13.2.0 Darwin Kernel Version 13.2.0: Thu Apr 17 23:03:13 PDT 2014; root:xnu-2422.100.13~1/RELEASE_X86_64 x86_64
$ python benchmark.py 
---msgpack---
packing integers (ext)           0.026387, 0.026884, 0.026210, 0.026112
packing integers (fallback)      1.917463, 1.893093, 1.889604, 1.898885
 msgpack packed size:   1003 [bytes]
unpacking integers (ext)         0.016021, 0.015831, 0.018296, 0.016544
unpacking integers (fallback)    3.501446, 3.502847, 3.499088, 3.518903
 msgpack unpacked size:   1000 [bytes]
packing bytes (ext)              0.112583, 0.114032, 0.104412, 0.104621
packing bytes (fallback)         3.467647, 3.460742, 3.747350, 3.727999
 msgpack packed size:   51863 [bytes]
unpacking bytes (ext)            0.069892, 0.073419, 0.060337, 0.051231
unpacking bytes (fallback)       8.099066, 7.324425, 7.163550, 7.199357
 msgpack unpacked size:   1000 [bytes]
packing string (ext)             0.027492, 0.032500, 0.027371, 0.030443
packing string (fallback)        0.894762, 0.921440, 0.908589, 0.947287
 msgpack packed size:   16899 [bytes]
unpacking string (ext)           0.016590, 0.012877, 0.015480, 0.015310
unpacking string (fallback)      1.912240, 1.898810, 1.909894, 1.890657
 msgpack unpacked size:   256 [bytes]
packing lists (ext)              0.311708, 0.321781, 0.322586, 0.308297
packing lists (fallback)       10.336071, 10.285126, 10.290976, 10.240862
 msgpack packed size:   4003 [bytes]
unpacking lists (ext)            0.248507, 0.262906, 0.245187, 0.245483
unpacking lists (fallback)     15.156253, 15.194266, 15.115340, 15.197394
 msgpack unpacked size:   1000 [bytes]
packing dicts (ext)              0.224640, 0.218614, 0.229561, 0.223867
packing dicts (fallback)       10.599498, 10.586775, 10.533657, 10.519049
 msgpack packed size:   6003 [bytes]
unpacking dicts (ext)            0.277766, 0.290569, 0.270391, 0.278007
unpacking dicts (fallback)     13.354813, 13.383455, 13.350145, 13.332089
 msgpack unpacked size:   1000 [bytes]
---json---
dumping integers       0.139068, 0.156903, 0.132771, 0.132070
 json dumped size:   5000 [bytes]
loading integers       0.176400, 0.169559, 0.171493, 0.165896
 json loaded size:   1000 [bytes]
dumping bytes          0.431933, 0.441784, 0.424687, 0.436677
 json dumped size:   53500 [bytes]
loading bytes          0.198330, 0.191880, 0.192878, 0.192028
 json loaded size:   1000 [bytes]
dumping string         0.153207, 0.145305, 0.142040, 0.139862
 json dumped size:   17280 [bytes]
loading string         0.063171, 0.062628, 0.061795, 0.062163
 json loaded size:   256 [bytes]
dumping lists          1.099332, 1.087891, 1.113273, 1.119309
 json dumped size:   11000 [bytes]
loading lists          0.642064, 0.660300, 0.663816, 0.642569
 json loaded size:   1000 [bytes]
dumping dicts          1.212356, 1.169995, 1.176135, 1.193705
 json dumped size:   12000 [bytes]
loading dicts          0.464122, 0.470241, 0.459511, 0.473723
 json loaded size:   1000 [bytes]

速度比較

JSONのencode/decode速度を100%とした場合のmsgpackの速度です。低い程、JSONより高速。

  • int : serialize 約18% , deserialize 約9%
  • byte : serialize 約26% , deserialize 約35%
  • string : serialize 約18% , deserialize 約26%
  • lists : serialize 約28% , deserialize 約39%
  • dicts : serialize 約19% , deserialize 約60%

サイズ比較

JSONの EncodedSize を100%とした場合の msgpack の速度です。低い程、JSONより軽量。

  • int : 約20%
  • string : 約97%
  • lists : 約98%
  • dicts : 約50%

結果は使用する言語のオブジェクトや処理系等によっても異なるので、あくまで参考まで。