Search This Blog

Monday, December 30, 2013

How to use GridFS to store big files in MongoDB

Important note: This article is in relation to online MongoDB course. For more information about the course and other posts describing its content please check my main page here: M101P: MongoDB for Developers course

Mongo supports document that are up to 16MB in size. To store bigger files you need to use the GridFs feature.

The file we are going to insert.

server# ls -lah mongodb-linux-x86_64-2.4.8.tgz
-rw-r--r-- 1 root root 91M Oct 31 22:25 mongodb-linux-x86_64-2.4.8.tgz

server:~/mongo-course/M101P/week2# ls -lah $HOME/my-big-files.img
lrwxrwxrwx 1 root root 30 Dec 30 17:55 /root/my-big-files.img -> mongodb-linux-x86_64-2.4.8.tgz

mongo shell

server# mongofiles -d grid-cli-examle put mongodb-linux-x86_64-2.4.8.tgz
connected to: 127.0.0.1
added file: { _id: ObjectId('52c1b104232188a316e51d61'), filename: "mongodb-linux-x86_64-2.4.8.tgz", chunkSize: 262144, uploadDate: new Date(1388425483792), md5: "4954765464dc4d97870ddc5de147e05d", length: 95015187 }
done!

server# mongo grid-cli-examle
MongoDB shell version: 2.4.8
connecting to: grid-cli-examle
> show collections
fs.chunks
fs.files
system.indexes

> db.fs.files.find()
{ "_id" : ObjectId("52c1b104232188a316e51d61"), "filename" : "mongodb-linux-x86_64-2.4.8.tgz", "chunkSize" : 262144, "uploadDate" : ISODate("2013-12-30T17:44:43.792Z"), "md5" : "4954765464dc4d97870ddc5de147e05d", "length" : 95015187 }

> db.fs.chunks.find().count()
363

Python

This little program reads the file from the disk and insets it into the gridfs like collection in Mongo

import pymongo
import gridfs
import sys
import os

connection = pymongo.Connection("mongodb://localhost", safe=True)
db = connection.grid_python_example
c = db.bigfiles

grid = gridfs.GridFS(db, "myfile")
f = open( os.environ['HOME'] + "/my-big-files.img")
_id = grid.put(f)
f.close()

c.insert( {'grid_id':_id, "filename":"my-big-files.img"} )  

Logging back to shell we can confirm that the file was saved.

server# mongo grid_python_example
MongoDB shell version: 2.4.8
connecting to: grid_python_example

> show collections
bigfiles
myfile.chunks
myfile.files
system.indexes

> db.bigfiles.find()
{ "_id" : ObjectId("52c1b9d55f4cb27cabe6e650"), "filename" : "my-big-files.img", "grid_id" : ObjectId("52c1b97e5f4cb27cabe6e4e4") }
> db.myfile.chunks.find().count()
363

References

http://docs.mongodb.org/manual/core/gridfs/
http://docs.mongodb.org/manual/reference/gridfs/
http://docs.mongodb.org/manual/reference/program/mongofiles/


No comments:

Post a Comment