93

How to find names of all collections using PyMongo and find all fields in chosen collection ? I have name of database and name of chosen collection. (Scenario : user input name of database, need to find all collections and show in dropdown list, when user click on one item need to find all fields in that collection)

1
  • As mongo is schema-less, how would you find list of fields?
    – Dogbert
    Commented Mar 21, 2012 at 13:24

7 Answers 7

135

To find the collections, you can use collection_names() - https://pymongo.readthedocs.io/en/stable/api/pymongo/database.html#pymongo.database.Database.collection_names

Update:

The collection_names is deprecated from 3.7 onwards and been replaced by list_collection_names() - https://pymongo.readthedocs.io/en/stable/api/pymongo/database.html#pymongo.database.Database.list_collection_names

2
  • 24
    I believe this has been replaced with list_collection_names() now
    – Alex
    Commented Aug 11, 2018 at 15:28
  • 3
    Both collection_names() and list_collection_names() work as of pymongo.__version__=='3.8.0'.
    – J-Eubanks
    Commented May 15, 2019 at 13:14
44

This is very simple. e.g.

import pymongo
import json

if __name__ == '__main__':
    client = pymongo.MongoClient("localhost", 27017, maxPoolSize=50)
    d = dict((db, [collection for collection in client[db].collection_names()])
             for db in client.database_names())
    print json.dumps(d)

result -> {"database1":["collection1","collection2"...], "database2": [...], ...}, like:

{"test": ["score", "test4", "test5", "test6", "test3", "test7", "user", "test2", "test8"],
 "testdb": ["test5", "test8", "test2", "test9", "test3", "test4", "test6", "test"],
 "local": ["startup_log"],
 "stackoverflow": ["questions"]}
2
  • 2
    Just updating. client[db].collection_names() was deprecated in favor of client[db].list_collection_names().api.mongodb.com/python/current/api/pymongo/… Commented Oct 2, 2018 at 14:59
  • I gave an up for the root of the solution, but it would have been much more informative and transparent (thus simply better) without the one liners loop.
    – Geeocode
    Commented Oct 31, 2018 at 1:05
11

DeprecationWarning: collection_names is deprecated. Use list_collection_names instead.

Changed in version 3.7: Deprecated. Use list_collection_names() instead.

An example to read the database name from user input and then finding the list collection names would be:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")

dbname = input("Enter database name: ")
mydb = myclient[dbname]

#list the collections
for coll in mydb.list_collection_names():
    print(coll)

Reference: Python MongoDB

9

I always used this way to get all collection names from my MongoDB database.

import pymongo
db_connect = pymongo.MongoClient('192.168.4.202', 20020)
database_name = 'MY_DATABASE_NAME'
database = db_connect[database_name]
collection = database.collection_names(include_system_collections=False)
for collect in collection:
    print collect
1
  • thsi works like a charm for mongo db's version 2.x Commented Apr 25, 2021 at 3:15
6

Here is a script that I created that does essentially what you want.

It displays a list of all collections in the database (in this case the 'dh' database). The user types the collection of choice and the script displays the fields and fields within documents down 2 levels. It displays in mongo entry format, which can be copied directly into a mongo query. It also will check the first level fields for lists of dictionaries and display those subfields in lists surrounded by brackets 'field.[subfield_in_list]' .

There is also optional command line input of collection name (e.g. python path/to/script/scriptname.py collection_name

import pymongo
from pymongo import Connection

mon_con = Connection('localhost', 27017)
mon_db = mon_con.dh

cols = mon_db.collection_names()
for c in cols:
    print c
col = raw_input('Input a collection from the list above to show its field names: ')

collection = mon_db[col].find()

keylist = []
for item in collection:
    for key in item.keys():
        if key not in keylist:
            keylist.append(key)
        if isinstance(item[key], dict):
            for subkey in item[key]:
                subkey_annotated = key + "." + subkey
                if subkey_annotated not in keylist:
                    keylist.append(subkey_annotated)
                    if isinstance(item[key][subkey], dict):
                        for subkey2 in item[subkey]:
                            subkey2_annotated = subkey_annotated + "." + subkey2
                            if subkey2_annotated not in keylist:
                                keylist.append(subkey2_annotated)
        if isinstance(item[key], list):
            for l in item[key]:
                if isinstance(l, dict):
                    for lkey in l.keys():
                        lkey_annotated = key + ".[" + lkey + "]"
                        if lkey_annotated not in keylist:
                            keylist.append(lkey_annotated)
keylist.sort()
for key in keylist:
    keycnt = mon_db[col].find({key:{'$exists':1}}).count()
    print "%-5d\t%s" % (keycnt, key)

I'm sure you could write a function to iterate down levels infinitely until there is no data left, but this was quick and dirty and serves my needs for now. You could also modify to show the fields for just a particular set of records in a collection. Hope you find it useful.

4

A simplest code would be using pymongo :

from pymongo import MongoClient
client = MongoClient('localhost',27017)
database = client.database_name
print(database.list_collection_names())
3

As a beginner I find the official documents of MongoDB and pymongo challenging to understand / digest. In addition api changes are making it harder to find relevant code examples. I would expect that the basic official tutorial to start with the following:

from pymongo import MongoClient

db_url = 'mongodb://localhost:27017/test' # or some other default url
client = MongoClient(db_url)
db=client.admin

# Sanity check - we get server status
print(db.command("serverStatus"))

# Available databases    
print(client.list_database_names())

# Available collections in a specific database
print(db.list_collection_names())

# Bonus: A scheme of relation among databases / collections / documents

 

Not the answer you're looking for? Browse other questions tagged or ask your own question.