================= Install and usage ================= `omidb` is a Python 3 command-line interface and package for parsing and interacting with the `OPTIMAM Mammography Image Database `_. Unless you have authorised access to the official database, it is assumed that you have downloaded the database (most likely a subset of it) via the `OMI-DB Sync Tool `_. For an overview of the database, see :doc:`structure`. CLI === A simple command-line interface (CLI), ``omidb``, has been developed to automate useful data extraction tasks commonly implemented by the hands of researchers working with the database. The CLI is currently limited to one (very useful) command, ``summarise``, which can be applied to your local copy of OMI-DB:: omidb summarise This will pass over the JSON data (within the `data` directory of OMI-DB), and generate a CSV file that summarises the database, at the image level. For example, the majority of images will be associated with a series, medical procedure, study, NBSS episode and client. The command also extracts a few useful DICOM tags, such as the manufacturer of the device, and the intent of presentation. This does not require access to the DICOM images themselves. The ``--clients-file `` option can be added to specify a list of clients to parse, rather than traversing the entire database. It should point to the path of a text file holding one line per client:: # my-client-list.txt demd1 demd2 The ``omidb`` package logger provides detailed information about the parsing process, e.g. studies that can't be linked to an event, so, if interested, we recommend you route logging to a file by adding the ``--log-file `` option. Package Usage ============= Import the package:: >>> import omidb The following code iteratively parses clients ``demd8482`` and ``demd11022`` from the database:: >>> db = omidb.DB('./OMI-DB', clients=['demd8482', 'demd11022']) >>> clients = [client for client in db] >>> [print(client.id) for client in clients] demd11022 demd8482 The hierarchical structure of OMI-DB is modelled by nested objects:: >>> clients[0].episodes[0].studies[0].series[0].images[0].marks NBSS data attributes are available through class members:: >>> print(clients[0].classification.value) Noraml >>> print(clients[0].episodes[0].value) RR Access dicom properties for an image (using `pydicom `_):: >>> print(clients[0].episodes[0].studies[0].series[0].images[0].dcm.PresentationIntentType) FOR PRESENTATION or via provided JSON representations of the DICOM headers (so no need for the DICOMs themselves):: >>> print(clients[0].episodes[0].studies[0].series[0].images[0].attributes['00080068']) {'vr': 'CS', 'Value': ['FOR PRESENTATION']} Plot individual images (via `matplotlib `_) and images within a series:: >>> clients[0].episodes[0].studies[0].series[0].images[0].plot() >>> clients[0].episodes[0].studies[0].series[0].plot() Use ``FilterImages`` to perform inplace, recursive dicom property filtering over images:: >>> image_filter = omidb.filters.FilterImages.dicom_filter( {'PresentationIntentType': ['FOR PROCESSING']}) >>> image_filter(clients[0]) # In-place filtering See :doc:`omidb` for API documentation. Installing ========== You will need version >=3.7 of Python. For the CLI only, we recommend `pipx `_:: pipx install omidb To install the package in your project:: poetry add omidb or:: pip install omidb For development:: git clone https://bitbucket.org/scicomcore/omi-db.git poetry install --dev To build the documentation:: cd ./docs poetry run make html