=================
Install and usage
=================
`omidb` is a Python 3 command-line interface and package for parsing and
interacting with the `OPTIMAM Mammography Image Database
`_. Unless you have authorised
access to the official database, it is assumed that you have downloaded the
database (most likely a subset of it) via the `OMI-DB Sync Tool
`_.
For an overview of the database, see :doc:`structure`.
CLI
===
A simple command-line interface (CLI), ``omidb``, has been developed to
automate useful data extraction tasks commonly implemented by the hands of
researchers working with the database.
The CLI is currently limited to one (very useful) command, ``summarise``, which
can be applied to your local copy of OMI-DB::
omidb summarise
This will pass over the JSON data (within the `data` directory of OMI-DB), and
generate a CSV file that summarises the database, at the image level. For
example, the majority of images will be associated with a series, medical
procedure, study, NBSS episode and client. The command also extracts a few
useful DICOM tags, such as the manufacturer of the device, and the intent of
presentation. This does not require access to the DICOM images themselves.
The ``--clients-file `` option can be added to specify a
list of clients to parse, rather than traversing the entire database. It should
point to the path of a text file holding one line per client::
# my-client-list.txt
demd1
demd2
The ``omidb`` package logger provides detailed information about the parsing
process, e.g. studies that can't be linked to an event, so, if interested, we
recommend you route logging to a file by adding the ``--log-file
`` option.
Package Usage
=============
Import the package::
>>> import omidb
The following code iteratively parses clients ``demd8482`` and ``demd11022`` from the database::
>>> db = omidb.DB('./OMI-DB', clients=['demd8482', 'demd11022'])
>>> clients = [client for client in db]
>>> [print(client.id) for client in clients]
demd11022
demd8482
The hierarchical structure of OMI-DB is modelled by nested objects::
>>> clients[0].episodes[0].studies[0].series[0].images[0].marks
NBSS data attributes are available through class members::
>>> print(clients[0].classification.value)
Noraml
>>> print(clients[0].episodes[0].value)
RR
Access dicom properties for an image (using `pydicom `_)::
>>> print(clients[0].episodes[0].studies[0].series[0].images[0].dcm.PresentationIntentType)
FOR PRESENTATION
or via provided JSON representations of the DICOM headers (so no need for the
DICOMs themselves)::
>>> print(clients[0].episodes[0].studies[0].series[0].images[0].attributes['00080068'])
{'vr': 'CS', 'Value': ['FOR PRESENTATION']}
Plot individual images (via `matplotlib `_) and images within a series::
>>> clients[0].episodes[0].studies[0].series[0].images[0].plot()
>>> clients[0].episodes[0].studies[0].series[0].plot()
Use ``FilterImages`` to perform inplace, recursive dicom property filtering over images::
>>> image_filter = omidb.filters.FilterImages.dicom_filter(
{'PresentationIntentType': ['FOR PROCESSING']})
>>> image_filter(clients[0]) # In-place filtering
See :doc:`omidb` for API documentation.
Installing
==========
You will need version >=3.7 of Python.
For the CLI only, we recommend `pipx `_::
pipx install omidb
To install the package in your project::
poetry add omidb
or::
pip install omidb
For development::
git clone https://bitbucket.org/scicomcore/omi-db.git
poetry install --dev
To build the documentation::
cd ./docs
poetry run make html